Nexdata-AI / 1796-Hours-German-Speech-Data-by-Mobile-Phone

German Speech Dataset

Home Page:https://www.nexdata.ai/datasets/949?source=Github

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

1796-Hours-German-Speech-Data-by-Mobile-Phone

Description

German audio data captured by mobile phone, 1,796 hours in total, recorded by 3,442 German native speakers. The recorded text is designed by linguistic experts, covering generic, interactive, on-board, home and other categories. The text has been proofread manually with high accuracy; this data can be used for automatic speech recognition, machine translation, and voiceprint recognition.

For more details, please refer to the link: https://bit.ly/3SeiITw

Format

16kHz, 16bit, uncompressed wav, mono channel

Recording Environment

quiet indoor environment, low background noise, without echo

Recording content (read speech)

generic category; human-machine interaction category; smart home command and control category; in-car command and control category; numbers

Demographics

3,442 speakers totally, with male 44% and female 56%; and 60% speakers of all are in the age group of 18-25,35% speakers of all are in the age group of 26-45, 5% speakers of all are in the age group of 46-60;

Device

Android mobile phone, iPhone

Language

German

Application Scene

speech recognition; voiceprint recognition

Licensing Information

Commerical License: https://drive.google.com/file/d/1saDCPm74D4UWfBL17VbkTsZLGfpOQj1J/view?usp=sharing