asr audio dataset deep-learning machine-learning speech speech-recognition speech-to-text

800-Hours-American-English-Speech-Data-by-Mobile-Phone

Description

1842 American native speakers participated in the recording with authentic accent. The recorded script is designed by linguists, based on scenes, and cover a wide range of topics including generic, interactive, on-board and home. The text is manually proofread with high accuracy. It matches with mainstream Android and Apple system phones.

For more details, please refer to the link: https://www.nexdata.ai/datasets/999?source=Github

Format

16kHz, 16bit, uncompressed wav, mono channel

Recording Environment

quiet indoor environment, low background noise, without echo

Recording content (read speech)

igeneric category; human-machine interaction category; smart home command and in-car command category; numbers

Demographics

1,842 speakers totally, with 39% male and 61% female; and 55% speakers of all are in the age group of 16-25,41% speakers of all are in the age group of 26-45, 4% speakers of all are in the age group of 46-69;

Device

Android mobile phone, iPhone

Language

American English

Application scenarios

speech recognition; voiceprint recognition

Licensing Information

Commercial License

About

American English Speech Dataset

https://www.nexdata.ai/datasets/999?source=Github

asr audio dataset deep-learning machine-learning speech speech-recognition speech-to-text