kevobt / speech-to-text-voxforge

Downloader for the voxforge corpus

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

speech-to-text-voxforge

Download the speech corpus

In order to download the speech corpus run

python downloader.py "voxforge-corpus"

You can additionally specify the amount of speaker directories to be downloaded using -n or the amount of threads to be used for the download using -w:

python downloader.py "voxforge-corpus" -n 20000 -w 15

Generate training data

If you want to generate a training data file for the speech recognition tool, run generator.py providing the path to the directory where the voxforge corpus was being downloaded and a path to the new file where the training data should be stored. The data will be stored as JSON.

python generator.py "voxforge-corpus" "training_data.json"

About

Downloader for the voxforge corpus

License:GNU General Public License v3.0


Languages

Language:Python 100.0%