asr automatic-speech-recognition corpus corpus-tools

ASR Corpus by Microphone

Overview

This repository contains code to run a script that collects speech data from your microphone.

Watch video below to see how it works:

Examples of Usage

Collect ASR Corpus with your computer in places without internet connection (it's important for low-resourced languages)
Split speech to chunks by Voice Activity Detection mechanism

Installation

Install Python requirements:

Linux

# the author has successfully tested the project with wave=0.0.2, torch==1.11.0, torchaudio==0.11.0, sox==1.4.1, and pyaudio==0.2.11
pip install wave torch torchaudio pyaudio sox

MacOS

brew install portaudio sox

pip install wave
pip install --global-option='build_ext' --global-option='-I/usr/local/include' --global-option='-L/usr/local/lib' pyaudio

To install torch and torchaudio on MacOS you need to install conda or miniconda (I recommend it) and then install torch libraries:

For Intel:

conda install pytorch torchaudio -c pytorch

For M1:

pip3 install torch torchaudio

If you have problems with installation of pyaudio, then check out this link. For me below command works:

pip3 install --global-option='build_ext' --global-option='-I/opt/homebrew/Cellar/portaudio/19.7.0/include/' --global-option='-L/opt/homebrew/Cellar/portaudio/19.7.0/lib/' pyaudio

Running

# Create folders where audio files will appear
mkdir data
mkdir speech

# Run the loop (this script will record speech and save it into the speech/ folder)
# Use Ctrl-C to stop the script
python record_and_split.py

Help

If you have any issues - create an issue in the repository
Currently tested on Linux and MacOS, for Windows you need to change the script slightly

Acknowledgements

Silero VAD: https://github.com/snakers4/silero-vad
PyAudio: https://people.csail.mit.edu/hubert/pyaudio/
wave: https://pythonhosted.org/Wave/

About

This is a simple solution for people who want to create own corpus for Automatic Speech Recognition with just a microphone

asr automatic-speech-recognition corpus corpus-tools

MIT License

Languages

Language:Python 100.0%