egorsmkv / asr-corpus-by-microphone

This is a simple solution for people who want to create own corpus for Automatic Speech Recognition with just a microphone

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

ASR Corpus by Microphone

Overview

This repository contains code to run a script that collects speech data from your microphone.

Watch video below to see how it works:

Examples of Usage

  • Collect ASR Corpus with your computer in places without internet connection (it's important for low-resourced languages)
  • Split speech to chunks by Voice Activity Detection mechanism

Installation

Install Python requirements:

Linux

# the author has successfully tested the project with wave=0.0.2, torch==1.11.0, torchaudio==0.11.0, sox==1.4.1, and pyaudio==0.2.11
pip install wave torch torchaudio pyaudio sox

MacOS

brew install portaudio sox

pip install wave
pip install --global-option='build_ext' --global-option='-I/usr/local/include' --global-option='-L/usr/local/lib' pyaudio

To install torch and torchaudio on MacOS you need to install conda or miniconda (I recommend it) and then install torch libraries:

For Intel:

conda install pytorch torchaudio -c pytorch

For M1:

pip3 install torch torchaudio

If you have problems with installation of pyaudio, then check out this link. For me below command works:

pip3 install --global-option='build_ext' --global-option='-I/opt/homebrew/Cellar/portaudio/19.7.0/include/' --global-option='-L/opt/homebrew/Cellar/portaudio/19.7.0/lib/' pyaudio

Running

# Create folders where audio files will appear
mkdir data
mkdir speech

# Run the loop (this script will record speech and save it into the speech/ folder)
# Use Ctrl-C to stop the script
python record_and_split.py

Help

Acknowledgements

About

This is a simple solution for people who want to create own corpus for Automatic Speech Recognition with just a microphone

License:MIT License


Languages

Language:Python 100.0%