PerceptiLabs / yesno_voice_recognition

Yes No voice recognition Use Case.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Voice classification to recognize Yes-No instructions

This dataset1 contains sounds and spectrogram images of the words "yes" and "no".

The spectrogram images can be used to build and train an ML model that can classify a yes/no sound converted to spectrogram with high accuracy.

Structure

This repo contains the following structure:

  • sound_2_spectrograms.ipynb Jupyter notebook for generating spectrogram image files from sound files.
  • train: original sounds (in .wav format), converted to mono.
  • train_images: spectrogram images (in .jpg format) corresponding to each sound. These were generated using sound_2_spectrograms.ipynb.
  • dataset.csv: CSV file that maps spectrogram files to yes/no classification labels.

The following shows a partial example of the data stored in dataset.csv:

labels image
yes train_images/yes0.jpg
yes train_images/yes1.jpg
no train_images/no0.jpg
no train_images/no1.jpg

The yes/no labels in the CSV map the image files to the yes or no classifications respectively.

After the model has been trained it can be used for inference. For this, you could use sound_2_spectrograms.ipynb to generate a spectrogram from a given .wav file and then pass that spectrogram image file to the trained model for classification.

Community

Got questions, feedback, or want to join a community of machine learning practitioners working with exciting tools and projects? Check out our Community!

1 Dataset Credits: https://github.com/vi/codegolf-jein

About

Yes No voice recognition Use Case.


Languages

Language:Jupyter Notebook 100.0%