keyword-spotter speech-recognition speech-commands tensorflow2 flask-restful

Keyword spotter

A simple project on speech recognition.

Sebastian Thomas (datascience at sebastianthomas dot de)

In this project, we intend to recognize a keyword out of a list of ten given keywords.

It is an extension of the introductory tutorial on speech command recognition from Tensorflow.

It uses the speech_commands dataset of Pete Warden, version 0.0.2. The dataset contains 105829 WAV files, each of a duration of at most 1 second. Each file consists of a spoken command out of a list of 35 commands.

For demonstration purposes, a REST API was implemented. This was inspired by a tutorial of Velardo of his series Deep Learning (Audio) Application: From Design to Deployment.

Content

Data mining, analysis, training and evaluation of the classifier:

Predictive analysis

Main development:

REST API:

Future work

tune more hyperparameters
use class weights for training (we have imbalanced classes)
add background noise to the instances
use other form of data augmentation such as e.g time shifting
add a silence label
consider other classifier models

References

Warden, Pete: Speech Commands: A Dataset for Limited-Vocabulary Speech Recognition. arXiv:1804.03209, 2018.

Velardo, Valerio: Deep Learning (Audio) Application: From Design to Deployment. YouTube, 2020.

About

Speech recognition of keyword commands

keyword-spotter speech-recognition speech-commands tensorflow2 flask-restful

Languages

Language:Jupyter Notebook 97.0%Language:Python 3.0%