bytewife/automatic-speech-recognition-guide

Automatic Speech Recognition Primer

Motivation

Automatic Speech Recognition (ASR) is an interdisciplinary task, requiring knowledge from Signal Processing, Natural Language Processing, and Machine Learning. This notebook is meant to explain each aspect for anyone not experienced in ASR, and to teach myself!

Status

(07/08/2021)
The Notebook's code implementation of the learning model isn't complete, but its written explanations on DSP, NLP, and ML ASR Architecture still hold. For a working implementation of the Transformer architecture, go to Apoorv Nandan's fantastic example- which makes up the entirety of my Notebook's current learning architecture.

Help Wanted

The model isn't able to provide an appropriate loss! If you'd like to help with that, leave an Issue :)

References

See the bottom of the Notebook for important readings!

About

Information on the ASR process, but the model needs your help!

https://colab.research.google.com/github/aith/speech-recognition-from-scratch/blob/main/index.ipynb

automatic-speech-recognition

Languages

Language:Jupyter Notebook 95.5%Language:Python 4.5%