Arjunp24 / automatic-speech-recognizer

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Automatic Speech Recognizer


This project aims to build a deep neural network that functions as part of an end-to-end automatic speech recognition (ASR) pipeline.

The LibriSpeech dataset is used to train and evaluate the models. The pipeline will first convert any raw audio to feature representations that are commonly used for ASR. It will then move on to building neural networks that can map these audio features to transcribed text. Different audio features taken into consideration are MFCC features and Spectorgrams.
The various models that were implemented include:

  1. Deep RNN + TimeDistributed Dense
  2. CNN + RNN + TimeDistributed Dense
  3. Bidirectional RNN + TimeDistributed Dense
  4. RNN + TimeDistributed Dense
  5. Vanilla RNN
    (List is presented in decrasing order of validation accuracy)

    Project done as part of Udacity Natural Language Processing Nanodegree Program

About


Languages

Language:Jupyter Notebook 95.3%Language:Python 4.7%