Transformer based Conversational Bot
The implementation follows the paper Attention is all you need
The conversational has two-stage module:
- Speech-to-Text
- Deep-Speech pre-trained model
- Re-trained model for Common-Voice Dataset
- Text-to-Text Response module
- Transformer model
- preprocessing (word tokenizer + positional encoding)
- Encoder (N = 6) Layers
- Decoder
- Adam with custom Lr Schedule
- Trained on Cornell Movie Dialog Corpus
- Analysis of Lr Schedule based on Paper
- Transformer model
Speech-to-Text Deep Module References:
wget https://github.com/mozilla/DeepSpeech/releases/download/v0.5.1/deepspeech-0.5.1-models.tar.gz
Converting Sample Audio Samples for 16-bit, 16Hz, Mono-Channel WAV-Audio:
ffmpeg -i 111.mp3 -acodec pcm_s16le -ac 1 -ar 16000 out.wav
Ref for Dataset and Sample Audio: