Speech emotion recognition in real-time using LSTM
Conventional LSTM based demo: https://tabahi.github.io/SER-LSTM-test/
Uses tensorflow JS library to predict emotions from speech MFCC features.
The model is trained with following layers and parameters using the IEMOCAP database to predict 4 basic emotions (Anger, Happy, Sad, Neutral, and Silence).
- Dense, 33
- LSTM, 16
- LSTM, 8
- Drop-out, 0.8
- Time Distributed Dense, 5
- Softmax
Note: The latency increases as the MFCC buffer size increases.
The new proposed method that performs better is demonstrated at: https://realtime-speech-emotion.netlify.app/