SER

An approach to predict speech emotions from clips of audio using deep learning.

To learn more about the approach: article

Folders

datasets: contains data downloaded from kaggle datasets (data will be downloaded in feature_extraction notebook)

saved_datasets: contains locally saved numpy datasets

models: contains model checkpoints

logs: contains tensorboard logs

Python version: Python 3.6.9

CUDA Version: 10.1

Audio emotion 5 notebooks: https://www.kaggle.com/ejlok1/audio-emotion-part-1-explore-data by https://www.kaggle.com/ejlok1

Audio data analysis: https://www.kdnuggets.com/2020/02/audio-data-analysis-deep-learning-python-part-1.html

ravdess dataset: - https://www.kaggle.com/uwrfkaggler/ravdess-emotional-speech-audio - https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0196391

TESS dataset: - https://www.kaggle.com/ejlok1/toronto-emotional-speech-set-tess - https://tspace.library.utoronto.ca/handle/1807/24487

SAVE dataset: - http://kahlan.eps.surrey.ac.uk/savee/ - https://www.kaggle.com/barelydedicated/savee-database

CREMA dataset: - https://github.com/CheyneyComputerScience/CREMA-D - https://www.kaggle.com/ejlok1/cremad