Inferring emotions from multiple modalities is very critical for social communication and deficits in emotion recognition are a very important marker in the diagnosis of autism spectrum disorder. This project uses AI to help autistic individuals recognize emotions in speech.
tf-wav2vec2-base (Keras and Keras Core)
RAVDESS dataset (Ryerson Audio-Visual Database of Emotional Speech and Song) contains 7,356 audio files which are labeled against different emotions in Speech (calm, happy, sad, angry, fearful, surprise, and disgust expressions) and song (calm, happy, sad, angry, and fearful emotions). This dataset contains a sample of the files from the original RAVDESS dataset.
HuggingFace spaces Demo: https://huggingface.co/spaces/tensorgirl/audio_classification
Include the line pip install --upgrade intel-extension-for-tensorflow[cpu] in the beginning to use the Intel Extension of Tensorflow
huggingFace - tf-wav2vec2-base
Official Keras Core Documentation
https://humansofdata.atlan.com/2019/08/unravel-the-mystery-of-the-human-brain-at-neuroai/