Speech Emotion Classification

This repository contains a Jupyter Notebook that demonstrates the use of machine learning models to classify emotions from speech signals. The notebook uses the RAVDESS dataset, which is a public dataset of emotional speech recordings.

Introduction
Data Exploration
Feature Extraction
Model Training
Model Evaluation
Conclusion

Introduction

The main objective of this project is to classify emotions from speech signals using machine learning algorithms. The RAVDESS dataset, which contains emotional speech recordings, is used for this purpose.

Data Exploration

The data exploration section explores the RAVDESS dataset and visualizes some of the audio files. It also provides some insights into the distribution of the emotional categories in the dataset.

Feature Extraction

The feature extraction section describes the process of extracting features from the audio files. The notebook uses the Mel-Frequency Cepstral Coefficients (MFCCs) to extract features from the audio files. The section also includes code for visualizing the MFCCs.

Model Training

The model training section describes the process of training machine learning models to classify emotions from speech signals. The notebook uses several models including Random Forest, Logistic Regression, Support Vector Machine and Multilayer Perceptron.

Model Evaluation

The model evaluation section evaluates the performance of the models using various metrics such as accuracy, precision, recall and F1-score. The section also includes code for visualizing the confusion matrix.

Conclusion

In this project,I have explored the task of classifying emotions from speech signals using machine learning models.The RAVDESS dataset, which is a public dataset of emotional speech recordings,was used to train and evaluate several machine learning models.

Our analysis showed that the models were able to achieve good performance on the task of emotion classification, with the Random Forest and Support Vector Machine models performing particularly well. We also observed that the Mel-Frequency Cepstral Coefficients (MFCCs) were effective in capturing the relevant features from the audio signals.

Requirements

The notebook requires several Python libraries, including NumPy, Pandas, Scikit-Learn, and Librosa.

Usage

If you want to replicate the results or experiment with the code, you can download the notebook and run it on your own machine. You can also run the notebook using Google Colab.

Note: The code in this notebook is for educational purposes only and is not intended for use in production environments.

Credits

This notebook is created by Aman Singh, and it is based on the RAVDESS dataset.

Aman0307 / Speech_emotion_recognition