saquibntt / speech_emotion_recognition

Predicting emotion in speech

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool



The code here implements different approaches to classification of the emotions from the recorded speech utterances. The project was mostly inspired by the requirements of CS221 class at Stanford.

Code structure

  • data/
    The arff feature files extracted from the original dataset using openSmile package.

  • modeltrain/
    The scripts used to automate feature extraction, training the models and classification of the test set from the original data. It is mostly copied from the corresponding dir from the openSmile packet.

  • nn/
    The implementation of the simple multilayer perceptron ( and single-layer lstm rnn ( used in the project to analyze the performance of the NN to classify the utterances into emotion categories in the project.

  • svm_gmm_logreg/
    The implementation of Support Vector Machine, Gaussian Mixture Model and Logistic Regression models. Includes training and test set error analysis.


The following packages should be pre-installed to use the code from this project:

Expected dataset layout

The following is the expected layout of the original wave files dataset on the filesystem:


    • train/
        • emotion1/
            • File1
            • File2
            • ...
        • emotion2/
            • File1
            • File2
            • ...
        • .../
    • test/
        • emotion1/
            • File1
            • File2
            • ...
        • emotion2/
            • File1
            • File2
            • ...
        • .../


  • To train and test the multilayer perceptron nn classifier, run cd nn && python
  • To train and test the LSTM RNN classifier, run cd nn && python Note: you will need to edit the to provide the proper path to the original dataset.
  • To train and test the SVM, GMM and Logistic Regression models, run cd svm_gmm_logreg && python Note: you will need to edit the to provide the proper path to the original dataset (../data/etc.)


Predicting emotion in speech


Language:C++ 35.3%Language:C 24.8%Language:Perl 23.2%Language:Python 16.5%Language:Makefile 0.3%