shrutikshirsagar / Bag-of-word-SER

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Bag-of-word-SER

bow_mod3_1 (2)
Fig 1: End-to-end pipeline for the proposed bag of modulation spectral features extraction and SER. Top part shows signal processing steps involved in BoAW computations.

Step 1: Extract modulation spectral fetaures using window size of 256 ms and frame size of 40 ms (or 64 ms).
Step 2: Extract bag of words on top of these modulation spectrum.
Step 3: These BoW represented feature work as a input to the LSTM model
Step 4: Extract SRMR as a quality feature and these features can be fused to provide robustness along with BOW modulation features.

low_high_val
Fig 2: Average modulation spectrogram plots for unprocessed (top row) and processed speech (bottom row) for high (left column) and low valence (right column) emotional state.

About


Languages

Language:Jupyter Notebook 85.7%Language:Python 11.4%Language:MATLAB 2.9%