gcunhase / Emotional-Video-to-Audio-with-ANFIS-DeepRNN

Emotional Video to Audio Transformation with ANFIS-DeepRNN (Vanilla RNN and LSTM-DeepRNN) [MPE 2020]

Home Page:https://www.hindawi.com/journals/mpe/2020/8478527/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

About

Repository for paper titled "Emotional Video to Audio Transformation Using Deep Recurrent Neural Networks and a Neuro-Fuzzy System".

Contents

RequirementsDatasetHow to UseHow to Cite

Requirements

Matlab 2017, Mac OS

Toolboxes: Fuzzy Logic, Deep Learning

Dataset

Both datasets have emotion label in the 2D-axis (Valence and Arousal)

  • 8 music videos
  • Emotion labels: dataset/lindsey stirling dataset/user_response*.tsv
  • 38 music videos
  • Emotion labels: dataset/deap dataset/participant_ratings.csv

Model

  • Extract audio and visual features
  • ANFIS for emotion classification of visual features
  • Seq2Seq for audio feature generation (multi-modal domain transformation)
  • Mapping of audio features to audio snippets for music generation

How to Use

All the codes are for the Lindsey Stirling Dataset. The codes corresponding to the DEAP Dataset are also available.

  1. Change current folder to where this file is located

  2. Download datasets

  3. Extract audio and visual features

    • Extract sound features:
      scripts/emotion_from_sound/main_sound2feat_lindsey.m
      
    • Extract visual features:
      scripts/emotion_from_visual/main_video2feat_lindsey.m
      
  4. Train:

    • Settings and Load data:
      scripts/model/main_settings.m
      
    • ANFIS for emotion classification from HSL (visual features):
      scripts/model/main_anfis.m
      
    • Seq2Seq for domain transformation from visual to audio features:
      scripts/model/main_seq2seq_train.m
      
  5. Evaluation (music generation from visual features)

    • Extract sound features (test data):
      scripts/emotion_from_sound/main_sound2feat_lindsey_test_individual.m
      
    • Extract visual features (test data):
      scripts/emotion_from_visual/main_video2feat_lindsey_test_individual.m
      
    • Settings and Load data:
      scripts/model/main_settings.m
      
    • Eval
      scripts/model/main_anfis_seq2seq_test.m
      
  6. Evaluation of MTurk results in scripts/eval_mturk

Notes

Acknowledgement

In case you wish to use this code, please use the following citation:

@article{sergio2020mpe,
   AUTHOR={{Sergio, G. C., and Lee, M.}},
   TITLE={Emotional Video to Audio Transformation Using Deep Recurrent Neural Networks and a Neuro-Fuzzy System},
   JOURNAL={Mathematical Problems in Engineering},
   VOLUME={2020},
   PAGES={1--15},
   DOI={https://doi.org/10.1155/2020/8478527},
   YEAR={2020}
}

Contact: gwena.cs@gmail.com

About

Emotional Video to Audio Transformation with ANFIS-DeepRNN (Vanilla RNN and LSTM-DeepRNN) [MPE 2020]

https://www.hindawi.com/journals/mpe/2020/8478527/


Languages

Language:MATLAB 96.0%Language:HTML 3.9%Language:Shell 0.0%