Veleslavia / SMC2016

The repo contains the code for a paper from SMC 2016 conference dedicated to the comparison of audio-based and image-based strategies for musical instrument recognition.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

#Automatic musical instrument recognition in audiovisual recordings by combining image and audio classification strategies : SMC 2016

This is a code sources for our paper at SMC 2016 conference dedicated to the comparison of audio-based and image-based strategies for musical instrument recognition.

##Requirements

Python requirement can be found in requirements.txt file.

##Dataset requirements

In order to reproduce results presented in the paper, please, download the following datasets:

##Step-by-step Guide

Audio-based approach

Feature extraction

This step is optional. You can use features provided in ./audio/irmas/irmas_essentia_features.csv and ./audio/rwc/rwc_essentia_features.csv files In order to extract features with Essentia library, run for each dataset

python ./audio/feature_extraction.py data_directory_path output_file_path 

Training classifiers

We perform 10-fold cross-validation for audio. The parameter dataset_name can be only RWC or IRMAS

SVM classification

python ./audio/svm_classification.py path_to_features_file.csv dataset_name

XGBoost classification

python ./audio/xgb_classification.py path_to_features_file.csv dataset_name

The trained classifier stores at the same directory as a .plk file for the following cross-evaluation on other datasets. To reproduce the test results, please, save a label encoder additionally or use the encoder provided ./audio/irmas_le.pkl and ./audio/rwc_le.pkl.

Image-based approach

In order to reproduce fine-tuning, be sure, that you have ImageNet subset stored at ./../dataset/images or change IMAGES_DIR variable in ./utils/settings/py You also need to download pretrained weights for VGG-16 model and store it at ./image/cnnnet folder.

Then run

python ./image/train_classify.py 

The fine-tuning will perform 5 epoch, display intermediate results and store the new weights for each epoch in separated .pkl file.

##Reference

  • Olga Slizovskaia, Emilia Gomez & Gloria Haro (2016, September). "Automatic musical instrument recognition in audiovisual recordings by combining image and audio classification strategies" in 13th Sound and Music Computing Conference (SMC), Hamburg, Germany.

About

The repo contains the code for a paper from SMC 2016 conference dedicated to the comparison of audio-based and image-based strategies for musical instrument recognition.


Languages

Language:Jupyter Notebook 94.9%Language:Python 5.1%