Veleslavia/SMC2016

#Automatic musical instrument recognition in audiovisual recordings by combining image and audio classification strategies : SMC 2016

This is a code sources for our paper at SMC 2016 conference dedicated to the comparison of audio-based and image-based strategies for musical instrument recognition.

##Requirements

Python requirement can be found in requirements.txt file.

##Dataset requirements

In order to reproduce results presented in the paper, please, download the following datasets:

ImageNet dataset (synsets n02672831, n02787622, n02992211, n03110669, n03249569, n03372029, n03467517, n03838899, n03928116, n04141076, n04487394, n04536866)
IRMAS dataset
RWC Music Database: Musical Instrument Sound

##Step-by-step Guide

Audio-based approach

Feature extraction

This step is optional. You can use features provided in ./audio/irmas/irmas_essentia_features.csv and ./audio/rwc/rwc_essentia_features.csv files In order to extract features with Essentia library, run for each dataset

python ./audio/feature_extraction.py data_directory_path output_file_path

Training classifiers

We perform 10-fold cross-validation for audio. The parameter dataset_name can be only RWC or IRMAS

SVM classification

python ./audio/svm_classification.py path_to_features_file.csv dataset_name

XGBoost classification

python ./audio/xgb_classification.py path_to_features_file.csv dataset_name

The trained classifier stores at the same directory as a .plk file for the following cross-evaluation on other datasets. To reproduce the test results, please, save a label encoder additionally or use the encoder provided ./audio/irmas_le.pkl and ./audio/rwc_le.pkl.

Image-based approach

In order to reproduce fine-tuning, be sure, that you have ImageNet subset stored at ./../dataset/images or change IMAGES_DIR variable in ./utils/settings/py You also need to download pretrained weights for VGG-16 model and store it at ./image/cnnnet folder.

Then run

python ./image/train_classify.py

The fine-tuning will perform 5 epoch, display intermediate results and store the new weights for each epoch in separated .pkl file.

##Reference

Olga Slizovskaia, Emilia Gomez & Gloria Haro (2016, September). "Automatic musical instrument recognition in audiovisual recordings by combining image and audio classification strategies" in 13th Sound and Music Computing Conference (SMC), Hamburg, Germany.

Veleslavia / SMC2016

Audio-based approach

Feature extraction

Training classifiers

Image-based approach

About

Languages