christofw / pitchclass_mctc

Pytorch project accompanying the paper "Training Deep Pitch-Class Representations With a Multi-Label CTC Loss", ISMIR 2021.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

pitchclass_mctc

This is a pytorch code repository accompanying the following paper:

Christof Weiß and Geoffroy Peeters
Training Deep Pitch-Class Representations With a Multi-Label CTC Loss
Proceedings of the International Society for Music Information Retrieval Conference (ISMIR), 2021

This repository only contains exemplary code and pre-trained models for most of the paper's experiments as well as some individual examples. All datasets used in the paper are publicly available (at least partially), e.g. our main datasets:

Feature extraction and prediction (Jupyter notebooks)

In this top folder, three Jupyter notebooks demonstrate how to

  • preprocess audio files for running our models (01_precompute_features),
  • load a pretrained model for predicting pitches (02_predict_with_pretrained_model),
  • generate the visualizations of the paper's Figure 5 (03_visualize_pitch_class_features).

Experiments from the paper (Python scripts)

In the experiments folder, all experimental scripts as well as the log files (subfolder logs) and the filewise results (subfolder results_filewise) can be found. The folder models_pretrained contains pre-trained models for the main experiments. The subfolder predictions contains exemplary model predictions for two of the experiments. Plese note that re-training requires a GPU as well as the pre-processed training data (see the notebook 01_precompute_features for an example). Any script must be started from the repository top folder path in order to get the relative paths working correctly.

The experiment files' names relate to the paper's results in the following way:

Experiment 1 (Table 3) - Loss and model variants

  • exp136b_traintest_schubert_sctcthreecomp_pitchclass.py (All-Zero baseline)
  • exp136f2_traintest_schubert_librosa_pitchclass_maxnorm.py (CQT-Chroma baseline)
  • exp136b_traintest_schubert_sctcthreecomp_pitchclass.py (Separable CTC (SCTC) loss)
  • exp136d_traintest_schubert_mctcnethreecomp_pitchclass.py (Non-Epsilon MCTC (MCTC:NE) loss)
  • exp136e_traintest_schubert_mctcwe_pitchclass.py (MCTC with epsilon (MCTC:WE) loss)
  • exp136h_traintest_schubert_aligned_pitchclass.py (Strongly-aligned training (BCE loss))

Experiment 2 (Figure 4) - Cross-dataset experiment

  • exp131b_trainmaestromunet_testmix_mctcwe_pitchclass_basiccnn_normtargl_SGD.py (Train MusicNet & MAESTRO, test others, MCTC)
  • exp131e_trainmaestromunet_testmix_aligned_pitchclass_basiccnn_SGD.py (Train MusicNet & MAESTRO, test others, aligned)
  • exp137a_trainmix_testmusicnet_mctcwe_pitchclass_basiccnn.py (Test MusicNet, train others, MCTC)
  • exp137b_trainmix_testmusicnet_aligned_pitchclass_basiccnn_segmmodel.py (Test MusicNet, train others aligned)
  • exp138a_trainmix_testmaestro_mctcwe_pitchclass_basiccnn.py (Test MAESTRO, train others MCTC)
  • exp138b_trainmix_testmaestro_aligned_pitchclass_basiccnn_segmmodel.py (Test MAESTRO, train others aligned)

AddOn: Extra experiment with a deep residual CNN

  • exp136hR_traintest_schubert_aligned_pitchclass_resnet (Train/Test Schubert, Strongly-aligned training (BCE loss))

Run scripts using e.g. the following commands:
conda activate pitchclass_mctc
export CUDA_VISIBLE_DEVICES=1
python experiments/exp136b_traintest_schubert_sctcthreecomp_pitchclass.py

Application: Visualization (Figure 5)

  • Please see the Jupyter Notebook 03_visualize_pitch_class_features.

About

Pytorch project accompanying the paper "Training Deep Pitch-Class Representations With a Multi-Label CTC Loss", ISMIR 2021.


Languages

Language:HTML 52.3%Language:Jupyter Notebook 31.9%Language:Python 15.9%