This repository contains the code and the dataset for our paper,
- Kyungsu Kim*, Minju Park*, Haesun Joung*, Yunkee Chae, Yeongbeom Hong, Seonghyeon Go, Kyogu Lee. “Show Me the Instruments: Musical Instrument Retrieval from Mixture Audio”. 2022.
For audio samples and demo, visit our website.
- Clone the repository
git clone https://github.com/minju0821/musical_instrument_retrieval.git
- Install requirements
pip3 install -r requirements.txt
- Install Nlakh dataset
Model | EER |
---|---|
Single-Intrument Encoder | 0.026 |
python Single_Instrument_Encoder/train.py
Model | Encoder Architecture | Train Dataset | F1 (macro) | F1 (weighted) | mAP (macro) | mAP (weighted) |
---|---|---|---|---|---|---|
Small-Nlakh | DeepCNN | Nlakh | 0.482 | 0.524 | 0.553 | 0.597 |
Large-Nlakh | ConvNeXT | Nlakh | 0.533 | 0.578 | 0.635 | 0.666 |
Small-Random | DeepCNN | Randomly mixed | 0.528 | 0.543 | 0.598 | 0.615 |
Large-Random | ConvNeXT | Randomly mixed | 0.694 | 0.712 | 0.752 | 0.760 |
- Customize arguments in parse_args function in train.py before training.
python Multi_Instrument_Encoder/train.py