minzwon / sota-music-tagging-models

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

My roc_auc is higher than reported

pengbo-learn opened this issue · comments

I train harmonicnn on MTAT, where MTAT is downloaded from https://github.com/Spijkervet/CLMR/blob/master/clmr/datasets/magnatagatune.py
I trained 4 times, the evaluation outputs are:
loss: 0.1400
roc_auc: 0.9151
pr_auc: 0.4636

loss: 0.1399
roc_auc: 0.9157
pr_auc: 0.4666

loss: 0.1405
roc_auc: 0.9155
pr_auc: 0.4617

loss: 0.1403
roc_auc: 0.9153
pr_auc: 0.4643

loss: 0.1402
roc_auc: 0.9148
pr_auc: 0.4641

the roc_auc is much higher than the reported 0.9126.
Did I miss something ?

Hi,

That's interesting.
There can be many reasons. Sometimes it's because of a different PyTorch version (for example, once I experienced a similar thing after they update batch normalization in PyTorch). Or It can be a different data split. Can you double-check if your data split is identical to mine? I included lists of train/valid/test sets in this repo.

I did use your data split provided in split/mtat.

My environment: torch==1.2.0, torchaudio==0.3.0

Some of my MTAT mp3 files are 0 size, it may be the root of problem. Could you provide your way to download the MTAT mp3s, so I could make sure the data source is identical.

To make sure, the scores that you mentioned are the "test set" score, not the "validation set", right?

I used the dataset that already existed in my university's cluster. I believe it's collected from the original web page (https://mirg.city.ac.uk/codeapps/the-magnatagatune-dataset).

To make sure, the scores that you mentioned are the "test set" score, not the "validation set", right?

I used the dataset that already existed in my university's cluster. I believe it's collected from the original web page (https://mirg.city.ac.uk/codeapps/the-magnatagatune-dataset).

yeah, the results are generated by running python -u eval.py --data_path YOUR_DATA_PATH

Thanks a lot, I will report the scores when experiments are finished

The md5 values of MP3 files downloaded from https://mirg.city.ac.uk/codeapps/the-magnatagatune-dataset and https://github.com/Spijkervet/CLMR/blob/master/clmr/datasets/magnatagatune.py are the same.

To make sure I did not modify your original implementation, I git cloned the repo and tried again.

I did modify the preprocessing script mtat_read.py. 6/norine_braun-now_and_zen-08-gently-117-146.mp3 is 0 size, which make the program crash with EOFError. Therefore, I catch the error and pass.

Is this file in your MTAT data 0 size?

@pengbo-learn The file size is 0 for me as well. Can you run the experiment with another model, please? If you experience the performance gain in another model, for sure, this improvement comes from the different PyTorch versions.

You are right, other models do better as well.

harmonicnn
loss: 0.1405
roc_auc: 0.9142
pr_auc: 0.4658

fcn
loss: 0.1405
roc_auc: 0.9006
pr_auc: 0.4347

musicnn
loss: 0.1461
roc_auc: 0.9112
pr_auc: 0.4520

Possibly it's a version issue then. Thank you for reporting this and I will close the issue.