My roc_auc is higher than reported

Question

My roc_auc is higher than reported

pengbo-learn opened this issue 3 years ago · comments

I train harmonicnn on MTAT, where MTAT is downloaded from https://github.com/Spijkervet/CLMR/blob/master/clmr/datasets/magnatagatune.py
I trained 4 times, the evaluation outputs are:
loss: 0.1400
roc_auc: 0.9151
pr_auc: 0.4636

loss: 0.1399
roc_auc: 0.9157
pr_auc: 0.4666

loss: 0.1405
roc_auc: 0.9155
pr_auc: 0.4617

loss: 0.1403
roc_auc: 0.9153
pr_auc: 0.4643

loss: 0.1402
roc_auc: 0.9148
pr_auc: 0.4641

the roc_auc is much higher than the reported 0.9126.
Did I miss something ?

Minz Won · Answer 1 · Fri May 14 2021 17:15:08 GMT+0800 (China Standard Time)

Hi,

That's interesting.
There can be many reasons. Sometimes it's because of a different PyTorch version (for example, once I experienced a similar thing after they update batch normalization in PyTorch). Or It can be a different data split. Can you double-check if your data split is identical to mine? I included lists of train/valid/test sets in this repo.

PengBo · Answer 2 · Tue May 18 2021 15:53:31 GMT+0800 (China Standard Time)

I did use your data split provided in split/mtat.

My environment: torch==1.2.0, torchaudio==0.3.0

Some of my MTAT mp3 files are 0 size, it may be the root of problem. Could you provide your way to download the MTAT mp3s, so I could make sure the data source is identical.

Minz Won · Answer 3 · Wed May 19 2021 15:43:41 GMT+0800 (China Standard Time)

To make sure, the scores that you mentioned are the "test set" score, not the "validation set", right?

I used the dataset that already existed in my university's cluster. I believe it's collected from the original web page (https://mirg.city.ac.uk/codeapps/the-magnatagatune-dataset).

PengBo · Answer 4 · Sun May 23 2021 16:26:27 GMT+0800 (China Standard Time)

To make sure, the scores that you mentioned are the "test set" score, not the "validation set", right?

I used the dataset that already existed in my university's cluster. I believe it's collected from the original web page (https://mirg.city.ac.uk/codeapps/the-magnatagatune-dataset).

yeah, the results are generated by running python -u eval.py --data_path YOUR_DATA_PATH

Thanks a lot, I will report the scores when experiments are finished

PengBo · Answer 5 · Tue May 25 2021 10:27:55 GMT+0800 (China Standard Time)

The md5 values of MP3 files downloaded from https://mirg.city.ac.uk/codeapps/the-magnatagatune-dataset and https://github.com/Spijkervet/CLMR/blob/master/clmr/datasets/magnatagatune.py are the same.

To make sure I did not modify your original implementation, I git cloned the repo and tried again.

I did modify the preprocessing script mtat_read.py. 6/norine_braun-now_and_zen-08-gently-117-146.mp3 is 0 size, which make the program crash with EOFError. Therefore, I catch the error and pass.

Is this file in your MTAT data 0 size?

Minz Won · Answer 6 · Mon May 31 2021 14:49:37 GMT+0800 (China Standard Time)

@pengbo-learn The file size is 0 for me as well. Can you run the experiment with another model, please? If you experience the performance gain in another model, for sure, this improvement comes from the different PyTorch versions.

PengBo · Answer 7 · Wed Jun 02 2021 11:30:40 GMT+0800 (China Standard Time)

You are right, other models do better as well.

harmonicnn
loss: 0.1405
roc_auc: 0.9142
pr_auc: 0.4658

fcn
loss: 0.1405
roc_auc: 0.9006
pr_auc: 0.4347

musicnn
loss: 0.1461
roc_auc: 0.9112
pr_auc: 0.4520

Minz Won · Answer 8 · Mon Jun 07 2021 19:09:13 GMT+0800 (China Standard Time)

Possibly it's a version issue then. Thank you for reporting this and I will close the issue.