The experimental results could not be reproduced

Question

The experimental results could not be reproduced

JJJTF opened this issue 2 years ago · comments

I wonder why the CIFAR100 result I reproduced was only 57%, but the experimental result of the paper was 75%. I haven't found out why.

Zizheng Pan · Answer 1 · Fri May 13 2022 10:47:26 GMT+0800 (China Standard Time)

Hi @JJJTF, thanks for your interest! Could you please be more specific on from which model that you cannot reproduce the results? And also your experiment settings, e.g. number of GPUs, hyperparameters like learning rate and batch size. If you follow the default hyperparameter settings in this repo, you should get the similar results.

J_JJ · Answer 2 · Thu May 19 2022 15:13:21 GMT+0800 (China Standard Time)

my experiment settings is hvt-s-1.json file and cuda:0,1, others is default.

Zizheng Pan · Answer 3 · Thu May 19 2022 19:34:21 GMT+0800 (China Standard Time)

Hi @JJJTF, I just trained this model on 2 V100 GPUs. My python environment has PyTorch 1.7.1 and CUDA 10.1. Without modifying any code in this repo, I got the final accuracy of 73.68% by using the following commands for training,

python -m torch.distributed.launch --nproc_per_node=2 --use_env main.py --config config/hvt-s-1.json --data-set CIFAR --data-path [path/to/cifar100]

Note that your final result could be slightly different based on your environment, but ideally it should not be significantly lower, i.e. 57% is not normal. It would be better if you can double-check your code.

Here is the training log for your reference: all_logs.txt

J_JJ · Answer 4 · Fri May 20 2022 09:02:36 GMT+0800 (China Standard Time)

Thank you very much for your timely answer to each question. I will run again according to your all_logs.txt parameter.