Can not reproduce following the hyperparameter in the paper for finefuning ViT on Cifar100

Question

Can not reproduce following the hyperparameter in the paper for finefuning ViT on Cifar100

NamlessM opened this issue 2 years ago · comments

I run the code provided with hyperparameter lr = 1e-2, alpha = 0.3, dropout = 0.1, resolution = 384*384, 10000 global steps, batch size = 512 yet the result I got is far from the improvement given by the paper

wangyuenlp · Answer 1 · Mon Jun 27 2022 20:12:48 GMT+0800 (China Standard Time)

Hi! I can reproduce the ViT-B_16's R-Drop results on cifar100 when I change img_size to 224, learning_rate to 8e-3 and num_steps to 20,000. Hope it can help you!

NamlessM · Answer 2 · Fri Jul 01 2022 04:43:29 GMT+0800 (China Standard Time)

Thank you very much for your help! Just still wonder what is the alpha you are using in addition to the hyperparameters listed above?

wangyuenlp · Answer 3 · Fri Jul 01 2022 10:50:03 GMT+0800 (China Standard Time)

I didn't change any hyperparameters other than img_size, learning_rate and num_steps. Therefore, the alpha is still 0.3, which is the default value.

NamlessM · Answer 4 · Sat Jul 02 2022 05:21:43 GMT+0800 (China Standard Time)

Thank you! I will try this setting!