dropreg / R-Drop

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Can not reproduce following the hyperparameter in the paper for finefuning ViT on Cifar100

NamlessM opened this issue · comments

I run the code provided with hyperparameter lr = 1e-2, alpha = 0.3, dropout = 0.1, resolution = 384*384, 10000 global steps, batch size = 512 yet the result I got is far from the improvement given by the paper
image

Hi! I can reproduce the ViT-B_16's R-Drop results on cifar100 when I change img_size to 224, learning_rate to 8e-3 and num_steps to 20,000. Hope it can help you!

Thank you very much for your help! Just still wonder what is the alpha you are using in addition to the hyperparameters listed above?

I didn't change any hyperparameters other than img_size, learning_rate and num_steps. Therefore, the alpha is still 0.3, which is the default value.

Thank you! I will try this setting!