idstcv / GPU-Efficient-Networks

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Training details

ZichaoGuo opened this issue · comments

Could you release the training code? Or introduce the training details? Many details are missed in paper.

There are sufficient details in our paper.

What is your learning rate schedule, I didn't see it in paper.
“The final networks are trained up to 480 epochs with label-smoothing [Szegedy et al., 2016], mix-up [Zhang et al., 2018], random-erase [Zhong et al., 2020] and auto-augmentation [Cubuk et al., 2019]. Due to the space limitation, more details and results could be found in appendix.”
I didn't see the training detail in appendix. Could you tell more here.

看到您的GENet,很感兴趣,想复现一下论文的结果,但是发现论文的训练细节不是特别清楚。 我用batch size 1024,lr 0.5,weight decay 1e-4,epochs 360, 5个epochs的 warmup,cosine 学习率衰减,无dropout, GENet-normal结构的精度只训练到了76.1。

想咨询一下GENet-normal结构的训练策略是怎么样的,比如 lr,batch size,weight decay ,dropout rate,epochs,学习率的衰减策略,以及是否用了warm up。盼望得到您的帮助~

What is your learning rate schedule, I didn't see it in paper.
“The final networks are trained up to 480 epochs with label-smoothing [Szegedy et al., 2016], mix-up [Zhang et al., 2018], random-erase [Zhong et al., 2020] and auto-augmentation [Cubuk et al., 2019]. Due to the space limitation, more details and results could be found in appendix.”
I didn't see the training detail in appendix. Could you tell more here.

请问您复现出GENet结果了么?调整了好几个训练策略,跟论文的差距还是很大。如果您也没复现出来,我就放弃了~

@pawopawo 我也没复现出来,我复现的结果跟你的差不多

We will update our draft this week to include more detailed training parameters. We use cosine lr decay, warm-up 5 epochs, wd is 4e-5, lr=0.1, batch size 256.