Questions about batch size and learning rate
kuang23 opened this issue · comments
Hi,
Firstly, Thanks for your great work!
After studying code, I am wondering the setting of batch size and learning rate in main.py
(1) Why batch_size is divided by 2 in line 70 ?
(2) Why lr_schedule is divided by sqrt(n_replicas) in line 174 ?
Hi, those settings aren't necessarily the best or the definitive configuration. Specifically these adjustments were needed when I was experimenting with multi-GPU training (two GPUs). It's a bit of historic cruft, the training code is not super streamlined and cleaned up in this sense. Take it as just one possible example training configuration.
Ok I see
Thanks for your reply !