Pre-training Time
haoshuai714 opened this issue · comments
haoshuai714 commented
Thanks for your great codes!
In your paper, running the pre-training experiments needs 64 V100 GPUs.
How long have you been training with 64 V100 GPUs?
Thank you!
Wonjae Kim commented
Hi @haoshuai714
https://tensorboard.dev/experiment/mNHxDM08R6eHKeU0JHn5vg/#scalars
This is the log of MLM+ITM pre-training with 64 V100 GPUS.