What is the number of epochs of the final training?
cmsflash opened this issue · comments
The config file lists the sample count of the dataset as 220M and a global batch size of 2048, which equates to ~107K steps per epoch. The main README says the total number of training steps is 95K, which means epoch 1 is not finished. However, the training chronicles suggest more than one epochs of training.
What is the number of epoch for the final training and what am I missing?