Minimum learning rate
artnoage opened this issue · comments
The minimum learning rate is the same as the "max". Is this intentional or a mistake? If yes, why (you can skip explanation if it is too bothersome)?
This is a mistake made by us and thanks a million for spotting that out!(by right it should be 4e-5)
Fortunately, we are still at the very early stage of training and there is still room to correct it. I have corrected this mistake, and the TinyLlama's lr curve will look like below:
I will update the readme pointing to this issue together with the release of the 500B-token checkpoint later.