Minimum learning rate

Question

Minimum learning rate

artnoage opened this issue 10 months ago · comments

The minimum learning rate is the same as the "max". Is this intentional or a mistake? If yes, why (you can skip explanation if it is too bothersome)?

Zhang Peiyuan · Answer 1 · Sat Sep 16 2023 08:29:21 GMT+0800 (China Standard Time)

This is a mistake made by us and thanks a million for spotting that out!（by right it should be 4e-5）
Fortunately, we are still at the very early stage of training and there is still room to correct it. I have corrected this mistake, and the TinyLlama's lr curve will look like below:

I will update the readme pointing to this issue together with the release of the 500B-token checkpoint later.