mosaicml / llm-foundry

LLM training code for Databricks foundation models

Home Page:https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

docstring in DecoupledAdaLRLion is not cohere with the code

ericxsun opened this issue · comments

commented

The docstring of DecoupledAdaLRLion say the LR is scaled down by min(`lr_penalty` ** N, `min_scale`), but the code implemented adjust_lr here is lr * max(min_scale, lr_penalty**num_times).

So which is the right one, max or min?

you're correct. the doctoring is wrong, thanks for catching the typo! fixing with #563