cofe-ai / Mu-scaling

Research without Re-search: Maximal Update Parametrization Yields Accurate Loss Prediction across Scales

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

cofe-ai/Mu-scaling Stargazers