Minimalistic large language model 3D-parallelism training
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool
xrsrke opened this issue 5 months ago · comments
Why don't we save a model checkpoint before terminating the training run? [link]