huggingface / nanotron

Minimalistic large language model 3D-parallelism training

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Save checkpoint before terminating the training run

xrsrke opened this issue · comments

Why don't we save a model checkpoint before terminating the training run? [link]

image