TsinghuaAI / CPM-2-Pretrain

Code for CPM-2 Pre-Train

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Model checkpoint convert

k15201363625 opened this issue · comments

How to complete the transformation:
Deepspeed_ckpt to mp_model(tensor parallel) to single_model(single ckpt) with zero-1/2 and mp(tensor parallel) training.
Furthermore,Deepspeed version should be?
thanks very much.

You can refer to https://github.com/TsinghuaAI/CPM-1-Generate/blob/main/change_mp.py for the transformation.
The deepspeed version is 0.3.9+59e4dbb. You can directly use our docker to run the model.

Thank you very much!