microsoft / Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[QUESTION] How can I finetune a model without zero_pp_rank_*_mp_rank_00_optim_states.pt

SefaZeng opened this issue · comments

I transform a model from huggingface to megatron-deepspeed format and there is only weights for the model. But if I set --deepspeed, the code always try to load the zero_pp_rank_*_mp_rank_00_optim_states.pt. How can I skip this? As I only need the weights and training from scratch.
I have set --finetune --no-load-optim --no-load-rng --no-load-lr-state but not work.