Details on ZeRO++ tutorials

Question

Details on ZeRO++ tutorials

R0n12 opened this issue 7 months ago · comments

I am trying to evaluate ZeRO ++ by following this tutorial.

I was trying to look for pretrain_zeropp_gpt.py in this repo but no luck, is this still being developed or will the regular pretrain_gpt.py just work with the ZeRO ++ configs?

Just wondering where I can find a concrete example to re-produce GPT-2 training with ZeRO ++.

I am using the main branch here and DeepSpeed v0.11.1
Much Appreicated!

Guldan · Answer 1 · Mon Nov 06 2023 10:44:12 GMT+0800 (China Standard Time)

The same question.
And have you reproduced the GPT2 training with ZeRO by pretrain_gpt.py ? I don't know hich script I should use. Will examples_deepspeed/rebase/ds_pretrain_gpt_125M.sh work? Thanks.

Quentin Anthony · Answer 2 · Wed Nov 08 2023 07:45:05 GMT+0800 (China Standard Time)

Looks like the scripts for ZeRO++ nor the appendix for https://arxiv.org/pdf/2306.10209.pdf are available? We can't run ZeRO++ and verify the paper's speedups without these.