microsoft / Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Will offline++be forcibly uninstalled after specifying a ratio?

xvanQ opened this issue · comments

Will turning on twin offload force the corresponding ratio to be offloaded to the specified device (such as CPU), even if the GPU memory is much larger than the training required memory? It seems that I did not unload the parameters to the specified CPU when running ds on V100

When running ds_pretrain_gpt_350M.sh with default configuration on a single V100, it seems that I did not uninstall the parameters to the specified CPU