Pipeline parallelism + CPU offload?
webber26232 opened this issue · comments
Wenbo Zhao commented
The config below is required for running CPU offload along with Megatron features:
--no-pipeline-parallel --cpu-optimizer
Could anyone tell me why using Pipeline parallelism together with CPU offload is not supported? In my opinion, these 2 optimization methods could work together. Please let me know if I am wrong.