Megatron-DeepSpeed only applies to specific models?

Question

Megatron-DeepSpeed only applies to specific models?

Bob-cby opened this issue a year ago · comments

Is Megatron-DeepSpeed only targeting specific models such as GPT-2? Can it support parallel partitioning of relatively lightweight models such as CLIP?