GPT-MoE supports for expert parallel

Question

GPT-MoE supports for expert parallel

YJHMITWEB opened this issue a year ago · comments

Hi, I am wondering if https://github.com/NVIDIA/FasterTransformer/blob/main/docs/gpt_guide.md#gpt-with-moe example provided has the support for expert parallel. The provided examples are using nlp_gpt3_text-generation_0.35B_MoE-64, but there are only tensor parallel and pipeline parallel options.

Since in the Swin-Transformer-Quantization folder, FasterTransformer is using the Swin-MoE repo which supports expert parallel, I'd like to know how to enable this feature for GPT-MoE as well.