Supporting for expert parallelism in MoE inference

Question

iteratorlee opened this issue a year ago · comments

#743 also mentions this issue. So is there a guiding tutorial about how to use expert parallelism in MoE inference?