hpcaitech / EnergonAI

Large-scale model inference.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Feature]: Automatic Pipeline Parallelism

dujiangsu opened this issue · comments

commented

Describe the feature:
We are going to introduce the automated pipeline parallelism feature into EnergonAI, which hopes that users only need to specify some simple arguments and achieve the pipeline parallelism.
With torch.fx, here the pipelinable directory is with functions that can split a model into multiple submodules.

Difficulty:

  1. Use meta device in fx.GraphModule generation to reduce peak memory usage.
  2. auto_pipeline_wrapper.py is not that automated.