pipeline-parallelism

There are 3 repositories under pipeline-parallelism topic.

ColossalAI
hpcaitech / ColossalAI
Making large AI models cheaper, faster and more accessible
ai big-model data-parallelism deep-learning distributed-computing foundation-models heterogeneous-training hpc inference large-scale model-parallelism pipeline-parallelism
Language:Python 38628
microsoft / DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
billion-parameters compression data-parallelism deep-learning gpu inference machine-learning mixture-of-experts model-parallelism pipeline-parallelism pytorch trillion-parameters zero
Language:Python 34811
bigscience-workshop / petals
🌸 Run LLMs at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading
bloom chatbot deep-learning distributed-systems falcon gpt guanaco language-models large-language-models llama machine-learning mixtral neural-networks nlp pipeline-parallelism pretrained-models pytorch tensor-parallelism transformer volunteer-computing
Language:Python 9089
torchgpipe
kakaobrain / torchgpipe
A GPipe implementation in PyTorch
checkpointing deep-learning gpipe model-parallelism parallelism pipeline-parallelism pytorch
Language:Python 802
PaddlePaddle / PaddleFleetX
飞桨大模型开发套件，提供大语言模型、跨模态大模型、生物计算大模型等领域的全流程开发工具链。
fleet-api paddlepaddle benchmark distributed-algorithm large-scale model-parallelism data-parallelism pipeline-parallelism cloud paddlecloud elastic lightning pretraining self-supervised-learning unsupervised-learning
Language:Python 437
Oneflow-Inc / libai
LiBai(李白): A Toolbox for Large-Scale Distributed Parallel Training
oneflow nlp deep-learning large-scale data-parallelism model-parallelism distributed-training pipeline-parallelism transformer self-supervised-learning vision-transformer
Language:Python 389
Coobiw / MPP-LLaVA
Personal Project: MPP-Qwen14B & MPP-Qwen-Next(Multimodal Pipeline Parallel based on Qwen-LM). Support [video/image/multi-image] {sft/conversations}. Don't let the poverty limit your imagination! Train your own 8B/14B LLaVA-training-like MLLM on RTX3090/4090 24GB.
multimodal-large-language-models deepspeed model-parallel pipeline-parallelism mllm qwen fine-tuning pretraining video-language-model video-large-language-models
Language:Jupyter Notebook 356
InternLM / InternEvo
InternEvo is an open-sourced lightweight training framework aims to support model pre-training without the need for extensive dependencies.
910b deepspeed-ulysses flash-attention gemma internlm internlm2 llama3 llava llm-framework llm-training multi-modal pipeline-parallelism pytorch ring-attention sequence-parallelism tensor-parallelism transformers-models zero3
Language:Python 278
alibaba / EasyParallelLibrary
Easy Parallel Library (EPL) is a general and efficient deep learning framework for distributed model training.
deep-learning data-parallelism model-parallelism pipeline-parallelism memory-efficient distributed-training gpu
Language:Python 261
Shenggan / awesome-distributed-ml
A curated list of awesome projects and papers for distributed training or inference
deep-learning distributed-systems high-performance-computing machine-learning model-parallelism pipeline-parallelism
189
torchpipe / torchpipe
Serving Inside Pytorch
deployment inference pipeline-parallelism serving tensorrt triton-inference-server ray pytorch torch2trt serve llm-serving
Language:C++ 141
xrsrke / pipegoose
Large scale 4D parallelism pre-training for 🤗 transformers in Mixture of Experts *(still work in progress)*
megatron megatron-lm transformers 3d-parallelism data-parallelism pipeline-parallelism tensor-parallelism model-parallelism zero-1 large-scale-language-modeling huggingface-transformers distributed-optimizers mixture-of-experts moe sequence-parallelism
Language:Python 77
AlibabaPAI / DAPPLE
An Efficient Pipelined Data Parallel Approach for Training Large Model
distribution-strategy-planner hybrid-parallelism pipeline-parallelism
Language:Python 69
saareliad / FTPipe
FTPipe and related pipeline model parallelism research.
pipeline-parallelism fine-tuning nlp t5 deep-neural-networks distributed-training
Language:Python 41
Shigangli / Chimera
Chimera: Efficiently Training Large-Scale Neural Networks with Bidirectional Pipelines.
distributed-deep-learning pipeline-parallelism transformers
Language:Python 40
nawnoes / pytorch-gpt-x
Implementation of autoregressive language model using improved Transformer and DeepSpeed pipeline parallelism.
gpt pytorch transformer pipeline-parallelism deepspeed
Language:Python 31
fanpu / DynPartition
Official implementation of DynPartition: Automatic Optimal Pipeline Parallelism of Dynamic Neural Networks over Heterogeneous GPU Systems for Inference Tasks
dynamic-neural-network machine-learning model-parallelism neural-networks pipeline-parallelism pytorch reinforcement-learning scheduling treelstm dynpartition
Language:Python 5
garg-aayush / model-parallelism
Model parallelism for NN architectures with skip connections (eg. ResNets, UNets)
gpipe model-parallelism pipeline-parallelism pytorch
Language:Python 4
torchpipe.github.io
torchpipe / torchpipe.github.io
Docs for torchpipe: https://github.com/torchpipe/torchpipe
deployment inference pipeline-parallelism pytorch serving tensorrt
Language:MDX 4
explcre / pipeDejavu
pipeDejavu: Hardware-aware Latency Predictable, Differentiable Search for Faster Config and Convergence of Distributed ML Pipeline Parallelism
data-parallelism deep-learning differentiable-programming distributed-training dynamic-programming model-parallelism pipeline-parallelism pytorch random-initialization differentiable-dynamic-programming python baysian-optimisation hardware-aware linear-regression parallel-optmization predictive-modeling sampling-methods
Language:Jupyter Notebook 3
LER0ever / HPGO
Development of Project HPGO | Hybrid Parallelism Global Orchestration
pipeline-parallelism data-parallelism distributed-training tensorflow pytorch pipedream gpipe machine-learning model-parallelism rust
3