Youhe Jiang's repositories
IJCAI2023-OptimalShardedDataParallel
[IJCAI2023] An automated parallel training system that combines the advantages from both data and model parallelism. If you have any interests, please visit/star/fork https://github.com/Youhe-Jiang/OptimalShardedDataParallel
alpa
Training and serving large-scale neural networks with auto parallelization.
cutlass
CUDA Templates for Linear Algebra Subroutines
DeepRL_PyTorch
Deep Reinforcement Learning codes for study. Currently, there are only codes for algorithms: DQN, C51, QR-DQN, IQN, QUOTA.
examples
A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.
FasterTransformer
Transformer related optimization, including BERT, GPT
flash-attention
Fast and memory-efficient exact attention
FlexFlow
A distributed deep learning framework that supports flexible parallelization strategies.
FlexGen
Running large language models on a single GPU for throughput-oriented scenarios.
Hetu-Galvatron
Galvatron is an automatic distributed training system designed for Transformer models, including Large Language Models (LLMs). If you have any interests, please visit/star/fork https://github.com/PKU-DAIR/Hetu-Galvatron
HexGen
Serving LLMs on heterogeneous decentralized clusters.
llama
Inference code for LLaMA models
Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
mms
Multi model serving
MS-AMP
Microsoft Automatic Mixed Precision Library
MS-AMP-Examples
Examples for MS-AMP package.
ninja
a small build system with a focus on speed
pytorch-CycleGAN-and-pix2pix
Image-to-Image Translation in PyTorch
Pytorch-UNet
PyTorch implementation of the U-Net for image semantic segmentation with high quality images
TransformerEngine
A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.
yolov7
Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
Youhe-Jiang
Config files for my GitHub profile.
youhe-jiang.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes