Youhe Jiang's repositories

IJCAI2023-OptimalShardedDataParallel

[IJCAI2023] An automated parallel training system that combines the advantages from both data and model parallelism. If you have any interests, please visit/star/fork https://github.com/Youhe-Jiang/OptimalShardedDataParallel

Language:PythonLicense:MITStargazers:50Issues:1Issues:0

Hetu

A high-performance distributed deep learning system targeting large-scale and automated distributed training.

Language:PythonLicense:Apache-2.0Stargazers:2Issues:0Issues:0

alpa

Training and serving large-scale neural networks with auto parallelization.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:1Issues:0

cutlass

CUDA Templates for Linear Algebra Subroutines

License:NOASSERTIONStargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0

DeepRL_PyTorch

Deep Reinforcement Learning codes for study. Currently, there are only codes for algorithms: DQN, C51, QR-DQN, IQN, QUOTA.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

examples

A set of examples around pytorch in Vision, Text, Reinforcement Learning, etc.

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

FasterTransformer

Transformer related optimization, including BERT, GPT

Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0

flash-attention

Fast and memory-efficient exact attention

License:BSD-3-ClauseStargazers:0Issues:0Issues:0

FlexFlow

A distributed deep learning framework that supports flexible parallelization strategies.

License:Apache-2.0Stargazers:0Issues:0Issues:0

FlexGen

Running large language models on a single GPU for throughput-oriented scenarios.

License:Apache-2.0Stargazers:0Issues:0Issues:0

Hetu-Galvatron

Galvatron is an automatic distributed training system designed for Transformer models, including Large Language Models (LLMs). If you have any interests, please visit/star/fork https://github.com/PKU-DAIR/Hetu-Galvatron

Stargazers:0Issues:0Issues:0

HexGen

Serving LLMs on heterogeneous decentralized clusters.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

llama

Inference code for LLaMA models

License:GPL-3.0Stargazers:0Issues:0Issues:0

Megatron-DeepSpeed

Ongoing research training transformer language models at scale, including: BERT & GPT-2

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

mms

Multi model serving

Language:PythonStargazers:0Issues:0Issues:0

MS-AMP

Microsoft Automatic Mixed Precision Library

License:MITStargazers:0Issues:0Issues:0

MS-AMP-Examples

Examples for MS-AMP package.

License:MITStargazers:0Issues:0Issues:0

ninja

a small build system with a focus on speed

License:Apache-2.0Stargazers:0Issues:0Issues:0

pytorch-CycleGAN-and-pix2pix

Image-to-Image Translation in PyTorch

Language:PythonLicense:NOASSERTIONStargazers:0Issues:0Issues:0

Pytorch-UNet

PyTorch implementation of the U-Net for image semantic segmentation with high quality images

License:GPL-3.0Stargazers:0Issues:0Issues:0
Stargazers:0Issues:0Issues:0
Stargazers:0Issues:1Issues:0
Language:PythonStargazers:0Issues:1Issues:0

TransformerEngine

A library for accelerating Transformer models on NVIDIA GPUs, including using 8-bit floating point (FP8) precision on Hopper and Ada GPUs, to provide better performance with lower memory utilization in both training and inference.

License:Apache-2.0Stargazers:0Issues:0Issues:0

yolov7

Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors

License:GPL-3.0Stargazers:0Issues:0Issues:0

Youhe-Jiang

Config files for my GitHub profile.

Stargazers:0Issues:1Issues:0

youhe-jiang.github.io

Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes

Language:JavaScriptLicense:MITStargazers:0Issues:0Issues:0