Zehao Shi's starred repositories
ColossalAI
Making large AI models cheaper, faster and more accessible
flash-attention
Fast and memory-efficient exact attention
Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
DeepSpeedExamples
Example models using DeepSpeed
lion-pytorch
🦁 Lion, new optimizer discovered by Google Brain using genetic algorithms that is purportedly better than Adam(w), in Pytorch
how-to-optim-algorithm-in-cuda
how to optimize some algorithm in cuda.
python3-source-code-analysis
《Python 3 源码剖析》
PatrickStar
PatrickStar enables Larger, Faster, Greener Pretrained Models for NLP and democratizes AI for everyone.
dino-vit-features
Official implementation for the paper "Deep ViT Features as Dense Visual Descriptors".
algorithmic-efficiency
MLCommons Algorithmic Efficiency is a benchmark and competition measuring neural network training speedups due to algorithmic improvements in both training algorithms and models.
Pytorch-PCGrad
Pytorch reimplementation for "Gradient Surgery for Multi-Task Learning"
sagemaker-debugger
Amazon SageMaker Debugger provides functionality to save tensors during training of machine learning jobs and analyze those tensors
SHARK-Turbine
Unified compiler/runtime for interfacing with PyTorch Dynamo.