yh8899's starred repositories
llama3-from-scratch
llama3 implementation one matrix multiplication at a time
ThunderKittens
Tile primitives for speedy kernels
cudnn-frontend
cudnn_frontend provides a c++ wrapper for the cudnn backend API and samples on how to use it
Megatron-DeepSpeed
Ongoing research training transformer language models at scale, including: BERT & GPT-2
DeepSpeedExamples
Example models using DeepSpeed
Open-Sora-Plan
This project aim to reproduce Sora (Open AI T2V model), we wish the open source community contribute to this project.
generative-models
Generative Models by Stability AI
optimum-quanto
A pytorch quantization backend for optimum
text-generation-inference
Large Language Model Text Generation Inference
pytorch-original-transformer
My implementation of the original transformer model (Vaswani et al.). I've additionally included the playground.py file for visualizing otherwise seemingly hard concepts. Currently included IWSLT pretrained models.
ml-engineering
Machine Learning Engineering Open Book
torch-mlir
The Torch-MLIR project aims to provide first class support from the PyTorch ecosystem to the MLIR ecosystem.
bitsandbytes
Accessible large language models via k-bit quantization for PyTorch.