Jiashu's starred repositories
readerwriterqueue
A fast single-producer, single-consumer lock-free queue for C++
transformer-walkthrough
A walkthrough of transformer architecture code
intel-extension-for-deepspeed
Intel® Extension for DeepSpeed* is an extension to DeepSpeed that brings feature support with SYCL kernels on Intel GPU(XPU) device. Note XPU is already supported by stock DeepSpeed.
generative-ai-for-beginners
18 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
dlrm_datasets
Set of datasets for the deep learning recommendation model (DLRM).
sc23-dl-tutorial
SC23 Deep Learning at Scale Tutorial Material
llm-analysis
Latency and Memory Analysis of Transformer Models for Training and Inference
DeepSpeed-MII
MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.
how-to-optimize-gemm
row-major matmul optimization
Megatron-LM
Ongoing research training transformer models at scale
ElasticFlow
Artifacts for our ASPLOS'23 paper ElasticFlow