Chaofan Lin's starred repositories
cs-self-learning
计算机自学指南
PowerInfer
High-speed Large Language Model Serving on PCs with Consumer-grade GPUs
SJTUThesis
上海交通大学 LaTeX 论文模板 | Shanghai Jiao Tong University LaTeX Thesis Template
Awesome-GPTs
Curated list of awesome GPTs 👍.
Checkpoint
Fast and simple homebrew save manager for 3DS and Switch.
CUDA-Learn-Notes
🎉CUDA 笔记 / 大模型手撕CUDA / C++笔记,更新随缘: flash_attn、sgemm、sgemv、warp reduce、block reduce、dot product、elementwise、softmax、layernorm、rmsnorm、hist etc.
Triton-Puzzles
Puzzles for learning Triton
How_to_optimize_in_GPU
This is a series of GPU optimization topics. Here we will introduce how to optimize the CUDA kernel in detail. I will introduce several basic kernel optimizations, including: elementwise, reduce, sgemv, sgemm, etc. The performance of these kernels is basically at or near the theoretical limit.
pokeyellow
Disassembly of Pokemon Yellow
Awesome-CUDA
This is a list of useful libraries and resources for CUDA development.
LLMSys-PaperList
Large Language Model (LLM) Systems Paper List
ParrotServe
[OSDI'24] Serving LLM-based Applications Efficiently with Semantic Variable