Wenhao Xie's starred repositories
ccf-deadlines
⏰ Collaboratively track deadlines of conferences recommended by CCF (Website, Python Cli, Wechat Applet) / If you find it useful, please star this project, thanks~
llama3-from-scratch
llama3 implementation one matrix multiplication at a time
NN-CUDA-Example
Several simple examples for popular neural network toolkits calling custom CUDA operators.
flash-attention
Fast and memory-efficient exact attention
cuda-samples
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
CUDALibrarySamples
CUDA Library Samples
text-generation-inference
Large Language Model Text Generation Inference
llm-course
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
mlir-tutorial
MLIR For Beginners tutorial
Awesome-Efficient-LLM
A curated list for Efficient Large Language Models
awesome-generative-ai-guide
A one stop repository for generative AI research updates, interview resources, notebooks and much more!
Awesome-LLM-Inference
📖A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
LLMs-from-scratch
Implementing a ChatGPT-like LLM in PyTorch from scratch, step by step
Modern-CPP-Programming
Modern C++ Programming Course (C++03/11/14/17/20/23/26)