Yan Chunwei's repositories
cutlass
CUDA Templates for Linear Algebra Subroutines
doom-emacs-docker
A docker for minimum emacs env for development
explor
Some misc code for surveying features from tools or libraries.
hugo-PaperMod
A fast, clean, responsive Hugo theme.
llm-scratch
Keep record of some experiments.
pimacs
A programming language and transpiler for Emacs Lisp
TensorRT-LLM
TensorRT-LLM provides users with an easy-to-use Python API to define Large Language Models (LLMs) and build TensorRT engines that contain state-of-the-art optimizations to perform inference efficiently on NVIDIA GPUs. TensorRT-LLM also contains components to create Python and C++ runtimes that execute those TensorRT engines.
tensorrtllm_backend
The Triton TensorRT-LLM Backend
the-algorithm
Source code for Twitter's Recommendation Algorithm
yallm
Some random code and worknotes related to LLM inference