Wenxiang's repositories
mmoe
Mixed precision MoE kernels
Language:C++MIT000
nanoGPT
The simplest, fastest repository for training/finetuning medium-sized GPTs.
Language:PythonMIT000
pycublas
Python Interface Updated for cublas.
Language:CudaMIT000
vllm-pr
A high-throughput and memory-efficient inference and serving engine for LLMs
Language:PythonApache-2.0000