hlu1's repositories
AITemplate_public
AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
caffe2
Caffe2 is a lightweight, modular, and scalable deep learning framework.
cutlass
CUDA Templates for Linear Algebra Subroutines
dmlc-core
A common bricks library for building scalable and portable distributed machine learning.
KeepingYouAwake
Prevents your Mac from going to sleep.
minGPT
A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training
models
A repository for storing pre-trained Caffe2 models.
TASO
A Tensor Algebra SuperOptimizer for Deep Learning
vllm
A high-throughput and memory-efficient inference and serving engine for LLMs