Hongzheng Chen's repositories
Assignments
Assignments of Computer Science courses in SYSU
ToolsSeminar-CS
Seminar on selected tools in Computer Science
ptc-tutorial
PyTorch compilation tutorial covering TorchScript, torch.fx, and Slapo
slapo-artifact
Artifact evaluation of ASPLOS 2024 paper "Slapo: A Schedule Language for Progressive Optimization of Large Deep Learning Model Training"
heterocl-demo
Demo programs for HeteroCL
awesome-tensor-compilers
A list of awesome compiler projects and papers for tensor computation and deep learning.
ByteTransformer
optimized BERT transformer inference on NVIDIA GPU. https://arxiv.org/abs/2210.03052
DeepSpeed
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
epoi
Benchmark PyTorch Custom Operators
FasterTransformer
Transformer related optimization, including BERT, GPT
GeminiGraph
A computation-centric distributed graph processing system.
hcl-dialect
HeteroCL-MLIR dialect for accelerator design
hidet
Hidet: A compilation-based DNN inference framework
llvm-pass-skeleton
example LLVM pass
llvm-project
The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Note: the repository does not accept github pull requests at this moment. Please submit your patches at http://reviews.llvm.org.
Megatron-LM
Ongoing research training transformer models at scale
pl.cs.cornell.edu
Website for PL@Cornell
riscv-innovations
RISC-V is where innovation happens!
scalehls
A scalable High-Level Synthesis framework on MLIR
slapo
A schedule language for progressive optimization of large deep learning model training
transformers
🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.