Shizhi Tang's repositories
FreeTensor
A language and compiler for irregular tensor programs.
vfk_uoj_sandbox
vfk's sandbox for uoj
FreeTensor_experiments
Experiments on FreeTensor
async-syscall-app
Userspace for roastduck/linux:async. Working in progress.
AutoGPTQ
An easy-to-use LLMs quantization package with user-friendly apis, based on GPTQ algorithm.
benchmark
A microbenchmark support library
capsule
Capsule network implemented with TVM
checkout
Action for checking out a repo
cutlass
CUDA Templates for Linear Algebra Subroutines
EETQ
Easy and Efficient Quantization for Transformers
Enzyme
High-performance automatic differentiation of LLVM and MLIR.
fastmoe
A fast MoE impl for PyTorch
googletest
GoogleTest - Google Testing and Mocking Framework
incubator-tvm
Open deep learning compiler stack for cpu, gpu and specialized accelerators
llm-awq
[MLSys 2024 Best Paper Award] AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
longformer
Longformer: The Long-Document Transformer
onnx
Open standard for machine learning interoperability
pytorch-benchmark
TorchBench is a collection of open source benchmarks used to evaluate PyTorch performance.
spdlog
Fast C++ logging library.
taskflow
A General-purpose Task-parallel Programming System using Modern C++
transformers
🤗Transformers: State-of-the-art Natural Language Processing for Pytorch and TensorFlow 2.0.