zhangxs's starred repositories
tvm_mlir_learn
compiler learning resources collect.
Awesome-LLM-Inference
đź“–A curated list of Awesome LLM Inference Paper with codes, TensorRT-LLM, vLLM, streaming-llm, AWQ, SmoothQuant, WINT8/4, Continuous Batching, FlashAttention, PagedAttention etc.
distrifuser
[CVPR 2024 Highlight] DistriFusion: Distributed Parallel Inference for High-Resolution Diffusion Models
dash-infer
DashInfer is a native LLM inference engine aiming to deliver industry-leading performance atop various hardware architectures, including x86 and ARMv9.
PipeFusion
A Suite of Parallel Approaches for Inference of Diffusion Transformer Models on GPU Clusters
Effective-Fusion-Factor
Effective Fusion Factor in FPN for Tiny Object Detection(WACV2021)
u-mixformer
OpenMMLab Semantic Segmentation Toolbox and Benchmark.
Hetu-Galvatron
Galvatron is an automatic distributed training system designed for Transformer models, including Large Language Models (LLMs).