ZCHNO's repositories
compiler-and-arch
A list of tutorials, paper, talks, and open-source projects for emerging compiler and architecture
MatmulTutorial
A Easy-to-understand TensorOp Matmul Tutorial
ZSZ_Samples
Benchmark & Study materials
Beijing_Daxuexi_Simple
北京 青年大学习 使用Github Actions自动完成
mlc-llm
Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.
academicpages.github.io
Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes
alpa
Auto parallelization for large-scale neural networks
cuda-samples
Samples for CUDA Developers which demonstrates features in CUDA Toolkit
flashinfer
FlashInfer: Kernel Library for LLM Serving
FlexFlow
A distributed deep learning framework.
generative-ai-for-beginners
12 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/
gpgpu-sim_distribution
GPGPU-Sim provides a detailed simulation model of contemporary NVIDIA GPUs running CUDA and/or OpenCL workloads. It includes support for features such as TensorCores and CUDA Dynamic Parallelism as well as a performance visualization tool, AerialVisoin, and an integrated energy model, GPUWattch.
llvm-project
The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Note: the repository does not accept github pull requests at this moment. Please submit your patches at http://reviews.llvm.org.
nccl
Optimized primitives for collective multi-GPU communication
tensorflow
An Open Source Machine Learning Framework for Everyone
tflite-micro
TensorFlow Lite for Microcontrollers
thrust
[ARCHIVED] The C++ parallel algorithms library. See https://github.com/NVIDIA/cccl
uwsampl.github.io
The UW SAMPL group's website.