ZCHNO's repositories

compiler-and-arch

A list of tutorials, paper, talks, and open-source projects for emerging compiler and architecture

MatmulTutorial

A Easy-to-understand TensorOp Matmul Tutorial

Language:C++License:Apache-2.0Stargazers:212Issues:8Issues:9
Language:PythonLicense:Apache-2.0Stargazers:8Issues:1Issues:0

ZSZ_Samples

Benchmark & Study materials

Language:C++Stargazers:7Issues:0Issues:0

tvm

Open deep learning compiler stack for cpu, gpu and specialized accelerators

Language:PythonLicense:Apache-2.0Stargazers:4Issues:3Issues:0

Beijing_Daxuexi_Simple

北京 青年大学习 使用Github Actions自动完成

Language:PythonLicense:MITStargazers:2Issues:2Issues:0

mlc-llm

Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

academicpages.github.io

Github Pages template for academic personal websites, forked from mmistakes/minimal-mistakes

Language:JavaScriptLicense:MITStargazers:0Issues:2Issues:0

akg-test

Hard copy of AKG without git-lfs

Language:PythonLicense:Apache-2.0Stargazers:0Issues:3Issues:0

alpa

Auto parallelization for large-scale neural networks

License:Apache-2.0Stargazers:0Issues:0Issues:0

byteir

ByteIR

Language:MLIRLicense:NOASSERTIONStargazers:0Issues:2Issues:0

cuda-samples

Samples for CUDA Developers which demonstrates features in CUDA Toolkit

Language:CLicense:NOASSERTIONStargazers:0Issues:2Issues:0

cutlass

CUDA Templates for Linear Algebra Subroutines

Language:C++License:NOASSERTIONStargazers:0Issues:2Issues:0
License:MITStargazers:0Issues:0Issues:0

FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and FastChat-T5.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:2Issues:0

flashinfer

FlashInfer: Kernel Library for LLM Serving

License:Apache-2.0Stargazers:0Issues:0Issues:0

FlexFlow

A distributed deep learning framework.

Language:C++License:Apache-2.0Stargazers:0Issues:0Issues:0

gem5-zsz

This is an read-only mirror of the gem5 simulator. The upstream repository is stored in https://gem5.googlesource.com, code reviews should be submitted to https://gem5-review.googlesource.com/. The mirrors are synchronized every 15 minutes.

Language:C++License:BSD-3-ClauseStargazers:0Issues:2Issues:0

generative-ai-for-beginners

12 Lessons, Get Started Building with Generative AI 🔗 https://microsoft.github.io/generative-ai-for-beginners/

Language:Jupyter NotebookLicense:MITStargazers:0Issues:0Issues:0

gpgpu-sim_distribution

GPGPU-Sim provides a detailed simulation model of contemporary NVIDIA GPUs running CUDA and/or OpenCL workloads. It includes support for features such as TensorCores and CUDA Dynamic Parallelism as well as a performance visualization tool, AerialVisoin, and an integrated energy model, GPUWattch.

Language:C++License:NOASSERTIONStargazers:0Issues:2Issues:0

jax

Composable transformations of Python+NumPy programs: differentiate, vectorize, JIT to GPU/TPU, and more

Language:PythonLicense:Apache-2.0Stargazers:0Issues:2Issues:0

llvm-project

The LLVM Project is a collection of modular and reusable compiler and toolchain technologies. Note: the repository does not accept github pull requests at this moment. Please submit your patches at http://reviews.llvm.org.

License:NOASSERTIONStargazers:0Issues:0Issues:0

nccl

Optimized primitives for collective multi-GPU communication

License:NOASSERTIONStargazers:0Issues:0Issues:0

silo-lm

SILO Language Models code repository

Language:PythonLicense:MITStargazers:0Issues:1Issues:0

tensorflow

An Open Source Machine Learning Framework for Everyone

License:Apache-2.0Stargazers:0Issues:0Issues:0

tflite-micro

TensorFlow Lite for Microcontrollers

Language:C++License:Apache-2.0Stargazers:0Issues:2Issues:0

thrust

[ARCHIVED] The C++ parallel algorithms library. See https://github.com/NVIDIA/cccl

License:NOASSERTIONStargazers:0Issues:0Issues:0
Language:C++License:BSD-3-ClauseStargazers:0Issues:2Issues:0

uwsampl.github.io

The UW SAMPL group's website.

Language:HTMLLicense:NOASSERTIONStargazers:0Issues:0Issues:0

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:0Issues:2Issues:0