yzh119

Zihao Ye's repositories

bibfetch

Fetch bibtex entries from academic search engines like dblp.

Language:PythonGPL-3.03 20

mirage

A multi-level tensor algebra superoptimizer

Apache-2.0200

punica

Serving multiple LoRA finetuned LLM as one

Language:PythonApache-2.02 10

mlc-llm

Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.

Language:PythonApache-2.01 10

relax

Temp repo for prototyping relax(relay next), the effort will be upstreamed. We use the wiki pages on this repo to host design docs.

Language:PythonApache-2.01 10

dgsparse

Language:CudaApache-2.0010

envd

🏕️ Reproducible development environment for AI/ML

Language:GoApache-2.0010

flashinfer-ai.github.io

Project website of FlashInfer project

Language:HTML000

graphiler

Language:CudaApache-2.0010

llm-perf-bench

Language:Shell010

Magicube

Magicube is a high-performance library for quantized sparse matrix operations (SpMM and SDDMM) of deep learning on Tensor Cores.

Language:C++GPL-3.0010

metal-benchmarks

Apple GPU microarchitecture

MIT000

mlx

MLX: An array framework for Apple silicon

MIT000

mogan

Mogan Editor / 墨干编辑器

Language:TclGPL-3.0010

nccl

Optimized primitives for collective multi-GPU communication

Language:C++NOASSERTION000

nnfusion

A flexible and efficient deep neural network (DNN) compiler that generates high-performance executable from a DNN model description.

Language:C++MIT010

relax-sparse

Temp repo for prototyping relax(relay next), the effort will be upstreamed. We use the wiki pages on this repo to host design docs.

Language:PythonApache-2.0010

sam

Language:PythonMIT010

smoothquant

SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

Language:PythonMIT010

sputnik

A library of GPU kernels for sparse matrix operations.

Language:C++Apache-2.0020

taco

The Tensor Algebra Compiler (taco) computes sparse tensor expressions on CPUs and GPUs

Language:C++NOASSERTION030

tlcpack

Language:GroovyApache-2.0020

triton

Development repository for the Triton language and compiler

Language:C++MIT010

tvm

Open deep learning compiler stack for cpu, gpu and specialized accelerators

Language:PythonApache-2.0040

tvm-rfcs

A home for the final text of all TVM RFCs.

Apache-2.0010

utils

Language:PythonApache-2.0010

uwsampl.github.io

The UW SAMPL group's website.

Language:HTMLNOASSERTION000

web-data

020

web-llm

Bringing large-language models and chat to web browsers. Everything runs inside the browser with no server support.

Language:PythonApache-2.0010

web-stable-diffusion

Bringing stable diffusion models to web browsers. Everything runs inside the browser with no server support.

Language:Jupyter NotebookApache-2.0010