MARD1NO

Make PyTorch models up to 40% faster! Thunder is a source to source compiler for PyTorch. It enables using different hardware executors at once; across one or thousands of GPUs.

Language:PythonApache-2.0000

MInference

To speed up Long-context LLMs' inference, approximate and dynamic sparse calculate the attention, which reduces inference latency by up to 10x for pre-filling on an A100 while maintaining accuracy.

Language:PythonMIT000

nvmath-python

NVIDIA Math Libraries for the Python Ecosystem

Language:CythonApache-2.0000

OpenAI 接口管理 & 分发系统，支持 Azure、Anthropic Claude、Google PaLM 2 & Gemini、智谱 ChatGLM、百度文心一言、讯飞星火认知、阿里通义千问、360 智脑以及腾讯混元，可用于二次分发管理 key，仅单可执行文件，已打包好 Docker 镜像，一键部署，开箱即用. OpenAI key management & redistribution system, using a single API for all LLMs, and features an English UI.

Language:JavaScriptMIT000

py-codegen

Language:Python000

QuaRot

Code for QuaRot, an end-to-end 4-bit inference of large language models.

Language:PythonApache-2.0000

quip-sharp

Language:PythonGPL-3.0000

sarathi-serve

A low-latency & high-throughput serving engine for LLMs

Language:PythonApache-2.0000

SpeculativeDecodingPapers

📰 Must-read papers and blogs on Speculative Decoding ⚡️

Apache-2.0000

SpinQuant

Code repo for the paper "SpinQuant LLM quantization with learned rotations"

Language:PythonNOASSERTION000

TensorRT-Model-Optimizer

TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization and sparsity. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed on NVIDIA GPUs.

Language:PythonNOASSERTION000

MARD1NO

ZZK's repositories

nvbench

BitBLAS

EETQ

fast-hadamard-transform

faster-nougat

FLASHNN

flux

KsanaLLM

kvikio

lightning-thunder

LLM101n

llumnix

MARD1NO

MARD1NO.github.io

MInference

Mooncake

nvmath-python

one-api

py-codegen

QuaRot

quip-sharp

sarathi-serve

SpeculativeDecodingPapers

SpinQuant

TensorRT-Model-Optimizer

ThunderKittens

tiny-gpu

triton-linalg

unsloth

vidur