vLLM (vllm-project)

vLLM

vllm-project

Organization data from Github https://github.com/vllm-project

GitHub:@vllm-project

vLLM's repositories

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:62533Issues:452Issues:11822

aibrix

Cost-efficient and pluggable Infrastructure components for GenAI inference

Language:GoLicense:Apache-2.0Stargazers:4358Issues:46Issues:760

llm-compressor

Transformers-compatible library for applying various compression algorithms to LLMs for optimized deployment with vLLM

Language:PythonLicense:Apache-2.0Stargazers:2212Issues:24Issues:451

semantic-router

Intelligent Router for Mixture-of-Models

Language:RustLicense:Apache-2.0Stargazers:2199Issues:46Issues:208

production-stack

vLLM’s reference system for K8S-native cluster-wide deployment with community-driven performance optimization

Language:PythonLicense:Apache-2.0Stargazers:1924Issues:25Issues:228

vllm-ascend

Community maintained hardware plugin for vLLM on Ascend

Language:PythonLicense:Apache-2.0Stargazers:1326Issues:16Issues:998

guidellm

Evaluate and Enhance Your LLM Deployments for Real-World Inference Needs

Language:PythonLicense:Apache-2.0Stargazers:690Issues:17Issues:155

recipes

Common recipes to run vLLM

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:216Issues:1Issues:7

compressed-tensors

A safetensors extension to efficiently store sparse quantized tensors on disk

Language:PythonLicense:Apache-2.0Stargazers:195Issues:12Issues:35

tpu-inference

TPU inference for vLLM, with unified JAX and PyTorch support.

Language:PythonLicense:Apache-2.0Stargazers:156Issues:0Issues:0

flash-attention

Fast and memory-efficient exact attention

Language:PythonLicense:BSD-3-ClauseStargazers:97Issues:3Issues:0

speculators

A unified library for building, evaluating, and storing speculative decoding algorithms for LLM inference in vLLM

Language:PythonLicense:Apache-2.0Stargazers:64Issues:0Issues:0

dashboard

vLLM performance dashboard

Language:PythonLicense:Apache-2.0Stargazers:37Issues:1Issues:0

vllm-spyre

Community maintained hardware plugin for vLLM on Spyre

Language:PythonLicense:Apache-2.0Stargazers:37Issues:8Issues:27

ci-infra

This repo hosts code for vLLM CI & Performance Benchmark infrastructure.

Language:HCLLicense:Apache-2.0Stargazers:26Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:25Issues:1Issues:10
Language:JavaScriptLicense:MITStargazers:23Issues:15Issues:0

vllm-nccl

Manages vllm-nccl dependency

Language:PythonLicense:Apache-2.0Stargazers:17Issues:1Issues:3

vllm-gaudi

Community maintained hardware plugin for vLLM on Intel Gaudi

Language:PythonLicense:Apache-2.0Stargazers:15Issues:0Issues:0

vllm-neuron

Community maintained hardware plugin for vLLM on AWS Neuron

Language:PythonLicense:Apache-2.0Stargazers:11Issues:0Issues:0

vllm-xpu-kernels

The vLLM XPU kernels for Intel GPU

Language:C++License:Apache-2.0Stargazers:11Issues:3Issues:0
Language:C++License:MITStargazers:7Issues:1Issues:0

media-kit

vLLM Logo Assets

DeepGEMM

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

License:MITStargazers:0Issues:0Issues:0