tqchen

Tianqi Chen's starred repositories

myscaledb

An open-source, high-performance SQL vector database built on ClickHouse.

Language:C++Apache-2.060600

grok-1

Grok open release

Language:PythonApache-2.04766200

asyncio

asyncio is a c++20 library to write concurrent code using the async/await syntax.

Language:C++MIT76100

marlin

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Language:PythonApache-2.030600

sglang

SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.

Language:PythonApache-2.0225100

wordflow

Social and customizable AI writing assistant! ✍️

Language:TypeScriptMIT12900

open-webui

User-friendly WebUI for LLMs (Formerly Ollama WebUI)

Language:SvelteMIT1373000

tvm.tl

An extention of TVMScript to write simple and high performance GPU kernels with tensorcore.

Language:PythonApache-2.04500

mlx

MLX: An array framework for Apple silicon

Language:C++MIT1396100

flashinfer

FlashInfer: Kernel Library for LLM Serving

Language:CudaApache-2.060300

gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Language:PythonBSD-3-Clause505700

webarena

Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"

Language:PythonApache-2.053300

wasm-ai

Vercel and web-llm template to run wasm models directly in the browser.

Language:TypeScriptApache-2.08100

punica

Serving multiple LoRA finetuned LLM as one

Language:PythonApache-2.080300

MemGPT

Building persistent LLM agents with long-term memory 📚🦙

Language:PythonApache-2.0856300

jetson-inference

Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.

Language:C++MIT730600

ad-llama

Structured inference with Llama 2 in your browser

Language:TypeScriptMIT4600

lightllm

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Language:PythonApache-2.0178700