Tianqi Chen (tqchen)

tqchen

Geek Repo

Company:CMU, OctoML

Home Page:https://tqchen.com/

Github PK Tool:Github PK Tool


Organizations
apache
dmlc
octoml
uwsampl

Tianqi Chen's starred repositories

myscaledb

An open-source, high-performance SQL vector database built on ClickHouse.

Language:C++License:Apache-2.0Stargazers:606Issues:0Issues:0

grok-1

Grok open release

Language:PythonLicense:Apache-2.0Stargazers:47662Issues:0Issues:0

asyncio

asyncio is a c++20 library to write concurrent code using the async/await syntax.

Language:C++License:MITStargazers:761Issues:0Issues:0

marlin

FP16xINT4 LLM inference kernel that can achieve near-ideal ~4x speedups up to medium batchsizes of 16-32 tokens.

Language:PythonLicense:Apache-2.0Stargazers:306Issues:0Issues:0

sglang

SGLang is a structured generation language designed for large language models (LLMs). It makes your interaction with models faster and more controllable.

Language:PythonLicense:Apache-2.0Stargazers:2251Issues:0Issues:0

wordflow

Social and customizable AI writing assistant! ✍️

Language:TypeScriptLicense:MITStargazers:129Issues:0Issues:0

open-webui

User-friendly WebUI for LLMs (Formerly Ollama WebUI)

Language:SvelteLicense:MITStargazers:13730Issues:0Issues:0

tvm.tl

An extention of TVMScript to write simple and high performance GPU kernels with tensorcore.

Language:PythonLicense:Apache-2.0Stargazers:45Issues:0Issues:0

mlx

MLX: An array framework for Apple silicon

Language:C++License:MITStargazers:13961Issues:0Issues:0

flashinfer

FlashInfer: Kernel Library for LLM Serving

Language:CudaLicense:Apache-2.0Stargazers:603Issues:0Issues:0

gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.

Language:PythonLicense:BSD-3-ClauseStargazers:5057Issues:0Issues:0

webarena

Code repo for "WebArena: A Realistic Web Environment for Building Autonomous Agents"

Language:PythonLicense:Apache-2.0Stargazers:533Issues:0Issues:0

wasm-ai

Vercel and web-llm template to run wasm models directly in the browser.

Language:TypeScriptLicense:Apache-2.0Stargazers:81Issues:0Issues:0

punica

Serving multiple LoRA finetuned LLM as one

Language:PythonLicense:Apache-2.0Stargazers:803Issues:0Issues:0

MemGPT

Building persistent LLM agents with long-term memory 📚🦙

Language:PythonLicense:Apache-2.0Stargazers:8563Issues:0Issues:0

jetson-inference

Hello AI World guide to deploying deep-learning inference networks and deep vision primitives with TensorRT and NVIDIA Jetson.

Language:C++License:MITStargazers:7306Issues:0Issues:0

ad-llama

Structured inference with Llama 2 in your browser

Language:TypeScriptLicense:MITStargazers:46Issues:0Issues:0

lightllm

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Language:PythonLicense:Apache-2.0Stargazers:1787Issues:0Issues:0

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonLicense:Apache-2.0Stargazers:17957Issues:0Issues:0

mlc-llm

Enable everyone to develop, optimize and deploy AI models natively on everyone's devices.

Language:PythonLicense:Apache-2.0Stargazers:16749Issues:0Issues:0

cutlass_fpA_intB_gemm

A standalone GEMM kernel for fp16 activation and quantized weight, extracted from FasterTransformer

Language:C++License:Apache-2.0Stargazers:75Issues:0Issues:0

llm-awq

AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration

Language:PythonLicense:MITStargazers:1766Issues:0Issues:0

gorilla

Gorilla: An API store for LLMs

Language:PythonLicense:Apache-2.0Stargazers:9952Issues:0Issues:0

daedalOS

Desktop environment in the browser

Language:JavaScriptLicense:MITStargazers:8059Issues:0Issues:0

tokenizers-cpp

Universal cross-platform tokenizers binding to HF and sentencepiece

Language:C++License:Apache-2.0Stargazers:176Issues:0Issues:0

zeno-build

Build, evaluate, understand, and fix LLM-based apps

Language:Jupyter NotebookLicense:MITStargazers:468Issues:0Issues:0

react-llm

Easy-to-use headless React Hooks to run LLMs in the browser with WebGPU. Just useLLM().

Language:TypeScriptLicense:MITStargazers:643Issues:0Issues:0

ChatLLM-Web

🗣️ Chat with LLM like Vicuna totally in your browser with WebGPU, safely, privately, and with no server. Powered by web llm.

Language:JavaScriptLicense:MITStargazers:602Issues:0Issues:0

open_llama

OpenLLaMA, a permissively licensed open source reproduction of Meta AI’s LLaMA 7B trained on the RedPajama dataset

License:Apache-2.0Stargazers:7186Issues:0Issues:0

argparse

Argument Parser for Modern C++

Language:C++License:MITStargazers:2353Issues:0Issues:0