pengwu22

followers

following

stars

ByteDance

Mountain View

Peng Wu's starred repositories

open-interpreter

A natural language interface for computers

Language:PythonAGPL-3.05245400

Mooncake

Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.

Quest

[ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference

Language:Cuda17300

llm.c

LLM training in simple, raw C/CUDA

Language:CudaMIT2370200

JetStream

JetStream is a throughput and memory optimized engine for LLM inference on XLA devices, starting with TPUs (and GPUs in future -- PRs welcome).

Language:PythonApache-2.020200

gemma_pytorch

The official PyTorch implementation of Google's Gemma models

Language:PythonApache-2.0524800

DeepSeek-Coder

DeepSeek Coder: Let the Code Write Itself

Language:PythonMIT662000

basic-pitch

A lightweight yet powerful audio-to-MIDI converter with pitch bend detection

Language:PythonApache-2.0337100

photoprism

AI-Powered Photos App for the Decentralized Web 🌈💎✨

Language:GoNOASSERTION3489700

codellama

Inference code for CodeLlama models

Language:PythonNOASSERTION1592500

BladeDISC

BladeDISC is an end-to-end DynamIc Shape Compiler project for machine learning workloads.

Language:C++Apache-2.080200

nccl

Optimized primitives for collective multi-GPU communication

Language:C++NOASSERTION315500

audiocraft

Audiocraft is a library for audio processing and generation with deep learning. It features the state-of-the-art EnCodec audio compressor / tokenizer, along with MusicGen, a simple and controllable music generation LM with textual and melodic conditioning.

Language:PythonMIT2069600

bitsandbytes

Accessible large language models via k-bit quantization for PyTorch.

Language:PythonMIT612900

llama.cpp

LLM inference in C/C++

Language:C++MIT6578000

vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Language:PythonApache-2.02780900

FastChat

An open platform for training, serving, and evaluating large language models. Release repo for Vicuna and Chatbot Arena.

Language:PythonApache-2.03658400

stable-diffusion-webui

Stable Diffusion web UI

Language:PythonAGPL-3.014064400

ControlNet

Let us control diffusion models!

Language:PythonApache-2.02994200

lora

Using Low-rank adaptation to quickly fine-tune diffusion models.

Language:Jupyter NotebookApache-2.0699100

opennurbs

OpenNURBS libraries allow anyone to read and write the 3DM file format without the need for Rhino.

Language:C++NOASSERTION40900

evals

Evals is a framework for evaluating LLMs and LLM systems, and an open-source registry of benchmarks.

Language:PythonNOASSERTION1476000

FlexiGen

Running large language models on a single GPU for throughput-oriented scenarios.

Language:PythonApache-2.0914900

byteir

A model compilation solution for various hardware

Language:MLIRApache-2.036200

sheetsage

Transcribe music into lead sheets!

Language:PythonNOASSERTION30100

triton

Development repository for the Triton language and compiler

Language:C++MIT1292100

llama_index

LlamaIndex is a data framework for your LLM applications

Language:PythonMIT3585700

ng-video-lecture

Language:Python346400

DeepSpeed-MII

MII makes low-latency and high-throughput inference possible, powered by DeepSpeed.

Language:PythonApache-2.0185300

google-research

Google Research

Language:Jupyter NotebookApache-2.03397300