ChaosCodes

followers

following

stars

ChaosCodes's starred repositories

mini-omni

open-source multimodal large language model that can hear, talk while thinking. Featuring real-time end-to-end speech input and streaming audio output conversational capabilities.

Language:PythonMIT274500

flute

Fast Matrix Multiplications for Lookup Table-Quantized LLMs

Language:CudaApache-2.016800

attention-gym

Helpful tools and examples for working with flex-attention

Language:PythonBSD-3-Clause37100

MiniCPM-V

MiniCPM-V 2.6: A GPT-4V Level MLLM for Single Image, Multi Image and Video on Your Phone

Language:PythonApache-2.01213900

torchtitan

A native PyTorch Library for large model training

Language:PythonBSD-3-Clause231000

MHPP

2500

Knowledge-Conflicts-Survey

[EMNLP 2024] The official GitHub repo for the survey paper "Knowledge Conflicts for LLMs: A Survey"

6600

TensorRT-Model-Optimizer

TensorRT Model Optimizer is a unified library of state-of-the-art model optimization techniques such as quantization, pruning, distillation, etc. It compresses deep learning models for downstream deployment frameworks like TensorRT-LLM or TensorRT to optimize inference speed on NVIDIA GPUs.

Language:PythonNOASSERTION45900

ThunderKittens

Tile primitives for speedy kernels

Language:CudaMIT151600

lmquant

Language:PythonApache-2.010700

qserve

QServe: W4A8KV4 Quantization and System Co-design for Efficient LLM Serving

Language:PythonApache-2.040700

flash-linear-attention

Efficient implementations of state-of-the-art linear attention models in Pytorch and Triton

Language:PythonMIT124300

Steel-LLM

Train a Chinese LLM From 0 by Personal

Language:Python15300

llm.c

LLM training in simple, raw C/CUDA

Language:CudaMIT2371200

EasyContext

Memory optimization and training recipes to extrapolate language models' context length to 1 million tokens, with minimal hardware.

Language:PythonApache-2.062100

x-transformers

A concise but complete full-attention transformer with a set of promising experimental features from various papers

Language:PythonMIT462300

scattermoe

Triton-based implementation of Sparse Mixture of Experts.

Language:PythonApache-2.017000

lmms-eval

Accelerating the development of large multimodal models (LMMs) with lmms-eval

Language:PythonNOASSERTION140500

LLaMA-Factory

Efficiently Fine-Tune 100+ LLMs in WebUI (ACL 2024)

Language:PythonApache-2.03187000

sailor-llm

⚓️ Sailor: Open Language Models for South-East Asia

Language:PythonMIT10400

MergeLM

Codebase for Merging Language Models (ICML 2024)

Language:Python74900

DataDreamer

DataDreamer: Prompt. Generate Synthetic Data. Train & Align Models. 🤖💤

Language:PythonMIT80900

text-dedup

All-in-one text de-duplication

Language:PythonApache-2.059300

sglang

SGLang is a fast serving framework for large language models and vision language models.

Language:PythonApache-2.0540900

datatrove

Freeing data processing from scripting madness by providing a set of platform-agnostic customizable pipeline processing blocks.

Language:PythonApache-2.0197100

gradio

Build and share delightful machine learning apps, all in Python. 🌟 Star to support our work!

Language:PythonApache-2.03251500

llama-moe

⛷️ LLaMA-MoE: Building Mixture-of-Experts from LLaMA with Continual Pre-training (EMNLP 2024)

Language:PythonApache-2.086100

tangent_task_arithmetic

Source code of "Task arithmetic in the tangent space: Improved editing of pre-trained models".

Language:PythonMIT8100

weak-to-strong

Language:PythonMIT249100

trl

Train transformer language models with reinforcement learning.

Language:PythonApache-2.0960400