Beast code in Giters

A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.

182600

kompute

General purpose GPU compute framework built on Vulkan to support 1000s of cross vendor graphics cards (AMD, Qualcomm, NVIDIA & friends). Blazing fast, mobile-enabled, asynchronous and optimized for advanced GPU data processing usecases. Backed by the Linux Foundation.

Language:C++Apache-2.0196500

opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Language:PythonApache-2.0383200

persona-hub

Official repo for the paper "Scaling Synthetic Data Creation with 1,000,000,000 Personas"

Language:Python80600

pyreft

ReFT: Representation Finetuning for Language Models

Language:PythonApache-2.0111600

redco

NAACL '24 (Best Demo Paper RunnerUp) / MlSys @ NeurIPS '23 - RedCoast: A Lightweight Tool to Automate Distributed Training and Inference

Language:PythonApache-2.05800

Perplexica

Perplexica is an AI-powered search engine. It is an Open source alternative to Perplexity AI

Language:TypeScriptMIT1376500

optimum-nvidia

Language:PythonApache-2.088600

weight-selection

Language:Python16600

FoFo

Language:PythonApache-2.01600

unsloth

Finetune Llama 3.2, Mistral, Phi & Gemma LLMs 2-5x faster with 80% less memory

Language:PythonApache-2.01638500

mambaformer-icl

MambaFormer in-context learning experiments and implementation for https://arxiv.org/abs/2402.04248

Language:PythonApache-2.03300

OSWorld

OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments

Language:PythonApache-2.0113700

SWE-agent

SWE-agent takes a GitHub issue and tries to automatically fix it, using GPT-4, or your LM of choice. It can also be employed for offensive cybersecurity or competitive coding challenges.

Language:PythonMIT1339500

Triton-Puzzles

Puzzles for learning Triton

Language:Jupyter NotebookApache-2.0101500

lightning-thunder

Make PyTorch models up to 40% faster! Thunder is a source to source compiler for PyTorch. It enables using different hardware executors at once; across one or thousands of GPUs.

Language:PythonApache-2.0114900

modules

🧩 Official registry of Rivet Modules.

Language:TypeScriptApache-2.011000

Open-Sora

Open-Sora: Democratizing Efficient Video Production for All

Language:PythonApache-2.02176300

transformer-debugger

Language:PythonMIT401700

wns823

SJYang's starred repositories

llm-sp

ADAS

llm-compressor

sgcrl

LLM-QAT

torchchat

hqq

qllm-eval

llama-stack

llama-stack-apps

llmtools

awesome-model-quantization