ht-zhou

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Apache-2.0000

lightseq

LightSeq: A High Performance Library for Sequence Processing and Generation

Language:C++NOASSERTION000

MNN

MNN is a blazing fast, lightweight deep learning framework, battle-tested by business-critical use cases in Alibaba

Language:C++000

A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.

000

Nonuniform-to-Uniform-Quantization

Nonuniform-to-Uniform Quantization: Towards Accurate Quantization via Generalized Straight-Through Estimation. In CVPR 2022.

Language:Python000

parallelformers

Parallelformers: An Efficient Model Parallelization Toolkit for Deployment

Language:PythonApache-2.0000

prm800k

800,000 step-level correctness labels on LLM solutions to MATH problems

Language:PythonMIT000

smoothquant

SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models

Language:PythonMIT000

stable-baselines3

PyTorch version of Stable Baselines, reliable implementations of reinforcement learning algorithms.

Language:PythonMIT000

TensorRT_Tutorial

Language:C++000

transformers

🤗 Transformers: State-of-the-art Machine Learning for Pytorch, TensorFlow, and JAX.

Apache-2.0000

trlx

A repo for distributed training of language models with Reinforcement Learning via Human Feedback (RLHF)

Language:PythonMIT000

ht-zhou

BlueRum's repositories

awesome-Auto-Parallelism

awesome-RLHF

binary-bert

diffusers

EnergonAI

Best-README-Template

binary-quantization-Meta

bitsandbytes

ColoBloom

ColossalAI

Colossalai-Bloom

ColossalAI-Examples

DeepSpeed

FQ-ViT

ht-zhou

InfiAgent.github.io

Int8TP

leecode

lightllm

lightseq

MNN

model-quantization

Nonuniform-to-Uniform-Quantization

parallelformers

prm800k

smoothquant

stable-baselines3

TensorRT_Tutorial

transformers

trlx