ModelTC

ModelTC's repositories

lightllm

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Language:PythonApache-2.02113 24 174

MQBench

Model Quantization Benchmark

Language:ShellApache-2.0741 14 196

United-Perception

United Perception

Language:PythonApache-2.0427 20 65

llmc

This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit"

Language:PythonApache-2.0128 9 3

Dipoorlet

Offline Quantization Tools for Deploy.

Language:PythonApache-2.0108 16 9

awesome-lm-system

Summary of system papers/frameworks/codes/tools on training or serving large model

Apache-2.055 90

TFMQ-DM

[CVPR 2024 Highlight] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models".

Language:Jupyter NotebookApache-2.048 10 4

NART

NART = NART is not A RunTime, a deep learning inference framework.

Language:PythonApache-2.037 10 1

Outlier_Suppression_Plus

Official implementation of the EMNLP23 paper: Outlier Suppression+: Accurate quantization of large language models by equivalent and optimal shifting and scaling

Language:PythonMIT35 8 6

NNLQP

Language:PythonApache-2.033 2 8

EasyLLM

Built upon Megatron-Deepspeed and HuggingFace Trainer, EasyLLM has reorganized the code logic with a focus on usability. While enhancing usability, it also ensures training efficiency.

Language:PythonApache-2.031 8 1