ModelTC

ModelTC's repositories

lightllm

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Language:PythonApache-2.0217100

llmc

This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".

Language:PythonApache-2.018300

OmniBal

Language:Python600

TFMQ-DM

[CVPR 2024 Highlight] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models".

Language:Jupyter NotebookApache-2.04900

EasyLLM

Built upon Megatron-Deepspeed and HuggingFace Trainer, EasyLLM has reorganized the code logic with a focus on usability. While enhancing usability, it also ensures training efficiency.

Language:PythonApache-2.03500

general-sam-py

Python bindings for general-sam and some utilities

Language:PythonApache-2.0100

mtc-token-healing

Token healing implementation in Rust

Language:RustApache-2.0300

L2_Compression

Language:PythonApache-2.01100

msbench

A tool for model sparse based on torch.fx

Language:PythonApache-2.0700

MQBench

Model Quantization Benchmark

Language:ShellApache-2.074200

FCPTS

Language:Python200

statecs

Language:RustApache-2.0100

general-sam

A general suffix automaton implementation in Rust with Python bindings

Language:RustApache-2.0200

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

Apache-2.0000

greedy-tokenizer

Greedily tokenize strings with the longest tokens iteratively.

Language:PythonApache-2.0000

QLLM

[ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models"

Language:PythonApache-2.02900

Dipoorlet

Offline Quantization Tools for Deploy.

Language:PythonApache-2.010900

awesome-lm-system

Summary of system papers/frameworks/codes/tools on training or serving large model

Apache-2.05600

LPCV_2023_solution

Language:Python1800

Outlier_Suppression_Plus

Official implementation of the EMNLP23 paper: Outlier Suppression+: Accurate quantization of large language models by equivalent and optimal shifting and scaling

Language:PythonMIT3500