ModelTC

ModelTC

Geek Repo

Model Infra

Github PK Tool:Github PK Tool

ModelTC's repositories

lightllm

LightLLM is a Python-based LLM (Large Language Model) inference and serving framework, notable for its lightweight design, easy scalability, and high-speed performance.

Language:PythonLicense:Apache-2.0Stargazers:2171Issues:0Issues:0

llmc

This is the official PyTorch implementation of "LLMC: Benchmarking Large Language Model Quantization with a Versatile Compression Toolkit".

Language:PythonLicense:Apache-2.0Stargazers:183Issues:0Issues:0
Language:PythonStargazers:6Issues:0Issues:0

TFMQ-DM

[CVPR 2024 Highlight] This is the official PyTorch implementation of "TFMQ-DM: Temporal Feature Maintenance Quantization for Diffusion Models".

Language:Jupyter NotebookLicense:Apache-2.0Stargazers:49Issues:0Issues:0

EasyLLM

Built upon Megatron-Deepspeed and HuggingFace Trainer, EasyLLM has reorganized the code logic with a focus on usability. While enhancing usability, it also ensures training efficiency.

Language:PythonLicense:Apache-2.0Stargazers:35Issues:0Issues:0

general-sam-py

Python bindings for general-sam and some utilities

Language:PythonLicense:Apache-2.0Stargazers:1Issues:0Issues:0

mtc-token-healing

Token healing implementation in Rust

Language:RustLicense:Apache-2.0Stargazers:3Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:11Issues:0Issues:0

msbench

A tool for model sparse based on torch.fx

Language:PythonLicense:Apache-2.0Stargazers:7Issues:0Issues:0

MQBench

Model Quantization Benchmark

Language:ShellLicense:Apache-2.0Stargazers:742Issues:0Issues:0
Language:PythonStargazers:2Issues:0Issues:0
Language:RustLicense:Apache-2.0Stargazers:1Issues:0Issues:0

general-sam

A general suffix automaton implementation in Rust with Python bindings

Language:RustLicense:Apache-2.0Stargazers:2Issues:0Issues:0

DeepSpeed

DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.

License:Apache-2.0Stargazers:0Issues:0Issues:0

greedy-tokenizer

Greedily tokenize strings with the longest tokens iteratively.

Language:PythonLicense:Apache-2.0Stargazers:0Issues:0Issues:0

QLLM

[ICLR 2024] This is the official PyTorch implementation of "QLLM: Accurate and Efficient Low-Bitwidth Quantization for Large Language Models"

Language:PythonLicense:Apache-2.0Stargazers:29Issues:0Issues:0

Dipoorlet

Offline Quantization Tools for Deploy.

Language:PythonLicense:Apache-2.0Stargazers:109Issues:0Issues:0

awesome-lm-system

Summary of system papers/frameworks/codes/tools on training or serving large model

License:Apache-2.0Stargazers:56Issues:0Issues:0
Language:PythonStargazers:18Issues:0Issues:0

Outlier_Suppression_Plus

Official implementation of the EMNLP23 paper: Outlier Suppression+: Accurate quantization of large language models by equivalent and optimal shifting and scaling

Language:PythonLicense:MITStargazers:35Issues:0Issues:0
Language:PythonStargazers:0Issues:0Issues:0

ChatGLM-6B

ChatGLM-6B: An Open Bilingual Dialogue Language Model | 开源双语对话语言模型

License:Apache-2.0Stargazers:2Issues:0Issues:0

pyvlova

Yet another Polyhedra Compiler for DeepLearning

Language:PythonLicense:Apache-2.0Stargazers:19Issues:0Issues:0
Language:HTMLStargazers:0Issues:0Issues:0

NART

NART = NART is not A RunTime, a deep learning inference framework.

Language:PythonLicense:Apache-2.0Stargazers:37Issues:0Issues:0

United-Perception

United Perception

Language:PythonLicense:Apache-2.0Stargazers:427Issues:0Issues:0

AAAI2023_EAMPD

AAAI2023 Efficient and Accurate Models towards Practical Deep Learning Baseline

Stargazers:13Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:33Issues:0Issues:0

Imagenet-S

Robustness for real-world system noise

Language:PythonStargazers:4Issues:0Issues:0
Language:PythonLicense:Apache-2.0Stargazers:29Issues:0Issues:0