model-compression

There are 67 repositories under model-compression topic.

nni
microsoft / nni
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
automl deep-learning neural-architecture-search hyperparameter-optimization distributed bayesian-optimization automated-machine-learning machine-learning machine-learning-algorithms data-science tensorflow pytorch neural-network deep-neural-network model-compression feature-engineering nas python hyperparameter-tuning mlops
Language:Python 13726
huawei-noah / Efficient-AI-Backbones
Efficient AI Backbones including GhostNet, TNT and MLP, developed by Huawei Noah's Ark Lab.
convolutional-neural-networks efficient-inference imagenet model-compression tensorflow pytorch ghostnet transformer pretrained-models vision-transformer
Language:Python 3782
dkozlov / awesome-knowledge-distillation
Awesome Knowledge Distillation
knowledge-distillation knowledge-transfer teacher-student co-training distillation distillation-model model-distillation knowldge-distillation kd model-compression deep-learning
3311
huawei-noah / Pretrained-Language-Model
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
knowledge-distillation model-compression quantization pretrained-models large-scale-distributed
Language:Python 2954
Tencent / PocketFlow
An Automatic Model Compression (AutoMC) framework for developing smaller and faster AI applications.
deep-learning model-compression mobile-app automl computer-vision
Language:Python 2780
FLHonker / Awesome-Knowledge-Distillation
Awesome Knowledge-Distillation. 分类整理的知识蒸馏paper(2014-2021)。
kd knowldge-distillation distillation deep-learning transfer-learning model-compression
2392
VainF / Torch-Pruning
[CVPR 2023] Towards Any Structural Pruning; LLMs / SAM / Diffusion / Transformers / YOLOv8 / CNNs
pruning model-compression network-pruning channel-pruning structural-pruning efficient-deep-learning depgraph cvpr2023
Language:Python 2284
he-y / Awesome-Pruning
A curated list of neural network pruning resources.
awesome-list model-acceleration model-compression pruning
2204
666DZY666 / micronet
micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、Low-Bit(≤2b)/Ternary and Binary(TWN/BNN/XNOR-Net); post-training-quantization(PTQ), 8-bit(tensorrt); 2、 pruning: normal、regular and group convolutional channel pruning; 3、 group convolution structure; 4、batch-normalization fuse for quantization. deploy: tensorrt, fp32/fp16/int8(ptq-calibration)、op-adapt(upsample)、dynamic_shape
quantization pruning dorefa twn bnn xnor-net pytorch model-compression group-convolution network-slimming neuromorphic-computing convolutional-networks network-in-network integer-arithmetic-only quantization-aware-training post-training-quantization tensorrt onnx tensorrt-int8-python batch-normalization-fuse
Language:Python 2175
haitongli / knowledge-distillation-pytorch
A PyTorch implementation for exploring deep and shallow knowledge distillation (KD) experiments with flexibility
pytorch knowledge-distillation deep-neural-networks cifar10 model-compression dark-knowledge computer-vision
Language:Python 1779
htqin / awesome-model-quantization
A list of papers, docs, codes about model quantization. This repo is aimed to provide the info for model quantization research, we are continuously improving the project. Welcome to PR the works (papers, repositories) that are missed by the repo.
awesome binarization binarized-neural-networks binary-network deep-learning efficient-deep-learning lightweight-neural-network model-acceleration model-compression model-quantization quantization
1595
AberHu / Knowledge-Distillation-Zoo
Pytorch implementation of various Knowledge Distillation (KD) methods.
distillation kd kd-methods knowledge-distillation knowledge-transfer model-compression teacher-student
Language:Python 1508
tensorflow / model-optimization
A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
tensorflow machine-learning deep-learning optimization quantized-neural-networks quantized-networks quantized-training keras model-compression compression ml pruning sparsity quantization
Language:Python 1464
microsoft / NeuronBlocks
NLP DNN Toolkit - Building Your NLP DNN Models Like Playing Lego
question-answering deep-learning pytorch natural-language-processing text-classification artificial-intelligence dnn qna text-matching knowledge-distillation model-compression sequence-labeling
Language:Python 1440
huawei-noah / Efficient-Computing
Efficient computing methods developed by Huawei Noah's Ark Lab
knowledge-distillation model-compression binary-neural-networks pruning quantization self-supervised
Language:Jupyter Notebook 1107
channel-pruning
ethanhe42 / channel-pruning
Channel Pruning for Accelerating Very Deep Neural Networks (ICCV'17)
image-recognition model-compression acceleration object-detection image-classification channel-pruning deep-neural-networks
Language:Python 1065
MingSun-Tse / Efficient-Deep-Learning
Collection of recent methods on (deep) neural network compression and acceleration.
model-compression network-pruning knowledge-distillation deep-learning deep-neural-networks efficient-deep-learning
895
guan-yuan / awesome-AutoML-and-Lightweight-Models
A list of high-quality (newest) AutoML works and lightweight models including 1.) Neural Architecture Search, 2.) Lightweight Structures, 3.) Model Compression, Quantization and Acceleration, 4.) Hyperparameter Optimization, 5.) Automated Feature Engineering.
automl meta-learning automated-feature-engineering hyperparameter-optimization architecture-search model-compression model-acceleration awesome-list neural-architecture-search nas pytorch quantization quantized-neural-network quantized-training tensorflow
827
lhyfst / knowledge-distillation-papers
knowledge distillation papers
knowledge-distillation model-compression paper dark-knowledge reading-list
724
alibaba / TinyNeuralNetwork
TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.
pytorch deep-learning model-compression pruning model-converter quantization-aware-training deep-neural-networks post-training-quantization
Language:Python 703
cnkuangshi / LightCTR
Lightweight and Scalable framework that combines mainstream algorithms of Click-Through-Rate prediction based computational DAG, philosophy of Parameter Server and Ring-AllReduce collective communication.
machine-learning deep-learning factorization-machines distributed-systems parameter-server model-compression computational-graphs
Language:C++ 674
horseee / DeepCache
[CVPR 2024] DeepCache: Accelerating Diffusion Models for Free
diffusion-models efficient-inference model-compression stable-diffusion training-free
Language:Python 585
SforAiDl / KD_Lib
A Pytorch Knowledge Distillation library for benchmarking and extending works in the domains of Knowledge Distillation, Pruning, and Quantization.
knowledge-distillation model-compression pruning quantization pytorch deep-learning-library machine-learning data-science benchmarking algorithm-implementations
Language:Python 570
SqueezeAILab / SqueezeLLM
SqueezeLLM: Dense-and-Sparse Quantization
efficient-inference large-language-models llm model-compression natural-language-processing post-training-quantization quantization text-generation transformer llama localllm small-models
Language:Python 560
he-y / filter-pruning-geometric-median
Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration (CVPR 2019 Oral)
pruning pytorch model-compression
Language:Python 539
iamhankai / ghostnet.pytorch
[CVPR2020] GhostNet: More Features from Cheap Operations
convolutional-neural-networks mobilenetv3 model-compression pytorch fbnet
Language:Python 519
microsoft / archai
Accelerate your Neural Architecture Search (NAS) through fast, reproducible and modular research.
python pytorch machine-learning deep-learning neural-architecture-search nas automated-machine-learning model-compression darts petridish hyperparameter-optimization automl
Language:Python 452
cedrickchee / awesome-ml-model-compression
Awesome machine learning model compression research papers, tools, and learning material.
machine-learning model-compression quantization pruning awesome-list neural-networks
444
mit-han-lab / amc
[ECCV 2018] AMC: AutoML for Model Compression and Acceleration on Mobile Devices
automl automl-for-compression model-compression channel-pruning efficient-model on-device-ai
Language:Python 417
chester256 / Model-Compression-Papers
Papers for deep neural network compression and acceleration
deep-learning model-compression papers deep-neural-networks model-acceleration
393
Zhen-Dong / HAWQ
Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.
quantization tvm model-compression distillation quantized-neural-networks pytorch hardware-aware mixed-precision efficient-neural-networks 8-bit 4-bit tensorcore hessian
Language:Python 391
he-y / soft-filter-pruning
Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks
pruning pytorch model-compression
Language:Python 374
1duo / awesome-ai-infrastructures
Infrastructures™ for Machine Learning Training/Inference in Production.
artificial-intelligence machine-learning deep-learning machine-learning-systems awesome-list deep-learning-framework kubernetes apache-spark apache-arrow apache-mesos pruning quantization knowledge-distillation model-compression federated-learning
368
pratyushasharma / laser
The Truth Is In There: Improving Reasoning in Language Models with Layer-Selective Rank Reduction
gpt-j interpretability laser llama2 llm llms model-compression transformers
Language:Python 321
BERT-of-Theseus
JetRunner / BERT-of-Theseus
⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressing BERT by Progressive Module Replacing" (EMNLP 2020).
bert transformers nlp glue model-compression
Language:Python 308
Zhen-Dong / Awesome-Quantization-Papers
List of papers related to neural network quantization in recent AI conferences and journals.
quantization awesome-list papers neural-networks model-compression edge-computing efficient-inference diffusion-models large-language-models
295

model-compression

microsoft / nni

huawei-noah / Efficient-AI-Backbones

dkozlov / awesome-knowledge-distillation

huawei-noah / Pretrained-Language-Model

Tencent / PocketFlow

FLHonker / Awesome-Knowledge-Distillation

VainF / Torch-Pruning

he-y / Awesome-Pruning

666DZY666 / micronet

haitongli / knowledge-distillation-pytorch

htqin / awesome-model-quantization

AberHu / Knowledge-Distillation-Zoo

tensorflow / model-optimization

microsoft / NeuronBlocks

huawei-noah / Efficient-Computing

ethanhe42 / channel-pruning

MingSun-Tse / Efficient-Deep-Learning

guan-yuan / awesome-AutoML-and-Lightweight-Models

lhyfst / knowledge-distillation-papers

alibaba / TinyNeuralNetwork

cnkuangshi / LightCTR

horseee / DeepCache

SforAiDl / KD_Lib

SqueezeAILab / SqueezeLLM

he-y / filter-pruning-geometric-median

iamhankai / ghostnet.pytorch

microsoft / archai

cedrickchee / awesome-ml-model-compression

mit-han-lab / amc

chester256 / Model-Compression-Papers

Zhen-Dong / HAWQ

he-y / soft-filter-pruning

1duo / awesome-ai-infrastructures

pratyushasharma / laser

JetRunner / BERT-of-Theseus

Zhen-Dong / Awesome-Quantization-Papers