There are 54 repositories under model-compression topic.
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Awesome Knowledge Distillation
An Automatic Model Compression (AutoMC) framework for developing smaller and faster AI applications.
Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
CV backbones including GhostNet, TinyNet and TNT, developed by Huawei Noah's Ark Lab.
Awesome Knowledge-Distillation. 分类整理的知识蒸馏paper(2014-2021)。
micronet, a model compression and deploy lib. compression: 1、quantization: quantization-aware-training(QAT), High-Bit(>2b)(DoReFa/Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference)、Low-Bit(≤2b)/Ternary and Binary(TWN/BNN/XNOR-Net); post-training-quantization(PTQ), 8-bit(tensorrt); 2、 pruning: normal、regular and group convolutional channel pruning; 3、 group convolution structure; 4、batch-normalization fuse for quantization. deploy: tensorrt, fp32/fp16/int8(ptq-calibration)、op-adapt(upsample)、dynamic_shape
A curated list of neural network pruning resources.
NLP DNN Toolkit - Building Your NLP DNN Models Like Playing Lego
A PyTorch implementation for exploring deep and shallow knowledge distillation (KD) experiments with flexibility
A toolkit to optimize ML models for deployment for Keras and TensorFlow, including quantization and pruning.
Channel Pruning for Accelerating Very Deep Neural Networks (ICCV'17)
Pytorch implementation of various Knowledge Distillation (KD) methods.
PaddleSlim is an open-source library for deep model compression and architecture search.
A list of high-quality (newest) AutoML works and lightweight models including 1.) Neural Architecture Search, 2.) Lightweight Structures, 3.) Model Compression, Quantization and Acceleration, 4.) Hyperparameter Optimization, 5.) Automated Feature Engineering.
Collection of recent methods on (deep) neural network compression and acceleration.
Lightweight and Scalable framework that combines mainstream algorithms of Click-Through-Rate prediction based computational DAG, philosophy of Parameter Server and Ring-AllReduce collective communication.
Pruning channels for model acceleration
knowledge distillation papers
[CVPR2020] GhostNet: More Features from Cheap Operations
Filter Pruning via Geometric Median for Deep Convolutional Neural Networks Acceleration (CVPR 2019 Oral)
A Pytorch Knowledge Distillation library for benchmarking and extending works in the domains of Knowledge Distillation, Pruning, and Quantization.
Papers for deep neural network compression and acceleration
Archai accelerates Neural Architecture Search (NAS) through fast, reproducible and modular research.
[ECCV 2018] AMC: AutoML for Model Compression and Acceleration on Mobile Devices
Soft Filter Pruning for Accelerating Deep Convolutional Neural Networks
Infrastructures™ for Machine Learning Training/Inference in Production.
TinyNeuralNetwork is an efficient and easy-to-use deep learning model compression framework.
⛵️The official PyTorch implementation for "BERT-of-Theseus: Compressing BERT by Progressive Module Replacing" (EMNLP 2020).
Awesome machine learning model compression research papers, tools, and learning material.
Quantization library for PyTorch. Support low-precision and mixed-precision quantization, with hardware implementation through TVM.
Code for "Co-Evolutionary Compression for Unpaired Image Translation" (ICCV 2019), "SCOP: Scientific Control for Reliable Neural Network Pruning" (NeurIPS 2020) and “Manifold Regularized Dynamic Network Pruning” (CVPR 2021).
Java interface for fastText
Deep Face Model Compression