juliagusak / model-compression-and-acceleration-progress

Repository to track the progress in model compression and acceleration

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Model Compression and Acceleration Progress

Repository to track the progress in model compression and acceleration

Low-rank approximation

  • T-Net: Parametrizing Fully Convolutional Nets with a Single High-Order Tensor (CVPR 2019) paper
  • MUSCO: Multi-Stage COmpression of neural networks (ICCVW 2019) paper | code (PyTorch)
  • Efficient Neural Network Compression (CVPR 2019) paper | code (Caffe)
  • Adaptive Mixture of Low-Rank Factorizations for Compact Neural Modeling (ICLR 2019) paper | code (PyTorch)
  • Extreme Network Compression via Filter Group Approximation (ECCV 2018) paper
  • Ultimate tensorization: compressing convolutional and FC layers alike (NIPS 2016 workshop) paper | code (TensorFlow) | code (MATLAB, Theano + Lasagne)
  • Compression of Deep Convolutional Neural Networks for Fast and Low Power Mobile Applications (ICLR 2016) paper
  • Accelerating Very Deep Convolutional Networks for Classification and Detection (IEEE TPAMI 2016) paper
  • Speeding-up Convolutional Neural Networks Using Fine-tuned CP-Decomposition (ICLR 2015) paper | code (Caffe)
  • Exploiting Linear Structure Within Convolutional Networks for Efficient Evaluation (NIPS 2014) paper
  • Speeding up Convolutional Neural Networks with Low Rank Expansions (2014) paper

Pruning & Sparsification

Papers

Repos

Knowledge distillation

Papers

  • Learning Efficient Detector with Semi-supervised Adaptive Distillation (arxiv 2019) paper | code (Caffe)
  • Model compression via distillation and quantization (ICLR 2018) paper | code (Pytorch)
  • Training Shallow and Thin Networks for Acceleration via Knowledge Distillation with Conditional Adversarial Networks (ICLR 2018 workshop) paper
  • Training Shallow and Thin Networks for Acceleration via Knowledge Distillation with Conditional Adversarial Networks ( BMVC 2018) paper
  • Net2Net: Accelerating Learning via Knowledge Transfer (ICLR 2016) paper
  • Distilling the Knowledge in a Neural Network (NIPS 2014) paper
  • FitNets: Hints for Thin Deep Nets (2014) paper | code (Theano + Pylearn2)

Repos

TensorFlow implementation of three papers https://github.com/chengshengchan/model_compression, results for CIFAR-10

Quantization

  • Bayesian Bits: Unifying Quantization and Pruning (2020) paper
  • Up or Down? Adaptive Rounding for Post-Training Quantization (2020) paper
  • Gradient $\ell_1$ Regularization for Quantization Robustness (ICLR 2020) paper
  • Training Binary Neural Networks with Real-to-Binary Convolutions (ICLR 2020) paper | code (coming soon)
  • Data-Free Quantization Through Weight Equalization and Bias Correction (ICCV 2019) paper | code (PyTorch)
  • XNOR-Net++ (2019) paper
  • Matrix and tensor decompositions for training binary neural networks (2019) paper
  • XNOR-Net (ECCV 2016) paper | code (Pytorch)
  • Trained Quantization Thresholds for Accurate and Efficient Fixed-Point Inference of Deep Neural Networks (2019) paper | code (TensorFlow)
  • Relaxed Quantization for Discretized Neural Networks (ICLR 2019) paper
  • Training and Inference with Integers in Deep Neural Networks (ICLR 2018) paper | code (TensorFlow)
  • Training Quantized Nets: A Deeper Understanding (NeurIPS 2017) paper
  • Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference (2017) paper
  • Deep Learning with Limited Numerical Precision (2015) paper
  • Estimating or Propagating Gradients Through Stochastic Neurons for Conditional Computation (2013) paper

Architecture search

  • MobileNets
  • EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks (ICML 2019) paper | code and pretrained models (TensorFlow)
  • MnasNet: Platform-Aware Neural Architecture Search for Mobile (CVPR 2019) paper | code (TensorFlow)
  • MorphNet: Fast & Simple Resource-Constrained Learning of Deep Network Structure (CVPR 2018) paper | code (TensorFlow)
  • ShuffleNets
    • ShuffleNet V2: Practical Guidelines for Efficient CNN Architecture Design (ECCV 2018) paper
    • ShuffleNet: An Extremely Efficient Convolutional Neural Network for Mobile Devices (CVPR 2018) paper
  • Multi-Fiber Networks for Video Recognition (ECCV 2018) paper | code (PyTorch)
  • IGCVs
    • IGCV3: Interleaved Low-Rank Group Convolutions for Efficient Deep Neural Networks (BMVC 2018) paper | code and pretrained models (MXNet)
    • IGCV2: Interleaved Structured Sparse Convolutional Neural Networks (CVPR 2018) paper
    • Interleaved Group Convolutions for Deep Neural Networks (ICCV 2017) paper

PhD thesis and overviews

  • Quantizing deep convolutional networks for efficient inference: A whitepaper (2018) paper
  • Algorithms for speeding up convolutional neural networks (2018) thesis
  • Model Compression and Acceleration for Deep Neural Networks: The Principles, Progress, and Challenges (2018) paper
  • Efficient methods and hardware for deep learning (2017) thesis

Frameworks

  • MUSCO - framework for model compression using tensor decompositions (PyTorch, TensorFlow)
  • AIMET - AI Model Efficiency Toolkit (PyTorch, Tensorflow)
  • Distiller - package for compression using pruning and low-precision arithmetic (PyTorch)
  • MorphNet - framework for neural networks architecture learning (TensorFlow)
  • Mayo - deep learning framework with fine- and coarse-grained pruning, network slimming, and quantization methods
  • PocketFlow - framework for model pruning, sparcification, quantization (TensorFlow implementation)
  • Keras compressor - compression using low-rank approximations, SVD for matrices, Tucker for tensors.
  • Caffe compressor K-means based quantization
  • gemmlowp - Building a quantization paradigm from first principles (C++)
  • NNI - Framework for Feature Engineering, NAS, Hyperparam tuning and Model compression

Comparison of different approaches

Please, see comparative_results.pdf

Similar repos