datur / nncompression

Masters Thesis Project https://davidturner94.github.com/nncompression

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Codacy Badge

Masters Thesis

TODO


  • Finish moving previous folder into new module
  • Find a way to programatically use openvino model optimizer - should it be through an include or a sym link?
  • write methods for compressing networks
  • explore NCS2 python tools for model performance statistics

Neural Network Compression

Tools

Model Description Libraries

Library Maintainer Desc Status
PyTorch Facebook Facebook backed and Open Source. Dynamic Computational graph Actively Developed
Tensorflow Google Google Backed. Tensorflow Lite for Mobile Deployment. Python and C++. Static graph Actively Developed
Keras François Chollet/Google Google Employee François Cholle developed. Great for quick prototyping. Wrapper for Tensorflow, Theano, CNTK. It is tightly integrated with tensorflow 2 Active
CNTK Microsoft toolkit that describes neural networks as a series of computational steps via a directed graph Depreciated
ONNX Joint Venture between Facebook and Microsoft. Wide spread enterprise use. Multiple libraries convert to universal Model. Interchangable AI Models Actively Developed
MxNet Apache Has wide Industry support and a large community of users. Focus on scalability over multiple GPUs and portability with large number of languages supported and most major OS supported. Has libraries expanding on core functionality for NLP and CV Repo Actively Developed
Chainer Community, Preferred Networks, Inc. open source deep learning framework written purely in Python on top of Numpy and CuPy. First to use "define-by-run"(dynamic computational graph) and focuses on object oriented design for defining neural networks. Actively Developed
Caffe UC Berkley expression, Focus on speed, and modularity. Last stable release over 2 years ago. However, it is still supported by many pruning and deployment libraries so still used in industry. Depreciated
Caffe2 Facebook - Merged with PyTorch as of 2018 Focus on light-weight and modular. Caffe2go for mobile deployment. Depreciated
Theano Montreal Institute for Learning Algorithms, University of Montreal Focus on ability to define, optimize, and evaluate mathematical expressions involving multi-dimensional arrays efficiently. Last Release 2017 only just being phased out in major libraries backends. Depreciated

Compression Libraries and extensions

Tool Maintainer Desc
Brevitas Xilinx Research Xilinx Research team Neural Network Quantization Framework from Xilinx Research FINN project originally built on theano, now being migrated to PyTorch.
Vitis Xilinx Xilinx commercial software Suite with "Vitis AI" Library supporting caffe and tensorflow and potentially pytorch. including AI Optimizer module for pruning, AI Quantizer for Quantizing, and AI Compiler for optimising code for DPU (Deep Learning Processing Unit) a layer on top of the bare metal FPGA.
Intel Distiller Intel Compression Library ontop of PyTorch. State of the art algorithms. Includes Pruning, Quantisation, Regularization, Knowledge Distilation, Conditional Computation
Keras-Surgeon Ben Whetton Developed by Ben Whetton. Pruining for keras models. Last updated a year ago
QNNPACK Facebook A mobile-optimized library for low-precision high-performance neural network inference. Focus on Quantization. Intended Not for research but for high level frameworks
MXNet Contrib Quantization Apache/Community MxNet Contrib library for Quantization.
tensorflow_model_optimization Google Includes Model optimization techniques using Quantization and Pruning APIs
Chainer Pruner tkat0 Channel Pruning form chainer
Tensorflow Lite Google More geared towards Mobile deployment. Model Optimization through Quantization. Two in components: TF Lite interpreter and TF Lite Converter which converts tensorflow models to tflite optimized models.
Optuna Optuna Automatic hyperparameter optimization software framework.

Deployment Tools

Tool Maintainer Desc
TVM Neural Network Compiler Stack Apache Relay is a Intermediate Representation Library for building dataflow computational graphs. VTA A programmable accelerator and an end-to-end solution that includes drivers, a JIT runtime, and an optimizing compiler stack based on TVM. Includes deployment and simulation tools for FPGAs
Glow Facebook Pytorch Compiler
OpenVino Intel Deployment and model optimization for FPGA
FINN-HLS Xilinx Research Vivaldo For FINN
Vitis Xilinx Commercial Software stack including Vivaldo for simulation and deployment

References

Papers Im Reading

Papers im Reading

Papers on pruning

List of papers on pruning: Link Other Collection of compression techniques link

About

Masters Thesis Project https://davidturner94.github.com/nncompression


Languages

Language:TeX 75.1%Language:Python 24.9%