sithu31296 / torch_optimize

Optimize PyTorch Models for faster inference

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Optimize PyTorch Models


This project is for optimizing pytorch models for production. Optimization includes the following:

  • Optimizing PyTorch models
  • Converting to another frameworks (ONNX, TFLite, TensorRT, OpenVINO, NCNN, etc.)
  • Optimizing converted models from another frameworks


Installing OpenVINO

Download OpenVINO toolkit from here.

On Linux:

$ tar -xvzf l_openvino_toolkit_p_<version>.tgz
$ cd l_openvino_toolkit_p_<version>
$ sudo ./

[Optional] Install External Software Dependencies

These include:

  • Intel-optimized build of OpenCV library
  • Inference Engine
  • Model Optimizer Tools

On Linux:

$ cd /opt/intel/openvino_2021/install_dependencies
$ sudo -E ./

Set the Environment Variables

  • Open the .bashrc file.
$ gedit ~/.bashrc
  • Add this line to the end of the file.
source /opt/intel/openvino_2021/bin/
  • Save and close the file.
  • Open a new terminal and you will see [] OpenVINO environment initialized.

Configure the Model Optimizer

  • Go to the Model Optimizer pre-requisites directory.
$ cd /opt/intel/openvino_2021/deployment_tools/model_optimizer/install_prequisites
  • Run the script for ONNX framework.
$ sudo ./

Uninstall OpenVINO

Run the following command.

$ sudo /opt/intel/openvino_2021/openvino_toolkit_uninstaller/ -s

Installing openvino2tensorflow

openvino2tensorflow tool will be used to convert OpenVINO model to TensorFlow model. Install as follows:

$ pip install -U git+

PyTorch to TFLite

Step 1: Convert PyTorch to ONNX

$ python convert/

Step 2: Convert ONNX to OpenVINO

$ python <OpenVINO_INSTALL_DIR>/deployment_tools/model_optimizer/ \
    --input_model <MODEL>.onnx \
    --output_dir <OpenVINO_MODEL_PATH> \
    --input_shape [B,C,H,W] \
    --data_type {FP16,FP32,half,float} \

Step 3: Convert OpenVINO to TensorFlow

$ openvino2tensorflow \
    --model_path <OpenVINO_MODEL_PATH>/<MODEL>.xml \
    --model_output_path <TF_SAVED_MODEL_PATH> \
    --output_saved_model \

Step 4: Convert TensorFlow to TFLite

$ python convert/ \
    --model-path <TF_SAVED_MODEL_PATH>
    --model-output-path <TFLITE_MODEL_PATH>
    --quant {'float32', 'float16', 'int8'}

Notes: If you use int8 quantization, you need to add --dataset-path <CALIBRATE_DATASET_PATH> unlabelled data in numpy format.



Methods Inference Time (ms) Model Size (ms) Improvements (%)
original - - -
orig+quantize - - -
orig+prune - - -
orig+quant+prune - - -
orig2onnx - - -
tflite - - -
tflite+quantize - - -


Methods Inference Time (ms) Model Size (ms) Improvements (%)
original (FP32) - - -
original (FP16) - - -
tensorrt (FP32) - - -
tensorrt (FP16) - - -
tensorrt (int8) - - -


Optimize PyTorch Models for faster inference

License:MIT License


Language:Python 100.0%