TensorRT_Parser_Python

TensorRT engine convert (from Onnx engine) and inference in Python.

The Onnx model can be run on any system with difference platform (Operating system/ CUDA / CuDNN / TensorRT) but take a lot of time to parse. Convert the Onnx model to TensorRT model (.trt) help you save a lot of parsing time (4-10 min) but can only run on fixed system you've built.

I. Prerequiste.

CUDA/CUDNN/TensorRT Installation Guide

Clone.

git clone https://github.com/CuteBoiz/TensorRT_Parser_Python
cd TensorRT_Parser_Python

II. Export Onnx engine to TensorRT engine.

python3 main.py export --weight (--saved_name) (--max_batch_size) (--max_workspace_size) (--fp16) (--input_tensor_name) (--dim)

Arguments Details

Arguments Details	Type	Default	Note
`--weight`	`str`	`required`	Path to onnx engine.
`--saved_name`	`str`	`'weight_path'.trt`	Saved name of trt engine
`--fp16`	`store_true`	`false`	Use FP16 fast mode (x2 inference time).
`--max_batch_size`	`int`	`1`	Inference max batchsize.
`--max_workspace_size`	`int`	`1300`	Max workspace size(MB)
`--input_tensor_name`	`str`	`None`	*Input tensor name (dynamic shape input only).*
`--dim`	`int_array`	`None`	*Input tensor dimension (dynamic shape input only).*

Note: The only GPUs with full-rate FP16 Fast mode performance are Tesla P100, Quadro GP100, and Jetson TX1/TX2.

Note: To get input tensor name/shape of a DL engine: Use Netron.

Examples

Export Onnx engine to TensorRT engine.

python3 main.py export --weight ../2020_0421_0925.onnx 
python3 main.py export --weight ../2020_0421_0925.onnx --saved_name model.trt --max_batch_size 10 --fp16

Export Onnx engine with Dynamic shape input (batchsize x 3 x 416 x416).

 --input_tensor_name tensorName --dim dims1(,dims2,dims3)  (Does not include batchsize dims)
 python3 main.py export --ds --weight ../2020_0421_0925.onnx --input_tensor_name input_1 --dim 128 128 3
 python3 main.py export --ds --weight ../Keras.onnx --input_tensor_name input:0 --dim 3 640 640 --fp16

III. Inference.

python3 main.py infer --weight --data (--batch_size) (--softmax)

Arguments Details

Arguments Details	Type	Default	Note
`--weight`	`str`	`required`	Path to trt engine.
`--data`	`str`	`required`	Path to inference data.
`--batch_size`	`int`	`1`	Inference batchsize.
`--softmax`	`store_true`	`false`	Add softmax to output layer.

Examples

python3 main.py infer --weight ../2020_0421_0925.trt --data ../Dataset/Train/
python3 main.py infer --weight ../2020_0421_0925.trt --data ../Dataset/Train/ --batch_size 6 --softmax

TO-DO

Batchsize inference.
Add missing params (max_workspace_size, gpu).
Multiple inputs support.
Multiple output support.
Multi-type of inference data (video/folder/image).

CuteBoiz / TensorRT_Parser_Python

TensorRT_Parser_Python

I. Prerequiste.

II. Export Onnx engine to TensorRT engine.

III. Inference.

TO-DO

About

Languages