CuteBoiz / TensorRT_Parser_Python

Export (from Onnx) and Inference TensorRT engine with Python

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TensorRT_Parser_Python

TensorRT engine convert (from Onnx engine) and inference in Python.

The Onnx model can be run on any system with difference platform (Operating system/ CUDA / CuDNN / TensorRT) but take a lot of time to parse. Convert the Onnx model to TensorRT model (.trt) help you save a lot of parsing time (4-10 min) but can only run on fixed system you've built.

I. Prerequiste.

II. Export Onnx engine to TensorRT engine.

python3 main.py export --weight (--saved_name) (--max_batch_size) (--max_workspace_size) (--fp16) (--input_tensor_name) (--dim) 
Arguments Details
Arguments Details Type Default Note
--weight str required Path to onnx engine.
--saved_name str 'weight_path'.trt Saved name of trt engine
--fp16 store_true false Use FP16 fast mode (x2 inference time).
--max_batch_size int 1 Inference max batchsize.
--max_workspace_size int 1300 Max workspace size(MB)
--input_tensor_name str None Input tensor name (dynamic shape input only).
--dim int_array None Input tensor dimension (dynamic shape input only).

Note: The only GPUs with full-rate FP16 Fast mode performance are Tesla P100, Quadro GP100, and Jetson TX1/TX2.

Note: To get input tensor name/shape of a DL engine: Use Netron.

Examples
  • Export Onnx engine to TensorRT engine.

    python3 main.py export --weight ../2020_0421_0925.onnx 
    python3 main.py export --weight ../2020_0421_0925.onnx --saved_name model.trt --max_batch_size 10 --fp16
  • Export Onnx engine with Dynamic shape input (batchsize x 3 x 416 x416).

     --input_tensor_name tensorName --dim dims1(,dims2,dims3)  (Does not include batchsize dims)
     python3 main.py export --ds --weight ../2020_0421_0925.onnx --input_tensor_name input_1 --dim 128 128 3
     python3 main.py export --ds --weight ../Keras.onnx --input_tensor_name input:0 --dim 3 640 640 --fp16

III. Inference.

python3 main.py infer --weight --data (--batch_size) (--softmax)
Arguments Details
Arguments Details Type Default Note
--weight str required Path to trt engine.
--data str required Path to inference data.
--batch_size int 1 Inference batchsize.
--softmax store_true false Add softmax to output layer.
Examples
python3 main.py infer --weight ../2020_0421_0925.trt --data ../Dataset/Train/
python3 main.py infer --weight ../2020_0421_0925.trt --data ../Dataset/Train/ --batch_size 6 --softmax

TO-DO

  • Batchsize inference.
  • Add missing params (max_workspace_size, gpu).
  • Multiple inputs support.
  • Multiple output support.
  • Multi-type of inference data (video/folder/image).

About

Export (from Onnx) and Inference TensorRT engine with Python


Languages

Language:Python 100.0%