onnx file won't work in deepstream_python_apps without EXPLICIT_BATCH flag set

Question

onnx file won't work in deepstream_python_apps without EXPLICIT_BATCH flag set

whutchi opened this issue 5 months ago · comments

I have a trained detection model that I trained in this excellent Jetson-Inference series. In that process instructed by pytorch-ssd.md page, training produces a .pth file that is then converted by the onnx_export.py utility to an onnx model. I want to use that trained model in one of the deepstream_python_apps that are perfect for my needs. To use an onnx model, you change the config file to comment out the tlt-encoded-model and the tlt-model-key, add a line with the name of the onnx file, then use trtexec to create an engine file from the onnx file and put that in the model-engine-file entry in place of the sample that was there. With that setup, it starts up great, but then hits an error saying that the onnx model must have been created with NetworkDefinitionCreationFlag.EXPLICIT_BATCH, since trt no longer accepts implict batch size. Adding the argument --explictBatch to the trtexec command doesn't solve it. The documentation at https://docs.nvidia.com/deeplearning/tensorrt/developer-guide/index.html#explicit-implicit-batch also doesn't help.

I've tried to get around that by going all the way back to the training and the conversion to onnx and setting the batch size explicitly, but that does not fulfill the requirement. Is there any way to use an onnx network created by the training or the onnx_export.py conversion to be able to run in the deepstream_python_apps? Perhaps some way something could be added to the onnx_export code to set that parameter?
I've got such a nice trained onnx network and the deepstream_python_app examples are perfect for what I want to do if I can solve this.

whutchi · Answer 1 · Fri Jun 14 2024 23:27:58 GMT+0800 (China Standard Time)

It does seem like a major flaw to have the main network training method in the dusty-nv/jetson-inference tutorials produce an onnx file that can't be used in the main sample apps in the deepstream_python_apps.... How can that be remedied?

Dustin Franklin · Answer 2 · Sat Jun 15 2024 01:51:47 GMT+0800 (China Standard Time)

@whutchi the deepstream models are typically trained with TAO (or YOLO iirc) and if you were to get past the explicit batch thing, I'm not sure those other samples support ssd-mobilenet and the pre/post-processing for it. Supporting models requires more than just the ONNX, and the pytorch training scripts in jetson-inference are basically for educational examples.

…

________________________________ From: whutchi ***@***.***> Sent: Friday, June 14, 2024 11:28:30 AM To: dusty-nv/jetson-inference ***@***.***> Cc: Subscribed ***@***.***> Subject: Re: [dusty-nv/jetson-inference] onnx file won't work in deepstream_python_apps without EXPLICIT_BATCH flag set (Issue #1864) It does seem like a major flaw to have the main network training method in the dusty-nv/jetson-inference tutorials produce an onnx file that can't be used in the main sample apps in the deepstream_python_apps.... How can that be remedied? — Reply to this email directly, view it on GitHub<#1864 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/ADVEGK2SYOVYPJM4YR5RXMTZHMDZ5AVCNFSM6AAAAABJDL6222VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRYGI3TEOJXGE>. You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

whutchi · Answer 3 · Tue Jun 18 2024 23:23:46 GMT+0800 (China Standard Time)

Dustin, thanks again for the quick response. I'm at a bit of a loss now figuring out how to train a network that will work in the deepstream python apps since they are perfect for my application (in fact, I've already successfully modified the code in the usbcam example to do object detection and face recognition at the same time). I have a jetson orin nano devkit, a powerful board, but Tao doesn't work for training on it, nor does the Transfer Learning Toolkit! I've been looking at ultralytics to use yolov8, which would be a great detection model, but their deployment method doesn't fit into the deepstream python app and it requires a continuing license. I just want to train a network that I can put into the deepstream python app. Given that I have the Orin, it doesn't make sense to me to have to buy a separate workstation to train using Tao or TLT. I presume I'm not understanding the options, so could you advise me how to train on the Orin? Could your python training scripts be used with yolov8, for example? Thanks for your help.

…

On Fri, Jun 14, 2024 at 11:52 AM Dustin Franklin ***@***.***> wrote: @whutchi the deepstream models are typically trained with TAO (or YOLO iirc) and if you were to get past the explicit batch thing, I'm not sure those other samples support ssd-mobilenet and the pre/post-processing for it. Supporting models requires more than just the ONNX, and the pytorch training scripts in jetson-inference are basically for educational examples. ________________________________ From: whutchi ***@***.***> Sent: Friday, June 14, 2024 11:28:30 AM To: dusty-nv/jetson-inference ***@***.***> Cc: Subscribed ***@***.***> Subject: Re: [dusty-nv/jetson-inference] onnx file won't work in deepstream_python_apps without EXPLICIT_BATCH flag set (Issue #1864) It does seem like a major flaw to have the main network training method in the dusty-nv/jetson-inference tutorials produce an onnx file that can't be used in the main sample apps in the deepstream_python_apps.... How can that be remedied? — Reply to this email directly, view it on GitHub< #1864 (comment)>, or unsubscribe< https://github.com/notifications/unsubscribe-auth/ADVEGK2SYOVYPJM4YR5RXMTZHMDZ5AVCNFSM6AAAAABJDL6222VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRYGI3TEOJXGE>. You are receiving this because you are subscribed to this thread.Message ID: ***@***.***> — Reply to this email directly, view it on GitHub <#1864 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/ABXRPXAJ25FKYV2JJCSE5ULZHMUURAVCNFSM6AAAAABJDL6222VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCNRYGQ4TSOBXGY> . You are receiving this because you were mentioned.Message ID: ***@***.***>