[Bug] 在jeston nano上jetpack4.6.1无法正常运行TensorRT与onnxruntime的C++部署
HEIseYOUmolc opened this issue · comments
Checklist
- I have searched related issues but cannot get the expected help.
- 2. I have read the FAQ documentation but cannot get the expected help.
- 3. The bug has not been fixed in the latest version.
Describe the bug
模型转换由windows执行,将产生的onnx模型与engine模型文件迁移至jetson nano 模组中,部署过程中出现如下错误信息。仅在onnxruntime cpu 设置下可正常运行,cuda模式下均不可运行。
此外,可以使用trt在jetson nano 上转换onnx模型为engine格式,但我对engine了解较少,是否有其他部署方法?
Reproduction
在不同设置编译下的文件进行运行
trt编译
cmake .. -DMMDEPLOY_BUILD_SDK=ON -DMMDEPLOY_BUILD_SDK_PYTHON_API=ON -DMMDEPLOY_BUILD_EXAMPLES=ON -DMMDEPLOY_TARGET_DEVICES="cuda;cpu" -DMMDEPLOY_TARGET_BACKENDS="trt" -DMMDEPLOY_CODEBASES=all -Dpplcv_DIR=${PPLCV_DIR}/cuda-build/install/lib/cmake/ppl
ort编译
cmake .. -DMMDEPLOY_BUILD_SDK=ON -DMMDEPLOY_BUILD_SDK_PYTHON_API=ON -DMMDEPLOY_BUILD_EXAMPLES=ON -DMMDEPLOY_TARGET_DEVICES="cuda;cpu" -DMMDEPLOY_TARGET_BACKENDS="ort" -DMMDEPLOY_CODEBASES=all -Dpplcv_DIR=${PPLCV_DIR}/cuda-build/install/lib/cmake/ppl -DONNXRUNTIME_DIR=${ONNXRUNTIME_DIR}
运行命令
使用TensorRT
./object_detection cuda /home/nvidia/文档/mmdeploy_models/rtdetr-trt-sta-640/ /home/nvidia/图片/resources/test.jpg
使用ONNXRUNTIME
./object_detection cuda /home/nvidia/文档/mmdeploy_models/rtdetr-ort-dyn/ /home/nvidia/图片/resources/test.jpg
Environment
01/08 17:35:54 - mmengine - INFO -
01/08 17:35:54 - mmengine - INFO - **********Environmental information**********
01/08 17:35:56 - mmengine - INFO - sys.platform: linux
01/08 17:35:56 - mmengine - INFO - Python: 3.6.15 | packaged by conda-forge | (default, Dec 3 2021, 19:12:04) [GCC 9.4.0]
01/08 17:35:56 - mmengine - INFO - CUDA available: True
01/08 17:35:56 - mmengine - INFO - GPU 0: NVIDIA Tegra X1
01/08 17:35:56 - mmengine - INFO - CUDA_HOME: /usr/local/cuda
01/08 17:35:56 - mmengine - INFO - NVCC: Cuda compilation tools, release 10.2, V10.2.300
01/08 17:35:56 - mmengine - INFO - GCC: gcc (Ubuntu/Linaro 7.5.0-3ubuntu1~18.04) 7.5.0
01/08 17:35:56 - mmengine - INFO - PyTorch: 1.10.0
01/08 17:35:56 - mmengine - INFO - PyTorch compiling details: PyTorch built with:
- GCC 7.5
- C++ Version: 201402
- OpenMP 201511 (a.k.a. OpenMP 4.5)
- LAPACK is enabled (usually provided by MKL)
- NNPACK is enabled
- CPU capability usage: NO AVX
- CUDA Runtime 10.2
- NVCC architecture flags: -gencode;arch=compute_53,code=sm_53;-gencode;arch=compute_62,code=sm_62;-gencode;arch=compute_72,code=sm_72
- CuDNN 8.2.1
- Built with CuDNN 8.0
- Build settings: BLAS_INFO=open, BUILD_TYPE=Release, CUDA_VERSION=10.2, CUDNN_VERSION=8.0.0, CXX_COMPILER=/usr/bin/c++, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -DUSE_PTHREADPOOL -fopenmp -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO -O2 -fPIC -Wno-narrowing -Wall -Wextra -Werror=return-type -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-unused-local-typedefs -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-psabi -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Werror=format -DMISSING_ARM_VST1 -DMISSING_ARM_VLD1 -Wno-stringop-overflow, FORCE_FALLBACK_CUDA_MPI=1, LAPACK_INFO=open, TORCH_VERSION=1.10.0, USE_CUDA=ON, USE_CUDNN=ON, USE_EIGEN_FOR_BLAS=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=OFF, USE_MKLDNN=OFF, USE_MPI=ON, USE_NCCL=0, USE_NNPACK=ON, USE_OPENMP=ON,
01/08 17:35:56 - mmengine - INFO - TorchVision: 0.11.1
01/08 17:35:56 - mmengine - INFO - OpenCV: 4.6.0
01/08 17:35:56 - mmengine - INFO - MMCV: 1.7.1
01/08 17:35:56 - mmengine - INFO - MMCV Compiler: GCC 7.5
01/08 17:35:56 - mmengine - INFO - MMCV CUDA Compiler: 10.2
01/08 17:35:56 - mmengine - INFO - MMDeploy: 1.3.1+bc75c9d
01/08 17:35:56 - mmengine - INFO -
01/08 17:35:56 - mmengine - INFO - **********Backend information**********
01/08 17:35:57 - mmengine - INFO - tensorrt: 8.2.1.8
01/08 17:35:57 - mmengine - INFO - tensorrt custom ops: Available
01/08 17:35:57 - mmengine - INFO - ONNXRuntime: None
01/08 17:35:57 - mmengine - INFO - ONNXRuntime-gpu: 1.10.0
01/08 17:35:57 - mmengine - INFO - ONNXRuntime custom ops: Available
01/08 17:35:57 - mmengine - INFO - pplnn: None
01/08 17:35:57 - mmengine - INFO - ncnn: None
01/08 17:35:57 - mmengine - INFO - snpe: None
01/08 17:35:57 - mmengine - INFO - openvino: None
01/08 17:35:57 - mmengine - INFO - torchscript: 1.10.0
01/08 17:35:57 - mmengine - INFO - torchscript custom ops: NotAvailable
01/08 17:35:57 - mmengine - INFO - rknn-toolkit: None
01/08 17:35:57 - mmengine - INFO - rknn-toolkit2: None
01/08 17:35:57 - mmengine - INFO - ascend: None
01/08 17:35:57 - mmengine - INFO - coreml: None
01/08 17:35:57 - mmengine - INFO - tvm: None
01/08 17:35:57 - mmengine - INFO - vacc: None
01/08 17:35:57 - mmengine - INFO -
01/08 17:35:57 - mmengine - INFO - **********Codebase information**********
01/08 17:35:57 - mmengine - INFO - mmdet: None
01/08 17:35:57 - mmengine - INFO - mmseg: None
01/08 17:35:57 - mmengine - INFO - mmpretrain: None
01/08 17:35:57 - mmengine - INFO - mmocr: None
01/08 17:35:57 - mmengine - INFO - mmagic: None
01/08 17:35:57 - mmengine - INFO - mmdet3d: None
01/08 17:35:57 - mmengine - INFO - mmpose: None
01/08 17:35:57 - mmengine - INFO - mmrotate: None
01/08 17:35:57 - mmengine - INFO - mmaction: None
01/08 17:35:57 - mmengine - INFO - mmrazor: None
01/08 17:35:57 - mmengine - INFO - mmyolo: None
windows环境信息
01/08 17:52:08 - mmengine - INFO - **********Environmental information**********
E:\conda_env\RT-DETR\lib\site-packages\requests\__init__.py:102: RequestsDependencyWarning: urllib3 (1.26.16) or chardet (5.2.0)/charset_normalizer (None) doesn't match a supported version!
warnings.warn("urllib3 ({}) or chardet ({})/charset_normalizer ({}) doesn't match a supported "
01/08 17:52:10 - mmengine - INFO - sys.platform: win32
01/08 17:52:10 - mmengine - INFO - Python: 3.9.17 (main, Jul 5 2023, 20:47:11) [MSC v.1916 64 bit (AMD64)]
01/08 17:52:10 - mmengine - INFO - CUDA available: True
01/08 17:52:10 - mmengine - INFO - numpy_random_seed: 2147483648
01/08 17:52:10 - mmengine - INFO - GPU 0: NVIDIA GeForce GTX 1060 6GB
01/08 17:52:10 - mmengine - INFO - CUDA_HOME: C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.3
01/08 17:52:10 - mmengine - INFO - NVCC: Cuda compilation tools, release 11.3, V11.3.58
01/08 17:52:10 - mmengine - INFO - MSVC: 用于 x64 的 Microsoft (R) C/C++ 优化编译器 19.36.32532 版
01/08 17:52:10 - mmengine - INFO - GCC: n/a
01/08 17:52:10 - mmengine - INFO - PyTorch: 1.12.1+cu113
01/08 17:52:10 - mmengine - INFO - PyTorch compiling details: PyTorch built with:
- C++ Version: 199711
- MSVC 192829337
- Intel(R) Math Kernel Library Version 2020.0.2 Product Build 20200624 for Intel(R) 64 architecture applications
- Intel(R) MKL-DNN v2.6.0 (Git Hash 52b5f107dd9cf10910aaa19cb47f3abf9b349815)
- OpenMP 2019
- LAPACK is enabled (usually provided by MKL)
- CPU capability usage: AVX2
- CUDA Runtime 11.3
- NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_80,code=sm_80;-gencode;arch=compute_86,code=sm_86;-gencode;arch=compute_37,code=compute_37
- CuDNN 8.3.2 (built against CUDA 11.5)
- Magma 2.5.4
- Build settings: BLAS_INFO=mkl, BUILD_TYPE=Release, CUDA_VERSION=11.3, CUDNN_VERSION=8.3.2, CXX_COMPILER=C:/actions-runner/_work/pytorch/pytorch/builder/windows/tmp_bin/sccache-cl.exe, CXX_FLAGS=/DWIN32 /D_WINDOWS /GR /EHsc /w /bigobj -DUSE_PTHREADPOOL -openmp:experimental -IC:/actions-runner/_work/pytorch/pytorch/builder/windows/mkl/include -DNDEBUG -DUSE_KINETO -DLIBKINETO_NOCUPTI -DUSE_FBGEMM -DUSE_XNNPACK -DSYMBOLICATE_MOBILE_DEBUG_HANDLE -DEDGE_PROFILER_USE_KINETO, LAPACK_INFO=mkl, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, TORCH_VERSION=1.12.1, USE_CUDA=ON, USE_CUDNN=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=OFF, USE_MPI=OFF, USE_NCCL=OFF, USE_NNPACK=OFF, USE_OPENMP=ON, USE_ROCM=OFF,
01/08 17:52:10 - mmengine - INFO - TorchVision: 0.13.1+cu113
01/08 17:52:10 - mmengine - INFO - OpenCV: 4.8.0
01/08 17:52:10 - mmengine - INFO - MMEngine: 0.9.0
01/08 17:52:10 - mmengine - INFO - MMCV: 2.1.0
01/08 17:52:10 - mmengine - INFO - MMCV Compiler: MSVC 192930148
01/08 17:52:10 - mmengine - INFO - MMCV CUDA Compiler: 11.3
01/08 17:52:10 - mmengine - INFO - MMDeploy: 1.3.0+2882c64
01/08 17:52:10 - mmengine - INFO -
01/08 17:52:10 - mmengine - INFO - **********Backend information**********
01/08 17:52:10 - mmengine - INFO - tensorrt: 8.2.1.8
01/08 17:52:10 - mmengine - INFO - tensorrt custom ops: Available
01/08 17:52:10 - mmengine - INFO - ONNXRuntime: None
01/08 17:52:10 - mmengine - INFO - ONNXRuntime-gpu: 1.10.0
01/08 17:52:10 - mmengine - INFO - ONNXRuntime custom ops: Available
01/08 17:52:10 - mmengine - INFO - pplnn: None
01/08 17:52:10 - mmengine - INFO - ncnn: None
01/08 17:52:10 - mmengine - INFO - snpe: None
01/08 17:52:10 - mmengine - INFO - openvino: None
01/08 17:52:10 - mmengine - INFO - torchscript: 1.12.1+cu113
01/08 17:52:10 - mmengine - INFO - torchscript custom ops: NotAvailable
01/08 17:52:10 - mmengine - INFO - rknn-toolkit: None
01/08 17:52:10 - mmengine - INFO - rknn-toolkit2: None
01/08 17:52:10 - mmengine - INFO - ascend: None
01/08 17:52:10 - mmengine - INFO - coreml: None
01/08 17:52:10 - mmengine - INFO - tvm: None
01/08 17:52:10 - mmengine - INFO - vacc: None
01/08 17:52:10 - mmengine - INFO -
01/08 17:52:10 - mmengine - INFO - **********Codebase information**********
01/08 17:52:10 - mmengine - INFO - mmdet: 3.2.0
01/08 17:52:10 - mmengine - INFO - mmseg: None
01/08 17:52:10 - mmengine - INFO - mmpretrain: None
01/08 17:52:10 - mmengine - INFO - mmocr: None
01/08 17:52:10 - mmengine - INFO - mmagic: None
01/08 17:52:10 - mmengine - INFO - mmdet3d: None
01/08 17:52:10 - mmengine - INFO - mmpose: None
01/08 17:52:10 - mmengine - INFO - mmrotate: None
01/08 17:52:10 - mmengine - INFO - mmaction: None
01/08 17:52:10 - mmengine - INFO - mmrazor: None
01/08 17:52:10 - mmengine - INFO - mmyolo: None
Error traceback
TensorRT错误信息
[2024-01-08 17:42:04.214] [mmdeploy] [info] [model.cpp:35] [DirectoryModel] Load model: "/home/nvidia/文档/mmdeploy_models/rtdetr-trt-sta-640/"
[2024-01-08 17:42:04.459] [mmdeploy] [error] [resize.cpp:84] unsupported interpolation method: bicubic
[2024-01-08 17:42:04.460] [mmdeploy] [error] [task.cpp:99] error parsing config: {
"context": {
"device": "<any>",
"model": "<any>",
"stream": "<any>"
},
"input": [
"img"
],
"module": "Transform",
"name": "Preprocess",
"output": [
"prep_output"
],
"transforms": [
{
"backend_args": null,
"type": "LoadImageFromFile"
},
{
"interpolation": "bicubic",
"keep_ratio": false,
"size": [
640,
640
],
"type": "Resize"
},
{
"mean": [
0,
0,
0
],
"std": [
255,
255,
255
],
"to_rgb": true,
"type": "Normalize"
},
{
"size_divisor": 32,
"type": "Pad"
},
{
"type": "DefaultFormatBundle"
},
{
"keys": [
"img"
],
"meta_keys": [
"pad_param",
"ori_filename",
"ori_shape",
"filename",
"flip",
"valid_ratio",
"img_id",
"pad_shape",
"img_path",
"img_norm_cfg",
"img_shape",
"flip_direction",
"scale_factor"
],
"type": "Collect"
}
],
"type": "Task"
}
[2024-01-08 17:42:06.949] [mmdeploy] [error] [trt_net.cpp:28] TRTNet: 6: The engine plan file is not compatible with this version of TensorRT, expecting library version 8.2.1 got 8.2.3, please rebuild.
[2024-01-08 17:42:06.950] [mmdeploy] [error] [trt_net.cpp:28] TRTNet: 4: [runtime.cpp::deserializeCudaEngine::50] Error Code 4: Internal Error (Engine deserialization failed.)
[2024-01-08 17:42:06.950] [mmdeploy] [error] [trt_net.cpp:75] failed to deserialize TRT CUDA engine
[2024-01-08 17:42:06.981] [mmdeploy] [error] [net_module.cpp:54] Failed to create Net backend: tensorrt, config: {
"context": {
"device": "<any>",
"model": "<any>",
"stream": "<any>"
},
"input": [
"prep_output"
],
"input_map": {
"img": "input"
},
"is_batched": false,
"module": "Net",
"name": "rtdetr",
"output": [
"infer_output"
],
"output_map": {},
"type": "Task"
}
[2024-01-08 17:42:06.982] [mmdeploy] [error] [task.cpp:99] error parsing config: {
"context": {
"device": "<any>",
"model": "<any>",
"stream": "<any>"
},
"input": [
"prep_output"
],
"input_map": {
"img": "input"
},
"is_batched": false,
"module": "Net",
"name": "rtdetr",
"output": [
"infer_output"
],
"output_map": {},
"type": "Task"
}
Segmentation fault (core dumped)
ONNXRUNTIME 错误信息
[2024-01-08 17:46:37.772] [mmdeploy] [info] [model.cpp:35] [DirectoryModel] Load model: "/home/nvidia/文档/mmdeploy_models/rtdetr-ort-dyn/"
[2024-01-08 17:46:38.355] [mmdeploy] [error] [resize.cpp:84] unsupported interpolation method: bicubic
[2024-01-08 17:46:38.356] [mmdeploy] [error] [task.cpp:99] error parsing config: {
"context": {
"device": "<any>",
"model": "<any>",
"stream": "<any>"
},
"input": [
"img"
],
"module": "Transform",
"name": "Preprocess",
"output": [
"prep_output"
],
"transforms": [
{
"backend_args": null,
"type": "LoadImageFromFile"
},
{
"interpolation": "bicubic",
"keep_ratio": false,
"size": [
640,
640
],
"type": "Resize"
},
{
"mean": [
0,
0,
0
],
"std": [
255,
255,
255
],
"to_rgb": true,
"type": "Normalize"
},
{
"size_divisor": 32,
"type": "Pad"
},
{
"type": "DefaultFormatBundle"
},
{
"keys": [
"img"
],
"meta_keys": [
"flip_direction",
"ori_shape",
"pad_shape",
"scale_factor",
"img_shape",
"ori_filename",
"pad_param",
"img_id",
"img_norm_cfg",
"img_path",
"flip",
"filename",
"valid_ratio"
],
"type": "Collect"
}
],
"type": "Task"
}
Segmentation fault (core dumped)
Hi, I think you trans your onnx -> TensorRT in windows?
This step should in your device.
Hi, I think you trans your onnx -> TensorRT in windows? This step should in your device.
@yinfan98
Thanks for the reply, is what you said a must on the target device? My mmdet on Jeston Nano only allows version 2.27 to apply my model more difficult /(ㄒoㄒ)/~~
Hi @HEIseYOUmolc , you need to export the TensorRT model on the Jetson device. Maybe you can try deployee: https://platform.openmmlab.com/deploee to export TensorRT model on Jetson. I've submitted a PR that supports mmdet 3.0 on jetpack 4.6 devices. it will be live shortly!
Hi @HEIseYOUmolc , you need to export the TensorRT model on the Jetson device. Maybe you can try deployee: https://platform.openmmlab.com/deploee to export TensorRT model on Jetson. I've submitted a PR that supports mmdet 3.0 on jetpack 4.6 devices. it will be live shortly!
@yinfan98
great,Waiting for your release.
I already try to convert my model by trtexec, and try replace the .engine file with the trtexec conversion ,but in c++ demo it also have error like onnxruntime
[01/07/2024-18:35:51] [I] === Trace details ===
[01/07/2024-18:35:51] [I] Trace averages of 10 runs:
[01/07/2024-18:35:51] [I] Average on 10 runs - GPU latency: 500.888 ms - Host latency: 501.397 ms (end to end 502 ms, enqueue 191.146 ms)
[01/07/2024-18:35:51] [I]
[01/07/2024-18:35:51] [I] === Performance summary ===
[01/07/2024-18:35:51] [I] Throughput: 1.99203 qps
[01/07/2024-18:35:51] [I] Latency: min = 498.864 ms, max = 503.506 ms, mean = 501.397 ms, median = 501.815 ms, percentile(99%) = 503.506 ms
[01/07/2024-18:35:51] [I] End-to-End Host Latency: min = 499.323 ms, max = 504.746 ms, mean = 502 ms, median = 502.486 ms, percentile(99%) = 504.746 ms
[01/07/2024-18:35:51] [I] Enqueue Time: min = 25.5835 ms, max = 363.854 ms, mean = 191.146 ms, median = 189.779 ms, percentile(99%) = 363.854 ms
[01/07/2024-18:35:51] [I] H2D Latency: min = 0.48291 ms, max = 0.531982 ms, mean = 0.504181 ms, median = 0.501709 ms, percentile(99%) = 0.531982 ms
[01/07/2024-18:35:51] [I] GPU Compute Time: min = 498.351 ms, max = 503.006 ms, mean = 500.888 ms, median = 501.289 ms, percentile(99%) = 503.006 ms
[01/07/2024-18:35:51] [I] D2H Latency: min = 0.00305176 ms, max = 0.00585938 ms, mean = 0.00477295 ms, median = 0.00488281 ms, percentile(99%) = 0.00585938 ms
[01/07/2024-18:35:51] [I] Total Host Walltime: 5.02 s
[01/07/2024-18:35:51] [I] Total GPU Compute Time: 5.00888 s
[01/07/2024-18:35:51] [I] Explanations of the performance metrics are printed in the verbose logs.
[01/07/2024-18:35:51] [I]
&&&& PASSED TensorRT.trtexec [TensorRT v8201] # trtexec --onnx=/home/nvidia/文档/mmdeploy_models/test/end2end.onnx --saveEngine=end2end.engine --plugins=/home/nvidia/文档/mmdeploy/mmdeploy/lib/libmmdeploy_tensorrt_ops.so --workspace=1024
error message
[2024-01-09 15:26:53.939] [mmdeploy] [info] [model.cpp:35] [DirectoryModel] Load model: "/home/nvidia/文档/mmdeploy_models/rtdetr-trt-sta-640/"
[2024-01-09 15:26:54.174] [mmdeploy] [error] [resize.cpp:84] unsupported interpolation method: bicubic
[2024-01-09 15:26:54.174] [mmdeploy] [error] [task.cpp:99] error parsing config: {
"context": {
"device": "",
"model": "",
"stream": ""
},
"input": [
"img"
],
"module": "Transform",
"name": "Preprocess",
"output": [
"prep_output"
],
"transforms": [
{
"backend_args": null,
"type": "LoadImageFromFile"
},
{
"interpolation": "bicubic",
"keep_ratio": false,
"size": [
640,
640
],
"type": "Resize"
},
{
"mean": [
0,
0,
0
],
"std": [
255,
255,
255
],
"to_rgb": true,
"type": "Normalize"
},
{
"size_divisor": 32,
"type": "Pad"
},
{
"type": "DefaultFormatBundle"
},
{
"keys": [
"img"
],
"meta_keys": [
"pad_param",
"ori_filename",
"ori_shape",
"filename",
"flip",
"valid_ratio",
"img_id",
"pad_shape",
"img_path",
"img_norm_cfg",
"img_shape",
"flip_direction",
"scale_factor"
],
"type": "Collect"
}
],
"type": "Task"
}
Segmentation fault (core dumped)