Mismatch in number of Detections with TFRT onnx inference VS pytorch, pth file

Question

Mismatch in number of Detections with TFRT onnx inference VS pytorch, pth file

Allamrahul opened this issue 2 years ago · comments

Dataset: I am using a custom dataset with npy files and annotations. I followed all steps required for custom dataset preparation and I am able to get great results with pytorch with 90% map on my eval set.

However, once I convert the pth file to onnx format using exporter.py, for every point cloud in my eval dataset, I am seeing relatively smaller number of detections using TFRT inference with the cpp script as opposed to what I am getting using pytorch with the pth file.

In regard to the export process, exporter.py and simplifier_onnx.py are being used in the script. However, both scripts are hardcoded for 3 classes for kitti dataset. I have just one class to detect. Hence, I referred to the following commit to make the onnx export work: https://github.com/NVIDIA-AI-IOT/CUDA-PointPillars/pull/77/commits. After this , I was able to export but I faced the following issue after this: #82. I resolved this by tinkering with the export script, as mentioned on the following comment: #77 (comment). After this, my detections using TFRT onnx were atleast a subset of what I was seeing with pytorch but not the whole set. There is a clear delta between TFRT onnx and pytorch pth combo, in majority of my eval set. This can be seen in the following table:

Bounding box delta comparision: pytorch .pth VS TensorFlow RT onnx

File	Pytorch pth	TFRT cpp using .onnx file	Delta
000000.npy	tensor([[ 9.6498, 1.1609, 1.9397, 0.2856, 0.4898, 2.8947, 6.2814], [ 24.8358, 1.3459, 2.5912, 0.2332, 0.4984, 3.0438, 6.2827], [ 24.9936, -10.4810, 3.2429, 0.2568, 0.4702, 3.1647, 6.2816], [ 9.8542, -10.6894, 2.1888, 0.4316, 0.4553, 2.7412, 6.2486]], device='cuda:0')	24.8358 1.34592 2.59117 0.23324 0.498444 3.04383 6.28266 0 0.46325 ; 24.9936 -10.481 3.24294 0.256755 0.47017 3.16474 6.28156 0 0.445165 ; 9.8573 -10.6925 2.17166 0.433223 0.452724 2.7258 6.24912 0 0.445157	1
000001.npy	tensor([[ 9.6501, 1.1778, 1.8507, 0.2533, 0.4935, 2.7208, 6.2741], [ 24.9947, -10.4883, 3.0557, 0.2706, 0.4838, 3.0915, 6.2594], [ 24.8404, 1.3479, 2.6033, 0.2287, 0.4947, 3.0391, 6.2825], [ 9.8570, -10.6883, 2.1663, 0.4322, 0.4521, 2.7124, 6.2346]], device='cuda:0')	9.65337 1.1817 1.80798 0.248034 0.493837 2.66008 6.27361 0 ; 24.9947 -10.4883 3.05572 0.270619 0.483843 3.09145 6.25942 0 0.670895 ; 24.8404 1.34787 2.60326 0.228719 0.494724 3.03909 6.28252 0 0.459299 ; 9.8545 -10.6925 2.1472 0.438129 0.448904 2.7132 6.23376 0 0.424986	0
000002.npy	tensor([[ 9.6042, 1.1503, 2.0593, 0.2839, 0.4955, 2.9902, 6.3128], [ 24.7882, 1.3638, 2.6522, 0.2538, 0.5039, 3.1623, 6.2903], [ 9.7436, -10.6760, 2.1350, 0.3712, 0.4578, 2.6609, 6.2507], [ 24.9494, -10.5134, 3.2150, 0.2888, 0.4944, 3.3462, 6.2143]], device='cuda:0')	9.74478 -10.6817 2.1041 0.374984 0.453993 2.63108 6.25019 0 0.532783 ; 24.9494 -10.5134 3.21504 0.288844 0.494413 3.34624 6.21432 0 0.515557 ; 0.309276 -10.6853 2.08503 0.458935 0.413923 3.13058 6.09365 0 0.412784	1
000003.npy	tensor([[ 9.5610, -10.4589, 2.1206, 0.4139, 0.4505, 2.7193, 6.2802], [ 24.3758, 1.7272, 2.6000, 0.2396, 0.4966, 3.0571, 6.1985], [ 24.7097, -10.1406, 3.0566, 0.2619, 0.4718, 3.0835, 6.2728], [ 9.2311, 1.3354, 1.8251, 0.2543, 0.4891, 2.7015, 6.2441], [ 8.9262, 7.8720, 2.1033, 0.3872, 0.4424, 2.7067, 6.3819]], device='cuda:0')	9.56115 -10.4598 2.09798 0.418282 0.448642 2.68469 6.27597 0 0.735731 ; 24.3758 1.72724 2.59998 0.239596 0.496643 3.05714 6.19854 0 0.629267 ; 24.7097 -10.1406 3.0566 0.26186 0.471776 3.08349 6.27275 0 0.585723 ; 9.21606 1.33047 1.82858 0.254299 0.490583 2.66956 6.23728 0 0.471899	1
000004.npy	tensor([[ 6.4732, 2.6481, 1.7006, 0.2879, 0.4678, 2.6444, 6.3118], [21.4290, 4.8774, 2.5937, 0.2325, 0.5022, 3.1258, 6.4040], [23.1383, -6.8599, 2.7714, 0.2839, 0.4960, 3.0160, 6.3080], [ 8.1175, -8.9831, 2.2486, 0.3856, 0.4450, 2.7676, 6.3550]], device='cuda:0')	23.1383 -6.85986 2.77142 0.283893 0.495966 3.01596 6.30801 0 0.580739 ; 8.11463 -8.9818 2.12152 0.396575 0.436063 2.65015 6.35895 0 0.429396	2
000005.npy	tensor([[ 5.5251, 2.7731, 1.6679, 0.3284, 0.4662, 2.6940, 6.2788], [20.4834, 5.0487, 2.5489, 0.2769, 0.5241, 3.1817, 6.4027], [ 7.3220, -8.8810, 2.1011, 0.4506, 0.4281, 2.6641, 6.3688], [22.2850, -6.6383, 2.6867, 0.2744, 0.4986, 3.0367, 6.3119]], device='cuda:0')	7.32207 -8.88152 2.0861 0.445914 0.430497 2.6552 6.36896 0 0.696223	3
000006.npy	tensor([[18.0280, 4.9469, 2.4509, 0.3035, 0.5205, 3.1520, 6.3221], [19.8413, -6.7181, 2.7475, 0.3097, 0.5246, 3.2910, 6.3001], [ 3.1871, 2.6373, 1.7287, 0.4621, 0.4224, 2.9021, 6.3156], [ 4.8621, -8.9172, 1.8402, 0.4540, 0.3952, 2.5332, 6.3420], [32.0742, 7.1384, 3.3039, 0.2361, 0.4806, 3.3647, 6.4108], [21.2824, 12.1162, 3.6256, 0.2676, 0.4659, 3.5638, 6.5643], [ 0.6082, 4.4304, 1.8762, 0.4470, 0.4348, 3.4172, 6.2065]], device='cuda:0')	4.85492 -8.92965 1.819 0.460386 0.396642 2.5153 6.34298 0 0.494817	6
000007.npy	tensor([[18.2038, -6.8837, 2.5308, 0.3099, 0.5277, 3.1208, 6.3168], [16.5025, 4.7925, 2.3577, 0.3065, 0.5248, 3.0787, 6.3005], [ 1.5735, 2.6487, 1.6249, 0.5034, 0.4109, 2.6605, 6.3160], [ 2.2250, 2.7058, 1.8312, 0.4703, 0.4060, 3.0384, 6.3380], [ 3.2350, -8.9478, 1.8462, 0.4438, 0.4085, 2.5771, 6.3109], [19.7396, 11.9755, 3.2925, 0.2890, 0.5000, 3.6453, 6.5671], [ 3.5311, 2.8095, 2.3147, 0.4571, 0.4455, 4.2559, 6.3274], [30.5054, 6.8140, 3.3753, 0.2804, 0.5016, 3.6093, 6.2777]], device='cuda:0')	18.2057 -6.88499 2.4907 0.307031 0.527094 3.07328 6.31815 0 0.636754 ; 16.502 4.79033 2.33373 0.299566 0.523598 3.0561 6.3044 0 0.532995 ; 1.56738 2.64373 1.68283 0.506594 0.412098 2.66617 6.31967 0 0.51762 ; 3.22002 -8.95614 1.8366 0.449459 0.409571 2.56386 6.3068 0 0.431358 ; 2.2279 2.70934 1.85016 0.464891 0.40516 3.07841 6.33425 0 0.391239 ; 19.7397 11.9755 3.29258 0.28902 0.499917 3.64496 6.56848 0 0.381675	2
000008.npy	tensor([[ 8.7021, -7.9169, 2.6375, 0.3647, 0.4888, 3.5404, 6.2655], [ 7.7196, 3.7774, 2.3025, 0.4060, 0.4704, 3.2993, 6.2707], [22.8483, -6.6640, 3.5341, 0.3350, 0.5277, 4.1040, 6.3141], [21.7832, 5.1120, 2.8534, 0.2781, 0.5178, 3.2145, 6.1912], [ 3.2359, -8.4495, 2.0291, 0.4187, 0.4105, 3.2451, 6.2915]], device='cuda:0')	8.70127 -7.92042 2.62612 0.36539 0.486129 3.51703 6.26476 0 0.864963 ; 7.6994 3.79393 2.24546 0.40736 0.469539 3.21603 6.25044 0 0.73586 ; 22.8483 -6.66398 3.53411 0.335008 0.527745 4.10398 6.31413 0 0.605781 ; 21.7832 5.11193 2.85462 0.278421 0.517415 3.21271 6.21329 0 0.508611 ;	1
000009.npy	tensor([[19.5711, 4.7877, 2.6956, 0.3077, 0.5412, 3.3734, 6.2451], [ 6.3672, -8.0972, 2.7778, 0.4181, 0.4778, 4.1039, 6.2421], [ 5.4901, 3.6080, 2.3323, 0.4340, 0.4502, 3.7175, 6.2740], [20.3728, -7.0433, 3.3803, 0.3514, 0.5351, 4.1972, 6.3070], [26.6330, 11.8861, 3.9950, 0.3089, 0.5019, 4.1503, 6.6127]], device='cuda:0')	5.47306 3.61103 2.394 0.432978 0.453338 3.80027 6.32163 0 0.714706 ; 19.5717 4.78751 2.71062 0.308163 0.539413 3.36241 6.27686 0 0.621834 ; 6.35329 -8.10289 2.76789 0.422266 0.47866 4.13415 6.24032 0 0.606208	2
000010.npy	tensor([[18.3196, 4.6323, 3.2815, 0.3700, 0.5370, 4.5950, 6.3164], [ 5.0913, -8.1561, 2.6470, 0.4329, 0.4667, 4.0704, 6.2747], [19.1831, -7.1906, 3.3499, 0.3578, 0.5279, 4.2080, 6.3127], [ 2.5482, 4.3696, 1.6065, 0.4281, 0.3918, 2.8003, 6.2634]], device='cuda:0')	5.08485 -8.16716 2.64149 0.431825 0.466464 4.03816 6.27571 0 0.731938 ; 19.1846 -7.19002 3.2872 0.352221 0.529464 4.08496 6.31286 0 0.591408	2
000011.npy	tensor([[15.3577, -7.3005, 3.0413, 0.3812, 0.5104, 4.2909, 6.3159], [ 0.6093, 3.4074, 1.9033, 0.5056, 0.4306, 3.3583, 6.1790], [14.5397, 4.4909, 3.0513, 0.3723, 0.5222, 4.3821, 6.2383], [30.4700, -6.2796, 4.0225, 0.2914, 0.4843, 3.8403, 6.3179], [29.6795, 5.5980, 4.0535, 0.2816, 0.4877, 3.9741, 6.2869]], device='cuda:0')	0.594493 3.41456 2.11992 0.502219 0.441799 3.74912 6.17387 0 0.828488 ; 15.3587 -7.29961 2.99875 0.375657 0.512654 4.18005 6.31556 0 0.798267 ; 30.47 -6.27963 4.02255 0.29143 0.484331 3.84032 6.31788 0 0.434042	2
000012.npy	tensor([[ 11.2944, 4.3980, 3.0133, 0.3911, 0.5198, 4.6365, 6.2670], [ 26.4576, 5.3648, 3.6263, 0.3002, 0.5062, 3.8833, 6.3176], [ 12.0963, -7.3715, 3.0630, 0.3846, 0.5122, 4.3017, 6.2922], [ 8.1463, -12.5014, 2.9129, 0.3691, 0.4980, 3.9686, 6.1562], [ 27.1433, -6.4810, 3.9175, 0.3048, 0.5110, 3.9699, 6.3372], [ 18.4373, 11.4960, 3.7129, 0.3159, 0.4918, 4.2750, 6.4670]], device='cuda:0')	8.14566 -12.506 2.84502 0.364298 0.498799 3.84938 6.15557 0 0.378752 ; 12.0904 -7.37816 2.90519 0.378811 0.516209 4.00902 6.29017 0 0.376648	4

Please let me know if you know something that could help me.

Rahul Allam · Answer 1 · Thu Feb 16 2023 02:38:54 GMT+0800 (China Standard Time)

I see the same behavior with the kitti dataset as well, as follows:

Can anyone confirm if this an expected behavior or is this not supposed to happen?

Kwangjin Choi · Answer 2 · Fri Feb 24 2023 14:58:40 GMT+0800 (China Standard Time)

Hello, can you tell me how much the 3D detection performance drops?

Rahul Allam · Answer 3 · Thu Mar 09 2023 03:19:09 GMT+0800 (China Standard Time)

Hi, from my initial comment, there is delta as large as 6 in 000006.npy between pytorch pth and TFRT inference. I have about 30 evaluation point clouds and I see this drop in 90 % of them. Is there anything I can do to avoid this?

wangxj2014 · Answer 4 · Mon Jun 19 2023 15:35:16 GMT+0800 (China Standard Time)

I also encountered the same problem. Is there any way to solve this problem?

Dreamdreams8 · Answer 5 · Fri Feb 02 2024 09:54:06 GMT+0800 (China Standard Time)

The same problem. Has anyone solved it?