NVIDIA-AI-IOT / redtail

Perception and AI components for autonomous mobile robotics.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

stereoDNN resnet18 test fails

mtbsteve opened this issue · comments

@Alexey-Kamenev my TX2 is running out of memory when I run the stereoDNN model tests for resnet18 fp16 and fp32. When I run:
./bin/nvstereo_sample_app_debug resnet18 1025 321 ./models/ResNet-18/TensorRT/trt_weights.bin ./sample_app/data/img_left.png ./sample_app/data/img_right.png ./bin/disp.bin
Further down in the calculations memory consumption spikes up from approx 4GB used over the course of the calculations to the max limit of the TX2 memory and the process gets killed.

All other models, NVsmall, NVtiny, and Resnet18-2D run without problems within the processing times as indicated in the wiki.

I am running Redtail on a TX2 with Jetpack 4.2.2 / Ubuntu 18.04 and ROS Melodic - I know not supported but I am not sure if that's the root cause behind this issue since the other stuff seems to work.Also, I dont want to go back to the old 3.2 release due to other dependencies.
Any ideas are appreciated!

One idea that might be worth checking is to see which call/plugin causes memory spike. I would specifically check cuDNN autotuner in Conv3DPlugin plugin (also read my comments around that line for more details). This function, cudnnFindConvolutionForwardAlgorithm, would try to allocate a workspace memory when testing algorithms and while it's supposed to fail gracefully and return appropriate error code, it might not be the case.

Ideally, Ex version of the tuner should be used as it does not allocate any memory unlike its non-Ex counterpart but I did not have time to implement that properly.
Note that Conv3DTransposePlugin plugin does not have this problem and it uses Get version of the tuner which does not allocate any memory at all.

Finally, with TensorRT 6.0 release, these plugins are not needed anymore - TRT now supports 3D convolutions. However, the model generation code needs to be updated to use TRT layers rather than our plugins.

Thanks for getting back @Alexey-Kamenev
Here is the tail of the output when I run the test program with resnet18:

.....
TRT INFO: --------------- Timing deconv3D_3_add_skip(5)
TRT INFO: Tactic 1 time 13.8621
TRT INFO: Tactic 2 time 21.4979
TRT INFO: 
TRT INFO: --------------- Timing deconv3D_4_add_skip(5)
TRT INFO: Tactic 1 time 56.5356
TRT INFO: Tactic 2 time 83.2072
TRT INFO: Formats and tactics selection completed in 41.6912 seconds.
TRT INFO: After reformat layers: 137 layers
TRT INFO: Block size 1437778944
TRT INFO: Block size 1073741824
TRT INFO: Block size 718889472
TRT INFO: Block size 718889472
TRT INFO: Block size 181191168
TRT INFO: Block size 181191168
TRT INFO: Block size 3145216
TRT INFO: Block size 929280
TRT INFO: Total Activation Memory: 4315756544
TRT INFO: Detected 2 input and 1 output network tensors.
TRT INFO: left_conv1_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: right_conv1_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: left_resblock1_conv1_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: left_resblock1_conv2_add_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: right_resblock1_conv1_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: right_resblock1_conv2_add_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: left_resblock2_conv1_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: left_resblock2_conv2_add_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: right_resblock2_conv1_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: right_resblock2_conv2_add_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: left_resblock3_conv1_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: left_resblock3_conv2_add_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: right_resblock3_conv1_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: right_resblock3_conv2_add_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: left_resblock4_conv1_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: left_resblock4_conv2_add_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: right_resblock4_conv1_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: right_resblock4_conv2_add_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: left_resblock5_conv1_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: left_resblock5_conv2_add_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: right_resblock5_conv1_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: right_resblock5_conv2_add_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: left_resblock6_conv1_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: left_resblock6_conv2_add_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: right_resblock6_conv1_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: right_resblock6_conv2_add_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: left_resblock7_conv1_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: left_resblock7_conv2_add_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: right_resblock7_conv1_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: right_resblock7_conv2_add_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: left_resblock8_conv1_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: left_resblock8_conv2_add_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: right_resblock8_conv1_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: right_resblock8_conv2_add_act: Dims: {  32, 161, 513}, Format: [Float, NCHW]
TRT INFO: cost_vol: InDims(x2): {  32, 161, 513}
TRT INFO: cost_vol: OutDims   : {  68,  64, 161, 513}
Killed
apsync@apsync:~/redtail/stereoDNN$