uzh-rpg / vilib

CUDA Visual Library by RPG

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

test_vilib results - Jetson Nano vs TX2

brendanlb opened this issue · comments

commented

Hi,

I'm trying to run vilib on my Jetson Nano to compare the results with a TX2. I managed to run the EuRoC Machine Hall 1 dataset example on my Jetson Nano by following the Examples section.

Could you please provide the console output when running the same example on a Jetson TX2 ? Here is the output that I get :

### Image Pyramid
 CPU (w/o. preallocated array)      : min: 248, max: 750, avg: 265.69 usec
 GPU Device (w. preallocated array) : min: 286.458, max: 303.334, avg: 290.041 usec
 GPU Host (w. preallocated array)   : min: 102, max: 156, avg: 106.98 usec
 Success: OK
### SubframePool
 Pool creation (with 10 frames)     : min: 3537, max: 4025, avg: 3710.1 usec
 Preallocated access                : min: 0, max: 2, avg: 0.47 usec
 New allocation                     : min: 10, max: 882, avg: 342.326 usec
 Success: OK
### PyramidPool
 Pool creation (with 10 frames)     : min: 5460, max: 5769, avg: 5564.71 usec
 Preallocated access                : min: 1, max: 108, avg: 2.527 usec
 New allocation                     : min: 45, max: 2643, avg: 525.929 usec
 Success: OK
### FAST detector
 CPU ---
 FAST: min: 11306, max: 16425, avg: 13537.4 [usec]
 FAST feature count: min: 3258, max: 7875, avg: 4671.76 [1]
 GPU ---
 FAST: min: 7523, max: 10749, avg: 8008.92 [usec]
 FAST feature count: min: 270, max: 340, avg: 316.68 [1]
 Success: OK
### Feature Tracker
 Note: No verification performed
 GPU ---
 Tracker execution time: min: 1523, max: 11502, avg: 2262.44 [usec]
 Tracked feature count: min: 0, max: 49, avg: 19.98 [1]
 Detected feature count: min: 0, max: 50, avg: 0.87 [1]
 Total feature count: min: 16, max: 50, avg: 20.85 [1]
 Feature track life: min: 0, max: 99, avg: 22.9655 [1]
 Success: OK
### Overall
Success: OK

Thanks

Hi brendanlb,
Thanks for your query. Personally I cannot help you with this one as I dont't have access to a Jetson TX2 anymore, and my available logs are related to the paper measurements and not the default state of the test executable. However, if there's anything else we can help you with, let us know.
Maybe a side comment looking at your timings, I believe the Jetson Nano has similar power modes as the Jetson TX2, so make sure whenever you make your timings also check which power mode you're currently in. The results in the paper were collected in MaxN mode (unrestricted power mode).
Thanks!

commented

Hi baliika,

Thank you for your quick answer. I used the MaxN mode to get these results. Does anyone else have the corresponding timing information on a Jetson TX2 ? I don't think that I can compare my test results with those you gave in the article..

Thanks

Dear @brendanlb

I'm coauthor of the article and in the process of using this library @baliika has developed.
I might be able to get the timings in the coming week.
However I'm working for a deadline at the moment and can promise anything within anytime.

If you don't hear back from me, feel free to ping me in a week or two.

commented

Hi @foehnx

That would be great ! Don't worry about the delay, I'm not in a rush to get these results. It's more of a personal questioning about the computing power of Jetson's devices. Especially when considering a platform optimized computer vision program.

Thanks !

commented

Hi @foehnx

Did you have time to run the test on your Jetson TX2 ?

Thanks.

Any update on TX2 by any chance?
Merci beaucoup et super repo

Dear @brendanlb catching up on this, since I was working on the repo I rerun the tests:

### Image Pyramid
 CPU (w/o. preallocated array)      : min: 169, max: 478, avg: 190.65 usec
 GPU Device (w. preallocated array) : min: 76.128, max: 110.592, avg: 79.3299 usec
 GPU Host (w. preallocated array)   : min: 66, max: 107, avg: 69.26 usec
 Success: OK
### SubframePool
 Pool creation (with 10 frames)     : min: 490, max: 1553, avg: 585.39 usec
 Preallocated access                : min: 0, max: 2, avg: 0.307 usec
 New allocation                     : min: 6, max: 1170, avg: 50.978 usec
 Success: OK
### PyramidPool
 Pool creation (with 10 frames)     : min: 965, max: 1487, avg: 1001.29 usec
 Preallocated access                : min: 1, max: 28, avg: 1.727 usec
 New allocation                     : min: 37, max: 1214, avg: 211.913 usec
 Success: OK
### FAST detector
 CPU ---
 FAST: min: 8180, max: 11814, avg: 9736.55 [usec]
 FAST feature count: min: 3258, max: 7875, avg: 4671.76 [1]
 GPU ---
 FAST: min: 1164, max: 1348, avg: 1220.09 [usec]
 FAST feature count: min: 270, max: 340, avg: 316.68 [1]
 Success: OK
### Harris detector
 CPU ---
 Harris: min: 15837, max: 23198, avg: 16018 [usec]
 Harris feature count: min: 17, max: 86, avg: 36.15 [1]
 GPU ---
 Harris: min: 1710, max: 1781, avg: 1732.39 [usec]
 Harris feature count: min: 19, max: 84, avg: 35.99 [1]
 Success: OK
### Feature Tracker
 Note: No verification performed
 GPU ---
 Tracker execution time: min: 694, max: 2541, avg: 878.54 [usec]
 Tracked feature count: min: 0, max: 49, avg: 25.3 [1]
 Detected feature count: min: 0, max: 50, avg: 1.22 [1]
 Total feature count: min: 15, max: 50, avg: 26.52 [1]
 Feature track life: min: 0, max: 99, avg: 20.7377 [1]
 Success: OK
### Overall
Success: OK

What's interesting is for example the GPU Tracker execution time, which in my case is 878.54 usec and in your evaluation is 2262.44 usec.
Can you verify that your device is not in any power-saving mode?
You can check out the scripts to switch the performance modes of the Jetson using one of our scripts found here, e.g. mode_max_n.sh.
You might have to adapt this to the Jetson Nano... I don't know if the commands are the same across devices.