gustavz / realtime_object_detection

Plug and Play Real-Time Object Detection App with Tensorflow and OpenCV

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Hi Gustav I have questions!

sankim90 opened this issue · comments

First, I was amazed at your work. It fits perfectly in my work.

I am working at JetsonTX2 & DrivePX2, and as you know, there is a speed issue.

I got information about the various works and github.

  1. tensorflow/models#3270
  2. https://devtalk.nvidia.com/default/topic/1028234/jetson-tx2/low-gpu-usage-with-tensorflow-inference-on-jetson-tx2/
  3. https://devtalk.nvidia.com/default/topic/1027819/jetson-tx2/object-detection-performance-jetson-tx2-slower-than-expected/

Q1. How can you achieve 30 fps at SSD mobilenet JetsonTX2?
AS mentioned (1), you manually assigning the CNN related nodes on GPU and the rest nodes on CPU at tensorflow? How?

Q2. Have you experimented with other frameworks?
I have experimented with openCV DNN (SSD-mobilenet), caffe (SSD-mobilenet), darknet (YOLO v2, v3) and tensorflow (SSD-mobilenet).

However, i got performance up to only 9 fps.

Do you think the above frameworks lacks the ability to optimize GPU / CPU allocation?

Thank you

Q1: The problem is that the tensorflow NMS implementation is not running fast on gpu, therefore i go through all layers/nodes and place the ones connected to the NMS on CPU, which does it much faster.

Q2: No, only darknet, which wperforms well also. But still slower than my approach.