Inference time for images

Question

Inference time for images

ShrutheeshIR opened this issue 4 years ago · comments

Shrutheesh Raman Iyer commented 4 years ago

Thanks for releasing the code for this amazing work!
I have a query regarding the execution time on the test images. How long does the code take to extract and describe for a single image. The paper claims that feature extraction can be performed at 25 fps for the VGA frames.

I tried running the code on Google Colaboratory (due to lack of access to a GPU at this time) on the given examples, as instructed in the Readme. And upon running the code, each image takes about 1.7 seconds to complete. Admittedly, the GPU provided is not as good as the Titan X Pascal as mentioned in the paper, but is such a huge difference to be expected?

Is the FPS given only for feature extraction and not combining extraction and description stage? If so, what is the FPS you expect for the entire process for a single image?

Thanks!

Eduard Trulls · Answer 1 · Tue Jun 09 2020 19:14:58 GMT+0800 (China Standard Time)

Is the FPS given only for feature extraction and not combining extraction and description stage? If so, what is the FPS you expect for the entire process for a single image?

For the entire process, for a single image. That's obviously removing the overhead from loading the network, etc. But 2 seconds isn't even in the ballpark.

Shrutheesh Raman Iyer · Answer 2 · Tue Jun 09 2020 19:38:28 GMT+0800 (China Standard Time)

That is what seemed concerning to me.
I have timed the sess.run command, putting a timer just before and after the command, and it seems to take 0.45 seconds per image or roughly 2.2 FPS now. I suppose it must be a problem with the GPU then.
Thanks for the clarification.

Eduard Trulls · Answer 3 · Tue Jun 09 2020 19:42:44 GMT+0800 (China Standard Time)

It still seems high, but it's a bit more reasonable. There are other variables, such as the number of points we extracted for that experiment (which I don't remember), given the cropping with the transformers. Glad you figured it out.