vcg-uvic / lf-net-release

Code Release for LF-Net: Learning Local Features from Images

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Inference time for images

ShrutheeshIR opened this issue · comments

Thanks for releasing the code for this amazing work!
I have a query regarding the execution time on the test images. How long does the code take to extract and describe for a single image. The paper claims that feature extraction can be performed at 25 fps for the VGA frames.

I tried running the code on Google Colaboratory (due to lack of access to a GPU at this time) on the given examples, as instructed in the Readme. And upon running the code, each image takes about 1.7 seconds to complete. Admittedly, the GPU provided is not as good as the Titan X Pascal as mentioned in the paper, but is such a huge difference to be expected?

Is the FPS given only for feature extraction and not combining extraction and description stage? If so, what is the FPS you expect for the entire process for a single image?

Thanks!

Is the FPS given only for feature extraction and not combining extraction and description stage? If so, what is the FPS you expect for the entire process for a single image?

  1. For the entire process, for a single image. That's obviously removing the overhead from loading the network, etc. But 2 seconds isn't even in the ballpark.

That is what seemed concerning to me.
I have timed the sess.run command, putting a timer just before and after the command, and it seems to take 0.45 seconds per image or roughly 2.2 FPS now. I suppose it must be a problem with the GPU then.
Thanks for the clarification.

It still seems high, but it's a bit more reasonable. There are other variables, such as the number of points we extracted for that experiment (which I don't remember), given the cropping with the transformers. Glad you figured it out.