Liumouliu / deepIBL

Stochastic Attraction-Repulsion Embedding for Large Scale Localization

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Some questions about the result table

yxgeee opened this issue · comments

Hi,

Thanks for your nice work!
I have several questions about the following table

53e0 Best epoch: 4 (out of 7)
===========================================
Recall@N      0001 0002 0003 0004 0005 0010 
===========================================
off-the-shelf 0.80 0.86 0.90 0.92 0.93 0.96 
our trained   0.89 0.94 0.95 0.96 0.97 0.98 
trained/shelf 1.12 1.09 1.06 1.05 1.05 1.02
  1. Does the off-the-shelf here include (ImageNet pretrained) VGG16+VLAD+PCA?
  2. What does trained/shelf mean?
  3. I am now trying to implement your results by PyTorch, but the results seem similar to the raw NetVLAD. So do you have any suggestions?

Thank you very much!

Hi,

Thank you for the questions.
For the questions 1 and 2, I would suggest to read the function pickBestNet in NetVLAD repository since the table is the output of the function pickBestNet.

For question 3, aha, it is a big question to answer since many aspects (mostly a small point) can affect it.

Hi @Liumouliu

Thanks for your reply!

I have a small question,
do you use the same model for all the testing in the paper, including the same PCA weights?

And another small questions,
do you train the model from the ImageNet pretraining weights or the NetVLAD pretraining weights?

Thanks a lot!

Hi,

  1. do you use the same model for all the testing in the paper, including the same PCA weights?

Yes. only one model is used.

  1. do you train the model from the ImageNet pretraining weights or the NetVLAD pretraining weights?

I use ImageNet pretraining weights.

Best,

liu

Good!

I assume the following steps have been done:

To do experiments on Tokyo 24/7 dataset, you need to resize query images (Only query, don't resize database images):
ims_= vl_imreadjpeg(thisImageFns, 'numThreads', opts.numThreads, 'Resize', 640);
You also need to convert all .png images to .jpg images to enable vl_imreadjpeg. You can use convertPngtoJPG.m to do that.

Since you have achieved good performance on Pitts250k-test. I think the reason could be the matlab function : vl_imreadjpeg.

Please read vl_imreadjpeg carefully.

For example,

ims_= vl_imreadjpeg(thisImageFns, 'numThreads', opts.numThreads, 'Resize', 640);

resize the short side of images to 640, while keeping the aspect ratio.

I hope the above solution can do some help.

If the above solution does not work. I would suggest you first download my pretrained model, and test it using matconvnet. See whether you can achieve the same results as mine. We shall start from here.

Finally, I could achieve similar results as yours on Pitts and Tokyo datasets.
Thanks a lot for your quick replies!

I found that loading the same MatConvNet's pretrained VGG-16 weights and clustering centers is key for improving the final results.
I will look into why training with PyTorch pretrained weights performs worse.