Is it possible to gain dense correspondence from the known data augmentation?

Question

Is it possible to gain dense correspondence from the known data augmentation?

lilanxiao opened this issue 3 years ago · comments

Hi, Thank you very much for the nice work!

I have a question about the dense correspondence of views. In the paper, the correspondence is gained by calculating the similarity between feature vectors from the backbone. Since the data augmentation (e.g. rotating, cropping, flipping) performed to each view of the same image is known, it's possible to obtain the correspondence directly from these transformations.

For example, Image A is a left-right flipped copy of Image B. The two images are encoded to 3x3 feature maps, which can be represented as:

fa1, fa2, fa3
fa4, fa5, fa6
fa7, fa8, fa9

and

fb1, fb2, fb3
fb4, fb5, fb6
fb7, fb8, fb9

Since A and B are flipped views of the same image, the correspondence could be (fa1, fb3), (fa2, fb2), (fa3, fb1), ... .

From my perspective, the transformation-motivated correspondence is more straightforward but the paper doesn't use it. Are there any intuitions behind this?

Thank you again!

Tianhao Li · Answer 1 · Wed May 12 2021 15:31:54 GMT+0800 (China Standard Time)

Hi, I notice that there is already a paper that is similar to your idea (https://arxiv.org/pdf/2011.10043.pdf). Please correct me if I misunderstand what you mean.

Xinlong Wang · Answer 2 · Wed May 12 2021 16:37:56 GMT+0800 (China Standard Time)

Yes, using geometric transformation is a straightforward way. In our framework, the two ways can achieve almost the same results. This part of the experiments will be updated in our next version.

As discussed in our paper, our proposed method is more flexible and simple.
Please refer to our paper for a detailed discussion. (the last of Sec 1.1 Related Work/Pre-training for dense prediction tasks)

Lanxiao Li · Answer 3 · Wed May 12 2021 17:16:14 GMT+0800 (China Standard Time)

Hi, I notice that there is already a paper that is similar to your idea (https://arxiv.org/pdf/2011.10043.pdf). Please correct me if I misunderstand what you mean.

Hi, thank you for the information! I haven't read that paper before and it looks interesting. But yeah, that's what I mean.

Lanxiao Li · Answer 4 · Wed May 12 2021 17:27:44 GMT+0800 (China Standard Time)

Yes, using geometric transformation is a straightforward way. In our framework, the two ways can achieve almost the same results. This part of the experiments will be updated in our next version.

As discussed in our paper, our proposed method is more flexible and simple.
Please refer to our paper for a detailed discussion. (the last of Sec 1.1 Related Work/Pre-training for dense prediction tasks)

Hi, thank you for your reply. Yeah, I get your point. I'm looking forward to the updated version.
The issue is closed.