Why inference twice on the same image with different scales?
SomeoneDaniel opened this issue · comments
Will focus predictions from 1/2 images with glance predictions from 1/3 image produce better results?
Hi @SomeoneDaniel , this has been explained in Section 6.3 - Hybrid-resolution Test in our paper. Since glance decoder aims to extract semantic information from the images, it requires a larger receptive field, focus decoder aims to extract fine detail and benefits from taking high-resolution image as input. After conducting multiple experiments on all combinations of down-sampling ratios, we have chosen 1/2 and 1/3 as the best result.
For more details and the experiment results, please check in our paper, thanks!