Tengfei-Wang / DCSR

[ICCV 2021 (Oral Presentation)] Dual-Camera Super-Resolution with Aligned Attention Modules (RefSR)

Home Page:https://tengfei-wang.github.io/Dual-Camera-SR/index.html

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question about 4X SR on CUFED5

Brightlcz opened this issue · comments

commented

Hi,
Thanks for your tips on 4X SR! I am sparing no effort to modify the released code for 4X SR on CUFED5 because your results on CUFED5 are quite appealing. I have a few questions to ask :)

Datasets:

  1. For 2X SR,the scale of LR images is the same as the ref images(2016X1512). For 4x SR, you said the scale of the ref image is 4x of that of LR images. Is it only downsampling 4X input images? Did you mean that the I Ref patch is 4X of the I LR patch? In your paper, you said the resolutions of HR and Ref are about 300×500. and which downsampling method do you use to process the input liking Figure 5 in the main paper?

  2. Dose the CUFED5 datasets for training refer to the datasets with 11871 photos and testing with 126 photos?

Modify:

  1. For 4X SR,I assume I need to add an extra reference features extracted from I Ref and Attention module and fusion module based on the released code.
  2. when modifying the VGG19 layer from 7 to 11, it has 2 MaxPool2d. Maybe it is the input that confuses me. I feel I still need little tips to modify it successfully.

Thanks.

Hi,

Datasets:

  1. Our input for CUFED5 and CameraFusion dataset is different. For CUFED5 dataset, the ref image 4x than the LR image, we downsample 4x the HR images in CUFED5 dataset as LR iamges and keep the Ref images unchanged. In our method, for each patch in LR image, we select Ref patch 4x of the LR patch.
  2. Yes! We follow the same train and test setting as SRNTT and TTSR.

Modify:

  1. Yes! There should be 3 ref_encoderx , 3 resx, 3 fusionx, 3 decoderx, 4 fusion1x, 4 alphax and 4 aax. Each module should adds one.
  2. For CUFED5, it is 4x SR, so we need 2 MaxPool2d. We check the code, we use for x in range(12):. So you need to change VGG layer from 7 to 12 in line26 model/attention.py. In our last answer, we got it wrong.
commented

Hi,

Datasets:

  1. Our input for CUFED5 and CameraFusion dataset is different. For CUFED5 dataset, the ref image 4x than the LR image, we downsample 4x the HR images in CUFED5 dataset as LR iamges and keep the Ref images unchanged. In our method, for each patch in LR image, we select Ref patch 4x of the LR patch.
  2. Yes! We follow the same train and test setting as SRNTT and TTSR.

Modify:

  1. Yes! There should be 3 ref_encoderx , 3 resx, 3 fusionx, 3 decoderx, 4 fusion1x, 4 alphax and 4 aax. Each module should adds one.
  2. For CUFED5, it is 4x SR, so we need 2 MaxPool2d. We check the code, we use for x in range(12):. So you need to change VGG layer from 7 to 12 in line26 model/attention.py. In our last answer, we got it wrong.

Thank you for your reply. I'll try it right now.

Hello,what is the meaning of scale in AlignedAttention?Does it represent a downsample scale of the original input scale?

commented

Hello,what is the meaning of scale in AlignedAttention?Does it represent a downsample scale of the original input scale?

Hi, I think the scale in AA depends on the ksizes for extract_image_patches(unfold).

commented

Hi, @jiaxinxie97

According to your answer, Is the input size in CUFED5 testing set 83x125 and training set 40x40? The reason for asking this is because I see that the testing image on CUFED5 in your paper does not look like downsampling 4x using bicubic. I feel it is more like downsampling four times and then upsampling.

I'm sorry to ask you and Wang so many naive questions but Dual Camera is charming so.

Hi, @jiaxinxie97

According to your answer, Is the input size in CUFED5 testing set 83x125 and training set 40x40? The reason for asking this is because I see that the testing image on CUFED5 in your paper does not look like downsampling 4x using bicubic. I feel it is more like downsampling four times and then upsampling.

I'm sorry to ask you and Wang so many naive questions but Dual Camera is charming so.

Hi,
Yes, the input (LR) size when testing is around 80x120 (or 120x80), but the sizes vary slightly for different images in CUFED5 dataset. And we directly use the LR images provided in this dataset as the inputs, which are 4x bicubic downsampled. For figures displayed in the paper, we upsample LR images with 4x bicubic upsampling for better comparison.

Hi,
Datasets:

  1. Our input for CUFED5 and CameraFusion dataset is different. For CUFED5 dataset, the ref image 4x than the LR image, we downsample 4x the HR images in CUFED5 dataset as LR iamges and keep the Ref images unchanged. In our method, for each patch in LR image, we select Ref patch 4x of the LR patch.
  2. Yes! We follow the same train and test setting as SRNTT and TTSR.

Modify:

  1. Yes! There should be 3 ref_encoderx , 3 resx, 3 fusionx, 3 decoderx, 4 fusion1x, 4 alphax and 4 aax. Each module should adds one.
  2. For CUFED5, it is 4x SR, so we need 2 MaxPool2d. We check the code, we use for x in range(12):. So you need to change VGG layer from 7 to 12 in line26 model/attention.py. In our last answer, we got it wrong.

Thank you for your reply. I'll try it right now.

Hi, @jiaxinxie97

According to your answer, Is the input size in CUFED5 testing set 83x125 and training set 40x40? The reason for asking this is because I see that the testing image on CUFED5 in your paper does not look like downsampling 4x using bicubic. I feel it is more like downsampling four times and then upsampling.

I'm sorry to ask you and Wang so many naive questions but Dual Camera is charming so.

Hello,did you modify the code in dataset?