JingyunLiang / MANet

Official PyTorch code for Mutual Affine Network for Spatially Variant Kernel Estimation in Blind Image Super-Resolution (MANet, ICCV2021)

Home Page:https://arxiv.org/abs/2108.05302

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Issue about sv mode

sujyQ opened this issue · comments

Hi.

I'm trying to train MANet with spatially-variant mode.
I changed your code here

#### init online degradation function

to this :

    prepro_train = util.SRMDPreprocessing(opt['scale'], random=True, l=opt['kernel_size'], add_noise=opt['train_noise'],
                                          noise_high=opt['noise_high'] / 255., add_jpeg=opt['train_jpeg'], jpeg_low=opt['jpeg_low'],
                                          rate_cln=-1, device=torch.device('cuda:{}'.format(device_id)), sig=opt['sig'],
                                          sig1=opt['sig1'], sig2=opt['sig2'], theta=opt['theta'],
                                          sig_min=opt['sig_min'], sig_max=opt['sig_max'], rate_iso=opt['rate_iso'],
                                          is_training=True, sv_mode=1)
    prepro_val = util.SRMDPreprocessing(opt['scale'], random=False, l=opt['kernel_size'], add_noise=opt['test_noise'],
                                        noise_high=opt['noise'], add_jpeg=opt['test_jpeg'], jpeg_low=opt['jpeg'],
                                        rate_cln=-1, device=torch.device('cuda:{}'.format(device_id)), sig=opt['sig'],
                                        sig1=opt['sig1'], sig2=opt['sig2'], theta=opt['theta'],
                                        sig_min=opt['sig_min'], sig_max=opt['sig_max'], rate_iso=opt['rate_iso'],
                                        is_training=False, sv_mode=1)

But it returns error :
Traceback (most recent call last): File "train.py", line 347, in <module> main() File "train.py", line 210, in main model.optimize_parameters(current_step, scaler) File "/home/hsj/d_drive/hsj/hsj/MANet/codes/models/B_model.py", line 165, in optimize_parameters -1) * 10000) / self.fake_K.size(1) RuntimeError: expand(torch.cuda.FloatTensor{[16, 1, 36864, 21, 21]}, size=[-1, 36864, -1, -1]): the number of sizes provided (4) must be greater or equal to the number of dimensions in the tensor (5)

So I erased unsqueeze here (

l_ker = self.l_ker_w * self.cri_ker(self.fake_K * 10000,
)
to this :

l_ker = self.l_ker_w * self.cri_ker(self.fake_K * 10000,
                                                    self.real_K.expand(-1, self.fake_K.size(1), -1,
                                                                                    -1) * 10000) / self.fake_K.size(1)

However OOM occurs :(
RuntimeError: CUDA out of memory. Tried to allocate 2.91 GiB (GPU 0; 11.93 GiB total capacity; 8.78 GiB already allocated; 1.57 GiB free; 9.73 GiB reserved in total by PyTorch)

Is MANet not enough to train sv mode with 12GB RAM??

Hi, may I ask a question, please?
I wonder if there's a small mistake in the code comments. Because it says that "sv_mode=0" is spatial-variant (the wrong comment is in the class "SRMDPreprocessing", but in fact "sv_mode=1" is the spatial-variant, right?

I think so. "1<= sv_mode <= 5" is spatially-variant and "sv_mode=0" is spatially-invariant.

Thanks so much for response.
I've successfully run the spacial-variant version, and the GPU usage is 17380MiB.

commented

sorry, have you met the dimension error? The shape of self.fake_K is Bx441xhxw, but the shape of self.real_K is Bx21x21 for the spatial invariant case( BxHWx21x21 for the spatial variant case). How could these tensors calculate kernel loss?

commented

sorry, have you met the dimension error? The shape of self.fake_K is Bx441xhxw, but the shape of self.real_K is Bx21x21 for the spatial invariant case( BxHWx21x21 for the spatial variant case). How could these tensors calculate kernel loss?

whether should I reshape the real kernel? like using self.real_K.view(B, -1, 1, 1).expand(1, 1, self.fake_K.size(2), self.fake_K.size(3)) to change the dimension to Bx441xhxw which is consistent with self.fake_K for the spatial invariant case?

commented

Could u please help me what exactly is ground truth kernel?
as for Super-resolution we consider ground truth image to be the actually HR image and LR image to be the downsampled version? So my doubt is what have we actually consider to be a ground truth kernel?