What is wrong with augmentation code (Why not to use it?)

Question

What is wrong with augmentation code (Why not to use it?)

mramzy25 opened this issue 2 years ago · comments

Mohamed Ramzy Ibrahim commented 2 years ago

`

##########` need to fix, and do not use it ##########
        # data_arguments, we need to process padded_lr_batch, padded_lm_batch, hr_batch, hm_batch
        if self.config["training"]["data_arguments"]:
            # print(padded_lr_batch.shape) # [12,16,64,64]
            # print(padded_lm_batch.shape) # [12,16,64,64]
            # print(hr_batch.shape)        # [12,192,192]
            # print(hm_batch.shape)        # [12,192,192]
            np.random.seed(int(1000 * time.time()) % 2**32)
            if np.random.random() <= self.config["training"]["probability of flipping horizontally"]:
                padded_lr_batch = torch.flip(padded_lr_batch, [3]) # Horizontal flip of lr images
                padded_lm_batch = torch.flip(padded_lm_batch, [3]) # Horizontal flip of lm images
                hr_batch = torch.flip(hr_batch, [2]) # Horizontal flip of hr images
                hm_batch = torch.flip(hm_batch, [2]) # Horizontal flip of hm images
            np.random.seed(int(1000 * time.time()) % 2**32)
            if np.random.random() <= self.config["training"]["probability of flipping vertically"]:
                padded_lr_batch = torch.flip(padded_lr_batch, [2]) # Vertical flip of lr images
                padded_lm_batch = torch.flip(padded_lm_batch, [2]) # Vertical flip of lm images
                hr_batch = torch.flip(hr_batch, [1]) # Horizontal flip of hr images
                hm_batch = torch.flip(hm_batch, [1]) # Horizontal flip of hm images
            np.random.seed(int(1000 * time.time()) % 2**32)
            k_num = np.random.choice(a=self.config["training"]["corresponding angles(x90)"],
                                     replace=True,
                                     p=self.config["training"]["probability of rotation"])
            padded_lr_batch = torch.rot90(padded_lr_batch, k=k_num, dims=[2,3]) # Rotate k times ninety degrees counterclockwise of lr images
            padded_lm_batch = torch.rot90(padded_lm_batch, k=k_num, dims=[2,3]) # Rotate k times ninety degrees counterclockwise of lm images
            hr_batch = torch.rot90(hr_batch, k=k_num, dims=[1,2]) # Rotate k times ninety degrees counterclockwise of hr images
            hm_batch = torch.rot90(hm_batch, k=k_num, dims=[1,2]) # Rotate k times ninety degrees counterclockwise of hm images
            np.random.seed(int(1000 * time.time()) % 2**32)

There is a comment saying not to use the augmentation and I don't know what is the problem in using it (I could not detect from code tracing). So can you explain me the problem occuring ?

An Tai · Answer 1 · Thu Aug 04 2022 17:49:46 GMT+0800 (China Standard Time)

It is because the enhancement code in this version contains errors and is less efficient. we have abandoned it, and you can continue to improve it. By the way, I vaguely remember that the code causes the image to be incorrectly rotated.

…

------------------ 原始邮件 ------------------ 发件人: "Suanmd/TR-MISR" ***@***.***>; 发送时间: 2022年8月4日(星期四) 下午5:40 ***@***.***>; ***@***.***>; 主题: [Suanmd/TR-MISR] What is wrong with augmentation code (Why not to use it?) (Issue #4) ########## need to fix, and do not use it ########## # data_arguments, we need to process padded_lr_batch, padded_lm_batch, hr_batch, hm_batch if self.config["training"]["data_arguments"]: # print(padded_lr_batch.shape) # [12,16,64,64] # print(padded_lm_batch.shape) # [12,16,64,64] # print(hr_batch.shape) # [12,192,192] # print(hm_batch.shape) # [12,192,192] np.random.seed(int(1000 * time.time()) % 2**32) if np.random.random() <= self.config["training"]["probability of flipping horizontally"]: padded_lr_batch = torch.flip(padded_lr_batch, [3]) # Horizontal flip of lr images padded_lm_batch = torch.flip(padded_lm_batch, [3]) # Horizontal flip of lm images hr_batch = torch.flip(hr_batch, [2]) # Horizontal flip of hr images hm_batch = torch.flip(hm_batch, [2]) # Horizontal flip of hm images np.random.seed(int(1000 * time.time()) % 2**32) if np.random.random() <= self.config["training"]["probability of flipping vertically"]: padded_lr_batch = torch.flip(padded_lr_batch, [2]) # Vertical flip of lr images padded_lm_batch = torch.flip(padded_lm_batch, [2]) # Vertical flip of lm images hr_batch = torch.flip(hr_batch, [1]) # Horizontal flip of hr images hm_batch = torch.flip(hm_batch, [1]) # Horizontal flip of hm images np.random.seed(int(1000 * time.time()) % 2**32) k_num = np.random.choice(a=self.config["training"]["corresponding angles(x90)"], replace=True, p=self.config["training"]["probability of rotation"]) padded_lr_batch = torch.rot90(padded_lr_batch, k=k_num, dims=[2,3]) # Rotate k times ninety degrees counterclockwise of lr images padded_lm_batch = torch.rot90(padded_lm_batch, k=k_num, dims=[2,3]) # Rotate k times ninety degrees counterclockwise of lm images hr_batch = torch.rot90(hr_batch, k=k_num, dims=[1,2]) # Rotate k times ninety degrees counterclockwise of hr images hm_batch = torch.rot90(hm_batch, k=k_num, dims=[1,2]) # Rotate k times ninety degrees counterclockwise of hm images np.random.seed(int(1000 * time.time()) % 2**32) There is a comment saying not to use the augmentation and I don't know what is the problem in using it (I could not detect from code tracing). So can you explain me the problem occuring.. — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>

Mohamed Ramzy Ibrahim · Answer 2 · Thu Aug 04 2022 21:06:26 GMT+0800 (China Standard Time)

Sorry, I have another question in training he loss appears in negative... Is that normal or I am running the code in correctly ??

An Tai · Answer 3 · Thu Aug 04 2022 21:12:45 GMT+0800 (China Standard Time)

Normal. If you are uncomfortable with negative values, you can add a constant to the loss (just a joke).

…

------------------ 原始邮件 ------------------ 发件人: "Suanmd/TR-MISR" ***@***.***>; 发送时间: 2022年8月4日(星期四) 晚上9:06 ***@***.***>; ***@***.******@***.***>; 主题: Re: [Suanmd/TR-MISR] What is wrong with augmentation code (Why not to use it?) (Issue #4) Sorry, I have another question in training he loss appears in negative... Is that normal or I am running the code in correctly ?? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: ***@***.***>

Mohamed Ramzy Ibrahim · Answer 4 · Thu Aug 04 2022 21:39:27 GMT+0800 (China Standard Time)

and also the psnr is in negative. I don't know is that also normal ?

An Tai · Answer 5 · Thu Aug 04 2022 21:43:57 GMT+0800 (China Standard Time)

PSNR cannot be a negative number. You need to find out why, e.g., you can start by checking whether the generated image is normal.

…

------------------ 原始邮件 ------------------ 发件人: "Suanmd/TR-MISR" ***@***.***>; 发送时间: 2022年8月4日(星期四) 晚上9:39 ***@***.***>; ***@***.******@***.***>; 主题: Re: [Suanmd/TR-MISR] What is wrong with augmentation code (Why not to use it?) (Issue #4) and also the psnr is in negative. I don't know is that also normal ? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: ***@***.***>

Mohamed Ramzy Ibrahim · Answer 6 · Fri Aug 05 2022 18:03:42 GMT+0800 (China Standard Time)

`
if not config["training"]["use_all_data_to_fight_leaderboard"]:
# Eval
fusion_model.eval()
val_score = 0.0 # monitor val score

        for lrs, lr_maps, alphas, hrs, hr_maps, names in dataloaders['val']:
            lrs = lrs.float().to(device)
            lr_maps = lr_maps.float().to(device)
            alphas = alphas.float().to(device)
            hrs = hrs.numpy()
            hr_maps = hr_maps.numpy()

            srs = fusion_model(lrs)

            # compute ESA score
            srs = srs[0].detach().cpu().numpy()
            for i in range(srs.shape[0]):

                if baseline_cpsnrs is None:
                    if config["training"]["truncate values"]:
                        val_score -= shift_cPSNR(np.clip((srs[i] - np.min(srs[i])), 0, 16383/65535), hrs[i], hr_maps[i])
                    else:
                        val_score -= shift_cPSNR(srs[i], hrs[i], hr_maps[i])
                else:
                    ESA = baseline_cpsnrs[names[i]]
                    # val_score += ESA / shift_cPSNR(srs[i], hrs[i], hr_maps[i])
                    if config["training"]["truncate values"]:
                        val_score -= shift_cPSNR(np.clip((srs[i] - np.min(srs[i])), 0, 16383/65535), hrs[i], hr_maps[i])
                    else:
                        val_score -= shift_cPSNR(srs[i], hrs[i], hr_maps[i])

        val_score /= len(dataloaders['val'].dataset)

`

Why is the validation score here subtracted? I think that is the reason that after each epoch the psnr is negative. am I correct? as I understand that shift _cPSNR is a function that returns the max cPSNR with regard to LR image registration (crop of borders)

Elwarfalli · Answer 7 · Thu Sep 15 2022 05:00:28 GMT+0800 (China Standard Time)

Same! getting negative values. Have you solved the problem??

An Tai · Answer 8 · Thu Sep 15 2022 11:27:53 GMT+0800 (China Standard Time)

Hi, negative values are OK. You can validate the model and check the generated results. The augmentation code may cause extra rotations resulting in incorrect LR-HR pairs, and furthermore, the code leads to a greater computation.

…

------------------ 原始邮件 ------------------ 发件人: ***@***.***>; 发送时间: 2022年9月15日(星期四) 凌晨5:00 收件人: ***@***.***>; 抄送: ***@***.***>; ***@***.***>; 主题: Re: [Suanmd/TR-MISR] What is wrong with augmentation code (Why not to use it?) (Issue #4) Same! getting negative values. Have you solved the problem?? — Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you commented.Message ID: ***@***.***>

nonick2k23 · Answer 9 · Tue Feb 07 2023 17:21:15 GMT+0800 (China Standard Time)

Can you explain the logic behind using a negative loss for training? You are using either L1 or L2 or SSIM or a combination of the three. Why would you have a negative loss value in the end? What's the purpose?

Same thing regarding validation loss, why are you using cPSNR then multiply the value by -1?

Thanks.

An Tai · Answer 10 · Tue Feb 07 2023 17:47:32 GMT+0800 (China Standard Time)

@nonick2k23 The lower the value, the better the optimization. Negative values work here. I optimise a negative PSNR, but of course, 1/PSNR could also be used. In general, the MSE, which is the denominator component of the PSNR, should be optimized. Anything goes.

nonick2k23 · Answer 11 · Tue Feb 07 2023 18:17:05 GMT+0800 (China Standard Time)

When using L2 loss, the distance between the ground truth and model output, by definition, is squared, which means, all the values returned by the loss are positive. This does not happen in your case. The function "get_loss" function when using L2 loss returns both negative and positive values.

Why is that?

After further investigation, you do not use L2 as a loss function, but cPSNR as a loss function, that is why you have negative values since (log(0<x<1) results in a negative value.

Okay another edit -

You don't even use cPSNR as defined in PROBA-V. You just modified it to something else, that works differently from cPSNR.

It would be great if you'd explain the reasoning behind this custom loss, and how does it affect the convergence of the architecture and the results.

Like previously said, my "best_score" and "val_score" are also in the negative, which is odd.

More edits...

The shift_cPSNR code used to calculate the "val_score" is taken from HighRes-Net as is, so it seem that it is fine. However the code wrapping it just reduces the returned value from 100 (?), and we end up going down from 100 to -47 on first epoch? Can you explain the reasoning behind this? It doesn't make much sense...

Okay - So you just used HighRes-Net code for the wrapping function as well. Still it doesn't make sense why the values are negative.

And the loss still needs to be investigated... to be continued.

seeuao · Answer 12 · Wed Feb 15 2023 11:44:00 GMT+0800 (China Standard Time)

Can your trained network achieve the cPSNR given in the paper?
After I use all the data as the training set and set the number of input images to 32, the average cPSNR obtained is only 49.3784. Why?

nonick2k23 · Answer 13 · Wed Feb 15 2023 17:10:12 GMT+0800 (China Standard Time)

The values are negative, so I am not sure how they decided on the number 49. They start with "score" of 100 and reduce from there for some reason.

This whole metric is a mess. No way their results are real.

The problem it is also used in HighRes-Net where also I have no clue how they decided on cPSNR from negative values.

Maybe the authors can shed some light, if they ever answer.

An Tai · Answer 14 · Wed Feb 15 2023 18:40:35 GMT+0800 (China Standard Time)

Our work is improved from HighRes-Net, so our loss is similar to his. @nonick2k23

An Tai · Answer 15 · Wed Feb 15 2023 18:46:47 GMT+0800 (China Standard Time)

You need tricks, for example, by supervising the training process to decide to early-stop and retrain. Direct training is not going to get high scores. @cvkaiming

An Tai · Answer 16 · Thu Feb 16 2023 09:27:27 GMT+0800 (China Standard Time)

You need tricks, for example, by supervising the training process to decide to early-stop and retrain. Direct training is not going to get high scores. @cvkaiming

I recall that throwing away the alignment during inference may improve the results, I can't explain it, you can try it.