cszn / BSRGAN

Designing a Practical Degradation Model for Deep Blind Image Super-Resolution (ICCV, 2021) (PyTorch) - We released the training code!

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Dataset and Degradation Model

ShiinaMitsuki opened this issue · comments

commented

Great work! Is there any plan releasing the DIV2k3D dataset or degradation model in the paper?
Tks!

Hello! A couple of questions regarding the published degradation code. It is different from the one published in the paper, I noticed it doesn't have the aligned nearest neighbor interpolation option, it adds a second resizing inside the pipeline, has the "up, down, keep" random scale direction and there is no code for the unprocess ISP camera degradation, now there is speckle and Poisson noise instead, so it's pretty much the same as the Real-ESRGAN pipeline. Any particular reason for these changes? Any model performance issues? Also, in my understanding, the strategy in the paper with the degradation shuffling works to expand the degradation space to something similar to the repeated blur->resize->noise->jpeg model from Real-ESRGAN, but now the shuffling happens over the double degradation model, any comments about that change? How does shuffling the double pipeline change results?

I've added both BSRGAN and Real-ESRGAN (and Real-SR) training strategies in a project, by the way of configurable presets: https://github.com/victorca25/traiNNer/blob/master/codes/options/presets/README.md, but I was wondering if the options where changed for some reason in particular.

Also, in case it's interesting, I've converted the unprocess/process camera noise to be 100% in numpy, so there's no need for TensorFlow or converting between numpy and Torch tensors and also added the missing sinc blur option as augmentations in: https://github.com/victorca25/augmennt (unsharpening, speckle, Gaussian, jpeg and many other degradation are also there, just in case).

I will upload our original degradation model today. Real-ESRGAN follows our BSRGAN work. The idea of using double JPEG, double blur, down-up-resize, Gaussian noise for color image and grayscale image for SR degradation was first proposed in our work.

Yes, I know Real-ESRGAN uses the same ideas as BSRGAN, I mentioned it in my readme file: "very similar to BSRGAN, but adds sinc filter to the two blur operations, replaces the realistic camera noise for a simpler poisson noise augmentation and adds a second in-pipeline scaling operation."

For what is worth, I've been using unsharpening mask on HR images and double noise on-the-fly augmentations, besides blur and resizing for about 2 years in the pipeline my repo :D

But the degradation_bsrgan_plus() changes substantially from what was in the BSRGAN paper and is closer to the brute-force approach from Real-ESRGAN, duplicating the pipeline instead of shuffling the degradation, so I was wondering if there's a reason for that, if this replaces the original strategy or it's just an alternative.

I have uploaded the degradation_bsrgan().

The idea of BSRGAN was mainly inspired by my previous work SRMD, DPSR, IRCNN, USRNet, DnCNN, FFDNet and CBDNet. For example, the degradation shuffle and double blur was inspired by SRMD, DPSR, IRCNN, USRNet which use different degradation orders for blur and resizing.

I do not use sharpening operation. Your work should be known by more people. However, sharpening is not new for SR, and blur and resizing are the normal degradation operations for SR. The idea of using double blur and resizing would be novel.

As claimed in our paper, One can conveniently modify the degradation model by changing the degradation parameter settings and adding more reasonable degradation types to improve the practicability for a certain application.

There's the add_sharpening() function in degradation_bsrgan_plus(), that's why I mentioned the unsharp mask and I don't see it commonly used in any online pipeline strategy, only in some very rare cases as offline data preparation. Also the double noise was not standard before I added it a while back. Single blur and resizing, I agree is standard. Using the presets in my code you can do all these changes without changing the code, only configuring the augmentations that will be used and the parameters for each augmentation from a YAML.

Great to see the original pipeline, but now I have another question. If I remember correctly, a Gaussian blur of kernel size 21 was mentioned in the paper for the aligned nearest neighbor, but in the code the kernel size is 25. Is this a typo?

I have tried in my code with sizes from 17 to 21 and in most cases it produces almost the same aligned results using the same sigma values, but I didn't test up to 25.

And then my main doubt is still there, is there any performance comparison between using degradation_bsrgan_plus() and the original degradation_bsrgan() or it's yet to be done?

There is no difference between kernels with size 25 and size 21 because their boarder elements are almost zero.

I have not compare yet. In my opinion, the results would be similar for most of the real images if you use the same training strategy and same training data.

Awesome! Thanks for clarifying my questions!

One additional comment, I don't know if you had seen this work before, but it uses SRMD and USRNet (among others) for benchmarking and the results are pretty interesting: https://openaccess.thecvf.com/content/CVPR2021/papers/Jo_Tackling_the_Ill-Posedness_of_Super-Resolution_Through_Adaptive_Target_Generation_CVPR_2021_paper.pdf

The name is a bit misleading, as in the actual model training it modifies the model output (not the target images), but in my tests it helps even in cases using complex pipelines like BSRGAN, results are even sharper.

It's very simple to integrate, I'll be committing it to my code soon so more people can test it and compare results, but it may give you some more ideas too.

Cheers!

I have not read this CVPR2021 paper yet since our work was also previously submitted to CVPR2021.

Thanks for your hard work.

I have uploaded the degradation_bsrgan().

The idea of BSRGAN was mainly inspired by my previous work SRMD, DPSR, IRCNN, USRNet, DnCNN, FFDNet and CBDNet. For example, the degradation shuffle and double blur was inspired by SRMD, DPSR, IRCNN, USRNet which use different degradation orders for blur and resizing.

I do not use sharpening operation. Your work should be known by more people. However, sharpening is not new for SR, and blur and resizing are the normal degradation operations for SR. The idea of using double blur and resizing would be novel.

Hello~ How can I get ISP model?

Awesome! Thanks for clarifying my questions!

One additional comment, I don't know if you had seen this work before, but it uses SRMD and USRNet (among others) for benchmarking and the results are pretty interesting: https://openaccess.thecvf.com/content/CVPR2021/papers/Jo_Tackling_the_Ill-Posedness_of_Super-Resolution_Through_Adaptive_Target_Generation_CVPR_2021_paper.pdf

The name is a bit misleading, as in the actual model training it modifies the model output (not the target images), but in my tests it helps even in cases using complex pipelines like BSRGAN, results are even sharper.

It's very simple to integrate, I'll be committing it to my code soon so more people can test it and compare results, but it may give you some more ideas too.

Cheers!

Hello victorca
May I know did your experiment use the ISP model?
Where can I get the ISP model?
Thanks.