xinntao / Real-ESRGAN

Real-ESRGAN aims at developing Practical Algorithms for General Image/Video Restoration.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

微调模型恶化 Fine-tuned Model Get Worse

Jared-02 opened this issue · comments

微调模型恶化

我是一个深度学习领域的新手,第一次做预训练模型的微调。我在测试 Real-ESRGAN Anime 模型时遇到了点问题。

首先,它有着极其惊人的降噪效果。其次,我觉得它也有一些不可回避的问题:过度锐化和抗锯齿效果差。于是,我就阅读了 Training.md 中的微调流程来尝试解决这些问题。

  • 数据集准备:我自己搞了120张动漫原盘的截图(多部动漫不同风格),用脚本生成了多尺寸的标准图像,总计700张。

  • 修改配置文件:主要修改了数据集部分,G 网络的 Block 数量修改为6(官方推荐值),学习率设置中调整点修改为 120,000,总计迭代次数修改为 120,000。

  • Ps:因为我知道我很菜所以大多数配置保持原样。

但事情并不像我想的那么简单。在 120,000 次迭代后,我发现微调模型的处理结果似乎不如原来了。除了在一定程度上,他有了更好的抗锯齿特性。对于图像整体清晰度和色彩的把控都不如之前的版本了。

我不知道是我做错了什么导致这样的结果,也许是过高的学习率?因为我不太敢去随意调整他们,所以请教一下您的意见!

对比图 Comparison:

compare

微调配置 Finetune Options:

# path
path:
  # use the pre-trained Real-ESRNet model
  pretrain_network_g: experiments/pretrained_models/RealESRGAN_x4plus_anime_6B.pth
  param_key_g: params_ema
  strict_load_g: true
  pretrain_network_d: experiments/pretrained_models/RealESRGAN_x4plus_anime_6B_netD.pth
  param_key_d: params
  strict_load_d: true
  resume_state: ~

# training settings
train:
  ema_decay: 0.999
  optim_g:
    type: Adam
    lr: !!float 1e-4
    weight_decay: 0
    betas: [0.9, 0.99]
  optim_d:
    type: Adam
    lr: !!float 1e-4
    weight_decay: 0
    betas: [0.9, 0.99]

  scheduler:
    type: MultiStepLR
    milestones: [120000]
    gamma: 0.5

  total_iter: 120000
  warmup_iter: -1  # no warm up

Fine-tuned Model Get Worse

This's an international version which aims to help students with the similar issues.

I am a rookie in deep learning. This's my first time to fine-tune a pre-trained model. I got into trouble while testing the Real-ESRGAN Anime model.

First of all, its incomparable denoising effect startled me. Secondly, I thought that it also has some inevitable problems, such as over-sharpen and poor anti-aliasing. Later, I carefully read the fine-tune guide in Training.md to tackle the problems.

  • Prepare dataset: I prepared 120 screenshots from anime Blu-ray (multiple anime with different styles), also used script to generate multi-scale Ground-Truth images, a total of 700 images.

  • Modify options file: mainly the datasets part, the num_block of network_G is modified to 6 (the official recommendation), the milestones in lr (learning rate) settings is modified to 120,000, and the total_iter is modified to 120,000.

  • Ps: I know I'm a green hand, so most of options are in keeping with the official's.

But it has backfired. After 120,000 iter, I found the result of fine-tuned model didn't as well as the original used to be. Except, it has better anti-aliasing features, to a certain extent. Whether the control of image blur or color tone are worse than original's version.

I don't know what trouble I have caused lead to such a result, maybe the lr is too high? Because I dared not modify them at will, ask your advice please!

请问你有进一步研究吗

作者说他的训练用的ground truth 图片时经过锐化处理的
但是问他这个锐化时怎么做的 一直没有得到回复

我已收到您的来信,谢谢
Can you tell me how you resolved it subsequently? I'm having a similar problem.

Can you tell me how you resolved it subsequently? I'm having a similar problem.

I don't have a good solution either. The advice I received after contacting the repo author was to use a large-scale training dataset. As mentioned earlier, the dataset I used was too small, leading to distortion in the entire model. Perhaps you could also try adjusting to a smaller lr or employing techniques such as warm-up and other lr decay strategies to manage the lr, which might lead to improvements.