JingyunLiang / SwinIR

SwinIR: Image Restoration Using Swin Transformer (official repository)

Home Page:https://arxiv.org/abs/2108.10257

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

About training code

Pexure opened this issue · comments

Hi, there. Thanks for your amazing work, but I have some questions about the training code.

  1. Do we need to modify main_train_psnr.py (KAIR) to set training iterations to 500K? It's 1M epochs in the original file.

  2. I ran training python -m torch.distributed.launch --nproc_per_node=8 --master_port=1234 main_train_psnr.py --opt options/swinir/train_swinir_sr_classical.json --dist True on 8 RTX 3090 GPUs and the dataset is DIV2K train split (default X2). The estimated training time for 500K iters is ~3.5days (1min/100 iters), much longer than your 1.8 days on 8 2080 Ti GPUs. Do you have any idea about that?

1, Yes, 500K is enough for SR.
2, No idea. Maybe you can add more n_workers. Or you can try the codes here.

Thanks for your reply. I have found the reason. I'm new to SR and missed data preparation described in BasicSR. I think it would be better to make it clear in KAIR :)

Thanks for you advice.