ckkelvinchan / RealBasicVSR

Official repository of "Investigating Tradeoffs in Real-World Video Super-Resolution"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

GPU num for training

Xiao-R-Y opened this issue · comments

Thanks for your excellent work, I've got a problem when training with only one GPU, could you please give me some guidance on non-distributed learning commands, thank you.

THE logs are as follows:
Training command is /home/zhangyang/envs/anaconda3/envs/realVSR/bin/python -m torch.distributed.launch --nproc_per_node=1 --master_port=21932 /home/zhangyang/envs/anaconda3/envs/realVSR/lib/python3.7/site-packages/mmedit/.mim/tools/train.py configs/realbasicvsr_wogan_c64b20_2x30x8_lr1e-4_300k_reds.py --launcher pytorch.
/home/zhangyang/envs/anaconda3/envs/realVSR/lib/python3.7/site-packages/mmcv/init.py:21: UserWarning: On January 1, 2023, MMCV will release v2.0.0, in which it will remove components related to the training process and add a data transformation module. In addition, it will rename the package names mmcv to mmcv-lite and mmcv-full to mmcv. See https://github.com/open-mmlab/mmcv/blob/master/docs/en/compatibility.md for more details.
'On January 1, 2023, MMCV will release v2.0.0, in which it will remove '
/home/zhangyang/envs/anaconda3/envs/realVSR/lib/python3.7/site-packages/mmedit/utils/setup_env.py:33: UserWarning: Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
f'Setting OMP_NUM_THREADS environment variable for each process '
/home/zhangyang/envs/anaconda3/envs/realVSR/lib/python3.7/site-packages/mmedit/utils/setup_env.py:43: UserWarning: Setting MKL_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
f'Setting MKL_NUM_THREADS environment variable for each process '
Traceback (most recent call last):
File "/home/zhangyang/envs/anaconda3/envs/realVSR/lib/python3.7/site-packages/mmedit/.mim/tools/train.py", line 171, in
main()
File "/home/zhangyang/envs/anaconda3/envs/realVSR/lib/python3.7/site-packages/mmedit/.mim/tools/train.py", line 108, in main
cfg.dump(osp.join(cfg.work_dir, osp.basename(args.config)))
File "/home/zhangyang/envs/anaconda3/envs/realVSR/lib/python3.7/site-packages/mmcv/utils/config.py", line 596, in dump
f.write(self.pretty_text)
File "/home/zhangyang/envs/anaconda3/envs/realVSR/lib/python3.7/site-packages/mmcv/utils/config.py", line 508, in pretty_text
text, _ = FormatCode(text, style_config=yapf_style, verify=True)
TypeError: FormatCode() got an unexpected keyword argument 'verify'
Traceback (most recent call last):
File "/home/zhangyang/envs/anaconda3/envs/realVSR/lib/python3.7/runpy.py", line 193, in _run_module_as_main
"main", mod_spec)
File "/home/zhangyang/envs/anaconda3/envs/realVSR/lib/python3.7/runpy.py", line 85, in _run_code
exec(code, run_globals)
File "/home/zhangyang/envs/anaconda3/envs/realVSR/lib/python3.7/site-packages/torch/distributed/launch.py", line 260, in
main()
File "/home/zhangyang/envs/anaconda3/envs/realVSR/lib/python3.7/site-packages/torch/distributed/launch.py", line 256, in main
cmd=cmd)
subprocess.CalledProcessError: Command '['/home/zhangyang/envs/anaconda3/envs/realVSR/bin/python', '-u', '/home/zhangyang/envs/anaconda3/envs/realVSR/lib/python3.7/site-packages/mmedit/.mim/tools/train.py', '--local_rank=0', 'configs/realbasicvsr_wogan_c64b20_2x30x8_lr1e-4_300k_reds.py', '--launcher', 'pytorch']' returned non-zero exit status 1.
Traceback (most recent call last):
File "/home/zhangyang/envs/anaconda3/envs/realVSR/bin/mim", line 8, in
sys.exit(cli())
File "/home/zhangyang/envs/anaconda3/envs/realVSR/lib/python3.7/site-packages/click/core.py", line 1157, in call
return self.main(*args, **kwargs)
File "/home/zhangyang/envs/anaconda3/envs/realVSR/lib/python3.7/site-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/home/zhangyang/envs/anaconda3/envs/realVSR/lib/python3.7/site-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/zhangyang/envs/anaconda3/envs/realVSR/lib/python3.7/site-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/zhangyang/envs/anaconda3/envs/realVSR/lib/python3.7/site-packages/click/core.py", line 783, in invoke
return __callback(*args, **kwargs)
File "/home/zhangyang/envs/anaconda3/envs/realVSR/lib/python3.7/site-packages/mim/commands/train.py", line 111, in cli
other_args=other_args)
File "/home/zhangyang/envs/anaconda3/envs/realVSR/lib/python3.7/site-packages/mim/commands/train.py", line 262, in train
cmd, env=dict(os.environ, MASTER_PORT=str(port)))
File "/home/zhangyang/envs/anaconda3/envs/realVSR/lib/python3.7/subprocess.py", line 328, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/home/zhangyang/envs/anaconda3/envs/realVSR/bin/python', '-m', 'torch.distributed.launch', '--nproc_per_node=1', '--master_port=21932', '/home/zhangyang/envs/anaconda3/envs/realVSR/lib/python3.7/site-packages/mmedit/.mim/tools/train.py', 'configs/realbasicvsr_wogan_c64b20_2x30x8_lr1e-4_300k_reds.py', '--launcher', 'pytorch']' returned non-zero exit status 1.