There seems to be a problem with distributed running codeWhen I entered the training command, the program did not respond
pfeducode opened this issue · comments
Hi, @pfeducode! Did you solve this problem?
I also met this problem, but I couldn't solve this problem.
If you solve this problem, will you share the any idea?
I deleted the distributed code, and then it can run normally。
There seems to be a problem with distributed running code. When I entered the training command, the program did not respond, and I had to delete the distributed code
Please use this command line:
CUDA_VISIBLE_DEVICES=0,1,2,3 python run_dataparallel.py --config config/vox-adv-256.yaml --device_ids 0,1,2,3 --name DaGAN_voxceleb2_depth --rgbd --batchsize 48 --kp_num 15 --generator DepthAwareGenerator
After removing the distributed code for the generator and discriminator and making device changes in the "model_dataparallel.py" file, I have successfully got it working on a single GPU.