what's the main difference between single gpu and data parallel?
HoracceFeng opened this issue · comments
Hi, I just wonder the difference of train script between "single_gpu" and "data_parallel", since they seem like have the same structure and module, also using the same API.
By the way, would you introduce how to use the distributed one? I am a little bit confuse about how to set the url and how to start using this.
Thx.
os.environ["CUDA_VISIBLE_DEVICES"] = gpu_devices
net = nn.DataParallel(net)
I suggest people to take some effort themselves looking at the code before posting questions. Not a frequent pytorch user myself but this should not be difficult.