Terrible! Can not continue train model on a different GPU!
Naruto-Sasuke opened this issue · comments
Hungryof commented
It is quite strange! I have multiple gpus. I use your code of Instance Norm in my model and train it on GPU-4. When I load it on GPU-5 and continue training, it errors like this:
InstanceNormalization.lua:50: arguments are located on different GPUs at
/home/clp001/torch/extra/cutorch/lib/THC/generated/../generic/THCTensorMathReduce.cu:38
stack traceback:
[C]: in function 'mean'
Then I try to continue training on the same GPU-4, it also breaks with the same error.
Hungryof commented
It is not related with InstanceNorm. It is a bug in my code.