Terrible! Can not continue train model on a different GPU!

Question

Terrible! Can not continue train model on a different GPU!

Naruto-Sasuke opened this issue 7 years ago · comments

It is quite strange! I have multiple gpus. I use your code of Instance Norm in my model and train it on GPU-4. When I load it on GPU-5 and continue training, it errors like this:

InstanceNormalization.lua:50: arguments are located on different GPUs at
 /home/clp001/torch/extra/cutorch/lib/THC/generated/../generic/THCTensorMathReduce.cu:38
stack traceback:
        [C]: in function 'mean'

Then I try to continue training on the same GPU-4, it also breaks with the same error.

Hungryof · Answer 1 · Thu Oct 12 2017 13:48:48 GMT+0800 (China Standard Time)

It is not related with InstanceNorm. It is a bug in my code.