multi GPUs support?
psu1 opened this issue · comments
when run with torch.nn.DataParallel(net).cuda(), there is "AttributeError: 'DataParallel' object has no attribute 'loss'".
After I change loss = net.loss
to loss = net.module.loss
, there is a error "TypeError: unsupported operand type(s) for +: 'NoneType' and 'NoneType" at return self.bbox_loss + self.iou_loss + self.cls_loss
Do I need to rewrite the loss function outside "class Darknet19(nn.Module)"?
Any better idea?
Yes, you need to rewrite the loss function outside the model.
DataParallel
will duplicate your model to run on multiple gpus, so that you can not access a member variable of it.
Hi, I am trying to run the code with multigpu and I have rewrite the loss function outsde the model. The training looks normal, however, when I try to test it, it gives a lot of negative APs, do you have any idea about the reason, Thanks!