GPU not utilized
alexhuang2020 opened this issue · comments
I have a GTX 1050 Ti, and I checked:
torch.cuda.device_count()
1
torch.cuda.is_available()
True
However nvidis-smi reports 0% GPU usage when I ran "run('training.LunaTrainingApp', '--epochs=1')"
However, when I ran
import torch
a = torch.rand(20000,20000).cuda()
while True:
a += 1
a -= 1
GPU utility goes up to close to 100%.
What could be the problem?
Thanks.
Now I loaded 10,000 samples to Google Colab with GPU, and ran "run('training.LunaTrainingApp', '--epochs=1')"
GPU utility is also 0%.
The code is wrong in the function "initModel" in training.py.
Current code:
```
def initModel(self):
model = LunaModel()
if self.use_cuda:
log.info("Using CUDA; {} devices.".format(torch.cuda.device_count()))
if torch.cuda.device_count() > 1:
model = nn.DataParallel(model) #! Yeah this doesn't work. Stays on the CPUs.
model = model.to(self.device) #Not exactly sure what this is doing.
return model
```
Needs to be:
```
def initModel(self):
model = LunaModel()
if self.use_cuda:
log.info("Using CUDA; {} devices.".format(torch.cuda.device_count()))
model = model.to(self.device) #Transfer to multiple devices must come from a GPU.
if torch.cuda.device_count() > 1:
model = nn.DataParallel(model, device_ids=range(torch.cuda.device_count())) #Need the device list!
return model
```