VainF / DeepLabV3Plus-Pytorch

Pretrained DeepLabv3 and DeepLabv3+ for Pascal VOC & Cityscapes

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Issue with Multi-GPU Training/Predicting using --gpu_id

Biste-Wang opened this issue · comments

Issue with Multi-GPU Training/Predicting using --gpu_id

Problem:

I'm currently facing an issue when attempting to train or predict on multiple GPUs using the --gpu_id flag. Despite specifying multiple GPUs (--gpu_id 0,1), only one GPU is being utilized.

Environment:

  • PyTorch version: 1.12.0
  • CUDA version: 12..2
  • GPU model(s): 0,1
  • Operating System: Windows

Reproducible Example:

python predict.py --input datasets/data/mydata --dataset cityscapes --model deeplabv3plus_resnet101 --val_batch_size 64 --ckpt checkpoints/best_deeplabv3plus_resnet101_cityscapes_os16.pth --save_val_results_to test_results/myresult --gpu_id 0,1

Expected Behavior:
I expect the training or predicting to utilize both GPUs specified in --gpu_id.

Actual Behavior:
Only one GPU is being used, and the workload is not distributed across the specified GPUs.

Additional Information:

  • I have verified that both GPUs are available and functional.
  • My PyTorch version is up-to-date.
  • CUDA and cuDNN versions are compatible with PyTorch.
  • The model and optimizer are moved to the correct device in the script.

Any help or suggestions to troubleshoot this issue would be greatly appreciated!