Abour the GPU memory using
muzi2045 opened this issue · comments
Tried to specfiy the model training on the second GPU card in the server.
But there still allocate some GPU memory in first GPU card.
is there any way to Force the training process only using the single card memory?
BTW: the training start with python3 multiprocess is really slow.
Hi, did you try with export CUDA_VISIBLE_DEVICES=1
?
OK, I can force set the CUDA_VISIBLE_DEVICES=1,
but the main reason of it is the datasetloader process some thing in the GPU:0
which this part in proposal_targets.py are compute in a fork process in python3
the train data prepare part can switch to CPU deivce?
And the multiprocessing.set_start_method('spawn')
can be comment..
Update:
the proposal target compute part is really slow in CPU...
looking for a faster way to prepare data without GPU.
I agree the target assignment needing to run on gpu is a problem. I have some fixes for this:
The proposal target part is probably slow on cpu because of rotated iou computation. But actually, the original SECOND does not use rotated iou for assigning anchors to ground truth. Instead it uses the "nearest standing/lying axis-aligned iou" which i think can run fast on cpu. In my private branch I have replaced the rotated iou with nearest standing/lying iou -- I will push this change hopefully soon.
The database sampling also uses rotated iou for collision check but the dimensions of the IOU matrix are much smaller (groundtruth x groundtruth as opposed to anchors x groundtruth) so I think that can run fast on cpu.
Here is the code I use for nearest standing/lying iou.
import math
import torch
from torchvision.ops import boxes as box_ops
def _snap_boxes_axis_aligned(boxes):
xy, dxy, r = boxes.split([2, 2, 1], -1)
flip = r.sin().abs() > 1 / math.sqrt(2)
dxy = torch.where(flip, dxy.flip(-1), dxy)
boxes = torch.cat((xy, xy + dxy), -1)
return boxes
def box_iou_snapped(boxes1, boxes2):
"""Boxes in (x, y, dx, dy, theta) format."""
iou = box_ops.box_iou(
_snap_boxes_axis_aligned(boxes1),
_snap_boxes_axis_aligned(boxes2),
)
return iou