jhultman / vision3d

Research platform for 3D object detection in PyTorch.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

infer top K predicted boxes?

eraofelix opened this issue · comments

commented

How can i get top K predicted boxes from out = net.proposal(item) in inference.py?

Hi, I will add this code to repo soon but I want to refactor it first. In the meantime you can use:

from pvrcnn.ops import nms_rotated, box_iou_rotated
from pvrcnn.core import cfg, AnchorGenerator

def inference(out, anchors, cfg):
    cls_map, reg_map = out['P_cls'].squeeze(0), out['P_reg'].squeeze(0)
    score_map = cls_map.sigmoid()
    top_scores, class_idx = score_map.view(cfg.NUM_CLASSES, -1).max(0)
    top_scores, anchor_idx = top_scores.topk(k=cfg.PROPOSAL.TOPK)
    class_idx = class_idx[anchor_idx]
    top_anchors = anchors.view(cfg.NUM_CLASSES, -1, cfg.BOX_DOF)[class_idx, anchor_idx]
    top_boxes = reg_map.reshape(cfg.NUM_CLASSES, -1, cfg.BOX_DOF)[class_idx, anchor_idx]

    P_xyz, P_wlh, P_yaw = top_boxes.split([3, 3, 1], dim=1)
    A_xyz, A_wlh, A_yaw = top_anchors.split([3, 3, 1], dim=1)

    A_wl, A_h = A_wlh.split([2, 1], -1)
    A_norm = A_wl.norm(dim=-1, keepdim=True).expand(-1, 2)
    A_norm = torch.cat((A_norm, A_h), dim=-1)

    top_boxes = torch.cat((
        (P_xyz * A_norm + A_xyz),
        (torch.exp(P_wlh) * A_wlh),
        (P_yaw + A_yaw)), dim=1
    )

    nms_idx = nms_rotated(top_boxes[:, [0, 1, 3, 4, 6]], top_scores, iou_threshold=0.01)
    top_boxes = top_boxes[nms_idx]
    top_scores = top_scores[nms_idx]
    top_classes = class_idx[nms_idx]
    return top_boxes, top_scores, top_classes


anchors = AnchorGenerator(cfg).anchors.cuda()
out = net.proposal(item)
top_boxes, top_scores, top_classes = inference(out, anchors, cfg)

But I haven't tested this code thoroughly for multi-class inference. I suggest you work with single-class (car category) model for now.

commented

Thanks!

No problem! By the way, I am working on reproducing SECOND/Pointpillars results using this repo and hope to have good results soon. At that time I will release more complete training and inference codes.

the PV-RCNN network will have much better performance in Nuscenes dataset?
I am working on Nuscenes dataset reader and my own dataset reader.
@jhultman

commented

HI,i trained the single-class (car category) model on my own lidar dataset(10k lidar frames) and the loss curve:
image

i use your inference code but error:RuntimeError: CUDA error: device-side assert triggered
so i write numpy-version function inference(out, anchors, cfg) of yours. but the inference result seems not right:
image
here is my inference code
`
def inference(out, anchors, cfg):

# tensor2numpy
cls_map = out['P_cls'].cpu().numpy().squeeze(0)  # (2, 2, 200, 176) (NUM_CLASSES+1, NUM_YAW, ny, nx)
reg_map = out['P_reg'].cpu().numpy().squeeze(0)  # (1, 2, 200, 176, 7)
anchors = anchors.cpu().numpy()       # (1, 2, 200, 176, 7)
score_map = 1/(1+np.exp(-cls_map)) 
top_scores = score_map.reshape([cfg.NUM_CLASSES+1, -1]).max(0)
top_scores_copy = copy.deepcopy(top_scores)
top_scores.sort()
top_scores = top_scores[-cfg.PROPOSAL.TOPK:][::-1]
anchor_idx = top_scores_copy.argsort()[-cfg.PROPOSAL.TOPK:][::-1]

top_anchors = anchors.reshape([cfg.NUM_CLASSES, -1, cfg.BOX_DOF])
top_anchors = top_anchors[:, anchor_idx, :]

top_boxes = reg_map.reshape([cfg.NUM_CLASSES, -1, cfg.BOX_DOF])
top_boxes = top_boxes[:, anchor_idx, :]


P_xyz, P_wlh, P_yaw = top_boxes[:, :, :3], top_boxes[:, :, 3:6], top_boxes[:, :, 6:]
A_xyz, A_wlh, A_yaw = top_anchors[:, :, :3], top_anchors[:, :, 3:6], top_anchors[:, :, 6:]

A_wl, A_h = A_wlh[:, :, :2], A_wlh[:, :, 2:]
A_norm = np.linalg.norm(A_wl, axis=-1, keepdims=True)  # (1,100,1)
A_norm = np.concatenate([A_norm, A_norm, A_h], axis=-1)  # (1,100,2)
top_boxes = np.concatenate([P_xyz * A_norm + A_xyz,
                            np.exp(P_wlh) * A_wlh, 
                            P_yaw + A_yaw], axis=-1)[0]
# numpy2tensor
top_scores = torch.from_numpy(top_scores-np.zeros_like(top_scores)).float().cuda()
top_boxes = torch.from_numpy(top_boxes).float().cuda()

# [0, 1, 3, 4, 6] ==>[x, y, w, l, yaw]
nms_idx = nms_rotated(top_boxes[:, [0, 1, 3, 4, 6]], top_scores, iou_threshold=0.2)
top_boxes = top_boxes[nms_idx]
top_scores = top_scores[nms_idx]
return top_boxes, top_scores

`

in addition, the line in your inference is correct? shouldn't be cfg.NUM_CLASSES+1?
top_scores, class_idx = score_map.view(cfg.NUM_CLASSES, -1).max(0)

Hi @eraofelix. Sorry I realized I did not share with you some of the other uncommitted changes I have made. I will push these changes within the hour. I switched to representing background class by zeros (as is done in SECOND repository) instead of one-hot background channel. (I would prefer to use softmax output at inference time as is done in RetinaNet but first I will try to reproduce with sigmoid output). I also now normalize cls_loss by number of positive examples instead of total number non-ignore anchors. This prevents reg_loss from dominating.

With these changes, I am getting closer to reproducing SECOND results (using the inference code from above):
second

@muzi2045 Yes I would like to get PV-RCNN working eventually (NuScenes would be great). So far I have been focusing on proposal stage to make sure there are no bugs in target assigner or preprocessing.

@eraofelix I have just pushed the changes. There may be some breaking changes to dataset class if you are working on your own reader, but they are only stylistic so you can feel free to keep your version of that file.

commented

Great, but i have to add these 4 lines to second.py to run the train:
image