soeaver / py-RFCN-priv

code for py-R-FCN-multiGPU maintained by bupt-priv

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Question about one function which is bbox_transform_inv

lucasjinreal opened this issue · comments

def bbox_transform_inv(boxes, deltas):
    if boxes.shape[0] == 0:
        return np.zeros((0, deltas.shape[1]), dtype=deltas.dtype)
    boxes = boxes.astype(deltas.dtype, copy=False)

    widths = boxes[:, 2] - boxes[:, 0] + 1.0
    heights = boxes[:, 3] - boxes[:, 1] + 1.0
    ctr_x = boxes[:, 0] + 0.5 * widths
    ctr_y = boxes[:, 1] + 0.5 * heights

    dx = deltas[:, 0::4]
    dy = deltas[:, 1::4]
    dw = deltas[:, 2::4]
    dh = deltas[:, 3::4]

    pred_ctr_x = dx * widths[:, np.newaxis] + ctr_x[:, np.newaxis]
    pred_ctr_y = dy * heights[:, np.newaxis] + ctr_y[:, np.newaxis]
    pred_w = np.exp(dw) * widths[:, np.newaxis]
    pred_h = np.exp(dh) * heights[:, np.newaxis]

    pred_boxes = np.zeros(deltas.shape, dtype=deltas.dtype)
    # x1
    pred_boxes[:, 0::4] = pred_ctr_x - 0.5 * pred_w
    # y1
    pred_boxes[:, 1::4] = pred_ctr_y - 0.5 * pred_h
    # x2
    pred_boxes[:, 2::4] = pred_ctr_x + 0.5 * pred_w
    # y2
    pred_boxes[:, 3::4] = pred_ctr_y + 0.5 * pred_h

    return pred_boxes

This function locates at bbox_transform.py
this line

    pred_ctr_x = dx * widths[:, np.newaxis] + ctr_x[:, np.newaxis]

dx and widths[:, np.newaxis] are not same size mostly, such as [1769, 1] vs [1290, 1], because they comes from boxes and delta, they are not same in dimension 0.

How could they applying a*b if they are not same in dim 0??

For example:

[[1], [2]] * [[3]]

Will not work....

>>> a = np.array([[3], [4], [5]])
>>> a
array([[3],
       [4],
       [5]])
>>> b = np.array([[4], [8]])
>>> a*b
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: operands could not be broadcast together with shapes (3,1) (2,1) 
>>> 

I got exactly same when I train it with VOC data.....