experiencor / keras-yolo2

Easy training on custom dataset. Various backends (MobileNet and SqueezeNet) supported. A YOLO demo to detect raccoon run entirely in brower is accessible at https://git.io/vF7vI (not on Windows).

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Custom Loss Function - Adjust Confidence Section

pepo023 opened this issue · comments

I've spent some time playing around with this code, and during the training sessions I found that to get a good total_recall I had to "crank up" the value of the OBJECT_SCALE hyper parameter up to 1,000. Going through the custom loss function code and reading the YOLO paper I've arrived at the following conclusions:

  1. If the predicted boxes at the beginning of training are very off with respect to the ground truth boxes, then the IOU will be very small.
  2. If the IOU is very small, then when multiplied by 1 (which corresponds to the ground truth confidence), you end up with a number that is less than 1 in every location where the ground truth had a confidence score of 1.
  3. The predicted confidence score is adjusted with a sigmoid function so all the predicted confidence values will be between 0 and 1.
  4. The code proceeds to compute the difference between all the confidence scores and then squaring the result, which will end up being even a smaller number.
  5. The net effect is that the confidence score will end up being negligible unless the OBJECT_SCALE is not a "big" number.

The question then is, why multiply the ground truth confidence by the IOU?

Could you show where in the code did you see that?
If you multiply the 1/IOU * GT looks like you are making the network pay more atention in the sample where the IOU are not so good yet, but if you multiply IOU * GT doesn't make sense to me

This section:

adjust confidence

true_wh_half = true_box_wh / 2.
true_mins    = true_box_xy - true_wh_half
true_maxes   = true_box_xy + true_wh_half

pred_wh_half = pred_box_wh / 2.
pred_mins    = pred_box_xy - pred_wh_half
pred_maxes   = pred_box_xy + pred_wh_half       

intersect_mins  = tf.maximum(pred_mins,  true_mins)
intersect_maxes = tf.minimum(pred_maxes, true_maxes)
intersect_wh    = tf.maximum(intersect_maxes - intersect_mins, 0.)
intersect_areas = intersect_wh[..., 0] * intersect_wh[..., 1]

true_areas = true_box_wh[..., 0] * true_box_wh[..., 1]
pred_areas = pred_box_wh[..., 0] * pred_box_wh[..., 1]

union_areas = pred_areas + true_areas - intersect_areas
iou_scores  = tf.truediv(intersect_areas, union_areas)

true_box_conf = iou_scores * y_true[..., 4]

I haven't modified the loss function code in any way, only the code around it. Now that I have the time I'll start exploring changes to the loss function. One of them being that the square root is not applied to the width and height as mentioned in the YOLO paper.

If you notice further down the final object confidence loss is the following:
loss_conf = tf.reduce_sum(tf.square(true_box_conf-pred_box_conf) * conf_mask) / (nb_conf_box + 1e-6) / 2.

which in turn can be view as

loss_conf = tf.reduce_sum(tf.square(iou_scores -pred_box_conf) * conf_mask) / (nb_conf_box + 1e-6) / 2.

for the obejct locations where y_true[..., 4] is 1.

Hence, essentially forms a regressor for predicting the IoU of that predicted box which is the object presence confidence.

Hope this helped!

This section:

adjust confidence

true_wh_half = true_box_wh / 2.
true_mins    = true_box_xy - true_wh_half
true_maxes   = true_box_xy + true_wh_half

pred_wh_half = pred_box_wh / 2.
pred_mins    = pred_box_xy - pred_wh_half
pred_maxes   = pred_box_xy + pred_wh_half       

intersect_mins  = tf.maximum(pred_mins,  true_mins)
intersect_maxes = tf.minimum(pred_maxes, true_maxes)
intersect_wh    = tf.maximum(intersect_maxes - intersect_mins, 0.)
intersect_areas = intersect_wh[..., 0] * intersect_wh[..., 1]

true_areas = true_box_wh[..., 0] * true_box_wh[..., 1]
pred_areas = pred_box_wh[..., 0] * pred_box_wh[..., 1]

union_areas = pred_areas + true_areas - intersect_areas
iou_scores  = tf.truediv(intersect_areas, union_areas)

true_box_conf = iou_scores * y_true[..., 4]

I haven't modified the loss function code in any way, only the code around it. Now that I have the time I'll start exploring changes to the loss function. One of them being that the square root is not applied to the width and height as mentioned in the YOLO paper.

did you know why true_conf = iou between pred_box with true_box, but not between true_box and anchor box

This section:

adjust confidence

true_wh_half = true_box_wh / 2.
true_mins    = true_box_xy - true_wh_half
true_maxes   = true_box_xy + true_wh_half

pred_wh_half = pred_box_wh / 2.
pred_mins    = pred_box_xy - pred_wh_half
pred_maxes   = pred_box_xy + pred_wh_half       

intersect_mins  = tf.maximum(pred_mins,  true_mins)
intersect_maxes = tf.minimum(pred_maxes, true_maxes)
intersect_wh    = tf.maximum(intersect_maxes - intersect_mins, 0.)
intersect_areas = intersect_wh[..., 0] * intersect_wh[..., 1]

true_areas = true_box_wh[..., 0] * true_box_wh[..., 1]
pred_areas = pred_box_wh[..., 0] * pred_box_wh[..., 1]

union_areas = pred_areas + true_areas - intersect_areas
iou_scores  = tf.truediv(intersect_areas, union_areas)

true_box_conf = iou_scores * y_true[..., 4]

I haven't modified the loss function code in any way, only the code around it. Now that I have the time I'll start exploring changes to the loss function. One of them being that the square root is not applied to the width and height as mentioned in the YOLO paper.

Can you share your updated loss function, please