Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cost function in Yolo-v1 #3

Open
christophesaintjean opened this issue May 29, 2017 · 4 comments
Open

Cost function in Yolo-v1 #3

christophesaintjean opened this issue May 29, 2017 · 4 comments

Comments

@christophesaintjean
Copy link

Hi,
I think there is an error with the cost function in Yolo-v1.
In the original paper, the authors said that the confidence value should be :
- the IOU for the best box in a cell that contains an object (1_{ij}^{obj})
- and zero elsewhere (1_{ij}^{noobj})
For me, the confidence of boxes in a cell that contains an object BUT are not the best should be 0.
This leads to a different formulation of self['iou_normal'] which appears hard to reproduce without model.confidence variable:
self['iou_normal'] = tf.reduce_sum(mask_normal * tf.square(model.confidence), name='iou_normal')
Do you think i am right or wrong ?
Best regards,
Christophe.

@TaihuLight
Copy link

The cost function in YOLO-V2 is right in this project?

@ruiminshen
Copy link
Owner

@christophesaintjean Thank you for your question. In the function model.yolo.Objectives.init, the tensors "mask_best" and "mask_normal" representing $1_{ij}^{obj}$ and $1_{ij}^{noobj}$, respectively.

The tensor "mask_best" requires two conditions: the cell contains an object (self.mask) AND the bbox has the best IoU value in its cell (best_box). Because "best_box_iou" calculates the best IoU value of each independent cell, and "best_box" requires the IoU value of a bbox equals "best_box_iou". So it will be 0 if its IoU is not the best in its cell.

@ruiminshen
Copy link
Owner

@TaihuLight Yes, I've checked it.

@christophesaintjean
Copy link
Author

christophesaintjean commented Jun 3, 2017

@ruiminshen, i studied more your code and noticed that i agree your comment
I explain it below for the interested reader.

  • best_box_iou : best IOU whatever the cell (even if no object is the corresponding cell)
  • best_box : (tensor) indicator for being the best box
  • mask_best : (tensor) indicator for being "the best box and in a cell that contains an object" $= 1_{ij}^{obj}$
  • mask_normal (tensor) indicator for being "not the best box or in a cell that doesn't contain an object" $ = 1_{ij}^{noobj}$
  • iou_dist = tf.square(model.iou - mask_best, name='iou_dist')
    • for boxes when $1_{ij}^{obj} = 1, iou_dist = ||model.iou - 1||^2$
    • for boxes when $1_{ij}^{obj} = 0, iou_dist = ||model.iou - 0||^2$ , they are not responsible for prediction
  • cnt = np.multiply.reduce(iou_dist.get_shape().as_list())
    new term in this version ? numbers of boxes in whole batch, true ?
  • self['iou_best'] = tf.identity(tf.reduce_sum(mask_best * iou_dist) / cnt, name='iou_best')
    • Now this loss includes an expectation over all boxes
    • for boxes when 1{ij}^{obj} = 1, loss is E[||iou - 1||^2] -> push them to 1 (the ideal IOU)_
    • for boxes when 1{ij}^{obj} = 0, loss is 0 # so no influence_
  • self['iou_normal'] = tf.identity(tf.reduce_sum(mask_normal * iou_dist) / cnt, name='iou_normal')
    • for boxes when $1_{ij}^{obj} = 1$, loss is 0, so no influence_
    • for boxes when $1_{ij}^{obj} = 0$, loss is $E[||iou||^2]$ -> push them to 0 (no confidence)
    • This is what i called ' tf.square(model.confidence)'

I am implementing Yolo-v1 with Keras. My implementation for these two losses are as the following:

  • confidence_positive_loss = K.sum(one_ij_obj * K.square(iou - C_)) # or K.mean(...)
  • confidence_negative_loss = K.sum(one_ij_noobj * K.square(C_))
    where IOU is computed as in your code and C_ is the confidence output but the model.
    This latter is a tensor [batch_size, nb_cells, nb_boxes] in your code.

So maybe, there is a very subtle difference between our interpretations:

  • I want the model confidence for the best box to be equal to the IOU (and IOU is maximized through coordinates loss)
  • you want the model confidence for the best box to be equal to 1

At the end, it is the same loss since IOU is 1 for the best box in the learning step.
I got this because i have encoded the image annotation into a nb_cells*(nb_classes + boxes_per_cell * (1 + 4)) vector. However, only the first box in each cell is used for the desired output.

Best regards,
Christophe

ps: thank you very much for having shared your valuable code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants