Detectron: Why use sigmoid cross entropy instead of softmax in RPN?

Created on 2 May 2018  ·  3Comments  ·  Source: facebookresearch/Detectron

Hi,

In the original faster rcnn, you used softmax loss when training rpn. (https://github.com/rbgirshick/py-faster-rcnn/blob/master/models/pascal_voc/VGG16/faster_rcnn_alt_opt/stage1_rpn_train.pt#L447)

in FPN, you use sigmoid cross entropy to measure rpn loss.
(https://github.com/facebookresearch/Detectron/blob/master/lib/modeling/FPN.py#L459)

in my experiment, I found that the RPN recall dropped about 4 point when using sigmoid cross entropy.

So, why using sigmoid cross entropy in FPN? Have you ever tried softmax loss?

thanks!

Most helpful comment

Using softmax for binary classification is an over parameterization and shouldn't be necessary. When porting from py-faster-rcnn to detectron, I tried both softmax and sigmoid for RPN and obtained similar RPN recall. I have not revisited using softmax for RPN with FPN.

All 3 comments

Using softmax for binary classification is an over parameterization and shouldn't be necessary. When porting from py-faster-rcnn to detectron, I tried both softmax and sigmoid for RPN and obtained similar RPN recall. I have not revisited using softmax for RPN with FPN.

@rbgirshick Is it possible to support Softmax in "rpn_heads.py"?

I tried but didn't succeed. The reason is that when the shape of rpn_cls_logits is (1, 30, H, W) instead of (1, 15, H ,W), the "SpatialNarrowAs" operation cann't be applied to "rpn_labels_int32_wide" since its depth is not the same as rpn_cls_logits.

I found the same issue here, and it is great to see you put it up. I guess I will try both loss functions to check which one is better.

Was this page helpful?
0 / 5 - 0 ratings

Related issues

coldgemini picture coldgemini  ·  3Comments

partnercloudsupport picture partnercloudsupport  ·  3Comments

kampelmuehler picture kampelmuehler  ·  4Comments

Adhders picture Adhders  ·  3Comments

elfpattern picture elfpattern  ·  3Comments