Detectron: Why use sigmoid cross entropy instead of softmax in RPN?

Created on 2 May 2018 · 3Comments · Source: facebookresearch/Detectron

Hi,

In the original faster rcnn, you used softmax loss when training rpn. (https://github.com/rbgirshick/py-faster-rcnn/blob/master/models/pascal_voc/VGG16/faster_rcnn_alt_opt/stage1_rpn_train.pt#L447)

in FPN, you use sigmoid cross entropy to measure rpn loss.
(https://github.com/facebookresearch/Detectron/blob/master/lib/modeling/FPN.py#L459)

in my experiment, I found that the RPN recall dropped about 4 point when using sigmoid cross entropy.

So, why using sigmoid cross entropy in FPN? Have you ever tried softmax loss?

thanks!

Source

twmht

Most helpful comment

Using softmax for binary classification is an over parameterization and shouldn't be necessary. When porting from py-faster-rcnn to detectron, I tried both softmax and sigmoid for RPN and obtained similar RPN recall. I have not revisited using softmax for RPN with FPN.

rbgirshick on 6 May 2018

👍3

All 3 comments

rbgirshick on 6 May 2018

👍3

@rbgirshick Is it possible to support Softmax in "rpn_heads.py"?

I tried but didn't succeed. The reason is that when the shape of rpn_cls_logits is (1, 30, H, W) instead of (1, 15, H ,W), the "SpatialNarrowAs" operation cann't be applied to "rpn_labels_int32_wide" since its depth is not the same as rpn_cls_logits.