Detectron: Model is being successfully trained on custom dataset, but when testing it outputs COCO's class names instead of custom class names

Created on 4 Jun 2018 · 4Comments · Source: facebookresearch/Detectron

Hello,

I have successfully trained a Faster R-CNN model on my custom dataset (my config file is based on e2e_faster_rcnn_X-101-32x8d-FPN_1x.yaml with modifications of the DATASETS lines). However, when using tools/infer_simple.py to test the model:

python2 tools/infer_simple.py \
 --cfg configs/mydataset/mydataset_faster_rcnn_X-101-32x8d-FPN_1x.yaml \
 --output-dir /home/ubuntu/tmp/mydataset_faster_rcnn_X-101-32x8d-FPN_1x/ \
 --image-ext jpg \
 --wts /home/ubuntu/detectron/output/train/mydataset_train:mydataset_val/generalized_rcnn/model_final.pkl \
/home/ubuntu//test-data/

it runs without any errors, but the results are very poor, and more strangely, the names of the predicted objects are from COCO (e.g. car, air plane, etc.) and not from my custom dataset.

Could you please help?
Thank you very much in advance!

Source

netw0rkf10w

Most helpful comment

The error looks unrelated to the number of classes; it's thrown by a Conv op with grouped convolution where there's a mismatch between the number of input channels, number of groups, and number of filter channels. This indicates that you accidentally have used mismatched config and model files.

rbgirshick on 4 Jun 2018

👍2

All 4 comments

Hi @netw0rkf10w, infer_simple tool uses coco classes (see this) by default. You can replace this line with your dataset (JsonDataset object or a dummy one) to use the classes from your dataset instead.

ir413 on 4 Jun 2018

👍1

@ir413 Thanks a lot!
Unfortunately I still cannot make it work :(

As you suggested I replaced the line dummy_coco_dataset = dummy_datasets.get_coco_dataset() by dummy_coco_dataset = dummy_datasets.get_custom_dataset() where my get_custom_dataset() is defined as follows:

def get_custom_dataset():
    """A dummy COCO dataset that includes only the 'classes' field."""
    ds = AttrDict()
    classes = [
        '__background__', 'class1', 'class2', 'class3', 'class4', 'class5', 'class6'
    ]
    ds.classes = {i: name for i, name in enumerate(classes)}
    return ds

Then after executing the command (as shown in the original question), I got the following error:

INFO infer_qopius.py: 132: Processing /home/ubuntu//test-data/IMG_0560.jpg -> /home/ubuntu/tmp/mydataset_faster_rcnn_X-101-32x8d-FPN_1x/IMG_0560.jpg.pdf
E0604 13:19:32.223304 71160 net_dag.cc:195] Exception from operator chain starting at '' (type 'Conv'): caffe2::EnforceNotMet: [enforce fail at conv_op_cudnn.cc:546] filter.dim32(1) == C / group_. 64 vs 2 Error from operator:
input: "gpu_0/res2_0_branch2a" input: "gpu_0/res2_0_branch2b_w" output: "gpu_0/res2_0_branch2b" name: "" type: "Conv" arg { name: "kernel" i: 3 } arg { name: "group" i: 32 } arg { name: "exhaustive_search" i: 0 } arg { name: "stride" i: 1 } arg { name: "pad" i: 1 } arg { name: "order" s: "NCHW" } arg { name: "dilation" i: 1 } device_option { device_type: 1 cuda_gpu_id: 0 } engine: "CUDNN"
WARNING workspace.py: 185: Original python traceback for operator 7 in network generalized_rcnn in exception above (most recent call last):
WARNING workspace.py: 190: File "tools/infer_qopius.py", line 168, in
WARNING workspace.py: 190: File "tools/infer_qopius.py", line 118, in main
WARNING workspace.py: 190: File "/home/ubuntu/detectron/lib/core/test_engine.py", line 328, in initialize_model_from_cfg
WARNING workspace.py: 190: File "/home/ubuntu/detectron/lib/modeling/model_builder.py", line 124, in create
WARNING workspace.py: 190: File "/home/ubuntu/detectron/lib/modeling/model_builder.py", line 89, in generalized_rcnn
WARNING workspace.py: 190: File "/home/ubuntu/detectron/lib/modeling/model_builder.py", line 229, in build_generic_detection_model
WARNING workspace.py: 190: File "/home/ubuntu/detectron/lib/modeling/optimizer.py", line 54, in build_data_parallel_model
WARNING workspace.py: 190: File "/home/ubuntu/detectron/lib/modeling/model_builder.py", line 169, in _single_gpu_build_func
WARNING workspace.py: 190: File "/home/ubuntu/detectron/lib/modeling/FPN.py", line 62, in add_fpn_ResNet101_conv5_body
WARNING workspace.py: 190: File "/home/ubuntu/detectron/lib/modeling/FPN.py", line 103, in add_fpn_onto_conv_body
WARNING workspace.py: 190: File "/home/ubuntu/detectron/lib/modeling/ResNet.py", line 46, in add_ResNet101_conv5_body
WARNING workspace.py: 190: File "/home/ubuntu/detectron/lib/modeling/ResNet.py", line 101, in add_ResNet_convX_body
WARNING workspace.py: 190: File "/home/ubuntu/detectron/lib/modeling/ResNet.py", line 83, in add_stage
WARNING workspace.py: 190: File "/home/ubuntu/detectron/lib/modeling/ResNet.py", line 181, in add_residual_block
WARNING workspace.py: 190: File "/home/ubuntu/detectron/lib/modeling/ResNet.py", line 255, in bottleneck_transformation
WARNING workspace.py: 190: File "/home/ubuntu/detectron/lib/modeling/detector.py", line 406, in ConvAffine
WARNING workspace.py: 190: File "/usr/local/lib/python2.7/dist-packages/caffe2/python/cnn.py", line 97, in Conv
WARNING workspace.py: 190: File "/usr/local/lib/python2.7/dist-packages/caffe2/python/brew.py", line 107, in scope_wrapper
WARNING workspace.py: 190: File "/usr/local/lib/python2.7/dist-packages/caffe2/python/helpers/conv.py", line 186, in conv
WARNING workspace.py: 190: File "/usr/local/lib/python2.7/dist-packages/caffe2/python/helpers/conv.py", line 139, in _ConvBase
Traceback (most recent call last):
File "tools/infer_qopius.py", line 168, in
main(args)
File "tools/infer_qopius.py", line 138, in main
model, im, None, timers=timers
File "/home/ubuntu/detectron/lib/core/test.py", line 66, in im_detect_all
model, im, cfg.TEST.SCALE, cfg.TEST.MAX_SIZE, boxes=box_proposals
File "/home/ubuntu/detectron/lib/core/test.py", line 158, in im_detect_bbox
workspace.RunNet(model.net.Proto().name)
File "/usr/local/lib/python2.7/dist-packages/caffe2/python/workspace.py", line 217, in RunNet
StringifyNetName(name), num_iter, allow_fail,
File "/usr/local/lib/python2.7/dist-packages/caffe2/python/workspace.py", line 178, in CallWithExceptionIntercept
return func(args, *kwargs)
RuntimeError: [enforce fail at conv_op_cudnn.cc:546] filter.dim32(1) == C / group_. 64 vs 2 Error from operator:
input: "gpu_0/res2_0_branch2a" input: "gpu_0/res2_0_branch2b_w" output: "gpu_0/res2_0_branch2b" name: "" type: "Conv" arg { name: "kernel" i: 3 } arg { name: "group" i: 32 } arg { name: "exhaustive_search" i: 0 } arg { name: "stride" i: 1 } arg { name: "pad" i: 1 } arg { name: "order" s: "NCHW" } arg { name: "dilation" i: 1 } device_option { device_type: 1 cuda_gpu_id: 0 } engine: "CUDNN"

As you can see my custom dataset has only 6 classes. Maybe this is the reason?

netw0rkf10w on 4 Jun 2018

rbgirshick on 4 Jun 2018

👍2

@rbgirshick Thank you very much for your answer! To make sure everything is done properly, I have started the training again, and now it works. Sorry for the stupid question.

netw0rkf10w on 5 Jun 2018

Was this page helpful?

0 / 5 - 0 ratings

Related issues

RuntimeError: [enforce fail at conv_op_cudnn.cc:811] status == CUDNN_STATUS_SUCCESS. 8 vs 0. , Error at: /pytorch/caffe2/operators/conv_op_cudnn.cc:811: CUDNN_STATUS_EXECUTION_FAILED

Emma0928 · 3Comments

Conda caffe2 and libcaffe2_detectron_ops_gpu.so not where it should be

baristahell · 3Comments

How can i train model from scratch

Hwang-dae-won · 3Comments

how can I downloads the weights.pkl for rpn_person_only_R-101-FPN_1x.yaml

Adhders · 3Comments

ERROR: core/context_gpu.cu:343: out of memory Error from operator

743341 · 4Comments