Detectron: Custom dataset without segmentations

Created on 27 Jan 2018 · 7Comments · Source: facebookresearch/Detectron

Is there a way to run a custom dataset that only has bounding boxes? I have the Wider Face dataset in COCO api json format, but it won't train without a segmentation field in annotations. I had to use two work arounds:
1) Use annotation['segmenation'] = 0 and annotation['area'] = bbox_height * bbox_width (and of course leaving TRAIN.GT_MIN_AREA = -1)
2) To allow TRAIN.USE_FLIPPED = True, Detectron/lib/datasets/roidb.py had to have some code commented out in this function:

def extend_with_flipped_entries(roidb, dataset):
    """Flip each entry in the given roidb and return a new roidb that is the
    concatenation of the original roidb and the flipped entries.
    "Flipping" an entry means that that image and associated metadata (e.g.,
    ground truth boxes and object proposals) are horizontally flipped.
    """
    flipped_roidb = []
    for entry in roidb:
        width = entry['width']
        boxes = entry['boxes'].copy()
        oldx1 = boxes[:, 0].copy()
        oldx2 = boxes[:, 2].copy()
        boxes[:, 0] = width - oldx2 - 1
        boxes[:, 2] = width - oldx1 - 1
        assert (boxes[:, 2] >= boxes[:, 0]).all()
        flipped_entry = {}
        dont_copy = ('boxes', 'segms', 'gt_keypoints', 'flipped')
        for k, v in entry.items():
            if k not in dont_copy:
                flipped_entry[k] = v
        flipped_entry['boxes'] = boxes
        ### commenting out to allow flipping for datasets without segmentations annotated
        #flipped_entry['segms'] = segm_utils.flip_segms(
        #   entry['segms'], entry['height'], entry['width']
        #)
        if dataset.keypoints is not None:
            flipped_entry['gt_keypoints'] = keypoint_utils.flip_keypoints(
                dataset.keypoints, dataset.keypoint_flip_map,
                entry['gt_keypoints'], entry['width']
            )
        flipped_entry['flipped'] = True
        flipped_roidb.append(flipped_entry)
    roidb.extend(flipped_roidb)

With those two adjustments the code runs beautifully. Is there a flag or config param that I am missing, or an easier way to run datasets with only bboxes?

Thank you for the code and the docker version!

Source

learnbott

👍3

Most helpful comment

Hi @learnbott, I think it may be simpler and less error-prone to preprocess your json annotations to include the following entries:

'segmentation': []
'area': box_width * box_height
'iscrowd': 0

I believe this would allow you to perform training (w/ flipping) without any modifications to the code.

ir413 on 28 Jan 2018

👍15 🎉1

All 7 comments

Hi @learnbott, I think it may be simpler and less error-prone to preprocess your json annotations to include the following entries:

'segmentation': []
'area': box_width * box_height
'iscrowd': 0

I believe this would allow you to perform training (w/ flipping) without any modifications to the code.

ir413 on 28 Jan 2018

👍15 🎉1

Much appreciated!

On Sat, Jan 27, 2018 at 4:51 PM, Ilija Radosavovic <[email protected]

wrote:

Hi @learnbott https://github.com/learnbott, I think it may be simpler
and less error-prone to preprocess your json annotations to include the
following entries:

'segmentation' : []
'area': box_width * box_height
'iscrowd': 0

I believe this would allow you to perform training (w/ flipping) without
any modifications to the code.

—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://github.com/facebookresearch/Detectron/issues/48#issuecomment-361028870,
or mute the thread
https://github.com/notifications/unsubscribe-auth/AT56PcRoCdEUiJC9NZudU7-GZy0dPVhjks5tO8R_gaJpZM4RvF6T
.

learnbott on 29 Jan 2018

👍1

@learnbott : what was the speed of wider faces detector you have trained (on HD frame) ?

borisfom on 8 Mar 2018

@borisfom, the speed for training and inference is comparable to those reported on MODEL_ZOO.md and depends which model you use. Unfortunately, Detectron does not currently support CPU inference (see #36).

learnbott on 8 Mar 2018

Thanks learnbott ! I don't care about CPU...
Would you share any of your trained models for the faces ?

borisfom on 8 Mar 2018

Mine is pretty simple, but if you want a great face detector I would use Dockerface https://github.com/natanielruiz/dockerface. Its a faster r-cnn trained on multiple face datasets. https://arxiv.org/abs/1708.04370

learnbott on 9 Mar 2018

Is there a way to run a custom dataset that only has bounding boxes? I have the Wider Face dataset in COCO api json format, but it won't train without a segmentation field in annotations. I had to use two work arounds:

Use annotation['segmenation'] = 0 and annotation['area'] = bbox_height * bbox_width (and of course leaving TRAIN.GT_MIN_AREA = -1)

To allow TRAIN.USE_FLIPPED = True, Detectron/lib/datasets/roidb.py had to have some code commented out in this function:
def extend_with_flipped_entries(roidb, dataset):
    """Flip each entry in the given roidb and return a new roidb that is the
    concatenation of the original roidb and the flipped entries.
    "Flipping" an entry means that that image and associated metadata (e.g.,
    ground truth boxes and object proposals) are horizontally flipped.
    """
    flipped_roidb = []
    for entry in roidb:
        width = entry['width']
        boxes = entry['boxes'].copy()
        oldx1 = boxes[:, 0].copy()
        oldx2 = boxes[:, 2].copy()
        boxes[:, 0] = width - oldx2 - 1
        boxes[:, 2] = width - oldx1 - 1
        assert (boxes[:, 2] >= boxes[:, 0]).all()
        flipped_entry = {}
        dont_copy = ('boxes', 'segms', 'gt_keypoints', 'flipped')
        for k, v in entry.items():
            if k not in dont_copy:
                flipped_entry[k] = v
        flipped_entry['boxes'] = boxes
        ### commenting out to allow flipping for datasets without segmentations annotated
        #flipped_entry['segms'] = segm_utils.flip_segms(
        #   entry['segms'], entry['height'], entry['width']
        #)
        if dataset.keypoints is not None:
            flipped_entry['gt_keypoints'] = keypoint_utils.flip_keypoints(
                dataset.keypoints, dataset.keypoint_flip_map,
                entry['gt_keypoints'], entry['width']
            )
        flipped_entry['flipped'] = True
        flipped_roidb.append(flipped_entry)
    roidb.extend(flipped_roidb)
With those two adjustments the code runs beautifully. Is there a flag or config param that I am missing, or an easier way to run datasets with only bboxes?

Thank you for the code and the docker version!

@learnbott Are your errors resolved after using

'segmentation': []
'area': box_width * box_height
'iscrowd': 0

As I am able to train the model after using above syntax but not able to detect objects from the validation image.

Using below code for detection:

cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth")
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.7 # set the testing threshold for this model
cfg.DATASETS.TEST = ("balloon_val", )
predictor = DefaultPredictor(cfg)

from detectron2.utils.visualizer import ColorMode
dataset_dicts = get_balloon_dicts("balloon/val")
for d in random.sample(dataset_dicts, 3):
im = cv2.imread(d["file_name"])
outputs = predictor(im)
v = Visualizer(im[:, :, ::-1],
metadata=balloon_metadata,
scale=0.8,
instance_mode=ColorMode.IMAGE_BW # remove the colors of unsegmented pixels
)
v = v.draw_instance_predictions(outputs["instances"].to("cpu"))
cv2_imshow(v.get_image()[:, :, ::-1])