Detectron: “细分”的RLE或Polygon格式，可扩展到可可数据集

创建于 2018-02-02 · 38评论 · 资料来源: facebookresearch/Detectron

嗨Detectron，

最近，我尝试添加我的自定义可可数据以运行Detectron，并遇到以下问题。
（1）如下所示的可可数据中的“细分”，

{“ segmentation”： [[ 499.71，397.28，...... 342.71，172.31 ]] ，“ area”：43466.12825，“ iscrowd”：0，“ image_id”：182155，“ bbox”：[338.89，51.69， 205.82，367.61]，“ category_id”：1，“ id”：1248258}，

{“ segmentation”： {“ counts”： [66916，6，587，..... 1，114303]，“ size”：[594，640] }， “ area”：6197，“ iscrowd”：1， “ image_id”：284445，“ bbox”：[112，322，335，94]，“ category_id”：1，“ id”：9.001002844e + 11}，
“分段”的第一种格式是多边形，第二种格式需要对RLE格式进行编码/解码。

以上格式可以在Detectron上运行。

（2）我添加了一个新类别，并通过coco api encode（）/ decode（）掩码为“细分”字段生成了新的RLE格式。
我生成了这样的数据。

“分割”：[{ “计数”： “MNG = 1fb02O1O1O001N2O001O1O0O2O1O1O001N2O001O1O0O2O1O001O1O1O010000O01000O010000O01000O01000O01000O01N2N2M2O2N2N1O2N2O001O10O B000O10O1O001 ^ OQ ^ O9Pb0EQ ^ O; Wb0OO01O1O1O001O1N2N`jT3？”，“大小”：[600,1000]}]

我发现粗体字符与原始的coco“ segmentation” json格式不同，尽管它可以在MatterPort的Mask-RCNN实现上运行。

另外，我尝试修改一些Detectron的代码以满足我的要求，但是对我来说非常困难，因为需要更改许多代码。

您能给我一些建议来运行我的自定义数据吗？

谢谢。

community help wanted

资料来源

topcomma

👍1

最有用的评论

@topcomma
也许您可以尝试将蒙版转换为多边形。

labels_info = []
for mask in mask_list:
    # opencv 3.2
    mask_new, contours, hierarchy = cv2.findContours((mask).astype(np.uint8), cv2.RETR_TREE,
                                                        cv2.CHAIN_APPROX_SIMPLE)
    # before opencv 3.2
    # contours, hierarchy = cv2.findContours((mask).astype(np.uint8), cv2.RETR_TREE,
    #                                                    cv2.CHAIN_APPROX_SIMPLE)
    segmentation = []

    for contour in contours:
        contour = contour.flatten().tolist()
        # segmentation.append(contour)
        if len(contour) > 4:
            segmentation.append(contour)
    if len(segmentation) == 0:
        continue
    # get area, bbox, category_id and so on
    labels_info.append(
        {
            "segmentation": segmentation,  # poly
            "area": area,  # segmentation area
            "iscrowd": 0,
            "image_id": index,
            "bbox": [x1, y1, bbox_w, bbox_h],
            "category_id": category_id,
            "id": label_id
        },
    )

Sundrops 于 2018-02-04

👍41 🎉2

所有38条评论

我有一个类似的问题：lib / utils / segms.py中的某些功能期望分段采用“多边形”格式，并且在RLE中提供分段时会中断。
这很不方便，但似乎符合非拥挤区域的规范（iscrowd = 0）：

分段格式取决于实例是表示单个对象（在这种情况下使用多边形，iscrowd = 0）还是对象的集合（在这种情况下使用RLE，iscrowd = 1）。

[1] http://cocodataset.org/#download中的“ 4.1。对象实例注释”部分

我的解决方法是转换为“多边形”格式，该格式实质上是（x，y）顶点的列表。

-Lesha。

2018年2月2日，胡新贵[email protected]在06:41写道：
嗨Detectron，
最近，我尝试添加我的自定义可可数据以运行Detectron，并遇到以下问题。
（1）如下所示的可可数据中的“细分”，
{“ segmentation”：[[499.71，397.28，...... 342.71，172.31]]，“ area”：43466.12825，“ iscrowd”：0，“ image_id”：182155，“ bbox”：[338.89，51.69， 205.82，367.61]，“ category_id”：1，“ id”：1248258}，
{“ segmentation”：{“ counts”：[66916，6，587，..... 1，114303]，“ size”：[594，640]}，“ area”：6197，“ iscrowd”：1， “ image_id”：284445，“ bbox”：[112、322、335、94]，“ category_id”：1，“ id”：9.001002844e + 11}，
“分段”的第一种格式是多边形，第二种格式需要对RLE格式进行编码/解码。
以上格式可以在Detectron上运行。
（2）我添加了一个新类别，并通过coco api encode（）/ decode（）掩码为“细分”字段生成了新的RLE格式。
我生成了这样的数据。
“分割”：[{ “计数”： “MNG = 1fb02O1O1O001N2O001O1O0O2O1O1O001N2O001O1O0O2O1O001O1O1O010000O01000O010000O01000O01000O01000O01N2N2M2O2N2N1O2N2O001O10O B000O10O1O001 ^ OQ ^ O9Pb0EQ ^ O; Wb0OO01O1O1O001O1N2N`jT3？”， “大小”：[600,1000]}]
我发现粗体字符与原始的coco“ segmentation” json格式不同，尽管它可以在MatterPort的Mask-RCNN实现上运行。
另外，我尝试修改一些Detectron的代码以满足我的要求，但是对我来说非常困难，因为需要更改许多代码。
您能给我一些建议来运行我的自定义数据吗？
谢谢。
-
您收到此消息是因为您已订阅此线程。
直接回复此电子邮件，在GitHub上查看，或使该线程静音。

amokeev 于 2018-02-02

👍3 ❤1

@amokeev ，如何将“ RLE”转换为“ poly”格式以解决您的问题？

topcomma 于 2018-02-02

@amokeev ，您确定分段中的数字是（x，y）个顶点吗？例如“ 66916”，数量众多！另一个，尽管我为RLE格式将“ iscrowd”设置为“ 1”，但无法在Detectron上运行。

topcomma 于 2018-02-02

我将poly解释为由顶点定义的多边形列表，例如[[x1，y1，x2，y2 ... xN，yN]，…[x1，y1，x2，y2 ... xN，yN]]，其中坐标为与图像相同的比例。
以这种方式编码的掩码由CocoAPI [1]正确显示。

但是您可能想要获得“官方”答案。

[1] https://github.com/cocodataset/cocoapi https://github.com/cocodataset/cocoapi

2018年2月2日，胡新贵[email protected]在09:18写道：
@amokeev https://github.com/amokeev ，您确定分割中的数字是（x，y）个顶点吗？例如“ 66916”，数量众多！另一个，尽管我为RLE格式将“ iscrowd”设置为“ 1”，但无法在Detectron上运行。
-
您收到此邮件是因为有人提到您。
直接回复此电子邮件，在GitHub https://github.com/facebookresearch/Detectron/issues/100#issuecomment-362516928上查看，或使线程https://github.com/notifications/unsubscribe-auth/AFlh63ObnXg-静音

amokeev 于 2018-02-02

👍2

@topcomma
也许您可以尝试将蒙版转换为多边形。

labels_info = []
for mask in mask_list:
    # opencv 3.2
    mask_new, contours, hierarchy = cv2.findContours((mask).astype(np.uint8), cv2.RETR_TREE,
                                                        cv2.CHAIN_APPROX_SIMPLE)
    # before opencv 3.2
    # contours, hierarchy = cv2.findContours((mask).astype(np.uint8), cv2.RETR_TREE,
    #                                                    cv2.CHAIN_APPROX_SIMPLE)
    segmentation = []

    for contour in contours:
        contour = contour.flatten().tolist()
        # segmentation.append(contour)
        if len(contour) > 4:
            segmentation.append(contour)
    if len(segmentation) == 0:
        continue
    # get area, bbox, category_id and so on
    labels_info.append(
        {
            "segmentation": segmentation,  # poly
            "area": area,  # segmentation area
            "iscrowd": 0,
            "image_id": index,
            "bbox": [x1, y1, bbox_w, bbox_h],
            "category_id": category_id,
            "id": label_id
        },
    )

Sundrops 于 2018-02-04

👍41 🎉2

@amokeev ，
@Sundrops

感谢您的建议。
会尝试。

topcomma 于 2018-02-05

@Sundrops ，作为您转换的方法，可以获得“ poly”列表的结果。非常感谢！但是我仍然不知道为什么COCO json文件中的坐标是大/小数字，例如“ 66916”或“ 1”？

topcomma 于 2018-02-05

@topcomma COCO注释具有两种类型的分段注释

多边形（对象实例）[[499.71，397.28，...... 342.71，172.31]]顶点
未压缩的RLE（人群）。 “ segmentation”：{“ counts”：[66916，6，587，..... 1，114303]，“ size”：[594，640]}，66916表示标签0的数量。

多边形和未压缩的RLE将与MaskApi一起转换为紧凑的RLE格式。
紧凑的RLE格式：
分割 “：[{” 计数 “： ”MNG = 1fb02O1O1O001N2O001O1O0O2O1O1O001N2O001O1O0O2O1O001O1O1O010000O01000O010000O01000O01000O01000O01N2N2M2O2N2N1O2N2O001O10O B000O10O1O001 ^ OQ ^ O9Pb0EQ ^ O; Wb0OO01O1O1O001O1N2N`jT3“，” 大小“：[600,1000]}]

Sundrops 于 2018-02-05

👍10

@Sundrops ，
感谢您的大力帮助。
我的自定义可可样数据现在可以在Detectron上进行训练。

topcomma 于 2018-02-05

@topcomma ，
我和你有同样的问题。
作为Sundrops的方法，我找不到将Mask转换为Polys的文件。
您能告诉我哪个文件吗？非常感谢！

lg12170226 于 2018-02-06

@ lg12170226 ，
您可以参考可可的东西（https://github.com/nightrome/cocostuff）python代码自行实现。
在代码库中，没有相关注释的文件。

topcomma 于 2018-02-06

@topcomma ：我有一个原始图像和N标签图像。每个标签都存储在一个文件中，因此我有N图像作为标签。我想在自己的数据集中训练Mask RCNN，因此我首先需要转换为COCO格式。您可以分享代码如何将其转换为COCO样式吗？谢谢

John1231983 于 2018-02-06

只是想知道是否有一种方法可以将压缩的RLE转换为多边形/未压缩的RLE？

realwecan 于 2018-02-24

@realwecan在使用pycocotools.mask.decode解码RLE之后，可以检查我的实现以使用opencv生成多边形：

coco-json-converter

hazirbas 于 2018-02-28

👍5

@hazirbas ：感谢您的代码。为什么不使用包含实例细分的Davis 2017？我们能否使用您的代码将Davis 2017转换为可可格式以使用maskrcnn实现？

John1231983 于 2018-02-28

@ John1231983，您需要相应地修改脚本以读取拆分文件以及db_info.yml文件。对于我自己的研究，DAVIS 2016需要它。

hazirbas 于 2018-02-28

生成多边形的另一种解决方案，但使用skimage而不是opencv 。

import json
import numpy as np
from pycocotools import mask
from skimage import measure

ground_truth_binary_mask = np.array([[  0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
                                     [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
                                     [  0,   0,   0,   0,   0,   1,   1,   1,   0,   0],
                                     [  0,   0,   0,   0,   0,   1,   1,   1,   0,   0],
                                     [  0,   0,   0,   0,   0,   1,   1,   1,   0,   0],
                                     [  0,   0,   0,   0,   0,   1,   1,   1,   0,   0],
                                     [  1,   0,   0,   0,   0,   0,   0,   0,   0,   0],
                                     [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0],
                                     [  0,   0,   0,   0,   0,   0,   0,   0,   0,   0]], dtype=np.uint8)

fortran_ground_truth_binary_mask = np.asfortranarray(ground_truth_binary_mask)
encoded_ground_truth = mask.encode(fortran_ground_truth_binary_mask)
ground_truth_area = mask.area(encoded_ground_truth)
ground_truth_bounding_box = mask.toBbox(encoded_ground_truth)
contours = measure.find_contours(ground_truth_binary_mask, 0.5)

annotation = {
        "segmentation": [],
        "area": ground_truth_area.tolist(),
        "iscrowd": 0,
        "image_id": 123,
        "bbox": ground_truth_bounding_box.tolist(),
        "category_id": 1,
        "id": 1
    }

for contour in contours:
    contour = np.flip(contour, axis=1)
    segmentation = contour.ravel().tolist()
    annotation["segmentation"].append(segmentation)

print(json.dumps(annotation, indent=4))

waspinator 于 2018-03-15

👍6

您如何将二进制掩码或编码的RLE转换为未压缩的RLE，以便在“ iscrowd：1”“ counts”字段中使用？

waspinator 于 2018-03-29

👍1

@waspinator使用skimage代码分割是使用CV2的@Sundrops的代码differenct。
这两个结果是否都正确，并且Detectron可以使用这两个结果吗？请给我一些建议，谢谢。 @waspinator @Sundrops

使用您的代码与skimage的结果：

{“ segmentation”：[[0.0，252.00196078431372，1.0，252.00196078431372，2.0，252.00196078431372，3.0，252.00196078431372，4.0，252.00196078431372，5.0，252.00196078431372，6.0，252.00196078431372，7.0，252.00196078431371372，8.0，252.00192.02 ，11.0、252.00196078431372、12.0、252.00196078431372、13.0、252.00196078431372、14.0、252.00196078431372、15.0、252.00196078431372、16.0、252.00196078431372、17.0、252.00196078431372、18.0、252.001960784313722 29.019 252.0192.021379.0137 ，252.00196078431372、24.0、252.00196078431372、25.0、252.00196078431372、26.0、252.00196078431372、27.0、252.00196078431372、28.0、252.00196078431372、29.0、252.00196078431372、30.0、252.00196078431372、31.0、252.00196078432307 252.0192.006072137 ，36.0，252.00196078431372，37.0，252.00196078431372，38。 0，252.00196078431372，39.0，252.00196078431372，40.0，252.00196078431372，41.0，252.00196078431372，42.0，252.00196078431372，43.0，252.00196078431372，44.0，252.00196078431372，45.0，252.00196078431372，46.0，252.00197.060726076.07.02.007.047.0257.06077.06077.06077.06077.06077.06076.06077.06076.06077.06076.06077.06076.06077.06076.06077.06076.06077.06076.06077.06076.06077.06077.06076.06077.06076.06077.06077.06077.06076.06077.06077.06076.06077.06077.06076.06077.06077.06076.06077.06077.06076.06077.06077.06076.06077.06076.06077.06077.06077.06076.06077.06076.06077.06077.06077.06076.06077.06077.06076.06077.06076.06077.0137 252.00196078431372，51.0，252.00196078431372，52.0，252.00196078431372，53.0，252.00196078431372，54.0，252.00196078431372，55.0，252.00196078431372，56.0，252.00196078431372，57.0，252.00196078431372，58.0，252.00196078432 2607192.06072.02607192.06072.02137,1902607137 63.0、252.00196078431372、64.0、252.00196078431372、65.0、252.00196078431372、66.0、252.00196078431372、67.0、252.00196078431372、68.0、252.00196078431372、69.0、252.00196078431372、70.0、252.00196078431372、71.0、252.00197.023077.02、27.019、2137、19021372137、1372137、1372137、1372137、13721373137 252.00196078431372，76.0，252.00196 078431372、77.0、252.00196078431372、78.0、252.00196078431372、79.0、252.00196078431372、80.0、252.00196078431372、81.0、252.00196078431372、82.0、252.00196078431372、83.0、252.00196078431372、84.0、252.001960784313722 85.0、252.00196076071372137、82.0、192.0084313721378.02137 89.0，252.00196078431372，90.0，252.00196078431372，91.0，252.00196078431372，92.0，252.00196078431372，93.0，252.00196078431372，93.00196078431372，252.0，94.0，251.00196078431372，95.0，251.00196078431372，96.0 ...

使用cv2的@Sundrops代码的结果：

[94，252，93，253，0，253，0，286，188，286，188，269，187，268，187，252]

Kongsea 于 2018-04-05

@Kongsea我尚未测试@Sundrops cv2实现，但基本思想应该相同。由于可以使用无数的点集来描述形状，因此它们将产生不同的结果。但是否则它们都应该起作用。我只是没有安装cv2，所以我写了一些不需要的东西。

waspinator 于 2018-04-05

@Kongsea @waspinator我已经测试了我的代码。有用。

Sundrops 于 2018-04-05

谢谢@Sundrops @waspinator 。
我会尝试的。

Kongsea 于 2018-04-06

@waspinator有什么方法可以将分段多边形转换为RLE吗？我的目标是iscrowd = 1

ambigus9 于 2018-04-13

@Sundrops为什么在代码中注释此部分？

# if len(contour) > 4:
    #     segmentation.append(contour)
# if len(segmentation) == 0:
#     continue

我们确实需要处理这种情况，对吗？

Yuliang-Zou 于 2018-04-13

@ Yuliang-Zou当为Detectron使用轮廓时，应取消注释此部分。当len（contour）== 4时，Detectron会将其视为矩形。我已经更新了我以前的代码。

Sundrops 于 2018-04-13

@Sundrops谢谢。但是我们仍然需要处理len(contour)==2 ，对吧？

Yuliang-Zou 于 2018-04-13

@ Yuliang-Zou是的，但是代码if len(contour) > 4:可以处理len(contour)==2和len(contour)==4 。

Sundrops 于 2018-04-13

@Sundrops我明白了，谢谢！

Yuliang-Zou 于 2018-04-13

我写了一个库和文章来帮助创建COCO样式数据集。

https://patrickwasp.com/create-your-own-coco-style-dataset/

waspinator 于 2018-04-13

@Sundrops和@topcomma我无法在pycocotols中加载带注释的数据，因为我的注释包括不带掩码的分段。任何想法如何在pycocotools中不带遮罩地可视化注释？

soumenms2015 于 2018-04-23

@Sundrops
安：{
“细分”：[
[312.29，562.89，402.25，511.49，400.96，425.38，398.39，372.69，388.11，332.85，318.71，325.14，295.58，305.86，269.88，314.86，258.31，337.99，217.19,321.37，321.37，317.17，321.37，321.37。，358.55、159.36、377.83、116.95、421.53、167.07、499.92、232.61、560.32、300.72、571.89]
]，
“区域”：54652.9556，
“拥挤”：0，
“ image_id”：480023，
“ bbox”：[116.95、305.86、285.3、266.03]，
“ category_id”：58
“ id”：86
}
如何在coco-api中使用mask.py计算面积？谢谢你。
我的代码如下，但错误：

细分= ann ['segmentation']
bimask = np.array（分段，dtype = np.uint8，顺序='F'）
print（“ bimask：”，bimask）
rleObjs = mask.encode（bimask）
print（“ rleObjs：”，rleObjs）
面积= mask.area（rleObjs）
打印（“区域：”，区域）

manketon 于 2018-04-27

@曼克顿
也许您可以尝试使用cv2，但我不确定它是否正确。这只是使用Python和OpenCV从图像中去除轮廓的一个示例。

def is_contour_bad(c):
    # approximate the contour
    peri = cv2.arcLength(c, True)
    approx = cv2.approxPolyDP(c, 0.02 * peri, True)
    # return True if it is not a rectangle
    return not len(approx) == 4
image = cv2.imread('xx.jpg')
contours = ann['segmentation']
mask = np.ones(image.shape[:2], dtype="uint8") * 255
# loop over the contours
for c in contours:
    # if the contour is not a rectangle, draw it on the mask
    if is_contour_bad(c):
        cv2.drawContours(mask, [c], -1, 0, -1)
area = (mask==0).sum()

Sundrops 于 2018-04-27

@Sundrops @waspinator我有一个问题。如果我的原始对象蒙版有大孔，将其转换为多边形时，应如何正确地将其转换回原样？ coco API中的解码和合并功能会将孔视为对象的一部分，因此转换回后，孔将成为蒙版。在这种情况下该怎么办？

wangg12 于 2018-09-23

@ wangg12据我所知，COCO没有本地编码孔的方式。

waspinator 于 2018-09-24

for contour in contours:
        contour = contour.flatten().tolist()
        segmentation.append(contour)
        if len(contour) > 4:
            segmentation.append(contour)

嗨@Sundrops ，衷心感谢您的编码。我是Detectron的初学者。我感到困惑，为什么轮廓（这是一个列表）的长度应大于4，换句话说，当长度小于4时模型将发生什么？长方形。在我看来，可能有些物体的形状为矩形，所以我认为很好。另外，我想知道是否在一次迭代中将轮廓附加两次。我认为正确的代码应该是。

for contour in contours:
        contour = contour.flatten().tolist()
        if len(contour) > 4:
            segmentation.append(contour)

如果您能给我一些建议，将不胜感激。非常感谢！

BobZhangHT 于 2019-01-07

👍1

@BobZhangHT是的，它应该一次附加一次轮廓。
对于您的第一个问题，如果len（ann ['segmentation'] [0]）== 4，则cocoapi将假定所有均为矩形。

# cocoapi/PythonAPI/pycocotools/coco.py
def annToRLE(self, ann):
    t = self.imgs[ann['image_id']]
    h, w = t['height'], t['width']
    segm = ann['segmentation']
    if type(segm) == list:
        rles = maskUtils.frPyObjects(segm, h, w) 
        rle = maskUtils.merge(rles)
   ......

# cocoapi/PythonAPI/pycocotools/_mask.pyx
def frPyObjects(pyobj, h, w):
    # encode rle from a list of python objects
    if type(pyobj) == np.ndarray:
        objs = frBbox(pyobj, h, w)
    elif type(pyobj) == list and len(pyobj[0]) == 4:
        objs = frBbox(pyobj, h, w)
    elif type(pyobj) == list and len(pyobj[0]) > 4:
        objs = frPoly(pyobj, h, w)
   ......

Sundrops 于 2019-01-07

👍2 😄1

@Sundrops感谢您的回复！

BobZhangHT 于 2019-01-07

我将poly解释为由顶点定义的多边形列表，例如[[x1，y1，x2，y2 ... xN，yN]，…[x1，y1，x2，y2 ... xN，yN]]，其中坐标为与图像相同的比例。 CocoAPI [1]正确显示了以这种方式编码的掩码，但是您可能想要获得“官方”答案。 [1] https://github.com/cocodataset/cocoapi https://github.com/cocodataset/cocoapi
…
2018年2月2日，胡新贵@ 。 * >写道： @amokeev https://github.com/amokeev ，您确定分割中的数字是（x，y）个顶点吗？例如“ 66916”，数量众多！另一个，尽管我为RLE格式将“ iscrowd”设置为“ 1”，但无法在Detectron上运行。 -您收到此邮件是因为有人提到您。直接回复此电子邮件，在GitHub < ＃100（评论） >上查看，或使线程https://github.com/notifications/unsubscribe-auth/AFlh63ObnXg-DcaDKwIi3pB4Ppig464Hks5tQsTkgaJpZM4R2tN3静音。

当“ iscrowd”为“ 1”时，如何将RLE转换为多边形，这是因为物质端口/ MaskRCNN仅适用于多边形。 @amokeev。我想将maskRCNN的物港实现与使用pycococreator创建的coco数据集一起使用。

enemni 于 2019-04-25

此页面是否有帮助？

0 / 5 - 0 等级

Detectron: “细分”的RLE或Polygon格式，可扩展到可可数据集

最有用的评论

所有38条评论

相关问题