Convert the COCO RLE format to YOLOv5/v8 segmentation format.

Question

Convert the COCO RLE format to YOLOv5/v8 segmentation format.

ryouchinsa opened this issue a year ago · comments

Hi, thanks for your useful script.

We added rle2polygon() to general_json2yolo.py so that you can convert the COCO RLE format to YOLOv5/v8 segmentation format. Please let us know your opinion.
https://github.com/ryouchinsa/Rectlabel-support/blob/master/general_json2yolo.py

if use_segments:
    if len(ann['segmentation']) == 0:
        segments.append([])
        continue
    if isinstance(ann['segmentation'], dict):
        ann['segmentation'] = rle2polygon(ann['segmentation'])
    if len(ann['segmentation']) > 1:
        s = merge_multi_segment(ann['segmentation'])
        s = (np.concatenate(s, axis=0) / np.array([w, h])).reshape(-1).tolist()

def is_clockwise(contour):
    value = 0
    num = len(contour)
    for i, point in enumerate(contour):
        p1 = contour[i]
        if i < num - 1:
            p2 = contour[i + 1]
        else:
            p2 = contour[0]
        value += (p2[0][0] - p1[0][0]) * (p2[0][1] + p1[0][1]);
    return value < 0

def get_merge_point_idx(contour1, contour2):
    idx1 = 0
    idx2 = 0
    distance_min = -1
    for i, p1 in enumerate(contour1):
        for j, p2 in enumerate(contour2):
            distance = pow(p2[0][0] - p1[0][0], 2) + pow(p2[0][1] - p1[0][1], 2);
            if distance_min < 0:
                distance_min = distance
                idx1 = i
                idx2 = j
            elif distance < distance_min:
                distance_min = distance
                idx1 = i
                idx2 = j
    return idx1, idx2

def merge_contours(contour1, contour2, idx1, idx2):
    contour = []
    for i in list(range(0, idx1 + 1)):
        contour.append(contour1[i])
    for i in list(range(idx2, len(contour2))):
        contour.append(contour2[i])
    for i in list(range(0, idx2 + 1)):
        contour.append(contour2[i])
    for i in list(range(idx1, len(contour1))):
        contour.append(contour1[i])
    contour = np.array(contour)
    return contour

def merge_with_parent(contour_parent, contour):
    if not is_clockwise(contour_parent):
        contour_parent = contour_parent[::-1]
    if is_clockwise(contour):
        contour = contour[::-1]
    idx1, idx2 = get_merge_point_idx(contour_parent, contour)
    return merge_contours(contour_parent, contour, idx1, idx2)

def mask2polygon(image):
    contours, hierarchies = cv2.findContours(image, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_TC89_KCOS)
    contours_approx = []
    polygons = []
    for contour in contours:
        epsilon = 0.001 * cv2.arcLength(contour, True)
        contour_approx = cv2.approxPolyDP(contour, epsilon, True)
        contours_approx.append(contour_approx)

    contours_parent = []
    for i, contour in enumerate(contours_approx):
        parent_idx = hierarchies[0][i][3]
        if parent_idx < 0 and len(contour) >= 3:
            contours_parent.append(contour)
        else:
            contours_parent.append([])

    for i, contour in enumerate(contours_approx):
        parent_idx = hierarchies[0][i][3]
        if parent_idx >= 0 and len(contour) >= 3:
            contour_parent = contours_parent[parent_idx]
            if len(contour_parent) == 0:
                continue
            contours_parent[parent_idx] = merge_with_parent(contour_parent, contour)

    contours_parent_tmp = []
    for contour in contours_parent:
        if len(contour) == 0:
            continue
        contours_parent_tmp.append(contour)

    polygons = []
    for contour in contours_parent_tmp:
        polygon = contour.flatten().tolist()
        polygons.append(polygon)
    return polygons 

def rle2polygon(segmentation):
    if isinstance(segmentation["counts"], list):
        segmentation = mask.frPyObjects(segmentation, *segmentation["size"])
    m = mask.decode(segmentation) 
    m[m > 0] = 255
    polygons = mask2polygon(m)
    return polygons

Collin McCarthy · Answer 1 · Sun Sep 10 2023 03:31:16 GMT+0800 (China Standard Time)

Hi @ryouchinsa, I noticed you are approximating the contour in a different way than this answer here - cocodataset/cocoapi#476 (comment)

Why are you using this:

contours, _ = cv2.findContours(m, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_TC89_KCOS)
polygons = []
for contour in contours:
    epsilon = 0.001 * cv2.arcLength(contour, True)
    contour_approx = cv2.approxPolyDP(contour, epsilon, True)
    polygon = contour_approx.flatten().tolist()
    polygons.append(polygon)

instead of this, which produces significantly more polygon vertices/coordinates

contours, _ = cv2.findContours(mask, cv2.RETR_TREE, cv2.CHAIN_APPROX_SIMPLE)
polygons = []
for contour in contours:
    polygons.append(contour.astype(float).flatten().tolist())

I'm not saying your approach is wrong. I'm just curious if you chose a faster (but less accurate) method for your application, rather than a slower but more accurate method, or whether I'm misunderstanding something. Thanks.

Ryo Kawamura · Answer 2 · Fri Sep 15 2023 02:39:41 GMT+0800 (China Standard Time)

Thanks for the detailed feedback.

If we combine findContours() and approxPolyDP(), we can decrease the number of polygon points from 500 points to 50 points, for example.
When we edit the polygon, 500 points are too many, we think tens of points are appropriate.
If you want to preserve the mask shape as much as possible on training, you do not have to use approxPolyDP().

Glenn Jocher · Answer 3 · Fri Nov 10 2023 02:03:17 GMT+0800 (China Standard Time)

Thanks for the detailed feedback.

We chose to use a combination of findContours() and approxPolyDP() to reduce the number of polygon points, optimizing for a decrease from 500 points to around 50. This approach balances accuracy with efficiency, ensuring a manageable number of points while retaining the essential shape of the mask.

If preserving the mask shape as closely as possible during training is a priority, it's not necessary to use approxPolyDP().

Ryo Kawamura · Answer 4 · Wed Nov 22 2023 21:09:38 GMT+0800 (China Standard Time)

Using the script general_json2yolo.py, you can convert the RLE mask with holes to the YOLO segmentation format.

The RLE mask is converted to a parent polygon and a child polygon using cv2.findContours().
The parent polygon points are sorted in clockwise order.
The child polygon points are sorted in counterclockwise order.
Detect the nearest point in the parent polygon and in the child polygon.
Connect those 2 points with narrow 2 lines.
So that the polygon with a hole is saved in the YOLO segmentation format.

def is_clockwise(contour):
    value = 0
    num = len(contour)
    for i, point in enumerate(contour):
        p1 = contour[i]
        if i < num - 1:
            p2 = contour[i + 1]
        else:
            p2 = contour[0]
        value += (p2[0][0] - p1[0][0]) * (p2[0][1] + p1[0][1]);
    return value < 0

def get_merge_point_idx(contour1, contour2):
    idx1 = 0
    idx2 = 0
    distance_min = -1
    for i, p1 in enumerate(contour1):
        for j, p2 in enumerate(contour2):
            distance = pow(p2[0][0] - p1[0][0], 2) + pow(p2[0][1] - p1[0][1], 2);
            if distance_min < 0:
                distance_min = distance
                idx1 = i
                idx2 = j
            elif distance < distance_min:
                distance_min = distance
                idx1 = i
                idx2 = j
    return idx1, idx2

def merge_contours(contour1, contour2, idx1, idx2):
    contour = []
    for i in list(range(0, idx1 + 1)):
        contour.append(contour1[i])
    for i in list(range(idx2, len(contour2))):
        contour.append(contour2[i])
    for i in list(range(0, idx2 + 1)):
        contour.append(contour2[i])
    for i in list(range(idx1, len(contour1))):
        contour.append(contour1[i])
    contour = np.array(contour)
    return contour

def merge_with_parent(contour_parent, contour):
    if not is_clockwise(contour_parent):
        contour_parent = contour_parent[::-1]
    if is_clockwise(contour):
        contour = contour[::-1]
    idx1, idx2 = get_merge_point_idx(contour_parent, contour)
    return merge_contours(contour_parent, contour, idx1, idx2)

def mask2polygon(image):
    contours, hierarchies = cv2.findContours(image, cv2.RETR_CCOMP, cv2.CHAIN_APPROX_TC89_KCOS)
    contours_approx = []
    polygons = []
    for contour in contours:
        epsilon = 0.001 * cv2.arcLength(contour, True)
        contour_approx = cv2.approxPolyDP(contour, epsilon, True)
        contours_approx.append(contour_approx)

    contours_parent = []
    for i, contour in enumerate(contours_approx):
        parent_idx = hierarchies[0][i][3]
        if parent_idx < 0 and len(contour) >= 3:
            contours_parent.append(contour)
        else:
            contours_parent.append([])

    for i, contour in enumerate(contours_approx):
        parent_idx = hierarchies[0][i][3]
        if parent_idx >= 0 and len(contour) >= 3:
            contour_parent = contours_parent[parent_idx]
            if len(contour_parent) == 0:
                continue
            contours_parent[parent_idx] = merge_with_parent(contour_parent, contour)

    contours_parent_tmp = []
    for contour in contours_parent:
        if len(contour) == 0:
            continue
        contours_parent_tmp.append(contour)

    polygons = []
    for contour in contours_parent_tmp:
        polygon = contour.flatten().tolist()
        polygons.append(polygon)
    return polygons 

def rle2polygon(segmentation):
    if isinstance(segmentation["counts"], list):
        segmentation = mask.frPyObjects(segmentation, *segmentation["size"])
    m = mask.decode(segmentation) 
    m[m > 0] = 255
    polygons = mask2polygon(m)
    return polygons

The RLE mask.

The converted YOLO segmentation format.

To run the script, put the COCO JSON file coco_train.json into datasets/coco/annotations.
Run the script. python general_json2yolo.py
The converted YOLO txt files are saved in new_dir/labels/coco_train.

Edit use_segments and use_keypoints in the script.

if __name__ == '__main__':
    source = 'COCO'

    if source == 'COCO':
        convert_coco_json('../datasets/coco/annotations',  # directory with *.json
                          use_segments=True,
                          use_keypoints=False,
                          cls91to80=False)

To convert the COCO bbox format to YOLO bbox format.

use_segments=False,
use_keypoints=False,

To convert the COCO segmentation format to YOLO segmentation format.

use_segments=True,
use_keypoints=False,

To convert the COCO keypoints format to YOLO keypoints format.

use_segments=False,
use_keypoints=True,

This script originates from Ultralytics JSON2YOLO repository.
We hope this script would help your business.

Glenn Jocher · Answer 5 · Thu Nov 23 2023 04:35:16 GMT+0800 (China Standard Time)

@ryouchinsa thanks for sharing the updated script and examples of the RLE mask and the converted YOLO segmentation format. Your efforts to enhance the functionality of the script are much appreciated. It's great to see the improvements you've made and how they translate into the YOLO segmentation format. Good job!

Ryo Kawamura · Answer 6 · Thu Nov 23 2023 17:56:25 GMT+0800 (China Standard Time)

We updated the general_json2yolo.py script so that the RLE mask with holes are converted to YOLO segmentation format.

We believe that this script would be beneficial for your company and users.
Could you review the script before making a PR?

Glenn Jocher · Answer 7 · Thu Nov 23 2023 20:34:45 GMT+0800 (China Standard Time)

@ryouchinsa thank you for the update and for considering our input. We appreciate your effort in enhancing the script to accommodate RLE masks with holes. We will review the script and provide feedback as soon as possible. Keep up the great work!

Ryo Kawamura · Answer 8 · Fri Nov 24 2023 12:25:15 GMT+0800 (China Standard Time)

Thanks for reviewing our script.
We checked whether YOLO can train polygon masks with holes with a small dataset.

Donut images and YOLO segmentation text files to confirm that YOLO can train polygon masks with holes.

Glenn Jocher · Answer 9 · Fri Nov 24 2023 14:37:59 GMT+0800 (China Standard Time)

@ryouchinsa thank you for sharing the donut images and YOLO segmentation text files. We'll take a look and confirm that the YOLO model can effectively train polygon masks with holes using this dataset. Your contribution is valuable, and we appreciate your efforts in enhancing the YOLO functionality.

Ryo Kawamura · Answer 10 · Wed Nov 29 2023 16:14:45 GMT+0800 (China Standard Time)

Hi @glenn-jocher, I submitted the PR about this update.
#61

Please let us know if there are any problems in the PR.

Glenn Jocher · Answer 11 · Wed Nov 29 2023 21:29:44 GMT+0800 (China Standard Time)

@ryouchinsa thanks for submitting the PR. I will review it and get back to you if there are any issues. Appreciate your contribution!

G'ayrat Tangriberganov · Answer 12 · Fri Feb 23 2024 17:10:20 GMT+0800 (China Standard Time)

Hi @ryouchinsa , my question is also similar with others. But I have labeled image like below:

{
"version": "5.4.1",
"flags": {},
"shapes": [
{
"label": "food",
"points": [
[
239.0,
196.0
],
[
285.0,
297.0
]
],
"group_id": null,
"description": "",
"shape_type": "mask",
"flags": {},
"mask": "iVBORw0KGgoAAAANSUhEUgAAAC8AAABmAQAAAABzC/WlAAAAxUlEQVR4nI2RwRGCMBBFX3YY5SYdSCfSlicpLSVYAiVw4MABEg8fhVUcPb15f7NJNoFjBIwyAnACMGiF5gAG9TUCp5w7DLSkgmq1BfXLyo8aUOyFC0wIa9gA7feGNzRCLVSuVr4s7rV3qxUhJGA2qRVCuWmwt/bq9wXrP2fAmo0lV5ucjULvwk6I2zDrBZM7aBB6YfnUKLQYMGvlJIzCHUKGAAZZmyU3w+RsdNY76z5nb58WAW55U+OSATgnhUI/ANhwB3gAGE8tVLpKku0AAAAASUVORK5CYII="
},
{
"label": "food",
"points": [
[
159.0,
137.0
],
[
177.0,
166.0
]
],
"group_id": null,
"description": "",
"shape_type": "mask",
"flags": {},
"mask": "iVBORw0KGgoAAAANSUhEUgAAABMAAAAeAQAAAADmtRi/AAAASElEQVR4nEXKsRFAQABE0XdLESJDJ9cZpdGJEoSyEzBE+//8ZRSzqAKaFGtCejKomUglKM0eDweWdny+vXv8PaenX++/a5TTDZA+DWjnicmcAAAAAElFTkSuQmCC"
},
.....
.....
"imagePath": "gray-a-0-0.jpg",
"imageData": ......
"imageHeight": 640,
"imageWidth": 640
}

I couldn't find any sources to convert its mask to yolo format

Ryo Kawamura · Answer 13 · Sat Feb 24 2024 03:11:14 GMT+0800 (China Standard Time)

Hi @Harry-KIT,
could you convert the Labelme format to COCO format? Then, you can convert the COCO format to YOLO format using this script general_json2yolo.py.

G'ayrat Tangriberganov · Answer 14 · Wed Feb 28 2024 07:33:37 GMT+0800 (China Standard Time)

Hi @ryouchinsa, Than you very much

403F · Answer 15 · Fri Mar 22 2024 01:38:23 GMT+0800 (China Standard Time)

Hi @ryouchinsa ,
have you tested your script on a RLE mask which "iscrowd" is 1, meaning it contains multiple objects?
Currently your script gives a false output just like this one here
ultralytics/ultralytics#2090 (comment)
it takes all objects into one single object

Ryo Kawamura · Answer 16 · Sat Mar 30 2024 00:31:07 GMT+0800 (China Standard Time)

Hi @4o3F, thanks for your detailed feedback.

You mean that converting RLE masks with "iscrowd": 1 to YOLO format might decrease the segmentation accuracy, correct?

But, another user told us that RLE masks with "iscrowd": 1 are necessary to convert from COCO to YOLO format.

"I am trying to convert the COCO1.0 annotation files generated in CVAT to Yolo format. The COCO json file created consists of segmentation masks in RLE format therefore 'iscrowd' variable is True across all annotations."
ryouchinsa/Rectlabel-support#241 (comment)

So, We added skip_iscrowd_1 flag to the convert_coco_json() function in the general_json2yolo.py script.
Please give us your feedback.

Set skip_iscrowd_1=True.

Set skip_iscrowd_1=False.

Glenn Jocher · Answer 17 · Sat Mar 30 2024 10:36:39 GMT+0800 (China Standard Time)

Hi @ryouchinsa, thanks for bringing this to our attention! 👍

Indeed, handling RLE masks with "iscrowd": 1 can be tricky as it represents multiple objects as a single mask. To accommodate this, we've introduced a skip_iscrowd_1 flag in the conversion function. This allows for flexibility depending on the user's needs.

For datasets where iscrowd is significant, and individual object segmentation is required, setting skip_iscrowd_1=True will skip these masks, avoiding the merge of multiple objects into one. However, if preserving the semantic segmentation of crowded areas without distinguishing between individual objects is desired, you might opt to set skip_iscrowd_1=False.

Each approach has its use case, depending on the goal of your model. If incorrect merging is a concern in your context, I recommend experimenting with the flag to see which setting fits your needs best.

Your feedback and further observations on this would be highly appreciated!

403F · Answer 18 · Sat Mar 30 2024 10:38:25 GMT+0800 (China Standard Time)

Hi @4o3F, thanks for your detailed feedback.

You mean that converting RLE masks with "iscrowd": 1 to YOLO format might decrease the segmentation accuracy, correct?

But, another user told us that RLE masks with "iscrowd": 1 are necessary to convert from COCO to YOLO format.

"I am trying to convert the COCO1.0 annotation files generated in CVAT to Yolo format. The COCO json file created consists of segmentation masks in RLE format therefore 'iscrowd' variable is True across all annotations." ryouchinsa/Rectlabel-support#241 (comment)

So, We added skip_iscrowd_1 flag to the convert_coco_json() function in the general_json2yolo.py script. Please give us your feedback.

Set skip_iscrowd_1=True.

Set skip_iscrowd_1=False.

Thanks! This indeed fixed the problem

Glenn Jocher · Answer 19 · Sat Mar 30 2024 13:21:10 GMT+0800 (China Standard Time)

Hi @4o3F, I’m thrilled to hear that! 🎉 Your feedback has been incredibly helpful in refining our approach to handling iscrowd flags for RLE masks. Don't hesitate to reach out if you have more insights or further questions. Cheers to improving together!