laclouis5 / globox

A package to read and convert object detection datasets (COCO, YOLO, PascalVOC, LabelMe, CVAT, OpenImage, ...) and evaluate them with COCO and PascalVOC metrics.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Convert yolov5 dataset labels into COCO labels

Alberto1404 opened this issue · comments

Hi. I want to compare the performance between yolov5 and DETR (from Hugging Face). The custom dataset I have has already txt lables in YOLO format. However, DETR expects labels to be in COCO format.
Can anybody help me how to use this tool?

dataset.yaml

# Paths
train: ~/Dataset/train/images
val: ~/Dataset/val/images
test: ~/Dataset/test/images

# Classes
names: 
    0: person

You can read the YOLOv5 predictions with AnnotationSet.from_yolov5() and and then convert them to COCO using the .save_coco() method:

from globox import AnnotationSet

predictions = AnnotationSet.from_yolo_v5("path/to/annotations", image_folder="path/to/images")
predictions.save_coco(
  "coco_preds.json", 
  label_to_id={"person": 0}, 
  imageid_to_id={im: i for i, im in enumerate(sorted(predictions.image_ids))}
)

Conversion to COCO is a little bit tricky as you can see because COCO uses integer ids rather and image names and str class names. In any case you should adapt label_to_id and imageid_to_id to match the id mapping from the DETR dataset, I'm just suggesting an example.

Note that you can also read COCO annotations with AnnotationSet.from_coco(), which could simplify your workflow.

commented

Hi, i have same goal, but when i run this code:

from globox import AnnotationSet

predictions = AnnotationSet.from_yolo_v5("little/labels/", image_folder="little/img/")
predictions.save_coco(
"coco_preds.json",
label_to_id={"one": 0, 'two': 1},
imageid_to_id={im: i for i, im in enumerate(sorted(predictions.image_ids))}
)
i have this error:
Traceback (most recent call last):
File "C:\Users\nolro\PycharmProjects\chekc\COCO\yolo_to_coco.py", line 27, in
predictions.save_coco(
File "C:\Users\nolro\PycharmProjects\chekc\venv\lib\site-packages\globox\annotationset.py", line 856, in save_coco
content = self.to_coco(
File "C:\Users\nolro\PycharmProjects\chekc\venv\lib\site-packages\globox\annotationset.py", line 812, in to_coco
"category_id": label_to_id[box.label],
KeyError: '0'
help me figure it out

@tesla150600 When you specify label_to_id = {"one": 0, "two": 1}, this means that the bounding boxes you are dealing with (predictions) should have labels such as "one" and "two" only. However as pointed out by the error message, some box has the label "0" instead, which is not part of the aforementioned allowed labels.

Maybe your predictions are already in the right format or some predictions have the wrong label.