CasiaFan / Dataset_to_VOC_converter

Scripts to convert datasets (Caltech pedestrian, MS COCO, HDA) to PASCAL VOC format

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Some annotations are missed.

DonghoonPark12 opened this issue · comments

In2017 the number of train Images are 118,237. but, the number of annotation xml which are in 80 categories are 117,266.

In2017 the number of val Images are 5,000. but, the number of annotation xml which are in 80 categories are 4872.

My Question is why some xml are missed ?
Is it related with 80 categories? msCOCO actually has 90 categories.
If it is true, why 10 categories are excluded from converting to make xml?

Thanks ahead!!

@DonghunP First, script anno_coco2voc just parses official annotation json file to export its content and save in xml file. If some images have no corresponding annotation file, perhaps due to absence of annotation information of these images in the json file.
BTW, I check the COCO website and it says only 80 object categories under What is COCO? panel. If your evidence is from tensorflow object detection, notice some ids are skipped in the label pbtxt file.

Really thanks. I understand.