CasiaFan / Dataset_to_VOC_converter

Scripts to convert datasets (Caltech pedestrian, MS COCO, HDA) to PASCAL VOC format

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

problem in url.txt

mna12478 opened this issue · comments

commented

When i run anno_json_image_urls.py, the second column in the generated url.txt was the website such as 'http://images.cocodataset.org/val2017/000000052412.jpg', not 1 or -1

In my opinion, the second row in urls.txt should be 1 or -1, isn't it?

@mna12478 This script is for extracting image name and its download url from COCO json annotation file (mainly for instances). I am not sure the meaning of 1/-1 you expected here.

commented

@CasiaFan When I try to convert ms coco 2017 dataset to voc format, I find some xml files missed. For example, there are 5000 images in validation files, but when I create the xml files for thses validation images, there are total 4952 xml files. Did you meet this problem, or do you have some ideas about why this happened?

@mna12478 Some images may be crashed due to some unknown errors andI skip these images those fail to be opened. So the number of output files could be inconsistent with expected.

I'm guessing that some images don't contain an object thus not needing a label file. There might just be some photos of a white wall or something similar