OpenImages support would be great
rodrigob opened this issue · comments
Would be great to also support the OpenImages dataset.
(15M boxes over 600 categories; 2.7M instance segmentations over 350 categories)
This dataset was part of the RVC 2020 challenge and its own Kaggle competitions in 2019.
Thanks for the suggestion!
I don't have much experience with OpenImages, so would it be possible for you to implement a dataset driver for it as a pull request? You can use the COCO one as reference:
Lines 60 to 107 in 49a5d2a
The official toolkit from the RVC competition might be helpful: https://github.com/ozendelait/rvc_devkit/tree/master/objdet
Specifically, it downloads the original CSV annotations from OID (V5|V6) and resorts to https://github.com/bethgelab/openimages2coco to convert them into COCO instance/segmentation format, which can then be a drop-in replacement for TIDE.
The steps to produce a working solutions seem to be as follow:
- remove unnecessary lines used to download OID images https://github.com/ozendelait/rvc_devkit/blob/c986717abc24eba99a259e203a9ce4e182b2124e/objdet/download_oid_boxable.sh#L21 and run
download_oid_boxable.sh
- Modify the line https://github.com/ozendelait/rvc_devkit/blob/c986717abc24eba99a259e203a9ce4e182b2124e/objdet/convert_oid_coco.sh#L32 to conform to the change in https://github.com/bethgelab/openimages2coco, where
convert.py
has been renamed toconvert_annotations.py
and run the conversion script - For bounding box annotations, it seems a dummy 'segmentation' field is needed, or one can fork TIDE and make necessary adjustments. For mask annotations, somehow
openimages2coco
decides to use 'segments_info' as the field name https://github.com/bethgelab/openimages2coco/blob/8991d9bccbd3d91f32b87f04dab60b2a61cb608e/utils.py#L238 , so that needs to be converted as well.
Then simply substitute the OID path:
tide = TIDE()
tide.evaluate(
datasets.COCO(path_to_oid_converted),
datasets.COCOResult(path_to_preds),
mode=TIDE.BOX
) # Use TIDE.MASK for masks
tide.summarize() # Summarize the results as tables in the console
tide.plot() # Show a summary figure. Specify a folder and it'll output a png to that folder.