mmaaz60 / mvits_for_class_agnostic_od

[ECCV'22] Official repository of paper titled "Class-agnostic Object Detection with Multi-modal Transformer".

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

using my own custom dataset

nikky4D opened this issue · comments

I would like to finetune on my own dataset. Do you have recommendations on how I can create my own dataset for this?

I have a question on the pretraining. I want to pretrain only on my dataset. Can i modify pretrain.json to only specify path to my dataset? what else should I change to get pretraining?

I have a question on the pretraining. I want to pretrain only on my dataset. Can i modify pretrain.json to only specify path to my dataset? what else should I change to get pretraining?

Hi @nikky4D,

Thank you for your interest in our work. We use the same setup as of MDETR for pretraining our model. Specifically, we trained on approximately 1.3 M image-caption pairs from GQA, COCO & Flicker.

In order to train on your custom dataset, you will need to convert your dataset in COCO format containing captions and tokens_positive defining alignment with the bounding boxes. The issue at explains the required format of tokens_positive. Further, the standard data loader used can be found at.

In addition to that, you can also evaluate MDef-DETR on your dataset without any pretraining/fine-tuning. Please refer to this issue for details.

I hope this information will be helpful.