facebookresearch / CutLER

Code release for "Cut and Learn for Unsupervised Object Detection and Instance Segmentation" and "VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Customer coco dataset in self-training

tianyufang1958 opened this issue · comments

commented

Thanks for the nice work. I have a question regarding the customer coco dataset used in self-training. For my coco data, I have instances_train.py and instances_val.py, and I registered two datasets for both train and val, but in the first step of self-training, --test-dataset only take the 'imagenet_train'.

Does it mean Imagenet only use one json file for both train and validation? Or json file generation of self-training can only be applied to training data itself not val data. I am confused about it.

Duplicate of #16. Please check #16 for more details on working with custom datasets.

About the self-training dataset, you can train CutLER on any dataset you specify. But you must let the model know which dataset/split to work on by changing the command accordingly.

commented

Duplicate of #16. Please check #16 for more details on working with custom datasets.

About the self-training dataset, you can train CutLER on any dataset you specify. But you must let the model know which dataset/split to work on by changing the command accordingly.

@frank-xwang Sorry maybe my question is not clear.
i have split the dataset 80% and 20% in coco format and register as training and val dataset. For the command below, it is only for training dataset, should I also change to val dataset to generate pseudo labels as well? Just want to confirm this.

python maskcut.py
--vit-arch base --patch-size 8
--tau 0.15 --fixed_size 480 --N 3
--num-folder-per-job 1000 --job-index 0
--dataset-path /path/to/dataset/traindir
--out-dir /path/to/save/annotations \

If you plan to use pseudo-masks for your validation dataset, then it is necessary to provide the path to the dataset that contains the validation split using the "--dataset-path" argument.

commented

If you plan to use pseudo-masks for your validation dataset, then it is necessary to provide the path to the dataset that contains the validation split using the "--dataset-path" argument.

@frank-xwang My understanding is firstly use the whole imaging dataset to generate the pseudo masks. After that the dataset can be splits into training and validation like 80% and 20% as the inputs of the phase 2 training. Could you please confirm if this is correct?

No, for self-training, we still utilize 100% of the data. Our experimental setup is: using all ImageNet data as the training set and evaluates the model's performance on 11 different detection datasets to demonstrate zero-shot unsupervised learning.

Closing it now, please feel free to reopen it if you have further questions.