facebookresearch / CutLER

Code release for "Cut and Learn for Unsupervised Object Detection and Instance Segmentation" and "VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Correct steps for self-training (custom dataset w/o annotations)

alexaatm opened this issue · comments

Hi and thank you for the cool work! :)

I am trying to perform unsupervised segmentation on a custom dataset (let's call it customdataset here for less confusion) using CutLER and have several questions which appeared when performing the following steps
(I referred to #16 already, it is related, but here I am asking about more things related to usage of the repo and its underdtanding)

  1. Generate presudo masks using MaskCut -> output is a .json file.
  2. Modify dataset scripts to enable registering a custom dataset. For that I added in cutler/data/datasets/builtin.py _PREDEFINED_SPLITS_customdataset = {} _PREDEFINED_SPLITS_customdataset["custom_dataset"] = { 'custom_dataset_train': ("custom_dataset/images/train", "custom_dataset/annotations/merged_imagenet_train_fixsize480_tau0.15_N3.json"), }

and
def register_all_customdataset(root): for dataset_name, splits_per_dataset in _PREDEFINED_SPLITS_customdataset.items(): for key, (image_root, json_file) in splits_per_dataset.items(): # Assume pre-defined datasets live in ./datasets. register_coco_instances( key, _get_builtin_metadata(dataset_name), os.path.join(root, json_file) if "://" not in json_file else json_file, os.path.join(root, image_root), )

and
register_all_customdataset(_root)

In the file cutler/data/datasets/builtin_meta.py It is written that for custom datasets it is not necessary to write hard-coded meta-data. But when debugging errors with registration, I added the follwingin the function _get_builtin_metadata:
elif dataset_name in ["imagenet", "kitti", "cross_domain", "lvis", "voc", "coco_cls_agnostic", "objects365", 'openimages', **'custom_dataset'**]: return _get_imagenet_instances_meta()

Question: Is it a correct way to handle meta data? Or shoudl annotations created by MaskCut be used with coco_instances instead? That is, I shoudl add my dataset name to this list here? if dataset_name in ["coco", "coco_semi"]: return _get_coco_instances_meta()
Or is it a wrong approach alltogether? My CustomDataset is not real-worl data and categories do not match. At this point, if I only care about segmenting out different objects without naming them, shoudl I use UVO function?

  1. Use generated pseuso masks for performing self-training.

In the Self-Training Cutler there are 3 steps described for self-training:
step1 - "Firstly, we can get model predictions on ImageNet via running".
step2 - "Secondly, we can run the following command to generate the json file for the first round of self-training"
step3 - "Finally, place "cutler_imagenet1k_train_r1.json" under "DETECTRON2_DATASETS/imagenet/annotations/", then launch the self-training process".

Question: For custom datasets, should I skip step1 and step2? As I thought the maskCut already gives us the .json file that can be used for self-training?

I did not run step1 and step2 and directly ran the followowing command from step3 to train mode on a custom datset using maskcut annotations, and used the imageNet Cutler as model weights initialization.
python train_net.py --num-gpus 1 \ --config-file model_zoo/configs/CutLER-ImageNet/cascade_mask_rcnn_R_50_FPN_self_train.yaml \ --train-dataset custom_dataset_train \ MODEL.WEIGHTS http://dl.fbaipublicfiles.com/cutler/checkpoints/cutler_mrcnn_final.pth \ \ OUTPUT_DIR outputs/cascade/custom_dataset_selftrain-r1 \

It launched the training and I got a model.

  1. (optional) Do another round of self training

Question: After the first round, do I understand correctly that I would need to run step1 (get model predictions from my newly trained model on maskcut annotations) and step2 (generate a json file for that) and then step3 (launch the self training process using a new json file)? Right? And the self training rounds shoudl all be done on the same data? Only the ground-trueth predictions are updated, right?

  1. Inference.

I ran only one round of self training (just on maskcut annotations ) and then ran the demo to visualize the learned masks usign command

python demo/demo.py \ --config-file model_zoo/configs/CutLER-ImageNet/mask_rcnn_R_50_FPN.yaml \ --input ../../data/custom_dataset/images/train/*.jpg --output outputs/inference/custom_dataset_selftrain1 \ --opts MODEL.WEIGHTS outputs/custom_dataset_selftrain-r1/model_final.pth

But the demo images were crowded with label "person" and confidence percentage.
Question: I understand that the problem must be related to the fact of using ImageNet metadata, right? Is there a way to only visulzie the segmentations without any labels?
So far, my intuition is to create a custom Visualizer for detectron... But I still wanted to ask...

Looking forward to hearing any feedback! :)

Hi, thank you for your interest in our work. For your questions:

Q1: That is, I should add my dataset name to this list here?
A1: You should add the datasetname to this line. All these datasets are class-agnostic.

Q2: For custom datasets, should I skip step 1 and step 2? As I thought the maskCut already gives us the .json file that can be used for self-training?
A2: No, you should not skip step 1 and step 2. The self-training stage comes after the unsupervised model learning stage. Although maskCut provides the .json file with pseudo-masks, the self-training stage is essential to improve the quantity and quality of these pseudo-masks, resulting in a 1.3% improvement. For more details, please refer to Table 6 in the paper. However, if you only want a quick check of the model performance, you may choose to skip the self-training stage.

Q3: The self training rounds should all be done on the same data? Only the ground-truth predictions are updated, right?
A3: Yes.

Q4: But the demo images were crowded with label "person" and confidence percentage. Is there a way to only visualize the segmentations without any labels?
A4: Yes, absolutely! We are using Detectron2's visualizer, which by default uses the COCO classes and names. That's why the demo results show "person" as the object name, which is the first class in MSCOCO's label space. However, I have updated the colab codes and demo codes to show "fg" as the class name now. This change uses the class-agnostic IMAGENET_CATEGORIES as the "thing_classes" used in the demo visualizer.
If you want to completely remove the label names, you can add "labels=None" before this line in your Detectron2 repository. This will ensure that only the segmentations are visualized without any label names.

Please let me know if this resolves your issues.

Closing it now. Please feel free to reopen it if you have further questions.

I am getting back to experiments, so am sorry for such a delayed reply! it helped a lot!

I got another question actually, if I may.. 😅

In the scenario where I want to do inference (so get predictions via running train_net with eval-only flag) using the already trained ImageNetCutLER but on my custom data: are the MaskCut masks generated for my data somehow used during this process?

I am asking only because for the inference, we need to provide a test dataset, and when registering a dataset, we also give the initial annotation file (generated by maskcut).

My intuition is that the masks are not used, since it is a test mode.

Thank you in advance!