facebookresearch / CutLER

Code release for "Cut and Learn for Unsupervised Object Detection and Instance Segmentation" and "VideoCutLER: Surprisingly Simple Unsupervised Video Instance Segmentation"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Set up panoptic/semantic segmentation with cutler

alexaatm opened this issue · comments

Hi!
Thanks for your awesome work!

I ran maskcut and cutler on a custom dataset, and registered the dataset the same way as imagenet. So it used 2 categories, background and foreground. I am wondering how to set up a custom cutler training on a dataset with more classes (or panotpic segmentation which would include stuff).

my concern and confusion is that from maskcut we get single foreground object masks, which have no semantic meaning (class) attached to them. How to tackle this situation?

My intuition was that i could learn foreground object segmentation by training cutler on class agnostic masks (that i get from maskcut or from somewhere), but it also seems wrong, since distinguishing classes in mrcnn is not utilised in such a case to a full extent.

looking forward to hearing your thoughts!

Kind regards,
Alexa

Hi Alexa,

Thank you for your interest in our work! We are currently working on a paper that extends CutLER to include unsupervised panoptic segmentation and universal image segmentation. We have plans to release both the code and the paper in the coming weeks. Stay tuned!

Best,
XD

Hi! Thanks for your awesome work!

I ran maskcut and cutler on a custom dataset, and registered the dataset the same way as imagenet. So it used 2 categories, background and foreground. I am wondering how to set up a custom cutler training on a dataset with more classes (or panotpic segmentation which would include stuff).

my concern and confusion is that from maskcut we get single foreground object masks, which have no semantic meaning (class) attached to them. How to tackle this situation?

My intuition was that i could learn foreground object segmentation by training cutler on class agnostic masks (that i get from maskcut or from somewhere), but it also seems wrong, since distinguishing classes in mrcnn is not utilised in such a case to a full extent.

looking forward to hearing your thoughts!

Kind regards, Alexa

hello,Are you still studying this? I would like to ask you some questions

Hey! Yes, we are still working on it and have released the paper (https://arxiv.org/pdf/2312.17243.pdf) and codes (https://github.com/u2seg/U2Seg).

Hey! Yes, we are still working on it and have released the paper (https://arxiv.org/pdf/2312.17243.pdf) and codes (https://github.com/u2seg/U2Seg).

Hello, I would like to ask you a question, that is, using train_net for training, I want to reason on the new image and visualize the results, but the following problems appear:

for d in random.sample(dataset_dicts, 3):
im = cv2.imread(d["file_name"])
outputs = predictor(im)
#outputs = predictor(im) # format is documented at https://detectron2.readthedocs.io/tutorials/models.html#model-output-format
print(outputs["instances"].pred_classes)
print(outputs["instances"].pred_boxes)
v = Visualizer(im[:, :, ::-1],
metadata=tixie_train_metadata,
scale=1.2

tensor([], device='cuda:0', dtype=torch.int64)
Boxes(tensor([], device='cuda:0', size=(0, 4)))
tensor([], device='cuda:0', dtype=torch.int64)
Boxes(tensor([], device='cuda:0', size=(0, 4)))
tensor([], device='cuda:0', dtype=torch.int64)
Boxes(tensor([], device='cuda:0', size=(0, 4)))