icoz69 / CaNet

The code for paper "CANet: Class-Agnostic Segmentation Networks with Iterative Refinement and Attentive Few-Shot Learning"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

model does not upsample predictions

SMHendryx opened this issue · comments

In the paper you note that: "Our final result is achieved by bilinearly upsampling the confidence map to the same spatial size of the query image and classifying each location according to the confidence maps." As far as I can tell from the code, your model is not upsampling (the predictions come from here:

out=self.layer9(out)
)

How are you evaluating the results when the predictions and labels are of different shapes? Do you downsize the masks to be the size of the predictions or upsample the predictions to be the size of the masks in other unreleased code?

Also, during training and eval, do you first resize the images and masks before passing them into the graph? Or do you use original pascal image dimensions?

This could relate to difficulties in producing reported results #4

hello, i have updated the training scripts now. the no-learnable upsampling is done outside the forward function in the network. for the final evaluation, all metrics, i.e. meanIOU and FB-IOU are based on the original image size.

Hi @icoz69, thank you for getting back to me. It's cool to see some more of the training details. It looks like you are training and evaluating on input_size = (321, 321), not the original image shapes, which are not all 321 by 321, correct? Also, It looks like you are saving the best model found during training (on the 15 training tasks) by evaluating on the test set of 5 held out tasks here:

CaNet/train.py

Line 269 in fdce946

torch.save(model.cpu().state_dict(), osp.join(checkpoint_dir, 'model', 'best' '.pth'))

Is that correct?

Thanks for the additional details @icoz69. One more question I had about the dataset that would be helpful for people to know is: how did you combine the examples from SBD and PASCAL to make the PASCAL-5^i dataset? Given that there is some amount of overlap, did you just use SBD? Or did you simply put the images and masks from both PASCAL and SBD into the same data directory?

I ask because if you simply put both datasets into the same directory, there may be some amount of overlap, meaning you will sometimes have the same images in the sampled few-shot training and validation sets.

Or perhaps you choose the image-mask pair if it is in one parent dataset and not the other?

I.e. if the mask for image1.jpg is in both SBD and PASCAL, do you just choose the image-mask pair from SBD?

there is a split of SBD that adopts the same val set with voc 2012. thats the splits we use. you can simply combine all training images in two datasets.

Thanks for the clarification @icoz69. I am closing this issue as it is more clear how to reproduce your work now.

you are welcome. good luck to your project