Request Demo

Question

Request Demo

jbeomlee93 opened this issue 2 years ago · comments

Dear authors.

Thank you for your nice work, and congratulations that MaskCLIP is accepted to ECCV as an oral paper!

I have tried your code to run a single image with a set of object classes, but actually failed to obtain meaningful localization results from MaskCLIP.

I'm sorry to bother you, but could you provide a demo file? Batman example would be nice.

Thank you so much!

free_wind · Answer 1 · Wed Oct 05 2022 12:58:44 GMT+0800 (China Standard Time)

I have the same issue, is there any update? It seems the pretrained CLIP model fails to provide meaningful results.

Kunyang Zhou · Answer 2 · Thu Oct 06 2022 09:02:56 GMT+0800 (China Standard Time)

I also meet the same problem. I use the following code to infer a new image. However, the result is unsatisfactory.

from mmseg.apis import inference_segmentor, init_segmentor, show_result_pyplot
from mmseg.core.evaluation import get_palette

if __name__== "__main__":

    configs = "configs/maskclip/maskclip_vit16_1024x512_cityscapes.py"
#    checkpoint_file = '/home/fyj/zky/Seg/MaskCLIP/pretrain/ViT16_clip_backbone.pth'
    checkpoint_file = None

    model = init_segmentor(configs, checkpoint_file, device='cuda:0')
    img = 'demo/munster_000024_000019_leftImg8bit.png'
    result = inference_segmentor(model, img)
    print("result:",result[0].shape)
    show_result_pyplot(model, img, result,get_palette('cityscapes'),out_file = 'fine.jpg')

Theodoros Pissas · Answer 3 · Sat Nov 12 2022 02:29:00 GMT+0800 (China Standard Time)

Hi and thanks for sharing the code of this interesting work. I have drawn similar conclusions with the above comments. I reimplemented the MaskCLIP baseline (i.e using pretrained clip features only) in my own codebase. I have then tested it on images from ADE20K and PASCAL Context. It seems to frequently be able to recognize salient semantics in the image (as expected given CLIPS zero-shot classification capabilities). For that prompt denoising/key smoothing is particularly helpfull to limit predictions to salient classes. However it clearly fails in terms of obtaining a segmentation mask.

Thus, I wonder if the authors can provide the requested demo on PASCAL-context.

Also it is unclear in the paper (to the best of my understanding) how the final segmentation of MASKCLIP is obtained? Are the CLIP features bilinearly upsampled or is the low-resolution segmentation (i.e after class-wise argmax) upsampled using nearest neighbours interpolation?

Many thanks for any help with this.

111chengxuyuan · Answer 4 · Wed Feb 08 2023 09:52:10 GMT+0800 (China Standard Time)

Hello,I want to ask you a question,Which version of mmsegmentation should I install to run this code properly? I installed 0.20.0 but couldn't run it

Kanchana Ranasinghe · Answer 5 · Wed Mar 08 2023 00:02:41 GMT+0800 (China Standard Time)

I am facing similar issues trying to replicate this. Sharing a demo notebook could be really useful.

sans · Answer 6 · Sat Mar 16 2024 19:39:32 GMT+0800 (China Standard Time)

Hi and thanks for sharing the code of this interesting work. I have drawn similar conclusions with the above comments. I reimplemented the MaskCLIP baseline (i.e using pretrained clip features only) in my own codebase. I have then tested it on images from ADE20K and PASCAL Context. It seems to frequently be able to recognize salient semantics in the image (as expected given CLIPS zero-shot classification capabilities). For that prompt denoising/key smoothing is particularly helpfull to limit predictions to salient classes. However it clearly fails in terms of obtaining a segmentation mask.

Thus, I wonder if the authors can provide the requested demo on PASCAL-context.

Also it is unclear in the paper (to the best of my understanding) how the final segmentation of MASKCLIP is obtained? Are the CLIP features bilinearly upsampled or is the low-resolution segmentation (i.e after class-wise argmax) upsampled using nearest neighbours interpolation?

Many thanks for any help with this.

Hi, I am working on a project involving MaskCLIP. Please do help me understand how to reproduce baseline results as shown in the paper!

Ｌｕｃｙ · Answer 7 · Sun May 05 2024 17:15:38 GMT+0800 (China Standard Time)

I don't know how to solve the following dataset problem,hope help:
ImportError: cannot import name 'Detail' from 'detail' (/home/asc005/anaconda3/envs/MaskCLIP/lib/python3.8/site-packages/detail/init.py)