Problem of evaluating ReCo on CityScapes

Question

Problem of evaluating ReCo on CityScapes

xs1317 opened this issue 2 years ago · comments

Hello, thanks for sharing your fantastic work.

I evaluated the ReCo on Cityscapes validation split but didn't get the same result as your paper.Here are my result and configs:

Results: deit_s_16_sin_in_train_ce_ta_dc/results_crf.json

"Pixel Acc": 0.7456942998704943, "Mean IoU": 0.19347486644280978
I get better Pixel Acc and worse mIOU.

configs:

        imagenet_split: "train"  
        dataset_name: "cityscapes"  
        clip_arch: "ViT-L/14@336px" 
        dense_clip_arch: "RN50x16"  
        dense_clip_inference: true
        encoder_arch: "DeiT-S/16-SIN"  
        patch_size: 16
        context_categories: ["tree", "sky", "building", "road", "person"]
        context_elimination: true
        text_attention: true

I directly download the reference image embedding for CityScapes and the pre-trained models.
The problem should not be caused by image preprocessing because I got the same result when evaluating ReCo+ on CityScapes.
Would you help me solve this problem? Thanks!

Gyungin · Answer 1 · Wed Jan 11 2023 22:22:40 GMT+0800 (China Standard Time)

Hi @xs1317 and thank you for your kind words and interest in our work.

I'm having difficulties understanding your question as it seems you got the same results as the ones reported in the paper after rounding, i.e., 74.6 and 19.3 for pixel accuracy and mIoU (in percent).

Could you please double check if your question is correct or if by any chacne you got the numbers confused with another dataset such as KITTI-STEP?

Kind regards,
Noel

Gyungin · Answer 2 · Wed Jan 11 2023 22:56:24 GMT+0800 (China Standard Time)

Sorry, I was too quick to reply to you earlier - I noticed that the numbers you compared to are the ones in the revised version rather than the arXiv version. The difference is: in the new version, ReCo was evaluated at original resolutions using the inference code of Drive&Segment (https://github.com/vobecant/DriveAndSegment) for a fair comparison, whereas ReCo+ is evaluated at 320×320 pixels as in STEGO (https://arxiv.org/abs/2203.08414). As the image pre-processing used in this repo is for the latter case, there are some differences as you noted.

Noel

飘飘飘飘 · Answer 3 · Thu Jan 12 2023 09:16:39 GMT+0800 (China Standard Time)

Hi @NoelShin ,thanks for your reply.
I read the arXiv version and get the same result on CityScapes.There's another problem when I evaluated ReCo on KITTI_STEP validation split. The result is 72.0 and 31.0 for pixel accuracy and mIoU (in percent) which is better than the ones in the arXiv version. I can't find any result matching this result.
The accurate results is "Pixel Acc": 0.7199210254853453,"Mean IoU": 0.30981611409230164.
The configs are the same as my origin issue.And the result of ReCo+ on KITTI_STEP matches the result in the arXiv version.

Gyungin · Answer 4 · Thu Jan 12 2023 21:44:16 GMT+0800 (China Standard Time)

Thank you very much for letting me know - I could confirm that the embedding file produces the numbers you got. To double check the code, I re-implemented to compute the image embeddings for KITTI-STEP and, in that case, the correct numbers could be obtained as reported in our paper. (I think I mistakenly upload a different embedding file with some other setting that I got during ablation studies.) Please try to download the embedding file again and verify if you could get the exact numbers. If not, please let me know again. :)

Download link

飘飘飘飘 · Answer 5 · Fri Jan 13 2023 11:23:19 GMT+0800 (China Standard Time)

Hi @NoelShin ,thanks for your help.
I download the embedding and the result is correct as your paper. I wonder the setting for original embedding.If it is convenient, please tell me.