kazuto1011 / deeplab-pytorch

PyTorch re-implementation of DeepLab v2 on COCO-Stuff / PASCAL VOC datasets

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

poorer perfermance on VOC dataset

veizgyauzgyauz opened this issue · comments

When I trained DeepLabv2 on PASCAL VOC dataset, I followed every step as you recommended and used the default settings. However, after 20000 iters, the mIOU on the validation set is only ~70%, and ~71% after CRF, which is much lower than you put on the git.

I check out all the steps and can't find any change that I possibly make that might influence the perfermance. Would you tell me how to achieve the ~77% mIOU on VOC. Maybe I miss out some key training strategies.

commented

Maybe you need to replace the ground truth used for training with the ground truth used for val (erosion boundary label).

What if with the pre-trained model provided in README?
https://github.com/kazuto1011/deeplab-pytorch#pascal-voc-2012

Sorry for my late reply. Yeah I used the pre-trained model provided in README and performed testing dirrectly. I got

{
"Class IoU": {
"0": 0.9113745134698907,
"1": 0.7548100049092207,
"2": 0.2776642437140412,
"3": 0.7920075503332934,
"4": 0.563795441754022,
"5": 0.7542276295105741,
"6": 0.9114552179884103,
"7": 0.7890135201761682,
"8": 0.8494311737935578,
"9": 0.33949498209200407,
"10": 0.7794567412074204,
"11": 0.5943384756179321,
"12": 0.7938190258452087,
"13": 0.7703424174613404,
"14": 0.7382430275456281,
"15": 0.7793527737300838,
"16": 0.5481757602864855,
"17": 0.7822673523992802,
"18": 0.48848244811029945,
"19": 0.8376429456716374,
"20": 0.6833195377677724
},
"Frequency Weighted IoU": 0.8683934872494381,
"Mean Accuracy": 0.863490887130375,
"Mean IoU": 0.7018435611135367,
"Pixel Accuracy": 0.9233609260634299
}

without CRF and

{
"Class IoU": {
"0": 0.9185947560528145,
"1": 0.7932974029185603,
"2": 0.28935500540235826,
"3": 0.8134413220566067,
"4": 0.5817749729606366,
"5": 0.7717773328834131,
"6": 0.9150898919980759,
"7": 0.7970463399174688,
"8": 0.8626382896898597,
"9": 0.35755957861652815,
"10": 0.7968930911567433,
"11": 0.6120302892044005,
"12": 0.8142565295271454,
"13": 0.7937575856786301,
"14": 0.749067429869795,
"15": 0.7954213089776719,
"16": 0.5768634716444693,
"17": 0.8067201509800754,
"18": 0.4941964832793712,
"19": 0.8447897899532443,
"20": 0.6998212263610648
},
"Frequency Weighted IoU": 0.8776901827779566,
"Mean Accuracy": 0.8649591453224956,
"Mean IoU": 0.7183043928156635,
"Pixel Accuracy": 0.9295523734432205
}

after CRF.
The results are lower than those reported in README (76.65%/77.93% mIoU with/without CRF).
Any idea about that?

@veizgyauzgyauz Hey! I tested the model provided in README. And got the same results as yours. I also tried to train the model myself, with voc_aug data using model pretrained on imagenet, but testing result didn't get better. Have you solved the problem?

OH!I got 76.65 mIOU on val set, the same as the result reported in README.

The previous poor results were because I confused weakly supervised data when testing, since I have modified the code to a weakly supervised manner a few days ago.