Mis-inplementation of JS divergence
greatwallet opened this issue · comments
Hi, according to definition of JS divergence (as mentioned in your supp file), JS divergence is calculated as the difference of
entropy of average probabilities and average of entropies.
However in your code, the first term of JS, aka the difference of entropy of average probabilities is implemented as:
where mean_seg
is defined as average segmentation map of 10 outputs of ensembled pixel_classifier
s.
Specifically, I have traced the implementation of mean_seg
-->
-->
datasetGAN_release/datasetGAN/train_interpreter.py
Lines 291 to 294 in dee6d7d
-->
img_seg
datasetGAN_release/datasetGAN/train_interpreter.py
Lines 282 to 284 in dee6d7d
In fact, img_seg
are all unnormalized probabilities, aka logits defined in pytorch distribution's argument. I think in the code you attempted to do average upon logits instead of probabilies (since you have commented out Sigmoid
in pixel_classifier
)
datasetGAN_release/datasetGAN/train_interpreter.py
Lines 68 to 92 in dee6d7d
TL; DR
The unlawful commutation of softmax
and linear operation leads to mis-implementation of JS divergence.
Thank you a lot for pointing this out. We are looking into it.
@greatwallet Thank you again for pointing this bug out!
We have fixed the bug in the commit of
d9564d4.
The number is also updated in README.