brain-research / realistic-ssl-evaluation

Open source release of the evaluation benchmark suite described in "Realistic Evaluation of Deep Semi-Supervised Learning Algorithms"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

VAT implementation is wrong

AskAukNuTutor opened this issue · comments

Thank you for your code.

Following the original VAT paper, consistency_func in should be reverse_kl for VAT, although it is set to forward_kl in your code.

The adversarial noise r in VAT is obtained by maximizing D_KL(p(y|x)||p(y|x+r)), however, the consistency loss D_KL(p(y|x+r)||p(y|x)) is used when consistency_func=forward_kl. It matters because of the asymmetricity of KL divergence, I think.

We tried both and forward_kl worked better.

IMO, consistency_func cannot be a hyper-parameter to be tuned (it's a part of the VAT model). If you consider consistency_func to be a hyper-parameter, it should be noted in Table 4 of your NIPS'18 paper.

For instance, I compared the VAT+EntMin with the following two settings in the CIFAR10-4000 scenario:
Setting-A: consistency_func=forward_kl and max_cons_multiplier=0.3 (original parameters)
Setting-B: consistency_func=reverse_kl and max_cons_multiplier=1.0 (modified parameters)

As a result, I observed that VAT+EntMin with Setting-B outperformed that with Setting-A about 2% in test error rates (11.7% vs 13.7%). Of course it is a result of a single run, so I do not insist that Setting-B outperforms Setting-A in general.