Question about the VAE's KL-loss

Question

Question about the VAE's KL-loss

kaiolae opened this issue 6 years ago · comments

Hi!
I'm trying to reproduce the doom example in Keras, and was curious about the KL-loss calculation of the VAE, specifically the parameter kl_tolerance. As far as I understand it limits the KL-loss from ever going under 32. What is the purpose of this? What effect would it have to remove this tolerance?
Thanks, and thanks for a very well written paper!
-Kai

hardmaru · Answer 1 · Mon Aug 20 2018 09:09:44 GMT+0800 (China Standard Time)

Hi @kaiolae

Thanks for the comment. Someone else has asked me this before (see discussion)

Basically, I stop optimizing for KL loss term once it is lower than some level, rather than letting it go to near zero. So optimize for tf.max(KL, good_enough_kl_level) instead, to relax the information bottleneck of the VAE.

This method was inspired by “free bits” concept in the appendix section of this paper: https://arxiv.org/abs/1606.04934 and I also did this in the sketch-rnn paper.