Sigma VAE (Paper)

Sigma VAE is VAE model that automatically balance between reconstruction loss and kl divergence via variance learning

Vanilla VAE requires balancing hyperparameters to balance reconstruction loss and kl divergence

And the value is generally known to be around 0.001,

But it should be adjusted according to the input size or the latent dimension value

In addition, a vanilla VAE model may not be trained well without the kl burn technique, even though the balancing hyperparameter is used

Adjusting these hyperparameters every time to train a VAE was very terrible and this was completely solved by using the Sigma VAE

As Sigma VAE learns the variance of data directly, it ensures stable training without the need for these hyperparameters

Optimal sigma

Even more surprisingly, you can use simple estimates of variance in data to achieve stable learning without learning variance in data

And it's very easy to implement. and it works just as well as learning to variance data directly.

Sigma VAE with tensorflow implementation

MIT License

Language:Python 100.0%