Reduce_sum instead of reduce_mean
Kh4L opened this issue · comments
Serge Panev commented
Hi @altosaar
First, thank you for this small tutorial on using tf.distributions
with a VAE!
elbo = tf.reduce_sum(expected_log_likelihood - kl, 0)
I wanted to know why you chose to use reduce_sum
instead of reduce_mean
when computing the ELBO?
Thanks!
Jaan Lı 李 PhD commented
reduce_mean
also works - there might be an advantage empirically if the gradients are very large. I chose the sum to match the math which hopefully makes the code easier to understand. The ELBO for the dataset is the sum of the per-datapoint ELBOs.