Usage of BCE for likelihood of gaussian distributed data?

Question

Usage of BCE for likelihood of gaussian distributed data?

chedatomasz opened this issue a year ago · comments

In the code for non-convolutional VQ-VAE, you seem to use BCE loss as the reconstruction loss for images. If I understand correctly, that would correspond to assuming that the image pixels follow a Bernoulli distribution, instead of the regular Gaussian assumption underlying MSE loss. Is this a deliberate choice? In the convolutional VQ-VAE operating on the same MNIST dataset, you use MSE.
The relevant line: https://github.com/nadavbh12/VQ-VAE/blob/a360e77d43ec43dd5a989f057cbf8e0843bb9b1f/vq_vae/auto_encoder.py#LL158C50-L158C50