google-deepmind / neural-processes

This repository contains notebook implementations of the following Neural Process variants: Conditional Neural Processes (CNPs), Neural Processes (NPs), Attentive Neural Processes (ANPs).

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Latent Encoder and Decoder: log_sigma transformation

FabricioArendTorres opened this issue · comments

Hi,

I was wondering what the reasoning behind the specific parameterizations of the standard deviation in the latent encoder is, i.e. why is it bounded?
I usually just use a softplus in such settings.

Also, they are different between the encoder and decoder, not sure if that is intentional.
I'd assume both should be with tf.sigmoid?

(Latent) Encoder: Bounds SD between 0.1 and 1

    # Compute sigma
    sigma = 0.1 + 0.9 * tf.sigmoid(log_sigma)

Decoder: Bounds SD to be higher than 0.1 and...?

    # Bound the variance
    sigma = 0.1 + 0.9 * tf.nn.softplus(log_sigma)

And thank you for that well-documented repository :).

Hi, the difference in the parameterisation of the sigma is not intentional - in this particular 1D regression case, either choice should be fine. In general the range of the sigma should be problem-dependent, but it usually helps to lower bound it by some small value >0 for stability in optimisation. Hope that helps!