Sampling from a NAF gives inf

Question

Sampling from a NAF gives inf

MouzaouiMatthieu opened this issue a year ago · comments

Description

Thank you for the package, which is useful for my current internship with @plcodrigues. I however encounter an issue, I would like to know if I am using your package correctly.
I am trying to reproduce the code from D. Ward et al., "Robust Neural Posterior Estimation and Statistical Model Criticism" and at one point I need to train an unconditional NAF on a two-dimensional training set. But once trained, sampling from the trained flow often produces 'inf'.

Reproduce

I found out this code reproduces the issue:

import torch
import zuko

train_set = torch.distributions.Normal(0,25).sample((10_000,2))
train_set = (train_set - train_set.mean(0))/train_set.std(0)
train_loader = torch.utils.data.DataLoader(train_set, batch_size=256, shuffle=True)
flow = zuko.flows.NAF(features=2,context=0) #Unconditional flow
optimizer = torch.optim.AdamW(flow.parameters(), lr=1e-3)

for x in train_loader :
    loss = -flow().log_prob(x) 
    loss = loss.mean()

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

print(flow().sample((10_000,)).isinf().any())

Environment

Zuko version: 0.2.0
PyTorch version: 2.0.0+cu117
Python version: 3.8.16
OS: Windows 10

François Rozet · Answer 1 · Fri Jun 16 2023 23:40:47 GMT+0800 (China Standard Time)

Thank you very much for the bug report! I'll take a look into it. At first sight, this is likely due to the numerical inversions of the neural transformations, which are not always well behaved.

François Rozet · Answer 2 · Sat Jun 17 2023 21:23:37 GMT+0800 (China Standard Time)

So, what happens is that neural transformations (NeuralAutoregressiveTransform) expect inputs to be within $[-5, 5]$, which are then mapped to $[a, b]$ (learnable) by the transformation. To stack several NeuralAutoregressiveTransform, it is therefore necessary to add clipping transformations (SoftclipTransform) in between.

In the forward pass (log_prob) this is never an issue. However, at sampling, if intermediate values get outside of the range $[a, b]$, then they are not mapped to $[-5, 5]$, or rather they are clipped to either $-5$ or $5$. Then, when passed to the inverse of SoftclipTransform, the values are mapped to $\pm\infty$. In summary, the issue arises when $[a, b]$ does not support the output domain. This could be the case if a transformation is not well trained (which is the case in your example) or overfits the training data.

I don't think it is possible to fix the core of the issue ($[a, b]$ not supporting the output domain): it is a drawback of the method itself. However, it is possible to handle the issue differently and prevent $\infty$ values. I'll submit a PR to fix this soon.

Matthieu Mouzaoui · Answer 3 · Tue Jun 20 2023 19:53:11 GMT+0800 (China Standard Time)

Thank you very much for your (quick!) answer.

François Rozet · Answer 4 · Tue Jun 20 2023 20:22:08 GMT+0800 (China Standard Time)

This should be fixed in the latest version.

pip install git+https://github.com/francois-rozet/zuko