probabilists / zuko

Normalizing flows in PyTorch

Home Page:https://zuko.readthedocs.io

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Sampling from a NAF gives inf

MouzaouiMatthieu opened this issue · comments

Description

Thank you for the package, which is useful for my current internship with @plcodrigues. I however encounter an issue, I would like to know if I am using your package correctly.
I am trying to reproduce the code from D. Ward et al., "Robust Neural Posterior Estimation and Statistical Model Criticism" and at one point I need to train an unconditional NAF on a two-dimensional training set. But once trained, sampling from the trained flow often produces 'inf'.

Reproduce

I found out this code reproduces the issue:

import torch
import zuko

train_set = torch.distributions.Normal(0,25).sample((10_000,2))
train_set = (train_set - train_set.mean(0))/train_set.std(0)
train_loader = torch.utils.data.DataLoader(train_set, batch_size=256, shuffle=True)
flow = zuko.flows.NAF(features=2,context=0) #Unconditional flow
optimizer = torch.optim.AdamW(flow.parameters(), lr=1e-3)

for x in train_loader :
    loss = -flow().log_prob(x) 
    loss = loss.mean()

    optimizer.zero_grad()
    loss.backward()
    optimizer.step()

print(flow().sample((10_000,)).isinf().any())

Environment

  • Zuko version: 0.2.0
  • PyTorch version: 2.0.0+cu117
  • Python version: 3.8.16
  • OS: Windows 10

Thank you very much for the bug report! I'll take a look into it. At first sight, this is likely due to the numerical inversions of the neural transformations, which are not always well behaved.

So, what happens is that neural transformations (NeuralAutoregressiveTransform) expect inputs to be within $[-5, 5]$, which are then mapped to $[a, b]$ (learnable) by the transformation. To stack several NeuralAutoregressiveTransform, it is therefore necessary to add clipping transformations (SoftclipTransform) in between.

In the forward pass (log_prob) this is never an issue. However, at sampling, if intermediate values get outside of the range $[a, b]$, then they are not mapped to $[-5, 5]$, or rather they are clipped to either $-5$ or $5$. Then, when passed to the inverse of SoftclipTransform, the values are mapped to $\pm\infty$. In summary, the issue arises when $[a, b]$ does not support the output domain. This could be the case if a transformation is not well trained (which is the case in your example) or overfits the training data.

I don't think it is possible to fix the core of the issue ($[a, b]$ not supporting the output domain): it is a drawback of the method itself. However, it is possible to handle the issue differently and prevent $\infty$ values. I'll submit a PR to fix this soon.

Thank you very much for your (quick!) answer.

This should be fixed in the latest version.

pip install git+https://github.com/francois-rozet/zuko