cvignac / DiGress

code for the paper "DiGress: Discrete Denoising diffusion for graph generation"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Validation results show nan all the time

FairyFali opened this issue · comments

I encounter a strange result during validating. the result is

Starting train epoch...
Epoch X: Val NLL nan -- Val Atom type KL nan -- Val Edge type KL: nan
Val loss: nan Best val loss: 100000000.0000

the NLL is always nan, why?

Hello, this is not normal. What command are you running? Is it on a custom dataset?

Hello, this is not normal. What command are you running? Is it on a custom dataset?

I run the command mentioned in the readme file. It is on qm9 with discrete noise. Specifically, it is python3 main.py.

I'm not sure where the exact issue came from (probably a different behavior of mask_distributions with recent python versions), but it's now fixed. You can use the latest commit.

I'm not sure where the exact issue came from (probably a different behavior of mask_distributions with recent python versions), but it's now fixed. You can use the latest commit.

I think I figure it out. because you are using .log or torch.log in functions like kl_prior, and compute_Lt but not consider the zero to log(), so it will appear nan.