cvignac / DiGress

code for the paper "DiGress: Discrete Denoising diffusion for graph generation"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cross-entropy in minibatching

najwalb opened this issue · comments

I have a question about your use of cross-entropy over nodes/edges when mini-batching graphs. If I understood your implementation correctly, to compute the loss for one minibatch, you compute the cross-entropy of a single node and a single edge in your minibatch of graphs. These cross-entropies are averaged over the entire minibatch, then combined via the following formula: $L_{ce} = L_{nodes} + \lambda L_{edges}$ (same as your equation 3).

To me $L_{ce}$ represents the loss for one graph, so I think you should first sum the losses for nodes and edges per graph, then take the mean of such sums over a minibatch. What do you think?

cool thanks for your reply. I am not sure I understand the intuition behind summing over all nodes and edges favoring bigger graphs. Is it because with my method bigger graphs will have a higher CE and would thus be penalized more?

I am actually proposing a sum over the nodes and edges for one graph, then taking the mean over the batch. So the loss per batch will be: $L = (\frac{1}{N}) \sum^{i<=N} (\sum_n CE_{n,i} + \lambda \sum_{nm} CE_{nm,i})$ with $n, m$ vertices in the graph $i$. This way big graphs will still have a large loss and will contribute to the batch loss more no?

Ok, I see your point. Did you try it? Does it work better?