graphdeeplearning / graphtransformer

Hi there,

I was reading your code on graphtransformer, I'm kind of curious on the operation shown below. Why did you divide the wV score by the w(or so called 'score' term), I didn't see any terms shown in your equation 4 or equation 9 in the paper. Could you illustrated that?

graphtransformer/layers/graph_transformer_edge_layer.py

Line 112 in c9cd493

    
           h_out = g.ndata['wV'] / (g.ndata['z'] + torch.full_like(g.ndata['z'], 1e-6)) # adding eps to all values here

Thanks

Hi @sperfu, it is part of the softmax term. Please refer to this issue for the pointers to the explanation.

Why did you divide this term?