graphdeeplearning / graphtransformer

Graph Transformer Architecture. Source code for "A Generalization of Transformer Networks to Graphs", DLG-AAAI'21.

Home Page:https://arxiv.org/abs/2012.09699

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

about attention

xinsheng44 opened this issue · comments

图片
图片
Hello, about the problem of calculating attention, the attention of node i and its adjacent nodes is calculated in the formula, but I find that the final calculation is the attention of all nodes, and it does not distinguish whether the nodes are connected. Is there a problem with my understanding?

Hi @lao-sheng,

The code section you highlight give the outputs of Q, K, V projections.

The implementation for node to attend to its local neighborhood is carried out by this function, which is just followed after the projections.

self.propagate_attention(g)

def propagate_attention(self, g):
# Compute attention score
g.apply_edges(src_dot_dst('K_h', 'Q_h', 'score')) #, edges)
g.apply_edges(scaled_exp('score', np.sqrt(self.out_dim)))
# Send weighted values to target nodes
eids = g.edges()
g.send_and_recv(eids, fn.src_mul_edge('V_h', 'score', 'V_h'), fn.sum('V_h', 'wV'))
g.send_and_recv(eids, fn.copy_edge('score', 'score'), fn.sum('score', 'z'))

oh,i see.thanks!