jack57lee / Diversify-MHA

EMNLP 2018: Multi-Head Attention with Disagreement Regularization; NAACL 2019: Information Aggregation for Multi-Head Attention with Routing-by-Agreement

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

损失函数符号问题

lsquser opened this issue · comments

请问在实现代码时,实现的是负的余弦相似度吗
代码里cos_diff = tf.reduce_mean(cos_diff, axis=[-2,-1]) + 1.0最后加的1是什么意思。
cos_diff是表示加了负号的吗,还是没加负号的。论文里面是添加了负号的呀