jack57lee / Diversify-MHA

EMNLP 2018: Multi-Head Attention with Disagreement Regularization; NAACL 2019: Information Aggregation for Multi-Head Attention with Routing-by-Agreement

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

cos_diff_square

jxxiao opened this issue · comments

cos_diff_square完全没有参与计算