zihangdai / xlnet

XLNet: Generalized Autoregressive Pretraining for Language Understanding

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Why are activation and dropout added after the classification layer?

MrInouye opened this issue · comments