lonePatient / NeZha_Chinese_PyTorch

NEZHA: Neural Contextualized Representation for Chinese Language Understanding

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

两个self.relative_positions_encoding[:to_seq_length, :to_seq_length, :].to(hidden_states.device)太影响性能了

huangyc0618 opened this issue · comments

占用了大量CPU资源和时间,建议init初始化后就直接to device