Pretrained language model and its related optimization techniques developed by Huawei Noah's Ark Lab.
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool
Weries opened this issue 2 years ago · comments
需要从general distil就蒸馏出2层的tinybert吗,还是可以在第二阶段task distil,使用4层的tinybert,直接整理出2层的tinybert