Code for paper titled "Towards the Law of Capacity Gap in Distilling Language Models"
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool