A Pytorch implementation of Sparsely-Gated Mixture of Experts, for massively increasing the parameter count of language models
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool