deepseek-ai / DeepSeek-V2

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

代码开源相关

DXZDXZ opened this issue · comments

请问 Aux Loss 部分的 device-level balance loss 和 communication balance loss 代码会开源吗,还有后面的 Token Dropping 策略