DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool
pandaupc opened this issue 2 months ago · comments
论文中提到: We obtain code preference data based on compiler-feedback, and mathematical preference data based on the ground-truth labels 可以详细讲一下是如何做的吗?
技术报告外的信息暂无披露计划 @pandaupc