deepseek-ai / DeepSeek-V2

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

偏好数据构造方法

pandaupc opened this issue · comments

论文中提到:
We obtain code preference data based on compiler-feedback, and mathematical
preference data based on the ground-truth labels
可以详细讲一下是如何做的吗?

技术报告外的信息暂无披露计划
@pandaupc