DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool
ftgreat opened this issue 2 months ago · comments
Hi, thank you for your great work!
Could you provide more details about the pretrain dataset? How has the pretrain dataset been optimized in DeepSeek-V2 compared to the previous version, DeepSeek?
Thank you.