deepseek-ai / DeepSeek-V2

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Could we have scores for `LongBookQA Eng` and `LongBookSum Eng`

zxzzz0 opened this issue · comments

Some results pasted below from this link:

Task Name GPT-4 YaRN-Mistral-7B Kimi-Chat Claude 2 Yi-6B-200K Yi-34B-200K Chatglm3-6B-128K
En.Sum 14.73% 9.09% 17.93% 14.45% < 5% < 5% < 5%
En.QA 22.22% 9.55% 16.52% 11.97% 9.20% 12.17% < 5%