open-compass / opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Home Page:https://opencompass.org.cn/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Feature] Support QuALITY dataset

Ezra-Yu opened this issue · comments

描述该功能

在Claude3中加入了QuALITY: Question Answering with Long Input Texts, Yes!”这个长文测试集(平均5k token),是一个人工标注且质量比较高的测试集,希望支持

Arxiv : https://arxiv.org/abs/2112.08608
Github: https://github.com/nyu-mll/quality

是否希望自己实现该功能?

  • 我希望自己来实现这一功能,并向 OpenCompass 贡献代码!