open-compass / opencompass

OpenCompass is an LLM evaluation platform, supporting a wide range of models (Llama3, Mistral, InternLM2,GPT-4,LLaMa2, Qwen,GLM, Claude, etc) over 100+ datasets.

Home Page:https://opencompass.org.cn/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

[Bug] The case in Figure 11 is not in the MMBench?

Richar-Du opened this issue · comments

Prerequisite

Type

I'm evaluating with the officially supported tasks/models/datasets.

Environment

The same as the official

Reproduces the problem - code/configuration sample

question_file = 'mmbench_dev_20230712.tsv'
question_df = pd.read_csv(question_file, sep='\t', encoding='utf-8')
physical_relation = question_df[question_df['category']=='physical_relation']

Reproduces the problem - command or script

NO

Reproduces the problem - error message

I iterate the physical_relation and did not find the data displayed in Figure 11.
image

Other information

No

@Richar-Du , the two cases do exist in MMBench, but in the test split. That's why you can not find them in the train split.