thu-coai / SafetyBench

Official github repo for SafetyBench, a comprehensive benchmark to evaluate LLMs' safety.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

What are differences between Chinese and Chinese Subset leaderboards

zhimin-z opened this issue · comments

image
For the evaluation benchmark, I did not see a difference, but the number of tested models.
Is that the only difference?

Solved it by checking this sentence:
image