Giters
THUDM
/
AlignBench
大模型多维度中文对齐评测基准 (ACL 2024)
Geek Repo:
Geek Repo
Github PK Tool:
Github PK Tool
Stargazers:
334
Watchers:
12
Issues:
31
Forks:
23
THUDM/AlignBench Issues
问题答案错误!!!
Updated
a month ago
Comments count
4
llmbench.ai SSL证书似乎已经过期
Updated
3 months ago
数学逻辑题有很多题,标准答案还是错的
Closed
4 months ago
关于 Llama 3-70B 的结果
Updated
5 months ago
Errors in reference (especially for reasoning)
Closed
5 months ago
Comments count
1
最近评测很慢
Updated
5 months ago
最近一些题目无法正确评测
Updated
6 months ago
最近提交的评测为什么都error了?
Updated
6 months ago
大佬们,这个评测模型会开源吗?
Closed
a year ago
Comments count
2
请问extract_score error是什么原因,该怎么解决
Updated
7 months ago
Comments count
5
data_release.jsonl 里有一些数据有问题
Updated
7 months ago
评测网站报错
Updated
9 months ago
Comments count
8
如何将测试结果公开至 LeaderBoard?
Updated
10 months ago
为什么用GPT-4评测的时候结果会出现‘待定’
Updated
10 months ago
Comments count
1
网站无法登录
Updated
10 months ago
有评测Qwen72B模型吗?
Closed
a year ago
Comments count
3
参考答案出错颇多,像是AI生成,未经仔细校对的,试举一例:C Mixolydian音阶的第7个音是什么音?
Closed
a year ago
Comments count
5
为什么模型输出越长,分数越高?
Updated
a year ago
无法注册和登录:注册页面弹出红叉,用其他账号登录,填好基本信息后,按登录按钮无反应
Closed
a year ago
Comments count
3
提交任务结果里的详细结果 judge 内容重复,导致分值计算错误
Closed
a year ago
Comments count
3
chatglm3测试结果差异大
Updated
a year ago
Comments count
1
网站上传测评 failed
Closed
a year ago
Comments count
1
调用本地api进行评测有时会出现404错误
Closed
a year ago
Comments count
3
模型打分阶段两个同样的csv 最后出来的分数有微小区别正常吗?
Closed
a year ago
Comments count
1
请问榜单内模型的生成方式有详细的说明吗
Closed
a year ago
Comments count
5
网站注册不上
Closed
a year ago
无法提交新结果
Closed
a year ago
Comments count
1
有些题目的答案有待商榷
Closed
a year ago
Comments count
1
请问,官网给的example.csv文件,打开乱码无法查看格式
Closed
a year ago
Comments count
3
About meta-evaluation dataset with human annotations
Closed
a year ago
Comments count
1
提交成功后下载的得分和详细结果都是空的文件
Closed
a year ago
Comments count
2