CMMLU: Measuring massive multitask language understanding in Chinese
Geek Repo:Geek Repo
Github PK Tool:Github PK Tool
mMrBun opened this issue a year ago · comments
Thanks for your work! I want to do a comparison between chatglm and qwen,Do you plan to support?
I attempted to evaluate Qwen-7b using hf_causal_model.py, but encountered an error at line 43. The specific reason is that the choice_ids at line 21 are ["None", "None", "None", "None"].
Thank you for reporting the issue. We will check it soon and support Qwen soon.