Dereck0602 / Bolaco

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Evaluation metrics

yaya-sy opened this issue · comments

Hi, thank you for sharing your work. Very interesting!

Which metrics do you use for zero-shot evaluation? lm-evaluation-harness proposes two metrics and I was wondering which one you use.

Thank you a lot.

Hi! We report acc_norm for hellaswag, arc_challenge and openbookqa, and report acc for others.

Thank you very much!