Evaluation metrics
yaya-sy opened this issue · comments
Yaya Sy commented
Hi, thank you for sharing your work. Very interesting!
Which metrics do you use for zero-shot evaluation? lm-evaluation-harness
proposes two metrics and I was wondering which one you use.
Thank you a lot.
Yixin commented
Hi! We report acc_norm for hellaswag, arc_challenge and openbookqa, and report acc for others.
Yaya Sy commented
Thank you very much!