huggingface / evaluate

🤗 Evaluate: A library for easily evaluating machine learning models and datasets.

Home Page:https://huggingface.co/docs/evaluate

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

SQuAD v2 is missing the NoAns metrics

GitMew opened this issue · comments

commented

The section "Output values" in the documentation for the SQuAD v2 metric shows that there should be 13 different measures returned by metric.compute( ), yet, as is shown in the examples in the section "Examples", all the measures about "NoAns" are missing in the output.

This is not just a documentation error; running the examples in evaluate 0.3.0 gives the same result, without NoAns.

On the topic of documentation (although this is a separate issue): exact, HasAns_exact and NoAns_exact are described with the same sentence, but they clearly should have different meanings. Furthermore, I don't see what the difference is between NoAns_exact and NoAns_f1, because if there is no answer, then the reference span is empty and hence F1 is measured over a reference of 0 words.