Incorrect P@5 Values

Question

Incorrect P@5 Values

ryan-clancy opened this issue 5 years ago · comments

When doing the official run, I get the following values from trec_eval:

Starting container from saved image...
Logs for search in container with ID eade5686b363406fb16c3d8c51b37436cd1879429c53ee11ba0cb60cae33f752...
CREATE DATABASE
CREATE CONNECTION
SCORING TOPICS
Evaluating results using trec_eval...
###
# /tmp/output/olddog/run.bm25.robust04
###
map                     all     0.1771
P_5                     all     0.4385

MAP looks good, but the P@5 value isn't what's expected.

Log and run files available at https://github.com/osirrc/osirrc2019-runs/tree/master/olddog/robust04/2019-06-17

Arjen P. de Vries · Answer 1 · Tue Jun 18 2019 16:23:03 GMT+0800 (China Standard Time)

I am checking right now, but I think that the measure in the Table is P@30 rather than P@5, so our mistake in which number to report.

Arjen P. de Vries · Answer 2 · Tue Jun 18 2019 16:47:50 GMT+0800 (China Standard Time)

Confirmed - the number we put in the table is P@30.

Will be fixed!

Arjen P. de Vries · Answer 3 · Tue Jun 18 2019 17:05:56 GMT+0800 (China Standard Time)

@r-clancy and @lintool a question:
I computed P@5 using trec_eval and get the value @r-clancy reports above (Phew);
but if I use trec_eval -c, I get P@5 = 0.4297.

I think people should report the latter, not the former - i.e., a system has to be evaluated on all topics, not on all topics that they returned answers for. (Otherwise, I could get a very high MAP by being smart about selecting topics to include - even if that is difficult to do, it is not unimaginable.)

Jimmy Lin · Answer 4 · Tue Jun 18 2019 18:18:22 GMT+0800 (China Standard Time)

@arjenpdevries filed issue here: osirrc/jig#105