castorini / pyserini

Pyserini is a Python toolkit for reproducible information retrieval research with sparse and dense representations.

Home Page:http://pyserini.io/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

number of hits for a given query is not as specified in the retrieval command

NourOM02 opened this issue · comments

I run the bm25 for the cqadupstack/english dataset using the following command :

command = python -m pyserini.search.lucene --threads 16 --batch-size 128 --index beir-v1.0.0-cqadupstack-english.flat --topics beir-v1.0.0-cqadupstack-english-test --output run.beir.bm25-flat.cqadupstack-english.txt --output-format trec --hits 1000 --bm25 --remove-query

It looks like for every query the results as expected, but the query with the id : 84177 only returns 1 hit (as you can see in the image)

image