EstQA contains multiple ids for the same context document
x-tabdeveloping opened this issue · comments
Márton Kardos commented
Something I noticed is that the way I uploaded EstQA to HuggingFace and then used it in the benchmark is not suitable for a retrieval task, as multiple contexts can belong to the same question and this is not accounted for.
Potential fixes:
- Just use the answer as the retrieved passage instead of the context.
- Fix the dataset on HuggingFace with multiple tables.
Kenneth Enevoldsen commented
Since you can have multiple positive retrievals shouldn't you simply have multiple positive pairs?
Márton Kardos commented
The problem is that since the same context gets multiple IDs even when the model retrieves the correct context for a question it might get detected as false positive. But I'm on it, submitting a PR soon.