BEIR: A Heterogenous Benchmark for Zero-shot Evaluation of Information Retrieval Models |
Nandan Thakur et al. |
NeurIPS 2021 |
DATA |
MS MARCO: A Human Generated MAchine Reading COmprehension Dataset |
Payal Bajaj et al. |
NeurIPS 2016 |
DATA |
Natural Questions: a Benchmark for Question Answering Research |
Tom Kwiatkowski et al. |
TACL 2019 |
DATA |
TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension. |
Mandar Joshi et al. |
ACL 2017 |
DATA |
mMARCO: A Multilingual Version of the MS MARCO Passage Ranking Dataset. |
Luiz Henrique Bonifacio et al. |
Arxiv 2021 |
DATA |
TREC 2019 News Track Overview. |
Ian Soborof et al. |
TREC 2019 |
DATA |
TREC-COVID: rationale and structure of an information retrieval shared task for COVID-19. |
Kirk Roberts et al. |
J Am Med Inform Assoc. 2020 |
DATA |
A Full-Text Learning to Rank Dataset for Medical Information Retrieval. |
Vera Boteva et al. |
ECIR 2016 |
DATA |
A Data Collection for Evaluating the Retrieval of Related Tweets to News Articles. |
Axel Suarez et al. |
ECIR 2018 |
DATA |
Overview of Touché 2020: Argument Retrieval. |
Alexander Bondarenko et al. |
CLEF 2020 |
DATA |
Retrieval of the Best Counterargument without Prior Topic Knowledge. |
Henning Wachsmuth et al. |
ACL 2018 |
DATA |
DBpedia-Entity v2: A Test Collection for Entity Search. |
Faegheh Hasibi et al. |
SIGIR 2017 |
DATA |
ORCAS: 20 Million Clicked Query-Document Pairs for Analyzing Search. |
Nick Craswell et al. |
CIKM 2020 |
DATA |
TREC 2022 Deep Learning Track Guidelines |
Nick Craswell et al. |
TREC 2021 |
DATA |
DuReader_retrieval: A Large-scale Chinese Benchmark for Passage Retrieval from Web Search Engine. |
Yifu Qiu et al. |
Arxiv 2022 |
DATA |
SQuAD: 100,000+ Questions for Machine Comprehension of Text. |
Pranav Rajpurkar et al. |
EMNLP 2016 |
DATA |
HOTPOTQA: A Dataset for Diverse, Explainable Multi-hop Question Answering. |
Zhilin Yang et al. |
EMNLP 2018 |
DATA |
Semantic Parsing on Freebase from Question-Answer Pairs. |
Jonathan Berant et al. |
EMNLP 2013 |
DATA |
Modeling of the Question Answering Task in the YodaQA System. |
Petr Baudiš et al. |
CLEF 2015 |
DATA |
WWW'18 Open Challenge: Financial Opinion Mining and Question Answering. |
Macedo Maia et al. |
WWW 2018 |
DATA |
An overview of the BIOASQ large-scale biomedical semantic indexing and question answering competition. |
George Tsatsaronis et al. |
BMC Bioinform. 2015 |
DATA |
CQADupStack: A Benchmark Data Set for Community Question-Answering Research. |
Doris Hoogeveen et al. |
ADCS 2015 |
DATA |
First Quora Dataset Release: Question Pairs. |
Shankar Iyer et al. |
Webpage |
DATA |
CCQA: A New Web-Scale Question Answering Dataset for Model Pre-Training. |
Patrick Huber et al. |
NAACL 2022 |
DATA |
FEVER: a Large-scale Dataset for Fact Extraction and VERification. |
James Thorne et al. |
NAACL 2018 |
DATA |
CLIMATE-FEVER: A Dataset for Verification of Real-World Climate Claims. |
Thomas Diggelmann et al. |
NeurIPS 2020 |
DATA |
Fact or Fiction: Verifying Scientific Claims. |
David Wadden et al. |
EMNLP 2020 |
DATA |
SPECTER: Document-level Representation Learning using Citation-informed Transformers. |
Arman Cohan et al. |
ACL 2020 |
DATA |
Simple Entity-Centric Questions Challenge Dense Retrievers. |
Christopher Sciavolino et al. |
EMNLP 2021 |
DATA |
ArchivalQA: A Large-scale Benchmark Dataset for Open Domain Question Answering over Archival News Collections. |
Jiexin Wang et al. |
Arxiv 2021 |
NA |
Multi-CPR: A Multi Domain Chinese Dataset for Passage Retrieval. |
Dingkun Long et al. |
SIGIR 2022 |
DATA |
HOVER: A Dataset for Many-Hop Fact Extraction And Claim Verification. |
Yichen Jiang et al. |
EMNLP 2020 |
DATA |
TREC 2021 Deep Learning Track Guidelines. |
Nick Craswell et al. |
NA |
NA |
MSMarco Chameleons: Challenging the MSMarco Leaderboard with Extremely Obstinate Queries. |
Negar Arabzadeh et al. |
CIKM 2021 |
Roff |