Sean MacAvaney's repositories
pushshift-reddit-sync
A simple utility to sync a local copy of the pushshift.io Reddit submission and comment dumps
trello-track
Tracks running processes via Trello
bert-axioms
Code for ECIR'20 paper Diagnosing BERT with Retrieval Heuristics
GenderBias_IR
Tools and resources for measuring gender bias in Information Retrieval models
ir_datasets
Provides a common interface to many IR ranking datasets.
MSMARCO-Passage-Ranking
MS MARCO(Microsoft Machine Reading Comprehension) is a large scale dataset focused on machine reading comprehension, question answering, and passage ranking. A variant of this task will be the part of TREC and AFIRM 2019. For Updates about TREC 2019 please follow This Repository Passage Reranking task Task Given a query q and a the 1000 most relevant passages P = p1, p2, p3,... p1000, as retrieved by BM25 a succeful system is expected to rerank the most relevant passage as high as possible. For this task not all 1000 relevant items have a human labeled relevant passage. Evaluation will be done using MRR
pytrec_eval
pytrec_eval is an Information Retrieval evaluation tool for Python, based on the popular trec_eval.
terrier-core
Terrier IR Platform
trec-car-tools
Tools for working with the TREC CAR dataset.
TREC-COVID
TREC-COVID results
warc3-clueweb
Python 3 library for reading and writing warc files