pisa-engine / ciff-hub

Hosting some useful CIFFs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CIFF Hub

Common Index File Format CIFF is an inverted index exchange format as defined as part of the Open-Source IR Replicability Challenge (OSIRRC) initiative.

The Ciff Hub hosts many indexes and queries for a variety of collections and models.

MS Marco

The MS Marco passage ranking dataset consists of 8.8M passages.

ESPLADE

Lassance, Carlos, and Stéphane Clinchant. "An efficiency study for splade models." Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2022.

Name Description CIFF Dev DL 2019 DL 2020
ESPLADE efficient-splade-V-large-doc reordered w/ BP Download Download Download Download
SPLADE splade-cocondenser-ensemble distil reordered w/ BP Download Download Download Download

uniCOIL

Jimmy Lin and Xueguang Ma. A Few Brief Notes on DeepImpact, COIL, and a Conceptual Framework for Information Retrieval Techniques. arXiv:2106.14807.

Name Description CIFF Dev DL 2019 DL 2020
uniCOIL-TILDE uniCOIL w/ TILDE expansion reordered w/ BP Download Download Download Download

CSV

Yu, Puxuan, Antonio Mallia, and Matthias Petri. "Improved Learned Sparse Retrieval with Corpus-Specific Vocabularies." arXiv preprint arXiv:2401.06703 (2024).

Name Description CIFF Dev DL 2019 DL 2020
CSV-30k csv-30k reordered w/ BP Download Download Download Download
CSV-100k csv-100k reordered w/ BP Download Download Download Download
CSV-300k csv-300k reordered w/ BP Download Download Download Download

About

Hosting some useful CIFFs

License:Apache License 2.0