osirrc / irc-centre2019-docker

OSIRRC Docker Image for IRC-CENTRE2019

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

OSIRRC Docker Image for IRC-CENTRE2019

Build Status Docker Build Status DOI

Timo Breuer and Philipp Schaer

This is the docker image for our replicated submission to CENTRE@CLEF2019 conforming to the OSIRRC jig for the Open-Source IR Replicability Challenge (OSIRRC 2019) at SIGIR 2019. This image is available on Docker Hub has been tested with the jig at commit ca31987 (6/5/2019).

  • Supported test collections: core17
  • Required training collections: robust04, robust05
  • Supported hooks: init, index, search

Quick Start

Use the commands below to get the runs for WCRobust04 and WCRobust0405 as they were replicated in the course of our participation in CENTRE@CLEF19.

The following jig command can be used to index the New York Times corpus and prepare training data for WCRobust04:

python run.py prepare \
    --repo osirrc2019/irc-centre2019 \
    --tag v0.1.3 \
    --collections robust04=/path/to/robust04/=trectext \
                  core17=/path/to/core17/=trectext \
    --opts run="wcrobust04"

The argument run can be customized to run="wcrobust0405" in order to prepare training data for WCRobust0405. In this case, the robust05 corpus has to be mounted as an additional volume.

python run.py prepare \
    --repo osirrc2019/irc-centre2019 \
    --tag v0.1.3 \
    --collections robust04=/path/to/robust04/=trectext \
                  robust05=/path/to/robust05/=trectext \
                  core17=/path/to/core17/=trectext \
    --opts run="wcrobust0405"

The following jig command can be used to perform a retrieval run on the New York Times depending on the previously defined training corpora.

python run.py search \
    --repo osirrc2019/irc-centre2019 \
    --tag v0.1.3 \
    --collection core17 \
    --topic topics/topics.core17.txt \
    --output /path/to/output/ \
    --qrels qrels/qrels.core17.txt

Expected Results

Run MAP P@10 P@30
WCRobust04 0.2971 0.6820 0.5613
WCRobust0405 0.3539 0.7360 0.6347

Implementation

The following is a short summary of what happens in each of the scripts in this repo.

Dockerfile

The Dockerfile installs python3, copies scripts for corresponding hooks and makes required directory. The working directory is set to /work/

init

The init script will download the code from a repository and installs required Python packages from the requirements.txt file. Depending on the specified run, scripts for WCRobust04 or WCRobust0405 will be prepared.

index

The index script runs a subprocess which starts indexing.

search

The search script will start the ranking depending on the previously specified run.

Reviews

About

OSIRRC Docker Image for IRC-CENTRE2019


Languages

Language:Python 66.2%Language:Dockerfile 33.8%