snapbug / questionable-qa

Everything required to get the results presented in the Questionable QA paper.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Questionable Answers

This repository contains everything required to completely replicate the results presented in:

Matt Crane. "Questionable Answers in Question Answering Research: Reproducibility and Variability of Published Results". In: Transactions of the Association for Computational Linguistics 6 (2018), pp. 241–252. url: https://transacl.org/ojs/index.php/tacl/article/view/1299.

Status

Unfortunately, the upstream repository castorini/castor has diverged due to history rewriting changes, so the changesets don't match the official current repository.

Unfortunately this repository was not forked in time to capture the cf0e269 SHA from the official repository before that repositories history was re-written. This means that if building from source, you'll have a different SHA which is used to build this image. The setup.sh script will make this change, the contents of which can be verified against the official repository diff.

Running setup.sh build will build the docker images from the source, including making the un-captured change above, while setup.sh pull will pull the prebuilt docker images.

Requirements

Running on GPU

nvidia-docker is required to run the GPU based experiments, and for these experiments version 1 was used. This has since been deprecated by nVidia in favour of version 2. The results should be the same, but for guarantees install version 1.

Building images

The embeddings used by the network should be downloaded from Aliaksei Severyn's shared file (520MB), and placed in the working directory for this repository. The docker image builder will verify checksums to ensure that the same file is used.

Pulling images

All the docker images generated are available online to download/run without having to be built from scratch. These are listed on Docker hub

By default the setup.sh script if run with will pull all the tagged images, this can take a substantial amount of disk space, even though they share a lot of commonality. If you only, for example, want to replicate the math library experiments, then manually pull the required images. Look at run.sh for which images are required for which experiments.

Image Figure/Table Notes
sha-* Table 4 See note above regarding sha-cf0e269
pytorch-* Table 5
*mkl Table 6
sha-cf0e269 or mkl or pytorch-0.1.12 Table 7
sha-cf0e269 or mkl or pytorch-0.1.12 Table 8
sha-cf0e269 or mkl or pytorch-0.1.12 Figure 2 (left) Just the CPU seeds
sha-cf0e269 or mkl or pytorch-0.1.12 Figure 2 (right) Just the GPU seeds
sha-cf0e269 or mkl or pytorch-0.1.12 Figure 2 Both CPU and GPU seeds
Figure 3 Use the output from the logs of run.sh seeds
Table 9 Use the output from the logs of run.sh seeds

Replication

run.sh will successfully replicate all the experiments in the paper using either the built docker images, or pulled docker images from setup.sh. It takes a single argument that specifies which experiments to run.

Argument Figure/Table Notes
all All of the experiments
network Table 4
pytorch Table 5
mathlib Table 6
thread Table 7
gpu Table 8
seeds-cpu Figure 2 (left) Just the CPU seeds
seeds-gpu Figure 2 (right) Just the GPU seeds
seeds Figure 2 Both CPU and GPU seeds
Figure 3 Use the output from the logs of run.sh seeds
Table 9 Use the output from the logs of run.sh seeds

Log files are generated in the form qqa.[dataset].log.[experiment], at the end of training the network performs a feed-forward pass of the datasets, which is where the numbers for the paper are extracted.

Model files will be generated in the form: qqa.[dataset].model.[experiment], to allow for feed-forward verification, or re-creation of the results without retraining the network. These models, in my experimentation, are reproducible across different hardware setups, although I would be interested in hearing of situations where they aren't.

Issues

If you encounter any issues with the scripts etc. in this repository, then either file an issue on github, or email me.

About

Everything required to get the results presented in the Questionable QA paper.


Languages

Language:Shell 100.0%