This repository contains code by Mor Geva, Ankit Gupta and Tomer Wolfson for our paper, "Break It Down: A Question Understanding Benchmark" (TACL 2020). The repository features the codebase and models from our paper.
For the Break dataset please refer to: https://allenai.github.io/Break
Break is a human annotated dataset of natural language questions and their Question Decomposition Meaning Representations (QDMRs). Break consists of 83,978 examples sampled from 10 question answering datasets over text, images and databases.
4/10/2020
Pretrained QDMR Parsing models are now available.2/24/2020
Open-domain QA experiments are now available.2/20/2020
QDMR parsing models and evaluation are now available.2/1/2020
The full dataset has been publicly released at https://allenai.github.io/Break.
The repository features:
- The QDMR Parsing models, by Mor Geva
- The Open-domain QA models utilizing QDMR, by Ankit Gupta
- The annotation pipeline of Break
- Code for converting QDMR to logical-form
@article{Wolfson2020Break,
title={Break It Down: A Question Understanding Benchmark},
author={Wolfson, Tomer and Geva, Mor and Gupta, Ankit and Gardner, Matt and Goldberg, Yoav and Deutch, Daniel and Berant, Jonathan},
journal={Transactions of the Association for Computational Linguistics},
year={2020},
}