This repostiory contains codes for NC-BERT: Exploiting Numerical-Contextual Knowledge to Improve Numerical Reasoning in Question Answering.
NC-BERT is a numerical reasoning QA model that handles discrete reasoning (e.g., addition, subtraction, counting) to answer a question based on the given passage.
The task at hand is DROP, a numerical question answering dataset created by AllenNLP.
Our model leverages a novel attention masking scheme (namely, the NC-Mask) to:
- Reduce the over-reliance on the parametric knowledge by induceing the model leverage number-related contextual knowledge.
- And thereby enable the model to correctly interpret the numbers in the passage (consequently improving the numerical reasoning performance).
We also provide the code for pre-training the ALBERT-xxlarge-v2
model as the initial backbone of the NC-BERT model (in this case, the NC-ALBERT).
The NC-ALBERT model, unlike its BERT counterpart, is trained using the sentence order prediction (SOP) task along with the masked language modeling (MLM) task (Lan et al., 2019).
Note
- The sentence order prediction is not implemented on the "sentence-level," but on the "text chunk-level."
The repository contains:
- Implementation/pre-training/finetuning of NC-BERT on MLM/synthetic-data/DROP/SQuAD (in
pre_training
dir) - Code and vocabularies for textual data generation (in
textual_data_generation
dir) - Code for numerical data generation (in
pre_training/numeric_data_generation
dir)
Instructions for downloading data + models for pre-trained baseline are in the README of pre_training
dir.
This repository is based on Geva's repository.