tizianoproject / wqa_tanda

This repo provides code and data used in our TANDA paper.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

TANDA: Transfer and Adapt Pre-Trained Transformer Models for Answer Sentence Selection

We put together a script, data, and trained models used in our paper. In a nutshell, TANDA is a technique for fine-tuning pre-trained Transformer models sequentially in two steps:

  • first, transfer a pre-trained model to a model for a general task by fine-tuning it on a large and high-quality dataset;
  • then, perform a second fine-tuning step to adapt the transferred model to the target domain.

Script

We base our implementation on the transformers package. We use the following script to enable sequential fine-tuning option for the package.

git clone https://github.com/huggingface/transformers.git
cd transformers
git checkout f3386 -b tanda-sequential-finetuning
git apply tanda-sequential-finetuning-with-asnq.diff
  • f3386 is the latest commit as of Sun Nov 17 18:08:51 2019 +0900, and tanda-sequential-finetuning-with-asnq.diff is the diff to enable the option.

For example, to transfer with ASNQ and adapt with a target dataset:

  • download the ASNQ dataset and the target dataset (e.g. Wiki-QA, formatted similar as ASNQ), and
  • run the following script
python run_glue.py \
    --model_type bert \
    --model_name_or_path bert-base-uncased \
    --task_name ASNQ \
    --do_train \
    --do_eval \
    --do_lower_case \
    --data_dir [PATH-TO-ASNQ] \
    --per_gpu_train_batch_size 150 \
    --learning_rate 2e-5 \
    --num_train_epochs 2.0 \
    --output_dir [PATH-TO-TRANSFER-FOLDER]

python run_glue.py \
    --model_type bert \
    --model_name_or_path [PATH-TO-TRANSFER-FOLDER] \
    --task_name ASNQ \
    --do_train \
    --do_eval \
    --sequential \
    --do_lower_case \
    --data_dir [PATH-TO-WIKI-QA] \
    --per_gpu_train_batch_size 150 \
    --learning_rate 1e-6 \
    --num_train_epochs 2.0 \
    --output_dir [PATH-TO-OUTPUT-FOLDER]

Data

We use the following datasets in the paper:

Answer-Sentence Natural Questions (ASNQ)

  • ASNQ is a dataset for answer sentence selection derived from Google Natural Questions (NQ) dataset (Kwiatkowski et al. 2019). The dataset details can be found in our paper.
  • ASNQ is used to transfer the pre-trained models in the paper, and can be downloaded here.

Domain Datasets

  • Wiki-QA: we used the Wiki-QA dataset from here and removed all the questions that have no correct answers.
  • TREC-QA: we used the *-filtered.jsonl version of this dataset from here.

Models

Models Transferred on ASNQ

TANDA: Models Transferred on ASNQ, then Fine-Tuned with Wiki-QA

TANDA: Models Transferred on ASNQ, then Fine-Tuned with TREC-QA

How To Cite TANDA

The paper is to appear in the AAAI 2020 proceedings. For now, please cite the Arxiv version

@article{garg2019tanda,
  title={TANDA: Transfer and Adapt Pre-Trained Transformer Models for Answer Sentence Selection},
  author={Siddhant Garg and Thuy Vu and Alessandro Moschitti},
  year={2019},
  eprint={1911.04118},
}

License Summary

The documentation, including the shared data and models, is made available under the Creative Commons Attribution-ShareAlike 3.0 Unported License. See the LICENSE file.

The sample script within this documentation is made available under the MIT-0 license. See the LICENSE-SAMPLECODE file.

Contact

For help or issues, please submit a GitHub issue.

For direct communication, please contact Siddhant Garg (sgarg33 is at wisc dot edu, https://github.com/sid7954), Thuy Vu (thuyvu is at amazon dot com), or Alessandro Moschitti (amosch is at amazon dot com).

About

This repo provides code and data used in our TANDA paper.

License:Other