songfeng / sharedtask-dialdoc2021

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DialDoc21: Shared Task on Doc2Dial Dataset

DialDoc21 Shared Task at ACL 2021 includes two subtasks for building goal-oriented document-grounded dialogue systems. The first subtask is to predict the grounding in the given document for next agent response; the second subtask is to generate agent response in natural language given the contexts.

Data

This shared task is based on Doc2Dial v1.0.1 in folder data/doc2dial. For more information about the dataset, please refer to README, paper and Doc2Dial Project Page.

Note: you can choose to utilize other public datasets in addition to Doc2Dial data for training. See example here.

Shared Task

Subtask 1

The task is to predict the knowledge grounding in form of document span for the next agent response given dialogue history and the associated documents.

  • Input: the associated document and dialogue history.

  • Output: the grounding text.

  • Evaluation: exact match and F1 scores. Please refer to script for more details.

Subtask 2

The task is to generate the next agent response in natural language given dialogue-based and document-based contexts.

  • Input: the associated document and dialogue history.

  • Output: dialog utterance.

  • Evaluation: sacrebleu and human evaluation. Please refer to script for more details. Stay tuned for more details about human evaluations.

Baselines

Environment Setup

Create a virtual environment

conda create -n ENV_NAME python=3.7
conda activate ENV_NAME

Install PyTorch

conda install pytorch cudatoolkit=10.2 -c pytorch

Install Huggingface Transformers, Datasets and a few more dependencies

pip install -r requirements.txt

Install NVIDIA/apex

conda install -c conda-forge nvidia-apex 

Load Dataset

You can use Huggingface Dataset to load Doc2Dial datasets.

For loading v1.0.1 for subtask 1, you need to point to scripts/datasets/doc2dial/doc2dial.py.

from datasets import load_dataset
datasets = load_dataset("scripts/datasets/doc2dial/", "doc2dial_rc", cache_dir="YOUR_LOCAL_CACHE")

The script includes how to obtain ID and expected output format given an agent turn for prediction or generation task.

Run Baseline for Subtask 1

Run HuggingFace QA on Doc2Dial

  • For fine-tuning Bert on Doc2Dial,

    cd sharedtask-dialdoc2021/scripts/subtask1
    ./run_qa.sh
  • Results on validation set:

    # bert-base-uncased
    f1 = 56.29 
    exact_match = 39.73
    # bert-large-uncased-whole-word-masking
    f1 = 62.98
    exact_match = 50.50

Evaluating your model output

  • Output format and sample file

    Please see the format in sample file.

  • Evaluation script

    Please refer to script for evaluating your model predictions.

    python sharedtask_utils.py --task subtask1 --prediction_json sample_files/sample_prediction_subtask1.json

Run Baseline for Subtask 2

Run HuggingFace Seq2Seq on Doc2Dial

  • For generating input files,

    We first create source and target files. Please see run script with required parameters along with other default values.

    cd scripts/subtask2
    python seq2seq_utils.py --split validation --output_dir seq2seq_files
  • For fine-tuning bart on Doc2Dial,

    cd scripts/subtask2
    ./run_seq2seq.sh
  • Results on validation set:

    # bart-large-cnn
    val_bleu = 17.72

Evaluating your model output

  • Output format and sample file

    Please see the format in sample file.

  • Evaluation script

    Please refer to script for evaluating your model predictions.

    python sharedtask_utils.py --task subtask2 --prediction_json sample_files/sample_prediction_subtask2.json

About Participation

For more up-to-date information about participating DialDoc21 Shared Task, please check our workshop page.

About


Languages

Language:Python 98.9%Language:Shell 1.1%