DialDoc21 Shared Task at ACL 2021 includes two subtasks for building goal-oriented document-grounded dialogue systems. The first subtask is to predict the grounding in the given document for next agent response; the second subtask is to generate agent response in natural language given the contexts.
This shared task is based on Doc2Dial v1.0.1 in folder data/doc2dial
. For more information about the dataset, please refer to README, paper and Doc2Dial Project Page.
Note: you can choose to utilize other public datasets in addition to Doc2Dial data for training. See example here.
The task is to predict the knowledge grounding in form of document span for the next agent response given dialogue history and the associated documents.
-
Input: the associated document and dialogue history.
-
Output: the grounding text.
-
Evaluation:
exact match
andF1 scores
. Please refer to script for more details.
The task is to generate the next agent response in natural language given dialogue-based and document-based contexts.
-
Input: the associated document and dialogue history.
-
Output: dialog utterance.
-
Evaluation:
sacrebleu
andhuman evaluation
. Please refer to script for more details. Stay tuned for more details about human evaluations.
Create a virtual environment
conda create -n ENV_NAME python=3.7
conda activate ENV_NAME
Install PyTorch
conda install pytorch cudatoolkit=10.2 -c pytorch
Install Huggingface Transformers, Datasets and a few more dependencies
pip install -r requirements.txt
Install NVIDIA/apex
conda install -c conda-forge nvidia-apex
You can use Huggingface Dataset to load Doc2Dial datasets.
For loading v1.0.1 for subtask 1, you need to point to scripts/datasets/doc2dial/doc2dial.py
.
from datasets import load_dataset
datasets = load_dataset("scripts/datasets/doc2dial/", "doc2dial_rc", cache_dir="YOUR_LOCAL_CACHE")
The script includes how to obtain ID and expected output format given an agent turn for prediction or generation task.
Run HuggingFace QA on Doc2Dial
-
For fine-tuning Bert on Doc2Dial,
cd sharedtask-dialdoc2021/scripts/subtask1 ./run_qa.sh
-
Results on validation set:
# bert-base-uncased f1 = 56.29 exact_match = 39.73 # bert-large-uncased-whole-word-masking f1 = 62.98 exact_match = 50.50
Evaluating your model output
-
Output format and sample file
Please see the format in sample file.
-
Evaluation script
Please refer to
script
for evaluating your model predictions.python sharedtask_utils.py --task subtask1 --prediction_json sample_files/sample_prediction_subtask1.json
Run HuggingFace Seq2Seq on Doc2Dial
-
For generating input files,
We first create source and target files. Please see run script with required parameters along with other default values.
cd scripts/subtask2 python seq2seq_utils.py --split validation --output_dir seq2seq_files
-
For fine-tuning bart on Doc2Dial,
cd scripts/subtask2 ./run_seq2seq.sh
-
Results on validation set:
# bart-large-cnn val_bleu = 17.72
Evaluating your model output
-
Output format and sample file
Please see the format in sample file.
-
Evaluation script
Please refer to script for evaluating your model predictions.
python sharedtask_utils.py --task subtask2 --prediction_json sample_files/sample_prediction_subtask2.json
For more up-to-date information about participating DialDoc21 Shared Task, please check our workshop page.