This repository accompanies our paper Exploring the Robustness of Task-oriented Dialogue Systems for Colloquial German Varieties (Ekaterina Artemova, Verena Blaschke, & Barbara Plank, to be published at EACL 2024). It contains code for automatically applying morphosyntactic perturbation rules to German sentences in order to mimic grammatical structures found in colloquial varieties (details in the paper).
We release this code for research purposes only, and expressly forbid usage for mockery or parody of any dialects or registers.
We implemented 18 perturbations covering a wide range of dialect phenomena in German. The code is available in the dialect_perturbations.py
file and the example usage is demontrated in the perturbation_test.ipynb
notebook.
To test the perturbation, you'll require dictionaries and word lists from the resources
folder, and the following packages:
- SoMaJo for tokenization
- SpaCy for POS tagging
- Stanza for POS tagging and dependency parsing
- DERBI for inflection -- at the moment, the 2022 version is needed for the code to run (integerated as a submodule here)
- Pattern-de for verb conjugation
Clone the repo + submodule:
git clone --recursive git@github.com:mainlp/dialect-ToD-robustness.git
If you already cloned the repo without the recursive flag:
cd dialect-ToD-robustness
git submodule init
git submodule update
Install dependencies:
python -m pip install -r requirements.txt
or:
# pip install jupyter # optional; only for sample notebook
pip install GitPython
pip install pandas
pip install somajo
pip install stanza
pip install spacy
pip install pattern
Install the SpaCy model:
python -m spacy download de_core_news_sm
The table in the human_eval
folder contains results of the human evaluation of perturbations on the Likert scale from 1 to 5. Each row corresponds to a pair of sentences where one sentence is a perturbation of the other. The columns are as follows:
sentence
: the intact sentenceperturbed_sentence
: the perturbed sentenceperturbation
: the perturbation appliedann_x
: the score from the annotatorx
ann_y
: the score from the annotatory
.
The folder plots
contains plots used in the main part of the paper and Appendices C and D.
The folder results
contains resulting tables. Each table contains intent accuracy and slot F1 values for intact and perturbed test sets.
We use the following convention to name files. Each file is named according to the pattern '{train language}{dev language}.{test language}.{dataset}'. The suffix '1p' denotes cases where single perturbations are applied. In other cases, all perturbations are applied simultaneously by default.
To replicate the perturbation rules exactly as used in the paper (without potential later improvements), use this commit.
@inproceedings{artemova-etal-2024-exploring,
author = {Artemova, Ekaterina and Blaschke, Verena and Plank, Barbara},
title = {Exploring the Robustness of Task-oriented Dialogue Systems for Colloquial German Varieties},
booktitle = {Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics},
year = {2024},
publisher = {Association for Computational Linguistics},
note = {To appear},
}