This package contains outputs of participating systems and raw human ratings for the E2E NLG Challenge. The systems have been trained and tested on the E2E NLG Dataset to generate restaurant recommendation from a flat meaning representation (attribute-value sets).
See the Challenge website and the detailed results draft paper on arXiv for more information.
The directory structure is the following:
participants.tsv
-- "legend" to labels for participating teams (these are used for direcory names and in system labels in the human rating files; see below for primary system variants)human_ratings/
-- raw human ratingssystem_outputs/primary/
-- outputs of primary systems (these were used for human ratings)system_outputs/all/
-- outputs of all systems submitted to the challenge (including primary)
The system output files are TSV files with UTF-8 encoding, containing two tab-separated columns:
MR
= the input MRoutput
= the corresponding system output
In the human ratings CSV files, these columns are of interest:
mr
= the input MRsys1
-sys5
= system labelsref1
-ref5
= corresponding system outputsquality1
-quality5
/natur1
-natur5
= corresponding human ratings
These are used in the human rating files (they correspond to the labeling used in the paper):
label | affiliation | architecture |
---|---|---|
adapt | AdaptCentre | seq2seq |
chen | Harbin Institute of Technology | seq2seq |
dangnt | University of Information Technology VNU-HCM | rule-based |
forge1 | Pompeu Fabra University | rule-based |
forge3 | Pompeu Fabra University | templates |
gong | Harbin Institute of Technology | seq2seq |
harv | HarvardNLP | seq2seq |
nle | NAVER Labs Europe | seq2seq |
sheff1 | Sheffield NLP | data-driven |
sheff2 | Sheffield NLP | seq2seq |
slug | UC Santa Cruz (Slug2Slug) | seq2seq |
slug-alt | UC Santa Cruz (Slug2Slug) | seq2seq |
tgen | Heriot-Watt University (baseline) | seq2seq |
tnt1 | UC Santa Cruz (TNT-NLG) | seq2seq |
tnt2 | UC Santa Cruz (TNT-NLG) | seq2seq |
tr1 | Thomson Reuters NLP | seq2seq |
tr2 | Thomson Reuters NLP | templates |
tuda | UKP TU Darmstadt | templates |
zhang | Xiamen University | seq2seq |
zhaw1 | Zürcher Hochschule für Angewandte Wissenschaften | data-driven |
zhaw2 | Zürcher Hochschule für Angewandte Wissenschaften | data-driven |
Ondřej Dušek, Jekaterina Novikova & Verena Rieser, Heriot-Watt University
To cite this data, please refer to this paper:
@article{dusek_evaluating_2019,
title = {Evaluating the {State}-of-the-{Art} of {End}-to-{End} {Natural} {Language} {Generation}: {The} {E}2E {NLG} {Challenge}},
journal = {arXiv:1901.07931 [cs]},
author = {Dušek, Ondřej and Novikova, Jekaterina and Rieser, Verena},
month = jan,
year = {2019},
url = {http://arxiv.org/abs/1901.07931},