Pipeline vs. End-to-End Architecture to Neural Data-to-Text

This is the code used to obtained the results reported in the manuscript "Neural data-to-text generation: A comparison between pipeline and end-to-end architectures"

The file main.sh is the main script of this project. You may run it to extract the intermediate representations from the data, to train the models, to evaluate each step of the pipeline approach (reported in Section 6 of the paper) as well as to generate text from the non-linguistic approach based on each model (reported in Section 7 of the paper).

To run the script, first install the Python dependencies by running the following command:

pip install > requirements.txt

Then update the root and the dependecies path on vars. This code has as dependencies Moses, Nematus and Subword NMT. Once the paths are set, the script can be executed:

./main.sh

The augmented version of the WebNLG corpus is available here. To see information about the evaluation, go to the evaluation folder.

Reproducibility

For reproducibility reasons, the data used in the experiments can be found here, whereas the results are here.

About

A systematic comparison between pipeline and end-to-end architectures in the RDF-to-text task

Languages

Language:PLSQL 81.6%Language:Lex 16.6%Language:Python 1.0%Language:Jupyter Notebook 0.6%Language:Shell 0.1%