XinyuHua / textgen-emnlp19

Code for our EMNLP 2019 paper titled "Sentence-Level Content Planning and Style Specification for Neural Text Generation"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

About

This repository contains code for the following paper:

Xinyu Hua and Lu Wang Sentence-Level Content Planning and Style Specification for Neural Text Generation

If you find our work useful, please cite:

@inproceedings{hua-wang-2019-sentence,
    title = "Sentence-Level Content Planning and Style Specification for Neural Text Generation",
    author = "Hua, Xinyu  and
              Wang, Lu",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    year = "2019",
    address = "Hong Kong, China",
    publisher = "Association for Computational Linguistics",
}

Dataset

download link: link

arggen test set with target arguments: link

arggen untokenized datasets: link

task # tokens target # keyphrase source
arggen 54.87 55.80 changemyview
wikigen 70.57/48.60 23.56 Normal/Simple Wikipedia
absgen 141.34 12.23 AGENDA

errata: prior to March 04, 2020, there was a problem with the vocabulary file for absgen. If you are using this model, please replace the old vocab.txt with the new one.

Quickstart

note: all actions below assume src/ to be the working directory.

To train an argument generation model:

python main.py --mode=train \
    --exp_name=arggen_exp \
    --encode_passage \
    --type_conditional_lm \
    --task=arggen \
    --batch_size=30 \
    --num_train_epochs=30 \
    --logging_freq=2
    --max_src_words=500 \
    --max_passage_words=400 \
    --max_sent_num=10 \
    --max_bank_size=70 \

To train an abstract generation model, which has no sentence level style labels:

python main.py --mode=train \
    --exp_name=absgen_exp \
    --task=absgen \
    --batch_size=30 \
    --num_train_epochs=30 \
    --max_src_words=1000 \
    --max_bank_size=30 \
    --logging_freq=2

To train a Wikipedia generation model:

python main.py --mode=train \
    --exp_name=wikigen_exp \
    --type_conditional_lm \
    --task=wikigen \
    --batch_size=30 \
    --max_bank_size=30 \
    --num_train_epochs=30 \
    --max_src_words=1000 \
    --logging_freq=2

License

See the LICENSE file for details.

About

Code for our EMNLP 2019 paper titled "Sentence-Level Content Planning and Style Specification for Neural Text Generation"

License:MIT License


Languages

Language:Python 99.6%Language:Shell 0.4%