Effective Estimation of Deep Generative Language Models

Overview

This repository contains the code needed to run the experiments presented in the paper Effective Estimation of Deep Generative Language Models [1].

Setup

To start experimenting, clone the repository to your local device and install the following dependencies:

python >= 3.6
pip install -r requirements.txt
hyperspherical_vae: the code was tested with this fork, get the latest version from here.
torch_two_sample
pyter

Quick Start

Download and pre-process the Penn Treebank data, see the data folder.
Train a RNNLM:

./main.py --model deterministic --mode train --pre_def 1 --ptb_type mik

Train a default SenVAE:

./main.py --model bowman --mode train --pre_def 1 --ptb_type mik

Train a SenVAE with a target rate of 5, using MDR:

./main.py --model bowman --mode train --pre_def 1 --ptb_type mik --lagrangian 1 --min_rate 5 --save_suffix mdr

Train a SenVAE with MoG prior and a target rate of 5, using MDR:

./main.py --model flowbowman --prior mog --mode train --pre_def 1 --ptb_type mik --lagrangian 1 --min_rate 5 --save_suffix mog

Evaluate the models:

./main.py --model deterministic --mode test --pre_def 1 --ptb_type mik
./main.py --model bowman --mode test --pre_def 1 --ptb_type mik
./main.py --model bowman --mode test --pre_def 1 --ptb_type mik --lagrangian 1 --min_rate 5 --save_suffix mdr
./main.py --model flowbowman --prior mog --mode test --pre_def 1 --ptb_type mik --lagrangian 1 --min_rate 5 --save_suffix mog

Print some samples:

./main.py --model deterministic --mode qualitative --pre_def 1 --ptb_type mik
./main.py --model bowman --mode qualitative --pre_def 1 --ptb_type mik
./main.py --model bowman --mode qualitative --pre_def 1 --ptb_type mik --lagrangian 1 --min_rate 5 --save_suffix mdr
./main.py --model flowbowman --prior mog --mode qualitative --pre_def 1 --ptb_type mik --lagrangian 1 --min_rate 5 --save_suffix mog

Yahoo and Yelp experiments

We used a fork of the vae-lagging-encoder repo to run the experiments on Yahoo and Yelp data. Please use the submodule to recreate the experiments, it has been modified for training with MDR and the MoG prior.

Structure

main.py: the main script that handles all command line arguments.
dataset: expected location of data. Contains code for preprocessing and batching PTB.
model: all components for the various models tested in the paper.
scripts: various scripts for training/testing/BayesOpt, etcetera.
util: utility functions for storage, evaluation and more.

Settings

There are many command line settings available to tweak the experimental setup. Please see the settings file for a complete overview. Here, we will highlight the most important settings:

--script: [generative|bayesopt|grid] chooses which script to run. generative is used for training/testing a single model, bayesopt and grid run Bayesian Optimization and Grid search respectively. Please see the scripts for more information about their usage.
--mode: [train|test|novelty|qualitative] select in which mode to run the generative script.
--save_suffix: to give your model a name.
--seed: set a random seed.
--model: [deterministic|bowman|flowbowman] the model to use. Deterministic refers to the RNNLM, bowman to the SenVAE and flowbowman to the SenVAE with expressive latent structure.
--lagrangian: set to 1 to use the MDR objective.
--min_rate: specify a minimum rate, in nats.
--flow: [diag|iaf|vpiaf|planar] the type of flow to use with the flowbowman model.
--prior: [weak|mog|vamp] the type of prior to use with the flowbowman model.
--data_folder: path to your pre-processed data.
--out_folder: path to store experiments.  
--ptb_type: [|mik|dyer] choose between simple (mik) and expressive (dyer) unked PTB. paths to out and data are set automatically.
--pre_def: set to 1 to use encoder-decoder hyperparameters that match the ones in the paper.
--local_rank: which GPU to use. Set to -1 to run on CPU.

Citation

If you use this code in your project, please cite:

[1] Pelsmaeker, T., and Aziz, W. (2019). Effective Estimation of Deep Generative Language Models. arXiv preprint arXiv:1904.08194.

BibTeX format:

@article{effective2019pelsmaeker,
  title={Effective Estimation of Deep Generative Language Models},
  author={Pelsmaeker, Tom and
          Aziz, Wilker},
  journal={arXiv preprint arXiv:1904.08194},
  year={2019}
}

License

MIT

tom-pelsmaeker / deep-generative-lm