Implicit Deep Latent Variable Models for Text Generation

[03/10/2022] It's reported by a researcher that there may be some default API changes happened behind, which may cause the training loss diverge (e.g. ELBO not converging as expected). Please refer to this blog for the loss behavior seen originally.

[03/11/2022] Thanks to Kushal Jain! The code runs well with torch 0.4.1. FYI: the following virtual environment using conda is tested to run the code.

conda create -n myenv python=3.6
conda install pytorch=0.4.1 cuda92 -c pytorch
conda install tqdm
conda install h5py

This repository contains source code to reproduce the results presented in the paper Implicit Deep Latent Variable Models for Text Generation (EMNLP 2019):

@inproceedings{Fang_iLVM_2019_EMNLP,
  title={Implicit Deep Latent Variable Models for Text Generation},
  author={Le Fang, Chunyuan Li, Jianfeng Gao, Wei Dong, Changyou Chen},
  booktitle={EMNLP},
  year={2019}
}

We present two types of implicit deep latent variable models, iVAE and iVAE_MI. Core to these two model variants is the sample-based representation of the latent features in LVMs, in replacement of traditional Gaussian-based distributions. In this repo, each folder corresponds to a specific experiment with different datasets or tasks. To run the code, please go to subfolders for specific experiment. The file named as train_xxx.py in each folder is the main code to train and test our implicit VAEs.

1. Proof-of-Concept on a Toy Dataset

The toy dataset contains 4 data points x: 4 different one-hot four-dimensional vectors, and we learn the corresponding latent code z in 2D space for each x. Run the following in cmd lime:

cd toy_onehot/
python vae_onehot.py
python train_onehot.py

The result is as following:

VAE	iVAE

2. Language modeling

2.1 Language modeling on PTB

After downloading, run

cd lang_model_ptb/
python preprocess_ptb.py --trainfile data/train.txt --valfile data/val.txt --testfile data/test.txt --outputfile data/ptb

This will create the *.hdf5 files (data tensors) to be used by the model, as well as the *.dict file which contains the word-to-integer mapping for each word.

Specify '--model' in cmd line, with '--model mle' for implicit vae (default if not specified) and '--model mle_mi' for implicit vae with mutual information maximized. The command for training is for example

python train_ptb.py

The command for evaluation after the 30th epoch is

python train_ptb.py --test --train_from results_mle/030.pt

The command for training VAEs ('vae', 'beta_vae', 'savae', 'cyc_vae') is for example

python train_ptb_vaes.py --model vae

For interpolating between 2 sentences after training the 40th epoch, run

python interpolation.py

For evaluating decoders from prior codes after training and calculate forward & reverse PPL, we need install KenLM Language Model Toolkit and run

python generative_model.py --model ['mle', 'mle_mi', 'vae','beta_vae', 'savae', 'cyc_vae']

2.2. Language modeling on Yahoo

After downloading, run

cd lang_model_yahoo/
python preprocess_yahoo.py --trainfile data/train.txt --valfile data/val.txt --testfile data/test.txt --outputfile data/yahoo

This will create the *.hdf5 files (data tensors) to be used by the model, as well as the *.dict file which contains the word-to-integer mapping for each word.

Specify '--model' in cmd line, with '--model mle' for implicit vae (default if not specified) and '--model mle_mi' for implicit vae with mutual information maximized. The command for training is for example

python train_yahoo.py

The command for evaluation after the 30th epoch is

python train_yahoo.py --test --train_from results_mle/030.pt

2.3. Language modeling on Yelp

Specify '--model' in cmd line, with '--model mle' for implicit vae (default if not specified) and '--model mle_mi' for implicit vae with mutual information maximized. The command for training is for example

python train_yelp.py

The command for evaluation after, e.g., the 30th epoch is

python train_yelp.py --test --train_from results_mle/030.pt

The command for training an autoencoder (AE) is

python train_yelp_ae.py

The command for training VAEs ('vae', 'beta_vae', 'cyc_vae') is for example

python train_yelp_vaes.py --model vae

3. Style transfer on Yelp

Need fasttext library and KenLM Language Model Toolkit for evaluation. Please install beforehand. The command for training implicit vae with mutual information maximized is

python train_yelp.py

The command for training an adversarially regularized autoencoder is

python arae_train_yelp.py

For evaluating the model after training for the 25th epoch, run

python train_yelp.py --eval --load_path ./output/ILVM --load_epoch 25

4. Dialog response generation

4.1. Dialog response generation on Switchboard

Use pre-trained Word2vec: download Glove word embeddings glove.twitter.27B.200d.txt from https://nlp.stanford.edu/projects/glove/ and save it to the ./data folder. The default setting use 200 dimension word embedding trained on Twitter.

The command for training implicit vae with mutual information maximized is

python train_swda.py

The command for training DialogWAE is

python DialogWAE_train_swda.py

4.2. Dialog response generation on DailyDialog

Use pre-trained Word2vec: download Glove word embeddings glove.twitter.27B.200d.txt from https://nlp.stanford.edu/projects/glove/ and save it to the ./data folder. The default setting use 200 dimension word embedding trained on Twitter.

The command for training implicit vae with mutual information maximized is

python train_dailydial.py

The command for training DialogWAE is

python DialogWAE_train_dailydial.py

Questions?

Please drop us (Le or Chunyuan) a line if you have any questions.

About

This code repository presents the pytorch implementation of the paper “Implicit Deep Latent Variable Models for Text Generation”(EMNLP 2019).

Languages

Language:OpenEdge ABL 95.0%Language:C++ 3.7%Language:Python 1.2%Language:CMake 0.1%Language:Cython 0.0%Language:Shell 0.0%Language:C 0.0%

fangleai / Implicit-LVM

Implicit Deep Latent Variable Models for Text Generation

Contents

1. Proof-of-Concept on a Toy Dataset

2. Language modeling

2.1 Language modeling on PTB

2.2. Language modeling on Yahoo

2.3. Language modeling on Yelp

3. Style transfer on Yelp

4. Dialog response generation

4.1. Dialog response generation on Switchboard

4.2. Dialog response generation on DailyDialog

Questions?

About

Languages