Name-generation
Repository for Generative model for names of Olympic Name Dataset
Created at March 2, 2017 Korea Unviversity, Data-Mining Lab
Tensorflow Implementation for Name-GAN(Generative Adversarial Network).
Requirements
Take a look at the installation instruction for details about installation.
# Install Tensorflow GPU version
$ sudo apt-get install python-pip python-dev
$ pip install tensorflow-gpu
# If the code above doesn't work, try
$ sudo -H pip install tensorflow-gpu
$ sudo pip install --upgrade
Directories
- utils.py : Progress bar function
- model.py : GAN(encoder+decoder+generator+discriminator model) + load + save
- ops.py : Basic functions for rnn, lstm, feed-forward neural network, dropout
- dataset.py : Crawling name dataset and train the autoencoder(encoder + decoder) model
- main.py : Training script for the GAN model (requires pretrained autoencoder model from dataset.py)
Modules
- Each name string will be represented into an encoded vector consisted of (h+c)
- h : LSTM's hidden state
- C : LSTM's output state
1. Encoder
- Encodes the given string value into a hidden vector h
- Output of the model = (2 x cell-dim) = (LSTM's h) + (LSTM's c)
Division |
representation |
specifics |
input |
x |
character-level embedding of name strings |
output |
h |
vector-level representation of name strings |
model |
RNN |
input-size * time-step -> (2 x cell-dim) |
2. Decoder
- Decodes the given hidden vector into an approximated string value x-hat
- Output of the model will (time_steps x input_dim)
Division |
representation |
specifics |
input |
h |
vector-level representation of name strings |
output |
x-hat |
near-value reconstruction of 'x' |
model |
RNN |
(cell-dim) -> input-size * time-step |
3. Generator (G)
- Generates a fake hidden vector representing a name string
- Output of the model = (2 x cell-dim) = (LSTM's h) + (LSTM's c)
Division |
representation |
specifics |
input |
Zc |
Random input vector with class info |
output |
Xc |
Generated name hidden vectors |
model |
Linear |
(z_dim + class_dim) => (cell_dim * 2) |
4. Discriminator (D)
- Binary classification. Define whether the given input is fake or not.
Division |
representation |
specifics |
input |
Xc |
Hidden vector for name-class representation |
output |
Pc |
Probabilities whether the input is fake or not |
model |
Linear |
(cell_dim * 2 + class_dim) => p |