shatu / HyperGAN

A composable Generative Adversarial Network(GAN) with API and command line tool.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

HyperGAN

A versatile GAN(generative adversarial network) implementation focused on scalability and ease-of-use.

hypergan logo 1 hypergan logo 2 hypergan logo 4

Logos generated with examples/colorizer

Table of contents

Changelog

0.8 ~ "LSGAN / 2d test"

  • Tensorflow 1.0 support
  • New configuration format and refactored api.
  • New loss function based on least squared GAN. See lsgan implementation.
  • API example 2d-test - tests a trainer/encoder/loss combination against a known distribution.
  • API example 2d-measure - measure and report the above test by randomly combining options.
  • And more

_Still training, larger samples to come._

0.7 ~ "WGAN API"

  • New loss function based on wgan :. Fixes many classes of mode collapse! See wgan implementation
  • Initial Public API Release
  • API example: colorizer - re-colorize an image!
  • API example: inpainter - remove a section of an image and have your GAN repaint it
  • API example: super-resolution - zoom in and enhance. We've caught the bad guy!
  • 4 new samplers. --sampler flag. Valid options are: batch,progressive,static_batch,grid.

0.6 ~ "MultiGAN"

  • New encoders
  • Support for multiple discriminators
  • Support for discriminators on different image resolutions

0.5 ~ "FaceGAN"

0.5.x

  • fixed configuration save/load
  • cleaner cli output
  • documentation cleanup

0.5.0

  • pip package released!
  • Better defaults. Good variance. 256x256. The broken images showed up after training for 5 days.

0.1-0.4

  • Initial private release

Quick start

Minimum requirements

  1. For 256x256, we recommend a GTX 1080 or better. 32x32 can be run on lower-end GPUs.
  2. CPU mode is extremely slow. Never train with it!
  3. Python3

Install

  pip3 install hypergan --upgrade

Train

  # Train a 32x32 gan with batch size 32 on a folder of pngs
  hypergan train [folder] -s 32x32x3 -f png -b 32

Increasing performance

On ubuntu sudo apt-get install libgoogle-perftools4 and make sure to include this environment variable before training

  LD_PRELOAD="/usr/lib/libtcmalloc.so.4" hypergan train my_dataset

Development mode

If you wish to modify hypergan

git clone https://github.com/255BITS/hypergan
cd hypergan
python3 setup.py develop

Running on CPU

Make sure to include the following 2 arguments:

CUDA_VISIBLE_DEVICES= hypergan --device '/cpu:0'

Don't train on CPU! It's too slow.

Configuration

Configuration in HyperGAN uses JSON files. You can create a new config by running hypergan train. By default, configurations are randomly generated using Hyperchamber.

Configurations are located in:

  ~/.hypergan/configs/

Usage

  --config [name]

Naming a configuration during training required.

Architecture

A hypergan configuration contains multiple encoders, multiple discriminators, multiple loss functions, and a single generator.

Generator

A generator is responsible for projecting an encoding (sometimes called z space) to an output (normally an image). A single GAN object from HyperGAN has one generator.

Resize Conv

Resize conv pseudo code looks like this

 1.  net = linear(z, z_projection_depth)
 2.  net = resize net to max(output width/height, double input width/height)
 3.  add layer filter if defined
 4.  convolution block
 5.  If at output size: 
 6.  Else add first 3 layers to progressive enhancement output and go to 2
attribute description type
create Called during graph creation f(config, gan, net):net
z_projection_depth The output size of the linear layer before the resize-conv stack. int > 0
activation Activations to use. See activations f(net):net
final_activation Final activation to use. This is usually set to tanh to squash the output range. f(net):net
depth_reduction Reduces the filter sizes on each convolution by this multiple. float > 0
layer_filter On each resize of G, we call this method. Anything returned from this method is added to the graph before the next convolution block. See common layer filters f(net):net
layer_regularizer This "regularizes" each layer of the generator with a type. See layer regularizers f(name)(net):net

Encoders

You can combine multiple encoders into a single GAN.

Uniform Encoder

attribute description type
create Called during graph creation f(config, gan, net):net
z The dimensions of random uniform noise inputs int > 0
min Lower bound of the random uniform noise int > 0
max Upper bound of the random uniform noise int > min
projections See more about projections below [f(config, gan, net):net, ...]
modes If using modes, the number of modes to have per dimension int > 0

Projections

This encoder takes a random uniform value and outputs it as many possible types. The primary idea is that you are able to query Z as a random uniform distribution, even if the gan is using a spherical representation.

Some projection types are listed below.

"identity" projection

"sphere" projection

"gaussian" projection

"modal" projection

One of many

"binary" projection

On/Off

Category Encoder

Uses categorical prior to choose 'one-of-many' options. Can be paired with Categorical Loss.

Discriminators

You can combine multiple discriminators in a single GAN. This type of ensembling can be useful, but by default only 1 is enabled.

Pyramid Discriminator

attribute description type
create Called during graph creation f(config, gan, net):net
activation Activations to use. See activations f(net):net
depth_increase Increases the filter sizes on each convolution by this multiple. float > 0
final_activation Final activation to use. This is usually set to tanh to squash the output range. f(net):net
layers The number of convolution layers int > 0
layer_filter Append information to each layer of the discriminator f(config, net):net
layer_regularizer batch_norm_1, layer_norm_1, or None f(batch_size, name)(net):net
fc_layer_size The size of the linear layers at the end of this network(if any). int > 0
fc_layers fully connected layers at the end of the discriminator(standard dcgan is 0) int >= 0
noise Instance noise. Can be added to the input X float >= 0
progressive_enhancement If true, enable progressive enhancement boolean

progressive enhancement

If true, each layer of the discriminator gets a resized version of X and additional outputs from G.

Losses

WGAN

Our implementation of WGAN is based off the paper. WGAN loss in Tensorflow can look like:

 d_loss = d_real - d_fake
 g_loss = d_fake

d_loss and g_loss can be reversed as well - just add a '-' sign.

LS-GAN

 d_loss = (d_real-b)**2 - (d_fake-a)**2
 g_loss = (d_fake-c)**2

a, b, and c are all hyperparameters.

Standard GAN and Improved GAN

Includes support for Improved GAN. See hypergan/losses/standard_gan_loss.py for details.

Supervised loss

Supervised loss is for labeled datasets. This uses a standard softmax loss function on the outputs of the discriminator.

Categorical loss

This is currently untested.

Loss configuration

attribute description type
batch_norm batch_norm_1, layer_norm_1, or None f(batch_size, name)(net):net
create Called during graph creation f(config, gan, net):net
discriminator Set to restrict this loss to a single discriminator(defaults to all) int >= 0 or None
label_smooth improved gan - Label smoothing. float > 0
labels lsgan - A triplet of values containing (a,b,c) terms. [a,b,c] floats
reduce Reduces the output before applying loss f(net):net
reverse Reverses the loss terms, if applicable boolean

Trainers

RMSProp

Uses RMSProp on G and D

Adam

Uses Adam on G and D

SGD

Uses SGD on G and D

Configuration

attribute description type
create Called during graph creation f(config, gan, net):net
run Steps forward once in training. f(gan):[d_cost, g_cost]
g_learn_rate Learning rate for the generator float >= 0
g_beta1 (adam) float >= 0
g_beta2 (adam) float >= 0
g_epsilon (adam) float >= 0
g_decay (rmsprop) float >= 0
g_momentum (rmsprop) float >= 0
d_learn_rate Learning rate for the discriminator float >= 0
d_beta1 (adam) float >= 0
d_beta2 (adam) float >= 0
d_epsilon (adam) float >= 0
d_decay (rmsprop) float >= 0
d_momentum (rmsprop) float >= 0
clipped_gradients If set, gradients will be clipped to this value. float > 0 or None
d_clipped_weights If set, the discriminator will be clipped by value. float > 0 or None

The pip package hypergan

 hypergan -h

Training

  # Train a 256x256 gan with batch size 32 on a folder of pngs
  hypergan train [folder] -s 32x32x3 -f png -b 32 --config [name]

Sampling

  # Train a 256x256 gan with batch size 32 on a folder of pngs
  hypergan train [folder] -s 32x32x3 -f png -b 32 --config [name] --sampler static_batch --sample_every 5

One way a network learns:

Demo CountPages alpha

To create videos:

  ffmpeg -i samples/%06d.png -vcodec libx264 -crf 22 -threads 0 gan.mp4

Web Server

  # Train a 256x256 gan with batch size 32 on a folder of pngs
  hypergan serve [folder] -s 32x32x3 -f png -b 32 --config [name]

Serve starts a flask server. You can then access:

http://localhost:5000/sample.png?type=batch

To prevent the GPU from allocating space, see Running on CPU.

hypergan build

Build takes the same arguments as train and builds a generator. It's required for serve.

Building does 2 things:

  • Loads the training model, which include the discriminator
  • Saves into a ckpt model containing only the generator

Saves

Saves are stored in ~/.hypergan/saves/

They can be large.

Formats

--format <type>

Type can be one of:

  • jpg
  • png

Arguments

To see a detailed list, run

  hypergan -h
  • -s, --size, optional(default 64x64x3), the size of your data in the form 'width'x'height'x'channels'
  • -f, --format, optional(default png), file format of the images. Only supports jpg and png for now.

API

  import hypergan as hg

Examples

API is currently under development. The best reference are the examples in the examples directory.

Examples

2d test

Runs a 2d toy problem for a given configuration. Can be sampled to show how a given configuration learns.

2d measure accuracy

Applies a batch accuracy (nearest neighbor) measurement to the 2d toy problem.

Colorizer

Colorizer feeds a black and white version of the input into the generator.

Inpainting

Hides a random part of the image from the discriminator and the generator.

Super Resolution

Provides a low resolution image to the generator.

Constant inpainting

Applies a constant mask over part of the image. An easier problem than general inpainting.

GAN object

The GAN object consists of:

  • The config(configuration) used
  • The graph - specific named Tensors in the Tensorflow graph
  • The tensorflow sess(session)

Constructor

hg.GAN(config, initial_graph, graph_type='full', device='/gpu:0')

When a GAN constructor is called, the Tensorflow graph will be constructed.

Arguments

  • config - The graph configuration. See examples or the CLI tool for usage.
  • initial_graph - a Dictionary consisting of any variables used by the GAN
  • graph_type - Either 'full' or 'generator'
  • device - Tensorflow device id

Properties

property type description
gan.graph Dictionary Maps names to tensors
gan.config Dictionary Maps names to options(from the json)
gan.sess tf.Session The tensorflow session

Methods

save

 gan.save(save_file)

save_file - a string designating the save path

Saves the GAN

sample_to_file

 gan.sample_to_file(name, sampler=grid_sampler.sample)
  • name - the name of the file to sample to
  • sampler - the sampler method to use

Sample to a specified path.

train

 gan.train()

Steps the gan forward in training once. Trains the D and G according to your specified trainer.

Datasets

To build a new network you need a dataset. Your data should be structured like:

  [folder]/[directory]/*.png

Creating a Dataset

Supervised learning

Training with labels allows you to train a classifier.

Each directory in your dataset represents a classification.

Example: Dataset setup for classification of apple and orange images:

 /dataset/apples
 /dataset/oranges

Unsupervised learning

You can still build a GAN if your dataset is unlabelled. Just make sure your folder is formatted like

 [folder]/[directory]/*.png

where all files are in 1 directory.

Downloadable datasets

Contributing

Contributions are welcome and appreciated! We have many open issues in the Issues tab that have the label Help Wanted.

Our process

HyperGAN uses semantic versioning. http://semver.org/

TLDR: x.y.z

  • x is incremented on stable public releases.
  • y is incremented on API breaking changes. This includes configuration file changes and graph construction changes.
  • z is incremented on non-API breaking changes. z changes will be able to reload a saved graph.

Branches

The branches are:

  • master contains the best GAN we've found as default. It aims to just work for most use cases.
  • develop contains the latest and can be in a broken state.

Bug fixes and showcases can be merged into master

Configuration changes, new architectures, and generally anything experimental belongs in develop.

Showcase

If you create something cool with this let us know! Open a pull request and add your links, and screenshots here!

In case you are interested, our pivotal board is here: https://www.pivotaltracker.com/n/projects/1886395

Notable Configurations

Notable configurations are stored in example/configs Feel free to submit additional ones.

About

Generative Adversarial Networks consist of 2 learning systems that learn together. HyperGAN implements these learning systems in Tensorflow with deep learning.

The discriminator learns the difference between real and fake data. The generator learns to create fake data.

For a more in-depth introduction, see here http://blog.aylien.com/introduction-generative-adversarial-networks-code-tensorflow/

A single fully trained GAN consists of the following useful networks:

  • generator - Generates content that fools the discriminator. If using supervised learning mode, can generate data on a specific classification.
  • discriminator - The discriminator learns how to identify real data and how to detect fake data from the generator.
  • classifier - Only available when using supervised learning. Classifies an image by type. Some examples of possible datasets are 'apple/orange', 'cat/dog/squirrel'. See Creating a Dataset.

HyperGAN is currently in open beta.

Papers

Sources

Citation

If you wish to cite this project, do so like this:

  255bits (M. Garcia),
  HyperGAN, (2017), 
  GitHub repository, 
  https://github.com/255BITS/HyperGAN

About

A composable Generative Adversarial Network(GAN) with API and command line tool.

License:MIT License


Languages

Language:Python 100.0%Language:Shell 0.0%