Mind2Mind : transfer learning for GANs

Jean-Baptiste Gouray • Yaël Frégier

Official repository of the paper

This repository contains a Mind2Mind transfer module. We have added it to a fork of the ALAE repository. We have kept from this fork only the modules essential for running Mind2Mind. If you need the full capacities of ALAE, add our module to the original ALAE repository.

Google Drive folder with models and qualitative results

Mind2Mind

Transfer Learning for GANs

Abstract: Training generative adversarial networks (GANs) on high quality (HQ) images involves important computing resources. This requirement represents a bottleneck for the development of applications of GANs. We propose a transfer learning technique for GANs that significantly reduces training time. Our approach consists of freezing the low-level layers of both the critic and generator of the original GAN. We assume an auto-encoder constraint in order to ensure the compatibility of the internal representations of the critic and the generator. This assumption explains the gain in training time as it enables us to bypass the low-level layers during the forward and backward passes. We compare our method to baselines and observe a significant acceleration of the training. It can reach two orders of magnitude on HQ datasets when compared with StyleGAN. We prove rigorously, within the framework of optimal transport, a theorem ensuring the convergence of the learning of the transferred GAN. We moreover provide a precise bound for the convergence of the training in terms of the distance between the source and target dataset.

Repository organization

To run the scripts, you will need to have a CUDA capable GPU, PyTorch >= v1.3.1 and cuda/cuDNN drivers installed. Install the required packages:

pip install -r requirements.txt

Running scripts

The code in the repository is organized in such a way that all scripts must be run from the root of the repository. If you use an IDE (e.g. PyCharm or Visual Studio Code), just set Working Directory to point to the root of the repository.

If you want to run from the command line, then you also need to set PYTHONPATH variable to point to the root of the repository.

For example, let's say we've cloned repository to ~/ALAE directory, then do:

$ cd ~/ALAE
$ export PYTHONPATH=$PYTHONPATH:$(pwd)

Now you can run scripts as follows:

$ python module_mind/generate_images.py

Repository structure

Path	Description
ALAE	Repository root folder
├ configs	Folder with yaml config files.
│ └ ffhq.yaml	Config file for FFHQ dataset at 1024x1024 resolution.
├ module_mind	Folder with Mind2Mind module.
│ ├ data_loader.py	Class to define loaders for encoded data.
│ ├ download_mindGAN.py	Script to download a pre-trained MindGan.
│ ├ generate_images.py	Script to generate images from the MindGAN.
│ ├ model.py	MindGAN model.
│ ├ prepare_data.py	Script to download celebaHQ and encode the data.
│ ├ train.py	Script to train the MindGAN on CelebaHQ from the ALAE autoencoder trained FFHQ.
│ └ trainer.py	Class for handling training loops.
├ checkpointer.py	Module for saving/restoring model weights, optimizer state and loss history.
├ defaults.py	Definition for config variables with default values.
├ losses.py	Defintions of the loss fonctions.
├ lreq.py	Custom `Linear`, `Conv2d` and `ConvTranspose2d` modules for learning rate equalization.
├ model.py	Module with high-level model definition.
├ net.py	Definition of all network blocks for multiple architectures.
├ registry.py	Registry of network blocks for selecting from config file.
├ random_choice.png	Sample of images (for this readme).
├ requirements.txt	List of python modules needed.
└ utils.py	Decorator for async call, decorator for caching, registry for network blocks.

Configs

In ALAE, you can specify which yaml config yacs will use. However, our Mind2Mind module only accepts for the moment the ffhq config. Since it is the default config for ALAE you do not have anything to do. However, if you know the use of the -c parameter from ALAE, do not try to use it here to choose another config.

Datasets

You must prepare the data with the command:

$ python module_mind/prepare_data.py

Pre-trained models

To download pre-trained models run:

python module_mind/download_mindGAN.py

Generating figures

To make generation figure run:

python module_mind/generate_images.py

By default, it will generate one batch of 4 images. If you want to modify the numbers of batches and images, you have to modify the lines 19-20 in generate_images. In particular if your system runs out of memory, you will need to lower the number of images par batch and restart the kernel. The generated samples can be found in the folder module_mind/images_generated/mind2mid.

Training

To run training:

python module_mind/train.py

We have only tested our Mind2Mind module on a single GPU.

You might need to adjust the batch size in the config file depending on the memory size of the GPU.

Computation of FID

To compute the fid score, you need to download the module pytorch-fid and run the command :

python fid_score $ALAE_PATH/module_mind/Dataset/Celeba-HQ/data1024x1024 $ALAE_PATH/module_mind/images_generated/mind2mind/

where $ALAE_PATH is the directory in which ALAE is located.

Our settings description

{\bf Datasets}. We have tested our algorithm at resolution $28\times 28$ in grey levels (scaled in range $[-1;1]$) on MNIST \cite{mnist}, KMNIST \cite{kmnist}, FashionMNIST \cite{Fashionmnist} (60 000 images each) and at resolution $1024\times 1024$ in color on CelabaHQ \cite{growing} (30 000 images) from models trained on the 60 000 first images of FFHQ \cite{StyleGAN} (which consists of 70 000 images), using the {\bf library} Pytorch. The {\bf hardware} for our experiments with $28\times 28$ images consisted of a desktop with 16 Go of RAM and a GPU Nvidia GTX 1080 TI. Most of our experiments with HD color images used a node-gpu \cite{JeanZay} with two CPU Intel Cascade Lake 6248 and a GPU Nvidia V100 SXM2 32 Go. We have also benchmarked the running time on entry level GPU GTX 1060. Despite its limitations \cite{Borji:aa}, we have used {\it FID} (Frechet Inception Distance) as {\bf metric}. It is the current standard for evaluations of GANs.

Description of hyperparameters

{\bf At resolution 28$\times $28}. The encoder $c_1$ has three convolutional layers with instance normalisation (in) and relu : 32+in+relu, 64+in+relu, 128+in+relu, followed by a single dense layer 256+tanh. The decoder $g_1$ has a single dense layer 4464 + relu and three deconvolutional layers with batch normalisation (bn) : 64+bn+relu, 64+bn+relu, 32+bn+relu, 1+tanh. The MindGAN is a MLP WGAN whose generator $g_0$ consists in three dense layers : 512+bn+relu, 512+bn+relu, 256+tanh and the critic $c_0$ consists also in three dense layers : 256+relu, 256+relu,1. Our {\bf hyper-parameters} : learning rate of $10^{-3}$, batch size of 128 for all the networks, 80 epochs for $(g_1,c_1)$ and 100 for the other networks, gradient penalty with $\lambda=10$, beta parameters in Adam optimizer $(.9,.9)$ for $(g_1,c_1)$, $(.1,.5)$ for the other networks.

{\bf At resolution 1024$\times $1024}. We have worked with the encoder and decoder of the ALAE model \cite{ALAE} pre-trained on FFHQ available at \cite{gitALAE}. The generator of our MindGAN has an input dimension of 128, three hidden dense layers with relu activation (128, 256, 512) followed by a dense output layer with 512 units (no activation). The critic has an input dimension of 512, three hidden dense layers with relu activation (512, 256, 128) followed by a dense output layer with one unit (no activation). Its hyperparameters were lr $= 1e^{-3}$, betas $= [0., 0.5]$, gradient penalty $= 10$, epsilon penalty $= 1e^{-2}$, batch size = 256, critic iteration = 5, epochs = 300.

JeanBaptisteGouray / mindgan