Genesis

This is the official PyTorch implementation of "GENESIS: Generative Scene Inference and Sampling with Object-Centric Latent Representations" by Martin Engelcke, Adam R. Kosiorek, Oiwi Parker Jones, and Ingmar Posner; accepted for publication at the International Conference on Learning Representations (ICLR) 2020.

Setup

Start by cloning the repository, e.g. into ~/code/genesis:

git clone --recursive https://github.com/applied-ai-lab/genesis.git ~/code/genesis

Forge

We use Forge (https://github.com/akosiorek/forge) to save some legwork. It is included as a submodule but you need to add it to your python path, e.g. with:

# If needed, replace .bashrc with .zshrc or similar
echo 'export PYTHONPATH="${PYTHONPATH}:${HOME}/code/genesis/forge"' >> ~/.bashrc

Python dependencies

You can either install PyTorch, TensorFlow, and all other dependencies manually or you can setup up conda environment with all required dependencies using the environment.yml file:

conda env create -f environment.yml
conda activate genesis_env

Datasets

This repository contains data loaders for the three datasets considered in the paper:

Multi-dSprites
GQN (rooms-ring-camera)
ShapeStacks

This order aligns with the increasing visual complexity of the datasets. A few steps are required for setting up each individual dataset.

Multi-dSprites

Generate coloured Multi-dSprites from the original dSprites with:

cd ~/code/genesis
mkdir -p data/multi_dsprites/processed
git clone https://github.com/deepmind/dsprites-dataset.git data/multi_dsprites/dsprites-dataset
python scripts/generate_multid.py

GQN (rooms-ring-camera)

The GQN datasets are quite large. The rooms_ring_camera dataset as used in the paper takes about 250GB and can be downloaded with:

pip install gsutil
cd ~/code/genesis
mkdir -p data/gqn_datasets
gsutil -m cp -r gs://gqn-dataset/rooms_ring_camera data/gqn_datasets

Note that we use a modified version of the TensorFlow GQN data loader from https://github.com/ogroth/tf-gqn which is based on https://github.com/deepmind/gqn-datasets.git and included in third_party/tf_gqn.

ShapeStacks

You need about 30GB of free disk space for ShapeStacks:

cd ~/code/genesis
mkdir -p data/shapestacks
cp utils/shapestacks_urls.txt data/shapestacks
cd data/shapestacks
# Download compressed dataset
wget -i shapestacks_urls.txt
# Uncompress files
bash ../../utils/uncompress_shapestacks.sh

The instance segmentation labels for ShapeStacks can be downloaded from here.

Experiments

Visualising data

You can visualise your data with, e.g.:

python scripts/visualise_data.py --data_config datasets/multid_config.py
python scripts/visualise_data.py --data_config datasets/gqn_config.py
python scripts/visualise_data.py --data_config datasets/shapestacks_config.py

Training models

You can train Genesis, MONet and baseline VAEs on the datasets using the default hyperparameters with, e.g.:

python train.py --data_config datasets/multid_config.py --model_config models/genesis_config.py
python train.py --data_config datasets/gqn_config.py --model_config models/monet_config.py
python train.py --data_config datasets/shapestacks_config.py --model_config models/vae_config.py

You can change many of the hyperparameters via the Forge command line flags in the respective config files, e.g.:

python train.py --data_config datasets/multid_config.py --model_config models/genesis_config.py --batch_size 64 --learning_rate 0.001

See train.py and the config files for the available flags.

Monitoring training

TensorBoard logs are written to file with TensorboardX. Run tensorboard --logdir checkpoints to monitor training.

Pretrained models

Models trained on the three datasets with the default flags are available here.

Evaluation metrics

See scripts/compute_fid.py and scripts/compute_seg_metrics.py.

Visualise generation

See scripts/visualise_generation.py.

Further particulars

License

This source code is licensed under a GNU General Public License (GPL) v3 license, which is included in the LICENSE file in the root directory.

Copyright

Authors: Applied AI Lab, Oxford Robotics Institute, University of Oxford, https://ori.ox.ac.uk/labs/a2i/

No warranty, explicit or implicit, provided.

Citation

If you use this repository in your research, please cite our paper:

@article{engelcke2019genesis,
  title={{GENESIS: Generative Scene Inference and Sampling of Object-Centric Latent Representations}},
  author={Engelcke, Martin and Kosiorek, Adam R. and Parker Jones, Oiwi and Posner, Ingmar},
  journal={Proceedings of the International Conference on Learning Representations (ICLR)},
  year={2020}
}

Third party code

This repository builds upon code from the following third party repositories, which are included in the third_party folder:

tf-gqn (Apache v2 license)
shapestacks (GPL v3.0)
sylvester-flows (MIT license)
pytorch-fid (Apache v2 license)

The full licenses are included in the respective folders.

Release notes

v1.0: First release.

mcm2020 / genesis