yhlleo/RG-UNIT

Retrieval Guided Unsupervised Multi-domain Image to Image Translation, accepted to ACM International Conference on Multimedia(ACM MM), 2020. [Paper]|[arXiv]|[code]

Image to image translation aims to learn a mapping that transforms an image from one visual domain to another. Recent works assume that images descriptors can be disentangled into a domain-invariant content representation and a domain-specific style representation. Thus, translation models seek to preserve the content of source images while changing the style to a target visual domain. However, synthesizing new images is extremely challenging especially in multi-domain translations, as the network has to compose content and style to generate reliable and diverse images in multiple domains. In this paper we propose the use of an image retrieval system to assist the image-to-image translation task. First, we train an image-to-image translation model to map images to multiple domains. Then, we train an image retrieval model using real and generated images to find images similar to a query one in content but in a different domain. Finally, we exploit the image retrieval system to fine-tune the image-to-image translation model and generate higher quality images. Our experiments show the effectiveness of the proposed solution and highlight the contribution of the retrieval network, which can benefit from additional unlabeled data and help image-to-image translation models in the presence of scarce data

1. Configuration

See the environment.yml. We provide an user-friendly configuring method via Conda system, and you can create a new Conda environment using the command:

conda env create -f environment.yaml

2. Datasets

CelebA: Download from Google Drive, then copy it into datasets/celeba:

$ cd datasets/celeba
$ unzip img_align_celeba.zip
$ mv img_align_celeba images

Then, preparing the dataset in the following structure:

./datasets
  |__celeba
       |__images
       |    |__xxx.jpg
       |    |__...
       |__list_attr_celeba-train.txt
       |__list_attr_celeba-val.txt

AFHQ: Download from the StarGAN v2 project.

Then, preparing the dataset in the following structure:

./datasets
  |__afhq
       |__train
       |    |__cat
       |    |    |__xxx.jpg
       |    |    |__...
       |    |__dog
       |         |__xxx.jpg
       |         |__...
       |__val
       |    |__cat
       |    |    |__xxx.jpg
       |    |    |__...
       |    |__dog
       |         |__xxx.jpg
       |         |__...
       |
       |__train_list.txt
       |__test_list.txt

3. Train

Train GMM-UNIT model

$ sh train_gmmunit.sh

Train retrieval network

$ sh train_retrieval.sh

Update the GMM-UNIT model aided by retrieval network

$ sh train_gmmunit_retrieval.sh

4. Pretrained models

Here, we release the pretrained models in Google Drive:

GMM-UNIT without self-attention:

Name	Description	Download Link
`gmmunit_gen.pth`	Pretrained generator of GMM-UNIT model	👌link
`gmmunit_dis.pth`	Pretrained discriminator of GMM-UNIT model	👌link
`gmmunit_ret.pth`	Pretrained retrieval network	link
`image_embeddings.pth`	Image embeddings extracted by the pretrained model on the training images of CelebA	link
`rgunit_gen.pth`	generator of RG-UNIT model	link
`rgunit_dis.pth`	discriminator of RG-UNIT model	link

For fast verification, we can using the following demo:

GMM-UNIT with self-attention:

Name	Description	Download Link
`gmmunit_gen.pth`	Pretrained generator of GMM-UNIT model	👌link
`gmmunit_dis.pth`	Pretrained discriminator of GMM-UNIT model	👌link
`gmmunit_ret.pth`	Pretrained retrieval network	link
`image_embeddings.pth`	Image embeddings extracted by the pretrained model on the training images of CelebA	link
`rgunit_gen.pth`	generator of RG-UNIT model	link
`rgunit_dis.pth`	discriminator of RG-UNIT model	link

For fast verification, we can using the following demo:

5. Evaluation codes

We evaluate the performances of the compared models mainly based on this repo: GAN-Metrics

References

If our project is useful for you, please cite our papers:

@inproceedings{raul2020retrieval,
author = {Gomez, Raul and Liu, Yahui and De Nadai, Marco and Karatzas, Dimosthenis and Lepri, Bruno and Sebe, Nicu},
title = {Retrieval Guided Unsupervised Multi-Domain Image to Image Translation},
booktitle = {Proceedings of the 28th ACM International Conference on Multimedia},
year = {2020},
doi = {10.1145/3394171.3413785},
url = {https://doi.org/10.1145/3394171.3413785}
}

@article{liu2020gmm,
  title={GMM-UNIT: Unsupervised Multi-Domain and Multi-Modal Image-to-Image Translation via Attribute Gaussian Mixture Modeling},
  author={Liu, Yahui and De Nadai, Marco and Yao, Jian and Sebe, Nicu and Lepri, Bruno and Alameda-Pineda, Xavier},
  journal={arXiv preprint arXiv:2003.06788},
  year={2020}
}

The baseline model GMM-UNIT is based on this paper: GMM-UNIT: Unsupervised Multi-Domain and Multi-Modal Image-to-Image Translation via Attribute Gaussian Mixture Modeling.

yhlleo / RG-UNIT