Ganghun Lee, Minji Kim, Yunsu Lee, Minsu Lee, and Byoung-Tak Zhang
An official implementation of the paper Neural Collage Transfer: Artistic Reconstruction via Material Manipulation.
- Python 3.8.5 (Conda)
- PyTorch 1.11.0
We recommend using the following instruction after making a new Python 3.8.5 environment:
$ pip install -r requirements.txt
You can find infer.sh
for testing your own image.
The goal
image should be placed in samples/goal/
.
ex) samples/goal/boat.jpg
The materials
are a set of images, so please make your own folder (e.g., newspaper/) containing all your material images.
Then move the folder to the directory samples/materials/
.
ex) samples/materials/newspaper/
To make it quick, you can download a prepared set of newspapers from here.
(Vilkin,Aleksey and Safonov,Ilia. (2014). Newspaper and magazine images segmentation dataset. UCI Machine Learning Repository. https://doi.org/10.24432/C5N60V.)
There would be some kinds of files, but we only need the .jpg
s (please delete the other files).
Please make sure to set your goal/material path in infer.sh
.GOAL_PATH='samples/goals/your_own_goal.jpg'
(not necessarily .jpg extension)SOURCE_DIR='samples/materials/your_own_material_folder'
Now you can run the code.$ bash infer.sh
It will take some time, and the results will be saved at samples/results/
.
GOAL_RESOLUTION
- result image resolutionGOAL_RESOLUTION_FIT
- fit the resolution as (horizontal | vertical | square)SOURCE_RESOLUTION_RATIO
- material image casting size (0-1)SOURCE_LOAD_LIMIT
- max num of material images to load (prevent RAM overloaded)SOURCE_SAMPLE_SIZE
- num of material images agent will see at each stepMIN_SOURCE_COMPLEXITY
- minimum allowed complexity for materials (prevent using too simple ones) (>=0)SCALE_ORDER
- scale sequence for multi-scale collageNUM_CYCLES
- num of steps for each sliding windowWINDOW_RATIO
- stride ratio of sliding window (0-1) (0.5 for stride = window_size x 0.5)MIN_SCRAP_SIZE
- the minimum allowed scrap size (prevent too small scraps) (0-1)SENSITIVITY
- complexity-sensitivity value for multi-scale collageFIXED_T
- fixed value of t_channel for multi-scale collageFPS
- fps for result video
You can also toggle the following options:
skip_negative_reward
- Whether to undo actions that led to a negative MSE rewardpaper_like
- Whether to use the torn paper effectdisallow_duplicate
- Whether to disallow duplicate usage of materials
We recommend trying adjusting SENSITIVITY
first, in a range of about 1-5.
Goals and materials should be prepared for training.
This code supports the following datasets for goals:
- ImageNet (2012)
- MNIST
- Flowers-102
- Scene
This code properly supports Describable Textures Dataset (DTD) for training only.
Please make your datasets be placed in the same data directory.
As an example, you can see our example tree of ~/Datasets
.
Datasets/
├── dtd
│ ├── images
│ ├── imdb
│ └── labels
├── flowers-102
│ ├── imagelabels.mat
│ ├── jpg
│ └── setid.mat
├── imagenet
│ ├── meta.bin
│ ├── train
│ └── val
├── IntelScene
│ ├── train
│ └── val
└── MNIST
└── raw
Then set --data_path
in train.sh
to your data directory.
ex) --data_path ~/Datasets
Before training, please set up and log in to your wandb account for logging.
Set --goal
in train.sh
to right name (imagenet | mnist | flower | scene).
Tip: imagenet
is for general use.
--source
means material
, and it basically supports dtd
only.
But you can use other materials for specific goal-material cases: (imagenet-imagenet, mnist-mnist, flower-flower, scene-scene).
Now just run the code to train:$ bash train.sh
The progress and result will be saved at outputs/
.
If your RAM get overloaded, you can decrease the replay memory size --replay_size
.
To make the rendering process differentiable, we implemented and pretrained shaper
network as in shaper/shaper_training.ipynb
.
We also used Kornia library for differentiable image translation.
If you find this work useful, please cite the paper as follows:
@inproceedings{lee2023neural,
title={Neural Collage Transfer: Artistic Reconstruction via Material Manipulation},
author={Lee, Ganghun and Kim, Minji and Lee, Yunsu and Lee, Minsu and Zhang, Byoung-Tak},
booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision},
pages={2394--2405},
year={2023}
}
Many thanks to the authors of Learning to Paint for inspiring this work. They also inspired our other work From Scratch to Sketch.
We also appreciate the contributors of Kornia for providing useful differentiable image processing operators.