Updated by @yoojin
Pulling Docker Image
docker pull hellog2n/style_transformer_image:latest
Generating Docker Container
nvidia-docker run -it --name style_transformer -v ~/style-transformer:/workspace/style-transformer \
-v /nas2/lait/5000_Dataset/Video/GRID/preprocess/:/workspace/dataset/GRID \
--gpus=all -p [YOUR_PORT_NUM]:[YOUR_PORT_NUM] --shm-size=8g \
pytorch/pytorch:1.12.1-cuda11.3-cudnn8-devel /bin/bash
Update configs/paths_config.py
with the necessary data paths and model paths for training and inference.
dataset_paths = {
'train_data': '/path/to/train/data'
'test_data': '/path/to/test/data',
}
model_paths = {
'stylegan_ffhq': 'pretrained_models/your_stylegan2_model'
'ir_se50': 'pretrained_models/your_ir_se50_model',
}
If you want to use GRID dataset, use and update the make_grid_dataset
in utils/data_utils.py
.
python scripts/train.py \
--dataset_type=grid_encode \
--exp_dir=results/train_style_transformer \
--batch_size=8 \
--test_batch_size=8 \
--val_interval=5000 \
--save_interval=10000
- Ubuntu 16.04
- NVIDIA GPU + CUDA CuDNN
- Python 3
We provide the pre-trained models of inversion for face and car domains.
Update configs/paths_config.py
with the necessary data paths and model paths for training and inference.
dataset_paths = {
'train_data': '/path/to/train/data'
'test_data': '/path/to/test/data',
}
We use rosinality's StyleGAN2 implementation.
You can download the 256px pretrained model in the project and put it in the directory /pretrained_models
.
Moreover, following pSp, we use some pretrained models to initialize the encoder and for the ID loss, you can download them from here and put it in the directory /pretrained_models
.
python scripts/train.py \
--dataset_type=ffhq_encode \
--exp_dir=results/train_style_transformer \
--batch_size=8 \
--test_batch_size=8 \
--val_interval=5000 \
--save_interval=10000 \
--stylegan_weights=pretrained_models/stylegan2-ffhq-config-f.pt
python scripts/inference.py \
--exp_dir=results/infer_style_transformer \
--checkpoint_path=results/train_style_transformer/checkpoints/best_model.pt \
--data_path=/test_data \
--test_batch_size=8 \
If you use this code for your research, please cite
@inproceedings{hu2022style,
title={Style Transformer for Image Inversion and Editing},
author={Hu, Xueqi and Huang, Qiusheng and Shi, Zhengyi and Li, Siyuan and Gao, Changxin and Sun, Li and Li, Qingli},
booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
pages={11337--11346},
year={2022}
}