hellog2n / style-transformer

Official implementation for "Style Transformer for Image Inversion and Editing" (CVPR 2022)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Style Transformer for Image Inversion and Editing (CVPR2022)

arXiv arXiv video

Style Transformer for LAIT

Updated by @yoojin

Pretrained weights for face image

  • IR-SE50 Model for ID Loss [LINK]
  • FFHQ Pretrained StyleGAN2 Generator [LINK]

Docker Image

Pulling Docker Image

docker pull hellog2n/style_transformer_image:latest

Generating Docker Container

nvidia-docker run -it --name style_transformer -v ~/style-transformer:/workspace/style-transformer \
-v /nas2/lait/5000_Dataset/Video/GRID/preprocess/:/workspace/dataset/GRID \
 --gpus=all -p [YOUR_PORT_NUM]:[YOUR_PORT_NUM] --shm-size=8g \
pytorch/pytorch:1.12.1-cuda11.3-cudnn8-devel /bin/bash

Getting Started

Training

Update configs/paths_config.py with the necessary data paths and model paths for training and inference.

dataset_paths = {
    'train_data': '/path/to/train/data'
    'test_data': '/path/to/test/data',
}

model_paths = {
    'stylegan_ffhq': 'pretrained_models/your_stylegan2_model'
    'ir_se50': 'pretrained_models/your_ir_se50_model',
}

If you want to use GRID dataset, use and update the make_grid_dataset in utils/data_utils.py.

Training source code

python scripts/train.py \
--dataset_type=grid_encode \
--exp_dir=results/train_style_transformer \
--batch_size=8 \
--test_batch_size=8 \
--val_interval=5000 \
--save_interval=10000



Style Transformer in Original code

Getting Started

Prerequisites

  • Ubuntu 16.04
  • NVIDIA GPU + CUDA CuDNN
  • Python 3

Pretrained Models

We provide the pre-trained models of inversion for face and car domains.

Training

Preparing Datasets

Update configs/paths_config.py with the necessary data paths and model paths for training and inference.

dataset_paths = {
    'train_data': '/path/to/train/data'
    'test_data': '/path/to/test/data',
}

Preparing Generator and Encoder

We use rosinality's StyleGAN2 implementation. You can download the 256px pretrained model in the project and put it in the directory /pretrained_models.

Moreover, following pSp, we use some pretrained models to initialize the encoder and for the ID loss, you can download them from here and put it in the directory /pretrained_models.

Training Inversion Model

python scripts/train.py \
--dataset_type=ffhq_encode \
--exp_dir=results/train_style_transformer \
--batch_size=8 \
--test_batch_size=8 \
--val_interval=5000 \
--save_interval=10000 \
--stylegan_weights=pretrained_models/stylegan2-ffhq-config-f.pt

Inference

python scripts/inference.py \
--exp_dir=results/infer_style_transformer \
--checkpoint_path=results/train_style_transformer/checkpoints/best_model.pt \
--data_path=/test_data \
--test_batch_size=8 \

Citation

If you use this code for your research, please cite

@inproceedings{hu2022style,
  title={Style Transformer for Image Inversion and Editing},
  author={Hu, Xueqi and Huang, Qiusheng and Shi, Zhengyi and Li, Siyuan and Gao, Changxin and Sun, Li and Li, Qingli},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={11337--11346},
  year={2022}
}

About

Official implementation for "Style Transformer for Image Inversion and Editing" (CVPR 2022)


Languages

Language:Python 90.7%Language:Cuda 8.0%Language:C++ 1.2%