StyleGANEX - Official PyTorch Implementation

teaser2.mp4

This repository provides the official PyTorch implementation for the following paper:

StyleGANEX: StyleGAN-Based Manipulation Beyond Cropped Aligned Faces
Shuai Yang, Liming Jiang, Ziwei Liu and Chen Change Loy
Project Page | Paper | Supplementary Video

Abstract: Recent advances in face manipulation using StyleGAN have produced impressive results. However, StyleGAN is inherently limited to cropped aligned faces at a fixed image resolution it is pre-trained on. In this paper, we propose a simple and effective solution to this limitation by using dilated convolutions to rescale the receptive fields of shallow layers in StyleGAN, without altering any model parameters. This allows fixed-size small features at shallow layers to be extended into larger ones that can accommodate variable resolutions, making them more robust in characterizing unaligned faces. To enable real face inversion and manipulation, we introduce a corresponding encoder that provides the first-layer feature of the extended StyleGAN in addition to the latent style code. We validate the effectiveness of our method using unaligned face inputs of various resolutions in a diverse set of face manipulation tasks, including facial attribute editing, super-resolution, sketch/mask-to-face translation, and face toonification.

Features:

Support for Unaligned Faces: StyleGANEX can manipulate normal field-of-view face images and videos.
Compatibility: StyleGANEX can directly load pre-trained StyleGAN parameters without retraining.
Flexible Manipulation: StyleGANEX retains the style representation and editing ability of StyleGAN.

Updates

[03/2023] Inference code is released.
[03/2023] This website is created.

Installation

Clone this repo:

git clone https://github.com/williamyang1991/StyleGANEX.git
cd StyleGANEX

Dependencies:

We have tested on:

CUDA 10.1
PyTorch 1.7.1
Pillow 8.3.1; Matplotlib 3.4.2; opencv-python 4.5.3; Faiss 1.7.1; tqdm 4.61.2; Ninja 1.10.2; dlib 19.24.0

(1) Inference

Inference Notebook

To help users get started, we provide a Jupyter notebook found in ./inference_playground.ipynb that allows one to visualize the performance of StyleGANEX. The notebook will download the necessary pretrained models and run inference on the images found in ./data/.

Pre-trained Models

Pre-trained models can be downloaded from Google Drive, Baidu Cloud (access code: luck) or Hugging Face:

Task	Model	Description
Inversion	styleganex_inversion.pt	pre-trained model for StyleGANEX inversion
Image translation	styleganex_sr32.pt	pre-trained model specially for 32x face super resolution
	styleganex_sr.pt	pre-trained model for 4x-48x face super resolution
	styleganex_sketch2face.pt	pre-trained model for skech-to-face translation
	styleganex_mask2face.pt	pre-trained model for parsing map-to-face translation
Video editing	styleganex_edit_hair.pt	pre-trained model for hair color editing on videos
	styleganex_edit_age.pt	pre-trained model for age editing on videos
	styleganex_toonify_cartoon.pt	pre-trained Cartoon model for video face toonification
	styleganex_toonify_arcane.pt	pre-trained Arcane model for video face toonification
	styleganex_toonify_pixar.pt	pre-trained Pixar model for video face toonification
Supporting model
faceparsing.pth		BiSeNet for face parsing from face-parsing.PyTorch

The downloaded models are suggested to be put into ./pretrained_models/

StyleGANEX Inversion

We can embed a face image into the latent space of StyleGANEX to obtain its w+ latent code and the first-layer feature f with inversion.py.

python inversion.py --ckpt STYLEGANEX_MODEL_PATH --data_path FACE_IMAGE_PATH

The results are saved in the folder ./output/. The results contain a reconstructed image FILE_NAME_inversion.jpg and a FILE_NAME_inversion.pt file. You can obtain w+ latent code and the first-layer feature f by

latents = torch.load('./output/FILE_NAME_inversion.pt')
wplus_hat = latents['wplus'].to(device) # w+
f_hat = [latents['f'][0].to(device)]    # f

The ./inference_playground.ipynb provides some face editing examples based on wplus_hat and f_hat.

Image Translation

image_translation.py supports face super-resolution, sketch-to-face translation and parsing map-to-face translation.

python image_translation.py --ckpt STYLEGANEX_MODEL_PATH --data_path FACE_INPUT_PATH

The results are saved in the folder ./output/.

Additional notes to consider:

--parsing_model_ckpt (default: pretrained_models/faceparsing.pth): path to the pre-trained parsing model
--resize_factor (default: 32): super resolution resize factor
- For styleganex_sr.pt, should be in [4, 48]
- For styleganex_sr32.pt, should be 32
--number (default: 4): output number of multi-modal translation (for sketch/mask-to-face translation task)
--use_raw_data (default: False):
- if not specified, apply possible pre-processing to the input data
  - For styleganex_sr/sr32.pt, the input face image, e.g., ./data/ILip77SbmOE.png will be downsampled based on --resize_factor. The downsampled image will be also saved in ./output/.
  - For styleganex_sketch2face.pt, no pre-processing will be applied.
  - For styleganex_mask2face.pt, the input face image, e.g., ./data/ILip77SbmOE.png will be transformed into a parsing map. The parsing map and its visualization version will be also saved in ./output/.
- if specified, directly load input data without pre-processing
  - For styleganex_sr/sr32.pt, the input should be downsampled face images, e.g., ./data/ILip77SbmOE_45x45.png
  - For styleganex_sketch2face.pt, the input should be a one-channel sketch image e.g., ./data/234_sketch.jpg
  - For styleganex_mask2face.pt, the input should be a one-channel parsing map e.g., ./data/ILip77SbmOE_mask.png

Video Editing

video_editing.py supports video facial attribute editing and video face toonification.

python video_editing.py --ckpt STYLEGANEX_MODEL_PATH --data_path FACE_INPUT_PATH

The results are saved in the folder ./output/.

Additional notes to consider:

--data_path: the input can be either an image or a video.
--scale_factor: for attribute editing task (styleganex_edit_hair/age), control the editing degree.

(2) Training

The training code will be released upon the publication of the paper.

(3) Results

Overview of StyleGANEX inversion and facial attribute/style editing on unaligned faces:

Video facial attribute editing:

part2.mp4

Video face toonification:

part3.mp4

Citation

If you find this work useful for your research, please consider citing our paper:

@article{yang2023styleganex,
 title = {StyleGANEX: StyleGAN-Based Manipulation Beyond Cropped Aligned Faces},
 author = {Yang, Shuai and Jiang, Liming and Liu, Ziwei and and Loy, Chen Change},
 journal = {arXiv preprint arXiv:2303.06146},
 year = {2023},
}

Acknowledgments

The code is mainly developed based on stylegan2-pytorch, pixel2style2pixel and VToonify.

pranavmistry / StyleGANEX2