Mimic3D: Thriving 3D-Aware GANs via 3D-to-2D Imitation (ICCV 2023)

Requirements and Data Preparation

Our code is adopted from EG3D and follow its requirements and data preparation.

Create a environment

conda env create -f environment.yml
conda activate eg3d

Follow EG3D to pre-process FFHQ, AFHQ, and ShapeNet data.
Pretrained models are avaliable at Google Drive.

The data and model folders look as follows:

ROOT
    ├──data
        ├──AFHQ
            ├── adhq_v2_256.zip
            ├── adhq_v2_512.zip
        ├──FFHQ
            ├──FFHQ_256.zip
            ├──FFHQ_512.zip
        ├──ShapeNet
            |──car_128.zip
    ├──out
        ├──afhq256_2d
        ├──afhq256_3d
        ├──afhq512_2d
        ├──afhq512_3d
        ├──ffhq256_2d
        ├──ffhq256_3d
        ├──ffhq512_2d
        ├──ffhq512_3d
        ├──shapenet128_2d
        ├──shapenet128_3d

Inference

./scripts/infer.sh

Results will be saved to out/{experiment}/infer

Evaluation

./scripts/val.sh

Training

./scripts/train.sh

All asserts produced by the training process will be saved to out/{experiment}

Config file

In above .sh files, --cfg can be changed for different models.

In a config file (e.g., configs/ffhq_3d.yaml), key settings are explained as follows:

# your experiment name
experiment: 'ffhq512_3d' 
# it takes ~40G GPU memory if using 8 GPUs and a batch size of 32 to train a 512-size 3D model
gpus: 8
batch: 32
# the resolution with 3D-aware conv
aware3d_res: [4,8,16,32,64,128,256]
# model to load; None for from-scratch; we suggest loading a 2D model before training a 3D model; also, a 3D model can be trained from scratch
resume: '017000' 
# loss weight for patch discrimination
patch_gan: 0.1 
# for 512-size model w/o 2D super-res., FID evaluation takes ~4h; you would want to set `metrics: []` to cancel evaluation durning training
metrics: [] 
# select from video|videos|image
inference_mode: 'video' 
# rendering resolution with radiance field in inference and evaluation
neural_rendering_resolution_infer: 512 
# which tri-plane used for rendering. 0: coarse & detail triplanes; 1: coarse triplane; 2: detail triplane
coarse: 0 
# return triplane or not
retplane: -1 
# extract geometry or not
shapes: False 
# reduce point amount in a forward pass to avoid OOM durning inference and evaluation
chunk: 500000

Reference

@inproceedings{bib:mimic3d,
  title={Mimic3D: Thriving 3D-Aware GANs via 3D-to-2D Imitation},
  author={Chen, Xingyu and Deng, Yu and Wang, Baoyuan},
  booktitle={Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
  year={2023}
}

Acknowledgement

Our implementation is based on EG3D. We thank them for inspiring implementations.

SeanChenxy / Mimic3D