phymhan / P2GAN

Dual Projection Generative Adversarial Networks for Conditional Image Generation

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

P2GAN: Dual Projection Generative Adversarial Networks for Conditional Image Generation

[pdf] [supp] [arXiv] [slides] [poster]

Discriminator models for conditional GANs.

The code consists of the following forks of BigGAN-PyTorch and PyTorch-StudioGAN:

1-D Mixture-of-Gaussian Experiment (Based on TAC-GAN Code)

Run 1-D MoG experiments with --distance 0, 2, 4, 6, and 8 respectively. Use --num_runs to specify the number of runs and --gan_loss to specify GAN loss type (bce or hinge).

python run_1D_MoG.py --distance 0 --num_runs 100 --gan_loss bce

Results will be saved under MOG/1D as .txt files:

method:
mean_0, var_0
mean_1, var_1
mean_2, var_2
mean_m, var_m

VGGFace2 Experiments (BigGAN Codebase)

Experiments for VGGFace2, CIFAR100, and ImageNet at 64-by-64 resolution are based on the BigGAN-PyTorch codebase.

Prepare VGGFace2 subsets

Download VGGFace2 dataset from official website. Make subsets that contain 200, 500, 2000 identities. The lists of corresponding identities used in the experiments are provided in id_v200.txt, id_v500.txt, and id_v2000.txt, respectively.

To create a HDF5 dataset, run

python make_hdf5.py --dataset V2000 --dataset_hdf5 VGGFace2000_ --num_workers 8

Finetune an Inception V3 model for evaluation on VGGFace2

CUDA_VISIBLE_DEVICES=0,1 python train_inception.py \
--dataset V2000_hdf5 \
--experiment_name newinc_v2000 \
--optimizer adam \
--tensorboard \
--shuffle --batch_size 256 --parallel \
--num_epochs 100 \
--seed 0 --save_every 200

To load the model used in our experiments, please download the checkpoint as provided here, and save it as weights/newinc_v2000/model_itr_20000.pth.

Prepare Inception Moments

Prepare inception moments for calculating FID:

python calculate_inception_moments.py --dataset V200_hdf5 --custom_inception_model_path weights/newinc_v2000/model_itr_20000.pth --inception_moments_path data/v200_inc_itr20000.npz

Prepare inception moments for calculating Intra-FID:

python calculate_intra_inception_moments.py --dataset V200_hdf5 --custom_inception_model_path weights/newinc_v2000/model_itr_20000.pth --intra_inception_moments_path data/v200_intra_inc_itr20000

Train P2GAN and baseline models

The current implementation requires training with multiple GPUs (for VGGFace2 experiments, we observe a significant performance drop when running on a single GPU). The following commands are tested on two GPUs.

Train a P2GAN:

CUDA_VISIBLE_DEVICES=0,1 python train.py \
--dataset V200_hdf5 \
--custom_inception_model_path weights/newinc_v2000/model_itr_20000.pth --custom_num_classes 2000 \
--inception_moments_path data/v200_inc_itr20000.npz \
--experiment_name V200_p2 --seed 2018 \
--f_div_loss revkl \
--loss_type hybrid --no_projection --AC --TAC \
--AC_weight 1.0 \
--model BigGAN_hybrid --which_train_fn hybrid \
--tensorboard \
--parallel --shuffle --num_workers 16 --batch_size 256  \
--num_G_accumulations 1 --num_D_accumulations 1 \
--G_ch 32 --D_ch 32 \
--num_D_steps 1 --G_lr 1e-4 --D_lr 4e-4 --D_B2 0.999 --G_B2 0.999 \
--G_attn 32 --D_attn 32 \
--G_nl inplace_relu --D_nl inplace_relu \
--SN_eps 1e-6 --BN_eps 1e-5 --adam_eps 1e-6 \
--G_ortho 0.0 \
--G_shared \
--G_init ortho --D_init ortho \
--hier --dim_z 120 --shared_dim 128 \
--G_eval_mode \
--ema --use_ema --ema_start 20000 --num_epochs 100 \
--test_every 2000 --save_every 2000 --num_best_copies 2 --num_save_copies 2 \
--use_multiepoch_sampler --sv_log_interval -1 \
--which_best FID \
--save_test_iteration --no_intra_fid 

Train a P2GAN-w (P2GAN-ap):

CUDA_VISIBLE_DEVICES=0,1 python train.py \
--dataset V200_hdf5 \
--custom_inception_model_path weights/newinc_v2000/model_itr_20000.pth --custom_num_classes 2000 \
--inception_moments_path data/v200_inc_itr20000.npz \
--experiment_name V200_p2ap --seed 2018 \
--f_div_loss revkl \
--detach_weight_linear \
--add_weight_penalty \
--use_hybrid --adaptive_loss sigmoid --adaptive_loss_detach \
--loss_type hybrid --no_projection --AC --TAC \
--AC_weight 1.0 \
--model BigGAN_hybrid --which_train_fn amortised \
--tensorboard \
--parallel --shuffle --num_workers 16 --batch_size 256  \
--num_G_accumulations 1 --num_D_accumulations 1 \
--G_ch 32 --D_ch 32 \
--num_D_steps 1 --G_lr 1e-4 --D_lr 4e-4 --D_B2 0.999 --G_B2 0.999 \
--G_attn 32 --D_attn 32 \
--G_nl inplace_relu --D_nl inplace_relu \
--SN_eps 1e-6 --BN_eps 1e-5 --adam_eps 1e-6 \
--G_ortho 0.0 \
--G_shared \
--G_init ortho --D_init ortho \
--hier --dim_z 120 --shared_dim 128 \
--G_eval_mode \
--ema --use_ema --ema_start 20000 --num_epochs 100 \
--test_every 2000 --save_every 2000 --num_best_copies 2 --num_save_copies 2 \
--use_multiepoch_sampler --sv_log_interval -1 \
--which_best FID \
--save_test_iteration --no_intra_fid 

Train a f-cGAN:

CUDA_VISIBLE_DEVICES=0,1 python train.py \
--dataset V200_hdf5 \
--custom_inception_model_path weights/newinc_v2000/model_itr_20000.pth --custom_num_classes 2000 \
--inception_moments_path data/v200_inc_itr20000.npz \
--experiment_name V200_fc --seed 2018 \
--f_div_loss revkl \
--loss_type TAC --no_projection --AC --TAC \
--AC_weight 1.0 \
--model BigGAN_hybrid --which_train_fn hybrid \
--tensorboard \
--parallel --shuffle --num_workers 16 --batch_size 256 \
--num_G_accumulations 1 --num_D_accumulations 1 \
--G_ch 32 --D_ch 32 \
--num_D_steps 1 --G_lr 1e-4 --D_lr 4e-4 --D_B2 0.999 --G_B2 0.999 \
--G_attn 32 --D_attn 32 \
--G_nl inplace_relu --D_nl inplace_relu \
--SN_eps 1e-6 --BN_eps 1e-5 --adam_eps 1e-6 \
--G_ortho 0.0 \
--G_shared \
--G_init ortho --D_init ortho \
--hier --dim_z 120 --shared_dim 128 \
--G_eval_mode \
--ema --use_ema --ema_start 20000 --num_epochs 100 \
--test_every 2000 --save_every 2000 --num_best_copies 2 --num_save_copies 2 \
--use_multiepoch_sampler --sv_log_interval -1 \
--which_best FID \
--save_test_iteration --no_intra_fid 

Train a Proj-GAN:

CUDA_VISIBLE_DEVICES=0,1 python train.py \
--dataset V200_hdf5 \
--custom_inception_model_path weights/newinc_v2000/model_itr_20000.pth --custom_num_classes 2000 \
--inception_moments_path data/v200_inc_itr20000.npz \
--experiment_name V200_proj --seed 2018 \
--f_div_loss revkl \
--loss_type Projection \
--AC_weight 1.0 \
--model BigGAN --which_train_fn GAN \
--tensorboard \
--parallel --shuffle --num_workers 16 --batch_size 256 \
--num_G_accumulations 1 --num_D_accumulations 1 \
--G_ch 32 --D_ch 32 \
--num_D_steps 1 --G_lr 1e-4 --D_lr 4e-4 --D_B2 0.999 --G_B2 0.999 \
--G_attn 32 --D_attn 32 \
--G_nl inplace_relu --D_nl inplace_relu \
--SN_eps 1e-6 --BN_eps 1e-5 --adam_eps 1e-6 \
--G_ortho 0.0 \
--G_shared \
--G_init ortho --D_init ortho \
--hier --dim_z 120 --shared_dim 128 \
--G_eval_mode \
--ema --use_ema --ema_start 20000 --num_epochs 100 \
--test_every 2000 --save_every 2000 --num_best_copies 2 --num_save_copies 2 \
--use_multiepoch_sampler --sv_log_interval -1 \
--which_best FID \
--save_test_iteration --no_intra_fid 

Train a TAC-GAN:

CUDA_VISIBLE_DEVICES=0,1 python train.py \
--dataset V200_hdf5 \
--custom_inception_model_path weights/newinc_v2000/model_itr_20000.pth --custom_num_classes 2000 \
--inception_moments_path data/v200_inc_itr20000.npz \
--experiment_name V200_tac --seed 2018 \
--f_div_loss revkl \
--loss_type TAC --no_projection --AC --TAC \
--train_AC_on_fake \
--AC_weight 1.0 \
--model BigGAN_hybrid --which_train_fn hybrid \
--tensorboard \
--parallel --shuffle --num_workers 16 --batch_size 256 \
--num_G_accumulations 1 --num_D_accumulations 1 \
--G_ch 32 --D_ch 32 \
--num_D_steps 1 --G_lr 1e-4 --D_lr 4e-4 --D_B2 0.999 --G_B2 0.999 \
--G_attn 32 --D_attn 32 \
--G_nl inplace_relu --D_nl inplace_relu \
--SN_eps 1e-6 --BN_eps 1e-5 --adam_eps 1e-6 \
--G_ortho 0.0 \
--G_shared \
--G_init ortho --D_init ortho \
--hier --dim_z 120 --shared_dim 128 \
--G_eval_mode \
--ema --use_ema --ema_start 20000 --num_epochs 100 \
--test_every 2000 --save_every 2000 --num_best_copies 2 --num_save_copies 2 \
--use_multiepoch_sampler --sv_log_interval -1 \
--which_best FID \
--save_test_iteration --no_intra_fid 

ImageNet 64x64 Resolution Experiments (BigGAN Codebase)

Train a P2GAN:

CUDA_VISIBLE_DEVICES=0,1 python train.py \
--dataset I64 \
--experiment_name I64_p2 --seed 0 \
--f_div_loss revkl \
--loss_type hybrid --no_projection --AC --TAC \
--AC_weight 1.0 \
--model BigGAN_hybrid --which_train_fn hybrid \
--tensorboard \
--parallel --shuffle --num_workers 16 --batch_size 256 \
--num_G_accumulations 8 --num_D_accumulations 8 \
--G_ch 32 --D_ch 32 \
--num_D_steps 1 --G_lr 1e-4 --D_lr 4e-4 --D_B2 0.999 --G_B2 0.999 \
--G_attn 32 --D_attn 32 \
--G_nl inplace_relu --D_nl inplace_relu \
--SN_eps 1e-6 --BN_eps 1e-5 --adam_eps 1e-6 \
--G_ortho 0.0 \
--G_shared \
--G_init ortho --D_init ortho \
--hier --dim_z 120 --shared_dim 128 \
--G_eval_mode \
--ema --use_ema --ema_start 20000 --num_epochs 200 \
--test_every 2000 --save_every 2000 --num_best_copies 2 --num_save_copies 2 \
--use_multiepoch_sampler --sv_log_interval -1 \
--no_intra_fid \
--which_best FID --save_test_iteration 

Train a P2GAN-w (P2GAN-ap):

CUDA_VISIBLE_DEVICES=0,1 python train.py \
--dataset I64 \
--experiment_name I64_p2ap --seed 0 \
--add_weight_penalty \
--detach_weight_linear \
--use_hybrid --adaptive_loss sigmoid --adaptive_loss_detach \
--loss_type hybrid --no_projection --AC --TAC \
--AC_weight 1.0 \
--model BigGAN_hybrid --which_train_fn amortised \
--tensorboard \
--parallel --shuffle --num_workers 16 --batch_size 256 \
--num_G_accumulations 8 --num_D_accumulations 8 \
--G_ch 32 --D_ch 32 \
--num_D_steps 1 --G_lr 1e-4 --D_lr 4e-4 --D_B2 0.999 --G_B2 0.999 \
--G_attn 32 --D_attn 32 \
--G_nl inplace_relu --D_nl inplace_relu \
--SN_eps 1e-6 --BN_eps 1e-5 --adam_eps 1e-6 \
--G_ortho 0.0 \
--G_shared \
--G_init ortho --D_init ortho \
--hier --dim_z 120 --shared_dim 128 \
--G_eval_mode \
--ema --use_ema --ema_start 20000 --num_epochs 200 \
--test_every 2000 --save_every 2000 --num_best_copies 2 --num_save_copies 2 \
--use_multiepoch_sampler --sv_log_interval -1 \
--no_intra_fid \
--which_best FID --save_test_iteration 

ImageNet 128x128 Resolution Experiments (StudioGAN Codebase)

Experiments for ImageNet at 128-by-128 resolution and CIFAR10 are based on the StudioGAN codebase. The code is tested on 4 A100 GPUs.

To train a P2GAN model:

CUDA_VISIBLE_DEVICES=0,1,2,3 python src/main.py -t -e -sync_bn -c src/configs/P2GAN/I128_p2.json --eval_type "valid"

To train a P2GAN-ap model:

CUDA_VISIBLE_DEVICES=0,1,2,3 python src/main.py -t -e -sync_bn -c src/configs/P2GAN/I128_p2ap.json --eval_type "valid"

To train a P2GAN-ap-alt model:

CUDA_VISIBLE_DEVICES=0,1,2,3 python src/main.py -t -e -sync_bn -c src/configs/P2GAN/I128_p2ap_exp.json --eval_type "valid"

Pretrained Weights and Training Logs

ImageNet 128:

Weight Log FID
Proj-GAN gema log 23.07
P2GAN gema log 16.98
P2GAN-ap gema log 19.20
P2GAN-ap-alt gema log 16.53

VGGFace2-200:

Weight Log FID
Proj-GAN gema log 61.43
TAC-GAN gema log 96.06
f-cGAN gema log 29.54
P2GAN gema log 20.70
P2GAN-w gema log 15.70

VGGFace2-500:

Weight Log FID
Proj-GAN gema log 23.57
TAC-GAN gema log 19.30
f-cGAN gema log 16.74
P2GAN gema log 12.09
P2GAN-w gema log 12.73

Citation

P2GAN implementation is heavily based on BigGAN, StudioGAN, and TAC-GAN. If you use this code, please cite

@InProceedings{Han_2021_ICCV,
    author    = {Han, Ligong and Min, Martin Renqiang and Stathopoulos, Anastasis and Tian, Yu and Gao, Ruijiang and Kadav, Asim and Metaxas, Dimitris N.},
    title     = {Dual Projection Generative Adversarial Networks for Conditional Image Generation},
    booktitle = {Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV)},
    month     = {October},
    year      = {2021},
    pages     = {14438-14447}
}

About

Dual Projection Generative Adversarial Networks for Conditional Image Generation