shaoanlu / faceswap-GAN

A denoising autoencoder + adversarial losses and attention mechanisms for face swapping.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Timesteps (# gen_iterations) for changing loss_config

zikuicai opened this issue · comments

Could you please explain your scheme for adjusting the loss_config? The reason behind your choice of hyperparameters and the timestep gen_iterations for adjusting? Thanks in advance!

if gen_iterations == (TOTAL_ITERS//5 - display_iters//2):
# why at iter (TOTAL_ITERS//5 - display_iters//2)?
  loss_config['m_mask'] = 0.0
# why 0.0?
  ...
elif gen_iterations == (TOTAL_ITERS//5 + TOTAL_ITERS//10 - display_iters//2):
  loss_config['m_mask'] = 0.5
  ...
...

The setting of loss_config['m_mask'] = 0.0 represents an additional hinge loss for alpha mask. It is accompanied with loss_config['use_mask_hinge_loss']. Setting the latter False simply disable the hinge loss on the predicted alpha channel. So in the training scheme, there is no hinge loss introduced until gen_iterations == (TOTAL_ITERS//5 + TOTAL_ITERS//10 - display_iters//2).

The reason of rejecting hinge loss of alpha mask, at early stage of training, is because the faces generated at this stage are very blurry, and it is hard to learn good alpha masks before RGB outputs converge. I'm just worrying that if we introduce hinge loss too early, the backprop from the alpha mask might hurt the model.

In addition, it is suggested in this post that one should even not use adversarial losses for most of the training process and only apply them as fine-tuning at last 2600 ~ 7800 iters.

Enjoy the alchemy!😂😂😂