junyanz / pytorch-CycleGAN-and-pix2pix

Image-to-Image Translation in PyTorch

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

freezing D when optimizing G

YoojLee opened this issue · comments

self.set_requires_grad([self.netD_A, self.netD_B], False) # Ds require no gradients when optimizing Gs

Thanks for the nice work! I am quite confused that freezing D when optimizing G is just for a speedup (according to a reply to the previous issues of this topic).
I thought it was quite important to freeze D when optimizing G since G and D should be isolated from each other in the optimization process. Does it really have nothing to do with the "performance" of training?
I would like to know that the code I mentioned was written for the purpose of mere speed-up of training process.

Thanks!

These are two separate questions:
(1) should we optimize G and D jointly or not?
(2) If we optimize G and D separately, do we need to compute gradients for D while updating G.

For (2), As long as we don't do optimizer_D.step(), the gradients for D will not be used in SGD. Therefore, we set require False in Line 185.

For (1), most authors optimize them separately, following the original paper's practice.

@junyanz Thank you for your reply! Still, I am wondering why we set require False in Line 185. Is it because it is mandatory or it is for any other purpose such as speed-up or something?