freezing D when optimizing G
YoojLee opened this issue · comments
Thanks for the nice work! I am quite confused that freezing D when optimizing G is just for a speedup (according to a reply to the previous issues of this topic).
I thought it was quite important to freeze D when optimizing G since G and D should be isolated from each other in the optimization process. Does it really have nothing to do with the "performance" of training?
I would like to know that the code I mentioned was written for the purpose of mere speed-up of training process.
Thanks!
These are two separate questions:
(1) should we optimize G and D jointly or not?
(2) If we optimize G and D separately, do we need to compute gradients for D while updating G.
For (2), As long as we don't do optimizer_D.step(), the gradients for D will not be used in SGD. Therefore, we set require False in Line 185.
For (1), most authors optimize them separately, following the original paper's practice.