Why do a_p and a_n use .detach()?
zs-zhong opened this issue · comments
Zhisheng Zhong commented
Hi, thanks for your code!
I wonder why a_p and a_n use .detach() here?
Line 28 in d002ecd
As Eq. (9) and (10) in the original paper, it seems a_p and a_n are involved in the gradient computation.
Hope to get your answer, thanks!