KaiyangZhou / mixstyle-release

Domain Generalization with MixStyle (ICLR'21)

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

The use of detach

SY-Xuan opened this issue · comments

Thanks for your nice work.
mu, sig = mu.detach(), sig.detach()
Why you use detach on these two parameters?
Could you explain.

This is related to the central idea in mixstyle: to perturb features in a layer such that the next layer can see data in a "new" style. Blocking the gradients in mu and sigma might prevent the network from erasing such augmentation effect through adjusting its weights.

However, I didn't extensively evaluate this design. I guess it might not affect the performance too much.

Got it. Thank you for your reply.