TinyZeaMays / CircleLoss

Hi, thanks for your code!

I wonder why a_p and a_n use .detach() here?

Line 28 in d002ecd

ap = torch.clamp_min(- sp.detach() + 1 + self.m, min=0.)

As Eq. (9) and (10) in the original paper, it seems a_p and a_n are involved in the gradient computation.

Hope to get your answer, thanks!

Why do a_p and a_n use .detach()?