TinyZeaMays / CircleLoss

Pytorch implementation of the paper "Circle Loss: A Unified Perspective of Pair Similarity Optimization"

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Why do a_p and a_n use .detach()?

zs-zhong opened this issue · comments

Hi, thanks for your code!

I wonder why a_p and a_n use .detach() here?

ap = torch.clamp_min(- sp.detach() + 1 + self.m, min=0.)

As Eq. (9) and (10) in the original paper, it seems a_p and a_n are involved in the gradient computation.

Hope to get your answer, thanks!