Objective for push force

Question

Objective for push force

JaWeyl opened this issue 2 years ago · comments

Hi all,

I'm a bit confused by the push force in your paper (see eq. 2).

Especially, I'm wondering why the push force is calcluated between "mu_k" and "mu_l" even in case that the index "k" is equal to the index "l".

Intuitively, you cannot push the mean embeddings from itself. Thus, in my opinion you should not compute this objective in case k == l. This idea follows from the original paper of Brabandere.

I'd appreciate any comments on this.

Adrian Wolny · Answer 1 · Thu Oct 20 2022 13:48:41 GMT+0800 (China Standard Time)

Hi @JaWeyl,

thanks for the question. When k == l the norm between the mean embeddings is zero anyway, so there is no contribution to the loss (to be precise there is an additional constant term in the push loss due to the hinge, see eq. 2, but it doesn't change the training dynamics). In other words, although the push loss is slightly different than the one from Brabandere et al., the underlying optimization problem is the same.

Constantin Pape · Answer 2 · Thu Oct 20 2022 15:18:33 GMT+0800 (China Standard Time)

When k == l the norm between the mean embeddings is zero anyway, so there is no contribution to the loss

Sorry to jump in here, but this doesn't really make sense to me. It would mean that the solution of having the same mean embedding for all clusters wouldn't induce any contribution from the push term.

Adrian Wolny · Answer 3 · Sat Oct 22 2022 21:15:23 GMT+0800 (China Standard Time)

Hi @constantinpape,

you're totally correct and the point raised by @JaWeyl is indeed valid. I had something else in mind and the explanation I gave is incorrect. The implementation however is correct, see https://github.com/kreshuklab/spoco/blob/main/spoco/losses.py#L448, where the diagonal in the hinged distance matrix is zero, so there is indeed no contribution to the push loss when k == l. This however is not reflected in eq. 2 in the paper, which is a mistake. Thanks for pointing that out @JaWeyl. I'll update eq. 2 in the new version on arxiv.

JaWeyl · Answer 4 · Sun Oct 23 2022 18:21:18 GMT+0800 (China Standard Time)

Hi @wolny and @constantinpape,

thank you both for the discussion!

@wolny great - just be sure to also change the normalisation factor that should be (C ** 2 - C) or C * (C - 1).