kreshuklab / spoco

PyTorch implementation of SPOCO

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Objective for push force

JaWeyl opened this issue · comments

Hi all,

I'm a bit confused by the push force in your paper (see eq. 2).

Especially, I'm wondering why the push force is calcluated between "mu_k" and "mu_l" even in case that the index "k" is equal to the index "l".

Intuitively, you cannot push the mean embeddings from itself. Thus, in my opinion you should not compute this objective in case k == l. This idea follows from the original paper of Brabandere.

I'd appreciate any comments on this.

Hi @JaWeyl,

thanks for the question. When k == l the norm between the mean embeddings is zero anyway, so there is no contribution to the loss (to be precise there is an additional constant term in the push loss due to the hinge, see eq. 2, but it doesn't change the training dynamics). In other words, although the push loss is slightly different than the one from Brabandere et al., the underlying optimization problem is the same.

When k == l the norm between the mean embeddings is zero anyway, so there is no contribution to the loss

Sorry to jump in here, but this doesn't really make sense to me. It would mean that the solution of having the same mean embedding for all clusters wouldn't induce any contribution from the push term.

Hi @constantinpape,

you're totally correct and the point raised by @JaWeyl is indeed valid. I had something else in mind and the explanation I gave is incorrect. The implementation however is correct, see https://github.com/kreshuklab/spoco/blob/main/spoco/losses.py#L448, where the diagonal in the hinged distance matrix is zero, so there is indeed no contribution to the push loss when k == l. This however is not reflected in eq. 2 in the paper, which is a mistake. Thanks for pointing that out @JaWeyl. I'll update eq. 2 in the new version on arxiv.

Hi @wolny and @constantinpape,

thank you both for the discussion!

@wolny great - just be sure to also change the normalisation factor that should be (C ** 2 - C) or C * (C - 1).