bioinf-jku / TTUR

Two time-scale update rule for training GANs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Between np.trace(whole term) and sum of np.trace(each term)

miqbal23 opened this issue · comments

Hi, I found that you implemented the final FID calculation result in this line

return diff.dot(diff) + np.trace(sigma1) + np.trace(sigma2) - 2 * tr_covmean

But according to your formula, the trace are applied after both sigmas and covmean being calculated, so it should be in

return diff.dot(diff) + np.trace(sigma1 + sigma2 - 2 * covmean)

I have experimented using both, and it seems that the calculation are both similar (until 13 numbers behind decimal point). Any explanation on reason of using the former?

Hi! The two expressions are mathematically equivalent due to the rules of the trace operator, so for the results doesn't really matter which one we use. We picked the former as it has slightly better running time, as a trace is O(N), while adding two matrices would be O(N^2) (where N is the side-length of the matrix). But this is of course negligible compared to the calculation of the activations, anyways.

I see. I have confirmed it from reading back on trace operation. Thank you for this confirmation