FID get "nan" or "complex number"

Question

FID get "nan" or "complex number"

ZhimingZhou opened this issue 7 years ago · comments

Zhiming Zhou commented 7 years ago

Hi,

I'm trying FID.

I get "nan" or "complex number", which stems from the "sp.linalg.sqrtm".

Have you ever faced similar issues? How should I solve this problem?

Thanks a lot.

Martin Heusel · Answer 1 · Wed Nov 22 2017 19:23:47 GMT+0800 (China Standard Time)

Hi,
to calculate the covariance matrix you need at least 2048 examples, which is the dimension of the Inception coding layer, otherwise the covariance matrix is not full rank and you get indeed complex numbers. Maybe that's the reason here?

Zhiming Zhou · Answer 2 · Thu Nov 23 2017 16:17:47 GMT+0800 (China Standard Time)

Thanks for your response. It does solve my problem. It needs more samples to avoid complex numbers.

Zhiming Zhou · Answer 3 · Thu Nov 30 2017 22:03:44 GMT+0800 (China Standard Time)

Hi,

With enough samples, it does not get complex numbers.

But it still gets "nan" sometimes. How should I fix this problem?

Thanks for your help!

Thomas Unterthiner · Answer 4 · Thu Nov 30 2017 22:09:17 GMT+0800 (China Standard Time)

Hi!
In my experience this happens when a model mode-collapses, so that the covariance matrix hasn't got full rank (or rather: a rank smaller 2048). Have you tried plotting the images you get, to see if this is the case?

Zhiming Zhou · Answer 5 · Fri Dec 01 2017 17:32:35 GMT+0800 (China Standard Time)

Hi,

Thanks for your reply! I'm using FID with a different pretrained model, where the activation only has 250 dimensions. I checked the covariance matrix but found that even the covariance matrix for the training data is not full rank.

It seems possible to get not full rank when the pretrained network has not constrained on its activation space. (right?) As an inference, the Inception model's activation space usually has a full rank covariance matrix. Still, with mode collapsed data, it gets not full rank and people would get "nan".

Requiring a full rank covariance matrix in the activation space makes the FID slightly hard to use in practice. Is it ok to simply use the square distance between the mean and covariance? what's the benefit of Fréchet Distance?

elggurts · Answer 6 · Fri Feb 02 2018 16:41:28 GMT+0800 (China Standard Time)

@ZhimingZhou
Hi，my program is stuck at the step "sp.linalg.sqrtm"，do you know the reason?

Danica J. Sutherland · Answer 7 · Tue Feb 06 2018 22:07:30 GMT+0800 (China Standard Time)

Here's some code I've used that's a little more forgiving about numerical errors:

import numpy as np
from scipy import linalg
import warnings

def fid(mn1, cov1, mn2, cov2, eps=1e-6):
    mn1 = np.atleast_1d(mn1)
    mn2 = np.atleast_1d(mn2)
    
    cov1 = np.atleast_2d(cov1)
    cov2 = np.atleast_2d(cov2)
    
    diff = mn1 - mn2
        
    # product might be almost singular
    covmean, _ = linalg.sqrtm(cov1.dot(cov2), disp=False)
    if not np.isfinite(covmean).all():
        warnings.warn(("fid() got singular product; adding {} to diagonal of "
                       "cov estimates").format(eps))
        offset = np.eye(d) * eps
        covmean = linalg.sqrtm((cov1 + offset).dot(cov2 + offset))

    # numerical error might give slight imaginary component
    if np.iscomplexobj(covmean):
        if not np.allclose(np.diagonal(covmean).imag, 0, atol=1e-3):
            m = np.max(np.abs(covmean.imag))
            raise ValueError("Imaginary component {}".format(m))
        covmean = covmean.real

    tr_covmean = np.trace(covmean)

    return diff.dot(diff) + np.trace(cov1) + np.trace(cov2) - 2 * tr_covmean

Martin Heusel · Answer 8 · Wed Feb 07 2018 19:51:42 GMT+0800 (China Standard Time)

Hi Dougal,

This is indeed a more stable implementation of the FID calculation and it reproduces the results of the current version. Can we use it to replace our code in calculate_frechet_distance()? Thanks for contributing!

Danica J. Sutherland · Answer 9 · Wed Feb 07 2018 20:05:26 GMT+0800 (China Standard Time)

Of course, feel free. :)

Martin Heusel · Answer 10 · Wed Feb 07 2018 20:50:51 GMT+0800 (China Standard Time)

Done, thnx :)

Martin Heusel · Answer 11 · Wed Feb 07 2018 21:15:28 GMT+0800 (China Standard Time)

@ZhimingZhou,

you can use the net inputs wx + b instead the activations act(wx + b) from your coding layer e.g. if it's not a pooling layer like we use it for the inception net. A benefit for this is that net inputs are more likely gaussian which is better to calculate the FID because the FID actually calculates the distance between Gaussians. You should get full rank matrices then. The Frechet Distance calculates the distance between Gaussians. Just use squared distances between the means and covariance matrices (how to combine them?) will give you definitely wrong results in the sense of a distance between probabilities.

Farid · Answer 12 · Tue Jan 23 2024 19:24:27 GMT+0800 (China Standard Time)

Here's some code I've used that's a little more forgiving about numerical errors:

import numpy as np
from scipy import linalg
import warnings

def fid(mn1, cov1, mn2, cov2, eps=1e-6):
    mn1 = np.atleast_1d(mn1)
    mn2 = np.atleast_1d(mn2)
    
    cov1 = np.atleast_2d(cov1)
    cov2 = np.atleast_2d(cov2)
    
    diff = mn1 - mn2
        
    # product might be almost singular
    covmean, _ = linalg.sqrtm(cov1.dot(cov2), disp=False)
    if not np.isfinite(covmean).all():
        warnings.warn(("fid() got singular product; adding {} to diagonal of "
                       "cov estimates").format(eps))
        offset = np.eye(d) * eps
        covmean = linalg.sqrtm((cov1 + offset).dot(cov2 + offset))

    # numerical error might give slight imaginary component
    if np.iscomplexobj(covmean):
        if not np.allclose(np.diagonal(covmean).imag, 0, atol=1e-3):
            m = np.max(np.abs(covmean.imag))
            raise ValueError("Imaginary component {}".format(m))
        covmean = covmean.real

    tr_covmean = np.trace(covmean)

    return diff.dot(diff) + np.trace(cov1) + np.trace(cov2) - 2 * tr_covmean

This code does not seem to fix the problem as in I still get the error: ValueError: Imaginary component 2.0828361505257897e+111

Hao Wang · Answer 13 · Wed Jan 24 2024 09:59:09 GMT+0800 (China Standard Time)

Downgrade scipy to 1.11.1 solved the problem for me.

sumorday · Answer 14 · Wed Jan 31 2024 16:13:43 GMT+0800 (China Standard Time)

Downgrade scipy to 1.11.1 solved the problem for me.

it works! Thank you!