bioinf-jku / TTUR

Two time-scale update rule for training GANs

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

FID get "nan" or "complex number"

ZhimingZhou opened this issue · comments

Hi,

I'm trying FID.

I get "nan" or "complex number", which stems from the "sp.linalg.sqrtm".

Have you ever faced similar issues? How should I solve this problem?

Thanks a lot.

Hi,
to calculate the covariance matrix you need at least 2048 examples, which is the dimension of the Inception coding layer, otherwise the covariance matrix is not full rank and you get indeed complex numbers. Maybe that's the reason here?

Thanks for your response. It does solve my problem. It needs more samples to avoid complex numbers.

Hi,

With enough samples, it does not get complex numbers.

But it still gets "nan" sometimes. How should I fix this problem?

Thanks for your help!

Hi!
In my experience this happens when a model mode-collapses, so that the covariance matrix hasn't got full rank (or rather: a rank smaller 2048). Have you tried plotting the images you get, to see if this is the case?

Hi,

Thanks for your reply! I'm using FID with a different pretrained model, where the activation only has 250 dimensions. I checked the covariance matrix but found that even the covariance matrix for the training data is not full rank.

It seems possible to get not full rank when the pretrained network has not constrained on its activation space. (right?) As an inference, the Inception model's activation space usually has a full rank covariance matrix. Still, with mode collapsed data, it gets not full rank and people would get "nan".

Requiring a full rank covariance matrix in the activation space makes the FID slightly hard to use in practice. Is it ok to simply use the square distance between the mean and covariance? what's the benefit of Fréchet Distance?

@ZhimingZhou
Hi,my program is stuck at the step "sp.linalg.sqrtm",do you know the reason?

Here's some code I've used that's a little more forgiving about numerical errors:

import numpy as np
from scipy import linalg
import warnings

def fid(mn1, cov1, mn2, cov2, eps=1e-6):
    mn1 = np.atleast_1d(mn1)
    mn2 = np.atleast_1d(mn2)
    
    cov1 = np.atleast_2d(cov1)
    cov2 = np.atleast_2d(cov2)
    
    diff = mn1 - mn2
        
    # product might be almost singular
    covmean, _ = linalg.sqrtm(cov1.dot(cov2), disp=False)
    if not np.isfinite(covmean).all():
        warnings.warn(("fid() got singular product; adding {} to diagonal of "
                       "cov estimates").format(eps))
        offset = np.eye(d) * eps
        covmean = linalg.sqrtm((cov1 + offset).dot(cov2 + offset))

    # numerical error might give slight imaginary component
    if np.iscomplexobj(covmean):
        if not np.allclose(np.diagonal(covmean).imag, 0, atol=1e-3):
            m = np.max(np.abs(covmean.imag))
            raise ValueError("Imaginary component {}".format(m))
        covmean = covmean.real

    tr_covmean = np.trace(covmean)

    return diff.dot(diff) + np.trace(cov1) + np.trace(cov2) - 2 * tr_covmean

Hi Dougal,

This is indeed a more stable implementation of the FID calculation and it reproduces the results of the current version. Can we use it to replace our code in calculate_frechet_distance()? Thanks for contributing!

Of course, feel free. :)

Done, thnx :)

@ZhimingZhou,

you can use the net inputs wx + b instead the activations act(wx + b) from your coding layer e.g. if it's not a pooling layer like we use it for the inception net. A benefit for this is that net inputs are more likely gaussian which is better to calculate the FID because the FID actually calculates the distance between Gaussians. You should get full rank matrices then. The Frechet Distance calculates the distance between Gaussians. Just use squared distances between the means and covariance matrices (how to combine them?) will give you definitely wrong results in the sense of a distance between probabilities.

Here's some code I've used that's a little more forgiving about numerical errors:

import numpy as np
from scipy import linalg
import warnings

def fid(mn1, cov1, mn2, cov2, eps=1e-6):
    mn1 = np.atleast_1d(mn1)
    mn2 = np.atleast_1d(mn2)
    
    cov1 = np.atleast_2d(cov1)
    cov2 = np.atleast_2d(cov2)
    
    diff = mn1 - mn2
        
    # product might be almost singular
    covmean, _ = linalg.sqrtm(cov1.dot(cov2), disp=False)
    if not np.isfinite(covmean).all():
        warnings.warn(("fid() got singular product; adding {} to diagonal of "
                       "cov estimates").format(eps))
        offset = np.eye(d) * eps
        covmean = linalg.sqrtm((cov1 + offset).dot(cov2 + offset))

    # numerical error might give slight imaginary component
    if np.iscomplexobj(covmean):
        if not np.allclose(np.diagonal(covmean).imag, 0, atol=1e-3):
            m = np.max(np.abs(covmean.imag))
            raise ValueError("Imaginary component {}".format(m))
        covmean = covmean.real

    tr_covmean = np.trace(covmean)

    return diff.dot(diff) + np.trace(cov1) + np.trace(cov2) - 2 * tr_covmean

This code does not seem to fix the problem as in I still get the error: ValueError: Imaginary component 2.0828361505257897e+111

Downgrade scipy to 1.11.1 solved the problem for me.

Downgrade scipy to 1.11.1 solved the problem for me.

it works! Thank you!