rsagroup / rsatoolbox

Python library for Representational Similarity Analysis

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Correlation values could be fisher z-transformed before t-tests?

alexeperon opened this issue · comments

Hi all,

When performing group-level analysis using RSA, we might want to test correlation values against 0. This can be done in rsatoolbox using the .inference.eval_fixed function, for example, in which the distribution of correlation values is tested against 0.

Conventionally, the correlation values (r or rho) are z-transformed using a fisher transform prior to running t-tests. This ensures that the correlation scores are normally distributed (which is a necessary assumption for t-tests).

It appears that rsatoolbox does not currently do this - which might result in biased results for groups of subjects with high correlation scores.

There are multiple points where this could be added, but probably the logical point is in the rsatoolbox.inference.evaluate module, specifically the eval_fixed function. This means that (1) the compare functions still provide correlation scores, and (2) any downstream functions use z-scored evaluation values.

The current version of the function (0.1.5) looks like this:

def eval_fixed(models, data, theta=None, method='cosine'):
    """evaluates models on data, without any bootstrapping or
    cross-validation

    Args:
        models(list of rsatoolbox.model.Model or list): models to be evaluated
        data(rsatoolbox.rdm.RDMs): data to evaluate on
        theta(numpy.ndarray): parameter vector for the models
        method(string): comparison method to use

    Returns:
        float: evaluation

    """
    models, evaluations, theta, _ = input_check_model(models, theta, None, 1)
    evaluations = np.repeat(np.expand_dims(evaluations, -1),
                            data.n_rdm, -1)
    for k, model in enumerate(models):
        rdm_pred = model.predict_rdm(theta=theta[k])
        evaluations[k] = compare(rdm_pred, data, method)
    evaluations = evaluations.reshape((1, len(models), data.n_rdm))
    noise_ceil = boot_noise_ceiling(
        data, method=method, rdm_descriptor='index')
    if data.n_rdm > 1:
        variances = np.cov(evaluations[0], ddof=0) \
            / evaluations.shape[-1]
        dof = evaluations.shape[-1] - 1
    else:
        variances = None
        dof = 0
    result = Result(models, evaluations, method=method,
                    cv_method='fixed', noise_ceiling=noise_ceil,
                    variances=variances, dof=dof, n_rdm=data.n_rdm,
                    n_pattern=None)
    result.n_pattern = data.n_cond
    return result

Instead, we could use the method variable to see if our scores are correlations. If they are, they should be z-scored. Let's add in also a 'z-score' input in case we want to avoid this for other downstream functions.This results in the following:


def eval_fixed(models, data, theta=None, method='cosine', zscore=True):
    """evaluates models on data, without any bootstrapping or
    cross-validation

    Args:
        models(list of rsatoolbox.model.Model or list): models to be evaluated
        data(rsatoolbox.rdm.RDMs): data to evaluate on
        theta(numpy.ndarray): parameter vector for the models
        method(string): comparison method to use

    Returns:
        float: evaluation

    """
    models, evaluations, theta, _ = input_check_model(models, theta, None, 1)
    evaluations = np.repeat(np.expand_dims(evaluations, -1),
                            data.n_rdm, -1)
    for k, model in enumerate(models):
        rdm_pred = model.predict_rdm(theta=theta[k])
        # if our comparison is a correlation, return values post fisher z transform
        # this means our results are normally distributed, which is useful for future analyses
        if method in ['rho-a', 'corr', 'tau-a'] and zscore==True:
            evaluations[k] = np.sqrt((data.n_rdm-3) / 1.06) * np.arctanh(compare(rdm_pred, data, method))
        else:
            evaluations[k] = compare(rdm_pred, data, method)
    evaluations = evaluations.reshape((1, len(models), data.n_rdm))
    noise_ceil = boot_noise_ceiling(
        data, method=method, rdm_descriptor='index')
    if data.n_rdm > 1:
        variances = np.cov(evaluations[0], ddof=0) \
            / evaluations.shape[-1]
        dof = evaluations.shape[-1] - 1
    else:
        variances = None
        dof = 0

    result = Result(models, evaluations, method=method,
                    cv_method='fixed', noise_ceiling=noise_ceil,
                    variances=variances, dof=dof, n_rdm=data.n_rdm,
                    n_pattern=None)
    result.n_pattern = data.n_cond
    return result

Please note I am not an expert statistician, and so this suggestion comes with the caveat that I don't know how the distribution of other methods (cosine, cosine_cov, corr_cov) works with a t-test - which is why only 'corr', 'tau-a' and 'rho-a' are included in the above code. It is also worth noting that this is a substantial change, as it means all downstream functions using correlation values will then use z-scored values, whether a t-test or not. However, this does avoid changing all other downstream functions, and the added 'zscore' input gives user control over whether they want to use raw correlation values or not.

A further caveat is that I have only implemented this for the eval_fixed function. It may also be useful for other functions in rsatoolbox.inference.evaluate, but honestly I'm not quite sure so don't want to tamper with it!

I hope this is somewhat useful. For a couple of quick online sources, see:

https://en.wikipedia.org/wiki/Fisher_transformation
https://www.newbi4fmri.com/tutorial-9-mvpa-rsa
https://dartbrains.org/content/RSA.html

Best wishes,
Alex

Dear @alexeperon

first of all: Thanks for your comments!

This is a valid concern: The t-test does indeed assume a normal distribution and this of course not really true for correlation values (nor any of the other comparison matters for that matter.) Whether we really should apply this transformation is a different question though. The proofs that this transform improves the match with the normal distribution generally assume that the underlying data comes from a normal distribution (or similar distributional assumptions) which is also not true in our case.

I do not think this would be the most important correction to add for three reasons:

  1. For fMRI or similar techniques we are typically in a realm where the transformation doesn't do much (-.5 to .5 for example). If you are actually running experiments that yield very high correlations it would be more important.
  2. We did run quite extensive simulations to test our software and found quite accurate false positive rates for the tests we tried without the correction.
  3. The bootstrap methods and similar approximations are often much more approximate than the skewness deviations that the Fisher transformation tries to fix.

Thus, I am happy for people using the tests without transformations and believe this is not an error per se.

Implementation

That being said, I don't see immediate reasons why doing the transformation should hurt or cannot help. Thus adding this as a possibility seems like a good idea.

Rather than implementing this in the eval functions, I would have a tendency to just code a transform results function that takes a results object applies the transform to all the model evaluation results and returns a new results object after the transformation. For consistency, one should then display the transformed correlations in plots, too, I think.
This would avoid adding something to all the evaluation functions and makes it clearly optional, as I think it should be.

Hi Heiko,

Thank you for the speedy and thorough response! This sounds good - and it's reassuring to know that false positive rates are about right with or without correction.

A separate transform results function seems like a good way to handle things.

Thanks again and have a good evening,
Alex

@alexeperon are you happy for me to use the code above in a PR to add this?

Hi Jasper,

I'd add the following as a separate function, following discussions with Heiko above. I've checked it and it seems to work well.


def fisher_transform_results(result_corr):

    """
    Take results module and fisher z-transform evaluations and noise ceiling.

    This is conventionally used to approximate a normal distribution before running t-tests.

    Args:
    
    Results: Results module
        Results module containing evaluation and noise ceiling data.

    """

    evals_to_transform = result_corr.evaluations
    noise_ceil_to_transform = result_corr.noise_ceiling
    # check for values of -1 or 1 - these will generate +/-np.inf after transformation
    # replace with +/-0.9999
    evals_to_transform[evals_to_transform == -1] = -0.9999
    evals_to_transform[evals_to_transform == 1] = 0.9999
    noise_ceil_to_transform[noise_ceil_to_transform == -1] = -0.9999
    noise_ceil_to_transform[noise_ceil_to_transform == 1] = 0.9999

    # fisher transform using arctanh
    evaluations_ztrans = np.arctanh(evals_to_transform)
    noise_ceil_ztrans = np.arctanh(noise_ceil_to_transform)

    # variances have changed; deal with this
    if result_corr.n_rdm > 1:
        variances = np.cov(evaluations_ztrans[0], ddof=0) \
            / evaluations_ztrans.shape[-1]
        dof = evaluations_ztrans.shape[-1] - 1
    else:
        variances = None
        dof = 0

    result_ztrans = Result(result_corr.models, evaluations_ztrans, method=result_corr.method,
                    cv_method='fixed', noise_ceiling=noise_ceil_ztrans,
                    variances=variances, dof=dof, n_rdm=result_corr.n_rdm,
                    n_pattern=None)
    
    # add 'transform' string to reflect transformation
    result_ztrans.transform = "Fisher z-transformed"
    
    return result_ztrans