huggingface / evaluate

🤗 Evaluate: A library for easily evaluating machine learning models and datasets.

Home Page:https://huggingface.co/docs/evaluate

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Mahalanobis distance computes X_minus_mu incorrectly

j0ma opened this issue · comments

It seems that on Line 91 in mahalanobis.py the quantity X_minus_mu seems to be computed incorrectly.

Based on Wikipedia, both x and mu should be vectors:

image

however using np.mean(...) without specifying the axis returns a scalar:

import numpy as np 
reference_distribution = [[1,2], [3,4]]
print(np.mean(reference_distribution))

> 2.5

Instead, if the array is (N, D) then we can take an overage over the first dimension with np.mean(..., axis=0):

import numpy as np 
reference_distribution = [[1,2], [3,4]]
print(np.mean(reference_distribution, axis=0))

> [2. 3.]

This means that the same scalar is being subtracted from all components of X to create X_minus_mu, which isn't what we want, right?

Please correct me if my understanding is wrong. Thanks!