weighted_mean() gives error if x is np.nan (but not if weights sum to 0?!)

Question

weighted_mean() gives error if x is np.nan (but not if weights sum to 0?!)

ftobin opened this issue 2 years ago · comments

The following should return nan

datar.all.weighted_mean(np.nan, 5)
ZeroDivisionError: Weights sum to zero, can't be normalized

Consider

np.average(np.nan, weights=5)
nan

The error message that is happening with weighted_mean() is what would happen if you called

np.average(3, weights=0)
ZeroDivisionError: Weights sum to zero, can't be normalized

Surprisingly!!!:

datar.all.weighted_mean(3, 0)
np.nan  # error expected!

I can't for the life of me figure out why the code seems to be switching arguments when called with missing values, causing this error message to be "backwards".

pwwang · Answer 1 · Tue Aug 30 2022 12:27:31 GMT+0800 (China Standard Time)

The arguments were not switched.

It should give NA for sure in the first case. The error is shown because x and w were firstly aligned into a frame. However, we have na_rm=True by default, so the frame is empty (no rows) with two columns x and w. So what actually ran was np.average([], weights=[])

In the second case, weighted_mean(3, 0), I would expect NA rather than an error, because with R:

r$> weighted.mean(3, 0)                                                                           
[1] NaN