pwwang / datar

A Grammar of Data Manipulation in python

Home Page:https://pwwang.github.io/datar/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

weighted_mean() gives error if x is np.nan (but not if weights sum to 0?!)

ftobin opened this issue · comments

The following should return nan

datar.all.weighted_mean(np.nan, 5)
ZeroDivisionError: Weights sum to zero, can't be normalized

Consider

np.average(np.nan, weights=5)
nan

The error message that is happening with weighted_mean() is what would happen if you called

np.average(3, weights=0)
ZeroDivisionError: Weights sum to zero, can't be normalized

Surprisingly!!!:

datar.all.weighted_mean(3, 0)
np.nan  # error expected!

I can't for the life of me figure out why the code seems to be switching arguments when called with missing values, causing this error message to be "backwards".

The arguments were not switched.

It should give NA for sure in the first case. The error is shown because x and w were firstly aligned into a frame. However, we have na_rm=True by default, so the frame is empty (no rows) with two columns x and w. So what actually ran was np.average([], weights=[])

In the second case, weighted_mean(3, 0), I would expect NA rather than an error, because with R:

r$> weighted.mean(3, 0)                                                                           
[1] NaN