JuliaStats / StatsBase.jl

Basic statistics for Julia

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

StatsBase.mad computes the wrong median absolute deviation

astrobc1 opened this issue · comments

By default, StatsBase.mad computes a "normalized" median absolute deviation where the normalization factor is equal to quantile(Normal(), 3/4). The default behavior of this function should just be the mad, not an estimator for the standard deviation. This behavior is unexpected and is further not applicable for most distributions, anyway.

Also got caught by this!
Median absolute deviation is a well-defined statistic that doesn't include the normalization factor. Other software, such as scipy, matlab, mathematica, follow the definition (of course) and compute median(abs.(x .- median(x))). Would be totally sensible for Julia to also follow the definition.

Such behavior can only be changed in a breaking version, because it's documented. StatsBase does have breaking versions from time to time, can this be added to a list for the next potential breaking version? This amounts to chaning normalize=true to normalize=false as the default value.

I think the reason for this behavior in StatsBase is mostly historical. See earlier issues on this topic:
#97
#347