StatsBase.mad computes the wrong median absolute deviation
astrobc1 opened this issue · comments
By default, StatsBase.mad
computes a "normalized" median absolute deviation where the normalization factor is equal to quantile(Normal(), 3/4)
. The default behavior of this function should just be the mad, not an estimator for the standard deviation. This behavior is unexpected and is further not applicable for most distributions, anyway.
Also got caught by this!
Median absolute deviation is a well-defined statistic that doesn't include the normalization factor. Other software, such as scipy, matlab, mathematica, follow the definition (of course) and compute median(abs.(x .- median(x)))
. Would be totally sensible for Julia to also follow the definition.
Such behavior can only be changed in a breaking version, because it's documented. StatsBase does have breaking versions from time to time, can this be added to a list for the next potential breaking version? This amounts to chaning normalize=true
to normalize=false
as the default value.
I think the reason for this behavior in StatsBase is mostly historical. See earlier issues on this topic:
#97
#347