statrs-dev / statrs

Statistical computation library for Rust

Home Page:https://docs.rs/statrs/latest/statrs/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Beta skewness limits incorrect

Bi-Modal opened this issue · comments

Hello everyone,

I believe I've spotted a bug with the skewness for a beta distribution.

In distribution::beta the skewness function returns incorrect values when shape_a or shape_b are infinite. Currently this function gives -2.0 for infinite shape_a and 2.0 for infinite shape_b but these are not the correct limits for the skewness of a beta distribution.

The skewness of a beta distribution with shape parameters α, β is:
[ 2(β - α)/(α + β + 2) ] * sqrt( (α + β + 1) / (αβ) )

The limit of this as α tends to infinity is 2/sqrt(β). And the limit of this as β tends to infinity is -2/sqrt(α). I've attached a short proof of these.
beta_skewness_limits.pdf

If I have access I'll make a PR to fix this.

references:
https://en.wikipedia.org/wiki/Beta_distribution#Skewness
https://mathworld.wolfram.com/BetaDistribution.html

Thanks for the report, this is correct with that choice of the limit. However it's not clear that's the obvious thing to do. The edge-cases in this library are generally a bit wonky and have been discussed before #102 .
A mathematical limit is defined if lim X -> Y f(X) is the same as f(lim X -> Y) = f(Y) with a reasonable choice of lim X -> Y. The space of all Beta distributions has two parameters, but in this case the limit is an artifact of the parameterization: Established convergence definitions of random variables are defined to be independent of this.
Barring any math errors:
The limit as a distribution for alpha or beta towards inf is the point distribution at the edges 0, 1, independent of the other parameter.
Let's define f(X) = skewness(X) = E((X-mu)^3)/(E((X-mu)^2)^(3/2)
One sequence of Beta distributed random variables that converges to the point mass at 1 is a sequence X[n] distributed as B(n,b) and another is a sequence X'[n] distributed as B(n,b+1), which both yield different skewness. Thus the limit does not exist.
In light of these problems, I intended to remove all the special, degenerate cases, but haven't gotten around to it.

Edit: Corrected a wrong, confusing example.