JuliaStats / GLM.jl

Generalized linear models in Julia

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

methods for `StatsBase.leverage`

palday opened this issue · comments

Now that we have Cook's Distance, we can define this rather straightforwardly.

I've been looking for this functionality and couldn't find it. So I created the function for doing so. (Derived from here: https://en.wikipedia.org/wiki/Leverage_(statistics)) I tested it against R's hatvalues and it gives similar results.

function leverage(model::RegressionModel)
X = model.mm.m;
H = diag(X * ( ( X' * X ) ^ -1) * X');
return H;
end

Does anyone see an issue with what I have so far?

I think @ararslan had an implementation somewhere -- oh yeah as part of his work on Cook's Distance for GLM: #510

@HiramTheHero your proposal is mathematically correct but generally computing an explicit matrix inverse is less than ideal. The inverse is slow to compute and the result is often very sensitive to numerical details.

(Almost nothing in statistics except the sample mean is computed in practice with the textbook formulas. 🙁 )