PredictInterval ignores covariance between fixed and random effects - how to fix

Question

PredictInterval ignores covariance between fixed and random effects - how to fix

jamesonquinn opened this issue 7 years ago · comments

PredictInterval ignores covariance between fixed and random effects. In many cases, such correlation is negligible and ignoring it is OK. But in other common cases, they are correlated in a way so that a correct predictInterval would almost entirely cancel out the variance of both! This can lead to extremely over-conservative predictIntervals, especially having to do with the fixed intercept in random intercept models with relatively few groups.

Consider a simple random intercept model with 4 groups and 100 subjects in each group, where the true variance of the random intercepts is 4, the true within-group variance is 1, and the mean of the random intercepts is 0. If you fit a fixed effect model to this data, the standard error of the overall intercept term will be about .05; very low. But if you fit a random intercept model, the standard error of the overall intercept term will be about 1 — which should correlate at almost -1 with the random intercept for each group. Essentially, the model is saying that it doesn't know the true mean of the random intercepts because it only has 4 of them. But from a prediction point of view, for a new data point in one of the existing groups, that standard error is irrelevant; the data shows pretty well what the average for that group is.

There are three ways to potentially "fix" this, which I'll list as roughly "order-0", "order-1", and "order-n"; that is, going from simplest and roughest to most well-justified and complex.

Order-0 fix: just allow the user to treat certain specific fixed effects as truly fixed. Basically, I think the call would look something like predictInterval(mod, newData, ignore.fixed.terms=c(1,4)). This would set row and column 1 and 4 of the vcov.tmp matrix to 0 before doing the mvtnorm::rmvnorm. This would be mathematically ugly and poorly-justified, but in practice in a situation like the example above it would basically solve the problem.

Order-1 fix: try to calculate the correct variance to use for the intercept term alone by using some quick-and-dirty variance formulas which rely heavily on assuming normality and good model convergence. The formula is .
.

Order-n fix: actually fix lmer and/or blmer so that they fit only n-1 parameters for n groups, and set the last parameter to ensure that the random effects have mean zero.

I believe I could submit a patch with both order-0 and order-1 fixes using optional parameters.

Jared Knowles · Answer 1 · Mon Apr 24 2017 22:14:54 GMT+0800 (China Standard Time)

Hi @jamesonquinn

We'd be open to a pull request implementing this. It was out of scope for our initial work on merTools because we were just seeking to replicate the functionality of bootMer but in a way that would work for extremely large models (~500k observations, ~100k groups).

Giving users the option to use more efficient intervals that are not overly conservative seems like a huge added value.

KaiserDominici · Answer 2 · Sat May 27 2017 17:32:59 GMT+0800 (China Standard Time)

Hi @jamesonquinn ,

Where does the equation you suggest come from? Is there a similar formula to address correlation between fixed and random slopes?

I would like to create a function/wrapper to provide confidence intervals for the mean response at the group level without simulation. Think about longitudinal performance of students in classrooms, the idea is to provide confidence intervals around each time point in the class trajectories.

Since point estimates at the group level are a linear combination of fixed and random effects, their variance is the sum of the covariances of the fixed and random coefficients*. As you know, lme4 provides covariance matrices for fixed and random effects separately, but what I am missing is the covariance between fixed and random coefficients.

My line of thinking is that their correlation depends on the underlying distribution of observations, but I could find any literature about it, which makes your post so interesting.

Thanks

*Of course, whether the distribution of y-hat is a t distribution or even symmetric is another story...

Jared Knowles · Answer 3 · Wed Jul 12 2017 08:25:42 GMT+0800 (China Standard Time)

Hi @jamesonquinn -- have you had a chance to make the small changes and add yourself to the contributor list? I'd love to get your code merged in ahead of a release at the end of the summer. Thanks!

Jared Knowles · Answer 4 · Sat Aug 19 2017 06:06:31 GMT+0800 (China Standard Time)

Closed by #77