romainkp / stremr

Streamlined Estimation for Static, Dynamic and Stochastic Treatment Regimes in Longitudinal Data

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Add option for observation weights

ck37 opened this issue · comments

Hello,

I mentioned this in person but thought I'd start a quick feature request issue to support observation weights. I believe Mark is on board with simply passing the weights through to the Q & g estimation as well as the fluctuation. For the variance of the IC he suggested that the weights be normalized to sum to 1 (possibly also applied to the Q & g estimation as well, not sure of the specifics).

Personally I could use this for a study where we have 30 million observations but only a few categorical covariates, so we can aggregate the replicated observations and incorporate observation weights to drastically speed up the superlearning & reduce memory usage.

Susan's tmle package doesn't support observation weights but ltmle does, so I'm using that right now. Mark suggested that observation weights would generally be important to support because they can be used to solve many different problems.

Thanks,
Chris

Hi Chris,

Thanks for opening this issue. I agree that weights are very important and can be used to address many problems (yours is definitely new to me).

I was a bit surprised that you would want to use weights for estimation of g, but I think I see that now. Indeed, you want the MSE of your Super Learner to be representative of your target (re-weighted) population, not the observed sample, so that has to apply to any estimation part of TMLE: Q, g and epsilon update.

I think it should be fairly easy to implement this, as long as the underlying learners are supporting the weights argument. I'll try to implement this as part of the move to sl3 learners, keeping this open for now.

Hi Oleg,

I was also wondering if we can get the weights for each observation. It would be helpful to be able to look at the weights and I would like to summary of them.

Thanks,
Soudeh