T-Learner ATE, SE calculations
ras44 opened this issue · comments
Describe the bug
Less of a bug than a question:
The T-learner takes the mean of the treatment effect te
which is calculated over all subjects (so the mean over all rows of differences between each treatment group's model prediction and the control model prediction):
causalml/causalml/inference/meta/tlearner.py
Lines 242 to 243 in a031566
However, the standard errors of the ATE are calculated relative to a filtered subset- only the subjects that are within a particular treatment group and those in the control group are included:
causalml/causalml/inference/meta/tlearner.py
Lines 254 to 261 in a031566
It seems like the subjects in the ATE calculation should match the subjects in the SE calculation, with the SE potentially simply just being the SE of the te
measurements for all subjects, if all subjects are meant to be included in the calculation.
If all subjects are not included in the ATE calculation and the ATE calculation is group-specific, then it seems like we should have:
_ate = (yhat_t - yhat_c).mean()
And again the SE simply being the SE of the series:
se = np.sqrt((yhat_t - yhat_c).var())