uber / causalml

Describe the bug
Less of a bug than a question:

The T-learner takes the mean of the treatment effect te which is calculated over all subjects (so the mean over all rows of differences between each treatment group's model prediction and the control model prediction):

causalml/causalml/inference/meta/tlearner.py

Lines 242 to 243 in a031566

    
           for i, group in enumerate(self.t_groups): 
        
               _ate = te[:, i].mean()

However, the standard errors of the ATE are calculated relative to a filtered subset- only the subjects that are within a particular treatment group and those in the control group are included:

causalml/causalml/inference/meta/tlearner.py

Lines 254 to 261 in a031566

    
           se = np.sqrt( 
        
               ( 
        
                   (y_filt[w == 0] - yhat_c[w == 0]).var() / (1 - prob_treatment) 
        
                   + (y_filt[w == 1] - yhat_t[w == 1]).var() / prob_treatment 
        
                   + (yhat_t - yhat_c).var() 
        
               ) 
        
               / y_filt.shape[0] 
        
           )

It seems like the subjects in the ATE calculation should match the subjects in the SE calculation, with the SE potentially simply just being the SE of the te measurements for all subjects, if all subjects are meant to be included in the calculation.

If all subjects are not included in the ATE calculation and the ATE calculation is group-specific, then it seems like we should have:

_ate  = (yhat_t - yhat_c).mean()

And again the SE simply being the SE of the series:

se = np.sqrt((yhat_t - yhat_c).var())

	for i, group in enumerate(self.t_groups):
	_ate = te[:, i].mean()

	se = np.sqrt(
	(
	(y_filt[w == 0] - yhat_c[w == 0]).var() / (1 - prob_treatment)
	+ (y_filt[w == 1] - yhat_t[w == 1]).var() / prob_treatment
	+ (yhat_t - yhat_c).var()
	)
	/ y_filt.shape[0]
	)

T-Learner ATE, SE calculations