Issue on page /22-Debiased-Orthogonal-Machine-Learning.html
SebKrantz opened this issue · comments
I fail to understand why in the section "Non-Scientific Double/Debiased ML" it is necessary to save the first stage models and predict with them. In adding counterfactual treatments, we are not changing any part of the covariates X which are the sole input to the first stage models. Thus the first-stage predictions are the same with or without counterfactual treatments and we don't need those models.
In addition, I don't quite understand the value of training and test splitting and the ensamble_pred()
function here. If my goal is to get counterfactual predictions for all my data (which typically is the case), I would just use cross_val_predict()
to get the first stage residuals (as in the section on DML) on the entire data, and then fit cross-validated final models using cv_estimate()
, additionally saving the indices for each fold, and then create a predict method that uses the final-stage models and indices to create proper cross-validated final predictions for different price levels (subtracted their prediction from the first stage, which remains the same).