Issue on page /22-Debiased-Orthogonal-Machine-Learning.html

Question

Issue on page /22-Debiased-Orthogonal-Machine-Learning.html

SebKrantz opened this issue 7 months ago · comments

I fail to understand why in the section "Non-Scientific Double/Debiased ML" it is necessary to save the first stage models and predict with them. In adding counterfactual treatments, we are not changing any part of the covariates X which are the sole input to the first stage models. Thus the first-stage predictions are the same with or without counterfactual treatments and we don't need those models.

In addition, I don't quite understand the value of training and test splitting and the ensamble_pred() function here. If my goal is to get counterfactual predictions for all my data (which typically is the case), I would just use cross_val_predict() to get the first stage residuals (as in the section on DML) on the entire data, and then fit cross-validated final models using cv_estimate(), additionally saving the indices for each fold, and then create a predict method that uses the final-stage models and indices to create proper cross-validated final predictions for different price levels (subtracted their prediction from the first stage, which remains the same).

Patrick Conroy · Answer 1 · Sun Nov 19 2023 03:28:52 GMT+0800 (China Standard Time)

WIth out saving the first stage it is impossible to use non-functional form to create the measure of the probability of something occurring it is an extension of the sharp null ideaz. [image: image.png]

…

On Mon, Nov 13, 2023 at 4:39 AM Sebastian Krantz ***@***.***> wrote: I fail to understand why in the section "Non-Scientific Double/Debiased ML" it is necessary to save the first stage models and predict with them. In adding counterfactual treatments, we are not changing any part of the covariates X which are the sole input to the first stage models. Thus the first-stage predictions are the same with or without counterfactual treatments and we don't need those models. In addition, I don't quite understand the value of training and test splitting and the ensamble_pred() function here. If my goal is to get counterfactual predictions for all my data (which typically is the case), I would just use cross_val_predict() to get the first stage residuals (as in the section on DML), and then fit cross-validated final models using cv_estimate(), additionally saving the indices for each fold, and then create a predict method that uses the final-stage models and indices to create proper cross-validated final predictions for different price levels (subtracted their prediction from the first stage, which remains the same). — Reply to this email directly, view it on GitHub <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_matheusfacure_python-2Dcausality-2Dhandbook_issues_363&d=DwMCaQ&c=qgVugHHq3rzouXkEXdxBNQ&r=Ift2hiH908Ag4fj5J9hp2lnM-kZxMCx6yYvKqEtwQUo&m=oqjfFcslWXcnO8Mo8lEUuIqyvi_z_a0f2iV4oHEq8ZkYC-dNQXfgKvAxo_TEARle&s=PtAo57owKCQCel5PP3dYYBanFBEcGOLeaKuzVYe2F-E&e=>, or unsubscribe <https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AVNI34DDYUKEZRAP2L5TWTDYEIIIJAVCNFSM6AAAAAA7JCWLX2VHI2DSMVQWIX3LMV43ASLTON2WKOZRHE4TANJYGI3TMOI&d=DwMCaQ&c=qgVugHHq3rzouXkEXdxBNQ&r=Ift2hiH908Ag4fj5J9hp2lnM-kZxMCx6yYvKqEtwQUo&m=oqjfFcslWXcnO8Mo8lEUuIqyvi_z_a0f2iV4oHEq8ZkYC-dNQXfgKvAxo_TEARle&s=KFs_XndBmKMWF6oyfex5_WtLVsVcGYmue6zJ8WkwqWw&e=> . You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>