py-why / EconML

ALICE (Automated Learning and Intelligence for Causation and Economics) is a Microsoft Research project aimed at applying Artificial Intelligence concepts to economic decision making. One of its goals is to build a toolkit that combines state-of-the-art machine learning techniques with econometrics in order to bring automation to complex causal inference problems. To date, the ALICE Python SDK (econml) implements orthogonal machine learning algorithms such as the double machine learning work of Chernozhukov et al. This toolkit is designed to measure the causal effect of some treatment variable(s) t on an outcome variable y, controlling for a set of features x.

Home Page:https://www.microsoft.com/en-us/research/project/alice/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

CausalForestDML to get ate and ate confidence interval on training data

DailiZhang2010 opened this issue · comments

There are two ways to do it:

  1. use the est.ate_ and est.ate_stderr_. Under the hood, it uses Doubly Robust ATE on training data
  2. use est.ate(X=X, T0=T0, T1=T1) and est.ate_interval(X=X, T0=T0, T1=T1)

My target is to get the ate and confidence interval on the training data set.

The question is: which one is more reliable?

Thanks for the great package and awesome documentation.

I would recommend ate_ for your use case, since you should get tighter confidence intervals.

See more info here #753 (comment)

This is expected; as you note the ate_ attribute applies a double-robustness correction to the computation of the ATE itself (on the training data); the ate() method allows you to compute the ATE for any population by averaging the computed CATE values for each individual, so will not provide exactly the same result; however, if your use case is to compute the ATE for a data set that was not used in training then only the ate() method can be used for that.