py-why / dowhy

DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.

Home Page:https://www.pywhy.org/dowhy

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Unexpected results for CATE methods when predicting on new data

AlxndrMlk opened this issue · comments

Describe the bug
When using DoWhy 0.10.1, I am getting unexpected predictions on new data.

In particular, I tried replicating the results in the S-Learner: The Lone Ranger section of this notebook originally developed using DoWhy 0.8

I used two ways to generate the predictions on the test set:

[1] Using _estimator_object:

estimate._estimator_object.effect(my_test_data)

[2] Using model.estimate_effect() method:

model.estimate_effect(
    identified_estimand=estimand,
    method_name='backdoor.econml.metalearners.SLearner',
    fit_estimator=False,
    target_units=my_test_data,
).cate_estimates

I both cases the results were constant for all rows.

To make sure that it was not an issue with the data, I generated predictions for the training data using both methods ([1] and [2]).

The results were constant again, which is inconsistent with the original predictions generated on the training data.

Steps to reproduce the behavior

  1. Install DoWhy 0.10.1
  2. Run the following cells from this notebook
  • imports
  • cells 47-58
  • run the following code (generates predictions on the test set):
estimate._estimator_object.effect(earnings_interaction_test.drop(['true_effect', 'took_a_course'], axis=1))

or

model.estimate_effect(
    identified_estimand=estimand,
    method_name='backdoor.econml.metalearners.SLearner',
    fit_estimator=False,
    target_units=earnings_interaction_test.drop(['true_effect', 'took_a_course'], axis=1),
).cate_estimates
  1. Compare the original predictions on the training data:
estimate.cate_estimates

(Let's call this result ORIGINAL_PRED)

with the predictions on the training data generated using methods [1] and [2]:

estimate._estimator_object.effect(earnings_interaction_train.drop(['earnings', 'took_a_course'], axis=1))

(Let's call this result NEW_PRED_1)

model.estimate_effect(
    identified_estimand=estimand,
    method_name='backdoor.econml.metalearners.SLearner',
    fit_estimator=False,
    target_units=earnings_interaction_train.drop(['earnings', 'took_a_course'], axis=1),
).cate_estimates

(Let's call this result NEW_PRED_2)

Expected behavior
We expect ORIGINAL_PRED to be identical to NEW_PRED_1 and NEW_PRED_2 (at least assuming fixed seed), but this was not the case for me.

The results are as expected in DoWhy 0.8 when using:

model.causal_estimator.effect(earnings_interaction_test.drop(['true_effect', 'took_a_course'], axis=1))

to generate predictions on new data.

Version information:

  • DoWhy version 0.10.1

Additional context
...

Thanks for raising this @AlxndrMlk
I had a chance to check this issue and realized that the error is due to passing the columns in the incorrect order.

If you change earnings_interaction_test.drop(['true_effect', 'took_a_course'],axis=1) to earnings_interaction_test[['python_proficiency', 'age']], then the code runs as expected.

However, I do realize that this behavior in v0.10 creates an additional burden on the user. I am adding PR #1061 that restores v0.8 behavior where user can simply provide the dataframe (earnings_interaction_test) and the selection of columns is done automatically.