py-why / dowhy

DoWhy is a Python library for causal inference that supports explicit modeling and testing of causal assumptions. DoWhy is based on a unified language for causal inference, combining causal graphical models and potential outcomes frameworks.

Home Page:https://www.pywhy.org/dowhy

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Linear dataset functionality and parameters

priamai opened this issue · comments

Example from here:

https://www.pywhy.org/dowhy/v0.9/example_notebooks/dowhy_estimation_methods.html

image

Are those 1000 samples individual units e.g. patients for example?
Can we also generate multiple samples for a unit, for example this can be a treatement test over many days to measure the response.
I want to then be able to say: 10 units x 10 days = 100 samples.

Why the ate,att,atc are identical?

image

Since the W0 treatment is continuous how the system knows to discriminate between the treated and untreated?

How can we constraint the generation for example I want to have only treatment and outcomes in the positive range between 0 and 100.

Are those 1000 samples individual units e.g. patients for example?

Yes

Can we also generate multiple samples for a unit, for example this can be a treatement test over many days to measure the response. I want to then be able to say: 10 units x 10 days = 100 samples.

This is not supported. Will be great if you can add such a dataset simulator.

Why the ate, att, atc are identical?

This is because the true effect is a linear effect. It is homogeneous on the entire population. So it does not matter whether you compute causal effect on everyone, only on the treated, or only on the untreated. It is the same effect.
With a different simulated dataset, these quantities will be different.

Since the W0 treatment is continuous how the system knows to discriminate between the treated and untreated?

Treatment is v0. W0 is a confounder. For linear treatments, user has to specify the "treatment" and control" values (usually 1 and 0 respectively).

How can we constraint the generation for example I want to have only treatment and outcomes in the positive range between 0 and 100.

Not possible with the current function. You will need to add a new function or modify this one.