Cannot get the "do" operator to work for front door criteria with and unobserved criteria ...

Question

Cannot get the "do" operator to work for front door criteria with and unobserved criteria ...

grahamharrison68 opened this issue 2 years ago · comments

Graham Harrison commented 2 years ago

Subject of the issue

Describe your issue here.

Your environment

pgmpy 0.1.20
Python 3.8
Windows 10

Steps to reproduce

edges : list = [("smoker", "tar"), ("tar", "cancer"), ("U", "smoker"), ("U", "cancer")]

from pgmpy.models import BayesianNetwork
from pgmpy.inference import CausalInference

causal_model = BayesianNetwork(edges)

causal_model.fit(df_smoking)
causal_model.check_model()

Expected behaviour

model fits so that a call to do -

causal_inference = CausalInference(causal_model)
do_smokling = causal_inference.query(variables=["cancer"], do={"smoker": 1}, show_progress=False)

... will use the front-door adjustment formula to calculate the effect of smoking on cancer

Actual behaviour

ValueError: Maximum Likelihood Estimator works only for models with all observed variables. Found latent variables: set().

Jaime Pérez Sánchez · Answer 1 · Tue Jan 24 2023 23:42:07 GMT+0800 (China Standard Time)

I think you should use Expected Maximization algorithm, which do support unobserved data.
I assume that underneath the DO operator uses Maximum Likelihood Estimator. But I'm not sure if one can modify that easily.

Graham Harrison · Answer 2 · Sun Jan 29 2023 16:47:04 GMT+0800 (China Standard Time)

Appreciate the comment, thanks. I believe I have tried that with the same results, but if you have any sample code I would love to see it. Thanks

Ankur Ankan · Answer 3 · Sun Feb 12 2023 22:54:17 GMT+0800 (China Standard Time)

@jaimeperezsanchez @grahamharrison68 For missing data Expectation Maximization should be able to learn the model parameters. But there seems to be a bug in inference with the front-door criterion as well.