pgmpy / pgmpy

Python Library for learning (Structure and Parameter), inference (Probabilistic and Causal), and simulations in Bayesian Networks.

Home Page:https://pgmpy.org/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Cannot get the "do" operator to work for front door criteria with and unobserved criteria ...

grahamharrison68 opened this issue · comments

Subject of the issue

Describe your issue here.

Your environment

  • pgmpy 0.1.20
  • Python 3.8
  • Windows 10

Steps to reproduce

edges : list = [("smoker", "tar"), ("tar", "cancer"), ("U", "smoker"), ("U", "cancer")]

from pgmpy.models import BayesianNetwork
from pgmpy.inference import CausalInference

causal_model = BayesianNetwork(edges)

causal_model.fit(df_smoking)
causal_model.check_model()

Expected behaviour

model fits so that a call to do -

causal_inference = CausalInference(causal_model)
do_smokling = causal_inference.query(variables=["cancer"], do={"smoker": 1}, show_progress=False)

... will use the front-door adjustment formula to calculate the effect of smoking on cancer

Actual behaviour

ValueError: Maximum Likelihood Estimator works only for models with all observed variables. Found latent variables: set().

I think you should use Expected Maximization algorithm, which do support unobserved data.
I assume that underneath the DO operator uses Maximum Likelihood Estimator. But I'm not sure if one can modify that easily.

Appreciate the comment, thanks. I believe I have tried that with the same results, but if you have any sample code I would love to see it. Thanks

@jaimeperezsanchez @grahamharrison68 For missing data Expectation Maximization should be able to learn the model parameters. But there seems to be a bug in inference with the front-door criterion as well.