Cannot get the "do" operator to work for front door criteria with and unobserved criteria ...
grahamharrison68 opened this issue · comments
Subject of the issue
Describe your issue here.
Your environment
- pgmpy 0.1.20
- Python 3.8
- Windows 10
Steps to reproduce
edges : list = [("smoker", "tar"), ("tar", "cancer"), ("U", "smoker"), ("U", "cancer")]
from pgmpy.models import BayesianNetwork
from pgmpy.inference import CausalInference
causal_model = BayesianNetwork(edges)
causal_model.fit(df_smoking)
causal_model.check_model()
Expected behaviour
model fits so that a call to do -
causal_inference = CausalInference(causal_model)
do_smokling = causal_inference.query(variables=["cancer"], do={"smoker": 1}, show_progress=False)
... will use the front-door adjustment formula to calculate the effect of smoking on cancer
Actual behaviour
ValueError: Maximum Likelihood Estimator works only for models with all observed variables. Found latent variables: set().
I think you should use Expected Maximization algorithm, which do support unobserved data.
I assume that underneath the DO operator uses Maximum Likelihood Estimator. But I'm not sure if one can modify that easily.
Appreciate the comment, thanks. I believe I have tried that with the same results, but if you have any sample code I would love to see it. Thanks
@jaimeperezsanchez @grahamharrison68 For missing data Expectation Maximization should be able to learn the model parameters. But there seems to be a bug in inference with the front-door criterion as well.