Unable to create a HMM using the Dynamic Bayesian Network.
ngobibibnbe opened this issue · comments
Subject of the issue
I am creating a multivariate non stationary HMM using the DBN in pgmpy. I fall on a problem stating that the first time instance of the variable is not defined in the model.
Your environment
- pgmpy 0.1.20
- Python 3.8
- ubuntu
Steps to reproduce
from pgmpy.factors.discrete import TabularCPD
from pgmpy.models import DynamicBayesianNetwork as DBN
from pgmpy.inference import DBNInference
from pgmpy.models import BayesianNetwork
dbnet = DBN() # BayesianNetwork()
dbnet.add_edges_from([ (('Z', 0), ('Z', 1)), (('Z', 1), ('Z', 2)) ]) #(('X', 0), ('X', 1)),
z_start_cpd = TabularCPD(('Z', 0), 2, [[1.0/2], [1.0/2]])
z_trans_cpd = TabularCPD(('Z', 1), 2, [[0.7,0.8],
[ 0.3,0.2],
],
evidence=[('Z', 0)],
evidence_card=[2])
z_trans_cpd_2 = TabularCPD(('Z', 2), 2, [[0.7,0.8],
[ 0.3,0.2],
],
evidence=[('Z', 1)],
evidence_card=[2])
dbnet.add_cpds(z_start_cpd, z_trans_cpd, z_trans_cpd_2 )
dbnet.initialize_initial_state()
dbn_inf = DBNInference(dbnet)
dbn_inf.backward_inference([('Z', 0),('Z', 1)])
Expected behaviour
1-print(dbnet.nodes): I should have (Z,0), (Z, 1) and (Z ,2)
2- when I execute the whole upper code, I expect the probability of (Z,0) and (Z,1)
Actual behaviour
1-print(dbnet.nodes): [<DynamicNode(Z, 0) at 0x7faf5731aac0>, <DynamicNode(Z, 1) at 0x7faf5731ae80>]
2- when i execute the whole upper code i obtain: ('CPD defined on variable not in the model', <TabularCPD representing P(('Z', 2):2 | ('Z', 1):2) at 0x7faf573e04c0>) which is normal since the node isn't in the model. I have even added: dbnet.add_nodes_from(nodes=[('Z', 0), ('Z', 1), ('Z', 2)]) before adding edges but the same error occur
***Full error:
ValueError Traceback (most recent call last)
in
52
53 print(dbnet.nodes)
---> 54 dbnet.add_cpds(z_start_cpd, z_trans_cpd, z_trans_cpd_2 )#, y_i_cpd,y_cpd)
55 from pgmpy.inference import VariableElimination
56
/usr/local/lib/python3.8/dist-packages/pgmpy/models/DynamicBayesianNetwork.py in add_cpds(self, *cpds)
471 set(super(DynamicBayesianNetwork, self).nodes())
472 ):
--> 473 raise ValueError("CPD defined on variable not in the model", cpd)
474
475 self.cpds.extend(cpds)
ValueError: ('CPD defined on variable not in the model', <TabularCPD representing P(('Z', 2):2 | ('Z', 1):2) at 0x7faf57308dc0>)
@ngobibibnbe For DBNs, the assumption is that the model is a 2-TBN such that the transition CPDs remain constant for each time slice. So, you just need to specify the first one and a half time slice to fully specify the network. For the example above, you should just specify this:
from pgmpy.factors.discrete import TabularCPD
from pgmpy.models import DynamicBayesianNetwork as DBN
from pgmpy.inference import DBNInference
from pgmpy.models import BayesianNetwork
dbnet = DBN() # BayesianNetwork()
dbnet.add_edges_from([ (('Z', 0), ('Z', 1)) ]) #(('X', 0), ('X', 1)),
z_start_cpd = TabularCPD(('Z', 0), 2, [[1.0/2], [1.0/2]])
z_trans_cpd = TabularCPD(('Z', 1), 2, [[0.7,0.8],
[ 0.3,0.2],
],
evidence=[('Z', 0)],
evidence_card=[2])
dbnet.add_cpds(z_start_cpd, z_trans_cpd)
dbnet.initialize_initial_state()
dbn_inf = DBNInference(dbnet)
dbn_inf.backward_inference([('Z', 0),('Z', 1)])
pgmpy should automatically create nodes and edges for later time slices as required. The inference is still failing in the example above and is a bug that hasn't been fixed yet. Dup #1583