mattjj / pyhsmm

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Problems with EM for Gaussian-Mixture HDP-HSMM

james-GA opened this issue · comments

Hey guys,

I have been using hmmlearns GMMHMM (https://hmmlearn.readthedocs.io/en/latest/api.html#gmmhmm) for learning hmms where state distributions are gaussian mixtures to better approximate my data. I am now trying to build a similar model with pyhsmm except with the HDP and semi-markovian extensions but I am running into some problems applying EM_fit.

Here is the model set-up for some 2D data X:

  obs_dim = X.shape[1]
   obs_hypparams = {'mu_0':np.zeros(obs_dim),
                   'sigma_0':np.eye(obs_dim),
                   'kappa_0':0.25,
                   'nu_0':obs_dim+2}
   dur_hypparams = {'alpha_0':2*30,
                    'beta_0':2}
   alpha_0 = 5.0
   # Distributions
   mixture_dist = pyhsmm.distributions.MixtureDistribution(alpha_0 = alpha_0, components = [pyhsmm.distributions.Gaussian(**obs_hypparams) for itr in range(nMix)])
   obs_distns = [mixture_dist for state in range(n_components)]
   dur_distns = [pyhsmm.distributions.PoissonDuration(**dur_hypparams) for state in range(n_components)]
   # Build the iHSMM model that will represent the fitmodel
   model = pyhsmm.models.WeakLimitHDPHSMM(
           #alpha=6.,gamma=6., # these can matter; see concentration-resampling.py
           alpha_a_0=1.,alpha_b_0=1./4,
           gamma_a_0=1.,gamma_b_0=1./4,
           init_state_concentration=6., # pretty inconsequential
           obs_distns=obs_distns,
           dur_distns=dur_distns)
   model.add_data(X)#,trunc=60) # duration truncation speeds things up when it's possible
   # Gibbs sampling for initialization
   for idx in progprint_xrange(nGibbsSamples):
       model.resample_model()
   # EM fit
   likes = model.EM_fit()

Now as far as I can tell my model set-up for this distribution is correct and the Gibbs sampling seems to work, although when running model.EM_fit() I get the error:

<ipython-input-2-4ea26e5a1b5e> in computeAllScores(X, cutPoint, nMix, n_components, randState, nGibbsSamples, useEM, showPlots)
     37     # EM fit
---> 39    likes = model.EM_fit()

~/anaconda3/lib/python3.6/site-packages/pybasicbayes/abstractions.py in EM_fit(self, tol, maxiter)
    219 class ModelEM(with_metaclass(abc.ABCMeta, _EMBase)):
    220     def EM_fit(self,tol=1e-1,maxiter=100):
--> 221         return self._EM_fit(self.EM_step,tol=tol,maxiter=maxiter)
    222 
    223     @abc.abstractmethod

~/anaconda3/lib/python3.6/site-packages/pybasicbayes/abstractions.py in _EM_fit(self, method, tol, maxiter, progprint)
    204         step_iterator = range(maxiter) if not progprint else progprint_xrange(maxiter)
    205         for itr in step_iterator:
--> 206             method()
    207             likes.append(self.log_likelihood())
    208             if len(likes) > 1:

~/anaconda3/lib/python3.6/site-packages/pyhsmm/models.py in EM_step(self)
    638         self._clear_caches()
    639         self._E_step()
--> 640         self._M_step()
    641 
    642     def _E_step(self):

~/anaconda3/lib/python3.6/site-packages/pyhsmm/models.py in _M_step(self)
    971 class _HSMMEM(_HSMMBase,_HMMEM):
    972     def _M_step(self):
--> 973         super(_HSMMEM,self)._M_step()
    974         self._M_step_dur_distns()
    975 

~/anaconda3/lib/python3.6/site-packages/pyhsmm/models.py in _M_step(self)
    645 
    646     def _M_step(self):
--> 647         self._M_step_obs_distns()
    648         self._M_step_init_state_distn()
    649         self._M_step_trans_distn()

~/anaconda3/lib/python3.6/site-packages/pyhsmm/models.py in _M_step_obs_distns(self)
    652         for state, distn in enumerate(self.obs_distns):
    653             distn.max_likelihood([s.data for s in self.states_list],
--> 654                     [s.expected_states[:,state] for s in self.states_list])
    655 
    656     def _M_step_init_state_distn(self):

~/anaconda3/lib/python3.6/site-packages/pybasicbayes/models/mixture.py in max_likelihood(self, data, weights)
    644     def max_likelihood(self,data,weights=None):
    645         if weights is not None:
--> 646             raise NotImplementedError
    647         assert isinstance(data,list) or isinstance(data,np.ndarray)
    648         if isinstance(data,list):

NotImplementedError: 

So it looks like EM for mixture models has not been incorporated into pyhsmm yet? If so is this something that will be added soon or is work in progress? If this is not going to be added could you please point me in the right direction as to what exactly needs adding to get this to work (as much detail as possible please, first time using these models!)? I guess max_likelihood() must be altered to properly compute likelihoods for mixture models??

Many thanks for the help!!