chenrui33 / PyBGMM

Bayesian inference for Gaussian mixture model with some novel algorithms

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

PyBGMM: Bayesian inference for Gaussian mixture model

More coming...

Overview

Approximation inference (Bayesian inference) for finite Gaussian mixture model (FGMM) and infinte Gaussian mixture model (IGMM) includes variational inference and Monte Carlo methods. Here we only use Monte Carlo methods. In particular, we use collapsed Gibbs sampling to do the inference.

Code Structure

|-- GMM # base class for Gaussian mixture model
    |---- FGMM  # base class for finite Gaussian mixture model
        |------ PFGMM
        |------ CSFGMM
        |------ LSFGMM

    |---- IGMM  # base class for infinite Gaussian mixture model
        |------ CRPMM
        |------ PCRPMM    ## powered Chinese restaurant process (pCRP) mixture model
        |------ CSIGMM
        |------ LSIGMM
        |------ SubCRPMM  ## Sub-clustering with CRP mixture model for high-dimensional data

Documentation

What do we include:

  • Finite Gaussian mixture model

  • Hyperprior on Dirichlet distribution (for finite Gaussian mixture model)

  • Chinese restaurant process mixture model (CRPMM)

  • Powered Chinese restaurant process (pCRP) mixture model

  • Adaptive powered Chinese restaurant process (Ada-pCRP) mixture model

  • Constrained sampling for Chinese restaurant process mixture model

  • Bayesian variable selection in Chinese restaurant process mixture (Sub-CRP)

What we will include:

  • Hyperprior on Dirichlet process prior (for infinite Gaussian mixture model)

Examples

Code Description
CRPMM 1d Chinese restaurant process mixture model for 1d data
CRPMM 2d Chinese restaurant process mixture model for 2d data
pCRPMM 1d powered Chinese restaurant process mixture model for 1d data
pCRPMM 2d powered Chinese restaurant process mixture model for 2d data
SubCRP several test on SubCRP mixture model (Bayesian variable selection for high-dimensional data in CRP)
CSIGMM demo for constrained sampling for CRPMM
CRP draw A basic demo for CRP prior draw

Dependencies

  1. Adaptive Rejection Sampling (ARS) - Python implementation of ARS.
  2. Clustering accuracy - infopy: Python implementation of information theory computation.
  3. See requirements.txt

Lincense

MIT

References

[1]. H. Kamper, A. Jansen, S. King, and S. Goldwater, "Unsupervised lexical clustering of speech segments using fixed-dimensional acoustic embeddings", in Proceedings of the IEEE Spoken Language Technology Workshop (SLT), 2014.

[2]. Murphy, Kevin P. "Conjugate Bayesian analysis of the Gaussian distribution." def 1.2σ2 (2007): 16.

[3]. Murphy, Kevin P. Machine learning: a probabilistic perspective. MIT press, 2012.

[4]. Pedregosa, Fabian, et al. "Scikit-learn: Machine learning in Python." Journal of Machine Learning Research 12.Oct (2011): 2825-2830.

[5]. Rasmussen, Carl Edward. "The infinite Gaussian mixture model." Advances in neural information processing systems. 2000.

[6]. Tadesse, Mahlet G., Naijun Sha, and Marina Vannucci. "Bayesian variable selection in clustering high-dimensional data." Journal of the American Statistical Association 100.470 (2005): 602-617.

About

Bayesian inference for Gaussian mixture model with some novel algorithms

License:MIT License


Languages

Language:Python 100.0%