vsmolyakov / DP_means

Dirichlet Process K-means

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DP_means

Dirichlet Process K-means

Description

DP K-means is a bayesian non-parametric extension of the K-means algorithm based on small variance assymptotics (SVA) approximation of the Dirichlet Process Mixture Model.

It doesn't require prior knowledge of the number of clusters K. The cluster penalty parameter lambda is set based on the data by taking the maximum distance to the K++ means initialization. Normalized Mutual Information (NMI) is used to compare posterior cluster assignments with the ground truth.

Reference

B. Kulis and M. Jordan, "Revisiting k-means: New Algorithms via Bayesian Nonparametrics"

Dependencies

Matlab 2015a
Python 3.11.2
Eigen3

About

Dirichlet Process K-means

License:MIT License


Languages

Language:C++ 34.4%Language:Julia 24.5%Language:Python 21.3%Language:MATLAB 19.1%Language:CMake 0.7%