pneuvial / cvxclustr

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

#cvxclustr

Clustering is a fundamental problem in science and engineering. Many classic methods such as $k$-means, Gaussian mixture models, and hierarchical clustering, however, employ greedy algorithms which can be entrapped in local minima, sometimes drastical suboptimal ones at that. Recently introduced convex relaxations of $k$-means and hierarchical clustering shrink cluster centroids toward one another and ensure a unique global minimizer. This R package provides two variable splitting methods

* Alternating Method of Multipliers (ADMM)
* Alternating Minimization Algorithm (AMA)

for solving this convex formulation of the clustering problem. We seek the centroids $u_i$ that minimize

$\frac{1}{2} \sum_i || x_i - u_i||2^2 + \gamma \sum_l w{l} ||u_{l1} - u_{l2} ||$

Two penalty norms are currently supported: 1-norm and 2-norm.

Installation

You can install the stable version on [CRAN:] (http://cran.r-project.org/web/packages/cvxclustr/)

install.packages('cvxclustr', dependencies = TRUE)

The developmental version can be pulled from github using the R package [devtools] (http://cran.r-project.org/web/packages/devtools/index.html)

install_github('cvxclustr','echi')

or from the command line

git clone https://github.com/echi/cvxclustr.git	

R CMD build cvxclustr	

R CMD INSTALL cvxclustr_*.tar.gz

Remarks

Details on the algorithms are in the paper [Splitting Methods for Convex Clustering] (http://arxiv.org/abs/1304.0499) by Chi and Lange.

Previous takes on this formulation of clustering:

Just Relax and Come Clustering! A Convexification of k-Means Clustering by Lindsten, Ohlsson, and Ljung.

Clusterpath: An Algorithm for Clustering using Convex Fusion Penalties by Hocking, Joulin, Bach, and Vert.

About


Languages

Language:R 54.6%Language:Fortran 45.4%