lukaszbrzozowski / msts

Selected algorithms for MST-based clustering

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Clustering with Minimum Spanning Trees (MSTs)

Python implementations of a few algorithms to cluster data based on minimum spanning trees, including:

  1. HEMST (Grygorash, Zhou, Jorgensen, 2006)
  2. CTCEHC (Ma et al., 2021)
  3. MSDR [experimental]
  4. Zahn's method (1971) [experimental]

For other algorithms based on MSTs, see:

  1. single linkage in sklearn and fastcluster
  2. genieclust – the Genie algorithm
  3. ITM – information-theoretic clustering
  4. hdbscan

References

HEMST and MSDR: Grygorash, O., Zhou, Y., Jorgensen, Z., 2006. Minimum spanning tree based clustering algorithms, in: Proc. ICTAI’06, pp. 1–9. DOI: 10.1109/ICTAI.2006.83.

CTCEHC: Ma, Y., Lin, H., Wang, Y., Huang, H., He, X., 2021. A multi-stage hierarchical clustering algorithm based on centroid of tree and cut edge constraint. Information Sciences 557, 194–219. DOI: 10.1016/j.ins.2020.12.016.

Zahn's algoritm: Zahn, C., 1971. Graph-theoretical methods for detecting and describing gestalt clusters. IEEE Transactions on Computers C-20, 68–86. DOI: 10.1109/T-C.1971.223083.

Campello R.J.G.B., Moulavi D., Sander J. (2013). Density-based clustering based on hierarchical density estimates. Lecture Notes in Computer Science 7819:160–172, DOI:10.1007/978-3-642-37456-2_14.

Gagolewski M. (2022). A framework for benchmarking clustering algorithms, SoftwareX 20:101270. URL: https://clustering-benchmarks.gagolewski.com. DOI: 10.1016/j.softx.2022.101270.

Gagolewski M., Bartoszuk M., Cena A. (2016). Genie: A new, fast, and outlier-resistant hierarchical clustering algorithm. Information Sciences 363:8–23. DOI: 10.1016/j.ins.2016.05.003.

Gagolewski M. (2021). genieclust: Fast and robust hierarchical clustering. SoftwareX 15:100722. URL: https://genieclust.gagolewski.com/. DOI: 10.1016/j.softx.2021.100722.

McInnes L., Healy J., Astels S. (2017). hdbscan: Hierarchical density based clustering. The Journal of Open Source Software 2(11):205. DOI: 10.21105/joss.00205.

Müller A.C., Nowozin S., Lampert C.H. (2012). Information theoretic clustering using minimum spanning trees. In: Proc. German Conference on Pattern Recognition. URL: https://github.com/amueller/information-theoretic-mst.

Müllner D. (2013). fastcluster: Fast hierarchical, agglomerative clustering routines for R and Python. Journal of Statistical Software 53(9):1–18. DOI: 10.18637/jss.v053.i09.

About

Selected algorithms for MST-based clustering

License:BSD 3-Clause "New" or "Revised" License


Languages

Language:Python 100.0%