pbnjay / clustering

Clustering algorithm implementations for Go

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Threshold documentation

PaluMacil opened this issue · comments

I'm new to ML, but the documents I read on single-linkage hierarchical clustering didn't mention a threshold. How do I pick one for single linkage? Or is that a different formula that what you have here?

Thank you.

Hierarchical clustering doesn't typically give you clusters, it gives you a tree. If you know k, your number of expected clusters, you can cut the tree when there are k branches - this is done with the MaxClusters(k) Checker. If you don't know k, you can use a threshold to set a limit on when to stop agglomerating clusters. It's entirely dependent on your data's distance + linkage method.

For single-linkage like you mention, it's essentially the maximum distance you allow between the next merging of clusters.

got it! Thanks for the teaching. 👍