kaurao / hopkinsstatistic

R function to calculate the Hopkins statistic of clustering tendency

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Hopkins statistic

R function to calculate the Hopkins statistic of clustering tendency (H).

See this blog entry for why to calculate the Hopkins statistic: https://www.r-bloggers.com/assessing-clustering-tendency-a-vital-issue-unsupervised-machine-learning/

However, this blog (and the origial one this one is derived from) contain a mismatch between the equation for H and the implementation in the packages "clustertend" and "factoextra". The equation says that the H should increase with clustering tendency, but the clusterdata and factoextra implementations do the opposite (they calculate 1-H).

The function provided here implements H as per the equation which is also defined in the Wikipedia: https://en.wikipedia.org/wiki/Hopkins_statistic.

It also uses the FNN package providing a much faster implementation compared to both "clustertend" and "factoextra".

About

R function to calculate the Hopkins statistic of clustering tendency


Languages

Language:R 100.0%