mlampros / ClusterR

Gaussian mixture models, k-means, mini-batch-kmeans and k-medoids clustering

Home Page:https://mlampros.github.io/ClusterR/

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Optimal_Clusters_GMM warning number of columns

FMKerckhof opened this issue · comments

Not sure if a bug or intended behavior, but when using ClusterR::Optimal_Clusters_GMM I get a warning the number of columns of the data should be larger than 'max_clusters' triggered by:

if (ncol(data) < max(max_clusters) && verbose) { warning("the number of columns of the data should be larger than the maximum value of 'max_clusters'", call. = F); cat(" ", '\n') }

However, from the examples I would assume we are trying to cluster observations rather than parameters? Hence shouldn't the number of rows be used to trigger this warning?

@FMKerckhof yes that's true, based on the Armadillo documentation

The k parameter indicates the number of centroids; the number of samples in the data matrix should be much larger than k

give me a few days and I'll fix both the code and the documentation of the package

I just updated the code, now it should show the warning if the number of clusters are bigger than the number of observations.
I'll close the issue for now, feel free to re-open it in case that the code does not work as expected.