Bug Report: Slower than k-means on `n=10,000` moon dataset
motiwari opened this issue · comments
Original comment: https://news.ycombinator.com/reply?id=35464068&goto=item%3Fid%3D35445312%2335464068
Hi Mo, thanks for this work. It seems interesting.
I had the chance to play a little bit and wanted to compare that with KMeans. I relied on sklearn KMeans implementation.
Furthermore, I did some examples (mostly what is available). But One interesting thing I did is I generated some isotropic Gaussian blobs for clustering (using make_blobs
) and then tried a comparison between the two methods. Bandit PAM was a little bit better for a couple of metrics I used, but also much faster. I was generating n_samples=1000
but then I increased it to n_samples=10000
and I found that it is much slower than KMeans, see [1] and code is in [2]. Is there a particular reason for that?