Clustering from scratch
This repo contains implementation of k-means and DBSCAN algorithm from scratch on a sample dataset.
Dataset
In the below figure, green and blue points represent cluster 1 and cluster 2 respectively. Red points represent noise.
K-Means output
Applying K-means clustering algorithm for given dataset with k=2,
TRUE POSITIVE RATE FOR CLUSTER-1 = 15%
TRUE POSITIVE RATE FOR CLUSTER-2 = 16%
No noise points
DBSCAN output As observed from the above figure and also from code, we get epsilon=1.22 for given data and k=4. Applying DBSCAN algorithm with a value of k=4,
TRUE POSITIVE RATE FOR CLUSTER-1 = 100%
TRUE POSITIVE RATE FOR CLUSTER-2 = 100%
Thus, DBSCAN performs better than k-means for the given dataset from figures and true positive rates.