AUTHOR : Aadarsh Gupta
Clustering or cluster analysis is a machine learning technique, which groups the unlabelled dataset. K-Means clustering is a method of vector quantization, originally from signal processing, that aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. The following is the implementation of K-Means Clustering algorithm to a given income dataset of 21 subjects consisting of age and income (resepective names are given, but not required for training).
The following packages have been used while training of the model and visualization puposes :
- pandas : for data processing operations I/O in CSV file
- scikit-learn : to implement machine learning methods and pre-processing techniques
- matplotlib : for data visualization and graphical plotting
- seaborn : for data visualization for statistical graphics plotting
- Download the dataset from here
- Download this dataset, extract and store it in localdisk