aadarshgupta1412 / K_Means_Clustering

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

K_Means_Clustering

AUTHOR : Aadarsh Gupta

Introduction

Clustering or cluster analysis is a machine learning technique, which groups the unlabelled dataset. K-Means clustering is a method of vector quantization, originally from signal processing, that aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. The following is the implementation of K-Means Clustering algorithm to a given income dataset of 21 subjects consisting of age and income (resepective names are given, but not required for training).

Packages Used

The following packages have been used while training of the model and visualization puposes :

  • pandas : for data processing operations I/O in CSV file
  • scikit-learn : to implement machine learning methods and pre-processing techniques
  • matplotlib : for data visualization and graphical plotting
  • seaborn : for data visualization for statistical graphics plotting

Requirements

  • Download the dataset from here
  • Download this dataset, extract and store it in localdisk

About


Languages

Language:Jupyter Notebook 100.0%