kennedyCzar / HIGH-DIMENSIONAL-DATA-CLUSTERING

Implementation of hierarchical clustering on small n-sample dataset with very high dimension. Together with the visualization results implemented in R and python

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

HIGH-DIMENSIONAL-DATA-CLUSTERING

Implementation of hierarchical clustering on small n-sample dataset with very high dimension

  1. Basically a good visual representation of the data with easily viewable outliers and differently trending data.

  2. five subgroups and how they compare to each other in a cluster of devices. Any device that is trending differently or higher compared to others.

  3. And device clusters among like kind (device names are similar across the 5 subgroups) that display those out of the pack - outliers.   
    

Clustering results

alt image

alt imgage2

alt image3

Optimized cluster value k is 4 in most cases.

alt image4

About

Implementation of hierarchical clustering on small n-sample dataset with very high dimension. Together with the visualization results implemented in R and python

License:MIT License


Languages

Language:Python 87.6%Language:R 12.4%