him4318 / Gulpie

Repository from Github https://github.comhim4318/GulpieRepository from Github https://github.comhim4318/Gulpie

Libraries

1. Pandas

2. Scikit learn

3. matplolib

4. Seaborn

5. Plotly

The task was to cluster the Restaurants. The dataset used was Yeplp dataset, The basic ideas is that the restaurants with similar facilities shoud be in the same group. The problem I faced was that this data has Huge number of variables or very High dimension and 90% of attributes are Categorical. So, the approach I used is that data consist lots of Variables with categorical values, Now to reduce the dimension I took binary values of all the Variables which our type of category.

For eg value of first row will look something like this "0000000000000111011111100111...00". I converted this value in to integer value. By this method dimension will be greately reduced and Restaurant having similar values will come together.You can see that in Gulpie notebook

3D plotly graphs can be seen in Gulpie2 notebook

About


Languages

Language:Jupyter Notebook 100.0%