adityakumaar / DataAnalytics-Internship-Project

This project is for a Data Analytics Internship.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

DataAnalytics-Internship-Project

This is an Exploratory Data Analysis for the customer data provided by the company.

As an intern my tasks were to:

  1. Clean the DataSet and standardise the data
  2. Perform Exploratory Data Analysis on the DataSet
  3. Analyse the results and build simple visualizations
  4. Extract insights from the data and visualizations
  5. Create presentation about the findings
  6. Determine the states and cities in which the company should expand its business for increasing sales and profit

*The data is mostly LikeRt values (scale data) so in my opinion the best possible visualization can be done by using BarPlots.

We have the Customer DataSet provided by the company. This dataset contains only Likert and Categorical data


Previous Operations Performed on The DataSet:
1. Renamed all the columns
2. Standardized the data
3. Sorted the cities based on given states
4. Deleted extra columns which contained personal information of the customers


Operations Performed on This DataSet:
1. Importing required libraries
2. General overview of the dataset
3. Visualizing data for better understanding
4. Creating an algorithm for plotting Ratings Data against Conditions
5. Visualizing data using this algorithm
6. Conclusion and summary of our findings


Goal of The Project:
1. To understand the data set
2. Extract some insights from it
3. Plot graphs for better understanding of the dataset
4. Select cities for the company to expand its business and get maximum profit
5. Conclude the EDA with solid reasons for expanding business in selected cities


Table of Contents:
1. Importing required Libraries and DataSet
2. Description of the DataSet
3. Analysing missing values and then plotting a heatmap for visualization
4. Plotting a Correlational Heatmap for the data
5. Counting number of projects in individual city and plotting graphs for visualization
6. Counting ratings given by the customers that can affect their purchase decisions and plotting visualizations
7. Sorting cities based on considerable number of projects and suitable type of house customers live in
8. Analysing and visualizing customers based on these categories:

  • a. Average Monthly Income
  • b. Type of House
  • c. Active EMI
  • d. Percentage of Roof for Solar Installation
  • e. Home Loan
  • f. Type of Organization
  • g. Expectations on Saving Electricity Bill
  • i. Maximum Investment in Solar Technology
  • j. Payment Method
9. Creating a custom algorithm for plotting data against required conditions
10. Plotting data using this algorithm and getting valuable insights from it
11. Conclusion, List of cities for expanding business and Summary of findings

About

This project is for a Data Analytics Internship.


Languages

Language:Jupyter Notebook 86.4%Language:Python 13.6%