Arvindhh931 / Employee_health_expenditure-EDA

Deriving insights for Employee health expenditure planning

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Exploratory Data analysis on Employee health Expenditure dataset

Objectives -

  • To understand the data using descriptive statistics using pandas, Numpy.
  • Derive Insights based on descriptive statistics.
  • To visualize data using Matplotlib & Seaborn liabraries.

Data Dictionary

  • id : Employee unique identification number

  • Age : age of primary beneficiary

  • Sex : insurance contractor gender --> female, male

  • BMI : Body mass index, providing an understanding of body, weights that are relatively high or low relative to height, objective index of body weight (kg / m ^ 2) using the ratio of height to weight, ideally 18.5 to 24.9

  • dependent : Number of children covered by health insurance / Number of dependents --> descrete values from 0-4

  • Alcohol : Alcohol categories --> 'daily', 'weekend', 'rarely', 'party', 'no'

  • Smoker : Smoking categories --> 'yes','no'

  • Zone : The beneficiary's residential area in the US, northeast, southeast, southwest, northwest.

  • Expenditure : Individual medical costs billed by health insurance

Outcomes -

  • Overview of the data by slicing across dimensions.
  • Hands on practise of Data wrangling using Pandas liabrary.
  • Hands on practise of Visualization liabraries like Matplotlib, Seaborn.

About

Deriving insights for Employee health expenditure planning


Languages

Language:Jupyter Notebook 100.0%