KMenz / Life_Expectancy_In_US

Depicts Life Expectancy Rates In The US and Possible Variables Affecting It

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Analyzing Life Expectancy In The US

By Kevin Menz, Angela Spirou, Lauren Gama, Myke London, and Connor Healy

The goal of this project was to locate data relating to Life Expectancy across the United States over time and to analyze it using Data Science tools in order to come up with conclusions how life expectancy differs across different variables.

The Datasets

https://healthinequality.org/data/ - The Health Inequality Project - Set of multiple datasets https://data.hrsa.gov/data/download - Health Resources & Services Administration
https://hifld-geoplatform.opendata.arcgis.com/ - Homeland Infrastructure Foundation-Level Data

The Tools

The project was done almost entirely in Python and Jupyter Notebook using a various set of libraries including:

  • Pandas
  • MatPlotLib
  • Seaborn
  • NumPy
  • SciPy

The Process

We decided to split up who would answer questions regarding different variables that could potentially affect life expectancy. These questions were :

  • Are there differences in the national LE by gender?
  • Are there differences in national LE by income?
  • Which states have the highest/lowest LE?
  • Is LE affected by the percent of a state's uninsured population, percent of African American population, and percent of Hispanic population
  • Are there differences in state LE by access to quality health care?

Each person then took on each question by cleaning the data, analyzing the data, and reporting their findings

Analysis and Answers

Q1 - Gender

Used the Health Inequality Project dataset. Pulled into Jupyter for cleaning first. The columns were cut down to only just the neccessary ones for analysis.

The analysis involved creating individual dataframes for males and females, then show the national average Life Expectancy per year per gender. We then created a line graph to visualize the average differences.

Project 1 Line Graph

We found that the Female Average LE s 85.54 years while the Male Average LE is 81.81 years.

Q2 - Differences by Income

Used the Health Inequality Project dataset. Was cleaned by cutting down columns, and adding quartiles for income percentiles for easier analysis.

Project 1 Gender Cleaning

We then visualized the quartiles using a line chart and a box plot.

Project 1 Income Line 1

Project 1 Income Box

We also wanted to compare how Females and Males differed for individual quarters. So we looked at Q1 and Q4 to see the difference.

Project 1 Female LE Income

PRoject 1 Male LE Income

Q3 - States with High/Low LE

Used the Health Inequality Project dataset. Was cleaned by cutting down columns, and then deciding to use the average of 4 quarters worth of LE as the measure of LE for each State.

Plotted the top 5 and bottom 5 states for Male and Female and showed the opposite gender as a comparison.

Project 1 State High Male Project 1 State High Female

This showed that the state with the highest male LE is Montana at 83.08 years and the state with the highest female LE is Vermont at 86.18 years. This is consistent with our analysis earlier that females tend to live longer in general.

Project 1 State Low Male Project 1 State Low Female

This showed us that the state with the lowest male and female LE is Nevada at 79.64 and 86.18 years respectively. Some other takeaways from this analysis were that there is an average difference of 3.3 years per state between genders. Also, top and bottom states differ between genders, if only slightly. There were a few common states as well.

Q4 - LE vs Uninsured, African American population, Hispanic population

Used the Health Inequality Project dataset. Was cleaned by cutting down columns, then found the weight average for the county population against each column of interest. With the state averages for the data of interest calculated, we then merged this dataframe with the state dataframe from Q3 to include the Life Expectancy’s by state.

Project 1 State Uninsured

By charting the percent uninsured by state, and comparing against Q3's LE bar chart, we can see there are several states, including Vermont and Montana, which rank among the states with the most amount of insured residents and the greatest LE. Conversely, states, such as Nevada, have the lowest LE and have a population which ranks among the most uninsured.

Project 1 Scatter

We created a scatter plot to see if there is a direct correlation between State Average LE and Percent Uninsured by State. Because the chart did not appear to show a clear relationship between the variables, we ran a pearson correlation. The pearson correlation revealed that the LE will decrease the higher the percent of a state’s uninsured population increases.

We then created scatter plots to see the relationship between State Average LE and Percent of a State’s African-American and Hispanic population.

Project 1 Scatter AA Project 1 Scatter His

As well, the chart did not seem to show a clear relationship, so we ran a pearson correlation for both. In doing so, the pearson correlation showed that the LE will decrease the higher the percent of a state’s African-American and Hispanic population increases.

Q5 - State LE vs. Quality Health Care

Used the HIFLD Data for County and State Hospital data and the HRSA for Health Care Qualtity.

First looked at Hospitals vs. Medically Underserved Areas. Used the IMU score as a standard for quality. It's a scale from 0 – 100 to determine the value of performance on demographic and health care facilities in a given county.

Project 1 Map Blue Project 1 Map Orange

You can see in the blue the number of hospitals, and in the orange the IMU score. We then chose two states as our sampels, Alabama and Minnesota. These have an average LE of about 77 ad 82 respectively, while Alabama has 133 Health Care Facilities and Minnesota has 104.

Project 1 Alabama Map Project 1 Minn Map

Despite having more Health Care facilities, you can see that 18% of Alabama's population lives within IMU >60% while only 3% of Minnesota's population lives within IMU > 60%. This could explain why Alabama's LE is lower.

Project 1 Alabama IMU Project 1 Minn IMU

About

Depicts Life Expectancy Rates In The US and Possible Variables Affecting It


Languages

Language:Jupyter Notebook 100.0%