jesschannn / datasci_6_anova

Gain hands-on experience with ANOVA analysis, understanding its assumptions, and applying it to real-world datasets to understand differences among group means.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

datasci_6_regression

Assignment #6 for HHA 507


In this assignment, I conducted a Shapiro-Wilks test, a Levene test, and a Tukey Post-Hoc test on a dataset about diabetes. The three variables I studied were: gender, maximum glucose serum levels, and time spent in a hospital.

  • Shapiro-Wilks: to test for normality
  • Levene: to test for equal variance
  • Tukey: to determine where the differences lie within each group studied based on ANOVA results

Q: Do gender and maximum glucose serum (max_glu_serum) have an influence on the amount of days spent in the hospital (time_in_hospital) in those with diabetes?

IV 1: gender

Gender was a variable I picked because certain diseases may affect men and women differently, which can influence how they are diagnosed and treated. Specifically for diabetes, there are differences in the prevalence and incidence of diabetes between men and women, certain hormonal and genetic factors for each gender may contribute to disparities in diabetes risk, and symptoms and signs of diabetes can differ between men and women.

IV 2: max_glu_serum

Maximum glucose serum was another variable I picked because maximum glucose serum levels are used as a monitoring tool for diabetes. As a monitoring tool, data from maximum glucose serum levels can provide information about glycemic control and if any interventions need to be taken.

DV: time_in_hospital

Time in hospital was the last variable I picked because the overall goal for hospitals is to reduce the time patients have to spend in the hospital. Hospitals have successfully been able to reduce the amount of time patients spend in a hospital by advancements in diagnostics.

About

Gain hands-on experience with ANOVA analysis, understanding its assumptions, and applying it to real-world datasets to understand differences among group means.


Languages

Language:Jupyter Notebook 100.0%