MachineLearning_AppliedStatistics
Imported the necessary libraries
Read the data as a data frame
Performed basic EDA which included the following and printed out the insights at every step.
a. Shape of the data
b. Data type of each attribute
c. Checking the presence of missing values
d. 5 point summary of numerical attributes
e. Distribution of ‘bmi’, ‘age’ and ‘charges’ columns.
f. Measure of skewness of ‘bmi’, ‘age’ and ‘charges’ columns
g. Checking the presence of outliers in ‘bmi’, ‘age’ and ‘charges columns
h. Distribution of categorical columns (include children)
i. Pair plot that includes all the columns of the data frame
The notebook also analyzed the below questions with the statistical evidence
a. Do charges of people who smoke differ significantly from the people who don't?
b. Does bmi of males differ significantly from that of females?
c. Is the proportion of smokers significantly different in different genders?
d. Is the distribution of bmi across women with no children, one child and two children,the same ?