I import data using pandas (read_csv) built in function
-
check for duplicates: no duplicates found
-
check for null values
-
handling null values:
see data type and unique values
replace (varies)in Caffeine Column with null values
And fill all null values with the mean of the column
-
drop unnecessary columns: like(Vitamin A, Vitamin B,...)
-
check for duplication again: no duplication found
-
Encoding Cateorgical values:(Beverage,Beverage_category,Beverage_prep)
-
concatinate all Beverages to make drink feature(column)
-
Sort data in a new variable by the tajing the first 7 sorted data according to calories
-
plot a barh plot with the name of the drink on y axis and calories on x axis
-
do the same for sugars
-
plot a bar plot between Beverage Category Column and Calories Column
-
plot a bar plot between Beverage Category Column and Sugars Column
There is nothing Complicated in running the solution all you need to do is run all and see the visualization
I wrote headings and few comments in the code to help anyone understand it.