Dalia-Mahmoud-ElSayes / eT3-Task

DataScience Task for final acceptance stage for eT3 Internship

cleaning-data-in-python data-science data-visualization datascience

eT3-Task

Import Data

I import data using pandas (read_csv) built in function

Data Cleaning

check for duplicates: no duplicates found
check for null values
handling null values:

first:

see data type and unique values

then:

replace (varies)in Caffeine Column with null values

And fill all null values with the mean of the column
drop unnecessary columns: like(Vitamin A, Vitamin B,...)
check for duplication again: no duplication found
Encoding Cateorgical values:(Beverage,Beverage_category,Beverage_prep)

Data Visualization

By Drink

concatinate all Beverages to make drink feature(column)
Sort data in a new variable by the tajing the first 7 sorted data according to calories
plot a barh plot with the name of the drink on y axis and calories on x axis
do the same for sugars

By Category

plot a bar plot between Beverage Category Column and Calories Column
plot a bar plot between Beverage Category Column and Sugars Column

How to run

There is nothing Complicated in running the solution all you need to do is run all and see the visualization

I wrote headings and few comments in the code to help anyone understand it.

About

DataScience Task for final acceptance stage for eT3 Internship

cleaning-data-in-python data-science data-visualization datascience

Languages

Language:Jupyter Notebook 100.0%