Dalia-Mahmoud-ElSayes / eT3-Task

DataScience Task for final acceptance stage for eT3 Internship

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

eT3-Task

Import Data

I import data using pandas (read_csv) built in function

Data Cleaning

  • check for duplicates: no duplicates found

  • check for null values

  • handling null values:

    first:

    see data type and unique values

    then:

    replace (varies)in Caffeine Column with null values

    And fill all null values with the mean of the column

  • drop unnecessary columns: like(Vitamin A, Vitamin B,...)

  • check for duplication again: no duplication found

  • Encoding Cateorgical values:(Beverage,Beverage_category,Beverage_prep)

Data Visualization

By Drink

  • concatinate all Beverages to make drink feature(column)

  • Sort data in a new variable by the tajing the first 7 sorted data according to calories

  • plot a barh plot with the name of the drink on y axis and calories on x axis

  • do the same for sugars

By Category

  • plot a bar plot between Beverage Category Column and Calories Column

  • plot a bar plot between Beverage Category Column and Sugars Column

How to run

There is nothing Complicated in running the solution all you need to do is run all and see the visualization

I wrote headings and few comments in the code to help anyone understand it.

About

DataScience Task for final acceptance stage for eT3 Internship


Languages

Language:Jupyter Notebook 100.0%