FuadAnalyst / EDA-Exploratory-Data-Analysis

Exploratory Data Analysis (EDA) on a dataset from Kaggle

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Exploratory Data Analysis (EDA) of "BigBasket Entire Product List" 🔎

Description:

This project thoroughly examines the BigBasket Products.csv dataset obtained from Kaggle through an Exploratory Data Analysis (EDA). The main goal is to fully understand the data's features, discover possible connections between different factors, and pinpoint any unusual data points that might need more exploration.

In this project i found and cleaned Null values, through a combination of univariate and bivariate analysis techniques, i carefully analyzed each variable on its own and investigated how they interact with each other.

We use a lot of visualizations to clearly present important findings, show patterns, and give a visual representation of how the data is spread out and interconnected.

The results of this EDA can serve as a valuable foundation for further statistical modeling, machine learning, or data-driven decision-making.

Key Techniques:

  • Univariate Analysis:

    • Distribution analysis (histograms, boxplots)
    • Central tendency measures (mean, median, mode)
    • Measures of spread (variance, standard deviation)
  • Bivariate Analysis:

    • Scatter plots to visualize relationships between two continuous variables
    • Boxplots, barplots for categorical vs. continuous variables
    • Correlation coefficient to quantify strength and direction of relationships

Outlier Detection:

The project employs strategies such as boxplots, IQR (Interquartile Range), and statistical techniques to identify potential outliers and delete them from dataset

Outcomes:

This Exploratory Data Analysis (EDA) gives a clear grasp of the data's layout, main tendencies, and variations.

Identified relationships between variables, if any, are showcased through visualizations and correlation coefficients.

Outliers, if present, are investigated and removed from dataset.

Technologies:

Jupyter Notebook, Python - Pandas, NumPy, Matplotlib, Seaborn

About

Exploratory Data Analysis (EDA) on a dataset from Kaggle


Languages

Language:Jupyter Notebook 100.0%