hanifabd / PythonEDA

EDA techniques and how to execute them in Python

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Python EDA

This is a collection of python scripts exploring the basics of exploratory data analysis (EDA) with python along with ways to enhance efficiencies in python EDA, such as using Auto EDA libraries.

Python Basic EDA

This script examines the basics of EDA in Python using the very familiar Titanic dataset It covers the fundamental topics of:

  • Understanding the dataset structure
  • Basic statistical summaries
  • Data Cleaning
  • Visualization
  • Correlation analysis
  • Handling categorical data
  • Feature engineering
  • Preliminary Insights

Auto EDA with Python

Under the AutoEDA script is a collection of auto EDA libraries in Python and how to execute them along with their strengths and weaknesses.

The decision to use auto EDA depends on:

  • specific needs of the project
  • size and complexity of the dataset
  • expertise of the user

Auto EDA tools offer a quick and efficient way to get a broad overview of the data.
Manual EDA allows for more nuanced and detailed analysis.

Packages include:

Python Basic Stats Tests

This script examines basic statistical test using the Titanic and Iris datasets. It covers how to choose when to use them based on data type, distribution, and population size. Basic tests include:

  • Chi-Square
  • t-Test
  • ANOVA
  • Kruskal-Wallis
  • Pearson Correlation
  • Linear Regression
  • Logistic Regression

About

EDA techniques and how to execute them in Python


Languages

Language:Jupyter Notebook 100.0%