exploratory-data-analysis exploratory-data-visualizations statistical-analysis statistical-tests data-science machine-learning python3 seaborn scikit-learn pandas tpot automl hyperparameter-tuning pipeline-framework shell-script

MPG

Part 1, Exploratory Data Analysis(EDA):
This part consists of summary statistics of data but the major focus will be on EDA where I extract meaning/information from data using plots and report important insights about data. This part is more about data analysis and business intelligence(BI). You can follow this entire notebook on kaggle as well.

Part 2, Statistical Analysis:
In this part I will do many statistical hypothesis testing, apply estimation statistics and interpret the results. I will also validate this with the findings from part one. I will apply both parametric and non-parametric tests. This part is all about data science requires statistical background. You can follow this entire notebook on kaggle as well.

Part 3, Predictive Modelling:
In this part I will predict mpg using predictors. This part is all about machine learning. I used many data pipelines and models for training and then predict using the best found pipeline and model.

If you like these notebooks then please share with others.

Data Description

The data we are using is the auto mpg dataset taken from UCI repository.

Information regarding data
Title: Auto-Mpg Data
Number of Instances: 398
Number of Attributes: 9 including the class attribute
Attribute Information:

    1. mpg:           continuous
    2. cylinders:     multi-valued discrete
    3. displacement:  continuous
    4. horsepower:    continuous
    5. weight:        continuous
    6. acceleration:  continuous
    7. model year:    multi-valued discrete
    8. origin:        multi-valued discrete
    9. car name:      string (unique for each instance)
    
    All the attributes are self-explanatory.

This data is not complex and is good for analysis as it has a nice blend of both categorical and numerical attributes.

^{data source}

If you like this project then please star it and also share with others.

About

A case study on MPG dataset.

exploratory-data-analysis exploratory-data-visualizations statistical-analysis statistical-tests data-science machine-learning python3 seaborn scikit-learn pandas tpot automl hyperparameter-tuning pipeline-framework shell-script

Languages

Language:Jupyter Notebook 98.6%Language:Python 1.3%Language:Shell 0.1%