vamshikallem / HypothesisTesting-EDA_on_Air_Quality_data_from_Beijing

Considering few Hypothesis based on the observations of data and testing hypothesis by visualizing using Seaborn.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

HypothesisTesting-EDA_on_Air_Quality_data_from_Beijing

Considering few Hypothesis based on the observations of data and proving them using Visualizations.

UCI Machine Learning Repository -Link to download Data file: https://archive.ics.uci.edu/ml/machine-learning-databases/00501/ This link leads to Parent Directory and Link to download Zip file where we have data in csv files(multiple files).

-This data set includes hourly air pollutants data from 12 nationally controlled air-quality monitoring sites. The air-quality data are from the Beijing Municipal Environmental Monitoring Center. The meteorological data in each air-quality site is matched with the nearest weather station from the China Meteorological Administration. The time period is from March 1st, 2013 to February 28th, 2017. Missing data are denoted as NA. The attributes are categorized in to three types which are indicated by different symbols in the proposal.

-The zip file consists of data collected from 12 different stations as 12 different csv files. -Each file has 18 columns and 35000 rows and 2.7MB of data. Having different characteristic’s and missing values, there is good scope for Visualization and Data cleaning.

Hypothesis-1: HNull: Increase in gas cocncentration of O3 reduces gas cocncentration of CO,NO2,SO2 HAlt: Increase in gas coccentration of O3 does not reduces gas cocncentration of CO,NO2,SO2

Hypothesis-2: HNull: Increase in TEMP increases DEWP HAlt: Increase in TEMP doesn't increase DEWP

Hypothesis-3: HNull: summers have less amonut of toxic gasses present in atmosphere compared to other seasons HAlt: summers doesn't have less amonut of toxic gasses present in atmosphere compared to other seasons

Hypothesis-4: HNull: Toxic gas concentrations increase over the years HAlt: Toxic gas concentrations doesn't increase over the years

About

Considering few Hypothesis based on the observations of data and testing hypothesis by visualizing using Seaborn.


Languages

Language:Jupyter Notebook 100.0%