KrishArul26 / Air-Quality-Index-prediction_with_deployment

India is one of the countries with the highest air pollution country. Generally, air pollution is assessed by PM value or air quality index value. For my further analysis, I have selected PM-2.5 value to determine the air quality prediction and the India-Bangalore region. Also, the data was collected through web scraping with the help of Beautiful Soup.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Air-Quality-Index-prediction- Using PM 2.5 value

India is one of the higher air pollution country. Generally, air pollution is assessed by PM value or air quality index value. For my further analysis, I have selected PM-2.5 value to determine the air quality prediction and India-Bangalore region. Also, the data was collected through web scraping with the help of Beautiful Soup.

Demo of the app:

  • If wanted to see App Please click here

Please Enter the value & clisk the predict button

Technologies Used

1. IDE - Pycharm
2. Linear Regression Model
3. Ridge and Lasso Regression
4. Support vector regressor(SVR)
5. Extra tree regressor
6. Decission tree regressor
7. Google Colab - Trained ML model
8. Flask- Rest API
9. Postman - API Tester
10. Heroku

πŸ“ Data Collection

  • Air quality data was collected from the "http://en.tutiempo.net/climate". So, here I selected the India- Bangalore'sregion & collected the independent features such as Average annual temperature(AT), Annual average maximum temperature(TM), Average annual minimum temperature(Tm), Rain or snow precipitation total annual(PP), Annual average wind speed(V), Number of days with rain(RA), Number of days with snow(SN) and dependent feature as PM 2.5 values has been colected from the "dhewdhjwdhjw"

  • The dataset used can be downloaded Here from the 2013 to 2018.

Data Preprocessing

Data Preprocessing of the raw data Google Colab For EDA Vist, Here

           1.	Remove Unnecessary Columns

           2.	Feature Engineering Selection
                
                                              
                                              * Correlation Analysis

                                              * Hnadling Out layer - Capping using Percentile method (Winsorization )
                                                
                                              * Feature Importances
                                            

           3.	Finding The Null Values Present In The Dataset

           
           3.	Handle the NaN values

           6.	Missing Values Replace With Mean

           8.	Dimensionality Reduction Using  PCA

           9.	Remove Columns Which A Standard Deviation Of Zero

πŸ”‘ Prerequisites

  • All the dependencies and required libraries are included in the file requirements.txt

πŸš€ Installation

  1. Clone the repo
git clone https://github.com/KrishArul26/End-to-End-Deployment-Air-Quality-Index-prediction.git

  1. Change your directory to the cloned repo
cd End-to-End-Deployment-Air-Quality-Index-prediction
  1. Create a Python virtual environment named 'AQI' and activate it

 pip install virtualenv

 virtualenv AQI

 AQI\Scripts\activate
  1. Now, run the following command in your Terminal/Command Prompt to install the libraries required
pip install -r requirements.txt

πŸ’‘ Working

  1. Open terminal. Go into the cloned project directory and type the following command:
python app.py

πŸ”‘ Results

  • For this project Support vector regressor(SVR), linear regressor, Extra tree regressor, decision tree regressor and XGBoost regressor has applied.By tuned hyperparameter for all algorithms finally received these evaluation parameters MAE, MSE & RMSE. Among them, the Extra tree regressor has the lowest MAE values. So, for further analysis, I used an Extra tree regressor.

Linear Regressor: Open In Colab

Evaluation Matrix

Evaluation Parameter Value
MAE 43.505
MSE 3335.414
RMSE 57.753

Support vector regressor(SVR): Open In Colab

Evaluation Matrix

Evaluation Parameter Value
MAE 40.780
MSE 3277.271
RMSE 57.247

Extra tree regressor: Open In Colab

Evaluation Matrix

Evaluation Parameter Value
MAE 19.348
MSE 1185.348
RMSE 34.429

Decission tree regressor: Open In Colab

Evaluation Matrix

Evaluation Parameter Value
MAE 26.92
MSE 2440.952
RMSE 49.406

πŸ”‘ Comparision

  |----------------------------|------------------------|----------|
  |                            | Evaluation Parameter   | Value    |
  |----------------------------|------------------------| ---------|
  |  Linear Regressor          |       MAE              | 43.505   |
  |                            |       MSE              | 3335.414 |
  |                            |       RMSE             | 57.753   |
  |----------------------------|------------------------|----------|

πŸ‘ And it's done!

Feel free to mail me for any doubts/query βœ‰οΈ ragavan.arul26@gmail.com

About

India is one of the countries with the highest air pollution country. Generally, air pollution is assessed by PM value or air quality index value. For my further analysis, I have selected PM-2.5 value to determine the air quality prediction and the India-Bangalore region. Also, the data was collected through web scraping with the help of Beautiful Soup.


Languages

Language:Jupyter Notebook 99.7%Language:Python 0.2%Language:HTML 0.1%Language:CSS 0.0%Language:Procfile 0.0%