To perform ARIMA analyses and predict the temperature values for the next five years in Delhi.
https://github.com/sidiquegithub/ML-MODEL-TIME-SERIES-ANALYSIS/blob/main/CODE/MODEL.ipynb
The dataset contains the average temperature of each day for different cities across the world from 1995 to 2020.
Region | Country | State | City | Month | Day | Year | AvgTemperature |
---|---|---|---|---|---|---|---|
Africa | Algeria | Algiers | 1 | 1 | 1995 | 64.2 | |
Africa | Algeria | Algiers | 1 | 2 | 1995 | 49.4 | |
Africa | Algeria | Algiers | 1 | 3 | 1995 | 48.8 |
- Created new data frame for Delhi
Region | Country | State | City | Month | Day | Year | AvgTemperature |
---|---|---|---|---|---|---|---|
Asia | India | Delhi | 1 | 1 | 1995 | 50.7 | |
Asia | India | Delhi | 1 | 2 | 1995 | 52.1 | |
Asia | India | Delhi | 1 | 3 | 1995 | 53.8 | |
Asia | India | Delhi | 1 | 4 | 1995 | 53.7 | |
Asia | India | Delhi | 1 | 5 | 1995 | 54.5 |
-
- Trends
-
- Box Plot
-
- Skewness
The dataset exhibited a high degree of skewness
An outlier value of -99.0 was identified within the dataset
After replacing the outlier with NaN, the dataset became symmetric. Subsequently, the NaN was replaced with the dataset's mean temperature.
ADF Test Statistics:-8.786101777647525
p value:2.3064322634278694e-14
#Lags Used:38
Number of Observations Used:9226
Reject H0: It is stationary
The Auto ARIMA method was employed for model selection, and it identified the SARIMAX(3, 0, 1) model as the best fit. This indicates that the optimal configuration for the time series forecasting involves three autoregressive terms (AR), no differencing (I), and one moving average term (MA), along with the potential inclusion of seasonal components if applicable in the SARIMAX model.