EDA of Historical Storm Data of North America
| Authors: Divya Rajendran, Ethan Violette, Pramod Duvvuri, Wenjuan Sang
R is needed to run the code for this project. The project can be run using RStudio, which is available at rstudio website.
install.packages(knitr)
install.packages(ggplot2)
install.packages(broom)
install.packages(dplyr)
install.packages(ggmap)
install.packages(htmlwidgets)
install.packages(MASS)
install.packages(gridExtra)
devtools::install_github("dkahle/ggmap")
devtools::install_github("hadley/ggplot2")
File: Project_Code.Rmd
File: EDA_Report.pdf
With hurricanes staying at the forefront of the news over the past couple of years, the idea of an increase in number of dangerous storms is fairly frightening. We utilized past data of tropical and subtropical storms to answer this question and, also see the trends in storm frequency for each category of storm.
For our choice of data set, we utilized the Department of Homeland Security’s Storm Tracking data ( available here )
We followed the below steps in our EDA
-
Plots of univariate, bivariate, and trivariate relationships between different attributes such as basin, category, location, frequency, over the year attribute.
-
Univariate and Bivariate relationship plots between pressure and wind for different category of storms.
-
Residual, fitted values comparision for a poisson model
-
plot of predictions for expected frequency of storms
-
heat map of frequency of storms in differnt basins
-
It appears that the number of storms is indeed increasing at a small rate (0.08%) each year; this can be predicted with a certain degree of accuracy obtained using features Basin of Origination, Category of Storm, and (of course) Year.
-
Though we had sufficient tools at our disposal to conclude this using a fitted Poisson Model, we’re limited in our predictive accuracy due to lack of additional features.
-
Given more time, we would have merged this dataset with another that contained information about weather, number of man-made influencers of climate change, and other useful predictors, over time.