Lupercio421 / STAT_790_Case_Seminar

Project Seminar course for the MA program. More Details on ReadMe below.

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

STAT_790_Case_Seminar

Hey everyone! During the Spring '22 semester, I enrolled in the Project Seminar course for the B.A./M.A. program. Using data from the NYC Department of Sanitation, I analyzed waste tonnage collected throughout the five boroughs. The data has been imported into R using an API and the RSocrata package.

Completed tasks for this project:

  • The challenge that I took on was to create a univariate time series model to anaylze the total waste collected for each of the five boroughs
    • All of the models used seasonal ARIMA models to analyze each time series
    • A common pattern seen within the models was the use of the differenced series, adding non-seasonal MA() arguments and seasonal AR() arguments
  • A preliminary multiple linear regression model was used, with the total tonnage collected in NYC per month, being regressed onto external variables
    • This model returned an adjusted r-squared = 0.41
  • A dynamic regression model was introduced
    • Where we are allowing the errors from a regression model to contain autocorrelation
    • These models will have two error terms - the error from the regression model, which we denote by ๐œ‚_๐‘ก and the error from the ARIMA model, which we denote by ๐œ€_๐‘ก
    • Only the ARIMA model errors are assumed to be white noise

Predictors that I considered:

Resources that helped me complete this project:

The final paper can be read here.

The code for the Manhattan time series research can be found on the MN_ts.Rmd file. The code for the multiple linear regression and dyncamic regression can be found on the thirteenth_meeting_notes.Rmd and the dynamic_regression_attempt.Rmd files, respectively.

About

Project Seminar course for the MA program. More Details on ReadMe below.


Languages

Language:HTML 98.9%Language:TeX 0.8%Language:R 0.3%