JestonBlu / Unemployment

Masters Project: Forecasting Unemployment

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Modeling

JestonBlu opened this issue · comments

Use this thread to discuss modeling and forecasting

I added a new file to RScripts/ and that imports the raw data, centers a few of the variables, and exports a .Rda file to Data/... whoever is going to be playing with various models should just be able to use the command below and it will import the data set with no missing information (Jan 1993 - Oct 2015). Feel free to do some of your own transformations, i just put it there to get started.

load("Data/Data_Prep.rds")

Travis proposed model from the presentation discussion

We can basically follow the steps of Example 3.46 in the text.

I took the second difference (d = 2), as Joseph suggested, then the first seasonal difference (D = 1) with s = 12 (this is common for monthly economic data), as the book did. Below is the plot of the transformed graphed. It looks pretty stationary (not perfect, but adequate), and we can confirm this with the ADF test (it's cited in other time series texts, but I haven't seen it in ours yet).

image

After that, the book suggests that you examine the ACF and PACF plots.

First, the book says to look at the seasonal changes in ACF and PACF (h = 12, 24, 36, ...). These seem to indicate that the ACF trails off, and the PACF cuts off after one year (h = 12). This suggests that we let P = 1 and Q = 0.

Next, the book says to look at the ACF and PACF within only the first season (h = 1, 2, ..., 12). The PACF declines slowly, but the ACF cuts off after 1, suggesting we let p = 0, and q = 1.

image

image

When we put this all together, we get a (S)ARIMA(0, 2, 1) x (1, 1, 0) with s = 12 model. I fit that model, and here are the diagnostic plots:

image

And here are the parameter estimates:

Coefficients:
ma1 sar1 constant
-0.8322 -0.4868 0.0348
s.e. 0.0355 0.0536 6.7411

sigma^2 estimated as 0.03703: log likelihood = 54.87, aic = -101.75

Overall, I think the diagnostics look good. The standardized residual plot isn't great, but it isn't terrible. The normal Q-Q and ACF of residuals look pretty good. The p-values for Ljung-Box are not amazing, but they are at least above the line until H = 20, which I believe the book says is a decent cutoff.

Also, here is the mathematical representation of our model, which we will probably need at some point.

image

Please check everything I did. I am sure I messed up somewhere. I will post my R code and plots so everyone can check. Also, we should come up with a couple of other models to test, so if you interpret the ACF and PACF differently, that is great. I am not intending this to be the final model, just a starting point. Hopefully, some of this can go toward next week's presentation.

I am happy to add things to Overleaf, but I don't know how, and I am only familiar with basic Latex in the context of Word's Equation Editor.

Travis, thank you for building this model which can be the foundation for our further work.

I build two additional models and here are the results.

From looking at the diagnostic results from the model proposed by Travis, I feel like there might still be some non-stationarity after second order difference. So I also tried third order difference, but it didn't work well since it led to more variability. Thus I also stick to second order difference.

image

I then look at the ACF and PACF plots to build models. First, we can look into the seasonal pattern. The ACF seems to tail off and the PACF cuts off at either 1 or 3. Together, the plots suggest AR(1) or AR (3).

We then can inspect the plots at the within season lags, h=1,..,11. One perspective is that the ACF cuts off after 1 and the PACF tails off, and it indicates MA(1). Another perspective is that the ACF cuts off after 1 and the PACF cuts off after 4. In this situation, the book suggests to build a SARMA of orders p = 4 and q = 1. However, our professor points out that this is a bad reasoning. It is still however tempting for me to try this model since I lean to the side that the PACF cuts off after lag 4 instead of tails off (some subjective feeling goes in here). We right now don't know how to handle the situation where both ACF and PACF cuts off at a certain lag other than the approach mentioned in the book. So I tried this model. In diagnostic procedures, as you will see, it works better at some criterion. Together, I proposed two additional models and compare them to the one proposed by Travis.

image

model1 <- sarima(unem, p = 0, d = 2, q = 1, P = 1, D = 1, Q = 0, S = 12)
model2 <- sarima(unem, p = 0, d = 2, q = 1, P = 3, D = 1, Q = 0, S = 12)
model3 <- sarima(unem, p = 4, d = 2, q = 1, P = 3, D = 1, Q = 0, S = 12)

model1$AIC; model1$BIC
[1] -2.283108
[1] -3.244063

model2$AIC; model2$BIC
[1] -2.444084
[1] -3.379008

model3$AIC; model3$BIC
[1] -2.44496
[1] -3.327824

From looking at AIC and BIC values, Models 2 and 3 perform quite similarly, which both show some slight evidence of outperforming Model1.

Then we could compare the three models based on diagnostic plots. The standardized residuals of all models show some evidence of non-white-noise. ARMA models do not model variability. We will have a few lectures on this topic. There is not much we can do now on this issue.

In the ACF of residuals of Model 1 shows a spike at lag 24. The other two models do not show such a spike.

The normal plots from the three models are fairly similar.

The Q-statistic or Ljung-Box statistic
Models 1 and 2 have similar results. Model 1 seems to perform better at the first few lags, but Model 2 does better after lag 15. Model 3 clearly perform better than the two models on the Q-statistic. Since the Model 3 is based on a reasoning our professor does not like, we may not present this model. However it at least informs us that some models based on the thought that both the ACF and PACF cuts off at certain lags might model our data better. I am not quite sure how to handle this situation. Any thoughts on this would be highly appreciated!

Model 1
image

Model 2
image

Model 3
image

While going through a bunch of models, the following model seems most
appropriate as noted by everyone.

sarima(econ[,2],0,2,1,1,1,0,12) with the following diagnostics:

[image: Inline image 2]

The adf test also suggests stationarity as follows:

[image: Inline image 3]
Also, I am working on other predictor variables to develop a preliminary
regression model.

Regards,
AP

On Fri, Jul 1, 2016 at 11:31 PM, bopangpsy notifications@github.com wrote:

Travis, thank you for building this model which can be the foundation for
our further work.

I build two additional models and here are the results.

From looking at the diagnostic results from the model proposed by Travis,
I feel like there might still be some non-stationarity after second order
difference. So I also tried third order difference, but it didn't work well
since it led to more variability. Thus I also stick to second order
difference.

[image: image]
https://cloud.githubusercontent.com/assets/10681978/16538147/5601e5ce-3fe0-11e6-9474-da0283cc2466.png

I then look at the ACF and PACF plots to build models. First, we can look
into the seasonal pattern. The ACF seems to tail off and the PACF cuts off
at either 1 or 3. Together, the plots suggest AR(1) or AR (3).

We then can inspect the plots at the within season lags, h=1,..,11. One
perspective is that the ACF cuts off after 1 and the PACF tails off, and it
indicates MA(1). Another perspective is that the ACF cuts off after 1 and
the PACF cuts off after 4. In this situation, the book suggests to build a
SARMA of orders p = 4 and q = 1. However, our professor points out that
this is a bad reasoning. It is still however tempting for me to try this
model since I lean to the side that the PACF cuts off after lag 4 instead
of tails off (some subjective feeling goes in here). We right now don't
know how to handle the situation where both ACF and PACF cuts off at a
certain lag other than the approach mentioned in the book. So I tried this
model. In diagnostic procedures, as you will see, it works better at some
criterion. Together, I proposed two additional models and compare them to
the one proposed by Travis.

[image: image]
https://cloud.githubusercontent.com/assets/10681978/16538161/c812406e-3fe0-11e6-8967-10837b874289.png

model1 <- sarima(unem, p = 0, d = 2, q = 1, P = 1, D = 1, Q = 0, S = 12)
model2 <- sarima(unem, p = 0, d = 2, q = 1, P = 3, D = 1, Q = 0, S = 12)
model3 <- sarima(unem, p = 4, d = 2, q = 1, P = 3, D = 1, Q = 0, S = 12)

model1$AIC; model1$BIC
[1] -2.283108
[1] -3.244063

model2$AIC; model2$BIC
[1] -2.444084
[1] -3.379008

model3$AIC; model3$BIC
[1] -2.44496
[1] -3.327824

From looking at AIC and BIC values, Models 2 and 3 perform quite
similarly, which both show some slight evidence of outperforming Model1.

Then we could compare the three models based on diagnostic plots. The
standardized residuals of all models show some evidence of non-white-noise.
ARMA models do not model variability. We will have a few lectures on this
topic. There is not much we can do now on this issue.

In the ACF of residuals of Model 1 shows a spike at lag 24. The other two
models do not show such a spike.

The normal plots from the three models are fairly similar.

The Q-statistic or Ljung-Box statistic
Models 1 and 2 have similar results. Model 1 seems to perform better at
the first few lags, but Model 2 does better after lag 15. Model 3 clearly
perform better than the two models on the Q-statistic. Since the Model 3 is
based on a reasoning our professor does not like, we may not present this
model. However it at least informs us that some models based on the thought
that both the ACF and PACF cuts off at certain lags might model our data
better. I am not quite sure how to handle this situation. Any thoughts on
this would be highly appreciated!

Model 1
[image: image]
https://cloud.githubusercontent.com/assets/10681978/16538220/1c8b6b6e-3fe3-11e6-9f3b-83f6704863b7.png

Model 2
[image: image]
https://cloud.githubusercontent.com/assets/10681978/16538221/2a40d064-3fe3-11e6-8a2b-3b6f80660f1f.png

Model 3
[image: image]
https://cloud.githubusercontent.com/assets/10681978/16538225/42dbdaf6-3fe3-11e6-89fc-e5f994ab5301.png


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AS9ZZ1wsnq8q0K_cq9UCJAjUzXUxAqlOks5qRembgaJpZM4Izj8R
.

Hi,

I am trying to fit the regression model here and was wondering if I should
take the stationary data for my fitting? Any help on this would be
appreciated.

Thanks
On Jul 2, 2016 10:38, "akarshan puri" p.akarshan@gmail.com wrote:

While going through a bunch of models, the following model seems most
appropriate as noted by everyone.

sarima(econ[,2],0,2,1,1,1,0,12) with the following diagnostics:

[image: Inline image 2]

The adf test also suggests stationarity as follows:

[image: Inline image 3]
Also, I am working on other predictor variables to develop a preliminary
regression model.

Regards,
AP

On Fri, Jul 1, 2016 at 11:31 PM, bopangpsy notifications@github.com
wrote:

Travis, thank you for building this model which can be the foundation for
our further work.

I build two additional models and here are the results.

From looking at the diagnostic results from the model proposed by Travis,
I feel like there might still be some non-stationarity after second order
difference. So I also tried third order difference, but it didn't work well
since it led to more variability. Thus I also stick to second order
difference.

[image: image]
https://cloud.githubusercontent.com/assets/10681978/16538147/5601e5ce-3fe0-11e6-9474-da0283cc2466.png

I then look at the ACF and PACF plots to build models. First, we can look
into the seasonal pattern. The ACF seems to tail off and the PACF cuts off
at either 1 or 3. Together, the plots suggest AR(1) or AR (3).

We then can inspect the plots at the within season lags, h=1,..,11. One
perspective is that the ACF cuts off after 1 and the PACF tails off, and it
indicates MA(1). Another perspective is that the ACF cuts off after 1 and
the PACF cuts off after 4. In this situation, the book suggests to build a
SARMA of orders p = 4 and q = 1. However, our professor points out that
this is a bad reasoning. It is still however tempting for me to try this
model since I lean to the side that the PACF cuts off after lag 4 instead
of tails off (some subjective feeling goes in here). We right now don't
know how to handle the situation where both ACF and PACF cuts off at a
certain lag other than the approach mentioned in the book. So I tried this
model. In diagnostic procedures, as you will see, it works better at some
criterion. Together, I proposed two additional models and compare them to
the one proposed by Travis.

[image: image]
https://cloud.githubusercontent.com/assets/10681978/16538161/c812406e-3fe0-11e6-8967-10837b874289.png

model1 <- sarima(unem, p = 0, d = 2, q = 1, P = 1, D = 1, Q = 0, S = 12)
model2 <- sarima(unem, p = 0, d = 2, q = 1, P = 3, D = 1, Q = 0, S = 12)
model3 <- sarima(unem, p = 4, d = 2, q = 1, P = 3, D = 1, Q = 0, S = 12)

model1$AIC; model1$BIC
[1] -2.283108
[1] -3.244063

model2$AIC; model2$BIC
[1] -2.444084
[1] -3.379008

model3$AIC; model3$BIC
[1] -2.44496
[1] -3.327824

From looking at AIC and BIC values, Models 2 and 3 perform quite
similarly, which both show some slight evidence of outperforming Model1.

Then we could compare the three models based on diagnostic plots. The
standardized residuals of all models show some evidence of non-white-noise.
ARMA models do not model variability. We will have a few lectures on this
topic. There is not much we can do now on this issue.

In the ACF of residuals of Model 1 shows a spike at lag 24. The other two
models do not show such a spike.

The normal plots from the three models are fairly similar.

The Q-statistic or Ljung-Box statistic
Models 1 and 2 have similar results. Model 1 seems to perform better at
the first few lags, but Model 2 does better after lag 15. Model 3 clearly
perform better than the two models on the Q-statistic. Since the Model 3 is
based on a reasoning our professor does not like, we may not present this
model. However it at least informs us that some models based on the thought
that both the ACF and PACF cuts off at certain lags might model our data
better. I am not quite sure how to handle this situation. Any thoughts on
this would be highly appreciated!

Model 1
[image: image]
https://cloud.githubusercontent.com/assets/10681978/16538220/1c8b6b6e-3fe3-11e6-9f3b-83f6704863b7.png

Model 2
[image: image]
https://cloud.githubusercontent.com/assets/10681978/16538221/2a40d064-3fe3-11e6-8a2b-3b6f80660f1f.png

Model 3
[image: image]
https://cloud.githubusercontent.com/assets/10681978/16538225/42dbdaf6-3fe3-11e6-89fc-e5f994ab5301.png


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AS9ZZ1wsnq8q0K_cq9UCJAjUzXUxAqlOks5qRembgaJpZM4Izj8R
.

I believe you would need to use the stationary non-seasonal unemployment rate as your response variable. I do not believe your predictor variables have to be stationary, but it would probably make sense to at least take the seasonality out of them.

Also can you guys commit your code? If you dont know how let me know and I will help you through it... once everyone starts posting code with their models I will start compiling all of them into a single script so we can easily compare with some graphs

I can provide my code by tonight or tomorrow morning if thats okay?
On Jul 3, 2016 08:41, "Joseph Blubaugh" notifications@github.com wrote:

Also can you guys commit your code? If you dont know how let me know and I
will help you through it... once everyone starts posting code with their
models I will start compiling all of them into a single script so we can
easily compare with some graphs


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AS9ZZ9O2O-v3pdALzq2k0gqfZhEsufVFks5qR7wNgaJpZM4Izj8R
.

Yeah, no problem.

Checking both residual plots shows that there is a drastic drop that is likely indicative of the 2008 recession. Can we add weighting or something to fix this?

In the political dataset I have an indicator variable for recession by month we could try using that.

PFA

Regards,
Akki

On Sun, Jul 3, 2016 at 2:06 PM, Joseph Blubaugh notifications@github.com
wrote:

In the political dataset I have an indicator variable for recession by
month we could try using that.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AS9ZZ3mP3hApY9Su6cJJw4v2hLfNHSZTks5qSAg_gaJpZM4Izj8R
.

I'll go ahead and start putting together what we have so far on Overleaf
this evening, if that's OK with everyone.

On Mon, Jul 4, 2016 at 6:47 AM, pakarshan notifications@github.com wrote:

PFA

Regards,
Akki

On Sun, Jul 3, 2016 at 2:06 PM, Joseph Blubaugh notifications@github.com
wrote:

In the political dataset I have an indicator variable for recession by
month we could try using that.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<
#3 (comment)
,
or mute the thread
<
https://github.com/notifications/unsubscribe/AS9ZZ3mP3hApY9Su6cJJw4v2hLfNHSZTks5qSAg_gaJpZM4Izj8R

.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D230294606&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=HT_HwAyR3YCoLV9eKbgfUkOempCOqpg6VkUFn9QU3FI&s=-kkkXhdJIn3kcydf1Cv69gjXWtcOX-j-ndsVFjDsoc8&e=,
or mute the thread
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe_AOQK0810T2FOfnzaF2zK5twHDp6CuqBtks5qSQ7ogaJpZM4Izj8R&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=HT_HwAyR3YCoLV9eKbgfUkOempCOqpg6VkUFn9QU3FI&s=L_d9bPqMUyPDHDrChhJl57btQ-m9Z0tqBb7oz0fXvTU&e=
.

Sounds good. Letme know if you require the model parameter details and the
corresponding AIac values. I did not include those in the attachment but i
provided the code.

Regards,
Akki
On Jul 5, 2016 07:57, "Alison" notifications@github.com wrote:

I'll go ahead and start putting together what we have so far on Overleaf
this evening, if that's OK with everyone.

On Mon, Jul 4, 2016 at 6:47 AM, pakarshan notifications@github.com
wrote:

PFA

Regards,
Akki

On Sun, Jul 3, 2016 at 2:06 PM, Joseph Blubaugh <
notifications@github.com>
wrote:

In the political dataset I have an indicator variable for recession by
month we could try using that.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<

#3 (comment)

,
or mute the thread
<

https://github.com/notifications/unsubscribe/AS9ZZ3mP3hApY9Su6cJJw4v2hLfNHSZTks5qSAg_gaJpZM4Izj8R

.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D230294606&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=HT_HwAyR3YCoLV9eKbgfUkOempCOqpg6VkUFn9QU3FI&s=-kkkXhdJIn3kcydf1Cv69gjXWtcOX-j-ndsVFjDsoc8&e=
,
or mute the thread
<
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe_AOQK0810T2FOfnzaF2zK5twHDp6CuqBtks5qSQ7ogaJpZM4Izj8R&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=HT_HwAyR3YCoLV9eKbgfUkOempCOqpg6VkUFn9QU3FI&s=L_d9bPqMUyPDHDrChhJl57btQ-m9Z0tqBb7oz0fXvTU&e=

.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AS9ZZ3B3wd2-Dh1f3E385zN8gGcML709ks5qSlSzgaJpZM4Izj8R
.

I am still unclear about the expectations for this presentation. Does
anyone know if we are actually supposed to present a model this week? It
looks to me like unless there is a class Thursday, that we wont present
until next Monday.

On Tue, Jul 5, 2016 at 8:00 AM, pakarshan notifications@github.com wrote:

Sounds good. Letme know if you require the model parameter details and the
corresponding AIac values. I did not include those in the attachment but i
provided the code.

Regards,
Akki

On Jul 5, 2016 07:57, "Alison" notifications@github.com wrote:

I'll go ahead and start putting together what we have so far on Overleaf
this evening, if that's OK with everyone.

On Mon, Jul 4, 2016 at 6:47 AM, pakarshan notifications@github.com
wrote:

PFA

Regards,
Akki

On Sun, Jul 3, 2016 at 2:06 PM, Joseph Blubaugh <
notifications@github.com>
wrote:

In the political dataset I have an indicator variable for recession
by
month we could try using that.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<

#3 (comment)

,
or mute the thread
<

https://github.com/notifications/unsubscribe/AS9ZZ3mP3hApY9Su6cJJw4v2hLfNHSZTks5qSAg_gaJpZM4Izj8R

.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D230294606&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=HT_HwAyR3YCoLV9eKbgfUkOempCOqpg6VkUFn9QU3FI&s=-kkkXhdJIn3kcydf1Cv69gjXWtcOX-j-ndsVFjDsoc8&e=

,
or mute the thread
<

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe_AOQK0810T2FOfnzaF2zK5twHDp6CuqBtks5qSQ7ogaJpZM4Izj8R&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=HT_HwAyR3YCoLV9eKbgfUkOempCOqpg6VkUFn9QU3FI&s=L_d9bPqMUyPDHDrChhJl57btQ-m9Z0tqBb7oz0fXvTU&e=

.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<
#3 (comment)
,
or mute the thread
<
https://github.com/notifications/unsubscribe/AS9ZZ3B3wd2-Dh1f3E385zN8gGcML709ks5qSlSzgaJpZM4Izj8R

.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/ADL2hdlKkhW98_vWLJAtVWnVcaaTHbV1ks5qSlV2gaJpZM4Izj8R
.

@bopangpsy @pakarshan can you commit the code for your models to the /RScripts folder?

Sure .. code is in the word doc I attached..ill put it in github folder too
On Jul 5, 2016 10:48, "Joseph Blubaugh" notifications@github.com wrote:

@bopangpsy https://github.com/bopangpsy @pakarshan
https://github.com/pakarshan can you commit the code for your models to
the /RScripts folder?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AS9ZZ-MIcXCOoxcqMGDoY0KYDwX8gezjks5qSnzOgaJpZM4Izj8R
.

@bopangpsy @pakarshan okay, im guessing you dropped it into your local folder, but you didnt actually commit and push your changes.. do you know how to do this?

Sure, I'll add that soon.

Regarding the expectations for presentation, the professor has not
mentioned yet. However, in the last two lectures (14 and 15), he talked a
lot applied examples about model building. I'd assume that our presentation
would be something similar to what he talked in the two lectures.
Basically, it's the model building process. How do we preprocess our data
to obtain a stationary process (difference order 2 and difference order 1
in our case)? How do we identify the model (based on ACF and PACF)? What is
the set of candidate models? How do you choose the best one (AIC, BIC,
diagnostics)? I guess we might not need to present a regression model at
this stage since he hasn't talked much about it. How do you guys think
about this?

On Tue, Jul 5, 2016 at 10:48 AM, Joseph Blubaugh notifications@github.com
wrote:

@bopangpsy https://github.com/bopangpsy @pakarshan
https://github.com/pakarshan can you commit the code for your models to
the /RScripts folder?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AKL-elgQ4F-Jm4mU_2s9t99bx7ZU-Mruks5qSnzOgaJpZM4Izj8R
.

"Best model," as far as I know, is pretty ambiguous right now. With what I have done before, I checked AIC and BIC (not really thinking about using R-squared for the time being). The model identification from P/ACF is outlined in the text by checking out the tail behavior to see if it decays asymptotically or cuts off. We should be checking inside the band for "cutoff" behavior.

I actually missed today's live lecture since I had an engineering final to take; I'll relay other questions to him tomorrow.

Sure, it's always hard to call a model "Best". I think in presentations, we
may present several potential candidate models, and compare them from
several perspectives. Hopefully, one model will gain relatively more
evidence.

I just uploaded my code for these preliminary models I played with.

On Tue, Jul 5, 2016 at 5:43 PM, Sean Roberson notifications@github.com
wrote:

"Best model," as far as I know, is pretty ambiguous right now. With what I
have done before, I checked AIC and BIC (not really thinking about using
R-squared for the time being). The model identification from P/ACF is
outlined in the text by checking out the tail behavior to see if it decays
asymptotically or cuts off. We should be checking inside the band for
"cutoff" behavior.

I actually missed today's live lecture since I had an engineering final to
take; I'll relay other questions to him tomorrow.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AKL-eqrVP3BjdyVc3aNptLUGn4A85HY9ks5qSt32gaJpZM4Izj8R
.

The code is good enough. Thank you.

On Tue, Jul 5, 2016 at 6:00 AM, pakarshan notifications@github.com wrote:

Sounds good. Letme know if you require the model parameter details and the
corresponding AIac values. I did not include those in the attachment but i
provided the code.

Regards,
Akki
On Jul 5, 2016 07:57, "Alison" notifications@github.com wrote:

I'll go ahead and start putting together what we have so far on Overleaf
this evening, if that's OK with everyone.

On Mon, Jul 4, 2016 at 6:47 AM, pakarshan notifications@github.com
wrote:

PFA

Regards,
Akki

On Sun, Jul 3, 2016 at 2:06 PM, Joseph Blubaugh <
notifications@github.com>
wrote:

In the political dataset I have an indicator variable for recession
by
month we could try using that.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<

#3 (comment)

,
or mute the thread
<

https://github.com/notifications/unsubscribe/AS9ZZ3mP3hApY9Su6cJJw4v2hLfNHSZTks5qSAg_gaJpZM4Izj8R

.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D230294606&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=HT_HwAyR3YCoLV9eKbgfUkOempCOqpg6VkUFn9QU3FI&s=-kkkXhdJIn3kcydf1Cv69gjXWtcOX-j-ndsVFjDsoc8&e=

,
or mute the thread
<

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe_AOQK0810T2FOfnzaF2zK5twHDp6CuqBtks5qSQ7ogaJpZM4Izj8R&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=HT_HwAyR3YCoLV9eKbgfUkOempCOqpg6VkUFn9QU3FI&s=L_d9bPqMUyPDHDrChhJl57btQ-m9Z0tqBb7oz0fXvTU&e=

.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<
#3 (comment)
,
or mute the thread
<
https://github.com/notifications/unsubscribe/AS9ZZ3B3wd2-Dh1f3E385zN8gGcML709ks5qSlSzgaJpZM4Izj8R

.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D230471221&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=8YUBE9adqnelnpMPCShocUwXBeTK_nRuGaBuCcOIzf4&s=SqpmtH57NhZLyal0zHf6LptPeQD_vvyP0HzbLkSoCuY&e=,
or mute the thread
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe_AOQK01-5Ftll-2D7NTdaYjHVEdvn79gLctDJks5qSlV2gaJpZM4Izj8R&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=8YUBE9adqnelnpMPCShocUwXBeTK_nRuGaBuCcOIzf4&s=UglRdZwyYdII8KxoAMTbHiUndRH8NssERaSiyyNI8L0&e=
.

I would like to propose an additional model. I have gone through the same exercise as @trlilley12 and @bopangpsy only I used the seasonally adjusted unemployment rate. It looks like the performance is definitely comparable to the seasonal models. I used sarima for the nice diagnostic plot it creates, but I left the seasonal parameters out.

I committed a script here RScripts/seasonally_adjusted.R

I get an AICc = -2.672 and BIC = -3.565

non-seasonal-diagnostic

#### Model Comparison
## 
## Model 1: {AIC: -2.617} {BIC: -3.578} *** Best BIC
## Model 2: {AIC: -2.613} {BIC: -3.495}
## Model 3: {AIC: -2.672} {BIC: -3.565} *** Best AIC
##
#### Model 3 Pvalues
##
##                             Estimate     SE  t.value Pvalue
## ar1                          -0.2176 0.0672  -3.2387   .001 ***
## ma1                          -0.8835 0.0411 -21.4938  <.001 ***
## intercept                     0.0001 0.0009   0.1447   .886
## industrial_production_sa     -0.0500 0.0132  -3.7763  <.001 ***
## manufacturers_new_orders_sa  -0.0005 0.0007  -0.6516   .523
## house_price_sa               -0.0413 0.0122  -3.3765  <.001 ***
## construction_spend_sa         0.0120 0.0067   1.7902   .091 
## retail_sales_sa               0.0027 0.0013   2.1645   .044 ***

Cool, Joseph! This model is simple and performs pretty well in terms of
both fitting indices and diagnostics.

On Wed, Jul 6, 2016 at 6:50 PM, Joseph Blubaugh notifications@github.com
wrote:

I would like to propose an additional model. I have gone through the same
exercise as @trlilley12 https://github.com/trlilley12 and @bopangpsy
https://github.com/bopangpsy only I used the seasonally adjusted
unemployment rate. It looks like the performance is definitely comparable
to the seasonal models. I used sarima for the nice diagnostic plot it
creates, but I left the seasonal parameters out.

I committed a script here RScripts/seasonally_adjusted.R

I get an AICc = -2.617631 and BIC = -3.598962

[image: non-seasonal-diagnostic]
https://cloud.githubusercontent.com/assets/3339909/16638083/f162e2c6-43a9-11e6-99d4-2faae485bbe1.png


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AKL-eq5rQpzxcA9RHD-w40Ad0Bazcuzpks5qTD9dgaJpZM4Izj8R
.

Thanks, one thing I am wondering about in the preliminary models you guys created is in the differrencing... im wondering what the impact is of doing one or two differences and then doing a 12 lag difference.. that may make interpretation a little difficult... do you guys have any references or thoughts for going about differencing that way? Did you try doing the lag difference first? Maybe like this:

diff(diff(unem, lag = 12), differences = 2)

Even I have used the seasonally adjusted unemp rate while considering the
models. Also , I have posted my script on github.
On Jul 6, 2016 18:50, "Joseph Blubaugh" notifications@github.com wrote:

I would like to propose an additional model. I have gone through the same
exercise as @trlilley12 https://github.com/trlilley12 and @bopangpsy
https://github.com/bopangpsy only I used the seasonally adjusted
unemployment rate. It looks like the performance is definitely comparable
to the seasonal models. I used sarima for the nice diagnostic plot it
creates, but I left the seasonal parameters out.

I committed a script here RScripts/seasonally_adjusted.R

I get an AICc = -2.617631 and BIC = -3.598962

[image: non-seasonal-diagnostic]
https://cloud.githubusercontent.com/assets/3339909/16638083/f162e2c6-43a9-11e6-99d4-2faae485bbe1.png


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AS9ZZ8md8tVWPjF9buWTA0X8zJDOyr4tks5qTD9dgaJpZM4Izj8R
.

@pakarshan im not seeing your code.. you have to commit and push for it to show up for the rest of us. If you have done that can you specify which script file you are referring to?

Model Fitting.R is the name. I put it online using upload feature.
On Jul 6, 2016 20:56, "Joseph Blubaugh" notifications@github.com wrote:

@pakarshan https://github.com/pakarshan im not seeing your code.. you
have to commit and push for it to show up for the rest of us. If you have
done that can you specify which script file you are referring to?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AS9ZZy9Y7KbXrM0e84BDpQE2q44XPjZKks5qTFzjgaJpZM4Izj8R
.

Okay, i see it... can you explain the thought behind fitting a seasonal parameter to the seasonally adjusted data? I just switched it to 0 but it looks like it doesnt make a difference in the output. Also using the additional variables looks like it does improve the model slightly. Did you try playing with the lags to see if any of the explanatory variables can be used as leading variables? As a side note, in my last commit i added a recession indicator.. if you use load("Data/data_prep.rda") then you shouldnt have to do all of the data prep in your first several steps.

I have created a script RScripts/All_Final_Models.R to combine everyone's currently proposed models into a single place. I grouped them by seasonal vs seasonally adjusted data and created this table to show the model differences and relative performance. I also have the latex equivalent pasted below in case we want to put that into beamer (hopefully its compatible). I would still like to see us play with the additional variables a bit and see if we can find the appropriate lags to improve the models further since sarima allows you to easily include them.

Please take a look and let me know what you think. There are a few plots in the code which we can use for the presentation, but feel free to add more if you think we are missing something. We do probably need a few more.

Data Model Order Seasonal.Order Xregs AIC BIC
Unem Mdl.1 0,2,1 1,1,0 N -2.274336 -3.234984
Unem Mdl.2 0,2,1 3,1,0 N -2.435558 -3.369972
Unem Mdl.3 4,2,1 3,1,0 N -2.437146 -3.319090
Unem.sa Mdl.4 0,2,1 1,0,0 N -2.606460 -3.580226
Unem.sa Mdl.5 1,2,1 NA N -2.625197 -3.598962
Unem.sa Mdl.6 0,2,1 1,0,0 Y -2.576905 -3.485083
Unem.sa Mdl.7 1,2,1 NA Y -2.595392 -3.490453
% latex table generated in R 3.3.1 by xtable 1.8-2 package
% Thu Jul  7 17:22:39 2016
\begin{table}[ht]
\centering
\begin{tabular}{rlllllrr}
  \hline
 & Data & Model & Order & Seasonal.Order & Xregs & AIC & BIC \\ 
  \hline
1 & Unem & Mdl.1 & 0,2,1 & 1,1,0 & N & -2.27 & -3.23 \\ 
  2 & Unem & Mdl.2 & 0,2,1 & 3,1,0 & N & -2.44 & -3.37 \\ 
  3 & Unem & Mdl.3 & 4,2,1 & 3,1,0 & N & -2.44 & -3.32 \\ 
  4 & Unem.sa & Mdl.4 & 0,2,1 & 1,0,0 & N & -2.61 & -3.58 \\ 
  5 & Unem.sa & Mdl.5 & 1,2,1 &  & N & -2.63 & -3.60 \\ 
  6 & Unem.sa & Mdl.6 & 0,2,1 & 1,0,0 & Y & -2.58 & -3.49 \\ 
  7 & Unem.sa & Mdl.7 & 1,2,1 &  & Y & -2.60 & -3.49 \\ 
   \hline
\end{tabular}
\end{table}

We can probably stop here for now. None of the groups yet considered
seasonal adjustments, and 7 models seems a bit much at this stage.

I think I'll probably talk about the three best based on AIC, but I want to
be sure that's okay before I write presentation notes for myself.
On Jul 7, 2016 5:28 PM, "Joseph Blubaugh" notifications@github.com wrote:

I have created a script RScripts/All_Final_Models.R to combine everyone's
currently proposed models into a single place. I grouped them by seasonal
vs seasonally adjusted data and created this table to show the model
differences and relative performance. I also have the latex equivalent
pasted below in case we want to put that into beamer (hopefully its
compatible). I would still like to see us play with the additional
variables a bit and see if we can find the appropriate lags to improve the
models further since sarima allows you to easily include them.

Please take a look and let me know what you think. There are a few plots
in the code which we can use for the presentation, but feel free to add
more if you think we are missing something. We do probably need a few more.
Data Model Order Seasonal.Order Xregs AIC BIC
Unem Mdl.1 0,2,1 1,1,0 N -2.274336 -3.234984
Unem Mdl.2 0,2,1 3,1,0 N -2.435558 -3.369972
Unem Mdl.3 4,2,1 3,1,0 N -2.437146 -3.319090
Unem.sa Mdl.4 0,2,1 1,0,0 N -2.606460 -3.580226
Unem.sa Mdl.5 1,2,1 NA N -2.625197 -3.598962
Unem.sa Mdl.6 0,2,1 1,0,0 Y -2.576905 -3.485083
Unem.sa Mdl.7 1,2,1 NA Y -2.595392 -3.490453

% latex table generated in R 3.3.1 by xtable 1.8-2 package% Thu Jul 7 17:22:39 2016\begin{table}[ht]\centering\begin{tabular}{rlllllrr}
\hline
& Data & Model & Order & Seasonal.Order & Xregs & AIC & BIC
\hline
1 & Unem & Mdl.1 & 0,2,1 & 1,1,0 & N & -2.27 & -3.23
2 & Unem & Mdl.2 & 0,2,1 & 3,1,0 & N & -2.44 & -3.37
3 & Unem & Mdl.3 & 4,2,1 & 3,1,0 & N & -2.44 & -3.32
4 & Unem.sa & Mdl.4 & 0,2,1 & 1,0,0 & N & -2.61 & -3.58
5 & Unem.sa & Mdl.5 & 1,2,1 & & N & -2.63 & -3.60
6 & Unem.sa & Mdl.6 & 0,2,1 & 1,0,0 & Y & -2.58 & -3.49
7 & Unem.sa & Mdl.7 & 1,2,1 & & Y & -2.60 & -3.49
\hline\end{tabular}\end{table}


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AS8he4jWc5C3nlX_r1DA37gosBzz_3TBks5qTX2mgaJpZM4Izj8R
.

Looks great guys!
On Jul 7, 2016 17:28, "Joseph Blubaugh" notifications@github.com wrote:

I have created a script RScripts/All_Final_Models.R to combine everyone's
currently proposed models into a single place. I grouped them by seasonal
vs seasonally adjusted data and created this table to show the model
differences and relative performance. I also have the latex equivalent
pasted below in case we want to put that into beamer (hopefully its
compatible). I would still like to see us play with the additional
variables a bit and see if we can find the appropriate lags to improve the
models further since sarima allows you to easily include them.

Please take a look and let me know what you think. There are a few plots
in the code which we can use for the presentation, but feel free to add
more if you think we are missing something. We do probably need a few more.
Data Model Order Seasonal.Order Xregs AIC BIC
Unem Mdl.1 0,2,1 1,1,0 N -2.274336 -3.234984
Unem Mdl.2 0,2,1 3,1,0 N -2.435558 -3.369972
Unem Mdl.3 4,2,1 3,1,0 N -2.437146 -3.319090
Unem.sa Mdl.4 0,2,1 1,0,0 N -2.606460 -3.580226
Unem.sa Mdl.5 1,2,1 NA N -2.625197 -3.598962
Unem.sa Mdl.6 0,2,1 1,0,0 Y -2.576905 -3.485083
Unem.sa Mdl.7 1,2,1 NA Y -2.595392 -3.490453

% latex table generated in R 3.3.1 by xtable 1.8-2 package% Thu Jul 7 17:22:39 2016\begin{table}[ht]\centering\begin{tabular}{rlllllrr}
\hline
& Data & Model & Order & Seasonal.Order & Xregs & AIC & BIC
\hline
1 & Unem & Mdl.1 & 0,2,1 & 1,1,0 & N & -2.27 & -3.23
2 & Unem & Mdl.2 & 0,2,1 & 3,1,0 & N & -2.44 & -3.37
3 & Unem & Mdl.3 & 4,2,1 & 3,1,0 & N & -2.44 & -3.32
4 & Unem.sa & Mdl.4 & 0,2,1 & 1,0,0 & N & -2.61 & -3.58
5 & Unem.sa & Mdl.5 & 1,2,1 & & N & -2.63 & -3.60
6 & Unem.sa & Mdl.6 & 0,2,1 & 1,0,0 & Y & -2.58 & -3.49
7 & Unem.sa & Mdl.7 & 1,2,1 & & Y & -2.60 & -3.49
\hline\end{tabular}\end{table}


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AS9ZZx5EBdPuIic661NQ5uiIEUF5700Nks5qTX2mgaJpZM4Izj8R
.

This looks great! Thank you for putting this together!

On Thu, Jul 7, 2016 at 5:34 PM, pakarshan notifications@github.com wrote:

Looks great guys!

On Jul 7, 2016 17:28, "Joseph Blubaugh" notifications@github.com wrote:

I have created a script RScripts/All_Final_Models.R to combine everyone's
currently proposed models into a single place. I grouped them by seasonal
vs seasonally adjusted data and created this table to show the model
differences and relative performance. I also have the latex equivalent
pasted below in case we want to put that into beamer (hopefully its
compatible). I would still like to see us play with the additional
variables a bit and see if we can find the appropriate lags to improve
the
models further since sarima allows you to easily include them.

Please take a look and let me know what you think. There are a few plots
in the code which we can use for the presentation, but feel free to add
more if you think we are missing something. We do probably need a few
more.
Data Model Order Seasonal.Order Xregs AIC BIC
Unem Mdl.1 0,2,1 1,1,0 N -2.274336 -3.234984
Unem Mdl.2 0,2,1 3,1,0 N -2.435558 -3.369972
Unem Mdl.3 4,2,1 3,1,0 N -2.437146 -3.319090
Unem.sa Mdl.4 0,2,1 1,0,0 N -2.606460 -3.580226
Unem.sa Mdl.5 1,2,1 NA N -2.625197 -3.598962
Unem.sa Mdl.6 0,2,1 1,0,0 Y -2.576905 -3.485083
Unem.sa Mdl.7 1,2,1 NA Y -2.595392 -3.490453

% latex table generated in R 3.3.1 by xtable 1.8-2 package% Thu Jul 7
17:22:39 2016\begin{table}[ht]\centering\begin{tabular}{rlllllrr}
\hline
& Data & Model & Order & Seasonal.Order & Xregs & AIC & BIC
\hline
1 & Unem & Mdl.1 & 0,2,1 & 1,1,0 & N & -2.27 & -3.23
2 & Unem & Mdl.2 & 0,2,1 & 3,1,0 & N & -2.44 & -3.37
3 & Unem & Mdl.3 & 4,2,1 & 3,1,0 & N & -2.44 & -3.32
4 & Unem.sa & Mdl.4 & 0,2,1 & 1,0,0 & N & -2.61 & -3.58
5 & Unem.sa & Mdl.5 & 1,2,1 & & N & -2.63 & -3.60
6 & Unem.sa & Mdl.6 & 0,2,1 & 1,0,0 & Y & -2.58 & -3.49
7 & Unem.sa & Mdl.7 & 1,2,1 & & Y & -2.60 & -3.49
\hline\end{tabular}\end{table}


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<
#3 (comment)
,
or mute the thread
<
https://github.com/notifications/unsubscribe/AS9ZZx5EBdPuIic661NQ5uiIEUF5700Nks5qTX2mgaJpZM4Izj8R

.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AKL-enymO0UKo9-YQAeF6A-1hvNqpr6Vks5qTX7mgaJpZM4Izj8R
.

I put everything done so far into a minimal pdf but with all 7 models. Please let me know what you want changed.
https://www.overleaf.com/5646560qcxtqg

Here is a newer version: I like this layout better: https://www.overleaf.com/5654811dmsqbs

I have been messing around with the conversations from these discussions and trying to put them into a document. I will put it into editable format for everyone else as soon as I can. (I have somewhere to be this evening or I would do it now.) In the meantime, you can send me things to add into the document in any format - word, email, text....

I don't mind compiling anything you want included.

main.article.pdf

I have been working a bit more on fitting an arima model with regressors to the seasonally adjusted data. I believe I fixed the issue we were having with the xregs (they needed to be stationary as well). I also lagged the xregs based off of the cross correlation and lag plots and it looks like the model has improved from the AIC measure. It also looks like a few of the xregs are leading indicators of unemployment. The code is in RScripts/multivariate if you want to play with it. I think i will add this one to the All_Final_Models.r script soon if no one makes improvements on it.

multivariate_sarima

#### Model Comparison
## 
## Model 1: {AIC: -2.617} {BIC: -3.578} *** Best BIC
## Model 2: {AIC: -2.613} {BIC: -3.495}
## Model 3: {AIC: -2.672} {BIC: -3.565} *** Best AIC
##
#### Model 3 Pvalues
##
##                             Estimate     SE  t.value Pvalue
## ar1                          -0.2176 0.0672  -3.2387   .001 ***
## ma1                          -0.8835 0.0411 -21.4938  <.001 ***
## intercept                     0.0001 0.0009   0.1447   .886
## industrial_production_sa     -0.0500 0.0132  -3.7763  <.001 ***
## manufacturers_new_orders_sa  -0.0005 0.0007  -0.6516   .523
## house_price_sa               -0.0413 0.0122  -3.3765  <.001 ***
## construction_spend_sa         0.0120 0.0067   1.7902   .091 
## retail_sales_sa               0.0027 0.0013   2.1645   .044 ***

I am working on citations in the writeup and will upload it soon. Do you
want me to add this into this week's paper or keep it for next time?

On Wed, Jul 13, 2016 at 5:19 PM, Joseph Blubaugh notifications@github.com
wrote:

I have been working a bit more on fitting an arima model with regressors
to the seasonally adjusted data. I believe I fixed the issue we were having
with the xregs (they needed to be stationary as well). I also lagged the
xregs based off of the cross correlation and lag plots and it looks like
the model has improved from the AIC measure. It also looks like a few of
the xregs are leading indicators of unemployment. The code is in
RScripts/multivariate if you want to play with it. I think i will add
this one to the All_Final_Models.r script soon if no one makes
improvements on it.

[image: multivariate_sarima]
https://urldefense.proofpoint.com/v2/url?u=https-3A__cloud.githubusercontent.com_assets_3339909_16824060_514083d6-2D492e-2D11e6-2D8266-2D9714e22941b1.png&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=K1HboEU2StDQ9CAj77etxA8E1wxLnxWKOo24NVagJLo&s=l5bmf4LBIMJu8ibndqxUkjOnfvDBNxrr921VTUsuqRA&e=

Model Comparison## ## Model 1: {AIC: -2.617} {BIC: -3.578} *** Best BIC## Model 2: {AIC: -2.613} {BIC: -3.495}## Model 3: {AIC: -2.672} {BIC: -3.565} * Best AIC###### Model 3 Pvalues#### Estimate SE t.value Pvalue## ar1 -0.2176 0.0672 -3.2387 .001 _## ma1 -0.8835 0.0411 -21.4938 <.001 *## intercept 0.0001 0.0009 0.1447 .886## industrial_production_sa -0.0500 0.0132 -3.7763 <.001 *## manufacturers_new_orders_sa -0.0005 0.0007 -0.6516 .523## house_price_sa -0.0413 0.0122 -3.3765 <.001 *__## construction_spend_sa 0.0120 0.0067 1.7902 .091 ## retail_sales_sa 0.0027 0.0013 2.1645 .044 _


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D232523792&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=K1HboEU2StDQ9CAj77etxA8E1wxLnxWKOo24NVagJLo&s=kQVE-tbOaXVLzGYAx-ohfyIpgA7glZkwRIU9KiArQkg&e=,
or mute the thread
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe_AOQK0zpDPo-2DAas9rjeNnaFdEZQLgNvMwks5qVYCXgaJpZM4Izj8R&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=K1HboEU2StDQ9CAj77etxA8E1wxLnxWKOo24NVagJLo&s=2OXo-F1APtr7AVovOSSodXksxWTMO3WxNtv7ZYr4y3A&e=
.

If you have time let's replace the other seasonally adjusted models with
xregs, otherwise we can wait.
On Jul 13, 2016 7:46 PM, "Alison" notifications@github.com wrote:

I am working on citations in the writeup and will upload it soon. Do you
want me to add this into this week's paper or keep it for next time?

On Wed, Jul 13, 2016 at 5:19 PM, Joseph Blubaugh <notifications@github.com

wrote:

I have been working a bit more on fitting an arima model with regressors
to the seasonally adjusted data. I believe I fixed the issue we were
having
with the xregs (they needed to be stationary as well). I also lagged the
xregs based off of the cross correlation and lag plots and it looks like
the model has improved from the AIC measure. It also looks like a few of
the xregs are leading indicators of unemployment. The code is in
RScripts/multivariate if you want to play with it. I think i will add
this one to the All_Final_Models.r script soon if no one makes
improvements on it.

[image: multivariate_sarima]
<
https://urldefense.proofpoint.com/v2/url?u=https-3A__cloud.githubusercontent.com_assets_3339909_16824060_514083d6-2D492e-2D11e6-2D8266-2D9714e22941b1.png&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=K1HboEU2StDQ9CAj77etxA8E1wxLnxWKOo24NVagJLo&s=l5bmf4LBIMJu8ibndqxUkjOnfvDBNxrr921VTUsuqRA&e=

Model Comparison## ## Model 1: {AIC: -2.617} {BIC: -3.578} *** Best

BIC## Model 2: {AIC: -2.613} {BIC: -3.495}## Model 3: {AIC: -2.672} {BIC:
-3.565} * Best AIC###### Model 3 Pvalues#### Estimate SE t.value Pvalue##
ar1 -0.2176 0.0672 -3.2387 .001 _
## ma1 -0.8835 0.0411 -21.4938 <.001
*## intercept 0.0001 0.0009 0.1447 .886## industrial_production_sa
-0.0500 0.0132 -3.7763 <.001 *
## manufacturers_new_orders_sa -0.0005
0.0007 -0.6516 .523## house_price_sa -0.0413 0.0122 -3.3765 <.001 *__##
construction_spend_sa 0.0120 0.0067 1.7902 .091 ## retail_sales_sa 0.0027
0.0013 2.1645 .044 _


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D232523792&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=K1HboEU2StDQ9CAj77etxA8E1wxLnxWKOo24NVagJLo&s=kQVE-tbOaXVLzGYAx-ohfyIpgA7glZkwRIU9KiArQkg&e=
,
or mute the thread
<
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe_AOQK0zpDPo-2DAas9rjeNnaFdEZQLgNvMwks5qVYCXgaJpZM4Izj8R&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=K1HboEU2StDQ9CAj77etxA8E1wxLnxWKOo24NVagJLo&s=2OXo-F1APtr7AVovOSSodXksxWTMO3WxNtv7ZYr4y3A&e=

.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/ADL2hcHjzdhiC43Sm99khUe10L4VTrePks5qVYb6gaJpZM4Izj8R
.

I can make it happen.

On Wed, Jul 13, 2016 at 5:55 PM, Joseph Blubaugh notifications@github.com
wrote:

If you have time let's replace the other seasonally adjusted models with
xregs, otherwise we can wait.
On Jul 13, 2016 7:46 PM, "Alison" notifications@github.com wrote:

I am working on citations in the writeup and will upload it soon. Do you
want me to add this into this week's paper or keep it for next time?

On Wed, Jul 13, 2016 at 5:19 PM, Joseph Blubaugh <
notifications@github.com

wrote:

I have been working a bit more on fitting an arima model with
regressors
to the seasonally adjusted data. I believe I fixed the issue we were
having
with the xregs (they needed to be stationary as well). I also lagged
the
xregs based off of the cross correlation and lag plots and it looks
like
the model has improved from the AIC measure. It also looks like a few
of
the xregs are leading indicators of unemployment. The code is in
RScripts/multivariate if you want to play with it. I think i will add
this one to the All_Final_Models.r script soon if no one makes
improvements on it.

[image: multivariate_sarima]
<

https://urldefense.proofpoint.com/v2/url?u=https-3A__cloud.githubusercontent.com_assets_3339909_16824060_514083d6-2D492e-2D11e6-2D8266-2D9714e22941b1.png&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=K1HboEU2StDQ9CAj77etxA8E1wxLnxWKOo24NVagJLo&s=l5bmf4LBIMJu8ibndqxUkjOnfvDBNxrr921VTUsuqRA&e=

Model Comparison## ## Model 1: {AIC: -2.617} {BIC: -3.578} ***

Best
BIC## Model 2: {AIC: -2.613} {BIC: -3.495}## Model 3: {AIC: -2.672} {BIC:
-3.565} * Best AIC###### Model 3 Pvalues#### Estimate SE t.value
Pvalue##
ar1 -0.2176 0.0672 -3.2387 .001 _
## ma1 -0.8835 0.0411 -21.4938 <.001
*## intercept 0.0001 0.0009 0.1447 .886## industrial_production_sa
-0.0500 0.0132 -3.7763 <.001 *
## manufacturers_new_orders_sa -0.0005
0.0007 -0.6516 .523## house_price_sa -0.0413 0.0122 -3.3765 <.001 *__##
construction_spend_sa 0.0120 0.0067 1.7902 .091 ## retail_sales_sa 0.0027
0.0013 2.1645 .044 _


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D232523792&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=K1HboEU2StDQ9CAj77etxA8E1wxLnxWKOo24NVagJLo&s=kQVE-tbOaXVLzGYAx-ohfyIpgA7glZkwRIU9KiArQkg&e=

,
or mute the thread
<

https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe_AOQK0zpDPo-2DAas9rjeNnaFdEZQLgNvMwks5qVYCXgaJpZM4Izj8R&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=K1HboEU2StDQ9CAj77etxA8E1wxLnxWKOo24NVagJLo&s=2OXo-F1APtr7AVovOSSodXksxWTMO3WxNtv7ZYr4y3A&e=

.


You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<
#3 (comment)
,
or mute the thread
<
https://github.com/notifications/unsubscribe/ADL2hcHjzdhiC43Sm99khUe10L4VTrePks5qVYb6gaJpZM4Izj8R

.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D232528574&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=qjFU5D2zc-CkZxVm6z8lF0Fx3Rt5_SmfcxPV2vh47w8&s=Op-0Rw_TQUMGNkTRkoHlR9egqG610MguvoxlMmB7z50&e=,
or mute the thread
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe_AOQK05O1E-2DESU7g7vh3PJg-2DTKkjxVtD8ks5qVYkcgaJpZM4Izj8R&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=qjFU5D2zc-CkZxVm6z8lF0Fx3Rt5_SmfcxPV2vh47w8&s=fiVDvXLDefEf5hn6biM_LGpZfnNASJMC6G_m7tZWkJg&e=
.

I emailed what I have so far to everyone. I will proofread in the morning. If all is good I will send this one in and keep revising for the next round.
Group4Presentation2.pdf

For the xregs, I already used the seasonally adjusted stationary xregs.
On Jul 14, 2016 1:02 AM, "Alison" notifications@github.com wrote:

Reopened #3 #3.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AS9ZZ0WZFK71k6zAqR5Wv4S_pT6zBGbhks5qVdERgaJpZM4Izj8R
.

Right, but what I just proposed was lagging the xregs which you did not do. Also we had some differences in our differencing and model parameters choices. Our model diagnostics are a different as well.. it looks like a lot of the pvalues in your Ljung-Box statistic were significant suggesting error dependence.

Of the models we have discussed so far, I think the ARIMA(1, 2, 1) is best. It had the best diagnostics and the lowest AIC.

I added some predictors to the ARIMA(1, 2, 1), and only retail seemed significant. However, its coefficient is so small that I argue we don't need it.

I then did some forecasting for the ARIMA(1, 2, 1) as well as two ARIMA(1, 2, 1) models with predictors. I then compared our predicted values for 2016 unemployment with the actual values:

Jan 2016: actual 5.3 , predicted = 5.0
Feb 2016: actual 5.2 , predicted = 5.0
Mar 2016: actual 5.1 , predicted = 4.9
Apr 2016: actual 4.7 , predicted = 4.9
May 2016: actual 4.5 , predicted = 4.9

Overall, I think the ARIMA(1, 2, 1) is very good.

I uploaded all of my code as "forecasting 7_21_16".

@trlilley12 did you see the model I posted that was also an ARIMA(1,2,1)? I also added some xregs with different lags and in addition to retail, industrial production, and house price measure as significant. The script is in Rscripts/multivariate.R.

Oh, okay that looks like it lowers the AIC. Did you try the ARIMA(1, 2, 1) with different lags for retail, ipi, and house price (excluding the others)? The AIC might be even lower.

If your model is better, we can definitely go with it.

@trlilley12 yeah, I experimented with different lags and that seemed to be what worked best... when they were all the same lag it didn't look as good.

Okay, I prefer the simpler ARIMA(1, 2, 1) with no predictors, since Dr. P prefers simpler models. It had the lowest BIC as well. Can everyone vote on it?

@trlilley12 would you mind adding the AIC/BIC results to your model script as comments? Im going to add this to the all_final_models.r script and I want to make sure the diagnostics match up when I run it.

ARIMA(1,2,1) looks good to me. Many people actually prefer BIC over AIC.

Btw, do we need to look around for other candidate models? I plan to do it
tomorrow night since I have other final on tomorrow afternoon. Sorry being
late on this issue.

On Thu, Jul 21, 2016 at 2:23 PM, Joseph Blubaugh notifications@github.com
wrote:

@trlilley12 https://github.com/trlilley12 would you mind adding the
AIC/BIC results to your model script as comments? Im going to add this to
the all_final_models.r script and I want to make sure the diagnostics
match up when I run it.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AKL-ejQ1j4FDsqFqMdQqCXe8smcFys2cks5qX8c1gaJpZM4Izj8R
.

Yes, I think we should look at one or two models outside of the current ARIMA set... maybe VAR or Fractional ARIMA. Also @trlilley12 I am unable to run your code without it erroring out so I cannot verify your results..

Sounds good. I'll try to develop a few other models.

On Thu, Jul 21, 2016 at 2:39 PM, Joseph Blubaugh notifications@github.com
wrote:

Yes, I think we should look at one or two models outside of the current
ARIMA set... maybe VAR or Fractional ARIMA. Also @trlilley12
https://github.com/trlilley12 I am unable to run your code without it
erroring out so I cannot verify your results..


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AKL-eqrcm5HjfWTyS-ZYJIkvxXsHfyp9ks5qX8rdgaJpZM4Izj8R
.

Would everyone be in favor of ditching the seasonally unadjusted data/models in our to help reduce the number of models we talk about in the writeup? I think it may be a little confusing going through same steps for the 2 series. So far from what everyone has posted, the seaonally adjusted data seems to be performing better than the unadjusted. I vote that we focus our attention on the seasonally adjusted data to try and consolidate all of our iterations. Any thoughts?

I second this motion. We shouod keep the best models.

On Jul 21, 2016 3:05 PM, "Joseph Blubaugh" notifications@github.com wrote:

Would everyone be in favor of ditching the seasonally unadjusted
data/models in our to help reduce the number of models we talk about in the
writeup? I think it may be a little confusing going through same steps for
the 2 series. So far from what everyone has posted, the seaonally adjusted
data seems to be performing better than the unadjusted. I vote that we
focus our attention on the seasonally adjusted data to try and consolidate
all of our iterations. Any thoughts?


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AS8he0ecxEO-UPHpBPC0sdiBuPlSu0Z3ks5qX9DxgaJpZM4Izj8R
.

I agree. Thank you.

On Thu, Jul 21, 2016 at 1:09 PM, Sean Roberson notifications@github.com
wrote:

I second this motion. We shouod keep the best models.

On Jul 21, 2016 3:05 PM, "Joseph Blubaugh" notifications@github.com
wrote:

Would everyone be in favor of ditching the seasonally unadjusted
data/models in our to help reduce the number of models we talk about in
the
writeup? I think it may be a little confusing going through same steps
for
the 2 series. So far from what everyone has posted, the seaonally
adjusted
data seems to be performing better than the unadjusted. I vote that we
focus our attention on the seasonally adjusted data to try and
consolidate
all of our iterations. Any thoughts?


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<
#3 (comment)
,
or mute the thread
<
https://github.com/notifications/unsubscribe-auth/AS8he0ecxEO-UPHpBPC0sdiBuPlSu0Z3ks5qX9DxgaJpZM4Izj8R

.


You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D234368531&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=rYx2eGrD5p583M1ZahY8n4ntZrZohreCC_mA9Ec1ozk&s=NMxZKj4wN5TZrfInwNsYqWehlpIbO5fwPoMNEwOuOgs&e=,
or mute the thread
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AOQK08TPpaeMZubwm41YYIle4357LzrNks5qX9HdgaJpZM4Izj8R&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=rYx2eGrD5p583M1ZahY8n4ntZrZohreCC_mA9Ec1ozk&s=0uwFMN_BoCNGumb49d9L689rOkSjj0ETEOPL2TBQC6A&e=
.

Sounds good to me

On Jul 21, 2016, at 3:14 PM, Alison notifications@github.com wrote:

I agree. Thank you.

On Thu, Jul 21, 2016 at 1:09 PM, Sean Roberson notifications@github.com
wrote:

I second this motion. We shouod keep the best models.

On Jul 21, 2016 3:05 PM, "Joseph Blubaugh" notifications@github.com
wrote:

Would everyone be in favor of ditching the seasonally unadjusted
data/models in our to help reduce the number of models we talk about in
the
writeup? I think it may be a little confusing going through same steps
for
the 2 series. So far from what everyone has posted, the seaonally
adjusted
data seems to be performing better than the unadjusted. I vote that we
focus our attention on the seasonally adjusted data to try and
consolidate
all of our iterations. Any thoughts?


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<
#3 (comment)
,
or mute the thread
<
https://github.com/notifications/unsubscribe-auth/AS8he0ecxEO-UPHpBPC0sdiBuPlSu0Z3ks5qX9DxgaJpZM4Izj8R

.


You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D234368531&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=rYx2eGrD5p583M1ZahY8n4ntZrZohreCC_mA9Ec1ozk&s=NMxZKj4wN5TZrfInwNsYqWehlpIbO5fwPoMNEwOuOgs&e=,
or mute the thread
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AOQK08TPpaeMZubwm41YYIle4357LzrNks5qX9HdgaJpZM4Izj8R&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=rYx2eGrD5p583M1ZahY8n4ntZrZohreCC_mA9Ec1ozk&s=0uwFMN_BoCNGumb49d9L689rOkSjj0ETEOPL2TBQC6A&e=
.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

I am good too.

On Jul 21, 2016 4:02 PM, "bopangpsy" notifications@github.com wrote:

Sounds good to me

On Jul 21, 2016, at 3:14 PM, Alison notifications@github.com wrote:

I agree. Thank you.

On Thu, Jul 21, 2016 at 1:09 PM, Sean Roberson <notifications@github.com

wrote:

I second this motion. We shouod keep the best models.

On Jul 21, 2016 3:05 PM, "Joseph Blubaugh" notifications@github.com
wrote:

Would everyone be in favor of ditching the seasonally unadjusted
data/models in our to help reduce the number of models we talk about
in
the
writeup? I think it may be a little confusing going through same
steps
for
the 2 series. So far from what everyone has posted, the seaonally
adjusted
data seems to be performing better than the unadjusted. I vote that
we
focus our attention on the seasonally adjusted data to try and
consolidate
all of our iterations. Any thoughts?


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<

#3 (comment)

,
or mute the thread
<

https://github.com/notifications/unsubscribe-auth/AS8he0ecxEO-UPHpBPC0sdiBuPlSu0Z3ks5qX9DxgaJpZM4Izj8R

.


You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
<
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D234368531&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=rYx2eGrD5p583M1ZahY8n4ntZrZohreCC_mA9Ec1ozk&s=NMxZKj4wN5TZrfInwNsYqWehlpIbO5fwPoMNEwOuOgs&e=
,
or mute the thread
<
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AOQK08TPpaeMZubwm41YYIle4357LzrNks5qX9HdgaJpZM4Izj8R&d=CwMFaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=rYx2eGrD5p583M1ZahY8n4ntZrZohreCC_mA9Ec1ozk&s=0uwFMN_BoCNGumb49d9L689rOkSjj0ETEOPL2TBQC6A&e=

.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AS9ZZ9AXpi1zXXG1IFsY_WLWkP0Y1ZJEks5qX95xgaJpZM4Izj8R
.

I am glad to start doing some forecasting. I did some with the ARIMA(1, 2, 1) seasonally adjusted, no predictors. It's in the RScript "forecasting."

What other potential models are we considering? My only concern is that if we choose a model with predictors, we will have to forecast those predictors before we forecast the unemployment rate.

I think that we were focusing on the model without predictors, so you are
great.

On Fri, Jul 22, 2016 at 6:09 AM, trlilley12 notifications@github.com
wrote:

I am glad to start doing some forecasting. I did some with the ARIMA(1, 2,

  1. seasonally adjusted, no predictors. It's in the RScript "forecasting."

What other potential models are we considering? My only concern is that if
we choose a model with predictors, we will have to forecast those
predictors before we forecast the unemployment rate.


You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D234539364&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=DedT2-6wnpUm14UOx0Ba38i5jjAiGjinFsgi0CSWRdQ&s=72lMWQ5jswtbs-cHTEUnIDPC2Zw4sWNkfO3WlcSmtj8&e=,
or mute the thread
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AOQK06RN5JAIBqjVsSxj6LafZ1G6xQ87ks5qYMEkgaJpZM4Izj8R&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=DedT2-6wnpUm14UOx0Ba38i5jjAiGjinFsgi0CSWRdQ&s=TRJZ2k3uW8fWOFBuJZ3KlvitT0CArjuOLlNF4JAZL4E&e=
.

By the way, the comments in the scripts are very helpful.

In case we go with the ARIMA(1, 2, 1) model for the seasonally adjusted data with no predictors, here are some forecast plots. I uploaded them in the Plots folder, too.

The graphs are for the h = 5, 12, and 24 step ahead forecasts. The first three were generated by sarima( ), and the last three by Arima( ). Personally, I think the last three look better. I think it's good to have a picture of the forecast in the context of all the data. I will play around with sarima( ) to see if I can adjust the default axes to accommodate all past data.

sarima h 5
sarima h 12
sarima h 24
arima h 5
arima h 12
arima h 24

And here is a plot of the first five forecasted values (red) along with the actual observed values (black) from 2016.

sarima h 5 predicted and actual values

I looked at the FRED website where we got our data, and it looks like the unemployment for June 2016 has been posted at 5.1%. We could compare that to our predictor for June 2016 as well.

Here is a plot from Arima( ) that shows the predicted values through June 2016 (blue) and the observed values (black).

I put all the code for my plots in the RSCript folder and named it "forecasting plots."

arima h 6 predicted and actual

Very nice

Get Outlook for iOS

On Fri, Jul 22, 2016 at 11:19 AM -0700, "trlilley12" notifications@github.com wrote:

I looked at the FRED website where we got our data, and it looks like the unemployment for June 2016 has been posted at 5.1%. We could compare that to our predictor for June 2016 as well.

Here is a plot from Arima( ) that shows the predicted values through June 2016 (blue) and the observed values (black).

I put all the code for my plots in the RSCript folder and named it "forecasting plots."


You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub, or mute the thread.

For models without predictors, I had also attached a word document
containing the forecasts. I will look into a couple more.

On Jul 22, 2016 2:40 PM, "Alison" notifications@github.com wrote:

Very nice

Get Outlook for iOS

On Fri, Jul 22, 2016 at 11:19 AM -0700, "trlilley12" <
notifications@github.com> wrote:

I looked at the FRED website where we got our data, and it looks like the
unemployment for June 2016 has been posted at 5.1%. We could compare that
to our predictor for June 2016 as well.

Here is a plot from Arima( ) that shows the predicted values through June
2016 (blue) and the observed values (black).

I put all the code for my plots in the RSCript folder and named it
"forecasting plots."


You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub, or mute the thread.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AS9ZZ_T3NtyL99AdyCipoTd5CcEsgMOnks5qYRy0gaJpZM4Izj8R
.

I added word documents into a new folder with model output. Please let me know your opinions on the final models so I can get them into the write-up.

Also, there is a lot in the literature about VAR models so I like the idea of comparing the VAR models to the ARIMA ones.

Also, a lot of the literature discusses using new unemployment claims as a predictor for unemployment rate, but I think that may be a little too close to the actual data. What do you think?

What about the presidential cycles. Do we want to look at that too? How unemployment has risen and dropped. The literature also mentions that there tend to be cycles where the unemployment rises sharply and then recovers slowly. Visually that seems to correspond with presidential changes. Do we want to see if we can model that mathematically or is that too much?

I have built a few VAR models that we can use to compare against the currently favored ARIMA models. I have also cleaned up the All_Final_Models.r script and removed all of the seasonally adjusted data and models. I will post about that next, but here is what I have found for the VAR model. First, i think it was very fun to play with the vars package. It has a lot of functionality and many different plots that can be called.

I ended up fitting 6 models in total. VAR(1), VAR(2), VAR(3) with no lags and then again with all of the "xRegs" lagged at various h (see Multivariate.r) for how i determined which lags to use. There is a lot of output that comes with each model so I am only going to post one so you get the idea. You should be able to run the VAR.r script without incident if the data folder is a sub directory of your current R work space.

I decided to run up to a VAR(3) so that I could try to eliminate as much residual variance as possible. Sometimes in the ACF residuals plots you can see significant values in lag 12 even though we are using seasonally adjusted data. You dont see this in the unemployment rate acf plots which is good since thats what we are most interested in. You could probably argue that VAR(1) is good enough if you only wanted to look at unemployment.

var_unem_resid

Here is a plot of the unemployment series in the best performing model by AIC: Var(2) with lagged xregs.

var2_lag_diag

There is also forecasting functionality in the package which is nice because in the case of an ARIMA model with xregs, you dont have to forecast the xregs. Vars will do that for you since all of they are essentially AR(p) models that only use lagged values to forecast.

var2_fcst

I also built a few VAR models. By VARselect, BIC suggests VAR(1) HQ suggest VAR(2). The VAR(1) results only show the retail_sales_sa.l1 and recession_ind.l1 besides unem_rate_sa.l1 were significant predictors. I checked the correlation among these predictors and found that variables industrial_production, manufacturers_new_orders, house_price_sa, construction_spend, and retail_sales are highly correlated.
image

It might be reasonable to leave out some highly correlated variables. Thus, I then fitted two models with only unem_rate, retail_sales, and recession_ind. Here are the AICs and BICs.

AIC(M1$varresult$unem_rate_sa) # -253.317
AIC(M2$varresult$unem_rate_sa) # -252.6457
AIC(M3$varresult$unem_rate_sa) # -247.1147
AIC(M4$varresult$unem_rate_sa) # -251.6351

BIC(M1$varresult$unem_rate_sa) # -217.1493
BIC(M2$varresult$unem_rate_sa) # -191.2225
BIC(M3$varresult$unem_rate_sa) # -225.414
BIC(M4$varresult$unem_rate_sa) # -219.117

AICs suggest the original VAR(1) model. The BICs suggest the VAR(1) with only three variables.

image
The CCF plots look pretty reasonable. The ACF for retail sales and recession ind signify some issues. Due to the ACF problems, expectedly the formal is rejected.

I noticed the models that @JestonBlu built did not include the recession index. It is a dummy variable. So @JestonBlu may be right. It is not appropriate to put it into a VAR model But including the variable improves the model fitting quite a bit. Any suggestion on this point?

I put the code I worked with under RScripts/var_2.

Yeah, i am not sure how appropriate it is to include the recession indicator, but that is very interesting that it improved AIC that much. I will add it to my version as well since I am probably using different lags for all of the variables... we will see how it shakes out.. either way I will add what you have done to the All_Final_Moels.r and then we can decide as a group which to mention in the write up. Im finalizing some tables right now that compares all of the best performing models everyone has submitted.. i will post the results for discussion shortly.

One point though that I read about... since VARs do not require data to be stationary maybe it is okay to include it... has anyone come across anything in the literature that might have looked at this?

Thanks! This issue might need some discussion. Btw, I actually prefer the
model 3 among the set I proposed. It has the smallest BIC and really simple
(two leading variables and 1 lag).

I also saw some problems of the acf plots. I tried to fit stationary data
by differencing. But that didn't help much and ruined model fitting in
terms AIC and BIC. Any suggestions to further explore on this issue would
be appreciated.

On Sat, Jul 23, 2016 at 12:29 PM, Joseph Blubaugh notifications@github.com
wrote:

One point though that I read about... since VARs do not require data to be
stationary maybe it is okay to include it... has anyone come across
anything in the literature that might have looked at this?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AKL-euVgpzUvVCQdSDnWh4Kv_E9zvud7ks5qYk9igaJpZM4Izj8R
.

Okay, I have compiled all of the models we have considered into the All_Final_Models.r script... so far we have 2 model types ARIMA and VAR. I do not think we should actually talk about or show diagnostic plots on all of these models. Maybe just focus on the top 2 in the 3rd table, but I do think we should perhaps show tables of all of the models we considered.

Model Comparisons

Comparing ARIMA Models

Model Order Xregs Lag.Xregs AIC BIC Best
Mdl.1 1,2,1 -212.2957 -201.4563 Best BIC
Mdl.2 2,2,2 -211.8094 -193.7438
Mdl.3 3,2,3 -215.4772 -190.1853
Mdl.4 1,2,1 Y -211.5564 -182.6514
Mdl.5 2,2,2 Y -209.8342 -177.3160
Mdl.6 3,2,3 Y -215.0983 -171.7408
Mdl.7 1,2,1 Y -222.4520 -193.6943 Best AIC
Mdl.8 2,2,2 Y -220.7001 -188.3477
Mdl.9 3,2,3 Y -217.8920 -174.7555

Comparing VAR Models

Model P Lag.Xregs Recession.Ind AIC BIC Best
Mdl.1 1 -226.3472 -193.7962
Mdl.2 2 -219.2293 -165.0324
Mdl.3 1 Y -253.3170 -217.1493 Best BIC/AIC
Mdl.4 1 Y -220.6678 -188.2820
Mdl.5 2 Y -235.4387 -181.5180
Mdl.6 1 Y Y -243.8699 -207.8857

Best Models from both model sets

Model Lag.XRegs Reccession AIC BIC Best
ARIMA(1,2,1) -212.29 -201.45
ARIMA(1,2,1) Y -222.45 -193.69
VAR(1) Y -253.31 -217.15 Best AIC/BIC

Forecast Plots

The code for the plot are also saved in the All Final_Models.r script.

5 Month Forecasts for the 2 best Models

Since we decomposed and adjusted the seasonal data ourselves, it differs slightly from what you would see on the BLS website so I applied the same seasonal adjustment to the first 5 months of unemployment that came with the original data set. Overall the two plots are very similar.

forecast_arima_var

It also looks like the VAR model produced a slightly better forecast over this period, however the confidence intervals of the models overlap substantially.

forecast_arima_var_together

The forecasts start to look significantly different when you look at the longer term forecasts. This plot shows a 36 month forecast for the two best models. We can see how the confidence interval of the ARIMA model quickly explodes, perhaps indicating that it is not a good choice for long term forecasts.

forecast_longterm2

Note on best VAR
For the best VAR model shown in the 2nd table, all of the variables are present. The inclusion of the recession indicator significantly improves the overall fit as well as the look of the forecast plot. There are a few variables in the VAR model that do not measure as being significant. When taking those parameters out the longterm forecast looks a bit more aggressive. The AIC and BIC are both a couple points improved if you remove the insignificant variables though. I can strip them back out depending on what everyone thinks we should do. Here is a plot with the insignificant variables removed.

forecast_longterm

As far as model choice goes, i tend to favor the VAR rather than the ARIMA based on the model fit and forecast plots. The ARIMA(1,2,1) has 2 parameters, and the VAR(1) has 9 parameters (7 if we remove the insignificant variables). The inclusion of the recession indicator really helps the fit. So far I have not seen anything online that says its inappropriate to use an indicator variable in a VAR model.

Please everyone weigh in on the model selection. If we elect not to use recession indicator then on the second table, mdl.1 is the best BIC and model 5 is the best AIC. If we only use the significant variables then the mdl.1 VAR(1) becomes the best model with an AIC of -225 and BIC of -200 which is right there with the ARIMA(1,2,1) and it would have 6 parameters.

Latex of generated tables

% latex table generated in R 3.3.1 by xtable 1.8-2 package
% Sat Jul 23 15:13:07 2016
\begin{table}[ht]
\centering
\begin{tabular}{rllllrrl}
  \hline
 & Model & Order & Xregs & Lag.Xregs & AIC & BIC & Best \\ 
  \hline
1 & Mdl.1 & 1,2,1 &  &  & -212.30 & -201.46 & Best BIC \\ 
  2 & Mdl.2 & 2,2,2 &  &  & -211.81 & -193.74 &  \\ 
  3 & Mdl.3 & 3,2,3 &  &  & -215.48 & -190.19 &  \\ 
  4 & Mdl.4 & 1,2,1 & Y &  & -211.56 & -182.65 &  \\ 
  5 & Mdl.5 & 2,2,2 & Y &  & -209.83 & -177.32 &  \\ 
  6 & Mdl.6 & 3,2,3 & Y &  & -215.10 & -171.74 &  \\ 
  7 & Mdl.7 & 1,2,1 &  & Y & -222.45 & -193.69 & Best AIC \\ 
  8 & Mdl.8 & 2,2,2 &  & Y & -220.70 & -188.35 &  \\ 
  9 & Mdl.9 & 3,2,3 &  & Y & -217.89 & -174.76 &  \\ 
   \hline
\end{tabular}
\end{table}

% latex table generated in R 3.3.1 by xtable 1.8-2 package
% Sat Jul 23 15:13:07 2016
\begin{table}[ht]
\centering
\begin{tabular}{rllllrrl}
  \hline
 & Model & P & Lag.Xregs & Recession.Ind & AIC & BIC & Best \\ 
  \hline
1 & Mdl.1 & 1 &  &  & -226.35 & -193.80 &  \\ 
  2 & Mdl.2 & 2 &  &  & -219.23 & -165.03 &  \\ 
  3 & Mdl.3 & 1 &  & Y & -253.32 & -217.15 & Best BIC/AIC \\ 
  4 & Mdl.4 & 1 & Y &  & -220.67 & -188.28 &  \\ 
  5 & Mdl.5 & 2 & Y &  & -235.44 & -181.52 &  \\ 
  6 & Mdl.6 & 1 & Y & Y & -243.87 & -207.89 &  \\ 
   \hline
\end{tabular}
\end{table}

% latex table generated in R 3.3.1 by xtable 1.8-2 package
% Sat Jul 23 15:13:07 2016
\begin{table}[ht]
\centering
\begin{tabular}{rlllrrl}
  \hline
 & Model & Lag.XRegs & Reccession & AIC & BIC & Best \\ 
  \hline
1 & ARIMA(1,2,1) &  &  & -212.29 & -201.45 &  \\ 
  2 & ARIMA(1,2,1) & Y &  & -222.45 & -193.69 &  \\ 
  3 & VAR(1) &  & Y & -253.31 & -217.15 & Best AIC/BIC \\ 
   \hline
\end{tabular}
\end{table}

This is very nice. I like the recession indicator. I think it is
consistent with the literature. It is a way of dealing with the fact that
we would expect unemployment to increase more rapidly during a recession
than at other times.

From: (Montgomery et al., 1998)

"Evidently the unemployment rate has a strong tendency to move
countercyclically, upward in general business slowdowns and contractions
and downward in speedups and expansions.

...univariate linear models are not able to accurately represent these
asymmetric cycles.

...the contraction phases in the U.S. economy tend to be shorter than the
expansion phases.

It should also be noted that forecasting unemployment is much more
difficult during periods when it is rapidly increasing than during more
stable periods."

On Sat, Jul 23, 2016 at 1:23 PM, Joseph Blubaugh notifications@github.com
wrote:

Okay, I have compiled all of the models we have considered into the
All_Final_Models.r script... so far we have 2 model types ARIMA and VAR.
I do not think we should actually talk about or show diagnostic plots on
all of these models. Maybe just focus on the top 2 in the 3rd table, but I
do think we should perhaps show tables of all of the models we considered.
Model Comparisons

Comparing ARIMA Models
Model Order Xregs Lag.Xregs AIC BIC Best
Mdl.1 1,2,1 -212.2957 -201.4563 Best BIC
Mdl.2 2,2,2 -211.8094 -193.7438
Mdl.3 3,2,3 -215.4772 -190.1853
Mdl.4 1,2,1 Y -211.5564 -182.6514
Mdl.5 2,2,2 Y -209.8342 -177.3160
Mdl.6 3,2,3 Y -215.0983 -171.7408
Mdl.7 1,2,1 Y -222.4520 -193.6943 Best AIC
Mdl.8 2,2,2 Y -220.7001 -188.3477
Mdl.9 3,2,3 Y -217.8920 -174.7555

Comparing VAR Models
Model P Lag.Xregs Recession.Ind AIC BIC Best
Mdl.1 1 -226.3472 -193.7962
Mdl.2 2 -219.2293 -165.0324
Mdl.3 1 Y -253.3170 -217.1493 Best BIC/AIC
Mdl.4 1 Y -220.6678 -188.2820
Mdl.5 2 Y -235.4387 -181.5180
Mdl.6 1 Y Y -243.8699 -207.8857

Best Models from both model sets
Model Lag.XRegs Reccession AIC BIC Best
ARIMA(1,2,1) -212.29 -201.45
ARIMA(1,2,1) Y -222.45 -193.69
VAR(1) Y -253.31 -217.15 Best AIC/BIC Forecast Plots

The code for the plot are also saved in the All Final_Models.r script.

5 Month Forecasts for the 2 best Models

Since we decomposed and adjusted the seasonal data ourselves, it differs
slightly from what you would see on the BLS website so I applied the same
seasonal adjustment to the first 5 months of unemployment that came with
the original data set. Overall the two plots are very similar.

[image: forecast_arima_var]
https://urldefense.proofpoint.com/v2/url?u=https-3A__cloud.githubusercontent.com_assets_3339909_17079989_7f50eb7a-2D50e6-2D11e6-2D86f9-2D2be94512cf50.png&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=8bcHnrd5FLnE-CzqsVttKx9F_djM2ndeP3rVOYcamXw&s=0XTm_2Te9TUnhhb5fHSjFAqJsPjtKoTu5iifFcyDd-g&e=

It also looks like the VAR model produced a slightly better forecast over
this period, however the confidence intervals of the models overlap
substantially.

[image: forecast_arima_var_together]
https://urldefense.proofpoint.com/v2/url?u=https-3A__cloud.githubusercontent.com_assets_3339909_17079990_842d8874-2D50e6-2D11e6-2D99e7-2Db5836976f427.png&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=8bcHnrd5FLnE-CzqsVttKx9F_djM2ndeP3rVOYcamXw&s=KrUR8pLYwo6g9POhxKFE7Qv4exnzTK0HcF90NqpPvXE&e=

The forecasts start to look significantly different when you look at the
longer term forecasts. This plot shows a 36 month forecast for the two best
models. We can see how the confidence interval of the ARIMA model quickly
explodes, perhaps indicating that it is not a good choice for long term
forecasts.

[image: forecast_longterm2]
https://urldefense.proofpoint.com/v2/url?u=https-3A__cloud.githubusercontent.com_assets_3339909_17080016_121c769a-2D50e7-2D11e6-2D9e43-2D4c45cccc7cd7.png&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=8bcHnrd5FLnE-CzqsVttKx9F_djM2ndeP3rVOYcamXw&s=B1l5nLrzVoIC3J49EAm5xqRaRMHOC-NrLepiW7SREVA&e=

Note on best VAR
For the best VAR model shown in the 2nd table, all of the variables are
present. The inclusion of the recession indicator significantly improves
the overall fit as well as the look of the forecast plot. There are a few
variables in the VAR model that do not measure as being significant. When
taking those parameters out the longterm forecast looks a bit more
aggressive. The AIC and BIC are both a couple points improved if you remove
the insignificant variables though. I can strip them back out depending on
what everyone thinks we should do. Here is a plot with the insignificant
variables removed.

[image: forecast_longterm]
https://urldefense.proofpoint.com/v2/url?u=https-3A__cloud.githubusercontent.com_assets_3339909_17079992_8931df00-2D50e6-2D11e6-2D99e4-2Db19ea9d08ffa.png&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=8bcHnrd5FLnE-CzqsVttKx9F_djM2ndeP3rVOYcamXw&s=5SsLViEJ5VCZo8G5GWo2fDCpYpTEg_XAi8B_NkZGz7Q&e=

As far as model choice goes, i tend to favor the VAR rather than the ARIMA
based on the model fit and forecast plots. The ARIMA(1,2,1) has 2
parameters, and the VAR(1) has 9 parameters (7 if we remove the
insignificant variables). The inclusion of the recession indicator really
helps the fit. So far I have not seen anything online that says its
inappropriate to use an indicator variable in a VAR model.

Please everyone weigh in on the model selection. If we elect not to use
recession indicator then on the second table, mdl.1 is the best BIC and
model 5 is the best AIC. If we only use the significant variables then the
mdl.1 VAR(1) becomes the best model with an AIC of -225 and BIC of -200
which is right there with the ARIMA(1,2,1) and it would have 6 parameters.

Latex of generated tables

% latex table generated in R 3.3.1 by xtable 1.8-2 package% Sat Jul 23 15:13:07 2016\begin{table}[ht]\centering\begin{tabular}{rllllrrl}
\hline
& Model & Order & Xregs & Lag.Xregs & AIC & BIC & Best
\hline
1 & Mdl.1 & 1,2,1 & & & -212.30 & -201.46 & Best BIC
2 & Mdl.2 & 2,2,2 & & & -211.81 & -193.74 &
3 & Mdl.3 & 3,2,3 & & & -215.48 & -190.19 &
4 & Mdl.4 & 1,2,1 & Y & & -211.56 & -182.65 &
5 & Mdl.5 & 2,2,2 & Y & & -209.83 & -177.32 &
6 & Mdl.6 & 3,2,3 & Y & & -215.10 & -171.74 &
7 & Mdl.7 & 1,2,1 & & Y & -222.45 & -193.69 & Best AIC
8 & Mdl.8 & 2,2,2 & & Y & -220.70 & -188.35 &
9 & Mdl.9 & 3,2,3 & & Y & -217.89 & -174.76 &
\hline\end{tabular}\end{table}
% latex table generated in R 3.3.1 by xtable 1.8-2 package% Sat Jul 23 15:13:07 2016\begin{table}[ht]\centering\begin{tabular}{rllllrrl}
\hline
& Model & P & Lag.Xregs & Recession.Ind & AIC & BIC & Best
\hline
1 & Mdl.1 & 1 & & & -226.35 & -193.80 &
2 & Mdl.2 & 2 & & & -219.23 & -165.03 &
3 & Mdl.3 & 1 & & Y & -253.32 & -217.15 & Best BIC/AIC
4 & Mdl.4 & 1 & Y & & -220.67 & -188.28 &
5 & Mdl.5 & 2 & Y & & -235.44 & -181.52 &
6 & Mdl.6 & 1 & Y & Y & -243.87 & -207.89 &
\hline\end{tabular}\end{table}
% latex table generated in R 3.3.1 by xtable 1.8-2 package% Sat Jul 23 15:13:07 2016\begin{table}[ht]\centering\begin{tabular}{rlllrrl}
\hline
& Model & Lag.XRegs & Reccession & AIC & BIC & Best
\hline
1 & ARIMA(1,2,1) & & & -212.29 & -201.45 &
2 & ARIMA(1,2,1) & Y & & -222.45 & -193.69 &
3 & VAR(1) & & Y & -253.31 & -217.15 & Best AIC/BIC
\hline\end{tabular}\end{table}


You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D234738210&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=8bcHnrd5FLnE-CzqsVttKx9F_djM2ndeP3rVOYcamXw&s=X9rBe9Oq0gVM-9jY2xiMGF8FwtCHgymwZTXBVuFJA0w&e=,
or mute the thread
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AOQK03rzKwOv6hQ3PlTHlyvkpJicKw35ks5qYng8gaJpZM4Izj8R&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=8bcHnrd5FLnE-CzqsVttKx9F_djM2ndeP3rVOYcamXw&s=YMwp7vuK4M-Nftg6MOZlfrnWHx4X18l85DjEks1rc0c&e=
.

Those are good points. I like that you found some supporting references.

Here are the two equations without the insignificant variables. Im in favor of dropping out the insignificant variables even though it changes the long term forecast picture. If no one has a problem, im going to drop them in the code and rerun the tables (IndustrialProduction, ManufacturersNewOrders, HomePrices). Looks to me like the VAR(1) is the way to go.

VAR(1)
Unemployment = .935 + .0041 t + .975 Unemployment_{t-1} + .004 ConstructionSpend_{t-1} - .005 RetailSales_{t-1} + .19 RecessionIndicator_{t-1}+ w_t
AIC: -256, BIC: -231

ARIMA(1,2,1)
Unemployment = -.2021 Unemployment_{t-1} - .8078 w_{t-1} + w_t
AIC: -212, BIC: -201

Even though there are more parameters, VAR(1) does seem the best. It
incorporates some of our original ideas and beats everything else in AIC.

On the other hand, RetailSales and ConstructionSpend have small
coefficients; do they really add much to the model?

On Jul 23, 2016 6:28 PM, "Joseph Blubaugh" notifications@github.com wrote:

Here are the two equations without the insignificant variables. Im in
favor of dropping out the insignificant variables even though it changes
the long term forecast picture. If no one has a problem, im going to drop
them in the code and rerun the tables (IndustrialProduction,
ManufacturersNewOrders, HomePrices). Looks to me like the VAR(1) is the way
to go.

VAR(1)
Unemployment = .935 + .0041 t + .975 Unemployment_{t-1} + .004
ConstructionSpend_{t-1} - .005 RetailSales_{t-1} + .19
RecessionIndicator_{t-1}+ w_t
AIC: -256, BIC: -231

ARIMA(1,2,1)
Unemployment = -.2021 Unemployment_{t-1} - .8078 w_{t-1} + w_t
AIC: -212, BIC: -201


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AS8he4IxtqU0vfCoh4LKrAi_Omyxml2vks5qYqOHgaJpZM4Izj8R
.

Yeah. Keep in mind they are in different scales.

On Jul 23, 2016 6:30 PM, "Sean Roberson" notifications@github.com wrote:

Even though there are more parameters, VAR(1) does seem the best. It
incorporates some of our original ideas and beats everything else in AIC.

On the other hand, RetailSales and ConstructionSpend have small
coefficients; do they really add much to the model?

On Jul 23, 2016 6:28 PM, "Joseph Blubaugh" notifications@github.com
wrote:

Here are the two equations without the insignificant variables. Im in
favor of dropping out the insignificant variables even though it changes
the long term forecast picture. If no one has a problem, im going to
drop
them in the code and rerun the tables (IndustrialProduction,
ManufacturersNewOrders, HomePrices). Looks to me like the VAR(1) is the
way
to go.

VAR(1)
Unemployment = .935 + .0041 t + .975 Unemployment_{t-1} + .004
ConstructionSpend_{t-1} - .005 RetailSales_{t-1} + .19
RecessionIndicator_{t-1}+ w_t
AIC: -256, BIC: -231

ARIMA(1,2,1)
Unemployment = -.2021 Unemployment_{t-1} - .8078 w_{t-1} + w_t
AIC: -212, BIC: -201


You are receiving this because you commented.

Reply to this email directly, view it on GitHub
<
#3 (comment)
,
or mute the thread
<
https://github.com/notifications/unsubscribe-auth/AS8he4IxtqU0vfCoh4LKrAi_Omyxml2vks5qYqOHgaJpZM4Izj8R

.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.

I the VAR(1) is good, too. For our final discussion, do we want to just focus on one model, or were we going to discuss both. I think it might be easier just to stick with one.

I remember now, yes. Now I just need to gather some talking points.

On Jul 23, 2016 6:34 PM, "Joseph Blubaugh" notifications@github.com wrote:

Yeah. Keep in mind they are in different scales.

On Jul 23, 2016 6:30 PM, "Sean Roberson" notifications@github.com wrote:

Even though there are more parameters, VAR(1) does seem the best. It
incorporates some of our original ideas and beats everything else in AIC.

On the other hand, RetailSales and ConstructionSpend have small
coefficients; do they really add much to the model?

On Jul 23, 2016 6:28 PM, "Joseph Blubaugh" notifications@github.com
wrote:

Here are the two equations without the insignificant variables. Im in
favor of dropping out the insignificant variables even though it
changes
the long term forecast picture. If no one has a problem, im going to
drop
them in the code and rerun the tables (IndustrialProduction,
ManufacturersNewOrders, HomePrices). Looks to me like the VAR(1) is the
way
to go.

VAR(1)
Unemployment = .935 + .0041 t + .975 Unemployment_{t-1} + .004
ConstructionSpend_{t-1} - .005 RetailSales_{t-1} + .19
RecessionIndicator_{t-1}+ w_t
AIC: -256, BIC: -231

ARIMA(1,2,1)
Unemployment = -.2021 Unemployment_{t-1} - .8078 w_{t-1} + w_t
AIC: -212, BIC: -201


You are receiving this because you commented.

Reply to this email directly, view it on GitHub
<

#3 (comment)

,

or mute the thread
<

https://github.com/notifications/unsubscribe-auth/AS8he4IxtqU0vfCoh4LKrAi_Omyxml2vks5qYqOHgaJpZM4Izj8R

.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or mute the thread.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AS8heyGQBbunu6O9YQ7LKJI1jhW9JEgPks5qYqUegaJpZM4Izj8R
.

I think we want to present one model ultimately, but I also think that part
of the process is how we went about selecting the model we chose. Maybe
mention it more in the write up than the final presentation. I dont know.

On Sat, Jul 23, 2016 at 6:35 PM, trlilley12 notifications@github.com
wrote:

I the VAR(1) is good, too. For our final discussion, do we want to just
focus on one model, or were we going to discuss both. I think it might be
easier just to stick with one.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADL2haLIOIJO-nbtnmAnNtrrRH6zWcgsks5qYqVLgaJpZM4Izj8R
.

I am working on the final write-up right now - because I think it will be easier to build the final presentation from that. I have questions about some of the data. Where did the recession indicator data come from?

Also, the draft introduction that I have so far is:

Unemployment has been a topic of concern throughout the United States in recent years. The Great Recession iof 2007 was accompanied the worst unemployment crises seen since the 1930s (Wanberg, 2012). The results have been enduring, in 2010 the US job deficit was es- timated to be over 10 million (Katz, 2010). Graduate and Undergraduate college students alike are concerned over their employment prospects, wondering if their degrees will be enough to gain them a job after graduation. These worries are well-founded as full-reovery of college graduate employment rates and earning is expected to be a slow process Carnevale and Cheah (2015). In these times of economic uncertainty, obtaining an income gen- erating position is not the guarantee it has seemed to be in generations past.

Unemployment has far-reaching consequences that extends beyond financial security. Unemployment is linked to psychological difficulties, including depression and suicide, and even physical deterioration (Wanberg, 2012; Kim and von dem Knesebeck, 2015; DeFina and Han- non, 2015). A study of Greek students found a relationship between parental unemployment and PTSD symp- toms related to bullying (Kanellopoulos et al., 2014). In Nigeria, unemployment has been linked to insurgency and terrorism (Akanni, 2014). Given the impact that unemployment has on fiscal, mental, and physical health, reasearch into unemployment patterns an important part of developing policies to improve the welfare of the local, national, and global populace.

1.1 Goal
The purpose of our project is to examine trends in un- employment in the United States. We will focus on the years surrounding the Great Recession of 2007, 1992 to 2015. Our goal is to forcast unemployment into 2016.

I am trying to finalize the data section right now but I thought I'd share this. Am I missing anything important from the beginning or goal?

I like the VAR(1) model too, but we definitely need to talk about all of them in the write-up. I have gathered all the conversations from these discussions into a file and am trying to work the process through into a more logical order.

The indicator came from the national bureau of economic research. Here is
the citation link. http://www.nber.org/cycles/sept2010.html

On Jul 23, 2016 7:27 PM, "Alison" notifications@github.com wrote:

I like the VAR(1) model too, but we definitely need to talk about all of
them in the write-up. I have gathered all the conversations from these
discussions into a file and am trying to work the process through into a
more logical order.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/ADL2hW9yTaZsFAY-5GDSi10T4MGEJ4zOks5qYrGMgaJpZM4Izj8R
.

The VAR models in the literature have been outperforming the ARIMA models significantly. Although some of the more recent articles are using VAR to model different predictors I still think it is good justification. For example:

(Barnichon & Garda, 2016)
"Finally, the large improvements in forecasting performances were obtained with simple VAR-based forecasts of the worker flows. "

(Meyer & Tasci, 2015)
"So far our results indicate that the VAR model delivers the most accurate forecasts for up to 2 quarters ahead, and the FLOW-UC model presents the most potential for the farther
horizons,"

Thank you Joseph.

I'll gather a list of points that I may want to mention and post it
tomorrow morning before I return to College Station.

On Sat, Jul 23, 2016 at 6:40 PM, Joseph Blubaugh notifications@github.com
wrote:

I think we want to present one model ultimately, but I also think that part
of the process is how we went about selecting the model we chose. Maybe
mention it more in the write up than the final presentation. I dont know.

On Sat, Jul 23, 2016 at 6:35 PM, trlilley12 notifications@github.com
wrote:

I the VAR(1) is good, too. For our final discussion, do we want to just
focus on one model, or were we going to discuss both. I think it might be
easier just to stick with one.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<
#3 (comment)
,
or mute the thread
<
https://github.com/notifications/unsubscribe-auth/ADL2haLIOIJO-nbtnmAnNtrrRH6zWcgsks5qYqVLgaJpZM4Izj8R

.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AS8he0RNjt6uNEmhjfR2v1pTe71tRnaXks5qYqZ3gaJpZM4Izj8R
.

Thank you, that would be helpful. If it helps you, here is a copy of my current draft. I haven't started typing in the model selection information yet. I will keep working on it.
main.pdf

Updated the VAR to not include the insignificant variables I mentioned. The plots in All_Final_Models.r will reflect this... here are the updated tables now that those variables have been dropped. This matches the VAR equation i posted yesterday.

ARIMA Compare (no changes)

Model Order Xregs Lag.Xregs AIC BIC Best
Mdl.1 1,2,1 -212.2957 -201.4563 Best BIC
Mdl.2 2,2,2 -211.8094 -193.7438
Mdl.3 3,2,3 -215.4772 -190.1853
Mdl.4 1,2,1 Y -211.5564 -182.6514
Mdl.5 2,2,2 Y -209.8342 -177.3160
Mdl.6 3,2,3 Y -215.0983 -171.7408
Mdl.7 1,2,1 Y -222.4520 -193.6943 Best AIC
Mdl.8 2,2,2 Y -220.7001 -188.3477
Mdl.9 3,2,3 Y -217.8920 -174.7555

Compare VAR

Model P Lag.Xregs Recession.Ind AIC BIC Best
Mdl.1 1 -223.6686 -201.9680
Mdl.2 2 -217.8281 -185.3099
Mdl.3 1 Y -256.7669 -231.4495 Best BIC/AIC
Mdl.4 1 Y -216.6464 -195.0558
Mdl.5 2 Y -212.5259 -180.1735
Mdl.6 1 Y Y -245.7239 -220.5349

Compare Best Models

Model Lag.XRegs Reccession AIC BIC Best
ARIMA(1,2,1) -212.29 -201.45
ARIMA(1,2,1) Y -222.45 -193.69
VAR(1) Y -256.76 -231.45 Best AIC/BIC

Latex of the tables above

% latex table generated in R 3.3.1 by xtable 1.8-2 package
% Sun Jul 24 08:35:48 2016
\begin{table}[ht]
\centering
\begin{tabular}{rllllrrl}
  \hline
 & Model & Order & Xregs & Lag.Xregs & AIC & BIC & Best \\ 
  \hline
1 & Mdl.1 & 1,2,1 &  &  & -212.30 & -201.46 & Best BIC \\ 
  2 & Mdl.2 & 2,2,2 &  &  & -211.81 & -193.74 &  \\ 
  3 & Mdl.3 & 3,2,3 &  &  & -215.48 & -190.19 &  \\ 
  4 & Mdl.4 & 1,2,1 & Y &  & -211.56 & -182.65 &  \\ 
  5 & Mdl.5 & 2,2,2 & Y &  & -209.83 & -177.32 &  \\ 
  6 & Mdl.6 & 3,2,3 & Y &  & -215.10 & -171.74 &  \\ 
  7 & Mdl.7 & 1,2,1 &  & Y & -222.45 & -193.69 & Best AIC \\ 
  8 & Mdl.8 & 2,2,2 &  & Y & -220.70 & -188.35 &  \\ 
  9 & Mdl.9 & 3,2,3 &  & Y & -217.89 & -174.76 &  \\ 
   \hline
\end{tabular}
\end{table}

% latex table generated in R 3.3.1 by xtable 1.8-2 package
% Sun Jul 24 08:35:48 2016
\begin{table}[ht]
\centering
\begin{tabular}{rllllrrl}
  \hline
 & Model & P & Lag.Xregs & Recession.Ind & AIC & BIC & Best \\ 
  \hline
1 & Mdl.1 & 1 &  &  & -223.67 & -201.97 &  \\ 
  2 & Mdl.2 & 2 &  &  & -217.83 & -185.31 &  \\ 
  3 & Mdl.3 & 1 &  & Y & -256.77 & -231.45 & Best BIC/AIC \\ 
  4 & Mdl.4 & 1 & Y &  & -216.65 & -195.06 &  \\ 
  5 & Mdl.5 & 2 & Y &  & -212.53 & -180.17 &  \\ 
  6 & Mdl.6 & 1 & Y & Y & -245.72 & -220.53 &  \\ 
   \hline
\end{tabular}
\end{table}

% latex table generated in R 3.3.1 by xtable 1.8-2 package
% Sun Jul 24 08:35:48 2016
\begin{table}[ht]
\centering
\begin{tabular}{rlllrrl}
  \hline
 & Model & Lag.XRegs & Reccession & AIC & BIC & Best \\ 
  \hline
1 & ARIMA(1,2,1) &  &  & -212.29 & -201.45 &  \\ 
  2 & ARIMA(1,2,1) & Y &  & -222.45 & -193.69 &  \\ 
  3 & VAR(1) &  & Y & -256.76 & -231.45 & Best AIC/BIC \\ 
   \hline
\end{tabular}
\end{table}

Just want to add a bit.

The professor seems to like the idea of splitting the data into training and validation sets. We didn't split the data but luckily we have the new 5 months data as a validation set. From looking at the plots, it seems hard to distinguish the performance of two models. I computed the mean squared error of forecasting of the two best models. 0.01505823 for ARIMA(1,2,1) and 0.009663836 for VAR(1). This quantitative measure also supports this VAR(1) model. Hope this would help a bit when we are comparing the two models.

I think we are all agreed on the VAR model.

I am working on wording our online discussions and putting it into the writeup.

I am working on taking my notes and the online discussions and putting them together offline. But, here is the version that has all the group discussion notes in the appendix.

The introduction is relatively fleshed out (please let me know if you want me to add anything or if I made any mistakes.)

Draft.pdf

Sure, we all like the VAR model. I was just trying to add a side note when we talk about the comparison between the ARIMA and VAR models.

Thank you for putting all together. It looks good. We just need to refine it.

Still working on it.

And, I appreciate what you said about the two models. I just want to make
sure that we are all on the same page. The side notes are very helpful and
I added what you said into my document.

On Sun, Jul 24, 2016 at 7:04 PM, bopangpsy notifications@github.com wrote:

Sure, we all like the VAR model. I was just trying to add a side note when
we talk about the comparison between the ARIMA and VAR models.

Thank you for putting all together. It looks good. We just need to refine
it.


You are receiving this because you modified the open/close state.
Reply to this email directly, view it on GitHub
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_3-23issuecomment-2D234820853&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=AjW3D3Qr4P-BH0s-U0eGy1sCYCxJh4ZtY_jbVFwyC3Y&s=CB2BWgl4PQDnNDyqfGcxwQ9-ojSSJDrVHBFCdYSdots&e=,
or mute the thread
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AOQK04BQBDgJtkTLaiSNf8BB1AzNIJgbks5qZBnFgaJpZM4Izj8R&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=AjW3D3Qr4P-BH0s-U0eGy1sCYCxJh4ZtY_jbVFwyC3Y&s=FHyX6PvDADALeucr7w48A5Oe4L0yq4W844ceRbf8ltU&e=
.

This is a really good graph.

image

The ARIMA(1, 2, 1) predicts that unemployment will continue to decrease indefinitely, which we know can't be true. The VAR(1) model shows a much more accurate picture in the long run.

What everyone has done so far is great.

I have the write-up fairly complete through the discussion of the VAR models (I have not fixed the tables yet.) I am working on writing up the forcasting section and just stopped by here to see if anyone had made any additional comments.

Once I get a complete first draft I will move it to overleaf in case anyone wants to make tweaks to it there. In the meantime, if you have changes that you want to make yourself, feel free to adjust them directly in the file uploaded to github.

Here is the current rendition of the first draft.
draft2.pdf

I take it the last section of this write up can be used for further
refinements that can be done? If so, I feel that another interpretation is
to include an indicator isElection or something.

On Jul 25, 2016 12:22 PM, "Alison" notifications@github.com wrote:

What everyone has done so far is great.

I have the write-up fairly complete through the discussion of the VAR
models (I have not fixed the tables yet.) I am working on writing up the
forcasting section and just stopped by here to see if anyone had made any
additional comments.

Once I get a complete first draft I will move it to overleaf in case
anyone wants to make tweaks to it there. In the meantime, if you have
changes that you want to make yourself, feel free to adjust them directly
in the file uploaded to github.

Here is the current rendition of the first draft.
draft2.pdf
https://github.com/JestonBlu/STAT626_PROJECT/files/382068/draft2.pdf


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#3 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AS8hezXugBxVveT4SG8gIVcTxDnlUHDUks5qZPDmgaJpZM4Izj8R
.

Its looking good so far.. i remember that the professor sent out a note about needing to know the specifics about what everyone worked on... should be just list that in the appendix or something? Not sure where that fits.

Appendix, maybe.

My only worry about this last presentation is if I'll do the work justice.