JestonBlu / Unemployment

Masters Project: Forecasting Unemployment

Geek Repo:Geek Repo

Github PK Tool:Github PK Tool

Final project changes

sheltonmath opened this issue · comments

Please post any adjustments that need to be made to the project here.

In particular:

  • results of proofreading
  • things that should be added as a result of the presentation
  • changes to be made
  • claims and methods that still need to be backed up by the literature
  • changes to the footnote or bullet point portion of the workload distribution

And of course anything else we need to talk about.

Here is a current draft so you don't have to go searching for it.

main.article.pdf

Travis made a comprehensive document about his contributions. It might be a good idea for each person to do this and then include a document with these files as an addendum. Here is the one Travis did as an example.

What does everyone else think?

stat626.group.project.contributions.docx

Maybe its just me, but that doc seems like overkill. On the draft, I believe forcast is misspelled in a few places. It should be forecast.

Thank you. I will change that right now.

Here is a new copy with some minor formatting changes, including grammatical mistakes.

Oops here it is:
main.article.pdf

Here are some more suggestions

Group4.article.July25.docx

Also, I don't know what style we are using, but according to APA format, figure titles go below the figure.

@trlilley12 Thank you, I will incorporate your suggested changes and move the figure titles.

FYI, there were a ton of good questions asked.

On Jul 26, 2016 2:23 PM, "Alison" notifications@github.com wrote:

@trlilley12 https://github.com/trlilley12 Thank you, I will incorporate
your suggested changes and move the figure titles.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#15 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AS8he8yWczjrxEf1IQ3au9UPy9X4Yr4Hks5qZl6-gaJpZM4JVWRq
.

when is the final write up due?

By tomorrow.

On Jul 26, 2016 2:46 PM, "Joseph Blubaugh" notifications@github.com wrote:

when is the final write up due?


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#15 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AS8he9PabpVTRQfVRWwlFz6Lgh_72vxNks5qZmQjgaJpZM4JVWRq
.

How do you feel the preso went?

@sheltonmath here are some more revisions

main.article (1).pdf

@SZRoberson After I go through the presentation, I will make adjustments based on the presentation questions.

@trilley12 Thank you for all your hard work. Sorry you had read through write-up before I caught all of my bleary eyed typos. There were a lot of spelling errors. I caught a few more after making changes to the ones you noticed.

The formatting of your revision document with tracked changes is helpful. I will post a word document after I handle the revisions you just sent to make it easier for everyone make their comments.

About the word multivariate. I don't want to be confusing but that is the language that is used in our textbook and the literature as well, distinguishing between multivariate ARIMA models and vector autoregressive models. I am taking out a lot of it though because there is no reason to use the word if it creates confusion.

One approach, advocated in the landmark work of Box and Jenkins (1970;
see also Box et al., 1994), develops a systematic class of models called autoregressive integrated moving average (ARIMA) models to handle time- correlated modeling and forecasting. The approach includes a provision for treating more than one input series through multivariate ARIMA or through transfer function modeling. The defining feature of these models is that they are multiplicative models, meaning that the observed data are assumed to result from products of factors involving differential or difference equation operators

Pretty good, I think. I needed a confidence boost so I sported a tie.

On Jul 26, 2016 2:49 PM, "Joseph Blubaugh" notifications@github.com wrote:

How do you feel the preso went?


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#15 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AS8he3aZKcgA9r5TWDcIa3v6L_oBx9V6ks5qZmS4gaJpZM4JVWRq
.

I made the revisions I had time for. I will come back to it in a couple hours with fresh eyes and finish them up.

@trlilley12 It was actually easier to make the adjustments with the commented pdf than the word document.

I will provide both by running the pdf through a pdf to word converter.

main.article.revised.pdf

main.article.revised.docx

In the abstract I think this comment is a little off "SARIMA models proved to be least useful, because a second nonseasonal difference was enough to stationarize the data." Also I dont think "stationarize" is a word.
We took the seasonality out of the original data before differencing so I dont think its right to say we differenced seasonal data and got better performance using ARIMA vs SARIMA... i think the real reason they were different is probably because the SARIMA function treats seasonality different than simply removing seasonality from the decomposed series which is what we did manually.

@sheltonmath I made comments in the pdf Travis sent out earlier. I didn't notice you already updated the file according to Travis's comments. If you feel it's better to have my comments in a separate file, just let me know.

main.article.2.pdf

@bopangspy no problem

@JestonBlu Here is the most recent rendition of the abstract.

US unemployment rates follow a complex countercyclical pattern, exhibiting smooth, sharp rises toward the beginning of presidential terms followed by much choppier, slower declines. Multiple ARIMA, SARIMA, and VAR models were developed and compared both to describe temporal unemployment trends from 1993-2015 and ultimately to predict unemployment rates from 2016-2018. The ARIMA(1, 2, 1) models with no exogenous predictors very accurately forecasted unemployment for six months and offered the greatest level of parsimony, but their predictions in the long term were unfeasible and prone to explosively large error bounds. A VAR(1) model with the three most useful exogenous predictors—construction, retail sales, and recession presence—performed comparably to the ARIMA(1, 2, 1) in the short run, and more accurately and precisely predicted unemployment in the long run by accounting for the sharp upswings in unemployment. Overall, our analysis suggests that unemployment trends require a layer of multivariate model complexity in order to be fully described and forecasted.

very nice

I just finished watching the presentation.

@SZRoberson Great job fielding all the questions by yourself. These were the comments I was able to pick out from the question and answer period that may need to be addressed in the final adjustments for the write-up.

  • 2000 dotcom bubble & 2008 housing crash, since that language is used in the presentation should we discuss it in the paper?
  • Commentary on linearly correlated predictors.
  • add in the scales and index years in the discussions of the variables included.
  • correlation between construction spending & retail spending.
  • 3rd differences where taken out o the paper, do we need to add them back in?
  • What was our criteria for determining more realistic?
    (My take is that the changing rate of unemployment is nonlinear)
  • Var forecast more narrow confidence band (smaller standard error) as compared to the ARIMA model.
  • mathematical criteria - lower variance. Looking for minimum prediction variance.
  • Why the blue confidence band is so wide. (Because variance increases with time in these prediction intervals. Maybe we should write out the prediction interval equations for one month, say November 2016 or June 2016.

Then the professor talked for a long time and I couldn't hear what he said. Can anyone pick it up on their systems? @SZRoberson I wouldn't expect that you would remember everything he said since you were presenting rather than taking notes - but if you do please chime in.

He said something about VAR(1) prediction intervals. I heard the word unpredictable. And later Sean said something about Los Angeles traffic.

Unrelated rant: Los Angles traffic is no joke: When I was going to USC my commute varied anywhere from 30 minutes to 3 hours depending on, anything. Not to mention on the 57 north there is always something crazy that gets dropped on the freeway. This year alone I have had to contend with a Jacuzzi in the middle of the Freeway, an open patio umbrella, a lazy boy in the middle of the carpool lane, a USPS trailer that had burst into flames, and a mattress and boxspring that was exploded across all the lanes, among other things. We were able to avoid most of them except the mattress and boxspring which caused my tire to explode and remove itself from the rim. That was this Friday.

We had a discussion about confidence bands. Pretty much it. Like, "how is a
prediction interval computed? What goes into the standard error?"

The joke came from something.

On Jul 26, 2016 5:24 PM, "Alison" notifications@github.com wrote:

I just finished watching the presentation.

@SZRoberson https://github.com/SZRoberson Great job fielding all the
questions by yourself. These were the comments I was able to pick out from
the question and answer period that may need to be addressed in the final
adjustments for the write-up.

2000 dotcom bubble & 2008 housing crash, since that language is used
in the presentation should we discuss it in the paper?

Commentary on linearly correlated predictors.

add in the scales and index years in the discussions of the variables
included.

correlation between construction spending & retail spending.

3rd differences where taken out o the paper, do we need to add them
back in?

What was our criteria for determining more realistic?
(My take is that the changing rate of unemployment is nonlinear)

Var forecast more narrow confidence band (smaller standard error) as
compared to the ARIMA model.

mathematical criteria - lower variance. Looking for minimum prediction
variance.

Why the blue confidence band is so wide. (Because variance increases
with time in these prediction intervals. Maybe we should write out the
prediction interval equations for one month, say November 2016 or June 2016.

Then the professor talked for a long time and I couldn't hear what he
said. Can anyone pick it up on their systems? @SZRoberson
https://github.com/SZRoberson I wouldn't expect that you would remember
everything he said since you were presenting rather than taking notes - but
if you do please chime in.

He said something about VAR(1) prediction intervals. I heard the word
unpredictable. And later Sean said something about Los Angeles traffic.

Unrelated rant: Los Angles traffic is no joke: When I was going to USC my
commute varied anywhere from 30 minutes to 3 hours depending on, anything.
Not to mention on the 57 north there is always something crazy that gets
dropped on the freeway. This year alone I have had to contend with a
Jacuzzi in the middle of the Freeway, an open patio umbrella, a lazy boy in
the middle of the carpool lane, a USPS trailer that had burst into flames,
and a mattress and boxspring that was exploded across all the lanes, among
other things. We were able to avoid most of them except the mattress and
boxspring which caused my tire to explode and remove itself from the rim.
That was this Friday.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#15 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AS8hexoznUuT_O-AJjvJBEoFLjEQDhh7ks5qZokhgaJpZM4JVWRq
.

So was he asking us to clarify the calculation?

I suppose? I basically said that it was generated by R; we really don't
have total control over the prediction interval.

On Jul 26, 2016 5:49 PM, "Joseph Blubaugh" notifications@github.com wrote:

So was he asking us to clarify the calculation?


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#15 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AS8he6RSVdz_hnwCqXTmXZJWaw6uDwkDks5qZo72gaJpZM4JVWRq
.

I checked the Vars R documentation. It doesn't give details of computing CIs. However, the book gives a formula of the large sample distribution of estimated coefficients. Vars may use that to compute CIs and do forecasting. We may note that formula on the write-up. It's on the top of pp 311.

Good job on the presentation, one note though. We used the 2nd difference, not the 3rd difference to model... thats why its not in the final write up plots at the moment. Since we did not use them outside of data exploration, i do not see how they would add anything to to paper..

In the VARS package there is a predict function which defaults to the 95% CI which is what I used.

My mistake. I think I forgot to write that down in my talk notes.

On Jul 26, 2016 6:07 PM, "Joseph Blubaugh" notifications@github.com wrote:

Good job on the presentation, one note though. We used the 2nd difference,
not the 3rd difference to model... thats why its not in the final write up
plots at the moment. Since we did not use them outside of data exploration,
i do not see how they would add anything to to paper..

In the VARS package there is a predict function which defaults to the 95%
CI which is what I used.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#15 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AS8he4-aB5tBGGFYrg1LsyognqAc_haQks5qZpMsgaJpZM4JVWRq
.

Not a big deal. Great job overall.

I agree, great job.

Please note anything you would like me to fix/change in the final draft. I
haven't made all of the adjustments that were suggested as of yet, but I
will work on it this evening.

Feel free to make changes directly to the LaTeX file if you are comfortable
with that. Otherwise, I will just use the recent notes and revisions to
write something up.

I would like to post something this evening then give have you all make
final changes/suggestions/comments as needed in the morning and submit it
tomorrow afternoon before class begins.

How does that sound?

On Tue, Jul 26, 2016 at 4:12 PM, Joseph Blubaugh notifications@github.com
wrote:

Not a big deal. Great job overall.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_15-23issuecomment-2D235434349&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=S_RkHwaKh_fQ1sBj1R6iHyTuPhtmstonuTIlSylSdYM&s=-aLEb88FLfDg_CPZLyHGmlaHPbwc43SVm-oH210kmoM&e=,
or mute the thread
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AOQK020Upx6Ru0EHYgviPCawVr-5F8Y2IAks5qZpRxgaJpZM4JVWRq&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=S_RkHwaKh_fQ1sBj1R6iHyTuPhtmstonuTIlSylSdYM&s=CeYcNY5oNRX2Jz0YCVDBLw2Obqh2nmpuIMcull3YFVo&e=
.

The presentation was great. Good job.

On Jul 26, 2016 18:19, "Alison" notifications@github.com wrote:

I agree, great job.

Please note anything you would like me to fix/change in the final draft. I
haven't made all of the adjustments that were suggested as of yet, but I
will work on it this evening.

Feel free to make changes directly to the LaTeX file if you are comfortable
with that. Otherwise, I will just use the recent notes and revisions to
write something up.

I would like to post something this evening then give have you all make
final changes/suggestions/comments as needed in the morning and submit it
tomorrow afternoon before class begins.

How does that sound?

On Tue, Jul 26, 2016 at 4:12 PM, Joseph Blubaugh <notifications@github.com

wrote:

Not a big deal. Great job overall.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_JestonBlu_STAT626-5FPROJECT_issues_15-23issuecomment-2D235434349&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=S_RkHwaKh_fQ1sBj1R6iHyTuPhtmstonuTIlSylSdYM&s=-aLEb88FLfDg_CPZLyHGmlaHPbwc43SVm-oH210kmoM&e=
,
or mute the thread
<
https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_notifications_unsubscribe-2Dauth_AOQK020Upx6Ru0EHYgviPCawVr-5F8Y2IAks5qZpRxgaJpZM4JVWRq&d=CwMCaQ&c=ODFT-G5SujMiGrKuoJJjVg&r=dBombbLWrTfMsnz-PMDDwPElw1Pkbz0FWwrCqmhbgJA&m=S_RkHwaKh_fQ1sBj1R6iHyTuPhtmstonuTIlSylSdYM&s=CeYcNY5oNRX2Jz0YCVDBLw2Obqh2nmpuIMcull3YFVo&e=

.


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#15 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AS9ZZ-N1pp0zvTLtsy9A30rdaMIwQ0fLks5qZpXigaJpZM4JVWRq
.

Write up Notes:

Last Paragraph before Fig 11
"Models 4 through 6 were ARIMA(1,2,1),
ARIMA(2,2,2), and ARIMA(3,2,3) respectively.
These predictors had lower AIC and BIC values than
their original counterparts without regressors, ash
shown in Table 3. Since these models had predicted a
lagged response variable using data that was potentially
nonstationary, we chose to repeat the process using
lagged regressors."

Comments: I dont really understand the last sentence, can someone clarify?

Section 4.1.2
"We started with 6 initial VAR models to compare.
Models 1, 2, and 3 use the predictors of construction
spending and retail sales, without differencing. Model 1 is
a VAR(1), model 2 is a VAR(2), and model 3 is a VAR(1)
with the regression indicator included as well. Models 4,
5, and 6 repeat the analysis using the differenced version
of the predictors. Table 4 shows the AIC and BIC values
for each of these models."

Comments: VAR Models 4,5,6 are not using differenced data, they are using lagged variables so if we are at unemployment_t we are regressing on construction_spending_{t-2}. The idea was to see if the predictor variables lead unemployment

Section 4.2
"In the previous model building process, we retained 3
models for further comparison. ARIMA model 1 is a univariate
ARIMA(1,2,1) model without predictors, ARIMA
model 7 is an ARIMA(1,2,1) model with exogenous predictors,
and the VAR model 3 is a VAR(1) model with no
regressors and an indicator variable for recession among
its predictors."

Comments: VAR 3 uses regressors and the recession indicator

Some of the plots are not rendering well on my screen, fig 4 its hard to see the labels of the scatterplot even when I zoom in.. also in fig 5 i cant really see the correlation coefficients either. fig 9 and fig 10 are also hard to see... is anyone else having this issue?

Not on my phone. Those figures are as high-res as R makes them.

Looks fine on my phone too.

On Jul 26, 2016 21:24, "Sean Roberson" notifications@github.com wrote:

Not on my phone. Those figures are as high-res as R makes them.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#15 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AS9ZZ8SrqJlSZ1ZJjbx6xjte_Qv30WO5ks5qZsFPgaJpZM4JVWRq
.

Okay, great, disregard then.

@JestonBlu thank you for making those changes in the LaTeX code. That was helpful. I agree that some of the plots have small labels, I hope to adjust them if there is time.

I am going to go ahead and address the changes that haven't been done yet. Thank you everyone for your help.

Not sure if we still need detailed contributions considering that we already list the main responsibilities of each member in the appendix. Just in case we still need one. I wrote mine following Travis's example.

stat626.group.project.contributions-BoPang.docx

@sheltonmath I can look over it one more time this morning if you want. When do you plan to submit it?

@trlilley12 I plan on submitting it around noon pacific so it is submit before class starts. Here is the current draft. I am still working on adjusting the tables and figures to clarify. I have a couple questions but I will post them in a different comment.

main.article. revised.6.26.1.pdf

In terms of clarifying the model discussion, I have had difficulty with the model numbering and then later referencing the models. Since this is supposed to be an article write-up, we can't assume too much. However, the numbering of the models restarting with the VAR models is confusing to me - part of why the sentences are so awkward in that area. But, I hesitate to renumber them because it would be a disaster this late if I forgot a reference. What do you think?

Also, I was moving the Figure captions to the bottom and I think it looks funny because I was using the captions as titles. I guess that is incorrect. Should I re-run the figures with titles and use the figure captions as in APA style or keep it as is?

I would keep it as is, unless it really screws with the formatting.

On Jul 27, 2016 8:48 AM, "Alison" notifications@github.com wrote:

Also, I was moving the Figure captions to the bottom and I think it looks
funny because I was using the captions as titles. I guess that is
incorrect. Should I re-run the figures with titles and use the figure
captions as in APA style or keep it as is.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
#15 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AS8he4E4WXVq8oIbe4ggSitvoW0mK2vXks5qZ2GagaJpZM4JVWRq
.

@SZRoberson Thank you.

I think keeping it as it should be fine. As long as it looks clear,
formatting won't be a big deal.

On Wed, Jul 27, 2016 at 9:01 AM, Sean Roberson notifications@github.com
wrote:

I would keep it as is, unless it really screws with the formatting.

On Jul 27, 2016 8:48 AM, "Alison" notifications@github.com wrote:

Also, I was moving the Figure captions to the bottom and I think it looks
funny because I was using the captions as titles. I guess that is
incorrect. Should I re-run the figures with titles and use the figure
captions as in APA style or keep it as is.


You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<
#15 (comment)
,
or mute the thread
<
https://github.com/notifications/unsubscribe-auth/AS8he4E4WXVq8oIbe4ggSitvoW0mK2vXks5qZ2GagaJpZM4JVWRq

.


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#15 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AKL-ekHOEHqMs3AmHrzBpJ5R-52Gt80Pks5qZ2SpgaJpZM4JVWRq
.

Good news, I was able to upload the project to ShareTex (Overleaf wouldn't accept a project this big.) So, I am going to continue my revisions here so everyone can see the last minute changes in real time.

I made it publicly editable for the short term, so you should be able to edit without logging in if you'd like.

The link to the project is at: https://www.sharelatex.com/project/5798c29c6b560bce7c586e52

That link is broken for me.

im not going to do a detailed contribution doc, but im going to elaborate a little more at the end of the write up

@JestonBlu Then I can go back to using the github file thank you for the heads up.

I'll try to keep them synced in case anyone is able to use the sharetex site.

its probably just because I am at work, you guys can keep using it

@JestonBlu I attached this because I wanted to make sure you could see it at work. I tried to make the plots more readable.

STAT626FinalProject.Revision.6.26.2.pdf

@sheltonmath those plots look great, nicely done... the only one that doesn't look good on my screen is fig 13, the diagnostics for the VAR. Are you still working on that one?

@JestonBlu I will look at figure 13 right now.

I didn't write up the formula for the prediction interval because I wasn't 100% sure which formula was being used.

In the VARS package there is a predict function which defaults to the 95% CI which is what I used.

Is it using the large sample distribution of estimated coefficients?

I checked the Vars R documentation. It doesn't give details of computing CIs. However, the book gives a formula of the large sample distribution of estimated coefficients. Vars may use that to compute CIs and do forecasting. We may note that formula on the write-up. It's on the top of pp 311.

When you dive into the VAR model structure for the unemployment portion of the model you can see its actually just a linear combination so yes I believe it is constructed from the coefficients

Maybe we should leave it alone then. I wouldn't want to write it up incorrectly.

@JestonBlu Here is a version with Figure 13 fixed.
STAT626FinalProject.Revision.6.26.3.pdf

I am going to walk away from this now. I'll come back and check for comments in about an hour. If nobody has any changes I will submit it as is. Otherwise, I will make the changes and submit it for approval.

The current document is the one linked to above, STAT626FinalProject.Revision.6.26.3

Fantastic, thanks... after you submit, will you drop the final write up pdf in the main folder of the project?

I just fixed a couple typos... i think it reads great, there is only one thing that is still bugging me... In section 4.1.1 in the last paragraph of page 4.

"""Models 4 through 6 were ARIMA(1,2,1), ARIMA(2,2,2), and ARIMA(3,2,3) respectively. These predictors had lower AIC and BIC values than their original counterparts, as shown in Table \ref{tab:arimachoices}. Since these models had predicted a lagged response variable using data that was potentially nonstationary, we chose to repeat the process using lagged regressors. Models 7, 8, and 9 refer to the ARIMA(1,2,1), ARIMA(2,2,2), and ARIMA(3,2,3) models with lagged predictor variables. Of these three new models, model 7 has both the smallest AIC and the smallest BIC values. Of the 9 original models, model 7 has the lowest AIC overall, as shown in Table \ref{tab:arimachoices}.
"""

I dont really get the sentence in bold... since the models had predicted a lagged response that was potentially nonstationary... we have confirmed everything is stationary based on our differencing so i am not sure what is meant by this phrase

Thanks @sheltonmath! It looks great!

Thank you everyone. You are a great group to work with. Thank you for all the feedback as well. I am going to submit now and cc each of you.

No problem. I changed the name of the repo. You can download a zip file of the entire project on the main page. There is a green button on the right hand side near the top of the screen that says "Clone or Download". You can choose to clone it if you want to keep a permanent copy of it on your github account.

https://github.com/JestonBlu/Unemployment