Post on 19-Jan-2016
description
transcript
Leisure & Hospitality Employment in California
Jesus BarraganAlex Killian
Rasik Cauchon-DesaiLing Zhu
Jeannette FiggCaitlin Hunsuck
Background
– We chose to analyze the levels of leisure and hospitality employment in the state of California by looking at the number of people (in thousands) employed yearly in the industry over the past twenty years.
– We believe this topic is of great interest because with the on-going recession this industry is particularly susceptible to change as it is dependent on people’s disposable incomes.
– If disposable incomes decline, we expect the number of employees in the industry to decline as well
2
Examine the trace
• It appears highly seasonal and may be a random walk.
3
Histogram
• Non-normal
0
5
10
15
20
25
1100 1200 1300 1400 1500 1600
Series: CALEIHNSample 1990:01 2010:03Observations 243
Mean 1320.300Median 1320.300Maximum 1611.700Minimum 1063.200Std. Dev. 156.8197Skewness 0.095839Kurtosis 1.743337
Jarque-Bera 16.36143Probability 0.000280
4
Correlogram
• The correlogram also indicates a random walk (large spike in the PACF and slow decay in the ACF).
5
Unit root test
• Unit root test confirms our data has a unit root (evolutionary).
ADF Test Statistic -1.420320 1% Critical Value* -3.4589
5% Critical Value -2.8736
10% Critical Value -2.5731
*MacKinnon critical values for rejection of hypothesis of a unit root.
Augmented Dickey-Fuller Test Equation
Dependent Variable: D(CALEIHN)
Method: Least SquaresDate: 05/26/10 Time: 12:24
Sample(adjusted): 1990:02 2010:03
Included observations: 242 after adjusting endpoints
Variable Coefficient Std. Error t-Statistic Prob.
CALEIHN(-1) -0.010146 0.007144 -1.420320 0.1568
C 15.02531 9.493989 1.582613 0.1148
R-squared 0.008335 Mean dependent var 1.634711
Adjusted R-squared 0.004203 S.D. dependent var 17.43607
S.E. of regression 17.39938 Akaike info criterion 8.558977
Sum squared resid 72657.25 Schwarz criterion 8.587811
Log likelihood -1033.636 F-statistic 2.017308
Durbin-Watson stat 1.794741 Prob(F-statistic) 0.156812
6
Seasonal Difference• Generate:
SDCALEIHN=CALEIHN-CALEIHN(-12) -120
-80
-40
0
40
80
90 92 94 96 98 00 02 04 06 08 10
SDCALEIHN
0
10
20
30
40
-80 -60 -40 -20 0 20 40
Series: SDCALEIHNSample 1991:01 2010:03Observations 231
Mean 19.89524Median 27.20000Maximum 54.30000Minimum -89.30000Std. Dev. 28.20975Skewness -2.088266Kurtosis 7.692993
Jarque-Bera 379.8756Probability 0.000000
7
Unit root test
• There is a unit root so the series is a random walk. Thus, SDCALEIHN is also evolutionary.
ADF Test Statistic -1.521954 1% Critical Value* -3.4602 5% Critical Value -2.8742 10% Critical Value -2.5734
*MacKinnon critical values for rejection of hypothesis of a unit root.
Augmented Dickey-Fuller Test EquationDependent Variable: D(SDCALEIHN)Method: Least SquaresDate: 05/26/10 Time: 12:37Sample(adjusted): 1991:02 2010:03Included observations: 230 after adjusting endpoints
Variable Coefficient
Std. Error t-Statistic Prob.
SDCALEIHN(-1) -0.026076 0.017133 -1.521954 0.1294C 0.359419 0.590504 0.608665 0.5434
R-squared 0.010057 Mean dependent var -0.165652Adjusted R-squared 0.005715 S.D. dependent var 7.288904S.E. of regression 7.268044 Akaike info criterion 6.813509Sum squared resid 12043.98 Schwarz criterion 6.843406Log likelihood -781.5536 F-statistic 2.316343Durbin-Watson stat 2.023199 Prob(F-statistic) 0.129406
8
First-Difference
• Generate: DSDCALEIHN=SDCALEIHN-SDCALEIHN(-1)
• Now the trace looks stationary
• Still non-normal
-30
-20
-10
0
10
20
30
90 92 94 96 98 00 02 04 06 08 10
DSDCALEIHN
0
10
20
30
40
50
-30 -20 -10 0 10 20 30
Series: DSDCALEIHNSample 1991:02 2010:03Observations 230
Mean -0.165652Median -0.450000Maximum 28.70000Minimum -27.80000Std. Dev. 7.288904Skewness 0.320093Kurtosis 5.971872
Jarque-Bera 88.56783Probability 0.000000
9
Unit root test
• Yes! Now we can reject the null hypothesis of a unit root.
• Our seasonally differenced and first differenced data is stationary
ADF Test Statistic -15.53900 1% Critical Value* -3.4604 5% Critical Value -2.8742 10% Critical Value -2.5735
*MacKinnon critical values for rejection of hypothesis of a unit root.
Augmented Dickey-Fuller Test EquationDependent Variable: D(DSDCALEIHN)Method: Least SquaresDate: 05/26/10 Time: 12:48Sample(adjusted): 1991:03 2010:03Included observations: 229 after adjusting endpoints
Variable Coefficient Std. Error t-Statistic Prob. DSDCALEIHN(-1) -1.034761 0.066591 -15.53900 0.0000
C -0.198298 0.483039 -0.410521 0.6818R-squared 0.515434 Mean dependent var 0.024017Adjusted R-squared 0.513299 S.D. dependent var 10.47319S.E. of regression 7.306507 Akaike info criterion 6.824103Sum squared resid 12118.40 Schwarz criterion 6.854092Log likelihood -779.3598 F-statistic 241.4604Durbin-Watson stat 1.982480 Prob(F-statistic) 0.000000
10
Examine the Correlogram to specify a model
• We may have overly seasonally adjusted because there are large spikes at lags 12 and 24.
• There are spikes for PACF at lag 5, lag 12, lag 24 and a spike for ACF at lag 12.
• Also, because T=230, 2/√T≈2/ √225=0.13. The values of PACF at lag 5, lag 12 lag 24 are all bigger than 0.13.
• AR(5), MA(12), and AR(24) to run the regression
11
Estimate the model
• Because the coefficient on AR(24) is not significant, drop it and run the regression again.
Dependent Variable: DSDCALEIHNMethod: Least SquaresSample(adjusted): 1993:02 2010:03Included observations: 206 after adjusting endpointsConvergence achieved after 11 iterationsBackcast: 1992:02 1993:01
Variable Coefficient Std. Error t-Statistic Prob. C -0.185134 0.089629 -2.065566 0.0401
AR(5) 0.190326 0.060016 3.171255 0.0018AR(24) -0.013423 0.071887 -0.186717 0.8521MA(12) -0.885840 0.000163 -5438.865 0.0000
R-squared 0.442323 Mean dependent var -0.233495Adjusted R-squared 0.434041 S.D. dependent var 7.269642S.E. of regression 5.468970 Akaike info criterion 6.255284Sum squared resid 6041.746 Schwarz criterion 6.319903Log likelihood -640.2943 F-statistic 53.40567Durbin-Watson stat 2.054422 Prob(F-statistic) 0.000000Inverted AR Roots .84 -.10i .84+.10i .77+.30i .77 -.30i
.65+.50i .65 -.50i .49+.67i .49 -.67i
.32+.79i .32 -.79i .13 -.84i .13+.84i
-.10+.82i -.10 -.82i -.32+.76i -.32 -.76i
-.52+.65i -.52 -.65i -.68 -.52i -.68+.52i
-.77+.34i -.77 -.34i -.82 -.12i -.82+.12i
Inverted MA Roots .99 .86 -.49i .86+.49i .49 -.86i
.49+.86i .00+.99i -.00 -.99i -.49+.86i
-.49 -.86i -.86+.49i -.86 -.49i -.99 12
Re-estimate the model
• All the coeficcients are significant.
• Durbin watson is approximately 2
Dependent Variable: DSDCALEIHN
Method: Least Squares
Sample(adjusted): 1991:07 2010:03
Included observations: 225 after adjusting endpoints
Convergence achieved after 14 iterations
Backcast: 1990:07 1991:06
Variable Coefficient Std. Error t-Statistic Prob.
C -0.107654 0.077479 -1.389466 0.1661
AR(5) 0.189503 0.062735 3.020673 0.0028
MA(12) -0.885818 0.000192 -4617.023 0.0000
R-squared 0.429947 Mean dependent var -0.270222Adjusted R-squared 0.424811 S.D. dependent var 7.250342
S.E. of regression 5.498748 Akaike info criterion 6.260162
Sum squared resid 6712.443 Schwarz criterion 6.305710
Log likelihood -701.2682 F-statistic 83.71857
Durbin-Watson stat 2.058594 Prob(F-statistic) 0.000000
Inverted AR Roots .72 .22 -.68i .22+.68i -.58+.42i
-.58 -.42i
Inverted MA Roots .99 .86+.49i .86 -.49i .49+.86i
.49 -.86i -.00 -.99i -.00+.99i -.49 -.86i
-.49+.86i -.86+.49i -.86 -.49i -.99
13
Correlogram of Residuals
• The values of PACF are still bigger than 0.13 at lag 2 and lag 4. Thus, add AR(2) and AR(4) into the model.
14
Re-estimate the model
• All the coefficients are significant except the constant term
Dependent Variable: DSDCALEIHN
Method: Least Squares
Date: 05/26/10 Time: 13:25
Sample(adjusted): 1991:07 2010:03
Included observations: 225 after adjusting endpoints
Convergence achieved after 24 iterations
Backcast: 1990:07 1991:06
Variable Coefficient Std. Error t-Statistic Prob.
C -0.115196 0.114607 -1.005142 0.3159
AR(5) 0.162217 0.061254 2.648249 0.0087
AR(2) 0.140621 0.060669 2.317848 0.0214
AR(4) 0.159408 0.061818 2.578655 0.0106
MA(12) -0.885745 0.000179 -4960.613 0.0000
R-squared 0.460963 Mean dependent var -0.270222
Adjusted R-squared 0.451163 S.D. dependent var 7.250342
S.E. of regression 5.371312 Akaike info criterion 6.221993
Sum squared resid 6347.218 Schwarz criterion 6.297906
Log likelihood -694.9742 F-statistic 47.03385
Durbin-Watson stat 2.080547 Prob(F-statistic) 0.000000
Inverted AR Roots .82 .16+.69i .16 -.69i -.57+.28i
-.57 -.28i
Inverted MA Roots .99 .86+.49i .86 -.49i .49 -.86i
15
Validate our model
• Correlogram of the residuals is clean. The probability for Q-statistics are all bigger than 0.05.
16
Serial Correlation
• Because the p-value of F-statistic is bigger than 0.05, there is no evidence of serial correlation.
Breusch-Godfrey Serial Correlation LM Test:
F-statistic 0.908158 Probability 0.404787
Obs*R-squared 1.845227 Probability 0.397479
Test Equation:
Dependent Variable: RESID
Method: Least Squares
Date: 05/26/10 Time: 13:35
Variable Coefficient Std. Error t-Statistic Prob.
C -0.006561 0.143425 -0.045744 0.9636
AR(5) -0.008993 0.066874 -0.134477 0.8931
AR(2) 0.220477 0.274062 0.804479 0.4220
AR(4) -0.038078 0.081569 -0.466814 0.6411
MA(12) -0.027171 0.029215 -0.930047 0.3534
RESID(-1) -0.051084 0.067996 -0.751284 0.4533
RESID(-2) -0.228863 0.281665 -0.812536 0.4174
R-squared 0.008201 Mean dependent var -0.041950
Adjusted R-squared -0.019096 S.D. dependent var 5.322972
S.E. of regression 5.373556 Akaike info criterion 6.231474
Sum squared resid 6294.772 Schwarz criterion 6.337752
Log likelihood -694.0408 F-statistic 0.300434
Durbin-Watson stat 1.997785 Prob(F-statistic) 0.936185
17
Correlogram of residuals squared
• Big spike at lag 6 • The Q-statistics are
significant from lag 6• There is conditional
heteroskedasticity
18
ARCH LM Test
• The F-statistic is significant.• We need an ARCH -GARCH model as
a remedy.
ARCH Test:
F-statistic 4.068571 Probability 0.000696Obs*R-squared 22.61355 Probability 0.000937
Test Equation:Dependent Variable: RESID^2Method: Least SquaresDate: 06/02/10 Time: 00:33Sample (adjusted): 1992M01 2010M03Included observations: 219 after adjustments
Variable Coefficient Std. Error t-Statistic Prob.
C 21.59285 5.870280 3.678334 0.0003RESID^2(-1) 0.006552 0.065251 0.100415 0.9201
RESID^2(-2) -0.021910 0.065331 -0.335366 0.7377RESID^2(-3) 0.025496 0.068421 0.372636 0.7098
RESID^2(-4) -0.050394 0.068350 -0.737302 0.4618
RESID^2(-5) -0.022153 0.068394 -0.323905 0.7463RESID^2(-6) 0.326619 0.068441 4.772288 0.0000
R-squared 0.103258 Mean dependent var 28.72062Adjusted R-squared 0.077879 S.D. dependent var 54.98061S.E. of regression 52.79631 Akaike info criterion 10.80220Sum squared resid 590939.5 Schwarz criterion 10.91053
Log likelihood -1175.841 F-statistic 4.068571Durbin-Watson stat 1.960973 Prob(F-statistic) 0.000696
19
ARCH-GARCH ModelAfter trying various combinations of ARCH and GARCH terms, we decided to use GARCH(2,2) model.
Variance Equation
C 90.44445 8.658893 10.44527 0.0000
RESID(-1)^2 0.001449 0.007789 0.185983 0.8525
RESID(-2)^2 -0.025738 0.011630 -2.213144 0.0269
GARCH(-1) -1.008740 0.009896 -101.9370 0.0000
GARCH(-2) -0.969267 0.013175 -73.57121 0.0000
R-squared 0.454480 Mean dependent var -0.270222
Adjusted R-squared 0.431644 S.D. dependent var 7.250342
S.E. of regression 5.465989 Akaike info criterion 6.092879
Sum squared resid 6423.562 Schwarz criterion 6.244706
Log likelihood -675.4489 F-statistic 19.90213
Durbin-Watson stat 2.075892 Prob(F-statistic) 0.000000
Inverted AR Roots .81 .14-.62i .14+.62i -.55-.19i
-.55+.19i
Inverted MA Roots .99 .86+.49i .86-.49i .49+.86i
.49-.86i -.00-.99i -.00+.99i -.49-.86i
-.49+.86i -.86+.49i -.86-.49i -.99
Dependent Variable: DSDCALEIHN
Method: ML - ARCH (Marquardt) - Normal distribution
Sample (adjusted): 1991M07 2010M03
Included observations: 225 after adjustments
Convergence achieved after 42 iterations
MA backcast: 1990M07 1991M06, Variance backcast: ON
GARCH = C(6) + C(7)*RESID(-1)^2 + C(8)*RESID(-2)^2 + C(9)
*GARCH(-1) + C(10)*GARCH(-2)
Coefficient Std. Error z-Statistic Prob.
C -0.104236 0.148511 -0.701870 0.4828
AR(2) 0.237301 0.050194 4.727673 0.0000
AR(4) 0.144621 0.053824 2.686915 0.0072
AR(5) 0.110283 0.054069 2.039675 0.0414
MA(12) -0.883960 0.028691 -30.80957 0.0000 20
Model Validation
• Q-statistics are significant from lag 9
• Residuals are cleanfrom lag 5 to lag 8.
21
ARCH Test for GARCH(2,2)ARCH Test:
F-statistic 0.769904 Probability 0.594354
Obs*R-squared 4.670189 Probability 0.586754
Test Equation:
Dependent Variable: STD_RESID^2
Method: Least Squares
Date: 06/02/10 Time: 00:45
Sample (adjusted): 1992M01 2010M03
Included observations: 219 after adjustments
Variable Coefficient Std. Error t-Statistic Prob.
C 0.775074 0.185477 4.178819 0.0000
STD_RESID^2(-1) 0.066195 0.068396 0.967827 0.3342
STD_RESID^2(-2) 0.011702 0.068680 0.170380 0.8649
STD_RESID^2(-3) 0.080423 0.076642 1.049326 0.2952
STD_RESID^2(-4) -0.039251 0.076481 -0.513213 0.6083
STD_RESID^2(-5) 0.007833 0.076446 0.102458 0.9185
STD_RESID^2(-6) 0.106570 0.076488 1.393287 0.1650
R-squared 0.021325 Mean dependent var 0.997558
Adjusted R-squared -0.006373 S.D. dependent var 1.523201
S.E. of regression 1.528047 Akaike info criterion 3.717299
Sum squared resid 495.0046 Schwarz criterion 3.825626
Log likelihood -400.0443 F-statistic 0.769904
Durbin-Watson stat 2.010203 Prob(F-statistic) 0.594354
• The F-statistic is now not significant
• Problem of conditionalHeteroskedasticity
is solved
22
Further Validations
• The Actual, Fittedand Residuals plotlooks good
• From the histogram, the residuals are WHITE noise.
-30
-20
-10
0
10
20
-30
-20
-10
0
10
20
30
92 94 96 98 00 02 04 06 08
Residual Actual Fitted
0
4
8
12
16
20
24
28
-2.50 -1.25 0.00 1.25 2.50
Series: Standardized ResidualsSample 1991M07 2010M03Observations 225
Mean 0.005339Median 0.054744Maximum 3.301127Minimum -3.022454Std. Dev. 0.990233Skewness -0.136439Kurtosis 3.379876
Jarque-Bera 2.050950Probability 0.358626
23
Testing our model
R-squared 0.443972 Mean dependent var -0.441784
Adjusted R-squared 0.419320 S.D. dependent var 7.082465
S.E. of regression 5.397005 Akaike info criterion 6.209403
Sum squared resid 5912.915 Schwarz criterion 6.367210
Log likelihood -651.3014 F-statistic 18.00993
Durbin-Watson stat 1.979207 Prob(F-statistic) 0.000000
Inverted AR Roots .78 .15-.66i .15+.66i -.54-.26i
-.54+.26i
Inverted MA Roots .99 .86-.49i .86+.49i .49-.86i
.49+.86i .00+.99i -.00-.99i -.49-.86i
-.49+.86i -.86+.49i -.86-.49i -.99
Dependent Variable: DSDCALEIHN
Method: ML - ARCH (Marquardt) - Normal distribution
Sample (adjusted): 1991M07 2009M03
Included observations: 213 after adjustments
Convergence achieved after 37 iterations
MA backcast: 1990M07 1991M06, Variance backcast: ON
GARCH = C(6) + C(7)*RESID(-1)^2 + C(8)*RESID(-2)^2 + C(9)
*GARCH(-1) + C(10)*GARCH(-2)
Coefficient Std. Error z-Statistic Prob.
C 0.008413 0.144789 0.058104 0.9537
AR(2) 0.115072 0.070790 1.625547 0.1040
AR(4) 0.137404 0.084550 1.625124 0.1041
AR(5) 0.126070 0.077214 1.632736 0.1025
MA(12) -0.871609 0.030465 -28.61010 0.0000
Variance Equation
C 2.822559 1.924208 1.466868 0.1424
RESID(-1)^2 0.213536 0.101105 2.112032 0.0347
RESID(-2)^2 -0.079787 0.100402 -0.794673 0.4268
GARCH(-1) 0.227943 0.223105 1.021687 0.3069
GARCH(-2) 0.556361 0.183598 3.030321 0.0024
• Now that we estimated a satisfactory model, let’s test it.
• We want to forecast the past 12 months based on the data up to May2009
24
Forecast: April 2009 – March 2010
-30
-20
-10
0
10
20
30
2009M04 2009M07 2009M10 2010M01
DSDCALEIHNF
Forecast: DSDCALEIHNFActual: DSDCALEIHNForecast sample: 2009M04 2010M03Included observations: 12
Root Mean Squared Error 6.547299Mean Absolute Error 5.226972Mean Abs. Percent Error 201.1492Theil Inequality Coefficient 0.367983 Bias Proportion 0.087344 Variance Proportion 0.155742 Covariance Proportion 0.756914
44
48
52
56
60
64
2009M04 2009M07 2009M10 2010M01
Forecast of Variance25
Compare forecast to actual
-30
-20
-10
0
10
20
30
00 01 02 03 04 05 06 07 08 09 10
DSDCALEIHNFORECASTDSDCA
FORECASTDSDCA+2*SEFFORECASTDSDCA-2*SEF
26
Forecast for the rest of 2010
• Use the full sample from 1990.01 to 2010.03
-15
-10
-5
0
5
10
15
20
2010M04 2010M07 2010M10
DSDCALEIHNF_F
26
27
28
29
30
31
32
33
2010M04 2010M07 2010M10
Forecast of Variance
27
Forecast for the rest of 2010
-20
-10
0
10
20
30
00 01 02 03 04 05 06 07 08 09 10
DSDCALEIHNFORECASTDSDCA_F
FORECASTDSDCA_F+2*SEF_FFORECASTDSDCA_F-2*SEF_F
28
Recolor
•Our forecast shows a recovery to the same seasonal pattern.•However, their could be a permanent downward shift. 1250
1300
1350
1400
1450
1500
1550
1600
1650
00 01 02 03 04 05 06 07 08 09 10
CALEIHNCALEIHN_F
CALEIHN_F+2*SEF_FCALEIHN_F-2*SEF_F
29
Conclusion• Unlike the past recessions, the latest great
recession caused a dramatic shift downward
1000
1100
1200
1300
1400
1500
1600
1700
90 92 94 96 98 00 02 04 06 08 10 12
CALEIHN_F2 CALEIHN
Our forecast until 2012 does not indicate a recovery to pre-2008 levels.
30