Leisure & Hospitality Employment in California

transcript

Jesus BarraganAlex Killian

Rasik Cauchon-DesaiLing Zhu

Jeannette FiggCaitlin Hunsuck

Background

– We chose to analyze the levels of leisure and hospitality employment in the state of California by looking at the number of people (in thousands) employed yearly in the industry over the past twenty years.

– We believe this topic is of great interest because with the on-going recession this industry is particularly susceptible to change as it is dependent on people’s disposable incomes.

– If disposable incomes decline, we expect the number of employees in the industry to decline as well

Examine the trace

• It appears highly seasonal and may be a random walk.

Histogram

• Non-normal

1100 1200 1300 1400 1500 1600

Series: CALEIHNSample 1990:01 2010:03Observations 243

Mean 1320.300Median 1320.300Maximum 1611.700Minimum 1063.200Std. Dev. 156.8197Skewness 0.095839Kurtosis 1.743337

Jarque-Bera 16.36143Probability 0.000280

Correlogram

• The correlogram also indicates a random walk (large spike in the PACF and slow decay in the ACF).

Unit root test

• Unit root test confirms our data has a unit root (evolutionary).

ADF Test Statistic -1.420320 1% Critical Value* -3.4589

5% Critical Value -2.8736

10% Critical Value -2.5731

*MacKinnon critical values for rejection of hypothesis of a unit root.

Augmented Dickey-Fuller Test Equation

Dependent Variable: D(CALEIHN)

Method: Least SquaresDate: 05/26/10 Time: 12:24

Sample(adjusted): 1990:02 2010:03

Included observations: 242 after adjusting endpoints

Variable Coefficient Std. Error t-Statistic Prob.

CALEIHN(-1) -0.010146 0.007144 -1.420320 0.1568

C 15.02531 9.493989 1.582613 0.1148

R-squared 0.008335 Mean dependent var 1.634711

Adjusted R-squared 0.004203 S.D. dependent var 17.43607

S.E. of regression 17.39938 Akaike info criterion 8.558977

Sum squared resid 72657.25 Schwarz criterion 8.587811

Log likelihood -1033.636 F-statistic 2.017308

Durbin-Watson stat 1.794741 Prob(F-statistic) 0.156812

Seasonal Difference• Generate:

SDCALEIHN=CALEIHN-CALEIHN(-12) -120

90 92 94 96 98 00 02 04 06 08 10

SDCALEIHN

-80 -60 -40 -20 0 20 40

Series: SDCALEIHNSample 1991:01 2010:03Observations 231

Mean 19.89524Median 27.20000Maximum 54.30000Minimum -89.30000Std. Dev. 28.20975Skewness -2.088266Kurtosis 7.692993

Unit root test

• There is a unit root so the series is a random walk. Thus, SDCALEIHN is also evolutionary.

ADF Test Statistic -1.521954 1% Critical Value* -3.4602 5% Critical Value -2.8742 10% Critical Value -2.5734

Augmented Dickey-Fuller Test EquationDependent Variable: D(SDCALEIHN)Method: Least SquaresDate: 05/26/10 Time: 12:37Sample(adjusted): 1991:02 2010:03Included observations: 230 after adjusting endpoints

Variable Coefficient

Std. Error t-Statistic Prob.

SDCALEIHN(-1) -0.026076 0.017133 -1.521954 0.1294C 0.359419 0.590504 0.608665 0.5434

R-squared 0.010057 Mean dependent var -0.165652Adjusted R-squared 0.005715 S.D. dependent var 7.288904S.E. of regression 7.268044 Akaike info criterion 6.813509Sum squared resid 12043.98 Schwarz criterion 6.843406Log likelihood -781.5536 F-statistic 2.316343Durbin-Watson stat 2.023199 Prob(F-statistic) 0.129406

First-Difference

• Generate: DSDCALEIHN=SDCALEIHN-SDCALEIHN(-1)

• Now the trace looks stationary

• Still non-normal

90 92 94 96 98 00 02 04 06 08 10

DSDCALEIHN

-30 -20 -10 0 10 20 30

Series: DSDCALEIHNSample 1991:02 2010:03Observations 230

Mean -0.165652Median -0.450000Maximum 28.70000Minimum -27.80000Std. Dev. 7.288904Skewness 0.320093Kurtosis 5.971872

Unit root test

• Yes! Now we can reject the null hypothesis of a unit root.

• Our seasonally differenced and first differenced data is stationary

ADF Test Statistic -15.53900 1% Critical Value* -3.4604 5% Critical Value -2.8742 10% Critical Value -2.5735

Augmented Dickey-Fuller Test EquationDependent Variable: D(DSDCALEIHN)Method: Least SquaresDate: 05/26/10 Time: 12:48Sample(adjusted): 1991:03 2010:03Included observations: 229 after adjusting endpoints

Variable Coefficient Std. Error t-Statistic Prob. DSDCALEIHN(-1) -1.034761 0.066591 -15.53900 0.0000

C -0.198298 0.483039 -0.410521 0.6818R-squared 0.515434 Mean dependent var 0.024017Adjusted R-squared 0.513299 S.D. dependent var 10.47319S.E. of regression 7.306507 Akaike info criterion 6.824103Sum squared resid 12118.40 Schwarz criterion 6.854092Log likelihood -779.3598 F-statistic 241.4604Durbin-Watson stat 1.982480 Prob(F-statistic) 0.000000

Examine the Correlogram to specify a model

• We may have overly seasonally adjusted because there are large spikes at lags 12 and 24.

• There are spikes for PACF at lag 5, lag 12, lag 24 and a spike for ACF at lag 12.

• Also, because T=230, 2/√T≈2/ √225=0.13. The values of PACF at lag 5, lag 12 lag 24 are all bigger than 0.13.

• AR(5), MA(12), and AR(24) to run the regression

Estimate the model

• Because the coefficient on AR(24) is not significant, drop it and run the regression again.

Dependent Variable: DSDCALEIHNMethod: Least SquaresSample(adjusted): 1993:02 2010:03Included observations: 206 after adjusting endpointsConvergence achieved after 11 iterationsBackcast: 1992:02 1993:01

Variable Coefficient Std. Error t-Statistic Prob. C -0.185134 0.089629 -2.065566 0.0401

AR(5) 0.190326 0.060016 3.171255 0.0018AR(24) -0.013423 0.071887 -0.186717 0.8521MA(12) -0.885840 0.000163 -5438.865 0.0000

R-squared 0.442323 Mean dependent var -0.233495Adjusted R-squared 0.434041 S.D. dependent var 7.269642S.E. of regression 5.468970 Akaike info criterion 6.255284Sum squared resid 6041.746 Schwarz criterion 6.319903Log likelihood -640.2943 F-statistic 53.40567Durbin-Watson stat 2.054422 Prob(F-statistic) 0.000000Inverted AR Roots .84 -.10i .84+.10i .77+.30i .77 -.30i

.65+.50i .65 -.50i .49+.67i .49 -.67i

.32+.79i .32 -.79i .13 -.84i .13+.84i

-.10+.82i -.10 -.82i -.32+.76i -.32 -.76i

-.52+.65i -.52 -.65i -.68 -.52i -.68+.52i

-.77+.34i -.77 -.34i -.82 -.12i -.82+.12i

Inverted MA Roots .99 .86 -.49i .86+.49i .49 -.86i

.49+.86i .00+.99i -.00 -.99i -.49+.86i

-.49 -.86i -.86+.49i -.86 -.49i -.99 12

Re-estimate the model

• All the coeficcients are significant.

• Durbin watson is approximately 2

Dependent Variable: DSDCALEIHN

Method: Least Squares

Convergence achieved after 14 iterations

Backcast: 1990:07 1991:06

C -0.107654 0.077479 -1.389466 0.1661

AR(5) 0.189503 0.062735 3.020673 0.0028

MA(12) -0.885818 0.000192 -4617.023 0.0000

R-squared 0.429947 Mean dependent var -0.270222Adjusted R-squared 0.424811 S.D. dependent var 7.250342

Inverted AR Roots .72 .22 -.68i .22+.68i -.58+.42i

-.58 -.42i

Inverted MA Roots .99 .86+.49i .86 -.49i .49+.86i

.49 -.86i -.00 -.99i -.00+.99i -.49 -.86i

-.49+.86i -.86+.49i -.86 -.49i -.99

Correlogram of Residuals

• The values of PACF are still bigger than 0.13 at lag 2 and lag 4. Thus, add AR(2) and AR(4) into the model.

Re-estimate the model

• All the coefficients are significant except the constant term

Date: 05/26/10 Time: 13:25

Backcast: 1990:07 1991:06

C -0.115196 0.114607 -1.005142 0.3159

AR(5) 0.162217 0.061254 2.648249 0.0087

AR(2) 0.140621 0.060669 2.317848 0.0214

AR(4) 0.159408 0.061818 2.578655 0.0106

MA(12) -0.885745 0.000179 -4960.613 0.0000

R-squared 0.460963 Mean dependent var -0.270222

Inverted AR Roots .82 .16+.69i .16 -.69i -.57+.28i

-.57 -.28i

Inverted MA Roots .99 .86+.49i .86 -.49i .49 -.86i

Validate our model

• Correlogram of the residuals is clean. The probability for Q-statistics are all bigger than 0.05.

Serial Correlation

• Because the p-value of F-statistic is bigger than 0.05, there is no evidence of serial correlation.

Breusch-Godfrey Serial Correlation LM Test:

F-statistic 0.908158 Probability 0.404787

Obs*R-squared 1.845227 Probability 0.397479

Test Equation:

Dependent Variable: RESID

Date: 05/26/10 Time: 13:35

C -0.006561 0.143425 -0.045744 0.9636

AR(5) -0.008993 0.066874 -0.134477 0.8931

AR(2) 0.220477 0.274062 0.804479 0.4220

AR(4) -0.038078 0.081569 -0.466814 0.6411

MA(12) -0.027171 0.029215 -0.930047 0.3534

RESID(-1) -0.051084 0.067996 -0.751284 0.4533

RESID(-2) -0.228863 0.281665 -0.812536 0.4174

Adjusted R-squared -0.019096 S.D. dependent var 5.322972

Correlogram of residuals squared

• Big spike at lag 6 • The Q-statistics are

significant from lag 6• There is conditional

heteroskedasticity

ARCH LM Test

• The F-statistic is significant.• We need an ARCH -GARCH model as

a remedy.

ARCH Test:

F-statistic 4.068571 Probability 0.000696Obs*R-squared 22.61355 Probability 0.000937

Test Equation:Dependent Variable: RESID^2Method: Least SquaresDate: 06/02/10 Time: 00:33Sample (adjusted): 1992M01 2010M03Included observations: 219 after adjustments

C 21.59285 5.870280 3.678334 0.0003RESID^2(-1) 0.006552 0.065251 0.100415 0.9201

RESID^2(-2) -0.021910 0.065331 -0.335366 0.7377RESID^2(-3) 0.025496 0.068421 0.372636 0.7098

RESID^2(-4) -0.050394 0.068350 -0.737302 0.4618

RESID^2(-5) -0.022153 0.068394 -0.323905 0.7463RESID^2(-6) 0.326619 0.068441 4.772288 0.0000

R-squared 0.103258 Mean dependent var 28.72062Adjusted R-squared 0.077879 S.D. dependent var 54.98061S.E. of regression 52.79631 Akaike info criterion 10.80220Sum squared resid 590939.5 Schwarz criterion 10.91053

Log likelihood -1175.841 F-statistic 4.068571Durbin-Watson stat 1.960973 Prob(F-statistic) 0.000696

ARCH-GARCH ModelAfter trying various combinations of ARCH and GARCH terms, we decided to use GARCH(2,2) model.

Variance Equation

C 90.44445 8.658893 10.44527 0.0000

RESID(-1)^2 0.001449 0.007789 0.185983 0.8525

RESID(-2)^2 -0.025738 0.011630 -2.213144 0.0269

GARCH(-1) -1.008740 0.009896 -101.9370 0.0000

GARCH(-2) -0.969267 0.013175 -73.57121 0.0000

Inverted AR Roots .81 .14-.62i .14+.62i -.55-.19i

-.55+.19i

Inverted MA Roots .99 .86+.49i .86-.49i .49+.86i

.49-.86i -.00-.99i -.00+.99i -.49-.86i

-.49+.86i -.86+.49i -.86-.49i -.99

Method: ML - ARCH (Marquardt) - Normal distribution

Sample (adjusted): 1991M07 2010M03

Included observations: 225 after adjustments

MA backcast: 1990M07 1991M06, Variance backcast: ON

GARCH = C(6) + C(7)*RESID(-1)^2 + C(8)*RESID(-2)^2 + C(9)

*GARCH(-1) + C(10)*GARCH(-2)

Coefficient Std. Error z-Statistic Prob.

C -0.104236 0.148511 -0.701870 0.4828

AR(2) 0.237301 0.050194 4.727673 0.0000

AR(4) 0.144621 0.053824 2.686915 0.0072

AR(5) 0.110283 0.054069 2.039675 0.0414

MA(12) -0.883960 0.028691 -30.80957 0.0000 20

Model Validation

• Q-statistics are significant from lag 9

• Residuals are cleanfrom lag 5 to lag 8.

ARCH Test for GARCH(2,2)ARCH Test:

F-statistic 0.769904 Probability 0.594354

Obs*R-squared 4.670189 Probability 0.586754

Test Equation:

Dependent Variable: STD_RESID^2

Date: 06/02/10 Time: 00:45

C 0.775074 0.185477 4.178819 0.0000

STD_RESID^2(-1) 0.066195 0.068396 0.967827 0.3342

STD_RESID^2(-2) 0.011702 0.068680 0.170380 0.8649

STD_RESID^2(-3) 0.080423 0.076642 1.049326 0.2952

STD_RESID^2(-4) -0.039251 0.076481 -0.513213 0.6083

STD_RESID^2(-5) 0.007833 0.076446 0.102458 0.9185

STD_RESID^2(-6) 0.106570 0.076488 1.393287 0.1650

R-squared 0.021325 Mean dependent var 0.997558

Adjusted R-squared -0.006373 S.D. dependent var 1.523201

• The F-statistic is now not significant

• Problem of conditionalHeteroskedasticity

is solved

Further Validations

• The Actual, Fittedand Residuals plotlooks good

• From the histogram, the residuals are WHITE noise.

92 94 96 98 00 02 04 06 08

Residual Actual Fitted

-2.50 -1.25 0.00 1.25 2.50

Series: Standardized ResidualsSample 1991M07 2010M03Observations 225

Mean 0.005339Median 0.054744Maximum 3.301127Minimum -3.022454Std. Dev. 0.990233Skewness -0.136439Kurtosis 3.379876

Testing our model

Inverted AR Roots .78 .15-.66i .15+.66i -.54-.26i

-.54+.26i

Inverted MA Roots .99 .86-.49i .86+.49i .49-.86i

.49+.86i .00+.99i -.00-.99i -.49-.86i

-.49+.86i -.86+.49i -.86-.49i -.99

Method: ML - ARCH (Marquardt) - Normal distribution

MA backcast: 1990M07 1991M06, Variance backcast: ON

GARCH = C(6) + C(7)*RESID(-1)^2 + C(8)*RESID(-2)^2 + C(9)

*GARCH(-1) + C(10)*GARCH(-2)

Coefficient Std. Error z-Statistic Prob.

C 0.008413 0.144789 0.058104 0.9537

AR(2) 0.115072 0.070790 1.625547 0.1040

AR(4) 0.137404 0.084550 1.625124 0.1041

AR(5) 0.126070 0.077214 1.632736 0.1025

MA(12) -0.871609 0.030465 -28.61010 0.0000

Variance Equation

C 2.822559 1.924208 1.466868 0.1424

RESID(-1)^2 0.213536 0.101105 2.112032 0.0347

RESID(-2)^2 -0.079787 0.100402 -0.794673 0.4268

GARCH(-1) 0.227943 0.223105 1.021687 0.3069

GARCH(-2) 0.556361 0.183598 3.030321 0.0024

• Now that we estimated a satisfactory model, let’s test it.

• We want to forecast the past 12 months based on the data up to May2009

Forecast: April 2009 – March 2010

2009M04 2009M07 2009M10 2010M01

DSDCALEIHNF

Forecast: DSDCALEIHNFActual: DSDCALEIHNForecast sample: 2009M04 2010M03Included observations: 12

Root Mean Squared Error 6.547299Mean Absolute Error 5.226972Mean Abs. Percent Error 201.1492Theil Inequality Coefficient 0.367983 Bias Proportion 0.087344 Variance Proportion 0.155742 Covariance Proportion 0.756914

2009M04 2009M07 2009M10 2010M01

Forecast of Variance25

Compare forecast to actual

00 01 02 03 04 05 06 07 08 09 10

DSDCALEIHNFORECASTDSDCA

FORECASTDSDCA+2*SEFFORECASTDSDCA-2*SEF

Forecast for the rest of 2010

• Use the full sample from 1990.01 to 2010.03

2010M04 2010M07 2010M10

DSDCALEIHNF_F

2010M04 2010M07 2010M10

Forecast of Variance

Forecast for the rest of 2010

00 01 02 03 04 05 06 07 08 09 10

DSDCALEIHNFORECASTDSDCA_F

FORECASTDSDCA_F+2*SEF_FFORECASTDSDCA_F-2*SEF_F

Recolor

•Our forecast shows a recovery to the same seasonal pattern.•However, their could be a permanent downward shift. 1250

00 01 02 03 04 05 06 07 08 09 10

CALEIHNCALEIHN_F

CALEIHN_F+2*SEF_FCALEIHN_F-2*SEF_F

Conclusion• Unlike the past recessions, the latest great

recession caused a dramatic shift downward

90 92 94 96 98 00 02 04 06 08 10 12

CALEIHN_F2 CALEIHN

Our forecast until 2012 does not indicate a recovery to pre-2008 levels.

Leisure & Hospitality Employment in California

Documents