+ All Categories
Home > Documents > Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon...

Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon...

Date post: 20-Mar-2020
Category:
Upload: others
View: 5 times
Download: 0 times
Share this document with a friend
40
Asymptotic Properties of GARCH-X Processes Heejoon Han 1 Department of Economics and Risk Management Institute, National University of Singapore August 2010 Abstract The paper considers the GARCH-X process in which the covariate is generalized as a fractionally integrated process I (d) for 1=2 <d< 1=2 or 1=2 <d< 3=2: We investigate the asymptotic properties of this process, and show how it ex- plains stylized facts of nancial time series such as the long memory property in volatility, leptokurtosis and IGARCH. If the covariate is a long memory process, regardless that it is stationary or nonstationary, the autocorrelation of the squared process of the model generates the long memory property in volatility by following the trend commonly observed in real data. The asymp- totic limit of the sample kurtosis of the GARCH-X process is larger than that of the GARCH(1,1) process unless the covariate is antipersistent. We also ana- lyze the e/ect of omitting the covariate that is nonstationary as well as persis- tent. Our analysis shows that, if the relevant covariate is omitted and the usual GARCH(1,1) model is tted, then the model would be estimated approximately as the IGARCH. This may well explain the ubiquitous evidence of the IGARCH in empirical volatility analysis. JEL classication: C22, C50, G12 Keywords; GARCH, GARCH-X, fractionally integrated process, long memory property, leptokurtosis, IGARCH 1 Introduction While most ARCH type models have been univariate, relating the volatility of time series only to the information contained in its own past history, researchers naturally included additional economic variables as covariates in the ARCH type models to model the volatility of economic and nancial time series. Since the GARCH(1,1) model has been popular, these works mostly use the GARCH(1,1) model with a covariate as following; 2 t = ! + y 2 t1 + 2 t1 + x t1 (or x 2 t1 ); (1) 1 This paper is based on parts of a working paper circulated under the title "GARCH process with per- sistent covariates". Research is supported by NUS Risk Management Institute. Department of Economics, National University of Singapore, 1 Arts Link, Singapore 117570; phone: (65) 6516-6258; fax: (65) 6775-2646 Email: [email protected] 1
Transcript
Page 1: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

Asymptotic Properties of GARCH-X Processes

Heejoon Han1

Department of Economics and Risk Management Institute,National University of Singapore

August 2010

Abstract

The paper considers the GARCH-X process in which the covariate is generalizedas a fractionally integrated process I (d) for �1=2 < d < 1=2 or 1=2 < d < 3=2:We investigate the asymptotic properties of this process, and show how it ex-plains stylized facts of �nancial time series such as the long memory propertyin volatility, leptokurtosis and IGARCH. If the covariate is a long memoryprocess, regardless that it is stationary or nonstationary, the autocorrelationof the squared process of the model generates the long memory property involatility by following the trend commonly observed in real data. The asymp-totic limit of the sample kurtosis of the GARCH-X process is larger than thatof the GARCH(1,1) process unless the covariate is antipersistent. We also ana-lyze the e¤ect of omitting the covariate that is nonstationary as well as persis-tent. Our analysis shows that, if the relevant covariate is omitted and the usualGARCH(1,1) model is �tted, then the model would be estimated approximatelyas the IGARCH. This may well explain the ubiquitous evidence of the IGARCHin empirical volatility analysis.

JEL classi�cation: C22, C50, G12

Keywords; GARCH, GARCH-X, fractionally integrated process, long memory property,leptokurtosis, IGARCH

1 Introduction

While most ARCH type models have been univariate, relating the volatility of time seriesonly to the information contained in its own past history, researchers naturally includedadditional economic variables as covariates in the ARCH type models to model the volatilityof economic and �nancial time series. Since the GARCH(1,1) model has been popular, theseworks mostly use the GARCH(1,1) model with a covariate as following;

�2t = ! + �y2t�1 + ��

2t�1 + �xt�1 (or �x

2t�1); (1)

1This paper is based on parts of a working paper circulated under the title "GARCH process with per-sistent covariates". Research is supported by NUS Risk Management Institute. Department of Economics,National University of Singapore, 1 Arts Link, Singapore 117570; phone: (65) 6516-6258; fax: (65) 6775-2646Email: [email protected]

1

Page 2: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

where (yt) is a demeaned time series and �2t is its variance conditional on the informa-tion available at time t � 1: As the covariate (xt) in (1), Glosten et al. (1993), Brenneret al. (1996), Gray (1996) and Engle and Patton (2001) used interest rate levels. Like-wise, forward-spot spreads and interest rate spreads were used as covariates respectively byHodrick (1989) and Hagiwara and Herce (1999). See also Fleming et al. (2008) for morereferences. Following Brenner et al. (1996), this model is referred as the GARCH-X model.2

Moreover, the GARCH-X model with the restriction of � = 0 in (1) is also considered byHan and Park (2008), where the yield spread between Aaa and Baa bonds is used as thecovariate.

The covariates used in the GARCH-X models were mostly economic variables, but re-cently various realized measures of volatility constructed from high frequency data havebeen adopted with the rapid development in the �eld of realized volatility. The multiplica-tive error model (MEM) by Engle (2002) �rst used the realized variance as the covariate inthe framework of the GARCH-X model. Barndor¤-Nielsen and Shephard (2007) includedboth the realized variance and the bipower variation. See also Engle and Gallo (2006),Cipollini et al. (2007), Shephard and Sheppard (2010) and Hansen et al. (2010). In par-ticular, HEAVY model by Shephard and Sheppard (2010) and the Realized GARCH modelby Hansen et al. (2010) specify the conditional variance as the GARCH-X model with therestriction of � = 0 in (1):

Compared to the GARCH(1,1) process, we expect the GARCH-X process could exhibitsubstantially di¤erent characteristics in particular when the covariate is persistent in mem-ory. However, even if the GARCH-X model is widely used in practice, there is no existingliterature which investigates the e¤ect of a stochastic covariate in the GARCH-X model onvarious characteristics of �nancial time series. We attempt to �ll this gap by investigatingtime series properties of the GARCH-X process. We show that stylized facts of �nancialtime series can be successfully explained in the framework of the GARCH-X model withsuitable conditions for the covariate.

We consider the GARCH-X process in which the covariate (xt) is generalized to be afractionally integrated process I (d) for �1=2 < d < 1=2 or 1=2 < d < 3=2. The covariate isallowed to be stationary short memory, stationary long memory, nonstationary long memoryor integrated. We model the covariate as a fractionally integrated process with a wide rangeof order of integration so that it can represent various types of time series of those covariatesused in the GARCH-X models. This seems to be desirable considering that each covariateused in the GARCH-X models shows a di¤erent degree of persistence. While the time seriesof some covariates are nonstationary and can be modeled as unit root processes, the timeseries of other covariates clearly reject the unit root hypothesis. However, even if the unitroot hypothesis is rejected for those variables, the degree of persistence is mostly high inthe economic variables used in the GARCH-X models. Moreover, it is well known that thetime series of realized measures are also persistent. For example, Andersen et al. (2003),Andersen et al. (2009) and Hol (2003) emphasized the evidence of long memory in the timeseries of realized measures.

2The GARCH-X model by Lee (1994) is also the GARCH model with a covariate. But it is a multivariatemodel that includes the error correction term from a cointegrating type relationship for the underlying vectorprocess.

2

Page 3: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

The contributions of this paper are following. We focus on three commonly observedfacts in �nancial time series, which are the long memory property in volatility, leptokurto-sis and IGARCH. We investigate the statistical properties of the GARCH-X process, andexamine how this process explains these stylized facts in �nancial time series. Additionally,we also consider the cases with the restriction of � = 0 or � = 0 in (1), and report how eachrestriction a¤ects the asymptotic results. We investigate if there exists any qualitativelydistinct di¤erence in each case.

First, regarding the long memory property in volatility, it is known that the sample au-tocorrelation function of squared �nancial return series (in particular high frequency data)decreases fast at �rst and remains signi�cantly positive for larger lags. The asymptoticbehavior of the sample autocorrelation function of the squared process generated by theGARCH-X model describes this trend and generates the long memory property in volatilityas long as the covariate is long memory, regardless that it is stationary or nonstationary.If the covariate is stationary long memory, the sample autocorrelation will decay hyper-bolically. If the covariate is nonstationary long memory, the autocorrelation decreases ex-ponentially at �rst and �nally converges to a positive random limit that is smaller thanunity.

Second, it is also known that the kurtosis implied by the GARCH(1,1) model withnormally distributed innovations tends to be far less than the sample kurtosis observed inmany �nancial return series. The asymptotic limit of the sample kurtosis in the GARCH-Xprocess is larger than that of the GARCH(1,1) process unless the covariate is antipersistent.When the covariate is an I(d) process for 0 � d < 1=2; the kurtosis of the GARCH-X processhas an additional positive nonrandom term, compared to that of the GARCH(1,1) process.When the covariate is nonstationary long memory, the kurtosis of the GARCH-X processis randomly larger than that of the GARCH(1,1) process. Hence, the GARCH-X modelprovides an explanation of the sample kurtosis observed in �nancial return series withoutusing innovations with fat-tailed distributions.

Finally, it is well known that most empirical applications of the GARCH(1,1) modelon �nancial return series suggest the IGARCH(1,1) process if the sample size is su¢ cientlylarge. We conduct a study on misspeci�cation; assuming that the true data generatingprocess is the GARCH-X process with a nonstationary long memory covariate, we investi-gate the e¤ect of missing the covariate on the GARCH(1,1) model estimation. Our studyshows that the IGARCH could be the result of missing a relevant covariate, that is nonsta-tionary as well as persistent in memory, in the GARCH-X model.

The rest of the paper is organized as follows. Section 2 introduces the model with somepreliminary concepts and results, which are necessary for the development of our theoryin the paper. Section 3 and 4 investigate how our model explains the stylized facts of�nancial time series. Section 3 examines the asymptotic behaviors of sample statistics suchas sample autocorrelation of the squared process, as well as sample variance and kurtosis.Section 4 shows that the commonly observed IGARCH evidence could be due to missinga nonstationary covariate in the GARCH-X model. Section 5 concludes the paper, andAppendices A and B contain mathematical proofs for the technical results in the paper.Finally a word on notation. We denote by R+ and R++ the sets of real numbers that arenonnegative and positive, respectively. Standard terminologies and notations in probability

3

Page 4: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

and measure theory are used throughout the paper. In particular, notations for variousconvergence such as !a:s:; !p and !d will frequently appear. All limits are taken asn!1, except where otherwise indicated.

2 The Model and Preliminaries

We consider the volatility model speci�ed as

yt = �t"t;

and let (Ft) be the �ltration representing the information available at time t.

Assumption 1 Assume that(a) ("t) is iid (0,1) and adapted to (Ft)(b) (�t) is given by

�2t = ! + �y2t�1 + ��

2t�1 + �xt (2)

for parameters �; � � R+ such that �+ � < 1;(c) ! + �xt > 0 for all t,(d) (xt) is adapted to (Ft�1).

Assumption 1 de�nes (yt) as the GARCH-X process and introduces parameter conditions.The speci�c form of ! + �xt in (2) is the most commonly used one in the literature of theGARCH-X model, which adopts a nonnegative covariate (xt) with parameter conditions of! > 0 and � � 0: Other speci�cations are also used in the literature. For example, �x2tinstead of �xt is used in a few cases, and the case with ! = 0 is also adopted.3 Consideringthese practical applications of the model, the condition in part (c) is not restrictive. As inthe literature, we can simply use a nonnegative covariate (xt) with parameter conditions of! > 0 and � � 0: We assume the part (d), instead of using (xt�1) as in (1), for notationalconvenience in proofs.

Han and Park (2008) investigated the time series properties of a related process, where�2t = �y2t�1 + f(xt; �) for an integrated or near-integrated covariate (xt) and a positivenonlinear volatility function f (�). They considered various nonlinear transformations ofan integrated or near-integrated covariate in the ARCH-X framework, where f(xt; �) =! + �1 jxtj�2 is surely allowed. Since Park and Phillips (1999) established asymptotics forthe nonlinear transformation of an integrated process, a nonlinear transformation of (xt)can be allowed in (2) as long as the covariate is modeled as an integrated or near-integratedprocess. Park (2002) �rst applied these asymptotics in the conditional variance model.However, when (xt) is fractionally integrated, there exists no available asymptotics for such

3Hwang and Satchell (2005) considered �2t = �y2t�1 + ��2t�1 + ��

2C;mt�1, where �

2C;mt�1 is the lagged

cross-sectional market volatility in their individual asset volatility model referred to as the GARCHX model.The parameter condition is � > 0 in their case, which guarantees the positivity of ��2C;mt�1 because �

2C;mt�1

is always positive.

4

Page 5: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

a nonlinear transformation of (xt) :4 Therefore, we do not consider nonlinear transformationsof the covariate (xt) in our model because it is assumed to be a fractionally integrated processas following.

Assumption 2 Assume that(a) for �1=2 < d < 1=2;

(1� L)d �t = vt; (3)

(b) (vt) is a zero mean covariance stationary process and it is near-epoch-dependent on amixing process satisfying Assumption 1 in Davidson and De Jong (2000) and,(c) (xt) is a fractionally integrated process of order dx; I (dx) ; such that

xt = �t for � 1=2 < dx < 1=2; orxt = xt�1 + �t for 1=2 < dx < 3=2:

Assumption 2 precisely de�nes the covariate (xt) as a fractionally integrated process I (dx)driven by a general weakly dependent process for �1=2 < dx < 1=2 or 1=2 < dx < 3=2:5

Hence it is allowed to be a stationary short memory process, stationary long memory process,nonstationary long memory process or integrated process. The covariate (xt) is generatedby (�t); which is an I(d) process with �1=2 < d < 1=2: Assumption 2(b) implies thatthe innovation (vt) has a very general form of weak dependence allowing various formsof nonlinear dynamics (see Davidson (2002)). Davidson and De Jong (2000) derives afunctional central limit theorem for the partial sums of fractionally integrated processesand several weak convergence results for stochastic integrals having fractional integrandsand weakly dependent integrators. While previously Sowell (1990) derived related resultsunder the assumption that (vt) is iid, Davidson and De Jong (2000) allowed a very generalform of weak dependence of (vt) by assuming that it is near-epoch-dependent on a mixingprocess.

For the subsequent development of our theory, it will be convenient to introduce someadditional notions and results. Let us consider (1� L)d �t = vt for �1=2 < d < 1=2 where

(1� L)d =1Xj=0

� (�d+ j)� (�d) � (j + 1)L

j :

It can also be written in the MA representation

�t = (1� L)�d vt =1Xj=0

� (d+ j)

� (d) � (j + 1)vt�j :

4Dittmann and Granger (2002) investigated the properties of nonlinear transformations of fractionallyintegrated processess even if they did not consider asymptotic limts of partial sums of those series. Theyprove that the the Hermite polynomials of a stationary Gaussian I(d) process show less dependence thanthe initial process. For example, the squared of a Gaussian I(d) process with d 2 (0; 0:25] is I(0):

5We exclude the case with dx = 1=2: This is because there is no available asymptotic limit ofPn

t=1 x2t for

dx = 1=2 that is needed for our results in the next section. When dx = 1=2; the asymptotic limit ofPn

t=1 xtis derived by Liu (1998) and Tanaka (1999). However, the asymptotic limit of

Pnt=1 x

2t is not obtained and

only its convergence rate is available in Liu (1998).

5

Page 6: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

Let X (�) be the autocovariance function of X. The autocovariance function of (�t) has thesame sign as d for k � 1; and it satis�es � (k) � Ck2d�1: Hence, the autocovariance decayshyperbolically for 0 < d < 1=2. For d = 0, � (k) = v (k) and it decays geometrically if vtis a stationary ARMA process.

De�ning �2n� = E (Pnt=1 �t)

2 ; it is known that �2n� = Op�n1+2d

�: Let [z] denote the

integer part of z. Under suitable conditions, it is known that

��1n�

[nr]Xt=1

�t !d Wd (r)

for r 2 [0; 1]; where Wd is a fractional Brownian motion, de�ned for d 2 (�1=2; 1=2) by

Wd (r) =1

� (d+ 1)K1=2d

�Z r

0(r � s)d dV0(s) +

Z 0

�1

h(r � s)d � (�s)d

idV0(s)

�:

Here, V0 is the standard Brownian motion and

Kd =1

� (d+ 1)2

�1

2d+ 1+

Z 1

0

�(1 + �)d � �d

�2d�

�:

This scale constant Kd is chosen to make E(Wd (1)2) = 1: See Davidson and De Jong (2000)

for additional details. Note that Wd = V0 when d = 0. If xt = xt�1 + �t, it is also knownthat

1

n�n�

nXt=1

xt !d

Z 1

0Wd(r)dr:

If (yt) is generated as the GARCH(1,1) process with its conditional variance given by

�2t = ! + �y2t�1 + ��

2t�1;

then we may easily deduce by the recursive substitution that

�2t = !

1 +

t�1Xi=1

iYh=1

�� + �"2t�h

�!+ �20

tYh=1

�� + �"2t�h

�for t � 1. As Nelson (1990) noted, we can close the system either by de�ning a probabilitymeasure �0 for the starting value of �

20 or by assuming that the system extends in�nitely

far into the past. In the latter case, we have

�2t = !zt

where

zt = 1 +

1Xi=1

iYh=1

�� + �"2t�h

�(4)

for t � 1: Similarly, the conditional variance of our model can be written as

�2t =

1Xi=0

(! + �xt�i)iY

h=1

�� + �"2t�h

�for t � 1 where

Q0h=1

�� + �"2t�h

�= 1: We let K

4 = E"4t . This convention will be madethroughout the paper.

6

Page 7: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

3 Asymptotic Properties of Sample Statistics

3.1 Sample Autocorrelation and Long Memory Property

The sample autocorrelation function of squared �nancial return series (in particular highfrequency data) is known to have a typical trend that decreases fast at �rst and remainssigni�cantly positive for larger lags. Ding et al. (1993) found that it is possible to charac-terize the power transformation of stock returns to be long memory. However, it is knownthat the GARCH(1,1) model cannot generate this long memory property in volatility. Theasymptotic limit of the sample autocorrelation of the squared process generated by theGARCH(1,1) model is

R2k = (�+ �)k�1�(1� �� � �2)

1� 2�� � �2(5)

if E�� + �"2t

�2< 1: This shows that the autocorrelation function of the squared GARCH(1,1)

process exponentially decreases and quickly converges to zero.In the literature of ARCH type models, there has been active research to provide an

explanation of the long memory property in volatility.6 See Baillie et al. (1996) and Dingand Granger (1996) (fractionality of the order of integration), Engle and Lee (1999) (twocomponents), Diebold and Inoue (2001) (switching regime), Mikosch and Starica (2004)(structural change), Granger and Hyung (2004) (occasional break) and Han and Park (2008)(persistent covariate) for the related literature.

In order to see how the GARCH-X model explains the long memory property in volatility,we investigate the asymptotic behavior of the sample autocorrelation of the squared processgenerated by the GARCH-X model. We de�ne the sample autocorrelation of (y2t ) by

R2nk =

nPt=k+1

�y2t � �y2n

� �y2t�k � �y2n

�nPt=1

�y2t � �y2n

�2 ;

where �y2n denotes the sample mean of (y2t ): To precisely characterize the asymptotic behavior

of R2nk under the GARCH-X model, we make the following additional assumptions.

Assumption 3A Assume(a) E j"tjq <1 for some q > 8,(b) E

�� + �"2t

�2< 1, and

(c) ("t) is independent of (vt�s) for s � 0.

Assumption 3B Assume the parts (a) and (b) of Assumption 3A.

Assumption 3A is for the case of �1=2 < dx < 1=2; and Assumption 3B is for the caseof 1=2 < dx < 3=2: The part (a) introduces the moment condition for the innovation

6This is also an important issue in the literature of stochastic volatility models. See Hurvich and Soulier(2009) for stochastic volatility models with long memory property.

7

Page 8: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

("t). The part (b) is a conventional assumption in the GARCH(1,1) model. The condi-tion, E

�� + �"2t

�2< 1; is necessary in the investigation of the statistical properties of the

GARCH(1,1) process. If E�� + �"2t

�2< 1; the sample autocorrelation of squared process

and the sample kurtosis have probability limits in the GARCH(1,1) model. We assume thepart (c) of Assumption 3A only for the case of �1=2 < dx < 1=2: Note that the part (c) isnot restrictive. According to Assumption 1, ("t) is adapted to (Ft) but (vt) is adapted to(Ft�1). The part (c) can be in general allowed because ("t) is iid.

When 1=2 < dx < 3=2; we do not impose any condition on the relationship between ("t)and (vt). Hence, the leverage e¤ects is also allowed in the GARCH-X model if ("t) and (vt)are negatively correlated. Even when �1=2 < dx < 1=2; the model can accommodate theleverage e¤ects at least as the GJR-GARCH model does if ("t�1) and (vt) are negativelycorrelated. Otherwise, we can simply extend the model following the GJR-GARCH modelto accommodate the leverage e¤ects.

Theorem 1 Let Assumptions 1 and 2 hold, and let k � 1.(a) Given Assumption 3A, for �1=2 < dx < 1=2;

R2nk !p R2k =

0@A1 + k�1Xi=0

(�+ �)i1Xj=0

(�+ �)j�!2 + �2 x (k + j � i)

�1A /A2where x (k) = E (xtxt�k) and

A1 = (�+ �)k�1 �� + �K4�

0@ �!2 + �2E

�x2t�� ��

1���2K4 + 2�� + �2

��+2

1Pj=0

��2K4 + 2�� + �2

�j 1Pi=1+j

(�+ �)i�j�!2 + �2 x (i� j)

� 1A� (! /(1� (�+ �)))2

A2 = 21Xj=0

K4��2K4 + 2�� + �2

�j 1Xi=1+j

(�+ �)i�j�!2 + �2 x (i� j)

�+

K4�!2 + �2E

�x2t��

1���2K4 + 2�� + �2

� � � !

1� (�+ �)

�2:

(b) Given Assumption 3B, for 1=2 < dx < 3=2;

R2nk !d R2k =

�1� (�+ �)k�1�(1� �� � �

2)

1� 2�� � �2

�B + (�+ �)k�1

�(1� �� � �2)1� 2�� � �2

where

B =

R 10 (Wd(r))

2 dr ��R 10 Wd(r)dr

�2(1�(�+�)2)K4

1�(�2K4+2��+�2)

R 10 (Wd(r))

2 dr ��R 10 Wd(r)dr

�2 :

Theorem 1 provides the asymptotic limit of the sample autocorrelation of the squaredprocess generated by the GARCH-X model, which is denoted by R2k: The behavior of R

2k

8

Page 9: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

as k ! 1 is of our interest, which will show whether the model explains the commonlyobserved long memory property in volatility.

Remark 1.1 Theorem 1 shows that the GARCH-X process explains the long memoryproperty in volatility as long as the covariate (xt) is long memory, regardless that thecovariate is stationary or nonstationary. For �1=2 < dx < 1=2; the denominator of R2k; A2;is bounded and one part of the numerator of R2k, A1; decreases exponentially. However,the other part of the numerator of R2k decreases hyperbolically when 0 < dx < 1=2 because x (k) � Ck2dx�1. R2k converges to zero eventually, but the decay rate depends on dx: Thehyperbolic decay rate appears only when the covariate is long memory, which implies thatthe GARCH-X process generates the long memory property in volatility when the covariateis stationary long memory. If the covariate is short memory, �1=2 < dx � 0, the GARCH-Xprocess does not exhibit the long memory property in volatility.

For 1=2 < dx < 3=2; R2k has a di¤erent pattern. As k !1; R2k decreases exponentiallyat �rst and converges to B introduced in Theorem 1 (b). Note that B does not depend onk: B is random because it is a function of a fractional Brownian motion, and it is positiveand smaller than unity due to the Cauchy-Schwarz inequality. This result implies that theGARCH-X process generates the long memory property in volatility when the covariate isnonstationary long memory. Moreover, the trend of R2k in this case is quite similar to thetypically observed trend of the sample autocorrelation function of squared �nancial returnsseries, considering that it decreases fast (exponentially) at �rst and converses to a positiverandom value smaller than unity.

Remark 1.2 We consider the GARCH-X process with the restriction of � = 0:With thisrestriction, the process reduces to the ARCH-X process. If � = 0; the behavior of R2k isbasically similar as that in Theorem 1. It decreases hyperbolically when 0 < dx < 1=2because, for �1=2 < dx < 1=2;

R2k =

0@A01 + k�1Xi=0

�i1Xj=0

�j�!2 + �2 x (k + j � i)

�1A�A02where

A01 = �kK4

0@!2 + �2E �x2t �1� �2K4 + 2

1Xj=0

��2K4

�j 1Xi=1+j

�i�j�!2 + �2 x (k + j � i)

�1A� �k � !

1� �

�2

A02 = 21Xj=0

K4��2K4

�j 1Xi=1+j

�i�j�!2 + �2 x (i� j)

�+

K4�!2 + �2E

�x2t��

1� �2K4 ��

!

1� �

�2:

Note that A01 includes x (k + j � i) : For 1=2 < dx < 3=2;

R2k =�1� �k

�B0+ �k

9

Page 10: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

where

B0 =

R 10 (Wd(r))

2 dr ��R 10 Wd(r)dr

�2(1��2)K41��2K4

R 10 (Wd(r))

2 dr ��R 10 Wd(r)dr

�2 ;and R2k decreases exponentially at �rst and converses to a positive random value smallerthan unity.

Remark 1.3 Now we consider the restriction of � = 0: The behavior of R2k is overallsimilar as that in Theorem 1 when the covariate is stationary. It decreases hyperbolicallywhen 0 < dx < 1=2 because, for �1=2 < dx < 1=2;

R2k =

0@A001 +

k�1Xi=0

�i1Xj=0

�j�!2 + �2 x (k + j � i)

�1A.A002

where

A001 = �

k

�!2 + �2E

�x2t�� ��

1� �2�

+2P1j=0

��2�jP1

i=1+j (�)i�j �!2 + �2 x (i� j)�

!��

!

1� �

�2A002 = 2

P1j=0 K

4��2�jP1

i=1+j (�)i�j �!2 + �2 x (i� j)�

+K4�!2 + �2E

�x2t��

1� �2��

!

1� �

�2:

However, it shows a di¤erent trend when the covariate is nonstationary. For 1=2 < dx < 3=2;

R2k =

R 10 (Wd(r))

2 dr ��R 10 Wd(r)dr

�2K4R 10 (Wd(r))

2 dr ��R 10 Wd(r)dr

�2 :R2k does not depend of k and it has a positive random value between zero and unity for allk. This is similar as R2k of the IGARCH(1,1) model, where R

2k = 1 for all k:

If (xt) is stationary, there exists no major di¤erence in the behavior of R2k between thecase of � = 0 and the case of � = 0. However, when (xt) is nonstationary, it is interestingto note the di¤erence between two cases in the behavior of the limit of the autocorrelationfunction of squared returns. When � = 0; the trend of R2k is still similar as the one observedin real data; it decreases fast (exponentially) at �rst and converges to a positive randomlimit. However, if � = 0, R2k has a positive random value for all k. This comparison showsthat the squared return

�y2t�1

�plays an important role in explaining the behavior of the

autocorrelation function. The long memory property in volatility is generated due to thelong memory covariate in the GARCH-X process, but it is hard to explain the trend of theautocorrelation function observed in real data if we restrict �, the coe¢ cient of

�y2t�1

�, to

be zero:

10

Page 11: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

3.2 Sample Variance and Kurtosis

We investigate the asymptotic behaviors of sample variance and kurtosis in this subsection.The sample variance of (yt) is de�ned as usual by

S2n =1

n

nXt=1

y2t :

To derive the asymptotic limit of the sample variance, we make the following additionalassumptions, which are weaker than Assumption 3.

Assumption 3A´ Assume(a) E j"tjq <1 for some q > 4; and the part (c) of Assumption 3.

Assumption 3B´ Assume the part (a) of Assumption 3A´.

Theorem 2 Let Assumptions 1 and 2 hold.(a) Given Assumption 3A´, for �1=2 < dx < 1=2;

S2n !p!

1� (�+ �) :

(b) Given Assumption 3B´, for 1=2 < dx < 3=2;

��1n� S2n !d

1� (�+ �)

Z 1

0Wd(r)dr:

Theorem 2 reports the asymptotic limit of the sample variance of the GARCH-X process.For �1=2 < dx < 1=2, the sample variance converges to a nonrandom limit, which does notdepend upon any characteristics of the covariate. Moreover, it is the same as the asymptoticlimit of the sample variance of the GARCH(1,1) process. The stationary covariate does nota¤ect the asymptotic behavior of the sample variance of the GARCH-X process. However,it is di¤erent when the covariate is nonstationary. For 1=2 < dx < 3=2, the sample variancediverges as sample size increases because �n� = Op

�n1=2+d

�for �1=2 < d < 1=2: When the

covariate is nonstationary, the variance of the GARCH-X process is in�nite in the limit,and therefore, it is more comparable to the IGARCH model. This result has an importantimplication on our study about the IGARCH in the next section (see Remark 4.4).

Now we investigate how a covariate in the GARCH-X model a¤ects the kurtosis of timeseries. Many �nancial time series are known to be leptokurtic. The asymptotic limit of thesample kurtosis of the GARCH(1,1) process is

(1� (�+ �)2)K4

1� (�2K4 + 2�� + �2)(6)

11

Page 12: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

if E�� + �"2t

�2< 1; which shows that the GARCH(1,1) process is also leptokurtic. How-

ever, it is well known that the kurtosis implied by the GARCH(1,1) model with normallydistributed innovations tends to be far less than the sample kurtosis observed in many �-nancial return series. As a typical way to overcome this problem, some econometriciansproposed the use of innovation ("t) with a fat-tailed distribution while maintaining theGARCH(1,1) model. For example, Bollerslev (1987) advocates the use of innovations withthe t-distribution, and Bai et al. (2003) considers innovations following a mixture of twonormal distributions. We shall refer to these models as the GARCH(1,1) model with fat-tailed innovations.

However, the GARCH(1,1) model with fat-tailed innovations has its limitations. Even ifit successfully explains the leptokurtosis of �nancial time series, it cannot provide explana-tions on other stylized facts. Most of all, the GARCH(1,1) model with fat-tailed innovationscannot explain the long memory property in volatility as it was shown in the previous sec-tion. Regardless of the distribution of ("t) ; the GARCH(1,1) model cannot explain theobserved behavior of the autocorrelation function of squared return series.

We de�ne the sample kurtosis of (yt) by

K4n =

1

n

nXt=1

y4t

, 1

n

nXt=1

y2t

!2:

We report the asymptotic results for the sample kurtosis of (yt) in Theorem 3.

Theorem 3 Let Assumptions 1 and 2 hold.(a) Given Assumption 3A, for �1=2 < dx < 1=2;

K4n !p

(1� (�+ �)2)K4

1� (�2K4 + 2�� + �2)+C1C2

where

C1 =K4�2E

�x2t�

1� (�2K4 + 2�� + �2)

+21Xj=0

K4(�2K4 + 2�� + �2)2

1Xi=1+j

(�+ �)i�j�!2 + �2 x (i� j)

�C2 =

�!

1� (�+ �)

�2:

(b) Given Assumption 3B, for 1=2 < dx < 3=2;

K4n !d

(1� (�+ �)2)K4

1� (�2K4 + 2�� + �2)

R 10 (Wd(r))

2 dr

(R 10 Wd(r)dr)2

:

12

Page 13: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

Remark 3.1 For �1=2 < dx < 1=2; the limit of the sample kurtosis has two terms. The�rst term is the same as the asymptotic limit of the sample kurtosis of the GARCH(1,1)process given in (6). The other term C1=C2 is nonrandom, and its value is positive for0 � dx < 1=2 because

K4��2E

�x2t�+ 2!2(�+�)

1�(�+�)

�1� (�2K4 + 2�� + �2)

< C1 <

K4

��2E

�x2t�+

2(!2+�2)(�+�)1�(�+�)

�1� (�2K4 + 2�� + �2)

:

For �1=2 < dx < 0; C1 is positive only if E�x2t�> 2 (�+ �) = (1� (�+ �)) : Hence the

GARCH-X process has a larger kurtosis than the GARCH(1,1) process unless the covariateis antipersistent (�1=2 < dx < 0).

For 1=2 < dx < 3=2; the limit of the sample kurtosis is a product of two parts. One

part is the same as (6). The other partR 10 (Wd(r))

2 dr.(R 10 Wd(r)dr)

2 is random because it

contains a fractional Brownian motion, and its value is larger than unity due to the Cauchy-Schwarz inequality. This implies that the kurtosis of the GARCH-X process is randomlylarger than that of the GARCH(1,1) process.

Since dx � 0 in most data used in practice, Theorem 3 shows that the GARCH-X processwill have a larger kurtosis than the GARCH(1,1) process. Note that there is no additionalassumption on the distribution of the innovation ("t) : Without using innovations withfat-tailed distribution, Theorem 3 provides an alternative explanation of the leptokurtosisobserved in �nancial return series using a stochastic covariate.

Remark 3.2 We consider the GARCH-X process with the restriction of � = 0: For�1=2 < dx < 1=2;

K4n !d

(1� �2)K41� �2K4 +

C 01C 02

where

C 01 =K4�2E

�x2t�

1� �2K4 + 2

1Xj=0

K4(1� �2K4)j

1Xi=1+j

�i�j�!2 + �2 x (i� j)

�C 02 =

�!

1� �

�2:

For 1=2 < dx < 3=2;

K4n !d

(1� �2)K41� �2K4

R 10 (Wd(r))

2 dr

(R 10 Wd(r)dr)2

:

This result suggests that the ARCH-X process explains the leptokurtosis by having a largerkurtosis than K

4 that is the kurtosis of the innovation ("t) : However, it is hard to see if itis larger than the limit of the sample kurtosis of the GARCH(1,1) process given in (6).

13

Page 14: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

Remark 3.3 Now we consider the restriction of � = 0: For �1=2 < dx < 1=2;

K4n !p K

4 +C001

C002

where

C001 =

K4�2E

�x2t�

1� �2+ 2

1Xj=0

K4��2�j 1Xi=1+j

�i�j�!2 + �2 x (i� j)

�C002 =

�!

1� �

�2:

For 1=2 < dx < 3=2;

K4n !d K

4

R 10 (Wd(r))

2 dr

(R 10 Wd(r)dr)2

:

If we consider the case when the covariate is nonstationary, the limit of the sample kurtosiswhen � = 0 is smaller than that of the case when � = 0 because

�1� �2

�> (1� �2K4):

4 IGARCH Evidence

In many empirical applications of the GARCH model

�2t = ! + �y2t�1 + ��

2t�1 (7)

on speculative asset returns, the ARCH e¤ects, i.e., � + � in (7) are found to be close tounity. This led Engle and Bollerslev (1986) earlier to introduce the IGARCH model with�+ � = 1.

However, there have been claims that the IGARCH could be spurious and could be dueto the behavior of estimators under misspeci�cation. One main reason why they think theIGARCH could be spurious is that the IGARCH process cannot properly explain the longmemory property in volatility. According to the IGARCH(1,1) model, the autocorrelationfunction of squared return series is unity for all lags, which is not realistic at all. Thisdiscordance motivated econometricians to �nd an explanation of the IGARCH.

Econometricians have proposed several misspeci�ed cases that would generate the IGARCH.One example is a case with neglected structural changes or regime changes. See Diebold(1986), Lamoureux and Lastrapes (1990), Cai (1994) and Hamilton and Susmel (1994). An-other example is a case with neglected fractionality of the order of integration. See Baillie etal. (1996). These works made use of either simulations or indirect approaches, but Mikoschand Starica (2004) and Hillebrand (2005) showed theoretically that the IGARCH couldbe generated, due to the behavior of estimators, by neglecting structural change.7 Note

7Mikosch and Starica (2004) showed it in the framework of the Whittle estimation, but the theory ofHillebrand (2005) does not depend on estimation method and covers the common framework of the maximumlikelihood estimation.

14

Page 15: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

that the models, that are assumed to be true data generating processes in these works, canalso generate the long memory property in volatility as explained in the previous section.Moreover, Jensen and Lange (2010) recently showed that the IGARCH could appear whendata is generated by certain types of stochastic volatility models. See also Craioveanu andHillebrand (2010) for more references.

Therefore, it will be interesting to investigate the e¤ect of misspeci�cation of the GARCH-X model as the GARCH(1,1) model and see if it would yield any evidence for the IGARCH.We will investigate if the IGARCH can be generated by missing a relevant covariate in theGARCH-X model. For this, we let

mt = zt"2t ; (8)

where (zt) is given in (4). Moreover, we introduce some additional assumptions.

Assumption 4 Assume that(a)

xt = xt�1 + �t

where(1� L)d �t = vt

for �1=2 < d < 1=2;(b) (vt) is iid and Ejvtjp <1 for some p > 2;

(c) Ej"tjq <1 and E�� + �"2t

�q=2< 1 for some q > 4, and

(d) 1=p+ 2=q < 1=2 + d.

Assumption 5 We let

�1t(�) =mtP1

i=1 �i�1mt�i

; �2t(�) =

P1i=1 �

i�1P1j=i+1 �

j�i�1mt�jP1i=1 �

i�1mt�i;

�3t(�) =

P1i=1 �

i�1P1j=i+1 �

j�i�1mtmt�j�P1i=1 �

i�1mt�i�2 ;

and assume that (�it(�)) are strictly stationary and ergodic with E�it(�) �nite and contin-uous for all � 2 B � (0; 1), i = 1; 2; 3.

Assumption 4 more precisely de�nes the covariate (xt) as a fractionally integrated process,and introduces moment conditions for the innovation sequences (vt) and ("t). Now thecovariate is allowed to be I (dx) only for 1=2 < dx < 3=2 and the stationary case (�1=2 <dx < 1=2) is excluded. See Remark 4.4 in this section for related comments. Moreover, theinnovation (vt) is assumed to be iid.

The conditions in the part (c) is the same as Assumption 2(b) in Han and Park (2009).Using Jensen�s inequality, it follows from the part (c) that E

�ln�� + �"2t

��< 0, which

as shown in Nelson (1990) is the necessary and su¢ cient condition for the GARCH(1,1)process to be strictly stationary and ergodic. Moreover, it also follows from the part (c)that E

�� + �"2t

�2< 1 by Jensen�s inequality, which is used for analyzing asymptotic limits

15

Page 16: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

of the sample autocorrelation of squared process and the sample kurtosis in the previoussection. See Han and Park (2009, Assumption 2) for detailed explanations. The conditionin part (d) becomes the same as Assumption 2(c) in Han and Park (2009) if d = 0: Whend = 0 (xt � I(1)), we have p = 1 and E"4t = 3 under the Gaussianity of innovationsequences. As Han and Park (2009) explained, in this case, we may allow for any � and �in the range of 0 �

��2 + 2�� + 3�2

�< 1 by taking q su¢ ciently close to 4 However, if

d < 0; the conditions in part (c) should hold for larger q even if p =1 due to the part (d).The following theorem establishes the probability limit of the MLE�s ~�n and ~�n re-

spectively for the parameters � and � in the GARCH(1,1) model (7) when the true datagenerating process is assumed to be the GARCH-X model. The constant term parameter !is not identi�ed in our model, precisely for the same reason as in the nonstationary GARCHmodel studied by Jensen and Rahbek (2004). Therefore, we will not consider its MLE inwhat follows.

Theorem 4 Let Assumptions 1, 4 and 5 hold. Then we have

~�n !p �� and ~�n !p ��

with �� and �� de�ned as the solution of the simultaneous equations

E�

mt

��P1i=1 �

i�1� mt�i

� 1�= 0

E

"�mt

��P1i=1 �

i�1� mt�i

� 1� P1

i=1 �i�1�P1j=i+1 �

j�i�1� mt�jP1

i=1 �i�1� mt�i

#= 0;

which we assume to exist.

Remark 4.1 Theorem 4 shows that the pseudo-true values �� and �� of the parametersfor the misspeci�ed GARCH(1,1) model in (7) are determined solely by the distribution of(mt), which is completely speci�ed by the true value �0 and �0 of the parameters and thedistribution of ("t) in the GARCH-X model. In particular, their values do not depend uponany characteristics of the covariate (xt).

It is indeed quite simple to calculate �� and ��, at least approximately, once the truevalue �0 and �0 of the parameters and the distribution of ("t) are given. We just consider thedata generated from the GARCH(1,1) model with �0 and �0 (with an arbitrary constantterm, which is unimportant), and �t the data by the GARCH(1,1) model without theconstant term, i.e.,

�2t = �y2t�1 + ��

2t�1 (9)

using the maximum likelihood estimation method. Further, we denote the MLE�s of theparameters � and � in (9) by ��n and ��n, say. Then we may easily see from the proof ofTheorem 4 that

��n !p �� and ��n !p ��;

where �� and �� are de�ned as in Theorem 4.

16

Page 17: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

Needless to say, it is always possible to modify the estimators without a¤ecting theirconsistency so that they are bounded. Therefore, we may assume without loss of generalitythat ��n and ��n are a.s. bounded. Under this convention, we have

E��n ! �� and E��n ! ��

andE��n � �� and E��n � ��;

for large n. Table 1 reports the simulated values of E��n and E��n. We consider normal,uniform and t5 distributions8 for ("t) when �0 and �0 take di¤erent values; 0:1 � �0 � 0:3;�0 � 0:1 and �0 + �0 � 0:6:

The computed values of �� and �� are di¤erent depending upon both the distributionof ("t) and the values of �0 and �0: However, we see that the values of �� + �� are veryclose to unity in all cases. Hence, we can expect that we would observe the evidence of theIGARCH if we omit a relevant covariate in the GARCH-X model and �t the GARCH(1,1)model. This is so, regardless of the values of �0 and �0 and the distribution of ("t). Theubiquitous evidence for the IGARCH may thus be spurious, and it could be generated as aresult of missing a relevant covariate that is nonstationary as well as persistent in memory.

Remark 4.2 If we consider the GARCH-X process with the restriction of �0 = 0 andconduct the same study, we can apply the same procedure and also obtain similar results.If �0 = 0; the GARCH(0,1)

9-X model (ARCH-X) is assumed to be the true data generatingprocess and we have the same results for ~�n and ~�n as in Theorem 4, where (mt) in (8)is now de�ned by zt = 1 +

P1i=1

Qih=1 �0"

2t�h: Following the same procedure described

above, we can calculate �� and ��: We consider the data generated from the GARCH(0,1)model with �0 ranging between 0:1 and 0:4 (with an arbitrary constant term), and �t thedata by the GARCH(1,1) model without the constant term as in (9). Table 2 reports theresults. Similarly, the computed values of �� and �� are di¤erent depending upon both thedistribution of ("t) and the values of �0; but we see that the values of ��+�� are very closeto unity in all cases.

Remark 4.3 Now we consider the restriction of �0 = 0 and conduct the same study. If�0 = 0; the GARCH(1,0)-X model is assumed to be the true data generating process. Wehave the same results for ~�n and ~�n as in Theorem 4, where (mt) in (8) is now de�nedby zt = 1 +

P1i=1

Qih=1 �0 = 1=(1 � �0): We can calculate �� and �� following the same

procedure described above. We consider the data generated from the GARCH(1,0) modelwith �0 ranging between 0:1 and 0:6 (with an arbitrary constant term), and �t the data bythe GARCH(1,1) model without the constant term as in (9). Table 3 reports the results.The values of �� + �� are very close to unity in all cases. However, there exists a distinctdi¤erence from the previous cases. In Remark 4.1 and 4.2, the computed values of �� and ��are di¤erent depending upon both the distribution of ("t) and the values of true parameters.

8The results for t6 and t15 distributions are also similar.9We follow the de�nition of the GARCH(p; q) given as �2t = ! +

Pqi=1 �iy

2t�i +

Ppj=1 �j�

2t�j :

17

Page 18: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

But, regardless of the distribution of ("t) and the values of �0; the computed values of ��and �� are zero and unity, respectively, and they are almost constant under the restrictionof �0 = 0. Considering that the estimate of � is positive and the estimate of � is lessthan unity in most empirical applications of the GARCH(1,1) model, this implies that therestriction of �0 = 0 is not as appropriate as the restriction of �0 = 0:

Remark 4.4 Our study provides an alternative explanation of the IGARCH. It is impor-tant to note that the missing covariate is assumed to be nonstationary (1=2 < dx < 3=2).This is related with the result on the asymptotic limit of the sample variance in the previoussection. Theorem 2 shows that the asymptotic limit of the sample variance of the GARCH-X process is in�nite when 1=2 < dx < 3=2: Considering that the sample variance of theIGARCH process is also in�nite in the limit, we expected that the IGARCH would appearwhen we omit the nonstationary covariate in the GARCH-X model. As we expected, ourstudy shows that the IGARCH is generated when the nonstationary covariate is missing inthe GARCH-X model. On the other hand, we expect that missing a stationary covariate inthe GARCH-X model would not generate the IGARCH. It is because, when the covariateis stationary (�1=2 < dx < 1=2), the sample variance of the GARCH-X process is �nite inthe limit, unlike the IGARCH process.

5 Conclusion

This paper investigates the asymptotic properties of the GARCH-X model in which thecovariate is generalized as a fractional integrated process. Since we consider as the covariatean I(dx) process for �1=2 < dx < 1=2 or 1=2 < dx < 3=2; the model can represent almostall the GARCH-X models in the literature. This paper provides asymptotic theories, whichshow how the model explains the stylized facts of �nancial time series such as the longmemory property in volatility, leptokurtosis and IGARCH.

First, the model generates the long memory property in volatility if the covariate is along memory process, regardless that it is stationary or nonstationary. The autocorrelationof the squared process of the model has a trend that is similar to the trend observed in realdata; it decreases fast at �rst and stays positive for larger lags. Second, the asymptoticlimit of the sample kurtosis of the GARCH-X process is larger than that of the GARCH(1,1)process unless the covariate is antipersistent. Note that the GARCH-X model provides anexplanation of the sample kurtosis observed in real data without using innovations withfat-tailed distributions. Third, our theory of misspeci�cation shows that the ubiquitousevidence for the IGARCH may be spurious, and it could be generated as a result of missinga relevant covariate that is nonstationary as well as persistent in memory.

Finally, we also consider the model with the restriction of � = 0 or � = 0 in (2). Theasymptotic properties of the model with the restriction of � = 0 is qualitatively similar tothose of the model without restriction. However, it is not in the case with the restriction of� = 0: In particular, if the covariate is nonstationary, the asymptotic properties of the modelwith the restriction of � = 0 is less appropriate in explaining the long memory propertyin volatility and the IGARCH than the model without restriction or the model with the

18

Page 19: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

restriction of � = 0:It is well known that the asymptotic properties of the GARCH(1,1) model is not fully

appropriate in explaining the stylized facts in �nancial time series even if it is extensivelyused in practice. This paper shows that the stylized facts in �nancial time series can beall explained in the framework of the GARCH-X model. There has been active researchto provide proper explanations of the stylized facts in �nancial time series, but each papermostly focuses on one particular stylized fact. It should be emphasized that this paperprovides a uni�ed framework to explain most stylized facts in �nancial time series.

Appendix A. Useful Lemmas and Their Proofs

The proofs of the theorems in the paper rely on the results from the following lemmas. Weassume that xt = 0 for all t � 0 without loss of generality.

Lemma 1 Let Assumptions 1 and 2 hold.(a) For �1=2 < dx < 1=2;

1

n

nXt=1

(! + �xt) !p !

1

n

nXt=k+1

(! + �xt)2 !p !2 + �2E

�x2t�:

(b) For 1=2 < dx < 3=2;

1

n�n�

nXt=1

(! + �xt) !d �

Z 1

0Wd(r)dr

1

n�2n�

nXt=1

(! + �xt)2 !d �2

Z 1

0(Wd(r))

2 dr:

Proof of Lemma 1 (a) (xt) is stationary and ergodic for �1=2 < dx < 1=2; and we havethe results due to the ergodic theorem.(b) Since n= (n�n�)! 0; we have

1

n�n�

nXt=1

(! + �xt) =1

n

nXt=1

����1n� xt

�+ op(1)!d �

Z 1

0Wd(r)dr

by Davidson and De Jong (2000). Similarly, we have

1

n�2n�

nXt=1

(! + �xt)2 =

1

n�2n�

nXt=1

�2x2t + op(1)

=1

n

nXt=1

�2���1n� xt

�2!d �

2

Z 1

0(Wd(r))

2 dr

using the continuous mapping theorem as in the proof of Theorem 3 in Sowell (1990). �

19

Page 20: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

Remark A1 For 1=2 < dx < 3=2; Davidson and De Jong (2000) show that, under suitableconditions for a zero mean process (ut) ;

��1n� xt; n�1=2ut; �

�1n� n

�1=2nXt=1

xtut

!!d

�Wd; U;

Z 1

0WddU

�(10)

where U is a Brownian motion. In the proof of Theorem 1 in Appendix B, we deal withvarious zero mean processes that are functions of ("t) and its past values. When such amean zero process (ut) satis�es the conditions for (10), we denote it by

ut 2 UL2:

For example, we can easily show that the following processes belong to UL2;�"2t � 1

�,

("4t � K4);�"2t "

2t�k � 1

�; "2t

�� + �"2t�k

�� (�+ �) ; "4t (� + �"2t�k)2 � K

4E�� + �"2t

�2 for agiven k � 1:

As an illustration, we consider one element of UL2 that is actually used in the proof ofTheorem 1 in Appendix B, and show how it satis�es the conditions needed for (10): De�ning

u1t = "2t

iYh=1

�� + �"2t�h

�� (�+ �)i

for a given i � 0; we now show that u1t 2 UL2 under Assumptions 1-3: Recall that d = dx�1for 1=2 < dx < 3=2 from Assumption 2.

First, (u1t) is L2-NED (L2-near-epoch-dependent) on ("t) because Et+mt�mu1t = u1t form � i, where Et+mt�m (�) denotes the expectation conditional on � ("t�m; � � � ; "t+m) : SeeDavidson (2002) for the de�nition of L2-NED. Hence, we can easily see that (u1t) satis�esthe assumptions of Theorem 4.1 in Davidson and De Jong (2000), which covers the casewhen �1=2 < d < 0: And also (u1t) satis�es the assumptions of Theorem 4.1 in De Jongand Davidson (2000), which covers the case when d = 0:

Second, (u1t) does not depend on future values of ("t) and also satis�es the condition (3)for Theorem 4.2 in Davidson and De Jong (2000). This is because Et+mt�mEtu1t+j = Etu1t+jfor m � max(i; j) and j � 0, where Et�1 is written as Et for convenience of notation:Therefore, (u1t) satis�es the assumptions of Theorem 4.2 in Davidson and De Jong (2000),which covers the case when 0 < d < 1=2: These show that (u1t) satis�es all conditions for(10) when �1=2 < d < 1=2 (or 1=2 < dx < 3=2) :

Lemma 2 Let Assumptions 1 and 2 hold. Let (ut) have zero mean, and assume that (ut)is independent of (xt) for �1=2 < dx < 1=2: For 1=2 < dx < 3=2, we assume that ut 2 UL2as it is described in Remark A1:(a) If v�1n

Pnt=1 (! + �xt) = Op(1), then v

�1n

Pnt=1 (! + �xt)ut = op(1):

(b) If v�1nPnt=1 (! + �xt)

2 = Op(1), then v�1nPnt=1 (! + �xt)

2 ut = op(1):

20

Page 21: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

Proof of Lemma 2 For �1=2 < dx < 1=2; (xt) is stationary and ergodic, and we have

1

n

nXt=1

(! + �xt)ut = op(1) and1

n

nXt=k+1

(! + �xt)2 ut = op(1)

by the ergodic theorem. For 1=2 < dx < 3=2; using (10) and the continuous mappingtheorem, we have

1pn�n�

nXt=1

(! + �xt)ut !d �

Z 1

0Wd(r)dU(r)

1pn�2n�

nXt=1

(! + �xt)2 ut !d �2

Z 1

0(Wd(r))

2 dU(r)

where U is a Brownian motion, which completes the proof. �

Lemma 3 Let Assumptions 1 and 2 hold.(a) For �1=2 < dx < 1=2;

1

n

nXt=k+1

(! + �xt) (! + �xt�k)!p !2 + �2E (xtxt�k) :

(b) For 1=2 < dx < 3=2;

1

n�2n�

nXt=k+1

(! + �xt) (! + �xt�k)!d �2

Z 1

0(Wd(r))

2 dr:

Proof of Lemma 3 (a) For �1=2 < dx < 1=2; (xt) is stationary and ergodic, and

n�1nX

t=k+1

xtxt�k !p E (xtxt�k)

by the ergodic theorem.(b) For 1=2 < dx < 3=2; since xt = xt�k +

Pk�1i=0 �t�i; we have

1

n�2n�

nXt=k+1

xtxt�k =1

n�2n�

nXt=k+1

x2t�k +1pn�n�

k�1Xi=0

1pn�n�

nXt=k+1

�t�ixt�k

=1

n�2n�

nXt=k+1

x2t�k + op(1)!d

Z 1

0(Wd(r))

2 dr:

21

Page 22: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

Remark A2 Given (ut) de�ned in Lemma 2, we can deduce from Lemmas 2 and 3 that

1

vn

nXt=1

(! + �xt) (! + �xt�k)ut = op(1)

if v�1nPnt=1 (! + �xt) (! + �xt�k) = Op (1) : For �1=2 < dx < 1=2;

1

n

nXt=k+1

(! + �xt) (! + �xt�k)ut !p 0

by the ergodic theorem because (ut) is independent of (xt) : For 1=2 < dx < 3=2; we have

1

n�2n�

nXt=k+1

(! + �xt) (! + �xt�k)ut =1

n�2n�

nXt=k+1

�2xtxt�kut + op(1)

=1

n�2n�

nXt=k+1

�2x2t�kut +1

n�2n�

nXt=k+1

xt�k

0@k�1Xj=0

�t�j

1Aut + op(1)= op(1):

For the last line, see thatPnt=k+1 x

2t�kut = op(n�

2n�) due to Lemma 2, which also implies

thatPnt=k+1 xt�k�tut = op(n�

2n�):

Lemma 4 Let Assumptions 1 and 2 hold. Suppose that 0 < �c < 1 and vn ! 1monotonically as n!1:(a) If v�1n

Pnt=1 (! + �xt)!d Q, then v�1n

Pni=0 �

ic

Pn�kt=1 (! + �xt)!d Q= (1� �c) :

(b) If v�1nPnt=1 (! + �xt)

2 !d Q, then v�1nPni=0 �

ic

Pn�kt=1 (! + �xt)

2 !d Q= (1� �c) :

Proof of Lemma 4 See Han and Park (2008, Lemma 4). �

Lemma 5 Under Assumptions 1 and 4,

��1n� max1�t�n

���2t (�0)� (w0 + �0xt) zt�� = op(1)for all large n.

Proof of Lemma 5 The proof follows the lines of Lemma A of Han and Park (2009,henceforth HP) with some modi�cations. We let (�n) be a sequence of numbers such that

�n = nr

with 0 < r < 1=4 + d=2� 1=2p� 1=q. Note in particular that

�n !1 and �2nn�1=2�d+1=p+2=q = n2r�1=2�d+1=p+2=q ! 0: (11)

22

Page 23: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

Moreover, K denotes a generic constant whose precise de�nition varies from a line to an-other.

By the recursive substitution, it can be easily deduced that

�2t (�0) = w0 + �0xt +

1Xi=1

(w0 + �0xt�i)iYj=1

��0 + �0"

2t�j�:

Therefore, we may write�2t (�0) = (w0 + �0xt) zt + eRt

witheRt = eRt(A) + eRt(B) + eRt(C);

where

eRt(A) =

�nXi=1

�0 (xt�i � xt)iYj=1

��0 + �0"

2t�j�

(12)

eRt(B) =1X

i=�n+1

�0xt�i

iYj=1

��0 + �0"

2t�j�

(13)

eRt(C) = �1X

i=�n+1

�0xt

iYj=1

��0 + �0"

2t�j�: (14)

We use

max1�t�n

�nXi=1

iYj=1

��0 + �0"

2t�j�= Op(�nn

2=q) (15)

and

max1�t�n

1Xi=�n+1

iYj=1

��0 + �0"

2t�j�= op(1); (16)

which are shown in the proof of the Lemma A in HP.First, we consider eRt(A) introduced in (12). If we show that, for all i � �n;

��1n� max1�t�njxt � xt�ij = Op(�nn�1=2�d+1=p); (17)

it follows from (15) that

��1n� max1�t�njeRt(A)j

��j�0j��1n� max1�t�n

max1�i��n

jxt � xt�ij�0@max

1�t�n

�nXi=1

iYj=1

��0 + �0"

2t�j�1A

= Op(�2nn

�1=2�d+1=p+2=q) = op(1); (18)

23

Page 24: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

due in particular to (11), and we may readily conclude that eRt(A) is negligible. Since

max1�t�n

��1n� jxt � xt�ij � ��1n� max1�t�n

j�t + � � �+ �t�i+1j � �n ��1n� max1�t�nj�tj;

for all i � �n, we just need to show

max1�t�n

j�tj = Op(n1=p); (19)

to establish (17). Recall that �t =P1k=0 ' (k) vt�k where

' (k) =� (k + d)

� (d) � (k + 1)

for jdj < 1=2: Since ' (k) � (1=� (d)) kd�1 for jdj < 1=2; we haveP1k=0 j' (k)j

p < 1 forp > 2 and, therefore,

Ej�tjp = Ejvtjp1Xk=0

j' (k)jp <1:

Now (19) follows because, for any constant K > 0; we have

P�max1�t�n

n�1=pj�tj > K��

nXt=1

Pnn�1=pj�tj > K

o= nP

nn�1=pj�tj > K

o� K�pEj�tjp:

Second, we consider eRt(B) and eRt(C) de�ned in (13) and (14) respectively. Since��1n�

Pnt=1 �t !d Wd (1) ;

��1n� max1�t�njxtj = Op(1) (20)

for all large n. Therefore, we have

��1n� max1�t�njeRt(B)j = ��1n� max1�t�n

jeRt(C)j = op(1) (21)

due to (16) and (20). The stated result follows immediately from (18) and (21). �

Lemma 6 Under Assumptions 1 and 4,

�i�1��1n� max1�t�njxtzt�i � xt�izt�ij = op(1):

for all i � 1 and large n.

24

Page 25: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

Proof of Lemma 6 If we let (�n) be the sequence of numbers given as (11) in the proofof Lemma 5, it follows from the proof of Lemma B in HP that

max1�t�n

jztj = Op(�nn2=q) + op(1): (22)

For all i � �n; we have

��1n� max1�t�njxtzt�i � xt�izt�ij

��max1�t�n

jztj��

��1n� max1�t�njxt � xt�ij

�= Op(�

2nn

�1=2�d+1=p+2=q) + op(�nn�1=2�d+1=p) = op(1)

due to (17) and (22).For all i > �n + 1; it follows that

�i�1��1n� max1�t�njxtzt�i � xt�izt�ij

� ��n�max1�t�n

jztj��

��1n� max1�t�njxtj�

= ��nOp(�nn2=q) + ��nop(1) = op(1)

due to (11), (20), (22) and 0 < � < 1. The proof is therefore complete. �

Appendix B. Proofs of the Main Results

To save space, we introduce the following notations. We let wt = !+�xt and et = �+�"2t :Then

�2t =

1Xi=0

wt�i

iYh=1

et�h

for t � 1 whereQ0h=1 et�h = 1: Let $ = Ee2t : Clearly $ = �2K4 + 2�� + �2: We let vn be

a generic sequence such that vn !1 monotonically as n!1:

Proof of Theorem 1 At �rst, we need to obtain asymptotic limits for the following threesample moments:

nXt=1

y2t ;nXt=1

y4t ;nXt=1

y2t y2t�k:

The �rst sample moment is

nXt=1

y2t =

nXt=1

1Xi=0

wt�i

iYh=1

et�h

!"2t

=1Xi=0

(�+ �)inXt=1

wt�i +1Xi=0

nXt=1

wt�i

"2t

iYh=1

et�h � (�+ �)i!

25

Page 26: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

If v�1nPnt=1wt�i = Op(1),

1

vn

nXt=1

wt�i

"2t

iYh=1

et�h � (�+ �)i!= op(1)

by Lemma 2. This is because�"2tQih=1 et�h � (�+ �)

i�is independent of wt�i and is an

element of UL210 for a given i � 0. Therefore, we have

1

vn

nXt=1

y2t =1

vn

1Xi=0

(�+ �)inXt=1

wt�i + op(1);

which leads to, due to Lemmas 1 and 4,

1

n

nXt=1

y2t !p!

1� (�+ �)

for �1=2 < dx < 1=2; and

1

n�n�

nXt=1

y2t !p�

1� (�+ �)

Z 1

0Wd(r)dr

for 1=2 < dx < 3=2:The second sample moment is

nXt=1

y4t =

nXt=1

1Xi=0

w2t�i

iYh=1

e2t�h

!"4t

+ 2

nXt=1

0@ 1Xj=0

jYl=1

e2t�l

1Xi=1+j

wt�jwt�i

iYh=1+j

et�h

1A "4t (23)

If v�1nPnt=1w

2t�i = Op(1), we have

1

vn

nXt=1

w2t�i

"4t

iYh=1

e2t�h � K4$i

!= op(1)

by Lemma 2. This is because�"4tQih=1 e

2t�h � K

4$i�is independent of wt�i and is an

element of UL2 for a given i � 0. For the �rst term in (23), we have, due to Lemmas 1 and4,

1

n

1Xi=0

K4$i

nXt=1

w2t�i !pK4

1�$�!2 + �2E

�x2t��

10See Remark A1 in Appendix A for the notation UL2:

26

Page 27: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

for �1=2 < dx < 1=2; and

1

n�2n�

1Xi=0

K4$i

nXt=1

w2t�i !dK4

1�$�2

Z 1

0(Wd(r))

2 dr

for 1=2 < dx < 3=2:If v�1n

Pnt=1wt�jwt�i = Op(1) for j � 0 and i � 1 + j, we have

1

vn

nXt=1

wt�jwt�i

"4t

jYl=1

e2t�l � K4$j

!0@ iYh=1+j

et�h � (�+ �)i�j1A = op(1): (24)

Note that�"4tQjl=1 e

2t�l � K

4$j�has zero mean, and is independent of wt�jwt�i as well as�Qi

h=1+j et�h � (�+ �)i�j�for j � 0 and i � 1+j. Hence, (24) holds for �1=2 < dx < 1=2

by the ergodic theorem. Moreover,�"4tQjl=1 e

2t�l � K

4$j��Qi

h=1+j et�h � (�+ �)i�j�2

UL2 for j � 0 and i � 1 + j, and (24) therefore holds for 1=2 < dx < 3=2 as it is shown inRemark A2 in Appendix A. For the second term in (23), we have, due to Lemmas 3 and 4,

2

n

1Xj=0

K4$j

1Xi=1+j

(�+ �)i�jnXt=1

wt�jwt�i !p 2

1Xj=0

K4$j

1Xi=1+j

(�+ �)i�j�!2 + �2 x (i� j)

�for �1=2 < dx < 1=2; and

2

n�2n�

1Xj=0

K4$j

1Xi=1+j

(�+ �)i�jnXt=1

wt�jwt�i !dK4

1�$2 (�+ �)

1� (�+ �)�2

Z 1

0(Wd(r))

2 dr

for 1=2 < dx < 3=2:Combining two terms in (23), we have

1

n

nXt=1

y4t ! pK4

1�$�!2 + �2E

�x2t��

+2

1Xj=0

K4$j

1Xi=1+j

(�+ �)i�j�!2 + �2 x (i� j)

�for �1=2 < dx < 1=2; and

1

n�2n�

nXt=1

y4t !d1 + (�+ �)

1� (�+ �)K4

1�$�2

Z 1

0(Wd(r))

2 dr

for 1=2 < dx < 3=2:

27

Page 28: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

The third sample moment is

nXt=k+1

y2t y2t�k =

nXt=k+1

1Xi=0

wt�i

iYh=1

et�h

!"2t

0@ 1Xj=0

wt�k�j

jYh=1

et�k�h

1A "2t�k=

nXt=k+1

k�1Xi=0

wt�i

iYh=1

et�h

!"2t

0@ 1Xj=0

wt�k�j

jYh=1

et�k�h

1A "2t�k+

nXt=k+1

1Xi=k

wt�i

iYh=1

et�h

!"2t

0@ 1Xj=0

wt�k�j

jYh=1

et�k�h

1A "2t�k: (25)

For the third line, we divided y2t =�P1

i=0wt�iQih=1 et�h

�"2t into two parts,

k�1Xi=0

wt�i

iYh=1

et�h

!"2t and

1Xi=k

wt�i

iYh=1

et�h

!"2t ;

because each term produces di¤erent types of zero mean processes (ut) when it is multipliedby y2t�k:

At �rst, we consider the �rst term of (25). Note that

E

" iY

h=1

et�h

!"2t

jYh=1

et�k�h

!"2t�k

#= (�+ �)i+j

for 0 � i � k� 1 and j � 0: If v�1nPnt=k+1wt�iwt�k�j = Op(1) for 0 � i � k� 1 and j � 0,

we have

1

vn

nXt=k+1

wt�iwt�k�j

"2t

iYh=1

et�h � (�+ �)i!

"2t�k

jYh=1

et�k�h � (�+ �)j!= op(1)

as in (24). Note that�"2tQih=1 et�h � (�+ �)

i�has zero mean, and is independent of

wt�iwt�k�j as well as�"2t�k

Qjh=1 et�k�h � (�+ �)

j�for 0 � i � k�1 and j � 0:Moreover,

we can deduce that�"2tQih=1 et�h � (�+ �)

i��"2t�k

Qjh=1 et�k�h � (�+ �)

j�2 UL2 for

0 � i � k � 1 and j � 0. Hence, we have

1

vn

nXt=k+1

k�1Xi=0

wt�i

iYh=1

et�h

!"2t

0@ 1Xj=0

wt�k�j

jYh=1

et�k�h

1A "2t�k=

1

vn

k�1Xi=0

(�+ �)i1Xj=0

(�+ �)jnX

t=k+1

wt�iwt�k�j + op(1):

28

Page 29: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

This leads to

1

n

k�1Xi=0

(�+ �)i1Xj=0

(�+ �)jnX

t=k+1

wt�iwt�k�j + op(1)

!p

k�1Xi=0

(�+ �)i1Xj=0

(�+ �)j�!2 + �2 x (k + j � i)

�(26)

for �1=2 < dx < 1=2; and

1

n�2n�

k�1Xi=0

(�+ �)i1Xj=0

(�+ �)jnX

t=k+1

wt�iwt�k�j + op(1)

!d1� (�+ �)k

1� (�+ �)�2

1� (�+ �)

Z 1

0(Wd(r))

2 dr (27)

for 1=2 < dx < 3=2:For the second term in (25), note that

E

iY

h=1

et�h

!"2t

jYh=1

et�k�h

!"2t�k

!= (�+ �)k�1

�� + �K4

�$min(i�k;;j)

for i � k and j � 0: If v�1nPnt=k+1wt�iwt�k�j = Op(1) for i � k and j � 0, we have

1

vn

nXt=k+1

wt�iwt�k�j

"2t "

2t�k

�Qih=1 et�h

��Qjh=1 et�k�h

�� (�+ �)k�1

�� + �K4

�$min(i�k;;j)

!= op(1)

as it is shown in Remark A2 in Appendix A. Note that wt�iwt�k�j is independent of�"2t "

2t�k

�Qih=1 et�h

��Qjh=1 et�k�h

�� (�+ �)k�1

�� + �K4

�$min(i�k;;j)

�; that is an ele-

ment of UL2, for i � k and j � 0. Therefore,

1

vn

nXt=k+1

0@ 1Xi=k

wt�i

iYh=1

et�h

!"2t �

0@ 1Xj=0

wt�k�j

jYh=1

et�k�h

1A "2t�k1A

=1

vn(�+ �)k�1

�� + �K4

� 1Xj=0

$jnX

t=k+1

w2t�k�j

+1

vn2 (�+ �)k�1

�� + �K4

� 1Xj=0

$j1X

i=1+j

(�+ �)i�jnX

t=k+1

wt�k�jwt�k�i + op(1):

29

Page 30: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

This leads to

1

n

nXt=k+1

0@ 1Xi=k

wt�i

iYh=1

et�h

!"2t �

0@ 1Xj=0

wt�k�j

jYh=1

et�k�h

1A "2t�k1A

!p(�+ �)k�1

�� + �K4

�1�$

�!2 + �2E

�x2t��

+2 (�+ �)k�1�� + �K4

� 1Xj=0

$j1X

i=1+j

(�+ �)i�j�!2 + �2 x (i� j)

�(28)

for �1=2 < dx < 1=2; and

1

n�2n�

nXt=k+1

24 1Xi=k

wt�i

iYh=1

et�h

!"2t �

0@ 1Xj=0

wt�k�j

jYh=1

et�k�h

1A "2t�k35

!d1 + �+ �

1� (�+ �)(�+ �)k�1

�� + �K4

�1�$ �2

Z 1

0(Wd(r))

2 dr (29)

for 1=2 < dx < 3=2:Combining the equations (26)-(29), we have

1

n

nXt=k+1

y2t y2t�k !p

k�1Xi=0

(�+ �)i1Xj=0

(�+ �)j�!2 + �2 x (k + j � i)

�+(�+ �)k�1

�� + �K4

�1�$

�!2 + �2E

�x2t��

+ 2 (�+ �)k�1�� + �K4

� 1Xj=0

$j1X

i=1+j

(�+ �)i�j�!2 + �2 x (i� j)

�for �1=2 < dx < 1=2; and

1

n�2n�

nXt=k+1

y2t y2t�k

!p

1� (�+ �)k

(1� (�+ �))2+1 + �+ �

1� (�+ �)(�+ �)k�1

�� + �K4

�1�$

!�2Z 1

0(Wd(r))

2 dr

for 1=2 < dx < 3=2:In order to prove the stated result, note that

�y2n =1

n

nXt=1

y2t !p!

1� (�+ �)

for �1=2 < dx < 1=2 and

1

�n��y2n =

1

n�n�

nXt=1

y2t !d1

1� (�+ �)�Z 1

0Wd(r)dr

30

Page 31: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

for 1=2 < dx < 3=2; which we already proved in the beginning.One may easily deduce that, for �1=2 < dx < 1=2;

1

n

nXt=k+1

�y2t � �y2n

� �y2t�k � �y2n

�=1

n

nXt=k+1

y2t y2t�k �

��y2n�2+ op (1)

!d

k�1Xi=0

(�+ �)i1Xj=0

(�+ �)j�!2 + �2 x (k + j � i)

�+(�+ �)k�1

�� + �K4

�1�$

�!2 + �2E

�x2t��

+ 2 (�+ �)k�1�� + �K4

� 1Xj=0

$j1X

i=1+j

(�+ �)i�j�!2 + �2 x (i� j)

���

!

1� (�+ �)

�2and

1

n

nXt=1

�y2t � �y2n

�2=1

n

nXt=1

y4t ���y2n�2+ op (1)

!dK4

1�$�!2 + �2E

�x2t��+ 2

1Xj=0

K4$j

1Xi=1+j

(�+ �)i�j�!2 + �2 x (i� j)

���

!

1� (�+ �)

�2:

For 1=2 < dx < 3=2; we have

1

n�2n�

nXt=k+1

�y2t � �y2n

� �y2t�k � �y2n

�=

1

n�2n�

nXt=k+1

y2t y2t�k �

�1

�n��y2n

�2+ op (1)

!d

1� (�+ �)k

(1� (�+ �))2+1 + (�+ �)

1� (�+ �)(�+ �)k�1

�� + �K4

�1�$

!�2Z 1

0(Wd(r))

2 dr

��

1� (�+ �)

Z 1

0Wd(r)dr

�2and

1

n�2n�

nXt=1

�y2t � �y2n

�2=

1

n�2n�

nXt=1

y4t ��1

�n��y2n

�2+ op (1)

!d1 + �+ �

1� (�+ �)K4

1�$�2

Z 1

0(Wd(r))

2 dr ��

1� (�+ �)

Z 1

0Wd(r)dr

�2:

The stated result follows immediately. �

31

Page 32: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

Proof of Theorem 2 The results are proven in the proof of Theorem 1. �

Proof of Theorem 3 The stated result for �1=2 < dx < 1=2 is followed by

K4n =

1

n

nXt=1

y4t

, 1

n

nXt=1

y2t

!2;

and the result for 1=2 < dx < 3=2 is followed by

K4n =

1

n�2n�

nXt=1

y4t

, 1

n�n�

nXt=1

y2t

!2:

Proof of Theorem 4 Let � = (�; �)0 and � 2 � = A�B ;where � 2 A and � 2 B � (0; 1)as in Assumption 5: Since

@�2t@�

=

�@�2t@�

;@�2t@�

�0=

1Xi=1

�i�1y2t�i;1Xi=1

�i�1�2t�i(�)

!0;

the score function is given by

sn(�) =1

n

nXt=1

�y2t�2t (�)

� 1� P1

i=1 �i�1y2t�i

�2t (�);

P1i=1 �

i�1�2t�i(�)

�2t (�)

!0(30)

with

�2t (�) = ! + �y2t�1 + ��

2t�1 =

!

1� � + �1Xi=1

�i�1y2t�i:

It follows from Lemmas 5 and 6 that

�i�1��1n� y2t�i = �

i�1���1n� (w0 + �0xt)

�mt�i + op(1) (31)

for all i uniformly in t = 1; : : : ; n. Moreover, we have

��1n� �2t (�) =

���1n� (w0 + �0xt)

��

1Xi=1

�i�1mt�i + op(1) (32)

uniformly in t = 1; : : : ; n.Note that (32) holds also uniformly in � 2 �, since in particular B � (0; 1). Therefore,

we may deduce from (31) and (32) that

y2t�2t (�)

=mt

�P1i=1 �

i�1mt�i+ op(1) (33)P1

i=1 �i�1y2t�i

�2t (�)=1

�+ op(1) (34)P1

i=1 �i�1�2t�i(�)

�2t (�)=

P1i=1 �

i�1P1j=i+1 �

j�i�1mt�jP1i=1 �

i�1mt�i+ op(1) (35)

32

Page 33: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

uniformly in t = 1; : : : ; n and uniformly in � 2 �.Consequently, if we set

sn(�) = (s1n(�); s2n(�))0;

then it follows from (30) together with (33), (34) and (35) that

s1n(�) =1

n

nXt=1

�mt

�P1i=1 �

i�1mt�i� 1�1

�+ op(1) (36)

s2n(�) =1

n

nXt=1

�mt

�P1i=1 �

i�1mt�i� 1� P1

i=1 �i�1P1

j=i+1 �j�i�1mt�jP1

i=1 �i�1mt�i

+ op(1) (37)

uniformly in � 2 �.Consider the simultaneous equations system

s�1n(�) =1

n

nXt=1

�mt

�P1i=1 �

i�1mt�i� 1�= 0 (38)

s�2n(�) =1

n

nXt=1

�mt

�P1i=1 �

i�1mt�i� 1� P1

i=1 �i�1P1

j=i+1 �j�i�1mt�jP1

i=1 �i�1mt�i

= 0: (39)

We may easily see from (36) and (37) that ~�n and ~�n have the same probability limits asthe solution of the simultaneous equations system (38) and (39).

Now it follows from Assumption 5 that

s�1n(�)!a:s: E�

mt

�P1i=1 �

i�1mt�i� 1�

s�2n(�)!a:s: E�

mt

�P1i=1 �

i�1mt�i� 1� P1

i=1 �i�1P1

j=i+1 �j�i�1mt�jP1

i=1 �i�1mt�i

uniformly in � 2 �. To see this, note in particular that (mt) is nonnegative a.s., andtherefore, �it(�) is monotone decreasing a.s. in � 2 B for all i = 1; 2; 3. The stated resulttherefore follows immediately. �

33

Page 34: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

References

Andersen, T.G., T. Bollerslev, F.X. Diebold and P. Labys, 2003, Modeling and forecastingrealized volatility, Econometrica, 71, 529-626.

Andersen, T.G., T. Bollerslev and F.X. Diebold, 2009, Parametric and nonparametricmeasurement of volatility, in Y. Aït-Sahalia and L.P. Hansen, (Eds.), Handbook ofFinancial Econometrics, 67-138, North Holland.

Bai, X., J.R. Russell and G.C. Tiao, 2003, Kurtosis of GARCH and stochastic volatilitymodels with non-normal innovations, Journal of Econometrics, 114, 349-360.

Baillie, R.T., T. Bollerslev and H.O. Mikkelsen, 1996, Fractionally integrated generalizedautoregressive conditional heteroskedasticity, Journal of Econometrics, 74, 3-30.

Barndor¤-Nielsen, O.E. and N. Shephard, 2002, Variation, jumps and high frequencydata in �nancial econometrics, in R. Blundell, T. Persson and W.K. Newey, (Eds.),Advances in Economics and Econometrics. Theory and Applications, Ninth WorldCongress, Econometric Society Monographs, 328-372. Cambridge University Press.

Bollerslev, T., 1987, A conditionally heteroskedastic time series for speculative prices andrates of return, The Review of Economics and Statistics, 69, 542-547.

Brenner, R.J., R.H. Harjes and K.F. Kroner, 1996, Another look at models of the short-term interest rate, Journal of Financial and Quantitative Analysis, 31, 85-107.

Cai, J., 1994, A markov model of unconditional variance in ARCH, Journal of Businessand Economic Statistics, 12, 309-316.

Cipollini, F., R.F. Engle and G. Gallo, 2007, A model for multivariate non-negative valuedprocesses in �nancial econometrics, mimeograph, Stern School of Business, New YorkUniversity.

Craioveanu, M. and E. Hillebrand, 2010, Level changes in volatility models, Annals ofFinance, forthcoming.

Davidson, J., 2002, Establishing conditions for the fractional central limit theorem innonlinear and semiparametric time series processes, Journal of Econometrics, 106,243-269.

Davidson, J. and R.M. De Jong, 2000, The functional central limit theorem and weakconvergence to stochastic integrals II: Fractionally integrated processes, EconometricTheory, 16, 643-666.

De Jong, R.M. and J. Davidson, 2000, The functional central limit theorem and weak con-vergence to stochastic integrals I: Weakly dependent processes, Econometric Theory,16, 621-642.

34

Page 35: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

Diebold, F.X., 1986, Modelling the persistence of conditional variance: A comment, Econo-metric Reviews, 5, 51-56.

Diebold, F.X. and A. Inoue, 2001, Long memory and regime switching, Journal of Econo-metrics, 105, 131-159.

Ding, Z. and C.W.J. Granger, 1996, Modeling volatility persistence of speculative returns:A new approach, Journal of Econometrics, 73, 185-215.

Ding, Z., C.W.J. Granger and R.F. Engle, 1993, A long memory property of stock marketreturns and a new model, Journal of Empirical Finance, 1, 83-106.

Dittmann, I. and C.W.J. Granger, 2002, Properties of nonlinear transformations of frac-tionally integrated processes, Journal of Econometrics, 110, 113-133.Journal of Econo-metrics, 110, 113-133.

Engle, R.F., 2002, New frontiers for ARCH models, Journal of Applied Econometrics, 17,425-446.

Engle, R.F. and T. Bollerslev, 1986, Modelling the persistence of conditional variance,Econometric Reviews, 5, 1-50, 81-87.

Engle, R.F. and G.M. Gallo, 2006, A multiple indicators model for volatility using intra-daily data, Journal of Econometrics, 131, 3-27.

Engle, R.F. and G.J. Lee, 1999, A long-run and short-run component model of stockreturn volatility, in Engle R. and H. White, (Eds.), Cointegration, Causality, andForecasting: A Festschrift in Honour of Clive W.J. Granger, Chapter 10, 475-497,Oxford University Press.

Engle, R.F. and A.J. Patton, 2001, What good is a volatility model?, Quantitative Finance,1(2), 237-245.

Fleming, J., C. Kirby and B. Ostdiek, 2008, The speci�cation of GARCH models withstochastic covariates, Journal of Futures Markets, 28, 911-934.

Glosten, L.R., R. Jagannathan and D. Runkle, 1993, On the relation between the expectedvalue and the volatility of nominal excess returns on stocks, Journal of Finance, 48,1779-1801.

Granger, C.W.J. and N. Hyung, 2004, Occasional structural breaks and long memory withan application to the S&P 500 absolute stock returns, Journal of Empirical Finance,11, 399-421.

Gray, S.F., 1996, Modeling the conditional distribution of interest rates as a regime-switching process, Journal of Financial Economics, 42, 27-62.

Hagiwara, M. and M.A. Herce, 1999, Endogenous exchange rate volatility, trading volumeand interest rate di¤erentials in a model of portfolio selection, Review of InternationalEconomics, 7(2), 202-218.

35

Page 36: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

Hamilton, J. and R. Susmel, 1994, Autoregressive conditional heteroskedasticity and changesin regime, Journal of Econometrics, 64, 307-333.

Han, H. and J.Y. Park, 2008, Time series properties of ARCH processes with persistentcovariates, Journal of Econometrics, 146, 275-292.

Han, H. and J.Y. Park, 2009, ARCH/GARCH with persistent covariate: Asymptotictheory of MLE, mimeograph, Department of Economics, National University of Sin-gapore.

Hansen, P.R., Z. Huang and H.H. Shek, 2010, Realized GARCH: A complete model ofreturns and realized measures of volatility, mimeograph, Department of Economics,Stanford University.

Hillebrand, E., 2005, Neglecting parameter changes in GARCH models, Journal of Econo-metrics, 129, 121-138.

Hodrick R.J., 1989, Risk, uncertainty, and exchange rates, Journal of Monetary Economics,23, 433-459.

Hol, E.M.J.H., 2003, Empirical studies on volatility in international stock markets. KluwerAcademic, Dordrecht.

Hurvich, C.M. and P. Soulier, 2009, Stochastic volatility models with long memory, in T.G.Andersen, R.A. Davis, J.P. Kreiss and Th. Mikosch, (Eds.), Handbook of FinancialTime Series, 157-168, Springer.

Hwang, S. and S.E. Satchell, 2005, GARCHmodel with cross-sectional volatility: GARCHXmodels, Applied Financial Economics, 15, 203-216.

Jensen, A.T. and T. Lange, 2010, On convergence of the QMLE for misspeci�ed GARCHmodels, Journal of Time Series Econometrics, 2, Iss.1, Article 3.

Jensen, S.T. and A. Rahbek, 2004, Asymptotic inference for nonstationary GARCH,Econometric Theory, 20, 1203-1226.

Lamoureux, C.G. and W.D. Lastrapes, 1990, Persistence in variance, structural changeand the GARCH model, Journal of Business and Economic Statistics, 8, 225-234.

Lee, T.H., 1994, Spread and volatility in spot and forward exchange rates, Journal ofInternational Money and Finance, 13, 375-382.

Liu, M., 1998, Asymptotics of nonstationary fractional integrated series, EconometricTheory, 14, 641-662.

Mikosch, T. and C. Starica, 2004, Nonstationarities in �nancial time series, the long-rangedependence, and the IGARCH e¤ects, The Review of Economics and Statistics, 86,378-390.

36

Page 37: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

Nelson, D.B., 1990. Stationarity and persistence in the GARCH(1,1) model, EconometricTheory, 6, 318-334.

Park, J.Y., 2002, Nonstationary nonlinear heteroskedasticity, Journal of Econometrics,110, 383-415.

Park, J.Y. and P.C.B. Phillips, 1999, Asymptotics for nonlinear transformations of inte-grated time series, Econometric Theory, 15, 269-298.

Shephard, N. and K. Sheppard, 2010, Realising the future: forecasting with high frequencybased volatility (HEAVY) models, Journal of Applied Econometrics, 25, 197-231.

Sowell, F., 1990, The fractional unit root distribution, Econometrica, 58, 495-505.

Tanaka, K., 1999, The nonstationary fractional unit root, Econometric Theory, 15, 549-582.

37

Page 38: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

Table 1. Simulated values of �� and �� for the GARCH(1,1)-X model

"t~iid N(0; 1) "t~iid U��p3;p3�

"t~iid (5=3)�1=2t5

�0 �0 �� �� �� + �� �� �� �� + �� �� �� �� + ��0:1 0:1 0:00 1:00 1:00 0:00 1:00 1:00 0:01 0:99 1:00

0:2 0:00 1:00 1:00 0:00 1:00 1:00 0:01 0:99 1:000:3 0:00 1:00 1:00 0:00 1:00 1:00 0:02 0:99 1:010:4 0:00 1:00 1:00 0:00 1:00 1:00 0:01 0:99 1:000:5 0:01 1:00 1:00 0:01 0:99 1:00 0:01 0:99 1:00

0:2 0:1 0:01 0:99 1:00 0:01 0:99 1:00 0:03 0:98 1:010:2 0:01 0:99 1:00 0:01 0:99 1:00 0:03 0:97 1:010:3 0:03 0:98 1:00 0:02 0:98 1:00 0:05 0:96 1:010:4 0:04 0:96 1:00 0:04 0:96 1:00 0:06 0:95 1:01

0:3 0:1 0:05 0:95 1:00 0:05 0:96 1:00 0:06 0:95 1:010:2 0:07 0:94 1:01 0:08 0:92 1:00 0:08 0:93 1:020:3 0:11 0:90 1:01 0:12 0:88 1:01 0:11 0:91 1:02

Notes: For the results in the Table, the GARCH(1,1) processes with the coe¢ cients �0 and �0are generated and �tted into the GARCH(1,1) model without constant to obtain the MLE ��n and��n from the samples of size 2,000. The reported values of �� and �� are then obtained in eachcase by taking averages of ��n and ��n over 100 iterations. The values of ��n and ��n show very littlevariation, and are very close respectively to �� and �� in every iteration. The �rst 1000 observationsare discarded from the initial samples of size 3,000 to remove the start-up e¤ect. The results areinvariant with respect to the value of the constant term in the GARCH models.

38

Page 39: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

Table 2. Simulated values of �� and �� for the GARCH(0,1)-X model (�0 = 0)

"t~iid N(0; 1) "t~iid U��p3;p3�

"t~iid (5=3)�1=2t5

�0 �� �� �� + �� �� �� �� + �� �� �� �� + ��0:10 0:00 1:00 1:00 0:00 1:00 1:00 0:00 1:00 1:000:12 0:00 1:00 1:00 0:00 1:00 1:00 0:01 0:99 1:000:14 0:00 1:00 1:00 0:00 1:00 1:00 0:01 0:99 1:000:16 0:00 1:00 1:00 0:00 1:00 1:00 0:01 0:99 1:000:18 0:00 1:00 1:00 0:00 1:00 1:00 0:02 0:98 1:000:20 0:01 0:99 1:00 0:00 1:00 1:00 0:01 0:99 1:000:22 0:01 0:99 1:00 0:01 0:99 1:00 0:03 0:98 1:010:24 0:01 0:99 1:00 0:01 0:99 1:00 0:03 0:98 1:000:26 0:02 0:98 1:00 0:01 0:99 1:00 0:04 0:97 1:010:28 0:03 0:97 1:00 0:02 0:98 1:00 0:06 0:96 1:010:30 0:04 0:97 1:00 0:03 0:97 1:00 0:05 0:96 1:010:32 0:05 0:95 1:00 0:04 0:96 1:00 0:07 0:95 1:010:34 0:06 0:95 1:01 0:05 0:95 1:00 0:08 0:94 1:020:36 0:07 0:94 1:01 0:07 0:93 1:00 0:08 0:94 1:010:38 0:09 0:92 1:01 0:08 0:92 1:00 0:11 0:92 1:030:40 0:10 0:91 1:01 0:11 0:89 1:01 0:10 0:92 1:02

Notes: For the results in the Table, the GARCH(0,1) processes with the coe¢ cients �0 = 0:1+0:02ifor i = 0; 1; 2; � � � ; 20 are generated and �tted into the GARCH(1,1) model without constant toobtain the MLE ��n and ��n from the samples of size 2,000. The reported values of �� and �� arethen obtained in each case by taking averages of ��n and �n over 100 iterations. The values of ��n and��n show very little variation, and are very close respectively to �� and �� in every iteration. The�rst 1000 observations are discarded from the initial samples of size 3,000 to remove the start-upe¤ect. The results are invariant with respect to the value of constant term in the GARCH(1,0)models.

39

Page 40: Asymptotic Properties of GARCH-X Processes · Asymptotic Properties of GARCH-X Processes Heejoon Han1 ... Standard terminologies and notations in probability 3. and measure theory

Table 3. Simulated values of �� and �� for the GARCH(1,0)-X model (�0 = 0)

"t~iid N(0; 1) "t~iid U��p3;p3�

"t~iid (5=3)�1=2t5

�0 �� �� �� + �� �� �� �� + �� �� �� �� + ��0:1 0:00 1:00 1:00 0:00 1:00 1:00 0:00 1:00 1:000:2 0:00 1:00 1:00 0:00 1:00 1:00 0:01 1:00 1:000:3 0:00 1:00 1:00 0:00 1:00 1:00 0:00 1:00 1:000:4 0:00 1:00 1:00 0:00 1:00 1:00 0:00 1:00 1:000:5 0:00 1:00 1:00 0:00 1:00 1:00 0:00 1:00 1:000:6 0:00 1:00 1:00 0:00 1:00 1:00 0:00 1:00 1:00

Notes: For the results in the Table, the GARCH(1,0) processes with the coe¢ cients �0 = 0:1 + 0:1ifor i = 0; 1; 2; � � � ; 5 are generated and �tted into the GARCH(1,1) model without constant to obtainthe MLE ��n and ��n from the samples of size 2,000. The reported values of �� and �� are thenobtained in each case by taking averages of ��n and �n over 100 iterations. The values of ��n and ��nshow very little variation, and are very close respectively to �� and �� in every iteration. The �rst1000 observations are discarded from the initial samples of size 3,000 to remove the start-up e¤ect.The results are invariant with respect to the value of constant term in the ARCH models.

40


Recommended