Variation and E ciency of High-Frequency BetasWe denote X= (Z;Y)>and assume that Xis an It^o...

Variation and Efficiency of High-Frequency Betas∗

Congshan Zhang,† Jia Li,‡ Viktor Todorov,§ and George Tauchen¶

April 8, 2019

Abstract

This paper studies the efficient estimation of betas from high-frequency return dataon a fixed time interval. Under an assumption of equal diffusive and jump betas, wederive the semiparametric efficiency bound for estimating the common beta and wedevelop an adaptive estimator that attains the efficiency bound. We further proposea Hausman type test for deciding whether the common beta assumption is true fromthe high-frequency data. In our empirical analysis we provide examples of stocks andtime periods for which a common market beta assumption appears true and ones forwhich this is not the case. We further quantify empirically the gains from the efficientcommon beta estimation developed in the paper.

Keywords: adaptive estimation; beta; high frequency data; jump; semiparametricefficiency; volatility.

JEL Classification: C51, C52, G12.

∗Todorov’s research is partially supported by NSF grant SES-1530748.†Department of Economics, Duke University, Durham, NC 27708; e-mail: [email protected].‡Department of Economics, Duke University, Durham, NC 27708; e-mail: [email protected].§Department of Finance, Kellogg School of Management, Northwestern University, Evanston, IL 60208;

e-mail: [email protected].¶Department of Economics, Duke University, Durham, NC 27708; e-mail: [email protected].

1

1 Introduction

In this paper we study the problem of efficient estimation of asset betas from high-frequency return data on a fixed time interval. The estimation of factor loadings (betas)is of central importance for practical risk management. This estimation constitutes also acritical step in evaluating the ability of asset pricing models to explain the cross-sectionalbehavior of asset prices.1 If betas stay constant over a (short) period of time, then onecan use high-frequency records of the factors and the asset prices within the time intervalto estimate the betas. The goal of this paper is to derive efficient methods for doing sowhen betas associated with continuous and discontinuous moves in the factors are imposedto be the same, and to further develop tests for this assumption. More specifically, we areinterested in the following continuous-time regression:

Yt = Bt + βcZct + βdZd

t + εct + εdt , t ∈ [0, T ], (1)

where Y is the asset price, B is a continuous process of finite variation (i.e., the drift), Zc andZd are the continuous and the discontinuous local martingale components of the discretelyobserved factor Z, and εc and εd are the residual continuous and discontinuous martingalecomponents of Y which are orthogonal to Zc and Zd in a martingale sense, that is, theirquadratic covariations with the factors are zero (this is formally defined in the main text).

There has been a lot of work on the estimation of betas using high-frequency data. Mostof the existing work is about the estimation of the diffusive beta, βc, with jumps eithernot allowed in the model setup or the inference being robust to their presence; see, forexample, Bollerslev and Zhang (2003), Barndorff-Nielsen and Shephard (2004a), Andersenet al. (2006), Mykland and Zhang (2006, 2009), Todorov and Bollerslev (2010), Gobbi andMancini (2012), Patton and Verardo (2012), Kalnina (2012) and Li et al. (2017a) amongmany others. In particular, Li et al. (2017a) propose an adaptive estimator that attains thesemiparametric efficiency bound for recovering βc. There is much less work on the estimationof the jump beta, βd. Todorov and Bollerslev (2010) propose estimators based on higher-order power variation while Li et al. (2017b) consider semiparametrically efficient estimationbased on estimates of the jump (co)variation by optimally weighting the information in thedetected jumps in order to account for the heteroskedasiticity present in the data. In contrastto the adaptive estimation result for the diffusive beta, the efficient estimation of jump betais shown to be generally not adaptive with respect to the sizes of jumps in Z.

In the existing work to date, typically the discontinuous (resp. continuous) part ofthe asset price is treated as a nuisance component when estimating the continuous (resp.discontinuous) beta. In this paper, we study a “common beta” restriction βc ≡ βd andexplore the efficiency gains such an assumption may provide for making inference for assetbetas. This is particularly relevant for the study of how stocks react to “big” moves inthe factor (i.e., for the jump beta). By their nature, jumps are rare events and, hence,“borrowing” information from the way the stock and the risk factor co-move for the morefrequent returns of “small” size can provide nontrivial efficiency gains.

1Indeed, the error in recovering betas can carry over to the second stage cross-sectional regressions ofthe Fama-Macbeth procedures for estimating risk premia; see, for example, Shanken (1992), Jaganathan andWang (1998), Kan and Zhang (1999), Gospodinov et al. (2009) and Kleibergen (2009).

2

If the common beta restriction is satisfied, then we show in this paper that the betacoefficient can be estimated adaptively with respect to the nonparametric nuisance compo-nents in the model. More specifically, we propose a (feasible) estimator that is as efficientas the infeasible estimator that would be obtained if one could counter-factually observethe nonparametric nuisances including the drift, and the latent volatility and jumps of thefactor process. Our efficiency result is related to the one in Li et al. (2017a) for estimatingthe constant diffusive beta. But, unlike that prior work, in the current paper we exploit theinformation content not only in the diffusive moves but also from the jumps of the assetsand the risk factors. To show the adaptiveness of the proposed estimator, we establish thelocal asymptotic mixed normality (LAMN) property in the aforementioned infeasible model,which then allows us to derive the Cramer–Rao efficiency bound for estimating the commonbeta by using the convolution theorem. We then show that the bound can be attained bythe proposed estimator. Hence, a fortiori, the efficiency bound is sharp and the proposed es-timator is adaptive (consequently, semiparametrically efficient). This adaptiveness result inthe common beta model is theoretically interesting (and somewhat surprising), particularlybecause the semiparametrically efficient estimator of the jump beta is generally not adaptiveto the sizes of jumps in Z. But we show that adaptiveness is “recovered” under the commonbeta restriction.

The adaptive estimator belongs to a class of weighted estimators for the (common) betacoefficient, and is constructed by optimally choosing the weight functions for both the contin-uous and jump returns, taking into consideration the heteroskedasticity in the data. Undergeneral regularity conditions, we derive a feasible limit theory for this class of estimators.This class also nests various existing estimators by appropriately choosing the weight func-tions. For example, the “realized regression” beta estimator of Barndorff-Nielsen and Shep-hard (2004a) (see also Todorov and Bollerslev (2010) and Gobbi and Mancini (2012)), theblock-based estimator of Mykland and Zhang (2009), the optimally weighted diffusive betaestimator and the optimally weighted jump beta estimator in Li et al. (2017a,b) all fall intothis framework as special cases.

Since the efficiency gains of our optimally weighted estimator rely crucially on the com-mon beta assumption, it is important to test whether such restriction holds. We design aHausman-type specification test (Hausman (1978)), for which the test statistic is built upontwo estimators: one being asymptotically efficient under the null but inconsistent under thealternative, the other being consistent under both the null and the alternative. Our test haspower against constant diffusive beta with general forms of nonlinearity in the jump depen-dence or constant jump beta with general forms of nonlinearity in the diffusive dependence.

We document satisfactory performance of the inference techniques developed in the paperon simulated data from realistically calibrated models. In an empirical application, we testfor common market quarterly beta and further assess the efficiency gains resulted fromexploiting such a restriction using data for three representative stocks covering the period2007-2014. For two of the stocks in our study, we find nontrivial evidence for the jumpbeta being higher than the diffusive one. For the third stock, on the other hand, we findthat for 65% of the quarterly periods in our sample a common beta assumption cannot berejected. Our results further show that when a common beta seems statistically plausible,the efficiency gains of exploiting this assumption over the efficient diffusive beta estimatoris around 6% on average and it is around 50% over that of Barndorff-Nielsen and Shephard

3

(2004a).The rest of the paper is organized as follows. Section 2 presents the formal setup. Sec-

tion 3.1 introduces our inference techniques and establishes the adaptive estimation resultfor the common beta. In Section 4, we propose a Hausman-type specification test for com-mon beta based on the efficient estimator. Section 5 contains a Monte Carlo study andSection 6 provides an empirical illustration. We conclude in Section 7. All proofs are givenin Section 8.

2 Setup

We start with introducing some notation that will be used throughout. We denote withP−→,

L−→ andL-s−→ convergence in probability, convergence in law and stable convergence in

law, respectively. All limits are for n → ∞, and the asymptotics is of infill type on a fixedtime interval [0, T ].

The set of real numbers is R and R∗ = R \ 0. We denote byMd the space of all d× dpositive semidefinite matrices. The Euclidean norm of a finite-dimensional vector space is‖·‖. The integer part of x ∈ R is [x]. We write x∧y to denote the smaller number of x and y.For a matrix A, its transpose is denoted by A>, and its (j, k) element is Ajk, while vec(·) isthe column vectorization operator. For matrix differentiation, if f is a generic differentiablefunction defined on Md, then ∂jkf(A) ≡ ∂f(A)/∂Ajk and ∂2

jk,lmf(A) ≡ ∂2f(A)/∂Ajk∂Alm.Finally, we write an bn if for some constant C ≥ 1, we have an/C ≤ bn ≤ Can.

2.1 The underlying processes

We start with some regularity conditions for the processes Z and Y . These processesare defined on a filtered probability space (Ω,F , (Ft)t≥0,P). We denote X = (Z, Y )> andassume that X is an Ito semimartingale of the form

Xt = X0 +

∫ t

0

bsds+

∫ t

0

σsdWs + Jt, Jt ≡∑s≤t

∆Xs, (2)

where bt takes value in R2, the volatility process σt takes value in M2 and W is a 2-dimensional standard Brownian motion. The jump of X at time t is denoted by ∆Xt ≡Xt−Xt−, where Xt− ≡ lims↑tXs. The process Jt = (JZ,t, JY,t)

> can be alternatively written

as∫ t

0

∫R δ(ω, s, u)µ(ds, du), where δ : Ω× R+ × R→ R2 is a predictable function and µ is a

Poisson random measure on R+×R with its deterministic compensator ν(dt, du) = dt⊗λ(du)for some σ-finite measure λ on R. The spot covariance matrix of X at time t is defined by

ct ≡ σtσ>t =

(cZZ,t cZY,tcZY,t cY Y,t

). (3)

4

Assumption 1 (a) The process b is locally bounded; (b) ν ([0, T ]× R) < ∞; (c) ct is non-singular for t ∈ [0, T ] and it is an Ito semimartingale of the form

vec(ct) = vec(c0) +

∫ t

0

bsds+

∫ t

0

σsdWs

+

∫ t

0

∫Rδ(s, z)1‖δ(s,z)‖≤1(µ− ν)(ds, dz)

+

∫ t

0

∫Rδ(s, z)1‖δ(s,z)‖>1µ(ds, dz), (4)

where the processes b and σ are locally bounded and take values respectively in R4 and R4⊗4,W is a 4-dimensional Brownian motion that may depend on W , δ : Ω × R+ × R → R4 isa predictable function and µ is a Poisson random measure with compensator ν of the formλ(dt, du) = dt ⊗ λ(du) for some σ-finite measure λ. Moreover, there exists a sequence ofstopping times (Tm)m≥1 increasing to infinity and λ-integrable functions (Γm)m≥1, such that‖δ(ω, t, u)‖2 ∧ 1 ≤ Γm(u) for all ω ∈ Ω, t ≤ Tm and u ∈ R.

Assumption 1 is quite standard for the analysis of high-frequency data. We note thatpart (b) requires the jumps in X to be of finite activity. In the context of the current paper,we extract jump information only through “large” enough jumps in our analysis on efficientestimation; we thus impose this assumption to simplify the exposition, although it may befurther relaxed. Assumption 1(c) allows for the so called “leverage effect,” that is, correlation

between W and W . Moreover, we allow for volatility jumps and do not restrict their activityand dependence with the price jumps.

2.2 The common beta model

We now set up the continuous-time regression model. Let Y be decomposed as in equation(1). The orthogonality conditions between the diffusive and jump components in Y can beformally stated as

〈Zc, εc〉t = 0, [Zd, εd]t =∑s≤t

∆Zds∆εds = 0, for t ∈ [0, T ], (5)

where 〈·, ·〉 denotes the quadratic covariation of two continuous local martingales and [·, ·]denotes the quadratic covariation in general. Under the “common beta” assumption, namelyβc = βd, we can write (1) more succinctly as

Yt = Bt + β(Zct + Zd

t ) + (εct + εdt ), (6)

where the common beta β is the parameter of interest and the residual εt = εct + εdt is ajump-diffusion process that is orthogonal to (Zt)t≥0, namely,

[Z, ε]t = 0, for t ∈ [0, T ]. (7)

Equivalently, we can write the continuous-time regression model in (6) more explicitly asdZt = bZ,tdt+

√cZZ,tdWZ,t + dJZ,t,

dYt = bY,tdt+ β√cZZ,tdWZ,t +

√ςtdWε,t + βdJZ,t + dJε,t,

(8)

5

where WZ and Wε are univariate independent Brownian motions, Jε is a pure jump processthat contains the Y -specific jumps, and ς is the spot idiosyncratic variance of Y . It is easyto see that the processes in (2) and (8) are connected in the following way

bt =

(bZ,tbY,t

), σt =

( √cZZ,t 0

β√cZZ,t

√ςt

), Wt =

(WZ,t

Wε,t

), Jt =

(JZ,t

βJZ,t + Jε,t

). (9)

Moreover, a simple calculation implies β = cZY,t/cZZ,t, and the spot idiosyncratic varianceςt is connected with ct via

ςt ≡ cY Y,t − c2ZY,t/cZZ,t. (10)

This model can be trivially extended to the setting where Z is multidimensional, only withmore cumbersome notation.

3 Adaptive estimation of the common beta

3.1 Asymptotic properties of weighted estimators

In this subsection, we introduce a class of weighted estimators for the common betaand establish their asymptotic properties. The issue of efficiency and adaptiveness will beaddressed in Section 3.2 below. We denote the true value of the common beta as β0 and useβ as a generic reference for it.

Our inference is done on the basis of high-frequency data on the fixed interval [0, T ].More specifically, the vector X is discretely observed at times i∆n, for 1 ≤ i ≤ [T/∆n] witha discretization mesh ∆n ≡ 1/n going to 0 asymptotically. The increments of X are denotedby

∆niX ≡ Xi∆n −X(i−1)∆n , for i = 1, 2, ..., [T/∆n].

Since we are going to exploit the information about β from both the diffusive and jumpmoves, but in distinct ways, we first need to identify high-frequency intervals that containjumps of Z. We collect the jump times of Z in the set T ≡ τp : p ∈ P, where P ≡ p ≥1 : τp ≤ T denotes the indices of the jumps. Both sets are finite almost surely since Z hasfinite-activity jumps. To identify jumps from the high-frequency data, we use the (standard)truncation method based on a real sequence (un)n≥1 of truncation thresholds, which satisfiesthe following assumption:

Assumption 2 For some $ ∈ (0, 1/2), un ∆$n .

For each p ∈ P , we denote by i(p) the unique random index i such that τp ∈ ((i −1)∆n, i∆n]. Let In ≡ i : 1 ≤ i ≤ [T/∆n], |∆n

i Z| > un and I ≡ i(p) : p ∈ P. ThenProposition 1 of Li et al. (2017b) shows that In = I with probability approaching one. Thatis, in the limit, we can use In to consistently disentangle jumps from the continuous movesusing this truncation methodology.2

2Since this result holds with probability approaching one, the detection error has no asymptotic effecton the subsequent asymptotic analysis.

6

For the inference on the common beta, we also need to estimate the path of the spotcovariance matrix of X. To do this, we pick an integer sequence (kn)n≥1 of local windowsizes, which eventually goes to infinity, and a sequence of R2-valued truncation thresholds(u′n)n≥1.3 These two sequences satisfy the following condition:

Assumption 3 For some ρ ∈ (0, 1), and $ ∈ (0, 1/2), kn ∆−ρn and u′n ∆$n .

The spot covariance matrix at time i∆n can then be recovered locally using returns overthe time interval [i∆n, (i+ kn) ∆n], for 0 ≤ i ≤ [T/∆n]− kn, through a truncated variationestimator due to Mancini (2001):

cn,i ≡1

kn∆n

kn∑j=1

∆ni+jX∆n

i+jX>1−u′n≤∆n

i+jX≤u′n. (11)

When a jump is detected, we also define a pre-jump covariance estimator in a similar way:

cn,i− ≡1

kn∆n

kn−1∑j=0

∆ni−kn+jX∆n

i−kn+jX>1−u′n≤∆n

i−kn+jX≤u′n, for i ∈ I ′n, (12)

where I ′n ≡ i ∈ In : kn + 1 ≤ i ≤ [T/kn]− kn. We note that I ′n differs from the index setIn only due to its exclusion of the boundary terms that are needed for the spot estimation.But this difference is asymptotically negligible for the subsequent analysis.

We are now ready to construct a class of weighted estimators for the common beta.It turns out that, in order to obtain the (efficient) adaptive estimator, we need to weightdiffusive and jump returns in distinct ways. Hence, we consider two weight functions v :M2 7→ (0,∞) and w :M2×M2×R 7→ (0,∞) for the diffusive and jump parts, respectively,such that the following assumption holds:

Assumption 4 (a) There is a sequence (Tm)m≥1 of stopping times increasing to infinityand a sequence of convex compact subsets Km ⊂M2 such that ct ∈ Km for t ≤ Tm and v isthree-time continuously differentiable on an ε-enlargement about Km for some ε > 0; (b) wis continuous at (c−, c, β0) for any c−, c ∈M2.4

Assumption 4(a) is used to relax the polynomial growth condition on test functionsimposed by Jacod and Rosenbaum (2013) for estimating integrated volatility functionals,and is easy to verify in typical applications. Assumption 4(b) is needed for a continuousmapping argument.

For any twice continuously differentiable function g (·), we set S(g) ≡∫ T

0g(cs)ds and

define its bias-corrected estimator as

Sn(g) ≡ ∆n

[T/∆n]−kn∑i=0

(g (cn,i)−

1

knBg (cn,i)

)(13)

3The truncation sequences un and u′n satisfy the same assumptions. Theoretically speaking, one couldset the components of u′n simply to be un. By distinguishing these two sequences, we allow the truncationused in spot volatility estimation to be different from that for detecting jumps, and (possibly) different acrossassets.

4The ε-enlargement of Km is defined as the collection of points whose Euclidean distance from the setKm is less than ε.

7

where the function Bg(·) in the bias-correction term is defined as

Bg(c) ≡ 1

2

2∑j,k,l,m=1

∂2jk,lmg(c)(cjlckm + cjmckl). (14)

This bias-correction term is needed for correcting a nonlinearity bias in the estimation ofintegrated volatility functionals as in, for example, Jacod and Rosenbaum (2013).

We consider a class of weighted estimators for the common beta constructed as

βn(v, w) ≡Sn (vgb) +

∑i∈I′n

w(cn,i−, cn,i, βn

)∆ni Z∆n

i Y

Sn (v) +∑

i∈I′nw(cn,i−, cn,i, βn

)(∆n

i Z)2, (15)

where the function gb(c) ≡ cZY /cZZ and βn ≡∑

i∈I′n∆ni Z∆n

i Y/∑

i∈I′n(∆n

i Z)2 is a pilot

estimator of β using only jump returns. The estimator βn essentially combines the weighteddiffusive beta of Li et al. (2017a)

βc

n (v) ≡ Sn (vgb)

Sn (v), (16)

and the weighted jump beta of Li et al. (2017b)

βJ

n (w) ≡

∑i∈I′n


)∆ni Z∆n

i Y∑i∈I′n


)(∆n

i Z)2. (17)

While optimally weighted βc

n and βJ

n have been shown by these prior work to be semipara-metrically efficient for the diffusive beta and the jump beta, respectively, it is intuitivelyclear that they cannot be efficient when the common beta assumption holds, in that neitherof them exploits the full information from all observed data.5

We now describe precisely the optimally weighted estimator for the common beta, whichis actually adaptive as shown in Section 3.2 below. This adaptive estimator is given by

β?

n ≡

(1− 3

kn

)∆n

∑[T/∆n]−kni=0

cZY,n,iςn,i

+∑

i∈I′n∆ni Z∆n

i Y

ςn,i(1− 3

kn

)∆n

∑[T/∆n]−kni=0

cZZ,n,iςn,i

+∑

i∈I′n(∆n

i Z)2

ςn,i

, (18)

where ςn,i ≡ cY Y,n,i−c2ZY,n,i/cZZ,n,i is the spot estimator for the diffusive idiosyncratic variance

ςi∆n of Y (recall the definition (10)). Note that β?

n is a special case of βn, corresponding toweight functions6

v?(c) =cZZ

cY Y − c2ZY /cZZ

, w?(c−, c, β) =1

cY Y − c2ZY /cZZ

.

5Although the number of jump returns is, asymptotically speaking, “much smaller” than the numberof diffusive returns, the information content of the former is not negligible relative to the latter; this isbecause the jump returns carry a “much higher” signal-to-noise ratio compared with the diffusive ones athigh frequency.

6We note that the 3/kn factor in the definition of β?

n corresponds to the bias-correction term Bg (·) /kn,and is obtained from direct calculation.

8

Theorem 1, below, characterizes the asymptotic properties of the general weighted es-timator βn and shows the optimality of the adaptive estimator β

?

n within this class. Tosimplify notation, we set

Qn,ZZ(w) ≡∑i∈I′n

w(cn,i−, cn,i, βn) (∆ni Z)2 , QZZ(w) ≡

∑τ∈T

w(cτ−, cτ , β0) (∆Zτ )2 .

We also assume the following.

Assumption 5 ς and Z do not jump at the same times almost surely.

Assumption 5 says that the idiosyncratic variance ς of the individual stock price Y doesnot co-jump with the factor Z. Although the asymptotic distribution of βn can be derivedwithout this assumption, we impose it nonetheless so that the limiting distribution of βnis F -conditionally mixed Gaussian; otherwise, the limiting distribution would be “doubly”mixed Gaussian with an extra layer of mixing due to the indeterminancy of the exact jumptime within the corresponding sampling interval. We note that this assumption does notrule out the co-jump between the prices and volatilities of Y and Z, as it only concerns theidiosyncratic variance of Y .

Theorem 1 Suppose Assumptions 1–5 hold and kn ∆−ρn such that ρ ∈ (1/3, 1/2) and(1− ρ)/2 ≤ $ ≤ 1/2. Then,

(a) ∆−1/2n (βn(v, w)− β0)

L-s−→MN (0,Σ(v, w)), where

Σ(v, w) ≡

∫ T0v(cs)

2 ςscZZ,s

ds+∑

τ∈T w(cτ−, cτ , β0)2 (∆Zτ )2 ςτ

(S(v) +QZZ(w))2 ;

(b) ∆−1/2n (β

?

n − β0)L-s−→MN (0, V ?) where

V ? ≡(∫ T

0

d[Z,Z]sςs

)−1

;

moreover, V ? ≤ Σ(v, w) and the equality is attained if and only if v(c) = ϕcZZ/ς andw(cτ ) = ϕ/ςτ for all τ ∈ T and some constant ϕ > 0;

(c) V ? can be consistently estimated by V ?n ≡ (Sn(v?) +Qn,ZZ(w?))−1.

Comments. (i) Part (a) of Theorem 1 establishes the asymptotic mixed normality of βn.This result can be regarded as a generalization of both the limit theory for the diffusivebeta estimator presented in Proposition 1 of Li et al. (2017a) and that for the jump betaestimator shown in Theorem 2 of Li et al. (2017b).

(ii) More importantly, part (b) shows that β?

n attains the smallest asymptotic varianceamong all weighted estimators. In particular, the asymptotic variance V ? is strictly smallerthan both the semiparametric efficiency bound for estimating the constant jump beta in Liet al. (2017b) (even for the adaptive case where the diffusive beta and jump beta are thesame at the jump times) and the adaptive bound for estimating the constant diffusive betain Li et al. (2017a). From here, we see clearly that exploiting both the diffusive and the

9

jump dependence will provide an efficiency gain. In Section 3.2, below, we show that V ? isindeed the adaptive efficiency bound.

(iii) Part (c) shows that V ?n is a consistent estimator for V ?, which can be used to make

feasible inference.(iv) In general, if the common beta assumption does not hold, then the estimator β

?

n

converges to a weighted mixture of the diffusive and jump betas; see Lemma 3 in the appendixfor details.

3.2 Efficiency bound and adaptive estimation

In this subsection, we show that the optimally weighted estimator β?

n is indeed adaptive.We proceed as follows. We first derive the efficiency bound for estimating β in a parametricsubmodel of (6) in which the only unknown parameter is β, whereas the nonparametricnuisance components (b, cZZ , ς, JZ , ε) are observed. In other words, we augment the originaldata with the observation of these nuisance processes. We prove the LAMN property for thissubmodel and establish the efficiency bound by invoking the conditional convolution theorem(Jeganathan (1982, 1983)). The resulting efficiency bound coincides with the F -conditional

asymptotic variance of β?

n. Since the estimator β?

n only depends on the original data (insteadof the augmented data), we conclude that it attains the adaptive efficiency bound a fortiori.

It is instructive to recall the LAMN property. Compared with the commonly used lo-cal asymptotic normality (LAN) property, the LAMN property is more general because itallows the information matrix to be random. In the sequel, we use P n

β to denote the jointdistribution of the data sequence (∆n

iX)1≤i≤[T/∆n] in a parametric model with an unknownparameter β. We say that the sequence (P n

β ) satisfies the LAMN property at β = β0 if thereexist a sequence Γn of nonnegative random variables and a sequence ψn of random variables,such that for any h ∈ R,

logdP n

β0+∆1/2n h

dP nβ0

= hΓ1/2n ψn −

1

2Γnh

2 + op(1),

and

(ψn,Γn)L−→ (ψ,Γ) ,

where the information Γ is a positive F -measurable random variable and ψ is a standardnormal variable independent of Γ.

To establish the asymptotic behavior of the log likelihood ratio, we maintain the followingassumption in this subsection.

Assumption 6 We have Assumption 1 and the processes (bt)t≥0, (σt)t≥0 and (Jt)t≥0 areindependent of (Wt)t≥0, and the joint law of (b, cZZ , ς, JZ , ε) does not depend on β.

Note that Assumption 6 is only needed in this subsection for getting a closed-form ex-pression for the likelihood ratio in the derivation of the efficiency bound for estimating β.Our estimation, inference and testing methods work under far more general settings. The-orem 2, below, shows that the aforementioned submodel satisfies the LAMN property andcharacterizes the information for the estimation of β.

10

Theorem 2 Under Assumptions 5 and 6, the sequence (P nβ : β ∈ R) satisfies the LAMN

property at β = β0 with information∫ T

0(1/ςs) d [Z,Z]s, where β0 is the true value of β.

Comments. (i) Theorem 2 reveals that the “local” information bound for estimating betais (1/ςs) d [Z,Z]s, which is exactly the local signal-to-noise ratio between the instantaneouslocal quadratic variation of Z and the idiosyncratic variance ς.

(ii) The information in Theorem 2 can be decomposed as∫ T

0

d [Z,Z]sςs

=

∫ T

0

d 〈Zc, Zc〉sςs

+

∫ T

0

d[Zd, Zd

]s

ςs

=

∫ T

0

cZZ,sςs

ds+∑τ∈T

(∆Zτ )2

ςτ.

The two terms on the right-hand side of the above display are exactly the information boundsfor separately estimating the diffusive beta and the jump beta; see Theorem 1 in Li et al.(2017a) and Theorem 2 in Li et al. (2017b).7

From Theorem 2 and the conditional convolution theorem (see Jeganathan (1982, 1983)),we deduce that the efficiency bound for estimating β0 in the adaptive case is given by theinverse of the information, that is,(∫ T

0

d[Z,Z]sςs

)−1

.

This efficiency bound coincides exactly with the F -conditional asymptotic variance of β?

n

and, hence, β?

n is an (efficient) adaptive estimator as claimed above.

4 A Hausman specification test for common beta

The theoretical results developed above rely critically on the common beta restriction of(6). We have shown that, if this parametric restriction holds, exploiting the information inthe jump dependence will yield efficiency gain compared with, for example, the diffusive betaestimator in Li et al. (2017a), and vice versa. However, if the parametric restriction doesnot hold, our proposed common beta estimator is not even consistent for either the diffusiveor the jump beta, since it converges to a weighted average of the two different betas (seeLemma 3). Therefore, it is important to develop a test for the common beta assumption.

Following the insight of Hausman (1978), we design a specification test as a by-productof the efficient estimation result developed above. Hausman (1978) proposes a specificationtest based on an asymptotic theory for the difference between an asymptotically efficientestimator and some other consistent but asymptotically inefficient estimator. The key featureof the Hausman test is that, as a consequence of efficiency, the asymptotic variance of the

7Note that if the common beta assumption does not hold, the idiosyncratic variance ς appearing in theinformation bound of the continuous beta and that of the jump beta would be defined differently using thecorresponding betas.

11

difference between the two estimators is the difference of their asymptotic variances. Weshall exploit the efficiency of the adaptive estimator β

?

n in the same spirit for the testingpurpose.

More specifically, we compare the efficient common beta estimator β?

n and a diffusive beta

estimator βc

n(v) with weight function v (recall (16)). If the restriction for constant diffusivebeta is known a priori, then the diffusive beta estimator will always be consistent regardlessof the validity of the common beta restriction, but the adaptive common beta estimator willbe consistent only when the common beta restriction holds. Formally, we test in which ofthe following two sets the observed sample path falls8

Ω0 ≡ ω ∈ Ω : Yt = Bt + β0

(Zct + Zd

t

)+(εct + εdt

)such that (5) holds for some β0 ∩ |T | ≥ 1,

Ωa ≡ ω ∈ Ω : Yt = Bt + β0Zct + βdZd

t + εct + εdtsuch that (5) holds for some β0 6= βd ∩ |T | ≥ 1.

Note that the alternative hypothesis Ωa maintains that the jump beta is a constant βd. Weform the alternative hypothesis as such mainly for the ease of exposition. As we shall discussin more detail below, the test is actually consistent against more general alternatives withtime-varying jump betas (and we prove the more general result in the appendix).

We carry out the test by examining whether the difference β?

n − βc

n(v) is statistically

different from zero. Due to the efficiency of β?

n, the asymptotic variance of ∆−1/2n (β

?

n− βc

n(v))is simply

Ξ ≡

∫ T0v(cs)

2 ςscZZ,s

ds

S(v)2− V ?,

which can be consistently estimated by a sample analogue estimator

Ξn ≡Sn(gΞ)

Sn(v)2− V ?

n , where gΞ(c) =v(c)2 (cY Y − c2

ZY /cZZ)

cZZ.

We note that Ξ is strictly positive almost surely provided that jumps are present, as impliedby Theorem 1. The resulting Wald-type test statistic is given by

Hn ≡ ∆−1n

(β?

n − βc

n(v))2/

Ξn.

Theorem 3, below, describes the asymptotic behavior of the test. We denote by χ21(1 − α)

the 1− α quantile of the χ21 distribution.

Theorem 3 Under the same conditions as in Theorem 1, the following statements hold:(a) the sequence ∆

−1/2n (β

?

n− βc

n(v)) converges stably in law to a centered mixed Gaussianvariable with F-conditional variance Ξ;

(b) in restriction to Ω0, Hn converges stably in law to a χ21 distribution;

(c) at significance level α ∈ (0, 1), the test with critical region Cn ≡ Hn > χ21(1 − α)

has asymptotic size α under the null and asymptotic power one under the alternative, thatis, P (Cn|Ω0)→ α and P (Cn|Ωa)→ 1.

8We restrict attention to the scenarios with at least one jump in Z (i.e., |T | ≥ 1) so as to avoiduninteresting degenerate cases.

12

Comments. (i) Part (a) of Theorem 3 follows from the fact that the asymptotic covariance

between β?

n − βc

n(v) and β?

n is zero. This orthogonality condition holds essentially because

of the efficiency of β?

n, which is analogous to the classical Hausman test.(ii) The alternative hypothesis Ωa maintains that the jump beta is a constant βd, and

the test is consistent provided that the jump beta is different from the diffusive beta β0.More generally, we show in the proof that the test is consistent even when the jump beta istime-varying, provided that a certain weighted average of spot jump betas (i.e., ∆Yτ/∆Zτ )does not coincide with the diffusive beta β0.

Finally, we note that we can also test the common beta assumption by contrasting theconsistent jump beta and efficient common beta estimators. More specifically, we can con-

sider a weighted jump beta estimator βJ

n (w) for some weight function w (·) and then definethe test statistic alternatively as

Hn ≡ ∆−1n

(β?

n − βJ

n(w))2/

Ξn, Ξn ≡∑

i∈I′nw(cn,i−, cn,i, βn)2 (∆n

i Z)2 ςn,i

Qn,ZZ(w)2− V ?

n .

Theorem 3 still holds in this case, essentially with the same proof.

5 Monte Carlo study

5.1 Setup

We now examine the finite sample performance of our inference theory. Consider dZt =σtdLt, where L is a Levy process with characteristic triplet (0, 1, ν1). We look at two al-ternative specifications for ν1: ν1(dx) = (1/16)e−|x|dx, which leads to a Z process withrelatively low jump activity (one jump every 8 days) and large expected jump size, andν1(dx) = (1/4)e−2|x|dx, which yields a Z process with higher jump activity (one jump every

4 days) but smaller expected jump size. We set dεt = σtdLt, where Lt is another Levy process,independent of Lt, with characteristic triplet

(0, 1/√

2, ν2

), where ν2(dx) = (1/90)e−|x|dx.

That is, the continuous parts in Lt and Lt are Brownian diffusions with constant volatil-ity and the jump parts are compound Poisson processes with jump size following double-exponential distributions. The frequency and size of jumps are calibrated to match roughlythose observed in real data. Our model under the null hypothesis is:

dYt = dZt + dεt.

The specification under the alternative hypothesis is given by:

dYt = dY ct + dJY,t, dY c

t = dZct + dεct , dJY,t = βddJZ,t + dJε,t,

where βd = 0.75, and JY,t, JZ,t and Jε,t are the jump components in Y , Z and ε. Finally,following Bollerslev and Todorov (2011), the stochastic volatility follows a two-factor affinediffusion model, that is, ct = V1,t + V2,t, where

dV1,t = 0.0116 (0.5− V1,t) dt+ 0.0862√V1,tdW1,t,

dV2,t = 0.6930 (0.5− V2,t) dt+ 0.6660√V2,tdW2,t, (19)

13

and (W1,W2) is a two-dimensional Brownian motion that is independent of L and L. Theparameters chosen in (19) imply that V1 is a highly persistent process with half-life of 60days, and V2 is a fast mean-reverting process with half-life of 1 day.

In the inference we consider T = 60, which corresponds to approximately a quarter (ourunit of time is a business day), with each day consisting of 6.5 trading hours. We examinetwo regular sampling schemes: ∆n = 1/389, which corresponds to 1-minute frequency and∆n = 1/77, which corresponds to 5-minute frequency. The local window size is chosen to bekn ∈ 100, 140 for ∆n = 1/389 and kn ∈ 24, 30 for ∆n = 1/77. The truncation thresholdis set to

un = 3.5×√γiBVt ×∆0.49

n ,

where BVt is the bipower variation (see Barndorff-Nielsen and Shephard (2004b)) measuringthe integrated diffusive variation on day t and is given by

BVt ≡π

2

n

n− 1

tn∑i=(t−1)n+2

|∆ni−1Z||∆n

i Z|.

While there is no deterministic intraday volatility pattern introduced in the data generat-ing process, in order to keep in line with the empirical study in the next section, we alsoinclude in the truncation threshold the time-of-day factor γi ≡ bi/((1/n)

∑nj=1 bj), with

bi ≡ T−1∑T

t=1 |∆n(t−1)n+i−1Z||∆n

(t−1)n+iZ| for i = 2, ..., n and b1 = b2, for observation i in dayt. The number of Monte Carlo trials is set to be 1000.

5.2 Results

In Table 1 we report the coverage probability of the confidence interval constructed usingthe adaptive common beta estimator β

?

n. Although we see some slight under-coverage, thecoverage probabilities are generally close to the corresponding nominal levels, suggestingthat the asymptotic inference performs reasonably well across the different scenarios.

We next examine the size and power of our Hausman specification test for discriminatingΩ0 versus Ωa. The results are reported in Table 2. The test is implemented based on thedifference between the efficient adaptive common beta estimator β

?

n and the adaptive diffusive

beta estimator βc

n(v), where the latter is proposed in Li et al. (2017a) but is inefficient underthe common beta assumption. As Table 2 shows, for different sampling frequencies andchoices of kn, the rejection probabilities are reasonably close to the nominal levels under thenull, and are close to one under the alternative. Nevertheless, we do note that there is someover-rejection for the coarser sampling frequency of 5 minutes.

Finally, we measure the relative efficiency gains from optimally weighting observationsand pooling jumps and diffusive returns in the estimation. To this end, we consider fourdifferent beta estimators: (i) the adaptive common beta estimator β

?

n defined in (18), (ii) the“realized regression” estimator of Barndorff-Nielsen and Shephard (2004a), (iii) a truncated

version of the “realized regression” beta estimator denoted by βc

BNS, and (iv) the adaptivediffusive beta estimator in Li et al. (2017a), where the latter three estimators are defined as

βall

BNS =

∑i ∆

ni Y∆n

i Z∑i (∆

ni Z)2 , β

c

BNS =

∑i cZY,n,i∑i cZZ,n,i

, βc

opt =

∑i cZY,n,i/ςn,i∑i cZZ,n,i/ςn,i

.

14

Table 1: Monte Carlo Coverage Rates (%) of Confidence Intervals of β?

n.

Case Nominal Level90% 95% 99%

A. ν1(dx) =(e−|x|/16

)dx

∆n = 1/389, kn = 100 87.40 92.90 98.40∆n = 1/389, kn = 140 87.20 92.70 98.20∆n = 1/77, kn = 24 86.70 92.50 97.70∆n = 1/77, kn = 30 87.10 92.60 97.60

B. ν1(dx) =(e−2|x|/4

)dx

∆n = 1/389, kn = 100 85.30 92.30 98.40∆n = 1/389, kn = 140 85.40 92.00 98.60∆n = 1/77, kn = 24 86.80 92.40 97.20∆n = 1/77, kn = 30 87.30 92.60 97.50

Table 2: Monte Carlo Rejection Rates (%) of Common Beta Test.

Case Under Ω0 Under Ωa

Nominal Level Nominal Level10% 5% 1% 10% 5% 1%

A. ν1(dx) =(e−|x|/16

)dx

∆n = 1/389, kn = 100 12.00 5.80 1.30 99.90 99.70 99.30∆n = 1/389, kn = 140 11.40 6.00 1.20 99.90 99.80 99.50∆n = 1/77, kn = 24 13.40 7.50 2.50 98.20 97.80 96.90∆n = 1/77, kn = 30 12.80 7.90 2.00 98.30 98.10 97.00

B. ν1(dx) =(e−2|x|/4

)dx

∆n = 1/389, kn = 100 11.20 6.30 1.10 99.90 99.90 99.80∆n = 1/389, kn = 140 10.90 6.30 1.30 99.90 99.90 99.80∆n = 1/77, kn = 24 14.20 8.00 1.70 98.50 98.00 96.10∆n = 1/77, kn = 30 12.90 7.80 1.70 98.40 98.00 96.10

15

Table 3: Relative Efficiency of Alternative Beta Estimates in Monte Carlo.

Case V copt/V

?n V all

BNS/V?n V c

BNS/Vcopt

Q0.25 Q0.50 Q0.75 Q0.25 Q0.50 Q0.75 Q0.25 Q0.50 Q0.75

A. ν1(dx) =(e−|x|/16

)dx

∆n = 1/389, kn = 100 1.10 1.20 1.36 1.11 1.18 1.27 1.11 1.18 1.26∆n = 1/389, kn = 140 1.10 1.20 1.36 1.11 1.18 1.27 1.11 1.18 1.26∆n = 1/77, kn = 24 1.11 1.23 1.41 1.09 1.16 1.25 1.07 1.13 1.22∆n = 1/77, kn = 30 1.11 1.22 1.40 1.10 1.16 1.25 1.08 1.14 1.22

B. ν1(dx) =(e−2|x|/4

)dx

∆n = 1/389, kn = 100 1.08 1.12 1.18 1.11 1.18 1.27 1.11 1.18 1.26∆n = 1/389, kn = 140 1.08 1.12 1.18 1.11 1.18 1.27 1.11 1.18 1.26∆n = 1/77, kn = 24 1.08 1.13 1.20 1.09 1.16 1.26 1.07 1.13 1.22∆n = 1/77, kn = 30 1.08 1.13 1.19 1.10 1.16 1.26 1.08 1.14 1.23

Note: V copt, V

allBNS and V c

BNS denote the estimators for the asymptotic variances of βc

opt, βall

BNS

and V cBNS, respectively. We use Qα to denote the α-quantile across the Monte Carlo replica.

In Table 3, we report ratios of the estimated asymptotic variances of the different esti-mators, so as to access the relative efficiency of these estimators. As seen from the table, thechoice of kn has little effect on the estimated ratios. By weighting observations according tothe idiosyncratic variance of the Y process, β

c

opt and β?

n yield, on average, about 15% effi-

ciency gains relative to the “flat weighted” estimators βc

BNS and βall

BNS. Moreover, exploitingthe information in the jumps in addition to the diffusive movements provides nontrivial effi-ciency benefits, as we can see from the comparison between β

c

opt and β?

n in the left panel ofTable 3.

6 Empirical illustration

We next illustrate our inference procedures on real data. Our focus is the sensitivity ofasset returns to the market index, i.e., market betas. We use high frequency data from theTrade and Quote (TAQ) database in the analysis for the following three stocks: JPMor-gan (JPM), Walmart (WMT) and Johnson & Johnson (JNJ). Our proxy for the market isthe S&P 500 ETF (SPY). JPMorgan is a large stock from the financial sector which haslarge diffusive and jump betas. Walmart and Johnson & Johnson are representatives of theconsumers and health care sectors, and they have much less sensitivity towards the marketindex. We sample at the 5-minute frequency, which results in 78 price observations per asseton each day. Similar to the Monte Carlo, we set our estimation window to T = 63 tradingdays (approximately one quarter), resulting in 31 estimation windows over the entire sample

16

Diffusive Beta Jump Beta0.3

0.4

0.5

0.6

0.7

0.8

0.9

( a ) JNJ

Diffusive Beta Jump Beta0.3

0.4

0.5

0.6

0.7

0.8

0.9

( b ) WMT

Diffusive Beta Jump Beta1

1.1

1.2

1.3

1.4

1.5

1.6

( c ) JPM

Figure 1: Box plots of market betas, 2007-2014. The diffusive beta and the jump beta are

estimated using the optimally weighted βc

n and βJ

n estimators, respectively, for eachquarter. Each box plot shows the 25th, 50th and 75th percentiles of the beta estimates

across the 31 quarters.

period. The local smoothing window used to estimate spot volatility is chosen to be kn = 24(results with kn = 30 are very similar) and the truncation threshold is calculated in the sameway as in the simulation study.

We start our empirical analysis with testing the hypothesis of common diffusive andjump betas. At the 5% significance level, we reject the null hypothesis of common betain 52% of the estimation windows for both JNJ and WMT and in 35% of the estimationwindows for JPM. To further highlight the differences between the diffusive and jump betas,

on Figure 1 we display for each stock a box plot for βc

opt and βJ

opt, where the latter is theoptimal jump beta estimator proposed in Li et al. (2017b). As seen from these plots, thequantiles of the jump beta are systematically above those of the diffusive beta for JNJ andWMT. On the other hand, the displayed quantiles of the diffusive and jump betas of JPMare very similar. This observation is further confirmed by the time series plots of these betaestimates displayed on Figure 2. Indeed, from the figure we can notice that all rejectionsof the null hypothesis of common diffusive and jump betas for JNJ occur when the jumpbeta is above the diffusive one. Similarly, we have only one instance where the commonbeta hypothesis is rejected and the jump beta is below the diffusive one for WMT. On theother hand, for JPM from the 11 rejections of the common beta hypothesis, five of themcorrespond to jump beta below the diffusive one. Overall, these results suggest that JNJand WMT consistently react more to the market jump moves relative to their reaction in“normal” times. For JPM, the levels of diffusive and jump betas appear rather similar andin many periods the difference between the two betas is not even statistically significant.

We continue next with assessing the efficiency gains from “pooling” the diffusive andjump returns for the purposes of beta estimation. In Table 4 we compare the estimatedasymptotic variances of the different beta estimates for the estimation periods for which thenull hypothesis of common beta is not rejected. The second and third panels (i.e., columns4-6 and 7-9) of the table reveal that we can generate rather nontrivial efficiency gains byoptimally weighting the returns to account for the heteroskedasticity that is present in them.Pooling jumps and diffusive returns together yields additional gains as seen from the first

17

2007 2008 2009 2010 2011 2012 2013 2014 2015Time

0.2

0.4

0.6

0.8

1

1.2

Bet

a

Diffusive Beta

Jump Beta

Rejection

( a ) JNJ

2007 2008 2009 2010 2011 2012 2013 2014 2015Time

0.2

0.4

0.6

0.8

1

Bet

a

( b ) WMT

2007 2008 2009 2010 2011 2012 2013 2014 2015Time

0.5

1

1.5

2

2.5

Bet

a

( c ) JPM

Figure 2: Quarterly diffusive and jump market betas, 2007-2014. Filled circles correspondto the case where the Hausman test rejects a null of common beta and the empty ones to

situations where the null is not rejected.

18

Table 4: Relative Efficiency of Alternative Beta Estimates.

Stock V copt/V

? V allBNS/V

? V cBNS/V

copt

Q0.25 Q0.50 Q0.75 Q0.25 Q0.50 Q0.75 Q0.25 Q0.50 Q0.75

JNJ 1.04 1.05 1.08 1.41 1.50 2.18 1.41 1.50 2.17WMT 1.03 1.06 1.08 1.46 1.51 1.68 1.46 1.52 1.60JPM 1.04 1.07 1.09 1.59 1.74 1.97 1.57 1.74 1.91

Note: This table reports the relative efficiency of various beta estimators, measured by theratio of the estimates of their asymptotic variances. The asymptotic variance estimators V ?,

V copt, V

allBNS and V c

BNS correspond to β?

n, βc

opt, βall

BNS and βc

BNS, respectively. We summarizethe results by reporting the empirical quantiles (denoted Qα) of these ratios across the 31quarterly estimation windows.

panel (i.e., columns 1-3) of the table. Naturally, the latter gains are smaller due to the “rare”nature of jumps.

7 Conclusion

We propose a test for and analyze efficiency gains provided by an assumption for acommon diffusive and jump beta for discretely observed asset prices and risk actors. Wepropose a general adaptive estimator for the common beta that utilizes information in boththe diffusive and jump returns and further optimally weights the increments to take intoaccount the nontrivial heteroskedasticity present in financial data. The estimator achievesa semiparametric efficiency bound for recovering the common beta and is in fact adaptivewith respect to various nonparametric nuisance components in the model. We further usethe efficient estimator to propose a Hausman-type specification test for deciding whether thecommon beta assumption holds. An empirical application reveals the efficiency gains fromincorporating the common beta assumption in the estimation of market betas as well as theempirical plausibility of this assumption for a representative sample of three stocks over thesample period 2007-2014.

8 Appendix: Proofs

We first introduce some notations that will be used in the proofs. Let (κp, ξp−, ξp+)p≥1

be a collection of mutually independent random variables which are also independent ofF , such that κp is uniformly distributed on the unit interval and both ξp− and ξp+ arebivariate standard normal variables. For each p ≥ 1, we define a 2-dimensional vector Rp as

19

Rp ≡√κpστp−ξp− +

√1− κpστpξp+ and ρp ≡ (−β0, 1)Rp. If Assumption 5 holds, ρp has an

F -conditionally centered mixed Gaussian distribution with variance ςτp .We also need a preliminary lemma, which is a straightforward extension of Theorem 4

in Li et al. (2017a), except that we consider estimators for integrated volatility functionalsusing overlapping windows.

Lemma 1 Suppose (i) Assumptions 1 and 3 hold; (ii) there exist a sequence (Tm)m≥1 ofstopping times increasing to infinity, a sequence of convex compact subsets Km ⊂ M2 suchthat ct ∈ Km for t ≤ Tm and g (·) is a three-time continuously differentiable function on anε-enlargement of Km for some ε > 0; (iii)

1

3< ρ <

1

2,

1− ρ2≤ $ <

1

2.

Then, ∆−1/2n (Sn(g)− S(g))

L-s−→MN (0, V (g)), where

V (g) ≡∑

j,k,l,m=1

∫ T

0

∂jkg(cs)∂lmg(cs)> (cs,jlcs,km + cs,jmcs,kl) ds.

Proof of Lemma 1. By Lemma 2 of Li and Xiu (2016), sup0≤i≤[T/∆n]−kn ‖cn,i − cn,i‖ =

op(1), where cn,i ≡ (kn∆n)−1∫ i∆n+kn∆n

i∆ncsds. This uniform approximation result allows us

to use the spatial localization technique as in the proof of Theorem 2 in Li et al. (2017a). Inparticular, we can assume that g (·) has bounded derivatives up to the third order withoutloss of generality. The assertion of the theorem then follows from Theorem 3.2 in Jacod andRosenbaum (2013). Q.E.D.

8.1 Proof of Theorem 1.

Proof of Theorem 1. (a) We set Qn,ZY (w) ≡∑

i∈I′nw(cn,i−, cn,i, βn)∆n

i Z∆ni Y , so that we

can rewrite

βn (v, w) =Sn(vgb) +Qn,ZY (w)

Sn(v) +Qn,ZZ(w).

Hence,

∆−1/2n

(βn (v, w)− β0

)=

∆−1/2n

(Sn(vgb)− β0Sn(v)

)+ ∆

−1/2n (Qn,ZY (w)− β0Qn,ZZ(w))

Sn(v) +Qn,ZZ(w).

Define g(c) ≡ v(c)gb(c)−β0v(c) and note that S(g) = 0. Applying Lemma 1 and simplifyingthe form of asymptotic variance with direct calculation, we deduce

∆−1/2n Sn(g)

L-s−→MN(

0,

∫ T

0

v(cs)2 ςscZZ,s

ds

). (20)

20

In addition, since I ′n = I with probability approaching one, we have

∆−1/2n (Qn,ZY (w)− β0Qn,ZZ(w)) =

∑p∈P

w(cn,i(p)−, cn,i(p), βn)∆ni(p)Zρn,p,

where ρn,p ≡ (−β0, 1)Rn,p and Rn,p ≡ ∆−1/2n (∆n

i(p)X −∆Xτp). Theorem 9.3.2 in Jacod and

Protter (2012) implies that cn,i(p)− = cτp− + op(1) and cn,i(p) = cτp + op(1) for each p ≥ 1.

Proposition 1(b) in Li et al. (2017b) implies ∆ni(p)Z = ∆Zτp + op(1). Since βn = β0 + op(1),

we can apply the continuous mapping theorem to deduce

w(cn,i(p)−, cn,i(p), βn)P−→ w

(cτp−, cτp , β0

).

By Proposition 4.4.10 in Jacod and Protter (2012), (Rn,p)p≥1 converges stably in law to(Rp)p≥1 and, hence, (ρn,p)p≥1 converges stably in law to (ρp)p≥1. Note that the variables(ρp)p≥1 are F -conditionally independent and, under Assumption 5, are F -conditionally cen-tered mixed Gaussian with variance ςτp . Since the set P is finite almost surely, we have

∆−1/2n (Qn,ZY (w)− β0Qn,ZZ(w))

L-s−→MN

(0,∑p∈P

w(cτp−, cτp , β0)2(∆Zτp

)2ςτp

). (21)

By a standard argument, we can show that the convergences in (20) and (21) holdjointly with F -conditionally independent limits. Moreover, it is easy to show that Sn(v) =S(v) + op(1) and Qn,ZZ(w) = QZZ(w) + op(1). Hence,

∆−1/2n

(βn (v, w)− β0

)L-s−→MN (0,Σ (v, w)) ,

where

Σ (v, w) =

∫ T0v(cs)

2 ςscZZ,s

ds+∑

τ∈T w(cτ−, cτ , β0)2 (∆Zτ )2 ςτ

(S(v) +QZZ(w))2 .

(b) Since β?

n is a special case of βn (v, w), the asserted convergence follows readily frompart (a). It remains to solve the minimization of Σ (v, w). We consider two generic weightfunctions v and w. Below, we write vs = v (cs) and wτ = w(cτ−, cτ , β0) for notationalsimplicity. Observe that

Σ (v, w)

V ?=

(∫ T0

v2s ςscZZ,s

ds+∑

τ∈T w2τ (∆Zτ )

2 ςτ

)(∫ T0

cZZ,sςsds+

∑τ∈T

(∆Zτ )2

ςτ

)(S(v) +QZZ(w))2

≥

(∫ T0

v2s ςscZZ,s

ds)(∫ T

0

cZZ,sςsds)

+(∑


2 ςτ) (∑

τ∈T(∆Zτ )2

ςτ

)(S(v) +QZZ(w))2

+

2

√(∫ T0

v2s ςscZZ,s

ds)(∫ T

0

cZZ,sςsds) (∑


2 ςτ) (∑

τ∈T(∆Zτ )2

ςτ

)(S(v) +QZZ(w))2

≥

(∫ T0vsds

)2

+(∑

τ∈T wτ (∆Zτ )2)2

+ 2(∫ T

0vsds

) (∑τ∈T wτ (∆Zτ )

2)(S(v) +QZZ(w))2

= 1

21

where the first inequality is due to a simple quadratic inequality and the equality is attainedif and only if(∫ T

0

v2sςs

cZZ,sds

)(∑τ∈T

(∆Zτ )2

ςτ

)=

(∫ T

0

cZZ,sςs

ds

)(∑τ∈T

w2τ (∆Zτ )

2 ςτ

);

the second inequality is due to the Cauchy–Schwarz inequality and the equality holds if andonly if v(c) is a multiple of cZZ/ς and w(cτ ) is a multiple of 1/ςτ . Taking these estimatestogether, we deduce that V ? ≤ Σ (v, w) in general and the equality holds if and only ifv(c) = ϕcZZ/ς and w(cτ ) = ϕ/ςτ for some constant ϕ > 0 (recall that v and w are strictlypositive functions).

(c) By Theorem 3 of Li et al. (2017a), Sn(v?) =∫ T

0(cZZ,s/ςs)ds+op(1). A similar argument

to part (a) yields Qn,ZZ(w?) = QZZ(w?) + op(1). The assertion of part (c) readily followsfrom the convergence results. Q.E.D.


Proof of Theorem 2. We consider a sequence Ωn of events defined by

Ωn =

For every 1 ≤ i ≤ [T/∆n], ((i− 1)∆n, i∆n]contains at most one jump of Z.

.

Under the maintained assumptions, the process Z has finitely active jumps. Hence, P(Ωn)→1 and we can restrict our calculation below on Ωn without loss of generality.

We denote the log likelihood ratio by

Ln(h) ≡ logdP n

β0+∆1/2n h

dP nβ0

, h ∈ R.

Let G denote the σ-field generated by the processes (b, cZZ , ς, JZ , ε). Given the maintainedassumptions, we see that, under the law P n

β , the observed returns (∆niX)i≥0 are indepen-

dently normally distributed conditional on G. Using this fact, we can obtain an explicitexpression for Ln(h). For notational simplicity, we denote

zn,i ≡ ∆−1/2n

∫ i∆n

(i−1)∆n

√cZZ,sdWZ,s, yn,i ≡ ∆−1/2

n

∫ i∆n

(i−1)∆n

√ςsdWε,s,

cn,i ≡ ∆−1n

∫ i∆n

(i−1)∆n

cZZ,sds, ςn,i ≡ ∆−1n

∫ i∆n

(i−1)∆n

ςsds.

Some straightforward (although somewhat cumbersome) algebra yields

Ln(h) = hψn −h2

2Γn, (22)

where

ψn ≡[T/∆n]∑i=1

yn,i

(∆ni JZ + ∆

1/2n zn,i

)ςn,i

, Γn ≡[T/∆n]∑i=1

(∆ni JZ + ∆

1/2n zn,i

)2

ςn,i.

22

It remains to analyze the asymptotic properties of ψn and Γn. We decompose ψn = ψ′n+ ψ

′′n,

where ψ′n and ψ

′′n are sums over the subset i : ∆n

i JZ 6= 0 and its complement, respectively.Similarly, we decompose Γn = Γ′n + Γ′′n.

We now proceed to derive the joint convergence in law of (ψ′n, ψ

′′n) under the G-conditional

probability. We note that ∆ni JZ and ςn,i are G-measurable, and (zn,i, yn,i) are G-conditionally

independent with conditional distributions given by

zn,i|G ∼ MN (0, cn,i) , yn,i|G ∼ MN (0, ςn,i) .

In particular, ψ′n and ψ

′′n are G-conditionally independent, so it is enough to derive the

marginal convergence of each sequence. Since the jumps are finitely active, it is easy to seethat

ψ′n =

∑i:∆n

i JZ 6=0

yn,i∆ni JZ

ςn,i+ op(1), ψ

′′n = ∆1/2

n

[T/∆n]∑i=1

yn,izn,iςn,i

+ op(1).

By applying the Lindeberg–Levy central limit theorem under the G-conditional probability,we deduce the following G-conditional convergence in law

ψ′nL−→MN

(0,∑τ∈T

∆Z2τ

ςτ

), ψ

′′nL−→ MN

(0,

∫ T

0

cZZ,sςs

ds

).

From here, we deduce the following convergence under the G-conditional probability,

ψnL−→ ψ ∼MN

(0,

∫ T

0

d[Z,Z]sςs

). (23)

Similarly, we can derive the convergence in probability for Γn:

ΓnP−→ Γ ≡

∫ T

0

d[Z,Z]sςs

. (24)

Since Γn is G-measurable, (23) and (24) imply that (ψn,Γn) converges in law to (ψ,Γ). Fromhere, the assertion of the theorem readily follows (recall (22)). Q.E.D.


The proof of Theorem 3 relies on two auxiliary lemmas. Lemma 2 describes the jointconvergence for a generic weighted common beta estimator and the adaptive estimator. Inparticular, this lemma shows clearly that βn (v, w)− β

?

n is asymptotically orthogonal to β?

n.

Lemma 3 establishes the probability limit of β?

n under the alternative hypothesis (i.e., thecommon beta assumption does not hold).

Lemma 2 Suppose that the conditions in Theorem 1 hold. Then, in restriction to Ω0, wehave

∆−1/2n

(βn (v, w)− β0, β

?

n − β0

)L-s−→MN

(0,

(Σ (v, w) V ?

V ? V ?

)). (25)

23

Proof of Lemma 2. Recall that Sn (v) = S (v) + op(1) and Qn,ZZ (w) = QZZ (w) + op(1).Hence, the left side of (25) has the asymptotic representation ζn,1 + ζn,2 + op(1), where

ζn,1 ≡

(∆−1/2n Sn (vgb − β0v)

S (v) +QZZ (w),∆−1/2n Sn (v?gb − β0v

?)

S (v?) +QZZ (w?)

)>,

ζn,2 ≡

(∆−1/2n (Qn,ZY (w)− β0Qn,ZZ (w))

S (v) +QZZ (w),∆−1/2n (Qn,ZY (w?)− β0Qn,ZZ (w?))

S (v?) +QZZ (w?)

)>.

Next, we proceed to derive the joint stable convergence in law for ζn,1 and ζn,2. Since thecentral limit theorem for ζn,2 is driven completely by the Brownian motion in finitely manyjump intervals, a usual argument implies that ζn,1 and ζn,2 are asymptotically F -conditionallyindependent. Hence, it suffices to derive the marginal convergence of ζn,1 and ζn,2.

We set g (·) = (v (·) gb (·)− β0v (·) , v? (·) gb (·)− β0v? (·))>. In restriction to Ω0, S(g) = 0.

By Lemma 1,

∆−1/2n Sn(g)

L-s−→MN (0, V (g)) ,

where the asymptotic covariance matrix is given by (after some straightforward algebra)

V (g) =

(∫ T0v(cs)

2 ςscZZ,s

ds∫ T

0v(cs)ds∫ T

0v(cs)ds

∫ T0

cZZ,sςsds

).

From here, we deduce

ζn,1L-s−→MN (0,Σ1) , (26)

where

Σ1 =

∫ T0 v(cs)2

ςscZZ,s

ds

(S(v)+QZZ(w))2S(v)

(S(v?)+QZZ(w?))(S(v)+QZZ(w))

S(v)(S(v?)+QZZ(w?))(S(v)+QZZ(w))

∫ T0

cZZ,sςs

ds

(S(v?)+QZZ(w?))2

.

Turning to ζn,2, we note that with probability approaching one,

ζn,2 =∑p∈P

w(cn,i(p)−,cn,i(p),βn)S(v)+QZZ(w)

1/ςn,i(p)S(v?)+QZZ(w?)

∆ni(p)Zρn,p.

Recall from the proof of Theorem 1 that w(cni(p)−, cni(p), βn)

P−→ w(cτp−, cτp , β0) and (ρn,p)p≥1

converges stably in law to (ρp)p≥1, where (ρp)p≥1 are F -conditionally independent. Since Pis finite almost surely, we deduce from the property of stable convergence in law that

ζn,2L-s−→

∑p∈P

w(cτp−,cτp ,β0)S(v)+QZZ(w)

1/ςτpS(v?)+QZZ(w?)

∆Zτpρp.

24

Under Assumption 5, ρp is conditionally centered mixed Gaussian with variance ςτp for eachp ∈ P . Hence,

ζn,2L-s−→MN (0,Σ2) , (27)

where

Σ2 ≡

∑p∈P w(cτp−,cτp ,β0)

2(∆Zτp)

2ςτp

(S(v)+QZZ(w))2QZZ(w)

(S(v?)+QZZ(w?))(S(v)+QZZ(w))

QZZ(w)(S(v?)+QZZ(w?))(S(v)+QZZ(w))

∑p∈P(∆Zτp)

2/ςτp

(S(v?)+QZZ(w?))2

.

Finally, we note that V ? = (S (v?) +QZZ (w?))−1. The assertion of the lemma readilyfollows from (26) and (27). Q.E.D.

Lemma 3 Suppose that the conditions in Theorem 1 hold. Then,

β?

nP−→∫ T

0d〈Zc,Zc〉s

ςsβcs∫ T

0

d[Z,Z]sςs

+

∑τ∈T

(∆Zτ )2

ςτβJτ∫ T

0

d[Z,Z]sςs

,

where βcs = cZY,s/cZZ,s and βJτ ≡ ∆Yτ/∆Zτ , τ ∈ T .

Proof of Lemma 3. For each p ≥ 1, let βn,i(p) ≡ ∆ni(p)Y/∆

ni(p)Z. With probability ap-

proaching one, we have

β?

n =Sn(v?gb) +

∑p∈P βn,i(p)

(∆ni(p)Z

)2

/ςn,i(p)

Sn(v?) +Qn,ZZ(w?).

Note that the following convergence holds on the entire space

Sn(v?gb)P−→ S (v?gb) , Sn(v?)

P−→ S (v?) , ςn,i(p)P−→ ςτp ,

Qn,ZZ(w?)P−→∑τ∈T

(∆Zτ )2 /ςτ , ∆n

i(p)ZP−→ ∆Zτp , βn,i(p)

P−→ βτp .

In addition, note that βcs = gb (cs). The assertion of the lemma readily follows from theseconvergence results. Q.E.D.

Proof of Theorem 3. (a) The assertion is a direct consequence of Lemma 2 since βc

n (v) isa special case of βn (v, w) with w (·) = 0.

(b) Similar to Theorem 1, we have ΞnP−→ Ξ. The assertion of part (b) then follows from

the convergence in part (a) and the properties of stable convergence in law.(c) The size property is implied by part (b). By Lemma 3, in restriction to βct = β0,

t ∈ [0, T ],

β?

nP−→∫ T

0d〈Zc,Zc〉s

ςs∫ T0

d[Z,Z]sςs

β0 +

∑τ∈T

(∆Zτ )2

ςτβJτ∫ T

0

d[Z,Z]sςs

.

25

In addition, we note that ΞnP−→ Ξ holds over the whole sample space. Hence, the test

statistic Hn diverges in probability to ∞ whenever∑τ∈T

(∆Zτ )2

ςτβJτ∑

τ∈T(∆Zτ )2

ςτ

6= β0,

implying that the rejection probability converges to one. In particular, this condition holdsin restriction to Ωa and the assertion on power readily follows. Q.E.D.

References

Andersen, T., T. Bollerslev, F. Diebold, and G. Wu (2006). Realized Beta: Persistence andPredictability. In T. Fomby and D. Terrell (Eds.), Advances in Econometrics: EconometricAnalysis of Economic and Financial Time Series, Volume 20. Elsevier Science.

Barndorff-Nielsen, O. and N. Shephard (2004a). Econometric Analysis of Realized Covari-ation: High Frequency Based Covariance, Regression, and Correlation in Financial Eco-nomics. Econometrica 72 (3), pp. 885–925.

Barndorff-Nielsen, O. E. and N. Shephard (2004b). Power and bipower variation withstochastic volatility and jumps. Journal of financial econometrics 2 (1), 1–37.

Bollerslev, T. and V. Todorov (2011). Estimation of Jump Tails. Econometrica 79, 1727–1783.

Bollerslev, T. and B. Zhang (2003). Measuring and Modeling Systematic Risk in FactorPricing Models using High-frequency Data. Journal of Empirical Finance 10, 533–558.

Gobbi, F. and C. Mancini (2012). Identifying the Brownian Covariation from the Co-Jumpsgiven Discrete Observations. Econometric Theory 28, 249–273.

Gospodinov, N., R. Kan, and C. Robotti (2009). Misspecification-Robust Inference in LinearAsset Pricing Models with Irrelevant Risk Factors. Review of Financial Studies 27, 2139–2170.

Hausman, J. A. (1978). Specification tests in econometrics. Econometrica: Journal of theeconometric society , 1251–1271.

Jacod, J. and P. Protter (2012). Discretization of Processes, Volume 67. Springer, Heidelberg.

Jacod, J. and M. Rosenbaum (2013). Quarticity and other functionals of volatility: efficientestimation. The Annals of Statistics 41 (3), 1462–1484.

Jaganathan, R. and Z. Wang (1998). An Asymptotic Theory for Estimating Beta-PricingModels using Cross-Sectional Regression. Journal of Finance 53, 1285–1309.

26

Jeganathan, P. (1982). On the asymptotic theory of estimation when the limit of the log-likelihood ratios is mixed normal. Sankhya: The Indian Journal of Statistics, Series A,173–212.

Jeganathan, P. (1983). Some asymptotic properties of risk functions when the limit of theexperiment is mixed normal. Sankhya: The Indian Journal of Statistics, Series A, 66–87.

Kalnina, I. (2012). Nonparametric Tests of Time Variation in Betas. Technical report,Universite de Montreal.

Kan, R. and C. Zhang (1999). Two-Pass Tests of Asset Pricing Models with Useless Factors.Journal of Finance 54, 204–235.

Kleibergen, F. (2009). Tests of Risk Premia in Linear Factor Models. Journal of Economet-rics 149, 149–173.

Li, J., V. Todorov, and G. Tauchen (2017a). Adaptive estimation of continuous-time regres-sion models using high-frequency data. Journal of Econometrics 200 (1), 36–47.

Li, J., V. Todorov, and G. Tauchen (2017b). Jump regressions. Econometrica 85 (1), 173–195.

Li, J. and D. Xiu (2016). Generalized method of integrated moments for high-frequencydata. Econometrica 84 (4), 1613–1633.

Mancini, C. (2001). Disentangling the jumps of the diffusion in a geometric jumping brownianmotion. Giornale dell’Istituto Italiano degli Attuari 64 (19-47), 44.

Mykland, P. and L. Zhang (2006). ANOVA for Diffusions and Ito Processes. Annals ofStatistics 34, 1931–1963.

Mykland, P. and L. Zhang (2009). Inference for Continuous Semimartingales Observed atHigh Frequency. Econometrica 77, 1403–1445.

Patton, A. and M. Verardo (2012). Does Beta Move with News? Firm-specific InformationFlows and Learning about Profitability. Review of Financial Studies 25, 2789–2839.

Shanken, J. (1992). On the Estimation of Beta Pricing Models. Review of Financial Studies 5,1–33.

Todorov, V. and T. Bollerslev (2010). Jumps and Betas: A New Theoretical Framework forDisentangling and Estimating Systematic Risks. Journal of Econometrics 157, 220–235.

27

Date post:	23-May-2020
Category:	Documents
Upload:	others
View:	4 times
Download:	0 times

Variation and E ciency of High-Frequency BetasWe denote X= (Z;Y)>and assume that Xis an It^o...

Documents