Download - !!!!!!!!!!!!!!!!!!!!!!!!Optimal combinationofrealizedvolatilityestimators!!!!!!!!!!!!

Optimal Combinations of Realised Volatility Estimators�

Andrew J. Patton and Kevin Sheppard

University of Oxford

12 September 2008

Abstract

Recent advances in �nancial econometrics have led to the development of new estimators of

asset price variability using frequently-sampled price data, known as �realised volatility estima-

tors�or simply �realised measures�. These estimators rely on a variety of di¤erent assumptions

and take many di¤erent functional forms. Motivated by the success of combination forecasts

over individual forecasts in many forecasting applications, this paper presents a novel approach

for combining individual realised measures to form new estimators of price variability, overcom-

ing the obstacle that the quantity of interest is not observable, even ex post. In an application

to high frequency IBM price data over the period 1996-2007, we consider 30 di¤erent realised

measures from 6 distinct classes of estimators. We �nd that a simple equally-weighted average

of these estimators is not signi�cantly out-performed, in terms of accuracy, by any individual es-

timator. Moreover, we �nd that none of the individual estimators encompasses the information

in all other estimators, providing further support for the use of combination realised measures.

Keywords: realised variance, volatility forecasting, forecast comparison, forecast combination.

J.E.L. codes: C52, C22, C53.�We thank Asger Lunde, Neil Shephard, and seminar participants at the third ESRC seminar on Nonlinear

Economics and Finance at Keele University. Contact address: Department of Economics, and Oxford-Man Insti-

tute of Quantitative Finance, University of Oxford, Manor Road, Oxford OX1 3UQ, United Kingdom. Email:

[email protected] and [email protected].

1 Introduction

The development of new estimators of asset price variability has been an active area of econometric

research in the past decade. These estimators, known as �realised volatility estimators�or �realised

measures�, exploit the information in high frequency data on asset prices (e.g., 5-minute prices) to

estimate the variability of the price process over a longer period, commonly one day. Older papers

in this �realised volatility� literature, such as Merton (1980), French, Schwert and Stambaugh

(1987) and Zhou (1996) recognised the bene�ts in increased accuracy from such an approach, and

recent work1 has built on this to propose estimators that are more e¢ cient, robust to market

microstructure e¤ects, and that can estimate the variation due to the continuous part of the price

process separately from the variation due to the �jump�part of the price process. See Andersen,

et al. (2006) and Barndor¤-Nielsen and Shephard (2007) for recent reviews of this rapidly-evolving

literature.

This paper seeks to answer the following simple question: do combinations of the above esti-

mators o¤er gains in average accuracy relative to individual estimators? It has long been known in

the forecasting literature that combinations of individual forecasts often out-perform even the best

individual forecast, see Bates and Granger (1969), Newbold and Granger (1974) and Stock and

Watson (2004), for example, and see Clemen (1989) and Timmermann (2006) for a more reviews

of this �eld2. Timmermann (2006) summarises three explanations for why combination forecasts

work well in practice: they combine the information contained in each individual forecast; they

average across di¤erences in the way individual forecasts are a¤ected by structural breaks; they

are less sensitive to possible mis-speci�cation of individual forecasting models (see also Clements

and Hendry, 1998, on forecast model mis-speci�cation). Each of these three points applies equally

to the problem of estimating price variability: individual realised measures use di¤erent pieces of

information from high frequency data, they may be di¤erently a¤ected by structural breaks (caused

by, for example, changes in the market microstructure), and they may be a¤ected to various de-

1See Andersen and Bollerslev (1998), Andersen, et al. (2001, 2003), Barndor¤-Nielsen and Shephard (2002, 2004,

2006), Aït-Sahalia, Mykland and Zhang (2005), Large (2005), Zhang, Mykland and Aït-Sahalia (2005), Bandi and

Russell (2006, 2008), Hansen and Lunde (2006a), Oomen (2006), Christensen and Podolskij (2007), Christensen,

Oomen and Podolskij (2008), Barndor¤-Nielsen, et al. (2008a,b) amongst others.2See Halperin (1961) and Reid (1968) for interesting early work on combining di¤erent estimates of a mean, and

di¤erent noisy estimates of GDP, as opposed to combining forecasts.

1

grees by mis-speci�cation. Thus there is reason to believe that a combination realised measure may

out-perform individual realised measures.

The theoretical contribution of this paper is to propose methods for constructing optimal com-

binations of realised measures, where optimality is de�ned formally below. The construction of

combination estimators for asset price variability (measured by its quadratic variation, QV) di¤ers

in an important way from the usual forecast combination problem: the task is complicated by the

fact that QV is not observed, even ex post. This means that measuring the accuracy of a given

estimator of QV, or constructing a combination estimator that is as accurate as possible, has to

be done using proxies (or, in our case, instruments) for the true latent QV. Our theoretical work

extends the data-based method for estimating the relative accuracy of realised measures suggested

by Patton (2008) to allow the estimation of optimal combination weights, or optimal combina-

tion functional forms more generally. Our methods use the time series aspect of the data (i.e.,

they are �large T�), which enables us to avoid making strong assumptions about the underlying

price process, but at the cost of having to employ some assumptions (such as standard mixing and

moment conditions) to ensure that a central limit theorem can be invoked.

The main contribution of this paper is to apply our combination methods to a collection of 30

di¤erent realised measures, across 6 distinct classes of estimators, estimated on high frequency data

on IBM over the period 1996-2007. We present results on the ranking of the individual estimators

and �simple� combination estimators such as the arithmetic mean, the geometric mean and the

median, over the full sample period and over three sub-samples (1996-1999, 2000-2003, 2004-2007),

using two distance measures, the mean squared error (MSE) and the QLIKE distance measure

described below. We use the step-wise hypothesis testing method of Romano and Wolf (2005) to

�nd the estimators that are signi�cantly better (and signi�cantly worse) than simple daily squared

returns, the standard 5-minute realised volatility estimator, or a simple equally-weighted average

of all estimators. We also use the �model con�dence set� (MCS) of Hansen, Lunde and Nason

(2005) to �nd the set of estimators that is not signi�cantly di¤erent from the best estimator. We

�nd that the MCS contains 4 to 10 estimators (for QLIKE and MSE respectively), and in both

cases includes a simple combination estimator. Furthermore, using the Romano-Wolf test, we �nd

that no realised measure signi�cantly out-performs the simple average in the full sample, or in any

sub-sample, while many signi�cantly under -perform the simple average estimator.

We also estimate optimal linear and multiplicative combination estimators, under both MSE

2

and QLIKE, and examine which individual realised measures enter signi�cantly into the optimal

combination forecast, or enter with non-zero weight in an optimal constrained forecast. We �nd

that weight is given to a variety of realised measures, including both simple and more sophisticated

estimators. Importantly, we �nd that none of the individual estimators encompasses the information

in all other estimators, providing further support for the use of combination realised measures.

Overall our results suggest that there are bene�ts from combining the information contained in the

array of di¤erent volatility estimators proposed in the literature to date.

2 Combining realised measures

2.1 Notation

The latent target variable, generally the quadratic variation (QV) or integrated variance (IV) of

an asset price process, is denoted �t. We assume that �t is a Ft-measurable scalar, where Ftis the information set generated by the complete path of the log-price process. The estimators

(�realised measures� or �realised volatility estimators�) of �t are denoted Xi;t, i = 1; 2; :::; n. In

addition to being estimators with di¤erent functional forms, these may include the same type of

estimator applied to data sampling at di¤erent frequencies (e.g., standard RV estimated on 1-

minute or 5-minute data). Let g (Xt;w) denote a parametric combination estimator, where w is a

�nite-dimensional vector of parameters to be estimated from the data.

De�ning an �optimal� combination estimator requires a measure of accuracy for a given esti-

mator. Two popular measures in the volatility literature are the MSE and QLIKE measures:

MSE L (�;X) = (� �X)2 (1)

QLIKE L (�;X) =�

X� log

��

X

�� 1 (2)

The QLIKE distance measure is a simple modi�cation of the familiar Gaussian log-likelihood, with

the modi�cation being such that the minimum distance of zero is obtained when X = �. Our

result below will be shown to hold for a more general class of distance measures, namely the class

of �robust�pseudo-distance measures proposed in Patton (2006):

L (�;X) = ~C (X)� ~C (�) + C (X) (� �X) (3)

with C being some function that is decreasing and twice-di¤erentiable function on the supports of

both � and X, and where ~C is the anti-derivative of C. In this class each pseudo-distance measure

3

L is completely determined by the choice of C, and MSE and QLIKE are obtained (up to location

and scale constants) when C (z) = �z and C (z) = 1=z respectively. Finally, it is convenient to

introduce the following quantities:

L� (w) � E [L (�t; g (Xt;w))] (4)

~L� (w) � E [L (Yt; g (Xt;w))] (5)

�LT (w) � 1

T

TXt=1

L (Yt; g (Xt;w)) (6)

where the dependence of �LT ; ~L� and L� on the function g is suppressed for simplicity. Yt is an

observable proxy for the latent �t; and is further discussed in the following section.

2.2 Estimating optimal combinations of realised measures

In this section we provide the theory underlying the estimation of optimal combination estimators,

building on Patton (2008) who considered rankings of realised measures. We will denote a generic

combination estimator as g (Xt;w). A concrete example of a combination estimator is the linear

combination:

gL (Xt;w) = !0 +

nXi=1

!iXit (7)

In volatility applications multiplicative forecast combinations are also used:

gM (Xt;w) = !0 �nYi=1

X!iit (8)

We will de�ne the optimal combination parameter, w�; the feasible optimal combination parameter,

~w�; and the estimated combination parameter, w�T ; as follows:

w� � arg minw2W

L� (w) ; ~w� � arg minw2W

~L� (w) ; w�T � arg minw2W

�LT (w) (9)

where W is the parameter space and L�; ~L�, and �LT are de�ned in equations (4) to (6).

The di¢ culty in estimating optimal combinations of realised measures lies in the fact that

quadratic variation is not observable, even ex post. Thus measuring the accuracy of a given estima-

tor, or constructing a combination estimator that is as accurate as possible, is not straightforward.

This is related to the problem of volatility forecast evaluation and comparison, where the target

variable is also unobservable. Andersen and Bollerslev (1998), Meddahi (2001), Andersen, Boller-

slev and Meddahi (2005), Hansen and Lunde (2006b), Patton (2006), and Patton and Sheppard

4

(2008) discuss this problem in volatility forecasting. Unfortunately, the methods developed for

volatility forecasting are not directly applicable to the problem of evaluating realised measure ac-

curacy, or the construction of combinations of realised measures, due to a change in the information

set that is used: in forecasting applications, the estimate of �t will be based on Ft�1, while in re-

alised volatility estimation applications, the estimate of �t will be based on Ft: As discussed in

Patton (2008), this subtle change in information sets forces a substantial change in methods to

compare and combine realised measures. Ignoring this change leads to combination estimators that

are biased and inconsistent. These problems arise because the error in the proxy for �t is correlated

with the error in the estimator, a problem that does not arise, under basic assumptions, in volatility

forecasting applications.

Following Patton (2008), we will consider the case that an unbiased proxy for �t is known to be

available. An example of this is the daily squared return, which can plausibly assumed to be free

from microstructure and other biases, and so is an unbiased, albeit noisy, estimator of QV.

Assumption P1: ~�t = �t + �t, with E [�tjFt�1; �t] = 0.

In order the break the problem of correlated measurement errors in ~�t and in Xit identi�ed in

Patton (2008), we will use a lead, or combination of leads, of ~�t in the estimation of the optimal

combination weights. Denoting this as Yt; we make the following assumption:

Assumption P2: Yt =PJj=1 �j

~�t+j, where 1 � J <1; �j � 0 8 j andPJj=1 �j = 1:

Finally, we need an assumption about the dynamics of the target variable �t: Patton (2008)

considers two assumptions here, that �t follows a (possibly heteroskedastic) random walk, or that

�t follows a stationary AR(p) process. For the high frequency IBM data studied below, Patton

(2008) found that the random walk approximation was satisfactory, and so for simplicity we will

focus on that case; the extension to the AR(p) approximation is straightforward.

Assumption T1: �t = �t�1 + �t, with E [�tjFt�1] = 0:

To obtain the asymptotic distribution of the estimated optimal combination weights we require

assumptions su¢ cient for a central limit theorem to hold. Several di¤erent sets of assumptions

may be employed here; we use high-level assumptions from Gallant and White (1988), and refer

the interested reader to that book for more primitive assumptions that may be used for non-linear,

dynamic m-estimation problems such as ours. See Davidson and MacKinnon (1993) for a concise

and very readable overview of asymptotic normality for m-estimators.

Assumption A1(a): �LT (w)� ~L� (w)p! 0 uniformly on W.

5

Assumption A1(b): ~L� (w) has a unique minimiser ~w�.

Assumption A1(c): g is twice continuously di¤erentiable with respect to w.

Assumption A1(d): Let AT (w) � rww �LT (w), then AT (w)� A (w)p! 0 uniformly on W;

where A (w) is a �nite positive de�nite matrix of constants for all w 2W.

Assumption A1(e): T�1=2XT

t=1rwL (Yt; g (Xt;w))

D! N (0; B (w)), where B (w) is �nite

and positive de�nite for all w 2W.

With the above assumptions in hand, we now present our main theoretical result.

Proposition 1 If the distance measure L is a member of the class in equation (3) and if ~w�

interior to W, then under assumptions P1, P2, T1 and A1, we have:

V�1=2T

pT (w�T �w�)

D! N (0; I)

where VT � A�1T BT A�1T ; AT �

1

T

TXt=1

rwwL (Yt; g (Xt; w�T )) ; BT � V"1pT

TXt=1

rwL (Yt; g (Xt; w�T ))#

and BT is some symmetric and positive de�nite estimator of BT such that BT �BTp! 0:

Proof. We �rst show that w� = ~w�:This part of the proof is a corollary to Proposition 2(a) of

Patton (2008). Consider a second-order mean-value expansion of L (Yt; g (Xt;w)) around �t:

L (Yt; g (Xt;w)) = L (�t; g (Xt;w)) +@L (�t; g (Xt;w))

@�(Yt � �t) +

1

2

@2L��t; g (Xt;w)

�@�2

(Yt � �t)2

where ��t = �tYt+(1� �t) �t for some �t 2 [0; 1] : Under assumptions P1, P2 and T1, Patton (2008)

shows that the second term in this expansion has mean zero, and so we obtain:

E [L (Yt; g (Xt;w))] = E [L (�t; g (Xt;w))] +1

2E

24@2L��t; g (Xt;w)

�@�2

(Yt � �t)235

Distance measures in the class in equation (3) yield @2L (�;X) =@�2 = �C 0 (�) ; and so

E [L (Yt; g (Xt;w))] = E [L (�t; g (Xt;w))]�1

2EhC 0��t

�(Yt � �t)2

iNotice that the second term above does not depend on w; thus the parameter that minimises

E [L (Yt; g (Xt;w))] is the same as that which minimises E [L (�t; g (Xt;w))] : Thus ~w� = w�:

Next we obtain the asymptotic distribution of w�T : This part of the proof uses standard results

from m-estimation theory, see Gallant and White (1988) for example. Under assumptions A1(a)

6

and A1(b), Theorem 3.3 of Gallant and White (1988) yields w�T � ~w�p! 0: Combining this with

the fact that ~w� = w� yields consistency of w�T for the parameter of interest: w�T �w�

p! 0: Under

Assumption A1, Theorem 5.1 of Gallant and White (1988) yields the asymptotic normality of w�T ;

centered around ~w�: Combining this with ~w� = w� from above yields the desired result.

The above proposition shows that it is possible to consistently estimate the optimal combination

weights from the data, by employing a �robust�loss function of the form in equation (3) and using

a lead (or combination of leads) of a conditionally unbiased proxy for �t: This proposition further

shows how to compute standard errors on these estimated optimal weights. The use of a proxy, Yt;

for the true quadratic variation, �t; means that these standard errors will generally be larger than

those that would be obtained if �t was observable, but nevertheless these standard errors can be

estimated using the expressions above.

3 Empirical application

3.1 Data description

In this section we consider the problem of estimating the quadratic variation of the open-to-close

(9:45am to 4pm) continuously-compounded return on IBM, using a variety of di¤erent estimators

and sampling frequencies. We use data on NYSE trade prices from the TAQ database over the

period from January 1996 to June 2007, yielding a total of 2893 daily observations3. Figure 1

reveals that the sample includes periods of rising prices and moderate-to-high volatility (roughly

1996-1999), of slightly falling prices and relatively high volatility (roughly 2000-2003), and of stable

prices and relatively low volatility (2004-2007). In addition to considering the full sample estimates

of optimal combination estimators, we will consider the results in each of these three sub-samples.

[ INSERT FIGURE 1 HERE ]

3We use trade prices from the NYSE only, between 9:45am and 4:00pm, with a g127 code of 0 or 40, a corr code of

0 or 1, positive size, and cond not equal to �O�, �Z�, �B�, �T�, �L�, �G�, �W�, �J�, or �K�. Further, the data were

cleaned for outliers and related problems: prices of zero were dropped, as were prices that suggested a single-trade

change of more than 10% relative to the opening price. Finally, if more than one price was observed with the same

time stamp then we used the median of these prices. See Barndor¤-Nielsen, et al. (2008b) for a discussion of cleaning

high frequency data.

7

3.2 Description of the individual estimators

The motivation for our study of realised measures is that the varied forms of realised measures

that have been proposed in the literature to date, and the di¤erent pieces of information captured

by each, may allow for the construction of combination estimators that out-perform any given

individual estimator. With this in mind, we consider a large collection of di¤erent realised measures.

In the implementation of each of these estimators, we follow the implementation of the authors of

the original paper as closely as possible (and in most cases, exactly). We omit detailed de�nitions

and descriptions of each estimator in the interests of space and instead refer the interested reader

to the original papers.

We �rstly consider standard realised variance, de�ned as:

RV(m)t =

mXj=1

r2t;j (10)

where m is the number of intra-daily returns used, and rt;j is the jth return on day t: The number

of intra-daily returns used can vary from 22,500 (if we sample prices every second between 9:45am

and 4pm) to 1, if we sample just the open and the close price. To keep the number of estimators

tractable, and with the high degree of correlation between RV estimators with similar sampling

frequencies in mind, we select six sampling frequencies: 1 second, 5 seconds, 1 minute, 5 minutes,

62.5 minutes (which we will abbreviate as �1 hour�), and 1 trade day (375 minutes). The �rst

set of RV estimators is based on prices sampled in �calendar time�using last-price interpolation,

meaning that we construct a grid of times between 9:45am and 4pm with the speci�ed number of

minutes between each point, and use the most recent price as the one for a given grid point.

Next we consider RV estimators computed using the same formula, but with prices sampled in

�tick time�(also known as �event time�or �trade time�). In this sampling scheme a price series

for each day is constructed by skipping every x trades: this leads to prices that are evenly spaced

in �event time�, but generally not in calendar time. If the trade arrival rate is correlated with the

level of volatility then tick-time sampling produces high-frequency returns which are approximately

homoskedastic. Theory suggests that this should improve the accuracy of RV estimation, see Hansen

and Lunde (2006a) and Oomen (2006). We consider average sampling frequencies in tick-time that

correspond to those used in calendar time: 1 second, 5 seconds, 1 minute, 5 minutes, 62.5 minutes

and 375 minutes. The highest and the lowest of these frequencies lead to estimators that are

numerically identical to calendar-time RV, and so we drop these from the analysis.

8

We then draw on the work of Bandi and Russell (2006, 2008), who provide a method to estimate

the optimal (calendar-time) sampling frequency, for each day, for realised variance in the presence

of market microstructure noise. This formula relies on estimates of the variance and kurtosis of the

microstructure noise, as well as preliminary estimates of the integrated variance (IV) and integrated

quarticity (IQ) of the price process: we follow Bandi and Russell (2008), who also study IBM stock

returns, and estimate the moments of the microstructure noise using 1-second returns, and use

15-minute returns to obtain preliminary estimates of the IV and IQ. Bandi and Russell (2008)

also propose a bias-corrected realised variance estimator, which removes the estimated impact of

the microstructure noise: we consider the Bandi-Russell RV estimator both with�RV BR;bc

�and

without�RV BR

�this bias correction.

Our second set of realised measures are the �realised range-based variance� (RRV) of Chris-

tensen and Podolskij (2007) and Martens and van Dijk (2007). We follow Christensen and Podol-

skij�s implementation of this estimator and use 5-minute blocks. Rather than estimate the number

of prices to use within each block from the number of non-zero return changes, as in Christensen

and Podolskij (2007), we simply use 1-minute prices within each block, giving us 5 prices per block,

compared with around 7 in their application to General Motors stock returns. We implement RRV

using both calendar-time and tick-time sampling.

The third set of realised measures are the �quantile-based realised variance� (QRV) of Chris-

tensen, Oomen and Podolskij (2008). For implementation we follow Christensen, et al.�s application

to Apple stock returns and use quantiles of 0:85; 0:90; and 0:96; and prices sampled every 1 minute,

with the number of subintervals (�n�in their notation) set to one. We implement QRV using both

calendar-time and tick-time sampling.

The fourth set of realised measures are the �bi-power variation� (BPV) of Barndor¤-Nielsen

and Shephard (2006). These authors implemented their estimator using 5-minute calendar-time

returns, however this was presumably partially dictated by their data (indicative quotes for the

US dollar/German Deutsche mark and US dollar/Japanese yen exchange rate), for which this was

the highest frequency. We thus implement BPV at both 5-minute and 1-minute frequencies, using

both calendar-time and tick-time sampling4.

4 It is worth noting that, unlike the other estimators considered in this paper, BPV and QRV estimate the part

of quadratic variation due to the continuous semimartingale (that is, the integrated variance, IV) and so ignore the

contribution to QV from jumps. If jumps are unpredictable (or less predictable than IV), as found by Andersen,

9

The �fth set of estimators are the �realised kernels�(RK) of Barndor¤-Nielsen, et al. (2008a),

BNHLS henceforth. This is a broad class of estimators and we consider several variations. Firstly,

we consider RK with the Bartlett kernel, as this estimator was shown by BNHLS to be asymptot-

ically equivalent to the �two-scale�estimator (2SRV) of Zhang, Mykland and Aït-Sahalia (2005).

Second, we consider RK with the �cubic�kernel, which was shown to be asymptotically equivalent

to the �multi-scale�estimator (MSRV) of Zhang (2006), (and which in turn was shown by Zhang to

be more e¢ cient than 2SRV). For both RKbart and RKcubic we consider both 5-tick sampling and

5-second sampling, and in all cases we use the optimal bandwidth for a given kernel provided in

BNHLS5 ;6. Next we consider the �modi�ed Tukey-Hanning2�(TH) kernel, following their empirical

application to General Electric stock returns. They suggest using 1-minute tick-time sampling with

optimal bandwidth, which we implement here. We also consider the TH kernel with bandwidth

�xed to equal 5�RKTH�5bw�, and the TH kernel with optimal bandwidth applied to 5-tick returns�

RKTH;5tick�. Finally, we consider the �non-�at-top Parzen� kernel of Barndor¤-Nielsen, et al.

(2008b), which is designed to guarantee non-negativity of the estimator (which is not guaranteed

for the other RK estimators considered above). We implement this with their optimal bandwidth

formula, and following their application to Alcoa stock returns, we use 1-minute tick-time sampling.

Our sixth and �nal realised measure is the �alternation� estimator of Large (2005), ALTRV.

This estimator was designed for use on assets whose prices can only change by one tick size (e.g.,

one cent, one penny, one-sixteenth of a dollar, etc.), and with this restriction in mind, Large

proposes a novel estimator of the QV of the price series. Our application to IBM stock returns

does not satisfy this assumption, and so formally ALTRV is not applicable here, but we implement

a modi�cation of this estimator to see whether it nevertheless provides useful information in the

optimal combination estimator. Following Large (2005), we implement this on quote prices rather

than trade prices. To handle the fact that non-zero absolute IBM price changes are not always of

Bollerslev and Diebold (2006), then there may be bene�ts to using estimators that focus on IV rather than QV in

forecasting applications.5These optimal bandwidths, like the RVBR sampling frequencies, are estimated for each day in the sample and so

can change with market conditions, the level of market microstructure noise, etc.6These optimal bandwidth formulas require estimates of moments of the noise and preliminary estimates of IV

and QV. We use 1-second returns to obtain estimates of the moments of the noise as in Bandi and Russell (2008),

10-minute returns to obtain a preliminary estimate of IV and QV, and 1-minute returns to obtain a preliminary RK

estimate used to bias-correct the noise estimate. These are slightly di¤erent to the choices in the web appendix to

Barndor¤-Nielsen, et al. (2008a); see http://www.hha.dk/~alunde/BNHLS/BNHLS.htm for further discussion.

10

the same size (i.e. 1 tick), on days when there are more than one size for non-zero absolute price

changes we use the mode non-zero price change as the �tick�, and implement ALTRV as though

this was the tick size. To make ALTRV an estimator of the QV of the log-price (as for the other

estimators considered here) rather than the price, we scale ALTRV by the squared opening price

for each day (this is a form of �rst-order Taylor series adjustment term). ALTRV uses every quote,

and so we do not have to choose a sampling scheme or frequency. We consider ALTRV applied to

the mid-quote (ALTRVmq), and the average of the ALTRVs applied separately to the bid and the

ask prices (ALTRVcombo).

In total we have 30 di¤erent realised measures, from 6 di¤erent classes of estimators, with a

variety of sampling frequencies and sampling schemes. To the best of our knowledge, this is the

largest collection of realised measures considered in a single empirical study.

Table 1 presents some summary statistics for these 30 estimators. ALTRV mq has the smallest

average value, 2.158, and RV tick;5 sec has the largest average at 3.314. More familiar estimators,

such as RV 1day and RV 5min have average values of around 2.3, corresponding to 24.1% annualised

standard deviation. Whilst RV 1day has a reasonable average value, it performs poorly on the

other summary statistics: it has the highest standard deviation, skewness, and kurtosis of all 30

estimators. BPV 1min is the estimator with the lowest standard deviation, while QRV has the

lowest skewness and kurtosis. RV 1day is also the only estimator to generate estimates of QV that

are not strictly positive: its minimum value is zero, which it attains on 42 days (around 1.5% of

the sample). None of the other estimators have a minimum value that is non-positive, including

the RK estimators which do not ensure non-negativity of the estimates.

Table 2 presents a subset of the correlation matrix of these estimators (the complete corre-

lation matrix is available upon request). We present the correlation of each estimator with two

standard estimators in the literature (RV1day and RV5min), a naïve choice given high frequency

data (RV1 sec), and three of the estimators that ultimately perform well in our empirical analysis

(RVtick;1min; BPVtick;1min and RKcubic). This table shows that these estimators are generally highly

correlated (as one would expect); the average correlation across all elements of their correlation

matrix is 0.861. This should be kept in mind when interpreting the estimated optimal combination

weights in Section 3.4. The highest correlation between any two estimators is between RKtick;cubic

and RKTH;5tick; which is 0.9992. The lowest correlation between any two estimators is between

RV1day and ALTRVcombo, at 0.425.

11

[ INSERT TABLES 1 AND 2 HERE ]

3.3 Results using simple combination estimators

In Tables 3 and 4 we present the �rst set of empirical results of the paper. These tables present the

estimated accuracy of each of the estimators using the ranking methodology in Patton (2008). Our

preferred measure of accuracy is the QLIKE distance measure, and we also present results for the

familiar MSE distance measure. We use the random walk approximation (assumption T1 above),

with a one-period lead of the RV 5min as the instrument for the latent quadratic variation to obtain

these estimates.

The ranking method of Patton (2008) can only estimate the accuracy of an estimator relative

to some other estimator, and in Tables 3 and 4 we use RV 5min as the base estimator; this choice

is purely a normalisation and has no e¤ect on the conclusions. Negative values in the �rst columns

of Tables 3 and 4 indicate that a given estimator has lower average MSE or QLIKE distance to the

latent QV (i.e., greater accuracy) than RV 5min, while positive values indicate higher average MSE

or QLIKE distance than RV 5min:

We consider the 30 individual realised measures discussed in the previous section, as well as three

simple combination estimators: the equally-weighted arithmetic mean, equally-weighted geometric

mean, and the median, leading to a total of 33 estimators. Under QLIKE the most accurate

estimator of QV is the simple RV tick;1min; which is ranked in the top 4 in all three sub-periods.

Under MSE, QRVtick is ranked the highest and does well in the �rst two sub-periods but is ranked

20th in the last sub-period. The top 5 estimators, averaging across the full-sample ranks under both

MSE and QLIKE are QRV tick (�rst), RKcubic (second), RV BR and QRV (equal third), RVMean

and RVMedian (equal �fth).

The discussion of rankings of average accuracy is a useful initial look at the results, but a

more formal analysis is desirable. We use two approaches: The �rst is the �model con�dence

set� (MCS) of Hansen, Lunde and Nason (2005), which was developed to obtain a set of fore-

casting models that contains the true �best� model out of the entire set of forecasting mod-

els with some speci�ed level of con�dence. It allows the researcher to identify the sub-set of

models that are �not signi�cantly di¤erent� from the unknown true best model. Patton (2008)

shows that this methodology may be adapted to the problem of identifying the most accurate re-

alised measures, under the assumptions discussed in Section 2. The last four columns of Tables

12

3 and 4 show the results of the MCS procedure on the full sample and on three sub-samples7.

Under QLIKE the full-sample MCS, at the 90% con�dence level, contains just 4 estimators:

RV tick;1min, QRV tick, RKcubic, and RVMean: Under MSE distance the MCS contains 10 estimators:

RV 1min; RV tick;1min,RV BR,QRV ,QRV tick,BPV 1min,BPV tick;5min,RV tick;bart,RV cubic,RVMedian:

Thus under both distance measures a simple combination estimator appears in the MCS. Inter-

estingly, it is a di¤erent combination estimator depending on the distance measure: under MSE,

which is more sensitive to outliers, the preferred simple combination estimator is the median, which

is a more robust combination estimator, see Stock and Watson (2001) for example. Under QLIKE,

which is less sensitive to large over-predictions and is invariant to the level of volatility, see Patton

and Sheppard (2008), the preferred combination estimator is the mean.


The second formal analysis of the individual estimators and simple combination estimators

uses the stepwise multiple testing method of Romano and Wolf (2005). This method identi�es the

estimators that are signi�cantly better, or signi�cantly worse, than a given benchmark estimator,

while controlling the family-wise error rate of the complete set of hypothesis tests. We consider

three choices of benchmark estimator: RV 1day; which is the standard estimator in the absence of

high frequency data; RV 5min; which is based on a rule-of-thumb from earlier papers in the RV

literature (see Andersen, et al. (2001b) and Barndor¤-Nielsen and Shephard (2002) for example);

and RVMean; which is the standard simple combination estimator. The results of these tests are

presented in Tables 5 and 6 for MSE and QLIKE distance respectively8.

The results of the Romano-Wolf method under QLIKE distance reveal some interesting patterns.

Firstly, at the 10% level of signi�cance, every estimator signi�cantly outperforms RV 1day, in the

full sample and in all three sub-samples. This clearly a strong signal that high frequency data, when

used in any one of the 32 other estimators in this study, yields more precise estimates of QV than

daily data can. When theRV 5min is taken as a benchmark we see more variation in the results: some

estimators are signi�cantly better, others signi�cantly worse and some not signi�cantly di¤erent.

Broadly stated, the estimators that out-perform RV 5min include RV sampled at the 1-minute7The MCS is implemented via a bootstrap re-sampling scheme. We use Politis and Romano�s (1994) stationary

bootstrap with an average block length of 10 days and 1000 bootstrap replications for each test.8The Romano-Wolf testing method is also implemented using Politis and Romano�s (1994) stationary bootstrap

with an average block length of 10 days, and we again use 1000 bootstrap replications for each test.

13

frequency (either in tick time or calendar time), RRV and QRV, RK with the Bartlett, cubic

or Tukey-Hanning kernel (when sampled every 5 ticks or 5 seconds) and the three combination

estimators. The estimators that under -perform RV 5min include RV sampled at the 1 hour or 1 day

frequency, RKTH sampled at the 1 minute frequency, and ALTRV (though recall that the latter

estimator was not designed for a stock like IBM).

Finally, when RVMean is taken as the benchmark estimator in the Romano-Wolf testing method

a very clear conclusion emerges: no estimator signi�cantly out-performs RVMean. Several estima-

tors are not signi�cantly di¤erent, but none are signi�cantly better in the full sample or in any

of the sub-samples. This is a strong endorsement of using this simple combination estimator in

practice9.

When using MSE distance there are fewer signi�cant results: all estimators continue to signi�-

cantly out-perform RV 1day; whereas in the comparisons with RV 5min or RVMean as the benchmark

there are very few rejections of the null hypothesis.


3.4 Results using optimal combination estimators

In this section we present our estimated optimal combination estimators. We consider two standard

parametric combination estimators, a linear combination and a multiplicative combination:

X lint = w0 +

nXi=1

wiXit; and (11)

Xmultt = !0 �

nYi=1

X !iit ; (12)

estimated using Proposition 1 above. We consider both unconstrained combination estimators,

which satisfy the condition that the unknown weights all lie in the interior of the parameter space

and so permit us to compute standard errors using Proposition 1, and also constrained combination

estimators, with the constraint being that all weights must be non-negative. The constrained

9 In unreported results, we also implemented the Romano-Wolf method with RV Geo�mean and RVMedian as the

benchmarks. Under MSE neither of these was ever signi�cantly out-performed, while under QLIKE RV Geo�mean

was signi�cantly beaten (amongst others, by RVMean) while RVMedian was never signi�cantly beaten. These results

are available upon request.

14

estimation makes obtaining standard errors di¢ cult (and we do not pursue this here) but provides

additional information on the individual estimators that are most useful in a combination estimator.

Table 7 presents the optimal linear combination estimators under MSE distance. Looking

�rst at the constrained combination, we see many estimators get a weight of zero in the optimal

constrained combination. The largest weights are assigned to QRV (in calendar time and tick

time) with some weight also given to RV tick;5 sec; RKTH ; RKTH;5tick: Small weight (0.0039) is

also given to RV 1day. In the unconstrained combination the results are broadly similar: some

signi�cant weight is attached to QRV (in calendar time and tick time), and to RKTH ; RKTH;5tick

and RKtick;cubic: Interestingly, although the ALTRV estimators do not get signi�cant weight in the

full sample, they do have signi�cant coe¢ cients in the 1996-1999 sub-sample. This corresponds

with the period where the assumption of single-tick price changes is most plausible for IBM: the

minimum tick size on the New York stock exchange was one-eighth of a dollar until June 24, 1997,

and was one-sixteenth of a dollar until January 29, 2001, when decimal pricing began10.


Table 8 presents the results for optimal linear combinations under QLIKE distance. In the

optimal constrained combination estimator we again see substantial weight on QRV, and also

on RV NFParzen; and we also see non-zero weights on a collection of simple RVs, with sampling

frequencies from 5 seconds up to and including 1 day. As in the optimal MSE combination, the

unconstrained combination has few individually signi�cant coe¢ cients, though there are signi�cant

coe¢ cients on BPV sampled at the 1-minute frequency, RKTH ; and simple RV with sampling

frequency ranging from 1 second to 1day. Also similar to the optimal MSE combination, we again

see that the ALTRV estimators receive signi�cant weight in the �rst sub-sample but not in the

latter sub-samples.

Tables 9 and 10 present the estimated combination weights for multiplicative combinations.

Perhaps surprisingly, given the change in the functional form of these combination estimators, the

results are broadly in line with the optimal linear combination results: the optimal combinations

for MSE put substantial weights on QRV, RKTH ; RKTH;5tick and RV tick;5 sec; while the optimal

combinations for QLIKE again put weight on QRV, RV NFParzen and a collection of simple RVs

with sampling frequencies between 5 seconds and 1 day.

10Source: New York Stock Exchange web site, http://www.nyse.com/about/history/timeline_chronology_index.html.

15

Often of interest in the forecasting literature is whether the estimated optimal combination

is signi�cantly di¤erent from a simple equally-weighted average. If we let w�i denote the optimal

linear combination weights, and !�i denote the optimal multiplicative combination weights, the

hypotheses of interest are:

HL0 : w0 = 0 \ w1 = w2 = ::: = wn = 1=n (13)

vs. HLa : w0 6= 0 [ wi 6= 1=n for some i = 1; 2; :::; n

and HM0 : !0 = 1 \ !1 = !2 = ::: = !n = 1=n (14)

vs. HMa : !0 6= 1 [ !i 6= 1=n for some i = 1; 2; :::; n

Using Proposition 1 these hypotheses can be tested using Wald tests, and we �nd on the full

sample of data that the null that the optimal combination is an equally-weighted combination can

be rejected with a p-value of less than 0.01 in all four cases (linear and multiplicative combinations,

under MSE and QLIKE distance). Thus, while Section 3.3 revealed that the simple mean was

not signi�cantly beaten by any individual estimator, it can still be improved: an optimally formed

linear combination is signi�cantly more accurate than an equally-weighted average.

Finally, we conduct a set of tests related to idea of forecast encompassing, see Chong and Hendry

(1986) and Fair and Shiller (1990). We test the null hypothesis that a single realised measure (i)

encompasses the information in all other estimators:

HL;i0 : wi = 1 \ wj = 0 8 j 6= i (15)

vs. HL;ia : wi 6= 1 [ wj 6= 0 for some j 6= i

and HM;i0 : !0 = !i = 1 \ !j = 0 8 j 6= 0; i (16)

vs. HM;ia : !i 6= 1 [ !0 6= 1 [ wj 6= 0 for some j 6= 0; i

for i = 1; 2; :::; n:We �nd that the null hypothesis is rejected for every single estimator, under both

MSE and QLIKE distance, for both linear and multiplicative combinations, with all p-values less

than 0:01. This is strong evidence that there are gains to considering combination estimators of

quadratic variation: no single estimator dominates all others. This result is new to the realised

volatility literature, but is probably not surprising to those familiar with with forecasting in practice.

16

4 Summary and conclusion

Recent advances in �nancial econometrics have led to the development of new estimators of asset

price variability using high frequency price data. These estimators are based on a variety of di¤erent

assumptions and take many di¤erent functional forms. Motivated by the success of combination

forecasts over individual forecasts in a range of forecasting applications, see Clemen (1989) and

Timmermann (2006) for example, this paper sought to answer the following question: do combina-

tions of individual estimators o¤er accuracy gains relative to individual estimators? The answer is

a resounding �yes�.

This paper presents a novel method for combining individual realised measures to form new

estimators of price variability, overcoming the obstacle that the quantity of interest is not observable,

even ex post. We applied this method to a collection of 30 di¤erent realised measures, across 6

distinct classes of estimators, estimated on high frequency IBM price data over the period 1996-

2007. Using the Romano-Wolf (2005) test, we �nd that no individual realised measure signi�cantly

out-performs, in terms of average accuracy, a simple equally-weighted average. Further, we �nd that

none of the individual estimators encompasses the information in all other estimators, providing

further support for the use of combination realised measures. Overall our results suggest that there

are indeed bene�ts from combining the information contained in the array of di¤erent volatility

estimators proposed in the literature to date.

Several extensions of our analysis are possible. Firstly, it may be interesting, both economically

and statistically, to try to understand why certain estimators out-perform, or enter with greater

weight into the optimal combination estimator than others. It may be that certain market condi-

tions (e.g., high vs. low volatility or small vs. large bid-ask spreads) favour one class of estimator

relative to another, and identifying these may aid in the construction of improved estimators. Sec-

ondly, it may be bene�cial to consider time-varying combinations of realised measures, along the

lines of Elliott and Timmermann (2005), rather than the static combination estimators considered

in this paper.

17

References

[1] Aït-Sahalia, Y., P. Mykland, and L. Zhang, 2005, How Often to Sample a Continuous-TimeProcess in the Presence of Market Microstructure Noise, Review of Financial Studies, 18,351-416.

[2] Andersen, T.G., T. Bollerslev, 1998, Answering the Skeptics: Yes, Standard Volatility Modelsdo Provide Accurate Forecasts, International Economic Review, 39, 885-905.

[3] Andersen, T.G., Bollerslev, T., and Meddahi, N., 2005, Correcting the Errors: Volatility Fore-cast Evaluation Using High-Frequency Data and Realized Volatilities, Econometrica, 73(1),279-296.

[4] Andersen, T.G., T. Bollerslev, F.X. Diebold, and P. Labys, 2001a, The Distribution of RealizedExchange Rate Volatility, Journal of the American Statistical Association, 96, 42-55.

[5] Andersen, T.G., T. Bollerslev, F.X. Diebold, and H. Ebens, 2001b, The Distribution of Real-ized Stock Return Volatility, Journal of Financial Economics, 61, 43-76.

[6] Andersen, T.G., T. Bollerslev, F.X. Diebold, and P. Labys, 2003, Modeling and ForecastingRealized Volatility, Econometrica, 71, 579-626.

[7] Andersen, T.G., T. Bollerslev, and F.X Diebold, 2007, Roughing It Up: Including JumpComponents in the Measurement, Modeling and Forecasting of Return Volatility, Review ofEconomics and Statistics, 89, 701-720.

[8] Andersen, T.G., T. Bollerslev, P.F. Christo¤ersen, and F.X. Diebold, 2006, Volatility andCorrelation Forecasting, in the Handbook of Economic Forecasting, G. Elliott, C.W.J. Grangerand A. Timmermann eds., North Holland Press, Amsterdam.

[9] Bandi, F.M., and J.R. Russell, 2006, Separating Microstructure Noise from Volatility, Journalof Financial Economics, 79, 655-692.

[10] Bandi, F.M., and J.R. Russell, 2008, Microstructure Noise, Realized Variance, and OptimalSampling, Review of Economic Studies, 75, 339-369.

[11] Barndor¤-Nielsen, O.E., and N. Shephard, 2002, Econometric Analysis of Realized Volatilityand its use in Estimating Stochastic Volatility Models, Journal of the Royal Statistical Society,Series B, 64, 253-280.

[12] Barndor¤-Nielsen, O.E., and N. Shephard, 2004, Econometric Analysis of Realized Covaria-tion: High Frequency Based Covariance, Regression and Correlation in Financial Economics,Econometrica, 72, 885-925.

[13] Barndor¤-Nielsen, O.E., and N. Shephard, 2006, Econometrics of Testing for Jumps in Finan-cial Economics using Bipower Variation, Journal of Financial Econometrics, 4, 1-30.

[14] Barndor¤-Nielsen, O.E., and N. Shephard, 2007, Variation, jumps, market frictions and highfrequency data in �nancial econometrics, in Advances in Economics and Econometrics. The-ory and Applications, Ninth World Congress, R. Blundell, P. Torsten and W.K Newey eds.,Econometric Society Monographs, Cambridge University Press, 328-372.

18

[15] Barndor¤-Nielsen, O.E., P.R. Hansen, A. Lunde, and N. Shephard, 2008a, Designing RealisedKernels to Measure the Ex-Post Variation of Equity Prices in the Presence of Noise, Econo-metrica, forthcoming.

[16] Barndor¤-Nielsen, O.E., P.R. Hansen, A. Lunde, and N. Shephard, 2008b, Realized Kernelsin Practice, Econometrics Journal, forthcoming.

[17] Bates, J.M., and C.W.J. Granger, 1969, The Combination of Forecasts, Operations ResearchQuarterly, 20, 451-468.

[18] Chong, Y.Y. and D.F. Hendry, 1986, Econometric Evaluation of Linear Macroeconomic Mod-els, Review of Economic Studies, 53, 671-690.

[19] Christensen, K., and M. Podolskij, 2007, Realized range-based estimation of integrated vari-ance, Journal of Econometrics, 141, 323-349.

[20] Christensen, K., R.C.A. Oomen, and M. Podolskij, 2008, Realised Quantile-Based Estimationof the Integrated Variance, working paper.

[21] Clemen, R.T., 1989, Combining Forecasts: A Review and Annotated Bibliography, Interna-tional Journal of Forecasting, 5, 559-583.

[22] Clements, M.P. and D.F. Hendry, 1998, Forecasting Economic Time Series, Cambridge Uni-versity Press, Cambridge.

[23] Davidson, R., and J.G. MacKinnon, 1993, Estimation and Inference in Econometrics, OxfordUniversity Press, Oxford.

[24] Elliott, G., and A. Timmermann, 2005, Optimal Forecast Combination Under Regime Switch-ing, International Economic Review, 1081-1102.

[25] Fair, R.C. and R.J. Shiller, 1990, Comparing Information in Forecasts from Econometric Mod-els, American Economic Review, 80, 375-389.

[26] French, K.R., G.W. Schwert and R.F. Stambaugh, 1987, Expected Stock Returns and Volatil-ity, Journal of Financial Economics, 19, 3-29.

[27] Gallant, A.R., and H. White, A Uni�ed Theory of Estimation and Inference for NonlinearDynamic Models, Basil Blackwell, New York.

[28] Halperin, M., 1961, Almost Linearly-Optimum Combination of Unbiased Estimates, Journalof the American Statistical Association, 56(293), 36-43.

[29] Hansen, P.R., and A. Lunde, 2006a, Realized Variance and Market Microstructure Noise,Journal of Business and Economic Statistics, 24, 127-161.

[30] Hansen, P.R., and A. Lunde, 2006b, Consistent Ranking of Volatility Models, Journal ofEconometrics, 131, 97-121.

[31] Hansen, P.R., A. Lunde, and J.M. Nason, 2005, Model Con�dence Sets for Forecasting Models,Federal Reserve Bank of Atlanta Working Paper 2005-7.

19

[32] Large, J., 2005, Estimating Quadratic Variation When Quoted Prices Change by a ConstantIncrement, working paper, Department of Economics, University of Oxford.

[33] Martens, M. and D. van Dijk, 2007, Measuring volatility with the realized range, Journal ofEconometrics 138, 181-207.

[34] Meddahi, N., 2001, A Theoretical Comparison between Integrated and Realized Volatilities,manuscript, Université de Montréal.

[35] Merton, R. C., 1980. On Estimating the Expected Return on the Market: An ExploratoryInvestigation, Journal of Financial Economics, 8, 323-361.

[36] Newbold, P., and C.W.J. Granger, 1974, Experience with Forecasting Univariate Time Seriesand the Combination of Forecasts (with discussion), Journal of the Royal Statistical Society,Series A, 137, 131-149.

[37] Newey, W.K., and K.D. West, 1987, A Simple, Positive Semide�nite, Heteroskedasticity andAutocorrelation Consistent Covariance Matrix, Econometrica, 55, 703-708.

[38] Oomen, R.C.A., 2006, Properties of Realized Variance under Alternative Sampling Schemes,Journal of Business and Economic Statistics, 24, 219-237.

[39] Patton, A.J., 2006, Volatility Forecast Comparison using Imperfect Volatility Proxies, Quan-titative Finance Research Centre, University of Technology Sydney, Research Paper 175.

[40] Patton, A.J., 2008, Data-Based Ranking of Realised Volatility Estimators, working paper,Oxford-Man Institute of Quantitative Finance, University of Oxford.

[41] Patton, A.J., and K. Sheppard, 2008, Evaluating Volatility Forecasts, in T.G. Andersen, R.A.Davis, J.-P. Kreiss and T. Mikosch (eds.) Handbook of Financial Time Series, Springer Verlag.Forthcoming.

[42] Politis, D.N., and J.P. Romano, 1994, The Stationary Bootstrap, Journal of the AmericanStatistical Association, 89, 1303-1313.

[43] Reid, D.J., 1968, Combining Three Estimates of Gross Domestic Product, Economica, 35,431-444.

[44] Romano, J.P., and M. Wolf, 2005, Stepwise multiple testing as formalized data snooping,Econometrica, 73, 1237-1282.

[45] Stock, J.H, and M.W. Watson, 2001, A Comparison of Linear and Nonlinear Univariate Modelsfor Forecasting Macroeconomic Time Series, in R.F. Engle and H. White eds., Festschrift inHonor of Clive Granger, Cambridge University Press, Cambridge.

[46] Stock, J.H, and M.W. Watson, 2004, Combination Forecasts of Output Growth in a Seven-Country Data Set, Journal of Forecasting, 23, 405-430.

[47] Timmermann, A., 2006, Forecast Combinations, in the Handbook of Economic Forecasting, G.Elliott, C.W.J. Granger and A. Timmermann eds., North Holland Press, Amsterdam.

20

[48] Zhang, L., 2006, E¢ cient Estimation of Stochastic Volatility using Noisy Observations: AMulti-Scale Approach, Bernoulli , 12, 1019-1043.

[49] Zhang, L., P.A. Mykland, and Y. Aït-Sahalia, 2005, A Tale of Two Time Scales: DeterminingIntegrated Volatility With Noisy High-Frequency Data, Journal of the American StatisticalAssociation, 100, 1394-1411.

[50] Zhou, B., 1996, High-Frequency Data and Volatility in Foreign-Exchange Rates, Journal ofBusiness and Economic Statistics, 14, 45-52.

Jan96 Jan97 Jan98 Jan99 Jan00 Jan01 Jan02 Jan03 Jan04 Jan05 Jan06 Jan070

50

100

150Price of IBM (adjusted for splits)

Jan96 Jan97 Jan98 Jan99 Jan00 Jan01 Jan02 Jan03 Jan04 Jan05 Jan06 Jan070

20

40

60

80

100Annualised volatility of IBM, using RV5min

Figure 1: This �gure plots IBM price and volatility over the period January 1996 to June 2007.The price is adjusted for stock splits, and the volatility is computed using realised volatility basedon 5-minute calendar-time trade prices, annualised using the formula �t =

p252�RVt:

.

21

Table 1: Summary statistics of the realised measures

StandardMean Deviation Skewness Kurtosis Minimum

RV1 sec 3.128 2.952 2.710 18.255 0.149RV5 sec 2.985 2.804 2.736 18.866 0.130RV1min 2.388 2.377 3.731 34.841 0.092RV5min 2.296 2.937 7.339 120.425 0.081RV1hr 2.223 3.414 5.010 42.997 0.010RV1day 2.343 5.864 9.732 163.764 0.000RVtick;5 sec 3.314 3.113 2.583 16.387 0.191RVtick;1min 2.517 2.592 4.204 41.515 0.114RVtick;5min 2.461 3.060 6.896 111.351 0.076RVtick;1hr 2.423 3.972 7.456 104.018 0.012RVBR 2.385 2.274 3.315 27.760 0.103RVBR;bc 2.664 2.532 2.832 18.914 0.107RRV 2.382 2.651 4.562 48.207 0.120RRVtick 2.378 2.713 5.208 65.155 0.119QRV 2.509 2.327 2.352 11.026 0.130QRVtick 2.483 2.295 2.503 12.629 0.108BPV1min 2.173 2.170 2.651 13.156 0.110BPV5min 2.278 2.730 4.116 31.550 0.098BPVtick;1min 2.162 2.268 3.856 35.470 0.095BPVtick;5min 2.296 2.658 3.929 29.194 0.056RKbart 2.638 2.447 2.968 21.904 0.138RKtick;bart 2.480 2.509 4.127 45.574 0.107RKcubic 2.468 2.296 3.106 23.317 0.111RKtick;cubic 2.533 3.006 6.796 114.502 0.099RKTH 2.323 2.899 4.535 39.456 0.026RKTH�bw5 2.322 3.406 6.824 86.401 0.031RKTH�5tick 2.535 3.012 6.761 113.514 0.099RKNFParzen 2.373 2.930 5.013 49.124 0.062ALTRVmq 2.158 2.562 5.893 65.928 0.113ALTRVcombo 2.765 3.372 7.014 95.550 0.139RVMean 2.495 2.663 4.072 36.688 0.112RVGeo�mean 2.270 2.446 3.914 33.699 0.094RVMedian 2.404 2.473 3.753 33.029 0.114

Notes: This table presents basic summary statistics on the 30 di¤erent realised measures con-sidered in this paper, and 3 simple combination estimators.

22

Table 2: Correlation between the realised measures

RV1 sec RV5min RV1day RVtick;1min BPVtick;1min RKcubic

RV1 sec 1 0.836 0.445 0.922 0.925 0.944RV5 sec 0.999 0.848 0.455 0.934 0.935 0.954RV1min 0.934 0.941 0.527 0.977 0.978 0.986RV5min 0.836 1 0.568 0.942 0.924 0.915RV1hr 0.683 0.806 0.668 0.775 0.776 0.767RV1day 0.445 0.568 1 0.526 0.524 0.512RVtick;5 sec 0.998 0.832 0.447 0.922 0.926 0.945RVtick;1min 0.922 0.942 0.526 1 0.978 0.971RVtick;5min 0.843 0.967 0.588 0.939 0.935 0.925RVtick;1hr 0.641 0.791 0.695 0.746 0.736 0.730RVBR 0.948 0.919 0.508 0.972 0.976 0.991RVBR;bc 0.976 0.889 0.486 0.953 0.956 0.978RRV 0.892 0.977 0.563 0.974 0.971 0.966RRVtick 0.888 0.969 0.570 0.976 0.974 0.961QRV 0.905 0.845 0.484 0.923 0.936 0.954QRVtick 0.905 0.852 0.507 0.931 0.944 0.953BPV1min 0.912 0.867 0.493 0.943 0.955 0.960BPV5min 0.835 0.943 0.545 0.923 0.916 0.918BPVtick;1min 0.925 0.924 0.524 0.978 1 0.977BPVtick;5min 0.829 0.895 0.571 0.908 0.913 0.911RKbart 0.967 0.896 0.497 0.966 0.973 0.992RKtick;bart 0.925 0.946 0.533 0.981 0.981 0.987RKcubic 0.944 0.915 0.512 0.971 0.977 1RKtick;cubic 0.855 0.973 0.571 0.965 0.951 0.941RKTH 0.801 0.912 0.599 0.896 0.907 0.896RKTH�bw5 0.726 0.871 0.653 0.827 0.828 0.817RKTH�5tick 0.853 0.975 0.574 0.965 0.951 0.940RKNFParzen 0.820 0.931 0.608 0.909 0.913 0.908ALTRVmq 0.749 0.811 0.472 0.844 0.856 0.865ALTRVcombo 0.702 0.751 0.425 0.793 0.816 0.818

Notes: This table presents the a sub-set of the correlation matrix of the 30 di¤erent realised mea-sures considered in this paper. The estimators in the columns correspond to standard choices in theextant literature (RV1day and RV5min), a naïve choice given high frequency data (RV1 sec), and threeof the estimators that turn out to perform well in our empirical analysis (RVtick;1min; BPVtick;1min

and RKcubic). The complete correlation matrix is available from the authors on request.

23

Notes to Table 3: The �rst column of this table presents the average di¤erence in MSEdistance of each realised measure, relative to RV 5min; with negative (positive) values indicatingthat the estimator was closer (further) on average to the target variable than RV 5min: Columns2�5 present the rank of each estimator using MSE distance, for the full sample period and forthree sub-samples, 1996-1999, 2000-2003 and 2004-2007. The most accurate estimator is ranked 1,and the least accurate estimator is ranked 33. Columns 6�9 present an indicator for whether theestimator was in the �model con�dence set�(equal to X if in, �if not) in the full sample and eachof the three sub-samples.

Notes to Table 4: The �rst column of this table presents the average di¤erence in QLIKEdistance of each realised measure, relative to RV 5min; with negative (positive) values indicatingthat the estimator was closer (further) on average to the target variable than RV 5min: Columns2�5 present the rank of each estimator using QLIKE distance, for the full sample period and forthree sub-samples, 1996-1999, 2000-2003 and 2004-2007. The most accurate estimator is ranked 1,and the least accurate estimator is ranked 33. Columns 6�9 present an indicator for whether theestimator was in the �model con�dence set�(equal to X if in, �if not) in the full sample and eachof the three sub-samples.

24

Table 3: Performance of the realised measures under MSE distance

Avg �MSE MSE rank In MCS?Full Full 96-99 00-03 04-07 Full 96-99 00-03 04-07

RV1 sec 0.117 26 28 22 30 � � � �RV5 sec -0.504 19 25 19 26 � � � �RV1min -1.783 7 6 7 9 X X X �RV5min 0.000 24 19 26 11 � X � �RV1hr 3.014 30 30 30 28 � � � �RV1day 24.005 33 33 33 33 � � � �RVtick;5 sec 0.806 28 29 28 29 � � � �RVtick;1min -1.292 14 15 14 16 X X X �RVtick;5min 0.198 27 23 27 15 � � � �RVtick;1hr 6.096 32 32 31 31 � � � �RVBR -1.957 5 1 6 14 X X X �RVBR;bc -1.541 11 13 10 22 � X � �RRV -1.286 15 12 16 7 � X � �RRVtick -1.084 16 14 18 2 � X � XQRV -2.100 2 2 2 21 X X X �QRVtick -2.186 1 4 1 20 X X X �BPV1min -2.033 3 5 3 12 X X X �BPV5min -0.762 18 24 17 13 � � � �BPVtick;1min -1.878 6 9 5 5 X X X XBPVtick;5min -1.023 17 21 15 10 � � � �RKbart -1.660 9 11 8 25 � X � �RKtick;bart -1.564 10 8 11 19 X X X �RKcubic -1.967 4 3 4 23 X X X �RKtick;cubic 0.045 25 22 24 18 � � � �RKTH -0.398 20 27 20 4 � � � XRKTH�bw5 2.515 29 31 29 24 � � � �RKTH�5tick -0.024 23 20 23 17 � � � �RKNFParzen -0.243 21 26 21 3 � � � XALTRVmq -0.038 22 18 25 27 � � � �ALTRVcombo 3.019 31 17 32 32 � X � �RVMean -1.300 13 16 13 8 � X � �RVGeo�mean -1.528 12 7 12 1 � X � XRVMedian -1.715 8 10 9 6 X X X �

Notes: See page 24 for a description of this table.

25

Table 4: Performance of the realised measures under QLIKE distance

Avg �QLIKE QLIKE rank In MCS?Full Full 96-99 00-03 04-07 Full 96-99 00-03 04-07

RV1 sec -0.008 19 13 26 23 � � � �RV5 sec -0.018 18 12 24 21 � � � �RV1min -0.041 4 5 6 5 � � X XRV5min 0.000 22 21 22 27 � � � �RV1hr 0.731 32 32 32 32 � � � �RV1day 33.671 33 33 33 33 � � � �RVtick;5 sec 0.002 24 15 29 25 � � � �RVtick;1min -0.046 1 1 2 4 X X X XRVtick;5min -0.022 14 14 18 19 � � � �RVtick;1hr 0.484 31 31 31 31 � � � �RVBR -0.041 6 4 7 10 � � � �RVBR;bc -0.034 11 6 17 17 � � � �RRV -0.029 13 16 9 12 � � X �RRVtick -0.031 12 17 8 3 � � X XQRV -0.038 9 9 10 9 � � � XQRVtick -0.041 5 8 1 11 X � X �BPV1min 0.010 25 28 13 6 � � � XBPV5min 0.018 27 26 23 26 � � � �BPVtick;1min -0.003 20 27 12 8 � � � XBPVtick;5min 0.001 23 24 20 22 � � � �RKbart -0.038 8 3 14 16 � X � �RKtick;bart -0.036 10 10 11 7 � � � XRKcubic -0.042 3 2 5 13 X X X �RKtick;cubic -0.020 17 20 15 14 � � � �RKTH 0.071 29 30 28 28 � � � �RKTH�bw5 0.148 30 29 30 30 � � � �RKTH�5tick -0.021 16 19 16 15 � � � �RKNFParzen -0.003 21 22 21 24 � � � �ALTRVmq 0.015 26 25 25 20 � � � �ALTRVcombo 0.031 28 23 27 29 � � � �RVMean -0.043 2 7 4 1 X � X XRVGeo�mean -0.021 15 18 19 18 � � � �RVMedian -0.039 7 11 3 2 � � X X


26

Notes to Table 5: This table presents the results of the Romano-Wolf �stepwise�test for threedi¤erent choices of benchmark estimator. Columns 1�4 present indicators for when the benchmarkis set to RV1day : using MSE distance, the indicator is set to X if the estimator is signi�cantly moreaccurate than RV1day; to � if the estimator is signi�cantly less accurate than RV1day; and to �if theestimator�s accuracy is not signi�cantly di¤erent from RV1day: The four columns refer to the fullsample period and three sub-samples, 1996-1999, 2000-2003 and 2004-2007. Columns 5�8 presentcorresponding results when the benchmark estimator is set to RV5min; and columns 9�12 presentresults when the benchmark estimator is set to RVMean: The benchmark estimator in each columnis denoted with a F:

Notes to Table 6: This table presents the results of the Romano-Wolf �stepwise�test for threedi¤erent choices of benchmark estimator. Columns 1�4 present indicators for when the benchmarkis set to RV1day : using QLIKE distance, the indicator is set to X if the estimator is signi�cantlymore accurate than RV1day, to � if the estimator is signi�cantly less accurate than RV1day; and to� if the estimator�s accuracy is not signi�cantly di¤erent from RV1day: The four columns refer tothe full sample period and three sub-samples, 1996-1999, 2000-2003 and 2004-2007. Columns 5�8present corresponding results when the benchmark estimator is set to RV5min; and columns 9�12present results when the benchmark estimator is set to RVMean: The benchmark estimator in eachcolumn is denoted with a F:

27

Table 5: Romano-Wolf tests on the realised measures, under MSE distance

Romano-Wolf: RV1day Romano-Wolf: RV5min Romano-Wolf: RVMean

Full 96-99 00-03 04-07 Full 96-99 00-03 04-07 Full 96-99 00-03 04-07

RV1 sec X X X X � � � � � � � �RV5 sec X X X X � � � � � � � �RV1min X X X X � � � � � � � �

RV5min X X X X F F F F � � � �

RV1hr X X X X � � � � � � � �RV1day F F F F � � � � � � � �

RVtick;5 sec X X X X � � � � � � � �RVtick;1min X X X X � � � � � � � �

RVtick;5min X X X X � � � � � � � �

RVtick;1hr X X X X � � � � � � � �

RVBR X X X X � � � � � � � �

RVBR;bc X X X X � � � � � � � �

RRV X X X X � � � � � � � �

RRVtick X X X X � � � � � � � �

QRV X X X X � � � � � � � �

QRVtick X X X X � � � � � � � �

BPV1min X X X X � � � � � � � �

BPV5min X X X X � � � � � � � �

BPVtick;1min X X X X � � � � � � � �

BPVtick;5min X X X X � � � � � � � �

RKbart X X X X � � � � � � � �

RKtick;bart X X X X � � � � � � � �

RKcubic X X X X � � � � � � � �

RKtick;cubic X X X X � � � � � � � �

RKTH X X X X � � � � � � � �

RKTH�bw5 X X X X � � � � � � � �

RKTH�5tick X X X X � � � � � � � �

RKNFParzen X X X X � � � � � � � �

ALTRVmq X X X X � � � � � � � �

ALTRVcombo X X X � � � � � � � � �RVMean X X X X � � � � F F F FRVGeo�mean X X X X � � � � � � � �

RVMedian X X X X � � � � � � � �


28

Table 6: Romano-Wolf tests on the realised measures, under QLIKE distance

Romano-Wolf: RV1day Romano-Wolf: RV5min Romano-Wolf: RVmean

Full 96-99 00-03 04-07 Full 96-99 00-03 04-07 Full 96-99 00-03 04-07

RV1 sec X X X X � � � � � � � �RV5 sec X X X X � � � � � � � �RV1min X X X X X X X X � � � �RV5min X X X X F F F F � � � �RV1hr X X X X � � � � � � � �RV1day F F F F � � � � � � � �RVtick;5 sec X X X X � � � � � � � �RVtick;1min X X X X X X X X � � � �RVtick;5min X X X X X � � X � � � �RVtick;1hr X X X X � � � � � � � �RVBR X X X X X X X X � � � �RVBR;bc X X X X X X � X � � � �RRV X X X X X � X X � � � �RRVtick X X X X X � X X � � � �QRV X X X X X X X X � � � �QRVtick X X X X X X X X � � � �BPV1min X X X X � � X X � � � �BPV5min X X X X � � � � � � � �BPVtick;1min X X X X � � X X � � � �BPVtick;5min X X X X � � � � � � � �RKbart X X X X X X X � � � � �RKtick;bart X X X X X X X X � � � �RKcubic X X X X X X X X � � � �RKtick;cubic X X X X X � X X � � � �RKTH X X X X � � � � � � � �RKTH�bw5 X X X X � � � � � � � �RKTH�5tick X X X X X � X X � � � �RKNFParzen X X X X � � � � � � � �ALTRVmq X X X X � � � � � � � �ALTRVcombo X X X X � � � � � � � �RVMean X X X X X X X X F F F FRVGeo�mean X X X X X � � X � � � �RVMedian X X X X X X X X � � � �


29

Notes to Table 7: This table presents the MSE-optimal linear combination weights. Columns1�4 present these weights for the unconstrained case, with estimates that are signi�cantly di¤erentfrom zero at the 10% level highlighted in bold. (Standard errors were computed using Proposition1, with Newey-West (1987) estimates of the covariance matrix, BT ; and are not reported here inthe interests of space.) The four columns refer to the full sample period and three sub-samples,1996-1999, 2000-2003 and 2004-2007. Columns 5�8 present the optimal linear combination weights,using MSE distance, imposing the constraint that each weight must be weakly positive. Weightsthat were on the boundary at zero are reported as �0�, while those away from the boundary arereported to two decimal places.

Notes to Table 8: This table presents the QLIKE-optimal linear combination weights.Columns 1�4 present these weights for the unconstrained case, with estimates that are signi�-cantly di¤erent from zero at the 10% level highlighted in bold. (Standard errors were computedusing Proposition 1, with Newey-West (1987) estimates of the covariance matrix, BT ; and are notreported here in the interests of space.) The four columns refer to the full sample period and threesub-samples, 1996-1999, 2000-2003 and 2004-2007. Columns 5�8 present the optimal linear combi-nation weights, using QLIKE distance, imposing the constraint that each weight must be weaklypositive. Weights that were on the boundary at zero are reported as �0�, while those away fromthe boundary are reported to two decimal places.

Notes to Table 9: This table presents the MSE-optimal multiplicative combination weights.Columns 1�4 present these weights for the unconstrained case, with estimates that are signi�cantlydi¤erent from zero at the 10% level highlighted in bold. (Standard errors were computed usingProposition 1, with Newey-West (1987) estimates of the covariance matrix, BT ; and are not reportedhere in the interests of space.) The four columns refer to the full sample period and three sub-samples, 1996-1999, 2000-2003 and 2004-2007. Columns 5�8 present the optimal multiplicativecombination weights, using MSE distance, imposing the constraint that each weight must be weaklypositive. Weights that were on the boundary at zero are reported as �0�, while those away fromthe boundary are reported to two decimal places.

Notes to Table 10: This table presents the QLIKE-optimal multiplicative combinationweights. Columns 1�4 present these weights for the unconstrained case, with estimates that are sig-ni�cantly di¤erent from zero at the 10% level highlighted in bold. (Standard errors were computedusing Proposition 1, with Newey-West (1987) estimates of the covariance matrix, BT ; and are notreported here in the interests of space.) The four columns refer to the full sample period and threesub-samples, 1996-1999, 2000-2003 and 2004-2007. Columns 5�8 present the optimal multiplicativecombination weights, using QLIKE distance, imposing the constraint that each weight must beweakly positive. Weights that were on the boundary at zero are reported as �0�, while those awayfrom the boundary are reported to two decimal places.

30

Table 7: Optimal linear combinations under MSE distance

Unconstrained Constrained

Full 96-99 00-03 04-07 Full 96-99 00-03 04-07

Constant 0.21 0.87 0.24 0.28 0.27 1.16 0.28 0.32RV1 sec -0.70 -0.25 0.58 -0.39 0 0 0 0RV5 sec 0.18 -0.40 -0.20 0.19 0 0 0 0RV1min -0.57 -0.58 -1.18 -0.20 0 0 0 0RV5min 0.31 0.37 0.50 -0.15 0.01 0 0 0RV1hr 0.00 0.09 -0.07 0.02 0 0.08 0 0RV1day 0.00 0.03 -0.03 -0.01 0.00 0.03 0 0RVtick;5 sec 0.84 0.61 0.03 0.64 0.11 0 0.17 0.11RVtick;1min -0.33 0.23 -0.94 -0.22 0 0 0 0RVtick;5min -0.25 -0.11 -0.17 -0.19 0 0 0 0RVtick;1hr 0.04 0.04 0.07 0.04 0 0.02 0 0.02RVBR 0.07 0.70 -0.71 0.04 0 0 0 0RVBR;bc 0.01 -0.24 0.45 -0.13 0 0 0 0RRV -0.21 0.33 -0.71 0.07 0 0 0 0RRVtick 0.62 0.02 0.74 0.12 0 0 0 0QRV 0.38 0.39 0.43 0.17 0.20 0.35 0 0.07QRVtick 0.47 -0.08 0.92 0.11 0.40 0 0.62 0BPV1min 0.03 0.15 0.16 0.07 0 0 0 0BPV5min -0.16 -0.51 0.09 -0.02 0 0 0 0BPVtick;1min -0.23 -0.54 0.09 -0.06 0 0 0 0BPVtick;5min -0.03 0.15 -0.25 0.15 0 0 0 0RKbart -0.40 -0.04 -1.19 0.14 0 0 0 0RKtick;bart -0.21 -0.21 -0.23 -0.40 0 0 0 0RKcubic 0.70 0.43 1.89 -0.24 0 0 0 0RKtick;cubic -3.31 -3.18 0.28 0.24 0 0 0 0.03RKTH 0.26 0.05 0.17 0.50 0.03 0 0 0.17RKTH�bw5 -0.09 -0.08 -0.09 -0.01 0 0 0 0RKTH�5tick 3.60 3.38 0.47 0.12 0.02 0.04 0 0RKNFParzen -0.22 -0.12 -0.34 -0.25 0 0 0 0ALTRVmq -0.24 -0.77 0.00 0.13 0 0 0 0ALTRVcombo 0.14 0.61 -0.01 -0.02 0 0 0 0.03


31

Table 8: Optimal linear combinations under QLIKE distance


Full 96-99 00-03 04-07 Full 96-99 00-03 04-07

Constant 0.14 0.88 0.10 0.26 0.12 0.94 0.12 0.28RV1 sec -0.50 -0.19 0.32 -0.29 0 0 0.13 0RV5 sec 0.07 -0.36 0.32 0.31 0 0 0.09 0RV1min 0.29 0.05 -0.29 0.03 0 0 0 0RV5min -0.12 -0.10 -0.04 -0.08 0 0 0 0RV1hr -0.01 0.01 -0.06 -0.02 0 0 0 0RV1day 0.04 0.08 0.02 0.01 0.04 0.08 0.01 0.01RVtick;5 sec 0.73 0.62 -0.14 0.34 0.19 0.01 0.01 0.06RVtick;1min 0.23 0.02 0.35 0.01 0.08 0 0 0RVtick;5min -0.10 0.24 -0.03 0.03 0.09 0.09 0 0RVtick;1hr -0.01 -0.01 -0.01 0.04 0 0 0 0RVBR -0.08 0.08 -0.03 -0.03 0 0 0 0RVBR;bc -0.17 -0.21 -0.43 -0.14 0 0 0 0RRV 0.24 0.55 -0.03 -0.06 0 0 0 0RRVtick 0.45 0.23 -0.07 0.02 0 0.02 0 0QRV 0.13 0.15 0.09 0.28 0.05 0.14 0 0.18QRVtick 0.13 0.01 0.49 0.04 0.22 0.01 0.29 0BPV1min -0.41 -0.12 -0.19 0.00 0 0.04 0 0BPV5min -0.02 -0.32 0.28 0.01 0 0 0.19 0BPVtick;1min -0.53 -0.34 -0.52 -0.13 0 0 0 0BPVtick;5min -0.02 -0.17 0.07 0.01 0 0 0.02 0RKbart 0.03 -0.20 0.10 0.05 0 0 0 0RKtick;bart -0.11 0.13 -0.41 -0.45 0 0.05 0 0RKcubic 0.23 0.20 0.52 -0.08 0 0.02 0 0RKtick;cubic -0.36 -1.29 0.15 0.17 0 0 0 0RKTH 0.15 0.02 0.25 0.44 0 0.01 0.11 0.18RKTH�bw5 -0.01 -0.01 0.02 -0.01 0 0 0 0RKTH�5tick 0.32 1.00 0.10 0.10 0 0 0 0RKNFParzen 0.07 0.12 -0.04 -0.23 0.17 0.08 0.00 0ALTRVmq 0.05 -0.55 0.13 0.19 0 0 0 0.04ALTRVcombo -0.03 0.84 -0.07 -0.05 0 0.07 0 0.02


32

Table 9: Optimal multiplicative combinations under MSE distance


Full 96-99 00-03 04-07 Full 96-99 00-03 04-07

Constant 0.80 1.13 0.73 0.77 1.08 1.47 1.09 0.79RV1 sec -0.10 0.79 1.04 -0.02 0 0.55 0 0RV5 sec -0.48 -0.91 -1.61 -0.03 0 0 0 0RV1min -0.81 -0.26 -1.84 -0.32 0 0.55 0 0RV5min 0.71 0.38 1.89 -0.12 0.04 0.60 0.08 0RV1hr -0.05 0.00 -0.09 -0.02 0 0.90 0 0RV1day -0.01 0.04 -0.01 0.01 0 1.60 0 0.01RVtick;5 sec 0.98 0.13 1.09 0.63 0.17 0.63 0.17 0.10RVtick;1min -0.38 0.14 -0.46 -0.32 0 0.11 0 0RVtick;5min -0.21 -0.36 0.18 -0.07 0 0.47 0 0RVtick;1hr 0.02 0.03 0.05 0.05 0 1.07 0 0.01RVBR -0.22 0.41 -1.27 0.02 0 0.16 0 0RVBR;bc 0.28 -0.29 1.59 -0.43 0 0.10 0 0RRV -0.25 0.57 -2.51 -0.14 0 0.74 0 0RRVtick 0.58 0.29 1.06 0.07 0 0.84 0 0QRV 0.35 0.33 1.01 0.37 0.09 0.43 0 0.17QRVtick 0.63 0.11 1.12 0.17 0.41 0.40 0.62 0BPV1min 0.10 -0.05 0.10 0.20 0 0.12 0 0BPV5min -0.32 -0.61 -0.15 0.10 0 0.17 0 0BPVtick;1min -0.15 -0.45 1.10 -0.08 0 0 0 0BPVtick;5min -0.19 0.21 -1.01 0.03 0 0.29 0 0RKbart -0.67 -0.01 -2.96 0.22 0 0.04 0 0RKtick;bart -0.31 -0.16 -1.24 -0.40 0 0.43 0 0RKcubic 0.98 0.14 2.92 -0.26 0 0.57 0 0RKtick;cubic -2.55 -2.37 -1.04 0.34 0 0 0 0.02RKTH 0.19 0.00 0.10 0.40 0.10 0.53 0 0.17RKTH�bw5 -0.07 0.04 -0.22 0.00 0 0.76 0 0RKTH�5tick 3.19 2.57 2.43 0.23 0.04 2.33 0 0.00RKNFParzen -0.18 0.07 -0.27 -0.16 0 0.64 0 0ALTRVmq -0.38 -1.07 0.18 0.23 0 0 0 0.03ALTRVcombo 0.24 0.95 -0.23 -0.11 0 0.92 0 0.03


33

Table 10: Optimal multiplicative combinations under QLIKE distance


Full 96-99 00-03 04-07 Full 96-99 00-03 04-07

Constant 0.92 1.33 0.94 0.83 1.00 1.58 0.99 0.80RV1 sec -0.62 0.25 0.25 0.04 0 0 0.10 0RV5 sec -0.09 -0.57 0.35 0.07 0 0 0.11 0RV1min 0.29 0.28 -0.19 0.02 0 0.02 0 0RV5min 0.04 -0.01 0.04 -0.03 0 0 0 0RV1hr -0.01 -0.03 -0.03 -0.02 0 0 0 0RV1day 0.01 0.03 0.00 0.00 0.01 0.02 0 0.00RVtick;5 sec 1.16 0.35 -0.01 0.27 0.28 0 0.10 0.06RVtick;1min 0.13 -0.17 0.50 0.05 0.07 0 0 0RVtick;5min -0.17 0.09 -0.08 0.07 0.13 0.08 0.01 0.01RVtick;1hr 0.00 -0.01 -0.02 0.02 0 0 0 0.00RVBR -0.14 -0.10 -0.15 -0.04 0 0 0 0RVBR;bc -0.26 -0.18 -0.45 -0.34 0 0 0 0RRV -0.12 0.30 -0.08 -0.23 0 0.03 0 0RRVtick 0.72 0.53 -0.05 0.03 0 0.06 0 0QRV 0.06 0.10 0.06 0.32 0 0.05 0 0.21QRVtick 0.14 0.10 0.57 0.10 0.24 0.09 0.32 0BPV1min -0.18 -0.14 -0.23 0.11 0 0.00 0 0.01BPV5min 0.00 -0.25 0.23 0.05 0 0 0.16 0BPVtick;1min -0.37 -0.23 -0.54 -0.09 0 0 0 0BPVtick;5min -0.03 -0.16 0.07 -0.02 0 0 0.04 0.01RKbart -0.03 -0.20 0.20 -0.01 0 0 0 0RKtick;bart 0.01 0.20 -0.37 -0.29 0 0.01 0 0RKcubic 0.22 0.36 0.36 -0.06 0 0 0 0RKtick;cubic -0.72 -2.18 0.17 0.12 0 0 0 0RKTH 0.02 0.00 0.08 0.17 0 0 0 0.16RKTH�bw5 -0.02 0.04 -0.02 0.00 0 0.03 0 0RKTH�5tick 0.58 1.86 0.09 0.07 0 0 0 0RKNFParzen 0.27 0.15 0.15 0.03 0.18 0.09 0.11 0ALTRVmq 0.13 -0.53 0.19 0.34 0 0 0 0.08ALTRVcombo -0.14 0.76 -0.15 -0.20 0 0.12 0 0.00


34