+ All Categories
Home > Documents > A Closer Look at High-Frequency Data and Volatility Forecasting …2009. 4. 29. · using...

A Closer Look at High-Frequency Data and Volatility Forecasting …2009. 4. 29. · using...

Date post: 06-Sep-2021
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
18
A Closer Look at High-Frequency Data and Volatility Forecasting in a HAR Framework 1 Derek Song ECON 201FS Spring 2009 1 This report was written in compliance with the Duke Community Standard
Transcript
Page 1: A Closer Look at High-Frequency Data and Volatility Forecasting …2009. 4. 29. · using high-frequency data, simple linear autoregressive regression models have better predictive

A Closer Look at High-Frequency Data and Volatility Forecasting in a HAR Framework1

Derek Song

ECON 201FS

Spring 2009

1 This report was written in compliance with the Duke Community Standard

Page 2: A Closer Look at High-Frequency Data and Volatility Forecasting …2009. 4. 29. · using high-frequency data, simple linear autoregressive regression models have better predictive

2

1. Introduction

The volatility of asset returns is an area of vital importance for research in

financial theory – risk management and derivative valuation methods are all dependent

on being able to accurately measure and forecast volatility. The recent availability of

high-frequency price data has given rise to new models of volatility that have yielded

significant improvements in the accuracy of volatility measurements and forecasting.

The ability to accurately predict future volatility is particularly important in a practical

sense because of its implications for asset management.

Recent literature, such as Andersen, Bollerslev, Diebold, Labys (2003), show that

using high-frequency data, simple linear autoregressive regression models have better

predictive capabilities than the more sophisticated ARCH/GARCH and stochastic

volatility models. One such model is the heterogeneous autoregressive (HAR) model, a

simple autoregressive model for realized volatility first proposed by Corsi (2003).

When setting up a regression model, researchers are afforded several degrees of

freedom, including methods for calculating RV and regression methodology. In the

current literature, comparisons of different regression models are often made using a

particular choice for the sampling interval. This raises the question of whether or not

those results would hold given different choices for model parameters, since we would

like to ensure some level of consistency when comparing models. In this paper, we seek

to add to the existing literature by empirically examining the sensitivity of several HAR

model forecasts to various sampling and regression methods. In particular, we compare

four models, HAR-RV, HAR-RAV, and both of the above models with implied volatility

added in. The three factors that we will consider are (1) sampling interval, which runs

Page 3: A Closer Look at High-Frequency Data and Volatility Forecasting …2009. 4. 29. · using high-frequency data, simple linear autoregressive regression models have better predictive

3

from 1 minute to 30 minutes, (2) sub-sampling, and (3) the effect of using robust

regressions instead of OLS to control for outliers and leverage points. We compare the

models by measuring both in-sample fit and out-of-sample performance on a synthetic

portfolio constructed to mimic the S&P 100. We find that when the sampling interval is

set at 5 minutes or higher, there is little variation in forecast accuracy for different

intervals, and that any noisiness is eliminated by using sub-sampling. Furthermore, our

results suggest that including implied volatility has a significant impact on forecasting

accuracy.

The rest of the paper proceeds as follows: theoretical and mathematical

background of volatility and regression models (2), research methodologies (3), data

preparation (4), empirical results (5), and a conclusion summarizing the paper and the

most important results (6). All tables and figures are given in the end of the paper.

2. Theoretical Background

2.1 Stochastic Model of Asset Returns

In this paper, we assume a widely used model of asset prices that includes jumps.

We assume that the log-prices of a stock, denoted �� follow the stochastic differential

equation given below:

��� = ���� + ���� + ���� (2.1.1)

Here, ���� is a time-varying drift component, ���� represents a time-varying volatility

component of the asset price, � is a standard Wiener process, � is the magnitude of the

jump, and �� represents a counting process which is commonly assumed to be a Poisson

process so that jumps are rare.

Page 4: A Closer Look at High-Frequency Data and Volatility Forecasting …2009. 4. 29. · using high-frequency data, simple linear autoregressive regression models have better predictive

4

2.2 Market Microstructure Noise

Stock prices are commonly assumed to have a theoretical fundamental price,

calculated as the sum of all discounted future dividend payments. Market microstructure

noise is defined as any short-term deviations of the spot price from the fundamental value

of a stock, and is modeled by

��∗ = �� + �� (2.2.1)

Note that �� is the logarithm of the observed price, and therefore the error term �� is

proportional to the observed price. Market microstructure noise arises due to various

market frictions, including the bid-ask bounce. Because market microstructure noise

distorts price data at high frequencies, it can become problematic for the estimation of

realized volatility. Bandi, Russell (2008) show that in the presence of noise, the RV

estimator will diverge to infinity almost surely. However, Forsberg, Ghysels (2007)

argue that RAV is much more robust to sampling errors and jumps. In this paper, we

discuss two ways of circumventing this problem, which we discuss in Sections 2.3 and 3.

2.3 Models of Volatility in Asset Returns

We let ��,� denote the logarithmic (geometric) return at some intra-day time �,

given by ��,� = ��,� − ��,���, where �� is the logarithm of the observed price. We will

now define two different measures of volatility, Realized Variance (RV) and Realized

Absolute Value (RAV). We define RV as the sum of the squared log-returns, and RAV

as the sum of the absolute log-returns. As such, RV will be measured in variance units,

while RAV is measured in standard deviation units. Letting � be the number of times

Page 5: A Closer Look at High-Frequency Data and Volatility Forecasting …2009. 4. 29. · using high-frequency data, simple linear autoregressive regression models have better predictive

5

we sample within each day (in our data, we have per-minute returns, yielding a maximum

of 384 samples per day), we calculate the daily RV as follows:

��� = � ��,�� �!�

→#$%%& ' �(��)�

���+ � (����*(+�

(2.3.1)

The next measure, RAV, is defined as:

�-�� = . /2� � 0��,�0

�!� →#$%%& ' �(�)

��� (2.3.2)

There are several key points to note about these measures; the first is that because of the

way in which they are defined, RV and RAV are not directly comparable. Secondly, as

the sampling interval increases, we are throwing away more and more of the data.

Zhang, Mykland, Aït-Sahalia (2005) propose an alternative sampling

methodology known as sub-sampling. Assuming we sample every 1 minutes, we

calculate our measure from each starting point 2 = 1, 2, … , 1 and average those

calculations, resulting in no discarded data points. Sub-sampling has two distinct

advantages over the traditional sampling method: reduced bias from microstructure noise

and the ability to use all of the available data points regardless of sampling interval.

2.4 HAR Regression Models

In this paper, we rely on the Heterogeneous Autoregressive (HAR) models first

introduced by Müller et al (2007) and Corsi (2003) to forecast volatility. Recent papers

(see: Andersen, Bollerslev, Diebold, Labys (2003) or Andersen, Bollerslev, Huang

(2007)) have shown empirically that simple linear models can often predict future

volatility more accurately than more sophisticated models that can formally capture long

memory processes and persistence. The HAR framework developed by Corsi is

Page 6: A Closer Look at High-Frequency Data and Volatility Forecasting …2009. 4. 29. · using high-frequency data, simple linear autoregressive regression models have better predictive

6

attractive because it is easily estimated using OLS, and is significantly more

parsimonious than the HARCH model of Müller et al (1997). The expected future

variance over an ℎ-day horizon is given by a linear combination of average historical

RV’s over different time scales, which can capture the persistence seen in time series data

without making the sort of restrictive assumptions seen in ARFIMA and GARCH models.

In order to calculate the model, we will define ���,�56 as the average RV over a

given time span, ℎ. This is mathematically represented as follows:

���,�56 = 1ℎ � ���

�56�!�5�

(2.4.1)

�-��,�56 is then defined analogously. In this paper, we want to calculate 22-day ahead

forecasts, which correspond to the number of trading days within a calendar month. Thus,

we can set up a HAR-RV regression as:

���,�5�� = 89 + 8:�����,� + 8;����<,� + 8=������,� + ��5� (2.4.2)

The dependent variables correspond to daily, weekly, and monthly lagged regressors,

which were chosen by Corsi in his paper. Also, it should be noted that Andersen,

Bollerslev, Diebold (2007) established that in general, the jump effects embedded in RV

measures are not significant within the context of a HAR regression.

Forsberg, Ghysels (2007) extend the HAR-class models by using historical RAV

to forecast future RV, which they find to be a significantly better predictor of RV than

historical RV. The model is analogous to the one for HAR-RV, and is defined as:

���,�5�� = 89 + 8:�-����,� + 8;�-���<,� + 8=�-�����,� + ��5� (2.4.3)

It should be noted that the physical interpretation of the HAR-RAV model is not identical

to the HAR-RV model, since RAV and RV are in different units.

Page 7: A Closer Look at High-Frequency Data and Volatility Forecasting …2009. 4. 29. · using high-frequency data, simple linear autoregressive regression models have better predictive

7

2.5 Hybrid HAR - Implied Volatility Regressions

There is a large literature on the use of options and model-free implied volatility

to forecast future volatility. Poon, Granger (2005) and a literature review by Blair, Poon,

Taylor (2001) find that implied volatility is a better predictor of volatility than the

commonly used time-series models. Mincer and Zarnowitz (1969) proposed a simple

framework with which to evaluate the efficiency of implied volatility-based forecasting:

���,�56 = 89 + 8>?@�� + ��,�56 (2.5.1)

If implied volatility were perfectly efficient, 89 = 0 and 8>? = 1. However, numerous

papers, including Becker, Clements, White (2003) find that implied volatility is not a

perfectly efficient estimator. Jiang, Tian (2005) showed that model-free implied

volatility is better than options-implied volatility at predicting future volatility and

endorsed the new CBOE VIX methodology for its use of model-free implied volatility.

Fradkin (2008) found evidence that adding implied volatility to HAR models

almost always improved model fit, which suggests that implied volatility contains

information not present in historical realized volatility. We will define hybrid HAR-RV-

IV and HAR-RAV-IV models identical to those used by Fradkin:

���,�5�� = 89 + 8:�����,� + 8;����<,� + 8=������,� + 8>?@�� + ��5� (2.5.2)

���,�5�� = 89 + 8:�-����,� + 8;�-���<,� + 8=�-�����,� + 8>?@�� + ��5� (2.5.3)

3. Data Preparation

The high-frequency stock price data used in this paper were obtained from an

online vendor, price-data.com. For this paper, we follow Law (2007) and select 40 of the

largest MCAP stocks from the S&P 100 (OEX) and aggregate those stocks to form a

Page 8: A Closer Look at High-Frequency Data and Volatility Forecasting …2009. 4. 29. · using high-frequency data, simple linear autoregressive regression models have better predictive

8

portfolio that we claim can proxy for the S&P 500 (SPX) for two reasons; the OEX is a

subset of the SPX, and there exists a high degree of correlation between these two indices.

In Figure 1a, we show a scatterplot of daily open-to-close returns for our synthetic proxy

portfolio (SPP) versus daily open-to-close returns of the SPX. Our requirements for

inclusion were that data for the stock be present from Jan. 3, 2000 up through Dec. 31,

2008; we also checked for inconsistencies in the data and adjusted the prices for stock

splits. In creating the portfolio, we kept only the data for those days in which all 40

stocks traded, yielding a total of 2240 days. We use an equal-weighting scheme to

construct our portfolio by “buying” $25 of each stock at the initial price. In Section 2.2,

we discussed the problem of market microstructure noise, and we now claim, citing

Figures 1b, 2a, and 2b, that the process of aggregating stocks has averaged out most if not

all of the microstructure noise.

Our implied volatility data was taken from the CBOE website. We used the VIX,

a model-free implied volatility index which uses options on the SPX to calculate the 1-

month ahead implied volatility for reasons described in Section 2.5. Because intra-day

data was not available, we use only the closing price of the VIX in our regressions. We

transformed the data so that it is measured in the same units as RV. Also, we naturally

include only those days for which the SPP exists.

Our in-sample data runs for 7 years, from the beginning of 2000 until the end of

2006, yielding 1743 data points. Our out-of-sample data runs from the beginning of 2007

until the end of 2008, yielding 497 data points. We therefore have 24 independent

month-long periods for the out-of-sample result, which should be sufficient to accurately

gauge out-of-sample performance.

Page 9: A Closer Look at High-Frequency Data and Volatility Forecasting …2009. 4. 29. · using high-frequency data, simple linear autoregressive regression models have better predictive

9

4. Regression Methodologies

4.1 Robust Regressions with Iterative Huber Weighting

Poon, Granger (2005) discusses the common problem of sample outliers to

volatility estimation. These leverage points are problematic because they can unduly

influence OLS estimators, especially when the regressions use only historical volatility.

Because manually removing outliers in a data set this large is infeasible, we will deal

with these leverage points by using robust regressions as a comparison for OLS

regressions. We employ an iterative Huber weighting scheme over bisquare weighting

because it converges significantly faster for our regressions. Our regressions are run in

MATLAB, using the regress and robustfit commands to estimate the OLS and robust

coefficients, respectively.

4.2 Evaluating Regression Performance

There are a number of different methods for evaluation forecast accuracy. We

will use Mean Absolute Percentage Error (MAPE) because it is a measure of relative

accuracy, allowing us to compare results when the RV measures we forecast vary due to

sampling interval and sub-sampling. Letting �C be the residual, and DC be the actual value,

we define MAPE as:

E-FG = 1H � I�C

DCIJ

C!� (4.2.1)

The main problem with MAPE is that the measure is not upper-bounded and so we must

be careful of very small or zero values for DC. As Figure 3 shows, our RV is lower and

upper-bounded by values that are within a reasonable range of each other.

Page 10: A Closer Look at High-Frequency Data and Volatility Forecasting …2009. 4. 29. · using high-frequency data, simple linear autoregressive regression models have better predictive

10

5. Empirical Results

5.1 In-Sample Results

The in-sample surface plots (Figures 4-6) show a marked increase in variation in

fit when the sampling interval for either side of the regression is small (1 < 5 min). This

effect is significantly more pronounced for OLS regressions than for the robust

regressions. Above this threshold, the surface plot is relatively flat, suggesting that any

choice of large sampling interval (1 ≥ 5 min) has little bearing on fit. Sample fit

increases when the LHS sampling interval decreases in each of the models. For the

HAR-RAV models, fit decreases when the RHS sampling interval decreases. Adding

implied volatility appears to curtail most of that variability, however.

Sub-sampling eliminates noisiness in our regressions, producing a smooth surface

plot; however, it does not improve fit uniformly across all sampling intervals. Therefore,

although using sub-sampling is able to ensure some degree of consistency in our results,

it does not play a major role in fit.

Between models, we see that RAV produces a better fit than RV for OLS

regressions over large sampling intervals. However, the addition of implied volatility

provides the best fit, and there no longer appears to be a significant difference between

RV-IV and RAV-IV. Furthermore, the variability seen at smaller intervals is diminished

greatly by the inclusion of IV. With regards to the robust regressions, the differences

between RV and RAV alone are not significant, and robust regressions have also

decreased the variability at the lower sampling intervals. Adding IV improves fit, but the

magnitude of the improvement is not as large as for OLS. Finally, the robust regressions

appear to offer the best fit for each of the four regression models.

Page 11: A Closer Look at High-Frequency Data and Volatility Forecasting …2009. 4. 29. · using high-frequency data, simple linear autoregressive regression models have better predictive

11

We report OLS coefficients for selected combinations for each model in Table 1

and robust coefficients in Table 2. The standard errors for the OLS coefficients are

Newey-West standard errors with a lag of 44 days. We find that in general, the

coefficients are significant at the M = 0.05 level or better. The robust regression

coefficients are, with few exceptions, highly coefficient (� < 0.001), however, this is

very likely because the standard errors are not robust to serial correlations.

5.2 Out-of-sample Results

From Figures 7-9, we see that the variability towards the small intervals is

generally larger than in the in-sample results, particularly when using OLS regressions.

Using robust regressions improves consistency when implied volatility is not included;

however, when IV is included, the variation seen in at the small intervals is curtailed.

However, with regards to HAR-RV and HAR-RAV, we see the same general pattern as

in the in-sample data. Along the LHS of the regression, decreasing sampling interval

results in a small improvement in accuracy, while along the RHS, we see a drastic decline

in accuracy as the sampling interval decreases.

For the out-of-sample comparisons, we see many of the same results discussed

above. HAR-RV and HAR-RAV perform very similarly in the out-of-sample period, and

the inclusion of implied volatility helps to improve performance. Robust estimation

procedures appear to provide the most accurate forecasts across all models.

We should note that the out-of-sample period used in this paper encompasses a

period of unusually high volatility due to the recent economic turmoil, as seen in Figure 3.

Fradkin (2007) and Forsberg, Ghysels (2007) both found clear evidence that HAR-RAV

Page 12: A Closer Look at High-Frequency Data and Volatility Forecasting …2009. 4. 29. · using high-frequency data, simple linear autoregressive regression models have better predictive

12

offered the best predictions of future volatility; however, they used 2005 and 2001-2003

as their out-of-sample periods, respectively, which were both periods of relatively low

volatility. This may imply that HAR-RAV offers a significant advantage over HAR-RV

when the overall volatility is low and persistence effects are not as strong. However,

further analysis of this topic is beyond the scope of this paper.

6. Conclusion

In this paper, our goal was to examine the impact that different sampling and

regression methodologies have on volatility forecasting in order to gain a better

understanding of how the choices we make with regards to modeling and estimation can

affect our results. To that end, we examined three factors: sampling interval, sub-

sampling, and robust or OLS regressions. First, we found that forecast performance can

vary greatly when the sampling interval falls below 5 min; for most of the models,

decreasing the sampling interval on the LHS of the regression improved accuracy, but

decreasing the sampling interval on the RHS hurt accuracy. Beyond 5 minutes, there is a

high level of consistency in our results. Secondly, sub-sampling is able to reduce the

noisiness in our regression results, but it does not yield any true improvements in overall

forecast accuracy. Finally, our results show that using robust estimation procedures and

implied volatility both improve forecasting performance over the base HAR-RV and

HAR-RAV models, although the robustly estimated models fared the best out-of-sample.

Page 13: A Closer Look at High-Frequency Data and Volatility Forecasting …2009. 4. 29. · using high-frequency data, simple linear autoregressive regression models have better predictive

13

7. Tables and Figures

Figure 1: SPP Data

Figures 1a and 1b: 1a shows a scatterplot showing SPX intra-day returns vs. SPP intra-day returns. 1b is a plot of the price movements in our portfolio SPP within an arbitrarily chosen day.

Figure 2: Volatility Signature Plots

Figure 2: These are volatility signature plots, introduced in Andersen, Bollerslev, Diebold, Labys (1999). The fact that RV and RAV are decreasing as the sampling interval becomes smaller than 5 min suggests that market microstructure noise no longer biases either volatility measure.

Figure 3: 1-month Ahead Mean RV and VIX Plots

Figure 3: A plot showing annualized values for the VIX and the annualized monthly volatility for our synthetic portfolio (SPP).

-0.1 -0.05 0 0.05 0.1 0.15-0.1

-0.05

0

0.05

0.1

0.15

S&P 500 Log-returns

SPP L

og-ret

urns

Comparison of Daily Open-to-Close Log-returns

0 100 200 300 400993

994

995

996

997

998

999

1000

Time of Day

Val

ue o

f SPP

Intra-day Price Movements of SPP for 1 Day

0 5 10 15 20 25 300.022

0.023

0.024

0.025

0.026

0.027

0.028

0.029

0.03

Sampling Interval (Minutes)

Ann

ualiz

ed U

nits

RV Volatility Signature

No Sub-sampling

With Sub-sampling

0 5 10 15 20 25 300.115

0.12

0.125

0.13

0.135

0.14

Sampling Interval (Minutes)

Ann

ualiz

ed U

nits

RAV Volatility Signature

No Sub-sampling

With Sub-sampling

2000 2001 2002 2003 2004 2005 2006 2007 2008 20090

20

40

60

80

100

Date

Ann

ualiz

ed V

olat

ility

(%

)

1-Month Ahead Mean RV and VIX

1-Month Ahead RV

VIX

Page 14: A Closer Look at High-Frequency Data and Volatility Forecasting …2009. 4. 29. · using high-frequency data, simple linear autoregressive regression models have better predictive

14

Figure 4: OLS In-Sample Surface Plot w/out Sub-sampling

Figures 4-9: These are all surface plots, with the sampling interval (from 1 min up to 30 min) of the left-hand side of the regression on the left axis, the sampling interval (also from 1 – 30 min) of the right-hand side of the regression on the right axis, and the MAPE for each combination on the vertical axis.

Figure 5: OLS In-Sample Surface Plot w/ Sub-sampling

010

2030

010

20300.2

0.3

0.4

0.5

RHS

In-Sample HAR-RV

LHS

MA

PE

010

2030

010

20300.2

0.3

0.4

0.5

RHS

In-Sample HAR-RAV

LHS

MA

PE

010

2030

010

20300.2

0.3

0.4

0.5

RHS

In-Sample HAR-RV-IV

LHS

MA

PE

010

2030

010

20300.2

0.3

0.4

0.5

RHS

In-Sample HAR-RAV-IV

LHS

MA

PE

010

2030

010

2030

0.2

0.3

0.4

0.5

RHS

In-Sample HAR-RV

LHS

MA

PE

010

2030

010

20300.2

0.3

0.4

0.5

RHS

In-Sample HAR-RAV

LHS

MA

PE

010

2030

010

2030

0.2

0.3

0.4

0.5

RHS

In-Sample HAR-RV-IV

LHS

MA

PE

010

2030

010

2030

0.2

0.3

0.4

0.5

RHS

In-Sample HAR-RAV-IV

LHS

MA

PE

Page 15: A Closer Look at High-Frequency Data and Volatility Forecasting …2009. 4. 29. · using high-frequency data, simple linear autoregressive regression models have better predictive

15

Figure 6: Robust In-Sample Surface Plot w/ Sub-sampling

Figure 7: OLS Out-of-Sample Surface Plot w/out Sub-sampling

010

2030

010

20300.2

0.3

0.4

RHS

In-Sample HAR-RV

LHS

MA

PE

010

2030

010

20300.2

0.3

0.4

RHS

In-Sample HAR-RAV

LHS

MA

PE

010

2030

010

20300.2

0.3

0.4

RHS

In-Sample HAR-RV-IV

LHS

MA

PE

010

2030

010

20300.2

0.3

0.4

RHS

In-Sample HAR-RAV-IV

LHS

MA

PE

010

2030

010

2030

0.4

0.6

RHS

Out-of-Sample HAR-RV

LHS

MA

PE

010

2030

010

2030

0.4

0.6

RHS

Out-of-Sample HAR-RAV

LHS

MA

PE

010

2030

010

2030

0.4

0.6

RHS

Out-of-Sample HAR-RV-IV

LHS

MA

PE

010

2030

010

2030

0.4

0.6

RHS

Out-of-Sample HAR-RAV-IV

LHS

MA

PE

Page 16: A Closer Look at High-Frequency Data and Volatility Forecasting …2009. 4. 29. · using high-frequency data, simple linear autoregressive regression models have better predictive

16

Figure 8: OLS Out-of-Sample Surface Plot w/ Sub-sampling

Figure 9: Robust Out-of-Sample Surface Plot w/ Sub-sampling

010

2030

010

2030

0.4

0.6

RHS

Out-of-Sample HAR-RV

LHS

MA

PE

010

2030

010

2030

0.4

0.6

RHS

Out-of-Sample HAR-RAV

LHS

MA

PE

010

2030

010

2030

0.4

0.6

RHS

Out-of-Sample HAR-RV-IV

LHS

MA

PE

010

2030

010

2030

0.4

0.6

RHS

Out-of-Sample HAR-RAV-IV

LHS

MA

PE

010

2030

010

2030

0.4

0.6

RHS

Out-of-Sample HAR-RV

LHS

MA

PE

010

2030

010

2030

0.4

0.6

RHS

Out-of-Sample HAR-RAV

LHS

MA

PE

010

2030

010

2030

0.4

0.6

RHS

Out-of-Sample HAR-RV-IV

LHS

MA

PE

010

2030

010

2030

0.4

0.6

RHS

Out-of-Sample HAR-RAV-IV

LHS

MA

PE

Page 17: A Closer Look at High-Frequency Data and Volatility Forecasting …2009. 4. 29. · using high-frequency data, simple linear autoregressive regression models have better predictive

17

Table 1: Coefficients for Select OLS Regressions (w/ Sub-sampling) HAR-RV HAR-RAV HAR-RV-IV HAR-RAV-IV

(1,1) (1,10) (1,1) (1,10) (1,1) (1,10) (1,1) (1,10)

NO (x10-5)

1.4*** 2.4*** -3.4*** -2.2*** 1.1*** 1.5*** -2.1** -1.2

NP .19*** .06** .004*** .002*** .11** .03* .003** .001*

NQ .37*** .15*** .008** .01*** .32** .12** .007** .004 **

NR .18 .21*** .002 .003* .03 .06 -.0001 .001

NST ---- ---- ---- ---- .11** .14*** .09* .09

(10,1) (10,10) (10,1) (10,10) (10,1) (10,10) (10,1) (10,10)

NO (x10-5)

1.6** 2.7*** -5.7*** -4.6*** 1.0* 1.3** -3.0* -3.6**

NP .34*** .11** .01*** .004*** .17* .06* .004* .003**

NQ .64** .27*** .01** .01*** .53** .22** .01* .009**

NR .07 .25** -.0003 .002 -.28 .01 -.005 .0004

NST ---- ---- ---- ---- .24*** .22** .18** .10 Table 1: Coefficients reported with significance. (x,y) �LHS sampled at x min, RHS sampled at y min Significance levels: * = p<0.05 ** = p<0.01 *** = p<0.001 P-values obtained from Newey-West Standard Errors w/ Lag length of 44

Table 2: Coefficients for Select Robust Regressions (w/ Sub-sampling) HAR-RV HAR-RAV HAR-RV-IV HAR-RAV-IV

(1,1) (1,10) (1,1) (1,10) (1,1) (1,10) (1,1) (1,10)

NO (x10-5)

1.3*** 2.1*** -2.7*** -1.6*** 1.1*** 1.5*** 1.6*** -0.9***

NP .18*** .08*** .003*** .002*** .11*** .04*** .002*** .001***

NQ .30*** .15*** .01*** .004*** .25*** .10*** .005*** .003***

NR .21*** .17*** .003*** .003*** .10*** .08*** .002*** .002***

NST ---- ---- ---- ---- .08*** .11*** .08*** .07***

(10,1) (10,10) (10,1) (10,10) (10,1) (10,10) (10,1) (10,10)

NO (x10-5)

1.2*** 1.9*** -4.2*** -3.3*** 0.9*** 1.2*** -1.7*** -2.1***

NP .30*** .12*** .005*** .003*** .12*** .06*** .002*** .002***

NQ .37*** .27*** .01*** .01*** .28*** .18*** .01*** .01***

NR .27*** .21*** .004*** .004*** -.01 .05*** .0002 .001***

NST ---- ---- ---- ---- .19*** .17*** .17*** .11*** Table 2: Same significance levels as in Table 1. P-values obtained from heteroskedasticity-robust SE’s.

Page 18: A Closer Look at High-Frequency Data and Volatility Forecasting …2009. 4. 29. · using high-frequency data, simple linear autoregressive regression models have better predictive

18

8. References

1. Andersen, T., T. Bollerslev, and F. Diebold, Roughing It Up: Including Jump Components in the Measurement, Modeling, and Forecasting of Return Volatility. The Review of Economics and Statistics, 2007. 89(4): p. 701-720.

2. Andersen, T., et al., Realized Volatility and Correlation. Working Paper, Northwestern University, 1999.

3. Andersen, T., et al., Modelling and Forecasting Realized Volatility. Econometrica, 2003. 71(2): p. 579-625.

4. Andersen, T., T. Bollerslev, and X. Huang, A Semiparametric Framework for Modelling and Forecasting Jumps and Volatility in Speculative Prices. Working Paper, Duke University, 2007.

5. Bandi, F. and J. Russell, Microstructure Noise, Realized Variance, and Optimal Sampling. Review of Economic Studies, 2008. 75(2): p. 339-369.

6. Becker, R., A. Clements, and S. White, On the Informational Efficiency of S&P500 Implied Volatility. North American Journal of Economics and Finance, 2006. 17(2): p. 139-153.

7. Blair, B., S.-H. Poon, and S. Taylor, Forecasting S&P 100 Volatility: The Incremental Information Content of Implied Volatilities and High-Frequency Index Returns. Journal of Econometrics, 2001. 105(1): p. 5-26.

8. Corsi, F., A Simple Long Memory of Realized Volatility. Unpublished Manuscript, University of Logano, 2003.

9. Forsberg, L. and E. Ghysels, Why Do Absolute Returns Predict Volatility So Well? Journal of Financial Econometrics, 2007. 5(1): p. 31-67.

10. Fradkin, A., The Informational Content of Implied Volatility in Individual Stocks and the Market. Unpublished Manuscript, Duke University, 2007.

11. Jiang, G. and Y. Tian, The Model-Free Implied Volatility and its Informational Content. Review of Financial Studies, 2005. 18(4): p. 1305-1342.

12. Law, T.H., The Elusiveness of Systematic Jumps. Unpublished Manuscript, Duke University, 2007.

13. Mincer, J. and V. Zarnowitz, The Evaluation of Economic Forecasts, in Economic Forecasts and Expectations, J. Mincer, Editor. 1969, NBER: New York.

14. Muller, U., et al., Volatilities of Different Time Resolutions - Analyzing the Dynamics of Market Components. Journal of Empirical Finance, 1997. 4(2-3): p. 213-239.

15. Poon, S.-H. and C. Granger, Practical Issues in Forecasting Volatility. Financial Analysts Journal, 2005. 61(1): p. 45-56.

16. Zhang, L., L. Mykland, and Y. Ait-Sahalia, A Tale of Two Time Scales: Determining Integrated Volatility with Noisy High-Frequency Data. Journal of the American Statistical Association, 2005. 100: p. 1394-1411.


Recommended