A Closer Look at High-Frequency Data and Volatility Forecasting in a HAR Framework1
Derek Song
ECON 201FS
Spring 2009
1 This report was written in compliance with the Duke Community Standard
2
1. Introduction
The volatility of asset returns is an area of vital importance for research in
financial theory – risk management and derivative valuation methods are all dependent
on being able to accurately measure and forecast volatility. The recent availability of
high-frequency price data has given rise to new models of volatility that have yielded
significant improvements in the accuracy of volatility measurements and forecasting.
The ability to accurately predict future volatility is particularly important in a practical
sense because of its implications for asset management.
Recent literature, such as Andersen, Bollerslev, Diebold, Labys (2003), show that
using high-frequency data, simple linear autoregressive regression models have better
predictive capabilities than the more sophisticated ARCH/GARCH and stochastic
volatility models. One such model is the heterogeneous autoregressive (HAR) model, a
simple autoregressive model for realized volatility first proposed by Corsi (2003).
When setting up a regression model, researchers are afforded several degrees of
freedom, including methods for calculating RV and regression methodology. In the
current literature, comparisons of different regression models are often made using a
particular choice for the sampling interval. This raises the question of whether or not
those results would hold given different choices for model parameters, since we would
like to ensure some level of consistency when comparing models. In this paper, we seek
to add to the existing literature by empirically examining the sensitivity of several HAR
model forecasts to various sampling and regression methods. In particular, we compare
four models, HAR-RV, HAR-RAV, and both of the above models with implied volatility
added in. The three factors that we will consider are (1) sampling interval, which runs
3
from 1 minute to 30 minutes, (2) sub-sampling, and (3) the effect of using robust
regressions instead of OLS to control for outliers and leverage points. We compare the
models by measuring both in-sample fit and out-of-sample performance on a synthetic
portfolio constructed to mimic the S&P 100. We find that when the sampling interval is
set at 5 minutes or higher, there is little variation in forecast accuracy for different
intervals, and that any noisiness is eliminated by using sub-sampling. Furthermore, our
results suggest that including implied volatility has a significant impact on forecasting
accuracy.
The rest of the paper proceeds as follows: theoretical and mathematical
background of volatility and regression models (2), research methodologies (3), data
preparation (4), empirical results (5), and a conclusion summarizing the paper and the
most important results (6). All tables and figures are given in the end of the paper.
2. Theoretical Background
2.1 Stochastic Model of Asset Returns
In this paper, we assume a widely used model of asset prices that includes jumps.
We assume that the log-prices of a stock, denoted �� follow the stochastic differential
equation given below:
��� = ���� + ���� + ���� (2.1.1)
Here, ���� is a time-varying drift component, ���� represents a time-varying volatility
component of the asset price, � is a standard Wiener process, � is the magnitude of the
jump, and �� represents a counting process which is commonly assumed to be a Poisson
process so that jumps are rare.
4
2.2 Market Microstructure Noise
Stock prices are commonly assumed to have a theoretical fundamental price,
calculated as the sum of all discounted future dividend payments. Market microstructure
noise is defined as any short-term deviations of the spot price from the fundamental value
of a stock, and is modeled by
��∗ = �� + �� (2.2.1)
Note that �� is the logarithm of the observed price, and therefore the error term �� is
proportional to the observed price. Market microstructure noise arises due to various
market frictions, including the bid-ask bounce. Because market microstructure noise
distorts price data at high frequencies, it can become problematic for the estimation of
realized volatility. Bandi, Russell (2008) show that in the presence of noise, the RV
estimator will diverge to infinity almost surely. However, Forsberg, Ghysels (2007)
argue that RAV is much more robust to sampling errors and jumps. In this paper, we
discuss two ways of circumventing this problem, which we discuss in Sections 2.3 and 3.
2.3 Models of Volatility in Asset Returns
We let ��,� denote the logarithmic (geometric) return at some intra-day time �,
given by ��,� = ��,� − ��,���, where �� is the logarithm of the observed price. We will
now define two different measures of volatility, Realized Variance (RV) and Realized
Absolute Value (RAV). We define RV as the sum of the squared log-returns, and RAV
as the sum of the absolute log-returns. As such, RV will be measured in variance units,
while RAV is measured in standard deviation units. Letting � be the number of times
5
we sample within each day (in our data, we have per-minute returns, yielding a maximum
of 384 samples per day), we calculate the daily RV as follows:
��� = � ��,�� �!�
→#$%%& ' �(��)�
���+ � (����*(+�
(2.3.1)
The next measure, RAV, is defined as:
�-�� = . /2� � 0��,�0
�!� →#$%%& ' �(�)
�
��� (2.3.2)
There are several key points to note about these measures; the first is that because of the
way in which they are defined, RV and RAV are not directly comparable. Secondly, as
the sampling interval increases, we are throwing away more and more of the data.
Zhang, Mykland, Aït-Sahalia (2005) propose an alternative sampling
methodology known as sub-sampling. Assuming we sample every 1 minutes, we
calculate our measure from each starting point 2 = 1, 2, … , 1 and average those
calculations, resulting in no discarded data points. Sub-sampling has two distinct
advantages over the traditional sampling method: reduced bias from microstructure noise
and the ability to use all of the available data points regardless of sampling interval.
2.4 HAR Regression Models
In this paper, we rely on the Heterogeneous Autoregressive (HAR) models first
introduced by Müller et al (2007) and Corsi (2003) to forecast volatility. Recent papers
(see: Andersen, Bollerslev, Diebold, Labys (2003) or Andersen, Bollerslev, Huang
(2007)) have shown empirically that simple linear models can often predict future
volatility more accurately than more sophisticated models that can formally capture long
memory processes and persistence. The HAR framework developed by Corsi is
6
attractive because it is easily estimated using OLS, and is significantly more
parsimonious than the HARCH model of Müller et al (1997). The expected future
variance over an ℎ-day horizon is given by a linear combination of average historical
RV’s over different time scales, which can capture the persistence seen in time series data
without making the sort of restrictive assumptions seen in ARFIMA and GARCH models.
In order to calculate the model, we will define ���,�56 as the average RV over a
given time span, ℎ. This is mathematically represented as follows:
���,�56 = 1ℎ � ���
�56�!�5�
(2.4.1)
�-��,�56 is then defined analogously. In this paper, we want to calculate 22-day ahead
forecasts, which correspond to the number of trading days within a calendar month. Thus,
we can set up a HAR-RV regression as:
���,�5�� = 89 + 8:�����,� + 8;����<,� + 8=������,� + ��5� (2.4.2)
The dependent variables correspond to daily, weekly, and monthly lagged regressors,
which were chosen by Corsi in his paper. Also, it should be noted that Andersen,
Bollerslev, Diebold (2007) established that in general, the jump effects embedded in RV
measures are not significant within the context of a HAR regression.
Forsberg, Ghysels (2007) extend the HAR-class models by using historical RAV
to forecast future RV, which they find to be a significantly better predictor of RV than
historical RV. The model is analogous to the one for HAR-RV, and is defined as:
���,�5�� = 89 + 8:�-����,� + 8;�-���<,� + 8=�-�����,� + ��5� (2.4.3)
It should be noted that the physical interpretation of the HAR-RAV model is not identical
to the HAR-RV model, since RAV and RV are in different units.
7
2.5 Hybrid HAR - Implied Volatility Regressions
There is a large literature on the use of options and model-free implied volatility
to forecast future volatility. Poon, Granger (2005) and a literature review by Blair, Poon,
Taylor (2001) find that implied volatility is a better predictor of volatility than the
commonly used time-series models. Mincer and Zarnowitz (1969) proposed a simple
framework with which to evaluate the efficiency of implied volatility-based forecasting:
���,�56 = 89 + 8>?@�� + ��,�56 (2.5.1)
If implied volatility were perfectly efficient, 89 = 0 and 8>? = 1. However, numerous
papers, including Becker, Clements, White (2003) find that implied volatility is not a
perfectly efficient estimator. Jiang, Tian (2005) showed that model-free implied
volatility is better than options-implied volatility at predicting future volatility and
endorsed the new CBOE VIX methodology for its use of model-free implied volatility.
Fradkin (2008) found evidence that adding implied volatility to HAR models
almost always improved model fit, which suggests that implied volatility contains
information not present in historical realized volatility. We will define hybrid HAR-RV-
IV and HAR-RAV-IV models identical to those used by Fradkin:
���,�5�� = 89 + 8:�����,� + 8;����<,� + 8=������,� + 8>?@�� + ��5� (2.5.2)
���,�5�� = 89 + 8:�-����,� + 8;�-���<,� + 8=�-�����,� + 8>?@�� + ��5� (2.5.3)
3. Data Preparation
The high-frequency stock price data used in this paper were obtained from an
online vendor, price-data.com. For this paper, we follow Law (2007) and select 40 of the
largest MCAP stocks from the S&P 100 (OEX) and aggregate those stocks to form a
8
portfolio that we claim can proxy for the S&P 500 (SPX) for two reasons; the OEX is a
subset of the SPX, and there exists a high degree of correlation between these two indices.
In Figure 1a, we show a scatterplot of daily open-to-close returns for our synthetic proxy
portfolio (SPP) versus daily open-to-close returns of the SPX. Our requirements for
inclusion were that data for the stock be present from Jan. 3, 2000 up through Dec. 31,
2008; we also checked for inconsistencies in the data and adjusted the prices for stock
splits. In creating the portfolio, we kept only the data for those days in which all 40
stocks traded, yielding a total of 2240 days. We use an equal-weighting scheme to
construct our portfolio by “buying” $25 of each stock at the initial price. In Section 2.2,
we discussed the problem of market microstructure noise, and we now claim, citing
Figures 1b, 2a, and 2b, that the process of aggregating stocks has averaged out most if not
all of the microstructure noise.
Our implied volatility data was taken from the CBOE website. We used the VIX,
a model-free implied volatility index which uses options on the SPX to calculate the 1-
month ahead implied volatility for reasons described in Section 2.5. Because intra-day
data was not available, we use only the closing price of the VIX in our regressions. We
transformed the data so that it is measured in the same units as RV. Also, we naturally
include only those days for which the SPP exists.
Our in-sample data runs for 7 years, from the beginning of 2000 until the end of
2006, yielding 1743 data points. Our out-of-sample data runs from the beginning of 2007
until the end of 2008, yielding 497 data points. We therefore have 24 independent
month-long periods for the out-of-sample result, which should be sufficient to accurately
gauge out-of-sample performance.
9
4. Regression Methodologies
4.1 Robust Regressions with Iterative Huber Weighting
Poon, Granger (2005) discusses the common problem of sample outliers to
volatility estimation. These leverage points are problematic because they can unduly
influence OLS estimators, especially when the regressions use only historical volatility.
Because manually removing outliers in a data set this large is infeasible, we will deal
with these leverage points by using robust regressions as a comparison for OLS
regressions. We employ an iterative Huber weighting scheme over bisquare weighting
because it converges significantly faster for our regressions. Our regressions are run in
MATLAB, using the regress and robustfit commands to estimate the OLS and robust
coefficients, respectively.
4.2 Evaluating Regression Performance
There are a number of different methods for evaluation forecast accuracy. We
will use Mean Absolute Percentage Error (MAPE) because it is a measure of relative
accuracy, allowing us to compare results when the RV measures we forecast vary due to
sampling interval and sub-sampling. Letting �C be the residual, and DC be the actual value,
we define MAPE as:
E-FG = 1H � I�C
DCIJ
C!� (4.2.1)
The main problem with MAPE is that the measure is not upper-bounded and so we must
be careful of very small or zero values for DC. As Figure 3 shows, our RV is lower and
upper-bounded by values that are within a reasonable range of each other.
10
5. Empirical Results
5.1 In-Sample Results
The in-sample surface plots (Figures 4-6) show a marked increase in variation in
fit when the sampling interval for either side of the regression is small (1 < 5 min). This
effect is significantly more pronounced for OLS regressions than for the robust
regressions. Above this threshold, the surface plot is relatively flat, suggesting that any
choice of large sampling interval (1 ≥ 5 min) has little bearing on fit. Sample fit
increases when the LHS sampling interval decreases in each of the models. For the
HAR-RAV models, fit decreases when the RHS sampling interval decreases. Adding
implied volatility appears to curtail most of that variability, however.
Sub-sampling eliminates noisiness in our regressions, producing a smooth surface
plot; however, it does not improve fit uniformly across all sampling intervals. Therefore,
although using sub-sampling is able to ensure some degree of consistency in our results,
it does not play a major role in fit.
Between models, we see that RAV produces a better fit than RV for OLS
regressions over large sampling intervals. However, the addition of implied volatility
provides the best fit, and there no longer appears to be a significant difference between
RV-IV and RAV-IV. Furthermore, the variability seen at smaller intervals is diminished
greatly by the inclusion of IV. With regards to the robust regressions, the differences
between RV and RAV alone are not significant, and robust regressions have also
decreased the variability at the lower sampling intervals. Adding IV improves fit, but the
magnitude of the improvement is not as large as for OLS. Finally, the robust regressions
appear to offer the best fit for each of the four regression models.
11
We report OLS coefficients for selected combinations for each model in Table 1
and robust coefficients in Table 2. The standard errors for the OLS coefficients are
Newey-West standard errors with a lag of 44 days. We find that in general, the
coefficients are significant at the M = 0.05 level or better. The robust regression
coefficients are, with few exceptions, highly coefficient (� < 0.001), however, this is
very likely because the standard errors are not robust to serial correlations.
5.2 Out-of-sample Results
From Figures 7-9, we see that the variability towards the small intervals is
generally larger than in the in-sample results, particularly when using OLS regressions.
Using robust regressions improves consistency when implied volatility is not included;
however, when IV is included, the variation seen in at the small intervals is curtailed.
However, with regards to HAR-RV and HAR-RAV, we see the same general pattern as
in the in-sample data. Along the LHS of the regression, decreasing sampling interval
results in a small improvement in accuracy, while along the RHS, we see a drastic decline
in accuracy as the sampling interval decreases.
For the out-of-sample comparisons, we see many of the same results discussed
above. HAR-RV and HAR-RAV perform very similarly in the out-of-sample period, and
the inclusion of implied volatility helps to improve performance. Robust estimation
procedures appear to provide the most accurate forecasts across all models.
We should note that the out-of-sample period used in this paper encompasses a
period of unusually high volatility due to the recent economic turmoil, as seen in Figure 3.
Fradkin (2007) and Forsberg, Ghysels (2007) both found clear evidence that HAR-RAV
12
offered the best predictions of future volatility; however, they used 2005 and 2001-2003
as their out-of-sample periods, respectively, which were both periods of relatively low
volatility. This may imply that HAR-RAV offers a significant advantage over HAR-RV
when the overall volatility is low and persistence effects are not as strong. However,
further analysis of this topic is beyond the scope of this paper.
6. Conclusion
In this paper, our goal was to examine the impact that different sampling and
regression methodologies have on volatility forecasting in order to gain a better
understanding of how the choices we make with regards to modeling and estimation can
affect our results. To that end, we examined three factors: sampling interval, sub-
sampling, and robust or OLS regressions. First, we found that forecast performance can
vary greatly when the sampling interval falls below 5 min; for most of the models,
decreasing the sampling interval on the LHS of the regression improved accuracy, but
decreasing the sampling interval on the RHS hurt accuracy. Beyond 5 minutes, there is a
high level of consistency in our results. Secondly, sub-sampling is able to reduce the
noisiness in our regression results, but it does not yield any true improvements in overall
forecast accuracy. Finally, our results show that using robust estimation procedures and
implied volatility both improve forecasting performance over the base HAR-RV and
HAR-RAV models, although the robustly estimated models fared the best out-of-sample.
13
7. Tables and Figures
Figure 1: SPP Data
Figures 1a and 1b: 1a shows a scatterplot showing SPX intra-day returns vs. SPP intra-day returns. 1b is a plot of the price movements in our portfolio SPP within an arbitrarily chosen day.
Figure 2: Volatility Signature Plots
Figure 2: These are volatility signature plots, introduced in Andersen, Bollerslev, Diebold, Labys (1999). The fact that RV and RAV are decreasing as the sampling interval becomes smaller than 5 min suggests that market microstructure noise no longer biases either volatility measure.
Figure 3: 1-month Ahead Mean RV and VIX Plots
Figure 3: A plot showing annualized values for the VIX and the annualized monthly volatility for our synthetic portfolio (SPP).
-0.1 -0.05 0 0.05 0.1 0.15-0.1
-0.05
0
0.05
0.1
0.15
S&P 500 Log-returns
SPP L
og-ret
urns
Comparison of Daily Open-to-Close Log-returns
0 100 200 300 400993
994
995
996
997
998
999
1000
Time of Day
Val
ue o
f SPP
Intra-day Price Movements of SPP for 1 Day
0 5 10 15 20 25 300.022
0.023
0.024
0.025
0.026
0.027
0.028
0.029
0.03
Sampling Interval (Minutes)
Ann
ualiz
ed U
nits
RV Volatility Signature
No Sub-sampling
With Sub-sampling
0 5 10 15 20 25 300.115
0.12
0.125
0.13
0.135
0.14
Sampling Interval (Minutes)
Ann
ualiz
ed U
nits
RAV Volatility Signature
No Sub-sampling
With Sub-sampling
2000 2001 2002 2003 2004 2005 2006 2007 2008 20090
20
40
60
80
100
Date
Ann
ualiz
ed V
olat
ility
(%
)
1-Month Ahead Mean RV and VIX
1-Month Ahead RV
VIX
14
Figure 4: OLS In-Sample Surface Plot w/out Sub-sampling
Figures 4-9: These are all surface plots, with the sampling interval (from 1 min up to 30 min) of the left-hand side of the regression on the left axis, the sampling interval (also from 1 – 30 min) of the right-hand side of the regression on the right axis, and the MAPE for each combination on the vertical axis.
Figure 5: OLS In-Sample Surface Plot w/ Sub-sampling
010
2030
010
20300.2
0.3
0.4
0.5
RHS
In-Sample HAR-RV
LHS
MA
PE
010
2030
010
20300.2
0.3
0.4
0.5
RHS
In-Sample HAR-RAV
LHS
MA
PE
010
2030
010
20300.2
0.3
0.4
0.5
RHS
In-Sample HAR-RV-IV
LHS
MA
PE
010
2030
010
20300.2
0.3
0.4
0.5
RHS
In-Sample HAR-RAV-IV
LHS
MA
PE
010
2030
010
2030
0.2
0.3
0.4
0.5
RHS
In-Sample HAR-RV
LHS
MA
PE
010
2030
010
20300.2
0.3
0.4
0.5
RHS
In-Sample HAR-RAV
LHS
MA
PE
010
2030
010
2030
0.2
0.3
0.4
0.5
RHS
In-Sample HAR-RV-IV
LHS
MA
PE
010
2030
010
2030
0.2
0.3
0.4
0.5
RHS
In-Sample HAR-RAV-IV
LHS
MA
PE
15
Figure 6: Robust In-Sample Surface Plot w/ Sub-sampling
Figure 7: OLS Out-of-Sample Surface Plot w/out Sub-sampling
010
2030
010
20300.2
0.3
0.4
RHS
In-Sample HAR-RV
LHS
MA
PE
010
2030
010
20300.2
0.3
0.4
RHS
In-Sample HAR-RAV
LHS
MA
PE
010
2030
010
20300.2
0.3
0.4
RHS
In-Sample HAR-RV-IV
LHS
MA
PE
010
2030
010
20300.2
0.3
0.4
RHS
In-Sample HAR-RAV-IV
LHS
MA
PE
010
2030
010
2030
0.4
0.6
RHS
Out-of-Sample HAR-RV
LHS
MA
PE
010
2030
010
2030
0.4
0.6
RHS
Out-of-Sample HAR-RAV
LHS
MA
PE
010
2030
010
2030
0.4
0.6
RHS
Out-of-Sample HAR-RV-IV
LHS
MA
PE
010
2030
010
2030
0.4
0.6
RHS
Out-of-Sample HAR-RAV-IV
LHS
MA
PE
16
Figure 8: OLS Out-of-Sample Surface Plot w/ Sub-sampling
Figure 9: Robust Out-of-Sample Surface Plot w/ Sub-sampling
010
2030
010
2030
0.4
0.6
RHS
Out-of-Sample HAR-RV
LHS
MA
PE
010
2030
010
2030
0.4
0.6
RHS
Out-of-Sample HAR-RAV
LHS
MA
PE
010
2030
010
2030
0.4
0.6
RHS
Out-of-Sample HAR-RV-IV
LHS
MA
PE
010
2030
010
2030
0.4
0.6
RHS
Out-of-Sample HAR-RAV-IV
LHS
MA
PE
010
2030
010
2030
0.4
0.6
RHS
Out-of-Sample HAR-RV
LHS
MA
PE
010
2030
010
2030
0.4
0.6
RHS
Out-of-Sample HAR-RAV
LHS
MA
PE
010
2030
010
2030
0.4
0.6
RHS
Out-of-Sample HAR-RV-IV
LHS
MA
PE
010
2030
010
2030
0.4
0.6
RHS
Out-of-Sample HAR-RAV-IV
LHS
MA
PE
17
Table 1: Coefficients for Select OLS Regressions (w/ Sub-sampling) HAR-RV HAR-RAV HAR-RV-IV HAR-RAV-IV
(1,1) (1,10) (1,1) (1,10) (1,1) (1,10) (1,1) (1,10)
NO (x10-5)
1.4*** 2.4*** -3.4*** -2.2*** 1.1*** 1.5*** -2.1** -1.2
NP .19*** .06** .004*** .002*** .11** .03* .003** .001*
NQ .37*** .15*** .008** .01*** .32** .12** .007** .004 **
NR .18 .21*** .002 .003* .03 .06 -.0001 .001
NST ---- ---- ---- ---- .11** .14*** .09* .09
(10,1) (10,10) (10,1) (10,10) (10,1) (10,10) (10,1) (10,10)
NO (x10-5)
1.6** 2.7*** -5.7*** -4.6*** 1.0* 1.3** -3.0* -3.6**
NP .34*** .11** .01*** .004*** .17* .06* .004* .003**
NQ .64** .27*** .01** .01*** .53** .22** .01* .009**
NR .07 .25** -.0003 .002 -.28 .01 -.005 .0004
NST ---- ---- ---- ---- .24*** .22** .18** .10 Table 1: Coefficients reported with significance. (x,y) �LHS sampled at x min, RHS sampled at y min Significance levels: * = p<0.05 ** = p<0.01 *** = p<0.001 P-values obtained from Newey-West Standard Errors w/ Lag length of 44
Table 2: Coefficients for Select Robust Regressions (w/ Sub-sampling) HAR-RV HAR-RAV HAR-RV-IV HAR-RAV-IV
(1,1) (1,10) (1,1) (1,10) (1,1) (1,10) (1,1) (1,10)
NO (x10-5)
1.3*** 2.1*** -2.7*** -1.6*** 1.1*** 1.5*** 1.6*** -0.9***
NP .18*** .08*** .003*** .002*** .11*** .04*** .002*** .001***
NQ .30*** .15*** .01*** .004*** .25*** .10*** .005*** .003***
NR .21*** .17*** .003*** .003*** .10*** .08*** .002*** .002***
NST ---- ---- ---- ---- .08*** .11*** .08*** .07***
(10,1) (10,10) (10,1) (10,10) (10,1) (10,10) (10,1) (10,10)
NO (x10-5)
1.2*** 1.9*** -4.2*** -3.3*** 0.9*** 1.2*** -1.7*** -2.1***
NP .30*** .12*** .005*** .003*** .12*** .06*** .002*** .002***
NQ .37*** .27*** .01*** .01*** .28*** .18*** .01*** .01***
NR .27*** .21*** .004*** .004*** -.01 .05*** .0002 .001***
NST ---- ---- ---- ---- .19*** .17*** .17*** .11*** Table 2: Same significance levels as in Table 1. P-values obtained from heteroskedasticity-robust SE’s.
18
8. References
1. Andersen, T., T. Bollerslev, and F. Diebold, Roughing It Up: Including Jump Components in the Measurement, Modeling, and Forecasting of Return Volatility. The Review of Economics and Statistics, 2007. 89(4): p. 701-720.
2. Andersen, T., et al., Realized Volatility and Correlation. Working Paper, Northwestern University, 1999.
3. Andersen, T., et al., Modelling and Forecasting Realized Volatility. Econometrica, 2003. 71(2): p. 579-625.
4. Andersen, T., T. Bollerslev, and X. Huang, A Semiparametric Framework for Modelling and Forecasting Jumps and Volatility in Speculative Prices. Working Paper, Duke University, 2007.
5. Bandi, F. and J. Russell, Microstructure Noise, Realized Variance, and Optimal Sampling. Review of Economic Studies, 2008. 75(2): p. 339-369.
6. Becker, R., A. Clements, and S. White, On the Informational Efficiency of S&P500 Implied Volatility. North American Journal of Economics and Finance, 2006. 17(2): p. 139-153.
7. Blair, B., S.-H. Poon, and S. Taylor, Forecasting S&P 100 Volatility: The Incremental Information Content of Implied Volatilities and High-Frequency Index Returns. Journal of Econometrics, 2001. 105(1): p. 5-26.
8. Corsi, F., A Simple Long Memory of Realized Volatility. Unpublished Manuscript, University of Logano, 2003.
9. Forsberg, L. and E. Ghysels, Why Do Absolute Returns Predict Volatility So Well? Journal of Financial Econometrics, 2007. 5(1): p. 31-67.
10. Fradkin, A., The Informational Content of Implied Volatility in Individual Stocks and the Market. Unpublished Manuscript, Duke University, 2007.
11. Jiang, G. and Y. Tian, The Model-Free Implied Volatility and its Informational Content. Review of Financial Studies, 2005. 18(4): p. 1305-1342.
12. Law, T.H., The Elusiveness of Systematic Jumps. Unpublished Manuscript, Duke University, 2007.
13. Mincer, J. and V. Zarnowitz, The Evaluation of Economic Forecasts, in Economic Forecasts and Expectations, J. Mincer, Editor. 1969, NBER: New York.
14. Muller, U., et al., Volatilities of Different Time Resolutions - Analyzing the Dynamics of Market Components. Journal of Empirical Finance, 1997. 4(2-3): p. 213-239.
15. Poon, S.-H. and C. Granger, Practical Issues in Forecasting Volatility. Financial Analysts Journal, 2005. 61(1): p. 45-56.
16. Zhang, L., L. Mykland, and Y. Ait-Sahalia, A Tale of Two Time Scales: Determining Integrated Volatility with Noisy High-Frequency Data. Journal of the American Statistical Association, 2005. 100: p. 1394-1411.