Type I and Type II Fractional Brownian Motions:A Reconsideration
James Davidson and Nigar HashimzadeUniversity of Exeter
1
Long Memory - Some Background Also call strong dependence.
Ideas developed originally by
Harold Hurst (hydrology, weather patterns)
Benoit Mandelbrot (commodity and asset prices).
Consider a stationary zero-mean time series xt, t 1, . . . ,T, variance 2.
1. Short memory. Autocorrelations are summable:
∑t1
Txt OpT1/2, and T−1/2∑
t1
Trxt Br, 0 ≤ r ≤ 1
where B is Brownian motion (BM).
2. Long memory: Autocorrelations are nonsummable:
∑t1
Txt OpT1/2d, 0 d 1
2 ; and T−1/2−d∑t1
Trxt Bdr, 0 ≤ r ≤ 1
where Bd is fractional Brownian motion (fBM).
These continuous-time ‘limit processes’ are essential modelling tools for large sampleinference in time series models.
2
�300
�200
�100
0
100
200
300
400
631 703 775 847 919 991 1063 1136 1208 1280
Figure 7. Annual Nile minima (mean deviations)
27
The problem:
There is a single agreed mathematical model of Brownian motion – “Wiener measure”.
However, there are several alternative models of fractional Brownian motion.
There is no consensus among statisticians/ econometricians about which model to use.
Also, not a little misunderstanding of the issues!
4
Fractional Brownian motion:Type I:
Xr 1Γd 1 0
rr − sddBs 1
Γd 1 −0r − sd − −sddBs (1)
Type II:X∗r 1
Γd 1 0
rr − sddBs (2)
where − 12 d 1
2 and B denotes regular Brownian motion.
(Mandelbrot and van Ness 1968, Marinucci and Robinson 1999).
WriteX X∗ X∗∗
where X∗∗r is defined as the second of the two terms in (1).
The processes X∗ and X∗∗ are Gaussian, and independent of each other.
Variance of (1) exceeds that of (2).
Increments of (1) are stationary, whereas those of (2) are not.
5
Motivation for fBM processes1. Postulate a realization, of size n, of discrete long memory processes.2. considering the weak limit of the normalized partial sum, as n → .
Thus, definext 1 − L−dut (3)
where ut− is an i.i.d.0,2, and
1 − L−d ∑j1
bjLj, bj Γd j
ΓdΓ1 j
Define the partial sum process
Xnr 1n1/2d ∑
t1
nr
xt, 0 ≤ r ≤ 1 (4)
It is known (e.g. Davidson and de Jong 2000) that
Xnd→ X.
6
On the other hand, definext∗ 1 − L−dut
∗ (5)where
ut∗ 1t 0ut.
Defining Xn∗ like (4) with xt
∗ replacing xt, it is known (Marinucci and Robinson 2000) that
Xn∗ d→ X∗.
Model (5) used in simulation exercises to generate fractionally integrated processes, asan alternative to setting a fixed, finite truncation of the lag distribution, common to everyt.
Model (5) problematic. Nothing about the date when we start to observe a series tosuggest that we ought to set all shocks preceding it to 0.
Such truncation common in time series modelling, although usually justified byassumption that the effect is asymptotically negligible.
However, if (5) is used to generate the artificial data, the limiting distribution simulatedwill be Type II.
If the observed data ought to be treated as drawn from (3), then estimated critical valuesincorrect even in large samples.
7
Properties of Fractional Brownian MotionsSince
Xr1Xr2 12 Xr12 Xr22 − Xr2 − Xr12
a formula for the variance of an increment Xr2 − Xr1 is sufficient to determine thecomplete covariance structure.
To motivate our discussion, consider cases r1 0 and r2 r ∈ 0,1.
From Mandelbrot and Van Ness (1968), Davidson and Hashimzade (2007),
EXr2 Vdr2d1
where
Vd Γ1 − 2d2d 1Γ1 dΓ1 − d .
By contrast,EX∗r2 V∗dr2d1
whereV∗d 1
2d 1Γd 12 .
8
0.250-0.25
5
3.75
2.5
1.25
0
Plots of V (solid line) and V∗ (dashed line) over −0.5, 0.5
Easy to see how the distributions of functionals such as 0
1Xdr and
0
1X2dr will differ
correspondingly for these two models.
9
Stochastic Integrals
For type I processes X1 (integrand) and X2 (integrator), Davidson and Hashimzade give(2007, Proposition 4.1)
E 0
1X1dX2 12
Γ1 − d1 − d2 sind2
d1 d21 d1 d2
where 12 EX11X21.
On the other hand, for type II processes X1∗ and X2
∗, where 12 is defined analogously:
Proposition 2.1
E 0
1X1∗dX2
∗ 12d21 d1 d2d1 d2Γ1 d1Γ1 d2
10
0.3750.250.1250
2
1.5
1
0.5
E 0
1X1dX2 (solid line) and E
0
1X1∗dX2
∗ (dashed line) as functions of d2, with 12 1,d1 0. 4
11
CommentThese large discrepancies pose important issue - which of these models is the moreappropriate for use in econometric inference? Marinucci and Robinson (1999) remark:
“It is of some interest to note that [type II fBM] is taken for granted as the properdefinition of fractional Brownian motion in the bulk of the econometric time seriesliterature, whereas the probabilistic literature focuses on [type I fBM]. Thisdichotomy mirrors differing definitions of nonstationary fractionally integratedprocesses...”
Two approaches to the fractional model:
1. (“type II”) Continuum of nonstationary models indexed on d ≥ 0. Unified frameworkfor I(1), I(0), I(d).
Must specify finite start date with initial condition given by different mechanism.
2. (“type I”) Integer integration (cumulation from fixed start date) conceptually distinctfrom stationary long memory.
Stationary increments can start at "date" −.
12
Comment, cont’d
The distinction is artificial, but arises because our time series models are highlysimplified representations of (typically) nonlinear/aggregated processes.
What matters, in the choice of linear representation, is how best to represent thejoint distribution of the (finite) observed series.
By representing processes as linear in i.i.d. shocks, stationarity requires "start-up"in the infinite past
Remember, the shocks are unobserved and strictly fictional - just an artificialmodelling device.
Don’t get hung up on what it "means" to remember the infinite past.
13
Working with Type I Processes
Nonstandard asymptotic inference involves distributions that are unavailable in closedform.
Theory becomes operational by simulating the statistic on the computer for large, finiten, using i.i.d. pseudo-Gaussian increments.
Appeal to an invariance principle allows us to use resulting tables of critical values forobserved data.
A simulation strategy is therefore an essential step of the inference procedure.
However, simulating type I processes in some regions of the parameter space isproblematic...
14
Simulation Strategies For Type I Processes1. Using Presample LagsApply the formula
xt ∑j0
mt−1
bjut−j
Choosing m “large enough” should approximate type I process to any desired degree ofaccuracy.
Table shows the standard deviations in 10,000 replications of the terminal points
Xn1 1n1/2d ∑
t1
n
xt
where d 0.4, and n 1000.
m 0 1000 3000 6000 9000SD 0.843 0.996 1.036 1.108 1.137
Comment: for comparison, note V∗0.4 0.8401 and V0.4 1. 389 .
15
2. Frequency Domain Simulation:
When ut is i.i.d. Gaussian, xt has the harmonic representation
xt 2−
eitgWd (5)
where g 1 − e−i−d,W−d Wd
EWd 0
EWdWd d,
0, otherwise
Process is stationary by construction.
Davidson and Hashimzade (2006, Theorem 2.2) show the weak limit of the partial sumprocess is type I fBM.
16
In principle, could use FFT to compute discrete approximation
xt 2m
∑k1−m
m−1
eiktgkWk, t 0,… ,m − 1
where
Wk Uk iVk, k ≥ 0Uk − iVk, k 0
, Uk,Vk NI0,1
and, for |k| 0,
gk 1 − e−ik −d 2sin k2
−dcos − kd
2 − i sin − kd2
Problem: singularity of g at 0.
Natural solution is to use the series expansion g ∑j0 bje−ij truncated at m terms.
Case: d 0.4, 2 1 and n 1000, SD of Xn1 in 10,000 replications.
m 1000 5000 10,000 20,000SD 1.106 1.128 1.166 1.200
True type I SD 1.389. Comment: not a feasible procedure for routine application.
17
3. Choleski Method, Circulant Embedding and Wavelets
Methods based directly on the autocovariance function
k 2 Γ1 − 2dΓk dΓk 1 − d
sind .
do generate a type I process:
Choleski, and CE methods are basically equivalent, but CE much more numericallyefficient.
In practice, wavelet methods use a discrete time ARFIMA process to supply thelow-frequency persistence properties of fBM. Wavelets then fill in the high-frequency"details".
If the ARFIMA is simulated by CE, then the wavelet representation also corresponds totype I.
However, these methods not very suitable to econometric applications.
18
4. Simulation by Aggregation
Granger (1981) aggregation scheme.
Sum a large number of independently generated stable AR(1) processes.
Coefficients randomly generated in the interval 0,1 as where is a drawing fromthe Betaa,b distribution.
Granger showed that the resulting aggregate series xt would possess the attributes of afractional sequence with d 1 − b; for example, with d 1
2 the autocovariancesExtxt−k will decrease at the rate k2d−1.
The ‘long memory’ attribute can be identified with the incidence, in a suitableproportion of the aggregate, of AR roots close to 1.
This procedure will generate type II process if micro-series initialized at date t 0.
To get type I process requires ‘remote’ start dates for micro-series.
Comment: Same problem as before - this method also infeasible in practice.
19
An Alternative Simulation Strategy
1. Univariate Case If xt defined by (3), write xt xt
∗ xt∗∗ where
xt∗ ∑
j1
t−1
bjut−j, xt∗∗ ∑
jt
bjut−j
X∗ and X∗∗ are:
1. weak limits of the partial sum processes Xn∗ and Xn
∗∗ derived from xt∗ and xt
∗∗
respectively.
2. Gaussian, independent of each other.
Suppose ut process is i.i.d. Gaussian. Then, xt∗ and xt
∗∗ are also independent and thevector x∗∗ x1
∗∗,… ,xn∗∗′ is Gaussian with a known covariance matrix.
20
A convenient fact: Formula Ex0x−k k (see above) has alternative representation
Ex0x−k 2∑j0
bjbjk.
Therefore, for any t, s 0,
Ext∗∗xs
∗∗ 2∑j0
bjtbjs Ex0x1−mint,s − 2 ∑j0
mint,s−1
bjbj|t−s|.
The sequence bj is easily constructed by the recursion
bj bj−1j d − 1
jfor j 0 with b0 1.
The n n covariance matrixCn Ex∗∗x∗∗′
can therefore be constructed with minimal computational effort.
21
Idea:Simulate the distribution of x∗∗, by simply making an appropriate collection of Gaussiandrawings, independent of x∗ and Gaussian with covariance Cn.
Theorem 4.1 If
Xn∗∗r 1
n1/2d ∑t1
nr
xt∗∗.
then Xn∗∗ d→ X∗∗.
Let x∗ x1∗,… ,xn
∗ ′ be computed by the usual moving truncation method so that, bystandard arguments,
Xn∗ d→ X∗.
It follows by the continuous mapping theorem that
Xn Xn∗ Xn
∗∗ d→ X Type I fBM!
If ut is either not Gaussian, or is weakly dependent but not i.i.d., this simulation strategywill be inexact in small samples.
However, it will still be asymptotically valid under the usual conditions for theinvariance principle.
22
Points Cn tends rapidly to singularity as n increases.
Hence, only a few Gaussian drawings are needed to generate the complete sequence.
Suppose n is small enough that Cn can be diagonalized numerically (say, n ≤ 150):
1. Obtain the decompositionCn VnVn
′
where Vn is a n s matrix where s rank of smallest positive eigenvalue
(for d 0.4, s ≈ 6!).
2. Draw an independent standard Gaussian vector z s 1, and compute x∗∗ Vnz.
NB: In a Monte Carlo experiment, Vn only has to be computed once.
Computational cost therefore comparable to type II.
23
Large Samples:Where n is too large to perform diagonalization (n 150, say) resort to approximationby interpolation:
tth row of Vn matrix has length Ext∗∗2 (known)
Columns of Vn are orthogonal, and accordingly have a characteristic structure.
Combine these pieces of information as follows:
1. Construct and diagonalize Cp, where p is chosen as the largest whole divisor ofn not exceeding 150.
2. Given Vp, construct Vn as follows:
a) For t 1, n/p, 2n/p,… ,pn/p set the tth row of Vn by taking the pt/nthrow of Vp, renormalized to have squared norm equal to Ext
∗∗2.
b) Fill in missing rows by linear interpolation.
c) Renormalize rows to satisfy vnt′ vnt Ext
∗∗2.
24
The figure plots for the case d 0.4 and n 150, the first 4 columns of Vn by exactcalculation (solid lines) and also by interpolation from p 50 (dashed lines).
-0.2
0
0.2
0.4
0.6
0.8
1
0 50 100
Columns of Vn, n 150: Actual (solid line); interpolated from p 50 (dashed line).
25
Properties of the Simulation
Type I Type IId Theoretical Monte Carlo Theoretical Monte Carlo0.4 1.389 1.383 0.840 0.8420.2 0.997 0.993 0.920 0.9170 1 1.0085 1 1.0085−0.2 1. 176 1.167 1.109 1.104−0.4 1.877 1.76 1.501 1.41
Theoretical standard deviations of the random variables X1 and X∗1, with the samequantities estimated by Monte Carlo from samples of size n 1000 for comparison.
26
2.The Multivariate Case
If x t ∑j0
Bjut−j m 1, where Eutu t′ , Bj m m, then
Γk Ex0x0−k′ ∑
j0
BjBjk′ .
By preceding arguments,
Ext∗∗xs
∗∗′ Γs − t −∑j0
t−1
BjBjs−t′ t ≤ s.
Stacking x1∗∗,… ,xm
∗∗ into a vector x∗∗ (mn 1)
Ex∗∗x∗∗′ Cn
C11,n C1m,n
Cm1,n Cmm,n
Letting bk,j′ represent the kth row of Bj, note that
Ckh,n ts Exkt∗∗xhs
∗∗ khs − t −∑j0
t−1
bk,j′ bh,js−t, s ≥ t.
27
Fractional noise example:
Theorem 4.2 For xht and xkt defined by (3) with respect to i.i.d. shock processes uht and uktwith covariance 12,
khs − t hksindh
Γ1 − dh − dkΓdh sΓ1 − dk s
.
Procedure for Multivariate Models: Decompose Cn as
Cn VnVn′
where Vn V1n′ ,… ,Vmn
′ ′.
Use formulaxj∗∗ Vjnz
to generate replications of jth process.
If mn 150, modify the method by the extrapolation step described above.
28
Distributions of Fractional Brownian Functionals
Autoregression: - DF statistic n̂
Without intercept:∑t1
n−1 Stxt1
∑t1n−1 St
2
0
1XdX
0
1X2ds
With intercept: n∑t1
n−1St − S̄x1
∑t1n−1St − S̄2
0
1XdX − X1
0
1Xds
0
1X2ds −
0
1Xds
2
P≤ 0.01 0.05 0.1 0.9 0.95 0. 99no intercept Type I −0.12 0.04 0.28 2.39 2.88 4. 17
Type II −0.24 −0.04 0.16 2.71 3.30 4. 68with intercept Type I −4.51 −2.75 −2. 05 1.97 2.67 4. 25
Type II −4.02 −2.45 −1. 78 2.47 3.15 4. 94
Quantiles of the "Dickey-Fuller" statistics
29
Figure 4: Simulation of unit root autoregression: d = 0.41000 observations, 1000,000 replications
Type I
Type II
Unit root autoregression, no intercept
0
0.01
0.02
0.03
0.04
0.05
0.06
-3 -2 -1 0 1 2 3 4 5 6 7 8
Type I
Type II
Unit root autoregression with intercept
0
0.01
0.02
0.03
0.04
0.05
0.06
-15 -10 -5 0 5 10 15
Bivariate Case - Stochastic Integrals
∑t1n S1tx2t
n1d1d2
0
1X1dX2,
∑t1n S1t − S̄1x2t
n1d1d2
0
1X1dX2 − X21
0
1X1ds,
Cointegration t statistics
n1/2−d2∑t1n S1tx2t
∑t1n S1t
2 ∑t1n x2t
2 − ∑t1n S1tx2t
2
0
1X1dX2
2 01X1
2ds
n1/2−d2∑t1n S1t − S̄1x2t
∑t1n S1t − S̄12∑t1
n x2t2 − ∑t1
n S1t − S̄1x2t2
0
1X1dX2 − X21 0
1X1ds
2 01X1
2 − 0
1X1
2.
31
Figure 5: Simulations of a bivariate distribution with correlation 0.5.Integrand has parameter d1, integrator has parameter d2.
1000 observations, 100,000 replications.
Stochastic Integral, d1 = 0, d2 = 0.4
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
-6 -4 -2 0 2 4 6 8
Type I
Type II
Stochastic Integral, d1 = d2 = 0.4
Type I
Type II
0
0.02
0.04
0.06
0.08
0.1
0.12
-6 -4 -2 0 2 4 6
Stochastic Integral (demeaned integrand)d1 = 0, d2 = 0.4
Type I
Type II
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
-1.5 -1 -0.5 0 0.5 1 1.5 2
Stochastic Integral (demeaned integrand)d1 = d2 = 0.4
Type I
Type II
0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
-1.5 -1 -0.5 0 0.5 1 1.5
Figure 6: Simulations of regression t-value.Processes as for Figure 5.1000 observations, 100,000 replications.
Regression without intercept, d1 = 0, d2 = 0.4
Type I
Type II
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
0.05
-3 -2 -1 0 1 2 3 4
Regression without intercept, d1 = d2 = 0.4
Type I
Type II
0
0.01
0.02
0.03
0.04
0.05
0.06
-4 -3 -2 -1 0 1 2 3 4
Regression with intercept, d1 = 0, d2 = 0.4
Type I
Type II
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
0.045
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
Regression with intercept, d1 = d2 = 0.4
Type I
Type II
0
0.005
0.01
0.015
0.02
0.025
0.03
0.035
0.04
-2 -1.5 -1 -0.5 0 0.5 1 1.5 2
P≤ 0.01 0.05 0.1 0.9 0.95 0.99no intercept d1 0 Type I −1. 711 −1.135 −0.879 0.977 1.297 1.810
Type II −0. 672 −0.322 −0.172 1.125 1.375 1.774d1 0.4 Type I −1. 913 −1.353 −1.033 1.206 1.526 2.086
Type II −0. 986 −0.643 −0.446 0.977 1.173 1.566with intercept d1 0 Type I −0. 868 −0.570 −0.437 0.523 0.689 0.954
Type II −0. 381 −0.175 −0.056 0.770 0.888 1.124d1 0.4 Type I −0. 885 −0.623 −0.460 0.487 0.650 0.912
Type II −0. 778 −0.550 −0.387 0.525 0.655 0.916
Quantiles of the cointegrating regression "t statistics"
34
Estimating Type I ARFIMA Processes
Compare the fractional noise model1 − LdYt ut, t 1,… ,n (1)
where ut− is i.i.d.0,2 with feasible counterpart1 − LdYt
∗ ut∗, t 1,… ,n (2)
where ut∗ ut1t ≥ 1, and Yt
∗ defined by the equation.
In other words, if aj repesents coefficients in the expansion of 1 − Ld.Y1∗ u1
Y2∗ u2 − a1Y1
∗
Yn∗ uT − a1Yt−1
∗ − − an−1Y1∗
The asymptotics relevant to models (1) and (2) are those of type I and type II fractionalBrownian motion.
35
Write
tL;d ∑j0
t−1
ajLj
to represent the truncation of the expansion of 1 − L−d at the tth term, note that
tL;−d tL;d−1
With this notation, write the solution of (2) as
Yt∗ 1 − Ldut
∗ tL;−dut.
However, solution of (1) has the approximate form
Yt 1 − Ldut
≈ tL;−dut vtd, ′z
where vtd, ′ is row t of the n s matrix Vn, and z s 1 is a standard normal vector.
36
Consider approximate form of (1)
tL;dYt tL;dvtd, ′z ut
vt∗d,′z ut
Vectors vt∗d can be computed given values for d and .
s is finite, so z can be treated as s additional unknown parameters.
Model (1) can be estimated by inserting a set of s regressors into the equation, andfitting their coefficients.
Asymptotically equivalent to estimating d by fitting (1) as an infinite orderautoregression.
37
Straightforward...... to extend same technique to estimating the ARFIMA(p,d,q) model, with the form
L1 − LdYt − Lut, t 1 maxp,q,… ,n
where EYt.
The approximate model in this case is
L tL;dYt − vt∗d,|1| ′z Lut, t 1 maxp,q,… ,n
Variance of the presample shocks must be calculated as 212, so vt∗ depends
additionally on the moving average parameters.
38
Application: Nile minima series:
622–1284AD 784–1284ADs 0 s 1 s 2 MLE s 0 s 1 s 2 MLE
ARFIMA d0.03160.4182
0.03150.4187
0.03100.4185
0.02990.3932
0.03830.4504
0.03770.4398
0.03150.4289
0.03360.4374
type I Frac.,Z1 −0.516−0.465
0.679−0.908 − −
0.672−0.9841
0.554−0.5301 −
type I Frac., Z2 − −1.7711.894 − − −
1.842−3.485 −
Shock SD2.946
70.5473.004
70.6653.075
70.865 69.903.757
66.9813.891
66.9583.825
66.542 65.37
Student t DF0.2452.345
0.2392.314
0.2342.273 −
0.2142.1248
0.2062.088
0.2142.1248 −
Log-likelihood −3738 −3737 −3737 −3757 −2786 −2783 −2782 −2806Residual Q12 7.6426 7.524 7.027 − 5.250 5.686 5.897 −
39
�300
�200
�100
0
100
200
300
400
631 703 775 847 919 991 1063 1136 1208 1280
Figure 7. Annual Nile minima (mean deviations)
27