Department of Economics
Working Paper
Block Bootstrap Prediction Intervals for Vector
Autoregression
Jing Li Miami University
2013
Working Paper # - 2013-04
Block Bootstrap Prediction Intervals for Vector Autoregression
Jing Li∗
Miami University
Abstract
This paper attempts to answer the question of whether the principle of parsimony
can apply to interval forecasting for a multivariate series. Toward that end this paper
proposes the block bootstrap prediction intervals based on the parsimonious first order
vector autoregression. The new intervals generalize the standard bootstrap prediction
intervals by allowing for serially correlated prediction errors. The unexplained serial
correlation is accounted for by the block bootstrap, which resamples two-dimensional
arrays of residuals. A Monte Carlo experiment shows that the new intervals outper-
form the standard bootstrap intervals in most cases.
Keywords: Forecast; Vector Autoregression; Block Bootstrap; Prediction Intervals;
Principle of Parsimony
∗Jing Li, Department of Economics, Miami University, Oxford, OH 45056, USA. Phone: 001.513.529.4393,Fax: 001.513.529.6992, Email: [email protected].
1
1. Introduction
Chatfield (1993) emphasizes that the interval forecasting can provide the associated uncer-
tainty, so is more informative than the point forecasting. This paper is concerned with
constructing the individual prediction intervals for each component of a multivariate time
series, based on the vector autoregression developed by Sims (1980)1. We focus on the boot-
strap prediction intervals since they can automatically account for the sampling variability
of coefficient estimators and non-normal prediction errors.
For forecasting purpose the VAR has pros and cons. The VAR fully utilizes the across-
variable correlation, which is ignored by univariate models. However, the number of unknown
coefficients in a dynamically adequate VAR can be large, making the model not parsimo-
nious. This is at odds with the principle of parsimony: it is well known that a parsimonious
univariate model may produce superior out-of-sample point forecasts, see Enders (2009) for
example. Recently Li (2013) shows a parsimonious univariate model may yield superior
interval forecasts as well. The main objective of this paper is to investigate whether the
principle of parsimony applies to the interval forecasting for components of a multivariate
series.
Toward that end this paper proposes the block bootstrap intervals (BBI) using the first
order VAR, the most parsimonious VAR. Because the error terms are likely to be serially
correlated, we need to implement the block bootstrap of Kunsch (1989), instead of the
standard bootstrap of Efron (1979). More explicitly, the BBI is characterized by resampling
two-dimensional arrays of residuals. The space dimension intends to preserve the correlation
between variables; while the time dimension can capture the serial correlation.
The proposed BBI adds to the literature by generalizing Thombs and Schucany (1990)
in two directions: from univariate to multivariate, and from the assumption of independent
1There are other types of multivariate forecasting such as the prediction ellipsoid studied in Kim (1999).This paper concentrates on the individual prediction intervals in light of their popularity among practitioners.
2
error to relaxing that assumption. As a comparison, this paper also considers applying the
standard bootstrap to a dynamically complete p-th order VAR, where p is sufficiently large
so that the error becomes serially uncorrelated. This is necessary because the standard
bootstrap assumes independent errors. The resulting VAR with white noise errors can be
complicated. Our conjecture is that the BBI will have better performance since it is based
on the parsimonious model. Later we conduct a Monte Carlo experiment to check this
conjecture.
In addition, this paper considers improving the BBI by correcting the bias in the coeffi-
cient estimated by the ordinary least squares (OLS). Early works such as Shaman and Stine
(1988) and Kilian (1998) illustrate the importance of correcting the autoregressive bias. This
paper extends those studies from the perspective of the multivariate interval forecasting.
The literature of bootstrap prediction intervals is growing. An incomplete list includes
Thombs and Schucany (1990), Grigoletto (1998), Kim (1999), Kim (2001), Clements and
Taylor (2001), Kim (2002), Kim (2004) and Li (2011). This paper distinguishes from the
existing literature by applying the block bootstrap to the VAR forecasting. The remainder of
the paper is organized as follows. Section 2 specifies the BBI. Section 3 conducts the Monte
Carlo experiment. Section 4 discuss the possible extension of this study and concludes.
2. Bootstrap Prediction Intervals for VAR
Let yt denotes an m × 1 vector: yt = (y1,t, . . . , ym,t)′, where yi,t, (i = 1, . . . ,m), (t = 1, . . .)
is a univariate series. The goal is to find the prediction intervals for each component of the
future vectors (yn+1, . . . , yn+h) based on the observed values (y1, . . . , yn). To focus on the
main issue we assume yt has zero mean and is weakly stationary: Eyt = 0, E(yi,tyj,t−k) =
σijk, (i, j = 1, . . . ,m), (k = 0, 1, . . .). In practice yt may represent the differenced, detrended
or demeaned series, depending on the order of integration and the deterministic terms2.
2The literature on forecasting non-stationary series is vast, see Clements and Hendry (2001) for example.
3
The VAR is a multivariate model that takes into account the across-variable correlation
σijk, which is ignored by univariate models. There are at least two ways to build the bootstrap
prediction intervals using VAR. Their difference is which bootstrap method is used. Then
it boils down to how to model the serial correlation: by a dynamically adequate model with
white noise errors, or by a parsimonious model with serially correlated errors.
Standard Bootstrap Intervals
We start with the standard bootstrap intervals (BI for short), which are the multivariate
generalization of the one proposed by Thombs and Schucany (1990). The BI is based on a
VAR(p) model
yt = α1yt−1 + α2yt−2 + . . .+ αpyt−p + et, (1)
where αi, (i = 1, . . . , p) is an m × m coefficient matrix, and et is an m × 1 vector of error
terms3. Unlike the classical Box-Jenkins type prediction intervals, the BI does not assume
et follows normal distributions. However, because the standard bootstrap of Efron (1979)
can only be applied in the independent setting, the assumption that et in (1) is independent
of et−j, (∀j ≥ 1) is needed for the BI.
The independent assumption is restrictive in the sense that model (1) must be dynami-
cally adequate, or the lag value p should be sufficiently large (so that no serial correlation is
left in et). The detailed algorithm for constructing the BI is omitted here, and can be found
in Thombs and Schucany (1990). For our purpose, it suffices to stress that the algorithm
involves resampling with replacement the individual m × 1 residual vector et, after (1) is
fitted by OLS.
3Intercept term is dropped in (1) because we assume yt has zero mean. Our simulation indicates thatthere is no qualitative change if the intercept term is added.
4
Block Bootstrap Intervals
Basically the BI utilizes the serial correlation through (possibly many) lagged values of y.
Notice that there are m2p autoregressive coefficients in (1) to estimate. If p is large, the
resulting VAR can be very complicated. In fact, this must be the case when the series follow
a vector moving average process, for which any chosen VAR is a finite-order approximation
for VAR(∞). The principle of parsimony is forgone if that happens.
Having an eye for the parsimony principle, this paper proposes the block bootstrap
intervals (BBI for short) based on the simplest VAR(1) model
yt = β1yt−1 + vt. (2)
It is obvious that model (2) has no more coefficients to estimate than (1). Moreover, the
chance of multicollinearity is minimized by (2). The BBI is motivated by the intuition that
the simplicity of model (2) may lead to superior out-of-sample forecasts. In particular, the
improvement in the forecasting accuracy is expected to be substantial when the sample size
n is small and when serial correlation is strong.
One issue with model (2) is that the error term vt in general is serially correlated. To see
this, suppose the true data generating process is (1) with the independent error et. Then by
construction vt ≡ a2yt−2 + . . .+ apyt−p + et, so is correlated with vt−1 and so on. The serial
correlation in the new error term is not surprising: the VAR(1) model is somewhat semi-
parametric, and just one lagged value in general cannot capture all the dynamic structure.
Now we hope to take advantage of the simple model (2), and meanwhile use the serial
correlation in the error term to refine the forecast. The bootstrap method can be helpful in
this context4. Nevertheless, Efron’s standard bootstrap cannot be applied to the dependent
4The serial correlation in the error term may suggest methods such as generalized least squares (GLS).We do not consider GLS since that method intends to model the correlation structure in the error term. Inother words, the GLS is essentially based on a model as complicated as (1).
5
series vt. Instead we need to rely upon the multivariate version of the block bootstrap pro-
posed by Kunsch (1989). Let β1 denote the coefficient estimated by OLS, and vt ≡ yt−β1yt−1
the residual vector5. The key idea of the block bootstrap is resampling with replacement
random blocks of adjacent residual vectors. A typical block B(v)j from the residual vt looks
like
B(v)j = (vj, vj+1, . . . , vj+b−1) =
v1,j . . . v1,j+b−1
.... . .
...
vm,j . . . vm,j+b−1
m×b
(3)
where b is the width (size) of block, and the index number j is a random draw from the
discrete uniform distribution between 1 and n− b+ 1.
Notice that B(v)j is a two-dimensional array. The vertical horizon measured by m
can preserve the across-variable contemporaneous correlation σij0. The horizontal horizon
measured by b can capture the temporal serial correlation σiik. When b = 1 the block
bootstrap reduces into the standard bootstrap. The block bootstrap resamples blocks of
residuals (a vector in this case), rather than individual residual, so that the serial correlation
in vt can be accounted for.
In the literature the block bootstrap has been mostly used for purposes other than fore-
casting, see Politis (2003). One exception is Li (2013), which applies the block bootstrap to
the univariate interval forecasting by letting m = 1 in (3). This study generalizes that one
by allowing for m > 1.
In principle, the block size b should rise as the sample size rises. For given sample size,
there is a tradeoff when choosing the optimal block size. Rising b can capture more serial
correlation. But meanwhile, there is increasing overlapping between blocks, which produces
less variability. More discussion about this tradeoff is given in Li (2013). We let b = 4
throughout this paper, as our preliminary simulation indicates that the results may not be
5One may re-center and re-scale the residual using the factor of Stine (1987). We do not use the rescalingfactor since it is unclear how to modify the original one for VAR.
6
sensitive to the block size6.
Constructing the BBI takes five steps:
• In step one, model (2) is fitted by OLS, and the residual vt is saved.
• In step two, the backward representation of the VAR(1) model is fitted by OLS:
yt = ϕ1yt+1 + ut, (4)
where the regressor is the first lead, not lag. Denote the estimated coefficient and
residual by ϕ1 and ut. This backward regression is intended to ensure the conditionality
of the bootstrap replicate on the last observed value yn. See Figure 1 of Thombs and
Schucany (1990) for an illustration of the conditionality.
• In step three, use the backward residual ut to obtain the first random block B(u)j1 =
(uj1, uj1+1, . . . , uj1+b−1), the second block B(u)j2 = (uj2, uj2+1, . . . , uj2+b−1), and so on,
by redrawing the index number j1, j2, . . . with replacement from the discrete uniform
distribution between 1 and n − b + 1. Then we stack up these blocks until the length
of the stacked series becomes n, the size of the observed sample. Let u∗t denote the
t-th observation of the stacked series. Next one bootstrap replicate series (y∗1, . . . , y∗n)
is generated in a backward fashion as
y∗n = yn, y∗t = ϕ1y
∗t+1 + u∗
t , (t = n− 1, . . . , 1) (5)
It is instructive to emphasize that we generate the last observation first, then move
backward one period a time. By doing this all the bootstrap replicate series have the
6Alternatively, the stationary bootstrap of Politis and Romano (1994) can be used, which assumes bfollows a geometric distribution. The question still exists since the new one becomes how to select theparameter for the geometric distribution.
7
same last observation, which is yn. That is what conditionality on the last observed
value means.
Now we use the bootstrap replicates (y∗1, . . . , y∗n) to refit (2), and get the so called
bootstrap coefficient β∗1 . Later this step will be repeated many times. Then the vari-
ability of β∗1 can be used to mimic the inherent variability of estimating β1. This is
another advantage of the bootstrap intervals over the Box-Jenkins intervals which fail
to account for the sampling variability of the estimated coefficient.
• In step four, in a similar fashion, obtain the random blocks B(v)j1 = (vj1, . . . , vj1+b−1),
B(v)j2 = (vj2, . . . , vj2+b−1), . . . using vt, the residual of the forward VAR(1). But this
time we stack up these blocks until the length of the stacked series is h, the maximum
forecast horizon. Let v∗l denote the l-th observation of the stacked series. Then we
compute recursively the block bootstrap l-th step forecast y∗n+l as
y∗n = yn, y∗n+l = β∗
1 y∗n+l−1 + v∗l , (l = 1, . . . , h) (6)
The above equation shows that the randomness of the forecast value comes from two
sources. The randomness due to estimation is captured by β∗1 , while the randomness
of the future shock is represented by v∗l .
• In step five, we repeat step three and step four D times. In the end, there is a series
of the block bootstrap l-th step forecasts for yi,t, (i = 1, . . . ,m) :
y∗i,n+l(s)
D
s=1, (l = 1, . . . , h) (7)
where s is the index for bootstrapping. The l-th step BBI at the γ nominal level for
8
the i-th component yi,t is given by
l − step BBI =
[y∗i,n+l
(1− γ
2
), y∗i,n+l
(1 + γ
2
)](8)
where y∗i,n+l
(1−γ2
)and y∗i,n+l
(1+γ2
)are the
(1−γ2
)100-th and
(1+γ2
)100-th percentiles of
the empirical distribution of y∗i,n+l(s)Ds=1. Throughout this paper we let γ = 0.90. We
use D = 999 to avoid the discreteness problem raised by Booth and Hall (1994).
Here the percentile method of Efron and Tibshirani (1993) is applied to construct the BBI.
Hall (1988) discusses other percentile methods. De Gooijer and Kumar (1992) emphasize
that the percentile method performs well when the conditional distribution of the predicted
values is unimodal. Our preliminary simulation conducts the DIP test of Hartigan and
Hartigan (1985) and finds that the distribution is indeed unimodal.
Bias-Corrected Block Bootstrap Intervals
There is a well known fact that the autoregressive coefficient estimated by OLS can be
biased, see Shaman and Stine (1988) for example. This suggests that the proposed BBI
may be improved by using the method of Efron and Tibshirani (1993) or Kilian (1998).
In particular, equation (6) indicates that the residual in the forward VAR(1) vt has direct
effect on the forecast. Therefore correcting the bias in β1 may be more important than
the backward coefficient ϕ1. This is confirmed by Li (2013) in the setting of univariate
autoregressive forecasting.
It is straightforward to correct the bias in β1. We need to nest a new bootstrapping into
the original one. Suppose there are C series of the bootstrap replicates, which can be used
to refit the forward regression (2). After obtaining a series of forward bootstrap coefficients
β∗1(s)Cs=1, the bias-corrected forward coefficient and the bias-corrected forward residual are
9
computed as
βc1 = 2β1 − C−1
C∑s=1
β∗1(s) (9)
vct = yt − βc1yt−1. (10)
Throughout this paper we let C = 100. Next, the bias-corrected block bootstrap l-th step
forecast yc∗n+l is generated as
yc∗n = yn, yc∗n+l = β∗
1 yc∗n+l−1 + vc∗l , (l = 1, . . . , h) (11)
where vc∗l is obtained by block bootstrapping vct . The l-th step bias-corrected block bootstrap
intervals (BCBBI) at the γ nominal level for the i-th component yi,t are given by
l − step BCBBI =
[yc∗i,n+l
(1− γ
2
), yc∗i,n+l
(1 + γ
2
)](12)
where yc∗i,n+l
(1−γ2
)and yc∗i,n+l
(1+γ2
)are the
(1−γ2
)100-th and
(1+γ2
)100-th percentiles of the
empirical distribution of yc∗i,n+l(s)Ds=1. The new intervals can be improved further, for in-
stance, by correcting the bias in ϕ1. However, Li (2013) shows that the benefit of bias-
correcting ϕ1 may be marginal.
3. Monte Carlo Experiment
This section compares the performance of the BI and BBI using a Monte Carlo experiment.
The criterion of comparison is the average coverage rate (ACR) of the prediction intervals
for each component of a multivariate series:
k−1
k∑1
1(yi,n+l ∈ PI), (i = 1, . . . ,m, l = 1, . . . h) (13)
10
where k = 10000 is the number of iteration, 1( .) is the indicator function that equals one
when the event in the parenthesis is true, and PI denotes the prediction intervals. yi,n+l is
the i-th component of yn+l = (y1,n+l, . . . , ym,n+l)′. The maximum forecast horizon h is 5. No
qualitative change is found for larger h in preliminary simulation.
The true data generating process (DGP) is a second order VAR for a bivariate series
yt = (y1,t, y2,t)′ :
y1,t = a1y1,t−1 + a2y2,t−1 + a3y1,t−2 + e1,t (14)
y2,t = b1y2,t−1 + b2y2,t−2 + e2,t, (15)
where the coefficients ai and bi are all scalars. The error vector et = (e1,t, e2,t)′ is an inde-
pendent series generated as
et = Aut, ut ∼ i.i.d(0, 1), A =
√1− ρ2 ρ
0 1
. (16)
It is easy to show the variance-covariance matrix of et is
Ω ≡ E(ete′t) = AA′ =
1 ρ
ρ 1
. (17)
Notice that y2,t−1 appears on the right hand side of (14), but y1,t−1 is absent in (15). That
means y2,t first-order Granger causes y1,t, but not vice versa. Using this DGP of triangular
form is without losing generality because we can always transform a non-triangular form into
a triangular form by multiplying an invertible matrix. One advantage of using the triangular
form is the clear interpretation of the parameter: b1 and b2 control the stationarity of y2,t;
a1 and a3 control the stationarity of y1,t; the across-variable correlation is measured by a2
11
and ρ. For instance, y2,t is stationary when b1 = 1.4, b2 = −0.48 since the corresponding
characteristic roots are λ1 = 0.6 and λ2 = 0.8, both less than one in absolute value. The
positive definiteness of Ω requires that |ρ| < 1.
Three bootstrap prediction intervals are considered. The proposed block bootstrap inter-
vals (BBI) (8) are obtained by applying the block bootstrap to the residual of the parsimo-
nious VAR(1) model. One standard bootstrap intervals (denoted by AR1BI) are obtained
by applying the standard bootstrap to the residual of the VAR(1)7; the other (denoted by
AR2BI) is obtained by applying the standard bootstrap to the residual of the VAR(2). The
AR2BI is theoretically correct, but not parsimonious. The AR1BI is parsimonious, but fails
to account for the serial correlation in the error term. The BBI is parsimonious and fully
utilizes the serial correlation.
In total we generate n + h observations8 of yt. The first n observations are used for the
in-sample fitting and constructing the prediction intervals. Then the average coverage rate
(13) is computed for the last h pseudo out-of-sample observations. That is, we evaluate
whether the last h observations are inside the prediction intervals. The best method is the
one that yields the prediction intervals with the average coverage rate closest to the nominal
level γ = 0.90.
Error Distributions
In theory the bootstrap intervals should be robust to non-normal distributions of the predic-
tion errors. Following Thombs and Schucany (1990) we examine three distributions for ut :
the bivariate standard normal distribution; the bivariate exponential distribution with mean
of two, which is skewed; and the bivariate mixed normal distribution 0.9N(−1, 1)+0.1N(9, 1),
7In practice one may check the adequacy of the model (no serial correlation in the error term) by applyingthe Breusch-Godfrey type test to the residual. For simplicity we skip those tests in this section and proceedas if the VAR(1) or VAR(2) is adequate.
8The initial value y1 is set as zero, the unconditional mean. The pseudo random number generator ofMatlab R2011a is used.
12
which is bimodal skewed. All distributions are standardized to have zero mean and unity
variance. Then the prediction error et is generated as in (16), where A is the Cholesky
decomposition of the variance-covariance matrix Ω.
The parameters are set as b1 = 1.4, b2 = −0.48, a1 = 1.2, a2 = 0.6, a3 = −0.35, ρ = 0.4,
and the sample size is n = 50. By construction both y1,t and y2,t are stationary. Figure 1
plots the ACR against the forecast horizon, as the error distribution varies. The three panels
in the first row show the ACR for the first component y1,n+l; the second row shows the ACR
for the second component y2,n+l.
There are several findings from Figure 1. First of all, the BBI (denoted by circle) in
most cases has the best performance with the ACR closest to 0.90. The superiority of the
BBI becomes more evident as the forecast horizon rises. In particular, the BBI is shown to
outperform the AR2BI (denoted by square). This fact may serve as the evidence supporting
the principle of parsimony.
The performance of the AR1BI (denoted by diamond) is interesting. It has the worst
performance (lowest coverage rate) for long run forecast, while has the seemingly best per-
formance (highest coverage rate) for the first-step forecast. This finding can be explained
by two facts. First, the AR1BI uses the standard bootstrap, which fails to account for the
serial correlation in the error term but can add more variability than the block bootstrap.
Second, the serial correlation in the prediction error does not matter in short run as much as
in long run. The performance of the AR1BI also highlights the tradeoff of preserving serial
correlation vs adding variability. For the first-step forecast, adding variability outweighs
keeping serial correlation, so the AR1BI has first-step coverage rate greater than the BBI.
We find that the coverage rates of all three intervals remain largely unchanged as the
error distribution varies. This verifies the robustness of the bootstrap intervals to the non-
normality. Finally, the ACRs of all three intervals decrease as the horizon rises, a finding
consistent with Thombs and Schucany (1990).
13
Across Variable Correlation: ρ and a2
Next we investigate the effect of varying across variable correlation determined by ρ and a2.
The same values of b1, b2, a1, a3 and n as Figure 1 are used. The error follows the bivariate
normal distribution. But now we let ρ = −0.4, 0.4, 0.8, and a2 = −0.4, 0.4, 0.8. The results
are shown in Figure 2 (where ρ varies and a2 = 0.4) and Figure 3 (where a2 varies and
ρ = 0.4), respectively.
We find that the varying ρ and a2 have minimal effect on the coverage rate for y2,n+l.
This is expected because of the triangular form of the DGP or the fact that y1,t does not
Granger cause y2,t. Varying ρ and a2 do affect the coverage rate for y1,n+l. But the BBI still
dominates in most cases.
Persistence: b1, b2, a1, a3
The persistence of the series (or the speed at which the autocorrelation decays) is determined
by the parameters b1, b2, a1 and a3. Their effect on the ACR is illustrated by Figures 4, 5 and
6. We let a2 = 0.6, ρ = 0.4, n = 50 and the error follows the bivariate normal distribution.
Table 1 summarizes the values of b1, b2, a1, a3, and the corresponding characteristic roots.
Figure 4 is concerned with the real characteristic roots. For example, when b1 = 1.2, b2 =
−0.35, the characteristic roots for y2,t are 0.7 and 0.5; when a1 = 0.9, a3 = −0.2, the charac-
teristic roots for y1,t are 0.5 and 0.4. So in this case (DGP1) both series are stationary. y2,t
becomes more persistent (and its autocorrelation decays more slowly) when DGP1 changes
to DGP2; y1,t becomes more persistent when DGP2 changes to DGP3. Figure 4 shows that
the superiority of the BBI is insensitive to the change in the persistence.
The BBI maintains its superiority in Figure 5, where some or all characteristic roots are
complex conjugates. In DGP4-6 the series are still stationary because the modulus of the
characteristic root is less than one, but now the autocorrelation function has the sinusoidal
pattern.
14
Figure 6 relaxes the assumption of stationarity. In DGP7 and DGP8, y2,t has one unit
root, so becomes nonstationary. Moreover, y1,t and y2,t are cointegrated, in the terminology
of Engle and Granger (1987). In DGP9, y1,t is nonstationary but y2,t is stationary. As shown
by Figure 6, despite the nonstationarity and cointegration, the BBI still delivers the best
performance in most cases.
Sample Size: n
Figure 7 demonstrates the effect of the sample size n on the coverage rate. The error follows
the bivariate normal distribution, and the parameter values are the same as the Figure 1.
As the sample size rises, we find that the coverage rates of all three intervals improve, by
moving upward to the nominal level 0.90. Furthermore, the improvement of the BBI seems
to be the most evident. When n = 150, the ACR of the BBI becomes almost flat at the level
of 0.88.
Bias Correction
Figure 7 implies that the small sample size is one of the reasons why the prediction intervals
tend to undercover the true future values. Another reason is the bias in the autoregressive
coefficient estimated by OLS, as shown by Figure 8. In that figure the coverage rate of
the BBI is compared to the bias-corrected block bootstrap intervals (BCBBI, denoted by
square), using the same DGP as Figure 1. It is clear to see that correcting the autoregressive
bias leads to an increase in the coverage rate.
4. Conclusion
This paper proposes the block bootstrap prediction intervals (BBI) for each component of
a multivariate time series based on the parsimonious first order vector autoregression. One
characteristic of the BBI is resampling “big” blocks or two-dimensional arrays of residuals.
15
Those big blocks aim to preserve both the across variable correlation and serial correlation.
By contrast, the standard bootstrap intervals require independent prediction errors, which
amount to a possibly complicated model, and involve redrawing “small” blocks or one-
dimensional arrays of residuals.
The Monte Carlo experiment indicates that the principle of parsimony can be extended
to multivariate interval forecasting. The BBI is shown to outperform the standard bootstrap
intervals in most cases. Remarkably, the BBI always dominates for the long run forecast.
The performance of the BBI can be enhanced by bigger sample sizes and correcting the bias
of the estimated autoregressive coefficient.
There are some noteworthy future studies. One possibility is following Kim (1999) and
developing the block bootstrap prediction region (ellipsoid) based on the parsimonious VAR.
The prediction region can provide joint forecast for a multivariate series, but may incur higher
computational cost. Another one is considering an improved BBI for the cointegration system
that explicitly factors in the non-stationarity and the error-correcting mechanism.
16
References
Booth, J. G. and Hall, P. (1994). Monte carlo approximation and the iterated bootstrap.
Biometrika, 81, 331–340.
Chatfield, C. (1993). Calculating interval forecasts. Journal of Business & Economic Statis-
tics, 11, 121–135.
Clements, M. P. and Hendry, D. F. (2001). Forecasting Non-Stationary Economic Time
Series. The MIT Press.
Clements, M. P. and Taylor, N. (2001). Boostrapping prediction intervals for autoregressive
models. International Journal of Forecasting, 17, 247–267.
De Gooijer, J. G. and Kumar, K. (1992). Some recent developments in non-linear time series
modeling, testing, and forecasting. International Journal of Forecasting, pages 135–156.
Efron, B. (1979). Bootstrap method: Another look at the jackknife. Annals of Statistics, 7,
1–26.
Efron, B. and Tibshirani, R. J. (1993). An Introduction to the Bootstrap. London: Chapman
and Hall.
Enders, W. (2009). Applied Econometric Times Series. Wiley, 3 edition.
Engle, R. F. and Granger, C. W. J. (1987). Cointegration and error correction: Representa-
tion, estimation and testing. Econometrica, 55, 251–276.
Grigoletto, M. (1998). Bootstrap prediction intervals for autoregressions: some alternatives.
International Journal of Forecasting, 14, 447–456.
Hall, P. (1988). Theoretical comparison of bootstrap confidence intervals. Annals of Statis-
tics, 16, 927–953.
17
Hartigan, J. A. and Hartigan, P. M. (1985). The DIP test of unimodality. Annals of Statistics,
13, 70–84.
Kilian, L. (1998). Small sample confidence intervals for impulse response functions. The
Review of Economics and Statistics, 80, 218–230.
Kim, J. (1999). Asymptotic and bootstrap prediction regions for vector autoregression.
International Journal of Forecasting, 15, 393–403.
Kim, J. (2001). bootstrap-after-bootstrap prediction intervals for autoregressive models.
Journal of Business & Economic Statistics, 19, 117–128.
Kim, J. (2002). Bootstrap prediction intervals for autoregressive models of unknown or
infinite lag order. Journal of Forecasting, 21, 265–280.
Kim, J. (2004). Bias-corrected bootstrap prediction regions for vector autoregression. Journal
of Forecasting, 23, 141–154.
Kunsch, H. R. (1989). The jackknife and the bootstrap for general stationary observations.
Annals of Statistics, 17, 1217–1241.
Li, J. (2011). Bootstrap prediction intervals for SETAR models. International Journal of
Forecasting, 27, 320–332.
Li, J. (2013). Block bootstrap prediction intervals for autoregression. Working Paper.
Politis, D. N. (2003). The impact of bootstrap method on time series analysis. Statistical
Science, 18, 219–230.
Politis, D. N. and Romano, J. P. (1994). The stationary bootstrap. Journal of the American
Statistical Association, 89, 1303–1313.
18
Shaman, P. and Stine, R. A. (1988). The bias of autoregressive coefficient estimators. Journal
of the American Statistical Association, 83, 842–848.
Sims, C. (1980). Macroeconomics and reality. Econometrica, 48, 1–48.
Stine, R. A. (1987). Estimating properties of autoregressive forecasts. Journal of the Amer-
ican Statistical Association, 82, 1072–1078.
Thombs, L. A. and Schucany, W. R. (1990). Bootstrap prediction intervals for autoregression.
Journal of the American Statistical Association, 85, 486–492.
19
Table 1: Parameters Values and Characteristic Roots in Figures 4, 5, 6Parameters y2,t y1,t
b1 b2 a1 a3 λ1 λ2 λ1 λ2
DGP1 1.2 -0.35 0.9 -0.2 0.5 0.7 0.5 0.4DGP2 1.4 -0.48 0.9 -0.2 0.6 0.8 0.5 0.4DGP3 1.4 -0.48 1.3 -0.42 0.6 0.8 0.7 0.6
DGP4 1.2 -0.4 1.3 -0.42 0.6+0.2i 0.6-0.2i 0.7 0.6DGP5 1.4 -0.48 1.2 -0.4 0.6 0.8 0.6+0.2i 0.6-0.2iDGP6 1.2 -0.4 1.4 -0.74 0.6+0.2i 0.6-0.2i 0.7+0.5i 0.7-0.5i
DGP7 1.4 -0.4 0.9 -0.2 1 0.4 0.5 0.4DGP8 1.4 -0.4 1.3 -0.42 1 0.4 0.7 0.6DGP9 1.4 -0.48 1.4 -0.4 0.8 0.6 1 0.4
The data generating process is (14) and (15) with a2 = 0.6, ρ = 0.4, n = 50,ut ∼ i.i.d.n(0, 1)
20
0 2 4 60.7
0.72
0.74
0.76
0.78
0.8
0.82
0.84Normal Distribution
Horizon
Y1
Cov
erag
e
BBIAR2BIAR1BI
0 2 4 60.65
0.7
0.75
0.8
0.85
0.9Exponential Distribution
Horizon
Y1
Cov
erag
e0 2 4 6
0.65
0.7
0.75
0.8
0.85
0.9Mixed Normal Distribution
Horizon
Y1
Cov
erag
e
0 2 4 60.65
0.7
0.75
0.8
0.85
0.9Normal Distribution
Horizon
Y2
Cov
erag
e
0 2 4 60.65
0.7
0.75
0.8
0.85
0.9Exponential Distribution
Horizon
Y2
Cov
erag
e
0 2 4 60.65
0.7
0.75
0.8
0.85
0.9Mixed Normal Distribution
Horizon
Y2
Cov
erag
e
Figure 1: Error Distributions
Note: the data generating process is (14) and (15) with b1 = 1.4, b2 = −0.48,
a1 = 1.2, a2 = 0.6, a3 = −0.35, ρ = 0.4, n = 50.
21
0 2 4 60.72
0.74
0.76
0.78
0.8
0.82
0.84rho = −0.4
Horizon
Y1
Cov
erag
e
BBIAR2BIAR1BI
0 2 4 60.72
0.74
0.76
0.78
0.8
0.82
0.84
0.86rho = 0.4
Horizon
Y1
Cov
erag
e0 2 4 6
0.72
0.74
0.76
0.78
0.8
0.82
0.84
0.86rho = 0.8
Horizon
Y1
Cov
erag
e
0 2 4 60.72
0.74
0.76
0.78
0.8
0.82
0.84
0.86rho = −0.4
Horizon
Y2
Cov
erag
e
0 2 4 60.72
0.74
0.76
0.78
0.8
0.82
0.84
0.86rho = 0.4
Horizon
Y2
Cov
erag
e
0 2 4 60.72
0.74
0.76
0.78
0.8
0.82
0.84
0.86rho = 0.8
Horizon
Y2
Cov
erag
e
Figure 2: Across Variable Correlation: ρ
Note: the data generating process is (14) and (15) with b1 = 1.4, b2 = −0.48,
a1 = 1.2, a2 = 0.6, a3 = −0.35, n = 50, and ut ∼ i.i.d.n(0, 1).
22
0 2 4 60.72
0.74
0.76
0.78
0.8
0.82
0.84
0.86
a2 = −0.4
Horizon
Y1
Cov
erag
e
BBIAR2BIAR1BI
0 2 4 60.7
0.72
0.74
0.76
0.78
0.8
0.82
0.84
a2 = 0.4
Horizon
Y1
Cov
erag
e0 2 4 6
0.72
0.74
0.76
0.78
0.8
0.82
0.84
0.86
a2 = 0.8
Horizon
Y1
Cov
erag
e
0 2 4 60.72
0.74
0.76
0.78
0.8
0.82
0.84
0.86
a2 = −0.4
Horizon
Y2
Cov
erag
e
0 2 4 60.72
0.74
0.76
0.78
0.8
0.82
0.84
0.86
a2 = 0.4
Horizon
Y2
Cov
erag
e
0 2 4 60.72
0.74
0.76
0.78
0.8
0.82
0.84
0.86
a2 = 0.8
Horizon
Y2
Cov
erag
e
Figure 3: Across Variable Correlation: a2
Note: the data generating process is (14) and (15) with b1 = 1.4, b2 = −0.48,
a1 = 1.2, a3 = −0.35, ρ = 0.4, n = 50, and ut ∼ i.i.d.n(0, 1).
23
0 2 4 60.74
0.76
0.78
0.8
0.82
0.84
0.86DGP1
Horizon
Y1
Cov
erag
e
BBIAR2BIAR1BI
0 2 4 60.7
0.75
0.8
0.85
0.9
0.95DGP2
Horizon
Y1
Cov
erag
e0 2 4 6
0.65
0.7
0.75
0.8
0.85
0.9DGP3
Horizon
Y1
Cov
erag
e
0 2 4 60.74
0.76
0.78
0.8
0.82
0.84
0.86DGP1
Horizon
Y2
Cov
erag
e
0 2 4 60.65
0.7
0.75
0.8
0.85
0.9DGP2
Horizon
Y2
Cov
erag
e
0 2 4 60.65
0.7
0.75
0.8
0.85
0.9DGP3
Horizon
Y2
Cov
erag
e
Figure 4: Serial Correlation: Real Characteristic Roots
Note: the data generating process is in Table 1.
24
0 2 4 60.7
0.72
0.74
0.76
0.78
0.8
0.82
0.84DGP4
Horizon
Y1
Cov
erag
e
BBIAR2BIAR1BI
0 2 4 60.65
0.7
0.75
0.8
0.85
0.9DGP5
Horizon
Y1
Cov
erag
e0 2 4 6
0.65
0.7
0.75
0.8
0.85
0.9DGP6
Horizon
Y1
Cov
erag
e
0 2 4 60.72
0.74
0.76
0.78
0.8
0.82
0.84
0.86DGP4
Horizon
Y2
Cov
erag
e
0 2 4 60.65
0.7
0.75
0.8
0.85
0.9DGP5
Horizon
Y2
Cov
erag
e
0 2 4 60.76
0.78
0.8
0.82
0.84
0.86DGP6
Horizon
Y2
Cov
erag
e
Figure 5: Serial Correlation: Complex Characteristic Roots
Note: the data generating process is in Table 1.
25
0 2 4 60.65
0.7
0.75
0.8
0.85
0.9DGP7
Horizon
Y1
Cov
erag
e
BBIAR2BIAR1BI
0 2 4 60.65
0.7
0.75
0.8
0.85
0.9DGP8
Horizon
Y1
Cov
erag
e0 2 4 6
0.65
0.7
0.75
0.8
0.85DGP9
Horizon
Y1
Cov
erag
e
0 2 4 60.65
0.7
0.75
0.8
0.85
0.9DGP7
Horizon
Y2
Cov
erag
e
0 2 4 6
0.65
0.7
0.75
0.8
0.85DGP8
Horizon
Y2
Cov
erag
e
0 2 4 60.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9DGP9
Horizon
Y2
Cov
erag
e
Figure 6: Serial Correlation: Nonstationarity and Cointegration
Note: the data generating process is in Table 1.
26
0 2 4 60.7
0.75
0.8
0.85
0.9n=50
Horizon
Y1
Cov
erag
e
BBIAR2BIAR1BI
0 2 4 60.78
0.8
0.82
0.84
0.86
0.88n=100
Horizon
Y1
Cov
erag
e0 2 4 6
0.8
0.82
0.84
0.86
0.88
0.9n=150
Horizon
Y1
Cov
erag
e
0 2 4 60.72
0.74
0.76
0.78
0.8
0.82
0.84
0.86n=50
Horizon
Y2
Cov
erag
e
0 2 4 60.78
0.8
0.82
0.84
0.86
0.88n=100
Horizon
Y2
Cov
erag
e
0 2 4 60.8
0.82
0.84
0.86
0.88
0.9n=150
Horizon
Y2
Cov
erag
e
Figure 7: Sample Size
Note: the data generating process is (14) and (15) with b1 = 1.4, b2 = −0.48,
a1 = 1.2, a2 = 0.6, a3 = −0.35, ρ = 0.4 and ut ∼ i.i.d.n(0, 1).
27
0 2 4 60.74
0.76
0.78
0.8
0.82
0.84Normal Distribution
Horizon
Y1
Cov
erag
e
BBIBCBBI
0 2 4 60.74
0.76
0.78
0.8
0.82
0.84Exponential Distribution
Horizon
Y1
Cov
erag
e0 2 4 6
0.76
0.78
0.8
0.82
0.84Mixed Normal Distribution
Horizon
Y1
Cov
erag
e
0 2 4 60.74
0.76
0.78
0.8
0.82
0.84
0.86Normal Distribution
Horizon
Y2
Cov
erag
e
0 2 4 60.74
0.76
0.78
0.8
0.82
0.84Exponential Distribution
Horizon
Y2
Cov
erag
e
0 2 4 60.74
0.76
0.78
0.8
0.82
0.84
0.86Mixed Normal Distribution
Horizon
Y2
Cov
erag
e
Figure 8: Bias Correction
Note: the data generating process is (14) and (15) with b1 = 1.4, b2 = −0.48,
a1 = 1.2, a2 = 0.6, a3 = −0.35, ρ = 0.4, n = 50.
28