TI 2014-107/III Tinbergen Institute Discussion Paper
Spillover Dynamics for Systemic Risk Measurement using Spatial Financial Time Series Models Francisco Blasques Siem Jan Koopman Andre Lucas Julia Schaumburg
Faculty of Economics and Business Administration, VU University Amsterdam, and Tinbergen Institute.
Tinbergen Institute is the graduate school and research institute in economics of Erasmus University Rotterdam, the University of Amsterdam and VU University Amsterdam. More TI discussion papers can be downloaded at http://www.tinbergen.nl Tinbergen Institute has two locations: Tinbergen Institute Amsterdam Gustav Mahlerplein 117 1082 MS Amsterdam The Netherlands Tel.: +31(0)20 525 1600 Tinbergen Institute Rotterdam Burg. Oudlaan 50 3062 PA Rotterdam The Netherlands Tel.: +31(0)10 408 8900 Fax: +31(0)10 408 9031
Duisenberg school of finance is a collaboration of the Dutch financial sector and universities, with the ambition to support innovative research and offer top quality academic education in core areas of finance.
DSF research papers can be downloaded at: http://www.dsf.nl/ Duisenberg school of finance Gustav Mahlerplein 117 1082 MS Amsterdam The Netherlands Tel.: +31(0)20 525 8579
Spillover Dynamics for Systemic Risk Measurement Using
Spatial Financial Time Series Models1
Francisco Blasques(a), Siem Jan Koopman(a,b), Andre Lucas(a), Julia Schaumburg(a)
(a) VU University Amsterdam and Tinbergen Institute, The Netherlands
(b) CREATES, Aarhus University, Denmark
August 2014
Abstract
We introduce a new model for time-varying spatial dependence. The model extends the well-known static spatial lag model. All parameters can be estimated conveniently by maximumlikelihood. We establish the theoretical properties of the model and show that the maximumlikelihood estimator for the static parameters is consistent and asymptotically normal. We alsostudy the information theoretic optimality of the updating steps for the time-varying spatialdependence parameter. We adopt the model to empirically investigate the spatial dependencebetween eight European sovereign CDS spreads over the period 2009–2014, which includesthe European sovereign debt crisis. We construct our spatial weight matrix using cross-borderlending data and include country-specific and Europe-wide risk factors as controls. We finda high, time-varying degree of spatial spillovers in the sovereign CDS spread data. There isa downturn in spatial dependence after the first half of 2012, which is consistent with policymeasures taken by the European Central Bank. The findings are robust to a wide range ofalternative model specifications.
Keywords: Spatial correlation, time-varying parameters, systemic risk, European debt crisis,generalized autoregressive score.
1We thank conference participants at the 2nd workshop on “Models driven by the score of predictive likelihoods”
in Tenerife 2014, the 7th Annual SoFiE Conference in Toronto 2014, the International Association of Applied Econo-
metrics conference in London 2014, the SYRTO workshop at the Bundesbank 2014 and seminar participants at VU
University Amsterdam for helpful comments. Lucas and Blasques thank the Dutch Science Foundation (NWO, grant
VICI453-09-005) for financial support. Koopman, Lucas, and Schaumburg thank the European Union Seventh Frame-
work Programme (FP7-SSH/2007-2013, grant agreement 320270 - SYRTO) for financial support. Koopman acknowl-
edges support from CREATES, Center for Research in Econometric Analysis of Time Series (DNRF78) at Aarhus
University, Denmark, and funded by the Danish National Research Foundation. Research support by the Deutsche
Forschungsgemeinschaft through the SFB 649 ”Economic Risk” is gratefully acknowledged.
1
1 Introduction
We propose a new parsimonious model to measure the time-varying cross-sectional dependence
in European sovereign credit spreads. The model builds on the well-known spatial lag model for
panel data. The strength of contemporaneous spillover effects is summarized in a single time-
varying parameter: the spatial dependence parameter. We argue that this parameter may be inter-
preted as a measure of sovereign systemic risk.
It is important to model sovereign default risk in the Euro area using a joint model. Financial
markets in the Euro area are largely integrated, and the financial sectors in the separate member
countries are heavily engaged in cross-border borrowing and lending. The cross-border entan-
glement of financial firms and sovereigns exacerbated further during the financial crisis; see for
example Acharya et al. (2013) and Dieckmann and Plank (2012). Our model accounts for the
possibility that shocks affecting the credit quality of one Eurozone member are likely to be prop-
agated to the other members via the linkages of their financial sectors. Possible feedback loops
that amplify systemic risk are incorporated in this way as well. The transmission channels in our
model are defined explicitly as economic distances in a spatial weights matrix constructed from
cross-border debt data.
The model is adopted to empirically study a time series sample of eight Eurozone sovereign
credit default swaps (CDS) over the period 2009–2014. We find strong evidence for time-varying
spatial dependence. The dependence parameter displays clear but transitory downward throughs
around the Long Term Refinancing Operations (LTROs) by the European Central Bank at the
end 2011 and start of 2012, but effectively remains at high levels throughout up to the second
half of 2012. It is only after the announcements and implementation of the Outright Monetary
Transactions (OMT), and the implementation of the European Stability Mechanism (ESM) that
systemic risk settles more permanently at a spatial correlation level that is roughly 30% to 40%
lower than during the crisis. The empirical implications of the model are robust to a range of
model extensions and alternative specifications, includuding common unobserved factors in the
levels of CDS changes, alternative distributional assumptions, alternative spatial weight matrices,
and time varying volatilities.
Our paper contributes to two strands of literature. First, we contribute to the applied spatial
econometrics literature. Spatial models are widely used in applied geographic and regional sci-
ence studies, and have recently also been applied in empirical finance; see Fernandez (2011) for a
CAPM model augmented by spatial dependencies, Wied (2013), Arnold et al. (2013), and Asghar-
2
ian et al. (2013) for analyses of spatial dependencies in stock markets, Denbee et al. (2013) for
a network approach to assess interbank liquidity, and Saldias (2013) for a spatial error model to
identify sector risk determinants. The study closest to ours is Keiler and Eder (2013). They model
CDS spreads of financial institutions in a static spatial lag model, additionally accounting for firm-
specific covariates and market risk factors. Their spatial weight matrix is constructed from stock
return correlations.
All of the above models, however, treat the spatial dependence parameter as static. To the best
of our knowledge, explicitly endowing the spatial dependence parameter in the spatial lag model
with time series dynamics is new to the literature. We model the dynamics using the generalized
autoregressive score framework proposed by Creal et al. (2011, 2013) and Harvey (2013). Given
the nonlinear impact of the time-varying parameter on the model, the theoretical properties of
this model and the asymptotic properties of the maximum likelihood estimator (MLE) for the
remaining static parameters are challenging and have not been established so far. We show under
what conditions the filtered spatial dependence parameters are well behaved, such that the model
is invertible. Invertibility is a key property for establishing consistency and asymptotic normality
of the MLE; see for example Wintenberger (2013). We derive new conditions for the asymptotic
properties of the MLE compared to Blasques et al. (2014b), allowing for exogenous regressors to
be part of the specification. We also discuss the information theoretic optimality of the model and
illustrate in a simulation study that the model is able to track a range of different patterns for the
time-varying spatial dependence parameter.
Second, we contribute to the literature that studies the dynamics of financial systemic risk in
the context of a network of sovereigns or financial firms. Since the beginning of the European
sovereing debt crisis in 2009, the sharp increases and comovements of sovereign credit spreads
have been the subject of a growing number of empirical studies. For instance, by employing
an asset pricing model, Ang and Longstaff (2013) investigate the differences between U.S. and
European credit default swap (CDS) spreads as a reflection of systemic risk. Lucas et al. (2014)
and Kalbaska and Gatkowski (2012) use multivariate time series models to model comovements
in European sovereign CDS spreads. De Santis (2012) and Arezki et al. (2011) study credit risk
spillover effects that are induced by rating events, such as downgrades of Greek government bonds.
Leschinski and Bertram (2013) find contagion effects in European sovereign bond spreads using
the simultaneous equations approach of Pesaran and Pick (2007). Caporin et al. (2013), on the
other hand, employ Bayesian quantile regressions, and conclude that comovements in European
credit spreads during the debt crisis are only due to increased volatities, but not contagion.
3
Our approach differs from the studies above since we introduce cross-sectional correlation not
only through contemporaneous error correlations, but also through spillovers induced by shocks to
the regressors, such as stock market crashes or interbank lending rates. Furthermore, we explicitly
offer financial sector linkages as the source of sovereign credit risk comovements. This view is
supported by the results of Korte and Steffen (2013), Kallestrup et al. (2013), Gorea and Radev
(2013), and Beetsma et al. (2012), in which cross-border exposures between international financial
sectors are relevant drivers of sovereign credit spreads. By exploiting these debt interconnections
as economic distances between sovereigns in our spatial model, we obtain a scalar time-varying
(spatial) dependence coefficient. We interpret this parameter in the systemic context as the overall
tendency for shock spillovers. As such, it provides a measure of systemic risk that is easy to
monitor over time.
The remainder of the paper is organized as follows. Section 2 introduces our spatial model
with time-varying parameters. We examine its theoretical properties in Section 3. In Section 4, we
provide Monte Carlo evidence of the model’s ability to track different dynamic patterns in spatial
dependence over time. We provide our study of European sovereign CDS spread dynamics in
Section 5. Section 6 concludes.
2 Spatial models with dynamic spatial dependence
2.1 Static spatial lag model for panel data
The spatial lag model for panel data is given by
yt = ρWyt +Xtβ + et, et ∼ pe(et,Σ;λ), t = 1, . . . , T, (1)
where yt denotes a vector of n cross-sectional observations at time t, ρ is the spatial dependence
coefficient,W is an n×nmatrix of exogenous spatial weights,Xt is an n×k matrix of exogenous
regressors, β is a k × 1 vector of unknown coefficients, including an intercept, and n × 1 vector
et is the disturbance vector with multivariate density pe(et,Σ;λ) that has mean zero, an unknown
k × k variance (or scale) matrix Σ and possibly other parameters collected in vector λ. For
example, pe may represent the Student’s t distribution where λ is then the degrees of freedom
parameter. This model (1) implies that each entry yit, for i = 1, . . . , n, of the vector yt depends
on k individual-specific regressors xit as well as on possibly other entries yjt for j 6= i. For a
moderately large n, we cannot estimate such a system of contemporaneous dependencies without
4
imposing further restrictions. The idea of a spatial dependence modelling is to specify the spatial
weight matrix W as a function of geographic or economic distances, and in this way exogenously
define a neighborhood structure between the cross-sectional units. It is a standard option to row-
normalize W such that∑n
j=1wij = 1 for i = 1, . . . , n, where wij is the (i, j)th element from
W . The impact of the (spatially weighted) contemporaneous dependent variables Wyt on yt is
captured by a scalar spatial dependence parameter ρ. For shocks to die out over space, we require
ρ ∈ (1/ωmin, 1) where ωmin is the smallest eigenvalue of W ; see Lee (2004).
We show that the basic form of the spatial lag model (1) can capture nonlinear feedback effects
across units by rewriting the model as
yt = ZXtβ + Zet, (2)
where we assume that the inverse matrix Z = (In − ρW )−1 exists with In as the n × n identity
matrix. Using infinite power series expansion as in LeSage and Pace (2008), we obtain
yt = Xtβ + ρWXtβ + ρ2W 2Xtβ + · · ·+ et + ρWet + ρ2W 2et + · · · . (3)
Equation (3) reveals that shocks eit or x′itβ to unit i spill over to other units j 6= i to an extent
that depends on their relative proximity to i via the weight matrix W and the spatial dependence
parameter ρ. At the same time, there are also possible feedback effects back to unit i itself, for
example if wij and wji are both non-zero, such that i and j are mutual neighbors, and i is a
‘second-order neighbor’ to itself.
The simultaneous structure of (1) causes an endogeneity problem and leads to an inconsistency
in the least squares estimation of the coefficients when n becomes large. An alternative solution in
the cross-sectional literature is to estimate the parameters by the method of Maximum Likelihood
(ML) or Quasi-ML (QML) where the latter is typically based on the normal distribution.1 The
ML Estimator (MLE) for spatial models with static dependence parameter was first studied in
Ord (1975) in the context of cross-sectional data sets. Lee (2004) derives asymptotic properties
of the Quasi MLE (QMLE) for n → ∞, and Hillier and Martellosio (2013) investigate its finite
sample distribution. Large n and large T asymptotics for QMLE of the spatial model with static
dependence parameter are studied in Yu et al. (2008). For further textbook treatments of spatial
econometric models and their estimation, we refer to Anselin (1988) and LeSage and Pace (2008).
For a survey on the panel data spatial lag model and parameter estimation, see Lee and Yu (2010).1Alternatively, we can use GMM as in, for example, Kelejian and Prucha (2010).
5
2.2 Score dynamics for the spatial dependence parameter
We can interpret the spatial dependence parameter ρ in (1) as a measure of the strength of cross-
sectional spillovers. In many empirical applications involving panel data, it is unrealistic to assume
that ρ is constant over the entire sample period. We therefore propose to introduce a time-varying
ρ in the model, that is
yt = ρtWyt +Xtβ + et, et ∼ pe(et,Σ;λ), t = 1, . . . , T, (4)
where ρt = h(ft) is a monotonic transformation of a time-varying parameter ft. We choose the
link function h such that ρt ∈ (−1, 1). We adopt the autoregressive score framework of Creal et al.
(2011, 2013) and Harvey (2013) to introduce a time-varying ft. The score framework for time-
varying parameters has been adopted successfully in a range of different model settings, including
the multivariate volatility model of Creal et al. (2011), the systemic risk model of Oh and Patton
(2013) and Lucas et al. (2014), the credit risk dynamic factor model of Creal et al. (2014), the
location and scale models with fat tails of Harvey and Luati (2014), and many more.2
The score framework is based on the scaled score of the conditional density pe to drive the
time-variation in ft. The updating equation for ft is given by
ft+1 = ω +Ast +Bft, (5)
where ω is a scalar coefficient, A is the score coefficient, st = St∇t is the scaled score function
and B is the autoregressive (or mean-reverting) coefficient. All three coefficients are treated as
fixed and unknown parameters. The scaled score function is defined as the first derivative of the
predictive loglikelihood function at time t with respect to ft, possibly multiplied by some local
scaling factor St. The score function is given by∇t = ∂`t/∂ft where
`t = ln pe (yt − h(ft)Wyt −Xtβ,Σ;λ) + ln |(In − h(ft)W )| . (6)
Throughout this paper, we use unit scaling, that is St ≡ 1 such that st = ∇t. Other scaling
choices are also feasible; see Creal et al. (2013).3 Equation (6) differs from the likelihood of
a simple linear regression model by the term ln |(In − h(ft)W )|. This term accounts for the
nonlinearity of the model in ρt as shown in equation (2). We define the vector of static parameters2See www.gasmodel.com for a full compilation.3In a simulation (not reported here) it turned out that different choices of scaling, such as scaling by the inverse
information matrix scaling or by its square root, did not have much impact on our empirical results.
6
θ = (ω,A,B, β, λ)′ and estimate θ via the numerical maximization of the likelihood function
given by
`T =
T∑t=1
`t. (7)
We consider two specifications for the error term densities pe, namely the multivariate normal
distribution and the multivariate Student’s t distribution. The latter is particularly relevant for our
empirical study because spread changes in credit default swaps (CDS) may be fat-tailed. Also,
Harvey and Luati (2014) argue that the Student’s t distribution can render the dynamics more
robust to incidental influential observations and outliers. Using the standard expression for the
multivariate normal density, we obtain the time t contribution to the loglikelihood function as
`t = ln |I− h(ft)W |−n
2ln(2π)−1
2ln |Σ|−1
2(yt−h(ft)Wyt−Xtβ)′Σ−1(yt−h(ft)Wyt−Xtβ),
(8)
and score
∇t =(y′tW
′Σ−1(yt − h(ft)Wyt −Xtβ)− tr(Z(ft)W ))· h(ft), (9)
where tr(·) is the trace operator, Z(ft) = (In− f(ft)W )−1, and h(ft) is the first derivative of the
transformation function h with respect to ft. For instance, if h(ft) = γ tanh(ft) with γ ∈ (0, 1),
then h(ft) = γ(1 − tanh2(ft)). When the density of the disturbance vector et is a multivariate
Student’s t with λ degrees of freedom, we have the tth likelihood contribution as given by
`t = ln |Z(ft)−1|+ ln
(Γ(λ+n
2
)|Σ|1/2(λπ)n/2Γ
(λ2
))
+
(−λ+ n
2
)ln
(1 +
(yt − h(ft)Wyt −Xtβ)′Σ−1(yt − h(ft)Wyt −Xtβ)
λ
),
with the score function given by
∇t =(wt · y′tW ′Σ−1(yt − h(ft)Wyt −Xtβ)− tr(Z(ft)W )
)· h(ft), where (10)
wt = (1 + λ−1n)/(
1 + λ−1(yt − h(ft)Wyt −Xtβ)′Σ−1(yt − h(ft)Wyt −Xtβ)).
We can verify that if λ → ∞, we have wt → 1 and the score expression in (10) collapses to
the one in (9). The weight wt is small if the residuals yt − h(ft)Wyt − Xtβ are ‘large’ in a
multivariate sense. The implication of a small weight wt is that the observation has a smaller
impact on the updates of ft. It provides a robustness feature to the dynamics of ft if we assume
a fat-tailed distribution such as the Student’s t; see also the discussion in Creal et al. (2011) and
7
Harvey (2013). This intuition is also straightforward: a large residual may be attributable to the
fat-tailedness of the Student’s t distribution rather than to a recent increase in the spatial correlation
h(ft).
The score expressions in (9) and (10) also depart from the expressions for the standard linear
regression model. In particular, we have the additional correction term −tr(Z(ft)W ). This term
accounts for the simultaneity bias in the standard least squares estimator and follows from the
presence of the term 0.5 ln |Z(ft)| in the likelihood at time t. In effect, this term accounts for the
fact that there may be feedback effects from unit i to unit j and then back to unit i. Hence the
spatial autoregressive score model integrates time-varying direct effects and indirect effects; both
are used to determine the appropriate transition dynamics for ρt.
3 Statistical properties of the model
In this section, we establish the existence, strong consistency and asymptotic normality of the
MLE of the static parameters θ that define the stochastic properties of the spatial score model
from Section 2. We choose to first present the results in a more general setting than the spatial
score model, thus extending the results in Blasques et al. (2014b) to allow for the presence of
exogenous regressors. We then particularize the results to the MLE for the spatial score model
in Corollary 1. We also discuss how the spatial score update is the optimal observation-driven
parameter update in an information theoretic sense, thus providing a further theoretical backbone
to the use of the score for updating ft. All proofs of the results stated in this section can be found
in the appendix.
3.1 Stochastic properties of the filtered spatial dependence parameter
To establish the consistency and asymptotic normality of the MLE, we first study the stochastic
properties of the filtered parameter ft defined through equations (5), (9), and (10). The filtered ft
directly determine the time-varying spatial parameter ρt = h(ft). Understanding the properties of
the filtered parameters is key to understanding the stochastic properties of the likelihood function
over the parameter space Θ.
We first need some additional notation. Let the T -period sequences yt(ω)Tt=1 and Xt(ω)Tt=1
be subsets of the realized path of n and k-variate stochastic sequences y(ω) := yt(ω)t∈Z and
X(ω) := Xt(ω)t∈Z, for some ω in the event space Ω. In particular,4 we let yt(ω) ∈ Y ⊆4The random sequences y and X are thus F/B(Y∞) and F/B(X∞)-measurable mappings y : Ω → Y∞ ⊆ Rn∞
and X : Ω → X∞ ⊆ Rk∞ where Rn∞ := ×t=∞t=−∞Rn and Rk∞ := ×t=∞t=−∞Rk denote Cartesian products of infinite
8
Rn ∀ (ω, t) ∈ Ω × Z and Xt(ω) ∈ X ⊆ Rk ∀ (ω, t) ∈ Ω × Z. For every ω ∈ Ω, the stochas-
tic sequences y(ω) and X(ω) thus live on the spaces (Y∞,B(Y∞),Py0) and (X∞,B(X∞),PX0 )
where the probability measures Py0 are PX0 are defined over the elements of the Borel σ-algebras
B(Y∞) and B(X∞). We write the filtered time-varying parameter as ft to distinguish it from the
true time-varying parameter ft. More precisely, we write the filtered time-varying parameter as
ft(y1:t−1, X1:t−1;θ, f1)t∈N, which depends naturally on the initialization f1 ∈ F ⊆ R, the past
data y1:t−1 = yst−1s=1 and X1:t−1 = Xst−1
s=1, and the parameter vector θ ∈ Θ. For notational
simplicity we often omit the dependence on the data and write ft(θ, f1)t∈N instead.
We can now rewrite the score update in (5) as
ft+1(θ, f1) = ω +A s(ft(θ, f1), yt, Xt;β, λ
)+Bft(θ, f1) ∀ t ∈ N,
where s(ft(θ, f1), yt, Xt;β, λ) denotes the unit scaled score function. To shorten the notation, we
define the random function
φt(ft(θ, f1);θ
):= φ
(ft(θ, f1), yt, Xt;θ
):= ω +A s(ft(θ, f1), yt, Xt;β, λ) +Bft(θ, f1),
as well as the supremum of its derivative,
φ′t(θ) := supf∈F
∣∣∣A ∂s(f, yt, Xt;β, λ)
∂f+B
∣∣∣. (11)
Note that φt(θ) is also a random variable due to its dependence on the data.
The following theorem states sufficient conditions for the stochastic sequence ft(θ, f1)t∈Ninitialized at f1 ∈ F to converge almost surely, uniformly in θ ∈ Θ, and exponentially fast
to a limit stationary and ergodic (SE) sequence ft(θ)t∈Z that has Nf bounded moments. We
repeatedly make use of this notion of uniform exponentially fast almost sure convergence (e.a.s.),
which means that ∃ γ > 1 such that
supθ∈Θ
γt∣∣∣ft(y1:t−1, X1:t−1,θ, f1
)− ft
(yt−1, Xt−1,θ
)∣∣∣ a.s.→ 0 as t→∞;
copies of Rn and Rk respectively, and Y∞ = ×t=∞t=−∞Y and X∞ = ×t=∞t=−∞X with B(Y∞) ≡ B(Rn∞) ∩ Y∞ andB(X∞) ≡ B(Rk∞) ∩ X∞; see (Billingsley, 1995, p.159). Here, B(Rn∞) and B(Rk∞) denote the Borel σ-algebrasgenerated by the finite dimensional product cylinders of Rn∞ and Rk∞ respectively, F denotes a σ-field defined on theevent space Ω, and together with the probability measure P0 on F, the triplet (Ω,F,P0) denotes the common underlyingcomplete probability space of interest.
9
see Straumann and Mikosch (2006). Note that the limit sequence starts in the infinite past and
hence depends on the infinite past data yt−1 := yst−1s=−∞ and Xt−1 := Xst−1
s=−∞, i.e.,
ft(θ)t∈Z ≡ ft(yt−1, Xt−1;θ)t∈Z. We thus establish the convergence of the sequence of
random functions ft(·, f1)t∈N defined on Θ with random elements taking values in the Banach
space (C(Θ,F), ‖ · ‖Θ) for every t ∈ N, to an SE limit ft(·)t∈Z with elements taking values in
(C(Θ), ‖ · ‖Θ), where ‖ · ‖Θ denotes the supremum norm on Θ. We have the following result.
THEOREM 1. Let F be convex, Θ be compact, ytt∈Z and Xtt∈Z be SE, s ∈ C(F ×Y ×X ×
B × Λ) and assume there exists a non-random f1 ∈ F such that
(i) E ln+ sup(β,λ)∈B×Λ |s(f1, yt, Xt;β, λ)| <∞;
(ii) E ln supθ∈Θ φ′1(θ) < 0.
Then ft(θ, f1)t∈N converges e.a.s. to the unique limit SE process ft(θ)t∈Z.
If furthermore ∃ Nf ≥ 1 such that
(iii) E sup(β,λ)∈B×Λ |s(f1, yt, Xt;β, λ)|Nf <∞;
and either
(iv) sup(β,λ)∈B×Λ |s(f, y,X;β, λ)− s(f ′, X, f ;β, λ)| < |f − f ′| ∀ (f, f ′, y,X) ∈ F ×F ×
Y × X ;
or
(iv′) E supθ∈Θ φ′1(θ)Nf < 1 and ft(θ, f1) ⊥ φ′t(θ) ∀ (t, f1) ∈ N × F , where ⊥ denotes
independence;
then both ft(θ, f1)t∈N and the limit SE process ft(θ)t∈Z have Nf bounded moments, i.e.,
supt E supθ∈Θ |ft(θ, f1)|Nf <∞ and E supθ∈Θ |ft(θ)|Nf <∞.
The first claim of Theorem 1 makes use of the conditions in Bougerol (1993a). Condition (i)
requires the existence of an arbitrarily small moment for the score, and condition (ii) requires the
spatial score update to be contracting on average. The uniqueness of the SE limit follows from
Straumann and Mikosch (2006). The second claim of Theorem 1 uses stricter moment conditions
and contraction conditions to obtain bounded moments of higher order for the filtered sequence.
This constitutes an extension of Proposition 1 in Blasques et al. (2014b) to the spatial score setting
with exogenous random variables Xt as well as vector and matrix arguments. Remark 1 below
highlights that in the special case where the score is uniformly bounded, then the filter has infinitely
many bounded moments under simpler conditions.
REMARK 1. Let |B| < 1. If sup(β,λ,f,y,X)∈B×F×Y×X |s(f, y,X;β, λ)| <∞, then
supt E supθ∈Θ |ft(θ, f1)|Nf <∞ holds for very Nf ≥ 1.
10
The proof of this statement follows immediately by noting that ft+1 =∑t−1
j=0 βj(ω+Ast−j)+
Bt−1f1, and hence that |ft+1| ≤∑t−1
j=0 |B|j |ω+Ast−j |+|Bt−1f1| ≤∑t−1
j=0 |B|j |ω|+∑t−1
j=0 |B|j |A||s|+
|B|t−1|f1| <∞ because |s| <∞.
3.2 Asymptotic properties of the maximum likelihood estimator
The observation-driven structure of the time-varying spatial model allows for a simple implemen-
tation of a maximum likelihood (ML) estimation procedure. Following equation (7), we define the
ML estimator (MLE) of the spatial score parameters more precisely as an element of the arg max
set of the sample log likelihood function `T (θ, f1),
θT (f1) ∈ arg maxθ∈Θ
`T (θ, f1), (12)
where
`T (θ, f1) =1
T
T∑t=1
`t(θ, f1)
=1
T
T∑t=1
log pe
(yt − h
((ft(θ, f1)
)Wyt −Xtβ ; λ
)− log |Z
(ft(θ, f1)
)|.
with Z(ft) defined below (9).
We can now use Theorem 1 to establish existence, consistency and asymptotic normality of
the MLE of the static parameters in the time-varying spatial model. For existence, we make the
following assumptions.
ASSUMPTION 1. (Θ,B(Θ)) is a measurable space and Θ is a compact set. Furthermore, h :
F → F ⊆ R and pe : Rn × Λ→ R are continuously differentiable in their arguments.
In Section 2, we have opted for the unit scaling of the score in our model. We can easily generalize
all results below to the case of a non-constant scaling function S as long as we assume S : F → R
is sufficiently smooth. Theorem 2 below establishes the existence and measurability of the MLE.
THEOREM 2. (Existence) Let Assumption 1 hold. Then there exists a.s. an F/B(Θ)-measurable
map θT (f1) : Ω→ Θ satisfying (12) for all T ∈ N and every initialization f1 ∈ F .
To obtain consistency of the MLE, we impose conditions that ensure that the likelihood func-
tion satisfies a uniform law of large numbers for SE processes. We first ensures that the filter
f(θ, f1) is SE and has Nf bounded moments by application of Theorem 1.
11
ASSUMPTION 2. ∃ (Nf , f) ∈ [1,∞)×F and a Θ ⊂ R3+dλ such that
(i) sup(β,λ)∈B×Λ E|s(f, yt, Xt;β, λ)|Nf <∞,
and either
(ii) sup(f,y,X,β,λ)∈R×Y×X×B×Λ |B +A∂s(f, y,X;β, λ)/∂f | < 1,
or
(ii′) E supθ∈Θ φ′t,Nf
(θ) = E supθ∈Θ |B +A∂s(f, yt, Xt;β, λ)/∂f | < 1
and ft(yt−1, Xt−1,θ, f1) ⊥ φ′t+1,Nf
(θ) ∀ (t, f1) ∈ N×F .
Next, we ensure a bounded expectation for the likelihood function. To do this, we use the no-
tion of ‘moment preserving map’. This allows us to derive the appropriate number of bounded mo-
ments of the likelihood function from the moments of its arguments; see Blasques et al. (2014b)for
a detailed description of the moment preserving properties of a wide catalogue of functions.
DEFINITION 1. (Moment Preserving Maps) A function H : Rk1 × Θ → Rk2 is said to be n/m-
moment preserving, denoted as H ∈ MΘ(n,m), if and only if E supθ∈Θ |xt(θ)|n < ∞ implies
E supθ∈Θ |H(xt(θ);θ)|m <∞.5
ASSUMPTION 3. N` = minNlog pe , Nlog |Z| ≥ 1, where log |Z| ∈ MΘ(Nf , Nlog |Z|) and
log pe ∈ MΘ
(N,Nlog pe
), with N = min
Ny, Nx
, where Ny and Nx denote the moments
of yt and Xt, respectively.
The moment N` in Assumption 3 corresponds to the number of moments of the likelihood
function. Rather than assuming N` ≥ 1 as a high-level assumption, we define N` as a function of
the score model constituents directly, thus obtaining a set of low-level conditions for strong con-
sistency. The requirements imposed in Assumption 3 follow easily by application of a generalized
Holder inequality to the likelihood expression below (12). Note that N = minNy, Nx
follows
directly by the fact that the argument (yt − h(ft(θ, f1)Wyt −Xtβ) of pe is linear in both yt and
Xt, and supf∈F |h(f)| ≤ 1. The current conditions extend those of Blasques et al. (2014b) by
accounting for the presence of exogenous variables Xt in the model.
Theorem 3 now establishes the strong consistency of the MLE for the parameters of our time-
varying spatial model if the data are SE.
THEOREM 3. (Consistency) Let ytt∈Z and Xtt∈Z be SE sequences satisfying E|yt|Ny <
∞ and E|Xt|Nx < ∞ for some Ny > 0 and Nx > 0 and let Assumptions 1, 2, and 3 hold.
5The (k1×1)-vector xt satisfies E supθ∈Θ |xt(θ)|n <∞ if its elements xi,t(θ) satisfy E supθ∈Θ |xi,t(θ)|n <∞,i = 1, ..., k1. The same element-wise definition applies when xt(θ) is a matrix.
12
Furthermore, let θ0 ∈ Θ be the unique maximizer of `∞(θ) on the parameter space Θ. Then the
MLE satisfies θT (f1)a.s.→ θ0 as T →∞ for every f1 ∈ F .
Remark 2 below highlights that if the score s is uniformly bounded, we can change Assump-
tion 2 in line with Remark 1.
REMARK 2. We can substitute Assumption 2 in Theorem 3 by
(i) sup(β,λ,f,y,X)∈B×Λ×F×Y×X |s(f, y,X;β, λ)| <∞;
(ii) E ln supθ∈Θ φ′1,1(θ) < 0 and |B| < 1.
Finally, we establish the asymptotic normality of the MLE. For this, we require the exis-
tence of a sufficient number of bounded moments for the likelihood function and its derivatives.
For notational simplicity, we define the function qt := q(ft(θ, f1), yt, Xt;β, λ) := log pe(yt −
h(ft(θ, f1)Wyt −Xtβ;λ), as well as the cross-derivatives
s(K1,K2,K3)(f, y,X;β, λ) :=∂K1+K2+K3s(f, y,X;β, λ)
∂fK1∂βK2∂λK3.
The (cross)-derivatives q(K1,K2,K3) and (log |Z|)(K1) are defined similarly. Assumption 4 now
imposes sufficient moment conditions for the asymptotic normality of the MLE.
ASSUMPTION 4. (i) s(K) ∈ MΘ(N , N(K)s ), q(K′) ∈ MΘ(N,N
(K′)q ), N := (Nf , Ny, Nx),
with N as defined in Assumption 3;
(ii) N`′ ≥ 2, N`′′ ≥ 1, N (1)f > 0, and N (2)
f > 0, with
N`′ = min
N (0,1,0)q , N (0,0,1)
q ,N
(1)log |Z|N
(1)f
N(1)log |Z| +N
(1)f
,N
(1,0,0)q N
(1)f
N(1,0,0)q +N
(1)f
,
N`′′ = min
N (0,2,0)q , N (0,0,2)
q , N (0,1,1)q ,
N(1,1,0)q N
(1)f
N(1,1,0)q +N
(1)f
,N
(1,0,1)q N
(1)f
N(1,0,1)q +N
(1)f
,
N(2,0,0)q N
(1)f
2N(2,0,0)q +N
(1)f
,N
(1,0,0)q N
(2)f
N(1,0,0)q +N
(2)f
,N
(1)log |Z|N
(2)f
N(1)log |Z| +N
(2)f
,N
(2)log |Z|N
(1)f
2N(2)log |Z| +N
(1)f
,
N(1)f = min
Nf , Ns, N
(0,1,0)s , N (0,0,1)
s
,
N(2)f = min
N
(1)f , N (0,1,0)
s , N (0,0,1)s , N (0,2,0)
s , N (0,0,2)s , N (0,1,1)
s ,N
(1,0,0)s N
(1)f
N(1,0,0)s +N
(1)f
,
N(2,0,0)s N
(1)f
2N(2,0,0)s +N
(1)f
,N
(1,1,0)s N
(1)f
N(1,1,0)s +N
(1)f
,N
(1,0,1)s N
(1)f
N(1,0,1)s +N
(1)f
.
13
Again, rather than assuming N`′ ≥ 2 and N`′′ ≥ 1 directly as a high-level condition, we
define N`′ and N`′′ explicitly in terms of their lower-level constituents. The moment conditions
in Assumption 4 extend those of Blasques et al. (2014b) by allowing for exogenous regressors.
The expressions may seem complicated at first, but we show below that their verification is often
straightforward; see also Blasques et al. (2014b) for the verification of similar moment conditions
in a wide range of observation-driven models.
The quantities N (1)f and N (2)
f in Assumption 4 correspond to the moments of the first and
second derivatives of the filter ft(θ, f1) with respect to the parameter θ. Similarly, N`′ and N`′′
denote the moments of the first and second derivatives of the likelihood function, respectively.
Theorem 4 now establishes the asymptotic normality of the MLE. Here, int(Θ) denotes the
interior of Θ.
THEOREM 4. (Asymptotic Normality) Let ytt∈Z and Xtt∈Z be SE sequences satisfying E|yt|Ny <
∞ and E|Xt|Nx <∞ for some Ny > 0 and Nx > 0 and let Assumptions 1–4 hold. Furthermore,
let θ0 ∈ int(Θ) be the unique maximizer of `∞(θ) on Θ. Then,
√T (θT (f1)− θ0)
d→ N(0, I−1(θ0)J (θ0)I−1(θ0)
)as T →∞,
whereJ (θ0) := E˜′t(θ0)˜′
t(θ0)> is the expected outer product of gradients and I(θ0) := E˜′′t (θ0)
is the Fisher information matrix.
Next we apply the theory developed above to consider the properties of the MLE for the time-
varying spatial model. We consider the model in (4) with Student’s t distributed innovations with
λ > 0 degrees of freedom. Consider a transformation function h that is (a.s.) bounded away from
minus one and one with uniformly bounded derivatives h(i),
− 1 < ρ ≤ ρt = h(ft) ≤ ρ < 1 a.s.; supf∈F|h(i)(f)| <∞ , i = 1, 2. (13)
For example, to set the correlation between ρ = −ρ and ρ, we can take h(ft) = ρ tanh(ft), where
ρ can be arbitrarily close to one. We have the following corollary.
COROLLARY 1. Consider the spatial score model with link function (13). If ytt∈Z and Xtt∈Zare SE with E|yt| < ∞ and E|Xt| < ∞, then there exists a compact parameter space Θ with
|B| < 1 ∀ θ ∈ Θ, such that the MLE exists (a.s.) and is strongly consistent for any initialization
f1 ∈ F . If E|yt|2+ε < ∞ and E|Xt|2+ε < ∞ for some ε > 0, then the MLE is asymptotically
normal with covariance matrix given in Theorem 4.
14
The corollary is a direct consequence of the previous theorems and is particularly applicable
to the spatial score model that we apply in our empirical section later on. It shows that we can use
the MLE both for estimation and inference.
3.3 Optimality of score updating in the time-varying spatial model
We note that the score-driven framework is not only intuitively appealing as a way to update time-
varying parameters. More importantly, score based updates are also optimal in an information
theoretic sense under very mild regularity conditions; see Blasques et al. (2014a).
Let pt := p( · |ft, Xt) denote the true unknown conditional density of yt, which depends on
the true unobserved time-varying parameter ftt∈Z and the regressors Xt. Similarly, let pt :=
p( · |ft, Xt) denote the conditional density implied by the score model given the filtered time-
varying parameter ft, the regressors Xt, and the postulated innovation density pe. To simplify
the notation, note that we have dropped the dependence of ft on θ and f1 Ideally, whenever a
new observation yt becomes available, we want the filtered value ft+1 to be such that the new
conditional density implied by the model pt+1 := p(·|ft+1, Xt) be as close as possible to the true
unknown conditional density pt from which yt was drawn.
Following Blasques et al. (2014a), we focus on the notion of Kullback-Leibler divergence to
measure the distance between the two densities
DKL
(pt , pt+1
)=
∫Yp(y|Xt) ln
p(y|ft, Xt)
p(y|ft+1, Xt;θ)dy, (14)
where Y ⊆ R is the set over which the divergence is evaluated; ; see Blasques et al. (2014a) for fur-
ther details. In particular, we would like an update ft+1 for whichDKL(p( · |ft, Xt) , p( · |ft+1, Xt))
is smaller than DKL(p( · |ft, Xt) , p( · |ft, Xt)), such that the update from ft to ft+1 reduces the
distance to the true unknown conditional density.
Blasques et al. (2014a) show that only score updates have the property that they locally always
reduce the KL-distance and thus provide a local improvement. Though their original proofs do
not account for the presence of exogenous regressors, is it easy to see from their paper that all
their results continue to hold if Xt is incorporated in the conditional densities as described above.
In particular, the spatial model structure and Student’s t specification are sufficiently smooth for
local optimality results to apply, as well as for non-local Kullback-Leibler improvement regions
from Blasques et al. (2014a) to hold.
15
4 Monte Carlo study
To show the adequacy of the time-varying spatial model in filtering out dynamic patterns of the
spatial dependence parameters, we conduct a simulation study. In this study, we also investigate
whether the MLE is well-behaved and approximately normally distributed in larger samples. We
set the sample size to realistic values given the empirical application in Section 5. To limit the
complexity of the experiment, we consider a spatial lag model without regressors. The data gen-
erating process is
yt = Z(ft)et, eti.i.d.∼ Student’s t(0, In; 5), (15)
where Z(ft) = (In − tanh(ft)W )−1, t = 1, ..., 500. The spatial weight matrix W is specified
similar to the one used in our empirical application. It contains row-normalized cross-border
exposures of the financial sectors of nine European countries. We simulate 250 data sets according
to (15) using five processes with different dynamic patterns for the spatial dependence parameter.
These patterns are similar to the ones in Engle (2002), namely
1. Constant: ρt = 0.9;
2. Sine: ρt = 0.5 + 0.4 cos(2πt/200);
3. Fast sine: ρt = 0.5 + 0.4 cos(2πt/20);
4. Step: ρt = 0.9− 0.5 ∗ I(t > T/2);
5. Ramp: ρt = mod (t/200);
Figure 1 shows that the filtered spatial dependence parameters are able capture the patterns
of the simulated processes quite accurately. The model has some difficulty in tracking down-
turns compared to upturns, but this is intuitively plausible: the signal present in strongly cross-
sectionally correlated data yt is much more apparent than that for weakly correlated data.
In our second simulation study, we again use nine cross-sectional units. We assume that the
errors are normally distributed with common variance σ2, and we include one regressor variable
Xt ∼ N(0, I9). The data-generating process is the Gaussian spatial score model laid out in Section
2. In contrast to our previous experiment, the model is now thus correctly specified. We simulate
500 paths yt using the parameters ω = 0.05, A = 0.05, B = 0.8, β = 1.5, and σ2 = 2. We
plot the kernel density estimates of the distribution of the MLE for three different sample sizes,
T = 500, 1000, 2000, in Figure 2.
The figure clearly shows that for smaller sample sizes of around T = 500, the estimators
are still not perfectly normal. For larger sample sizes, however, we see a clear convergence to
16
Figure 1: Simulated true spatial dependence process (black line), median filtered parameter(dashed red line) and 2.5% and 97.5% (green lines) quantiles of the filtered parameters. Thefigures are based on 250 replications.
0 100 200 300 400 500
0.86
0.88
0.90
0.92
0.94
Constant
rho.
t
0 100 200 300 400 500
0.0
0.2
0.4
0.6
0.8
1.0
Sine
rho.
t
0 100 200 300 400 500
0.0
0.2
0.4
0.6
0.8
1.0
Fast sine
rho.
t
0 100 200 300 400 500
0.0
0.2
0.4
0.6
0.8
1.0
Step
rho.
t
0 100 200 300 400 500
0.0
0.2
0.4
0.6
0.8
1.0
Ramp
rho.
t
the limiting result. In particular, for empirically relevant sample sizes of around T = 2, 000, all
distributions look close to a normal centered around the true parameter values. We therefore apply
the MLE and its associated standard errors in our empirical application in the next section.
5 The empirics of time-varying European CDS spread dependencies
In our empirical study we evaluate the evolution of sovereign credit risk spreads over a period that
includes the Eurozone sovereign debt crisis. In particular, we investigate the time-varying features
of the spatial dependence structure between the sovereign credit spreads, particular in relation to
a number of the policy responses by regulators. Our spatial structure is directly linked to the bank
sectors’ cross-exposures to other sovereigns and financial sectors within the European Union.
17
Figure 2: Kernel density estimates of estimated parameters from 500 simulation replications
0.02 0.04 0.06 0.08 0.10
010
2030
4050
6070
Density of estimates for ω, true value=0.05
N = 500 Bandwidth = 0.002946
Den
sity
T=500T=1000T=2000
0.040 0.045 0.050 0.055 0.060
050
150
250
350
Density of estimates for a, true value=0.05
N = 500 Bandwidth = 0.0006868
Den
sity
T=500T=1000T=2000
0.70 0.75 0.80 0.85 0.90
010
2030
40
Density of estimates for b, true value=0.8
N = 500 Bandwidth = 0.00627
Den
sity
T=500T=1000T=2000
1.46 1.48 1.50 1.52 1.54
010
2030
4050
60
Density of estimates for β, true value=1.5
N = 500 Bandwidth = 0.003423
Den
sity
T=500T=1000T=2000
1.85 1.90 1.95 2.00 2.05 2.10 2.15
05
1015
20
Density of estimates for σ2, true value=2
N = 500 Bandwidth = 0.01108
Den
sity
T=500T=1000T=2000
5.1 Data
Credit default spread data
Since EU countries have been affected by the crisis to different degrees, sovereign credit spreads
in Europe are strongly cross-sectionally dependent. Figure 3 shows the credit default spreads from
February 2, 2009, until May 12, 2014 (1375 daily observations) for the nine euro area countries in
our sample: Belgium, France, Germany, Ireland, Italy, the Netherlands, Portugal, and Spain. We
use relative changes (log returns multiplied by 100) of Euro-denominated sovereign CDS spreads
for each of these countries using data obtained from Bloomberg.
The time series reveal clear common patterns, particularly among the non-stressed Eurozone
countries (Germany, France, Netherlands, Belgium, and to a lesser extend Spain and Italy). At
18
2009 2010 2011 2012 2013 2014
100
200
300
400
spre
ad (
bp)
BelgiumFranceNetherlandsGermany
2009 2010 2011 2012 2013 2014
050
010
0015
00
spre
ad (
bp)
PortugalIrelandSpainItaly
Figure 3: Credit default swap spreads of eight European sovereigns, Feb 2, 2009 – May 12, 2014.The different countries are split in two groups.
19
Table 1: List of country-specific stock indices included in the time-varying spatial model as re-gressor variables.
Belgium BEL 20 Price IndexFrance CAC 40 Price IndexGermany DAX 30 Price IndexIreland ISEQ 20 Price IndexItaly FTSE MIB Price IndexNetherlands AEX Price IndexPortugal PSI 20 Price IndexSpain IBEX 35 Price Index
the same time, there appear to be temporary dissimilarities: for example, the evolution of the
Ireland credit spread appears to be roughly in line with that of the other countries before mid 2010
and after mid 2012, but departing during the height of the European sovereign debt crisis. The
combination of commonalities with possible temporary changes in commonality warrants the use
of the time-varying spatial model proposed in this paper.
Other explanatory variables
Our empirical model contains two regressors that capture the state of the European financial mar-
ket, see also Caporin et al. (2013). The first variable is the change in the volatility index VStoxx.
The VStoxx is measured using the implied volatility of the EuroStoxx 50. It captures changes
in the risk appetite of financial markets. Our second variable is the difference between the three
month Euribor and the overnight rate EONIA. This measure captures the stress within the financial
sector and the perceived counterparty credit risk between banks.
We also incorporate two country-specific regressors, namely log returns of the respective coun-
tries’ leading stock returns and absolute changes in 10-year government bond yields. Local stock
market returns, see Table 1, are a measure of the well-being of the economy and the ability of gov-
ernments to pay off debt in the long run. We expect a negative relation with credit spread changes.
The change in 10-year yields mainly reflect the long-term borrowing costs of governments, and
we expect a positive relation with sovereign credit default swap spreads.
All variables are included in the model with a lag of one period. The data are obtained from
Datastream. We have computed the augmented Dickey-Fuller unit root test statistics and they
indicated that all time series are stationary. Table 2 presents the summary statistics.
20
Table 2: Data summary. Stock index log returns are calculated from closing prices. All stockindices are quoted in domestic currency (Euro).
mean min. 25% quant. median 75% quant. max.CDS spread changes (log changes*100)
Belgium -0.08 -19.34 -1.9 -0.07 1.78 17.04France -0.03 -19.44 -1.84 -0.07 1.56 19.82Germany -0.07 -26.71 -1.89 0 1.56 25.43Ireland -0.11 -32.69 -1.57 -0.03 1.32 26.81Italy -0.03 -43.73 -2.09 -0.1 1.76 20.27Netherlands -0.09 -22.2 -1.66 -0.03 1.39 14.92Portugal 0.02 -47.38 -1.8 0 1.66 20.54Spain -0.04 -37.04 -2.02 0 1.99 25.17
local stock index returns (log returns*100)Belgium 0.04 -5.49 -0.59 0.03 0.69 8.96France 0.03 -5.63 -0.68 0.02 0.8 9.22Germany 0.06 -5.99 -0.57 0.07 0.75 5.9Ireland 0.06 -6.79 -0.62 0.02 0.83 6.95Italy 0.01 -7.04 -0.88 0.04 1.03 10.68Netherlands 0.04 -5.34 -0.58 0.04 0.71 7.07Portugal 0.01 -5.51 -0.69 0.02 0.77 10.2Spain 0.02 -6.87 -0.82 0.01 0.87 13.48
local long-term bond yields (changes)Belgium -0.16 -30.2 -2.6 -0.1 2.2 34.4France -0.14 -26.2 -2.56 -0.12 2.4 24.2Germany -0.13 -25.6 -2.8 -0.1 2.3 18.48Ireland -0.2 -102.79 -3.64 -0.25 2.8 75Italy -0.11 -78 -3.3 -0.09 3.1 50.9Netherlands -0.16 -22.4 -2.8 -0.05 2.14 15.61Portugal -0.07 -146.98 -5.1 -0.01 5.13 168.6Spain -0.11 -88.3 -3.6 0 3.5 37.3
Eurozone-wide variablesVStoxx change -0.02 -10.94 -0.86 -0.11 0.67 12.79term spread 0.35 -0.37 0.14 0.34 0.52 1
Spatial weights matrix
The choice of spatial weight matrix is a key ingredient of the spatial model, as it determines the
structure of the ‘economic neighborhood’ between the sovereign CDS spreads and defines the
channel for cross-sectional spillovers. Recently, domestic banks’ cross-border exposures have
been identified as relevant pricing factors for sovereign credit spreads, see for example Kallestrup
et al. (2013), Korte and Steffen (2013), and Beetsma et al. (2012). A possible reason for this
connection is outlined in Korte and Steffen (2013). They argue that until recently, risk management
rules for banks implied a so-called ‘zero risk weight channel’: European banks were not required
to hold capital buffers against EU member states’ debt. This led to regulatory arbitrage incentives
21
for banks to hold more government debt; see also Acharya and Steffen (2013). At the same time
and due to the banks’ willingness to take on government debt, governments were able to issue large
amounts of debt, thus creating a problematic feedback loop: if sovereign credit risk materialized,
banks could become stressed, and due to possible bail-outs, governments in turn might become
stressed as well.
To account for this type of possible feedback loop, we use a weight matrix that is constructed
from cross-border debt data provided by the Bank for International Settlements (BIS).6 We average
the bilateral raw exposure data from 2007 Q4 - 2008 Q2. As the consolidated data are published
on the BIS homepage with a lag of approximately two quarters, this avoids a possible source of
endogeneity for W . This matrix is denoted by Wraw.
Due to large differences in the sizes of the member countries’ financial sectors, the weights
implied by Wraw vary significantly. To mitigate the size of these differences, we form three
discrete categories of mutual lending (‘low’, ‘medium’, and ‘high’). The entries of the resulting
matrix Wcat are constructed as
Wcat,ij =
1, if 0 < Wraw,ij ≤ Q0.33(Wraw),
2, if Q0.33(Wraw) ≤Wraw,ij < Q0.67(Wraw),
3, if Q0.67(Wraw) ≤Wraw,ij ,
where Qp(Wraw) denotes the p-th quantile of the exposure data contained in Wraw. After con-
structing Wcat, we row-normalize it to obtain proper weights that sum to one. An advantage of
the categorical matrix over the raw matrix is that the categories are almost time-invariant, so that
using a constant W can be justified. An alternative way would be to normalize the exposure data
by GDP of the country in order to relate the size of multual debt to the size of the economy. We
investigate this and other alternatives for constructing the weight matrix in our robustness checks
in Section 5.3.
5.2 Results
Table 3 contains the estimation results from both the static and time-varying spatial model for
normally and t-distributed error terms. For simplicity, the specifications contain a common, time-
invariant variance. This assumption is relaxed in the robustness section, Section 5.3. For the static
model, we find strong evidence for spatial dependence in the European sovereign CDS spreads,6The data can be found at http://www.bis.org/statistics/consstats.htm, Table 9B: International bank claims, consoli-
dated - immediate borrower basis. Last accessed on March 20, 2014.
22
Table 3: Estimated parameters and their robust (sandwich) standard errors in parentheses, for thestatic spatial lag model and the time-varying spatial model, based on normally (N ) and Student’st (tλ) distributed errors. The maximized loglikelihood value (logL) and the Akaike informationcriterion, corrected for finite numbers of observations, (AICc) are also reported. Estimation periodis February 2, 2009 – May 12, 2014.
Static model Time-varying modelN tλ N tλ
ρ 0.7249 0.7146(0.0071) (0.0062)
ω 0.0156 0.0181(0.0074) (0.0192)
A 0.0144 0.0168(0.003) (0.0085)
B 0.9817 0.9794(0.009) (0.0219)
ln(σ2) 1.8131 0.8392 1.8043 0.8426(0.0509) (0.0444) (0.0504) (0.0446)
Vstoxx -0.0901 -0.0261 -0.0756 -0.0261(0.0473) (0.0164) (0.0326) (0.0158)
term sp 0.0239 0.032 0.1084 0.0818(0.1065) (0.066) (0.0998) (0.0656)
local stocks -0.2031 -0.1156 -0.1769 -0.1122(0.0426) (0.0193) (0.035) (0.0187)
local 10Y yields 0.0256 0.0184 0.0258 0.0186(0.0041) (0.0027) (0.0039) (0.0027)
const -0.0137 -0.0341 -0.066 -0.0578(0.0386) (0.0216) (0.0393) (0.0244)
λ 2.5202 2.5649(0.1246) (0.1288)
logLik -26396.6 -24574.5 -26244.4 -24506.1AICc 52807.3 49165.1 52507.0 49032.4
indicated by the high estimates for ρ together with a small standard error. Given that CDS spread
changes are known to exhibit fat tails, it is not surprising to find that the model fit improves
substantially for the Student’s t vis-a-vis the normal distribution. The likelihood value increases
by more than 1800 points after only adding 1 parameter to the model. This finding is confirmed
by the AICc.
If we consider the dynamic spatial model based on the normal distribution, we see an increase
of about 160 likelihood points compared to the static Gaussian model at the cost of adding 2
parameters. The dynamics of the spatial dependence parameter are highly persistent with a value of
B close to unity. The unconditional mean of ft equals ω/(1−B) ≈ 0.8524 with tanh(0.8524) ≈
0.6924. Accounting for the fact that the expected value of tanh(ft) is slightly larger than this due
23
to Jensen’s inequality, we see that the unconditional level for the Gaussian spatial score model
is close to the static estimate of 0.7249. Also the increase from the static Student’s t to its time-
varying counterpart results in a likelihood increase, this time of about 68 points. The unconditional
level of tanh(ft) again lies close to its static counterpart.
On the basis of the reported AICc values, the data clearly favors time variation in the spatial
dependence parameter ρt using the Student’s t distribution for the estimation as well as for the
transition dynamics of ρt. The estimated degrees of freedom parameter λ for the Student’s t mod-
els is around 2.5. Hence there is a substantial degree of fat-tailedness. A part of the unconditional
fat-tailedness may also be due to the presence of volatility clustering. We discuss these robustness
issues in more detail in Section 5.3.
The coefficients for the included regressors have the same signs throughout the four specifi-
cations. Although the regression estimates vary somewhat, particularly between the normal and
Student’s t based models, the overall picture remains the same. A higher implied volatility on the
European stock market (VStoxx) correlates with lower CDS spreads. This is consistent with the
phenomenon of ‘flight to quality’ when the price of risk increases in financial markets. A higher
term spread on the interbank credit market implies a higher tendency to borrow overnight. This
is correlated with higher CDS spread changes and may be a sign of a perceived bank-sovereign
feedback loop: problems in the functioning of the interbank lending market may induce a fear of
possible future bailouts and subsequent sovereign debt problems. Stock market upturns have a
dampening effect on sovereign credit spreads, while increases in long-term bond yields point to
higher borrowing costs for governments and have a positive relation with sovereign CDS spreads.
Figure 4 presents the evolution of the filtered spatial dependence parameter. We observe that
the path of the spatial coefficient corresponding to the Student’s t spatial score model is more ro-
bust to outliers than its normal counterpart. This phenomenon is a common finding in the volatility
literature; see for example Creal et al. (2013) and Harvey (2013). Comparing the score expres-
sions in equations (9) and (10), it is clear that the time-varying spatial model shares this feature.
While the normal score is unbounded in the dependent variable and the regressors, the Student’s
t score contains a compensating effect in the the denominator that leads to a down-weighting of
large positive or negative observations. This leads to a different pattern between the two filtered
spatial dependence series for the two distributions, particularly during mid 2010, the first half of
2012, and late 2013.
Throughout the sample period, systemic risk is high, fluctuating around a value of 0.75 until
the end of 2012. At that time, the level starts to decline towards a lower level of about 0.5 to 0.6.
24
Figure 4: Filtered spatial dependence parameters obtained by imposing normally (dashed line) andStudent’s t (solid line) distributed errors.
2009 2010 2011 2012 2013 2014
0.4
0.5
0.6
0.7
0.8
0.9
rho_
t
t−GAS modelnormal−GAS model
The pattern can be related to a number of important policy events during the European sovereign
debt crisis.7 Some events have a high visible impact. For example, the first Long Term Refinancing
Operation (LTRO) at the end of 2012 caused a sudden and sharp drop in the spatial dependence
parameter. The effect, however, was short-lived and the value of ρt bounced back soon after to
similar levels as before. The second LTRO hardly has any visible effect on the spatial dependence
parameter. It is only until Mario Draghi’s speech at the Global Investment Conference in London
in July8 2012 and the subsequent announcements and implementation of the Outright Monetary
Transactions (OMT) and the European Stability Mechanism (ESM) in the months thereafter, that
the fear of perceived spillover effects appears to be mitigated on a more permanent basis, which
can be seen by the value of ρt coming down to lower levels.
5.3 Extensions
In this section, we extend the time-varying spatial model in different directions to investigate the
robustness of our results. First, we allow for sovereign-specific volatility clustering. Second, we7A list of events can be found in Figure B.1 in the appendix. See also Table B.1 with a list of sources.8Quote: “Within our mandate, the ECB is ready to do whatever it takes to preserve the euro. And believe me, it will
be enough.” Source: see Table B.1.
25
add an unobserved mean factor to try to distinguish common effects from spatial effects. Third,
we re-estimate the models using different choices of spatial weight matrices.
Unobserved time-varying volatility factors
Given the patterns in the data, it is clearly unrealistic to assume a common, time-invariant variance
for all sovereign CDS changes. We therefore extend the baseline model by adding a time-varying
diagonal covariance matrix Σt for the errors in the spatial model,
yt = h(ft)Wyt +Xtβ + et et ∼ pe(0,Σt), (16)
with
Σt := Σ(fσt ) = diag(σ2
1(fσ1t ), ..., σ2
n(fσnt ))
= diag (exp(fσ1t ), ..., exp(fσnt )) , (17)
where fσt = (fσ1t , ..., fσnt )′ is a vector of sovereign-specific variance factors. As before, we en-
dow the factor fσt with score updating. To enforce parsimony, we allow for sovereign-specific
intercepts in the score updating equations for fσt , but impose common score and persistence pa-
rameters Aσ and Bσ; see Appendix A. Although the covariance matrix of the error terms Σt is
diagonal, the reduced form covariance of yt is still a full matrix Cov(yt) = Z(ft)ΣtZ(ft)′.
Unobserved time-varying mean factor
To distinguish commonalities from spatial spill-overs, we also extend the model with an additional
unobserved time-varying mean factor. This factor is independent of the spatial lag structure,
yt = h(ft)Wyt +Xtβ + Z(ft)−1λfλt + et, et ∼ tλ0(0,Σt) (18)
where λ0 is the degrees of freedom parameter of the Student’s t distribution, λ = (λ1, . . . , λn)′ is
an (n×1)-vector of factor loadings, and fλt ∈ R is an additional time-varying parameter endowed
with score updating. Explicit formulas for the dynamics are given in Appendix A. Rewriting
equation (18) in reduced form, we obtain
yt = λfλt + Z(ft)Xtβ + Z(ft)et, (19)
26
Table 4: Comparison of goodness of fit of all considered empirical specifications. The largest max-imized loglikelihood value (logL) and the smallest Akaike Information Criterion (AICc) amongstthe considered models are marked as bold.
Static spatial Time-varying spatial
et ∼ N(0, σ2In) tλ(0, σ2In) N(0, σ2In) tλ(0, σ2In)
logL -26396.63 -24574.48 -26244.45 -24506.11AICc 52807.35 52507.03 49165.06 49032.39
Time-varying spatial-t Benchmark-t
(+tv. volas) (+mean f.+tv.volas) (+mean f.+tv.volas)
logL -24175.70 -24156.96 -26936.15AICc 48389.97 48375.30 53927.42
which allows for a direct comparison with a benchmark model without spatial lag structure,
yt = Xtβ + λfλt + et. (20)
Table 4 compares the goodness of fits of the seven empirical model specifications we consider
in our analysis. Each extension improves the performance of the model. However, the model
without any spatial structure performs worst, despite featuring an unobserved time-varying mean
and time-varying volatilities. We therefore conclude that explicitly accounting for dynamic con-
temporaneous spillovers of shocks, as it is done by the time-varying spatial model, is an important
feature when analyzing sovereign credit spread data.
The parameter estimates from the full model with spatial score updating, time-varying vari-
ances, and unobserved time-varying mean factor are given in Table 5. In contrast to the spatial
factor, the variance factors and particularly the mean factor are less persistent, which is seen by the
values of Bσ and Bλ, respectively. This is off-set by a larger impact of the scores in the transition
equations; see the values of Aσ and Aλ.
None of the parameters λi, i = 1, . . . , n, corresponding to the mean factor are individually
significantly different from zero. Jointly, however, they improve the model fit, as is indicated by
the AICc in Table 4. Furthermore, the loading estimates have an economic interpretation: the
non-stressed Eurozone countries have a negative coefficient λi, while the most stressed countries
during part of the European sovereign debt crisis (Portugal, Ireland, Spain) have positive loadings.
With regard to dynamic spatial dependence, the qualitative implications of the full model and
the basic time-varying spatial t-model are very similar. This is shown in Figure 5. Omitting the
27
Table 5: Estimated parameters and their numerically approximated (sandwich-)standard errors inparentheses, for the full model featuring spatial score updating, time-varying sovereign-specificvariances, an unobserved mean factor, and t-distributed error terms. The maximized loglikelihoodvalue (logL) and the Akaike information criterion (AICc) are also reported. Estimation period isFebruary 2, 2009 - May 12, 2014.
ωλ -0.0012 ωσ1 Belgium 0.0426 ω 0.0307(0.0252) (0.0125) (0.0229)
Aλ 0.3494 ωσ2 France 0.0448 A 0.019(0.8937) (0.0142) (0.007)
Bλ 0.6891 ωσ3 Germany 0.0573 B 0.9636(0.1065) (0.0155) (0.0271)
λ1 Belgium -0.2776 ωσ4 Ireland 0.0301 const. -0.0621(0.2308) (0.01) (0.024)
λ2 France -0.2846 ωσ5 Italy 0.0471 VStoxx -0.0257(0.3137) (0.0136) (0.0157)
λ3 Germany -0.2029 ωσ6 Netherlands 0.0443 term sp. 0.0693(0.2811) (0.0132) (0.0705)
λ4 Ireland 0.405 ωσ7 Portugal 0.0524 stocks -0.102(0.6928) (0.0153) (0.0183)
λ5 Italy -0.1604 ωσ8 Spain 0.0591 yields 0.0173(0.2429) (0.016) (0.0026)
λ6 Netherlands -0.1891 Aσ 0.1826 λ0 3.1357(0.2519) (0.023) (0.1977)
λ7 Portugal 0.4614 Bσ 0.9479(0.8334) (0.0135)
λ8 Spain 0.0988 logLik -24156.96(0.3635) AICc 48375.3
additional variance and mean dynamics leads to a slight upward adjustment in the filtered spatial
dependence parameter, but the overall pattern does not change.
Results from standard residual diagnostic tests are given in Table 6. The full model is able to
substantially reduce auto-correlations and ARCH effects for most individual series.9 Furthermore,
cross-correlations are, on average, much lower for the model residuals than for the raw data. To
give a full picture, we also provide the full correlation matrices in Table B.2.
To further robustify our results, we repeat the estimation, employing absolute instead of rela-
tive CDS spread changes as dependent variable. Figure B.2 shows the evolution of the correspond-
ing filtered parameter from the full model. Apart from an overall lower level of spatial dependence,
and a more clearly visible impact of the financial crisis at the beginning of the sample, the pattern
is similar to the picture obtained by using log changes.9Italy is the only country for which ARCH effects are not reduced by our model.
28
Table 6: Diagnostic tests for the residuals of the full model featuring spatial updating factor,volatilities, and additional mean factor, all driven by the score function, compared to the raw CDSspread changes. LB refers to the Ljung-Box test for residual serial correlation, ARCH LM refers tothe test for remaining auto-correlation in the squared residuals. The right panel contains averagesof pairwise cross-correlation.
sovereign LB test stat. ARCH LM test stat. average cross-corr.raw residuals raw residuals raw residuals
Belgium 108.64 15.93 169.91 25.53 0.70 0.07France 49.48 30.42 160.44 43.32∗ 0.66 -0.01Germany 62.61 19.49 142.70 53.78∗ 0.63 -0.07Ireland 129.89 17.53 302.23 87.11∗ 0.64 -0.07Italy 99.02 42.43∗ 102.13 150.88∗ 0.71 0.08Netherlands 55.69 33.29∗ 124.41 20.96 0.64 -0.05Portugal 167.91 32.56∗ 189.35 56.89∗ 0.65 0.03Spain 105.81 48.88∗ 253.68 154.42∗ 0.69 0.06∗Remaining effects at 5% level
Figure 5: Filtered spatial dependence parameters obtained from the basic time-varying spatialmodel with t-distributed errors (green) as well as with sovereign-specific, dynamic variances andan unobserved mean factor (red).
2009 2010 2011 2012 2013 2014
0.4
0.5
0.6
0.7
0.8
0.9
rho_
t
basic t−GAS modelfull t−GAS model
Choice of the spatial weight matrix
So far, all results reported have been obtained using the categorical spatial weight matrix Wcat de-
scribed in Section 5.1. In order to robustify our findings, we re-estimate the model using different
29
Table 7: Comparison of likelihood values for the time-varying spatial model with t-errors, usingdifferent spatial weights matrices.
Wraw Wdyn Wcat Wgeo
logL -24745.56 -24679.44 -24506.11 -25556.85
choices of W . Candidate choices include the matrix containing the averaged raw exposure data
(Wraw), a model in which the matrix of exposure data is updated quarterly (Wdyn), and a binary
matrix indicating the geographical neighborhood of the countries in our sample (Wgeo).10 As the
different models all have the same number of parameters, we can simply compare the likelihood
values at the optimum.
Table 7 shows that the goodness of fits are quite different. The model with a categorical
weights matrix provides the best fit. However, the parameter estimates are very robust towards the
specifiation of W , and none of the qualitative implications from our model changes.
It is particularly interesting to see that the weight matrices based on economic distances as
measured through financial cross-exposures (Wraw, Wdyn, and Wcat) provide a much better fit
than a matrix based on geographic distances (Wgeo). Some categorization is needed as well in
order to make the sizes of cross-exposures comparable. However, as mentioned before, scaling
the exposures by the size of the economy (as measured by GDP) did not provide an improvement
in terms of model fit.
6 Conclusion
In this paper, we propose a new model for time-varying spatial dependence in panel data sets.
The model extends the widely used spatial lag model to a time-varying parameter framework by
endowing the spacial dependence parameter with generalized autoregressive score dynamics and
fat tails. Allowing for time-variation is particularly useful if we apply spatial models over longer
time periods, where we can no longer be sure that the spatial dependence parameter is constant.
The fat-tailed feature of our model is useful in a setting where we apply the model to financial
data, which typically exhibit fatter tails than the normal.
We established the theoretical properties of our new model: the dynamics of the model are
optimal in the sense that they locally reduce the Kullback-Leibler distance of the statistical model10We also experimented with a weights matrix in which we weighted the raw exposures of the financial markets by
the countries’ respective GDP. However, the fit did not improve.
30
to the true unknown conditional density with every score update of the spatial dependence parame-
ter. Moreover, the maximum likelihood estimator for the model was consistent and asymptotically
normal under mild regularity conditions. Also, we showed under what conditions the model is in-
vertible, such that the filtered estimate of the time-varying spatial dependence parameters converge
in the limit to a unique stationary and ergodic sequence.
In our empirical study based on our time-varying spatial model, we showed that European
sovereign CDS spreads exhibit a strong, time-varying degree of spatial dependence. Cross-border
debt linkages appear as a suitable transmission channel for the spatial spillovers. In our final
model, we incorporated a time-varying common mean factor as well as time-varying volatilities
to the specification. Using the filtered time-varying parameters of this final model, we found
evidence for a break in spatial dependence towards the end of 2012. This illustrates that policies
by regulators have at least been partly effective in breaking the high spill-over effects prevalent
during the height of the European sovereign debt crisis.
References
Acharya, V., Drechsler, I., and Schnabl, P. (2013). A pyrrhic victory? bank bailouts and sovereign
credit risk. Journal of Finance, forthcoming.
Acharya, V. V. and Steffen, S. (2013). The ”greatest” carry trade ever? understanding eurozone
bank risks. Working Paper 19039, National Bureau of Economic Research.
Ang, A. and Longstaff, F. A. (2013). Systemic sovereign risk: Lessons from the u.s. and europe.
Journal of Monetary Economics, 60:493–510.
Anselin, L. (1988). Spatial Econometrics: Methods and Models. Springer.
Arezki, R., Candelon, B., and Sy, A. N. R. (2011). Sovereign rating news and financial markets
spillovers: Evidence from the european debt crisis. IMF Working Paper WP/11/68.
Arnold, M., Stahlberg, S., and Wied, D. (2013). Modeling different kinds of spatial dependence
in stock returns. Empirical Economics, 44:761–774.
Asgharian, H., Hess, W., and Liu, L. (2013). A spatial analysis of international stock market
linkages. Journal of Banking and Finance, 37:4738–4754.
Beetsma, R., Giuliodori, M., de Jong, F., and Widijanto, D. (2012). Spread the news: The impact
31
of news on the european sovereign bond markets during the crisis. Journal of International
Money and Finance, 34:83–101.
Billingsley, P. (1961). The lindeberg-levy theorem for martingales. Proceedings of the American
Mathematical Society, 12(5):788–792.
Billingsley, P. (1995). Probability and Measure. Wiley-Interscience.
Blasques, F., Koopman, S. J., and Lucas, A. (2014a). Information theoretic optimality of observa-
tion driven time series models. Tinbergen Institute Discussion Papers 14-046/III.
Blasques, F., Koopman, S. J., and Lucas, A. (2014b). Maximum likelihood estimation for gener-
alized autoregressive score models. Tinbergen Institute Discussion Papers 14-029/III.
Bougerol, P. (1993a). Kalman Filtering with Random Coefficients and Contractions.
Prepublications de l’Institut Elie Cartan. Univ. de Nancy.
Bougerol, P. (1993b). Kalman filtering with random coefficients and contractions. SIAM J. Control
Optim., 31(4):942–959.
Caporin, M., Pelizzon, L., Ravazzolo, F., and Rigobon, R. (2013). Measuring souvereign conta-
gion in europe. NBER Working Paper No. 18741.
Creal, D., Koopman, S. J., and Lucas, A. (2011). A dynamic multivariate heavy-tailed model
for time-varying volatilities and correlations. Journal of Business and Economic Statistics,
29(4):552–563.
Creal, D., Koopman, S. J., and Lucas, A. (2013). Generalized autoregressive score models with
applications. Journal of Applied Econometrics, 28:777–795.
Creal, D., Schwaab, B., Koopman, S. J., and Lucas, A. (2014). Observation driven mixed-
measurement dynamic factor models. Review of Economics and Statistics, page forthcoming.
De Santis, R. A. (2012). The euro area sovereign debt crisis - safe haven, credit rating agencies
and the spread of the fever from greece, ireland and portugal. ECB Working Paper No. 1419.
Denbee, E., Julliard, C., Li, Y., and Yuan, K. (2013). Network risk and key players: A structural
analysis of interbank liquidity. Working Paper.
Dieckmann, S. and Plank, T. (2012). Default risk of advanced economies: An empirical analysis
of credit default swaps during the financial crisis. Review of Finance, 16:903–934.
32
Dudley, R. M. (2002). Real Analysis and Probability. Cambridge Studies in Advanced Mathe-
matics. Cambridge University Press.
Engle, R. (2002). Dynamic conditional correlation. Journal of Business & Economic Statistics,
20:339–350.
Fernandez, V. (2011). Spatial linkages in international financial markets. Quantitative Finance,
11:237–245.
Foland, G. B. (2009). A Guide to Advanced Real Analysis. Dolciani Mathematical Expositions.
Cambridge University Press.
Gallant, R. and White, H. (1988). A Unified Theory of Estimation and Inference for Nonlinear
Dynamic Models. Cambridge University Press.
Gorea, D. and Radev, D. (2013). The euro area sovereign debt crisis: Can contagion spread from
the periphery to the core? Gutenberg School of Management and Economics Discussion Paper
No. 1208.
Harvey, A. C. (2013). Dynamic Models for Volatility and Heavy Tails. Econometric Society
Monographs, Cambridge University Press.
Harvey, A. C. and Luati, A. (2014). Filtering with heavy tails. Journal of the American Statistical
Association, page forthcoming.
Hillier, G. and Martellosio, F. (2013). Properties of the maximum likelihood estimator in spatial
autoregressive models. cemmap working paper CWP44/13.
Kalbaska, A. and Gatkowski, M. (2012). Eurozone sovereign contagion: Evidence from the cds
market (20052010). Journal of Economic Behavior & Organization, 83:657–673.
Kallestrup, R., Lando, D., and Murgoci, A. (2013). Financial sector linkages and the
dynamics of bank and sovereign credit spreads. Working paper, available at SSRN:
http://ssrn.com/abstract=2023635.
Keiler, S. and Eder, A. (2013). Cds spreads and systemic risk - a spatial econometric approach.
Deutsche Bundesbank Discussion Paper, No. 1/2013.
Kelejian, H. H. and Prucha, I. R. (2010). Specification and estimation of spatial autoregressive
models with autoregressive and heteroskedastic disturbances. Journal of Econometrics, 157:53–
67.
33
Korte, J. and Steffen, S. (2013). Zero risk contagion - banks’ sovereign exposure and sovereign
risk spillovers. Mimeo.
Krengel, U. (1985). Ergodic theorems. De Gruyter studies in Mathematics, Berlin.
Lee, L. (2004). Asymptotic distributions of quasi-maximum likelihood estimators for spatial au-
toregressive models. Econometrica, 72:1899–1925.
Lee, L. and Yu, J. (2010). Some recent developments in spatial panel data models. Regional
Science and Urban Economics, 40:255–271.
LeSage, J. P. and Pace, R. K. (2008). Introduction to Spatial Econometrics. CRC Press.
Leschinski, C. and Bertram, P. (2013). Pure contagion dynamics in emu government bond spreads.
Mimeo.
Lucas, A., Schwaab, B., and Zhang, X. (2014). Conditional euro area sovereign default risk.
Journal of Business and Economic Statistics, forthcoming, 32(2):271–284.
Oh, D. H. and Patton, A. (2013). Time-varying systemic risk: Evidence from a dynamic copula
model of cds spreads. Duke University Discussion Paper.
Ord, K. (1975). Estimation methods for models of spatial interaction. Journal of the American
Statistical Association, 70:120–126.
Pesaran, M. H. and Pick, A. (2007). Econometric issues in the analysis of contagion. Journal of
Economic Dynamics & Control, 31:12451277.
Ranga Rao, R. (1962). Relations between weak and uniform convergence of measures with appli-
cations. Annals of Mathematical Statistics, 33:659–680.
Saldias, M. (2013). A market-based approach to sector risk determinants and transmission in the
euro area. ECB Working Paper Series, No. 1574.
Straumann, D. and Mikosch, T. (2006). Quasi-maximum-likelihood estimation in conditionally
heteroeskedastic time series: A stochastic recurrence equations approach. The Annals of Statis-
tics, 34(5):2449–2495.
van der Vaart, A. W. (2000). Asymptotic Statistics (Cambridge Series in Statistical and Proba-
bilistic Mathematics). Cambridge University Press.
34
White, H. (1994). Estimation, Inference and Specification Analysis. Cambridge Books. Cambridge
University Press.
Wied, D. (2013). Cusum-type testing for changing parameters in a spatial autoregressive model
for stock returns. Journal of Time Series Analysis, 34:221229.
Wintenberger, O. (2013). Continuous invertibility and stable QML estimation of the EGARCH(1,
1) model. Scandinavian Journal of Statistics, 40(4):846–867.
Yu, J., de Jong, R., and Lee, L.-f. (2008). Quasi-maximum likelihood estimators for spatial dy-
namic panel data with fixed effects when both n and t are large. Journal of Econometrics,
146(118-134).
35
Appendix A Model extensions
We restrict the model extensions to the case of Student’s t distributed errors. We obtain the equa-
tions for the Gaussian case as a special case by letting λ0 →∞.
We assume that the vector of variance factors fσt in (17) follows an n-dimensional score process
as given by
fσt+1 = ωσ +Aσ∇σt +Bσfσt
with ω = (ωσ1 , . . . , ωσn)′, and Aσ, Bσ ∈ R. We thus allow for sovereign-specific intercepts in the
variance score update, but restrict the dynamic parameters Aσ and Bσ to be common across all
countries. This results in a parsimonious, yet flexible model. The score of the spatial dependence
factor ft is given in (10), with Σ replaced by Σt. For the variance factors, the score vector is
∇σt =∂`t∂fσt
=1
2
(1+λ−1n) exp(−fσ1,t)·
(y1,t−h(ft)
∑nj=1 w1jyj,t−x′1,tβ
)2
1+λ−1(yt−h(ft)Wyt−Xtβ)′Σ(fσt )−1(yt−h(ft)Wyt−Xtβ)− 1
...
(1+λ−1n) exp(−fσn,t)·(yn,t−h(ft)
∑nj=1 wnjyj,t−x′n,tβ
)2
1+λ−1(yt−h(ft)Wyt−Xtβ)′Σ(fσt )−1(yt−h(ft)Wyt−Xtβ)− 1
,
with X ′t = (x1,t, . . . , xn,t), and xi,t ∈ Rk×1.
In the presence of an additional mean factor fλt as in (19), the score update for ft changes
from (10) to
∇t =
[wt ·
(Wyt −Wλfλt
)′Σ−1
(yt − h(ft)Wyt −Xtβ − Z−1
t λfλt
)− tr(ZtW )
]· h(ft),
wt = (1+λ−1n)
1+λ−1(yt−h(ft)Wyt−Xtβ−Z−1t λfλt )′Σ−1(yt−h(ft)Wyt−Xtβ−Z−1
t λfλt ). (A.1)
The updating equation for fλt is given by
fλt+1 = ωλ +Aλ∇λt +Bλfλt ,
with score
∇λt = wt · (Z−1t λ)′Σ−1(yt − h(ft)Wyt −Xtβ − Z−1
t λfλt ). (A.2)
Finally, in the benchmark model (20), the score expression equals that in (A.2) with W = 0 and
Zt ≡ In.
36
Appendix B Additional tables and figures
Table B.1: Key policy events during the Eurozone crisis
Date Event SourceOct. 18, 2009 Greece announces doubling of budget deficit The Guardian1
Mar. 3, 2010 EU offers financial help to Greece ECB2
Dec. 7, 2010 Ireland is bailed out by EU and IMF ECB2
Dec. 22, 2011 ECB launches the first Longer-Term Refinancing Operation (LTRO) ECB2
Mar. 1, 2012 ECB launches the second LTRO ECB2
Jul. 26, 2012 M. Draghi: “[T]he ECB is ready to do whatever it takes to preserve the euro.” ECB3
Oct. 8, 2012 European Stability Mechanism (ESM) is inaugurated ESM4
Sep. 12, 2013 European Parliament approves new unified bank supervision system ECB2
1http://www.theguardian.com/business/2012/mar/09/greek-debt-crisis-timeline2http://www.ecb.europa.eu/ecb/html/crisis.de.html3http://www.ecb.europa.eu/press/key/date/2012/html/sp120726.en.html4http://www.esm.europa.eu/press/releases/20121008 esm-is-inaugurated.htmAll retrieved on June 19, 2014.
Figure B.1: Filtered spatial dependence parameters obtained from the full model, together withkey policy from Table B.1.
Mario Draghi: „Whatever it takes“
Ireland bailed out Help offer to Greece
First LTRO Second LTRO
ESM inaugurated
Greece : record deficit
New supervisory authority
37
Figure B.2: Filtered spatial parameter obtained from the full time-varying spatial model with time-varying volatilities and unobserved mean factor, using absolute CDS spread changes as dependentvariable.
2009 2010 2011 2012 2013 2014
0.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
rho_
t
Table B.2: Cross-correlation matrices: raw data and full model residuals
Correlation matrix of raw CDS change data
Belgium France Germany Ireland Italy Netherlands Portugal SpainBelgium 1.000 0.724 0.697 0.630 0.738 0.737 0.643 0.707France 1.000 0.724 0.581 0.655 0.678 0.581 0.657Germany 1.000 0.553 0.609 0.685 0.534 0.577Ireland 1.000 0.718 0.575 0.724 0.685Italy 1.000 0.654 0.740 0.847Netherlands 1.000 0.566 0.620Portugal 1.000 0.742Spain 1.000
Correlation matrix of residuals
Belgium France Germany Ireland Italy Netherlands Portugal SpainBelgium 1.000 0.113 0.068 -0.114 0.186 0.079 0.03 0.145France 1.000 0.234 -0.232 -0.076 0.029 -0.11 -0.015Germany 1.000 -0.247 -0.231 0.099 -0.19 -0.212Ireland 1.000 0.038 -0.115 0.205 -0.038Italy 1.000 -0.148 0.248 0.534Netherlands 1.000 -0.126 -0.172Portugal 1.000 0.161Spain 1.000
38
C Technical appendix: proofs
C.1 Proofs of main theorems
The lines of proof adopted here closely follow the original lines of proof in Blasques et al. (2014b), extended to the
case of exogenous variables.
Proof of Theorem 1: Define the norms ‖ · ‖Θ := supθ∈Θ | · | and ‖ · ‖ΘNf := E supθ∈Θ ‖ · ‖Nf .
Following Straumann and Mikosch (2006, Proposition 3.12), we have
supθ∈Θ|ft(y1:t−1, X1:t−1,θ, f1)− ft(yt−1, Xt−1,θ)| e.a.s.→ 0.
This follows directly from Bougerol (1993b, Theorem 3.1) in the context of the random sequence
ft(y1:t−1, X1:t−1, ·, fΘ1 )t∈N with elements ft(y1:t−1, X1:t−1, ·, fΘ
1 ) taking values in the separable Banach space
FΘ ⊆ (C(Θ,F), ‖ · ‖Θ), with initialization fΘ1 in C(Θ,F), where fΘ
1 (θ) = f1 ∀ θ ∈ Θ, and11
ft(y1:t, X1:t, ·, fΘ
1 ) = φt(ft(y
1:t−1, X1:t−1, ·, fΘ1 )),
:= φ(ft(y
1:t−1, X1:t−1, · , fΘ1 ) , yt, Xt; ·
)∀ t ∈ N,
where φtt∈Z is a stationary and ergodic (SE) sequence of stochastic recurrence equations φt : Ξ × C(Θ,F) →
C(Θ,F) ∀ t as in Straumann and Mikosch (2006, Proposition 3.12). Note that with a slight abuse of notation we use φ
both to denote the functional φ : C(Θ,F) × Y × X → C(Θ,F) as well as the function φ : F × Y × X × Θ → F .
Continuity of φ follows from s ∈ C(F × Y × X × B × Λ), where B is the domain of the regression parameters β.
The assumption that ytt∈Z and Xtt∈Z are SE and the continuity of φ together imply that φtt∈Z is SE by
Krengel (1985, Proposition 4.3). Condition C1 in Bougerol (1993b, Theorem 3.1) follows from E ln+ ‖s(fΘ, yt, Xt; · , · )‖Θ <
∞ since, by norm sub-additivity and positive homogeneity, for any fΘ ∈ C(Θ,F),
E ln+∥∥φt(fΘ)
∥∥Θ ≤ E∥∥φt(fΘ)
∥∥Θ= E
∥∥φ(fΘ, yt, Xt; · )∥∥Θ
= E∥∥ω +As(fΘ, yt, Xt; · , · ) +BfΘ‖Θ
≤ supθ∈Θ|ω|+ sup
θ∈Θ|A| E‖s(fΘ, yt, Xt; · , · )‖Θ + sup
θ∈Θ|B| · ‖fΘ‖Θ <∞,
because supθ∈Θ |ω| < ∞, supθ∈Θ |A| < ∞, supθ∈Θ |B| < ∞, and supθ∈Θ ‖fΘ‖Θ < ∞ hold by compactness of
Θ and continuity of fΘ, and E‖s(fΘ, yt, Xt; · , · )‖Θ < ∞ holds by assumption. This implies that fΘ ∈ C(Θ,F)
satisfies
E log+ ‖φ0(fΘ)− fΘ‖Θ ≤ E‖φ0(fΘ)− fΘ‖Θ ≤ E‖φ(fΘ, yt, Xt; ·)‖Θ + ‖fΘ‖Θ
= E supθ∈Θ|φ(fΘ(θ), yt, Xt,θ)|+ sup
θ∈Θ|fΘ(θ)| <∞.
By a similar argument E ln+ sup(β,λ)∈B×Λ |s(f1, yt, Xt;β, λ)| <∞ implies E log+ ‖φ0(fΘ)− fΘ‖NfΘ <∞.
11That (C(Θ,F), ‖ · ‖Θ) is a separable Banach space under compact Θ follows from application of the Arzelascolitheorem to obtain completeness and the Stone-Weierstrass theorem for separability.
39
For any pair (fΘ, f′Θ) ∈ C(Θ)× C(Θ), define
ρt = ρ(φt) = sup(fΘ,f
′Θ)∈FΘ×FΘ
‖φt(fΘ)− φt(f′Θ)‖Θ
‖fΘ − f ′Θ‖Θ.
Condition C2 in Bougerol (1993b, Theorem 3.1) holds if E ln ρt < 0. This is ensured by E ln ‖φ′t‖Θ < 0, with φt(θ)
as defined in (11). To see this, note that
E ln ρ(φt) := E ln sup‖fΘ−f ′Θ‖>0
‖φt(fΘ)− φt(f′Θ)‖Θ
‖fΘ − f ′Θ‖Θ
= E ln sup‖fΘ−f ′Θ‖>0
supθ∈Θ |φ(f(θ, f1), yt, Xt,θ)− φ(f ′(θ, f1), yt, Xt,θ)|supθ∈Θ |f(θ, f1)− f ′(θ, f1)|
≤ E ln sup‖fΘ−f ′Θ‖>0
supθ∈Θ φ′t(θ) supθ∈Θ |f(θ, f1)− f ′(θ, f1)|
supθ∈Θ |f(θ, f1)− f ′(θ, f1)|
= E ln ‖φ′t‖Θ < 0.
Also note that for the t period composition of the stochastic recurrence equation, we have E ln ρ(φt . . . φ1) ≤
E ln∏tj=1 ρ(φj) ≤
∑tj=1 ln ‖φ′j‖Θ < 0, where denotes composition. As a result, ft(·, f1)t∈N converges e.a.s. to
an SE solution ft(·)t∈Z in ‖ · ‖Θ-norm. Uniqueness and e.a.s. convergence is obtained in Straumann and Mikosch
(2006, Theorem 2.8).
Finally, we show that supt E supθ∈Θ |ft(y1:t−1, X1:t−1,θ, f1)|Nf <∞ and also
E supθ∈Θ |ft(y1:t−1, X1:t−1,θ)|Nf < ∞. We have supt E supθ∈Θ |ft(y1:t−1, X1:t−1,θ, f1)|Nf < ∞ if and only
if supt(E supθ∈Θ |ft(y1:t−1, X1:t−1,θ, f1)|Nf )1/Nf = supt ‖ft(·, f1)‖ΘNf < ∞. Furthermore, for any fΘ ∈
C(Θ,F), having ‖ft(·, f1) − fΘ‖ΘNf < ∞ implies ‖ft(·, fΘ1 )‖ΘNf < ∞ since continuity on the compact Θ im-
plies supθ∈Θ |f(θ)| <∞. For fΘ ∈ C(Θ,F), we define fΘ∗ , y∗, and X∗ such that fΘ = φ(y,X, fΘ
∗ , ·) ∈ C(Θ,F).
Above we showed that ∃ fΘ ∈ C(Θ,F) satisfying ‖φ(fΘ, yt, Xt; ·)‖ΘNf ≤¯φ < ∞ and ‖fΘ
1 − fΘ‖ΘNf = ‖fΘ1 −
φ(fΘ∗ , y∗, X∗; ·)‖ΘNf <∞. From this, we obtain
supt‖ft(·, fΘ
1 )− fΘ‖ΘNf = supt‖φ(ft(·, fΘ
1 ), yt, Xt; ·)− φ(fΘ∗ , y∗, X∗; ·)‖ΘNf
≤ supt‖φ(ft(·, fΘ
1 ), yt, Xt; ·)− φ(fΘ∗ , yt, Xt; ·)‖ΘNf+
supt‖φ(fΘ
∗ , yt, Xt; ·)‖ΘNf + supt‖φ(fΘ
∗ , y∗, X∗; ·)‖ΘNf
≤ supt
(E sup
θ∈Θ|ft(θ, f1)− fΘ
∗ |Nf × supθ∈Θ
|φ(ft(θ, fΘ1 ), yt, Xt;θ)− φ(fΘ
∗ (θ), yt, Xt;θ)|Nf
|ft(θ, fΘ1 )− fΘ
∗ (θ)|Nf
)1/Nf
+ supt‖φ(fΘ
∗ , yt, Xt; ·)‖ΘNf + ‖fΘ‖ΘNf
≤ supt
(E sup
θ∈Θ|ft(θ, f1)− fΘ
∗ |Nf × supθ∈Θ
φ′t(θ)Nf)1/Nf
+ supt‖φ(fΘ
∗ , yt, Xt; ·)‖ΘNf + ‖fΘ‖ΘNf .
Using the orthogonality condition in (iv′), we can write the expectation of the product as the product of the expectations
and continue
≤ supt‖ft( · , fΘ
1 )− fΘ∗ ‖ΘNf · ‖φ
′t‖ΘNf + sup
t‖φ(fΘ
∗ , yt, Xt; ·)‖ΘNf + ‖fΘ‖ΘNf
≤ ‖φ′t‖ΘNf ×(
supt‖ft( · , fΘ
1 )− fΘ∗ ‖ΘNf
)+ ¯φ+ f ,
40
with c = ‖φ′t‖ΘNf < 1 by condition (iv′), ¯φ < ∞, and f = ‖fΘ‖+ c · ‖fΘ − fΘ∗ ‖ΘNf < ∞. As a result we have the
recursion supt ‖ft(·, fΘ1 )− fΘ‖ΘNf ≤ c · supt ‖ft(·, fΘ
1 )− fΘ‖ΘNf +A, with A = ¯φ+ f . Hence,
supt‖ft(·, fΘ
1 )− fΘ‖ΘNf ≤t∑j=0
(c)j((c+ 1)f + ¯φ) + ct+1 supt‖fΘ
1 − fΘ‖ΘNf
≤ (c+ 1)f + ¯φ
1− c + ‖fΘ1 − fΘ‖ΘNf <∞.
The same result holds using the uniform contraction in (iv) by taking a further supremum in yt and Xt instead of the
orthogonality condition.
Proof of Theorem 2: Assumption 1 implies that `T (θ, f1) = (1/T )∑Tt=1 `t(θ, f1) is a.s. continuous (a.s.c.) in
θ ∈ Θ through continuity (c.) of each
`t(θ, f1) = `(yt, Xt, ft(y1:t−1, X1:t−1, f1,θ),θ)
= log pe(Zt(ft)−1yt −Xtβ;λ)− log |Zt(ft)|
ensured in turn by the differentiability of S, pe and h and the implied a.s.c. of
∇(ft(y1:t−1, X1:t−1, f1,θ), yt, Xt;β, λ) =
∂ log pe(Zt(ft(y1:t−1, X1:t−1, f1,θ))−1yt −Xtβ;λ)
∂f
− ∂ log |Zt(ft(y1:t−1, X1:t−1, f1,θ))|∂f
in (ft(y1:t−1, X1:t−1, f1,θ);λ) and the resulting c. of ft(y1:t−1, X1:t−1, f1,θ) in θ as a composition of t c. maps.
Together with the compactness of Θ this implies by Weierstrass’ theorem that the arg max set is non-empty a.s. and
hence that θT exists a.s. ∀T ∈ N. Assumption 1 implies also by a similar argument that
`T (θ, f1) = `(f1:T (y1:t−1, X1:t−1, f1,θ); y1:T , X1:T ,θ
)is continuous in (y1:T , X1:T ) ∀ θ ∈ Θ and hence measurable w.r.t. the product Borel σ-algebra B(Y) ⊗B(X ) that
are, in turn, measurable maps w.r.t. F by Proposition 4.1.7 in Dudley (2002).12 The measurability of θT follows from
Foland (2009, p.24) and White (1994, Theorem 2.11) or Gallant and White (1988, Lemma 2.1, Theorem 2.2).13
Proof of Theorem 3: We obtain θT (f1)a.s.→ θ0 from the uniform convergence of the criterion function
supθ∈Θ|`T (θ, f1)− `∞(θ)| a.s.→ 0 ∀ f1 ∈ F as T →∞, (C.1)
and the identifiable uniqueness of the maximizer θ0 ∈ Θ introduced in White (1994),
supθ:‖θ−θ0‖>ε
`∞(θ) < `∞(θ0) ∀ ε > 0; (C.2)
12Dudley’s proposition states that the Borel σ-algebra B(A × B) generated by the Tychonoff’s product topologyTA×B on the space A× B includes the product σ-algebra B(A)⊗ B(B).
13The reference of Foland (2009) is used here to establish that a map into a product space is measurable if and onlyif its projections are measurable.
41
see for example White (1994, Theorem 3.4) or Theorem 3.3 in Gallant and White (1988) for further details.
The uniform convergence is obtained by norm sub-additivity,14
supθ∈Θ|`T (θ, f1)− `∞(θ)| ≤ sup
θ∈Θ|`T (θ, f1)− `T (θ)|+ sup
θ∈Θ|`T (θ)− `∞(θ)|,
and then showing that the initialization effect vanishes asymptotically,
supθ∈Θ|`T (θ, f1)− `T (θ)| a.s.→ 0 as T →∞, (C.3)
and for the second term applying the ergodic theorem for separable Banach spaces in Ranga Rao (1962), as in Straumann
and Mikosch (2006, Theorem 2.7), to the sequence `T (·) with elements taking values in C(Θ,R) so that
supθ∈Θ|`T (θ)− `∞(θ)| a.s.→ 0 where `∞(θ) = E`t(θ) ∀ θ ∈ Θ.
The criterion `T (θ, f1) satisfies (C.3) if
supθ∈Θ|`t(θ, f1)− `t(θ)| a.s.→ 0 as t→∞.
The continuity of pe ensures that `t(·, f1) = `(ft(yt, Xt, ·, f1), yt, Xt, ·) is continuous in (ft(y
t, Xt, ·, f1), yt, Xt).
Since all the assumptions of Theorem 1 are satisfied we know that there exists a unique SE sequence ft(yt, Xt, ·))t∈Z
with elements taking values in C(Θ,F) such that
supθ∈Θ
∣∣(ft(yt−1, Xt−1, f1,θ), yt, Xt)− (ft(yt−1, Xt−1,θ), yt, Xt)
∣∣ a.s.→ 0,
and
supt
E supθ∈Θ|ft(yt−1, Xt−1, f1,θ)Nf | <∞ and E sup
θ∈Θ|ft(yt−1, Xt−1,θ)|Nf <∞,
with Nf ≥ 1. Hence, (C.3) follows by application of a continuous mapping theorem for ` : C(Θ,F)→ C(Θ,F).
The ULLN supθ∈Θ |`T (θ)− E`t(θ)| a.s.→ 0 as T →∞ follows, under a moment bound E supθ∈Θ |`t(θ)| <∞,
by the SE nature of `T t∈Z which is implied by continuity of ` on the SE sequence (yt, Xt, ft(yt−1, Xt−1, ·))t∈Z
and Proposition 4.3 in Krengel (1985). The moment bound E supθ∈Θ |`t(θ)| <∞ can be established as follows. First
note that
E supθ∈Θ|`t(θ)| = sup
θ∈ΘE| log pe(yt − h(ft(y
t−1, Xt−1,θ))Wyt−1 −Xtβ)
− log detZ(ft(yt−1, Xt−1,θ))|
≤ supθ∈Θ
E| log pe(yt − h(ft(yt−1, Xt−1,θ))Wyt−1 −Xtβ)|
− supθ∈Θ
E| log detZ(ft(yt−1, Xt−1,θ))| <∞,
then the bounded first moment for the likelihood is implied by having
E|yt|Ny <∞ , E|Xt|NX <∞ and supθ∈Θ
E|ft(yt−1, Xt−1,θ)|Nf <∞
14`T (θ) denotes `T (θ, f1) with f(θ), f1) replaced by its limit for T →∞, i.e., by f(θ).
42
since then
supθ∈Θ
E| log detZ(ft(yt−1, Xt−1,θ))| <∞,
because log |Z| ∈ M(Nf , Nlog |Z|) with Nlog |Z| ≥ 1 by assumption and
supθ∈Θ
E| log pe(yt − h(ft(yt−1, Xt−1,θ))Wyt−1 −Xtβ) <∞,
because h is uniformly bounded and hence the argument of pe has the same moments as yt and Xt. This ensures the
desired moment since log pe ∈ M(N,Nlog pe) with Nlog pe ≥ 1 and N = minNy, Nx by assumption.
Finally, the identifiable uniqueness (see e.g. White (1994)) of θ0 ∈ Θ in (C.2) follows from the assumed unique-
ness, the compactness of Θ, and the continuity of the limit E`t(θ) in θ ∈ Θ which is implied by the continuity of `T
in θ ∈ Θ ∀ T ∈ N and the uniform convergence in (C.1).
Proof of Theorem 4: As the likelihood and its derivatives depend on the derivatives of f(θ, f1) with respect to θ, we
introduce the notation f (0:m) as the vector containing f(θ, f1) and its derivatives up to order m, with initial condition
f(0:m). We obtain the desired result from: (i) the strong consistency of θT
a.s.→ θ0 ∈ int(Θ); (ii) the a.s. twice
continuous differentiability of `T (θ, f1) in θ ∈ Θ; (iii) the asymptotic normality of the score
√T`′T
(θ0, f
(0:1)1 )
d→ N(0,J (θ0)), J (θ0) = E
(˜′t
(θ0)˜′
t
(θ0)>
); (C.4)
(iv) the uniform convergence of the likelihood’s second derivative,
supθ∈Θ
∥∥`′′T (θ, f(0:2)1 )− `′′∞(θ)
∥∥ a.s.→ 0; (C.5)
and finally, (v) the non-singularity of the limit `′′∞(θ) = E˜′′t (θ) = I(θ). See e.g. in White (1994, Theorem 6.2)) for
further details.
The consistency condition θTa.s.→ θ0 ∈ int(Θ) in (i) follows under the maintained assumptions from Theorem 3
and the additional assumption in Theorem 4 that θ0 ∈ int(Θ). The smoothness condition in (ii) follows immediately
from Assumption 2 and the likelihood expressions in Appendix C.2.
The asymptotic normality of the score in (C.7) follows by Theorem 18.10[iv] in van der Vaart (2000) by showing
that
‖`′T(θ0, f
(0:1)1 )− `′T
(θ0)‖ e.a.s.→ 0 as T →∞, (C.6)
plus a CLT result for `′T (θ0). Note that from (C.6) we obtain that√T‖`′T
(θ0, f
(0:1)1 ) − `′T
(θ0)‖ a.s.→ 0 as T → ∞.
The desired CLT result follows by an application of the CLT for SE martingales in Billingsley (1961),
√T`′T
(θ0)
d→ N(0,J (θ0))
as T →∞, (C.7)
where J (θ0) = E(˜′t
(θ0)˜′
t
(θ0)>) <∞, where finite (co)variances follow from the assumption N`′ ≥ 2 in Assump-
tion 4 and the expressions for the likelihood in Appendix C.2.
To establish the e.a.s. convergence in (C.6), we use the e.a.s. convergence
|ft(y1:t−1, X1:t−1,θ0, f1)− ft(yt−1, Xt−1,θ0)| e.a.s.→ 0, (C.8)
43
and
‖f (1)
t (y1:t−1, X1:t−1,θ0, f(0:1)1 )− f (1)
t (y1:t−1, X1:t−1,θ0)‖ e.a.s.→ 0. (C.9)
The e.a.s. convergence in (C.8) is obtained directly by application of Theorem 1 under the maintained assumptions. The
e.a.s. convergence in (C.9) is obtained by the same argument as in the proof of Theorem 1 since: (a) the expressions for
the derivative process f (1)
t in Appendix C.2 show that the contraction condition
E ln supθ∈Θ
φ′1,1(θ) < 0
for the recursion of the filter ft is the same as the contraction condition for the derivative process f (1)
t ; and (b) the
expressions in Appendix C.2 also reveal that the counterpart of the moment condition
E ln+ sup(B,λ)∈B×Λ
|s(f1, yt, Xt;B, λ)| <∞,
used in Theorem 1 for the filtered process ft, is implied by the condition that
minNf , Ns, N0,1,0s , N (0,0,1)
s > 0,
as imposed in Assumption 4.
From the differentiability of
˜′t(θ, f
(0:1)1 ) = `′
(θ, y1:t, X1:t, f
(0:1)
t (y1:t−1, X1:t−1,θ, f(0:1)1 )
)in f
(0:1)
t (y1:t−1, X1:t−1,θ, f(0:1)1 ) and the convexity of F , we use the mean-value theorem to obtain
‖`′T(θ0, f
(0:1)1 )− `′T
(θ0)‖ ≤
4+dλ∑j=1
∣∣∣∂`′(y1:t, X1:t, f(0:1)
t )
∂fj
∣∣∣×∣∣f (0:1)
j,t (y1:t−1, X1:t−1,θ0, f(0:1)1 )− f (0:1)
j,t (y1:t−1, X1:t−1,θ0)∣∣,
(C.10)
where dλ denotes the dimension of λ, and f(0:1)
j,t denotes the j-th element of f(0:1)
t , and f(0:1)
is on the segment
connecting f(0:1)
j,t (y1:t−1, X1:t−1,θ0, f(0:1)1 ) and f
(0:1)
j,t (y1:t−1, X1:t−1,θ0). Note that f(0:1)
t ∈ R4+dλ because it
contains ft ∈ R (the first element) as well as f(1)
t ∈ R3+dλ (the derivatives with respect to ω, A, B, and λ). Using
the expressions of the likelihood and its derivatives in Appendix C.2, the moment bounds and the moment preserving
properties in Assumption 4, and the expressions in Appendix C.2 shows that
∣∣∂`′(y1:t, X1:t, f(0:1)
t )/∂fj∣∣ = Op(1) ∀j = 1, . . . , 4 + dλ.
The strong convergence in (C.10) is now ensured by
‖`′T(θ0, f
(0:1)1 )− `′T
(θ0)‖ =
4+dλ∑i=1
Op(1)oe.a.s(1) = oe.a.s.(1). (C.11)
The proof of the uniform convergence in (C.5) is similar to that of Theorem 2. We note
supθ∈Θ‖`′′T (θ, f1)− `′′∞(θ)‖ ≤ sup
θ∈Θ‖`′′T (θ, f1)− `′′T (θ)‖+ sup
θ∈Θ‖`′′T (θ)− `′′∞(θ)‖. (C.12)
44
To prove that the first term vanishes a.s., we show that
supθ∈Θ‖˜′′t (θ, f1)− ˜′′
t (θ)‖ a.s.→ 0 as t→∞.
The differentiability of g, g′, p, and S from Assumption 2 ensure that
˜′′t (·, f1) = `′′(yt, f
(0:2)
t (y1:t−1, X1:t−1, ·, f0:2), ·)
is continuous in (yt, f(0:2)
t (y1:t−1, X1:t−1, ·,f0:2)). Again, we note that the proof of Theorem 1 can be easily adapted
to show that there exists a unique SE sequence f (0:2)
t (yt−1, Xt−1, ·)t∈Z such that
supθ∈Θ
∥∥(yt, f(0:2)
t (y1:t−1, X1:t−1,θ, f0:2))− (yt, f(0:2)
t (yt−1, Xt−1,θ)∥∥ a.s.→ 0,
and satisfying, for for Nf ≥ 1,
supt
E supθ∈Θ‖f (0:2)
t (y1:t−1, X1:t−1,θ, f0:2)‖Nf <∞,
and also
E supθ∈Θ‖f (0:2)
t (yt−1, Xt−1,θ)‖Nf <∞,
because (a) the expressions for the derivative process f (1)
t in Appendix C.2 show that the contraction condition
E ln supθ∈Θ
φ′1,1(θ) < 0
for the recursion of the filter ft is the same as the contraction condition for the second derivative process f (2)
t ; and
(b) the expressions in Appendix C.2 show also that the counterpart of the moment condition
E ln+ sup(B,λ)∈B×Λ
|s(f1, yt, Xt;B, λ)| <∞,
used in Theorem 1 for the filtered process ft, is implied by the condition that
min
N
(1)f , N (0,1,0)
s , N (0,0,1)s , N (0,2,0)
s , N (0,0,2)s , N (0,1,1)
s ,N
(1,0,0)s N
(1)f
N(1,0,0)s +N
(1)f
,
N(2,0,0)s N
(1)f
2N(2,0,0)s +N
(1)f
,N
(1,1,0)s N
(1)f
N(1,1,0)s +N
(1)f
,N
(1,0,1)s N
(1)f
N(1,0,1)s +N
(1)f
> 0 ,
imposed in Assumption 4. By application of a continuous mapping theorem for `′′ : C(Θ × F (0:2)) → R we thus
conclude that the first term in (C.12) converges to 0 a.s..
The second term in (C.12) converges under a bound E supθ∈Θ ‖˜′′t (θ)‖ < ∞ by the SE nature of `′′T t∈Z. The
latter is implied by continuity of `′′ on the SE sequence
(yt, Xt, f(0:2)
t (y1:t−1, X1:t−1, ·))t∈Z.
The moment bound E supθ∈Θ ‖˜′′t (θ)‖ <∞ follows from N`′′ ≥ 1 in Assumption 4 and the expressions in Appendix
C.2. Finally, the non-singularity of the limit `′′∞(θ) = E˜′′t (θ) = I(θ) in (v) below equation (C.5) is implied by the
uniqueness of θ0 as a maximum of `′′∞(θ) in Θ.
45
C.2 Derivatives of the likelihood function
We take first derivatives of the likelihood with respect to all static parameters θ = (ω, a, b, β′, σ2)′:
∂`t∂θ
=
(∂`t∂ω
,∂`t∂a
,∂`t∂b
,∂`t∂β
,∂`t∂σ2
)′Let θm denote the mth element of θ. We can decompose the derivatives of the likelihood with respect to each θm
into two parts:
∂`t∂θm
=∂(pt + ln g′t)
∂ft· ∂ft∂θm
+∂pt∂θm
= ∇t ·∂ft∂θm
+∂pt∂θm
, (C.13)
because g′t does not depend on any of the parameters directly, only through ft. For θm ∈ ω, a, b the second term is
zero, because these parameters enter the likelihood only through ft.
All partial derivatives contain the term ∂ft∂θm
given by
∂ft∂θm
=∂
∂θm(ω + ast−1 + bft−1) (C.14)
=∂ω
∂θm+
∂a
∂θmst−1 + a
∂st−1
∂ft−1· ∂ft−1
∂θm+ a
∂st−1
∂θm+
∂b
∂θmft−1 + b
∂ft−1
∂θm(C.15)
=∂ω
∂θm+
∂a
∂θm∇t−1 + a∇′t−1 ·
∂ft−1
∂θm+ a
∂∇t−1
∂θm+
∂b
∂θmft−1 + b
∂ft−1
∂θm(C.16)
=∂ω
∂θm+
∂a
∂θm∇t−1 + a
∂∇t−1
∂θm+
∂b
∂θmft−1 + (a∇′t−1 + b)
∂ft−1
∂θm(C.17)
We want the matrix of second derivatives of the likelihood function, i.e.
∂2`t∂θ∂θ′
.
We take another derivative of (C.13) with respect to θo:
∂2`t∂θm∂θo
= ∇′t ·∂ft∂θo· ∂ft∂θm
+∂∇t∂θo· ∂ft∂θm
+∇t∂2f2
t−1
∂θmθo+
∂2pt∂θm∂θo
(C.18)
The second derivative process takes the form
∂2ft∂θm∂θo
=∂a
∂θm· ∂∇t−1
∂ft−1· ∂ft−1
∂θo+
∂a
∂θm
∂∇t−1
∂θo
+∂a
∂θo
∂∇t−1
∂ft−1
∂ft−1
∂θm+ a
∂2∇t−1
∂f2t−1
∂ft−1
∂θo
∂ft−1
∂θm+ a
∂2∇t−1
∂ft−1∂θo
∂ft−1
∂θm
+ a∂∇t−1
∂ft−1
∂2ft−1
∂θm∂θo+
∂a
∂θo
∂∇t−1
∂θm+ a
∂2∇t−1
∂θm∂θo+ a
∂2∇t−1
∂θm∂ft−1
∂ft−1
∂θo
+∂b
∂θm
∂ft−1
∂θo+
∂b
∂θo
∂ft−1
∂θm+ b
∂2ft−1
∂θm∂θo
=∂a
∂θm· ∇′t−1 ·
∂ft−1
∂θo+
∂a
∂θm
∂∇t−1
∂θo
+∂a
∂θo· ∇′t−1 ·
∂ft−1
∂θm+ a∇′′t−1 ·
∂ft−1
∂θo
∂ft−1
∂θm+ a
∂∇′t−1
∂θo
∂ft−1
∂θm
+ a∇′t−1∂2ft−1
∂θm∂θo+
∂a
∂θo
∂∇t−1
∂θm+ a
∂2∇t−1
∂θm∂θo+ a
∂2∇t−1
∂θm∂ft−1
∂ft−1
∂θo
+∂b
∂θm
∂ft−1
∂θo+
∂b
∂θo
∂ft−1
∂θm+ b
∂2ft−1
∂θm∂θo.
(C.19)
46