Spillover Dynamics for Systemic Risk Measurement using ...Spillover Dynamics for Systemic Risk...

TI 2014-107/III Tinbergen Institute Discussion Paper

Spillover Dynamics for Systemic Risk Measurement using Spatial Financial Time Series Models Francisco Blasques Siem Jan Koopman Andre Lucas Julia Schaumburg

Faculty of Economics and Business Administration, VU University Amsterdam, and Tinbergen Institute.

Tinbergen Institute is the graduate school and research institute in economics of Erasmus University Rotterdam, the University of Amsterdam and VU University Amsterdam. More TI discussion papers can be downloaded at http://www.tinbergen.nl Tinbergen Institute has two locations: Tinbergen Institute Amsterdam Gustav Mahlerplein 117 1082 MS Amsterdam The Netherlands Tel.: +31(0)20 525 1600 Tinbergen Institute Rotterdam Burg. Oudlaan 50 3062 PA Rotterdam The Netherlands Tel.: +31(0)10 408 8900 Fax: +31(0)10 408 9031

Duisenberg school of finance is a collaboration of the Dutch financial sector and universities, with the ambition to support innovative research and offer top quality academic education in core areas of finance.

DSF research papers can be downloaded at: http://www.dsf.nl/ Duisenberg school of finance Gustav Mahlerplein 117 1082 MS Amsterdam The Netherlands Tel.: +31(0)20 525 8579

Spillover Dynamics for Systemic Risk Measurement Using

Spatial Financial Time Series Models1

Francisco Blasques(a), Siem Jan Koopman(a,b), Andre Lucas(a), Julia Schaumburg(a)

(a) VU University Amsterdam and Tinbergen Institute, The Netherlands

(b) CREATES, Aarhus University, Denmark

August 2014

Abstract

We introduce a new model for time-varying spatial dependence. The model extends the well-known static spatial lag model. All parameters can be estimated conveniently by maximumlikelihood. We establish the theoretical properties of the model and show that the maximumlikelihood estimator for the static parameters is consistent and asymptotically normal. We alsostudy the information theoretic optimality of the updating steps for the time-varying spatialdependence parameter. We adopt the model to empirically investigate the spatial dependencebetween eight European sovereign CDS spreads over the period 2009–2014, which includesthe European sovereign debt crisis. We construct our spatial weight matrix using cross-borderlending data and include country-specific and Europe-wide risk factors as controls. We finda high, time-varying degree of spatial spillovers in the sovereign CDS spread data. There isa downturn in spatial dependence after the first half of 2012, which is consistent with policymeasures taken by the European Central Bank. The findings are robust to a wide range ofalternative model specifications.

Keywords: Spatial correlation, time-varying parameters, systemic risk, European debt crisis,generalized autoregressive score.

1We thank conference participants at the 2nd workshop on “Models driven by the score of predictive likelihoods”

in Tenerife 2014, the 7th Annual SoFiE Conference in Toronto 2014, the International Association of Applied Econo-

metrics conference in London 2014, the SYRTO workshop at the Bundesbank 2014 and seminar participants at VU

University Amsterdam for helpful comments. Lucas and Blasques thank the Dutch Science Foundation (NWO, grant

VICI453-09-005) for financial support. Koopman, Lucas, and Schaumburg thank the European Union Seventh Frame-

work Programme (FP7-SSH/2007-2013, grant agreement 320270 - SYRTO) for financial support. Koopman acknowl-

edges support from CREATES, Center for Research in Econometric Analysis of Time Series (DNRF78) at Aarhus

University, Denmark, and funded by the Danish National Research Foundation. Research support by the Deutsche

Forschungsgemeinschaft through the SFB 649 ”Economic Risk” is gratefully acknowledged.

1

1 Introduction

We propose a new parsimonious model to measure the time-varying cross-sectional dependence

in European sovereign credit spreads. The model builds on the well-known spatial lag model for

panel data. The strength of contemporaneous spillover effects is summarized in a single time-

varying parameter: the spatial dependence parameter. We argue that this parameter may be inter-

preted as a measure of sovereign systemic risk.

It is important to model sovereign default risk in the Euro area using a joint model. Financial

markets in the Euro area are largely integrated, and the financial sectors in the separate member

countries are heavily engaged in cross-border borrowing and lending. The cross-border entan-

glement of financial firms and sovereigns exacerbated further during the financial crisis; see for

example Acharya et al. (2013) and Dieckmann and Plank (2012). Our model accounts for the

possibility that shocks affecting the credit quality of one Eurozone member are likely to be prop-

agated to the other members via the linkages of their financial sectors. Possible feedback loops

that amplify systemic risk are incorporated in this way as well. The transmission channels in our

model are defined explicitly as economic distances in a spatial weights matrix constructed from

cross-border debt data.

The model is adopted to empirically study a time series sample of eight Eurozone sovereign

credit default swaps (CDS) over the period 2009–2014. We find strong evidence for time-varying

spatial dependence. The dependence parameter displays clear but transitory downward throughs

around the Long Term Refinancing Operations (LTROs) by the European Central Bank at the

end 2011 and start of 2012, but effectively remains at high levels throughout up to the second

half of 2012. It is only after the announcements and implementation of the Outright Monetary

Transactions (OMT), and the implementation of the European Stability Mechanism (ESM) that

systemic risk settles more permanently at a spatial correlation level that is roughly 30% to 40%

lower than during the crisis. The empirical implications of the model are robust to a range of

model extensions and alternative specifications, includuding common unobserved factors in the

levels of CDS changes, alternative distributional assumptions, alternative spatial weight matrices,

and time varying volatilities.

Our paper contributes to two strands of literature. First, we contribute to the applied spatial

econometrics literature. Spatial models are widely used in applied geographic and regional sci-

ence studies, and have recently also been applied in empirical finance; see Fernandez (2011) for a

CAPM model augmented by spatial dependencies, Wied (2013), Arnold et al. (2013), and Asghar-

2

ian et al. (2013) for analyses of spatial dependencies in stock markets, Denbee et al. (2013) for

a network approach to assess interbank liquidity, and Saldias (2013) for a spatial error model to

identify sector risk determinants. The study closest to ours is Keiler and Eder (2013). They model

CDS spreads of financial institutions in a static spatial lag model, additionally accounting for firm-

specific covariates and market risk factors. Their spatial weight matrix is constructed from stock

return correlations.

All of the above models, however, treat the spatial dependence parameter as static. To the best

of our knowledge, explicitly endowing the spatial dependence parameter in the spatial lag model

with time series dynamics is new to the literature. We model the dynamics using the generalized

autoregressive score framework proposed by Creal et al. (2011, 2013) and Harvey (2013). Given

the nonlinear impact of the time-varying parameter on the model, the theoretical properties of

this model and the asymptotic properties of the maximum likelihood estimator (MLE) for the

remaining static parameters are challenging and have not been established so far. We show under

what conditions the filtered spatial dependence parameters are well behaved, such that the model

is invertible. Invertibility is a key property for establishing consistency and asymptotic normality

of the MLE; see for example Wintenberger (2013). We derive new conditions for the asymptotic

properties of the MLE compared to Blasques et al. (2014b), allowing for exogenous regressors to

be part of the specification. We also discuss the information theoretic optimality of the model and

illustrate in a simulation study that the model is able to track a range of different patterns for the

time-varying spatial dependence parameter.

Second, we contribute to the literature that studies the dynamics of financial systemic risk in

the context of a network of sovereigns or financial firms. Since the beginning of the European

sovereing debt crisis in 2009, the sharp increases and comovements of sovereign credit spreads

have been the subject of a growing number of empirical studies. For instance, by employing

an asset pricing model, Ang and Longstaff (2013) investigate the differences between U.S. and

European credit default swap (CDS) spreads as a reflection of systemic risk. Lucas et al. (2014)

and Kalbaska and Gatkowski (2012) use multivariate time series models to model comovements

in European sovereign CDS spreads. De Santis (2012) and Arezki et al. (2011) study credit risk

spillover effects that are induced by rating events, such as downgrades of Greek government bonds.

Leschinski and Bertram (2013) find contagion effects in European sovereign bond spreads using

the simultaneous equations approach of Pesaran and Pick (2007). Caporin et al. (2013), on the

other hand, employ Bayesian quantile regressions, and conclude that comovements in European

credit spreads during the debt crisis are only due to increased volatities, but not contagion.

3

Our approach differs from the studies above since we introduce cross-sectional correlation not

only through contemporaneous error correlations, but also through spillovers induced by shocks to

the regressors, such as stock market crashes or interbank lending rates. Furthermore, we explicitly

offer financial sector linkages as the source of sovereign credit risk comovements. This view is

supported by the results of Korte and Steffen (2013), Kallestrup et al. (2013), Gorea and Radev

(2013), and Beetsma et al. (2012), in which cross-border exposures between international financial

sectors are relevant drivers of sovereign credit spreads. By exploiting these debt interconnections

as economic distances between sovereigns in our spatial model, we obtain a scalar time-varying

(spatial) dependence coefficient. We interpret this parameter in the systemic context as the overall

tendency for shock spillovers. As such, it provides a measure of systemic risk that is easy to

monitor over time.

The remainder of the paper is organized as follows. Section 2 introduces our spatial model

with time-varying parameters. We examine its theoretical properties in Section 3. In Section 4, we

provide Monte Carlo evidence of the model’s ability to track different dynamic patterns in spatial

dependence over time. We provide our study of European sovereign CDS spread dynamics in

Section 5. Section 6 concludes.

2 Spatial models with dynamic spatial dependence

2.1 Static spatial lag model for panel data

The spatial lag model for panel data is given by

yt = ρWyt +Xtβ + et, et ∼ pe(et,Σ;λ), t = 1, . . . , T, (1)

where yt denotes a vector of n cross-sectional observations at time t, ρ is the spatial dependence

coefficient,W is an n×nmatrix of exogenous spatial weights,Xt is an n×k matrix of exogenous

regressors, β is a k × 1 vector of unknown coefficients, including an intercept, and n × 1 vector

et is the disturbance vector with multivariate density pe(et,Σ;λ) that has mean zero, an unknown

k × k variance (or scale) matrix Σ and possibly other parameters collected in vector λ. For

example, pe may represent the Student’s t distribution where λ is then the degrees of freedom

parameter. This model (1) implies that each entry yit, for i = 1, . . . , n, of the vector yt depends

on k individual-specific regressors xit as well as on possibly other entries yjt for j 6= i. For a

moderately large n, we cannot estimate such a system of contemporaneous dependencies without

4

imposing further restrictions. The idea of a spatial dependence modelling is to specify the spatial

weight matrix W as a function of geographic or economic distances, and in this way exogenously

define a neighborhood structure between the cross-sectional units. It is a standard option to row-

normalize W such that∑n

j=1wij = 1 for i = 1, . . . , n, where wij is the (i, j)th element from

W . The impact of the (spatially weighted) contemporaneous dependent variables Wyt on yt is

captured by a scalar spatial dependence parameter ρ. For shocks to die out over space, we require

ρ ∈ (1/ωmin, 1) where ωmin is the smallest eigenvalue of W ; see Lee (2004).

We show that the basic form of the spatial lag model (1) can capture nonlinear feedback effects

across units by rewriting the model as

yt = ZXtβ + Zet, (2)

where we assume that the inverse matrix Z = (In − ρW )−1 exists with In as the n × n identity

matrix. Using infinite power series expansion as in LeSage and Pace (2008), we obtain

yt = Xtβ + ρWXtβ + ρ2W 2Xtβ + · · ·+ et + ρWet + ρ2W 2et + · · · . (3)

Equation (3) reveals that shocks eit or x′itβ to unit i spill over to other units j 6= i to an extent

that depends on their relative proximity to i via the weight matrix W and the spatial dependence

parameter ρ. At the same time, there are also possible feedback effects back to unit i itself, for

example if wij and wji are both non-zero, such that i and j are mutual neighbors, and i is a

‘second-order neighbor’ to itself.

The simultaneous structure of (1) causes an endogeneity problem and leads to an inconsistency

in the least squares estimation of the coefficients when n becomes large. An alternative solution in

the cross-sectional literature is to estimate the parameters by the method of Maximum Likelihood

(ML) or Quasi-ML (QML) where the latter is typically based on the normal distribution.1 The

ML Estimator (MLE) for spatial models with static dependence parameter was first studied in

Ord (1975) in the context of cross-sectional data sets. Lee (2004) derives asymptotic properties

of the Quasi MLE (QMLE) for n → ∞, and Hillier and Martellosio (2013) investigate its finite

sample distribution. Large n and large T asymptotics for QMLE of the spatial model with static

dependence parameter are studied in Yu et al. (2008). For further textbook treatments of spatial

econometric models and their estimation, we refer to Anselin (1988) and LeSage and Pace (2008).

For a survey on the panel data spatial lag model and parameter estimation, see Lee and Yu (2010).1Alternatively, we can use GMM as in, for example, Kelejian and Prucha (2010).

5

2.2 Score dynamics for the spatial dependence parameter

We can interpret the spatial dependence parameter ρ in (1) as a measure of the strength of cross-

sectional spillovers. In many empirical applications involving panel data, it is unrealistic to assume

that ρ is constant over the entire sample period. We therefore propose to introduce a time-varying

ρ in the model, that is

yt = ρtWyt +Xtβ + et, et ∼ pe(et,Σ;λ), t = 1, . . . , T, (4)

where ρt = h(ft) is a monotonic transformation of a time-varying parameter ft. We choose the

link function h such that ρt ∈ (−1, 1). We adopt the autoregressive score framework of Creal et al.

(2011, 2013) and Harvey (2013) to introduce a time-varying ft. The score framework for time-

varying parameters has been adopted successfully in a range of different model settings, including

the multivariate volatility model of Creal et al. (2011), the systemic risk model of Oh and Patton

(2013) and Lucas et al. (2014), the credit risk dynamic factor model of Creal et al. (2014), the

location and scale models with fat tails of Harvey and Luati (2014), and many more.2

The score framework is based on the scaled score of the conditional density pe to drive the

time-variation in ft. The updating equation for ft is given by

ft+1 = ω +Ast +Bft, (5)

where ω is a scalar coefficient, A is the score coefficient, st = St∇t is the scaled score function

and B is the autoregressive (or mean-reverting) coefficient. All three coefficients are treated as

fixed and unknown parameters. The scaled score function is defined as the first derivative of the

predictive loglikelihood function at time t with respect to ft, possibly multiplied by some local

scaling factor St. The score function is given by∇t = ∂`t/∂ft where

`t = ln pe (yt − h(ft)Wyt −Xtβ,Σ;λ) + ln |(In − h(ft)W )| . (6)

Throughout this paper, we use unit scaling, that is St ≡ 1 such that st = ∇t. Other scaling

choices are also feasible; see Creal et al. (2013).3 Equation (6) differs from the likelihood of

a simple linear regression model by the term ln |(In − h(ft)W )|. This term accounts for the

nonlinearity of the model in ρt as shown in equation (2). We define the vector of static parameters2See www.gasmodel.com for a full compilation.3In a simulation (not reported here) it turned out that different choices of scaling, such as scaling by the inverse

information matrix scaling or by its square root, did not have much impact on our empirical results.

6

http://www.gasmodel.com

θ = (ω,A,B, β, λ)′ and estimate θ via the numerical maximization of the likelihood function

given by

`T =

T∑t=1

`t. (7)

We consider two specifications for the error term densities pe, namely the multivariate normal

distribution and the multivariate Student’s t distribution. The latter is particularly relevant for our

empirical study because spread changes in credit default swaps (CDS) may be fat-tailed. Also,

Harvey and Luati (2014) argue that the Student’s t distribution can render the dynamics more

robust to incidental influential observations and outliers. Using the standard expression for the

multivariate normal density, we obtain the time t contribution to the loglikelihood function as

`t = ln |I− h(ft)W |−n

2ln(2π)−1

2ln |Σ|−1

2(yt−h(ft)Wyt−Xtβ)′Σ−1(yt−h(ft)Wyt−Xtβ),

(8)

and score

∇t =(y′tW

′Σ−1(yt − h(ft)Wyt −Xtβ)− tr(Z(ft)W ))· h(ft), (9)

where tr(·) is the trace operator, Z(ft) = (In− f(ft)W )−1, and h(ft) is the first derivative of the

transformation function h with respect to ft. For instance, if h(ft) = γ tanh(ft) with γ ∈ (0, 1),

then h(ft) = γ(1 − tanh2(ft)). When the density of the disturbance vector et is a multivariate

Student’s t with λ degrees of freedom, we have the tth likelihood contribution as given by

`t = ln |Z(ft)−1|+ ln

(Γ(λ+n

2

)|Σ|1/2(λπ)n/2Γ

(λ2

))

+

(−λ+ n

2

)ln

(1 +

(yt − h(ft)Wyt −Xtβ)′Σ−1(yt − h(ft)Wyt −Xtβ)

λ

),

with the score function given by

∇t =(wt · y′tW ′Σ−1(yt − h(ft)Wyt −Xtβ)− tr(Z(ft)W )

)· h(ft), where (10)

wt = (1 + λ−1n)/(

1 + λ−1(yt − h(ft)Wyt −Xtβ)′Σ−1(yt − h(ft)Wyt −Xtβ)).

We can verify that if λ → ∞, we have wt → 1 and the score expression in (10) collapses to

the one in (9). The weight wt is small if the residuals yt − h(ft)Wyt − Xtβ are ‘large’ in a

multivariate sense. The implication of a small weight wt is that the observation has a smaller

impact on the updates of ft. It provides a robustness feature to the dynamics of ft if we assume

a fat-tailed distribution such as the Student’s t; see also the discussion in Creal et al. (2011) and

7

Harvey (2013). This intuition is also straightforward: a large residual may be attributable to the

fat-tailedness of the Student’s t distribution rather than to a recent increase in the spatial correlation

h(ft).

The score expressions in (9) and (10) also depart from the expressions for the standard linear

regression model. In particular, we have the additional correction term −tr(Z(ft)W ). This term

accounts for the simultaneity bias in the standard least squares estimator and follows from the

presence of the term 0.5 ln |Z(ft)| in the likelihood at time t. In effect, this term accounts for the

fact that there may be feedback effects from unit i to unit j and then back to unit i. Hence the

spatial autoregressive score model integrates time-varying direct effects and indirect effects; both

are used to determine the appropriate transition dynamics for ρt.

3 Statistical properties of the model

In this section, we establish the existence, strong consistency and asymptotic normality of the

MLE of the static parameters θ that define the stochastic properties of the spatial score model

from Section 2. We choose to first present the results in a more general setting than the spatial

score model, thus extending the results in Blasques et al. (2014b) to allow for the presence of

exogenous regressors. We then particularize the results to the MLE for the spatial score model

in Corollary 1. We also discuss how the spatial score update is the optimal observation-driven

parameter update in an information theoretic sense, thus providing a further theoretical backbone

to the use of the score for updating ft. All proofs of the results stated in this section can be found

in the appendix.

3.1 Stochastic properties of the filtered spatial dependence parameter

To establish the consistency and asymptotic normality of the MLE, we first study the stochastic

properties of the filtered parameter ft defined through equations (5), (9), and (10). The filtered ft

directly determine the time-varying spatial parameter ρt = h(ft). Understanding the properties of

the filtered parameters is key to understanding the stochastic properties of the likelihood function

over the parameter space Θ.

We first need some additional notation. Let the T -period sequences yt(ω)Tt=1 and Xt(ω)Tt=1

be subsets of the realized path of n and k-variate stochastic sequences y(ω) := yt(ω)t∈Z and

X(ω) := Xt(ω)t∈Z, for some ω in the event space Ω. In particular,4 we let yt(ω) ∈ Y ⊆4The random sequences y and X are thus F/B(Y∞) and F/B(X∞)-measurable mappings y : Ω → Y∞ ⊆ Rn∞

and X : Ω → X∞ ⊆ Rk∞ where Rn∞ := ×t=∞t=−∞Rn and Rk∞ := ×t=∞t=−∞Rk denote Cartesian products of infinite

8

Rn ∀ (ω, t) ∈ Ω × Z and Xt(ω) ∈ X ⊆ Rk ∀ (ω, t) ∈ Ω × Z. For every ω ∈ Ω, the stochas-

tic sequences y(ω) and X(ω) thus live on the spaces (Y∞,B(Y∞),Py0) and (X∞,B(X∞),PX0 )

where the probability measures Py0 are PX0 are defined over the elements of the Borel σ-algebras

B(Y∞) and B(X∞). We write the filtered time-varying parameter as ft to distinguish it from the

true time-varying parameter ft. More precisely, we write the filtered time-varying parameter as

ft(y1:t−1, X1:t−1;θ, f1)t∈N, which depends naturally on the initialization f1 ∈ F ⊆ R, the past

data y1:t−1 = yst−1s=1 and X1:t−1 = Xst−1

s=1, and the parameter vector θ ∈ Θ. For notational

simplicity we often omit the dependence on the data and write ft(θ, f1)t∈N instead.

We can now rewrite the score update in (5) as

ft+1(θ, f1) = ω +A s(ft(θ, f1), yt, Xt;β, λ

)+Bft(θ, f1) ∀ t ∈ N,

where s(ft(θ, f1), yt, Xt;β, λ) denotes the unit scaled score function. To shorten the notation, we

define the random function

φt(ft(θ, f1);θ

):= φ

(ft(θ, f1), yt, Xt;θ

):= ω +A s(ft(θ, f1), yt, Xt;β, λ) +Bft(θ, f1),

as well as the supremum of its derivative,

φ′t(θ) := supf∈F

∣∣∣A ∂s(f, yt, Xt;β, λ)

∂f+B

∣∣∣. (11)

Note that φt(θ) is also a random variable due to its dependence on the data.

The following theorem states sufficient conditions for the stochastic sequence ft(θ, f1)t∈Ninitialized at f1 ∈ F to converge almost surely, uniformly in θ ∈ Θ, and exponentially fast

to a limit stationary and ergodic (SE) sequence ft(θ)t∈Z that has Nf bounded moments. We

repeatedly make use of this notion of uniform exponentially fast almost sure convergence (e.a.s.),

which means that ∃ γ > 1 such that

supθ∈Θ

γt∣∣∣ft(y1:t−1, X1:t−1,θ, f1

)− ft

(yt−1, Xt−1,θ

)∣∣∣ a.s.→ 0 as t→∞;

copies of Rn and Rk respectively, and Y∞ = ×t=∞t=−∞Y and X∞ = ×t=∞t=−∞X with B(Y∞) ≡ B(Rn∞) ∩ Y∞ andB(X∞) ≡ B(Rk∞) ∩ X∞; see (Billingsley, 1995, p.159). Here, B(Rn∞) and B(Rk∞) denote the Borel σ-algebrasgenerated by the finite dimensional product cylinders of Rn∞ and Rk∞ respectively, F denotes a σ-field defined on theevent space Ω, and together with the probability measure P0 on F, the triplet (Ω,F,P0) denotes the common underlyingcomplete probability space of interest.

9

see Straumann and Mikosch (2006). Note that the limit sequence starts in the infinite past and

hence depends on the infinite past data yt−1 := yst−1s=−∞ and Xt−1 := Xst−1

s=−∞, i.e.,

ft(θ)t∈Z ≡ ft(yt−1, Xt−1;θ)t∈Z. We thus establish the convergence of the sequence of

random functions ft(·, f1)t∈N defined on Θ with random elements taking values in the Banach

space (C(Θ,F), ‖ · ‖Θ) for every t ∈ N, to an SE limit ft(·)t∈Z with elements taking values in

(C(Θ), ‖ · ‖Θ), where ‖ · ‖Θ denotes the supremum norm on Θ. We have the following result.

THEOREM 1. Let F be convex, Θ be compact, ytt∈Z and Xtt∈Z be SE, s ∈ C(F ×Y ×X ×

B × Λ) and assume there exists a non-random f1 ∈ F such that

(i) E ln+ sup(β,λ)∈B×Λ |s(f1, yt, Xt;β, λ)| <∞;

(ii) E ln supθ∈Θ φ′1(θ) < 0.

Then ft(θ, f1)t∈N converges e.a.s. to the unique limit SE process ft(θ)t∈Z.

If furthermore ∃ Nf ≥ 1 such that

(iii) E sup(β,λ)∈B×Λ |s(f1, yt, Xt;β, λ)|Nf <∞;

and either

(iv) sup(β,λ)∈B×Λ |s(f, y,X;β, λ)− s(f ′, X, f ;β, λ)| < |f − f ′| ∀ (f, f ′, y,X) ∈ F ×F ×

Y × X ;

or

(iv′) E supθ∈Θ φ′1(θ)Nf < 1 and ft(θ, f1) ⊥ φ′t(θ) ∀ (t, f1) ∈ N × F , where ⊥ denotes

independence;

then both ft(θ, f1)t∈N and the limit SE process ft(θ)t∈Z have Nf bounded moments, i.e.,

supt E supθ∈Θ |ft(θ, f1)|Nf <∞ and E supθ∈Θ |ft(θ)|Nf <∞.

The first claim of Theorem 1 makes use of the conditions in Bougerol (1993a). Condition (i)

requires the existence of an arbitrarily small moment for the score, and condition (ii) requires the

spatial score update to be contracting on average. The uniqueness of the SE limit follows from

Straumann and Mikosch (2006). The second claim of Theorem 1 uses stricter moment conditions

and contraction conditions to obtain bounded moments of higher order for the filtered sequence.

This constitutes an extension of Proposition 1 in Blasques et al. (2014b) to the spatial score setting

with exogenous random variables Xt as well as vector and matrix arguments. Remark 1 below

highlights that in the special case where the score is uniformly bounded, then the filter has infinitely

many bounded moments under simpler conditions.

REMARK 1. Let |B| < 1. If sup(β,λ,f,y,X)∈B×F×Y×X |s(f, y,X;β, λ)| <∞, then

supt E supθ∈Θ |ft(θ, f1)|Nf <∞ holds for very Nf ≥ 1.

10

The proof of this statement follows immediately by noting that ft+1 =∑t−1

j=0 βj(ω+Ast−j)+

Bt−1f1, and hence that |ft+1| ≤∑t−1

j=0 |B|j |ω+Ast−j |+|Bt−1f1| ≤∑t−1

j=0 |B|j |ω|+∑t−1

j=0 |B|j |A||s|+

|B|t−1|f1| <∞ because |s| <∞.

3.2 Asymptotic properties of the maximum likelihood estimator

The observation-driven structure of the time-varying spatial model allows for a simple implemen-

tation of a maximum likelihood (ML) estimation procedure. Following equation (7), we define the

ML estimator (MLE) of the spatial score parameters more precisely as an element of the arg max

set of the sample log likelihood function `T (θ, f1),

θT (f1) ∈ arg maxθ∈Θ

`T (θ, f1), (12)

where

`T (θ, f1) =1

T

T∑t=1

`t(θ, f1)

=1

T

T∑t=1

log pe

(yt − h

((ft(θ, f1)

)Wyt −Xtβ ; λ

)− log |Z

(ft(θ, f1)

)|.

with Z(ft) defined below (9).

We can now use Theorem 1 to establish existence, consistency and asymptotic normality of

the MLE of the static parameters in the time-varying spatial model. For existence, we make the

following assumptions.

ASSUMPTION 1. (Θ,B(Θ)) is a measurable space and Θ is a compact set. Furthermore, h :

F → F ⊆ R and pe : Rn × Λ→ R are continuously differentiable in their arguments.

In Section 2, we have opted for the unit scaling of the score in our model. We can easily generalize

all results below to the case of a non-constant scaling function S as long as we assume S : F → R

is sufficiently smooth. Theorem 2 below establishes the existence and measurability of the MLE.

THEOREM 2. (Existence) Let Assumption 1 hold. Then there exists a.s. an F/B(Θ)-measurable

map θT (f1) : Ω→ Θ satisfying (12) for all T ∈ N and every initialization f1 ∈ F .

To obtain consistency of the MLE, we impose conditions that ensure that the likelihood func-

tion satisfies a uniform law of large numbers for SE processes. We first ensures that the filter

f(θ, f1) is SE and has Nf bounded moments by application of Theorem 1.

11

ASSUMPTION 2. ∃ (Nf , f) ∈ [1,∞)×F and a Θ ⊂ R3+dλ such that

(i) sup(β,λ)∈B×Λ E|s(f, yt, Xt;β, λ)|Nf <∞,

and either

(ii) sup(f,y,X,β,λ)∈R×Y×X×B×Λ |B +A∂s(f, y,X;β, λ)/∂f | < 1,

or

(ii′) E supθ∈Θ φ′t,Nf

(θ) = E supθ∈Θ |B +A∂s(f, yt, Xt;β, λ)/∂f | < 1

and ft(yt−1, Xt−1,θ, f1) ⊥ φ′t+1,Nf

(θ) ∀ (t, f1) ∈ N×F .

Next, we ensure a bounded expectation for the likelihood function. To do this, we use the no-

tion of ‘moment preserving map’. This allows us to derive the appropriate number of bounded mo-

ments of the likelihood function from the moments of its arguments; see Blasques et al. (2014b)for

a detailed description of the moment preserving properties of a wide catalogue of functions.

DEFINITION 1. (Moment Preserving Maps) A function H : Rk1 × Θ → Rk2 is said to be n/m-

moment preserving, denoted as H ∈ MΘ(n,m), if and only if E supθ∈Θ |xt(θ)|n < ∞ implies

E supθ∈Θ |H(xt(θ);θ)|m <∞.5

ASSUMPTION 3. N` = minNlog pe , Nlog |Z| ≥ 1, where log |Z| ∈ MΘ(Nf , Nlog |Z|) and

log pe ∈ MΘ

(N,Nlog pe

), with N = min

Ny, Nx

, where Ny and Nx denote the moments

of yt and Xt, respectively.

The moment N` in Assumption 3 corresponds to the number of moments of the likelihood

function. Rather than assuming N` ≥ 1 as a high-level assumption, we define N` as a function of

the score model constituents directly, thus obtaining a set of low-level conditions for strong con-

sistency. The requirements imposed in Assumption 3 follow easily by application of a generalized

Holder inequality to the likelihood expression below (12). Note that N = minNy, Nx

follows

directly by the fact that the argument (yt − h(ft(θ, f1)Wyt −Xtβ) of pe is linear in both yt and

Xt, and supf∈F |h(f)| ≤ 1. The current conditions extend those of Blasques et al. (2014b) by

accounting for the presence of exogenous variables Xt in the model.

Theorem 3 now establishes the strong consistency of the MLE for the parameters of our time-

varying spatial model if the data are SE.

THEOREM 3. (Consistency) Let ytt∈Z and Xtt∈Z be SE sequences satisfying E|yt|Ny <

∞ and E|Xt|Nx < ∞ for some Ny > 0 and Nx > 0 and let Assumptions 1, 2, and 3 hold.

5The (k1×1)-vector xt satisfies E supθ∈Θ |xt(θ)|n <∞ if its elements xi,t(θ) satisfy E supθ∈Θ |xi,t(θ)|n <∞,i = 1, ..., k1. The same element-wise definition applies when xt(θ) is a matrix.

12

Furthermore, let θ0 ∈ Θ be the unique maximizer of `∞(θ) on the parameter space Θ. Then the

MLE satisfies θT (f1)a.s.→ θ0 as T →∞ for every f1 ∈ F .

Remark 2 below highlights that if the score s is uniformly bounded, we can change Assump-

tion 2 in line with Remark 1.

REMARK 2. We can substitute Assumption 2 in Theorem 3 by

(i) sup(β,λ,f,y,X)∈B×Λ×F×Y×X |s(f, y,X;β, λ)| <∞;

(ii) E ln supθ∈Θ φ′1,1(θ) < 0 and |B| < 1.

Finally, we establish the asymptotic normality of the MLE. For this, we require the exis-

tence of a sufficient number of bounded moments for the likelihood function and its derivatives.

For notational simplicity, we define the function qt := q(ft(θ, f1), yt, Xt;β, λ) := log pe(yt −

h(ft(θ, f1)Wyt −Xtβ;λ), as well as the cross-derivatives

s(K1,K2,K3)(f, y,X;β, λ) :=∂K1+K2+K3s(f, y,X;β, λ)

∂fK1∂βK2∂λK3.

The (cross)-derivatives q(K1,K2,K3) and (log |Z|)(K1) are defined similarly. Assumption 4 now

imposes sufficient moment conditions for the asymptotic normality of the MLE.

ASSUMPTION 4. (i) s(K) ∈ MΘ(N , N(K)s ), q(K′) ∈ MΘ(N,N

(K′)q ), N := (Nf , Ny, Nx),

with N as defined in Assumption 3;

(ii) N`′ ≥ 2, N`′′ ≥ 1, N (1)f > 0, and N (2)

f > 0, with

N`′ = min

N (0,1,0)q , N (0,0,1)

q ,N

(1)log |Z|N

(1)f

N(1)log |Z| +N

(1)f

,N

(1,0,0)q N

(1)f

N(1,0,0)q +N

(1)f

,

N`′′ = min

N (0,2,0)q , N (0,0,2)

q , N (0,1,1)q ,

N(1,1,0)q N

(1)f

N(1,1,0)q +N

(1)f

,N

(1,0,1)q N

(1)f

N(1,0,1)q +N

(1)f

,

N(2,0,0)q N

(1)f

2N(2,0,0)q +N

(1)f

,N

(1,0,0)q N

(2)f

N(1,0,0)q +N

(2)f

,N

(1)log |Z|N

(2)f

N(1)log |Z| +N

(2)f

,N

(2)log |Z|N

(1)f

2N(2)log |Z| +N

(1)f

,

N(1)f = min

Nf , Ns, N

(0,1,0)s , N (0,0,1)

s

,

N(2)f = min

N

(1)f , N (0,1,0)

s , N (0,0,1)s , N (0,2,0)

s , N (0,0,2)s , N (0,1,1)

s ,N

(1,0,0)s N

(1)f

N(1,0,0)s +N

(1)f

,

N(2,0,0)s N

(1)f

2N(2,0,0)s +N

(1)f

,N

(1,1,0)s N

(1)f

N(1,1,0)s +N

(1)f

,N

(1,0,1)s N

(1)f

N(1,0,1)s +N

(1)f

.

13

Again, rather than assuming N`′ ≥ 2 and N`′′ ≥ 1 directly as a high-level condition, we

define N`′ and N`′′ explicitly in terms of their lower-level constituents. The moment conditions

in Assumption 4 extend those of Blasques et al. (2014b) by allowing for exogenous regressors.

The expressions may seem complicated at first, but we show below that their verification is often

straightforward; see also Blasques et al. (2014b) for the verification of similar moment conditions

in a wide range of observation-driven models.

The quantities N (1)f and N (2)

f in Assumption 4 correspond to the moments of the first and

second derivatives of the filter ft(θ, f1) with respect to the parameter θ. Similarly, N`′ and N`′′

denote the moments of the first and second derivatives of the likelihood function, respectively.

Theorem 4 now establishes the asymptotic normality of the MLE. Here, int(Θ) denotes the

interior of Θ.

THEOREM 4. (Asymptotic Normality) Let ytt∈Z and Xtt∈Z be SE sequences satisfying E|yt|Ny <

∞ and E|Xt|Nx <∞ for some Ny > 0 and Nx > 0 and let Assumptions 1–4 hold. Furthermore,

let θ0 ∈ int(Θ) be the unique maximizer of `∞(θ) on Θ. Then,

√T (θT (f1)− θ0)

d→ N(0, I−1(θ0)J (θ0)I−1(θ0)

)as T →∞,

whereJ (θ0) := E˜′t(θ0)˜′

t(θ0)> is the expected outer product of gradients and I(θ0) := E˜′′t (θ0)

is the Fisher information matrix.

Next we apply the theory developed above to consider the properties of the MLE for the time-

varying spatial model. We consider the model in (4) with Student’s t distributed innovations with

λ > 0 degrees of freedom. Consider a transformation function h that is (a.s.) bounded away from

minus one and one with uniformly bounded derivatives h(i),

− 1 < ρ ≤ ρt = h(ft) ≤ ρ < 1 a.s.; supf∈F|h(i)(f)| <∞ , i = 1, 2. (13)

For example, to set the correlation between ρ = −ρ and ρ, we can take h(ft) = ρ tanh(ft), where

ρ can be arbitrarily close to one. We have the following corollary.

COROLLARY 1. Consider the spatial score model with link function (13). If ytt∈Z and Xtt∈Zare SE with E|yt| < ∞ and E|Xt| < ∞, then there exists a compact parameter space Θ with

|B| < 1 ∀ θ ∈ Θ, such that the MLE exists (a.s.) and is strongly consistent for any initialization

f1 ∈ F . If E|yt|2+ε < ∞ and E|Xt|2+ε < ∞ for some ε > 0, then the MLE is asymptotically

normal with covariance matrix given in Theorem 4.

14

The corollary is a direct consequence of the previous theorems and is particularly applicable

to the spatial score model that we apply in our empirical section later on. It shows that we can use

the MLE both for estimation and inference.

3.3 Optimality of score updating in the time-varying spatial model

We note that the score-driven framework is not only intuitively appealing as a way to update time-

varying parameters. More importantly, score based updates are also optimal in an information

theoretic sense under very mild regularity conditions; see Blasques et al. (2014a).

Let pt := p( · |ft, Xt) denote the true unknown conditional density of yt, which depends on

the true unobserved time-varying parameter ftt∈Z and the regressors Xt. Similarly, let pt :=

p( · |ft, Xt) denote the conditional density implied by the score model given the filtered time-

varying parameter ft, the regressors Xt, and the postulated innovation density pe. To simplify

the notation, note that we have dropped the dependence of ft on θ and f1 Ideally, whenever a

new observation yt becomes available, we want the filtered value ft+1 to be such that the new

conditional density implied by the model pt+1 := p(·|ft+1, Xt) be as close as possible to the true

unknown conditional density pt from which yt was drawn.

Following Blasques et al. (2014a), we focus on the notion of Kullback-Leibler divergence to

measure the distance between the two densities

DKL

(pt , pt+1

)=

∫Yp(y|Xt) ln

p(y|ft, Xt)

p(y|ft+1, Xt;θ)dy, (14)

where Y ⊆ R is the set over which the divergence is evaluated; ; see Blasques et al. (2014a) for fur-

ther details. In particular, we would like an update ft+1 for whichDKL(p( · |ft, Xt) , p( · |ft+1, Xt))

is smaller than DKL(p( · |ft, Xt) , p( · |ft, Xt)), such that the update from ft to ft+1 reduces the

distance to the true unknown conditional density.

Blasques et al. (2014a) show that only score updates have the property that they locally always

reduce the KL-distance and thus provide a local improvement. Though their original proofs do

not account for the presence of exogenous regressors, is it easy to see from their paper that all

their results continue to hold if Xt is incorporated in the conditional densities as described above.

In particular, the spatial model structure and Student’s t specification are sufficiently smooth for

local optimality results to apply, as well as for non-local Kullback-Leibler improvement regions

from Blasques et al. (2014a) to hold.

15

4 Monte Carlo study

To show the adequacy of the time-varying spatial model in filtering out dynamic patterns of the

spatial dependence parameters, we conduct a simulation study. In this study, we also investigate

whether the MLE is well-behaved and approximately normally distributed in larger samples. We

set the sample size to realistic values given the empirical application in Section 5. To limit the

complexity of the experiment, we consider a spatial lag model without regressors. The data gen-

erating process is

yt = Z(ft)et, eti.i.d.∼ Student’s t(0, In; 5), (15)

where Z(ft) = (In − tanh(ft)W )−1, t = 1, ..., 500. The spatial weight matrix W is specified

similar to the one used in our empirical application. It contains row-normalized cross-border

exposures of the financial sectors of nine European countries. We simulate 250 data sets according

to (15) using five processes with different dynamic patterns for the spatial dependence parameter.

These patterns are similar to the ones in Engle (2002), namely

1. Constant: ρt = 0.9;

2. Sine: ρt = 0.5 + 0.4 cos(2πt/200);

3. Fast sine: ρt = 0.5 + 0.4 cos(2πt/20);

4. Step: ρt = 0.9− 0.5 ∗ I(t > T/2);

5. Ramp: ρt = mod (t/200);

Figure 1 shows that the filtered spatial dependence parameters are able capture the patterns

of the simulated processes quite accurately. The model has some difficulty in tracking down-

turns compared to upturns, but this is intuitively plausible: the signal present in strongly cross-

sectionally correlated data yt is much more apparent than that for weakly correlated data.

In our second simulation study, we again use nine cross-sectional units. We assume that the

errors are normally distributed with common variance σ2, and we include one regressor variable

Xt ∼ N(0, I9). The data-generating process is the Gaussian spatial score model laid out in Section

2. In contrast to our previous experiment, the model is now thus correctly specified. We simulate

500 paths yt using the parameters ω = 0.05, A = 0.05, B = 0.8, β = 1.5, and σ2 = 2. We

plot the kernel density estimates of the distribution of the MLE for three different sample sizes,

T = 500, 1000, 2000, in Figure 2.

The figure clearly shows that for smaller sample sizes of around T = 500, the estimators

are still not perfectly normal. For larger sample sizes, however, we see a clear convergence to

16

Figure 1: Simulated true spatial dependence process (black line), median filtered parameter(dashed red line) and 2.5% and 97.5% (green lines) quantiles of the filtered parameters. Thefigures are based on 250 replications.

0 100 200 300 400 500

0.86

0.88

0.90

0.92

0.94

Constant

rho.

t

0 100 200 300 400 500

0.0

0.2

0.4

0.6

0.8

1.0

Sine

rho.

t

0 100 200 300 400 500

0.0

0.2

0.4

0.6

0.8

1.0

Fast sine

rho.

t

0 100 200 300 400 500

0.0

0.2

0.4

0.6

0.8

1.0

Step

rho.

t

0 100 200 300 400 500

0.0

0.2

0.4

0.6

0.8

1.0

Ramp

rho.

t

the limiting result. In particular, for empirically relevant sample sizes of around T = 2, 000, all

distributions look close to a normal centered around the true parameter values. We therefore apply

the MLE and its associated standard errors in our empirical application in the next section.

5 The empirics of time-varying European CDS spread dependencies

In our empirical study we evaluate the evolution of sovereign credit risk spreads over a period that

includes the Eurozone sovereign debt crisis. In particular, we investigate the time-varying features

of the spatial dependence structure between the sovereign credit spreads, particular in relation to

a number of the policy responses by regulators. Our spatial structure is directly linked to the bank

sectors’ cross-exposures to other sovereigns and financial sectors within the European Union.

17

Figure 2: Kernel density estimates of estimated parameters from 500 simulation replications

0.02 0.04 0.06 0.08 0.10

010

2030

4050

6070

Density of estimates for ω, true value=0.05

N = 500 Bandwidth = 0.002946

Den

sity

T=500T=1000T=2000

0.040 0.045 0.050 0.055 0.060

050

150

250

350

Density of estimates for a, true value=0.05

N = 500 Bandwidth = 0.0006868

Den

sity

T=500T=1000T=2000

0.70 0.75 0.80 0.85 0.90

010

2030

40

Density of estimates for b, true value=0.8

N = 500 Bandwidth = 0.00627

Den

sity

T=500T=1000T=2000

1.46 1.48 1.50 1.52 1.54

010

2030

4050

60

Density of estimates for β, true value=1.5

N = 500 Bandwidth = 0.003423

Den

sity

T=500T=1000T=2000

1.85 1.90 1.95 2.00 2.05 2.10 2.15

05

1015

20

Density of estimates for σ2, true value=2

N = 500 Bandwidth = 0.01108

Den

sity

T=500T=1000T=2000

5.1 Data

Credit default spread data

Since EU countries have been affected by the crisis to different degrees, sovereign credit spreads

in Europe are strongly cross-sectionally dependent. Figure 3 shows the credit default spreads from

February 2, 2009, until May 12, 2014 (1375 daily observations) for the nine euro area countries in

our sample: Belgium, France, Germany, Ireland, Italy, the Netherlands, Portugal, and Spain. We

use relative changes (log returns multiplied by 100) of Euro-denominated sovereign CDS spreads

for each of these countries using data obtained from Bloomberg.

The time series reveal clear common patterns, particularly among the non-stressed Eurozone

countries (Germany, France, Netherlands, Belgium, and to a lesser extend Spain and Italy). At

18

2009 2010 2011 2012 2013 2014

100

200

300

400

spre

ad (

bp)

BelgiumFranceNetherlandsGermany

2009 2010 2011 2012 2013 2014

050

010

0015

00

spre

ad (

bp)

PortugalIrelandSpainItaly

Figure 3: Credit default swap spreads of eight European sovereigns, Feb 2, 2009 – May 12, 2014.The different countries are split in two groups.

19

Table 1: List of country-specific stock indices included in the time-varying spatial model as re-gressor variables.

Belgium BEL 20 Price IndexFrance CAC 40 Price IndexGermany DAX 30 Price IndexIreland ISEQ 20 Price IndexItaly FTSE MIB Price IndexNetherlands AEX Price IndexPortugal PSI 20 Price IndexSpain IBEX 35 Price Index

the same time, there appear to be temporary dissimilarities: for example, the evolution of the

Ireland credit spread appears to be roughly in line with that of the other countries before mid 2010

and after mid 2012, but departing during the height of the European sovereign debt crisis. The

combination of commonalities with possible temporary changes in commonality warrants the use

of the time-varying spatial model proposed in this paper.

Other explanatory variables

Our empirical model contains two regressors that capture the state of the European financial mar-

ket, see also Caporin et al. (2013). The first variable is the change in the volatility index VStoxx.

The VStoxx is measured using the implied volatility of the EuroStoxx 50. It captures changes

in the risk appetite of financial markets. Our second variable is the difference between the three

month Euribor and the overnight rate EONIA. This measure captures the stress within the financial

sector and the perceived counterparty credit risk between banks.

We also incorporate two country-specific regressors, namely log returns of the respective coun-

tries’ leading stock returns and absolute changes in 10-year government bond yields. Local stock

market returns, see Table 1, are a measure of the well-being of the economy and the ability of gov-

ernments to pay off debt in the long run. We expect a negative relation with credit spread changes.

The change in 10-year yields mainly reflect the long-term borrowing costs of governments, and

we expect a positive relation with sovereign credit default swap spreads.

All variables are included in the model with a lag of one period. The data are obtained from

Datastream. We have computed the augmented Dickey-Fuller unit root test statistics and they

indicated that all time series are stationary. Table 2 presents the summary statistics.

20

Table 2: Data summary. Stock index log returns are calculated from closing prices. All stockindices are quoted in domestic currency (Euro).

mean min. 25% quant. median 75% quant. max.CDS spread changes (log changes*100)

Belgium -0.08 -19.34 -1.9 -0.07 1.78 17.04France -0.03 -19.44 -1.84 -0.07 1.56 19.82Germany -0.07 -26.71 -1.89 0 1.56 25.43Ireland -0.11 -32.69 -1.57 -0.03 1.32 26.81Italy -0.03 -43.73 -2.09 -0.1 1.76 20.27Netherlands -0.09 -22.2 -1.66 -0.03 1.39 14.92Portugal 0.02 -47.38 -1.8 0 1.66 20.54Spain -0.04 -37.04 -2.02 0 1.99 25.17

local stock index returns (log returns*100)Belgium 0.04 -5.49 -0.59 0.03 0.69 8.96France 0.03 -5.63 -0.68 0.02 0.8 9.22Germany 0.06 -5.99 -0.57 0.07 0.75 5.9Ireland 0.06 -6.79 -0.62 0.02 0.83 6.95Italy 0.01 -7.04 -0.88 0.04 1.03 10.68Netherlands 0.04 -5.34 -0.58 0.04 0.71 7.07Portugal 0.01 -5.51 -0.69 0.02 0.77 10.2Spain 0.02 -6.87 -0.82 0.01 0.87 13.48

local long-term bond yields (changes)Belgium -0.16 -30.2 -2.6 -0.1 2.2 34.4France -0.14 -26.2 -2.56 -0.12 2.4 24.2Germany -0.13 -25.6 -2.8 -0.1 2.3 18.48Ireland -0.2 -102.79 -3.64 -0.25 2.8 75Italy -0.11 -78 -3.3 -0.09 3.1 50.9Netherlands -0.16 -22.4 -2.8 -0.05 2.14 15.61Portugal -0.07 -146.98 -5.1 -0.01 5.13 168.6Spain -0.11 -88.3 -3.6 0 3.5 37.3

Eurozone-wide variablesVStoxx change -0.02 -10.94 -0.86 -0.11 0.67 12.79term spread 0.35 -0.37 0.14 0.34 0.52 1

Spatial weights matrix

The choice of spatial weight matrix is a key ingredient of the spatial model, as it determines the

structure of the ‘economic neighborhood’ between the sovereign CDS spreads and defines the

channel for cross-sectional spillovers. Recently, domestic banks’ cross-border exposures have

been identified as relevant pricing factors for sovereign credit spreads, see for example Kallestrup

et al. (2013), Korte and Steffen (2013), and Beetsma et al. (2012). A possible reason for this

connection is outlined in Korte and Steffen (2013). They argue that until recently, risk management

rules for banks implied a so-called ‘zero risk weight channel’: European banks were not required

to hold capital buffers against EU member states’ debt. This led to regulatory arbitrage incentives

21

for banks to hold more government debt; see also Acharya and Steffen (2013). At the same time

and due to the banks’ willingness to take on government debt, governments were able to issue large

amounts of debt, thus creating a problematic feedback loop: if sovereign credit risk materialized,

banks could become stressed, and due to possible bail-outs, governments in turn might become

stressed as well.

To account for this type of possible feedback loop, we use a weight matrix that is constructed

from cross-border debt data provided by the Bank for International Settlements (BIS).6 We average

the bilateral raw exposure data from 2007 Q4 - 2008 Q2. As the consolidated data are published

on the BIS homepage with a lag of approximately two quarters, this avoids a possible source of

endogeneity for W . This matrix is denoted by Wraw.

Due to large differences in the sizes of the member countries’ financial sectors, the weights

implied by Wraw vary significantly. To mitigate the size of these differences, we form three

discrete categories of mutual lending (‘low’, ‘medium’, and ‘high’). The entries of the resulting

matrix Wcat are constructed as

Wcat,ij =

1, if 0 < Wraw,ij ≤ Q0.33(Wraw),

2, if Q0.33(Wraw) ≤Wraw,ij < Q0.67(Wraw),

3, if Q0.67(Wraw) ≤Wraw,ij ,

where Qp(Wraw) denotes the p-th quantile of the exposure data contained in Wraw. After con-

structing Wcat, we row-normalize it to obtain proper weights that sum to one. An advantage of

the categorical matrix over the raw matrix is that the categories are almost time-invariant, so that

using a constant W can be justified. An alternative way would be to normalize the exposure data

by GDP of the country in order to relate the size of multual debt to the size of the economy. We

investigate this and other alternatives for constructing the weight matrix in our robustness checks

in Section 5.3.

5.2 Results

Table 3 contains the estimation results from both the static and time-varying spatial model for

normally and t-distributed error terms. For simplicity, the specifications contain a common, time-

invariant variance. This assumption is relaxed in the robustness section, Section 5.3. For the static

model, we find strong evidence for spatial dependence in the European sovereign CDS spreads,6The data can be found at http://www.bis.org/statistics/consstats.htm, Table 9B: International bank claims, consoli-

dated - immediate borrower basis. Last accessed on March 20, 2014.

22

http://www.bis.org/statistics/consstats.htm

Table 3: Estimated parameters and their robust (sandwich) standard errors in parentheses, for thestatic spatial lag model and the time-varying spatial model, based on normally (N ) and Student’st (tλ) distributed errors. The maximized loglikelihood value (logL) and the Akaike informationcriterion, corrected for finite numbers of observations, (AICc) are also reported. Estimation periodis February 2, 2009 – May 12, 2014.

Static model Time-varying modelN tλ N tλ

ρ 0.7249 0.7146(0.0071) (0.0062)

ω 0.0156 0.0181(0.0074) (0.0192)

A 0.0144 0.0168(0.003) (0.0085)

B 0.9817 0.9794(0.009) (0.0219)

ln(σ2) 1.8131 0.8392 1.8043 0.8426(0.0509) (0.0444) (0.0504) (0.0446)

Vstoxx -0.0901 -0.0261 -0.0756 -0.0261(0.0473) (0.0164) (0.0326) (0.0158)

term sp 0.0239 0.032 0.1084 0.0818(0.1065) (0.066) (0.0998) (0.0656)

local stocks -0.2031 -0.1156 -0.1769 -0.1122(0.0426) (0.0193) (0.035) (0.0187)

local 10Y yields 0.0256 0.0184 0.0258 0.0186(0.0041) (0.0027) (0.0039) (0.0027)

const -0.0137 -0.0341 -0.066 -0.0578(0.0386) (0.0216) (0.0393) (0.0244)

λ 2.5202 2.5649(0.1246) (0.1288)

logLik -26396.6 -24574.5 -26244.4 -24506.1AICc 52807.3 49165.1 52507.0 49032.4

indicated by the high estimates for ρ together with a small standard error. Given that CDS spread

changes are known to exhibit fat tails, it is not surprising to find that the model fit improves

substantially for the Student’s t vis-a-vis the normal distribution. The likelihood value increases

by more than 1800 points after only adding 1 parameter to the model. This finding is confirmed

by the AICc.

If we consider the dynamic spatial model based on the normal distribution, we see an increase

of about 160 likelihood points compared to the static Gaussian model at the cost of adding 2

parameters. The dynamics of the spatial dependence parameter are highly persistent with a value of

B close to unity. The unconditional mean of ft equals ω/(1−B) ≈ 0.8524 with tanh(0.8524) ≈

0.6924. Accounting for the fact that the expected value of tanh(ft) is slightly larger than this due

23

to Jensen’s inequality, we see that the unconditional level for the Gaussian spatial score model

is close to the static estimate of 0.7249. Also the increase from the static Student’s t to its time-

varying counterpart results in a likelihood increase, this time of about 68 points. The unconditional

level of tanh(ft) again lies close to its static counterpart.

On the basis of the reported AICc values, the data clearly favors time variation in the spatial

dependence parameter ρt using the Student’s t distribution for the estimation as well as for the

transition dynamics of ρt. The estimated degrees of freedom parameter λ for the Student’s t mod-

els is around 2.5. Hence there is a substantial degree of fat-tailedness. A part of the unconditional

fat-tailedness may also be due to the presence of volatility clustering. We discuss these robustness

issues in more detail in Section 5.3.

The coefficients for the included regressors have the same signs throughout the four specifi-

cations. Although the regression estimates vary somewhat, particularly between the normal and

Student’s t based models, the overall picture remains the same. A higher implied volatility on the

European stock market (VStoxx) correlates with lower CDS spreads. This is consistent with the

phenomenon of ‘flight to quality’ when the price of risk increases in financial markets. A higher

term spread on the interbank credit market implies a higher tendency to borrow overnight. This

is correlated with higher CDS spread changes and may be a sign of a perceived bank-sovereign

feedback loop: problems in the functioning of the interbank lending market may induce a fear of

possible future bailouts and subsequent sovereign debt problems. Stock market upturns have a

dampening effect on sovereign credit spreads, while increases in long-term bond yields point to

higher borrowing costs for governments and have a positive relation with sovereign CDS spreads.

Figure 4 presents the evolution of the filtered spatial dependence parameter. We observe that

the path of the spatial coefficient corresponding to the Student’s t spatial score model is more ro-

bust to outliers than its normal counterpart. This phenomenon is a common finding in the volatility

literature; see for example Creal et al. (2013) and Harvey (2013). Comparing the score expres-

sions in equations (9) and (10), it is clear that the time-varying spatial model shares this feature.

While the normal score is unbounded in the dependent variable and the regressors, the Student’s

t score contains a compensating effect in the the denominator that leads to a down-weighting of

large positive or negative observations. This leads to a different pattern between the two filtered

spatial dependence series for the two distributions, particularly during mid 2010, the first half of

2012, and late 2013.

Throughout the sample period, systemic risk is high, fluctuating around a value of 0.75 until

the end of 2012. At that time, the level starts to decline towards a lower level of about 0.5 to 0.6.

24

Figure 4: Filtered spatial dependence parameters obtained by imposing normally (dashed line) andStudent’s t (solid line) distributed errors.

2009 2010 2011 2012 2013 2014

0.4

0.5

0.6

0.7

0.8

0.9

rho_

t

t−GAS modelnormal−GAS model

The pattern can be related to a number of important policy events during the European sovereign

debt crisis.7 Some events have a high visible impact. For example, the first Long Term Refinancing

Operation (LTRO) at the end of 2012 caused a sudden and sharp drop in the spatial dependence

parameter. The effect, however, was short-lived and the value of ρt bounced back soon after to

similar levels as before. The second LTRO hardly has any visible effect on the spatial dependence

parameter. It is only until Mario Draghi’s speech at the Global Investment Conference in London

in July8 2012 and the subsequent announcements and implementation of the Outright Monetary

Transactions (OMT) and the European Stability Mechanism (ESM) in the months thereafter, that

the fear of perceived spillover effects appears to be mitigated on a more permanent basis, which

can be seen by the value of ρt coming down to lower levels.

5.3 Extensions

In this section, we extend the time-varying spatial model in different directions to investigate the

robustness of our results. First, we allow for sovereign-specific volatility clustering. Second, we7A list of events can be found in Figure B.1 in the appendix. See also Table B.1 with a list of sources.8Quote: “Within our mandate, the ECB is ready to do whatever it takes to preserve the euro. And believe me, it will

be enough.” Source: see Table B.1.

25

add an unobserved mean factor to try to distinguish common effects from spatial effects. Third,

we re-estimate the models using different choices of spatial weight matrices.

Unobserved time-varying volatility factors

Given the patterns in the data, it is clearly unrealistic to assume a common, time-invariant variance

for all sovereign CDS changes. We therefore extend the baseline model by adding a time-varying

diagonal covariance matrix Σt for the errors in the spatial model,

yt = h(ft)Wyt +Xtβ + et et ∼ pe(0,Σt), (16)

with

Σt := Σ(fσt ) = diag(σ2

1(fσ1t ), ..., σ2

n(fσnt ))

= diag (exp(fσ1t ), ..., exp(fσnt )) , (17)

where fσt = (fσ1t , ..., fσnt )′ is a vector of sovereign-specific variance factors. As before, we en-

dow the factor fσt with score updating. To enforce parsimony, we allow for sovereign-specific

intercepts in the score updating equations for fσt , but impose common score and persistence pa-

rameters Aσ and Bσ; see Appendix A. Although the covariance matrix of the error terms Σt is

diagonal, the reduced form covariance of yt is still a full matrix Cov(yt) = Z(ft)ΣtZ(ft)′.

Unobserved time-varying mean factor

To distinguish commonalities from spatial spill-overs, we also extend the model with an additional

unobserved time-varying mean factor. This factor is independent of the spatial lag structure,

yt = h(ft)Wyt +Xtβ + Z(ft)−1λfλt + et, et ∼ tλ0(0,Σt) (18)

where λ0 is the degrees of freedom parameter of the Student’s t distribution, λ = (λ1, . . . , λn)′ is

an (n×1)-vector of factor loadings, and fλt ∈ R is an additional time-varying parameter endowed

with score updating. Explicit formulas for the dynamics are given in Appendix A. Rewriting

equation (18) in reduced form, we obtain

yt = λfλt + Z(ft)Xtβ + Z(ft)et, (19)

26

Table 4: Comparison of goodness of fit of all considered empirical specifications. The largest max-imized loglikelihood value (logL) and the smallest Akaike Information Criterion (AICc) amongstthe considered models are marked as bold.

Static spatial Time-varying spatial

et ∼ N(0, σ2In) tλ(0, σ2In) N(0, σ2In) tλ(0, σ2In)

logL -26396.63 -24574.48 -26244.45 -24506.11AICc 52807.35 52507.03 49165.06 49032.39

Time-varying spatial-t Benchmark-t

(+tv. volas) (+mean f.+tv.volas) (+mean f.+tv.volas)

logL -24175.70 -24156.96 -26936.15AICc 48389.97 48375.30 53927.42

which allows for a direct comparison with a benchmark model without spatial lag structure,

yt = Xtβ + λfλt + et. (20)

Table 4 compares the goodness of fits of the seven empirical model specifications we consider

in our analysis. Each extension improves the performance of the model. However, the model

without any spatial structure performs worst, despite featuring an unobserved time-varying mean

and time-varying volatilities. We therefore conclude that explicitly accounting for dynamic con-

temporaneous spillovers of shocks, as it is done by the time-varying spatial model, is an important

feature when analyzing sovereign credit spread data.

The parameter estimates from the full model with spatial score updating, time-varying vari-

ances, and unobserved time-varying mean factor are given in Table 5. In contrast to the spatial

factor, the variance factors and particularly the mean factor are less persistent, which is seen by the

values of Bσ and Bλ, respectively. This is off-set by a larger impact of the scores in the transition

equations; see the values of Aσ and Aλ.

None of the parameters λi, i = 1, . . . , n, corresponding to the mean factor are individually

significantly different from zero. Jointly, however, they improve the model fit, as is indicated by

the AICc in Table 4. Furthermore, the loading estimates have an economic interpretation: the

non-stressed Eurozone countries have a negative coefficient λi, while the most stressed countries

during part of the European sovereign debt crisis (Portugal, Ireland, Spain) have positive loadings.

With regard to dynamic spatial dependence, the qualitative implications of the full model and

the basic time-varying spatial t-model are very similar. This is shown in Figure 5. Omitting the

27

Table 5: Estimated parameters and their numerically approximated (sandwich-)standard errors inparentheses, for the full model featuring spatial score updating, time-varying sovereign-specificvariances, an unobserved mean factor, and t-distributed error terms. The maximized loglikelihoodvalue (logL) and the Akaike information criterion (AICc) are also reported. Estimation period isFebruary 2, 2009 - May 12, 2014.

ωλ -0.0012 ωσ1 Belgium 0.0426 ω 0.0307(0.0252) (0.0125) (0.0229)

Aλ 0.3494 ωσ2 France 0.0448 A 0.019(0.8937) (0.0142) (0.007)

Bλ 0.6891 ωσ3 Germany 0.0573 B 0.9636(0.1065) (0.0155) (0.0271)

λ1 Belgium -0.2776 ωσ4 Ireland 0.0301 const. -0.0621(0.2308) (0.01) (0.024)

λ2 France -0.2846 ωσ5 Italy 0.0471 VStoxx -0.0257(0.3137) (0.0136) (0.0157)

λ3 Germany -0.2029 ωσ6 Netherlands 0.0443 term sp. 0.0693(0.2811) (0.0132) (0.0705)

λ4 Ireland 0.405 ωσ7 Portugal 0.0524 stocks -0.102(0.6928) (0.0153) (0.0183)

λ5 Italy -0.1604 ωσ8 Spain 0.0591 yields 0.0173(0.2429) (0.016) (0.0026)

λ6 Netherlands -0.1891 Aσ 0.1826 λ0 3.1357(0.2519) (0.023) (0.1977)

λ7 Portugal 0.4614 Bσ 0.9479(0.8334) (0.0135)

λ8 Spain 0.0988 logLik -24156.96(0.3635) AICc 48375.3

additional variance and mean dynamics leads to a slight upward adjustment in the filtered spatial

dependence parameter, but the overall pattern does not change.

Results from standard residual diagnostic tests are given in Table 6. The full model is able to

substantially reduce auto-correlations and ARCH effects for most individual series.9 Furthermore,

cross-correlations are, on average, much lower for the model residuals than for the raw data. To

give a full picture, we also provide the full correlation matrices in Table B.2.

To further robustify our results, we repeat the estimation, employing absolute instead of rela-

tive CDS spread changes as dependent variable. Figure B.2 shows the evolution of the correspond-

ing filtered parameter from the full model. Apart from an overall lower level of spatial dependence,

and a more clearly visible impact of the financial crisis at the beginning of the sample, the pattern

is similar to the picture obtained by using log changes.9Italy is the only country for which ARCH effects are not reduced by our model.

28

Table 6: Diagnostic tests for the residuals of the full model featuring spatial updating factor,volatilities, and additional mean factor, all driven by the score function, compared to the raw CDSspread changes. LB refers to the Ljung-Box test for residual serial correlation, ARCH LM refers tothe test for remaining auto-correlation in the squared residuals. The right panel contains averagesof pairwise cross-correlation.

sovereign LB test stat. ARCH LM test stat. average cross-corr.raw residuals raw residuals raw residuals

Belgium 108.64 15.93 169.91 25.53 0.70 0.07France 49.48 30.42 160.44 43.32∗ 0.66 -0.01Germany 62.61 19.49 142.70 53.78∗ 0.63 -0.07Ireland 129.89 17.53 302.23 87.11∗ 0.64 -0.07Italy 99.02 42.43∗ 102.13 150.88∗ 0.71 0.08Netherlands 55.69 33.29∗ 124.41 20.96 0.64 -0.05Portugal 167.91 32.56∗ 189.35 56.89∗ 0.65 0.03Spain 105.81 48.88∗ 253.68 154.42∗ 0.69 0.06∗Remaining effects at 5% level

Figure 5: Filtered spatial dependence parameters obtained from the basic time-varying spatialmodel with t-distributed errors (green) as well as with sovereign-specific, dynamic variances andan unobserved mean factor (red).

2009 2010 2011 2012 2013 2014

0.4

0.5

0.6

0.7

0.8

0.9

rho_

t

basic t−GAS modelfull t−GAS model

Choice of the spatial weight matrix

So far, all results reported have been obtained using the categorical spatial weight matrix Wcat de-

scribed in Section 5.1. In order to robustify our findings, we re-estimate the model using different

29

Table 7: Comparison of likelihood values for the time-varying spatial model with t-errors, usingdifferent spatial weights matrices.

Wraw Wdyn Wcat Wgeo

logL -24745.56 -24679.44 -24506.11 -25556.85

choices of W . Candidate choices include the matrix containing the averaged raw exposure data

(Wraw), a model in which the matrix of exposure data is updated quarterly (Wdyn), and a binary

matrix indicating the geographical neighborhood of the countries in our sample (Wgeo).10 As the

different models all have the same number of parameters, we can simply compare the likelihood

values at the optimum.

Table 7 shows that the goodness of fits are quite different. The model with a categorical

weights matrix provides the best fit. However, the parameter estimates are very robust towards the

specifiation of W , and none of the qualitative implications from our model changes.

It is particularly interesting to see that the weight matrices based on economic distances as

measured through financial cross-exposures (Wraw, Wdyn, and Wcat) provide a much better fit

than a matrix based on geographic distances (Wgeo). Some categorization is needed as well in

order to make the sizes of cross-exposures comparable. However, as mentioned before, scaling

the exposures by the size of the economy (as measured by GDP) did not provide an improvement

in terms of model fit.

6 Conclusion

In this paper, we propose a new model for time-varying spatial dependence in panel data sets.

The model extends the widely used spatial lag model to a time-varying parameter framework by

endowing the spacial dependence parameter with generalized autoregressive score dynamics and

fat tails. Allowing for time-variation is particularly useful if we apply spatial models over longer

time periods, where we can no longer be sure that the spatial dependence parameter is constant.

The fat-tailed feature of our model is useful in a setting where we apply the model to financial

data, which typically exhibit fatter tails than the normal.

We established the theoretical properties of our new model: the dynamics of the model are

optimal in the sense that they locally reduce the Kullback-Leibler distance of the statistical model10We also experimented with a weights matrix in which we weighted the raw exposures of the financial markets by

the countries’ respective GDP. However, the fit did not improve.

30

to the true unknown conditional density with every score update of the spatial dependence parame-

ter. Moreover, the maximum likelihood estimator for the model was consistent and asymptotically

normal under mild regularity conditions. Also, we showed under what conditions the model is in-

vertible, such that the filtered estimate of the time-varying spatial dependence parameters converge

in the limit to a unique stationary and ergodic sequence.

In our empirical study based on our time-varying spatial model, we showed that European

sovereign CDS spreads exhibit a strong, time-varying degree of spatial dependence. Cross-border

debt linkages appear as a suitable transmission channel for the spatial spillovers. In our final

model, we incorporated a time-varying common mean factor as well as time-varying volatilities

to the specification. Using the filtered time-varying parameters of this final model, we found

evidence for a break in spatial dependence towards the end of 2012. This illustrates that policies

by regulators have at least been partly effective in breaking the high spill-over effects prevalent

during the height of the European sovereign debt crisis.

References

Acharya, V., Drechsler, I., and Schnabl, P. (2013). A pyrrhic victory? bank bailouts and sovereign

credit risk. Journal of Finance, forthcoming.

Acharya, V. V. and Steffen, S. (2013). The ”greatest” carry trade ever? understanding eurozone

bank risks. Working Paper 19039, National Bureau of Economic Research.

Ang, A. and Longstaff, F. A. (2013). Systemic sovereign risk: Lessons from the u.s. and europe.

Journal of Monetary Economics, 60:493–510.

Anselin, L. (1988). Spatial Econometrics: Methods and Models. Springer.

Arezki, R., Candelon, B., and Sy, A. N. R. (2011). Sovereign rating news and financial markets

spillovers: Evidence from the european debt crisis. IMF Working Paper WP/11/68.

Arnold, M., Stahlberg, S., and Wied, D. (2013). Modeling different kinds of spatial dependence

in stock returns. Empirical Economics, 44:761–774.

Asgharian, H., Hess, W., and Liu, L. (2013). A spatial analysis of international stock market

linkages. Journal of Banking and Finance, 37:4738–4754.

Beetsma, R., Giuliodori, M., de Jong, F., and Widijanto, D. (2012). Spread the news: The impact

31

of news on the european sovereign bond markets during the crisis. Journal of International

Money and Finance, 34:83–101.

Billingsley, P. (1961). The lindeberg-levy theorem for martingales. Proceedings of the American

Mathematical Society, 12(5):788–792.

Billingsley, P. (1995). Probability and Measure. Wiley-Interscience.

Blasques, F., Koopman, S. J., and Lucas, A. (2014a). Information theoretic optimality of observa-

tion driven time series models. Tinbergen Institute Discussion Papers 14-046/III.

Blasques, F., Koopman, S. J., and Lucas, A. (2014b). Maximum likelihood estimation for gener-

alized autoregressive score models. Tinbergen Institute Discussion Papers 14-029/III.

Bougerol, P. (1993a). Kalman Filtering with Random Coefficients and Contractions.

Prepublications de l’Institut Elie Cartan. Univ. de Nancy.

Bougerol, P. (1993b). Kalman filtering with random coefficients and contractions. SIAM J. Control

Optim., 31(4):942–959.

Caporin, M., Pelizzon, L., Ravazzolo, F., and Rigobon, R. (2013). Measuring souvereign conta-

gion in europe. NBER Working Paper No. 18741.

Creal, D., Koopman, S. J., and Lucas, A. (2011). A dynamic multivariate heavy-tailed model

for time-varying volatilities and correlations. Journal of Business and Economic Statistics,

29(4):552–563.

Creal, D., Koopman, S. J., and Lucas, A. (2013). Generalized autoregressive score models with

applications. Journal of Applied Econometrics, 28:777–795.

Creal, D., Schwaab, B., Koopman, S. J., and Lucas, A. (2014). Observation driven mixed-

measurement dynamic factor models. Review of Economics and Statistics, page forthcoming.

De Santis, R. A. (2012). The euro area sovereign debt crisis - safe haven, credit rating agencies

and the spread of the fever from greece, ireland and portugal. ECB Working Paper No. 1419.

Denbee, E., Julliard, C., Li, Y., and Yuan, K. (2013). Network risk and key players: A structural

analysis of interbank liquidity. Working Paper.

Dieckmann, S. and Plank, T. (2012). Default risk of advanced economies: An empirical analysis

of credit default swaps during the financial crisis. Review of Finance, 16:903–934.

32

Dudley, R. M. (2002). Real Analysis and Probability. Cambridge Studies in Advanced Mathe-

matics. Cambridge University Press.

Engle, R. (2002). Dynamic conditional correlation. Journal of Business & Economic Statistics,

20:339–350.

Fernandez, V. (2011). Spatial linkages in international financial markets. Quantitative Finance,

11:237–245.

Foland, G. B. (2009). A Guide to Advanced Real Analysis. Dolciani Mathematical Expositions.

Cambridge University Press.

Gallant, R. and White, H. (1988). A Unified Theory of Estimation and Inference for Nonlinear

Dynamic Models. Cambridge University Press.

Gorea, D. and Radev, D. (2013). The euro area sovereign debt crisis: Can contagion spread from

the periphery to the core? Gutenberg School of Management and Economics Discussion Paper

No. 1208.

Harvey, A. C. (2013). Dynamic Models for Volatility and Heavy Tails. Econometric Society

Monographs, Cambridge University Press.

Harvey, A. C. and Luati, A. (2014). Filtering with heavy tails. Journal of the American Statistical

Association, page forthcoming.

Hillier, G. and Martellosio, F. (2013). Properties of the maximum likelihood estimator in spatial

autoregressive models. cemmap working paper CWP44/13.

Kalbaska, A. and Gatkowski, M. (2012). Eurozone sovereign contagion: Evidence from the cds

market (20052010). Journal of Economic Behavior & Organization, 83:657–673.

Kallestrup, R., Lando, D., and Murgoci, A. (2013). Financial sector linkages and the

dynamics of bank and sovereign credit spreads. Working paper, available at SSRN:

http://ssrn.com/abstract=2023635.

Keiler, S. and Eder, A. (2013). Cds spreads and systemic risk - a spatial econometric approach.

Deutsche Bundesbank Discussion Paper, No. 1/2013.

Kelejian, H. H. and Prucha, I. R. (2010). Specification and estimation of spatial autoregressive

models with autoregressive and heteroskedastic disturbances. Journal of Econometrics, 157:53–

67.

33

Korte, J. and Steffen, S. (2013). Zero risk contagion - banks’ sovereign exposure and sovereign

risk spillovers. Mimeo.

Krengel, U. (1985). Ergodic theorems. De Gruyter studies in Mathematics, Berlin.

Lee, L. (2004). Asymptotic distributions of quasi-maximum likelihood estimators for spatial au-

toregressive models. Econometrica, 72:1899–1925.

Lee, L. and Yu, J. (2010). Some recent developments in spatial panel data models. Regional

Science and Urban Economics, 40:255–271.

LeSage, J. P. and Pace, R. K. (2008). Introduction to Spatial Econometrics. CRC Press.

Leschinski, C. and Bertram, P. (2013). Pure contagion dynamics in emu government bond spreads.

Mimeo.

Lucas, A., Schwaab, B., and Zhang, X. (2014). Conditional euro area sovereign default risk.

Journal of Business and Economic Statistics, forthcoming, 32(2):271–284.

Oh, D. H. and Patton, A. (2013). Time-varying systemic risk: Evidence from a dynamic copula

model of cds spreads. Duke University Discussion Paper.

Ord, K. (1975). Estimation methods for models of spatial interaction. Journal of the American

Statistical Association, 70:120–126.

Pesaran, M. H. and Pick, A. (2007). Econometric issues in the analysis of contagion. Journal of

Economic Dynamics & Control, 31:12451277.

Ranga Rao, R. (1962). Relations between weak and uniform convergence of measures with appli-

cations. Annals of Mathematical Statistics, 33:659–680.

Saldias, M. (2013). A market-based approach to sector risk determinants and transmission in the

euro area. ECB Working Paper Series, No. 1574.

Straumann, D. and Mikosch, T. (2006). Quasi-maximum-likelihood estimation in conditionally

heteroeskedastic time series: A stochastic recurrence equations approach. The Annals of Statis-

tics, 34(5):2449–2495.

van der Vaart, A. W. (2000). Asymptotic Statistics (Cambridge Series in Statistical and Proba-

bilistic Mathematics). Cambridge University Press.

34

White, H. (1994). Estimation, Inference and Specification Analysis. Cambridge Books. Cambridge

University Press.

Wied, D. (2013). Cusum-type testing for changing parameters in a spatial autoregressive model

for stock returns. Journal of Time Series Analysis, 34:221229.

Wintenberger, O. (2013). Continuous invertibility and stable QML estimation of the EGARCH(1,

1) model. Scandinavian Journal of Statistics, 40(4):846–867.

Yu, J., de Jong, R., and Lee, L.-f. (2008). Quasi-maximum likelihood estimators for spatial dy-

namic panel data with fixed effects when both n and t are large. Journal of Econometrics,

146(118-134).

35

Appendix A Model extensions

We restrict the model extensions to the case of Student’s t distributed errors. We obtain the equa-

tions for the Gaussian case as a special case by letting λ0 →∞.

We assume that the vector of variance factors fσt in (17) follows an n-dimensional score process

as given by

fσt+1 = ωσ +Aσ∇σt +Bσfσt

with ω = (ωσ1 , . . . , ωσn)′, and Aσ, Bσ ∈ R. We thus allow for sovereign-specific intercepts in the

variance score update, but restrict the dynamic parameters Aσ and Bσ to be common across all

countries. This results in a parsimonious, yet flexible model. The score of the spatial dependence

factor ft is given in (10), with Σ replaced by Σt. For the variance factors, the score vector is

∇σt =∂`t∂fσt

=1

2

(1+λ−1n) exp(−fσ1,t)·

(y1,t−h(ft)

∑nj=1 w1jyj,t−x′1,tβ

)2

1+λ−1(yt−h(ft)Wyt−Xtβ)′Σ(fσt )−1(yt−h(ft)Wyt−Xtβ)− 1

...

(1+λ−1n) exp(−fσn,t)·(yn,t−h(ft)

∑nj=1 wnjyj,t−x′n,tβ

)2

1+λ−1(yt−h(ft)Wyt−Xtβ)′Σ(fσt )−1(yt−h(ft)Wyt−Xtβ)− 1

,

with X ′t = (x1,t, . . . , xn,t), and xi,t ∈ Rk×1.

In the presence of an additional mean factor fλt as in (19), the score update for ft changes

from (10) to

∇t =

[wt ·

(Wyt −Wλfλt

)′Σ−1

(yt − h(ft)Wyt −Xtβ − Z−1

t λfλt

)− tr(ZtW )

]· h(ft),

wt = (1+λ−1n)

1+λ−1(yt−h(ft)Wyt−Xtβ−Z−1t λfλt )′Σ−1(yt−h(ft)Wyt−Xtβ−Z−1

t λfλt ). (A.1)

The updating equation for fλt is given by

fλt+1 = ωλ +Aλ∇λt +Bλfλt ,

with score

∇λt = wt · (Z−1t λ)′Σ−1(yt − h(ft)Wyt −Xtβ − Z−1

t λfλt ). (A.2)

Finally, in the benchmark model (20), the score expression equals that in (A.2) with W = 0 and

Zt ≡ In.

36

Appendix B Additional tables and figures

Table B.1: Key policy events during the Eurozone crisis

Date Event SourceOct. 18, 2009 Greece announces doubling of budget deficit The Guardian1

Mar. 3, 2010 EU offers financial help to Greece ECB2

Dec. 7, 2010 Ireland is bailed out by EU and IMF ECB2

Dec. 22, 2011 ECB launches the first Longer-Term Refinancing Operation (LTRO) ECB2

Mar. 1, 2012 ECB launches the second LTRO ECB2

Jul. 26, 2012 M. Draghi: “[T]he ECB is ready to do whatever it takes to preserve the euro.” ECB3

Oct. 8, 2012 European Stability Mechanism (ESM) is inaugurated ESM4

Sep. 12, 2013 European Parliament approves new unified bank supervision system ECB2

1http://www.theguardian.com/business/2012/mar/09/greek-debt-crisis-timeline2http://www.ecb.europa.eu/ecb/html/crisis.de.html3http://www.ecb.europa.eu/press/key/date/2012/html/sp120726.en.html4http://www.esm.europa.eu/press/releases/20121008 esm-is-inaugurated.htmAll retrieved on June 19, 2014.

Figure B.1: Filtered spatial dependence parameters obtained from the full model, together withkey policy from Table B.1.

Mario Draghi: „Whatever it takes“

Ireland bailed out Help offer to Greece

First LTRO Second LTRO

ESM inaugurated

Greece : record deficit

New supervisory authority

37

http://www.theguardian.com/business/2012/mar/09/greek-debt-crisis-timeline

http://www.ecb.europa.eu/ecb/html/crisis.de.html

http://www.ecb.europa.eu/press/key/date/2012/html/sp120726.en.html

http://www.esm.europa.eu/press/releases/20121008_esm-is-inaugurated.htm

Figure B.2: Filtered spatial parameter obtained from the full time-varying spatial model with time-varying volatilities and unobserved mean factor, using absolute CDS spread changes as dependentvariable.

2009 2010 2011 2012 2013 2014

0.0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

rho_

t

Table B.2: Cross-correlation matrices: raw data and full model residuals

Correlation matrix of raw CDS change data

Belgium France Germany Ireland Italy Netherlands Portugal SpainBelgium 1.000 0.724 0.697 0.630 0.738 0.737 0.643 0.707France 1.000 0.724 0.581 0.655 0.678 0.581 0.657Germany 1.000 0.553 0.609 0.685 0.534 0.577Ireland 1.000 0.718 0.575 0.724 0.685Italy 1.000 0.654 0.740 0.847Netherlands 1.000 0.566 0.620Portugal 1.000 0.742Spain 1.000

Correlation matrix of residuals

Belgium France Germany Ireland Italy Netherlands Portugal SpainBelgium 1.000 0.113 0.068 -0.114 0.186 0.079 0.03 0.145France 1.000 0.234 -0.232 -0.076 0.029 -0.11 -0.015Germany 1.000 -0.247 -0.231 0.099 -0.19 -0.212Ireland 1.000 0.038 -0.115 0.205 -0.038Italy 1.000 -0.148 0.248 0.534Netherlands 1.000 -0.126 -0.172Portugal 1.000 0.161Spain 1.000

38

C Technical appendix: proofs

C.1 Proofs of main theorems

The lines of proof adopted here closely follow the original lines of proof in Blasques et al. (2014b), extended to the

case of exogenous variables.

Proof of Theorem 1: Define the norms ‖ · ‖Θ := supθ∈Θ | · | and ‖ · ‖ΘNf := E supθ∈Θ ‖ · ‖Nf .

Following Straumann and Mikosch (2006, Proposition 3.12), we have

supθ∈Θ|ft(y1:t−1, X1:t−1,θ, f1)− ft(yt−1, Xt−1,θ)| e.a.s.→ 0.

This follows directly from Bougerol (1993b, Theorem 3.1) in the context of the random sequence

ft(y1:t−1, X1:t−1, ·, fΘ1 )t∈N with elements ft(y1:t−1, X1:t−1, ·, fΘ

1 ) taking values in the separable Banach space

FΘ ⊆ (C(Θ,F), ‖ · ‖Θ), with initialization fΘ1 in C(Θ,F), where fΘ

1 (θ) = f1 ∀ θ ∈ Θ, and11

ft(y1:t, X1:t, ·, fΘ

1 ) = φt(ft(y

1:t−1, X1:t−1, ·, fΘ1 )),

:= φ(ft(y

1:t−1, X1:t−1, · , fΘ1 ) , yt, Xt; ·

)∀ t ∈ N,

where φtt∈Z is a stationary and ergodic (SE) sequence of stochastic recurrence equations φt : Ξ × C(Θ,F) →

C(Θ,F) ∀ t as in Straumann and Mikosch (2006, Proposition 3.12). Note that with a slight abuse of notation we use φ

both to denote the functional φ : C(Θ,F) × Y × X → C(Θ,F) as well as the function φ : F × Y × X × Θ → F .

Continuity of φ follows from s ∈ C(F × Y × X × B × Λ), where B is the domain of the regression parameters β.

The assumption that ytt∈Z and Xtt∈Z are SE and the continuity of φ together imply that φtt∈Z is SE by

Krengel (1985, Proposition 4.3). Condition C1 in Bougerol (1993b, Theorem 3.1) follows from E ln+ ‖s(fΘ, yt, Xt; · , · )‖Θ <

∞ since, by norm sub-additivity and positive homogeneity, for any fΘ ∈ C(Θ,F),

E ln+∥∥φt(fΘ)

∥∥Θ ≤ E∥∥φt(fΘ)

∥∥Θ= E

∥∥φ(fΘ, yt, Xt; · )∥∥Θ

= E∥∥ω +As(fΘ, yt, Xt; · , · ) +BfΘ‖Θ

≤ supθ∈Θ|ω|+ sup

θ∈Θ|A| E‖s(fΘ, yt, Xt; · , · )‖Θ + sup

θ∈Θ|B| · ‖fΘ‖Θ <∞,

because supθ∈Θ |ω| < ∞, supθ∈Θ |A| < ∞, supθ∈Θ |B| < ∞, and supθ∈Θ ‖fΘ‖Θ < ∞ hold by compactness of

Θ and continuity of fΘ, and E‖s(fΘ, yt, Xt; · , · )‖Θ < ∞ holds by assumption. This implies that fΘ ∈ C(Θ,F)

satisfies

E log+ ‖φ0(fΘ)− fΘ‖Θ ≤ E‖φ0(fΘ)− fΘ‖Θ ≤ E‖φ(fΘ, yt, Xt; ·)‖Θ + ‖fΘ‖Θ

= E supθ∈Θ|φ(fΘ(θ), yt, Xt,θ)|+ sup

θ∈Θ|fΘ(θ)| <∞.

By a similar argument E ln+ sup(β,λ)∈B×Λ |s(f1, yt, Xt;β, λ)| <∞ implies E log+ ‖φ0(fΘ)− fΘ‖NfΘ <∞.

11That (C(Θ,F), ‖ · ‖Θ) is a separable Banach space under compact Θ follows from application of the Arzelascolitheorem to obtain completeness and the Stone-Weierstrass theorem for separability.

39

For any pair (fΘ, f′Θ) ∈ C(Θ)× C(Θ), define

ρt = ρ(φt) = sup(fΘ,f

′Θ)∈FΘ×FΘ

‖φt(fΘ)− φt(f′Θ)‖Θ

‖fΘ − f ′Θ‖Θ.

Condition C2 in Bougerol (1993b, Theorem 3.1) holds if E ln ρt < 0. This is ensured by E ln ‖φ′t‖Θ < 0, with φt(θ)

as defined in (11). To see this, note that

E ln ρ(φt) := E ln sup‖fΘ−f ′Θ‖>0

‖φt(fΘ)− φt(f′Θ)‖Θ

‖fΘ − f ′Θ‖Θ

= E ln sup‖fΘ−f ′Θ‖>0

supθ∈Θ |φ(f(θ, f1), yt, Xt,θ)− φ(f ′(θ, f1), yt, Xt,θ)|supθ∈Θ |f(θ, f1)− f ′(θ, f1)|

≤ E ln sup‖fΘ−f ′Θ‖>0

supθ∈Θ φ′t(θ) supθ∈Θ |f(θ, f1)− f ′(θ, f1)|

supθ∈Θ |f(θ, f1)− f ′(θ, f1)|

= E ln ‖φ′t‖Θ < 0.

Also note that for the t period composition of the stochastic recurrence equation, we have E ln ρ(φt . . . φ1) ≤

E ln∏tj=1 ρ(φj) ≤

∑tj=1 ln ‖φ′j‖Θ < 0, where denotes composition. As a result, ft(·, f1)t∈N converges e.a.s. to

an SE solution ft(·)t∈Z in ‖ · ‖Θ-norm. Uniqueness and e.a.s. convergence is obtained in Straumann and Mikosch

(2006, Theorem 2.8).

Finally, we show that supt E supθ∈Θ |ft(y1:t−1, X1:t−1,θ, f1)|Nf <∞ and also

E supθ∈Θ |ft(y1:t−1, X1:t−1,θ)|Nf < ∞. We have supt E supθ∈Θ |ft(y1:t−1, X1:t−1,θ, f1)|Nf < ∞ if and only

if supt(E supθ∈Θ |ft(y1:t−1, X1:t−1,θ, f1)|Nf )1/Nf = supt ‖ft(·, f1)‖ΘNf < ∞. Furthermore, for any fΘ ∈

C(Θ,F), having ‖ft(·, f1) − fΘ‖ΘNf < ∞ implies ‖ft(·, fΘ1 )‖ΘNf < ∞ since continuity on the compact Θ im-

plies supθ∈Θ |f(θ)| <∞. For fΘ ∈ C(Θ,F), we define fΘ∗ , y∗, and X∗ such that fΘ = φ(y,X, fΘ

∗ , ·) ∈ C(Θ,F).

Above we showed that ∃ fΘ ∈ C(Θ,F) satisfying ‖φ(fΘ, yt, Xt; ·)‖ΘNf ≤¯φ < ∞ and ‖fΘ

1 − fΘ‖ΘNf = ‖fΘ1 −

φ(fΘ∗ , y∗, X∗; ·)‖ΘNf <∞. From this, we obtain

supt‖ft(·, fΘ

1 )− fΘ‖ΘNf = supt‖φ(ft(·, fΘ

1 ), yt, Xt; ·)− φ(fΘ∗ , y∗, X∗; ·)‖ΘNf

≤ supt‖φ(ft(·, fΘ

1 ), yt, Xt; ·)− φ(fΘ∗ , yt, Xt; ·)‖ΘNf+

supt‖φ(fΘ

∗ , yt, Xt; ·)‖ΘNf + supt‖φ(fΘ

∗ , y∗, X∗; ·)‖ΘNf

≤ supt

(E sup

θ∈Θ|ft(θ, f1)− fΘ

∗ |Nf × supθ∈Θ

|φ(ft(θ, fΘ1 ), yt, Xt;θ)− φ(fΘ

∗ (θ), yt, Xt;θ)|Nf

|ft(θ, fΘ1 )− fΘ

∗ (θ)|Nf

)1/Nf

+ supt‖φ(fΘ

∗ , yt, Xt; ·)‖ΘNf + ‖fΘ‖ΘNf

≤ supt

(E sup

θ∈Θ|ft(θ, f1)− fΘ

∗ |Nf × supθ∈Θ

φ′t(θ)Nf)1/Nf

+ supt‖φ(fΘ

∗ , yt, Xt; ·)‖ΘNf + ‖fΘ‖ΘNf .

Using the orthogonality condition in (iv′), we can write the expectation of the product as the product of the expectations

and continue

≤ supt‖ft( · , fΘ

1 )− fΘ∗ ‖ΘNf · ‖φ

′t‖ΘNf + sup

t‖φ(fΘ

∗ , yt, Xt; ·)‖ΘNf + ‖fΘ‖ΘNf

≤ ‖φ′t‖ΘNf ×(

supt‖ft( · , fΘ

1 )− fΘ∗ ‖ΘNf

)+ ¯φ+ f ,

40

with c = ‖φ′t‖ΘNf < 1 by condition (iv′), ¯φ < ∞, and f = ‖fΘ‖+ c · ‖fΘ − fΘ∗ ‖ΘNf < ∞. As a result we have the

recursion supt ‖ft(·, fΘ1 )− fΘ‖ΘNf ≤ c · supt ‖ft(·, fΘ

1 )− fΘ‖ΘNf +A, with A = ¯φ+ f . Hence,

supt‖ft(·, fΘ

1 )− fΘ‖ΘNf ≤t∑j=0

(c)j((c+ 1)f + ¯φ) + ct+1 supt‖fΘ

1 − fΘ‖ΘNf

≤ (c+ 1)f + ¯φ

1− c + ‖fΘ1 − fΘ‖ΘNf <∞.

The same result holds using the uniform contraction in (iv) by taking a further supremum in yt and Xt instead of the

orthogonality condition.

Proof of Theorem 2: Assumption 1 implies that `T (θ, f1) = (1/T )∑Tt=1 `t(θ, f1) is a.s. continuous (a.s.c.) in

θ ∈ Θ through continuity (c.) of each

`t(θ, f1) = `(yt, Xt, ft(y1:t−1, X1:t−1, f1,θ),θ)

= log pe(Zt(ft)−1yt −Xtβ;λ)− log |Zt(ft)|

ensured in turn by the differentiability of S, pe and h and the implied a.s.c. of

∇(ft(y1:t−1, X1:t−1, f1,θ), yt, Xt;β, λ) =

∂ log pe(Zt(ft(y1:t−1, X1:t−1, f1,θ))−1yt −Xtβ;λ)

∂f

− ∂ log |Zt(ft(y1:t−1, X1:t−1, f1,θ))|∂f

in (ft(y1:t−1, X1:t−1, f1,θ);λ) and the resulting c. of ft(y1:t−1, X1:t−1, f1,θ) in θ as a composition of t c. maps.

Together with the compactness of Θ this implies by Weierstrass’ theorem that the arg max set is non-empty a.s. and

hence that θT exists a.s. ∀T ∈ N. Assumption 1 implies also by a similar argument that

`T (θ, f1) = `(f1:T (y1:t−1, X1:t−1, f1,θ); y1:T , X1:T ,θ

)is continuous in (y1:T , X1:T ) ∀ θ ∈ Θ and hence measurable w.r.t. the product Borel σ-algebra B(Y) ⊗B(X ) that

are, in turn, measurable maps w.r.t. F by Proposition 4.1.7 in Dudley (2002).12 The measurability of θT follows from

Foland (2009, p.24) and White (1994, Theorem 2.11) or Gallant and White (1988, Lemma 2.1, Theorem 2.2).13

Proof of Theorem 3: We obtain θT (f1)a.s.→ θ0 from the uniform convergence of the criterion function

supθ∈Θ|`T (θ, f1)− `∞(θ)| a.s.→ 0 ∀ f1 ∈ F as T →∞, (C.1)

and the identifiable uniqueness of the maximizer θ0 ∈ Θ introduced in White (1994),

supθ:‖θ−θ0‖>ε

`∞(θ) < `∞(θ0) ∀ ε > 0; (C.2)

12Dudley’s proposition states that the Borel σ-algebra B(A × B) generated by the Tychonoff’s product topologyTA×B on the space A× B includes the product σ-algebra B(A)⊗ B(B).

13The reference of Foland (2009) is used here to establish that a map into a product space is measurable if and onlyif its projections are measurable.

41

see for example White (1994, Theorem 3.4) or Theorem 3.3 in Gallant and White (1988) for further details.

The uniform convergence is obtained by norm sub-additivity,14

supθ∈Θ|`T (θ, f1)− `∞(θ)| ≤ sup

θ∈Θ|`T (θ, f1)− `T (θ)|+ sup

θ∈Θ|`T (θ)− `∞(θ)|,

and then showing that the initialization effect vanishes asymptotically,

supθ∈Θ|`T (θ, f1)− `T (θ)| a.s.→ 0 as T →∞, (C.3)

and for the second term applying the ergodic theorem for separable Banach spaces in Ranga Rao (1962), as in Straumann

and Mikosch (2006, Theorem 2.7), to the sequence `T (·) with elements taking values in C(Θ,R) so that

supθ∈Θ|`T (θ)− `∞(θ)| a.s.→ 0 where `∞(θ) = E`t(θ) ∀ θ ∈ Θ.

The criterion `T (θ, f1) satisfies (C.3) if

supθ∈Θ|`t(θ, f1)− `t(θ)| a.s.→ 0 as t→∞.

The continuity of pe ensures that `t(·, f1) = `(ft(yt, Xt, ·, f1), yt, Xt, ·) is continuous in (ft(y

t, Xt, ·, f1), yt, Xt).

Since all the assumptions of Theorem 1 are satisfied we know that there exists a unique SE sequence ft(yt, Xt, ·))t∈Z

with elements taking values in C(Θ,F) such that

supθ∈Θ

∣∣(ft(yt−1, Xt−1, f1,θ), yt, Xt)− (ft(yt−1, Xt−1,θ), yt, Xt)

∣∣ a.s.→ 0,

and

supt

E supθ∈Θ|ft(yt−1, Xt−1, f1,θ)Nf | <∞ and E sup

θ∈Θ|ft(yt−1, Xt−1,θ)|Nf <∞,

with Nf ≥ 1. Hence, (C.3) follows by application of a continuous mapping theorem for ` : C(Θ,F)→ C(Θ,F).

The ULLN supθ∈Θ |`T (θ)− E`t(θ)| a.s.→ 0 as T →∞ follows, under a moment bound E supθ∈Θ |`t(θ)| <∞,

by the SE nature of `T t∈Z which is implied by continuity of ` on the SE sequence (yt, Xt, ft(yt−1, Xt−1, ·))t∈Z

and Proposition 4.3 in Krengel (1985). The moment bound E supθ∈Θ |`t(θ)| <∞ can be established as follows. First

note that

E supθ∈Θ|`t(θ)| = sup

θ∈ΘE| log pe(yt − h(ft(y

t−1, Xt−1,θ))Wyt−1 −Xtβ)

− log detZ(ft(yt−1, Xt−1,θ))|

≤ supθ∈Θ

E| log pe(yt − h(ft(yt−1, Xt−1,θ))Wyt−1 −Xtβ)|

− supθ∈Θ

E| log detZ(ft(yt−1, Xt−1,θ))| <∞,

then the bounded first moment for the likelihood is implied by having

E|yt|Ny <∞ , E|Xt|NX <∞ and supθ∈Θ

E|ft(yt−1, Xt−1,θ)|Nf <∞

14`T (θ) denotes `T (θ, f1) with f(θ), f1) replaced by its limit for T →∞, i.e., by f(θ).

42

since then

supθ∈Θ

E| log detZ(ft(yt−1, Xt−1,θ))| <∞,

because log |Z| ∈ M(Nf , Nlog |Z|) with Nlog |Z| ≥ 1 by assumption and

supθ∈Θ

E| log pe(yt − h(ft(yt−1, Xt−1,θ))Wyt−1 −Xtβ) <∞,

because h is uniformly bounded and hence the argument of pe has the same moments as yt and Xt. This ensures the

desired moment since log pe ∈ M(N,Nlog pe) with Nlog pe ≥ 1 and N = minNy, Nx by assumption.

Finally, the identifiable uniqueness (see e.g. White (1994)) of θ0 ∈ Θ in (C.2) follows from the assumed unique-

ness, the compactness of Θ, and the continuity of the limit E`t(θ) in θ ∈ Θ which is implied by the continuity of `T

in θ ∈ Θ ∀ T ∈ N and the uniform convergence in (C.1).

Proof of Theorem 4: As the likelihood and its derivatives depend on the derivatives of f(θ, f1) with respect to θ, we

introduce the notation f (0:m) as the vector containing f(θ, f1) and its derivatives up to order m, with initial condition

f(0:m). We obtain the desired result from: (i) the strong consistency of θT

a.s.→ θ0 ∈ int(Θ); (ii) the a.s. twice

continuous differentiability of `T (θ, f1) in θ ∈ Θ; (iii) the asymptotic normality of the score

√T`′T

(θ0, f

(0:1)1 )

d→ N(0,J (θ0)), J (θ0) = E

(˜′t

(θ0)˜′

t

(θ0)>

); (C.4)

(iv) the uniform convergence of the likelihood’s second derivative,

supθ∈Θ

∥∥`′′T (θ, f(0:2)1 )− `′′∞(θ)

∥∥ a.s.→ 0; (C.5)

and finally, (v) the non-singularity of the limit `′′∞(θ) = E˜′′t (θ) = I(θ). See e.g. in White (1994, Theorem 6.2)) for

further details.

The consistency condition θTa.s.→ θ0 ∈ int(Θ) in (i) follows under the maintained assumptions from Theorem 3

and the additional assumption in Theorem 4 that θ0 ∈ int(Θ). The smoothness condition in (ii) follows immediately

from Assumption 2 and the likelihood expressions in Appendix C.2.

The asymptotic normality of the score in (C.7) follows by Theorem 18.10[iv] in van der Vaart (2000) by showing

that

‖`′T(θ0, f

(0:1)1 )− `′T

(θ0)‖ e.a.s.→ 0 as T →∞, (C.6)

plus a CLT result for `′T (θ0). Note that from (C.6) we obtain that√T‖`′T

(θ0, f

(0:1)1 ) − `′T

(θ0)‖ a.s.→ 0 as T → ∞.

The desired CLT result follows by an application of the CLT for SE martingales in Billingsley (1961),

√T`′T

(θ0)

d→ N(0,J (θ0))

as T →∞, (C.7)

where J (θ0) = E(˜′t

(θ0)˜′

t

(θ0)>) <∞, where finite (co)variances follow from the assumption N`′ ≥ 2 in Assump-

tion 4 and the expressions for the likelihood in Appendix C.2.

To establish the e.a.s. convergence in (C.6), we use the e.a.s. convergence

|ft(y1:t−1, X1:t−1,θ0, f1)− ft(yt−1, Xt−1,θ0)| e.a.s.→ 0, (C.8)

43

and

‖f (1)

t (y1:t−1, X1:t−1,θ0, f(0:1)1 )− f (1)

t (y1:t−1, X1:t−1,θ0)‖ e.a.s.→ 0. (C.9)

The e.a.s. convergence in (C.8) is obtained directly by application of Theorem 1 under the maintained assumptions. The

e.a.s. convergence in (C.9) is obtained by the same argument as in the proof of Theorem 1 since: (a) the expressions for

the derivative process f (1)

t in Appendix C.2 show that the contraction condition

E ln supθ∈Θ

φ′1,1(θ) < 0

for the recursion of the filter ft is the same as the contraction condition for the derivative process f (1)

t ; and (b) the

expressions in Appendix C.2 also reveal that the counterpart of the moment condition

E ln+ sup(B,λ)∈B×Λ

|s(f1, yt, Xt;B, λ)| <∞,

used in Theorem 1 for the filtered process ft, is implied by the condition that

minNf , Ns, N0,1,0s , N (0,0,1)

s > 0,

as imposed in Assumption 4.

From the differentiability of

˜′t(θ, f

(0:1)1 ) = `′

(θ, y1:t, X1:t, f

(0:1)

t (y1:t−1, X1:t−1,θ, f(0:1)1 )

)in f

(0:1)

t (y1:t−1, X1:t−1,θ, f(0:1)1 ) and the convexity of F , we use the mean-value theorem to obtain

‖`′T(θ0, f

(0:1)1 )− `′T

(θ0)‖ ≤

4+dλ∑j=1

∣∣∣∂`′(y1:t, X1:t, f(0:1)

t )

∂fj

∣∣∣×∣∣f (0:1)

j,t (y1:t−1, X1:t−1,θ0, f(0:1)1 )− f (0:1)

j,t (y1:t−1, X1:t−1,θ0)∣∣,

(C.10)

where dλ denotes the dimension of λ, and f(0:1)

j,t denotes the j-th element of f(0:1)

t , and f(0:1)

is on the segment

connecting f(0:1)

j,t (y1:t−1, X1:t−1,θ0, f(0:1)1 ) and f

(0:1)

j,t (y1:t−1, X1:t−1,θ0). Note that f(0:1)

t ∈ R4+dλ because it

contains ft ∈ R (the first element) as well as f(1)

t ∈ R3+dλ (the derivatives with respect to ω, A, B, and λ). Using

the expressions of the likelihood and its derivatives in Appendix C.2, the moment bounds and the moment preserving

properties in Assumption 4, and the expressions in Appendix C.2 shows that

∣∣∂`′(y1:t, X1:t, f(0:1)

t )/∂fj∣∣ = Op(1) ∀j = 1, . . . , 4 + dλ.

The strong convergence in (C.10) is now ensured by

‖`′T(θ0, f

(0:1)1 )− `′T

(θ0)‖ =

4+dλ∑i=1

Op(1)oe.a.s(1) = oe.a.s.(1). (C.11)

The proof of the uniform convergence in (C.5) is similar to that of Theorem 2. We note

supθ∈Θ‖`′′T (θ, f1)− `′′∞(θ)‖ ≤ sup

θ∈Θ‖`′′T (θ, f1)− `′′T (θ)‖+ sup

θ∈Θ‖`′′T (θ)− `′′∞(θ)‖. (C.12)

44

To prove that the first term vanishes a.s., we show that

supθ∈Θ‖˜′′t (θ, f1)− ˜′′

t (θ)‖ a.s.→ 0 as t→∞.

The differentiability of g, g′, p, and S from Assumption 2 ensure that

˜′′t (·, f1) = `′′(yt, f

(0:2)

t (y1:t−1, X1:t−1, ·, f0:2), ·)

is continuous in (yt, f(0:2)

t (y1:t−1, X1:t−1, ·,f0:2)). Again, we note that the proof of Theorem 1 can be easily adapted

to show that there exists a unique SE sequence f (0:2)

t (yt−1, Xt−1, ·)t∈Z such that

supθ∈Θ

∥∥(yt, f(0:2)

t (y1:t−1, X1:t−1,θ, f0:2))− (yt, f(0:2)

t (yt−1, Xt−1,θ)∥∥ a.s.→ 0,

and satisfying, for for Nf ≥ 1,

supt

E supθ∈Θ‖f (0:2)

t (y1:t−1, X1:t−1,θ, f0:2)‖Nf <∞,

and also

E supθ∈Θ‖f (0:2)

t (yt−1, Xt−1,θ)‖Nf <∞,

because (a) the expressions for the derivative process f (1)

t in Appendix C.2 show that the contraction condition

E ln supθ∈Θ

φ′1,1(θ) < 0

for the recursion of the filter ft is the same as the contraction condition for the second derivative process f (2)

t ; and

(b) the expressions in Appendix C.2 show also that the counterpart of the moment condition

E ln+ sup(B,λ)∈B×Λ

|s(f1, yt, Xt;B, λ)| <∞,

used in Theorem 1 for the filtered process ft, is implied by the condition that

min

N

(1)f , N (0,1,0)

s , N (0,0,1)s , N (0,2,0)

s , N (0,0,2)s , N (0,1,1)

s ,N

(1,0,0)s N

(1)f

N(1,0,0)s +N

(1)f

,

N(2,0,0)s N

(1)f

2N(2,0,0)s +N

(1)f

,N

(1,1,0)s N

(1)f

N(1,1,0)s +N

(1)f

,N

(1,0,1)s N

(1)f

N(1,0,1)s +N

(1)f

> 0 ,

imposed in Assumption 4. By application of a continuous mapping theorem for `′′ : C(Θ × F (0:2)) → R we thus

conclude that the first term in (C.12) converges to 0 a.s..

The second term in (C.12) converges under a bound E supθ∈Θ ‖˜′′t (θ)‖ < ∞ by the SE nature of `′′T t∈Z. The

latter is implied by continuity of `′′ on the SE sequence

(yt, Xt, f(0:2)

t (y1:t−1, X1:t−1, ·))t∈Z.

The moment bound E supθ∈Θ ‖˜′′t (θ)‖ <∞ follows from N`′′ ≥ 1 in Assumption 4 and the expressions in Appendix

C.2. Finally, the non-singularity of the limit `′′∞(θ) = E˜′′t (θ) = I(θ) in (v) below equation (C.5) is implied by the

uniqueness of θ0 as a maximum of `′′∞(θ) in Θ.

45

C.2 Derivatives of the likelihood function

We take first derivatives of the likelihood with respect to all static parameters θ = (ω, a, b, β′, σ2)′:

∂`t∂θ

=

(∂`t∂ω

,∂`t∂a

,∂`t∂b

,∂`t∂β

,∂`t∂σ2

)′Let θm denote the mth element of θ. We can decompose the derivatives of the likelihood with respect to each θm

into two parts:

∂`t∂θm

=∂(pt + ln g′t)

∂ft· ∂ft∂θm

+∂pt∂θm

= ∇t ·∂ft∂θm

+∂pt∂θm

, (C.13)

because g′t does not depend on any of the parameters directly, only through ft. For θm ∈ ω, a, b the second term is

zero, because these parameters enter the likelihood only through ft.

All partial derivatives contain the term ∂ft∂θm

given by

∂ft∂θm

=∂

∂θm(ω + ast−1 + bft−1) (C.14)

=∂ω

∂θm+

∂a

∂θmst−1 + a

∂st−1

∂ft−1· ∂ft−1

∂θm+ a

∂st−1

∂θm+

∂b

∂θmft−1 + b

∂ft−1

∂θm(C.15)

=∂ω

∂θm+

∂a

∂θm∇t−1 + a∇′t−1 ·

∂ft−1

∂θm+ a

∂∇t−1

∂θm+

∂b

∂θmft−1 + b

∂ft−1

∂θm(C.16)

=∂ω

∂θm+

∂a

∂θm∇t−1 + a

∂∇t−1

∂θm+

∂b

∂θmft−1 + (a∇′t−1 + b)

∂ft−1

∂θm(C.17)

We want the matrix of second derivatives of the likelihood function, i.e.

∂2`t∂θ∂θ′

.

We take another derivative of (C.13) with respect to θo:

∂2`t∂θm∂θo

= ∇′t ·∂ft∂θo· ∂ft∂θm

+∂∇t∂θo· ∂ft∂θm

+∇t∂2f2

t−1

∂θmθo+

∂2pt∂θm∂θo

(C.18)

The second derivative process takes the form

∂2ft∂θm∂θo

=∂a

∂θm· ∂∇t−1

∂ft−1· ∂ft−1

∂θo+

∂a

∂θm

∂∇t−1

∂θo

+∂a

∂θo

∂∇t−1

∂ft−1

∂ft−1

∂θm+ a

∂2∇t−1

∂f2t−1

∂ft−1

∂θo

∂ft−1

∂θm+ a

∂2∇t−1

∂ft−1∂θo

∂ft−1

∂θm

+ a∂∇t−1

∂ft−1

∂2ft−1

∂θm∂θo+

∂a

∂θo

∂∇t−1

∂θm+ a

∂2∇t−1

∂θm∂θo+ a

∂2∇t−1

∂θm∂ft−1

∂ft−1

∂θo

+∂b

∂θm

∂ft−1

∂θo+

∂b

∂θo

∂ft−1

∂θm+ b

∂2ft−1

∂θm∂θo

=∂a

∂θm· ∇′t−1 ·

∂ft−1

∂θo+

∂a

∂θm

∂∇t−1

∂θo

+∂a

∂θo· ∇′t−1 ·

∂ft−1

∂θm+ a∇′′t−1 ·

∂ft−1

∂θo

∂ft−1

∂θm+ a

∂∇′t−1

∂θo

∂ft−1

∂θm

+ a∇′t−1∂2ft−1

∂θm∂θo+

∂a

∂θo

∂∇t−1

∂θm+ a

∂2∇t−1

∂θm∂θo+ a

∂2∇t−1

∂θm∂ft−1

∂ft−1

∂θo

+∂b

∂θm

∂ft−1

∂θo+

∂b

∂θo

∂ft−1

∂θm+ b

∂2ft−1

∂θm∂θo.

(C.19)

46

Date post:	16-Mar-2020
Category:	Documents
Upload:	others
View:	4 times
Download:	0 times

Spillover Dynamics for Systemic Risk Measurement using ...Spillover Dynamics for Systemic Risk...

Documents