On market efficiency and volatility estimation
DISSERTATION
of the University of St. Gallen,
School of Management,
Economics, Law, Social Sciences
and International Affairs
to obtain the title of
Doctor of Philosophy in Economics and Finance
submitted by
Wale Dare
Approved on the application of
Prof. Dr. Matthias Fengler
and
Prof. Dr. Josef Teichmann
Dissertation no. 4748
Gutenberg AG, Schaan, 2018
The University of St. Gallen, School of Management, Economics, Law, Social Sciences and
International Affairs hereby consents to the printing of the present dissertation, without hereby
expressing any opinion on the views herein expressed.
St. Gallen, December 5, 2017
The President:
Prof. Dr. Thomas Bieger
Contents
1 Global estimation of realized spot volatility in the presence
of price jumps 5
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2 Prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Frames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3.1 Gabor frames . . . . . . . . . . . . . . . . . . . . . . . 11
1.4 Volatility estimation: continuous prices . . . . . . . . . . . . 12
1.5 Volatility estimation: discontinuous prices . . . . . . . . . . . 20
1.6 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.6.1 Continuous prices . . . . . . . . . . . . . . . . . . . . . 34
1.6.2 Prices with jumps . . . . . . . . . . . . . . . . . . . . . 39
1.7 Empirical illustration - Flash Crash of 2010 . . . . . . . . . . 43
1.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
1.9 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
2 Testing efficiency in small and large financial markets 48
2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.2 Efficiency in standard markets . . . . . . . . . . . . . . . . . . 51
2.2.1 Statistical inference for market efficiency . . . . . . . . 62
2.3 Market efficiency in large financial markets . . . . . . . . . . . 65
2.3.1 Large financial market payoff space . . . . . . . . . . . 66
2.3.2 Arbitrage pricing in large financial markets . . . . . . . 68
2.4 Asymptotic Market efficiency . . . . . . . . . . . . . . . . . . 70
2.4.1 Statistical inference for asymptotic market efficiency . . 74
2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
1
3 Statistical arbitrage in the U.S. treasury futures market 77
3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
3.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
3.2.1 Treasury futures . . . . . . . . . . . . . . . . . . . . . 80
3.2.2 Continuous prices . . . . . . . . . . . . . . . . . . . . . 82
3.3 Economic framework . . . . . . . . . . . . . . . . . . . . . . . 84
3.3.1 The price of a futures contract . . . . . . . . . . . . . . 84
3.3.2 Factor model of the yield curve . . . . . . . . . . . . . 87
3.3.3 Factor extraction . . . . . . . . . . . . . . . . . . . . . 89
3.3.4 Factor structure implies cointegration . . . . . . . . . . 96
3.3.5 Cointegration implies a factor structure . . . . . . . . . 98
3.4 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
3.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
3.5.1 Return calculation . . . . . . . . . . . . . . . . . . . . 102
3.5.2 Excess returns . . . . . . . . . . . . . . . . . . . . . . . 106
3.5.3 Statistical Arbitrage . . . . . . . . . . . . . . . . . . . 108
3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
2
Abstract
In Chapter 1, we propose a non-parametric procedure for estimating the
realized spot volatility of a price process described by an Itô semimartingale
with Lévy jumps. The procedure integrates the threshold jump elimination
technique of Mancini (2009) with a frame (Gabor) expansion of the realized
trajectory of spot volatility. We show that the procedure converges in proba-
bility in L2([0, T ]) for a wide class of spot volatility processes, including those
with discontinuous paths. Our analysis assumes the time interval between
price observations tends to zero; as a result, the intended application is for
the analysis of high frequency financial data.
In Chapter 2, we investigate practical tests of market efficiency that are
not subject to the joint-hypothesis problem inherent in tests that require
the specification of an equilibrium model of asset prices. The methodology
we propose simplifies the testing procedure considerably by reframing the
market efficiency question into one about the existence of a local martingale
measure. As a consequence, the need to directly verify the no dominance
condition is completely avoided. We also investigate market efficiency in the
large financial market setting with the introduction of notions of asymptotic
no dominance and market efficiency that remain consistent with the small
market theory. We obtain a change of numeraire characterization of asymp-
totic market efficiency and suggest empirical tests of inefficiency in large
financial markets.
In Chapter 3, we argue empirically that the U.S. treasury futures market
is informational inefficient. We show that an intraday strategy based on the
assumption of cointegrated treasury futures prices earns statistically signifi-
cant excess return over the equally weighted portfolio of treasury futures. We
also provide empirical backing for the claim that the same strategy, financed
by taking a short position in the 2-Year treasury futures contract, gives rise
to a statistical arbitrage.
3
Abstrakt
In Kapitel 1 schlagen wir ein nichtparametrisches Verfahren zur Schätzung
der realisierten Spotvolatilität eines Preisprozesses vor, der von einem Itô
semimartingale mit Lévy-Sprüngen beschrieben wird. Das Verfahren inte-
griert die Schwellenwert-Eliminierungstechnik von Mancini (2009) mit einer
Frame (Gabor)-Erweiterung der realisierten Trajektorie der Spot Volatilität.
Wir zeigen, dass die Prozedur mit Wahrscheinlichkeit in L2([0, T ]) für eine
breite Klasse von Spot Volatilitätsprozessen konvergiert, einschließlich solcher
mit diskontinuierlichen Pfaden. Unsere Analyse geht davon aus, dass das
Zeitintervall zwischen den Preisbeobachtungen gegen Null tendiert, so dass
die beabsichtigte Anwendung für die Analyse von hochfrequenten Finanz-
daten vorgesehen ist.
In Kapitel 2 untersuchen wir praktische Tests der Markteffizienz, die nicht
dem Joint-Hypothese-Problem unterliegen, das bei Tests, die die Spezifika-
tion eines Gleichgewichtsmodells der Vermögenspreise erfordern, inhärent
ist. Die von uns vorgeschlagene Methodik vereinfacht das Testverfahren
erheblich, indem sie die Frage der Markteffizienz in eine Frage über die
Existenz eines lokalen Martingalmaßes umformuliert. Dadurch entfällt die
Notwendigkeit, die Nicht-Dominanz-Bedingung direkt zu verifizieren. Wir
untersuchen auch die Markteffizienz im Umfeld der großen Finanzmärkte mit
der Einführung von Begriffen wie asymptotische No-Dominanz und Mark-
teffizienz, die mit der Theorie des kleinen Marktes übereinstimmen. Wir
erhalten eine Änderung der numerären Charakterisierung der asymptotis-
chen Markteffizienz und schlagen empirische Tests der Ineffizienz in großen
Finanzmärkten vor.
In Kapitel 3 argumentieren wir empirisch, dass der US-Treasury-Futures-
Markt informatorisch ineffizient ist. Wir zeigen, dass eine Intraday-Strategie,
die auf der Annahme von kointegrierten Treasury-Futures-Preisen basiert,
statistisch signifikante Überrenditen gegenüber dem gleichgewichteten Port-
folio von Treasury-Futures erzielt. Empirisch untermauern wir auch die Be-
hauptung, dass die gleiche Strategie, finanziert durch eine Short-Position im
2-Jahres-Treasury-Futures-Kontrakt, zu einer statistischen Arbitrage führt.
4
Chapter 1
Global estimation of realized spot
volatility in the presence of price
jumps
1.1 Introduction
Volatility estimation using discretely observed asset prices has received a
great deal of attention recently, however, much of that effort has been fo-
cused on estimating the integrated volatility and, to a lesser extent, the spot
volatility at a given point in time. Notable contributions to the literature
on volatility estimation include the papers by Foster & Nelson (1996), Fan
& Wang (2008), Florens-Zmirou (1993), and Barndorff-Nielsen & Shephard
(2004). In these studies, the object of interest is local in nature: spot volatil-
ity at a given point in time or integrated volatility up to a terminal point in
time. In contrast, estimators which aim to obtain spot volatility estimates
for entire time windows have received much less coverage. These are the
so-called “global” spot volatility estimators. These estimators derive their
name from the fact that the objects of interest are not localized. Typically,
a global estimator would be a random element whose realizations would be
elements of some function space.
There are potential benefits to adopting global estimators of spot volatil-
ity. Given a consistent global estimate of spot volatility σ2 over an interval
5
[0, T ], the integrated volatility at any point t within [0, T ] may be consistently
estimated by integrating σ2 over the interval [0, t]. In fact, by the contin-
uous mapping theorem, consistent estimates of continuous transformations
of σ2 are immediately available. Hence, integrated powers of spot volatility,∫ t
0σps ds, p > 0, the running maximum of spot volatility, σ∗
t := sups≤t |σs|,and volatility in excess of a given threshold, σa
t := σtI|σt|>a, a > 0, to
name just a few, are easily obtained via the obvious transformation of the
estimated global spot volatility. This flexibility is one of the more appealing
features of this class of estimators.
The estimator by Genon-Catalot et al. (1992) is an early contribution to
the study of the realized trajectory of spot volatility. Working within the
context of continuous asset prices and deterministic spot volatility, the au-
thors decribed an estimator of the realized trajectory of spot volatility using
wavelet projection methods. Their basic framework has been extended by
Hoffmann et al. (2012), who proposed an adaptive estimator of spot volatility
for continuous asset prices subject to market microstructure noise contami-
nations.
Another important contribution to the global spot volatility estimation
literature is the estimator studied by Malliavin & Mancino (2002), which
relies on Fourier methods to estimate the realized path of spot volatility for
assets with continuous prices. In their procedure, Fourier coefficients of the
realized price path are first estimated and the used to derive expressions for
the Fourier coefficients of the realized path of spot volatility.
In the current work, we extend the study of the realized path of spot
volatility to situations where the price process or the volatility coefficient
itself cannot be assumed to be continuous. That is we describe a procedure
for consistently estimating càdlàg volatility paths in the presence of price
jumps. By employing Gabor frames in our analysis we are able to leverage
the excellent time-frequency localization property of Gabor frames to obtain
the sparsest representation for the realized trajectories of spot volatility.
The rest of this paper is organized as follows: in Section 1.2 we introduce
notation and give general description of the dynamics of observed prices. In
Section 1.3 we introduce Gabor frames and review the basic theory required
for our subsequent analysis. We present our main results in Section 1.4 and
6
4, where we specify the estimator and give proof of its consistency. Section
1.6 describes simulation exercises that lend further support to the theoretical
analysis. Section 6 contains concluding remarks.
1.2 Prices
We fix a filtered probability space (Ω,F , Ftt≥0, P ) and recall the definition
of an Itô semimartingale with Lévy jumps.
1.1 Definition An R-valued process X is an Itô semimartingale with Lévy
jumps if it admits the representation:
Xt = X0 +
∫ t
0
bs ds+
∫ t
0
σs dWs + J lt + Js
t , t ≥ 0, (1.1)
with
J lt := xI|x|>1 ∗ µ :=
∫ t
0
∫
R
xI|x|>1µ(dx, ds),
Jst := xI|x|≤1 ∗ (µ− ν) :=
∫ t
0
∫
R
xI|x|≤1(µ− ν)(dx, ds),
ν(dt, dx) = F (dx) dt,
where W is a Brownian motion, σ and b are R-valued progressively measur-
able processes, µ is an integer-valued measure induced by the jumps of X,
ν is its Lévy system, and F (dx) is a deterministic constant-in-time σ-finite
measure on R.
1.1 Remark Generally, Itô semimartingales are those with characteristic
triplet that is absolutely continuous with respect to the Lebesgue measure.
Here, we further restrict the Lévy system ν to be deterministic. This as-
sumption ensures the jump measure µ is a Poisson random measure.
We assume prices are observed in the fixed time interval [0, 1] at discrete,
equidistant times ti = i∆n, i = 0, 1, · · · , n, where
∆n = 1/n = ti+1 − ti, i = 0, · · · , n− 1. (1.2)
7
Given the finite sequence Xti , i = 0, 1, 2, · · · , n, our aim is to estimate
the spot variance σ2 in the time interval [0, 1] by nonparametric methods.
Note that our objective is not an approximation of a point but rather the
approximation of an entire function. Thus an estimator of the spot variance
may be viewed as a random element (function), as opposed to a random
variable, that must converge in some sense to the spot variance, which itself
is a random element. We approach this task by estimating the expansion of
the spot variance using collections of Gabor frame elements.
1.3 Frames
Frames generalize the notion of orthonormal bases in Hilbert spaces. If
fkk∈N is a frame for a separable Hilbert space H then every vector f ∈ Hmay be expressed as a linear combination of the frame elements, i.e.
f =∑
k∈N
ckfk. (1.3)
This is similar to how elements in a Hilbert space may be expressed in terms of
orthonormal basis; but unlike orthonormal basis, the representation in (1.3)
need not be unique, and the frame elements need not be orthogonal. Loosely
speaking, frames contain redundant elements. The absence of uniqueness in
the frame representation is by no means a shortcoming; on the contrary, we
are afforded a great deal of flexibility and stability as a result. In fact, given
a finite data sample, the estimated basis expansion coefficients are likely to
be imprecise. This lack of precision can create significant distortions when
using an orthonormal basis. These distortions are somewhat mitigated when
using frames because of the built-in redundancy of frame elements.
Furthermore, if fkk∈N is a frame for H, then surjective, bounded trans-
formations of fkk∈N also constitute frames for H, e.g. fk + fk+1k∈N is a
frame. So, once we have a frame, we can generate an arbitrary number of
them very easily. We may then obtain estimates using each frame and com-
pare results. If our results using the different frames fall within a tight band,
then we are afforded some indication of the robustness of the computations.
Our discussion of frame theory will be rather brief; we only mention
8
concepts needed for our specification of the volatility estimator. For a more
detailed treatment we refer the reader to (Christensen, 2008). In the sequel
if z is a complex number then we shall denote respectively by z and |z|the complex conjugate and magnitude of z. Let L2(R) denote the space of
complex-valued functions defined on the real line with finite norm given by
‖f‖ :=
(∫
R
f(t)f(t) dt
)1/2
< ∞, f ∈ L2(R).
Define the inner product of two elements f and g in L2(R) as 〈f, g〉 :=∫
Rf(t)g(t) dt.
Denote by ℓ2(N) the set of complex-valued sequences defined on the set
of natural numbers N with finite norm given by
‖c‖ :=
∑
k∈N
ckck
1/2
< ∞, c ∈ ℓ2(N),
where ck is the k-th component of c. The inner product of two sequences
c and d in ℓ2(N) is 〈c, d〉 :=∑
k∈N ckdk. Now we may give a definition for
frames:
1.2 Definition A sequence fkk∈N ⊂ L2(R) is a frame if there exists
positive constants C1 and C2 such that
C1‖f‖2 ≤∑
k∈N
|〈f, fk〉|2 ≤ C2‖f‖2, f ∈ L2(R).
The constants C1 and C2 are called frame bounds. If C1 = C2 then fkk∈Nis said to be tight. Because an orthonormal basis satisfies Parseval’s equality,
it follows that an orthonormal basis is a tight frame with frame bounds
identically equal to 1, i.e. C1 = C2 = 1. Now if fk is a frame, we may
associate with it a bounded operator A that maps every function f in L2(R)
to a sequence c in ℓ2(N) in the following way:
Af = c where ck = 〈f, fk〉, k ∈ N. (1.4)
Because A takes a function defined on a continuum (R) to a sequence, which
9
is a function defined on the discrete set N, A is known as the analysis operator
associated with the frame fkk∈N. The boundedness of the analysis operator
follows from the frame bounds in Definition (1.2). Now A∗, the adjoint of A,
is well-defined and takes sequences in ℓ2(N) to functions in L2(R). Using the
fact that A∗ must satisfy the equality 〈Af, c〉 = 〈f,A∗c〉 for all f ∈ L2(R)
and c ∈ ℓ2(N), it may be deduced that
A∗c =∑
k∈N
ckfk, c ∈ ℓ2(N),
where ck is the k-th component of the sequence c. The adjoint, A∗, may be
thought of as reversing the operation or effect of the analysis operator; for
this reason it is known as the synthesis operator.
Now an application of the operator (A∗A)−1 to every frame element fk
yields a sequence fk := (A∗A)−1fkk∈N, which is yet another frame for
L2(R). The frame fkk∈N is known as the canonical dual of fkk∈N. De-
noting the analysis operator associated with the canonical dual by A, it may
be shown1 that
A∗A = A∗A = I, (1.5)
where I is the identity operator and A∗ is the adjoint of the analysis operator
of the canonical dual. Furthermore, Proposition 3.2.3 of Daubechies (1992)
shows that A satisfies
A = A(A∗A)−1, (1.6)
so that the analysis operator of the canonical dual frame is fully characterized
by A and its adjoint. It is easily seen that (1.5) yields a representation result
since if f ∈ L2(R) then
f = A∗Af = A∗Af =∑
k∈N
〈f, fk〉fk. (1.7)
Thus, in a manner reminiscent of orthonormal basis representations, every
1See for example Daubechies (1992, Proposition 3.2.3)
10
function in L2(R) is expressible as a linear combination of the frame elements,
with the frame coefficients given by 〈f, fk〉, the correlation between the func-
tion and the elements of the dual frame. It follows from the first equality
in (1.5) and the commutativity of the duality relationship that functions in
L2(R) may also be written as linear combinations of the elements in fkk∈N,
with coefficients given by 〈f, fk〉, i.e. f =∑
k∈N〈f, fk〉fk.
1.3.1 Gabor frames
Next, we specialize the discussion to Gabor frames. The analysis of Gabor
frames involves two operators: the translation operator T and the modulation
operator M defined as follows:
Tbf(t) := f(t− b), b ∈ R, f ∈ L2(R), (1.8)
Maf(t) := e2πiatf(t), a ∈ R, f ∈ L2(R), (1.9)
where i is the imaginary number, i.e. i =√−1. Both T and M are shift
operators: T is a shift or translation operator on the time axis, whereas
M performs shifts on the frequency axis. A Gabor system is constructed by
fixing a, b ∈ R, and performing shifts of a single nontrivial function g ∈ L2(R)
in time-frequency space. For example, if a and b are real numbers then the
sequence of functions
MhaTkbgh,k∈Z,
constitutes a Gabor system.
1.3 Definition Let g ∈ L2(R), and let a > 0, b > 0 be positive real
numbers. Define for t ∈ R
gh,k(t) := e2πihatg(t− kb), h, k ∈ Z.
If the sequence gh,kh,k∈Z constitutes a frame for L2(R), then it is called a
Gabor frame.2
2It is also sometimes referred to as a Weyl-Heisenberg frame.
11
The fixed function g is known as the Gabor frame generator 3; a is known
as the modulation parameter ; and b is known as the translation parame-
ter. In order to obtain sharp asymptotic rates, we require g and its dual g
(see (1.7)) to be continuous and compactly supported. The following result
(Christensen, 2006, Lemma 1.2) and (Zhang, 2008, Proposition 2.4), tells us
how to construct such dual pairs.
1.1 Lemma Let [r, s] be a finite interval, let a > 0, b > 0 be positive
constants, and let g be a continuous function. If g(t) 6= 0 when t ∈ (r, s);
g(t) = 0 when t /∈ (r, s); and a, b satisfy: a < 1/(s − r), 0 < b < s − r;
then g, g is a pair of dual Gabor frame generators, with the dual Gabor
generator given by
g(t) := g(t)/G(t), where (1.10)
G(t) :=∑
k∈Z
|g(t− kb)|2/a. (1.11)
Furthermore,
gh,k(t) := e2πihatg(t− kb), h, k ∈ Z (1.12)
is compactly supported.
In the sequel, we assume the Gabor frame setup in Lemma 1.1.
1.4 Volatility estimation: continuous prices
In this section we specify a consistent estimator of spot volatility within a
framework of continuous prices. That is, we simplify the general setup of
(1.1) to:
Xt = X0 +
∫ t
0
bs ds+
∫ t
0
σs dWs, t ≥ 0. (1.13)
We further restrict the processes b and σ as follows:
3It is referred to elsewhere as the window function.
12
1.1 Assumption
1. The drift b is progressively measurable, whereas the diffusion coefficient
σ is adapted and càdlàg.
2. There is a sequence of stopping times Tmm∈N tending to infinity al-
most surely such that
E( sup0≤s≤Tm
|bs − b0|4) + E( sup0≤s≤Tm
|σs − σ0|4) < ∞,
for all m.
1.2 Remark These assumptions are satisfied by a wide range of practi-
cally relevant processes; these include continuous Lévy and additive processes
with càdlàg volatility coefficients. Also included are continuous solutions of
stochastic differential equations; indeed all processes with locally bounded b
and σ satisfy these requirements.
Let g, g be a pair of dual Gabor frame generators constructed as in
Lemma 1.1, then σ2 admits a Gabor frame expansion given by:
σ2(t) =∑
h,k∈Z
ch,k gh,k(t), where (1.14)
ch,k = 〈σ2, gh,k〉. (1.15)
Note that both σ2 and g have compact support. Indeed σ2 has support in
[0, 1], whereas g has support in [s, r]. So, ch,k 6= 0 only if the supports of σ2
and gh,k overlap. Furthermore, we note from (1.12) that gh,k+1 is simply gh,k
shifted by b units; so, ch,k = 0 if |k| ≥ K0 with
K0 := ⌈(1 + |s|+ |r|)/b⌉, (1.16)
where ⌈x⌉, x ∈ R, is the least integer that is greater than or equal to x. Thus
σ2 admits a representation of the form:
σ2(t) =∑
(h,k)∈Z2
|k|≤K0
ch,k gh,k(t),
13
and for sufficiently large positive integer H,
σ2(t) ≈∑
|h|≤H|k|≤K0
ch,k gh,k(t).
Now, suppose n observations of the price process are available, and let
Θn := (h, k) ∈ Z2 : |h| ≤ Hn and |k| ≤ K0, (1.17)
where Hn is an increasing sequence in n. We propose the following estimator
of the volatility coefficient:
vn(X, t) :=∑
(h,k)∈Θn
ch,k gh,k(t), t ∈ [0, 1], where (1.18)
ch,k :=n−1∑
i=0
gh,k(ti)(Xti+1−Xti)
2. (1.19)
So |Θn| is the number of frame elements included in the expansion. Specifi-
cally, |Θn| = (2K0 + 1)(2Hn + 1); and since K0 is a finite quantity, it follows
that |Θn| = O(Hn), i.e. the number of estimated coefficients is proportional
to Hn, and therefore, will grow with the number of observations, n. In the
next section we show that the estimator converges to σ2 on [0, 1] in proba-
bility.
1.1 Proposition Suppose the price process is specified as in (1.13) and
satisfies the conditions of Assumption 1.1. Let g, g be pair of dual Gabor
generators satisfying the conditions of Lemma 1.1 with g Lipschitz continuous
on the unit interval. If Hn ↑ ∞ satisfies
(Hn)2∆1/2
n = o(1),
then vn(X, t), defined in (1.18), converges in L2[0, 1] to σ2 in probability.
14
Proof. We begin by noting that
vn(X, t)− σ2(t) =∑
(h,k)∈Θn
(ch,k − ch,k) gh,k(t)
−∑
(h,k) 6∈Θn
ch,k gh,k(t), (1.20)
where
ch,k =n−1∑
i=0
gh,k(ti)(Xti+1−Xti)
2 and
ch,k =
∫ 1
0
gh,k(s)σ2(s) ds.
We tackle the summands in (1.20) in turn starting with the first one. But
first let
Mi :=
∫ ti+1
ti
b(s) ds, and Si :=
∫ ti+1
ti
σ(s) dWs,
and note that since Xti+1−Xti = Mi + Si, it follows that
(Xti+1−Xti)
2 = M2i + 2MiSi + S2
i .
So, (1.20) may be written as
vn(X, t)− σ2(t) = B1,n(t) + B2,n(t) + B3,n(t) + B4,n(t),
15
where
B1,n(t) :=∑
(h,k)∈Θn
gh,k(t)
n−1∑
i=0
gh,k(ti)S2i − ch,k
,
B2,n(t) := 2∑
(h,k)∈Θn
gh,k(t)
n−1∑
i=0
gh,k(ti)SiMi
,
B3,n(t) :=∑
(h,k)∈Θn
gh,k(t)
n−1∑
i=0
gh,k(ti)M2i
,
B4,n(t) := −∑
(h,k) 6∈Θn
gh,k(t)ch,k. (1.21)
We start by recalling the well-known fact that frame expansions converge
unconditionally in L2[0, 1], that is, the expansion converges regardless of the
order of summation (Christensen, 2008, Theorem 5.1.7). Hence,
‖B4,n‖L2[0,1] = oa.s.(1).
We now obtain an estimate for B3,n(t). Suppose without loss of generality
that b0 = σ0 = 0 and let Tmm∈N be a localizing sequence for b and σ. Then,
by Jensen’s inequality
E
(
∫ ti+1
ti
bs∧Tm ds
)2
≤ ∆nE
(
∫ ti+1
ti
b2s∧Tmds
)
≤ ∆n
∫ ti+1
ti
E(b2s∧Tm) ds
≤ ∆n
∫ ti+1
ti
E( supu≤Tm
b4u)1/2 ds
≤ c∆2n, (1.22)
where the change in the order of integration is justified by Fubini’s theorem,
and c denotes a generic constant. In the sequel, in expressions containing
more than one inequality, c will denote the maximum or minimum, as the
case may be, of the constants appearing in each inequality. Set Mmi =
16
∫ ti+1
tib2s∧Tm
ds and
Bm3,n(t) =
∑
(h,k)∈Θn
gh,k(t)
n−1∑
i=0
gh,k(ti)(Mmi )2
and note that given η > 0,
P ( supt∈[0,1]
|B3,n(t)| > η) ≤ P (Tm ≤ 1) + P ( supt∈[0,1]
|Bm3,n(t)| > η),
for any m ∈ N. Since Tm ↑ ∞ a.s., the first term on the right becomes
arbitrarily small as m tends to infinity. Now since gh,k and gh,k are bounded
independently of h and k, and n∆n = 1, it follows by Markov’s inequality
and (1.22) that
P ( supt∈[0,1]
|Bm3,n(t)| > η) ≤ cHn∆n.
Hence,
supt∈[0,1]
|B3,n(t)| = oP (1). (1.23)
We now tackle B2,n(t). To that end, denote Smi :=
∫ ti+1
tiσs∧TmdWs and note
that
E((Smi )2) = E
(
∫ ti+1
ti
σ2s∧Tm
ds
)
=
∫ ti+1
ti
E(σ2s∧Tm
) ds
≤∫ ti+1
ti
(E( supu∧Tm
σ4u)
1/2) ds
≤ c∆n. (1.24)
17
By Hölder’s inequality, (1.22), and (1.24), we have
E(Mmi Sm
i ) ≤ (E(Mmi )2E(Sm
i )2)1/2
≤ c∆3/2n . (1.25)
Next, set
Bm2,n(t) := 2
∑
(h,k)∈Θn
gh,k(t)
n−1∑
i=0
gh,k(ti)Smi Mm
i
.
Then for each m, because gh,k and gh,k are bounded independently of h and
k, and n∆n = 1, we conclude by an appeal to Markov’s inequality that
P (supt∈[0,1] |Bm2,n(t)| > η) ≤ cHn∆
1/2n . By the previously used localization
argument,
supt∈[0,1]
|B2,n(t)| = oP (1). (1.26)
Now we tackle the final piece B1,n(t). Let
An :=n−1∑
i=0
gh,k(ti)S2i −
∫ 1
0
σ2(s)gh,k(s) ds. (1.27)
We will first obtain an upper bound for An; we proceed by adding and sub-
tracting∑n−1
i=0
∫ ti+1
tigh,k(ti)σ
2(s) ds from A to yield:
An =n−1∑
i=0
gh,k(ti)
(
S2i −
∫ ti+1
ti
σ2(s) ds
)
+n−1∑
i=0
(
∫ ti+1
ti
σ2(s)gh,k(ti)− gh,k(s) ds)
=: An1 + An
2 .
We obtain estimates in turn for the summands. By Assumption 1.1, σ is
càdlàg so that it is almost surely bounded on [0, 1]; by the continuity of gh,k
18
and Lemma (1.3), we have
An2 =
n−1∑
i=0
∫ ti+1
ti
σ2(s)gh,k(ti)− gh,k(s) ds
≤ cω(gh,k,∆n), a.s.,
where ω(gh,k,∆n) is the modulus of continuity of gh,k on an interval of length
∆n. By the Lipschitz continuity of g we have,
An2 = Oa.s.(ω(g,∆n)) = Oa.s.(∆n).
Now, we obtain an estimate for An1 . First, let Dn
i : Ω × [0, 1] → R for
i = 0, · · · , n− 1 be defined as follows:
Dni (t) := gh,k(ti)
(
∫ t
ti
σu∧Tm dWu
)
(ti,ti+1](t). (1.28)
Dn0 (0) := 0. (1.29)
So, Dni (t) is 0 on [0, 1] except when t is in (ti, ti+1]. Moreover
Dni (t)D
nj (t) = 0, i 6= j,
for t in [0, 1]. Now, for each i, 0 ≤ i < n, if t ∈ (ti, ti+1], we have
E(Dni (t)
4) = gh,k(ti)4(ti,ti+1](t)E
(
∫ t
ti
σu∧Tm dWu
)4
≤ c(ti,ti+1](t)E
(
∫ t
ti
σ2u∧Tm
du
)2
B.D.G
≤ c(t− ti)(ti,ti+1](t)E
(
∫ t
ti
σ4u∧Tm
du
)
Jensen
≤ c∆n(ti,ti+1](t)
∫ ti+1
ti
E(
σ4u∧Tm
)
du Fubini
≤ c(ti,ti+1](t)∆2n (1.30)
19
where the application of Fubini’s theorem (Halmos, 1950, Theorem VII.36.B)
is justified by the fact that σ4 is non negative and measurable with respect
to the product σ-algebra on [0, 1]×Ω. Now, using Itô’s integration by parts
formula, we may write
E((An1 )
2) = E
2n−1∑
i=0
∫ ti+1
ti
gh,k(ti)
(
∫ s
ti
σu∧Tm dWu
)
σs∧Tm dWs
2
= 4E
∫ 1
0
n−1∑
i=0
Dni (s)σs∧Tm dWs
2
≤ c
∫ 1
0
n−1∑
i=0
E(Dni (s)
2σ2s∧Tm
) ds
≤ c
∫ 1
0
n−1∑
i=0
(E(Dni (s)
4))1/2(E(σ4s∧Tm
))1/2 ds
≤ c
∫ 1
0
n−1∑
i=0
(ti,ti+1](s)∆n ds
≤ c∆n.
By Chebyshev’s inequality and the previously used stopping time argument,
we have An = OP (∆n). By the boundedness of gh,k, we have
supt∈[0,1]
|B1,n(t)| = oP (1).
Hence, Bj,n(t) for j = 1, · · · , 4, tends to zero in L2[0, 1] in probability.
1.5 Volatility estimation: discontinuous prices
In this section we specify a global spot volatility estimator for possibly dis-
continuous Itô semimartingale price processes. That is, for t ≥ 0,
Xt = X0 +
∫ t
0
bsds+
∫ t
0
σsdWs + xI|x|>1 ∗ µt + xI|x|≤1 ∗ (µ− ν)t
20
with ν(dt, dx) = F (dx)dt for a determinsistic and constant-in-time σ-finite
measure F . We assume σ and b satisfy the requirements of Assumption 1.1,
and we further restrict the Lévy system of X as follows:
1.2 Assumption The Lévy measure F satisfies the following condition
(x2I|x|≤u) ∗ νt =∫ t
0
∫ u
−ux2F (dx) dt = O(u) as u → 0.
1.3 Remark The requirement is satisfied if F is absolutely continuous
with bounded density f , as is the case with the Gaussian distribution; more
generally, it is satisfied if f(x) = O(x−2) as x → 0; these include the Lévy
(γ, δ) distribution with density
f(x) = (γ/2π)1/2(x− δ)−3/2 exp(−γ/2(x− δ)), x ∈ R,
and the Cauchy(γ, δ) distribution with density
f(x) = (γ/π)(γ2 + (x− δ)2)−1, x > δ.
We also remark that for general semimartingales (x2∧1)∗νt is increasing
and locally integrable. By the Lévy assumption, we simply have that (x2 ∧1) ∗ ν is finite. In addition, it is a consequence of the Lévy assumption that
the price process has no fixed time of discontinuity (Jacod & Shiryaev, 2003,
II.4.3). Hence, by Itô’s integration by parts formula
E((x2 ∧ 1) ∗ µt) = t(x2 ∧ 1) ∗ ν = O(t), t ≥ 0. (1.31)
As in the preceeding section, we observe a realization of the price process
at n + 1 equidistant points ti, i = 0, 1, · · · , n. The observation interval is
normalized to [0, 1] with no loss of generality. The estimator proposed in
the previous section, where there is no jump activity, will not do here. It is
inconsistent on account of the presence of jumps; its quality deteriorates as a
function of how active the jumps of X are. We will counter this phenomenon
with a modified spot variance estimator, but first we introduce the following
notation. Let ∆iX denote Xti+1−Xti for i = 0, 1, · · · , n − 1, and let un be
21
a positive decreasing sequence such that
un = O(∆βn), where 0 < β < 1. (1.32)
We specify the jump-robust global estimator of spot volatility as follows:
Vn(X, t)(t) :=∑
(h,k)∈Θn
ah,k gh,k(t), ∀t ∈ [0, 1], where (1.33)
ah,k :=n−1∑
i=0
gh,k(ti)(∆iX)2I(∆iX)2≤un, (1.34)
where gh,k, gh,k is a pair of dual Gabor frames constructed as in Lemma
(1.1); Θn retains its meaning from (1.17); and I(∆iX)2≤un is one if (∆iX)2
is less than or equal to un and zero otherwise.
There are obvious similarities between vn(X, t), defined at (1.18), and
Vn(X, t) with the key difference being that Vn(X, t) discards realized squared
increments over intervals that likely contain jumps; un determines the thresh-
old for what is included in the computation and what is not. This determi-
nation becomes more accurate as the observation interval becomes infinites-
simally small. Clearly it makes sense to use vn(X, t) if we have reason to
believe that the price process is not subject to jumps; vn(X, t) will always
employ all available data and therefore may be assumed to produce more
accurate results.
We now proceed to prove the consistency of the estimator. First we
introduce the following notation and prove an intermediate lemma.
Xct := X0 +
∫ t
0
bs ds+
∫ t
0
σs dWs,
J lt := xI|x|>1 ∗ µt,
Xft := Xc
t + J lt . (1.35)
1.2 Lemma Let Xf be specified as in (1.35) with σ and b satisfying As-
sumption 1.1. Let g, g denote a pair of dual Gabor generators satisfying
the conditions of Lemma (1.1) with g Lipschitz continuous on the unit in-
terval. Let Hn be an increasing sequence and un a decreasing sequence
22
satisfying un = O(∆βn) with 0 < β < 1. If
u−1/2n (Hn)2∆1/2
n = o(1)
then Vn(Xf , t) as defined in (1.33) converges in L2[0, 1] in probability to σ2.
Proof. We have
Vn(Xf , t)− σ2(t) = Vn(X
f , t)− Vn(Xc, t)+ Vn(X
c, t)− vn(Xc, t)
+ vn(Xc, t)− σ2(t). (1.36)
That the third summand on the right converges to zero in L2[0, 1] in probabil-
ity is the content of Proposition 1.1. Set bh,k :=∑n−1
i=0 gh,k(ti)(∆iXc)2I(∆iXc)2≤un
and dh,k :=∑n−1
i=0 gh,k(ti)(∆iXc)2. Now note that Vn(X
c, t) − vn(Xc, t) =
∑
(h,k)∈Θn(bh,k − dh,k) gh,k(t) with
bh,k − dh,k =n−1∑
i=0
gh,k(ti)(∆iXc)2I(∆iXc)2≤un − (∆iX
c)2
=n−1∑
i=0
gh,k(ti)(∆iXc)2I(∆iXc)2>un.
Without loss of generality, suppose b0 = σ0 = 0; let Tm be a localizing
sequence for b and σ. Set ∆iSm :=∫ ti+1
tiσs∧Tm dWs, ∆iMm :=
∫ ti+1
tibs∧Tm ds,
and ∆iXcm := ∆iMm + ∆iSm. Define bmh,k − dmh,k as above by substituting
∆iXcm for ∆iX
c. Now note the following
E(|bmh,k − dmh,k|) ≤ cnE((∆iXcm)
2I(∆iXcm)2>un)
≤ cn(E((∆iXcm)
4))1/2(P ((∆iXcm)
2 > un))1/2
≤ cnu−1/2n (E((∆iX
cm)
4))1/2(E((∆iXcm)
2))1/2.
Arguing as in Proposition 1.1, it is easily verified that E((∆iXcm)
4) ≤ c(∆4n+
∆3n +∆2
n) and E((∆iXcm)
2) ≤ c(∆2n +∆
3/2n +∆n). Hence, E(|bmh,k − dmh,k|) ≤
cnu−1/2n ∆
3/2n = cu
−1/2n ∆
1/2n . Because gh,k is bounded, this allows us to con-
23
clude by way of Markov’s inequality that given η > 0,
P ( supt∈[0,1]
|Vn(Xc, t)− vn(X
c, t)| > η) ≤ P (Tm ≤ 1) + cu−1/2n Hn∆1/2
n ,
which becomes arbitrarily small as m and n tend to infinity simultaneously.
To obtain an estimate for the first summand in (1.36), denote eh,k :=∑n−1
i=0 gh,k(ti)(∆iXf )2I(∆iXf )2≤un and observe that Vn(X
f , t) − Vn(Xc, t) =
∑
(h,k)∈Θn(eh,k − bh,k) gh,k(t) with
eh,k − bh,k =n−1∑
i=0
gh,k(ti)(∆iXf )2I(∆iXf )2≤un − (∆iX
c)2I(∆iXc)2≤un.
By definition Xf = Xc + J l, where J l represents the jumps of X in excess of
1. We may write (∆iXf )2I(∆iXf )2≤un−(∆iX
c)2I(∆iXc)2≤un = γ1i +2γ2
i +γ3i
with
γ1i := (∆iX
c)2(I(∆iXf )2≤un − I(∆iXc)2≤un),
γ2i := (∆iX
c∆iJl)I(∆iXf )2≤un,
γ3i := (∆iJ
l)2I(∆iXf )2≤un. (1.37)
Because, X is càdlàg, there is at most a finite number of jumps in excess
of 1 per outcome in [0, 1]. For sufficiently large n, each interval (ti, ti+1]
contains at most one jump. If the i-th interval does not contains a jump
then γ2i = γ3
i = 0 because ∆iJl = 0. If the i-th interval contains a jump, we
have
|∆iXf | = |∆iJ
l +∆iXc| ≥ 1− |∆iX
c|. (1.38)
Now observe that because Xc has continuous paths, it is uniformly continuous
on the compact domain [0, 1], so that as n tends to infinity, 1−supi<n |∆iXc| ↑
1; meanwhile, u1/2n ↓ 0. Hence, for n large enough, we have |∆iX
f | > u1/2n so
that, almost surely, γ2i and γ3
i , for all i, are uniformly eventually zero.
24
To pin down γ1i , we introduce the following events
Ω1n := ω : µ(ω, (ti, ti+1]× |x| > 1) ≤ 1, for all i < n, n ∈ N,
Ω2n := ω : |∆iX
c(ω)| < 1− u1/2n , for all i < n, n ∈ N,
Ωk := ω : µ(ω, [0, 1]× |x| > 1) ≤ k. k ∈ N.
Set Ωn := Ω1n∩Ω2
n. As previously argued (see (1.38)), P (Ω2n) → 1 as n → ∞.
Because X is càdlàg, µ([0, 1] × |x| > 1) is almost surely finite, so that
P (Ω1n) → 1 as n → ∞. Hence, P (Ωn) → 1 as n → ∞. It is also the case that
P (Ωk) → 1 as k → ∞ since X is càdlàg and the number of jumps larger than
one in any bounded interval must be finite almost surely. Now, recall that
Tm is a localizing sequence for b and σ; set Ω(m,n, k) := Ωn∩Ωk∩Tm > 1and note that P (Ω(m,n, k)) → 1 as n,m, k → ∞. Thus, on Ω(m,n, k) there
is at most k jumps larger than one with no more than one jump per interval;
the increments of Xc are small enough to ensure the increments of Xf exceed
u1/2n ; and the processes σ4 and b4 are integrable.
Set γ1i (n,m, k) = γ1
i IΩ(m,n,k) and denote Gi := |∆iJl| > 0. By the tri-
angle inequality, E(|γ1i (n,m, k)|) ≤ E(|γ1
i (n,m, k)IGi|)+E(|γ1
i (n,m, k)IGci|).
Clearly, γ1i (n,m, k) = 0 on Gc
i so that
n−1∑
i=0
gh,k(ti)E(|γ1i (n,m, k)|) ≤
n−1∑
i=0
gh,k(ti)E(|γ1i (n,m, k)IGi
|)
=k∑
i=1
gh,k(ti)E((∆iXcm)
2I(∆iXcm)2≤unIGi
)
≤k∑
i=1
gh,k(ti)E((∆iXcm)
2)
≤ ck∆n.
Hence, given η > 0,
P ( supt∈[0,1]
|Vn(Xf , t)− Vn(X
c, t)| > η) ≤ P (Ω(m,n, k)c) + cHnk∆n.
By taking m,n, k large enough, the first term can be made as small as re-
25
quired; for fixed m, k, letting n → ∞ will make the second term as small as
desired. This completes the proof.
We now prove consistency for the estimator when the price process admits
both large and small jumps. That is
Xt = X0 +Xct + J l
t + Jst .
where J lt := (xI|x|>1) ∗ µt and Js
t := (xI|x|≤1) ∗ (µ− ν)t. We now give the
main result of the paper.
1.2 Proposition Let the price process X be specified as in (1.1). We
assume that the requirements of Assumption 1.1 and 1.2 are met. Let g, gbe pair of dual Gabor generators satisfying the conditions of Lemma (1.1)
with g Lipschitz continuous on the unit interval. Let Hn be an increasing
sequence and un a decreasing sequence statisfying un = O(∆βn) with 0 <
β < 1. If
u−1/2n (Hn)2∆1/2
n = o(1),
(Hn)2u1/2n = o(1) (1.39)
then Vn(X, t) defined in (1.33) converges in L2[0, 1] in probability to σ2.
Proof. We argue along the lines of Theorem 4 of Mancini (2009). First,
consider the following decomposition of the process X:
X = Xf + Js, (1.40)
Xf = Xc + J l, (1.41)
where Xct =
∫ t
0bs ds+
∫ t
0σs dWs, J l
t = (xI|x|>1) ∗µt, and Jst = (xI|x|≤1) ∗ (µ−
ν)t. By localization, it is enough to assume σ4 and b4 are integrable. Let t
be a point in the unit interval, then
Vn(X, t)− σ2t =
∑
(h,k)∈Θn
(ah,k − ch,k)gh,k(t)−∑
(h,k) 6∈Θn
ch,kgh,k(t), (1.42)
with ah,k and ch,k defined by (1.33) and (1.15), respectively. The last term
tends to zero, almost surely, in L2[0, 1] as n → ∞ because Gabor frames
26
converge unconditionally.
To obtain a bound on the first item on the right of (1.42), we may use
(1.40) to write
∑
(h,k)∈Θn
(ah,k − ch,k)gh,k(t) =∑
(h,k)∈Θn
(wh,k + xh,k + yh,k + zh,k)gh,k(t),
(1.43)
where
wh,k :=n−1∑
i=0
gh,k(ti)(∆iXf )2I(∆iXf )2≤4un −
∫ 1
0
σ2(s)gh,k(s) ds
xh,k :=n−1∑
i=0
gh,k(ti)(∆iXf )2(I(∆iX)2≤un − I(∆iXf )2≤4un)
yh,k := 2n−1∑
i=0
gh,k(ti)∆iXf∆iJ
sI(∆iX)2≤un
zh,k :=n−1∑
i=0
gh,k(ti)(∆iJs)2I(∆iX)2≤un. (1.44)
By Lemma 1.2, if δ > 0 then P (supt∈[0,1] |∑
(h,k)∈Θnwh,kgh,k(t)| > δ) → 0
as n tends to infinity. It remains to show that the last three terms on the
right of (1.43) converge to zero in probability. Starting with the second
summand, denote Ai := (∆iX)2 ≤ un, Bi := (∆iXf )2 ≤ 4un and note
that IAi− IBi
= IAi∩Bci− IAc
i∩Bi. Hence, we may write
∑
(h,k)∈Θn
xh,kgh,k(t) =∑
(h,k)∈Θn
n−1∑
i=0
gh,k(ti)(xi,1 − xi,2)gh,k(t)
where xi,1 := (∆iXf )2IAi∩Bc
iand xi,2 := (∆iX
f )2IAci∩Bi
. It is now easily
verified using the reverse triangle inequality that Ai ∩Bci ⊂ |∆iJ
s| > u1/2n .
27
So that,
(∆iXf )2IAi∩Bc
i≤ (∆iX
f )2I(∆iJs)2>un (1.45)
≤ 2(∆iXc)2I(∆iJs)2>un + 2(∆iJ
l)2I(∆iJs)2>un
=: vi + wi. (1.46)
It thus follows that
∑
(h,k)∈Θn
n−1∑
i=0
gh,k(ti)xi,1gh,k(t) ≤
∑
(h,k)∈Θn
n−1∑
i=0
gh,k(ti)(vi + wi)
gh,k(t).
We proceed by using Hölder’s inequality and (1.31) to write
E(vi) ≤ c(E((∆iXc)4))1/2P ((∆iJ
s)2 > un)1/2
≤ cu−1/2n E((∆iX
c)4)1/2E((∆iJs)2)1/2
≤ cu−1/2n ∆3/2
n . (1.47)
Hence, by Markov’s inequality and the boundedness of gh,k
supt∈[0,1]
∑
(h,k)∈Θn
n−1∑
i=0
gh,k(ti)vigh,k(t) = OP (u−1/2n Hn∆1/2
n ), (1.48)
which by assumption tends to zero in probability.
As for the term involving wi, recall that because µ is a Poisson random
measure, if A and B are disjoint measurable sets in R+ × R then µ(A) is
independent of µ(B). Using this fact, we may write given η > 0
P ( supt∈[0,1]
|∑
(h,k)∈Θn
n−1∑
i=0
gh,k(ti)wigh,k(t)| > η)
≤ P(
∪iµ((ti, ti+1]× |x| > 1) > 0, (∆iJs)2 > un
)
≤ nP (µ([0, t1]× |x| > 1) > 0)E((Jst1)2)u−1
n
≤ c∆nu−1n ,
which clearly tends to zero in n. This concludes the demonstration that
28
∑
(h,k)∈Θn
∑n−1i=0 gh,k(ti)x
i,1gh,k(t) tends to zero in probability. To tackle the
term∑
(h,k)∈Θn
∑n−1i=0 gh,k(ti)x
i,2gh,k(t), we start with the following definitions:
Ω1n := ω : |∆iX
c(ω)| < 1− 2u1/2n , for all i < n,
Ω2n := ω : µ(ω, (ti, ti+1]× |x| > 1) ≤ 1, ∀i < n,
∀n ∈ N. These sets are clearly measurable. Denote
Ωn := Ω1n ∩ Ω2
n. (1.49)
Since there can be at most a finite number of jumps larger than 1 in magni-
tude on [0, 1], and 1− 2u1/2n ↑ 1 while ∆iX
c ↓ 0 uniformly on [0, 1], it follows
that P (Ωn) → 1 as n → ∞. Now note that
Aci ∩ Bi ∩ Ωn ⊂ (∆iX
c +∆iJs)2 > un
⊂ (∆iXc)2 > un/4 ∪ (∆iJ
s)2 > un/4.
Hence, by successive applications of Hölder and Markov inequalities,
E((∆iXf )2IAc
i∩Bi∩Ωn) = E((∆iXc)2IAc
i∩Bi∩Ωn)
≤ E((∆iXc)2I(∆iXc)2>un/4) + E((∆iX
c)2I(∆iJs)2>un/4)
≤ c∆3/2n u−1/2
n .
Let η be a given positive number; it is now clear that
P ( supt∈[0,1]
|∑
(h,k)∈Θn
n−1∑
i=0
gh,k(ti)xi,2gh,k(t)| > η) ≤ P (Ωc
n) + cu−1/2n Hn∆1/2
n ,
which tends to zero. This completes the demonstration that
P ( supt∈[0,1]
|∑
(h,k)∈Θn
xh,kgh,k(t)| > η) → 0. (1.50)
Now we show the third summand in (1.43) tends to zero. First, denote
Ci := (∆iJs)2 ≤ 4un, ph,k := 2
∑n−1i=0 gh,k(ti)∆iX
f∆iJsIAi∩Ci
, and qh,k :=
29
2∑n−1
i=0 gh,k(ti)∆iXf∆iJ
sIAi∩Cci. Clearly,
∑
(h,k)∈Θn
yh,kgh,k(t) =∑
(h,k)∈Θn
(ph,k + qh,k)gh,k(t).
Treating the term involving qh,k first, note that by the reverse triangle in-
equality, we may write Ai ∩ Cci ⊂ u1/2
n < |∆iXf | ⊂ u1/2
n /2 < |∆iXc| ∪
u1/2n /2 < |∆iJ
l| =: G1i ∪G2
i . So that
∆iXf∆iJ
sIAi∩Cci≤ ∆iX
f∆iJs(IG1
i+ IG2
i)
≤ ∆iXc∆iJ
s(IG1i+ IG2
i) + ∆iJ
l∆iJs(IG1
i+ IG2
i)
=: γ1i + γ2
i + γ3i + γ4
i .
Hence,
∑
(h,k)∈Θn
qh,kgh,k(t) ≤∑
(h,k)∈Θn
(n−1∑
i=0
gh,k(ti)(γ1i + γ2
i + γ3i + γ4
i ))gh,k(t).
We show in turn that each summand converges to zero. First, observe that
E(γ1i ) ≤ E((∆iX
cIG1i)2)1/2E((∆iJ
s)2)1/2
≤ E((∆iXc)4)1/4E(IG1
i)1/4E((∆iJ
s)2)1/2
≤ c∆1/2n (u−1/2
n ∆1/2n )∆1/2
n
≤ cu−1/2n ∆3/2
n . (1.51)
Hence, given positive η,
P ( supt∈[0,1]
|∑
(h,k)∈Θn
n−1∑
i=0
gh,k(ti)γ1i gh,k(t)| > η) ≤ cHn(u−1
n ∆n)1/2. (1.52)
30
Secondly, we have
E(γ2i ) = E(∆iX
c∆iJsIG2
i)
≤ E((∆iXc)2IG2
i)1/2E((∆iJ
s)2)1/2
≤ E((∆iXc)4)1/4P (∆iJ
l > u1/2n /2)1/4E((∆iJ
s)2)1/2
≤ cu−1/8n ∆5/4
n .
So that given positive η,
P ( supt∈[0,1]
|∑
(h,k)∈Θn
n−1∑
i=0
gh,k(ti)γ2i gh,k(t)| > η) ≤ cHn(u−1/2
n ∆n)1/4. (1.53)
Moreover,
P ( supt∈[0,1]
|∑
(h,k)∈Θn
n−1∑
i=0
gh,k(ti)γ3i gh,k(t)| > η)
≤ P (∪iµ((ti, ti+1]× |x| > 1) > 0, (∆iXc)2 > un/4)
≤ c∆nu−1n . (1.54)
Finally,
P ( supt∈[0,1]
|∑
(h,k)∈Θn
n−1∑
i=0
gh,k(ti)γ4i gh,k(t)| > η)
≤ P (∪iµ((ti, ti+1]× |x| > 1) > 0, (∆iJs)2 > un/4)
≤ c∆nu−1n . (1.55)
We conclude by reference to the estimates in (1.52), (1.53), (1.54), and (1.55)
that supt∈[0,1] |∑
(h,k)∈Θnqh,kgh,k(t)| tends to zero in probability.
We now show that∑
(h,k)∈Θnph,kgh,k(t) tends to zero uniformly in proba-
bility. To that end, let Ψn := ω : |∆iXc(ω)| > u
1/2n for some i < n. It now
31
follows by Markov’s inequality that
P (Ψn) ≤n−1∑
i=0
P (|∆iXc| > u1/2
n )
≤ u−3/2(1−β)n
n−1∑
i=0
E((∆iXc)3/(1−β))
≤ c∆1/2n . (1.56)
Hence, P (Ψn) → 0. On Ai ∩Ci ∩Ψcn, it is easily seen that |∆iJ
l| − |∆iXc +
∆iJs| < |∆iX| ≤ u
1/2n , so that |∆iJ
l| ≤ u1/2n + |∆iX
c|+ |∆iJs|. It is therefore
the case that |∆iJl| = O(u
1/2n ). Let rh,k := 2
∑n−1i=0 gh,k(ti)∆iX
c∆iJsIAi∩Ci∩Ψc
n
and sh,k := 2cu1/2n
∑n−1i=0 gh,k(ti)∆iJ
sIAi∩Ci∩Ψcn. Then given δ > 0 and ε > 0,
P ( supt∈[0,1]
|∑
(h,k)∈Θn
ph,kgh,k(t)| > δ)
≤ P (Ψn) + P ( supt∈[0,1]
|∑
(h,k)∈Θn
rh,kgh,k(t)| > δ/2)
+ P ( supt∈[0,1]
|∑
(h,k)∈Θn
sh,kgh,k(t)| > δ/2). (1.57)
Now consider that∑
(h,k)∈Θnrh,kgh,k(t) ≤ cHn
∑n−1i=0 |∆iX
c∆iJsIAi∩Ci∩Ψc |;
this implies that
P (|∑
(h,k)∈Θn
rh,kgh,k(t)| > δ/2) ≤ P (cHn
n−1∑
i=0
|∆iXc∆iJ
sIAi∩Ci∩Ψc | > δ/2)
≤ P
n−1∑
i=0
(∆iXc)2
1/2
n−1∑
i=0
(∆iJsIAi∩Ci∩Ψc)2
1/2
> δ(2Hnc)−1
.
We now use the well-known fact that∑n−1
i=0 (∆iXc)2(t) converges to
∫ t
0σ2(s) ds
in probability uniformly on compact intervals (Protter, 2004, Theorem II.22).
That is, there is a sufficiently large N and C such that if n is larger than
or equal to N then P (|(∑n−1
i=0 (∆iXc)2)1/2 − (
∫ 1
0σ2(s) ds)1/2| > C) ≤ ε/12,
and because integrated volatility is almost surely finite, there is a sufficiently
32
large K satisfying K/2 > C such that P (∫ 1
0σ2(s) ds > K/2) ≤ ε/12. Hence,
we may write
P (|∑
(h,k)∈Θn
rh,kgh,k(t)| > δ/2)
≤ P(
(x2I|x|≤1∧2u
1/2n
) ∗ µ1 > δ2(KHnc)−2)
+ ε/6
≤ c(Hn)2E(
(x2I|x|≤1∧2u
1/2n
) ∗ µ1
)
+ ε/6
≤ c(Hn)2(x2I|x|≤1∧2u
1/2n
) ∗ ν1 + ε/6
which for sufficiently large n is less than ε/3 by Assumption 1.2 and (1.39).
Now it is easily seen that for sufficiently large c
P (|∑
(h,k)∈Θn
sh,kgh,k(t)| > δ/2) ≤ P (n−1∑
i=0
∆iJsIAi∩Ci∩Ψc > (cHnu1/2
n )−1δ)
≤ cun(Hn)2E
(
(x2I|x|≤1∧2u
1/2n
) ∗ µ1
)
≤ cun(Hn)2(x2I
|x|≤1∧2u1/2n
) ∗ ν1 (1.58)
which, as above, is eventually less than ε/3. Hence, each summand on the
right hand side of (1.57) tends to zero. This concludes the demonstration
that
P ( supt∈[0,1]
|∑
(h,k)∈Θn
yh,kgh,k(t)| > δ) → 0.
We now tackle the last remaining summand in (1.43). Note that we may
write zh,k = ah,k + bh,k with ah,k :=∑n−1
i=0 gh,k(ti)(∆iJs)2IAi∩Ci
and bh,k :=∑n−1
i=0 gh,k(ti)(∆iJs)2IAi∩Cc
i. Then for positive δ
P ( supt∈[0,1]
|∑
(h,k)∈Θn
zh,kgh,k(t)| > δ) ≤ P ( supt∈[0,1]
|∑
(h,k)∈Θn
ah,kgh,k(t)| > δ/2)
+ P ( supt∈[0,1]
|∑
(h,k)∈Θn
bh,kgh,k(t)| > δ/2).
Now consider the event Ωn from (1.49), and note that Ai ∩ Cci ∩ Ωn ⊂
33
µ((ti, ti+1]× |x| > 1 > 0 ∩ Cci . Hence,
P (|∑
(h,k)∈Θn
bh,kgh,k(t)| > δ/2)
≤ P(
∪iI|x|>1 ∗ µ((ti, ti+1]× R) > 0, (∆iJs)2 > 4un
)
+ P (Ωcn)
≤ nP (I|x|>1 ∗ µ([0, t1]× R) > 0)E((Jst1)2)(4un)
−1 + P (Ωcn)
≤ c∆nu−1n + P (Ωc
n). (1.59)
which can be made as small as desired. Now consider
P (|∑
(h,k)∈Θn
ah,kgh,k(t)| > δ/2)
≤ P (n−1∑
i=0
(∆iJs)2I
|∆iJs|≤2u1/2n
> δ(2cHn)−1)
≤ cHnE(
x2I|x|≤1∧2u
1/2n
∗ µ1
)
≤ cHn(x2I|x|≤1∧2u
1/2n
∗ ν)1≤ cHnu1/2
n
which can be made arbitrarily small by the constraints on Hn. This completes
the demonstration that
P ( supt∈[0,1]
|∑
(h,k)∈Θn
zh,kgh,k(t)| > δ) → 0. (1.60)
1.6 Simulation
1.6.1 Continuous prices
In this section, we confirm via simulations the results established analytically.
We will first focus on the continuous case to mirror Proposition (1.1). Specif-
ically, we will demonstrate that the mean integrated square error (MISE),
34
the square bias, and the variance of the frame-based estimator tends to zero
as the number of obervations increases. We use prices generated by 4 com-
monly used models of asset prices, namely, the arithmetic Brownian motion
(ABM), the Ornstein-Uhlenbeck process (OU), the geometric Brownian mo-
tion (GBM), and the Cox-Ingersoll-Ross (CIR) process.
We simulate prices using the following stochastic differential equations:
Xt = 0.8 + 0.5t+ 0.2Wt, (ABM)
Xt = 0.8−∫ t
0
4Xs ds+
∫ t
0
0.2 dWs, (OU)
Xt = 0.8 +
∫ t
0
0.5Xs ds+
∫ t
0
0.2Xs dWs, (GBM)
Xt = 0.8 +
∫ t
0
(0.1− 0.5Xs) ds+
∫ t
0
0.2√
Xs dWs, (CIR)
where Wt is a standard Brownian motion. For convenience, the observation
interval is set to the unit interval [0, 1]. In all 4 cases, X0 = 0.8. For each price
model, we obtain estimates for the MISE, the square bias, and the variance
of the estimator when the number of observations are 500, 5000, and 50000,
respectively. In a high-frequency framework, 500 observations for an actively
traded stock is likely too small; 5,000 is about right, but 50,000 is not entirely
unheard of. At any rate, our objective is not to capture the average number
of trades of any particular security, but rather, to obtain support for our
asymptotic results by showing an inverse relationship between the number of
observations and the MISE, and thereby gain a better understanding of the
finite sample behavior of the estimator.
The starting point for constructing the estimator is to fix a generator
for the Gabor frame. We have denoted the generator and its dual by g and
g, respectively. For our purposes, any continuous and compactly supported
function would work.
35
Figure 1.1: Estimated vs. actual spot volatility
(a) GBM
0.025
0.050
0.075
0.100
0.25 0.50 0.75
Times
Sp
ot
(va
ria
nce
)
variable
Estimate
Actual
Geometric Brownian Motion (GBM)
(b) CIR
0.020
0.025
0.030
0.035
0.25 0.50 0.75
Times
Sp
ot
(va
ria
nce
)
variable
Estimate
Actual
Cox−Ingersoll−Ross (CIR)
(c) OU
0.0375
0.0400
0.0425
0.0450
0.25 0.50 0.75
Times
Sp
ot
(va
ria
nce
)
variable
Estimate
Actual
Ornstein−Uhlenbeck (OU)
(d) ABM
0.038
0.040
0.042
0.044
0.25 0.50 0.75
Times
Sp
ot
(va
ria
nce
)
variable
Estimate
Actual
Arithmetic Brownian Motion (ABM)
36
Table 1.1: Mean integrated square error (MISE) of vn(X, t).
ABM OU
n MISE Sq. Bias Var MISE Sq. Bias Var
500 1.30× 10−4 2.86× 10−6 1.27× 10−4 1.43× 10−4 1.19× 10−5 1.31× 10−4
5000 1.41× 10−5 1.11× 10−6 1.30× 10−5 1.45× 10−5 1.62× 10−6 1.28× 10−5
50000 2.32× 10−6 1.02× 10−6 1.30× 10−6 2.36× 10−6 1.12× 10−6 1.23× 10−6
GBM CIR
n MISE Sq. Bias Var MISE Sq. Bias Var
500 2.18× 10−4 4.18× 10−6 2.14× 10−4 6.26× 10−5 8.51× 10−7 6.17× 10−5
5000 2.33× 10−5 1.58× 10−6 2.17× 10−5 6.82× 10−6 6.00× 10−7 6.22× 10−6
50000 4.66× 10−6 1.02× 10−6 3.64× 10−6 1.46× 10−6 6.06× 10−7 8.52× 10−7
Note: The mean of the integrated square errors are obtained by taking an average over 100 sample paths generated for each model/numberof observations pair.
37
From an implementation perspective, using a B-spline makes the con-
struction of a dual frame generator a trivial matter. This is a consequence of
Theorems 2.2 and 2.7 in Christensen (2006), which together specify a very
simple rule for constructing dual pairs: Let a > 0 and b > 0 denote transla-
tion and modulation parameters, and let h be a B-spline of order p. Define
the dilation operator Dc as follows:
Dcf(x) = c−1/2f(x/c). (1.61)
If 0 < ab ≤ 1/(2p− 1) then Dah,Dah, where
h(x) = abh(x) + 2ab
p−1∑
n=1
h(x+ n), x ∈ R, (1.62)
is a pair of dual Gabor frame generators. So if we start with a B-spline h then
the dual generator will be a finite linear combination of scaled translates of
h; consequently, the dual generator will be a spline, with similar regularity
properties. For our simulation, we used a third-order B-spline. Our choice
of the third order B-spline is motivated by a desire for a generator with a
Fourier transform that decays like a quadratic polynomial. Specifically, we
set
h(x) =
x2/2 x ∈ (1, 0]
(−2x2 + 6x− 3)/2 x ∈ (2, 1]
(3− x2)/2 x ∈ (3, 2]
0 x 6∈ (3, 0]
, (1.63)
with h computed as in (1.62) above. Our choice of the modulation and
translation parameters is rather arbitrary. The only constraint is that 0 <
ab ≤ 1/(2p − 1) = 1/5; from our experimentation with different values,
performance seems to be about the same for different choices satisfying the
inequality; we settled on a = 1/5 and b = 1/3. Ideally Hn, the order of the
number of frequency domain shifts, would be selected optimally to minimize
MISE while balancing integrated variance and integrated square bias; this is
an open research question. For the time being we set Hn naively equal to 50.
38
The simulation results indicate that the Gabor frame estimator performs
satisfactorily. Figure 1.1 displays, for each of the 4 price models (ABM, OU,
GBM, and CIR), simulated spot variance sample paths plotted against spot
variance paths produced by the Gabor frame estimator. A visual inspection
shows that the estimator produces a relatively good fit even with the naive
selection of Hn. This claim is further corroborated by the analysis of the
the integrated mean aquare error (MISE), the integrated square bias, and
the integated variance summarized in Table 1.1.We found that the variance,
estimated in the foregoing manner, is only approximately the difference be-
tween the MISE and the integrated square bias. The reported figures for
variance are in fact the difference between the MISE and the integrated
square bias. The discrepancy is rather slight and does not materially change
the result. In all 4 model, an inverse relation between MISE, square bias, and
variance may be read off from the table. As was established mathematically,
we expect MISE to vanish if the number of price observations were made to
grow without bound.
1.6.2 Prices with jumps
We continue our investigation by simulating prices with jumps.
Xt = 0.8 + 0.5t+ 0.2Wt +N∑
i=1
Yi, (ABM + JMP)
Xt = 0.8−∫ t
0
4Xs ds+
∫ t
0
0.2 dWs +N∑
i=1
Yi, (OU + JMP)
Xt = 0.8 +
∫ t
0
0.5Xs ds+
∫ t
0
0.2Xs dWs +N∑
i=1
Yi, (GBM + JMP)
Xt = 0.8 +
∫ t
0
(0.1− 0.5Xs) ds+
∫ t
0
0.2√
Xs dWs +N∑
i=1
Yi, (CIR + JMP)
wher N is a Poisson random variable with intensity 5 and Yi, 1 ≤ i ≤ N , is
a normal random variable with mean zero and standard deviation 0.4.
We construct the dual Gabor frames as in the previous subsection using
the third order B-Spline specified in (1.63). With the introduction of jumps
39
into the simulation, we found out that better results may be obtained by
varying tha parameters a, b, and Hn. We settled on a = 1/7, b = 1/25,
and Hn = 50. The jump threshold is obtained by setting un = nα, where
α = −0.9. The results of the simulations are recorded in Table 1.2. We also
produce a graph of a single observations (paths) in Figure 1.2.
40
Figure 1.2: Estimated vs. actual spot volatility
(a) GBM + JMP
0.04
0.05
0.06
0.07
0.25 0.50 0.75
Times
Spot va
riance
Estimate
Actual
GBM + Jump
(b) CIR + JMP
0.025
0.030
0.035
0.040
0.045
0.050
0.25 0.50 0.75
Times
Spot va
riance
Estimate
Actual
CIR + Jump
(c) OU + JMP
0.036
0.038
0.040
0.042
0.044
0.25 0.50 0.75
Times
Sp
ot
va
ria
nce
Estimate
Actual
OU + Jump
(d) ABM + JUMP
0.036
0.039
0.042
0.045
0.048
0.25 0.50 0.75
Times
Spot va
riance
Estimate
Actual
ABM + Jump
41
Table 1.2: Mean integrated square error (MISE) of Vn(X, t).
ABM + JMP OU + JMP
n MISE Sq. Bias Var MISE Sq. Bias Var
500 1.53× 10−4 8.95× 10−6 1.44× 10−4 8.51× 10−4 1.31× 10−4 7.20× 10−4
5000 2.19× 10−5 2.27× 10−6 1.96× 10−5 5.48× 10−5 9.76× 10−6 4.50× 10−5
50000 2.13× 10−6 9.00× 10−8 2.04× 10−6 6.61× 10−6 2.65× 10−6 3.97× 10−6
GBM + JMP CIR + JMP
n MISE Sq. Bias Var MISE Sq. Bias Var
500 6.13× 10−3 8.70× 10−4 5.26× 10−3 3.74× 10−4 2.32× 10−4 1.43× 10−4
5000 3.42× 10−4 4.07× 10−5 3.02× 10−4 1.12× 10−5 8.29× 10−6 2.95× 10−6
50000 7.11× 10−5 6.36× 10−6 6.47× 10−5 7.05× 10−6 5.64× 10−6 1.40× 10−6
Note: The mean of the integrated square errors are obtained by taking an average over 50 sample paths generated for each model/numberof observations pair.
42
1.7 Empirical illustration - Flash Crash of 2010
On May 6, 2010, the S&P 500 index lost around 9% of its value in a matter
of minutes; the index rebounded to its pre-crash level a few minutes later.
Between 2:32 p.m. EDT and 3:08 p.m. EDT, the index fluctuated 100
points between 1160 and 1060. The erasure of value in the index has been
so precipitous that it has been dubbed the Flash Crash of 2010.
The relatively quick subsequent rebound of the index to its pre-crash level
suggests that the crash in the value of the index is likely not spurred by a
fundamental change in the intrinsic value of U.S. equities. The deviation of
realized prices from their fundamental or intrinsic values is the hallmark of a
liquidity crash. In this section, we study the trajectory of the spot volatility
of the S&P 500 index prior to, during, and after the Flash Crash of 2010
using our Gabor frame estimator.
Specifically, we sampled the S&P 500 index every 15 seconds during the
hours of 8:30 CT to 15:00 CT from May 5, 2010 to May 7, 2010. This resulted
in a sample of 4562 observations of the index around the time of the crash.
Table 1.3 provides descriptive statistics of the data. We obtained estimates
of realized volatility using Hn = 50, a = 1/5 and b = 1/7. We vary the
threshold parameter un to obtain four estimates of spot volatility. In the
first instance, un is set equal to infinity so that the estimate obtained in this
instance coincides with vn(X, t). The other estimates correspond to Vn(X, t)
with un set equal to 50, 25, and 12.5, respectively.
The estimates are graphed in Figure 1.3. The x-axis of the graph repre-
sents trading hours between 8:30 a.m. to 3.00 p.m. normalized to one time
unit, so that values between 0 and 1 represent May 5, 2010 and so on till May
7, 2010. Overall, the graphs of all estimates of realized spot volatility look
qualitatively similar. In all instances, spot volatility is seen to start to as-
cend toward the start of the crisis and to achieve a pronounced peak as prices
bottom out in the afternoon of May 6, 2010. The second but smaller peak
in the graph of realized volatility ndicate that markets remained agitated
throughout the following day.
43
Table 1.3: Descriptive statistics of S&P 500 data at 15 seconds resolution betweenMay 5th and 6th of 2010.
Variable Min. 25% perc. Median Mean 75% perc. Max
X 1066 1119 1153 1143 1166 1176(X)2 0.001 0.010 0.063 0.511 0.260 183.6
Figure 1.3: Realized spot volatility of the S&P500 Index from May 4, 2010 toMay 7, 2010. Realized spot volatility is estimated with Hn = 50, a = 1/7, b = 1/5.Trading hours from 8:30 to 15:00 is normalized to one time unit.
(a) un = ∞
Time
u_n = inf
0.5 1.0 1.5 2.0 2.5
1080
1100
1120
1140
1160
1180
S&
P 5
00 L
eve
l
20
40
60
80
Realiz
ed S
pot V
ola
tilit
y
(b) un = 50
Time
u_n = 50
0.5 1.0 1.5 2.0 2.5
1080
1100
1120
1140
1160
1180
S&
P 5
00 L
eve
l
10
20
30
40
50
60
Realiz
ed S
pot V
ola
tilit
y
(c) un = 25
Time
u_n = 25
0.5 1.0 1.5 2.0 2.5
1080
1100
1120
1140
1160
1180
S&
P 5
00 L
eve
l
10
20
30
40
50
60
Realiz
ed S
pot V
ola
tilit
y
(d) un = 12.5
Time
u_n = 12.5
0.5 1.0 1.5 2.0 2.5
1080
1100
1120
1140
1160
1180
S&
P 5
00 L
eve
l
10
20
30
40
50
Realiz
ed S
pot V
ola
tilit
y
44
1.8 Conclusion
We have investigated estimators of the instantaneous volatility of asset prices
for entire time windows based on Gabor frame expansions of the realized
trajectory of spot volatility. The main practical advantage of this type of
estimator is their versatility. Once an estimate obtained various functionals
of instantaneous volatility such as the ubiquitous integrated volatility are
obtained immediately. We derived our estimators of global instantaneous
volatility under the assumption that the price process is an Itô semimartin-
gale with Lévy jumps. We have also assumed that the densities of the first
and second predictable characteristics belong to the localized class of pro-
cesses with finite fourth moment.
We proposed a preliminary version of the estimator to be used in sit-
uations where the assumption of continuous asset prices hold. Under the
assumption that observations of the asset price occur at discrete equidistant
intervals with a mesh tending to zero within a fixed time interval, we have
shown using standard arguments that the estimator converges in probability
in L2[0, 1]. In the case of asset prices with discontinuous prices, we modified
the basic estimator to require the computation of the Gabor frame coeffi-
cients to depend on a threshold. The threshold itself is allowed to shrink
to zero at a sufficiently slow rate to ensure consistency of the estimator in
L2[0, 1].
45
1.9 Appendix
1.3 Lemma Let the dual Gabor frame generator g be constructed as in
(1.10). If ω(g, δ) denotes the modulus of continuity of g, i.e. ω(g, δ) :=
sup|g(t)− g(t′)| : t, t′ ∈ R and |t− t′| < δ, then
ω(gj,k, δ) ≤ Cω(g, δ) h, k ∈ Z,
where C is a positive constant.
Proof. G is bounded away from zero. To see this, note that since g has
support in [r, s], the series on the left hand side of (1.11) has finitely many
terms for each t. In addition, it is straight forward to verify that G(t) =
G(t+ b) for all t; so, G is periodic with period b. It is also clear that because
g is continuous, so is G. It follows that G attains its min and max on any
interval of length b. Let Ib denote the interval [(s + r − b)/2, (s + r + b)/2],
then
mint∈R
G(t) = mint∈Ib
G(t)
≥ a−1 mint∈Ib
|g(t)|2.
Because g is continuous and g doesn’t vanish in (r, s), we conclude that G∗ :=
mint∈R G(t) > 0. It is also straight forward that G∗ := maxt∈R G(t) < ∞.
Now, let t, t′ ∈ R, t > t′, such that |t− t′| ≤ δ, then
|g(t)− g(t′)| = |(G(t)G(t′))−1(g(t)G(t′)− g(t′)G(t))|≤ (G−2
∗ )|g(t)||G(t)−G(t′)|+ |G(t)||g(t)− g(t′)|.(1.64)
For a real number x, denote ⌊x⌋ the largest integer less than or equal to x and
⌈x⌉ the smallest integer that is greater than or equal to x. Now, Let A denote
the set of integers i such that r < t− ib < s. By definition of g, g(t− jb) = 0,
whenever j 6∈ A. Since b > 0, A contains at most ⌈(1 + |s|+ |r|)/b⌉ number
of elements. Let τ := mint − ib : i ∈ A, i.e. τ is the smallest t − ib such
that i ∈ A. Because A contains at most a finite number of elements, there
46
exists an integer k such that τ = t− kb. Set τ ′ := t′ − kb.
It is straight forward to verify that |τ − τ ′| ≤ δ and
a|G(t)−G(t′)| ≤⌈(1+|s|+|r|)/b⌉
∑
j=0
|g(τ + jb)2 − g(τ ′ + jb)2|
≤⌈(1+|s|+|r|)/b⌉
∑
j=0
|g(τ + jb)− g(τ ′ + jb)||g(τ + jb) + g(τ ′ + jb)|
≤ 2⌈(1 + |s|+ |r|)/b⌉g∗ω(g, δ), (1.65)
where g∗ := maxt∈R |g(t)|. Returning to (1.64), we see that
|g(t)− g(t′)| ≤ Cgω(g, δ),
where Cg = G2∗(2a(⌈(1 + |s|+ |r|)/b⌉)(g∗)2 +G∗). Now let h, k ∈ Z, then
|gh,k(t)− gh,k(t′)| = |e2πihat(g(t− kb)− g(t′ − kb))|
≤ |g(t− kb)− g(t′ − kb)| ≤ Cgω(g, δ). (1.66)
The last inequality follows because translating a function leaves its modulus
of continuity unchanged.
47
Chapter 2
Testing efficiency in small and
large financial markets
2.1 Introduction
An informationally efficient market is often understood to be one “in which
prices always ‘fully reflect’ available information” (Fama, 1969, Page 383).
While this description may be very helpful to the intuition, it leaves out at
least one important ingredient: the probability measure relative to which
prices are fully reflective of available information. If the probability measure
assumed is the physical or statistical measure then this description may have
more in common with the random walk hypothesis than market efficiency.
Indeed, a slightly more rigorous description of market efficiency would require
risk-adjusted asset prices to behave like martingales in the finite horizon,
complete market setting; of course, the risk-adjustment may be swept up
into an equivalent martingale measure Q, so that an alternative description
of market efficiency, assuming finite horizon and market completeness, would
require prices to evolve like martingales relative to a given information set
and an equivalent martingale measure reflecting agent’s preferences and risk
tolerance. This description of market efficiency echoes Malkiel (1991, Page
211)’s take on the subject:
A market is said to be efficient if it fully and correctly reflects
all relevant information in determining security prices. Formally,
48
the market is said to be efficient with respect to some information
set, φ, if security prices would be unaffected by revealing that
information to all participants. Moreover, efficiency with respect
to an information set, φ, implies that it is impossible to make
economic profits by trading on the basis of φ.
Hence, in an efficient (complete) market, risk-adjusted prices should be mar-
tingales, and it should be impossible to make economically significant profits
by trading on the basis of available information. Moreover, since prices fully
incorporate all available information at all times, there is no discrepancy be-
tween realized asset prices and prices implied by other non-price information.
In other words, since such discrepancies are non-existent, there cannot exist
trading strategies that perform better than buying and holding individual
traded assets.
This latter intuition informs the rigorous characterization of market ef-
ficiency proposed in (Jarrow & Larsson, 2012, Theorem 3.2). The authors
define a price process S as being efficient relative to a reference informa-
tion set if an economy E , determined by agents beliefs, endowments, and
preferences, and a consumption good price index may be found such that
S corresponds to equilibrium asset prices in E . From this basic definition,
they obtain characterizations in terms of the existence of equivalent martin-
gale measures and in terms of the joint satisfaction of the no free lunch with
vanishing risk (NFLVR) condition and the no dominance (ND) condition.
NFLVR is an “absence of arbitrage” condition that ensures the existence of
an equivalent local martingale measure for S, whereas ND imposes an opti-
mality condition on asset prices.
One of the benefits of characterizing market efficiency in terms of NFLVR
and ND is that the equivalent (risk-neutral) separating probability measure
is taken out of the definition of market efficiency. From the empirical point
of view, a way of testing efficiency without running into the joint-hypothesis
issue is suggested since both NFLVR and ND are expressed entirely in terms
of the physical or statistical probability measure. The joint hypothesis prob-
lem essentially describes the unquantifiability of the misspecification error
incurred by specifying an equilibrium asset pricing model or stochastic dis-
count factor as reference for testing market efficiency.
49
In the present work, we further this line of research by obtaining ad-
ditional characterizations of market efficiency that have the advantage of
simplifying empirical tests. In principle, the no dominance condition has to
be verified for each asset, so that for a market with a large number of assets
testing the ND requirement may prove to be impractical. Our first insight
into the problem comes from the fact that, under the condition of no un-
bounded profit with bounded risk (NUPBR), the i-th asset Si in a market
with n distinct assets satisfies no dominance if and only if the n-dimensional
vector of asset prices S, expressed in units of (γ + (1 − γ)Si), 0 < γ < 1,
does not violate the no arbitrage (NA) condition. Moreover, not only are
convex portfolios of undominated assets necessarily undominated (Delbaen
& Schachermayer, 1997), but the converse is also true (Corollary 2.1). Com-
bining, this insights with the fact that NUPBR remains invariant to a change
of numeraire, we obtain a characterization of market efficiency in terms of
NFLVR for S expressed in units of the market portfolio (Proposition 2.4).
From an empirical standpoint, this characterization obviates the need for
direct verification of the no dominance condition.
This reformulation also allows us to employ existing empirical techniques
for testing market efficiency. The empirical tests devised in (Jarrow et al.,
2012) and (Hogan et al., 2004) were originally intended to test for (statistical)
arbitrage strategies. The absence of arbitrage is not sufficient for market
efficiency. It is in fact possible for a given strategy to not be an arbitrage
while violating the ND condition for one or more assets. But since a violation
of the NA requirement for S expressed in units of the market portfolio is
equivalent to a violation of the ND condition for one or more assets, we are
able to repurpose the statistical arbitrage tests of Hogan et al. (2004) to
perform simultaneous verifications of violations of the ND condition for all
assets. And since the NA condition is equivalent to ND for the zeroth asset,
both the NA and ND conditions can be handled within a single test.
We conclude our study of market efficiency by introducing notions of no
dominance and market efficiency to the large financial market setting of Ka-
banov & Kramkov (1994, 1998); Cuchiero et al. (2015). We refer to these
notions as asymptotic no dominance (AND) and asymptotic market efficiency
(AME), respectively. The tests of market efficiency we propose in the stan-
50
dard small market setting assume the investment horizon is infinite, R+, so
that they may be most appropriately used to test longer horizon strategies.
The large financial market setting makes it possible to study fixed horizon
strategies under the assumption that the number of asset in the market
tends to infinity, much like in the arbitrage pricing theory (APT) framework
of Ross (1976a,b). We obtain a further change of numeraire characterization
and suggest tests for violations of asymptotic market efficiency.
2.2 Efficiency in standard markets
We take as given a filtered probability basis B := (Ω,F := (Ft)t≥0,F , P )
satisfying the usual conditions. Defined on B we assume an n-dimensional
semimartingale S := (St)t≥0 representing the price process of n assets. We
will refer to the pair (S,B) as a market.
Let λ > 0; we define λ-admissible strategies in the usual manner, i.e.,
n-dimensional predictable processes H such that the stochastic integral H •S
is well-defined, (H • S) ≥ −λ, limt→∞(H • S)t = (H • S)∞ exists, and H0 = 0.
A strategy is said to be admissible if there exists λ > 0 such that it is λ-
admissible. An arbitrage is an admissible strategy H such that (H •S)∞ ≥ 0
almost surely and (H • S)∞ > 0 holds with positive probability. A market
(S,B) is said to satisfy the no arbitrage condition (NA) if it is devoid of
arbitrage strategies. For admissible strategies, the random variable (H •S)∞
is referred to as the terminal value or payoff of strategy H. The payoff
space of 1-admissible strategies is denoted K1. The market (S,B) is said to
satisfy the no unbounded profit with bounded risk condition (NUPBR) if
K1 is bounded in probability, i.e, bounded in the set of finite valued random
variables L0(B). Now, if every sequence fn ∈ K1 satisfying ‖fn ∧ 0‖∞ → 0
must also satisfy fnP−→ 0 then the market (S,B) is said to satisfy the no
free lunch with vanishing risk condition (NFLVR).
The NFLVR condition is a strengthening of the NA condition. As a
matter of fact, NFLVR is necessary and sufficient for both NA and NUPBR to
hold (Kabanov, 1996, Lemma 2.2). In the general unbounded semimartingale
case, the NFLVR condition is equivalent to the existence of a probability Q
equivalent to P such that the components of S are stochastic integrals of a
51
predictable process with respect to a local martingale. The measure Q is said
to be an equivalent σ-martingale measure for S (Delbaen & Schachermayer,
1999, 1.1 Theorem). In the case of locally bounded S, Q is a local martingale
measure for S (Delbaen & Schachermayer, 1994, Corollary 1.2). This is
also the case for non-negative asset prices, i.e. NFLVR is equivalent to the
existence of a probability measure Q, equivalent to P , such that S is a Q
local martingale if S ≥ 0 (Ansel & Stricker, 1994, Corollary 3.5).
The basic intuition of an efficient market relative to an information set
(Ft)t∈[0,T ] (at least in the complete markets, finite horizon case) is that risk-
adjusted prices evolve over time like Ft-martingales. As a result, current
prices represent the best prediction of the future behavior of risk-adjusted
prices. This is the same as saying that any attempt, in the form of a trading
strategy based on current information, to achieve a better outcome, in the
form of superior risk-adjusted returns, than simply buying and holding the
individual traded assets would ultimately prove to be unsuccessful. Note the
close relationship between the available information set and the set of admis-
sible trading strategies. The available information set uniquely determines
the set of admissible trading strategies and vice versa. Hence, in describing
market efficiency, we may speak of trading strategies rather than informa-
tion set. Indeed, provided asset prices exist, an alternative characterization
of markets efficiency may be stated in terms of the no dominance (ND) con-
dition.
2.1 Definition (Jarrow & Larsson (2012)) Given and n-dimensional
S vector representing asset prices, the i-th component Si is undominated on
the time horizon [0, T ], T < ∞, if there is no admissible strategy H such that
P ((H • S)T ≥ SiT − Si
0) = 1 and P ((H • S)T > SiT − Si
0) > 0. (2.1)
The market (S,B) is said to satisfy ND if Si, 0 ≤ i < n, is undominated.
We will assume in the current setting that the investment horizon is the
positive real line, in contrast to the finite horizon setup analyzed in (Jarrow
& Larsson, 2012). This modeling choice is important, since, in this section,
we are primarily interested in devising tests of market efficiency that hold
asymptotically as the investment horizon approaches infinity. We assume
52
that prices have been rescaled so that Si0 = 1 for 0 ≤ i < n. We also assume
that H0 = 0 for all admissible strategies. Hence, as a slight modification of
the definition of ND given above, we will say that the i-th asset is undomi-
nated if for all admissible strategies H, P ((H • S)∞ ≥ Si∞ − 1) = 1 implies
P ((H • S)∞ = Si∞ − 1) = 1. We now state the definition of market efficiency
in our setting as follows:
2.2 Definition (Market efficiency) Let (S,B) be a market carried on
the filtered probability basis B = (Ω,F ,F, P ). It is said to be efficient if it
satisfies NFLVR and ND.
The above definition adapts the second characterization of market effi-
ciency in (Jarrow & Larsson, 2012, Theorem 3.2(ii)) to our setting where the
time horizon is infinite. Hence, a market is efficient if both NFLVR and ND
are satisfied. Here, our objective is to derive equivalent characterizations of
market efficiency that may be more suitable for empirical analysis. In the
sequel, we will assume that the vector of prices S is expressed in terms of the
asset occupying the zeroth position, so that S0t = 1 for t ≥ 0, and that all
other assets prices are non-negative so that Sit ≥ 0 for all t ≥ 0 and 0 ≤ i < n.
Let 0 < γ < 1 and denote Sγ,i the n+ 1 dimensional vector obtained by ap-
pending Sγ,i := (γ + (1 − γ)Si) to the end of S, i.e. Sγ,i := (S, Sγ,i). Now
set
Zγ,i := Sγ,i(Sγ,i)−1. (2.2)
That is, Zγ,i expresses the price process S in units of the convex portfolio
consisting of the zeroth asset and the i-th asset, Sγ,i.
2.1 Proposition If the market (S,B) is efficient then (Zγ,i,B) admits a
local martingale measure for all 0 ≤ i < n and 0 < γ < 1.
Proof. Recall that a σ-martingale measure, under which Zγ,i may be ex-
pressed as a stochastic integral with respect to a local martingale, coincides
with a local martingale measure for non-negative Zγ,i (Ansel & Stricker, 1994,
Corollary 3.5). Hence, it is only required to demonstrate NFLVR, which in
turn is equivalent to both NA and NUPBR (Delbaen & Schachermayer, 1994,
53
Corollary 3.8). Suppose (S,B) satisfies NFLVR and ND while (Zγ,i,B) fails
to satisfy NA for some i and 0 < γ < 1, so that there exists an admissible
strategy H for Zγ,i such that (H •Zγ,i)∞ is non-negative and strictly positive
with positive probability. We will argue as in (Delbaen & Schachermayer,
1995, Theorem 11). By rescaling H it may be assumed that H is 1-admissible
for Zγ,i. Consider the gain process
Y := (1− γ)−1(Sγ,i(H • Zγ,i + 1)− 1). (2.3)
Since Sγ,i is strictly positive and H is 1-admissible for Zγ,i, we have that
Yt ≥ −(1 − γ)−1 for t ≥ 0. Also, because H0 = 0 and Sγ,i0 = 1, we have
that Y0 = 0. Since H is an arbitrage for Zγ,i, we have that Y∞ is at least
as great as (1− γ)−1(Sγ,i∞ − 1) = Si
∞ − 1 with the inequality holding strictly
with positive probability. ND will be violated for Si if Y is representable as
a stochastic integral with respect to S. This follows from an application of
Itô’s integration by parts formula. Write H =: (Ha, Hb) where Ha denotes
the first n components of H and Hb its n+ 1-st (last) component. Then
(1− γ)Y + 1 = H • Sγ,i • Zγ,i +H • Zγ,i • Sγ,i +H • [Sγ,i, Zγ,i] + Sγ,i
= H • Sγ,i + Sγ,i
= Ha • S +Hb • Sγ,i + Sγ,i.
So that Y = (1−γ)−1Ha •S+(Hb+1) •Si, which may be expressed as K •S,
where K = (1−γ)−1Ha+ In,b,i and In,b,i is the n dimensional vector which is
zero everywhere except in the i-th position where it is equal to Hb+1. Hence,
Si is dominated by the (1 − γ)−1-admissible strategy K. This contradicts
the efficiency of (S,B).We note that the result that the NUPBR condition is invariant to a change
of numeraire in the finite horizon setting is proved in (Takaoka & Schweizer,
2014, Proposition 2.7 (ii)) using functional analytic methods. Here, we estab-
lish the claim using more elementary arguments. To that end, suppose (S,B)is efficient while (Zγ,i,B) fails to satisfy the NUPBR condition. In that case
there exists a sequence (Hm)m≥1 of 1-admissible strategies for Zγ,i and β > 0
such that given N ∈ N if m is sufficiently large then P ((Hm•Zγ,i)∞ > N) > β.
54
Denote
Y m := Sγ,i(Hm • Zγ,i + 1)− 1.
It is easily verified that Y m0 = 0 and Y m
t ≥ −1, t ≥ 0. Moreover, Y m∞ ≥
γ((Hm •Zγ,i)∞+1)−1 so that (Y m∞ )m≥1 is an unbounded sequence in L0 (B).
Indeed, for N ∈ N, P (Y m∞ > N) ≥ P ((Hm •Zγ,i)∞ > γ−1(N +1)− 1), which
for sufficiently large m exceeds β. NUPBR for (S,B) will be violated as soon
as Y m is shown to be representable as a stochastic integral with respect to
S. But this follows, as in the previous paragraph, from Itô’s integration by
parts formula.
We now establish the converse to the previous claim.
2.2 Proposition If (Zγ,i,B) admits a local martingale measure for every
0 < γ < 1 and 0 ≤ i < n then (S,B) is efficient.
Proof. Suppose (Zγ,i,B) admits a local martingale measure for every i and
0 < γ < 1 while (S,B) fails to satisfy NUPBR. Then there is Hm 1-admissible
for S and β > 0 such that for sufficiently large m, P ((Hm • S)∞ > N) > β
for all N ∈ N. Consider
Y m := (Sγ,i)−1(Hm • S + 1)− 1.
Y mt is well-defined because Sγ,i
t is strictly positive; Y mt ≥ −1 because Hm
is 1-admissible for S; and Y m0 = 0 because Hm
0 = 0 and Sγ,i0 = 1. We note
that the existence of a local martingale measure for Zγ,i is equivalent to
NFLVR. Under NFLVR (H • Zγ,i)∞ exists and is finite-valued for admissible
strategies (Delbaen & Schachermayer, 1994, Theorem 3.3). In particular Zγ,i∞
and therefore (Sγ,i∞ )−1, for all all i, is well-defined and finite valued. Note that
(Sγ,i∞ )−1 cannot be zero with positive probability since this would imply that
P (Si∞ = ∞) > 0, which would contracdict the almost sure finiteness of Zγ,i
∞ .
Hence, 0 < (Sγ,i∞ )−1 ≤ γ−1, almost surely, so that there is c > 0 sufficiently
small such that P ((Sγ,i∞ )−1 ≤ c) ≤ β/2. Hence,
P (Y m∞ > N) > P ((Hm • S)∞ > c−1(N + 1)− 1)− P ((Sγ,i
∞ )−1 ≤ c),
55
which for sufficiently large m is larger than β/2. That is (Y m∞ )m≥1 is un-
bounded in L0(B).Now let Km := (Hm, 0) denote the n+1-dimensional predictable process
obtained by appending 0 to Hm. It is easily seen that Km • Sγ,i = Hm • S.
So that Y m may be written alternatively as (Sγ,i)−1(Km • Sγ,i + 1) − 1. By
Itô’s integration by parts formula and the fact that Zγ,i = (Sγ,i)−1Sγ,i, we
have Y m = (Km + In+1,0) • Zγ,i, where In+1,0 denotes the n+ 1 dimensional
vector with zeros everywhere except in the zeroth position where there is a 1.
Thus, (Y m∞ )m≥1 is generated by 1-admissible strategies for Zγ,i and therefore
constitutes a violation of the NUPBR condition for Zγ,i.
We note that the NA condition is simply the ND condition for the zeroth
asset, so that demonstrating ND for 0 ≤ i < n is all that is required. To that
end, suppose there is an i and a c-admissible, c > 0, strategy H for S such
that (H • S)∞ ≥ Si∞ − 1 holds, almost surely, with the inequality holding
strictly on a set of positive probability. Observe that this implies that
(1− γ)(H • S)∞ ≥ (1− γ)(Si∞ − 1) = Sγ,i
∞ − 1 (2.4)
holds almost surely with the inequality holding strictly on a set with positive
probability. Set K := (H, 0) and note that H •S = K • Sγ,i for any 0 < γ < 1;
fix one such γ and define
Y := (Sγ,i)−1((1− γ)K • Sγ,i + 1)− 1.
It is easily seen that Y0 = 0 and easily verified that Yt ≥ γ−1(1 − γ)(1 − c)
for t ≥ 0. It follows from (2.4) that P (Y∞ ≥ 0) = 1 and P (Y∞ > 0) > 0. It
follows by the stochastic integration by parts formula that Y = ((1− γ)K +
In+1,0) •Zγ,i =: J •Zγ,i. Hence, J constitutes a violation of the NA condition
for Zγ,i. This completes the demonstration.
The previous two Propositions may be summarized as follows:
2.3 Proposition The market (S,B) is efficient if and only if (Zγ,i,B)admits an equivalent local martingale measure for each 0 ≤ i < n and 0 <
γ < 1. In particular, under NUPBR, the i-th asset Si is undominated if
56
and only if (Zγ,i,B) satisfy NA for all 0 < γ < 1. Moreover, (S,B) satisfies
NUPBR if and only if (Zγ,i,B) satisfies NUPBR.
It is easy to see that if Proposition 2.3 holds for one γ ∈ (0, 1) then it
must hold for all 0 < γ < 1. Hence, we may restate the claim of that Propo-
sition using equally weighted portfolios of the numeraire asset and the i-th
asset. These results make it somewhat easier to test for efficiency by lever-
aging econometric techniques designed for testing arbitrage and unbounded
profit opportunities as opposed to attempting to test for the no dominance
condition directly. Still a market with n assets would require n + 1 tests to
verify efficiency. The U.S. equities market is comprised of more than five
thousand stocks, so that, in principle, a verification of market efficiency in
the U.S. equities market would require as many as five thousand separate
tests. The following characterization of market efficiency simplifies the task
considerably by reducing the number of tests to just two: NA and NUPBR.
First, we introduce some helpful notation. Let
α = (α0, · · · , αn−1) (2.5)
be an n dimensional vector of real numbers such that αi > 0, and∑n−1
i=0 αi =
1, so that α is a weight vector. Define Sα := α ·S =∑n−1
i=0 αiSi, i.e. Sα is the
weighted sum of the n asset prices, and it is interpreted as the value process
of the market portfolio computed using the weight vector α. Next, denote
Sα the n + 1 dimensional price vector obtained by appending Sα to S, i.e.
Sα := (S, Sα). Denote
Zα := Sα(Sα)−1, (2.6)
so that Zα is a change of numeraire that restates S in units of the market
portfolio. We now have the following:
2.4 Proposition The market (S,B) is efficient if and only if (Zα,B)admits an equivalent local martingale measure for all strictly positive weight
vectors α.
Proof. Suppose, (Zα,B) admits a local martingale measure while there exists
57
a c-admissible strategy, c > 0,
H := (H0, · · · , Hn−1)
for S and at least one 0 ≤ k < n such that (H • S)∞ ≥ Sk∞ − 1 holds almost
surely, with the inequality holding strictly with positive probability. Denote
α−k the vector obtained by substituting 0 for the k-th coordinate of α. Set
K := α−k + αkH and observe that (K • S)∞ ≥ Sα∞ − 1, almost surely, with
the inequality holding strictly on a set of positive measure. Set J := (K, 0)
and note that J • Sα = K • S. Now consider
Y = (Sα)−1(J • Sα + 1)− 1.
Because J0 = 0 and Sα0 = 1, we have Y0 = 0. Because H is c-admissible
and 0 < (Sαt )
−1 ≤ (α0)−1, we have Yt ≥ (α0)
−1(1− α0)(1− c), and P (Y∞ ≥0) = 1 with P (Y∞ > 0) > 0 because H • S dominates Sk. By the stochastic
integration by parts formula, Y = (J+In+1,0) •Zα. That is, Y is an arbitrage
for Zα.
Now suppose (Hm)m≥1 violates NUPBR for S. Then there is β > 0
such that for all N ∈ N, P ((Hm • S)∞ > N) > β for sufficiently large
m. Let Y m = (Sα)−1(Km • Sα + 1) − 1, where Km = (Hm, 0). It is easy
to see that Y m0 = 0, and Y m
t ≥ −1. Under the assumption of NFLVR,
(Sα∞)−1 is well-defined, finite-valued, and contained in (0, α−1
0 ] (Delbaen &
Schachermayer, 1994, Theorem 3.3). Hence, there is a sufficiently small c
such that P ((Sα∞)−1 > c) > 1− β/2. Hence, for sufficiently large m,
P (Y m∞ > N) > P ((Km • Sα)∞ > c−1(N + 1)− 1)− P ((Sα
∞)−1 ≤ c),
which eventually exceeds β/2. Using Itô’s integration by parts formula, it
may be easily seen that Y m is expressible as a stochastic integral with respect
to Zα.
Now suppose (S,B) is efficient but for some α, (Zα,B) admits an arbi-
trage. So that there is 1-admissible H such that (H •Zα)∞ ≥ 0 almost surely
with the inequality holding strictly with strict probability. Then it is easily
verified, arguing as in the previous paragraphs, that Y := (Sα)(H •Zα+1)−1
58
is equal to K • S where K is 1-admissible for S. Because H is an arbitrage
for Zα, we have that Y∞ = (K • S)∞ ≥ Sα∞ − 1, with the inequality hold-
ing strictly with positive probability. Hence, Sα is dominated by K. That
ND fails for at least one asset now follows from (Delbaen & Schachermayer,
1997, Proposition 2.12). Indeed, denote J := α−1n−1(K−∑n−2
i=0 αi) and observe
that (J • S)∞ ≥ Sn−1∞ − 1, almost surely, with the inequality holding strictly
with positive probability. By the non negativity of Sn−1, we also have that
(J • S)∞ ≥ −1; by Proposition 2.11 of Delbaen & Schachermayer (1997),
(J •S)t ≥ −1 on R+, so that J is 1-admissible. This is a contradiction of the
no dominance assumption on Sn−1.
Now, if (Hm)m≥1 is a violation of NUPBR for Zα then (Y m)m≥1, where
Y m = (Sα)(Hm • Zα + 1)− 1 violates NUPBR for S.
As a corollary to the previous claim, we now have the following:
2.1 Corollary Under the assumption of NUPBR for S, (S,B) is efficient
if and only if Sα is undominated for every strictly positive weight vector α.
Proof. The necessity of the claim follows as in Proposition 2.4. On the other
hand if Si is dominated by H then K • S, where K = α−i + αiH and α−i is
the portfolio weight α with 0 substituted for αi, dominates Sα.
The next result shows that the choice of weight vector is irrelevant.
2.2 Corollary Let α be a srictly positive weight vector. (Zα,B) satisfies
NFLVR if and only if (Zκ,B) satisfies NFLVR for all strictly positive weight
vectors κ.
Proof. Sufficiency is obvious. Suppose (Zκ,B) fails to satisfy NUPBR for a
strictly positive weight vectors κ. Let (Hm)m≥1 denote the sequence yielding
unbounded profits in the market (Zκ,B). Then using familiar arguments, it
is easily verified that Y m := (Sκ)(Hm • Zκ + 1)− 1 constitutes a violation of
NUPBR for (S,B). By Proposition 2.4, (Zα,B) cannot satisfy NFLVR.
Suppose (Zκ,B) satisfies NUPBR but fails to satisfy NA. Then arguing
as in Proposition 2.4, it is easily seen that Sκ is dominated. By Corollary
2.1, (S,B) cannot be efficient. By Proposition 2.4, (Zα,B) cannot satisfy
59
NFLVR.
The use of the market portfolio Sα as numeraire in Proposition 2.4 is sug-
gestive of the role played by the “market” portfolio in the Stochastic Portfolio
Theory (SPT) of Fernholz (2002). In that setting, there are n assets/stocks
with strictly positive prices; each stock is normalized so that there is only
one stock outstanding; consequently, the price of the i-th stock X i coincides
with its capitalization. The time t total capitalization of the market is given
by Xt :=∑n−1
i=0 X it and the relative capitalization of each stock at time t
is given by µi(t) := X it(Xt)
−1, 0 ≤ i < n. In SPT market inefficiences are
exploited, i.e. when only NUPBR holds but not NA for the market relative
to the numeraire X. Our work contributes a large financial and/or infinite
time horizon view on this phenomenon.
The next characterization of market efficiency is perhaps the most in-
tuitive. It states that in a complete market setting, a market is efficient if
and only if there exists Q equivalent to P such that all asset prices are uni-
formly integrable Q martingales. Hence, not only must risk-adjusted prices
be unpredictable, they must also have constant unconditional risk-adjusted
expectation across time and at infinity. This result is the infinite horizon
counterpart of (Jarrow & Larsson, 2012, Theorem 3.2 (iii)).
2.5 Proposition The market (S,B) is efficient if and only if there exists
an equivalent local martingale measure Q for S such that S is a uniformly
integrable martingale under Q.
Proof. The claim follows from (Delbaen & Schachermayer, 1995, Theorem
13). Indeed, by Proposition 2.4, efficiency is equivalent to (Zα,B) admitting a
local martingale measure. Let Q′ be a local martingale measure for Zα. Since
0 < (Sα)−1 ≤ α−10 on R+, it follows in particular that (Sα)−1 is a uniformly
integrable martingale under Q′ (Protter, 2004, Theorem 51). Define dQ =
(Sα∞)−1dQ′. That Sα is uniformly integrable follows from the fact that for
all stopping times τ , EQ(Sατ ) = 1. Since 0 ≤ αiS
i ≤ Sα, we have that Si is
uniformly integrable as well. Moreover, Sα = Zα(Sα) is a Q local martingale
(He et al., 1992, Theorem 12.12). In particular, S is a Q local martingale.
Hence, Q is an equivalent local martingale measure for S.
60
Suppose Q is a uniformly integrable martingale measure for S. Then
Q doubles as a local martingale measure for S, so that NFLVR is satisfied
(Delbaen & Schachermayer, 1999, Theorem1.1). It remains to show prices
are not dominated. Suppose there is K admissible for S such that
P ((K • S)∞ ≥ Si∞ − 1) = 1, (2.7)
with the inequality holding strictly with positive probability for some 0 ≤i < n. Because S is a local martingale under Q, it follows that K • S is
a σ-martingale, so that by (Ansel & Stricker, 1994, Corollary 3.5) it is a
local martingale. Moreover, since it is bounded below, K • S is a super-
martingale. Hence, under the assumption of uniformly integrable Si, we
have EQ((K • S)∞ − (Si∞ − 1)) = EQ((K • S)∞) ≤ EQ((K • S)0) = 0. Since
(2.7) holds and P ∼ Q, it must be the case that P ((K •S)∞ = Si∞−1) = 1.
According to Proposition 2.5, the equivalence holds only if there exists
an equivalent probability measure Q such that prices, in addition to being
martingales, are also uniformly integrable. Indeed, the following is a coun-
terexample taken from (Delbaen & Schachermayer, 1999).
2.1 Example Let (εm)m≥1 be independent and identically distributed
Bernoulli sequence taking values 1 and -1 with equal probability (P ). Let
c denote a real number satisfying 0 < c < 1 and define the price process
(Sm)m≥1 recursively as follows: S0 = 1 and Sm = c if εm = 1, and Sm =
2Sm−1−c otherwise. Now consider an economy with two assets (1, S). Denote
Fm the σ-algebra generated by εm and observe that E(Sm|Fm−1) = 1/2(c+
2Sm−1 − c) = Sm−1. So that (1, S) is a martingale for (Fm)m≥1 under P .
Meanwhile, note that S∞ = c < 1 almost surely since the probability of all
occurrences of εm being -1 is zero. Hence, S is strongly dominated by 1. That
is (1, S) fails to satisfy ND and, therefore, market efficiency even though it
is a martingale.
61
2.2.1 Statistical inference for market efficiency
In the asset management industry, an arbitrage is often understood, at least
implicitly, as a trading strategy capable of generating positive expected excess
return beyond the level implied by its exposure to a set of risk factors. The
set of risk factors is often the return of a market index such as the S&P
500 together with the size and value factors of Fama & French (1993). This
excess positive return beyond the level prescribed by the benchmark index or
factors is often denoted α and the strategy as a whole is often referred to as
an alpha. The economic appeal of an arbitrage is the possibility of achieving
positive excess returns while incurring a less than commensurate amount of
risk.
In other words, an arbitrage is a free lunch. Clearly, the free lunch inter-
pretation of an arbitrage only makes sense to the extent that the benchmark
factors accurately represent the sources of systematic risk present in the econ-
omy. As a case in point, a strategy based on the “small size effect” (Banz,
1981) produces positive alpha when systematic risk is proxied with the return
on a market index; of course, the positive alpha vanishes in the multi-factor
model of Fama & French. Hence, a true determination of an alpha, at least in
the multi-factor framework, is only possible if the underlying risk factors are
known and measurable with accuracy. Another, way to state the same thing
is to consider the fact that in an exponentially affine multi-factor framework,
the logarithm of the Radon-Nikodym derivative of the risk-neutral measure,
is given by
m = a+k∑
i=1
bifi
where a and bi are constants and fi, 0 < i ≤ k, is a systematic/priced risk.
Hence, a choice of (fi)0<i≤k may be viewed as expressing an opinion about
m or indeed the risk-neutral measure Q since
Q(A) =
∫
A
exp(m)dP (2.8)
for all events A.
62
In practice, the pricing kernel m is unobservable so that the choice of risk
factors (fi)0<i≤k is subject to error; indeed the choice of a linear relationship
itself is subject to error. A means by which the misspecification error may be
sidestepped is suggested by the local martingale characterization of market
efficiency, i.e. a market is efficient in large financial markets if NA, ND,
and NUPBR hold. These conditions are expressed in terms of the physical
measure, so that, by formulating empirical tests based on these concepts the
misspecification error inherent in trying to estimate the pricing kernel m may
be avoided.
By Proposition 2.4, if the aim is to study efficiency in the market (S,B)then it may prove to be more efficient to first perform a change of numeraire,
using the market portfolio as the new numeraire, and then studying the
market (Zα,B). This approach has the benefit of obviating the need to
perform the ND test for each asset, since each violation of ND translates into
a violation of NA for Zα. To that end we have the following lemma.
2.6 Proposition Let (Hm)m≥1 be a sequence of admissible simple strategy
for Zα, that is Hm admits the representation Hm =∑nm
i=1 ζiIKτi−1,τiK, where
nm ↑ ∞, τi is a stopping time, and ζi is Fτi−1measurable. Further suppose
that
EP ((V mτm)
2) < ∞,
where V mτm
:= (Hm •Zα)τm. Suppose there is an admissible strategy H for Zα
such that Hm • Sucp−→ H • S. Then H constitutes a violation of NA for Zα if
and only if
limm
EP (V mτm) > β for some β > 0, (2.9)
limm
P (V mτm < 0) = 0. (2.10)
Moreover, if (Hm)m≥1 denotes a sequence of 1-admissible simple strategies
for Zα such that
EP (V mτm) < ∞.
63
Then (Hm)m≥1 constitutes a violation of NUPBR for Zα if and only if
limm
EP (V mτm) = ∞. (2.11)
Proof. These statements follow directly form the definitions of NA and
NUPBR.
The simplest way to verify (2.9), (2.10), and (2.11) is probably to specify a
parametric model for the incremental payoffs of Hm. This is the approached
taken in Jarrow et al. (2012) to study statistical arbitrage opportunities. Let
(εi)1≤i≤nm denote a sequence of independent standard normal variables and
define
∆V mτi
:= V mτi
− V mτi−1
= µiθ + σiγεi, (2.12)
where µ, θ, σ, and γ are constants. This specification is the unconstrained
mean (UM) model of Hogan et al. (2004); this basic setup may be modified
to accommodate more complicated behaviors such as correlated errors and
coefficients that change from one small market to the next. Observe that V mτm
is normally distributed with mean µ∑nm
i=1 iθ and variance
∑nm
i=1(σiγ)2. The
log likelihood function is given by
L(Θ) := −2−1
nm∑
i=1
log(σiγ)2 − (2σ2)−1
nm∑
i=1
i−2γ(∆V mτi
− µiθ)2.
where Θ := (µ, θ, σ, γ). The parameter vector may be estimated in the usual
manner by setting the gradient of L(Θ) to zero and solving a system of four
equations in four unknowns to obtain an estimate Θ := (µ, θ, σ, γ).
Now observe that if both µ and θ are positive then (Hm)m≥1 constitutes a
violation of the NUPBR condition for Zα. If µ > 0, γ < 0, and θ is sufficiently
large then (Hm) converges to an arbitrage for Zα and, by Proposition 2.4,
a violation of market efficiency for (S,B). In (Hogan et al., 2004, Theorem
6), it is shown that θ > γ − 1/2 ∨ −1 is sufficient to ensure convergence
to an arbitrage. The above considerations are summarized in the following
Proposition.
64
2.7 Proposition Under the assumptions of Lemma 2.3, if the incremental
payoffs of Hm satisfy (2.12) then the null hypothesis of market efficiency may
be rejected with 1− α confidence if either one of the joint tests
1. H1 : µ > 0 and H2 : θ > 0, or
2. H ′1 : γ < 0, H ′
2 : µ > 0, H ′3 : θ > γ − 1/2 ∨ −1.
achieve a combined p-value of less than α.
It is worth noting that since these tests involve the specification of a
model for the incremental payoffs of the target strategies, they are subject
to misspecification errors. Hence, these tests also involve testing a joint-
hypothesis. The advantage of the current tests over traditional tests that
require the specification of a model for the stochastic discount factor is that,
the misspecification error incurred in the tests proposed here may be ana-
lyzed and tested; this is so because they only require observable (at least
at discrete times) data: prices and portfolio returns. This is in contrasts to
the “unmeasurable” misspecification error incurred in traditional tests which
rely on estimates of unobservable quantities such as the stochastic discount
factor underlying the market.
Moreover, the incremental payoff specification in (2.12) is just one exam-
ple. Another reasonable model that may be analyzed by maximum likelihood
methods would involve modeling incremental payoffs as the sum of an expo-
nential random variable and a Gaussian random variable. The positivity of
the volatility of the Gaussian component can then be tested as m tends to
infinity to verify violations of the no arbitrage condition. Clearly, the model
that is ultimately selected would depend on how well it fits the data being
analyzed.
2.3 Market efficiency in large financial markets
The theory of large financial markets is a modern re-imagining of the ar-
bitrage pricing theory (APT). The APT (Ross, 1976a,b) was devised as an
alternative to the capital asset pricing model (CAPM) of Sharpe (1964) and
65
Lintner (1965); it aims to obviate the need for accurate measures of the mar-
ket portfolio and to relax some of the assumptions underlying the CAPM.
It assumes that changes in individual asset returns are due to changes in a
fixed number of factors plus an uncorrelated idiosyncratic component. Under
the assumption of no arbitrage (Huberman, 1982), the security-market line
is approximated arbitrarily well, as the number of assets increases without
bound.
The APT is fundamentally a discrete time theory. The theory of large
financial markets was introduced in (Kabanov & Kramkov, 1994, 1998) as
a dynamic continuous-trading extension of the APT. In this modern incar-
nation, the APT employs the tools of mathematical finance pioneered by
Harrison & Pliska (1981). A large financial market is defined as a sequence
of small markets (Sn,Bn, T n), n ∈ N, where 0 < T n ≤ ∞ is the terminal time
in the n-th small market, Sn is a dn-dimensional vector of asset prices and
Bn is a filtered probability basis (Ωn,Fn,Fn, P n). In the sequel, we adopt
the large financial market on one probability space setting of Cuchiero et al.
(2015) with dn = n, n ∈ N, i.e. T n = T < ∞, Bn = B, n ∈ N, and (Sn)n≥1
forms a nested sequence of n-dimensional asset prices, so that the i-th price
process in Sn is indistinguishable from the i-th coordinate of Sm whenever
0 ≤ i ≤ n ≤ m.
In the classic small market setup treated in the previous section, market
efficiency is characterized in terms of NFLVR and the no dominance condi-
tion. We introduce a similarly motivated definition of market efficiency in
large financial markets in terms of asymptotic no free lunch with vanishing
risk (ANFLVR), the large financial market counterpart of NFLVR (Cuchiero
et al., 2015), and asymptotic no dominance (AND) defined below (Definition
2.6). We begin with the introduction of large financial market notation and
definitions.
2.3.1 Large financial market payoff space
We adopt the notation of Cuchiero et al. (2015). Given the n-th small mar-
ket (Sn,B), where Sn is an n-dimensional semimartingale representing asset
prices, a λ-admissible strategy, λ > 0, is a predictable process H such that
66
H0 = 0, the stochastic integral H • Sn is well-defined, and H • Snt ≥ −λ for
all 0 ≤ t ≤ T . We will call Xn := H • Sn an admissible gain process if H
is λ-admissible for some positive real λ. We will denote by X nλ the set of λ-
admissible gain processes and by X n the collection of all admissible processes
in (Sn,B), i.e.
X n :=⋃
λ>0
X nλ =
⋃
λ>0
λX n1 .
Small market payoff spaces are denoted Kn and Kn1 and defined as the ter-
minal values of small market gain processes:
Kn := XT : X ∈ X n, and Kn1 := XT : X ∈ X n
1 .
The space of small market dominated payoffs are defined in the classical
manner:
Cn0 := f − g : f ∈ Kn and g ∈ L0
+(B),Cn := f : f ∈ Cn
0 and f ∈ L∞(B).
Now for an adapted càdlàg process X carried on the basis B, denote (X)∗T :=
sups≤T |Xs| and define
‖X‖ucp = E(min((X)∗T , 1)).
The functional ‖ · ‖ucp is a quasi-norm, and it induces a complete metric
ducp(X, Y ) := ‖X − Y ‖ucp on the space of adapted càdlàg processes. We
employ the notation Xn ucp−→ X to denote convergence with respect to this
topology. A predictable process H will be called simple if there exists F-
stopping times 0 = S0, · · · , Sk+1 = T , and ξi ∈ FSiwith ‖ξi‖∞ < ∞, 0 ≤ i ≤
k, such that
Ht = ξ0IJ0K(t) +k∑
i=1
ξi1KSi,Si+1K(t).
67
In the sequel, ξ0 is assumed to be identically zero. We denote by Λ the set of
B-predictable simple processes. Next, for a càdlàg adapted process X, define
‖X‖S := sup‖H ·X‖ucp : H ∈ Λ, |H| ≤ 1.
The functional ‖ · ‖S induces a complete metric space on the space of semi-
martingales referred to interchangeably as the Emery or semimartingale topol-
ogy. We employ the notation Xn S−→ X to denote convergence with respect
to this topology. Now, a process X is said to be a 1-admissible general-
ized gain process if there exists a sequence of small market wealth portfolios
Xn ∈ X n1 such that
Xn S−→ X,
that is, X is a limit point in the semimartingale topology of⋃
n≥1 X n1 . We
denote the set of λ-admissible generalized wealth portfolios by Xλ and the
set consisting of all admissible generalized wealth portfolios by X , i.e.
X :=⋃
λ>0
Xλ =⋃
λ>0
λX1.
We now define the payoff spaces K and K1 as the terminal values of general-
ized wealth portfolios:
K := XT : X ∈ X, and K1 := XT : X ∈ X1.
Given the above, we define the set of large financial market dominated payoffs
as follows:
C0 := f − g : f ∈ K and g ∈ L0+(B),
C := f : f ∈ C0 and f ∈ L∞(B).
2.3.2 Arbitrage pricing in large financial markets
We now recall the fundamental theorem of asset pricing for large financial
markets (Cuchiero et al., 2015, Theorem 1.1). That is, necessary and suf-
68
ficient conditions with acceptable economic interpretations under which the
existence of a pricing functional (equivalent separating measure) is assured.
Since zero is contained in C, we would like the pricing functional or more
specifically the P -equivalent probability measure Q to satisfy EQ(f) ≤ 0 for
all f ∈ C. In order to make these statements precise in the large financial
market setting, we require the following definitions and lemmas.
2.1 Lemma If f ∈ C0 then there exists fn ∈ Cn0 , n ∈ N, such that fn
P−→ f .
Proof. Suppose f ∈ C0. Then there is X ∈ X and random variable
g ∈ L0+(B) such that f = XT − g. Since X ∈ X , there is Xn ∈ X n such that
‖Xn −X‖S → 0. This in turn implies that Xn ucp−→ X, so that XnT
P−→ XT .
Set fn := XnT − g. Then fn ∈ Cn
0 , and fnP−→ f .
Hence, the dominated payoff of a generalized gain process may be viewed
as the limit of dominated payoffs in small markets. The next lemma shows
that the same can be said for the bounded portion of C0.
2.2 Lemma If f ∈ C then there exists fn ∈ Cn such that fnP−→ f .
Proof. Let f ∈ C, then f ∈ C0 and f ∈ L∞(B), i.e. there exists a K < ∞such that f ≤ K almost surely. Because f ∈ C0, there exists, by Lemma
(2.1), gn ∈ Cn0 such that gn
P−→ f . Set fn := gn − (gn − K)Ign≥K. Then
fn ∈ Cn, and fnP−→ f .
2.3 Definition A large financial market (Sn,B)n≥1 is said to possess the
(Asymptotic) No Arbitrage (ANA) property if there does not exist Xn ∈ X n1 ,
n ∈ N, and X ∈ X1 such that ‖Xn −X‖S → 0 and
lim supn
P (XnT < 0) = 0, (2.13)
lim infn
P (XnT > α) > α, (2.14)
for some α > 0.
It is easily verified that the definition of ANA given here is equivalent to
69
the more familiar functional analysis definition:
K1 ∩ L0+(Ω,F , P ) = 0.
Because our interests are econometrically motivated, Definition 2.3 is more
natural. The next definition is the large market counterpart of NUPBR.
2.4 Definition A large financial market (Sn,B)n≥1 is said to satisfy the
No Unbounded Profit with Bounded Risk (NUPBR) condition if K1 is bounded
in L0(B).
These two notions of arbitrage are equivalent to our next notion of arbi-
trage (Cuchiero et al., 2015, Proposition 4.4).
2.5 Definition A large financial market (Sn,B)n≥1 is said to possess the
Asymptotic No Free Lunch with Vanishing Risk (ANFLVR) property if
C ∩ L∞+ (Ω,F , P ) = 0,
where L∞+ (Ω,F , P ) denotes the set of essentially bounded nonnegative random
variables on B and C is the norm closure of C in L∞(Ω,F , P ).
It is shown in (Cuchiero et al., 2015, Theorem 1.1) that a version of
the fundamental theorem of asset pricing holds in the large financial market
setting: ANFLVR is necessary and sufficient for the existence of an equivalent
separating measure (ESM), where an ESM is a probability Q equivalent to
P such that EQ(f) ≤ 0 for f ∈ C.
2.4 Asymptotic Market efficiency
In the standard small market setting, the simultaneous satisfaction of the
NFLVR condition and the ND property for all assets is equivalent to market
efficiency. In the case of non-negative asset prices, it is also the case that
there exists a Q equivalent to P such that prices are uniformly integrable
martingales (Proposition 2.5). Here, our objective is to extend these notions
to the framework of large financial markets. We start with an adaptation of
70
the ND condition to the large financial market setting. For each n we as-
sume that Sn is an n-dimensional semimartingale with the zeroth component
Sn,0 = 1 on [0, T ]. So that, we have dn = n. For all 0 ≤ i < n, the i-th asset
price satisfies Sn,it ≥ 0 for t ∈ [0, T ]. We also assume time zero prices are
deterministic and that the entire price process is normalized so that Sn,i0 = 1
for 0 ≤ i < n ∈ N.
2.6 Definition (Asymptotic No Dominance (AND)) A large financial
market payoff f ∈ K is said to be (asymptotically) undominated if for all
g ∈ K if g ≥ f , a.s., then it must also be the case that g = f almost surely.
Now let Ak := a0, a1, · · · , ak−1 and denote an arbitrary set of k > 0
distinct natural numbers including 0; we adopt the convention
a0 = 0.
Now let
αk := (αa0 , αa1 , · · · , αak−1)
denote a strictly positive weight vector, that is∑k−1
j=0 αaj = 1, and αaj > 0
for 0 ≤ j < k. Now, for n ≥ maxa : a ∈ Ak define
Sαk =k−1∑
j=0
αajSn,aj .
We will refer to Sαk as the convex portfolio generated by (Ak, αk). Note
that because (Sn)n≥1 is a nested sequence, Sαk is up to an evanescent set
independent of n for n ≥ maxa : a ∈ Ak.
2.7 Definition (Asymptotic Market Efficiency (AME)) A large fi-
nancial market (Sn,B)n≥1 is said to be asymptotically efficient on [0, T ] if
1. ANFLVR holds for (Sn,B)n≥1, and
2. for all convex portfolios Sαk , the payoff SαkT − 1 is asymptotically un-
dominated.
71
Now denote Sn,αk the n + 1 dimensional vector obtained by appending
Sαk to Sn, that is Sn,αk = (Sn, Sαk). Define
Zn,αk = Sn,αk(Sαk)−1.
Hence, Zn,αk expresses Sn in units of Sαk .
2.8 Proposition The large financial market (Sn,B)n≥1 satisfies NUPBR
if and only if (Zn,αk ,B)n≥nkwith nk = maxa : a ∈ Ak satisfies NUPBR for
all Zn,αk .
Proof. Suppose (Hn)n≥1 violates NUPBR for (Sn,B)n≥1. Then there exists
β > 0 such that for N ∈ N and sufficiently large n, we have P ((Hn • Sn)T >
N) > β. Consider
Y n := (Sαk)−1(Hn • Sn + 1)− 1.
Because all prices have initial value 1 and Hn0 = 0, we have Y n
0 = 0. Because
(Sαk)−1T is finite-valued, (Y n
T )n≥1 is unbounded in L0(B). Because Hn is 1-
admissible, Y n ≥ −1 on [0, T ]. By Itô’s integration by parts formula and the
fact that Zn,αk = Sn,αk(Sαk)−1, we have Y n = Kn • Zn,αk for a predictable
Kn. Hence, (Kn)n≥nkviolates NUPBR for (Zn,αk ,B)n≥nk
.
For the converse denote (Hn)n≥nka violation of NUPBR for (Zn,αk ,B)n≥nk
and consider
Y n := (Sαk)(Hn • Zn,αk + 1)− 1.
That (Y n)n≥1, with Y n = 0 for n < nk, constitutes a violation of NUPBR for
(Sn,B)n≥1 follows by repeating the arguments of the previous paragraph.
2.9 Proposition Suppose n ≥ maxa : a ∈ Ak =: nk. Then Sαk is
asymptotically undominated if and only if (Zn,αk ,B)n≥nksatisfies ANA.
Proof. Suppose Sαk is dominated by X ∈ X . Since X ∈ X , there is λ > 0
and Xn ∈ X nλ , n ≥ nk, such that ‖Xn − X‖S → 0. Because Xn ∈ X n
λ , we
have Xn = Hn • Sn for Hn that is λ-admissible for Sn. Define Jn := (Hn, 0)
72
and observe that Jn • Sn,αk = Hn • Sn. Consider
Y n := (Sαk)−1(Jn • Sn,αk + 1)− 1. (2.15)
Then Y n0 = 0, and Y n
t ≥ (αa0)−1(1 − λ) − 1 for t in [0, T ]. By Itô’s inte-
gration by parts formula and the fact that Zn,αk = (Sαk)−1Sn,αk , there is a
predictable Gn such that Y n = Gn • Zn,αk . By the foregoing, Gn is admissi-
ble for (Zn,αk ,B)n≥1. Because of the stability of convergence in the Emery
topology (Kardaras, 2013, Proposition 2.10), we have Y n = Gn • Zn,αkS−→
(Sαk)−1(W + 1) − 1 =: Y . Hence, Y is a generalized gain process for Sαk .
Because W dominates Sαk , we see that Y is an arbitrage for (Zn,αk ,B)n≥nk.
Now suppose (Zn,αk ,B)n≥1 fails to satisfy arbitrage, so that the constant
1 is dominated by XT where X is a 1-admissible generalized gain process
for (Zn,αk ,B)n≥nk. Then there is (Hn)n≥nk
such that Hn • Zn,αk =: Xn is a
1-admissible gain process for Zn,αk , and Xn S−→ X. Consider
Y n := Sαk(Hn • Zn,αk + 1)− 1.
Then Y n0 = 0, and Y n ≥ −1 on [0, T ]. By Itô’s integration by parts formula,
there is predictable Kn such that Y n = Kn • Sn is well-defined. We have by
(Kardaras, 2013, Proposition 2.10 ) that Y n S−→ Sαk(X +1)− 1 =: Y . Since
XT is a nonnegative and strictly positive with positive probability, we have
that YT dominates SαkT − 1.
2.1 Theorem The large financial market (Sn,B)n≥1 is asymptotically ef-
ficient if and only if (Zn,αk ,B)n≥nksatisfies ANFLVR for all (Ak, αk).
Proof. This follows from Proposition 2.8, 2.9, and (Cuchiero et al., 2015,
Proposition 4.4).
73
2.4.1 Statistical inference for asymptotic market effi-
ciency
The small market tests discussed in the previous section hold under the
assumption that the time horizon tends to infinity while the number of assets
remains fixed. We perhaps draw an analogy with the time series regression
tests of discrete-time empirical asset pricing (Cochrane, 2001, Chapter 12). In
the current large financial market setup, the time horizon is held fixed while
the number of assets is allowed to grow without bound. The empirical tests
we propose in this section may be analogized to the cross-section regression
tests of discrete-time empirical asset pricing theory.
Because the time horizon is assumed fixed, these tests may be particularly
well-suited for analyzing strategies with short investment horizons. Also,
since the cross-section is assumed to grow without bound, they may be more
appropriate for studying strategies involving a great number of asset. In
particular, strategies that involve sorting a great number of asset according
some indicator of performance such as previous-year return. Examples of
such strategies include mean-reversion and momentum strategies.
2.3 Lemma Let (Ak, αk) be given and let (Hn)n≥nkbe a sequence of small
market strategies for (Zn,αk ,B)n≥nkconverging in the semimartingale topol-
ogy to a generalized gain process Y . Suppose
E((V nT )
2) < ∞,
where V nT := (Hn • Zn,αk)T . Then YT constitutes a violation of ANA for
(Zn,αk ,B)n≥nkif and only if
limn
E(V nT ) > β for some β > 0, (2.16)
limn
P (V nT < 0) = 0. (2.17)
Moreover, if (Hn)n,nkis a sequence of 1-admissible strategies for Zn,αk such
that
E(V nT ) < ∞
74
then (Hn • Zn,αk)n≥nkviolates NUPBR for (Zn,αk ,B)n≥nk
if and only if
limn
En(V nT ) = ∞. (2.18)
Proof. These statements follow directly form the definitions of ANA and
NUPBR.
The simplest way to determine whether a given strategy verifies the re-
quirements of either (2.16), (2.17), or (2.18), is to specify a parametric model
of its incremental payoffs. As a simple example, we may suppose that
∆V iT := V i
T − V i−1T = µiθ + σiγεi, (2.19)
where µ, θ, σ, and γ are constants, (εi) is i.i.d, and εi is a standard normal
random variable for i ≥ nk.
Now note that under the assumption of normally distributed εi, V nT is
normally distributed with log likelihood given by
L(Θ) := −2−1
n∑
i=1
log(σiγ)2 − (2σ2)−1
n∑
i=1
i−2γ(∆V iT − µiθ)2.
where Θ := (µ, θ, σ, γ). The parameter vector may be estimated in the usual
fashion by setting the gradient of L(Θ) to zero and solving a system of four
equations in four unknowns to obtain an estimate Θ := (µ, θ, σ, γ). We
summarize the preceding considerations as follows:
Now observe that if both µ and θ are positive then (Hn)n≥nkconstitutes
a violation of the NUPBR condition for (Zn,αk ,B)n≥nk. If µ > 0, γ < 0,
and θ is sufficiently large then (Hn • Zn,αk)n≥nkconverges to an arbitrage
for (Zn,αk ,B)n≥nkand, by Theorem 2.1, a violation of market efficiency for
(Sn,B)n≥1. In (Hogan et al., 2004, Theorem 6), it is shown that θ > γ −1/2∨−1 is necessary to ensure convergence to an asymptotic arbitrage. The
above considerations are summarized in the following Proposition.
2.10 Proposition Under the assumptions of Lemma 2.3, if the incremen-
tal payoffs of Hm satisfy (2.19) then the null hypothesis of market efficiency
may be rejected with 1− α confidence if either one of the joint tests
1. H1 : µ > 0 and H2 : θ > 0, or
75
2. H ′1 : γ < 0, H ′
2 : µ > 0, H ′3 : θ > γ − 1/2 ∨ −1.
achieve a combined p-value of less than α.
2.5 Conclusion
In a finite horizon complete market setting, market efficiency is equivalent to
asset prices admitting a martingale measure. This basic definition motivates
traditional tests of market efficiency. These tests must by necessity postulate
an equilibrium model of asset prices or a stochastic discount factor as a
reference. Naturally, such a procedure is subject to a misspecification which
cannot be assessed due to the fact that the stochastic discount factor (SDF)
is unobservable. Hence, traditional tests of market efficiency are in fact joint
tests of the fit of the particular model selected and deviations from market
efficiency. This is the well-known joint hypothesis problem.
We have contributed to the growing literature that aims to devise tests
of market efficiency that do not suffer from the joint-hypothesis problem.
We have obtained further characterizations of market efficiency that in turn
suggest simplifications of empirical tests of market efficiency. These char-
acterizations involve a change of numeraire that boils down to normalizing
asset prices with respect to the market portfolio prior to investigating vi-
olations of market efficiency. Our analysis may be extended to the large
financial market setting. We define the no dominance condition as well as
market efficiency in the large financial market framework. We show that the
no dominance condition can be characterized in terms of the no arbitrage
condition after a change of numeraire. This result suggest empirical tests of
asymptotic market efficiency similar to those proposed in the small market
setting. The practical importance of the large financial market theory is that
for certain strategies, taking limits as the time horizon tends to infinity may
be inappropriate. Provided the number of assets involved in the execution of
the strategy is very large then the large financial market tests we proposed
may be more adequate.
76
Chapter 3
Statistical arbitrage in the U.S.
treasury futures market
3.1 Introduction
Is the U.S. treasury bond futures market informational efficient? Weak-form
informational efficiency requires all strategies that rely solely on historical
price data to be dominated by the passive strategy of holding single traded
assets or a weighted portfolio of traded assets. The notion of dominance as
it relates to asset pricing was introduced by Merton (1973) to study option
pricing formulas that are consistent with rational investor behavior. More
recently, Jarrow & Larsson (2012) obtained a characterization of informa-
tional efficiency in terms of the no dominance condition (ND) and the No
Free Lunch with Vanishing Risk condition (NFLVR) of Delbaen & Schacher-
mayer (1994). Accordingly, market inefficiency can be asserted as soon as
either the ND or NFLVR fails.
This result simplifies considerably the task of verifying market efficiency;
it belies the long held belief that in order to test for violations of market
efficiency, one must first specify a model of equilibrium prices such as the
CAPM and then test for efficiency in relation to the estimated equilibrium
model. Unfortunately, this two step procedure runs quickly into difficulties,
since it may not be possible to tell apart errors due to model misspecification
and those that are solely due to market inefficiency. This is the well-known
77
joint-hypothesis problem discussed in (Fama, 1969).
Moreover, the No Dominance condition itself could be dispensed with as
soon as a change of numeraire is performed. Indeed let B := (Ω,F , (Ft)t≥0, P )
denote a probability basis, and let S denote an n-dimensional semimartingale
whose components Si, 0 ≤ i < n, represent the price of n distinct assets,
expressed in units of the zeroth asset. For the sake of convenience, also
assume that at time zero, each asset is priced at one, i.e. Si0 = 1 for 0 ≤ i < n.
Now, let γ denote a positive number between zero and one, i.e. 0 < γ < 1,
and define
Zγ,i := (S, Sγ,i)(Sγ,i)−1,
where Sγ,i = γ + (1 − γ)Si. According to Dare (2017, Proposition 2.1), the
efficiency of (S,B) is equivalent to the existence of a local martingale measure
for the markets (Zγ,i,B), for 0 ≤ i < n and 0 < γ < 1.
In fact, a stronger statement can be made achieve a slightly provided
prices are expressed in units of a portfolio constructed on the basis of a
strictly positive weight vector α = (α0, · · · , αn−1), i.e. αi > 0 for 0 ≤ i < n.
Indeed if
Zα := (S, Sα)(Sα)−1,
then according to Dare (2017, Corollary 2.2), the market (S,B) is efficient
if and only if (Zα,B) admits a local martingale measure. The choice of a
market portfolio is irrelevant so long as it assigns positive weight to each
traded asset.
We will argue for a violation of market efficiency using Dare (2017, Propo-
sition 2.1) with Si representing the price of the 2-Year U.S. Treasury futures
contract. Fortunately, since the NFLVR condition is specified in terms of the
physical measure, the joint-hypothesis issue may be avoided by evaluating
trading rule for violations of NFLVR. Using this testing approach, we make
and emprically support the claim that between April 1, 2010 and Decem-
ber 31, 2015, the equally weighted buy-and-hold strategy was out-performed
by a simple cointegration-based trading rule. Moreover, the hypothesis of
the existence of a statistical arbitrage, in the sense of (Hogan et al., 2004),
78
achieves a p-value less than 2%.
The trading rule we examine takes as starting point the hypothesis that
treasury bond futures are cointegrated and then attempts to profit from
deviations from the cointegrating relationships. The cointegration hypothesis
assumes, among other things, that even though prices of individual contracts
may be non-stationary, there exists at least one linear combination of these
contracts that results in a stationary price process. That is to say, it is
possible to put together a portfolio of long and short positions in individual
contracts such that the resulting market value of the portfolio is stationary.
The hypothesis of cointegrated bond prices has been examined by Bradley &
Lumpkin (1992), Zhang (1993), and many others. In these studies, the data
employed was sampled at low frequency, daily or monthly, and the hypothesis
of integrated bond prices could not be rejected . We carry out similar analysis
and find empirical support for cointegration using data sampled intra day at
one-minute intervals.
We obtain theoretical motivation for the cointegration-based trading rule
by embedding our analysis within the literature devoted to the study of the
term structure of bonds using factor models. Starting with Litterman &
Scheinkman (1991) and later Bouchaud et al. (1999) and many others, it
has been noted that between 96% and 98% of overall variance of the entire
family of treasury securities may be explained by the variance of just three
factors, the so-called level, slope, and curvature factors. The factors are so
named because of how they affect the shape of the yield curve. A shock
emanating from the first factor has nearly the same impact on contracts of
all maturities; the resulting effect is a vertical shift, upward or downward, of
the entire yield curve. The second factor affects bonds of different maturities
in such a manner as to change the steepness or slope of the curve; it does
so by affecting securities at one end of the maturity spectrum more or less
than those at the other end. Finally, the third factor has the effect of making
the yield curve curvier; it does so by having more or less pronounced effects
on medium term bonds than on bonds situated either ends of the maturity
spectrum.
We argue that a strategy based on a cointegration hypothesis is natural
within the context of a term structure driven by common stochastic trends or
79
factors. In fact, the opposite is also true, that is, a common factor structure is
a natural consequence of cointegrated yields. This line of argument provides
support based on economic theory for our strategy and helps explain its
performance. Our results suggests that the futures market may be inefficient.
Market inefficiency is clearly not a desired outcome. It implies the existence
of a free lunch. Put another way, our results points to possible misallocation
of resources.
The rest of the paper proceeds as follows: in section 2, we provide a
description of the data used. Futures price data usually does not come in
continuous form for extended periods of time, so we had to make certain
choices about how available historical price data is transformed into a state
suitable for our analysis. These choices can be implemented in real-time and
are, therefore, to be considered as part of the trading rule. In section 3, we
provide theoretical foundation for our trading rule. This foundation allows
us to reach beyond our data and assert that the profitability of the trading
rule is very likely not confined to the period for which we have data. Section
4 is devoted to the implementation details of the trading rule. Section 5
summarizes our empirical results, and section 6 concludes.
3.2 Data
3.2.1 Treasury futures
CBOT Treasury futures are standardized foreward contracts for selling and
buying US government debt obligations for future delivery or settlement.
They were introduced in the nineteen-seventies at the Chicago Board of Trade
(CBOT), now part of the Chincago Merchantile Exchange (CME), for hedg-
ing short-term risks on U.S. treasury yields. They come in four tenors or
maturities: 2, 5, 10, and 30 years. In reality, each contract type is written
on a basket of U.S. treasury notes and bonds with a range of maturities and
coupon rates. For instance, the 30-Year Treasury Bond Futures contract is
written on a basket of bonds with maturities ranging from 15 to 25 years.
It is, therefore, worth keeping in mind that a study of the dynamics of the
yield curve using futures data reflects influences from a range of maturities.
80
Every contract listed above except the 2-Year T-Note Futures contract,
which has a face value of $200,000, has a face value of $100,000. That is
each contract affords the buyer the right to buy an underlying treasury note
or bond with a face value of $100,000 or $200,000 in the case of the 2-Year
contract. In practice, the price of these contracts are quoted as percentages
of their par value. The minimum tick size of the 2-Year T-Note Futures is
1/128%, that of the 5-Year T-Note Futures is 1/128%, that of the 10-Year
T-Note Futures is 1/64%, and that of the 30-Year T-Bond Futures contract is
1/32%. In Dollar terms, this comes to $15.625, $7.8125, $15.625, and $31.25,
respectively, per tick movement. 1 These tick sizes are orders of maginitude
larger than those typically encounted in the equity markets.
Even though most futures contracts are settled in cash at the expiration
of the contract, for a small percentage of open interests, delivery of the
underlying bond actually takes place. Given that the futures contract is
written on a basket of notes and bonds, the actual bond or note delivered is
at the discretion of the seller of the contract. In practice, the seller merely
selects the cheapest bond in the basket to deliever. For our purposes, we
shall focus on only the above listed tenors, but it is worth keeping in mind
that there is is also a 30-Year Ultra contract that is also traded at the CME.
For our analysis, we use quote data, prices and sizes, from April 1, 2010
through December 31, 2015. Even though we have at our disposal data rich
enough to allow resolution down to the nearest millisecond, we opted, arbi-
trarily, to aggregate the data into one-minute time bars. The representative
quoted price and size for each time bar is the last recorded quote falling within
that interval. Our use of quotes , bids and offers, instead of transaction data
allows the computation of a proxy for the unobserved true price, by means
of the mid-quote, at a higher frequency than transaction prices might have
allowed. Using quotes, we are also able to reflect directly a major portion of
the execution costs associated with any transaction, i.e. the bid-ask spread.
Trading in these markets primarily takes place electronically via CME
ClearPort Clearing virtually around the clock between the hours of 18:00
and 17:00 (Chicago Time), Sunday through Friday. But, the markets are
1We refer the reader to more detailed information about the features of each contractto Labuszewski et al. (2014).
81
at their most active during the daytime trading hours of 7:20 and 14:00
(Chicago Time), Monday through Friday. This also the opening hours of
the open outcry trading pits. For our analysis, We use exclusively data from
the daytime trading hours. This ensures that the strategy is able to benefit
from the best liquidity these markets can offer, while mitigating the effects
of slippage (orders not getting filled at the stated price) and costs associated
with breaking through the Level 1 bid and ask sizes.
3.2.2 Continuous prices
Unlike stocks and long bonds, futures contracts tend to be short-lived, with
price histories extending over a few weeks or months. This stems from the
traditional use of futures contracts as short-term hedging instruments against
price/interest rate fluctuations. Treasury futures contracts, in particular,
have a quarterly expiration cycle in March, June, September, and December.
At any given point in time, several contracts written on the same underlying
bond, differentiated only by their expiration dates, may trade side by side.
Usually, the next contract due to expire, the so-called front-month con-
tract, offers the most liquidity. As the front-month approaches expiration,
liquidity is gradually transferred to the next contract in line to expire, the
deferred month contract. At any rate, a given contract is only actively traded
for a few months or weeks before it expires. Hence, holding a long-term po-
sition in a futures contract actually entails actively trading in and out of the
front month contract as it nears its expiration date. The implementation of
this process is known as rolling the front month forward.
For the purpose of evaluating a trading strategy over a historical period
of more than a few months, the roll can be retroactively implemented to
generate a continuous price data. The usual way to go about the roll is to
trade out of the front-month a given number of days before it expires. In the
extreme case, the roll takes place on the expiration date of the front month
contract. The downside of this type of approach is that the roll may take
place at a date when liquidity in the deferred month is not yet plentiful. The
result is that a backtest may not necessarily capture the increased trading
cost associated with the lower liquidity level.
82
Our preferred approach for implementing the roll is to start trading out of
the front month contract at any point during its expiration month as soon as
the open interest in the deferred month contract exceeds the open interest in
the front month contract. The data used in our backtest is spliced together
this way; the procedure is implementable in real-time and must be considered
part of the trading strategy discussed in this paper.
Now, while retroactive contract rolling may solve the problem of creat-
ing an unbroken long-term price history, it creates another: splicing prices
together as described above would invariably introduce artificial price jumps
into historical prices. To see this, consider a futures contract with price F
written on a bond with price B. Using an arbitrage argument and ignor-
ing accrued interest, the price of a futures contract at any time t may be
expressed as:
Ft = Bte(r−c)d, (3.1)
where c is the continuously compounded rate of discounted coupon payments
on the underlying bond, d is the number of time units before the futures
contract expires, and r is the repo rate.
Now, assuming the roll takes place in the expiration month, d for the front
month is less than 30 days, whereas for the deferred month contract, d is at
least 90 days. This results in a price differential between the two contracts,
which shows up in the price data as a jump. In reality, and assuming a
self-financing strategy, the price differential would necessitate a change in
the number of contracts held, so that overall, the return on the portfolio is
unaffected by the roll. Hence, in order to avoid fictitious gains and losses,
the price series must be adjusted to remove the roll-induced price jumps.
The most often used methods in practice applies an adjustment to prices
either prior or subsequent to the roll date. When the adjustment is applied
to prices recorded after the contract is rolled forward, the price history is said
to be adjusted forward; if on the other hand, the adjustment is applied to
prices recorded prior to the roll date then the prices are said to be adjusted
backward. The actual price adjustment, in the case of a backward adjust-
ment, is most commonly carried out in one of two ways: in the first instance,
83
the roll-induced price gap (price after roll minus price right after roll) is sub-
tracted from all prices recorded prior to the roll date; in the second instance,
all prices preceding the roll date are multiplied by a factor representing the
relative price level before and after the roll. The second approach is remi-
niscent of how stock prices are adjusted after a stock split. We will refer to
the first approach as the backward difference adjustment method and to the
second as the backward ratio adjustment method. Forward ratio adjustment
and forward difference adjustment are implemented similarly with the ad-
justments applied to prices recorded after the roll date. In our analysis, we
will only consider backward adjusted prices, as they appear to be the more
intuitive approach.
Both types of backward price adjustment methods are widely-used in
practice, but the ratio adjustment method has the advantage of guaranteeing
that prices, however early in the price series, always remain positive. In the-
ory, the difference adjustment approach may generate negative prices given
enough roll-induced price gaps. We mention these adjustment procedures
because they tend to affect the performance of most strategies, including the
one we study in this paper. The price adjustment procedures cannot be con-
sidered as part of a real-time trading strategy, so we report results using both
the backward ratio adjustment and the backward difference adjustment.
3.3 Economic framework
3.3.1 The price of a futures contract
The traditional way of pricing a futures contract is via an arbitrage argument.
The argument is best illustrated via an example. Suppose an agent, at time
t (today), has a need to purchase a 10-Year Treasury note at time T1. That
is, at time T1 when the forward/futures contract expires, the treasury note
will mature in ten years at time T1 + 10. The agent could go about it by
borrowing money at time t at the repo rate to cover the full price of the
bond. The full price of the bond would include the current spot price of the
bond and accrued interest on the bond since the last coupon payment. The
accrued interest is the portion of the next coupon payment that is due to the
84
previous owner of the bond. Let’s denote the spot price of the bond by Bt
and the accrued interest by It. So, at time t the agent may borrow Bt + It,
using the bond as collateral against the loan.
At time T1, the loan used by the agent to fund the purchase would have
accrued interest of its own and would have grown to (Bt + It)er(T1−t). Here,
we are assuming a fixed repo rate r. On the other hand, taking procession of
the treasury note endows the agent with the right to receive coupon payments
generated by the note. Coupon rates are usually a fixed percentage of the
par value of the bond. In practice, this is usually around 6% and payable
semiannually; for this illustration, we will imaging that the coupon payments
are payed continuously at the instantaneous rate of c. To recap, at time T1,
the loan balance grows to (Bt+It)er(T1−t), but it is offset by coupon payments
of Ke−c(T1−t), where K denotes the par value of the bond. Hence, at time T1,
for the agent to own the treasury bond outright, she simply needs to repay
the loan, but because of the accrued interets she would only be out of pocket
F := (Bt + It)er(T1−t) −Ke−c(T1−t). Hence, at time t it only makes economic
sense to enter a futures contract if it is priced in such a way as to equate the
cost of replicating it, that is, F .
The above analysis demonstrates that the price of a futures contract may
be written in terms of the price of the underlying bond. In fact, by denoting C
the continuously discounted present value of all coupon payments generated
by the bond, we may write:
Ft = (Bt − C)erd, (3.2)
where d is the amount of time left before the futures contract expires. It is
worth noting that the foregoing analysis relies on the assumption of constant
interest rate. It is also to be noted that the price of a futures contracts using
a no-arbitrage argument may differ from the price of a forward contract in an
environment with stochastic time varying interest rates. We refer the reader
to Cox et al. (1981) for a lucid discussion of this point. For the intuition
we wish to develop, the assumption of a constant in time interest rate is
tolerable.
Returning to (3.2), and taking the natural logarithm of both sides of the
85
equation, and assuming that the face value of the bond is much larger than
the present value of future coupon payments, we may write
ft ≈ bt − c+ rd, (3.3)
where bt := log(Bt) and c := log(C). Note that bt = −Tyt, where yt is the
yield to maturity at time t of the bond. The quantity rd−c is usually referred
to variously in the empirical literature as the carry or the basis. Looking at
actual price data, the carry would fluctuate from time to time usually around
a long term mean. The basic idea of a mean-reverting carry is the motivation
behind the so-called carry-trade, which is implemented by going short the
futures and long the bond when the carry is high and doing the opposite
when the carry is deemed too low. The strategy reviewed subsequently, is
only related to this trade by the fact that it relies on mean-reversion to be
profitable.
Returning to (3.3) it is apparent that besides the variation in the carry,
variations in the logarithm of the futures price comes about because of vari-
ations in the logarithm of the bond price, which is itself driven by the yield
to maturity of the bond. Usually, the carry does not vary by a whole lot and
it is often modeled as a constant as we have done here unless, of course, the
object of the analysis is to study the carry itself. Given these considerations
we may model the logarithm of the price of the futures contract directly and
exclusively in terms of the yield to maturity with no significant loss in rigor.
That is, we may write
ft = α + βyt, (3.4)
where α and β are constant terms and yt is the yield to maturity of the
underlying bond. The constant α is simply the carry and whatever needs
to be added or subtracted in order to make the approximation in (3.3) an
equality. The constant β is in this setting equal to −T , that is, negative the
tenor of the underlying bond.
The preceding reformulation of the logarithm of the price of a futures
contract in terms of the yield to maturity of the underlying bond allows us
to use the theoretical machinery developed to study the term structure of
86
interest rates to motivate the trading system that we discuss subsequently.
3.3.2 Factor model of the yield curve
Factor modeling of the yield curve has a rich history in the financial lit-
erature. The extant models may be broadly classified under three main
headings: statistical, no-arbitrage, and hybrid models. The static Nelson &
Siegel (1987) (NS) model of the yield curve and its modern counterpart, the
Dynamic Nelson-Siegel (DNS) model, proposed by Diebold & Li (2006) are
prototypes of the class of statistical factor models of the interest rate term
structure. They, especially the static Nelson & Siegel, are widely used both
by financial market practitioners and central banks to set interest rates and
forecast yields. Despite their popularity and appealing statistical properties,
they tend to give rise to violations of the no-arbitrage condition2.
The dynamic term structure models (DTSM) studied in (Singleton, 2006,
Chapter12), of which the yield-factor model of Duffie & Kan (1996) is an
early example, constitute the class of arbitrage-free models. These models
derive a functional form of the yield curve in terms of state variables or
factors, which also govern the market price of risk linking the local martin-
gale measure to the historical measure. They are, therefore, by construction
arbitrage-free. Despite their economic soundness, these models tend to have
sub-par empirical performance. For instance, Dybvig et al. (1996) showed in
the discrete-time setting that in an arbitrage-free model, long forward and
zero-coupon rates can never fall; working in the general setting of continu-
ous trading, Hubalek et al. (2002) arrived at a similar conclusion regarding
the monotonicity of long forward rates under the no-arbitrage assumption.
Clearly, this implication of the no-arbitrage framework is often contradicted
by the empirical evidence that zero-coupon rates do in fact fall. Furthermore,
negative rates and unit roots are ruled out. As (Diebold & Rudebusch, 2013,
p. 13) put it
Economic [no-arbitrage] theory strongly suggests that nominal
bond yields should not have unit roots, because the yields are
bounded below by zero, whereas unit root processes have random
2See (Filipović, 1999) for such violations in the case of DNS models.
87
walk components and therefore will eventually cross zero almost
surely.
Negative interest rates post 2008 financial crisis are a mainstay of many
developed economies, including Switzerland. Moreover, the task of fitting
arbitrage-free models to interest rate data can be very difficult since they tend
to be over-parametrized and, typically, would generate multiple likelihood
maxima (Diebold & Rudebusch, 2013, p. 55).
Lastly, the Arbitrage-Free Nelson-Siegel (AFNS) model proposed by Chris-
tensen et al. (2011) is a prototype of the hybrid class of models. It main-
tains the parsimonious parametrization of the DNS model while remaining
arbitrage-free. The AFNS differs, at least in the functional form of the yield
curve, from the standard DNS model only by the inclusion of an extra term
known as the “yield adjustment factor”. Intuitively, the AFNS model may
be thought of as the projection of an arbitrage-free affine term structure
model, namely the Duffie & Kan (1996) model, onto the DNS model with
the orthogonal component swept into the yield adjustment factor.
The factor models briefly surveyed above motivate the trading rule adopted
in this paper; it relies on the hypothesis that the term structure of interest
rates can be described by an affine function of a set of state variables, no-
tably the level, slope, and curvature principal components. Moreover, there
is ample empirical evidence suggesting that the term structure is cointe-
grated. In particular, using monthly Treasury bill data from January 1970
until December 1988, Hall et al. (1992) observed that yields to maturity of
Treasury bills are cointegrated and that during periods when the Federal
Reserve specifically targeted short-term interest rates, the spreads between
yields of different maturities defined the cointegrating vector.
In general, given a N ∈ N bonds, with N not necessarily finite, a factors
model of the yield curve would represent the yield on the i-th bond as:
yi,t = αt +
q∑
j=1
βi,jfj,t + εi,t, (3.5)
where α is deterministic, q is a small number, fj for j = 1, · · · , q, are fac-
tors βi,j is the contribution of the jth factor to the ith bond, and εi is the
88
component of the ith bond that is apart from any other bond. For our pur-
poses, it does not actually matter whether the factors are macroeconomic or
statistical in nature, but to fix ideas we assume q = 3 and the factors are
the level, slope, and curvature factors of Litterman & Scheinkman (1991).
By substituting the expression in (3.5) into equation (3.4), we obtain the log
futures price in terms of the level, slope, and curvature of the term structure.
That is,
fi,t = µt +3∑
j=1
γi,jfj,t + εi,t.
3.3.3 Factor extraction
Using Principal Component Analysis (PCA), it is possible to transform the
original time series of futures prices into a set of orthogonal time series known
as principal components. Because of the orthogonality property, the origi-
nal time series may be expressed uniquely as a linear combination of the
principal components. This representation motivates the interpretation of
the principal components as the latent risk factors driving observed price
fluctuations.
The analysis starts with n observations from an m-dimensional random
vector, the original time series data. Then, assuming that the original time
series admits a stationary distribution with finite first and second moments,
the covariance matrix is estimated using an unbiased and consistent estima-
tor. In our setting, the assumption of stationarity applied directly to the
logarithm of futures prices is hard to justify. Prices generally trend upward,
and the same may be expected for their log-transformed versions. Using
the Augmented Dickey-Fuller (ADF) statistics with constant drift, we test
the hypothesis that the lag polynomial characterizing the underlying data
generating process has a unit root.
A quick scan of Table 3.1 reveals that for the most part the unit root
assumption cannot be rejected. The only exception seems to be the 2 Year
and the 5 Year futures price data for the year 2015, for which the assumption
of a unit root may be rejected at the 5% confidence level. We think this out-
come is a temporary fluke since for the previous five years the null hypothesis
89
Table 3.1: Augmented Dickey-Fuller Tests
(a) 2010
a t(a) lag t(lag) 5% c.value(a) 5% c.value(lag)
2 Yr 0.00 2.17 -0.00 -2.16 4.59 -2.865 Yr 0.00 2.01 -0.00 -2.00 4.59 -2.8610 Yr 0.00 2.05 -0.00 -2.05 4.59 -2.8630 Yr 0.00 2.03 -0.00 -2.02 4.59 -2.86
(b) 2011
a t(a) lag t(lag) 5% c.value(a) 5% c.value(lag)
2 Yr 0.00 0.96 -0.00 -0.96 4.59 -2.865 Yr 0.00 0.62 -0.00 -0.61 4.59 -2.8610 Yr 0.00 0.56 -0.00 -0.54 4.59 -2.8630 Yr 0.00 0.36 -0.00 -0.33 4.59 -2.86
(c) 2012
a t(a) lag t(lag) 5% c.value(a) 5% c.value(lag)
2 Yr 0.01 2.51 -0.00 -2.51 4.59 -2.865 Yr 0.00 1.31 -0.00 -1.31 4.59 -2.8610 Yr 0.00 1.22 -0.00 -1.21 4.59 -2.8630 Yr 0.00 1.36 -0.00 -1.36 4.59 -2.86
(d) 2013
a t(a) lag t(lag) 5% c.value(a) 5% c.value(lag)
2 Yr 0.00 1.45 -0.00 -1.45 4.59 -2.865 Yr 0.01 1.85 -0.00 -1.85 4.59 -2.8610 Yr 0.00 1.65 -0.00 -1.65 4.59 -2.8630 Yr 0.00 1.24 -0.00 -1.25 4.59 -2.86
(e) 2014
a t(a) lag t(lag) 5% c.value(a) 5% c.value(lag)
2 Yr 0.00 1.06 -0.00 -1.05 4.59 -2.865 Yr 0.00 1.46 -0.00 -1.46 4.59 -2.8610 Yr 0.00 1.74 -0.00 -1.73 4.59 -2.8630 Yr 0.00 1.95 -0.00 -1.93 4.59 -2.86
(f) 2015
a t(a) lag t(lag) 5% c.value(a) 5% c.value(lag)
2 Yr 0.01 2.88 -0.00 -2.88 4.59 -2.865 Yr 0.01 3.01 -0.00 -3.01 4.59 -2.8610 Yr 0.01 2.76 -0.00 -2.76 4.59 -2.8630 Yr 0.00 1.65 -0.00 -1.65 4.59 -2.86
90
could not be rejected. We have also looked at different subsamples of the
2015 data, and for the most part the assumption of a unit root could not be
rejected.
Under the circumstances, carrying on with the analysis of the principal
components of the original price series may not be advisable. Without the
stationarity assumptions, it is very likely the case that the usual estimator
of the covariance matrix would yields estimates that may be substantially
off the mark. Meanwhile, taking the first difference of the logarithm of the
price series seems to produce time series that display very little persistence
as may be observed from an inspection of Figure 3.1. Hence, the assumption
of stationarity may be more appropriate only after differencing the data. We
have substantiated this assumption using the ADF test and the unit root
assumption was rejected at the 1% confidence level.
Clearly, taking differences of the log price data entails a loss of informa-
tion. Nevertheless, an analysis of the differenced data could still yield insight
into the factor structure of the original price data since the property could
be expected to be shared by both the differenced data and the data in levels.
This observation is easily confirmed by means of simple algebraic manipula-
tions. Naturally, the factors that may be extracted from the differenced data
would bear very little resemblance to the factors present in the levels data,
so that there are limits to how much can be inferred about the data in levels
once it has been differenced.
Proceeding with the differenced data, we estimate the covariance matrix
by means of the unbiased estimator
Σ := (n− 1)−1
n∑
i=1
xix′i,
where x is the normalized data series. In the final step we obtain a spectral
decomposition of the covariance matrix. That is
Σ =m∑
i=1
λiviv′i, (3.6)
where λ1/21 ≥ · · · ≥ λ
1/2m are the nonnegative eigenvalues of Σ in descending
91
Figure 3.1: Changes in log prices
(a) 2 Yr Treasury Note (b) 5 Yr Treasury Note
(c) 10 Yr Treasury Bond (d) 30 Yr Treasury Bond
92
order of magnitude, and vi, i = 1, . . . ,m, are the corresponding eigenvectors.
The eigenvectors are orthonormal, that is they have length one and form a
linearly independent set. Hence, from the representation in (3.6), the contri-
bution of the i-th principal factor to the overall variance of the differenced
log price data is λi. The j-th component of each the i-th eigenvector is the
factor loading or beta of the j-th security with respect to the i-th principal
component. That is, the components of the eigenvectors summarize expo-
sure levels. For instance, the second element of the third eigenvector is the
exposure of the 5 Year Treasury Note Futures contract to the third principal
component or risk factor, the so-called curvature factor.
Recall that the eigenvectors are orthonormal so that the associated eigen-
values represent the variance contribution of each principal components to
the variance of the differenced log price data. By taking the ratio of indi-
vidual eigenvalues to the sum of all four eigenvalues, we may estimate the
percentage contribution of each principal component to the overall variance.
The output of this analysis using subsamples corresponding to each calendar
year in our data set is recorded in Figure 3.2. What is immediately apparent
from these figures is that an overwhelming majority of variability is directly
attributable to the first component; this component contributes between 90
to 93% of total variability, followed by the second component contributing
between 4.8 and 12%. The contributions due to the third and fourth compo-
nents are fairly modest. The third component contributes between 0.5 and
3%, whereas the fourth component accounts for less than 0.5%. This result
is in agreement with previous works such as Litterman & Scheinkman (1991)
and Bouchaud et al. (1999) studying the term structure using lower frequency
data. By now, this is a stylized fact of the term structure of interest rates,
our results confirm this fact for higher frequency data.
Figure 3.3, reports the loadings associated with each principal component
for a variety of subsamples. The loading for the first fact is fairly stable across
maturity and time. The weights are uniformly close to 0.5 so that the effects
of shocks emanating from the first factor are felt uniformly across maturities.
The loadings associated with the second and third principal components show
a great deal of variation across time. In 2010, the effect of a shock emanating
from the second factor had the most impact on long bonds than short bonds.
93
Figure 3.2: Variance contributions
(a) 2010 (b) 2011
(c) 2012 (d) 2013
(e) 2014 (f) 2015
94
Figure 3.3: Factor loadings by contract
(a) 2010 (b) 2011
(c) 2012 (d) 2013
(e) 2014 (f) 2015
95
The situation was reserved in the following year. A similar reversal may be
observed for the third component which in 2010 had a greater impact on
medium term bonds than on both long and short bonds.
This empirical analysis of the factors underlying the data forms the basis
of the strategy we discuss in the sequel.
3.3.4 Factor structure implies cointegration
In this subsection we shall study the link between a factor structure de-
scription of the yield curve and the existence of cointegrating relationships
between contracts of different tenors. An n × 1 vector time series y is coin-
tegrated if each component of y is integrated of order p > 0, but there is k,
strictly less than n, independent linear combinations of the components of y
that result in processes that are integrated of order q, where q is strictly less
than p. For our purposes, we shall assume that p is one and q is zero. Hence,
cointegration in our setting means that y is a unit root process whose compo-
nents can be combined linearly in k independent ways to produce stationary
processes. The vector of cointegrating relationships are usually normalized
and grouped together as the columns of an n × k matrix denoted β. By
definition, β has linearly independent columns, therefore, it has rank k < n.
Consider the following model of the yield on n bonds:
yt = Aft + ut, (3.7)
where y is an n × 1 random vector of yields of varying maturities, A is an
n×k matrix of factor weights, f is a k×1 random vector of common factors,
and u is an n × 1 stationary random vector. Without loss of generality, we
may assume that each of the k components of f are unit root processes;
otherwise, if only r < k components of f are unit root processes and the
remaining k − r are stationary, then we may simply re-write (3.7)
yt = Bht + vt,
with vt = Cgt + ut, f ′t = [h′
t, g′t], and A′ = [B′, C ′], where A′ is the matrix
transpose of A, B and C are, respectively, the n×r and n×(k−r) submatrices
96
of A, and h and g are, respectively, r and k − r subvectors of f .
Returning to equation (3.7), the vector of factors may be assumed to be
a multivariate random walk, i.e.,
ft = ft−1 + φ(L)εt,
where φ(L) is a lag polynomial, ε is white noise, and φ(L)ε is stationary.
There some empirical evidence in our data that this assumption is not un-
justifiable. Using a matrix analysis argument, details of which may be found
in Theorem 3.1 of Escribano & Pena (1993), it is easy to verify that y may
be written as a sum of a stationary process and a unit root process:
yt = wt + zt,
where w ∼ I(1) and z ∼ I(0). Both w and z may be computed explic-
itly given the matrix of factor loadings as follows: wt = AA′yt and zt =
(A⊥)′A⊥yt, where A⊥ is the orthogonal complement of A, i.e. (A⊥)′A = 0.
Now, setting β := (A⊥)′, it is easily seen that βyt = βzt ∼ I(0), so that β is
a matrix of cointegrating vectors. Hence, cointegration of the vector of yields
is a consequence of the factor structure of the yield curve.
The analysis in the previous section provides some indirect empirical sup-
port for the existence of orthogonal risk factors underlying the dynamics of
the term structure. Recall that our analysis of the factors employed differ-
enced price data. So, direct measurements of the risk factors is not an option,
but we could at least extract the differenced factors and compute their cu-
mulative sum. While this approach may lack rigor, it nevertheless provides a
glimpse of what the original factors might look like. Using the reconstructed
risk factors, we test the hypothesis that the level, slope, and curvature factors
are unit root processes. The result of this analysis is recorded in Table 3.2.
The results show that the hypothesis of unit root for the risk factors may not
be reject at any reasonable level for any of the six calendar years included in
our data set.
97
3.3.5 Cointegration implies a factor structure
In the previous section, we argued that cointegration is natural assuming the
underlying data admits a factor structure. In this section, we argue that the
converse is also true. Starting with the assumption that the components of
y are integrated of order one, y may be expressed, using lag polynomials, as:
(1− L)yt = Φ(L)εt, (3.8)
where ε is n × 1 iid noise, L is the lag operator, Φ(L) =∑∞
j=1 ΦjLj, Φj is
n× n matrix, and Φ0 is the n× n identity matrix. The last condition is an
accommodation for the presence of a deterministic linear trend.
Cointegration entails a restriction of the process Φ(L)ε, and on Φ(1) in
particular. Indeed, writing Φ(L) = Φ(1) + (1 − L)Φ∗(L), where Φ∗(L) :=
(1− L)−1(Φ(L)− Φ(1)), equation (3.8) may be expressed as:
(1− L)yt = Φ(1)εt + (1− L)Φ∗(L)εt. (3.9)
Solving (3.9) by recursive substitution yields
yt = Φ(1)zt + Φ∗(L)εt, (3.10)
where zt :=∑t−1
i=0 εt−i. Now, cointegration implies the existence of an n × k
matrix β, the matrix of cointegrating vectors, such that β′yt is integrated of
order 0; but since zt is a multivariate random walk, it must be the case that
β′Φ(1) = 0. Now, since β has rank r and Φ spans the subspace orthogonal
to its column space, it must be the case that Φ(1) has rank n− k.
Using the Jordan canonical form, we may write
Φ(1) = AJA−1
where J is a (n − k) × (n − k) diagonal matrix containing the non-zero
eigenvalues of Φ(1), A the corresponding n× (n− k) matrix of eigenvectors,
and A−1 is the right inverse of A. This decomposition is possible because Φ
only has n − 1 non-zero eigenvalues. Now setting ut := JA−1εt and νt :=
98
Φ∗(L)εt, and substituting into (3.10) yields
yt = Aft + νt, (3.11)
where ft = ft−1 + ut. The interesting thing about (3.11) is that f is an
(n − k) × 1 unit root process driving y. That is cointegration implies a
factor structure. This result appears at various levels of generality in Stock
& Watson (1988) and Escribano & Pena (1993).
3.4 Methodology
The basic trading mechanism consists of two main steps. The first step tests
for cointegration between the four futures prices and estimates the paramters
of a stationary portfolio of the four contracts under the hypothesis of cointe-
grated prices. The portfolio weights are the components of the cointegration
vector. We use a month’s worth of daytime (7:30 to 14:00 CT) trading data
sampled at one minute intervals for this step. This period is the so-called for-
mation period. Besides estimating the cointegration vector, we also estimate
the first two central moments of the stationary portfolio.
In the second step, we start monitoring prices immediately after the for-
mation period to identify price configurations that may be too rich or too
cheap according to our estimates of the first two moments from the forma-
tion period. This so-called trading period lasts for about three weeks (100
daytime trading hours) from the end of the formation period. Specifically,
we consider the price configuration to present a buy opportunity, if the price
of the stationary portfolio falls below two standard deviations of the sam-
ple mean computed on the basis of the data generated during the formation
period. There is a sell opportuitity if the price climbs beyond two standard
deviations of the mean price from the formation period. Hence, a position is
entered into whenever the price of the synthetic asset, constructed from the
cointegration vector, veers outside the two standard deviation band; the po-
sition is long or short according to whether the price configuration is deemed
cheap or rich. Short-sale constraints are almost non-existent in the futures
market, so they do not enter into our analysis.
99
Table 3.2: Augmented Dickey-Fuller Tests
(a) 2010
Level Slope Curvature
intercept 0.00 0.00 0.00t-stat (intercept) 2.32 2.10 1.87
lag -0.00 -0.00 -0.00t-stat(lag) -2.08 -1.97 -1.88
(b) 2011
Level Slope Curvature
intercept 0.00 -0.00 -0.00t-stat (intercept) 1.78 -1.74 -1.52
lag -0.00 -0.00 -0.00t-stat(lag) -0.46 -0.31 -0.24
(c) 2012
Level Slope Curvature
intercept 0.00 0.00 0.00t-stat (intercept) 1.01 1.04 0.85
lag -0.00 -0.00 -0.00t-stat(lag) -1.30 -1.30 -1.50
(d) 2013
Level Slope Curvature
intercept -0.00 -0.00 -0.00t-stat (intercept) -1.26 -1.34 -1.33
lag -0.00 -0.00 -0.00t-stat(lag) -1.51 -1.21 -0.79
(e) 2014
Level Slope Curvature
intercept 0.00 0.00 -0.00t-stat (intercept) 2.34 2.62 -2.71
lag -0.00 -0.00 -0.00t-stat(lag) -1.75 -2.03 -2.31
(f) 2015
Level Slope Curvature
intercept 0.00 0.00 0.00t-stat (intercept) 1.89 0.34 1.37
lag -0.00 -0.00 -0.00t-stat(lag) -2.27 -1.56 -1.42
100
Position are opened at any time during the trading period; they are closed
as soon as the price of the synthetic asset experiences a large enough cor-
rection after its excursion away from the sample mean estimated from the
formation period. Specifically, a position is closed as soon as the price falls
within the one standard deviation band. Hence, after each correction, at
least one standard deviation is earned on the round-trip trade. This process
is continued until the end of the trading period at which time all open po-
sitions are liquidated at the quoted price. Generally, this is the only time a
loss can be registered, since a correction might not have taken place prior to
the end of the trading period.
The entire process is repeated on a rolling window from the start of the
sample (1 April 2010) to the end of the sample (31 December 2015). In
both steps we use exclusively quote data as opposed to transaction data. An
advantage of using quote data is that the data is simply more plentiful and
may better accurately represent the state of the market as perceived by an
agent at any given moment. During the synthetic portfolio formation stage,
the cointegration vectors and the first two moments are estimated using the
midpoint of the best bid and ask prices. During the trading stage, positions
are opened and closed using quoted bid and ask prices: a long position is
entered into at the ask and shorts executed at the bid.
The evaluation of the strategy using quotes prices is imperative given the
short-term nature of the strategy. All positions are opened for at most 100
daytime trading hours. Theoretically, a position could be entered into and
exited the very next minute. For such short investment horizons, the bid-ask
spread looms very large. By using quote data, execution costs arising from
the bid-ask spread is automatically taken into account. Of course, there are
other types of execution costs, but the bid-ask spread is usually the largest
source of execution costs, and the use of quoted prices takes care of it right
away.
101
3.5 Results
3.5.1 Return calculation
Evaluating the performance of a trading strategy that may involve long and
short positions is not altogether a straight foreword matter. In fact, the
literature gives little guidance on how to define the one-period return of a
portfolio consisting of long and short positions. The issue is without com-
plications for a portfolio consisting entirely of long positions; the one-period
return is simply the difference between the starting and ending value of the
portfolio divided by its starting value. Unfortunately, this definition presents
difficulties as soon as portfolios with both long and short positions are con-
sidered. For such portfolios, the initial investment could be arbitrarily small,
zero, or even negative due to the offsetting effects of long and short posi-
tions. In the case of a zero-cost portfolio, the period return is ether positive
infinity or negative infinity, regardless of the actual change in the value of
the portfolio.
It is easy to see that the standard definition is problematic for portfolios
with both long and short positions because the value of the portfolio at the
start of the period is always taken as the basis for measuring the performance
of the portfolio over the period. By reconsidering the investment simply in
terms of cash inflows and outflows much of the difficulties of the standard
approach may be overcome. The cash flow perspective, assumes that the
entire portfolio is marked to market at the end of each investment period, so
that there is a cash flow at the start and end of each period. Cash inflows
and outflows are defined from the perspective of the investor. A long position
involves an initial cash outflow followed by a cash inflow at the end of the
period. The situation is reversed for short positions: an initial cash inflow
followed by a cash outflow at the end of the period. Given a portfolio of long
and short positions, the one period return is simply the natural logarithm
of the ratio of the total cash inflows, from both types of positions, to the
total cash outflows, also from both long and short positions. This measure is
approximately equal to the ratio of the difference between cash inflows and
102
outflows to cash outflows for the period. That is
rt = log
(
InflowstOutflowst
)
≈ Inflowst − OutflowstOutflowst
. (3.12)
While this definition of period return may seem reasonable for perfor-
mance measurement in the majority of spot/cash markets, it is not without
controversy where futures markets are concerned. Black (1976) observed that
it is, in principle, impossible to define fractional or percentage returns for a
position in futures contracts. This is because, the time t quoted price of a
futures contract is merely the price at which the underlying instrument may
be exchanged at an agreed upon future date; no actual transactions occur
immediately, so that there are no cash outlays at time t. There is only one
transaction, and it occurs at the end of the contract in the form of an out-
flow or inflow but not both. In practice, both the long and the short sides
of a futures contract are required by the trading venue to post collateral to
offset the risk of default. Ordinarily, there is a mandated minimum collateral
required by the brokerage firm used by the investor. This minimal collateral
is otherwise known as the initial margin.
A position in futures contracts is marked to market daily, so that favor-
able price moves results in credits and unfavorable price moves as debits to
the margin account. To prevent the margin account from being entirely de-
pleted in the event of a succession of unfavorable price moves, the exchange
may set a maintenance margin, which is a minimum balance that must be
maintained in the margin account at all times after the initial transaction.
Usually, the maintenance margin is the same amount as the initial margin,
but it may sometime lower. Margin requirements may differ according to
whether the investor is classified as a member of the exchange or a non-
member speculator. In 2016, the margin requirements for investors without
membership licenses to the Chicago Mercantile Exchange(CME) is 10% more
than the margin requirement for members of the CME.
Technically, the margin is not to be taken as an initial investment, but it
may be argued that it is the amount of cash required to make the transaction
possible; without it, the position can not be established. Arguing in this
manner, we may define the return of a long position in a futures contract
103
at time t to be the change in the price of the contract divided by the initial
margin. That is
rt =Ft − Ft−1
M, (3.13)
where M is the initial margin and Ft is the price at the end of time t of
the futures contract. For a short position, the numerator above is multiplied
by negative one. This basic definition is also plagued by the usual problems
encountered when computing the return generated by a portfolio of both
long and short positions. Reasoning as in (3.12), the return metric defined
in (3.13) based on the timing of cash flows may be modified to only take
into account the direction of cash flows. Hence, given n different futures
contracts, we may define the performance metric
rt =
∑ni (Inflowsi,t − Outflowi,t)
∑
i Leverage Ratioi,t × Par Valuei,t ×Qi,t
. (3.14)
where the leverage ratio is simply the ratio of the initial margin of the i-th
contract to the par value of the underlying bond, and Qi,t is the exposure, in
terms of number of contracts, to the i-th contract at time t. In our setting n
is four and the contracts are distinguished by their tenors. Definition (3.12)
is a special case of the above; it holds when the position is fully founded,
that is, when the leverage ratio is one.
Table 3.3: Time-averaged CME margin requirements between 1 April 2010 and12 December 2015.
Initial marginContracts Notional value Members Speculators
2 Yr 200000 448 4935 Yr 100000 818 900
10 Yr 100000 1323 145630 Yr 100000 2647 2912
The initial margins are ordinarily not the same across contracts and,
therefore, must be handled carefully. For instance, in the last quarter of 2016,
the initial margin for the 30-Year Treasury Bond Futures contract was $4000,
104
whereas the initial margin of the 2-Year Treasury Note Futures contract was
only $550. Beside the differences in initial margins by contract types, there
are also variations over time. For most of 2010, the initial margin requirement
for the 5-Year Treasury Note Futures contract was $800 for investors with
membership licenses and $880 for non-members. Meanwhile, for all of 2014,
the initial margin for the same contract was $900. To simplify our analysis,
we compute a time-weighted average of the initial margin for the time period
between 1 April 2010 and 31 December 2015 for each contract type. The time
weighted average for members and non-members of the CME are recorded in
Table 3.3. As may be expected, margin requirements increase with the tenor
of the underlying, because the price of contracts with longer maturities are
more likely to experience large price swings.
As previously stated, the initial margin is merely the minimum collateral
required to initiate a transaction in one futures contract. An investor may
chose to apply however much collateral he or she desires. If each transaction
is fully funded, i.e., if the exact amount of the exposure to each contract
is always set aside for each transaction, then the appropriate performance
measure would be a slight modification of the formula given in (3.12). That
is
rt =
∑ni (Inflowsi,t − Outflowi,t)
∑
i Outflowi,t
. (3.15)
We remark that the use of fully funded accounts in the treasury futures
market is not very common. Consider that the notional value of the 2-Year
treasury futures contract is $200,000 and $100,000 for the others. Hence,
putting together a portfolio consisting of even a small number of contracts
quickly becomes prohibitively capital intensive. Meanwhile, treasury futures,
even those written on long bonds, have relatively stable long-term prices. As
a result, investments in the treasury futures market are most often under-
taken using leverage or a margin account.
We conclude this section with a remark on the distinction between an
investment period and a trading period. Trading periods are fixed: they are
exactly 6000 daytime trading minutes, approximately 14 trading days. An
investment period is simply the time between when a position is opened and
105
the time when it is closed. Positions are opened when the price configurations
of the four securities indicate a departure from the stable relationship estab-
lished during the preceding formation period. The positions are closed when
the stable relationship is restored. This deviation and restoration towards a
stable relationship may occur several times during a single trading period,
thereby creating multiple opportunities and, hence, investment periods.
The return formulas in (3.14) and (3.15) relate to the return over a single
investment period. For trading periods with multiple periods, we compute
the return over the trading period as the sum of the individual returns gener-
ated from each investment period contained within the trading period. That
is
rt =
q∑
i=1
ri,t
where ri,t is the return, computed via formula (3.14) or (3.15), of the i-th
investment period of the t-th trading period, and q is the total number of
investment periods occurring in the t-th trading period.
3.5.2 Excess returns
We summarize the distribution of returns generated by backtesting the coin-
tegration strategy described in the previous section in Table 3.4. The back-
test is run over daytime trading hours between April 1, 2010 and December
31, 2015. The entire period is divided into 96 hundred-hour trading periods
lasting approximately three business weeks. The figures shown in the table
are the annualized hundred-hour returns. We show cash flow-based returns
computed on a fully funded account in the first panel of the table; the second
panel displays the distribution of cash flows-based returns computed using
the initial margin as the cost basis. Each panel reports two sets of backtest
results: one for prices adjusted backwards by the application of a propor-
tional factor (Ratio) and the other for prices shifted in levels backward by
the amount of the roll-induced price gap (Difference).
Now, the annualized excess return over the equally weighted portfolio
of all four contracts, assuming a fully-funded account, are 6.01% and 6.00%
106
Table 3.4: Annualized (100 trading hours) returns on initial margin and fullyfunded account.
Panel A: Fully-funded excess return over the equal-weighted portfolio
Ratio Difference
Average return 0.0604 0.05995Standard error (Newey-West) 0.02625 0.02737t-Statistic 2.3012 2.18984Excess return distribution
Median 0.04749 0.03655Standard deviation 0.30453 0.35263Skewness 2.12072 1.66976Kurtosis 14.58188 9.49711Minimum -0.68255 -0.752265% Quantile -0.29139 -0.3523495% Quantile 0.51928 0.61858Maximum 1.85812 1.79468% of negative excess returns 40.625 44.79167
Panel B: Return on margin account
Ratio Difference
Average return 15.12327 14.95951Standard error (Newey-West) 3.87113 4.13448t-Statistic 3.90668 3.61823Excess return distribution
Median 0.6238 0Standard deviation 46.01987 51.17213Skewness 1.63053 1.29959Kurtosis 10.99074 7.87337Minimum -105.68033 -110.641475% Quantile -51.1802 -70.3208695% Quantile 83.25456 96.94193Maximum 259.66658 250.56902% of negative excess returns 14.58333 16.66667
107
respectively for the ratio and the difference price adjustment procedures. The
Newey-West adjusted t statistics are 2.3 and 2.2, respectively. Given this
result, the hypothesis that the cointegration strategy dominates the equal-
weighted portfolio, cannot be rejected. The idea is that one may short as
many of the equal-weighted portfolio as necessary and use the proceeds to
set up the cointegration strategy without incurring a loss.
Meanwhile, the annualized return using the initial margin as the cost basis
are, respectively, 1500% and 1490%. These returns are not as preposterous as
they first seem. Consider that the leverage factor implicit in the initial margin
for the 2-Year contract is 446 and that of the 5-Year contract is 122. The
inflated returns are, therefore, merely a consequence of the inflated leverage
factors. The t statistics in both cases are in excess of 2. Also, note that
the out-sized returns that may be achieved by trading on the initial margin
come at the expense of taking significant risks: consider that the standard
deviation of the returns on initial margin are 159.17 times the volatility of the
return on the fully funded account. Clearly, in practice, what an investor ends
up doing would be somewhere between trading a fully-funded account and
posting the minimum required collateral. At any rate, our analysis provides
a starting point for reasoning about how to incorporate leverage in a more
realistic real world strategy.
3.5.3 Statistical Arbitrage
Stephen Ross (1976a) gave the first serious treatment of the concept of arbi-
trage. While his treatment might have been of a heuristic nature, it never-
theless conveyed the essence of an arbitrage, which is a trading strategy that
yields a positive payoff with little to no downside risk. The first rigorous
definition appeared in Huberman (1982), where it was defined in the context
of an economy with asset generated by a set of risk factors as a sequence of
portfolios with payoffs φn such that E(φn) tends to +∞ while Var(φn) tends
to 0. The concept has undergone numerous changes in the literature, see for
example Kabanov (1996), but the essential meaning of the term as a low risk
risk investment still remains. We focus on a special type of arbitrage known
in the empirical asset pricing literature as a statistical arbitrage, which Hogan
108
et al. (2004) defines as a zero-cost, self-financing strategy whose cumulative
discounted value v(t) satisfies:
1. v(0) = 0,
2. limt→∞ E(vt) > 0,
3. limt→∞ P (vt < 0) = 0, and
4. if v(t) can become negative with positive probability, then Var(vt <
0)/t → 0 as t → ∞.
Even though the above notion of arbitrage bears resemblance to the original
definition given by Huberman (1982), it is worth noting that there is a cru-
cial difference between the two concepts. In the first instance, the limit is
taken with respect to the cross-section of the economy, i.e. the sequence of
small economies is assumed to expand without bound, whereas the definition
given above requires the investment horizon to tend to infinity. It is easily
verified that the first three conditions, assuming the existence of the first
moment correspond to the definition of an arbitrage in the classical sense
of (Delbaen & Schachermayer, 1994). Chapter 2 of Dare (2017), obtains a
series of equivalent characterizations of market efficiency. In particular, Dare
(2017, Proposition 2.1) shows that a necessary condition for market efficiency
is the existence of a local martingale measure after expressing asset prices
in units of any strictly positive convex portfolio of the zeroth asset and any
other asset (possibly itself). To apply this result, we express prices in units
of the 2-Year treasury futures contract and attempt to exhibit a violation of
the no-arbitrage condition.
Following Hogan et al. (2004) we propose to test market efficiency under
the assumption that the change in the discounted cumulative gains of the
strategy satisfies:
vt − vt−1 = µ+ σtλzt, (3.16)
where t is an integer; σ, λ, and µ are real numbers; and zt is an i.i.d. se-
quence of standard normal random variables. The model allows for determin-
istic time variation in the second moment, but makes the seemingly strong
109
assumption that there is no serial correlation between returns. Our own
simulations reveal that the effects of serial correlations are slight. In fact,
assuming z were generated by an AR(1), differences of more than 10% on
average standard errors only start to occur for values of the autoregressive
parameter in excess of 0.9 in absolute value. Since, the sample autocorre-
lation of the returns of the strategy is only -0.158, it is likely the case that
serial autocorrelation is a minor issue.
The inference strategy we have adopted is not without weaknesses. A
stylized empirical fact of financial markets is that asset returns generally have
fat-tail distributions. The normality assumption may therefore seem overly
restrictive. Moreover, the adopted parametric model may itself be a source of
misspecification errors. A more sophisticated analysis would perhaps employ
robust tools such as the bootstrap.
While the above criticisms may be valid, note that the test statistics
discussed in Hogan et al. (2004) and Dare (2017, Chapter 2) are very con-
servative because they rely on the Bonferroni criterion, which stipulates that
in compound tests involving a joint-hypothesis, the sum of the p-values of
the individual tests is an upper limit for the Type I error of the compound
test (Casella & Berger, 1990, p.11).
The test of efficiency then consists of estimating the parameters of (3.16)
and then testing for
1. µ > 0, and
2. λ < 0.
For, under the null hypothesis, the cumulative returns will increase with-
out bound as the variance of the cumulative gain tends to zero. Under the
assumption of normally distributed zt, the test can be carried out by max-
imum likelihood estimation. Table 5 summarizes the results of estimating
model (3.16). The p-value of the joint test using the Bonferonni correction,
which consists of adding up the p-values from each sub-hypothesis, of the
presence of a statistical arbitrage is approximately 0.02 in both the ratio and
difference-adjusted price series.
The p-value for the estimates of σ are relatively large, estecially for the
ratio adjusted price series. This is to be extected since for large t, the volatil-
110
Table 3.5: Test for statistical arbitrage
Difference RatioPar. Estimate Std Error p-value Estimate Std Error p-value
µ 0.3351 0.0108 <0.01 0.3735 0.0106 <0.01σ 1.151 0.5397 0.041 4.98 3.2077 0.1195λ -0.6808 0.1288 <0.01 -1.0353 0.1779 <0.01
ity of incremental payoffs is mostly determined by the term tλ and not σ.
We conduct a separate test with σ restricted to 1. The output of that test is
recorded in Table 6. The estimates and p-value obtained from the restricted
model match the estimates from the unrestricted model. These results pro-
vide a strong indication that the strategy not only earns positive excess return
over the equal weighted portfolios, but also, a positive free-lunch.
Table 3.6: Test for statistical arbitrage (σ = 1)
Difference RatioPar. Estimate Std Error p-value Estimate Std Error p-value
µ 0.333 0.0082 <0.01 0.3474 0.0104 <0.01λ -0.6452 0.0199 <0.01 -0.5885 0.0204 <0.01
3.6 Conclusion
Starting with the assumption that interest rates and, therefore, bond futures
prices admit a factor structure, we evaluate a trading strategy based on the
assumption of cointegrated bond futures prices. We argue that coitegration is
natural if in fact the dynamics of the yield curve is driven by orthogonal risk
factors which together form a jointly unit root process. Direct verification
of this hypothesis is difficult because the price series and its log transfomed
counterparts are likely not stationary. On the other, the stationary assump-
tion can be made and tested using the change in log prices. Using differenced
price data, we argued empirically that the vast majority of the volatility ex-
perienced by the changes in log prices may arise from three dominant risk
factors. Since the factors are othogonal and the differenced log prices may
111
be assumed to be stationary, there is very little doubt that the factors con-
tained in the differenced log prices are stationary. To test the claim for the
log prices in evels, we estimated the factors in levels by taking cumulative
sums of the factors estimated using differenced log prices. For these proxies
of the factors in levels, the assumption of unit roots could not be rejected at
reasonable significant levels.
With the choice of cointegrating strategy properly motivated, we pro-
ceeded to evaluate a simple trading strategy based on the cointegration hy-
pothesis. The crust of the strategy consists in opening a position as soon
the price configuration appears to deviate from an estimated stable cointe-
gration relationship. The strategy is evaluated by computing the ratio of
cash-equivalent inflows to outflows. We also consider a return metric based
on the initial margin required to take either a short or a long position in
one futures contract. Our results reveal that the gains from this strategy
are both economically and statistically significant. This exercise allows to
argue along the lines innitiated by Jarrow & Larsson (2012) that the U.S.
treasury bond futures market, for the period for which we have data, was
not informationally efficient.
112
Bibliography
Ansel, Jean-Pascal and Stricker, Christophe (1994) “Couverture des actifs
contingents et prix maximum”, Annal de l’ I. H. P., section B, Vol. 30,
No. 2, pp. 303–3015.
Banz, Rolf W. (1981) “The relationship between return and the market value
of common stocks”, Journal of Financial Economics, Vol. 9, pp. 3–18.
Barndorff-Nielsen, Ole E. and Shephard, Neil (2004) “Econometric analysis
of realized covariation: High frequency based covariance, regression, and
correlation in financial economics”, Econometrica, Vol. 72, No. 3, pp. 885–
925.
Black, Fischer (1976) “The pricing of commodity contracts”, Journal of Fi-
nancial Economics, Vol. 3, pp. 167–179.
Bouchaud, Jean-Philippe, Sagna, Nicolas, Cont, Rama, El-Karoui, Nicole,
and Potters, Marc (1999) “Phenomenology of the interest rate curve”, Ap-
plied Mathematical Finance, Vol. 6, pp. 209–232.
Bradley, Michael G. and Lumpkin, Stephen A. (1992) “The treasury yield
curve as a cointegrated system”, The Journal of Financial and Quantitative
Analysis, Vol. 27, No. 3, pp. 449–463.
Casella, George and Berger, Roger L. (1990) Statistical inference: Thomson
Learning, 2nd edition.
Christensen, J.H.E, Diebold, F.X., and Rudebusch, G.D (2011) “The affine
arbitrage-free class of nelson-siegel term structure models”, Journal of
Econometrics, Vol. 164, pp. 4–20.
113
Christensen, Ole (2006) “Pairs of dual Gabor frame generators with com-
pact support and desired frequency localization”, Applied Computational
Harmonic Analysis, Vol. 20, pp. 403–410.
(2008) Frames and bases: An introductory course, Boston:
Birkhauser.
Cochrane, John Howland (2001) Asset pricing: Princeton Univ. Press.
Cox, John C., Jonathan E. Ingersoll, Jr., and Ross, Stephen A. (1981) “The
relation between forward prices and futures prices”, Journal of Financial
Economics, Vol. 9, pp. 321–346.
Cuchiero, Christa, Klein, Irene, and Teichmann, Josef (2015) “A new per-
spective on the fundamental theorem of asset pricing for large financial
markets”, arXiv.
Dare, Wale (2017) “On market efficiency and volatility estimation”, Ph.D.
dissertation, University of St.Gallen, Switzerland.
Daubechies, Ingrid (1992) Ten lectures on wavelets: CBMS-NSF Series in
Applied Mathematics, SIAM.
Delbaen, Freddy and Schachermayer, Walter (1994) “A general version of the
fundamental theorem of asset pricing”, Mathematische Annalen, No. 300,
pp. 463 – 520.
(1997) “The Banach space of workable contingent claims in arbitrage
theory”, Annales de l’institut Henri Poincare (B) Probability and Statistics,
Vol. 33, pp. 113–144.
(1999) “The fundamental theorem of asset pricing for unbounded
stochastic processes”, Report Series SFB, “Adaptive Information Systems
and Modelling in Economics and Management Science” 24, WU Vienna
University of Economics and Business, Vienna.
Delbaen, Freddy Y. and Schachermayer, Walter (1995) “The no-arbitrage
property under the change of numeraire”, Stochastics and Stochastic Re-
ports, Vol. 53, pp. 213–226.
114
Diebold, Francis X. and Li, Canlin (2006) “Forecasting the term structure of
government bond yield”, Journal of Econometrics, Vol. 130, pp. 337–364.
Diebold, Francis X. and Rudebusch, Glenn D. (2013) Yield Curve Modelling
and Forecasting: Princeton University Press.
Duffie, Darrell and Kan, Rui (1996) “A yield-factor model of interest rates”,
Mathematical Finance, Vol. 6, No. 4, pp. 379–406.
Dybvig, Philip H., Jonathan E. Ingersoll, Jr., and Ross, Stephen A. (1996)
“Long forward and zero-coupon rates can never fall”, The Journal of Busi-
ness, Vol. 69, No. 1, pp. 1–25.
Escribano, Alvaro and Pena, Daniel (1993) “Cointegration and common fac-
tors”, Working paper 93-11, Universidad Carlos III de Madrid.
Fama, Eugene (1969) “Efficient capital markets: A review of theory and
empirical work”, The Journal of Finance, Vol. 25, No. 2, pp. 387–417.
Fama, Eugene F. and French, Kenneth R. (1993) “Common risk factors in
the return on stocks and bonds”, Journal of financial economics, Vol. 33,
pp. 3–56.
Fan, Jianqing and Wang, Yazhen (2008) “Spot volatility estimation for high-
frequency data”, Statistics and its Interface, Vol. 1, pp. 279–288.
Fernholz, E. Robert (2002) Stochastic Portfolio Theory, New York: Springer-
Verlag New York.
Filipović, Damir (1999) “A note on the nelson-siegel family”, Mathematical
Finance, Vol. 9, pp. 349–359.
Florens-Zmirou, Danielle (1993) “On estimating the diffusion coefficicient
from discrete observations”, Journal of Applied Probability, Vol. 30, No. 4,
pp. 790–804.
Foster, Dean P. and Nelson, Dan B. (1996) “Continuous record asymptotics
for rolling sample variance estimators”, Econometrica, Vol. 64, No. 1, pp.
139–174.
115
Genon-Catalot, V., Laredo, C., and Picard, D. (1992) “Non-parametric es-
timation of the diffusion coefficient by wavelets methods”, Scandinavian
Journal of Statistics, Vol. 19, No. 4, pp. 317–335.
Hall, Anthony D., Anderson, Heather M., and Granger, Clive W. (1992) “A
cointegration analysis of treasury bill yields”, The Review of Economics
and Statistics, Vol. 74, No. 1, pp. 116–126.
Halmos, Paul R. (1950) Measure Theory: Springer-Verlag New York.
Harrison, J. Michael and Pliska, Stanley R. (1981) “Martingales and stochas-
tic integrals in the theory of continuous trading”, Stochastic Processes and
their Applications, Vol. 11, pp. 215–260.
He, Sheng-wu, Wang, Jia-gang, and Yan, Jia-an (1992) Semimartingale the-
ory and stochastic calculus: CRC Press Inc.
Hoffmann, M., Munk, A., and Schmidt-Hieber (2012) “Adaptive wavelet es-
timation of the diffusion coefficient under additive error measurements”,
Annales de l’institut Henry Poincaré, Probabilités et Statistiques, Vol. 48,
No. 4, pp. 1186–1216.
Hogan, Steve, Jarrow, Robert, Teo, Melvyn, and Warachka, Mitch (2004)
“Testing market efficiency using statistical arbitrage with applications to
momentum and value strategies”, Journal of Financial Economics, Vol. 73,
pp. 525 – 565.
Hubalek, Friedrich, Klein, Irene, and Teichmann, Josef (2002) “A general
proof of the Dybvig-Ingersoll-Ross theorem: Long forward rates can never
fall”, Mathematical Finance, Vol. 12, No. 4, pp. 447–451.
Huberman, Gur (1982) “A simple approach to Arbitrage Pricing Theory”,
Journal of Economic Theory, Vol. 28, No. 1, pp. 183–191.
Jacod, Jean and Shiryaev, Albert N. (2003) Limit theorems for stochastic
processes: Springer-Verlag, 2nd edition.
Jarrow, Robert A. and Larsson, Martin (2012) “The meaning of market effi-
ciency”, Mathematical Finance, Vol. 22, No. 1, pp. 1–30.
116
Jarrow, Robert, Teo, Melvyn, Tse, Yiu Kuen, and Warachka, Mitch (2012)
“An improved test for statistical arbitrage”, Joural of Financial Markets,
Vol. 15, pp. 47–80.
Kabanov, Y.M. (1996) “On the FTAP of Kreps-Delbaen-Schachermayer”,
Statistics and Control of Stochastic Processes, pp. 191–203.
Kabanov, Y.M. and Kramkov, D.O. (1998) “Asymptotic arbitrage in large
financial markets”, Finance and Stochastics, Vol. 2, pp. 143–172.
Kabanov, Yu. M. and Kramkov, D. O. (1994) “Large financial markets:
asymptotic arbitrage and contiguity”, Probability Theory Application, Vol.
39, No. 1, pp. 222–229.
Kardaras, Constantinos (2013) “On the closure in the émory topology of
semimartingales wealth-process sets”, Annals of Applied Probability, Vol.
23, No. 4, pp. 1355–1376.
Labuszewski, John W., Kamradt, Michael, and Gibbs, David (2014) “Under-
standing treasury futures”, report, CME Group.
Lintner, John (1965) “The valuation of risk assets and the selection of risky
investments in stock portfolios and capital budgets”, The Review of Eco-
nomics and Statistics, Vol. 47, No. 1, pp. 13–37.
Litterman, Robert B. and Scheinkman, José (1991) “Common factors affect-
ing bond market returns”, The Journal of Fixed Income, Vol. 1, No. 1, pp.
54–61.
Malkiel, Burton G. (1991) The World of Economics, Chap. Efficient market
hypothesis, pp. 211–218, London: Palgrave Mcmillan UK.
Malliavin, Paul and Mancino, Maria Elvira (2002) “Fourier series methods
for measurement of multivariate volatilities”, Finance and Stochastics, Vol.
6, No. 1, pp. 49–61.
Mancini, Cecilia (2009) “Non-parametric threshold estimation for models
with stochastic diffusion coefficient and jumps”, Scandinavian Journal of
Statistics, Vol. 36, pp. 270 – 296.
117
Merton, Robert C. (1973) “Theory of rational option pricing”, The Bell Jour-
nal of Economics and Management Science, Vol. 4, No. 1, pp. 141–183.
Nelson, Charles R. and Siegel, Andrew F. (1987) “Parsimonious modeling of
yield curves”, The Journal of Business, Vol. 60, No. 4, pp. 473–489.
Protter, Philip E. (2004) Stochastic Integration and Differential Equations:
Springer, 2nd edition.
Ross, Stephen A. (1976a) “The arbitrage theory of capital asset pricing”,
Journal of Economic Theory, Vol. 13, pp. 341–360.
(1976b) Return, risk, and arbitrage, Chap. Risk and return in fi-
nance, Cambridge: Ballinger.
Sharpe, William F. (1964) “Capital asset prices: a theory of market equilib-
rium under conditions of risk”, The Journal of Finance, Vol. 19, No. 3, pp.
425–442.
Singleton, Kenneth J. (2006) Empirical Dynamic Asset Pricing: Model Spec-
ification and Econometric Assessment: Princeton University Press.
Stock, James H. and Watson, Mark W. (1988) “Testing for common trends”,
Journal of the American Statistical Association, Vol. 83, No. 404, pp. 1097–
1107.
Takaoka, Koichiro and Schweizer, Martin (2014) “A note on the condition of
no unbounded profit with bounded risk”, Finance Stochastic, Vol. 18, pp.
393–405.
Zhang, Hua (1993) “Treasury yield curves and cointegration”, Applied Eco-
nomics, Vol. 25, No. 3, pp. 361–367.
Zhang, Zhihua (2008) “Convergence of Weyl-Heisenberg frame series”, Indian
Journal of Pure and Applied Mathematics, Vol. 39, No. 2, pp. 167–175.
118
Wale Dare22 Hagedornstrasse
20149 Hamburg+49 (0) 162 8295 [email protected]
EDUCATIONUniversity of St. Gallen, St. Gallen, CH Ph.D. in Finance and Economics 02/2014 – 02/2018
• Research: High-frequency statistical arbitrage in ixed-income futures, arbitrage pricing in large inancial markets, and nonparametric realized volatility estimation in the presence of price jumps.
M.A. in Quantitative Economics and Finance 09/2011 – 10/2013
• Thesis: Ambiguity, long-run risk, and asset prices.
Knox College, Galesburg, IL, US B.A in Mathematics 02/2002 – 04/2006
• Overall GPA: 3.61/4.00, Math GPA: 3.62/4.00, Econ GPA: 3.88/4.00.
• Thesis: Poisson random measures and application
WORK EXPERIENCEBerenberg, Hamburg, GermanyPortfolio manager 10/2017 – Present
• Managed and developed strategies to harvest the volatility risk premium embedded in options markets
• Researched and developed quantitative investment strategies across multiple liquid asset classes
• Implemented derivatives-based investment strategies
• Developed and maintained advanced risk management processes
University of St.Gallen, St.Gallen, CH Doctoral Assistant 04/2014 – 09/2017
• Collaborated in research activities in the areas of Mathematics, Financial Economics and Quantitative Finance
• Collaborated in the teaching of mathematics and quantitative methods in Finance and Economics at diferent teaching levels
Liberty Mutual Group, Boston, MA, US Actuarial/Pricing Analyst 07/2008 – 07/2010
• Used actuarial methodology to forecast losses and price insurance products in the commercial automobile and real-estate property lines of business.
• Trained underwriting and support staf in the use of actuarial tools andsoftware
SKILLS
• Awards include the Howard Hughes Institute Summer Research Fellowship
• Proicient in R, SQL, Python, C++, Haskell, and VBA
• Fluent in French