Prof. Dr. Matthias Fengler Prof. Dr. Josef...

On market efficiency and volatility estimation

DISSERTATION

of the University of St. Gallen,

School of Management,

Economics, Law, Social Sciences

and International Affairs

to obtain the title of

Doctor of Philosophy in Economics and Finance

submitted by

Wale Dare

Approved on the application of

Prof. Dr. Matthias Fengler

and

Prof. Dr. Josef Teichmann

Dissertation no. 4748

Gutenberg AG, Schaan, 2018

The University of St. Gallen, School of Management, Economics, Law, Social Sciences and

International Affairs hereby consents to the printing of the present dissertation, without hereby

expressing any opinion on the views herein expressed.

St. Gallen, December 5, 2017

The President:

Prof. Dr. Thomas Bieger

Contents

1 Global estimation of realized spot volatility in the presence

of price jumps 5

1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.2 Prices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.3 Frames . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.3.1 Gabor frames . . . . . . . . . . . . . . . . . . . . . . . 11

1.4 Volatility estimation: continuous prices . . . . . . . . . . . . 12

1.5 Volatility estimation: discontinuous prices . . . . . . . . . . . 20

1.6 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

1.6.1 Continuous prices . . . . . . . . . . . . . . . . . . . . . 34

1.6.2 Prices with jumps . . . . . . . . . . . . . . . . . . . . . 39

1.7 Empirical illustration - Flash Crash of 2010 . . . . . . . . . . 43

1.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

1.9 Appendix . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

2 Testing efficiency in small and large financial markets 48

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

2.2 Efficiency in standard markets . . . . . . . . . . . . . . . . . . 51

2.2.1 Statistical inference for market efficiency . . . . . . . . 62

2.3 Market efficiency in large financial markets . . . . . . . . . . . 65

2.3.1 Large financial market payoff space . . . . . . . . . . . 66

2.3.2 Arbitrage pricing in large financial markets . . . . . . . 68

2.4 Asymptotic Market efficiency . . . . . . . . . . . . . . . . . . 70

2.4.1 Statistical inference for asymptotic market efficiency . . 74

2.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

1

3 Statistical arbitrage in the U.S. treasury futures market 77

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

3.2 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

3.2.1 Treasury futures . . . . . . . . . . . . . . . . . . . . . 80

3.2.2 Continuous prices . . . . . . . . . . . . . . . . . . . . . 82

3.3 Economic framework . . . . . . . . . . . . . . . . . . . . . . . 84

3.3.1 The price of a futures contract . . . . . . . . . . . . . . 84

3.3.2 Factor model of the yield curve . . . . . . . . . . . . . 87

3.3.3 Factor extraction . . . . . . . . . . . . . . . . . . . . . 89

3.3.4 Factor structure implies cointegration . . . . . . . . . . 96

3.3.5 Cointegration implies a factor structure . . . . . . . . . 98

3.4 Methodology . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

3.5 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102

3.5.1 Return calculation . . . . . . . . . . . . . . . . . . . . 102

3.5.2 Excess returns . . . . . . . . . . . . . . . . . . . . . . . 106

3.5.3 Statistical Arbitrage . . . . . . . . . . . . . . . . . . . 108

3.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111

2

Abstract

In Chapter 1, we propose a non-parametric procedure for estimating the

realized spot volatility of a price process described by an Itô semimartingale

with Lévy jumps. The procedure integrates the threshold jump elimination

technique of Mancini (2009) with a frame (Gabor) expansion of the realized

trajectory of spot volatility. We show that the procedure converges in proba-

bility in L2([0, T ]) for a wide class of spot volatility processes, including those

with discontinuous paths. Our analysis assumes the time interval between

price observations tends to zero; as a result, the intended application is for

the analysis of high frequency financial data.

In Chapter 2, we investigate practical tests of market efficiency that are

not subject to the joint-hypothesis problem inherent in tests that require

the specification of an equilibrium model of asset prices. The methodology

we propose simplifies the testing procedure considerably by reframing the

market efficiency question into one about the existence of a local martingale

measure. As a consequence, the need to directly verify the no dominance

condition is completely avoided. We also investigate market efficiency in the

large financial market setting with the introduction of notions of asymptotic

no dominance and market efficiency that remain consistent with the small

market theory. We obtain a change of numeraire characterization of asymp-

totic market efficiency and suggest empirical tests of inefficiency in large

financial markets.

In Chapter 3, we argue empirically that the U.S. treasury futures market

is informational inefficient. We show that an intraday strategy based on the

assumption of cointegrated treasury futures prices earns statistically signifi-

cant excess return over the equally weighted portfolio of treasury futures. We

also provide empirical backing for the claim that the same strategy, financed

by taking a short position in the 2-Year treasury futures contract, gives rise

to a statistical arbitrage.

3

Abstrakt

In Kapitel 1 schlagen wir ein nichtparametrisches Verfahren zur Schätzung

der realisierten Spotvolatilität eines Preisprozesses vor, der von einem Itô

semimartingale mit Lévy-Sprüngen beschrieben wird. Das Verfahren inte-

griert die Schwellenwert-Eliminierungstechnik von Mancini (2009) mit einer

Frame (Gabor)-Erweiterung der realisierten Trajektorie der Spot Volatilität.

Wir zeigen, dass die Prozedur mit Wahrscheinlichkeit in L2([0, T ]) für eine

breite Klasse von Spot Volatilitätsprozessen konvergiert, einschließlich solcher

mit diskontinuierlichen Pfaden. Unsere Analyse geht davon aus, dass das

Zeitintervall zwischen den Preisbeobachtungen gegen Null tendiert, so dass

die beabsichtigte Anwendung für die Analyse von hochfrequenten Finanz-

daten vorgesehen ist.

In Kapitel 2 untersuchen wir praktische Tests der Markteffizienz, die nicht

dem Joint-Hypothese-Problem unterliegen, das bei Tests, die die Spezifika-

tion eines Gleichgewichtsmodells der Vermögenspreise erfordern, inhärent

ist. Die von uns vorgeschlagene Methodik vereinfacht das Testverfahren

erheblich, indem sie die Frage der Markteffizienz in eine Frage über die

Existenz eines lokalen Martingalmaßes umformuliert. Dadurch entfällt die

Notwendigkeit, die Nicht-Dominanz-Bedingung direkt zu verifizieren. Wir

untersuchen auch die Markteffizienz im Umfeld der großen Finanzmärkte mit

der Einführung von Begriffen wie asymptotische No-Dominanz und Mark-

teffizienz, die mit der Theorie des kleinen Marktes übereinstimmen. Wir

erhalten eine Änderung der numerären Charakterisierung der asymptotis-

chen Markteffizienz und schlagen empirische Tests der Ineffizienz in großen

Finanzmärkten vor.

In Kapitel 3 argumentieren wir empirisch, dass der US-Treasury-Futures-

Markt informatorisch ineffizient ist. Wir zeigen, dass eine Intraday-Strategie,

die auf der Annahme von kointegrierten Treasury-Futures-Preisen basiert,

statistisch signifikante Überrenditen gegenüber dem gleichgewichteten Port-

folio von Treasury-Futures erzielt. Empirisch untermauern wir auch die Be-

hauptung, dass die gleiche Strategie, finanziert durch eine Short-Position im

2-Jahres-Treasury-Futures-Kontrakt, zu einer statistischen Arbitrage führt.

4

Chapter 1

Global estimation of realized spot

volatility in the presence of price

jumps

1.1 Introduction

Volatility estimation using discretely observed asset prices has received a

great deal of attention recently, however, much of that effort has been fo-

cused on estimating the integrated volatility and, to a lesser extent, the spot

volatility at a given point in time. Notable contributions to the literature

on volatility estimation include the papers by Foster & Nelson (1996), Fan

& Wang (2008), Florens-Zmirou (1993), and Barndorff-Nielsen & Shephard

(2004). In these studies, the object of interest is local in nature: spot volatil-

ity at a given point in time or integrated volatility up to a terminal point in

time. In contrast, estimators which aim to obtain spot volatility estimates

for entire time windows have received much less coverage. These are the

so-called “global” spot volatility estimators. These estimators derive their

name from the fact that the objects of interest are not localized. Typically,

a global estimator would be a random element whose realizations would be

elements of some function space.

There are potential benefits to adopting global estimators of spot volatil-

ity. Given a consistent global estimate of spot volatility σ2 over an interval

5

[0, T ], the integrated volatility at any point t within [0, T ] may be consistently

estimated by integrating σ2 over the interval [0, t]. In fact, by the contin-

uous mapping theorem, consistent estimates of continuous transformations

of σ2 are immediately available. Hence, integrated powers of spot volatility,∫ t

0σps ds, p > 0, the running maximum of spot volatility, σ∗

t := sups≤t |σs|,and volatility in excess of a given threshold, σa

t := σtI|σt|>a, a > 0, to

name just a few, are easily obtained via the obvious transformation of the

estimated global spot volatility. This flexibility is one of the more appealing

features of this class of estimators.

The estimator by Genon-Catalot et al. (1992) is an early contribution to

the study of the realized trajectory of spot volatility. Working within the

context of continuous asset prices and deterministic spot volatility, the au-

thors decribed an estimator of the realized trajectory of spot volatility using

wavelet projection methods. Their basic framework has been extended by

Hoffmann et al. (2012), who proposed an adaptive estimator of spot volatility

for continuous asset prices subject to market microstructure noise contami-

nations.

Another important contribution to the global spot volatility estimation

literature is the estimator studied by Malliavin & Mancino (2002), which

relies on Fourier methods to estimate the realized path of spot volatility for

assets with continuous prices. In their procedure, Fourier coefficients of the

realized price path are first estimated and the used to derive expressions for

the Fourier coefficients of the realized path of spot volatility.

In the current work, we extend the study of the realized path of spot

volatility to situations where the price process or the volatility coefficient

itself cannot be assumed to be continuous. That is we describe a procedure

for consistently estimating càdlàg volatility paths in the presence of price

jumps. By employing Gabor frames in our analysis we are able to leverage

the excellent time-frequency localization property of Gabor frames to obtain

the sparsest representation for the realized trajectories of spot volatility.

The rest of this paper is organized as follows: in Section 1.2 we introduce

notation and give general description of the dynamics of observed prices. In

Section 1.3 we introduce Gabor frames and review the basic theory required

for our subsequent analysis. We present our main results in Section 1.4 and

6

4, where we specify the estimator and give proof of its consistency. Section

1.6 describes simulation exercises that lend further support to the theoretical

analysis. Section 6 contains concluding remarks.

1.2 Prices

We fix a filtered probability space (Ω,F , Ftt≥0, P ) and recall the definition

of an Itô semimartingale with Lévy jumps.

1.1 Definition An R-valued process X is an Itô semimartingale with Lévy

jumps if it admits the representation:

Xt = X0 +

∫ t

0

bs ds+

∫ t

0

σs dWs + J lt + Js

t , t ≥ 0, (1.1)

with

J lt := xI|x|>1 ∗ µ :=

∫ t

0

∫

R

xI|x|>1µ(dx, ds),

Jst := xI|x|≤1 ∗ (µ− ν) :=

∫ t

0

∫

R

xI|x|≤1(µ− ν)(dx, ds),

ν(dt, dx) = F (dx) dt,

where W is a Brownian motion, σ and b are R-valued progressively measur-

able processes, µ is an integer-valued measure induced by the jumps of X,

ν is its Lévy system, and F (dx) is a deterministic constant-in-time σ-finite

measure on R.

1.1 Remark Generally, Itô semimartingales are those with characteristic

triplet that is absolutely continuous with respect to the Lebesgue measure.

Here, we further restrict the Lévy system ν to be deterministic. This as-

sumption ensures the jump measure µ is a Poisson random measure.

We assume prices are observed in the fixed time interval [0, 1] at discrete,

equidistant times ti = i∆n, i = 0, 1, · · · , n, where

∆n = 1/n = ti+1 − ti, i = 0, · · · , n− 1. (1.2)

7

Given the finite sequence Xti , i = 0, 1, 2, · · · , n, our aim is to estimate

the spot variance σ2 in the time interval [0, 1] by nonparametric methods.

Note that our objective is not an approximation of a point but rather the

approximation of an entire function. Thus an estimator of the spot variance

may be viewed as a random element (function), as opposed to a random

variable, that must converge in some sense to the spot variance, which itself

is a random element. We approach this task by estimating the expansion of

the spot variance using collections of Gabor frame elements.

1.3 Frames

Frames generalize the notion of orthonormal bases in Hilbert spaces. If

fkk∈N is a frame for a separable Hilbert space H then every vector f ∈ Hmay be expressed as a linear combination of the frame elements, i.e.

f =∑

k∈N

ckfk. (1.3)

This is similar to how elements in a Hilbert space may be expressed in terms of

orthonormal basis; but unlike orthonormal basis, the representation in (1.3)

need not be unique, and the frame elements need not be orthogonal. Loosely

speaking, frames contain redundant elements. The absence of uniqueness in

the frame representation is by no means a shortcoming; on the contrary, we

are afforded a great deal of flexibility and stability as a result. In fact, given

a finite data sample, the estimated basis expansion coefficients are likely to

be imprecise. This lack of precision can create significant distortions when

using an orthonormal basis. These distortions are somewhat mitigated when

using frames because of the built-in redundancy of frame elements.

Furthermore, if fkk∈N is a frame for H, then surjective, bounded trans-

formations of fkk∈N also constitute frames for H, e.g. fk + fk+1k∈N is a

frame. So, once we have a frame, we can generate an arbitrary number of

them very easily. We may then obtain estimates using each frame and com-

pare results. If our results using the different frames fall within a tight band,

then we are afforded some indication of the robustness of the computations.

Our discussion of frame theory will be rather brief; we only mention

8

concepts needed for our specification of the volatility estimator. For a more

detailed treatment we refer the reader to (Christensen, 2008). In the sequel

if z is a complex number then we shall denote respectively by z and |z|the complex conjugate and magnitude of z. Let L2(R) denote the space of

complex-valued functions defined on the real line with finite norm given by

‖f‖ :=

(∫

R

f(t)f(t) dt

)1/2

< ∞, f ∈ L2(R).

Define the inner product of two elements f and g in L2(R) as 〈f, g〉 :=∫

Rf(t)g(t) dt.

Denote by ℓ2(N) the set of complex-valued sequences defined on the set

of natural numbers N with finite norm given by

‖c‖ :=

∑

k∈N

ckck

1/2

< ∞, c ∈ ℓ2(N),

where ck is the k-th component of c. The inner product of two sequences

c and d in ℓ2(N) is 〈c, d〉 :=∑

k∈N ckdk. Now we may give a definition for

frames:

1.2 Definition A sequence fkk∈N ⊂ L2(R) is a frame if there exists

positive constants C1 and C2 such that

C1‖f‖2 ≤∑

k∈N

|〈f, fk〉|2 ≤ C2‖f‖2, f ∈ L2(R).

The constants C1 and C2 are called frame bounds. If C1 = C2 then fkk∈Nis said to be tight. Because an orthonormal basis satisfies Parseval’s equality,

it follows that an orthonormal basis is a tight frame with frame bounds

identically equal to 1, i.e. C1 = C2 = 1. Now if fk is a frame, we may

associate with it a bounded operator A that maps every function f in L2(R)

to a sequence c in ℓ2(N) in the following way:

Af = c where ck = 〈f, fk〉, k ∈ N. (1.4)

Because A takes a function defined on a continuum (R) to a sequence, which

9

is a function defined on the discrete set N, A is known as the analysis operator

associated with the frame fkk∈N. The boundedness of the analysis operator

follows from the frame bounds in Definition (1.2). Now A∗, the adjoint of A,

is well-defined and takes sequences in ℓ2(N) to functions in L2(R). Using the

fact that A∗ must satisfy the equality 〈Af, c〉 = 〈f,A∗c〉 for all f ∈ L2(R)

and c ∈ ℓ2(N), it may be deduced that

A∗c =∑

k∈N

ckfk, c ∈ ℓ2(N),

where ck is the k-th component of the sequence c. The adjoint, A∗, may be

thought of as reversing the operation or effect of the analysis operator; for

this reason it is known as the synthesis operator.

Now an application of the operator (A∗A)−1 to every frame element fk

yields a sequence fk := (A∗A)−1fkk∈N, which is yet another frame for

L2(R). The frame fkk∈N is known as the canonical dual of fkk∈N. De-

noting the analysis operator associated with the canonical dual by A, it may

be shown1 that

A∗A = A∗A = I, (1.5)

where I is the identity operator and A∗ is the adjoint of the analysis operator

of the canonical dual. Furthermore, Proposition 3.2.3 of Daubechies (1992)

shows that A satisfies

A = A(A∗A)−1, (1.6)

so that the analysis operator of the canonical dual frame is fully characterized

by A and its adjoint. It is easily seen that (1.5) yields a representation result

since if f ∈ L2(R) then

f = A∗Af = A∗Af =∑

k∈N

〈f, fk〉fk. (1.7)

Thus, in a manner reminiscent of orthonormal basis representations, every

1See for example Daubechies (1992, Proposition 3.2.3)

10

function in L2(R) is expressible as a linear combination of the frame elements,

with the frame coefficients given by 〈f, fk〉, the correlation between the func-

tion and the elements of the dual frame. It follows from the first equality

in (1.5) and the commutativity of the duality relationship that functions in

L2(R) may also be written as linear combinations of the elements in fkk∈N,

with coefficients given by 〈f, fk〉, i.e. f =∑

k∈N〈f, fk〉fk.

1.3.1 Gabor frames

Next, we specialize the discussion to Gabor frames. The analysis of Gabor

frames involves two operators: the translation operator T and the modulation

operator M defined as follows:

Tbf(t) := f(t− b), b ∈ R, f ∈ L2(R), (1.8)

Maf(t) := e2πiatf(t), a ∈ R, f ∈ L2(R), (1.9)

where i is the imaginary number, i.e. i =√−1. Both T and M are shift

operators: T is a shift or translation operator on the time axis, whereas

M performs shifts on the frequency axis. A Gabor system is constructed by

fixing a, b ∈ R, and performing shifts of a single nontrivial function g ∈ L2(R)

in time-frequency space. For example, if a and b are real numbers then the

sequence of functions

MhaTkbgh,k∈Z,

constitutes a Gabor system.

1.3 Definition Let g ∈ L2(R), and let a > 0, b > 0 be positive real

numbers. Define for t ∈ R

gh,k(t) := e2πihatg(t− kb), h, k ∈ Z.

If the sequence gh,kh,k∈Z constitutes a frame for L2(R), then it is called a

Gabor frame.2

2It is also sometimes referred to as a Weyl-Heisenberg frame.

11

The fixed function g is known as the Gabor frame generator 3; a is known

as the modulation parameter ; and b is known as the translation parame-

ter. In order to obtain sharp asymptotic rates, we require g and its dual g

(see (1.7)) to be continuous and compactly supported. The following result

(Christensen, 2006, Lemma 1.2) and (Zhang, 2008, Proposition 2.4), tells us

how to construct such dual pairs.

1.1 Lemma Let [r, s] be a finite interval, let a > 0, b > 0 be positive

constants, and let g be a continuous function. If g(t) 6= 0 when t ∈ (r, s);

g(t) = 0 when t /∈ (r, s); and a, b satisfy: a < 1/(s − r), 0 < b < s − r;

then g, g is a pair of dual Gabor frame generators, with the dual Gabor

generator given by

g(t) := g(t)/G(t), where (1.10)

G(t) :=∑

k∈Z

|g(t− kb)|2/a. (1.11)

Furthermore,

gh,k(t) := e2πihatg(t− kb), h, k ∈ Z (1.12)

is compactly supported.

In the sequel, we assume the Gabor frame setup in Lemma 1.1.

1.4 Volatility estimation: continuous prices

In this section we specify a consistent estimator of spot volatility within a

framework of continuous prices. That is, we simplify the general setup of

(1.1) to:

Xt = X0 +

∫ t

0

bs ds+

∫ t

0

σs dWs, t ≥ 0. (1.13)

We further restrict the processes b and σ as follows:

3It is referred to elsewhere as the window function.

12

1.1 Assumption

1. The drift b is progressively measurable, whereas the diffusion coefficient

σ is adapted and càdlàg.

2. There is a sequence of stopping times Tmm∈N tending to infinity al-

most surely such that

E( sup0≤s≤Tm

|bs − b0|4) + E( sup0≤s≤Tm

|σs − σ0|4) < ∞,

for all m.

1.2 Remark These assumptions are satisfied by a wide range of practi-

cally relevant processes; these include continuous Lévy and additive processes

with càdlàg volatility coefficients. Also included are continuous solutions of

stochastic differential equations; indeed all processes with locally bounded b

and σ satisfy these requirements.

Let g, g be a pair of dual Gabor frame generators constructed as in

Lemma 1.1, then σ2 admits a Gabor frame expansion given by:

σ2(t) =∑

h,k∈Z

ch,k gh,k(t), where (1.14)

ch,k = 〈σ2, gh,k〉. (1.15)

Note that both σ2 and g have compact support. Indeed σ2 has support in

[0, 1], whereas g has support in [s, r]. So, ch,k 6= 0 only if the supports of σ2

and gh,k overlap. Furthermore, we note from (1.12) that gh,k+1 is simply gh,k

shifted by b units; so, ch,k = 0 if |k| ≥ K0 with

K0 := ⌈(1 + |s|+ |r|)/b⌉, (1.16)

where ⌈x⌉, x ∈ R, is the least integer that is greater than or equal to x. Thus

σ2 admits a representation of the form:

σ2(t) =∑

(h,k)∈Z2

|k|≤K0

ch,k gh,k(t),

13

and for sufficiently large positive integer H,

σ2(t) ≈∑

|h|≤H|k|≤K0

ch,k gh,k(t).

Now, suppose n observations of the price process are available, and let

Θn := (h, k) ∈ Z2 : |h| ≤ Hn and |k| ≤ K0, (1.17)

where Hn is an increasing sequence in n. We propose the following estimator

of the volatility coefficient:

vn(X, t) :=∑

(h,k)∈Θn

ch,k gh,k(t), t ∈ [0, 1], where (1.18)

ch,k :=n−1∑

i=0

gh,k(ti)(Xti+1−Xti)

2. (1.19)

So |Θn| is the number of frame elements included in the expansion. Specifi-

cally, |Θn| = (2K0 + 1)(2Hn + 1); and since K0 is a finite quantity, it follows

that |Θn| = O(Hn), i.e. the number of estimated coefficients is proportional

to Hn, and therefore, will grow with the number of observations, n. In the

next section we show that the estimator converges to σ2 on [0, 1] in proba-

bility.

1.1 Proposition Suppose the price process is specified as in (1.13) and

satisfies the conditions of Assumption 1.1. Let g, g be pair of dual Gabor

generators satisfying the conditions of Lemma 1.1 with g Lipschitz continuous

on the unit interval. If Hn ↑ ∞ satisfies

(Hn)2∆1/2

n = o(1),

then vn(X, t), defined in (1.18), converges in L2[0, 1] to σ2 in probability.

14

Proof. We begin by noting that

vn(X, t)− σ2(t) =∑

(h,k)∈Θn

(ch,k − ch,k) gh,k(t)

−∑

(h,k) 6∈Θn

ch,k gh,k(t), (1.20)

where

ch,k =n−1∑

i=0

gh,k(ti)(Xti+1−Xti)

2 and

ch,k =

∫ 1

0

gh,k(s)σ2(s) ds.

We tackle the summands in (1.20) in turn starting with the first one. But

first let

Mi :=

∫ ti+1

ti

b(s) ds, and Si :=

∫ ti+1

ti

σ(s) dWs,

and note that since Xti+1−Xti = Mi + Si, it follows that

(Xti+1−Xti)

2 = M2i + 2MiSi + S2

i .

So, (1.20) may be written as

vn(X, t)− σ2(t) = B1,n(t) + B2,n(t) + B3,n(t) + B4,n(t),

15

where

B1,n(t) :=∑

(h,k)∈Θn

gh,k(t)

n−1∑

i=0

gh,k(ti)S2i − ch,k

,

B2,n(t) := 2∑

(h,k)∈Θn

gh,k(t)

n−1∑

i=0

gh,k(ti)SiMi

,

B3,n(t) :=∑

(h,k)∈Θn

gh,k(t)

n−1∑

i=0

gh,k(ti)M2i

,

B4,n(t) := −∑

(h,k) 6∈Θn

gh,k(t)ch,k. (1.21)

We start by recalling the well-known fact that frame expansions converge

unconditionally in L2[0, 1], that is, the expansion converges regardless of the

order of summation (Christensen, 2008, Theorem 5.1.7). Hence,

‖B4,n‖L2[0,1] = oa.s.(1).

We now obtain an estimate for B3,n(t). Suppose without loss of generality

that b0 = σ0 = 0 and let Tmm∈N be a localizing sequence for b and σ. Then,

by Jensen’s inequality

E

(

∫ ti+1

ti

bs∧Tm ds

)2

≤ ∆nE

(

∫ ti+1

ti

b2s∧Tmds

)

≤ ∆n

∫ ti+1

ti

E(b2s∧Tm) ds

≤ ∆n

∫ ti+1

ti

E( supu≤Tm

b4u)1/2 ds

≤ c∆2n, (1.22)

where the change in the order of integration is justified by Fubini’s theorem,

and c denotes a generic constant. In the sequel, in expressions containing

more than one inequality, c will denote the maximum or minimum, as the

case may be, of the constants appearing in each inequality. Set Mmi =

16

∫ ti+1

tib2s∧Tm

ds and

Bm3,n(t) =

∑

(h,k)∈Θn

gh,k(t)

n−1∑

i=0

gh,k(ti)(Mmi )2

and note that given η > 0,

P ( supt∈[0,1]

|B3,n(t)| > η) ≤ P (Tm ≤ 1) + P ( supt∈[0,1]

|Bm3,n(t)| > η),

for any m ∈ N. Since Tm ↑ ∞ a.s., the first term on the right becomes

arbitrarily small as m tends to infinity. Now since gh,k and gh,k are bounded

independently of h and k, and n∆n = 1, it follows by Markov’s inequality

and (1.22) that

P ( supt∈[0,1]

|Bm3,n(t)| > η) ≤ cHn∆n.

Hence,

supt∈[0,1]

|B3,n(t)| = oP (1). (1.23)

We now tackle B2,n(t). To that end, denote Smi :=

∫ ti+1

tiσs∧TmdWs and note

that

E((Smi )2) = E

(

∫ ti+1

ti

σ2s∧Tm

ds

)

=

∫ ti+1

ti

E(σ2s∧Tm

) ds

≤∫ ti+1

ti

(E( supu∧Tm

σ4u)

1/2) ds

≤ c∆n. (1.24)

17

By Hölder’s inequality, (1.22), and (1.24), we have

E(Mmi Sm

i ) ≤ (E(Mmi )2E(Sm

i )2)1/2

≤ c∆3/2n . (1.25)

Next, set

Bm2,n(t) := 2

∑

(h,k)∈Θn

gh,k(t)

n−1∑

i=0

gh,k(ti)Smi Mm

i

.

Then for each m, because gh,k and gh,k are bounded independently of h and

k, and n∆n = 1, we conclude by an appeal to Markov’s inequality that

P (supt∈[0,1] |Bm2,n(t)| > η) ≤ cHn∆

1/2n . By the previously used localization

argument,

supt∈[0,1]

|B2,n(t)| = oP (1). (1.26)

Now we tackle the final piece B1,n(t). Let

An :=n−1∑

i=0

gh,k(ti)S2i −

∫ 1

0

σ2(s)gh,k(s) ds. (1.27)

We will first obtain an upper bound for An; we proceed by adding and sub-

tracting∑n−1

i=0

∫ ti+1

tigh,k(ti)σ

2(s) ds from A to yield:

An =n−1∑

i=0

gh,k(ti)

(

S2i −

∫ ti+1

ti

σ2(s) ds

)

+n−1∑

i=0

(

∫ ti+1

ti

σ2(s)gh,k(ti)− gh,k(s) ds)

=: An1 + An

2 .

We obtain estimates in turn for the summands. By Assumption 1.1, σ is

càdlàg so that it is almost surely bounded on [0, 1]; by the continuity of gh,k

18

and Lemma (1.3), we have

An2 =

n−1∑

i=0

∫ ti+1

ti

σ2(s)gh,k(ti)− gh,k(s) ds

≤ cω(gh,k,∆n), a.s.,

where ω(gh,k,∆n) is the modulus of continuity of gh,k on an interval of length

∆n. By the Lipschitz continuity of g we have,

An2 = Oa.s.(ω(g,∆n)) = Oa.s.(∆n).

Now, we obtain an estimate for An1 . First, let Dn

i : Ω × [0, 1] → R for

i = 0, · · · , n− 1 be defined as follows:

Dni (t) := gh,k(ti)

(

∫ t

ti

σu∧Tm dWu

)

(ti,ti+1](t). (1.28)

Dn0 (0) := 0. (1.29)

So, Dni (t) is 0 on [0, 1] except when t is in (ti, ti+1]. Moreover

Dni (t)D

nj (t) = 0, i 6= j,

for t in [0, 1]. Now, for each i, 0 ≤ i < n, if t ∈ (ti, ti+1], we have

E(Dni (t)

4) = gh,k(ti)4(ti,ti+1](t)E

(

∫ t

ti

σu∧Tm dWu

)4

≤ c(ti,ti+1](t)E

(

∫ t

ti

σ2u∧Tm

du

)2

B.D.G

≤ c(t− ti)(ti,ti+1](t)E

(

∫ t

ti

σ4u∧Tm

du

)

Jensen

≤ c∆n(ti,ti+1](t)

∫ ti+1

ti

E(

σ4u∧Tm

)

du Fubini

≤ c(ti,ti+1](t)∆2n (1.30)

19

where the application of Fubini’s theorem (Halmos, 1950, Theorem VII.36.B)

is justified by the fact that σ4 is non negative and measurable with respect

to the product σ-algebra on [0, 1]×Ω. Now, using Itô’s integration by parts

formula, we may write

E((An1 )

2) = E

2n−1∑

i=0

∫ ti+1

ti

gh,k(ti)

(

∫ s

ti

σu∧Tm dWu

)

σs∧Tm dWs

2

= 4E

∫ 1

0

n−1∑

i=0

Dni (s)σs∧Tm dWs

2

≤ c

∫ 1

0

n−1∑

i=0

E(Dni (s)

2σ2s∧Tm

) ds

≤ c

∫ 1

0

n−1∑

i=0

(E(Dni (s)

4))1/2(E(σ4s∧Tm

))1/2 ds

≤ c

∫ 1

0

n−1∑

i=0

(ti,ti+1](s)∆n ds

≤ c∆n.

By Chebyshev’s inequality and the previously used stopping time argument,

we have An = OP (∆n). By the boundedness of gh,k, we have

supt∈[0,1]

|B1,n(t)| = oP (1).

Hence, Bj,n(t) for j = 1, · · · , 4, tends to zero in L2[0, 1] in probability.

1.5 Volatility estimation: discontinuous prices

In this section we specify a global spot volatility estimator for possibly dis-

continuous Itô semimartingale price processes. That is, for t ≥ 0,

Xt = X0 +

∫ t

0

bsds+

∫ t

0

σsdWs + xI|x|>1 ∗ µt + xI|x|≤1 ∗ (µ− ν)t

20

with ν(dt, dx) = F (dx)dt for a determinsistic and constant-in-time σ-finite

measure F . We assume σ and b satisfy the requirements of Assumption 1.1,

and we further restrict the Lévy system of X as follows:

1.2 Assumption The Lévy measure F satisfies the following condition

(x2I|x|≤u) ∗ νt =∫ t

0

∫ u

−ux2F (dx) dt = O(u) as u → 0.

1.3 Remark The requirement is satisfied if F is absolutely continuous

with bounded density f , as is the case with the Gaussian distribution; more

generally, it is satisfied if f(x) = O(x−2) as x → 0; these include the Lévy

(γ, δ) distribution with density

f(x) = (γ/2π)1/2(x− δ)−3/2 exp(−γ/2(x− δ)), x ∈ R,

and the Cauchy(γ, δ) distribution with density

f(x) = (γ/π)(γ2 + (x− δ)2)−1, x > δ.

We also remark that for general semimartingales (x2∧1)∗νt is increasing

and locally integrable. By the Lévy assumption, we simply have that (x2 ∧1) ∗ ν is finite. In addition, it is a consequence of the Lévy assumption that

the price process has no fixed time of discontinuity (Jacod & Shiryaev, 2003,

II.4.3). Hence, by Itô’s integration by parts formula

E((x2 ∧ 1) ∗ µt) = t(x2 ∧ 1) ∗ ν = O(t), t ≥ 0. (1.31)

As in the preceeding section, we observe a realization of the price process

at n + 1 equidistant points ti, i = 0, 1, · · · , n. The observation interval is

normalized to [0, 1] with no loss of generality. The estimator proposed in

the previous section, where there is no jump activity, will not do here. It is

inconsistent on account of the presence of jumps; its quality deteriorates as a

function of how active the jumps of X are. We will counter this phenomenon

with a modified spot variance estimator, but first we introduce the following

notation. Let ∆iX denote Xti+1−Xti for i = 0, 1, · · · , n − 1, and let un be

21

a positive decreasing sequence such that

un = O(∆βn), where 0 < β < 1. (1.32)

We specify the jump-robust global estimator of spot volatility as follows:

Vn(X, t)(t) :=∑

(h,k)∈Θn

ah,k gh,k(t), ∀t ∈ [0, 1], where (1.33)

ah,k :=n−1∑

i=0

gh,k(ti)(∆iX)2I(∆iX)2≤un, (1.34)

where gh,k, gh,k is a pair of dual Gabor frames constructed as in Lemma

(1.1); Θn retains its meaning from (1.17); and I(∆iX)2≤un is one if (∆iX)2

is less than or equal to un and zero otherwise.

There are obvious similarities between vn(X, t), defined at (1.18), and

Vn(X, t) with the key difference being that Vn(X, t) discards realized squared

increments over intervals that likely contain jumps; un determines the thresh-

old for what is included in the computation and what is not. This determi-

nation becomes more accurate as the observation interval becomes infinites-

simally small. Clearly it makes sense to use vn(X, t) if we have reason to

believe that the price process is not subject to jumps; vn(X, t) will always

employ all available data and therefore may be assumed to produce more

accurate results.

We now proceed to prove the consistency of the estimator. First we

introduce the following notation and prove an intermediate lemma.

Xct := X0 +

∫ t

0

bs ds+

∫ t

0

σs dWs,

J lt := xI|x|>1 ∗ µt,

Xft := Xc

t + J lt . (1.35)

1.2 Lemma Let Xf be specified as in (1.35) with σ and b satisfying As-

sumption 1.1. Let g, g denote a pair of dual Gabor generators satisfying

the conditions of Lemma (1.1) with g Lipschitz continuous on the unit in-

terval. Let Hn be an increasing sequence and un a decreasing sequence

22

satisfying un = O(∆βn) with 0 < β < 1. If

u−1/2n (Hn)2∆1/2

n = o(1)

then Vn(Xf , t) as defined in (1.33) converges in L2[0, 1] in probability to σ2.

Proof. We have

Vn(Xf , t)− σ2(t) = Vn(X

f , t)− Vn(Xc, t)+ Vn(X

c, t)− vn(Xc, t)

+ vn(Xc, t)− σ2(t). (1.36)

That the third summand on the right converges to zero in L2[0, 1] in probabil-

ity is the content of Proposition 1.1. Set bh,k :=∑n−1

i=0 gh,k(ti)(∆iXc)2I(∆iXc)2≤un

and dh,k :=∑n−1

i=0 gh,k(ti)(∆iXc)2. Now note that Vn(X

c, t) − vn(Xc, t) =

∑

(h,k)∈Θn(bh,k − dh,k) gh,k(t) with

bh,k − dh,k =n−1∑

i=0

gh,k(ti)(∆iXc)2I(∆iXc)2≤un − (∆iX

c)2

=n−1∑

i=0

gh,k(ti)(∆iXc)2I(∆iXc)2>un.

Without loss of generality, suppose b0 = σ0 = 0; let Tm be a localizing

sequence for b and σ. Set ∆iSm :=∫ ti+1

tiσs∧Tm dWs, ∆iMm :=

∫ ti+1

tibs∧Tm ds,

and ∆iXcm := ∆iMm + ∆iSm. Define bmh,k − dmh,k as above by substituting

∆iXcm for ∆iX

c. Now note the following

E(|bmh,k − dmh,k|) ≤ cnE((∆iXcm)

2I(∆iXcm)2>un)

≤ cn(E((∆iXcm)

4))1/2(P ((∆iXcm)

2 > un))1/2

≤ cnu−1/2n (E((∆iX

cm)

4))1/2(E((∆iXcm)

2))1/2.

Arguing as in Proposition 1.1, it is easily verified that E((∆iXcm)

4) ≤ c(∆4n+

∆3n +∆2

n) and E((∆iXcm)

2) ≤ c(∆2n +∆

3/2n +∆n). Hence, E(|bmh,k − dmh,k|) ≤

cnu−1/2n ∆

3/2n = cu

−1/2n ∆

1/2n . Because gh,k is bounded, this allows us to con-

23

clude by way of Markov’s inequality that given η > 0,

P ( supt∈[0,1]

|Vn(Xc, t)− vn(X

c, t)| > η) ≤ P (Tm ≤ 1) + cu−1/2n Hn∆1/2

n ,

which becomes arbitrarily small as m and n tend to infinity simultaneously.

To obtain an estimate for the first summand in (1.36), denote eh,k :=∑n−1

i=0 gh,k(ti)(∆iXf )2I(∆iXf )2≤un and observe that Vn(X

f , t) − Vn(Xc, t) =

∑

(h,k)∈Θn(eh,k − bh,k) gh,k(t) with

eh,k − bh,k =n−1∑

i=0

gh,k(ti)(∆iXf )2I(∆iXf )2≤un − (∆iX

c)2I(∆iXc)2≤un.

By definition Xf = Xc + J l, where J l represents the jumps of X in excess of

1. We may write (∆iXf )2I(∆iXf )2≤un−(∆iX

c)2I(∆iXc)2≤un = γ1i +2γ2

i +γ3i

with

γ1i := (∆iX

c)2(I(∆iXf )2≤un − I(∆iXc)2≤un),

γ2i := (∆iX

c∆iJl)I(∆iXf )2≤un,

γ3i := (∆iJ

l)2I(∆iXf )2≤un. (1.37)

Because, X is càdlàg, there is at most a finite number of jumps in excess

of 1 per outcome in [0, 1]. For sufficiently large n, each interval (ti, ti+1]

contains at most one jump. If the i-th interval does not contains a jump

then γ2i = γ3

i = 0 because ∆iJl = 0. If the i-th interval contains a jump, we

have

|∆iXf | = |∆iJ

l +∆iXc| ≥ 1− |∆iX

c|. (1.38)

Now observe that because Xc has continuous paths, it is uniformly continuous

on the compact domain [0, 1], so that as n tends to infinity, 1−supi<n |∆iXc| ↑

1; meanwhile, u1/2n ↓ 0. Hence, for n large enough, we have |∆iX

f | > u1/2n so

that, almost surely, γ2i and γ3

i , for all i, are uniformly eventually zero.

24

To pin down γ1i , we introduce the following events

Ω1n := ω : µ(ω, (ti, ti+1]× |x| > 1) ≤ 1, for all i < n, n ∈ N,

Ω2n := ω : |∆iX

c(ω)| < 1− u1/2n , for all i < n, n ∈ N,

Ωk := ω : µ(ω, [0, 1]× |x| > 1) ≤ k. k ∈ N.

Set Ωn := Ω1n∩Ω2

n. As previously argued (see (1.38)), P (Ω2n) → 1 as n → ∞.

Because X is càdlàg, µ([0, 1] × |x| > 1) is almost surely finite, so that

P (Ω1n) → 1 as n → ∞. Hence, P (Ωn) → 1 as n → ∞. It is also the case that

P (Ωk) → 1 as k → ∞ since X is càdlàg and the number of jumps larger than

one in any bounded interval must be finite almost surely. Now, recall that

Tm is a localizing sequence for b and σ; set Ω(m,n, k) := Ωn∩Ωk∩Tm > 1and note that P (Ω(m,n, k)) → 1 as n,m, k → ∞. Thus, on Ω(m,n, k) there

is at most k jumps larger than one with no more than one jump per interval;

the increments of Xc are small enough to ensure the increments of Xf exceed

u1/2n ; and the processes σ4 and b4 are integrable.

Set γ1i (n,m, k) = γ1

i IΩ(m,n,k) and denote Gi := |∆iJl| > 0. By the tri-

angle inequality, E(|γ1i (n,m, k)|) ≤ E(|γ1

i (n,m, k)IGi|)+E(|γ1

i (n,m, k)IGci|).

Clearly, γ1i (n,m, k) = 0 on Gc

i so that

n−1∑

i=0

gh,k(ti)E(|γ1i (n,m, k)|) ≤

n−1∑

i=0

gh,k(ti)E(|γ1i (n,m, k)IGi

|)

=k∑

i=1

gh,k(ti)E((∆iXcm)

2I(∆iXcm)2≤unIGi

)

≤k∑

i=1

gh,k(ti)E((∆iXcm)

2)

≤ ck∆n.

Hence, given η > 0,

P ( supt∈[0,1]

|Vn(Xf , t)− Vn(X

c, t)| > η) ≤ P (Ω(m,n, k)c) + cHnk∆n.

By taking m,n, k large enough, the first term can be made as small as re-

25

quired; for fixed m, k, letting n → ∞ will make the second term as small as

desired. This completes the proof.

We now prove consistency for the estimator when the price process admits

both large and small jumps. That is

Xt = X0 +Xct + J l

t + Jst .

where J lt := (xI|x|>1) ∗ µt and Js

t := (xI|x|≤1) ∗ (µ− ν)t. We now give the

main result of the paper.

1.2 Proposition Let the price process X be specified as in (1.1). We

assume that the requirements of Assumption 1.1 and 1.2 are met. Let g, gbe pair of dual Gabor generators satisfying the conditions of Lemma (1.1)

with g Lipschitz continuous on the unit interval. Let Hn be an increasing

sequence and un a decreasing sequence statisfying un = O(∆βn) with 0 <

β < 1. If

u−1/2n (Hn)2∆1/2

n = o(1),

(Hn)2u1/2n = o(1) (1.39)

then Vn(X, t) defined in (1.33) converges in L2[0, 1] in probability to σ2.

Proof. We argue along the lines of Theorem 4 of Mancini (2009). First,

consider the following decomposition of the process X:

X = Xf + Js, (1.40)

Xf = Xc + J l, (1.41)

where Xct =

∫ t

0bs ds+

∫ t

0σs dWs, J l

t = (xI|x|>1) ∗µt, and Jst = (xI|x|≤1) ∗ (µ−

ν)t. By localization, it is enough to assume σ4 and b4 are integrable. Let t

be a point in the unit interval, then

Vn(X, t)− σ2t =

∑

(h,k)∈Θn

(ah,k − ch,k)gh,k(t)−∑

(h,k) 6∈Θn

ch,kgh,k(t), (1.42)

with ah,k and ch,k defined by (1.33) and (1.15), respectively. The last term

tends to zero, almost surely, in L2[0, 1] as n → ∞ because Gabor frames

26

converge unconditionally.

To obtain a bound on the first item on the right of (1.42), we may use

(1.40) to write

∑

(h,k)∈Θn

(ah,k − ch,k)gh,k(t) =∑

(h,k)∈Θn

(wh,k + xh,k + yh,k + zh,k)gh,k(t),

(1.43)

where

wh,k :=n−1∑

i=0

gh,k(ti)(∆iXf )2I(∆iXf )2≤4un −

∫ 1

0

σ2(s)gh,k(s) ds

xh,k :=n−1∑

i=0

gh,k(ti)(∆iXf )2(I(∆iX)2≤un − I(∆iXf )2≤4un)

yh,k := 2n−1∑

i=0

gh,k(ti)∆iXf∆iJ

sI(∆iX)2≤un

zh,k :=n−1∑

i=0

gh,k(ti)(∆iJs)2I(∆iX)2≤un. (1.44)

By Lemma 1.2, if δ > 0 then P (supt∈[0,1] |∑

(h,k)∈Θnwh,kgh,k(t)| > δ) → 0

as n tends to infinity. It remains to show that the last three terms on the

right of (1.43) converge to zero in probability. Starting with the second

summand, denote Ai := (∆iX)2 ≤ un, Bi := (∆iXf )2 ≤ 4un and note

that IAi− IBi

= IAi∩Bci− IAc

i∩Bi. Hence, we may write

∑

(h,k)∈Θn

xh,kgh,k(t) =∑

(h,k)∈Θn

n−1∑

i=0

gh,k(ti)(xi,1 − xi,2)gh,k(t)

where xi,1 := (∆iXf )2IAi∩Bc

iand xi,2 := (∆iX

f )2IAci∩Bi

. It is now easily

verified using the reverse triangle inequality that Ai ∩Bci ⊂ |∆iJ

s| > u1/2n .

27

So that,

(∆iXf )2IAi∩Bc

i≤ (∆iX

f )2I(∆iJs)2>un (1.45)

≤ 2(∆iXc)2I(∆iJs)2>un + 2(∆iJ

l)2I(∆iJs)2>un

=: vi + wi. (1.46)

It thus follows that

∑

(h,k)∈Θn

n−1∑

i=0

gh,k(ti)xi,1gh,k(t) ≤

∑

(h,k)∈Θn

n−1∑

i=0

gh,k(ti)(vi + wi)

gh,k(t).

We proceed by using Hölder’s inequality and (1.31) to write

E(vi) ≤ c(E((∆iXc)4))1/2P ((∆iJ

s)2 > un)1/2

≤ cu−1/2n E((∆iX

c)4)1/2E((∆iJs)2)1/2

≤ cu−1/2n ∆3/2

n . (1.47)

Hence, by Markov’s inequality and the boundedness of gh,k

supt∈[0,1]

∑

(h,k)∈Θn

n−1∑

i=0

gh,k(ti)vigh,k(t) = OP (u−1/2n Hn∆1/2

n ), (1.48)

which by assumption tends to zero in probability.

As for the term involving wi, recall that because µ is a Poisson random

measure, if A and B are disjoint measurable sets in R+ × R then µ(A) is

independent of µ(B). Using this fact, we may write given η > 0

P ( supt∈[0,1]

|∑

(h,k)∈Θn

n−1∑

i=0

gh,k(ti)wigh,k(t)| > η)

≤ P(

∪iµ((ti, ti+1]× |x| > 1) > 0, (∆iJs)2 > un

)

≤ nP (µ([0, t1]× |x| > 1) > 0)E((Jst1)2)u−1

n

≤ c∆nu−1n ,

which clearly tends to zero in n. This concludes the demonstration that

28

∑

(h,k)∈Θn

∑n−1i=0 gh,k(ti)x

i,1gh,k(t) tends to zero in probability. To tackle the

term∑

(h,k)∈Θn

∑n−1i=0 gh,k(ti)x

i,2gh,k(t), we start with the following definitions:

Ω1n := ω : |∆iX

c(ω)| < 1− 2u1/2n , for all i < n,

Ω2n := ω : µ(ω, (ti, ti+1]× |x| > 1) ≤ 1, ∀i < n,

∀n ∈ N. These sets are clearly measurable. Denote

Ωn := Ω1n ∩ Ω2

n. (1.49)

Since there can be at most a finite number of jumps larger than 1 in magni-

tude on [0, 1], and 1− 2u1/2n ↑ 1 while ∆iX

c ↓ 0 uniformly on [0, 1], it follows

that P (Ωn) → 1 as n → ∞. Now note that

Aci ∩ Bi ∩ Ωn ⊂ (∆iX

c +∆iJs)2 > un

⊂ (∆iXc)2 > un/4 ∪ (∆iJ

s)2 > un/4.

Hence, by successive applications of Hölder and Markov inequalities,

E((∆iXf )2IAc

i∩Bi∩Ωn) = E((∆iXc)2IAc

i∩Bi∩Ωn)

≤ E((∆iXc)2I(∆iXc)2>un/4) + E((∆iX

c)2I(∆iJs)2>un/4)

≤ c∆3/2n u−1/2

n .

Let η be a given positive number; it is now clear that

P ( supt∈[0,1]

|∑

(h,k)∈Θn

n−1∑

i=0

gh,k(ti)xi,2gh,k(t)| > η) ≤ P (Ωc

n) + cu−1/2n Hn∆1/2

n ,

which tends to zero. This completes the demonstration that

P ( supt∈[0,1]

|∑

(h,k)∈Θn

xh,kgh,k(t)| > η) → 0. (1.50)

Now we show the third summand in (1.43) tends to zero. First, denote

Ci := (∆iJs)2 ≤ 4un, ph,k := 2

∑n−1i=0 gh,k(ti)∆iX

f∆iJsIAi∩Ci

, and qh,k :=

29

2∑n−1

i=0 gh,k(ti)∆iXf∆iJ

sIAi∩Cci. Clearly,

∑

(h,k)∈Θn

yh,kgh,k(t) =∑

(h,k)∈Θn

(ph,k + qh,k)gh,k(t).

Treating the term involving qh,k first, note that by the reverse triangle in-

equality, we may write Ai ∩ Cci ⊂ u1/2

n < |∆iXf | ⊂ u1/2

n /2 < |∆iXc| ∪

u1/2n /2 < |∆iJ

l| =: G1i ∪G2

i . So that

∆iXf∆iJ

sIAi∩Cci≤ ∆iX

f∆iJs(IG1

i+ IG2

i)

≤ ∆iXc∆iJ

s(IG1i+ IG2

i) + ∆iJ

l∆iJs(IG1

i+ IG2

i)

=: γ1i + γ2

i + γ3i + γ4

i .

Hence,

∑

(h,k)∈Θn

qh,kgh,k(t) ≤∑

(h,k)∈Θn

(n−1∑

i=0

gh,k(ti)(γ1i + γ2

i + γ3i + γ4

i ))gh,k(t).

We show in turn that each summand converges to zero. First, observe that

E(γ1i ) ≤ E((∆iX

cIG1i)2)1/2E((∆iJ

s)2)1/2

≤ E((∆iXc)4)1/4E(IG1

i)1/4E((∆iJ

s)2)1/2

≤ c∆1/2n (u−1/2

n ∆1/2n )∆1/2

n

≤ cu−1/2n ∆3/2

n . (1.51)

Hence, given positive η,

P ( supt∈[0,1]

|∑

(h,k)∈Θn

n−1∑

i=0

gh,k(ti)γ1i gh,k(t)| > η) ≤ cHn(u−1

n ∆n)1/2. (1.52)

30

Secondly, we have

E(γ2i ) = E(∆iX

c∆iJsIG2

i)

≤ E((∆iXc)2IG2

i)1/2E((∆iJ

s)2)1/2

≤ E((∆iXc)4)1/4P (∆iJ

l > u1/2n /2)1/4E((∆iJ

s)2)1/2

≤ cu−1/8n ∆5/4

n .

So that given positive η,

P ( supt∈[0,1]

|∑

(h,k)∈Θn

n−1∑

i=0

gh,k(ti)γ2i gh,k(t)| > η) ≤ cHn(u−1/2

n ∆n)1/4. (1.53)

Moreover,

P ( supt∈[0,1]

|∑

(h,k)∈Θn

n−1∑

i=0

gh,k(ti)γ3i gh,k(t)| > η)

≤ P (∪iµ((ti, ti+1]× |x| > 1) > 0, (∆iXc)2 > un/4)

≤ c∆nu−1n . (1.54)

Finally,

P ( supt∈[0,1]

|∑

(h,k)∈Θn

n−1∑

i=0

gh,k(ti)γ4i gh,k(t)| > η)

≤ P (∪iµ((ti, ti+1]× |x| > 1) > 0, (∆iJs)2 > un/4)

≤ c∆nu−1n . (1.55)

We conclude by reference to the estimates in (1.52), (1.53), (1.54), and (1.55)

that supt∈[0,1] |∑

(h,k)∈Θnqh,kgh,k(t)| tends to zero in probability.

We now show that∑

(h,k)∈Θnph,kgh,k(t) tends to zero uniformly in proba-

bility. To that end, let Ψn := ω : |∆iXc(ω)| > u

1/2n for some i < n. It now

31

follows by Markov’s inequality that

P (Ψn) ≤n−1∑

i=0

P (|∆iXc| > u1/2

n )

≤ u−3/2(1−β)n

n−1∑

i=0

E((∆iXc)3/(1−β))

≤ c∆1/2n . (1.56)

Hence, P (Ψn) → 0. On Ai ∩Ci ∩Ψcn, it is easily seen that |∆iJ

l| − |∆iXc +

∆iJs| < |∆iX| ≤ u

1/2n , so that |∆iJ

l| ≤ u1/2n + |∆iX

c|+ |∆iJs|. It is therefore

the case that |∆iJl| = O(u

1/2n ). Let rh,k := 2

∑n−1i=0 gh,k(ti)∆iX

c∆iJsIAi∩Ci∩Ψc

n

and sh,k := 2cu1/2n

∑n−1i=0 gh,k(ti)∆iJ

sIAi∩Ci∩Ψcn. Then given δ > 0 and ε > 0,

P ( supt∈[0,1]

|∑

(h,k)∈Θn

ph,kgh,k(t)| > δ)

≤ P (Ψn) + P ( supt∈[0,1]

|∑

(h,k)∈Θn

rh,kgh,k(t)| > δ/2)

+ P ( supt∈[0,1]

|∑

(h,k)∈Θn

sh,kgh,k(t)| > δ/2). (1.57)

Now consider that∑

(h,k)∈Θnrh,kgh,k(t) ≤ cHn

∑n−1i=0 |∆iX

c∆iJsIAi∩Ci∩Ψc |;

this implies that

P (|∑

(h,k)∈Θn

rh,kgh,k(t)| > δ/2) ≤ P (cHn

n−1∑

i=0

|∆iXc∆iJ

sIAi∩Ci∩Ψc | > δ/2)

≤ P

n−1∑

i=0

(∆iXc)2

1/2

n−1∑

i=0

(∆iJsIAi∩Ci∩Ψc)2

1/2

> δ(2Hnc)−1

.

We now use the well-known fact that∑n−1

i=0 (∆iXc)2(t) converges to

∫ t

0σ2(s) ds

in probability uniformly on compact intervals (Protter, 2004, Theorem II.22).

That is, there is a sufficiently large N and C such that if n is larger than

or equal to N then P (|(∑n−1

i=0 (∆iXc)2)1/2 − (

∫ 1

0σ2(s) ds)1/2| > C) ≤ ε/12,

and because integrated volatility is almost surely finite, there is a sufficiently

32

large K satisfying K/2 > C such that P (∫ 1

0σ2(s) ds > K/2) ≤ ε/12. Hence,

we may write

P (|∑

(h,k)∈Θn

rh,kgh,k(t)| > δ/2)

≤ P(

(x2I|x|≤1∧2u

1/2n

) ∗ µ1 > δ2(KHnc)−2)

+ ε/6

≤ c(Hn)2E(

(x2I|x|≤1∧2u

1/2n

) ∗ µ1

)

+ ε/6

≤ c(Hn)2(x2I|x|≤1∧2u

1/2n

) ∗ ν1 + ε/6

which for sufficiently large n is less than ε/3 by Assumption 1.2 and (1.39).

Now it is easily seen that for sufficiently large c

P (|∑

(h,k)∈Θn

sh,kgh,k(t)| > δ/2) ≤ P (n−1∑

i=0

∆iJsIAi∩Ci∩Ψc > (cHnu1/2

n )−1δ)

≤ cun(Hn)2E

(

(x2I|x|≤1∧2u

1/2n

) ∗ µ1

)

≤ cun(Hn)2(x2I

|x|≤1∧2u1/2n

) ∗ ν1 (1.58)

which, as above, is eventually less than ε/3. Hence, each summand on the

right hand side of (1.57) tends to zero. This concludes the demonstration

that

P ( supt∈[0,1]

|∑

(h,k)∈Θn

yh,kgh,k(t)| > δ) → 0.

We now tackle the last remaining summand in (1.43). Note that we may

write zh,k = ah,k + bh,k with ah,k :=∑n−1

i=0 gh,k(ti)(∆iJs)2IAi∩Ci

and bh,k :=∑n−1

i=0 gh,k(ti)(∆iJs)2IAi∩Cc

i. Then for positive δ

P ( supt∈[0,1]

|∑

(h,k)∈Θn

zh,kgh,k(t)| > δ) ≤ P ( supt∈[0,1]

|∑

(h,k)∈Θn

ah,kgh,k(t)| > δ/2)

+ P ( supt∈[0,1]

|∑

(h,k)∈Θn

bh,kgh,k(t)| > δ/2).

Now consider the event Ωn from (1.49), and note that Ai ∩ Cci ∩ Ωn ⊂

33

µ((ti, ti+1]× |x| > 1 > 0 ∩ Cci . Hence,

P (|∑

(h,k)∈Θn

bh,kgh,k(t)| > δ/2)

≤ P(

∪iI|x|>1 ∗ µ((ti, ti+1]× R) > 0, (∆iJs)2 > 4un

)

+ P (Ωcn)

≤ nP (I|x|>1 ∗ µ([0, t1]× R) > 0)E((Jst1)2)(4un)

−1 + P (Ωcn)

≤ c∆nu−1n + P (Ωc

n). (1.59)

which can be made as small as desired. Now consider

P (|∑

(h,k)∈Θn

ah,kgh,k(t)| > δ/2)

≤ P (n−1∑

i=0

(∆iJs)2I

|∆iJs|≤2u1/2n

> δ(2cHn)−1)

≤ cHnE(

x2I|x|≤1∧2u

1/2n

∗ µ1

)

≤ cHn(x2I|x|≤1∧2u

1/2n

∗ ν)1≤ cHnu1/2

n

which can be made arbitrarily small by the constraints on Hn. This completes

the demonstration that

P ( supt∈[0,1]

|∑

(h,k)∈Θn

zh,kgh,k(t)| > δ) → 0. (1.60)

1.6 Simulation

1.6.1 Continuous prices

In this section, we confirm via simulations the results established analytically.

We will first focus on the continuous case to mirror Proposition (1.1). Specif-

ically, we will demonstrate that the mean integrated square error (MISE),

34

the square bias, and the variance of the frame-based estimator tends to zero

as the number of obervations increases. We use prices generated by 4 com-

monly used models of asset prices, namely, the arithmetic Brownian motion

(ABM), the Ornstein-Uhlenbeck process (OU), the geometric Brownian mo-

tion (GBM), and the Cox-Ingersoll-Ross (CIR) process.

We simulate prices using the following stochastic differential equations:

Xt = 0.8 + 0.5t+ 0.2Wt, (ABM)

Xt = 0.8−∫ t

0

4Xs ds+

∫ t

0

0.2 dWs, (OU)

Xt = 0.8 +

∫ t

0

0.5Xs ds+

∫ t

0

0.2Xs dWs, (GBM)

Xt = 0.8 +

∫ t

0

(0.1− 0.5Xs) ds+

∫ t

0

0.2√

Xs dWs, (CIR)

where Wt is a standard Brownian motion. For convenience, the observation

interval is set to the unit interval [0, 1]. In all 4 cases, X0 = 0.8. For each price

model, we obtain estimates for the MISE, the square bias, and the variance

of the estimator when the number of observations are 500, 5000, and 50000,

respectively. In a high-frequency framework, 500 observations for an actively

traded stock is likely too small; 5,000 is about right, but 50,000 is not entirely

unheard of. At any rate, our objective is not to capture the average number

of trades of any particular security, but rather, to obtain support for our

asymptotic results by showing an inverse relationship between the number of

observations and the MISE, and thereby gain a better understanding of the

finite sample behavior of the estimator.

The starting point for constructing the estimator is to fix a generator

for the Gabor frame. We have denoted the generator and its dual by g and

g, respectively. For our purposes, any continuous and compactly supported

function would work.

35

Figure 1.1: Estimated vs. actual spot volatility

(a) GBM

0.025

0.050

0.075

0.100

0.25 0.50 0.75

Times

Sp

ot

(va

ria

nce

)

variable

Estimate

Actual

Geometric Brownian Motion (GBM)

(b) CIR

0.020

0.025

0.030

0.035

0.25 0.50 0.75

Times

Sp

ot

(va

ria

nce

)

variable

Estimate

Actual

Cox−Ingersoll−Ross (CIR)

(c) OU

0.0375

0.0400

0.0425

0.0450

0.25 0.50 0.75

Times

Sp

ot

(va

ria

nce

)

variable

Estimate

Actual

Ornstein−Uhlenbeck (OU)

(d) ABM

0.038

0.040

0.042

0.044

0.25 0.50 0.75

Times

Sp

ot

(va

ria

nce

)

variable

Estimate

Actual

Arithmetic Brownian Motion (ABM)

36

Table 1.1: Mean integrated square error (MISE) of vn(X, t).

ABM OU

n MISE Sq. Bias Var MISE Sq. Bias Var

500 1.30× 10−4 2.86× 10−6 1.27× 10−4 1.43× 10−4 1.19× 10−5 1.31× 10−4

5000 1.41× 10−5 1.11× 10−6 1.30× 10−5 1.45× 10−5 1.62× 10−6 1.28× 10−5

50000 2.32× 10−6 1.02× 10−6 1.30× 10−6 2.36× 10−6 1.12× 10−6 1.23× 10−6

GBM CIR


500 2.18× 10−4 4.18× 10−6 2.14× 10−4 6.26× 10−5 8.51× 10−7 6.17× 10−5

5000 2.33× 10−5 1.58× 10−6 2.17× 10−5 6.82× 10−6 6.00× 10−7 6.22× 10−6

50000 4.66× 10−6 1.02× 10−6 3.64× 10−6 1.46× 10−6 6.06× 10−7 8.52× 10−7

Note: The mean of the integrated square errors are obtained by taking an average over 100 sample paths generated for each model/numberof observations pair.

37

From an implementation perspective, using a B-spline makes the con-

struction of a dual frame generator a trivial matter. This is a consequence of

Theorems 2.2 and 2.7 in Christensen (2006), which together specify a very

simple rule for constructing dual pairs: Let a > 0 and b > 0 denote transla-

tion and modulation parameters, and let h be a B-spline of order p. Define

the dilation operator Dc as follows:

Dcf(x) = c−1/2f(x/c). (1.61)

If 0 < ab ≤ 1/(2p− 1) then Dah,Dah, where

h(x) = abh(x) + 2ab

p−1∑

n=1

h(x+ n), x ∈ R, (1.62)

is a pair of dual Gabor frame generators. So if we start with a B-spline h then

the dual generator will be a finite linear combination of scaled translates of

h; consequently, the dual generator will be a spline, with similar regularity

properties. For our simulation, we used a third-order B-spline. Our choice

of the third order B-spline is motivated by a desire for a generator with a

Fourier transform that decays like a quadratic polynomial. Specifically, we

set

h(x) =

x2/2 x ∈ (1, 0]

(−2x2 + 6x− 3)/2 x ∈ (2, 1]

(3− x2)/2 x ∈ (3, 2]

0 x 6∈ (3, 0]

, (1.63)

with h computed as in (1.62) above. Our choice of the modulation and

translation parameters is rather arbitrary. The only constraint is that 0 <

ab ≤ 1/(2p − 1) = 1/5; from our experimentation with different values,

performance seems to be about the same for different choices satisfying the

inequality; we settled on a = 1/5 and b = 1/3. Ideally Hn, the order of the

number of frequency domain shifts, would be selected optimally to minimize

MISE while balancing integrated variance and integrated square bias; this is

an open research question. For the time being we set Hn naively equal to 50.

38

The simulation results indicate that the Gabor frame estimator performs

satisfactorily. Figure 1.1 displays, for each of the 4 price models (ABM, OU,

GBM, and CIR), simulated spot variance sample paths plotted against spot

variance paths produced by the Gabor frame estimator. A visual inspection

shows that the estimator produces a relatively good fit even with the naive

selection of Hn. This claim is further corroborated by the analysis of the

the integrated mean aquare error (MISE), the integrated square bias, and

the integated variance summarized in Table 1.1.We found that the variance,

estimated in the foregoing manner, is only approximately the difference be-

tween the MISE and the integrated square bias. The reported figures for

variance are in fact the difference between the MISE and the integrated

square bias. The discrepancy is rather slight and does not materially change

the result. In all 4 model, an inverse relation between MISE, square bias, and

variance may be read off from the table. As was established mathematically,

we expect MISE to vanish if the number of price observations were made to

grow without bound.

1.6.2 Prices with jumps

We continue our investigation by simulating prices with jumps.

Xt = 0.8 + 0.5t+ 0.2Wt +N∑

i=1

Yi, (ABM + JMP)

Xt = 0.8−∫ t

0

4Xs ds+

∫ t

0

0.2 dWs +N∑

i=1

Yi, (OU + JMP)

Xt = 0.8 +

∫ t

0

0.5Xs ds+

∫ t

0

0.2Xs dWs +N∑

i=1

Yi, (GBM + JMP)

Xt = 0.8 +

∫ t

0

(0.1− 0.5Xs) ds+

∫ t

0

0.2√

Xs dWs +N∑

i=1

Yi, (CIR + JMP)

wher N is a Poisson random variable with intensity 5 and Yi, 1 ≤ i ≤ N , is

a normal random variable with mean zero and standard deviation 0.4.

We construct the dual Gabor frames as in the previous subsection using

the third order B-Spline specified in (1.63). With the introduction of jumps

39

into the simulation, we found out that better results may be obtained by

varying tha parameters a, b, and Hn. We settled on a = 1/7, b = 1/25,

and Hn = 50. The jump threshold is obtained by setting un = nα, where

α = −0.9. The results of the simulations are recorded in Table 1.2. We also

produce a graph of a single observations (paths) in Figure 1.2.

40

Figure 1.2: Estimated vs. actual spot volatility

(a) GBM + JMP

0.04

0.05

0.06

0.07

0.25 0.50 0.75

Times

Spot va

riance

Estimate

Actual

GBM + Jump

(b) CIR + JMP

0.025

0.030

0.035

0.040

0.045

0.050

0.25 0.50 0.75

Times

Spot va

riance

Estimate

Actual

CIR + Jump

(c) OU + JMP

0.036

0.038

0.040

0.042

0.044

0.25 0.50 0.75

Times

Sp

ot

va

ria

nce

Estimate

Actual

OU + Jump

(d) ABM + JUMP

0.036

0.039

0.042

0.045

0.048

0.25 0.50 0.75

Times

Spot va

riance

Estimate

Actual

ABM + Jump

41

Table 1.2: Mean integrated square error (MISE) of Vn(X, t).

ABM + JMP OU + JMP


500 1.53× 10−4 8.95× 10−6 1.44× 10−4 8.51× 10−4 1.31× 10−4 7.20× 10−4

5000 2.19× 10−5 2.27× 10−6 1.96× 10−5 5.48× 10−5 9.76× 10−6 4.50× 10−5

50000 2.13× 10−6 9.00× 10−8 2.04× 10−6 6.61× 10−6 2.65× 10−6 3.97× 10−6

GBM + JMP CIR + JMP


500 6.13× 10−3 8.70× 10−4 5.26× 10−3 3.74× 10−4 2.32× 10−4 1.43× 10−4

5000 3.42× 10−4 4.07× 10−5 3.02× 10−4 1.12× 10−5 8.29× 10−6 2.95× 10−6

50000 7.11× 10−5 6.36× 10−6 6.47× 10−5 7.05× 10−6 5.64× 10−6 1.40× 10−6

Note: The mean of the integrated square errors are obtained by taking an average over 50 sample paths generated for each model/numberof observations pair.

42

1.7 Empirical illustration - Flash Crash of 2010

On May 6, 2010, the S&P 500 index lost around 9% of its value in a matter

of minutes; the index rebounded to its pre-crash level a few minutes later.

Between 2:32 p.m. EDT and 3:08 p.m. EDT, the index fluctuated 100

points between 1160 and 1060. The erasure of value in the index has been

so precipitous that it has been dubbed the Flash Crash of 2010.

The relatively quick subsequent rebound of the index to its pre-crash level

suggests that the crash in the value of the index is likely not spurred by a

fundamental change in the intrinsic value of U.S. equities. The deviation of

realized prices from their fundamental or intrinsic values is the hallmark of a

liquidity crash. In this section, we study the trajectory of the spot volatility

of the S&P 500 index prior to, during, and after the Flash Crash of 2010

using our Gabor frame estimator.

Specifically, we sampled the S&P 500 index every 15 seconds during the

hours of 8:30 CT to 15:00 CT from May 5, 2010 to May 7, 2010. This resulted

in a sample of 4562 observations of the index around the time of the crash.

Table 1.3 provides descriptive statistics of the data. We obtained estimates

of realized volatility using Hn = 50, a = 1/5 and b = 1/7. We vary the

threshold parameter un to obtain four estimates of spot volatility. In the

first instance, un is set equal to infinity so that the estimate obtained in this

instance coincides with vn(X, t). The other estimates correspond to Vn(X, t)

with un set equal to 50, 25, and 12.5, respectively.

The estimates are graphed in Figure 1.3. The x-axis of the graph repre-

sents trading hours between 8:30 a.m. to 3.00 p.m. normalized to one time

unit, so that values between 0 and 1 represent May 5, 2010 and so on till May

7, 2010. Overall, the graphs of all estimates of realized spot volatility look

qualitatively similar. In all instances, spot volatility is seen to start to as-

cend toward the start of the crisis and to achieve a pronounced peak as prices

bottom out in the afternoon of May 6, 2010. The second but smaller peak

in the graph of realized volatility ndicate that markets remained agitated

throughout the following day.

43

Table 1.3: Descriptive statistics of S&P 500 data at 15 seconds resolution betweenMay 5th and 6th of 2010.

Variable Min. 25% perc. Median Mean 75% perc. Max

X 1066 1119 1153 1143 1166 1176(X)2 0.001 0.010 0.063 0.511 0.260 183.6

Figure 1.3: Realized spot volatility of the S&P500 Index from May 4, 2010 toMay 7, 2010. Realized spot volatility is estimated with Hn = 50, a = 1/7, b = 1/5.Trading hours from 8:30 to 15:00 is normalized to one time unit.

(a) un = ∞

Time

u_n = inf

0.5 1.0 1.5 2.0 2.5

1080

1100

1120

1140

1160

1180

S&

P 5

00 L

eve

l

20

40

60

80

Realiz

ed S

pot V

ola

tilit

y

(b) un = 50

Time

u_n = 50

0.5 1.0 1.5 2.0 2.5

1080

1100

1120

1140

1160

1180

S&

P 5

00 L

eve

l

10

20

30

40

50

60

Realiz

ed S

pot V

ola

tilit

y

(c) un = 25

Time

u_n = 25

0.5 1.0 1.5 2.0 2.5

1080

1100

1120

1140

1160

1180

S&

P 5

00 L

eve

l

10

20

30

40

50

60

Realiz

ed S

pot V

ola

tilit

y

(d) un = 12.5

Time

u_n = 12.5

0.5 1.0 1.5 2.0 2.5

1080

1100

1120

1140

1160

1180

S&

P 5

00 L

eve

l

10

20

30

40

50

Realiz

ed S

pot V

ola

tilit

y

44

1.8 Conclusion

We have investigated estimators of the instantaneous volatility of asset prices

for entire time windows based on Gabor frame expansions of the realized

trajectory of spot volatility. The main practical advantage of this type of

estimator is their versatility. Once an estimate obtained various functionals

of instantaneous volatility such as the ubiquitous integrated volatility are

obtained immediately. We derived our estimators of global instantaneous

volatility under the assumption that the price process is an Itô semimartin-

gale with Lévy jumps. We have also assumed that the densities of the first

and second predictable characteristics belong to the localized class of pro-

cesses with finite fourth moment.

We proposed a preliminary version of the estimator to be used in sit-

uations where the assumption of continuous asset prices hold. Under the

assumption that observations of the asset price occur at discrete equidistant

intervals with a mesh tending to zero within a fixed time interval, we have

shown using standard arguments that the estimator converges in probability

in L2[0, 1]. In the case of asset prices with discontinuous prices, we modified

the basic estimator to require the computation of the Gabor frame coeffi-

cients to depend on a threshold. The threshold itself is allowed to shrink

to zero at a sufficiently slow rate to ensure consistency of the estimator in

L2[0, 1].

45

1.9 Appendix

1.3 Lemma Let the dual Gabor frame generator g be constructed as in

(1.10). If ω(g, δ) denotes the modulus of continuity of g, i.e. ω(g, δ) :=

sup|g(t)− g(t′)| : t, t′ ∈ R and |t− t′| < δ, then

ω(gj,k, δ) ≤ Cω(g, δ) h, k ∈ Z,

where C is a positive constant.

Proof. G is bounded away from zero. To see this, note that since g has

support in [r, s], the series on the left hand side of (1.11) has finitely many

terms for each t. In addition, it is straight forward to verify that G(t) =

G(t+ b) for all t; so, G is periodic with period b. It is also clear that because

g is continuous, so is G. It follows that G attains its min and max on any

interval of length b. Let Ib denote the interval [(s + r − b)/2, (s + r + b)/2],

then

mint∈R

G(t) = mint∈Ib

G(t)

≥ a−1 mint∈Ib

|g(t)|2.

Because g is continuous and g doesn’t vanish in (r, s), we conclude that G∗ :=

mint∈R G(t) > 0. It is also straight forward that G∗ := maxt∈R G(t) < ∞.

Now, let t, t′ ∈ R, t > t′, such that |t− t′| ≤ δ, then

|g(t)− g(t′)| = |(G(t)G(t′))−1(g(t)G(t′)− g(t′)G(t))|≤ (G−2

∗ )|g(t)||G(t)−G(t′)|+ |G(t)||g(t)− g(t′)|.(1.64)

For a real number x, denote ⌊x⌋ the largest integer less than or equal to x and

⌈x⌉ the smallest integer that is greater than or equal to x. Now, Let A denote

the set of integers i such that r < t− ib < s. By definition of g, g(t− jb) = 0,

whenever j 6∈ A. Since b > 0, A contains at most ⌈(1 + |s|+ |r|)/b⌉ number

of elements. Let τ := mint − ib : i ∈ A, i.e. τ is the smallest t − ib such

that i ∈ A. Because A contains at most a finite number of elements, there

46

exists an integer k such that τ = t− kb. Set τ ′ := t′ − kb.

It is straight forward to verify that |τ − τ ′| ≤ δ and

a|G(t)−G(t′)| ≤⌈(1+|s|+|r|)/b⌉

∑

j=0

|g(τ + jb)2 − g(τ ′ + jb)2|

≤⌈(1+|s|+|r|)/b⌉

∑

j=0

|g(τ + jb)− g(τ ′ + jb)||g(τ + jb) + g(τ ′ + jb)|

≤ 2⌈(1 + |s|+ |r|)/b⌉g∗ω(g, δ), (1.65)

where g∗ := maxt∈R |g(t)|. Returning to (1.64), we see that

|g(t)− g(t′)| ≤ Cgω(g, δ),

where Cg = G2∗(2a(⌈(1 + |s|+ |r|)/b⌉)(g∗)2 +G∗). Now let h, k ∈ Z, then

|gh,k(t)− gh,k(t′)| = |e2πihat(g(t− kb)− g(t′ − kb))|

≤ |g(t− kb)− g(t′ − kb)| ≤ Cgω(g, δ). (1.66)

The last inequality follows because translating a function leaves its modulus

of continuity unchanged.

47

Chapter 2

Testing efficiency in small and

large financial markets

2.1 Introduction

An informationally efficient market is often understood to be one “in which

prices always ‘fully reflect’ available information” (Fama, 1969, Page 383).

While this description may be very helpful to the intuition, it leaves out at

least one important ingredient: the probability measure relative to which

prices are fully reflective of available information. If the probability measure

assumed is the physical or statistical measure then this description may have

more in common with the random walk hypothesis than market efficiency.

Indeed, a slightly more rigorous description of market efficiency would require

risk-adjusted asset prices to behave like martingales in the finite horizon,

complete market setting; of course, the risk-adjustment may be swept up

into an equivalent martingale measure Q, so that an alternative description

of market efficiency, assuming finite horizon and market completeness, would

require prices to evolve like martingales relative to a given information set

and an equivalent martingale measure reflecting agent’s preferences and risk

tolerance. This description of market efficiency echoes Malkiel (1991, Page

211)’s take on the subject:

A market is said to be efficient if it fully and correctly reflects

all relevant information in determining security prices. Formally,

48

the market is said to be efficient with respect to some information

set, φ, if security prices would be unaffected by revealing that

information to all participants. Moreover, efficiency with respect

to an information set, φ, implies that it is impossible to make

economic profits by trading on the basis of φ.

Hence, in an efficient (complete) market, risk-adjusted prices should be mar-

tingales, and it should be impossible to make economically significant profits

by trading on the basis of available information. Moreover, since prices fully

incorporate all available information at all times, there is no discrepancy be-

tween realized asset prices and prices implied by other non-price information.

In other words, since such discrepancies are non-existent, there cannot exist

trading strategies that perform better than buying and holding individual

traded assets.

This latter intuition informs the rigorous characterization of market ef-

ficiency proposed in (Jarrow & Larsson, 2012, Theorem 3.2). The authors

define a price process S as being efficient relative to a reference informa-

tion set if an economy E , determined by agents beliefs, endowments, and

preferences, and a consumption good price index may be found such that

S corresponds to equilibrium asset prices in E . From this basic definition,

they obtain characterizations in terms of the existence of equivalent martin-

gale measures and in terms of the joint satisfaction of the no free lunch with

vanishing risk (NFLVR) condition and the no dominance (ND) condition.

NFLVR is an “absence of arbitrage” condition that ensures the existence of

an equivalent local martingale measure for S, whereas ND imposes an opti-

mality condition on asset prices.

One of the benefits of characterizing market efficiency in terms of NFLVR

and ND is that the equivalent (risk-neutral) separating probability measure

is taken out of the definition of market efficiency. From the empirical point

of view, a way of testing efficiency without running into the joint-hypothesis

issue is suggested since both NFLVR and ND are expressed entirely in terms

of the physical or statistical probability measure. The joint hypothesis prob-

lem essentially describes the unquantifiability of the misspecification error

incurred by specifying an equilibrium asset pricing model or stochastic dis-

count factor as reference for testing market efficiency.

49

In the present work, we further this line of research by obtaining ad-

ditional characterizations of market efficiency that have the advantage of

simplifying empirical tests. In principle, the no dominance condition has to

be verified for each asset, so that for a market with a large number of assets

testing the ND requirement may prove to be impractical. Our first insight

into the problem comes from the fact that, under the condition of no un-

bounded profit with bounded risk (NUPBR), the i-th asset Si in a market

with n distinct assets satisfies no dominance if and only if the n-dimensional

vector of asset prices S, expressed in units of (γ + (1 − γ)Si), 0 < γ < 1,

does not violate the no arbitrage (NA) condition. Moreover, not only are

convex portfolios of undominated assets necessarily undominated (Delbaen

& Schachermayer, 1997), but the converse is also true (Corollary 2.1). Com-

bining, this insights with the fact that NUPBR remains invariant to a change

of numeraire, we obtain a characterization of market efficiency in terms of

NFLVR for S expressed in units of the market portfolio (Proposition 2.4).

From an empirical standpoint, this characterization obviates the need for

direct verification of the no dominance condition.

This reformulation also allows us to employ existing empirical techniques

for testing market efficiency. The empirical tests devised in (Jarrow et al.,

2012) and (Hogan et al., 2004) were originally intended to test for (statistical)

arbitrage strategies. The absence of arbitrage is not sufficient for market

efficiency. It is in fact possible for a given strategy to not be an arbitrage

while violating the ND condition for one or more assets. But since a violation

of the NA requirement for S expressed in units of the market portfolio is

equivalent to a violation of the ND condition for one or more assets, we are

able to repurpose the statistical arbitrage tests of Hogan et al. (2004) to

perform simultaneous verifications of violations of the ND condition for all

assets. And since the NA condition is equivalent to ND for the zeroth asset,

both the NA and ND conditions can be handled within a single test.

We conclude our study of market efficiency by introducing notions of no

dominance and market efficiency to the large financial market setting of Ka-

banov & Kramkov (1994, 1998); Cuchiero et al. (2015). We refer to these

notions as asymptotic no dominance (AND) and asymptotic market efficiency

(AME), respectively. The tests of market efficiency we propose in the stan-

50

dard small market setting assume the investment horizon is infinite, R+, so

that they may be most appropriately used to test longer horizon strategies.

The large financial market setting makes it possible to study fixed horizon

strategies under the assumption that the number of asset in the market

tends to infinity, much like in the arbitrage pricing theory (APT) framework

of Ross (1976a,b). We obtain a further change of numeraire characterization

and suggest tests for violations of asymptotic market efficiency.

2.2 Efficiency in standard markets

We take as given a filtered probability basis B := (Ω,F := (Ft)t≥0,F , P )

satisfying the usual conditions. Defined on B we assume an n-dimensional

semimartingale S := (St)t≥0 representing the price process of n assets. We

will refer to the pair (S,B) as a market.

Let λ > 0; we define λ-admissible strategies in the usual manner, i.e.,

n-dimensional predictable processes H such that the stochastic integral H •S

is well-defined, (H • S) ≥ −λ, limt→∞(H • S)t = (H • S)∞ exists, and H0 = 0.

A strategy is said to be admissible if there exists λ > 0 such that it is λ-

admissible. An arbitrage is an admissible strategy H such that (H •S)∞ ≥ 0

almost surely and (H • S)∞ > 0 holds with positive probability. A market

(S,B) is said to satisfy the no arbitrage condition (NA) if it is devoid of

arbitrage strategies. For admissible strategies, the random variable (H •S)∞

is referred to as the terminal value or payoff of strategy H. The payoff

space of 1-admissible strategies is denoted K1. The market (S,B) is said to

satisfy the no unbounded profit with bounded risk condition (NUPBR) if

K1 is bounded in probability, i.e, bounded in the set of finite valued random

variables L0(B). Now, if every sequence fn ∈ K1 satisfying ‖fn ∧ 0‖∞ → 0

must also satisfy fnP−→ 0 then the market (S,B) is said to satisfy the no

free lunch with vanishing risk condition (NFLVR).

The NFLVR condition is a strengthening of the NA condition. As a

matter of fact, NFLVR is necessary and sufficient for both NA and NUPBR to

hold (Kabanov, 1996, Lemma 2.2). In the general unbounded semimartingale

case, the NFLVR condition is equivalent to the existence of a probability Q

equivalent to P such that the components of S are stochastic integrals of a

51

predictable process with respect to a local martingale. The measure Q is said

to be an equivalent σ-martingale measure for S (Delbaen & Schachermayer,

1999, 1.1 Theorem). In the case of locally bounded S, Q is a local martingale

measure for S (Delbaen & Schachermayer, 1994, Corollary 1.2). This is

also the case for non-negative asset prices, i.e. NFLVR is equivalent to the

existence of a probability measure Q, equivalent to P , such that S is a Q

local martingale if S ≥ 0 (Ansel & Stricker, 1994, Corollary 3.5).

The basic intuition of an efficient market relative to an information set

(Ft)t∈[0,T ] (at least in the complete markets, finite horizon case) is that risk-

adjusted prices evolve over time like Ft-martingales. As a result, current

prices represent the best prediction of the future behavior of risk-adjusted

prices. This is the same as saying that any attempt, in the form of a trading

strategy based on current information, to achieve a better outcome, in the

form of superior risk-adjusted returns, than simply buying and holding the

individual traded assets would ultimately prove to be unsuccessful. Note the

close relationship between the available information set and the set of admis-

sible trading strategies. The available information set uniquely determines

the set of admissible trading strategies and vice versa. Hence, in describing

market efficiency, we may speak of trading strategies rather than informa-

tion set. Indeed, provided asset prices exist, an alternative characterization

of markets efficiency may be stated in terms of the no dominance (ND) con-

dition.

2.1 Definition (Jarrow & Larsson (2012)) Given and n-dimensional

S vector representing asset prices, the i-th component Si is undominated on

the time horizon [0, T ], T < ∞, if there is no admissible strategy H such that

P ((H • S)T ≥ SiT − Si

0) = 1 and P ((H • S)T > SiT − Si

0) > 0. (2.1)

The market (S,B) is said to satisfy ND if Si, 0 ≤ i < n, is undominated.

We will assume in the current setting that the investment horizon is the

positive real line, in contrast to the finite horizon setup analyzed in (Jarrow

& Larsson, 2012). This modeling choice is important, since, in this section,

we are primarily interested in devising tests of market efficiency that hold

asymptotically as the investment horizon approaches infinity. We assume

52

that prices have been rescaled so that Si0 = 1 for 0 ≤ i < n. We also assume

that H0 = 0 for all admissible strategies. Hence, as a slight modification of

the definition of ND given above, we will say that the i-th asset is undomi-

nated if for all admissible strategies H, P ((H • S)∞ ≥ Si∞ − 1) = 1 implies

P ((H • S)∞ = Si∞ − 1) = 1. We now state the definition of market efficiency

in our setting as follows:

2.2 Definition (Market efficiency) Let (S,B) be a market carried on

the filtered probability basis B = (Ω,F ,F, P ). It is said to be efficient if it

satisfies NFLVR and ND.

The above definition adapts the second characterization of market effi-

ciency in (Jarrow & Larsson, 2012, Theorem 3.2(ii)) to our setting where the

time horizon is infinite. Hence, a market is efficient if both NFLVR and ND

are satisfied. Here, our objective is to derive equivalent characterizations of

market efficiency that may be more suitable for empirical analysis. In the

sequel, we will assume that the vector of prices S is expressed in terms of the

asset occupying the zeroth position, so that S0t = 1 for t ≥ 0, and that all

other assets prices are non-negative so that Sit ≥ 0 for all t ≥ 0 and 0 ≤ i < n.

Let 0 < γ < 1 and denote Sγ,i the n+ 1 dimensional vector obtained by ap-

pending Sγ,i := (γ + (1 − γ)Si) to the end of S, i.e. Sγ,i := (S, Sγ,i). Now

set

Zγ,i := Sγ,i(Sγ,i)−1. (2.2)

That is, Zγ,i expresses the price process S in units of the convex portfolio

consisting of the zeroth asset and the i-th asset, Sγ,i.

2.1 Proposition If the market (S,B) is efficient then (Zγ,i,B) admits a

local martingale measure for all 0 ≤ i < n and 0 < γ < 1.

Proof. Recall that a σ-martingale measure, under which Zγ,i may be ex-

pressed as a stochastic integral with respect to a local martingale, coincides

with a local martingale measure for non-negative Zγ,i (Ansel & Stricker, 1994,

Corollary 3.5). Hence, it is only required to demonstrate NFLVR, which in

turn is equivalent to both NA and NUPBR (Delbaen & Schachermayer, 1994,

53

Corollary 3.8). Suppose (S,B) satisfies NFLVR and ND while (Zγ,i,B) fails

to satisfy NA for some i and 0 < γ < 1, so that there exists an admissible

strategy H for Zγ,i such that (H •Zγ,i)∞ is non-negative and strictly positive

with positive probability. We will argue as in (Delbaen & Schachermayer,

1995, Theorem 11). By rescaling H it may be assumed that H is 1-admissible

for Zγ,i. Consider the gain process

Y := (1− γ)−1(Sγ,i(H • Zγ,i + 1)− 1). (2.3)

Since Sγ,i is strictly positive and H is 1-admissible for Zγ,i, we have that

Yt ≥ −(1 − γ)−1 for t ≥ 0. Also, because H0 = 0 and Sγ,i0 = 1, we have

that Y0 = 0. Since H is an arbitrage for Zγ,i, we have that Y∞ is at least

as great as (1− γ)−1(Sγ,i∞ − 1) = Si

∞ − 1 with the inequality holding strictly

with positive probability. ND will be violated for Si if Y is representable as

a stochastic integral with respect to S. This follows from an application of

Itô’s integration by parts formula. Write H =: (Ha, Hb) where Ha denotes

the first n components of H and Hb its n+ 1-st (last) component. Then

(1− γ)Y + 1 = H • Sγ,i • Zγ,i +H • Zγ,i • Sγ,i +H • [Sγ,i, Zγ,i] + Sγ,i

= H • Sγ,i + Sγ,i

= Ha • S +Hb • Sγ,i + Sγ,i.

So that Y = (1−γ)−1Ha •S+(Hb+1) •Si, which may be expressed as K •S,

where K = (1−γ)−1Ha+ In,b,i and In,b,i is the n dimensional vector which is

zero everywhere except in the i-th position where it is equal to Hb+1. Hence,

Si is dominated by the (1 − γ)−1-admissible strategy K. This contradicts

the efficiency of (S,B).We note that the result that the NUPBR condition is invariant to a change

of numeraire in the finite horizon setting is proved in (Takaoka & Schweizer,

2014, Proposition 2.7 (ii)) using functional analytic methods. Here, we estab-

lish the claim using more elementary arguments. To that end, suppose (S,B)is efficient while (Zγ,i,B) fails to satisfy the NUPBR condition. In that case

there exists a sequence (Hm)m≥1 of 1-admissible strategies for Zγ,i and β > 0

such that given N ∈ N if m is sufficiently large then P ((Hm•Zγ,i)∞ > N) > β.

54

Denote

Y m := Sγ,i(Hm • Zγ,i + 1)− 1.

It is easily verified that Y m0 = 0 and Y m

t ≥ −1, t ≥ 0. Moreover, Y m∞ ≥

γ((Hm •Zγ,i)∞+1)−1 so that (Y m∞ )m≥1 is an unbounded sequence in L0 (B).

Indeed, for N ∈ N, P (Y m∞ > N) ≥ P ((Hm •Zγ,i)∞ > γ−1(N +1)− 1), which

for sufficiently large m exceeds β. NUPBR for (S,B) will be violated as soon

as Y m is shown to be representable as a stochastic integral with respect to

S. But this follows, as in the previous paragraph, from Itô’s integration by

parts formula.

We now establish the converse to the previous claim.

2.2 Proposition If (Zγ,i,B) admits a local martingale measure for every

0 < γ < 1 and 0 ≤ i < n then (S,B) is efficient.

Proof. Suppose (Zγ,i,B) admits a local martingale measure for every i and

0 < γ < 1 while (S,B) fails to satisfy NUPBR. Then there is Hm 1-admissible

for S and β > 0 such that for sufficiently large m, P ((Hm • S)∞ > N) > β

for all N ∈ N. Consider

Y m := (Sγ,i)−1(Hm • S + 1)− 1.

Y mt is well-defined because Sγ,i

t is strictly positive; Y mt ≥ −1 because Hm

is 1-admissible for S; and Y m0 = 0 because Hm

0 = 0 and Sγ,i0 = 1. We note

that the existence of a local martingale measure for Zγ,i is equivalent to

NFLVR. Under NFLVR (H • Zγ,i)∞ exists and is finite-valued for admissible

strategies (Delbaen & Schachermayer, 1994, Theorem 3.3). In particular Zγ,i∞

and therefore (Sγ,i∞ )−1, for all all i, is well-defined and finite valued. Note that

(Sγ,i∞ )−1 cannot be zero with positive probability since this would imply that

P (Si∞ = ∞) > 0, which would contracdict the almost sure finiteness of Zγ,i

∞ .

Hence, 0 < (Sγ,i∞ )−1 ≤ γ−1, almost surely, so that there is c > 0 sufficiently

small such that P ((Sγ,i∞ )−1 ≤ c) ≤ β/2. Hence,

P (Y m∞ > N) > P ((Hm • S)∞ > c−1(N + 1)− 1)− P ((Sγ,i

∞ )−1 ≤ c),

55

which for sufficiently large m is larger than β/2. That is (Y m∞ )m≥1 is un-

bounded in L0(B).Now let Km := (Hm, 0) denote the n+1-dimensional predictable process

obtained by appending 0 to Hm. It is easily seen that Km • Sγ,i = Hm • S.

So that Y m may be written alternatively as (Sγ,i)−1(Km • Sγ,i + 1) − 1. By

Itô’s integration by parts formula and the fact that Zγ,i = (Sγ,i)−1Sγ,i, we

have Y m = (Km + In+1,0) • Zγ,i, where In+1,0 denotes the n+ 1 dimensional

vector with zeros everywhere except in the zeroth position where there is a 1.

Thus, (Y m∞ )m≥1 is generated by 1-admissible strategies for Zγ,i and therefore

constitutes a violation of the NUPBR condition for Zγ,i.

We note that the NA condition is simply the ND condition for the zeroth

asset, so that demonstrating ND for 0 ≤ i < n is all that is required. To that

end, suppose there is an i and a c-admissible, c > 0, strategy H for S such

that (H • S)∞ ≥ Si∞ − 1 holds, almost surely, with the inequality holding

strictly on a set of positive probability. Observe that this implies that

(1− γ)(H • S)∞ ≥ (1− γ)(Si∞ − 1) = Sγ,i

∞ − 1 (2.4)

holds almost surely with the inequality holding strictly on a set with positive

probability. Set K := (H, 0) and note that H •S = K • Sγ,i for any 0 < γ < 1;

fix one such γ and define

Y := (Sγ,i)−1((1− γ)K • Sγ,i + 1)− 1.

It is easily seen that Y0 = 0 and easily verified that Yt ≥ γ−1(1 − γ)(1 − c)

for t ≥ 0. It follows from (2.4) that P (Y∞ ≥ 0) = 1 and P (Y∞ > 0) > 0. It

follows by the stochastic integration by parts formula that Y = ((1− γ)K +

In+1,0) •Zγ,i =: J •Zγ,i. Hence, J constitutes a violation of the NA condition

for Zγ,i. This completes the demonstration.

The previous two Propositions may be summarized as follows:

2.3 Proposition The market (S,B) is efficient if and only if (Zγ,i,B)admits an equivalent local martingale measure for each 0 ≤ i < n and 0 <

γ < 1. In particular, under NUPBR, the i-th asset Si is undominated if

56

and only if (Zγ,i,B) satisfy NA for all 0 < γ < 1. Moreover, (S,B) satisfies

NUPBR if and only if (Zγ,i,B) satisfies NUPBR.

It is easy to see that if Proposition 2.3 holds for one γ ∈ (0, 1) then it

must hold for all 0 < γ < 1. Hence, we may restate the claim of that Propo-

sition using equally weighted portfolios of the numeraire asset and the i-th

asset. These results make it somewhat easier to test for efficiency by lever-

aging econometric techniques designed for testing arbitrage and unbounded

profit opportunities as opposed to attempting to test for the no dominance

condition directly. Still a market with n assets would require n + 1 tests to

verify efficiency. The U.S. equities market is comprised of more than five

thousand stocks, so that, in principle, a verification of market efficiency in

the U.S. equities market would require as many as five thousand separate

tests. The following characterization of market efficiency simplifies the task

considerably by reducing the number of tests to just two: NA and NUPBR.

First, we introduce some helpful notation. Let

α = (α0, · · · , αn−1) (2.5)

be an n dimensional vector of real numbers such that αi > 0, and∑n−1

i=0 αi =

1, so that α is a weight vector. Define Sα := α ·S =∑n−1

i=0 αiSi, i.e. Sα is the

weighted sum of the n asset prices, and it is interpreted as the value process

of the market portfolio computed using the weight vector α. Next, denote

Sα the n + 1 dimensional price vector obtained by appending Sα to S, i.e.

Sα := (S, Sα). Denote

Zα := Sα(Sα)−1, (2.6)

so that Zα is a change of numeraire that restates S in units of the market

portfolio. We now have the following:

2.4 Proposition The market (S,B) is efficient if and only if (Zα,B)admits an equivalent local martingale measure for all strictly positive weight

vectors α.

Proof. Suppose, (Zα,B) admits a local martingale measure while there exists

57

a c-admissible strategy, c > 0,

H := (H0, · · · , Hn−1)

for S and at least one 0 ≤ k < n such that (H • S)∞ ≥ Sk∞ − 1 holds almost

surely, with the inequality holding strictly with positive probability. Denote

α−k the vector obtained by substituting 0 for the k-th coordinate of α. Set

K := α−k + αkH and observe that (K • S)∞ ≥ Sα∞ − 1, almost surely, with

the inequality holding strictly on a set of positive measure. Set J := (K, 0)

and note that J • Sα = K • S. Now consider

Y = (Sα)−1(J • Sα + 1)− 1.

Because J0 = 0 and Sα0 = 1, we have Y0 = 0. Because H is c-admissible

and 0 < (Sαt )

−1 ≤ (α0)−1, we have Yt ≥ (α0)

−1(1− α0)(1− c), and P (Y∞ ≥0) = 1 with P (Y∞ > 0) > 0 because H • S dominates Sk. By the stochastic

integration by parts formula, Y = (J+In+1,0) •Zα. That is, Y is an arbitrage

for Zα.

Now suppose (Hm)m≥1 violates NUPBR for S. Then there is β > 0

such that for all N ∈ N, P ((Hm • S)∞ > N) > β for sufficiently large

m. Let Y m = (Sα)−1(Km • Sα + 1) − 1, where Km = (Hm, 0). It is easy

to see that Y m0 = 0, and Y m

t ≥ −1. Under the assumption of NFLVR,

(Sα∞)−1 is well-defined, finite-valued, and contained in (0, α−1

0 ] (Delbaen &

Schachermayer, 1994, Theorem 3.3). Hence, there is a sufficiently small c

such that P ((Sα∞)−1 > c) > 1− β/2. Hence, for sufficiently large m,

P (Y m∞ > N) > P ((Km • Sα)∞ > c−1(N + 1)− 1)− P ((Sα

∞)−1 ≤ c),

which eventually exceeds β/2. Using Itô’s integration by parts formula, it

may be easily seen that Y m is expressible as a stochastic integral with respect

to Zα.

Now suppose (S,B) is efficient but for some α, (Zα,B) admits an arbi-

trage. So that there is 1-admissible H such that (H •Zα)∞ ≥ 0 almost surely

with the inequality holding strictly with strict probability. Then it is easily

verified, arguing as in the previous paragraphs, that Y := (Sα)(H •Zα+1)−1

58

is equal to K • S where K is 1-admissible for S. Because H is an arbitrage

for Zα, we have that Y∞ = (K • S)∞ ≥ Sα∞ − 1, with the inequality hold-

ing strictly with positive probability. Hence, Sα is dominated by K. That

ND fails for at least one asset now follows from (Delbaen & Schachermayer,

1997, Proposition 2.12). Indeed, denote J := α−1n−1(K−∑n−2

i=0 αi) and observe

that (J • S)∞ ≥ Sn−1∞ − 1, almost surely, with the inequality holding strictly

with positive probability. By the non negativity of Sn−1, we also have that

(J • S)∞ ≥ −1; by Proposition 2.11 of Delbaen & Schachermayer (1997),

(J •S)t ≥ −1 on R+, so that J is 1-admissible. This is a contradiction of the

no dominance assumption on Sn−1.

Now, if (Hm)m≥1 is a violation of NUPBR for Zα then (Y m)m≥1, where

Y m = (Sα)(Hm • Zα + 1)− 1 violates NUPBR for S.

As a corollary to the previous claim, we now have the following:

2.1 Corollary Under the assumption of NUPBR for S, (S,B) is efficient

if and only if Sα is undominated for every strictly positive weight vector α.

Proof. The necessity of the claim follows as in Proposition 2.4. On the other

hand if Si is dominated by H then K • S, where K = α−i + αiH and α−i is

the portfolio weight α with 0 substituted for αi, dominates Sα.

The next result shows that the choice of weight vector is irrelevant.

2.2 Corollary Let α be a srictly positive weight vector. (Zα,B) satisfies

NFLVR if and only if (Zκ,B) satisfies NFLVR for all strictly positive weight

vectors κ.

Proof. Sufficiency is obvious. Suppose (Zκ,B) fails to satisfy NUPBR for a

strictly positive weight vectors κ. Let (Hm)m≥1 denote the sequence yielding

unbounded profits in the market (Zκ,B). Then using familiar arguments, it

is easily verified that Y m := (Sκ)(Hm • Zκ + 1)− 1 constitutes a violation of

NUPBR for (S,B). By Proposition 2.4, (Zα,B) cannot satisfy NFLVR.

Suppose (Zκ,B) satisfies NUPBR but fails to satisfy NA. Then arguing

as in Proposition 2.4, it is easily seen that Sκ is dominated. By Corollary

2.1, (S,B) cannot be efficient. By Proposition 2.4, (Zα,B) cannot satisfy

59

NFLVR.

The use of the market portfolio Sα as numeraire in Proposition 2.4 is sug-

gestive of the role played by the “market” portfolio in the Stochastic Portfolio

Theory (SPT) of Fernholz (2002). In that setting, there are n assets/stocks

with strictly positive prices; each stock is normalized so that there is only

one stock outstanding; consequently, the price of the i-th stock X i coincides

with its capitalization. The time t total capitalization of the market is given

by Xt :=∑n−1

i=0 X it and the relative capitalization of each stock at time t

is given by µi(t) := X it(Xt)

−1, 0 ≤ i < n. In SPT market inefficiences are

exploited, i.e. when only NUPBR holds but not NA for the market relative

to the numeraire X. Our work contributes a large financial and/or infinite

time horizon view on this phenomenon.

The next characterization of market efficiency is perhaps the most in-

tuitive. It states that in a complete market setting, a market is efficient if

and only if there exists Q equivalent to P such that all asset prices are uni-

formly integrable Q martingales. Hence, not only must risk-adjusted prices

be unpredictable, they must also have constant unconditional risk-adjusted

expectation across time and at infinity. This result is the infinite horizon

counterpart of (Jarrow & Larsson, 2012, Theorem 3.2 (iii)).

2.5 Proposition The market (S,B) is efficient if and only if there exists

an equivalent local martingale measure Q for S such that S is a uniformly

integrable martingale under Q.

Proof. The claim follows from (Delbaen & Schachermayer, 1995, Theorem

13). Indeed, by Proposition 2.4, efficiency is equivalent to (Zα,B) admitting a

local martingale measure. Let Q′ be a local martingale measure for Zα. Since

0 < (Sα)−1 ≤ α−10 on R+, it follows in particular that (Sα)−1 is a uniformly

integrable martingale under Q′ (Protter, 2004, Theorem 51). Define dQ =

(Sα∞)−1dQ′. That Sα is uniformly integrable follows from the fact that for

all stopping times τ , EQ(Sατ ) = 1. Since 0 ≤ αiS

i ≤ Sα, we have that Si is

uniformly integrable as well. Moreover, Sα = Zα(Sα) is a Q local martingale

(He et al., 1992, Theorem 12.12). In particular, S is a Q local martingale.

Hence, Q is an equivalent local martingale measure for S.

60

Suppose Q is a uniformly integrable martingale measure for S. Then

Q doubles as a local martingale measure for S, so that NFLVR is satisfied

(Delbaen & Schachermayer, 1999, Theorem1.1). It remains to show prices

are not dominated. Suppose there is K admissible for S such that

P ((K • S)∞ ≥ Si∞ − 1) = 1, (2.7)

with the inequality holding strictly with positive probability for some 0 ≤i < n. Because S is a local martingale under Q, it follows that K • S is

a σ-martingale, so that by (Ansel & Stricker, 1994, Corollary 3.5) it is a

local martingale. Moreover, since it is bounded below, K • S is a super-

martingale. Hence, under the assumption of uniformly integrable Si, we

have EQ((K • S)∞ − (Si∞ − 1)) = EQ((K • S)∞) ≤ EQ((K • S)0) = 0. Since

(2.7) holds and P ∼ Q, it must be the case that P ((K •S)∞ = Si∞−1) = 1.

According to Proposition 2.5, the equivalence holds only if there exists

an equivalent probability measure Q such that prices, in addition to being

martingales, are also uniformly integrable. Indeed, the following is a coun-

terexample taken from (Delbaen & Schachermayer, 1999).

2.1 Example Let (εm)m≥1 be independent and identically distributed

Bernoulli sequence taking values 1 and -1 with equal probability (P ). Let

c denote a real number satisfying 0 < c < 1 and define the price process

(Sm)m≥1 recursively as follows: S0 = 1 and Sm = c if εm = 1, and Sm =

2Sm−1−c otherwise. Now consider an economy with two assets (1, S). Denote

Fm the σ-algebra generated by εm and observe that E(Sm|Fm−1) = 1/2(c+

2Sm−1 − c) = Sm−1. So that (1, S) is a martingale for (Fm)m≥1 under P .

Meanwhile, note that S∞ = c < 1 almost surely since the probability of all

occurrences of εm being -1 is zero. Hence, S is strongly dominated by 1. That

is (1, S) fails to satisfy ND and, therefore, market efficiency even though it

is a martingale.

61

2.2.1 Statistical inference for market efficiency

In the asset management industry, an arbitrage is often understood, at least

implicitly, as a trading strategy capable of generating positive expected excess

return beyond the level implied by its exposure to a set of risk factors. The

set of risk factors is often the return of a market index such as the S&P

500 together with the size and value factors of Fama & French (1993). This

excess positive return beyond the level prescribed by the benchmark index or

factors is often denoted α and the strategy as a whole is often referred to as

an alpha. The economic appeal of an arbitrage is the possibility of achieving

positive excess returns while incurring a less than commensurate amount of

risk.

In other words, an arbitrage is a free lunch. Clearly, the free lunch inter-

pretation of an arbitrage only makes sense to the extent that the benchmark

factors accurately represent the sources of systematic risk present in the econ-

omy. As a case in point, a strategy based on the “small size effect” (Banz,

1981) produces positive alpha when systematic risk is proxied with the return

on a market index; of course, the positive alpha vanishes in the multi-factor

model of Fama & French. Hence, a true determination of an alpha, at least in

the multi-factor framework, is only possible if the underlying risk factors are

known and measurable with accuracy. Another, way to state the same thing

is to consider the fact that in an exponentially affine multi-factor framework,

the logarithm of the Radon-Nikodym derivative of the risk-neutral measure,

is given by

m = a+k∑

i=1

bifi

where a and bi are constants and fi, 0 < i ≤ k, is a systematic/priced risk.

Hence, a choice of (fi)0<i≤k may be viewed as expressing an opinion about

m or indeed the risk-neutral measure Q since

Q(A) =

∫

A

exp(m)dP (2.8)

for all events A.

62

In practice, the pricing kernel m is unobservable so that the choice of risk

factors (fi)0<i≤k is subject to error; indeed the choice of a linear relationship

itself is subject to error. A means by which the misspecification error may be

sidestepped is suggested by the local martingale characterization of market

efficiency, i.e. a market is efficient in large financial markets if NA, ND,

and NUPBR hold. These conditions are expressed in terms of the physical

measure, so that, by formulating empirical tests based on these concepts the

misspecification error inherent in trying to estimate the pricing kernel m may

be avoided.

By Proposition 2.4, if the aim is to study efficiency in the market (S,B)then it may prove to be more efficient to first perform a change of numeraire,

using the market portfolio as the new numeraire, and then studying the

market (Zα,B). This approach has the benefit of obviating the need to

perform the ND test for each asset, since each violation of ND translates into

a violation of NA for Zα. To that end we have the following lemma.

2.6 Proposition Let (Hm)m≥1 be a sequence of admissible simple strategy

for Zα, that is Hm admits the representation Hm =∑nm

i=1 ζiIKτi−1,τiK, where

nm ↑ ∞, τi is a stopping time, and ζi is Fτi−1measurable. Further suppose

that

EP ((V mτm)

2) < ∞,

where V mτm

:= (Hm •Zα)τm. Suppose there is an admissible strategy H for Zα

such that Hm • Sucp−→ H • S. Then H constitutes a violation of NA for Zα if

and only if

limm

EP (V mτm) > β for some β > 0, (2.9)

limm

P (V mτm < 0) = 0. (2.10)

Moreover, if (Hm)m≥1 denotes a sequence of 1-admissible simple strategies

for Zα such that

EP (V mτm) < ∞.

63

Then (Hm)m≥1 constitutes a violation of NUPBR for Zα if and only if

limm

EP (V mτm) = ∞. (2.11)

Proof. These statements follow directly form the definitions of NA and

NUPBR.

The simplest way to verify (2.9), (2.10), and (2.11) is probably to specify a

parametric model for the incremental payoffs of Hm. This is the approached

taken in Jarrow et al. (2012) to study statistical arbitrage opportunities. Let

(εi)1≤i≤nm denote a sequence of independent standard normal variables and

define

∆V mτi

:= V mτi

− V mτi−1

= µiθ + σiγεi, (2.12)

where µ, θ, σ, and γ are constants. This specification is the unconstrained

mean (UM) model of Hogan et al. (2004); this basic setup may be modified

to accommodate more complicated behaviors such as correlated errors and

coefficients that change from one small market to the next. Observe that V mτm

is normally distributed with mean µ∑nm

i=1 iθ and variance

∑nm

i=1(σiγ)2. The

log likelihood function is given by

L(Θ) := −2−1

nm∑

i=1

log(σiγ)2 − (2σ2)−1

nm∑

i=1

i−2γ(∆V mτi

− µiθ)2.

where Θ := (µ, θ, σ, γ). The parameter vector may be estimated in the usual

manner by setting the gradient of L(Θ) to zero and solving a system of four

equations in four unknowns to obtain an estimate Θ := (µ, θ, σ, γ).

Now observe that if both µ and θ are positive then (Hm)m≥1 constitutes a

violation of the NUPBR condition for Zα. If µ > 0, γ < 0, and θ is sufficiently

large then (Hm) converges to an arbitrage for Zα and, by Proposition 2.4,

a violation of market efficiency for (S,B). In (Hogan et al., 2004, Theorem

6), it is shown that θ > γ − 1/2 ∨ −1 is sufficient to ensure convergence

to an arbitrage. The above considerations are summarized in the following

Proposition.

64

2.7 Proposition Under the assumptions of Lemma 2.3, if the incremental

payoffs of Hm satisfy (2.12) then the null hypothesis of market efficiency may

be rejected with 1− α confidence if either one of the joint tests

1. H1 : µ > 0 and H2 : θ > 0, or

2. H ′1 : γ < 0, H ′

2 : µ > 0, H ′3 : θ > γ − 1/2 ∨ −1.

achieve a combined p-value of less than α.

It is worth noting that since these tests involve the specification of a

model for the incremental payoffs of the target strategies, they are subject

to misspecification errors. Hence, these tests also involve testing a joint-

hypothesis. The advantage of the current tests over traditional tests that

require the specification of a model for the stochastic discount factor is that,

the misspecification error incurred in the tests proposed here may be ana-

lyzed and tested; this is so because they only require observable (at least

at discrete times) data: prices and portfolio returns. This is in contrasts to

the “unmeasurable” misspecification error incurred in traditional tests which

rely on estimates of unobservable quantities such as the stochastic discount

factor underlying the market.

Moreover, the incremental payoff specification in (2.12) is just one exam-

ple. Another reasonable model that may be analyzed by maximum likelihood

methods would involve modeling incremental payoffs as the sum of an expo-

nential random variable and a Gaussian random variable. The positivity of

the volatility of the Gaussian component can then be tested as m tends to

infinity to verify violations of the no arbitrage condition. Clearly, the model

that is ultimately selected would depend on how well it fits the data being

analyzed.

2.3 Market efficiency in large financial markets

The theory of large financial markets is a modern re-imagining of the ar-

bitrage pricing theory (APT). The APT (Ross, 1976a,b) was devised as an

alternative to the capital asset pricing model (CAPM) of Sharpe (1964) and

65

Lintner (1965); it aims to obviate the need for accurate measures of the mar-

ket portfolio and to relax some of the assumptions underlying the CAPM.

It assumes that changes in individual asset returns are due to changes in a

fixed number of factors plus an uncorrelated idiosyncratic component. Under

the assumption of no arbitrage (Huberman, 1982), the security-market line

is approximated arbitrarily well, as the number of assets increases without

bound.

The APT is fundamentally a discrete time theory. The theory of large

financial markets was introduced in (Kabanov & Kramkov, 1994, 1998) as

a dynamic continuous-trading extension of the APT. In this modern incar-

nation, the APT employs the tools of mathematical finance pioneered by

Harrison & Pliska (1981). A large financial market is defined as a sequence

of small markets (Sn,Bn, T n), n ∈ N, where 0 < T n ≤ ∞ is the terminal time

in the n-th small market, Sn is a dn-dimensional vector of asset prices and

Bn is a filtered probability basis (Ωn,Fn,Fn, P n). In the sequel, we adopt

the large financial market on one probability space setting of Cuchiero et al.

(2015) with dn = n, n ∈ N, i.e. T n = T < ∞, Bn = B, n ∈ N, and (Sn)n≥1

forms a nested sequence of n-dimensional asset prices, so that the i-th price

process in Sn is indistinguishable from the i-th coordinate of Sm whenever

0 ≤ i ≤ n ≤ m.

In the classic small market setup treated in the previous section, market

efficiency is characterized in terms of NFLVR and the no dominance condi-

tion. We introduce a similarly motivated definition of market efficiency in

large financial markets in terms of asymptotic no free lunch with vanishing

risk (ANFLVR), the large financial market counterpart of NFLVR (Cuchiero

et al., 2015), and asymptotic no dominance (AND) defined below (Definition

2.6). We begin with the introduction of large financial market notation and

definitions.

2.3.1 Large financial market payoff space

We adopt the notation of Cuchiero et al. (2015). Given the n-th small mar-

ket (Sn,B), where Sn is an n-dimensional semimartingale representing asset

prices, a λ-admissible strategy, λ > 0, is a predictable process H such that

66

H0 = 0, the stochastic integral H • Sn is well-defined, and H • Snt ≥ −λ for

all 0 ≤ t ≤ T . We will call Xn := H • Sn an admissible gain process if H

is λ-admissible for some positive real λ. We will denote by X nλ the set of λ-

admissible gain processes and by X n the collection of all admissible processes

in (Sn,B), i.e.

X n :=⋃

λ>0

X nλ =

⋃

λ>0

λX n1 .

Small market payoff spaces are denoted Kn and Kn1 and defined as the ter-

minal values of small market gain processes:

Kn := XT : X ∈ X n, and Kn1 := XT : X ∈ X n

1 .

The space of small market dominated payoffs are defined in the classical

manner:

Cn0 := f − g : f ∈ Kn and g ∈ L0

+(B),Cn := f : f ∈ Cn

0 and f ∈ L∞(B).

Now for an adapted càdlàg process X carried on the basis B, denote (X)∗T :=

sups≤T |Xs| and define

‖X‖ucp = E(min((X)∗T , 1)).

The functional ‖ · ‖ucp is a quasi-norm, and it induces a complete metric

ducp(X, Y ) := ‖X − Y ‖ucp on the space of adapted càdlàg processes. We

employ the notation Xn ucp−→ X to denote convergence with respect to this

topology. A predictable process H will be called simple if there exists F-

stopping times 0 = S0, · · · , Sk+1 = T , and ξi ∈ FSiwith ‖ξi‖∞ < ∞, 0 ≤ i ≤

k, such that

Ht = ξ0IJ0K(t) +k∑

i=1

ξi1KSi,Si+1K(t).

67

In the sequel, ξ0 is assumed to be identically zero. We denote by Λ the set of

B-predictable simple processes. Next, for a càdlàg adapted process X, define

‖X‖S := sup‖H ·X‖ucp : H ∈ Λ, |H| ≤ 1.

The functional ‖ · ‖S induces a complete metric space on the space of semi-

martingales referred to interchangeably as the Emery or semimartingale topol-

ogy. We employ the notation Xn S−→ X to denote convergence with respect

to this topology. Now, a process X is said to be a 1-admissible general-

ized gain process if there exists a sequence of small market wealth portfolios

Xn ∈ X n1 such that

Xn S−→ X,

that is, X is a limit point in the semimartingale topology of⋃

n≥1 X n1 . We

denote the set of λ-admissible generalized wealth portfolios by Xλ and the

set consisting of all admissible generalized wealth portfolios by X , i.e.

X :=⋃

λ>0

Xλ =⋃

λ>0

λX1.

We now define the payoff spaces K and K1 as the terminal values of general-

ized wealth portfolios:

K := XT : X ∈ X, and K1 := XT : X ∈ X1.

Given the above, we define the set of large financial market dominated payoffs

as follows:

C0 := f − g : f ∈ K and g ∈ L0+(B),

C := f : f ∈ C0 and f ∈ L∞(B).

2.3.2 Arbitrage pricing in large financial markets

We now recall the fundamental theorem of asset pricing for large financial

markets (Cuchiero et al., 2015, Theorem 1.1). That is, necessary and suf-

68

ficient conditions with acceptable economic interpretations under which the

existence of a pricing functional (equivalent separating measure) is assured.

Since zero is contained in C, we would like the pricing functional or more

specifically the P -equivalent probability measure Q to satisfy EQ(f) ≤ 0 for

all f ∈ C. In order to make these statements precise in the large financial

market setting, we require the following definitions and lemmas.

2.1 Lemma If f ∈ C0 then there exists fn ∈ Cn0 , n ∈ N, such that fn

P−→ f .

Proof. Suppose f ∈ C0. Then there is X ∈ X and random variable

g ∈ L0+(B) such that f = XT − g. Since X ∈ X , there is Xn ∈ X n such that

‖Xn −X‖S → 0. This in turn implies that Xn ucp−→ X, so that XnT

P−→ XT .

Set fn := XnT − g. Then fn ∈ Cn

0 , and fnP−→ f .

Hence, the dominated payoff of a generalized gain process may be viewed

as the limit of dominated payoffs in small markets. The next lemma shows

that the same can be said for the bounded portion of C0.

2.2 Lemma If f ∈ C then there exists fn ∈ Cn such that fnP−→ f .

Proof. Let f ∈ C, then f ∈ C0 and f ∈ L∞(B), i.e. there exists a K < ∞such that f ≤ K almost surely. Because f ∈ C0, there exists, by Lemma

(2.1), gn ∈ Cn0 such that gn

P−→ f . Set fn := gn − (gn − K)Ign≥K. Then

fn ∈ Cn, and fnP−→ f .

2.3 Definition A large financial market (Sn,B)n≥1 is said to possess the

(Asymptotic) No Arbitrage (ANA) property if there does not exist Xn ∈ X n1 ,

n ∈ N, and X ∈ X1 such that ‖Xn −X‖S → 0 and

lim supn

P (XnT < 0) = 0, (2.13)

lim infn

P (XnT > α) > α, (2.14)

for some α > 0.

It is easily verified that the definition of ANA given here is equivalent to

69

the more familiar functional analysis definition:

K1 ∩ L0+(Ω,F , P ) = 0.

Because our interests are econometrically motivated, Definition 2.3 is more

natural. The next definition is the large market counterpart of NUPBR.

2.4 Definition A large financial market (Sn,B)n≥1 is said to satisfy the

No Unbounded Profit with Bounded Risk (NUPBR) condition if K1 is bounded

in L0(B).

These two notions of arbitrage are equivalent to our next notion of arbi-

trage (Cuchiero et al., 2015, Proposition 4.4).

2.5 Definition A large financial market (Sn,B)n≥1 is said to possess the

Asymptotic No Free Lunch with Vanishing Risk (ANFLVR) property if

C ∩ L∞+ (Ω,F , P ) = 0,

where L∞+ (Ω,F , P ) denotes the set of essentially bounded nonnegative random

variables on B and C is the norm closure of C in L∞(Ω,F , P ).

It is shown in (Cuchiero et al., 2015, Theorem 1.1) that a version of

the fundamental theorem of asset pricing holds in the large financial market

setting: ANFLVR is necessary and sufficient for the existence of an equivalent

separating measure (ESM), where an ESM is a probability Q equivalent to

P such that EQ(f) ≤ 0 for f ∈ C.

2.4 Asymptotic Market efficiency

In the standard small market setting, the simultaneous satisfaction of the

NFLVR condition and the ND property for all assets is equivalent to market

efficiency. In the case of non-negative asset prices, it is also the case that

there exists a Q equivalent to P such that prices are uniformly integrable

martingales (Proposition 2.5). Here, our objective is to extend these notions

to the framework of large financial markets. We start with an adaptation of

70

the ND condition to the large financial market setting. For each n we as-

sume that Sn is an n-dimensional semimartingale with the zeroth component

Sn,0 = 1 on [0, T ]. So that, we have dn = n. For all 0 ≤ i < n, the i-th asset

price satisfies Sn,it ≥ 0 for t ∈ [0, T ]. We also assume time zero prices are

deterministic and that the entire price process is normalized so that Sn,i0 = 1

for 0 ≤ i < n ∈ N.

2.6 Definition (Asymptotic No Dominance (AND)) A large financial

market payoff f ∈ K is said to be (asymptotically) undominated if for all

g ∈ K if g ≥ f , a.s., then it must also be the case that g = f almost surely.

Now let Ak := a0, a1, · · · , ak−1 and denote an arbitrary set of k > 0

distinct natural numbers including 0; we adopt the convention

a0 = 0.

Now let

αk := (αa0 , αa1 , · · · , αak−1)

denote a strictly positive weight vector, that is∑k−1

j=0 αaj = 1, and αaj > 0

for 0 ≤ j < k. Now, for n ≥ maxa : a ∈ Ak define

Sαk =k−1∑

j=0

αajSn,aj .

We will refer to Sαk as the convex portfolio generated by (Ak, αk). Note

that because (Sn)n≥1 is a nested sequence, Sαk is up to an evanescent set

independent of n for n ≥ maxa : a ∈ Ak.

2.7 Definition (Asymptotic Market Efficiency (AME)) A large fi-

nancial market (Sn,B)n≥1 is said to be asymptotically efficient on [0, T ] if

1. ANFLVR holds for (Sn,B)n≥1, and

2. for all convex portfolios Sαk , the payoff SαkT − 1 is asymptotically un-

dominated.

71

Now denote Sn,αk the n + 1 dimensional vector obtained by appending

Sαk to Sn, that is Sn,αk = (Sn, Sαk). Define

Zn,αk = Sn,αk(Sαk)−1.

Hence, Zn,αk expresses Sn in units of Sαk .

2.8 Proposition The large financial market (Sn,B)n≥1 satisfies NUPBR

if and only if (Zn,αk ,B)n≥nkwith nk = maxa : a ∈ Ak satisfies NUPBR for

all Zn,αk .

Proof. Suppose (Hn)n≥1 violates NUPBR for (Sn,B)n≥1. Then there exists

β > 0 such that for N ∈ N and sufficiently large n, we have P ((Hn • Sn)T >

N) > β. Consider

Y n := (Sαk)−1(Hn • Sn + 1)− 1.

Because all prices have initial value 1 and Hn0 = 0, we have Y n

0 = 0. Because

(Sαk)−1T is finite-valued, (Y n

T )n≥1 is unbounded in L0(B). Because Hn is 1-

admissible, Y n ≥ −1 on [0, T ]. By Itô’s integration by parts formula and the

fact that Zn,αk = Sn,αk(Sαk)−1, we have Y n = Kn • Zn,αk for a predictable

Kn. Hence, (Kn)n≥nkviolates NUPBR for (Zn,αk ,B)n≥nk

.

For the converse denote (Hn)n≥nka violation of NUPBR for (Zn,αk ,B)n≥nk

and consider

Y n := (Sαk)(Hn • Zn,αk + 1)− 1.

That (Y n)n≥1, with Y n = 0 for n < nk, constitutes a violation of NUPBR for

(Sn,B)n≥1 follows by repeating the arguments of the previous paragraph.

2.9 Proposition Suppose n ≥ maxa : a ∈ Ak =: nk. Then Sαk is

asymptotically undominated if and only if (Zn,αk ,B)n≥nksatisfies ANA.

Proof. Suppose Sαk is dominated by X ∈ X . Since X ∈ X , there is λ > 0

and Xn ∈ X nλ , n ≥ nk, such that ‖Xn − X‖S → 0. Because Xn ∈ X n

λ , we

have Xn = Hn • Sn for Hn that is λ-admissible for Sn. Define Jn := (Hn, 0)

72

and observe that Jn • Sn,αk = Hn • Sn. Consider

Y n := (Sαk)−1(Jn • Sn,αk + 1)− 1. (2.15)

Then Y n0 = 0, and Y n

t ≥ (αa0)−1(1 − λ) − 1 for t in [0, T ]. By Itô’s inte-

gration by parts formula and the fact that Zn,αk = (Sαk)−1Sn,αk , there is a

predictable Gn such that Y n = Gn • Zn,αk . By the foregoing, Gn is admissi-

ble for (Zn,αk ,B)n≥1. Because of the stability of convergence in the Emery

topology (Kardaras, 2013, Proposition 2.10), we have Y n = Gn • Zn,αkS−→

(Sαk)−1(W + 1) − 1 =: Y . Hence, Y is a generalized gain process for Sαk .

Because W dominates Sαk , we see that Y is an arbitrage for (Zn,αk ,B)n≥nk.

Now suppose (Zn,αk ,B)n≥1 fails to satisfy arbitrage, so that the constant

1 is dominated by XT where X is a 1-admissible generalized gain process

for (Zn,αk ,B)n≥nk. Then there is (Hn)n≥nk

such that Hn • Zn,αk =: Xn is a

1-admissible gain process for Zn,αk , and Xn S−→ X. Consider

Y n := Sαk(Hn • Zn,αk + 1)− 1.

Then Y n0 = 0, and Y n ≥ −1 on [0, T ]. By Itô’s integration by parts formula,

there is predictable Kn such that Y n = Kn • Sn is well-defined. We have by

(Kardaras, 2013, Proposition 2.10 ) that Y n S−→ Sαk(X +1)− 1 =: Y . Since

XT is a nonnegative and strictly positive with positive probability, we have

that YT dominates SαkT − 1.

2.1 Theorem The large financial market (Sn,B)n≥1 is asymptotically ef-

ficient if and only if (Zn,αk ,B)n≥nksatisfies ANFLVR for all (Ak, αk).

Proof. This follows from Proposition 2.8, 2.9, and (Cuchiero et al., 2015,

Proposition 4.4).

73

2.4.1 Statistical inference for asymptotic market effi-

ciency

The small market tests discussed in the previous section hold under the

assumption that the time horizon tends to infinity while the number of assets

remains fixed. We perhaps draw an analogy with the time series regression

tests of discrete-time empirical asset pricing (Cochrane, 2001, Chapter 12). In

the current large financial market setup, the time horizon is held fixed while

the number of assets is allowed to grow without bound. The empirical tests

we propose in this section may be analogized to the cross-section regression

tests of discrete-time empirical asset pricing theory.

Because the time horizon is assumed fixed, these tests may be particularly

well-suited for analyzing strategies with short investment horizons. Also,

since the cross-section is assumed to grow without bound, they may be more

appropriate for studying strategies involving a great number of asset. In

particular, strategies that involve sorting a great number of asset according

some indicator of performance such as previous-year return. Examples of

such strategies include mean-reversion and momentum strategies.

2.3 Lemma Let (Ak, αk) be given and let (Hn)n≥nkbe a sequence of small

market strategies for (Zn,αk ,B)n≥nkconverging in the semimartingale topol-

ogy to a generalized gain process Y . Suppose

E((V nT )

2) < ∞,

where V nT := (Hn • Zn,αk)T . Then YT constitutes a violation of ANA for

(Zn,αk ,B)n≥nkif and only if

limn

E(V nT ) > β for some β > 0, (2.16)

limn

P (V nT < 0) = 0. (2.17)

Moreover, if (Hn)n,nkis a sequence of 1-admissible strategies for Zn,αk such

that

E(V nT ) < ∞

74

then (Hn • Zn,αk)n≥nkviolates NUPBR for (Zn,αk ,B)n≥nk

if and only if

limn

En(V nT ) = ∞. (2.18)

Proof. These statements follow directly form the definitions of ANA and

NUPBR.

The simplest way to determine whether a given strategy verifies the re-

quirements of either (2.16), (2.17), or (2.18), is to specify a parametric model

of its incremental payoffs. As a simple example, we may suppose that

∆V iT := V i

T − V i−1T = µiθ + σiγεi, (2.19)

where µ, θ, σ, and γ are constants, (εi) is i.i.d, and εi is a standard normal

random variable for i ≥ nk.

Now note that under the assumption of normally distributed εi, V nT is

normally distributed with log likelihood given by

L(Θ) := −2−1

n∑

i=1

log(σiγ)2 − (2σ2)−1

n∑

i=1

i−2γ(∆V iT − µiθ)2.

where Θ := (µ, θ, σ, γ). The parameter vector may be estimated in the usual

fashion by setting the gradient of L(Θ) to zero and solving a system of four

equations in four unknowns to obtain an estimate Θ := (µ, θ, σ, γ). We

summarize the preceding considerations as follows:

Now observe that if both µ and θ are positive then (Hn)n≥nkconstitutes

a violation of the NUPBR condition for (Zn,αk ,B)n≥nk. If µ > 0, γ < 0,

and θ is sufficiently large then (Hn • Zn,αk)n≥nkconverges to an arbitrage

for (Zn,αk ,B)n≥nkand, by Theorem 2.1, a violation of market efficiency for

(Sn,B)n≥1. In (Hogan et al., 2004, Theorem 6), it is shown that θ > γ −1/2∨−1 is necessary to ensure convergence to an asymptotic arbitrage. The

above considerations are summarized in the following Proposition.

2.10 Proposition Under the assumptions of Lemma 2.3, if the incremen-

tal payoffs of Hm satisfy (2.19) then the null hypothesis of market efficiency

may be rejected with 1− α confidence if either one of the joint tests

1. H1 : µ > 0 and H2 : θ > 0, or

75

2. H ′1 : γ < 0, H ′

2 : µ > 0, H ′3 : θ > γ − 1/2 ∨ −1.

achieve a combined p-value of less than α.

2.5 Conclusion

In a finite horizon complete market setting, market efficiency is equivalent to

asset prices admitting a martingale measure. This basic definition motivates

traditional tests of market efficiency. These tests must by necessity postulate

an equilibrium model of asset prices or a stochastic discount factor as a

reference. Naturally, such a procedure is subject to a misspecification which

cannot be assessed due to the fact that the stochastic discount factor (SDF)

is unobservable. Hence, traditional tests of market efficiency are in fact joint

tests of the fit of the particular model selected and deviations from market

efficiency. This is the well-known joint hypothesis problem.

We have contributed to the growing literature that aims to devise tests

of market efficiency that do not suffer from the joint-hypothesis problem.

We have obtained further characterizations of market efficiency that in turn

suggest simplifications of empirical tests of market efficiency. These char-

acterizations involve a change of numeraire that boils down to normalizing

asset prices with respect to the market portfolio prior to investigating vi-

olations of market efficiency. Our analysis may be extended to the large

financial market setting. We define the no dominance condition as well as

market efficiency in the large financial market framework. We show that the

no dominance condition can be characterized in terms of the no arbitrage

condition after a change of numeraire. This result suggest empirical tests of

asymptotic market efficiency similar to those proposed in the small market

setting. The practical importance of the large financial market theory is that

for certain strategies, taking limits as the time horizon tends to infinity may

be inappropriate. Provided the number of assets involved in the execution of

the strategy is very large then the large financial market tests we proposed

may be more adequate.

76

Chapter 3

Statistical arbitrage in the U.S.

treasury futures market

3.1 Introduction

Is the U.S. treasury bond futures market informational efficient? Weak-form

informational efficiency requires all strategies that rely solely on historical

price data to be dominated by the passive strategy of holding single traded

assets or a weighted portfolio of traded assets. The notion of dominance as

it relates to asset pricing was introduced by Merton (1973) to study option

pricing formulas that are consistent with rational investor behavior. More

recently, Jarrow & Larsson (2012) obtained a characterization of informa-

tional efficiency in terms of the no dominance condition (ND) and the No

Free Lunch with Vanishing Risk condition (NFLVR) of Delbaen & Schacher-

mayer (1994). Accordingly, market inefficiency can be asserted as soon as

either the ND or NFLVR fails.

This result simplifies considerably the task of verifying market efficiency;

it belies the long held belief that in order to test for violations of market

efficiency, one must first specify a model of equilibrium prices such as the

CAPM and then test for efficiency in relation to the estimated equilibrium

model. Unfortunately, this two step procedure runs quickly into difficulties,

since it may not be possible to tell apart errors due to model misspecification

and those that are solely due to market inefficiency. This is the well-known

77

joint-hypothesis problem discussed in (Fama, 1969).

Moreover, the No Dominance condition itself could be dispensed with as

soon as a change of numeraire is performed. Indeed let B := (Ω,F , (Ft)t≥0, P )

denote a probability basis, and let S denote an n-dimensional semimartingale

whose components Si, 0 ≤ i < n, represent the price of n distinct assets,

expressed in units of the zeroth asset. For the sake of convenience, also

assume that at time zero, each asset is priced at one, i.e. Si0 = 1 for 0 ≤ i < n.

Now, let γ denote a positive number between zero and one, i.e. 0 < γ < 1,

and define

Zγ,i := (S, Sγ,i)(Sγ,i)−1,

where Sγ,i = γ + (1 − γ)Si. According to Dare (2017, Proposition 2.1), the

efficiency of (S,B) is equivalent to the existence of a local martingale measure

for the markets (Zγ,i,B), for 0 ≤ i < n and 0 < γ < 1.

In fact, a stronger statement can be made achieve a slightly provided

prices are expressed in units of a portfolio constructed on the basis of a

strictly positive weight vector α = (α0, · · · , αn−1), i.e. αi > 0 for 0 ≤ i < n.

Indeed if

Zα := (S, Sα)(Sα)−1,

then according to Dare (2017, Corollary 2.2), the market (S,B) is efficient

if and only if (Zα,B) admits a local martingale measure. The choice of a

market portfolio is irrelevant so long as it assigns positive weight to each

traded asset.

We will argue for a violation of market efficiency using Dare (2017, Propo-

sition 2.1) with Si representing the price of the 2-Year U.S. Treasury futures

contract. Fortunately, since the NFLVR condition is specified in terms of the

physical measure, the joint-hypothesis issue may be avoided by evaluating

trading rule for violations of NFLVR. Using this testing approach, we make

and emprically support the claim that between April 1, 2010 and Decem-

ber 31, 2015, the equally weighted buy-and-hold strategy was out-performed

by a simple cointegration-based trading rule. Moreover, the hypothesis of

the existence of a statistical arbitrage, in the sense of (Hogan et al., 2004),

78

achieves a p-value less than 2%.

The trading rule we examine takes as starting point the hypothesis that

treasury bond futures are cointegrated and then attempts to profit from

deviations from the cointegrating relationships. The cointegration hypothesis

assumes, among other things, that even though prices of individual contracts

may be non-stationary, there exists at least one linear combination of these

contracts that results in a stationary price process. That is to say, it is

possible to put together a portfolio of long and short positions in individual

contracts such that the resulting market value of the portfolio is stationary.

The hypothesis of cointegrated bond prices has been examined by Bradley &

Lumpkin (1992), Zhang (1993), and many others. In these studies, the data

employed was sampled at low frequency, daily or monthly, and the hypothesis

of integrated bond prices could not be rejected . We carry out similar analysis

and find empirical support for cointegration using data sampled intra day at

one-minute intervals.

We obtain theoretical motivation for the cointegration-based trading rule

by embedding our analysis within the literature devoted to the study of the

term structure of bonds using factor models. Starting with Litterman &

Scheinkman (1991) and later Bouchaud et al. (1999) and many others, it

has been noted that between 96% and 98% of overall variance of the entire

family of treasury securities may be explained by the variance of just three

factors, the so-called level, slope, and curvature factors. The factors are so

named because of how they affect the shape of the yield curve. A shock

emanating from the first factor has nearly the same impact on contracts of

all maturities; the resulting effect is a vertical shift, upward or downward, of

the entire yield curve. The second factor affects bonds of different maturities

in such a manner as to change the steepness or slope of the curve; it does

so by affecting securities at one end of the maturity spectrum more or less

than those at the other end. Finally, the third factor has the effect of making

the yield curve curvier; it does so by having more or less pronounced effects

on medium term bonds than on bonds situated either ends of the maturity

spectrum.

We argue that a strategy based on a cointegration hypothesis is natural

within the context of a term structure driven by common stochastic trends or

79

factors. In fact, the opposite is also true, that is, a common factor structure is

a natural consequence of cointegrated yields. This line of argument provides

support based on economic theory for our strategy and helps explain its

performance. Our results suggests that the futures market may be inefficient.

Market inefficiency is clearly not a desired outcome. It implies the existence

of a free lunch. Put another way, our results points to possible misallocation

of resources.

The rest of the paper proceeds as follows: in section 2, we provide a

description of the data used. Futures price data usually does not come in

continuous form for extended periods of time, so we had to make certain

choices about how available historical price data is transformed into a state

suitable for our analysis. These choices can be implemented in real-time and

are, therefore, to be considered as part of the trading rule. In section 3, we

provide theoretical foundation for our trading rule. This foundation allows

us to reach beyond our data and assert that the profitability of the trading

rule is very likely not confined to the period for which we have data. Section

4 is devoted to the implementation details of the trading rule. Section 5

summarizes our empirical results, and section 6 concludes.

3.2 Data

3.2.1 Treasury futures

CBOT Treasury futures are standardized foreward contracts for selling and

buying US government debt obligations for future delivery or settlement.

They were introduced in the nineteen-seventies at the Chicago Board of Trade

(CBOT), now part of the Chincago Merchantile Exchange (CME), for hedg-

ing short-term risks on U.S. treasury yields. They come in four tenors or

maturities: 2, 5, 10, and 30 years. In reality, each contract type is written

on a basket of U.S. treasury notes and bonds with a range of maturities and

coupon rates. For instance, the 30-Year Treasury Bond Futures contract is

written on a basket of bonds with maturities ranging from 15 to 25 years.

It is, therefore, worth keeping in mind that a study of the dynamics of the

yield curve using futures data reflects influences from a range of maturities.

80

Every contract listed above except the 2-Year T-Note Futures contract,

which has a face value of $200,000, has a face value of $100,000. That is

each contract affords the buyer the right to buy an underlying treasury note

or bond with a face value of $100,000 or $200,000 in the case of the 2-Year

contract. In practice, the price of these contracts are quoted as percentages

of their par value. The minimum tick size of the 2-Year T-Note Futures is

1/128%, that of the 5-Year T-Note Futures is 1/128%, that of the 10-Year

T-Note Futures is 1/64%, and that of the 30-Year T-Bond Futures contract is

1/32%. In Dollar terms, this comes to $15.625, $7.8125, $15.625, and $31.25,

respectively, per tick movement. 1 These tick sizes are orders of maginitude

larger than those typically encounted in the equity markets.

Even though most futures contracts are settled in cash at the expiration

of the contract, for a small percentage of open interests, delivery of the

underlying bond actually takes place. Given that the futures contract is

written on a basket of notes and bonds, the actual bond or note delivered is

at the discretion of the seller of the contract. In practice, the seller merely

selects the cheapest bond in the basket to deliever. For our purposes, we

shall focus on only the above listed tenors, but it is worth keeping in mind

that there is is also a 30-Year Ultra contract that is also traded at the CME.

For our analysis, we use quote data, prices and sizes, from April 1, 2010

through December 31, 2015. Even though we have at our disposal data rich

enough to allow resolution down to the nearest millisecond, we opted, arbi-

trarily, to aggregate the data into one-minute time bars. The representative

quoted price and size for each time bar is the last recorded quote falling within

that interval. Our use of quotes , bids and offers, instead of transaction data

allows the computation of a proxy for the unobserved true price, by means

of the mid-quote, at a higher frequency than transaction prices might have

allowed. Using quotes, we are also able to reflect directly a major portion of

the execution costs associated with any transaction, i.e. the bid-ask spread.

Trading in these markets primarily takes place electronically via CME

ClearPort Clearing virtually around the clock between the hours of 18:00

and 17:00 (Chicago Time), Sunday through Friday. But, the markets are

1We refer the reader to more detailed information about the features of each contractto Labuszewski et al. (2014).

81

at their most active during the daytime trading hours of 7:20 and 14:00

(Chicago Time), Monday through Friday. This also the opening hours of

the open outcry trading pits. For our analysis, We use exclusively data from

the daytime trading hours. This ensures that the strategy is able to benefit

from the best liquidity these markets can offer, while mitigating the effects

of slippage (orders not getting filled at the stated price) and costs associated

with breaking through the Level 1 bid and ask sizes.

3.2.2 Continuous prices

Unlike stocks and long bonds, futures contracts tend to be short-lived, with

price histories extending over a few weeks or months. This stems from the

traditional use of futures contracts as short-term hedging instruments against

price/interest rate fluctuations. Treasury futures contracts, in particular,

have a quarterly expiration cycle in March, June, September, and December.

At any given point in time, several contracts written on the same underlying

bond, differentiated only by their expiration dates, may trade side by side.

Usually, the next contract due to expire, the so-called front-month con-

tract, offers the most liquidity. As the front-month approaches expiration,

liquidity is gradually transferred to the next contract in line to expire, the

deferred month contract. At any rate, a given contract is only actively traded

for a few months or weeks before it expires. Hence, holding a long-term po-

sition in a futures contract actually entails actively trading in and out of the

front month contract as it nears its expiration date. The implementation of

this process is known as rolling the front month forward.

For the purpose of evaluating a trading strategy over a historical period

of more than a few months, the roll can be retroactively implemented to

generate a continuous price data. The usual way to go about the roll is to

trade out of the front-month a given number of days before it expires. In the

extreme case, the roll takes place on the expiration date of the front month

contract. The downside of this type of approach is that the roll may take

place at a date when liquidity in the deferred month is not yet plentiful. The

result is that a backtest may not necessarily capture the increased trading

cost associated with the lower liquidity level.

82

Our preferred approach for implementing the roll is to start trading out of

the front month contract at any point during its expiration month as soon as

the open interest in the deferred month contract exceeds the open interest in

the front month contract. The data used in our backtest is spliced together

this way; the procedure is implementable in real-time and must be considered

part of the trading strategy discussed in this paper.

Now, while retroactive contract rolling may solve the problem of creat-

ing an unbroken long-term price history, it creates another: splicing prices

together as described above would invariably introduce artificial price jumps

into historical prices. To see this, consider a futures contract with price F

written on a bond with price B. Using an arbitrage argument and ignor-

ing accrued interest, the price of a futures contract at any time t may be

expressed as:

Ft = Bte(r−c)d, (3.1)

where c is the continuously compounded rate of discounted coupon payments

on the underlying bond, d is the number of time units before the futures

contract expires, and r is the repo rate.

Now, assuming the roll takes place in the expiration month, d for the front

month is less than 30 days, whereas for the deferred month contract, d is at

least 90 days. This results in a price differential between the two contracts,

which shows up in the price data as a jump. In reality, and assuming a

self-financing strategy, the price differential would necessitate a change in

the number of contracts held, so that overall, the return on the portfolio is

unaffected by the roll. Hence, in order to avoid fictitious gains and losses,

the price series must be adjusted to remove the roll-induced price jumps.

The most often used methods in practice applies an adjustment to prices

either prior or subsequent to the roll date. When the adjustment is applied

to prices recorded after the contract is rolled forward, the price history is said

to be adjusted forward; if on the other hand, the adjustment is applied to

prices recorded prior to the roll date then the prices are said to be adjusted

backward. The actual price adjustment, in the case of a backward adjust-

ment, is most commonly carried out in one of two ways: in the first instance,

83

the roll-induced price gap (price after roll minus price right after roll) is sub-

tracted from all prices recorded prior to the roll date; in the second instance,

all prices preceding the roll date are multiplied by a factor representing the

relative price level before and after the roll. The second approach is remi-

niscent of how stock prices are adjusted after a stock split. We will refer to

the first approach as the backward difference adjustment method and to the

second as the backward ratio adjustment method. Forward ratio adjustment

and forward difference adjustment are implemented similarly with the ad-

justments applied to prices recorded after the roll date. In our analysis, we

will only consider backward adjusted prices, as they appear to be the more

intuitive approach.

Both types of backward price adjustment methods are widely-used in

practice, but the ratio adjustment method has the advantage of guaranteeing

that prices, however early in the price series, always remain positive. In the-

ory, the difference adjustment approach may generate negative prices given

enough roll-induced price gaps. We mention these adjustment procedures

because they tend to affect the performance of most strategies, including the

one we study in this paper. The price adjustment procedures cannot be con-

sidered as part of a real-time trading strategy, so we report results using both

the backward ratio adjustment and the backward difference adjustment.

3.3 Economic framework

3.3.1 The price of a futures contract

The traditional way of pricing a futures contract is via an arbitrage argument.

The argument is best illustrated via an example. Suppose an agent, at time

t (today), has a need to purchase a 10-Year Treasury note at time T1. That

is, at time T1 when the forward/futures contract expires, the treasury note

will mature in ten years at time T1 + 10. The agent could go about it by

borrowing money at time t at the repo rate to cover the full price of the

bond. The full price of the bond would include the current spot price of the

bond and accrued interest on the bond since the last coupon payment. The

accrued interest is the portion of the next coupon payment that is due to the

84

previous owner of the bond. Let’s denote the spot price of the bond by Bt

and the accrued interest by It. So, at time t the agent may borrow Bt + It,

using the bond as collateral against the loan.

At time T1, the loan used by the agent to fund the purchase would have

accrued interest of its own and would have grown to (Bt + It)er(T1−t). Here,

we are assuming a fixed repo rate r. On the other hand, taking procession of

the treasury note endows the agent with the right to receive coupon payments

generated by the note. Coupon rates are usually a fixed percentage of the

par value of the bond. In practice, this is usually around 6% and payable

semiannually; for this illustration, we will imaging that the coupon payments

are payed continuously at the instantaneous rate of c. To recap, at time T1,

the loan balance grows to (Bt+It)er(T1−t), but it is offset by coupon payments

of Ke−c(T1−t), where K denotes the par value of the bond. Hence, at time T1,

for the agent to own the treasury bond outright, she simply needs to repay

the loan, but because of the accrued interets she would only be out of pocket

F := (Bt + It)er(T1−t) −Ke−c(T1−t). Hence, at time t it only makes economic

sense to enter a futures contract if it is priced in such a way as to equate the

cost of replicating it, that is, F .

The above analysis demonstrates that the price of a futures contract may

be written in terms of the price of the underlying bond. In fact, by denoting C

the continuously discounted present value of all coupon payments generated

by the bond, we may write:

Ft = (Bt − C)erd, (3.2)

where d is the amount of time left before the futures contract expires. It is

worth noting that the foregoing analysis relies on the assumption of constant

interest rate. It is also to be noted that the price of a futures contracts using

a no-arbitrage argument may differ from the price of a forward contract in an

environment with stochastic time varying interest rates. We refer the reader

to Cox et al. (1981) for a lucid discussion of this point. For the intuition

we wish to develop, the assumption of a constant in time interest rate is

tolerable.

Returning to (3.2), and taking the natural logarithm of both sides of the

85

equation, and assuming that the face value of the bond is much larger than

the present value of future coupon payments, we may write

ft ≈ bt − c+ rd, (3.3)

where bt := log(Bt) and c := log(C). Note that bt = −Tyt, where yt is the

yield to maturity at time t of the bond. The quantity rd−c is usually referred

to variously in the empirical literature as the carry or the basis. Looking at

actual price data, the carry would fluctuate from time to time usually around

a long term mean. The basic idea of a mean-reverting carry is the motivation

behind the so-called carry-trade, which is implemented by going short the

futures and long the bond when the carry is high and doing the opposite

when the carry is deemed too low. The strategy reviewed subsequently, is

only related to this trade by the fact that it relies on mean-reversion to be

profitable.

Returning to (3.3) it is apparent that besides the variation in the carry,

variations in the logarithm of the futures price comes about because of vari-

ations in the logarithm of the bond price, which is itself driven by the yield

to maturity of the bond. Usually, the carry does not vary by a whole lot and

it is often modeled as a constant as we have done here unless, of course, the

object of the analysis is to study the carry itself. Given these considerations

we may model the logarithm of the price of the futures contract directly and

exclusively in terms of the yield to maturity with no significant loss in rigor.

That is, we may write

ft = α + βyt, (3.4)

where α and β are constant terms and yt is the yield to maturity of the

underlying bond. The constant α is simply the carry and whatever needs

to be added or subtracted in order to make the approximation in (3.3) an

equality. The constant β is in this setting equal to −T , that is, negative the

tenor of the underlying bond.

The preceding reformulation of the logarithm of the price of a futures

contract in terms of the yield to maturity of the underlying bond allows us

to use the theoretical machinery developed to study the term structure of

86

interest rates to motivate the trading system that we discuss subsequently.

3.3.2 Factor model of the yield curve

Factor modeling of the yield curve has a rich history in the financial lit-

erature. The extant models may be broadly classified under three main

headings: statistical, no-arbitrage, and hybrid models. The static Nelson &

Siegel (1987) (NS) model of the yield curve and its modern counterpart, the

Dynamic Nelson-Siegel (DNS) model, proposed by Diebold & Li (2006) are

prototypes of the class of statistical factor models of the interest rate term

structure. They, especially the static Nelson & Siegel, are widely used both

by financial market practitioners and central banks to set interest rates and

forecast yields. Despite their popularity and appealing statistical properties,

they tend to give rise to violations of the no-arbitrage condition2.

The dynamic term structure models (DTSM) studied in (Singleton, 2006,

Chapter12), of which the yield-factor model of Duffie & Kan (1996) is an

early example, constitute the class of arbitrage-free models. These models

derive a functional form of the yield curve in terms of state variables or

factors, which also govern the market price of risk linking the local martin-

gale measure to the historical measure. They are, therefore, by construction

arbitrage-free. Despite their economic soundness, these models tend to have

sub-par empirical performance. For instance, Dybvig et al. (1996) showed in

the discrete-time setting that in an arbitrage-free model, long forward and

zero-coupon rates can never fall; working in the general setting of continu-

ous trading, Hubalek et al. (2002) arrived at a similar conclusion regarding

the monotonicity of long forward rates under the no-arbitrage assumption.

Clearly, this implication of the no-arbitrage framework is often contradicted

by the empirical evidence that zero-coupon rates do in fact fall. Furthermore,

negative rates and unit roots are ruled out. As (Diebold & Rudebusch, 2013,

p. 13) put it

Economic [no-arbitrage] theory strongly suggests that nominal

bond yields should not have unit roots, because the yields are

bounded below by zero, whereas unit root processes have random

2See (Filipović, 1999) for such violations in the case of DNS models.

87

walk components and therefore will eventually cross zero almost

surely.

Negative interest rates post 2008 financial crisis are a mainstay of many

developed economies, including Switzerland. Moreover, the task of fitting

arbitrage-free models to interest rate data can be very difficult since they tend

to be over-parametrized and, typically, would generate multiple likelihood

maxima (Diebold & Rudebusch, 2013, p. 55).

Lastly, the Arbitrage-Free Nelson-Siegel (AFNS) model proposed by Chris-

tensen et al. (2011) is a prototype of the hybrid class of models. It main-

tains the parsimonious parametrization of the DNS model while remaining

arbitrage-free. The AFNS differs, at least in the functional form of the yield

curve, from the standard DNS model only by the inclusion of an extra term

known as the “yield adjustment factor”. Intuitively, the AFNS model may

be thought of as the projection of an arbitrage-free affine term structure

model, namely the Duffie & Kan (1996) model, onto the DNS model with

the orthogonal component swept into the yield adjustment factor.

The factor models briefly surveyed above motivate the trading rule adopted

in this paper; it relies on the hypothesis that the term structure of interest

rates can be described by an affine function of a set of state variables, no-

tably the level, slope, and curvature principal components. Moreover, there

is ample empirical evidence suggesting that the term structure is cointe-

grated. In particular, using monthly Treasury bill data from January 1970

until December 1988, Hall et al. (1992) observed that yields to maturity of

Treasury bills are cointegrated and that during periods when the Federal

Reserve specifically targeted short-term interest rates, the spreads between

yields of different maturities defined the cointegrating vector.

In general, given a N ∈ N bonds, with N not necessarily finite, a factors

model of the yield curve would represent the yield on the i-th bond as:

yi,t = αt +

q∑

j=1

βi,jfj,t + εi,t, (3.5)

where α is deterministic, q is a small number, fj for j = 1, · · · , q, are fac-

tors βi,j is the contribution of the jth factor to the ith bond, and εi is the

88

component of the ith bond that is apart from any other bond. For our pur-

poses, it does not actually matter whether the factors are macroeconomic or

statistical in nature, but to fix ideas we assume q = 3 and the factors are

the level, slope, and curvature factors of Litterman & Scheinkman (1991).

By substituting the expression in (3.5) into equation (3.4), we obtain the log

futures price in terms of the level, slope, and curvature of the term structure.

That is,

fi,t = µt +3∑

j=1

γi,jfj,t + εi,t.

3.3.3 Factor extraction

Using Principal Component Analysis (PCA), it is possible to transform the

original time series of futures prices into a set of orthogonal time series known

as principal components. Because of the orthogonality property, the origi-

nal time series may be expressed uniquely as a linear combination of the

principal components. This representation motivates the interpretation of

the principal components as the latent risk factors driving observed price

fluctuations.

The analysis starts with n observations from an m-dimensional random

vector, the original time series data. Then, assuming that the original time

series admits a stationary distribution with finite first and second moments,

the covariance matrix is estimated using an unbiased and consistent estima-

tor. In our setting, the assumption of stationarity applied directly to the

logarithm of futures prices is hard to justify. Prices generally trend upward,

and the same may be expected for their log-transformed versions. Using

the Augmented Dickey-Fuller (ADF) statistics with constant drift, we test

the hypothesis that the lag polynomial characterizing the underlying data

generating process has a unit root.

A quick scan of Table 3.1 reveals that for the most part the unit root

assumption cannot be rejected. The only exception seems to be the 2 Year

and the 5 Year futures price data for the year 2015, for which the assumption

of a unit root may be rejected at the 5% confidence level. We think this out-

come is a temporary fluke since for the previous five years the null hypothesis

89

Table 3.1: Augmented Dickey-Fuller Tests

(a) 2010

a t(a) lag t(lag) 5% c.value(a) 5% c.value(lag)

2 Yr 0.00 2.17 -0.00 -2.16 4.59 -2.865 Yr 0.00 2.01 -0.00 -2.00 4.59 -2.8610 Yr 0.00 2.05 -0.00 -2.05 4.59 -2.8630 Yr 0.00 2.03 -0.00 -2.02 4.59 -2.86

(b) 2011


2 Yr 0.00 0.96 -0.00 -0.96 4.59 -2.865 Yr 0.00 0.62 -0.00 -0.61 4.59 -2.8610 Yr 0.00 0.56 -0.00 -0.54 4.59 -2.8630 Yr 0.00 0.36 -0.00 -0.33 4.59 -2.86

(c) 2012


2 Yr 0.01 2.51 -0.00 -2.51 4.59 -2.865 Yr 0.00 1.31 -0.00 -1.31 4.59 -2.8610 Yr 0.00 1.22 -0.00 -1.21 4.59 -2.8630 Yr 0.00 1.36 -0.00 -1.36 4.59 -2.86

(d) 2013


2 Yr 0.00 1.45 -0.00 -1.45 4.59 -2.865 Yr 0.01 1.85 -0.00 -1.85 4.59 -2.8610 Yr 0.00 1.65 -0.00 -1.65 4.59 -2.8630 Yr 0.00 1.24 -0.00 -1.25 4.59 -2.86

(e) 2014


2 Yr 0.00 1.06 -0.00 -1.05 4.59 -2.865 Yr 0.00 1.46 -0.00 -1.46 4.59 -2.8610 Yr 0.00 1.74 -0.00 -1.73 4.59 -2.8630 Yr 0.00 1.95 -0.00 -1.93 4.59 -2.86

(f) 2015


2 Yr 0.01 2.88 -0.00 -2.88 4.59 -2.865 Yr 0.01 3.01 -0.00 -3.01 4.59 -2.8610 Yr 0.01 2.76 -0.00 -2.76 4.59 -2.8630 Yr 0.00 1.65 -0.00 -1.65 4.59 -2.86

90

could not be rejected. We have also looked at different subsamples of the

2015 data, and for the most part the assumption of a unit root could not be

rejected.

Under the circumstances, carrying on with the analysis of the principal

components of the original price series may not be advisable. Without the

stationarity assumptions, it is very likely the case that the usual estimator

of the covariance matrix would yields estimates that may be substantially

off the mark. Meanwhile, taking the first difference of the logarithm of the

price series seems to produce time series that display very little persistence

as may be observed from an inspection of Figure 3.1. Hence, the assumption

of stationarity may be more appropriate only after differencing the data. We

have substantiated this assumption using the ADF test and the unit root

assumption was rejected at the 1% confidence level.

Clearly, taking differences of the log price data entails a loss of informa-

tion. Nevertheless, an analysis of the differenced data could still yield insight

into the factor structure of the original price data since the property could

be expected to be shared by both the differenced data and the data in levels.

This observation is easily confirmed by means of simple algebraic manipula-

tions. Naturally, the factors that may be extracted from the differenced data

would bear very little resemblance to the factors present in the levels data,

so that there are limits to how much can be inferred about the data in levels

once it has been differenced.

Proceeding with the differenced data, we estimate the covariance matrix

by means of the unbiased estimator

Σ := (n− 1)−1

n∑

i=1

xix′i,

where x is the normalized data series. In the final step we obtain a spectral

decomposition of the covariance matrix. That is

Σ =m∑

i=1

λiviv′i, (3.6)

where λ1/21 ≥ · · · ≥ λ

1/2m are the nonnegative eigenvalues of Σ in descending

91

Figure 3.1: Changes in log prices

(a) 2 Yr Treasury Note (b) 5 Yr Treasury Note

(c) 10 Yr Treasury Bond (d) 30 Yr Treasury Bond

92

order of magnitude, and vi, i = 1, . . . ,m, are the corresponding eigenvectors.

The eigenvectors are orthonormal, that is they have length one and form a

linearly independent set. Hence, from the representation in (3.6), the contri-

bution of the i-th principal factor to the overall variance of the differenced

log price data is λi. The j-th component of each the i-th eigenvector is the

factor loading or beta of the j-th security with respect to the i-th principal

component. That is, the components of the eigenvectors summarize expo-

sure levels. For instance, the second element of the third eigenvector is the

exposure of the 5 Year Treasury Note Futures contract to the third principal

component or risk factor, the so-called curvature factor.

Recall that the eigenvectors are orthonormal so that the associated eigen-

values represent the variance contribution of each principal components to

the variance of the differenced log price data. By taking the ratio of indi-

vidual eigenvalues to the sum of all four eigenvalues, we may estimate the

percentage contribution of each principal component to the overall variance.

The output of this analysis using subsamples corresponding to each calendar

year in our data set is recorded in Figure 3.2. What is immediately apparent

from these figures is that an overwhelming majority of variability is directly

attributable to the first component; this component contributes between 90

to 93% of total variability, followed by the second component contributing

between 4.8 and 12%. The contributions due to the third and fourth compo-

nents are fairly modest. The third component contributes between 0.5 and

3%, whereas the fourth component accounts for less than 0.5%. This result

is in agreement with previous works such as Litterman & Scheinkman (1991)

and Bouchaud et al. (1999) studying the term structure using lower frequency

data. By now, this is a stylized fact of the term structure of interest rates,

our results confirm this fact for higher frequency data.

Figure 3.3, reports the loadings associated with each principal component

for a variety of subsamples. The loading for the first fact is fairly stable across

maturity and time. The weights are uniformly close to 0.5 so that the effects

of shocks emanating from the first factor are felt uniformly across maturities.

The loadings associated with the second and third principal components show

a great deal of variation across time. In 2010, the effect of a shock emanating

from the second factor had the most impact on long bonds than short bonds.

93

Figure 3.2: Variance contributions

(a) 2010 (b) 2011

(c) 2012 (d) 2013

(e) 2014 (f) 2015

94

Figure 3.3: Factor loadings by contract

(a) 2010 (b) 2011

(c) 2012 (d) 2013

(e) 2014 (f) 2015

95

The situation was reserved in the following year. A similar reversal may be

observed for the third component which in 2010 had a greater impact on

medium term bonds than on both long and short bonds.

This empirical analysis of the factors underlying the data forms the basis

of the strategy we discuss in the sequel.

3.3.4 Factor structure implies cointegration

In this subsection we shall study the link between a factor structure de-

scription of the yield curve and the existence of cointegrating relationships

between contracts of different tenors. An n × 1 vector time series y is coin-

tegrated if each component of y is integrated of order p > 0, but there is k,

strictly less than n, independent linear combinations of the components of y

that result in processes that are integrated of order q, where q is strictly less

than p. For our purposes, we shall assume that p is one and q is zero. Hence,

cointegration in our setting means that y is a unit root process whose compo-

nents can be combined linearly in k independent ways to produce stationary

processes. The vector of cointegrating relationships are usually normalized

and grouped together as the columns of an n × k matrix denoted β. By

definition, β has linearly independent columns, therefore, it has rank k < n.

Consider the following model of the yield on n bonds:

yt = Aft + ut, (3.7)

where y is an n × 1 random vector of yields of varying maturities, A is an

n×k matrix of factor weights, f is a k×1 random vector of common factors,

and u is an n × 1 stationary random vector. Without loss of generality, we

may assume that each of the k components of f are unit root processes;

otherwise, if only r < k components of f are unit root processes and the

remaining k − r are stationary, then we may simply re-write (3.7)

yt = Bht + vt,

with vt = Cgt + ut, f ′t = [h′

t, g′t], and A′ = [B′, C ′], where A′ is the matrix

transpose of A, B and C are, respectively, the n×r and n×(k−r) submatrices

96

of A, and h and g are, respectively, r and k − r subvectors of f .

Returning to equation (3.7), the vector of factors may be assumed to be

a multivariate random walk, i.e.,

ft = ft−1 + φ(L)εt,

where φ(L) is a lag polynomial, ε is white noise, and φ(L)ε is stationary.

There some empirical evidence in our data that this assumption is not un-

justifiable. Using a matrix analysis argument, details of which may be found

in Theorem 3.1 of Escribano & Pena (1993), it is easy to verify that y may

be written as a sum of a stationary process and a unit root process:

yt = wt + zt,

where w ∼ I(1) and z ∼ I(0). Both w and z may be computed explic-

itly given the matrix of factor loadings as follows: wt = AA′yt and zt =

(A⊥)′A⊥yt, where A⊥ is the orthogonal complement of A, i.e. (A⊥)′A = 0.

Now, setting β := (A⊥)′, it is easily seen that βyt = βzt ∼ I(0), so that β is

a matrix of cointegrating vectors. Hence, cointegration of the vector of yields

is a consequence of the factor structure of the yield curve.

The analysis in the previous section provides some indirect empirical sup-

port for the existence of orthogonal risk factors underlying the dynamics of

the term structure. Recall that our analysis of the factors employed differ-

enced price data. So, direct measurements of the risk factors is not an option,

but we could at least extract the differenced factors and compute their cu-

mulative sum. While this approach may lack rigor, it nevertheless provides a

glimpse of what the original factors might look like. Using the reconstructed

risk factors, we test the hypothesis that the level, slope, and curvature factors

are unit root processes. The result of this analysis is recorded in Table 3.2.

The results show that the hypothesis of unit root for the risk factors may not

be reject at any reasonable level for any of the six calendar years included in

our data set.

97

3.3.5 Cointegration implies a factor structure

In the previous section, we argued that cointegration is natural assuming the

underlying data admits a factor structure. In this section, we argue that the

converse is also true. Starting with the assumption that the components of

y are integrated of order one, y may be expressed, using lag polynomials, as:

(1− L)yt = Φ(L)εt, (3.8)

where ε is n × 1 iid noise, L is the lag operator, Φ(L) =∑∞

j=1 ΦjLj, Φj is

n× n matrix, and Φ0 is the n× n identity matrix. The last condition is an

accommodation for the presence of a deterministic linear trend.

Cointegration entails a restriction of the process Φ(L)ε, and on Φ(1) in

particular. Indeed, writing Φ(L) = Φ(1) + (1 − L)Φ∗(L), where Φ∗(L) :=

(1− L)−1(Φ(L)− Φ(1)), equation (3.8) may be expressed as:

(1− L)yt = Φ(1)εt + (1− L)Φ∗(L)εt. (3.9)

Solving (3.9) by recursive substitution yields

yt = Φ(1)zt + Φ∗(L)εt, (3.10)

where zt :=∑t−1

i=0 εt−i. Now, cointegration implies the existence of an n × k

matrix β, the matrix of cointegrating vectors, such that β′yt is integrated of

order 0; but since zt is a multivariate random walk, it must be the case that

β′Φ(1) = 0. Now, since β has rank r and Φ spans the subspace orthogonal

to its column space, it must be the case that Φ(1) has rank n− k.

Using the Jordan canonical form, we may write

Φ(1) = AJA−1

where J is a (n − k) × (n − k) diagonal matrix containing the non-zero

eigenvalues of Φ(1), A the corresponding n× (n− k) matrix of eigenvectors,

and A−1 is the right inverse of A. This decomposition is possible because Φ

only has n − 1 non-zero eigenvalues. Now setting ut := JA−1εt and νt :=

98

Φ∗(L)εt, and substituting into (3.10) yields

yt = Aft + νt, (3.11)

where ft = ft−1 + ut. The interesting thing about (3.11) is that f is an

(n − k) × 1 unit root process driving y. That is cointegration implies a

factor structure. This result appears at various levels of generality in Stock

& Watson (1988) and Escribano & Pena (1993).

3.4 Methodology

The basic trading mechanism consists of two main steps. The first step tests

for cointegration between the four futures prices and estimates the paramters

of a stationary portfolio of the four contracts under the hypothesis of cointe-

grated prices. The portfolio weights are the components of the cointegration

vector. We use a month’s worth of daytime (7:30 to 14:00 CT) trading data

sampled at one minute intervals for this step. This period is the so-called for-

mation period. Besides estimating the cointegration vector, we also estimate

the first two central moments of the stationary portfolio.

In the second step, we start monitoring prices immediately after the for-

mation period to identify price configurations that may be too rich or too

cheap according to our estimates of the first two moments from the forma-

tion period. This so-called trading period lasts for about three weeks (100

daytime trading hours) from the end of the formation period. Specifically,

we consider the price configuration to present a buy opportunity, if the price

of the stationary portfolio falls below two standard deviations of the sam-

ple mean computed on the basis of the data generated during the formation

period. There is a sell opportuitity if the price climbs beyond two standard

deviations of the mean price from the formation period. Hence, a position is

entered into whenever the price of the synthetic asset, constructed from the

cointegration vector, veers outside the two standard deviation band; the po-

sition is long or short according to whether the price configuration is deemed

cheap or rich. Short-sale constraints are almost non-existent in the futures

market, so they do not enter into our analysis.

99

Table 3.2: Augmented Dickey-Fuller Tests

(a) 2010

Level Slope Curvature

intercept 0.00 0.00 0.00t-stat (intercept) 2.32 2.10 1.87

lag -0.00 -0.00 -0.00t-stat(lag) -2.08 -1.97 -1.88

(b) 2011


intercept 0.00 -0.00 -0.00t-stat (intercept) 1.78 -1.74 -1.52

lag -0.00 -0.00 -0.00t-stat(lag) -0.46 -0.31 -0.24

(c) 2012



lag -0.00 -0.00 -0.00t-stat(lag) -1.30 -1.30 -1.50

(d) 2013


intercept -0.00 -0.00 -0.00t-stat (intercept) -1.26 -1.34 -1.33

lag -0.00 -0.00 -0.00t-stat(lag) -1.51 -1.21 -0.79

(e) 2014


intercept 0.00 0.00 -0.00t-stat (intercept) 2.34 2.62 -2.71

lag -0.00 -0.00 -0.00t-stat(lag) -1.75 -2.03 -2.31

(f) 2015



lag -0.00 -0.00 -0.00t-stat(lag) -2.27 -1.56 -1.42

100

Position are opened at any time during the trading period; they are closed

as soon as the price of the synthetic asset experiences a large enough cor-

rection after its excursion away from the sample mean estimated from the

formation period. Specifically, a position is closed as soon as the price falls

within the one standard deviation band. Hence, after each correction, at

least one standard deviation is earned on the round-trip trade. This process

is continued until the end of the trading period at which time all open po-

sitions are liquidated at the quoted price. Generally, this is the only time a

loss can be registered, since a correction might not have taken place prior to

the end of the trading period.

The entire process is repeated on a rolling window from the start of the

sample (1 April 2010) to the end of the sample (31 December 2015). In

both steps we use exclusively quote data as opposed to transaction data. An

advantage of using quote data is that the data is simply more plentiful and

may better accurately represent the state of the market as perceived by an

agent at any given moment. During the synthetic portfolio formation stage,

the cointegration vectors and the first two moments are estimated using the

midpoint of the best bid and ask prices. During the trading stage, positions

are opened and closed using quoted bid and ask prices: a long position is

entered into at the ask and shorts executed at the bid.

The evaluation of the strategy using quotes prices is imperative given the

short-term nature of the strategy. All positions are opened for at most 100

daytime trading hours. Theoretically, a position could be entered into and

exited the very next minute. For such short investment horizons, the bid-ask

spread looms very large. By using quote data, execution costs arising from

the bid-ask spread is automatically taken into account. Of course, there are

other types of execution costs, but the bid-ask spread is usually the largest

source of execution costs, and the use of quoted prices takes care of it right

away.

101

3.5 Results

3.5.1 Return calculation

Evaluating the performance of a trading strategy that may involve long and

short positions is not altogether a straight foreword matter. In fact, the

literature gives little guidance on how to define the one-period return of a

portfolio consisting of long and short positions. The issue is without com-

plications for a portfolio consisting entirely of long positions; the one-period

return is simply the difference between the starting and ending value of the

portfolio divided by its starting value. Unfortunately, this definition presents

difficulties as soon as portfolios with both long and short positions are con-

sidered. For such portfolios, the initial investment could be arbitrarily small,

zero, or even negative due to the offsetting effects of long and short posi-

tions. In the case of a zero-cost portfolio, the period return is ether positive

infinity or negative infinity, regardless of the actual change in the value of

the portfolio.

It is easy to see that the standard definition is problematic for portfolios

with both long and short positions because the value of the portfolio at the

start of the period is always taken as the basis for measuring the performance

of the portfolio over the period. By reconsidering the investment simply in

terms of cash inflows and outflows much of the difficulties of the standard

approach may be overcome. The cash flow perspective, assumes that the

entire portfolio is marked to market at the end of each investment period, so

that there is a cash flow at the start and end of each period. Cash inflows

and outflows are defined from the perspective of the investor. A long position

involves an initial cash outflow followed by a cash inflow at the end of the

period. The situation is reversed for short positions: an initial cash inflow

followed by a cash outflow at the end of the period. Given a portfolio of long

and short positions, the one period return is simply the natural logarithm

of the ratio of the total cash inflows, from both types of positions, to the

total cash outflows, also from both long and short positions. This measure is

approximately equal to the ratio of the difference between cash inflows and

102

outflows to cash outflows for the period. That is

rt = log

(

InflowstOutflowst

)

≈ Inflowst − OutflowstOutflowst

. (3.12)

While this definition of period return may seem reasonable for perfor-

mance measurement in the majority of spot/cash markets, it is not without

controversy where futures markets are concerned. Black (1976) observed that

it is, in principle, impossible to define fractional or percentage returns for a

position in futures contracts. This is because, the time t quoted price of a

futures contract is merely the price at which the underlying instrument may

be exchanged at an agreed upon future date; no actual transactions occur

immediately, so that there are no cash outlays at time t. There is only one

transaction, and it occurs at the end of the contract in the form of an out-

flow or inflow but not both. In practice, both the long and the short sides

of a futures contract are required by the trading venue to post collateral to

offset the risk of default. Ordinarily, there is a mandated minimum collateral

required by the brokerage firm used by the investor. This minimal collateral

is otherwise known as the initial margin.

A position in futures contracts is marked to market daily, so that favor-

able price moves results in credits and unfavorable price moves as debits to

the margin account. To prevent the margin account from being entirely de-

pleted in the event of a succession of unfavorable price moves, the exchange

may set a maintenance margin, which is a minimum balance that must be

maintained in the margin account at all times after the initial transaction.

Usually, the maintenance margin is the same amount as the initial margin,

but it may sometime lower. Margin requirements may differ according to

whether the investor is classified as a member of the exchange or a non-

member speculator. In 2016, the margin requirements for investors without

membership licenses to the Chicago Mercantile Exchange(CME) is 10% more

than the margin requirement for members of the CME.

Technically, the margin is not to be taken as an initial investment, but it

may be argued that it is the amount of cash required to make the transaction

possible; without it, the position can not be established. Arguing in this

manner, we may define the return of a long position in a futures contract

103

at time t to be the change in the price of the contract divided by the initial

margin. That is

rt =Ft − Ft−1

M, (3.13)

where M is the initial margin and Ft is the price at the end of time t of

the futures contract. For a short position, the numerator above is multiplied

by negative one. This basic definition is also plagued by the usual problems

encountered when computing the return generated by a portfolio of both

long and short positions. Reasoning as in (3.12), the return metric defined

in (3.13) based on the timing of cash flows may be modified to only take

into account the direction of cash flows. Hence, given n different futures

contracts, we may define the performance metric

rt =

∑ni (Inflowsi,t − Outflowi,t)

∑

i Leverage Ratioi,t × Par Valuei,t ×Qi,t

. (3.14)

where the leverage ratio is simply the ratio of the initial margin of the i-th

contract to the par value of the underlying bond, and Qi,t is the exposure, in

terms of number of contracts, to the i-th contract at time t. In our setting n

is four and the contracts are distinguished by their tenors. Definition (3.12)

is a special case of the above; it holds when the position is fully founded,

that is, when the leverage ratio is one.

Table 3.3: Time-averaged CME margin requirements between 1 April 2010 and12 December 2015.

Initial marginContracts Notional value Members Speculators

2 Yr 200000 448 4935 Yr 100000 818 900

10 Yr 100000 1323 145630 Yr 100000 2647 2912

The initial margins are ordinarily not the same across contracts and,

therefore, must be handled carefully. For instance, in the last quarter of 2016,

the initial margin for the 30-Year Treasury Bond Futures contract was $4000,

104

whereas the initial margin of the 2-Year Treasury Note Futures contract was

only $550. Beside the differences in initial margins by contract types, there

are also variations over time. For most of 2010, the initial margin requirement

for the 5-Year Treasury Note Futures contract was $800 for investors with

membership licenses and $880 for non-members. Meanwhile, for all of 2014,

the initial margin for the same contract was $900. To simplify our analysis,

we compute a time-weighted average of the initial margin for the time period

between 1 April 2010 and 31 December 2015 for each contract type. The time

weighted average for members and non-members of the CME are recorded in

Table 3.3. As may be expected, margin requirements increase with the tenor

of the underlying, because the price of contracts with longer maturities are

more likely to experience large price swings.

As previously stated, the initial margin is merely the minimum collateral

required to initiate a transaction in one futures contract. An investor may

chose to apply however much collateral he or she desires. If each transaction

is fully funded, i.e., if the exact amount of the exposure to each contract

is always set aside for each transaction, then the appropriate performance

measure would be a slight modification of the formula given in (3.12). That

is

rt =

∑ni (Inflowsi,t − Outflowi,t)

∑

i Outflowi,t

. (3.15)

We remark that the use of fully funded accounts in the treasury futures

market is not very common. Consider that the notional value of the 2-Year

treasury futures contract is $200,000 and $100,000 for the others. Hence,

putting together a portfolio consisting of even a small number of contracts

quickly becomes prohibitively capital intensive. Meanwhile, treasury futures,

even those written on long bonds, have relatively stable long-term prices. As

a result, investments in the treasury futures market are most often under-

taken using leverage or a margin account.

We conclude this section with a remark on the distinction between an

investment period and a trading period. Trading periods are fixed: they are

exactly 6000 daytime trading minutes, approximately 14 trading days. An

investment period is simply the time between when a position is opened and

105

the time when it is closed. Positions are opened when the price configurations

of the four securities indicate a departure from the stable relationship estab-

lished during the preceding formation period. The positions are closed when

the stable relationship is restored. This deviation and restoration towards a

stable relationship may occur several times during a single trading period,

thereby creating multiple opportunities and, hence, investment periods.

The return formulas in (3.14) and (3.15) relate to the return over a single

investment period. For trading periods with multiple periods, we compute

the return over the trading period as the sum of the individual returns gener-

ated from each investment period contained within the trading period. That

is

rt =

q∑

i=1

ri,t

where ri,t is the return, computed via formula (3.14) or (3.15), of the i-th

investment period of the t-th trading period, and q is the total number of

investment periods occurring in the t-th trading period.

3.5.2 Excess returns

We summarize the distribution of returns generated by backtesting the coin-

tegration strategy described in the previous section in Table 3.4. The back-

test is run over daytime trading hours between April 1, 2010 and December

31, 2015. The entire period is divided into 96 hundred-hour trading periods

lasting approximately three business weeks. The figures shown in the table

are the annualized hundred-hour returns. We show cash flow-based returns

computed on a fully funded account in the first panel of the table; the second

panel displays the distribution of cash flows-based returns computed using

the initial margin as the cost basis. Each panel reports two sets of backtest

results: one for prices adjusted backwards by the application of a propor-

tional factor (Ratio) and the other for prices shifted in levels backward by

the amount of the roll-induced price gap (Difference).

Now, the annualized excess return over the equally weighted portfolio

of all four contracts, assuming a fully-funded account, are 6.01% and 6.00%

106

Table 3.4: Annualized (100 trading hours) returns on initial margin and fullyfunded account.

Panel A: Fully-funded excess return over the equal-weighted portfolio

Ratio Difference

Average return 0.0604 0.05995Standard error (Newey-West) 0.02625 0.02737t-Statistic 2.3012 2.18984Excess return distribution

Median 0.04749 0.03655Standard deviation 0.30453 0.35263Skewness 2.12072 1.66976Kurtosis 14.58188 9.49711Minimum -0.68255 -0.752265% Quantile -0.29139 -0.3523495% Quantile 0.51928 0.61858Maximum 1.85812 1.79468% of negative excess returns 40.625 44.79167

Panel B: Return on margin account

Ratio Difference

Average return 15.12327 14.95951Standard error (Newey-West) 3.87113 4.13448t-Statistic 3.90668 3.61823Excess return distribution

Median 0.6238 0Standard deviation 46.01987 51.17213Skewness 1.63053 1.29959Kurtosis 10.99074 7.87337Minimum -105.68033 -110.641475% Quantile -51.1802 -70.3208695% Quantile 83.25456 96.94193Maximum 259.66658 250.56902% of negative excess returns 14.58333 16.66667

107

respectively for the ratio and the difference price adjustment procedures. The

Newey-West adjusted t statistics are 2.3 and 2.2, respectively. Given this

result, the hypothesis that the cointegration strategy dominates the equal-

weighted portfolio, cannot be rejected. The idea is that one may short as

many of the equal-weighted portfolio as necessary and use the proceeds to

set up the cointegration strategy without incurring a loss.

Meanwhile, the annualized return using the initial margin as the cost basis

are, respectively, 1500% and 1490%. These returns are not as preposterous as

they first seem. Consider that the leverage factor implicit in the initial margin

for the 2-Year contract is 446 and that of the 5-Year contract is 122. The

inflated returns are, therefore, merely a consequence of the inflated leverage

factors. The t statistics in both cases are in excess of 2. Also, note that

the out-sized returns that may be achieved by trading on the initial margin

come at the expense of taking significant risks: consider that the standard

deviation of the returns on initial margin are 159.17 times the volatility of the

return on the fully funded account. Clearly, in practice, what an investor ends

up doing would be somewhere between trading a fully-funded account and

posting the minimum required collateral. At any rate, our analysis provides

a starting point for reasoning about how to incorporate leverage in a more

realistic real world strategy.

3.5.3 Statistical Arbitrage

Stephen Ross (1976a) gave the first serious treatment of the concept of arbi-

trage. While his treatment might have been of a heuristic nature, it never-

theless conveyed the essence of an arbitrage, which is a trading strategy that

yields a positive payoff with little to no downside risk. The first rigorous

definition appeared in Huberman (1982), where it was defined in the context

of an economy with asset generated by a set of risk factors as a sequence of

portfolios with payoffs φn such that E(φn) tends to +∞ while Var(φn) tends

to 0. The concept has undergone numerous changes in the literature, see for

example Kabanov (1996), but the essential meaning of the term as a low risk

risk investment still remains. We focus on a special type of arbitrage known

in the empirical asset pricing literature as a statistical arbitrage, which Hogan

108

et al. (2004) defines as a zero-cost, self-financing strategy whose cumulative

discounted value v(t) satisfies:

1. v(0) = 0,

2. limt→∞ E(vt) > 0,

3. limt→∞ P (vt < 0) = 0, and

4. if v(t) can become negative with positive probability, then Var(vt <

0)/t → 0 as t → ∞.

Even though the above notion of arbitrage bears resemblance to the original

definition given by Huberman (1982), it is worth noting that there is a cru-

cial difference between the two concepts. In the first instance, the limit is

taken with respect to the cross-section of the economy, i.e. the sequence of

small economies is assumed to expand without bound, whereas the definition

given above requires the investment horizon to tend to infinity. It is easily

verified that the first three conditions, assuming the existence of the first

moment correspond to the definition of an arbitrage in the classical sense

of (Delbaen & Schachermayer, 1994). Chapter 2 of Dare (2017), obtains a

series of equivalent characterizations of market efficiency. In particular, Dare

(2017, Proposition 2.1) shows that a necessary condition for market efficiency

is the existence of a local martingale measure after expressing asset prices

in units of any strictly positive convex portfolio of the zeroth asset and any

other asset (possibly itself). To apply this result, we express prices in units

of the 2-Year treasury futures contract and attempt to exhibit a violation of

the no-arbitrage condition.

Following Hogan et al. (2004) we propose to test market efficiency under

the assumption that the change in the discounted cumulative gains of the

strategy satisfies:

vt − vt−1 = µ+ σtλzt, (3.16)

where t is an integer; σ, λ, and µ are real numbers; and zt is an i.i.d. se-

quence of standard normal random variables. The model allows for determin-

istic time variation in the second moment, but makes the seemingly strong

109

assumption that there is no serial correlation between returns. Our own

simulations reveal that the effects of serial correlations are slight. In fact,

assuming z were generated by an AR(1), differences of more than 10% on

average standard errors only start to occur for values of the autoregressive

parameter in excess of 0.9 in absolute value. Since, the sample autocorre-

lation of the returns of the strategy is only -0.158, it is likely the case that

serial autocorrelation is a minor issue.

The inference strategy we have adopted is not without weaknesses. A

stylized empirical fact of financial markets is that asset returns generally have

fat-tail distributions. The normality assumption may therefore seem overly

restrictive. Moreover, the adopted parametric model may itself be a source of

misspecification errors. A more sophisticated analysis would perhaps employ

robust tools such as the bootstrap.

While the above criticisms may be valid, note that the test statistics

discussed in Hogan et al. (2004) and Dare (2017, Chapter 2) are very con-

servative because they rely on the Bonferroni criterion, which stipulates that

in compound tests involving a joint-hypothesis, the sum of the p-values of

the individual tests is an upper limit for the Type I error of the compound

test (Casella & Berger, 1990, p.11).

The test of efficiency then consists of estimating the parameters of (3.16)

and then testing for

1. µ > 0, and

2. λ < 0.

For, under the null hypothesis, the cumulative returns will increase with-

out bound as the variance of the cumulative gain tends to zero. Under the

assumption of normally distributed zt, the test can be carried out by max-

imum likelihood estimation. Table 5 summarizes the results of estimating

model (3.16). The p-value of the joint test using the Bonferonni correction,

which consists of adding up the p-values from each sub-hypothesis, of the

presence of a statistical arbitrage is approximately 0.02 in both the ratio and

difference-adjusted price series.

The p-value for the estimates of σ are relatively large, estecially for the

ratio adjusted price series. This is to be extected since for large t, the volatil-

110

Table 3.5: Test for statistical arbitrage

Difference RatioPar. Estimate Std Error p-value Estimate Std Error p-value

µ 0.3351 0.0108 <0.01 0.3735 0.0106 <0.01σ 1.151 0.5397 0.041 4.98 3.2077 0.1195λ -0.6808 0.1288 <0.01 -1.0353 0.1779 <0.01

ity of incremental payoffs is mostly determined by the term tλ and not σ.

We conduct a separate test with σ restricted to 1. The output of that test is

recorded in Table 6. The estimates and p-value obtained from the restricted

model match the estimates from the unrestricted model. These results pro-

vide a strong indication that the strategy not only earns positive excess return

over the equal weighted portfolios, but also, a positive free-lunch.

Table 3.6: Test for statistical arbitrage (σ = 1)

Difference RatioPar. Estimate Std Error p-value Estimate Std Error p-value

µ 0.333 0.0082 <0.01 0.3474 0.0104 <0.01λ -0.6452 0.0199 <0.01 -0.5885 0.0204 <0.01

3.6 Conclusion

Starting with the assumption that interest rates and, therefore, bond futures

prices admit a factor structure, we evaluate a trading strategy based on the

assumption of cointegrated bond futures prices. We argue that coitegration is

natural if in fact the dynamics of the yield curve is driven by orthogonal risk

factors which together form a jointly unit root process. Direct verification

of this hypothesis is difficult because the price series and its log transfomed

counterparts are likely not stationary. On the other, the stationary assump-

tion can be made and tested using the change in log prices. Using differenced

price data, we argued empirically that the vast majority of the volatility ex-

perienced by the changes in log prices may arise from three dominant risk

factors. Since the factors are othogonal and the differenced log prices may

111

be assumed to be stationary, there is very little doubt that the factors con-

tained in the differenced log prices are stationary. To test the claim for the

log prices in evels, we estimated the factors in levels by taking cumulative

sums of the factors estimated using differenced log prices. For these proxies

of the factors in levels, the assumption of unit roots could not be rejected at

reasonable significant levels.

With the choice of cointegrating strategy properly motivated, we pro-

ceeded to evaluate a simple trading strategy based on the cointegration hy-

pothesis. The crust of the strategy consists in opening a position as soon

the price configuration appears to deviate from an estimated stable cointe-

gration relationship. The strategy is evaluated by computing the ratio of

cash-equivalent inflows to outflows. We also consider a return metric based

on the initial margin required to take either a short or a long position in

one futures contract. Our results reveal that the gains from this strategy

are both economically and statistically significant. This exercise allows to

argue along the lines innitiated by Jarrow & Larsson (2012) that the U.S.

treasury bond futures market, for the period for which we have data, was

not informationally efficient.

112

Bibliography

Ansel, Jean-Pascal and Stricker, Christophe (1994) “Couverture des actifs

contingents et prix maximum”, Annal de l’ I. H. P., section B, Vol. 30,

No. 2, pp. 303–3015.

Banz, Rolf W. (1981) “The relationship between return and the market value

of common stocks”, Journal of Financial Economics, Vol. 9, pp. 3–18.

Barndorff-Nielsen, Ole E. and Shephard, Neil (2004) “Econometric analysis

of realized covariation: High frequency based covariance, regression, and

correlation in financial economics”, Econometrica, Vol. 72, No. 3, pp. 885–

925.

Black, Fischer (1976) “The pricing of commodity contracts”, Journal of Fi-

nancial Economics, Vol. 3, pp. 167–179.

Bouchaud, Jean-Philippe, Sagna, Nicolas, Cont, Rama, El-Karoui, Nicole,

and Potters, Marc (1999) “Phenomenology of the interest rate curve”, Ap-

plied Mathematical Finance, Vol. 6, pp. 209–232.

Bradley, Michael G. and Lumpkin, Stephen A. (1992) “The treasury yield

curve as a cointegrated system”, The Journal of Financial and Quantitative

Analysis, Vol. 27, No. 3, pp. 449–463.

Casella, George and Berger, Roger L. (1990) Statistical inference: Thomson

Learning, 2nd edition.

Christensen, J.H.E, Diebold, F.X., and Rudebusch, G.D (2011) “The affine

arbitrage-free class of nelson-siegel term structure models”, Journal of

Econometrics, Vol. 164, pp. 4–20.

113

Christensen, Ole (2006) “Pairs of dual Gabor frame generators with com-

pact support and desired frequency localization”, Applied Computational

Harmonic Analysis, Vol. 20, pp. 403–410.

(2008) Frames and bases: An introductory course, Boston:

Birkhauser.

Cochrane, John Howland (2001) Asset pricing: Princeton Univ. Press.

Cox, John C., Jonathan E. Ingersoll, Jr., and Ross, Stephen A. (1981) “The

relation between forward prices and futures prices”, Journal of Financial

Economics, Vol. 9, pp. 321–346.

Cuchiero, Christa, Klein, Irene, and Teichmann, Josef (2015) “A new per-

spective on the fundamental theorem of asset pricing for large financial

markets”, arXiv.

Dare, Wale (2017) “On market efficiency and volatility estimation”, Ph.D.

dissertation, University of St.Gallen, Switzerland.

Daubechies, Ingrid (1992) Ten lectures on wavelets: CBMS-NSF Series in

Applied Mathematics, SIAM.

Delbaen, Freddy and Schachermayer, Walter (1994) “A general version of the

fundamental theorem of asset pricing”, Mathematische Annalen, No. 300,

pp. 463 – 520.

(1997) “The Banach space of workable contingent claims in arbitrage

theory”, Annales de l’institut Henri Poincare (B) Probability and Statistics,

Vol. 33, pp. 113–144.

(1999) “The fundamental theorem of asset pricing for unbounded

stochastic processes”, Report Series SFB, “Adaptive Information Systems

and Modelling in Economics and Management Science” 24, WU Vienna

University of Economics and Business, Vienna.

Delbaen, Freddy Y. and Schachermayer, Walter (1995) “The no-arbitrage

property under the change of numeraire”, Stochastics and Stochastic Re-

ports, Vol. 53, pp. 213–226.

114

Diebold, Francis X. and Li, Canlin (2006) “Forecasting the term structure of

government bond yield”, Journal of Econometrics, Vol. 130, pp. 337–364.

Diebold, Francis X. and Rudebusch, Glenn D. (2013) Yield Curve Modelling

and Forecasting: Princeton University Press.

Duffie, Darrell and Kan, Rui (1996) “A yield-factor model of interest rates”,

Mathematical Finance, Vol. 6, No. 4, pp. 379–406.

Dybvig, Philip H., Jonathan E. Ingersoll, Jr., and Ross, Stephen A. (1996)

“Long forward and zero-coupon rates can never fall”, The Journal of Busi-

ness, Vol. 69, No. 1, pp. 1–25.

Escribano, Alvaro and Pena, Daniel (1993) “Cointegration and common fac-

tors”, Working paper 93-11, Universidad Carlos III de Madrid.

Fama, Eugene (1969) “Efficient capital markets: A review of theory and

empirical work”, The Journal of Finance, Vol. 25, No. 2, pp. 387–417.

Fama, Eugene F. and French, Kenneth R. (1993) “Common risk factors in

the return on stocks and bonds”, Journal of financial economics, Vol. 33,

pp. 3–56.

Fan, Jianqing and Wang, Yazhen (2008) “Spot volatility estimation for high-

frequency data”, Statistics and its Interface, Vol. 1, pp. 279–288.

Fernholz, E. Robert (2002) Stochastic Portfolio Theory, New York: Springer-

Verlag New York.

Filipović, Damir (1999) “A note on the nelson-siegel family”, Mathematical

Finance, Vol. 9, pp. 349–359.

Florens-Zmirou, Danielle (1993) “On estimating the diffusion coefficicient

from discrete observations”, Journal of Applied Probability, Vol. 30, No. 4,

pp. 790–804.

Foster, Dean P. and Nelson, Dan B. (1996) “Continuous record asymptotics

for rolling sample variance estimators”, Econometrica, Vol. 64, No. 1, pp.

139–174.

115

Genon-Catalot, V., Laredo, C., and Picard, D. (1992) “Non-parametric es-

timation of the diffusion coefficient by wavelets methods”, Scandinavian

Journal of Statistics, Vol. 19, No. 4, pp. 317–335.

Hall, Anthony D., Anderson, Heather M., and Granger, Clive W. (1992) “A

cointegration analysis of treasury bill yields”, The Review of Economics

and Statistics, Vol. 74, No. 1, pp. 116–126.

Halmos, Paul R. (1950) Measure Theory: Springer-Verlag New York.

Harrison, J. Michael and Pliska, Stanley R. (1981) “Martingales and stochas-

tic integrals in the theory of continuous trading”, Stochastic Processes and

their Applications, Vol. 11, pp. 215–260.

He, Sheng-wu, Wang, Jia-gang, and Yan, Jia-an (1992) Semimartingale the-

ory and stochastic calculus: CRC Press Inc.

Hoffmann, M., Munk, A., and Schmidt-Hieber (2012) “Adaptive wavelet es-

timation of the diffusion coefficient under additive error measurements”,

Annales de l’institut Henry Poincaré, Probabilités et Statistiques, Vol. 48,

No. 4, pp. 1186–1216.

Hogan, Steve, Jarrow, Robert, Teo, Melvyn, and Warachka, Mitch (2004)

“Testing market efficiency using statistical arbitrage with applications to

momentum and value strategies”, Journal of Financial Economics, Vol. 73,

pp. 525 – 565.

Hubalek, Friedrich, Klein, Irene, and Teichmann, Josef (2002) “A general

proof of the Dybvig-Ingersoll-Ross theorem: Long forward rates can never

fall”, Mathematical Finance, Vol. 12, No. 4, pp. 447–451.

Huberman, Gur (1982) “A simple approach to Arbitrage Pricing Theory”,

Journal of Economic Theory, Vol. 28, No. 1, pp. 183–191.

Jacod, Jean and Shiryaev, Albert N. (2003) Limit theorems for stochastic

processes: Springer-Verlag, 2nd edition.

Jarrow, Robert A. and Larsson, Martin (2012) “The meaning of market effi-

ciency”, Mathematical Finance, Vol. 22, No. 1, pp. 1–30.

116

Jarrow, Robert, Teo, Melvyn, Tse, Yiu Kuen, and Warachka, Mitch (2012)

“An improved test for statistical arbitrage”, Joural of Financial Markets,

Vol. 15, pp. 47–80.

Kabanov, Y.M. (1996) “On the FTAP of Kreps-Delbaen-Schachermayer”,

Statistics and Control of Stochastic Processes, pp. 191–203.

Kabanov, Y.M. and Kramkov, D.O. (1998) “Asymptotic arbitrage in large

financial markets”, Finance and Stochastics, Vol. 2, pp. 143–172.

Kabanov, Yu. M. and Kramkov, D. O. (1994) “Large financial markets:

asymptotic arbitrage and contiguity”, Probability Theory Application, Vol.

39, No. 1, pp. 222–229.

Kardaras, Constantinos (2013) “On the closure in the émory topology of

semimartingales wealth-process sets”, Annals of Applied Probability, Vol.

23, No. 4, pp. 1355–1376.

Labuszewski, John W., Kamradt, Michael, and Gibbs, David (2014) “Under-

standing treasury futures”, report, CME Group.

Lintner, John (1965) “The valuation of risk assets and the selection of risky

investments in stock portfolios and capital budgets”, The Review of Eco-

nomics and Statistics, Vol. 47, No. 1, pp. 13–37.

Litterman, Robert B. and Scheinkman, José (1991) “Common factors affect-

ing bond market returns”, The Journal of Fixed Income, Vol. 1, No. 1, pp.

54–61.

Malkiel, Burton G. (1991) The World of Economics, Chap. Efficient market

hypothesis, pp. 211–218, London: Palgrave Mcmillan UK.

Malliavin, Paul and Mancino, Maria Elvira (2002) “Fourier series methods

for measurement of multivariate volatilities”, Finance and Stochastics, Vol.

6, No. 1, pp. 49–61.

Mancini, Cecilia (2009) “Non-parametric threshold estimation for models

with stochastic diffusion coefficient and jumps”, Scandinavian Journal of

Statistics, Vol. 36, pp. 270 – 296.

117

Merton, Robert C. (1973) “Theory of rational option pricing”, The Bell Jour-

nal of Economics and Management Science, Vol. 4, No. 1, pp. 141–183.

Nelson, Charles R. and Siegel, Andrew F. (1987) “Parsimonious modeling of

yield curves”, The Journal of Business, Vol. 60, No. 4, pp. 473–489.

Protter, Philip E. (2004) Stochastic Integration and Differential Equations:

Springer, 2nd edition.

Ross, Stephen A. (1976a) “The arbitrage theory of capital asset pricing”,

Journal of Economic Theory, Vol. 13, pp. 341–360.

(1976b) Return, risk, and arbitrage, Chap. Risk and return in fi-

nance, Cambridge: Ballinger.

Sharpe, William F. (1964) “Capital asset prices: a theory of market equilib-

rium under conditions of risk”, The Journal of Finance, Vol. 19, No. 3, pp.

425–442.

Singleton, Kenneth J. (2006) Empirical Dynamic Asset Pricing: Model Spec-

ification and Econometric Assessment: Princeton University Press.

Stock, James H. and Watson, Mark W. (1988) “Testing for common trends”,

Journal of the American Statistical Association, Vol. 83, No. 404, pp. 1097–

1107.

Takaoka, Koichiro and Schweizer, Martin (2014) “A note on the condition of

no unbounded profit with bounded risk”, Finance Stochastic, Vol. 18, pp.

393–405.

Zhang, Hua (1993) “Treasury yield curves and cointegration”, Applied Eco-

nomics, Vol. 25, No. 3, pp. 361–367.

Zhang, Zhihua (2008) “Convergence of Weyl-Heisenberg frame series”, Indian

Journal of Pure and Applied Mathematics, Vol. 39, No. 2, pp. 167–175.

118

Wale Dare22 Hagedornstrasse

20149 Hamburg+49 (0) 162 8295 [email protected]

EDUCATIONUniversity of St. Gallen, St. Gallen, CH Ph.D. in Finance and Economics 02/2014 – 02/2018

• Research: High-frequency statistical arbitrage in ixed-income futures, arbitrage pricing in large inancial markets, and nonparametric realized volatility estimation in the presence of price jumps.

M.A. in Quantitative Economics and Finance 09/2011 – 10/2013

• Thesis: Ambiguity, long-run risk, and asset prices.

Knox College, Galesburg, IL, US B.A in Mathematics 02/2002 – 04/2006

• Overall GPA: 3.61/4.00, Math GPA: 3.62/4.00, Econ GPA: 3.88/4.00.

• Thesis: Poisson random measures and application

WORK EXPERIENCEBerenberg, Hamburg, GermanyPortfolio manager 10/2017 – Present

• Managed and developed strategies to harvest the volatility risk premium embedded in options markets

• Researched and developed quantitative investment strategies across multiple liquid asset classes

• Implemented derivatives-based investment strategies

• Developed and maintained advanced risk management processes

University of St.Gallen, St.Gallen, CH Doctoral Assistant 04/2014 – 09/2017

• Collaborated in research activities in the areas of Mathematics, Financial Economics and Quantitative Finance

• Collaborated in the teaching of mathematics and quantitative methods in Finance and Economics at diferent teaching levels

Liberty Mutual Group, Boston, MA, US Actuarial/Pricing Analyst 07/2008 – 07/2010

• Used actuarial methodology to forecast losses and price insurance products in the commercial automobile and real-estate property lines of business.

• Trained underwriting and support staf in the use of actuarial tools andsoftware

SKILLS

• Awards include the Howard Hughes Institute Summer Research Fellowship

• Proicient in R, SQL, Python, C++, Haskell, and VBA

• Fluent in French

mailto:[email protected]

Date post:	22-Dec-2019
Category:	Documents
Upload:	others
View:	9 times
Download:	0 times

Prof. Dr. Matthias Fengler Prof. Dr. Josef...

Documents