of 16
8/9/2019 Presentation 1st article
1/16
Statistical Arbitrage and High-Frequency Data
with an Application to Eurostoxx 50 Equities
May 2010
Authors:Christian L. Dunis
Gianluigi Giorgioni
Jason Laws
Jozef Rudy
Corresponding author and presenter :
Jozef Rudy
Liverpool John Moores University
mailto:[email protected]:[email protected]8/9/2019 Presentation 1st article
2/16
Outline
Motivation
Data used Data provider
2 types of data: HF and daily
In- and out-of-sample periods
Methodology Pair trading system
Calculation of adaptive parameters
Entry and exit points, stoploss
Preliminary out-of-sample results Average trading results for all 176 pairs
Further analysis Relation between in-sample information ratio, t-stat and out-of-sample information ratio
Final results Results for 5 best pairs based on in-sample information ratio, t-stat
Comparison with benchmarks 2
8/9/2019 Presentation 1st article
3/16
Motivation
Recent bad performance (see Gatev et al., 2006) of marketneutral strategies (see Vidyamurthy, 2004 for anintroduction)
Technique developed in 1980 by Wall Street quant Nunzio
Tartaglia. Now a well-known technique (Alexander et al.,2002, Burgess, 2003)
Majority of trading ideas well-known across Wall Street. Apractical implementation and parameters make everystrategy unique (Chan, 2009)
Application of a pair trading strategy to equity HF/daily dataand comparison of the results (Nath (2003))
3
8/9/2019 Presentation 1st article
4/16
Data used Eurostoxx 50 Equities:
Daily data :3rd Jan 2000 17th Nov 2009 Intraday data :3rd Jul 2009 17th Nov 2009
Various intraday intervals: 5, 10, 20, 30 and 60 minutes
Each share from 1 of 10 sectors: Basic Materials,Communications, Consumer Cyclical, Consumer Non-cyclical, Diversified, Energy, Financial, Industrial,Technology and Utilities
In- and Out-of-Sample Periods:
4
8/9/2019 Presentation 1st article
5/16
Methodology I
Pair trading model: and pairs only fromthe same industry
Alternative approaches for beta calculation:
fixed beta (calculated by OLS)
moving window beta (calculated by rolling OLS)
Double exponential - smoothing prediction model (DESP)
Kalman filter - system and observation noise variancesconstant (Bentz, 2003)
t tt Y t X z P P
5
8/9/2019 Presentation 1st article
6/16
Methodology II
Genetic algorithm used to optimize: Rolling OLS: Length of the OLS rolling window optimized by
genetic algorithm
DESP: Smoothing parameter and number of look-aheadperiods optimized by genetic algorithm
Kalman filter: Signal/noise ratio (system/observationnoise) optimized by genetic algorithm
Genetic optimization algorithm: Objective: maximization of the in-sample information ratio
Started with 100 generations Mutation and crossover allowed
Only 6 randomly chosen pairs optimized and these valuesused for all the pairs
6
8/9/2019 Presentation 1st article
7/16
Methodology III
Spread generated by the pair trading model:
Normalized:
and calculated from the entire in-sample period
Entry into the spread: abs(nt)>2
Exit from the spread: abs(nt)
8/9/2019 Presentation 1st article
8/16
Methodology in practice
8
-6
-5
-4
-3
-2
-1
0
1
2
3
4
1 501 1001 1501 2001
Valueofthenormalizedspread
Time
Normalized spread
Positions
0%
5%
10%
15%
20%
25%
30%
35%
1 501 1001 1501 2001
CumulativeReturn
Time
Equity curve
Bayer AG and Arcelor Mittal pair sampled at a 20-minute interval
Normalized spread Cumulative equity curve
8/9/2019 Presentation 1st article
9/16
Costs of trading
Trading costs one-way for both shares (longand short): 0.3%
Transaction costs: 0.2% (0.1% * 2)
Bid-ask spread: 0.1% (0.05% * 2)
Net return calculation:
9
1 1ln( / ) ln( / )
t t t t t X X Y Y Ret P P P P TC
8/9/2019 Presentation 1st article
10/16
Preliminary out-of-sample results
Results for different approaches:
Detailed results for the Kalman filter approach:
10
8/9/2019 Presentation 1st article
11/16
Some further analysis
11
95% confidence bounds for the correlation between in- and out-of-sampleinformation ratio
95% confidence bounds for the correlation between the in-sample t-stats
of the ADF test and out-of-sample information ratio
8/9/2019 Presentation 1st article
12/16
Results after further analysis I
5 best pairs based on the in-sample information ratios:
5 best pairs based on the in-sample t-stat of the ADF test:
12
8/9/2019 Presentation 1st article
13/16
Results after further analysis II
5 best pairs based on the in-sample t-stat (calculated from daily data)
5 best pairs based on the in-sample t-stat (calculated from daily frequency
data)
13
8/9/2019 Presentation 1st article
14/16
Comparison with benchmarks
Comparison of portfolio of 5 best pairs with benchmarks:
Using HF data in the out-of-sample period (10 Sep 17 Nov 2009)
Using daily data in the out-of-sample period (1 Jan 17 Nov 2009)
14
8/9/2019 Presentation 1st article
15/16
8/9/2019 Presentation 1st article
16/16
Thank you for your attention
16