+ All Categories
Home > Documents > Statistical Arbitrage Based on No-Arbitrage...

Statistical Arbitrage Based on No-Arbitrage...

Date post: 25-Aug-2020
Category:
Upload: others
View: 13 times
Download: 0 times
Share this document with a friend
41
Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School of Business, Baruch College July 1, 2011 Wu (Baruch) Statistical Arbitrage July 1, 2011 1 / 41
Transcript
Page 1: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

Statistical Arbitrage Based on No-ArbitrageModels

Liuren Wu

Zicklin School of Business, Baruch College

July 1, 2011

Wu (Baruch) Statistical Arbitrage July 1, 2011 1 / 41

Page 2: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

Review: Valuation and investment in primary securities

The securities have direct claims to future cash flows.

Valuation is based on forecasts of future cash flows and risk:

DCF (Discounted Cash Flow Method): Discount forecasted future cashflow with a discount rate that is commensurate with the forecasted risk.

Investment: Buy if market price is lower than model value; sellotherwise.

Both valuation and investment depend crucially on forecasts of futurecash flows (growth rates) and risks (beta, credit risk).

Key empirical skills needed: Time-series analysis.

Wu (Baruch) Statistical Arbitrage July 1, 2011 2 / 41

Page 3: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

Compare: Derivative securities

Payoffs are linked directly to the price of an “underlying” security.

Valuation is mostly based on replication/hedging arguments.

Find a portfolio that includes the underlying security, and possiblyother related derivatives, to replicate the payoff of the target derivativesecurity, or to hedge away the risk in the derivative payoff.Since the hedged portfolio is riskfree, the payoff of the portfolio can bediscounted by the riskfree rate.No need to worry about risk premium of a zero-risk portfolio.Models of this type are called “no-arbitrage” models.

Key: No forecasts are involved. Valuation is based on cross-sectionalcomparison.

It is not about whether the underlying security price will go up or down(given growth rate or risk forecasts), but about the relative pricingrelation between the underlying and the derivatives under all possiblescenarios.

Wu (Baruch) Statistical Arbitrage July 1, 2011 3 / 41

Page 4: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

Readings behind the technical jargons: P v. Q

P: Actual probabilities that earnings will be high or low.

Estimated based on historical data and other insights about thecompany — time-series analysis.Valuation is all about getting the forecasts right and assigning theappropriate price for the forecasted risk.

Q: “Risk-neutral” probabilities that we can use to aggregate expectedfuture payoffs and discount them back with riskfree rate, regardless ofhow risky the cash flow is.

It is related to real-time scenarios, but it has nothing to do withreal-time probability.Since the intention is to hedge away risk under all scenarios anddiscount back with riskfree rate, we do not really care about the actualprobability of each scenario happening.We just care about what all the possible scenarios are and whether ourhedging works under all scenarios.Q is not about getting close to the actual probability, but about beingfair relative to the prices of securities that you use for hedging.

Wu (Baruch) Statistical Arbitrage July 1, 2011 4 / 41

Page 5: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

A Micky Mouse example

Consider a non-dividend paying stock in a world with zero riskfree interestrate. Currently, the market price for the stock is $100. What should bethe forward price for the stock with one year maturity?

The forward price is $100.

Standard forward pricing argument says that the forward price shouldbe equal to the cost of buying the stock and carrying it over tomaturity.The buying cost is $100, with no storage or interest cost.

How should you value the forward differently if you have insideinformation that the company will be bought tomorrow and the stockprice is going to double?

Shorting a forward at $100 is still safe for you if you can buy the stockat $100 to hedge.

Wu (Baruch) Statistical Arbitrage July 1, 2011 5 / 41

Page 6: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

Investing in derivative securities without insights

If you can really forecast the cashflow (with inside information), youprobably do not care much about hedging or no-arbitrage modeling.

You just lift the market and try not getting caught for inside trading.⇒ buy stock, long forward, buy calls, sell puts of all strikes andmaturities.

But if you do not have insights on cash flows (earnings growth etc)and still want to invest in derivatives, the focus does not need to beon forecasting, but on cross-sectional consistency.

No-arbitrage pricing models can be useful.

From math to finance: When you write down some stochasticprocess, it is important to know whether you want to use the processto predict future movements or to use it to link the prices of differentsecurities together.

Different application involves different consideration and estimation.

Wu (Baruch) Statistical Arbitrage July 1, 2011 6 / 41

Page 7: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

Estimating statistical dynamics

Construct likelihood of the Levy return innovation based on Fourierinversion of the characteristic function.

If the model is a Levy process without time change, the maximumlikelihood estimation procedure is straightforward.

Given initial guesses on model parameters that control the Levy triplet(µ, σ, π(x)), derive the characteristic function.Apply FFT to generate the probability density at a fine grid of possiblereturn realizations — Choose a large N and a large η to generate a findgrid of density values.Interpolate to generate intensity values at the observed return values.Take logs on the densities and sum them.Numerically maximize the aggregate likelihood to determine theparameter estimates.Trick: Do as much pre-calculation and pre-processing as you can tospeed up the estimation.Standardizing the data can also be helpful in reducing numerical issues.Example: CGMY, 2002, The Fine Structure of Asset Returns, Journal of Business, 75(2), 305–332.

Liuren Wu, Dampened Power Law, Journal of Business, 2006, 79(3), 1445–1474.

Wu (Baruch) Statistical Arbitrage July 1, 2011 7 / 41

Page 8: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

Estimating statistical dynamics

The same MLE method can be extended to cases where only theinnovation is driven by a Levy process, while the conditional meanand variance can be predicted by observables:

dSt/St = µ(Zt)dt + σ(Zt)dXt

where Xt denotes a Levy process, and Zt denotes a set of observablesthat can predict the mean and variance.

Perform Euler approximation:

Rt+∆t =St+∆t − St

St= µ(Zt)∆t + σ(Zt)

√∆t(Xt+∆t − Xt)

From the observed return series Rt+∆t , derive a standardized returnseries,

SRt+∆t = (Xt+∆t − Xt) =Rt+∆t − µ(Zt)∆t

σ(Zt)√

∆t

Since SRt+∆t is generated by the increment of a pure Levy process, wecan build the likelihood just like before.

Given the Euler approximation, the exact forms of µ(Z ) and σ(Z ) donot matter as much.Wu (Baruch) Statistical Arbitrage July 1, 2011 8 / 41

Page 9: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

Estimating statistical dynamics

dSt/St = µ(Zt)dt + σ(Zt)dXt

When Z is unobservable (such as stochastic volatility, activity rates),the estimation becomes more difficult.

One normally needs some filtering technique to infer the hiddenvariables Z from the observables.

Maximum likelihood with partial filtering: Alireza Javaheri, Inside Volatility

Arbitrage : The Secrets of Skewness

MCMC Bayesian estimation: Eraker, Johannes, Polson (2003, JF): The Impact of Jumps

in Equity Index Volatility and Returns; Li, Wells, Yu, (RFS, forthcoming): A Bayesian Analysis of

Return Dynamics with Levy Jumps.

GARCH: Use observables (return) to predict un-observable (volatility).

Constructing variance swap rates from options and realized variancefrom high-frequency returns to make activity rates more observable.Wu, Variance Dynamics: Joint Evidence from Options and High-Frequency Returns.

Wu (Baruch) Statistical Arbitrage July 1, 2011 9 / 41

Page 10: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

Estimating statistical dynamics

Wu, Variance Dynamics: Joint Evidence from Options and High-Frequency Returns.

Use index options to replicate variance swap rates, VIX.

Under affine specifications, VIX 2t = 1

T EQ [∫ t+h

tvsds] = a(h) + b(h)vt ,

where (a(h), b(h)) are functions of risk-neutral v -dynamics.Solve for vt from VIX: vt = (VIX 2

t − a(h))/b(h).Build the likelihood on vt as an observable:

dvt = µ(vt)dt + σ(vt)dXt

Use Euler approximation to solve for the Levy component Xt+∆t − Xt

from vt .

Build the likelihood on the Levy component based on FFT inversion ofthe characteristic function.

Use high-frequency returns to construct daily realized variance (RV).

Treat RV as noisy estimators of vt : RVt = vt∆t + error .Given vt , build quasi-likelihood function on the realized variance error.

Future research: Incorporate more observables.

Wu (Baruch) Statistical Arbitrage July 1, 2011 10 / 41

Page 11: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

Estimating the risk-neutral dynamics

Nonlinear weighted least square to fit Levy models to option prices.Daily calibration (Bakshi, Cao, Chen (1997, JF), Carr and Wu (2003,JF))

The key issue is how to define the pricing error and how to build theweight:

In-the-money is dominated by the intrinsic value, not by the model. Ateach strike, use the out-of-the-money option: Call when K > F andput when K ≤ F .Pricing errors can either be absolute errors (market minus model), orpercentage errors (log (market/model)).

Using absolute errors favors options with higher values (longermaturity, near the money).Using percentage errors put more uniform weight across options, butmay put too much weight on illiquid options (far out of money).

Errors can be either in dollar prices or implied volatilities.My current choice: Use out of money option prices to define absoluteerrors, use the inverse of vega as weights.

Wu (Baruch) Statistical Arbitrage July 1, 2011 11 / 41

Page 12: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

Estimating the risk-neutral dynamics

Sometimes separate calibration per maturity is needed for a simpleLevy model (e.g., VG, MJD)

Levy processes with finite variance implies that non-normality dies awayquickly with time aggregation.

Model-generated implied volatility smile/smirk flattens out at longmaturities.

Separate calibration is necessary to capture smiles at long maturities.

Adding a persistent stochastic volatility process (time change) helpsimprove the fitting along the maturity dimension.

Daily calibration: activity rates and model parameters are treated thesame as free parameters.

Dynamically consistent estimation: Parameters are fixed, only activityrates are allowed to vary over time.

Wu (Baruch) Statistical Arbitrage July 1, 2011 12 / 41

Page 13: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

Static v. dynamic consistency

Static cross-sectional consistency: Option values across differentstrikes/maturities are generated from the same model (sameparameters) at a point in time.

Dynamic consistency: Option values over time are also generatedfrom the same no-arbitrage model (same parameters).

While most academic & practitioners appreciate the importance ofbeing both cross-sectionally and dynamically consistent, it can bedifficult to achieve while generating good pricing performance. So itcomes to compromises.

Market makers:Achieving static consistency is sufficient.Matching market prices is important to provide two-sided quotes.

Long-term convergence traders:Pricing errors represent trading opportunities.Dynamic consistency is important for long-term convergence trading.

A well-designed model (with several time changed Levy components)can achieve both dynamic consistency and good performance.

Wu (Baruch) Statistical Arbitrage July 1, 2011 13 / 41

Page 14: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

Dynamically consistent estimation

Nested nonlinear least square (Huang and Wu (2004)):Often has convergence issues.

Cast the model into state-space form and use MLE.

Define state propagation equation based on the P-dynamics of theactivity rates. (Need to specify market price on activity rates, but noton return risks).

Define the measurement equation based on option prices(out-of-money values, weighted by vega,...)

Use an extended version of Kalman filter (EKF, UKF, PKF) topredict/filter the distribution of the states and measurements.

Define the likelihood function based on forecasting errors on themeasurement equations.

Estimate model parameters by maximizing the likelihood.

Wu (Baruch) Statistical Arbitrage July 1, 2011 14 / 41

Page 15: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

The Classic Kalman filter

Kalman filter (KF) generates efficient forecasts and updates underlinear-Gaussian state-space setup:

State : Xt+1 = FXt +√

Σxεt+1,

Measurement : yt = HXt +√

Σyet

The ex ante predictions as

X t = FXt−1; V x ,t = FVx ,t−1F> + Σx ;

y t = HX t ; V y ,t = HV x ,tH> + Σy .

The ex post filtering updates are,

Kt = V x ,tH> (V y ,t

)−1= V xy ,t

(V y ,t

)−1, → Kalman gain

Xt = X t + Kt (yt − y t) ,

Vx ,t = V x ,t − KtV y ,tK>t = (I − KtH)V x ,t

The log likelihood is build on the forecasting errors:

lt = −12 log

∣∣V y ,t

∣∣− 12

((yt − y t)

> (V y ,t

)−1(yt − y t)

).

Wu (Baruch) Statistical Arbitrage July 1, 2011 15 / 41

Page 16: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

Numerical twists

Kalman filter with fading memory: Replace V x ,t with

V x ,t = α2FVx ,t−1F> + Σx ,

with α > 1 (e.g., α ≈ 1.01). This slight raise on V x ,t increases theKalman gain and hence increases the responsiveness of the states tothe new observations (versus old observations). This twist allows thefiltering to forget about old observations gradually.

Application: Recursive least square estimation of the coefficient X :yt = HtX + et . In this case, the coefficient X is the hidden state. Inthis case, we can estimate the coefficient recursively and update thecoefficient estimate after each new observation yt :Xt = Xt−1 + Kt(yt − HtXt−1), with

V x,t = Vx,t−1α2,

Kt = V x,tH>t (HtV y ,tH

>t + Σy ),

Vx,t = (I − KtHt)V x,t .

Wu (Baruch) Statistical Arbitrage July 1, 2011 16 / 41

Page 17: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

The Extended Kalman filter: Linearly approximating themeasurement equation

If we specify affine-diffusion dynamics for the activity rates, the statedynamics (X ) can be regarded as Gaussian linear, but option prices(y) are not linear in the states:

State : Xt+1 = FXt +√

Σx ,tεt+1,

Measurement : yt = h(Xt) +√

Σyet

One way to use the Kalman filter is by linear approximating themeasurement equation,

yt ≈ HtXt +√

Σyet , Ht =∂h(Xt)

∂Xt

∣∣∣∣Xt=Xt

It works well when the nonlinearity is small.

Numerical issues (some are well addressed in the engineeringliterature)

How to compute the gradient?How to keep the covariance matrix positive definite.

Wu (Baruch) Statistical Arbitrage July 1, 2011 17 / 41

Page 18: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

Approximating the distribution

Measurement : yt = h(Xt) +√

Σyet

The Kalman filter applies Bayesian rules in updating the conditionallynormal distributions.

Instead of linearly approximating the measurement equation h(Xt),we directly approximate the distribution and then apply Bayesian ruleson the approximate distribution.

There are two ways of approximating the distribution:

Draw a large amount of random numbers, and propagate these randomnumbers — Particle filter. (more generic)Choose “sigma” points deterministically to approximate thedistribution (think of binominal tree approximating a normaldistribution) — unscented filter. (faster, easier to implement, andworks reasonably well when X follow pure diffusion dynamics)

Wu (Baruch) Statistical Arbitrage July 1, 2011 18 / 41

Page 19: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

The unscented transformation

Let k be the number of states. A set of 2k + 1 sigma vectors χi aregenerated according to:

χt,0 = Xt , χt,i = Xt ±√

(k + δ)(Vx ,t)j (1)

with corresponding weights wi given by

wm0 = δ/(k + δ),w c

0 = δ/(k + δ) + (1− α2 + β),wi = 1/[2(k + δ)],

where δ = α2(k + κ)− k is a scaling parameter, α (usually between10−4 and 1) determines the spread of the sigma points, κ is asecondary scaling parameter usually set to zero, and β is used toincorporate prior knowledge of the distribution of x .It is optimal to setβ = 2 if x is Gaussian.

We can regard these sigma vectors as forming a discrete distributionwith wi as the corresponding probabilities.

Think of sigma points as a trinomial tree v. particle filtering assimulation.

Wu (Baruch) Statistical Arbitrage July 1, 2011 19 / 41

Page 20: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

The unscented Kalman filter

State prediction:

χt−1 =

[Xt−1, Xt−1 ±

√(k + δ)Vx ,t−1

], (draw sigma points)

χt,i = Fχt−1,i ,

X t =∑2k

i=0 wmi χt,i ,

V x ,t =∑2k

i=0 wci (χt,i − X t)(χt,i − X t)

> + Σx .(2)

Measurement prediction:

χt =

[X t ,X t ±

√(k + δ)V x ,t

], (re-draw sigma points)

ζt,i = h(χt,i ), y t =∑2k

i=0 wmi ζt,i .

(3)

Redrawing the sigma points in (3) is to incorporate the effect ofprocess noise Σx .

If the state propagation equation is linear, we can replace the stateprediction step in (2) by the Kalman filter and only draw sigma pointsin (3).

Wu (Baruch) Statistical Arbitrage July 1, 2011 20 / 41

Page 21: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

The unscented Kalman filter

Measurement update:

V y ,t =∑2k

i=0 wci

[ζt,i − y t

] [ζt,i − y t

]>+ Σy ,

V xy ,t =∑2k

i=0 wci

[χt,i − X t

] [ζt,i − y t

]>,

Kt = V xy ,t

(V y ,t

)−1,

Xt = X t + Kt (yt − y t) ,

Vx ,t = V x ,t − KtV y ,tK>t .

One can also do square root UKF to increase the numerical precisionand to maintain the positivity definite property of the covariancematrix.

Wu (Baruch) Statistical Arbitrage July 1, 2011 21 / 41

Page 22: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

The Square-root UKF

Let St denote the Cholesky factor of Vt such that Vt = StS>t .

State prediction:

χt−1 =[Xt−1, Xt−1 ±

√(k + δ)Sx ,t−1

], (draw sigma points)

χt,i = Fχt−1,i ,

X t =∑2k

i=0 wmi χt,i ,

Sx ,t = qr{[√

w c1 (χt,1:2k − X t)

√Σx

]},

Sx ,t = cholupdate{Sx ,t , χt,0 − X t ,wc0 }.

(4)

Measurement prediction:

χt =[X t ,X t ±

√(k + δ)Sx ,t

], (re-draw sigma points)

ζt,i = h(χt,i ), y t =∑2k

i=0 wmi ζt,i .

(5)

Wu (Baruch) Statistical Arbitrage July 1, 2011 22 / 41

Page 23: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

The Square-root UKF

Measurement update:

Sy ,t = qr{[√

w c1 (ζt,1:2k − y t)

√Σy

]},

Sy ,t = cholupdate{Sy ,t , ζt,0 − y t ,wc0 },

V xy ,t =∑2k

i=0 wci

[χt,i − X t

] [ζt,i − y t

]>,

Kt =(V xy ,t/S

>y ,t

)/Sy ,t ,

Xt = X t +Kt (yt − y t) ,

U = KtSy ,t ,

Sx ,t = cholupdate{Sx ,t ,U,−1}.

Wu (Baruch) Statistical Arbitrage July 1, 2011 23 / 41

Page 24: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

Joint estimation of P and Q dynamics

Pan (2002, JFE): GMM. Choosing moment conditions becomesincreasing difficult with increasing number of parameters.

Eraker (2004, JF): Bayesian with MCMC. Choose 2-3 options per day.Throw away lots of cross-sectional (Q) information.

Bakshi & Wu (2005, wp), “Investor Irrationality and the NasdaqBubble”MLE with filtering

Cast activity rate P-dynamics into state equation, cast option pricesinto measurement equation.

Use UKF to filter out the mean and covariance of the states andmeasurement.

Construct the likelihood function of options based on forecasting errors(from UKF) on the measurement equations.

Given the filtered activity rates, construct the conditional likelihood onthe returns by FFT inversion of the conditional characteristic function.

The joint log likelihood equals the sum of the log likelihood of optionpricing errors and the conditional log likelihood of stock returns.

Wu (Baruch) Statistical Arbitrage July 1, 2011 24 / 41

Page 25: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

Existing issues

When the state dynamics are discontinuous, the performance of UKFcan deteriorate. Using particle filter instead increases thecomputational burden dramatically.

Inconsistency regarding the current state of the state vector.

When we price options, we derive the option values (Fouriertransforms) as a function of the state vector.

In model estimation, the value of the state vector is never known, weonly know its distribution (characterized by the sigma points).

The two components (pricing and model estimation) are inherentlyinconsistent with each other.

Several papers try to use incomplete information to explain thecorrelation between credit events and their arrival intensity.

Conditional on incomplete information about the state vector (thecurrent state of the state vector is not known in full), how the optionvaluation should adjust?

Wu (Baruch) Statistical Arbitrage July 1, 2011 25 / 41

Page 26: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

Why are we doing this?

Understanding the P/Q dynamics — Researchers are inherentlycurious.

Understanding the human investment behavior: How investors pricedifferent sources of risks differently?

Refine investment decisions.

Understand the sources of risks in each contract and the expectedreturn per unit exposure to each risk source.

Exploit violations of no-arbitrage conditions.

Wu (Baruch) Statistical Arbitrage July 1, 2011 26 / 41

Page 27: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

Example: No-arbitrage dynamic term structure models

Basic idea:

Interest rates across different maturities are related.

A dynamic term structure model provides a (smooth) functional formfor this relation that excludes arbitrage.

The model usually consists of specifications of risk-neutral factordynamics (X ) and the short rate as a function of the factors, e.g.,rt = ar + b>r Xt .

Nothing about the forecasts: The “risk-neutral dynamics” areestimated to match historical term structure shapes.

A model is well-specified if it can fit most of the term structureshapes reasonably well.

Wu (Baruch) Statistical Arbitrage July 1, 2011 27 / 41

Page 28: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

A 3-factor affine model

with adjustments for discrete Fed policy changes:

Pricing errors on USD swap rates in bps

Maturity Mean MAE Std Auto Max R2

2 y 0.80 2.70 3.27 0.76 12.42 99.963 y 0.06 1.56 1.94 0.70 7.53 99.985 y -0.09 0.68 0.92 0.49 5.37 99.997 y 0.08 0.71 0.93 0.52 7.53 99.9910 y -0.14 0.84 1.20 0.46 8.14 99.9915 y 0.40 2.20 2.84 0.69 16.35 99.9030 y -0.37 4.51 5.71 0.81 22.00 99.55

Superb pricing performance: R-squared is greater than 99%.Maximum pricing errors is 22bps.Pricing errors are transient compared to swap rates (0.99):Average half life of the pricing errors is 3 weeks.The average half life for swap rates is 1.5 years.

Wu (Baruch) Statistical Arbitrage July 1, 2011 28 / 41

Page 29: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

Investing in interest rate swaps based on dynamic termstructure models

If you can forecast interest rate movements,Long swap if you think rates will go down.Forget about dynamic term structure model: It does not help yourinterest rate forecasting.

If you cannot forecast interest rate movements (it is hard), use thedynamic term structure model not for forecasting, but as adecomposition tool:

yt = f (Xt) + etWhat the model captures (f (Xt)) is the persistent component, which isdifficult to forecast.What the model misses (the pricing error e) is the more transient andhence more predictable component.

Form swap-rate portfolios thatneutralize their first-order dependence on the persistent factors.only vary with the transient residual movements.

Result: The portfolios are strongly predictable, even though theindividual interest-rate series are not.Wu (Baruch) Statistical Arbitrage July 1, 2011 29 / 41

Page 30: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

Static arbitrage trading based on no-arbitrage dynamicterm structure models

For a three-factor model, we can form a 4-swap rate portfolio thathas zero exposure to the factors.

The portfolio should have duration close to zeroNo systematic interest rate risk exposure.The fair value of the portfolio should be relatively flat over time.

The variation of the portfolio’s market value is mainly induced byshort-term liquidity shocks...

Long/short the swap portfolio based on its deviation from the fairmodel value.

Provide liquidity to where the market needs it and receives a premiumfrom doing so.

Wu (Baruch) Statistical Arbitrage July 1, 2011 30 / 41

Page 31: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

The time-series of 10-year USD swap rates

Hedged (left) v. unhedged (right)

Jan96 Jan98 Jan00 Jan02

−45

−40

−35

−30

−25

−20

−15

Inte

rest

Rat

e P

ortfo

lio, B

ps

Jan96 Jan98 Jan00 Jan023.5

4

4.5

5

5.5

6

6.5

7

7.5

8

10−Y

ear

Sw

ap, %

It is much easier to predict the hedged portfolio (left panel) than theunhedged swap contract (right panel).

Wu (Baruch) Statistical Arbitrage July 1, 2011 31 / 41

Page 32: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

Back-testing results from a simple investment strategy

95-00: In sample. Holding each investment for 4 weeks.

97 00 02 05

5

10

15

20

25

30

35

40

45

Cum

ula

tive W

ealth

US

97 00 02 050

5

10

15

20

25

30

35

40

Cum

ula

tive W

ealth

BP

01 02 03 04 05 06

2

4

6

8

10

12

Cum

ula

tive W

ealth

EU

97 00 02 05

0

10

20

30

40

50

Cum

ula

tive W

ealth

JY

Wu (Baruch) Statistical Arbitrage July 1, 2011 32 / 41

Page 33: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

Caveats

Convergence takes time: We take a 4-week horizon.

Accurate hedging is vital for the success of the strategy. The modelneeds to be estimated with dynamic consistency:

Parameters are held constant. Only state variables vary.

Appropriate model design is important: parsimony, stability, adjustmentfor some calendar effects.

Daily fitting of a simpler model (with daily varying parameters) isdangerous.

Spread trading (one factor) generates low Sharpe ratios.

Butterfly trading (2 factors) is also not guaranteed to succeed.

Wu (Baruch) Statistical Arbitrage July 1, 2011 33 / 41

Page 34: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

Another example: Trading the linkages between sovereignCDS and currency options

When a sovereign country’s default concern (over its foreign debt)increases, the country’s currency tend to depreciate, and currencyvolatility tend to rise.

“Money as stock” corporate analogy.

Observation: Sovereign credit default swap spreads tend to movepositively with currency’s

option implied volatilities (ATMV): A measure of the return volatility.risk reversals (RR): A measure of distributional asymmetry.

Wu (Baruch) Statistical Arbitrage July 1, 2011 34 / 41

Page 35: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

Co-movements between CDS and ATMV/RR

Dec01 Jul02 Jan03 Aug03 Feb04 Sep04 Mar050

2

4

6

CD

S S

prea

d, %

Mexico

Dec01 Jul02 Jan03 Aug03 Feb04 Sep04 Mar0510

20

30

40

Impl

ied

Vol

atili

ty F

acto

r, %

Dec01 Jul02 Jan03 Aug03 Feb04 Sep04 Mar050

50

100

CD

S S

prea

d, %

Brazil

Dec01 Jul02 Jan03 Aug03 Feb04 Sep04 Mar050

100

200

Impl

ied

Vol

atili

ty F

acto

r, %

Dec01 Jul02 Jan03 Aug03 Feb04 Sep04 Mar050

1

2

3

4

5

CD

S S

prea

d, %

Mexico

Dec01 Jul02 Jan03 Aug03 Feb04 Sep04 Mar050

2

4

6

8

10

Ris

k R

ever

sal F

acto

r, %

Dec01 Jul02 Jan03 Aug03 Feb04 Sep04 Mar050

50

100

CD

S S

prea

d, %

Brazil

Dec01 Jul02 Jan03 Aug03 Feb04 Sep04 Mar050

20

40

Ris

k R

ever

sal F

acto

r, %

Wu (Baruch) Statistical Arbitrage July 1, 2011 35 / 41

Page 36: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

A no-arbitrage model that prices both CDS and currencyoptions

Model specification:At normal times, the currency price (dollar price of a local currency, saypeso) follows a diffusive process with stochastic volatility.

When the country defaults on its foreign debt, the currency price jumpsby a large amount.

The arrival rate of sovereign default is also stochastic and correlatedwith the currency return volatility.

Under these model specifications, we can price both CDS andcurrency options via no-arbitrage arguments. The pricing equations istractable. Numerical implementation is fast.

Estimate the model with dynamic consistency: Each day, three thingsvary: (i) Currency price (both diffusive moves and jumps), (ii)currency volatility, and (iii) default arrival rate.

All model parameters are fixed over time.

Wu (Baruch) Statistical Arbitrage July 1, 2011 36 / 41

Page 37: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

The hedged portfolio of CDS and currency options

Suppose we start with an option contract on the currency. We need fourother instruments to hedge the risk exposure of the option position:

1 The underlying currency to hedge infinitesimal movements inexchange rate

2 A risk reversal (out of money option) to hedge the impact of defaulton the currency value.

3 A straddle (at-the-money option) to hedge the currency volatilitymovement.

4 A CDS contract to hedge the default arrival rate variation.

The portfolio needs to be rebalanced over time to maintain neutral to therisk factors.

The value of hedged portfolio is much more transient than volatilitiesor cds spreads.

Wu (Baruch) Statistical Arbitrage July 1, 2011 37 / 41

Page 38: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

Back-testing results

Wu (Baruch) Statistical Arbitrage July 1, 2011 38 / 41

Page 39: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

A more generic setup

Let yt ∈ RN denote a vector of observed derivative prices, let h(Xt)denote the model-implied value as a function of the state vectorXt ∈ Rk . Let Ht = ∂h(Xt)/∂Xt ∈ RN×k denote the gradient matrixat time t.

−et = h(Xt)− yt can be regarded as the “alpha” of the asset and Ht

its risk exposure.

We can solve the following quadratic program:

maxwt−w>et −

1

2γw>t Vew

subject to factor exposure constraints:

H>t wt = c ∈ Rk .

The above equation maximizes the expected return (alpha) of theportfolio subject to factor exposure constraints.Setting c = 0 maintains (first order) factor-neutrality.One can also take on factor exposures to obtain factor risk premium.Special care is needed for factor neutrality when the state can jump.

Wu (Baruch) Statistical Arbitrage July 1, 2011 39 / 41

Page 40: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

Bottom line

If you have a working crystal ball, others’ risks become youropportunities.

Forget about no-arbitrage models; lift the market.

No-arbitrage type models become useful when

You cannot forecast the future accurately: Risk persists.Hedge risk exposures, orTake a controlled exposure to certain risk sources and receive riskpremiums accordingly.

Understanding risk premium behavior is useful not just for academicpapers.

Perform statistical arbitrage trading on derivative products that profitfrom short-term market dislocations.

Caveat: When hedging is off, risk can overwhelm profit opportunities.

Wu (Baruch) Statistical Arbitrage July 1, 2011 40 / 41

Page 41: Statistical Arbitrage Based on No-Arbitrage Modelsfaculty.baruch.cuny.edu/lwu/papers/StatArbEst.pdf · Statistical Arbitrage Based on No-Arbitrage Models Liuren Wu Zicklin School

Research directions

You can only go so far with designing/estimating option pricingmodels with time-changed Levy processes — not many peopleunderstand the math.

What are needed are economic stories that appeal to a wider audience— Even if they do not understand the math, they can still understandthe importance of the story and appreciate the insights.

Example: “Leverage Effect, Volatility Feedback, and Self-ExcitingMarket Disruptions” — bridge the gap between “structural” modelsand “reduced-form” models.

Some innovative explorations that may open new fronts for research

“A New Simple Approach for Constructing Implied Volatility Surfaces”“Simple Robust Hedging with Nearby Contracts”“Multifrequency Cascade Interest Rate Dynamics andDimension-Invariant Term Structures”“Linearity-Generating Processes, Unspanned Stochastic Volatility, andInterest-Rate Option Pricing”

Wu (Baruch) Statistical Arbitrage July 1, 2011 41 / 41


Recommended