+ All Categories
Home > Documents > Robust portfolio selection for index tracking

Robust portfolio selection for index tracking

Date post: 04-Sep-2016
Category:
Upload: chen-chen
View: 220 times
Download: 4 times
Share this document with a friend
9
Robust portfolio selection for index tracking Chen Chen, Roy H. Kwon Department of Mechanical and Industrial Engineering, University of Toronto, 5 King’s College Road, Toronto, ON, Canada M5S 3G8 article info Available online 8 September 2010 Keywords: Index tracking Passive fund management Portfolio selection Robust optimization abstract We develop a robust portfolio selection model for tracking a market index using a subset of its assets. The model is a 0–1 integer program that seeks to maximize similarity between selected assets and the assets of the target index. We allow uncertainty in the objective function by using a computationally tractable robust framework that can control the conservativeness of the solution. This protects against worst-case realizations of potential estimation errors and other deviations. Out-of-sample experiments using the S&P 100 demonstrate the advantages of the robust model. Compared to portfolios constructed with the nominal model, moderately conservative robust portfolios are shown to have lower tracking error and risk profiles that are more similar to the target index. & 2010 Elsevier Ltd. All rights reserved. 1. Introduction Index tracking is a type of passive management that seeks to replicate the performance of a benchmark set of assets. A simple way to track an index is to hold all its assets in the same relative quantities, a method known as full replication. However, assets with small weightings will incur proportionally high transaction costs, and frequent revisions in the index will result in many transactions overall. Revisions can be especially problematic in times of high volatility; fluctuating prices can cause many assets to be added to and deleted from the index. Since one of the main advantages of passive management is reduced transaction costs, it may be desireable to hold less assets than the target index. This approach, however, results in tracking portfolios that do not match the performance of the target index as closely as full replication. One way to mitigate this problem is to minimize the expected tracking error, but this can result in complex models that require potentially sub-optimal heuristic algorithms to solve problems of practical size. Other methods try to match the expected return of the index, either directly using the mean- variance optimization framework established by Markowitz, or indirectly using economic factors. We use a model that maximizes similarity between the assets in the constructed tracking portfolio and those in the index, which avoids many of the disadvantages of the aforementioned methods. This model considers only the initial selection and subsequent investment in assets for the tracking portfolio, and as such we do not address the issue of rebalancing the portfolio. The main purpose of this paper is to study the effects of robustness in our selection model in a one period setting. Incorporating robustness in a multi-period setting with rebalancing and transaction costs could be handled with a combination of a stochastic programming with recourse and robust optimization approach, but is beyond the scope of the paper. Alternatively, turnover constraints could be incorporated to limit transaction costs. 1.1. Literature overview In finance, tracking error is commonly measured as the variance of the difference between portfolio return and index return. Minimizing this variance in the objective function results in a quadratic program such as in Meade and Salkin [1] and Jansen and Dijk [2]. Gaivoronski et al. [3] discuss in detail different measures of tracking error. Some researchers attempt to deal with the complexity of the resulting model by approximating good but not necessarily optimum solutions with heuristic algorithms. For example, Beasley et al. [4] create an evolutionary heuristic that minimizes a potentially nonlinear tracking error function while considering transaction costs and portfolio rebalancing. Gilli and Kellezi [5] use the threshold accepting heuristic algorithm to minimize tracking error with transaction costs. Building on the model of Jansen and Dijk [2], Coleman et al. [6] minimize quadratic tracking error with a cardinality constraint in the objective function using a graduated non-convexity algorithm. This algorithm first finds the global optimal solution to the unconstrained problem and then gradually approaches the required number of assets through a series of local optimum solutions, resulting in a near-optimal final solution. Another way to tractably minimize tracking error is to use a linear definition, making it amenable to linear programming. Konno and Wijayanayake [7] minimize mean absolute deviation with a branch-and-bound method, and present an alternative Contents lists available at ScienceDirect journal homepage: www.elsevier.com/locate/caor Computers & Operations Research 0305-0548/$ - see front matter & 2010 Elsevier Ltd. All rights reserved. doi:10.1016/j.cor.2010.08.019 Corresponding author. E-mail address: [email protected] (R.H. Kwon). Computers & Operations Research 39 (2012) 829–837
Transcript

Computers & Operations Research 39 (2012) 829–837

Contents lists available at ScienceDirect

Computers & Operations Research

0305-05

doi:10.1

� Corr

E-m

journal homepage: www.elsevier.com/locate/caor

Robust portfolio selection for index tracking

Chen Chen, Roy H. Kwon �

Department of Mechanical and Industrial Engineering, University of Toronto, 5 King’s College Road, Toronto, ON, Canada M5S 3G8

a r t i c l e i n f o

Available online 8 September 2010

Keywords:

Index tracking

Passive fund management

Portfolio selection

Robust optimization

48/$ - see front matter & 2010 Elsevier Ltd. A

016/j.cor.2010.08.019

esponding author.

ail address: [email protected] (R.H. Kw

a b s t r a c t

We develop a robust portfolio selection model for tracking a market index using a subset of its assets.

The model is a 0–1 integer program that seeks to maximize similarity between selected assets and the

assets of the target index. We allow uncertainty in the objective function by using a computationally

tractable robust framework that can control the conservativeness of the solution. This protects against

worst-case realizations of potential estimation errors and other deviations. Out-of-sample experiments

using the S&P 100 demonstrate the advantages of the robust model. Compared to portfolios constructed

with the nominal model, moderately conservative robust portfolios are shown to have lower tracking

error and risk profiles that are more similar to the target index.

& 2010 Elsevier Ltd. All rights reserved.

1. Introduction

Index tracking is a type of passive management that seeks toreplicate the performance of a benchmark set of assets. A simpleway to track an index is to hold all its assets in the same relativequantities, a method known as full replication. However, assetswith small weightings will incur proportionally high transactioncosts, and frequent revisions in the index will result in manytransactions overall. Revisions can be especially problematic intimes of high volatility; fluctuating prices can cause many assetsto be added to and deleted from the index. Since one of the mainadvantages of passive management is reduced transaction costs, itmay be desireable to hold less assets than the target index. Thisapproach, however, results in tracking portfolios that do notmatch the performance of the target index as closely as fullreplication. One way to mitigate this problem is to minimize theexpected tracking error, but this can result in complex modelsthat require potentially sub-optimal heuristic algorithms to solveproblems of practical size. Other methods try to match theexpected return of the index, either directly using the mean-variance optimization framework established by Markowitz, orindirectly using economic factors. We use a model that maximizessimilarity between the assets in the constructed tracking portfolioand those in the index, which avoids many of the disadvantages ofthe aforementioned methods. This model considers only theinitial selection and subsequent investment in assets for thetracking portfolio, and as such we do not address the issue ofrebalancing the portfolio. The main purpose of this paper is tostudy the effects of robustness in our selection model in a one

ll rights reserved.

on).

period setting. Incorporating robustness in a multi-period settingwith rebalancing and transaction costs could be handled with acombination of a stochastic programming with recourse androbust optimization approach, but is beyond the scope of thepaper. Alternatively, turnover constraints could be incorporatedto limit transaction costs.

1.1. Literature overview

In finance, tracking error is commonly measured as the varianceof the difference between portfolio return and index return.Minimizing this variance in the objective function results in aquadratic program such as in Meade and Salkin [1] and Jansen andDijk [2]. Gaivoronski et al. [3] discuss in detail different measures oftracking error. Some researchers attempt to deal with thecomplexity of the resulting model by approximating good butnot necessarily optimum solutions with heuristic algorithms. Forexample, Beasley et al. [4] create an evolutionary heuristic thatminimizes a potentially nonlinear tracking error function whileconsidering transaction costs and portfolio rebalancing. Gilli andKellezi [5] use the threshold accepting heuristic algorithm tominimize tracking error with transaction costs. Building on themodel of Jansen and Dijk [2], Coleman et al. [6] minimize quadratictracking error with a cardinality constraint in the objective functionusing a graduated non-convexity algorithm. This algorithm firstfinds the global optimal solution to the unconstrained problem andthen gradually approaches the required number of assets through aseries of local optimum solutions, resulting in a near-optimal finalsolution. Another way to tractably minimize tracking error is to usea linear definition, making it amenable to linear programming.Konno and Wijayanayake [7] minimize mean absolute deviationwith a branch-and-bound method, and present an alternative

C. Chen, R.H. Kwon / Computers & Operations Research 39 (2012) 829–837830

model for mean downside deviation. Rudolf et al. [8] also minimizetracking error with linear programming, and investigate fourdifferent linear measures of tracking error.

A different approach for index tracking is to use mean-varianceoptimization and to define the variance as tracking error relativeto a benchmark. Roll [9] minimizes quadratic tracking error usingthe mean-variance framework and applies an additional con-straint on the beta coefficient of the tracking portfolio. Rohweder[10] includes transaction costs in the objective function, andsuggests that a factor model produces a covariance matrix that issuperior to one composed of historical values. Wang [11] alsoincludes transaction costs, and demonstrates that multiple indicescan be tracked by transforming the problem to a standardproblem of tracking one index.

There are some well-known problems; however, with implement-ing mean-variance optimization such as the covariance matrix notbeing positive definite due to rounding error, or an ill-conditionedcorrelation matrix leading to unstable asset allocations. Anothersignificant issue is that the population parameters with respect toreturns cannot be known and therefore must be estimated. Inparticular, the difficulty of accurately forecasting mean returns isconsiderable. Jagannathan and Ma [12] suggest that the estimationerror for the sample mean in returns is too large to be of much use,and advocate a minimum-variance portfolio instead. DeMiguel andNogales [13] also support this; they create a single step nonlinearoptimization that uses robust estimators of the covariance matrix toproduce more stable minimum-variance portfolios. They cite Merton[14] as having demonstrated that estimation error can be reduced forthe sample covariance matrix with more frequent sampling ofreturns, while estimation error for the sample mean can only bereduced with a time series of longer duration, to the point thatreasonable mean return estimates can only be had with unreasonablylong time series.

Factor models can also be used for index tracking by matchingthe characteristics of a tracking portfolio with those of the targetindex. Rudd [15] presents a heuristic for portfolio selection andcreates portfolios with a beta of unity relative to the index whileminimizing residual risk. Corielli and Marcellino [16] alsoconsider multiple factors, employing a simple selection heuristicto produce a portfolio that avoids long-term factors that couldlead to tracking error. Erdogan et al. [17] consider differentfactors, maximizing Sharpe ratio with proportional transactioncosts and with a constraint on beta, using a model based onmean-variance optimization. In order to produce a more accurateforecast of returns, they use the Capital Asset Pricing Model.They account for estimation error of the Sharpe ratio by using therobust optimization framework of Ben-Tal and Nemirovski [18],resulting in a second-order cone program. Canakgoz and Beasley[19] use a different approach, formulating a MIP with an objectivefunction based on linear regression. For basic index trackingthey solve their model in three stages, first matching theregression intercept with respect to returns, then the slope ofregression, and finally minimizing an approximate measure oftransaction costs.

There are also advocates of index tracking based on cointegra-tion, which is a measure of long-term trend in the time series ofprices. Alexander and Dimitriu [20] create a method to determinecointegration-optimal weightings for a set of already selectedstocks in an index tracking portfolio. They compare cointegration-optimal portfolios with those made with the model in Roll [9] andshow a similar performance out-of-sample. In a similar vein,Dunis and Ho [21] determine cointegration-optimal weightingsfor a tracking portfolio, and do not provide a selection method.They argue that portfolios weighted on the basis of correlationwould require more frequent rebalancing. Focardi and Fabozzi[22] propose a method to cluster similar, potentially cointegrated

stocks based on a distance function for comparing time-series ofasset prices. They suggest that these clusters can simplify theportfolio selection process, allowing one to select an asset fromeach cluster either manually or with optimization.

To the best of our knowledge, there has not been anyapplication of robust optimization to index tracking involvingdiscrete choice (integer decisions). However, robust optimizationhas been effectively used in various other financial applications.For example, Tutuncu and Koenig [23] create a robust model forasset allocation that uses uncertainty sets, instead of pointestimates, for returns and covariance. Bertsimas and Pachamano-va [24] develop robust formulations for multi-period portfoliomanagement that can be solved with linear programming. Kawasand Thiele [25] develop log-robust portfolio management,advocating the continuously compounded rate of return as theunderlying driver of uncertainty rather than active returns. Pinarand Tutuncu [26] introduce the notion of robust profit opportu-nity, related to arbitrage. They also develop single and multi-period models to maximize robust profit opportunities. Many ofthese ideas are also explored in Cornuejols and Tutuncu [27].

1.2. Contributions of the paper

In this paper we adopt an integer program from Cornuejols andTutuncu [27] that selects a given number of assets for a trackingportfolio over a single period. Instead of explicitly includingtransaction costs, the model specifies the number of assets to beincluded in the tracking portfolio. The model avoids thecomputational difficulties of using quadratic tracking error byinstead maximing pairwise similarities between assets of thetracking portfolio and its target index. As a result, exact solutionscan be found using standard optimization software. Furthermore,returns do not have to be forecasted in the model, and many otherdata issues found in mean-variance optimization are avoided.However, the similarity measure in the model’s objective functionmust still be estimated and is thus vulnerable to misspecificationand potential deviations over time—a problem also encounteredin mean-variance optimization in the covariance matrix. There-fore, to protect against this risk, we apply the robust discreteoptimization framework of Bertsimas and Sim [28], which allowsone to specify ranges of uncertainty for the objective coefficientsinstead of specific statistical distributions. This frameworkprotects against worst-case realizations in the objective functionwhile maintaining computational tractability, allowing for therobust portfolio selection model to handle indices of practical size.We exploit the property of the robust framework that allowscontrol over the conservativeness of the solution. Out-of-sampletesting with the S&P 100 demonstrates that incorporatingrobustness in the model can provide improved performance,particularly with moderately conservative settings.

1.3. Structure of the paper

Section 2 presents the nominal model for selecting assets for theportfolio. Section 3 presents the robust formulation of the model aswell as methods to solve it. Section 4 presents experimentalmethods and results. Section 5 contains concluding remarks.

2. Portfolio selection model

As presented by Cornuejols and Tutuncu [27], the modelselects stocks for a tracking portfolio. After the model has beensolved, the selections are weighted using the market value of theassets in the target index on the basis of similarity. There are

C. Chen, R.H. Kwon / Computers & Operations Research 39 (2012) 829–837 831

numerous measures of similarity between assets; for example,cointegration (see Section 1.1) and covariance. Like Cornuejolsand Tutuncu, we used correlation of returns for our experiments,as it is a simple and bounded measure that is amenable to themaximization objective.

2.1. Model variables

Suppose we construct a portfolio of q assets from a targetindex of n assets. Let rij represent the similarity between asset i

and asset j.

2.2. Decision variables

Let yj represent if asset j is selected to be in the portfolio (1 iftrue, 0 otherwise). Let xij represent whether asset j is arepresentative of stock i; xij is 1 if j is the most similar asset inthe portfolio to i, 0 otherwise.

2.3. Model

Z ¼maxXn

i ¼ 1

Xn

j ¼ 1

rijxij ð1Þ

Subject to

Xn

j ¼ 1

yj ¼ q ðportfolio size constraintÞ ð2Þ

Xn

j ¼ 1

xij ¼ 1 for i¼ 1, . . . ,n

ðeach stock has exactly one representative in the portfolioÞ ð3Þ

xijryj for i¼ 1, . . . ,n; j¼ 1, . . . ,n

ðstock must be in the portfolio to be a representativeÞ ð4Þ

xij,yjAf0,1g ð5Þ

Having solved this model, a weight wj is calculated for eachselected asset j using the sum of the market value, Vi, of each stockfrom the index it is representing

wj ¼Xn

i ¼ 1

Vixij

The proportion invested into each selected stock can be foundby dividing the stock’s weight by the sum of all weights in theportfolio. Market value is used because most indices are weightedby market capitalization. If, for example, one were tracking anequal weighted index, then market value would not be needed atall, and the term Vi would be removed from the previous equation.

Market value can be explicitly included in this model byadding Vi to represent the market value of asset i in the targetindex so that the objective now maximizes the product ofsimilarity and market value. To include uncertainty informationin the market value, however, would require determining riskwith respect to returns.

An advantage of the model is that it does not explicitly useexpected return estimates, and thus avoids issues related toestimation error in returns. Since the model is an integer program,it easily allows additional linear and discrete constraints to beformulated. Furthermore, the act of selection before weightingallows flexibility in terms of matching the target index’s

weighting method. A drawback is that the rij term results in aquadratic number of variables and constraints with respect to thenumber of assets in the target index. Nonetheless, problems ofpractical size can be solved exactly using standard linearoptimization software; computational issues and ways to alle-viate them are discussed in Section 4.1.

3. Robust formulation

To implement the nominal portfolio selection model, thesimilarity measure must be estimated using historical data; however,the actual similarity will vary over time. Therefore, the objective issubject to both uncertain future realizations and potential estimationerrors. For this reason we use robust optimization, which provides away of generating optimal solutions for worst-case realizations givensome bounded uncertainty. For guidance in implementing robustoptimization, we recommend Bertsimas and Thiele [29].

There are particular advantages with using the robustoptimization framework of Bertsimas and Sim [28]. For 0–1optimization with n decision variables, they show that the robustcounterpart can be solved using n+1 instances of the nominalproblem. The framework allows uncertainty to be expressed as arange, and thus does not explicitly require statistical distributions,avoiding issues with gathering model data. Furthermore, thedegree of conservativeness can be controlled, producing theoptimal solution if only a certain number of coefficients realizetheir lowest bounded values. This last feature is shown to be ofsignificant practical benefit in Section 4.

3.1. Formulation of the robust problem

The standard robust formulation is presented by Bertsimas andSim [28] as

minxAX

cuxþ maxfSjSDN,jSjrCg

XjA S

djxj

9=;9=;

8<:

8<: ð6Þ

for a nominal problem minxAXcux subject to a set of constraints X.In the robust formulation the cost coefficient is a random variablewith an expected value c and a maximum value of c+d, where d isnon-negative. The conservativeness of the solution is set using anarbitrary value G. G specifies the number of coefficients in thesolution allowed to take on their highest bounded values in theworst possible way. S represents some subset of the universe of N

assets such that the G highest deviations are realized. Note thatfor the least conservative setting (G¼0) the robust problem isequivalent to the nominal problem.

This can be adapted to the portfolio selection model bymodifying the nominal objective function (1). Denote theexpected value of the cost coefficient as the pairwise similarityrij and assign the minimum value of the cost coefficient to berij�dij, where dij is non-negative. The portfolio selection objectiveis maximization, so the inner maximization of the standardformulation becomes an inner minimization. Thus the robustportfolio selection formulation is

Z ¼maxxAX

Xn

i ¼ 1

Xn

j ¼ 1

rijxijþ minfSjSDN,jSjrCg

Xi,jAS

�dijxij

9=;9=;

8<:

8<: ð7Þ

This maximin formulation retains all the pertinent features ofthe standard robust optimization framework as presented in (6).

For the portfolio selection model we specify a range ofsimilarity for all pairs of assets. So we suppose the similarity ofassets i and j has an expected value rij and that it will be no less

C. Chen, R.H. Kwon / Computers & Operations Research 39 (2012) 829–837832

than rij�dij, where d is some specified downside deviation. Thus,the G highest deviations among all selected assets will besubtracted from the objective value.

3.2. Robust MIP

Adopting the approach of Bertsimas and Sim [28], weintroduce decision variables y and v for the dual of the innerminimization problem in Eq. (7). Then the equivalent MIP for therobust formulation (7) is

Z ¼maxXn

i ¼ 1

Xn

j ¼ 1

frijxij�vijg�Gy

Subject to

vijþyZdijxij8i,jAf1, . . . ,ng

y,vijZ08i,jAf1, . . . ,ng

and subject to the constraints of the nominal problem (2)–(5).

3.3. Exact algorithm using nominal subproblems

The following algorithm produces an exact solution. It is asimple adaptation of the algorithm in Bertsimas and Sim [27] toaccommodate a maximization problem

Algorithm A. Suppose we order all the n distinct non-zerodownside deviations from largest to smallest such thatd14d24?4dn and define dn + 1¼0. A decision variable xl

corresponding to a deviation dl is used instead of xij in thisalgorithm out of notational convenience.

1.

For l¼1, y, n+1 solve the n+1 nominal problems

Gl ¼�CdlþmaxxAX rx�Xl

k ¼ 1

ðdk�dlÞxk

),

(

2.

And let xl be an optimal solution of the corresponding problem. 3. Let l� ¼ argmax Gl

l ¼ 1,:::,nþ1

Z� ¼ Gl�; x� ¼ xl�

This algorithm solves the robust MIP presented in Section 3.2by decomposing the problem into several subproblems thatare equivalent to instances of the nominal model. Specifically,the MIP is partitioned into n+1 ranges of values for thevariable y, hence the absence of the variable y in thecorresponding n+1 subproblems. Among these subproblems,it can be shown that the subproblem with the highestcorresponding value, as defined in step 1, will have the samesolution as the optimum solution to the robust MIP.

A brief outline for the proof of the optimality of Algorithm A is

as follows. For an optimal solution to the MIP ðv�,x�,y�Þ, it follows

that v�ij ¼minðdijx�ij�y

�,0Þ due to the MIP’s first constraint. Since x

is a 0–1 decision variable, then it is also true that v�ij ¼

minðdij�y�,0Þx�ij. The decision variable v can thus be replaced in

the MIP by substitution. Algorithm A creates subproblems by

partitioning the possible values of y into the intervals

½0,dn�,½dn,dn�1�, . . . ,½d2,d1�. Since the objective is a linear function

of y, then y must take one of the two extreme values for each

interval, resulting in the definition of Gl. A complete proof of

optimality for Algorithm A is provided in Bertsimas and Sim [27].

This algorithm is particularly useful for generating portfolios for

all values of G, since the value of the inner maximization of Gl

remains the same regardless of the choice of G. Thus the solution

for one value of G can be used to obtain the solution for any value

of G. For the portfolio selection model, the maximum number of

distinct non-zero deviations is at most the number of unique

pairings among the universe of q assets. Thus, the maximum

number of subproblems to be solved is

q

2

� �¼

q2�q

2

where each subproblem is a nominal portfolio selection problem.

4. Experimental results

The S&P 100 was used as a target index to generate trackingportfolios. This is a stock market index, a list of stocks, that ismaintained by Standard & Poor’s. The index is composed of theleading 100 publicly traded companies in the United States. Thestocks in the S&P 100 are selected on the basis of many factors,including market capitalization and balanced representation of allthe major economic sectors. Likewise, stocks can be deleted forvarious reasons, such as mergers, acquisitions, or losing financialviability. Additions to and deletions from the index are made asneeded. The value of the index is roughly proportional to sum ofall the float-adjusted market capitalization, which is the value ofall publicly tradeable shares, of its constituents.

Daily returns from January 2, 2002 until the investment date ofJanuary 3, 2006 were used to generate data for the robust model.Performance was tracked out of sample from the investment dateto January 2, 2007, assuming all assets were held after initialinvestment without rebalancing. Market capitalizations at the endof 2005 were used to calculate the initial asset weightings of thetracking portfolios. All necessary data were obtained from theCenter for Research in Security Prices.

4.1. Computational considerations

CPLEX solver engine version 9 was used with default settingsto solve the subproblems of Algorithm A as presented in Section3.2. The experiment resulted in 10,100 decision variables and10,101 constraints for each nominal problem. 4951 nominalsubproblems were solved for Algorithm A in order to generate 101tracking portfolios for G¼0 to 100, with all portfolios holding thesame specified number of assets. Using a computer with 2.13 GHzdual core processor and 2046 MB RAM, computation time was lessthan 1.5 h for each set of 101 portfolios.

In the nominal formulation, the number of decision variablesexpands quadratically with respect to the number of assets in theindex q. Cornuejols and Tutuncu [27] provide a Lagrangianrelaxation to improve branch-and-bound solution time, and theysolve the nominal problem using the S&P 500, which contains 500stocks, as a target index. They also provide a simple heuristic toproduce a feasible solution and hence a lower bound for theoptimal solution. The number of nominal subproblems to besolved in Algorithm A potentially expands quadratically withrespect to q. Using the upper bound shown in Section 3.2, trackingthe S&P 500 with robustness could potentially require 124,750subproblems, requiring on average a nominal subproblem to besolved in less than 1 s for the algorithm to finish in one day.However, the number of subproblems can be reduced if there arefewer deviations with unique values. Thus, if we round deviationvalues to three significant digits, then likely only a few hundredsuproblems need to be solved since correlation is bounded

C. Chen, R.H. Kwon / Computers & Operations Research 39 (2012) 829–837 833

between �1 and 1. This is a very reasonable limitation becausethe impact on the objective function would be minimal, and it isunlikely that deviation can be meaningfully specified with furtherprecision . We conclude that, using Algorithm A and the nominalproblem exactly as presented in this paper, an index of 500 assetscan be tracked within a practical amount of time.

For even larger problems it may be impractical to generate thefull spectrum of robust solutions, and so the MIP shown in Section3.3 can be employed to solve for a specific value of G instead.However, Bertsimas and Sim [28] suggest that their robust MIPmay not to scale well to large problems. To deal with this issue,Atamturk [30] provides alternative strong formulations for the0–1 robust MIP. These formulations have tighter linear programrelaxations, which can improve solution times for a branch-and-bound algorithm.

4.2. Model data

Correlation between the daily returns of two assets was chosenas an effective measure of similarity, as in the manner ofCornuejols and Tutuncu [27]. The correlation used in this paperis the commonly used product-moment correlation coefficient,which is the covariance in daily returns for a pair of assets dividedby the product of individual variances in daily returns over a givenperiod. Using covariance in the model instead of correlationwould present difficulties as maximizing covariance could alsoencourage the selection of stocks with high variance, particularlywith diagonal elements. Furthermore, correlation has a boundedrange, which simplifies the process of scaling the downsidedeviation for each pairwise similarity.

Elton et al. [31] show that the simple moving average ofcorrelations over several periods is an effective forecast of futurecorrelation. They suggest that there may be some furtherimprovement in forecasting by grouping stocks by various factorssuch as industry and size. However, this is unsuitable for ourmodel, as the model maximizes individual pairwise correlations;thus, each correlation should be differentiated. As such, pairwisecorrelations were calculated for each of the eight fiscal quarters of2004 and 2005, and the average of these eight correlations wasused for the expected similarity. The deviation value for each pairof assets was set equal to the sample standard deviation of theirpairwise correlations over eight periods. This resulted in areasonable but not excessively conservative estimate of thepossible range of realizations; 86% of per-period correlationswere above the corresponding lowest bounded value. Robustoptimization makes an inherently conservative assumption byusing worst-case values, so a large range of uncertainty wouldencompass a highly improbable event and thus a less meaningfulrobust optimization.

4.3. Performance metrics

In this section we define the standard metrics that were usedto determine the out-of-sample performance of generated track-ing portfolios. The returns of the S&P 100 were used to comparethe performance of portfolios, and so some of the tracking errorarises from revisions that changed the composition of the indexduring the out-of-sample period.

4.3.1. Beta

The relative beta value at the time of investment for each assetin the tracking portfolio was calculated as

covariance ðra,riÞ

variance ðriÞ

where ra is the daily return of the asset and ri is the daily return ofthe target index from 2004 to 2006. The beta value for a trackingportfolio is the value-weighted average of the beta values of itsstocks. Thus the beta value of the target index is 1, and this is thedesirable beta value for a tracking portfolio.

4.3.2. Tracking error

Tracking error was found for each tracking portfolio ex-post,from 2006 to 2008, and was calculated as the sample standarddeviation of the difference in daily returns between a portfolioand its target index. Thus tracking error does not differentiatebetween exceeding and underperforming a target index inreturns; ideally tracking error is at its minimum, 0.

4.3.3. Information ratio

We use the definition from Sharpe [32] of the ex-post Sharperatio: the average difference in returns between a trackingportfolio and its target divided by the tracking error, as definedearlier in this paper. This ratio is more commonly known as thehistorical information ratio. Information ratios were calculatedusing daily returns over the period 2006–2008. These values wereannualized by multiplying by a factor of

ffiffiffiffiffiffiffiffiffi251p

. The historicalinformation ratio is a measure of the risk/return efficiency ofexcess return, with greater values being more desirable.

4.3.4. Market ratio

As used by Cornuejols and Tutuncu [27], market ratio wascalculated as 1þra=1þri, where ra is the portfolio’s active returnafter investment and ri is the target index’s active return afterinvestment. This was calculated for 2007, one year after invest-ment. The ideal market ratio is 1, with a higher value indicatingexcessive return, and a lower value showing underperformancewith respect to the target index.

4.4. Numerical results

In this section we first discuss the composition of thegenerated robust and non-robust portfolios, and then analyzethe ex-post performance of these portfolios.

4.4.1. Portfolio selection

The nominal objective values of non-robust and mostconservative robust portfolios, holding between 1 and 100 assets,are shown in Fig. 1. This figure shows that the cost of robustoptimization in the nominal objective function is minimal evenfor the most conservative setting. Thus robust portfolios can becreated that are optimal under worst case conditions, but alsohave near optimal performance under nominal conditions.

For both non-robust and robust portfolios, allocations werefairly stable with respect to size; an optimal portfolio of q+1assets would contain many of the same assets as the optimalportfolio of q assets. Allocations tended to be more stable whenmore stocks were held. For example, the non-robust solution withq¼30 shared 28 of the same stocks as the solution with q¼29,while the solution with q¼10 shared only 6 stocks with thesolution with q¼9.

For both robust and non-robust solutions, there were certaininstances with multiple optimal solutions. This was related to thesimilarity measure; in particular, any asset that is representingitself in the tracking index was assigned a correlation of 1 and adeviation of 0. For these cases it would be useful to differentiateassets using another factor, in addition to pairwise correlation andstability of correlation. For example, market capitalization can bedirectly incorporated into the objective function (see Section 2).Such an objective function would likely result in more unique

C. Chen, R.H. Kwon / Computers & Operations Research 39 (2012) 829–837834

solutions and stable allocations, since assets would be differ-entiated further.

Table 1 shows the difference in allocated wealth amongportfolios with various levels of conservatism, averaged amongq from 5 to 30 (26 portfolios per conservatism setting). Forexample, if portfolio A had invested 90% in X and 10% in Y, andportfolio B invested 80% in X and 20% in Y, then A and B haveallocated 10% of their wealth differently. Table 1 shows thatrobust solutions had significantly different allocations from thenon-robust solutions. Furthermore, robust solutions at highlyconservative settings (G¼50+) were quite similar to each other.This is a consequence of the input data; the distribution ofdeviations in pairwise correlation was such that the lowest 5% ofdeviations would have relatively small impact on the objectivefunction.

As expected, the robust portfolios tended to avoid thosepairwise correlations with high deviations; their non-robustcounterparts were produced with no consideration of suchdeviations. Furthermore, compared to non-robust solutions, therobust solutions invested relatively less in companies with lowermarket capitalization and higher beta, and correspondinglyinvested more in those companies with higher market capitaliza-

Fig. 1. Expected correlation of least and most conservative portfolios.

Table 1Average difference in allocation of wealth for portfolios of different conservatism

with q from 5 to 30.

Non-robust G¼5 G¼15 G¼25 G¼50 G¼75 G¼100

Non-robust – 13% 16% 19% 23% 22% 23%

G¼5 13% – 10% 13% 18% 19% 18%

G¼15 16% 10% – 8% 16% 15% 15%

G¼25 19% 13% 8% – 11% 12% 11%

G¼50 23% 18% 16% 11% – 4% 5%

G¼75 22% 19% 15% 12% 4% – 2%

G¼100 23% 18% 15% 11% 5% 2% –

Table 2Characteristics of stocks allocated differently in robust solutions with q from 5 to 30.

G¼5

41% increase vs. non-robust Beta 0.8

Market cap ($B) 146

41% decrease vs. non-robust Beta 1.1

Market cap ($B) 95

tion and lower beta. Table 2 shows the characteristics of stocksthat had on average at least 1% difference in wealth allocationbetween non-robust to robust solutions from q¼5 to 30 (26portfolios per comparison). At all levels of conservatism, robustsolutions consistently had more invested in companies withhigher market capitalization and lower beta than the non-robustsolutions. More conservative portfolios had reduced investmentin companies with relatively lower market capitalization, but alsohad increased investment in companies with lower marketcapitalization. Thus the robust objective function sought morestable correlations, resulting in greater investment in largercompanies with lower beta values.

4.4.2. Portfolio performance

For Figs. 2 and 3, tracking portfolios containing between 1 and100 stocks were generated using the nominal model. There wasgenerally an improvement in terms of tracking error when atracking portfolio held more assets, as shown in Fig. 2. Theimprovement in tracking error diminished as the trackingportfolio size increased, with minimal decrease with more than60 stocks held. In any case, these large portfolios that use manybut not all 100 assets are of questionable practical use, as fullreplication of the target index would be more desireable andresulting transaction costs would be comparable.

The average expected pairwise correlation, which is thenominal objective function value divided by the number of assetsq, is shown in Fig. 3. The out-of-sample realized correlation oneyear after investment was lower than the expected correlation fornearly all tracking portfolios, particularly for tracking portfolioswith few assets. This was expected as pairwise correlations aremaximized by the model, and so the realized correlations candeviate further downward than upward. These results support theuse of the robust framework, which optimizes for conditionswhere coefficients can deviate downward.

In subsequent experiments we varied G to observe the out-of-sample performance of tracking portfolios with different degreesof conservativeness. Portfolios holding between 5 and 30 assetswere examined out-of-sample using common financial metrics.Since only pairwise correlations were considered by the selectionmodel, none of these metrics were explicitly considered duringthe generation of the tracking portfolios. In Figs. 4(1)–7(1)portfolios held 5, 10, or 15 assets; in Figs. 4(2)–7(2) portfoliosheld 20, 25, or 30 assets.

The Beta coefficient relative to the target index is shown foreach tracking portfolio in Fig. 4. There was generally a decreasefollowed by an increase in beta coefficient as G increased from thenominal value of 0 to the most conservative robust value of 100.Compared to the nominal portfolios, robust portfolios exhibitedmore desirable beta values, particularly for intermediate values ofG between 10 and 30.

In Fig. 5(1 and 2), the market ratio is shown exactly one yearafter investment. All tracking portfolios exceeded the value ofthe target index. In Fig. 5(1), G values around 20 were superior toboth nominal and more conservative settings. With the largerportfolios in Fig. 5(2), G around 30 resulted in the bestperformance.

G¼15 G¼25 G¼50 G¼75 G¼100

0.8 0.8 0.8 0.8 0.8

118 126 99 106 99

1.1 1.1 1.1 1.1 1.1

80 80 73 73 73

Fig. 2. Out-of-sample tracking error for non-robust portfolios (2006–2007).

Fig. 3. Expected and realized correlations for non-robust portfolios.

Fig. 4. Beta coefficients of portfolios with different G and number of assets held.

Fig. 5. Market ratio of portfolios with different G and number of assets held

(January 2007).

C. Chen, R.H. Kwon / Computers & Operations Research 39 (2012) 829–837 835

Fig. 6(1 and 2) shows daily tracking error during the period oneyear after investment. Fig. 6(1) shows that portfolios with a Gvalue around 15 were superior to more conservative settings, andeither outperformed or were close to the performance of nominalportfolios. In Fig. 6(2), G values around 30 were superior.

4.5. Discussion

Since all tracking portfolios generated excess return comparedto the target index, the annualized historical information ratiowas used as a measure of the efficiency of the surplus returnswith respect to risk. Shown in Fig. 7(1 and 2), the informationratio was calculated relative to the target index. In all cases, themost conservative setting resulted in the highest informationratios. In Fig. 7(1), G values around 15 resulted in lowerinformation ratios, and in Fig. 7(2), G values around 30 had thelowest information ratios.

Robust tracking portfolios with intermediate values of G wereshown to be superior. High levels of conservativeness, however,resulted in poorer performance, sometimes worse than nominalcounterparts. Recall that for the robust model used in theexperiments, a greater penalty is given to pairwise correlationsthat vary more over several historical periods; controllingconservativeness via G sets a threshold for how many pairwisecorrelations with the highest deviations would be penalized. Thus,with a moderate level of conservativeness, only the highestdeviations could have affected the robust objective function, andthe remaining pairwise correlations would have been optimized

Fig. 6. Tracking error of portfolios with different G and number of assets

(2006–2007).

Fig. 7. Annualized Information ratios of portfolios with different G and number of

assets (2006–2007).

C. Chen, R.H. Kwon / Computers & Operations Research 39 (2012) 829–837836

on the basis of expected value. When conservativeness was high,almost all pairwise correlations would have had a significantpenalty for deviation; in the model data, average expectedcorrelation was 0.28 while average deviation was 0.13. As such,tracking portfolios with high conservativeness would seekcorrelations with low deviation while portfolios with moderateconservativeness would tend to avoid correlations with highdeviation and select the rest by maximizing expected correlation.Selecting assets strictly on the basis of stable correlations does notseem to have relevance to tracking, and the experiments showthis with the poor performance of the most conservativeportfolios. Furthermore, in theoretical terms, highly conservativeportfolios are designed to be optimal only in the unlikely case thatnearly all objective coefficients take on their lowest boundedvalues. It is clear, however, that there is an advantage tooptimizing with an aversion to the most volatile pairwisecorrelations, as is the case with moderate conservativeness; forexample, this results in a superior risk profile as shown by thebeta coefficients.

5. Conclusion

We have presented a portfolio selection model that maximizessimilarity and that is robust to potential uncertainty in eachestimate of similarity. The model has attractive features, such as astandard linear programming formulation that can be solvedexactly using standard optimization software, and flexibility interms of asset weighting. We avoided forecasting mean returns infavour of the more predictable expected similarity. Robustoptimization was employed to provide protection against estima-tion error and other changes over time.

We demonstrated the feasibility of the model on a practicalproblem by tracking the S&P 100. Simple but effective forecasts ofpairwise correlations were are used as a measure of similaritybetween stocks. Our out-of-sample results demonstrated theadvantages of robust portfolios. In particular, robust portfoliosgenerated with moderate levels of conservativeness had betterperformance in terms of beta coefficient, market ratio, andtracking error. All these benefits came at minimal cost to thenominal objective; the expected similarities of robust portfolioswere close to optimal.

References

[1] Meade N, Salkin GR. Developing and maintaining an equity index fund.Journal of the Operational Research Society 1990;41(7):599–607.

[2] Jansen Roel, van Dijk Ronald. Optimal benchmark tracking with smallportfolios. Journal of Portfolio Management 2002;28(2):33–9.

[3] Gaivoronski AA, Krylov S, van der Wijst N. Optimal portfolio selection anddynamic benchmark tracking. European Journal of Operational Research2005;163(1):115–31.

[4] Beasley JE, Meade N, Chang TJ. An evolutionary heuristic for the indextracking problem. European Journal of Operational Research 2003;148(3):621–643.

[5] Gilli M, Kellezi E. The threshold accepting heuristic for index tracking. In:Pardalos P, Tsitsiringos VK, editors. Financial engineering, e-commerce,and supply chain. Kluwer; 2002. p. 1–18 Kluwer Applied OptimizationSeries.

[6] Coleman Thomas F, Li Yuying, Henniger Jay. Minimizing tracking error whilerestricting the number of assets. The Journal of Risk 2006;8(4):33–55.

[7] Konno H, Wijayanayake A. Minimal cost index tracking under nonlineartransaction costs and minimal transaction unit constraints. InternationalJournal of Theoretical and Applied Finance 2001;4(6):939–58.

[8] Rudolf Markus, Wolter Hans-Jurgen, Zimmermann Heinz. A linear modelfor tracking error minimization. Journal of Banking & Finance 1999;23(1):85–103.

[9] Roll R. A mean/variance analysis of tracking error. Journal of PortfolioManagement 1992;18:13–22.

[10] Rohweder Herold C. Implementing stock selection ideas: does tracking erroroptimization do any good?Journal of Portfolio Management 1998;24(3):49–59.

C. Chen, R.H. Kwon / Computers & Operations Research 39 (2012) 829–837 837

[11] Wang Ming Yee. Multiple-benchmark and multiple-portfolio optimization.Financial Analysts Journal 1999;55(1):63–72.

[12] Jagannathan R, Ma T. Risk reduction in large portfolios: why imposing thewrong constraints helps. The Journal of Finance 2003;58(4):1651–84.

[13] DeMiguel V, Nogales FJ. Portfolio selection with robust estimation. Opera-tions Research 2008.

[14] Merton RC. On estimating the expected return on the market: an exploratoryinvestigation. Journal of Financial Economics 1980;8(4):323–61.

[15] Rudd A. Optimal selection of passive portfolios. Financial Management1980;9(1):57–66.

[16] Corielli Francesco, Marcellino Massimiliano. Factor based index tracking.Journal of Banking and Finance 2006;30(8):2215–33.

[17] Erdogan E, Goldfarb D, Iyengar G. Robust portfolio management. CORCTechnical Report TR-2004–11, Columbia University; 2004.

[18] Ben-Tal A, Nemirovski A. Robust convex optimization. Mathematics ofOperations Research 1998;23:769–805.

[19] Canakgoz NA, Beasley JE. Mixed-integer programming approaches for indextracking and enhanced indexation. European Journal of Operational Research2008.

[20] Alexander Carol, Dimitriu Anca. Indexing and statistical arbitrage:tracking error or cointegration?Journal of Portfolio Management 2005;31(2):50–63.

[21] Dunis Christian L, Ho Richard. Cointegration portfolios of European equitiesfor index tracking and market neutral strategies. Journal of Asset Manage-ment 2005;6(1):33.

[22] Focardi SM, Fabozzi FJ. A methodology for index tracking based on time-series clustering. Quantitative Finance 2004;4(4):417–25.

[23] Tutuncu RH, Koenig M. Robust asset allocation. Annals of OperationsResearch 2004;132:157–87.

[24] Bertsimas D, Pachamanova D. Robust multiperiod portfolio management inthe presence of transaction costs. Computers and Operations Research2008;35(1):3–17.

[25] Kawas B, Thiele A. Short sales in log-robust portfolio management. TechnicalReport, Lehigh University; 2008.

[26] Pinar MC- , Tutuncu RH. Robust profit opportunities in risky financialportfolios. Operations Research Letters 2005;33(4):331–40.

[27] Cornuejols G, Tutuncu R. Optimization methods in finance. CambridgeUniversity Press; 2007.

[28] Bertsimas D, Sim M. Robust discrete optimization and network flows.Mathematical Programming 2003;98(1–3):49–71.

[29] Bertsimas D, Thiele A. Robust and data-driven optimization: moderndecision-making under uncertainty. Tutorials in Operations Research 2006;4:95–122.

[30] Atamturk A. Strong formulations of robust mixed 0–1 programming.Mathematical Programming 2007;108(2):235–50.

[31] Elton EJ, Gruber MJ, Spitzer J. Improved estimates of correlation coefficientsand their impact on optimum portfolios. European Financial Management2006;12(3):303–18.

[32] Sharpe WF. The Sharpe ratio. Journal of Portfolio Management 1994;21(1):49–58.


Recommended