On the Out-of-Sample Importance of Skewness and...

On the Out-of-Sample Importance ofSkewness and Asymmetric Dependencefor Asset Allocation

ANDREW J. PATTON

London School of Economics

abstract

Recent studies in the empirical finance literature have reported evidence of twotypes of asymmetries in the joint distribution of stock returns. The first is skew-ness in the distribution of individual stock returns. The second is an asymmetry inthe dependence between stocks: stock returns appear to be more highly corre-lated during market downturns than during market upturns. In this article weexamine the economic and statistical significance of these asymmetries for assetallocation decisions in an out-of-sample setting. We consider the problem of aconstant relative risk aversion (CRRA) investor allocating wealth between the risk-free asset, a small-cap portfolio, and a large-cap portfolio. We use models thatcan capture time-varying moments up to the fourth order, and we use copulatheory to construct models of the time-varying dependence structure that allowfor different dependence during bear markets than bull markets. The importanceof these two asymmetries for asset allocation is assessed by comparing theperformance of a portfolio based on a normal distribution model with a portfoliobased on a more flexible distribution model. For investors with no short-salesconstraints, we find that knowledge of higher moments and asymmetric depen-dence leads to gains that are economically significant and statistically significantin some cases. For short sales-constrained investors the gains are limited.

keywords: asymmetry, copulas, density forecasting , forecasting , normality,stock returns

This article is a revision of Chapter IV of my Ph.D. dissertation [Patton (2002)]. I would like to thank Sean

Campbell, Rob Engle, Raffaella Giacomini, Clive Granger, Kris Jacobs, Bruce Lehmann, two anonymous

referees, and seminar participants at the Econometric Society meetings, the 2002 Inquire UK annual seminar

in Bournemouth, and the October 2002 Extremal Events in Finance conference in Montreal for helpful

comments. Special thanks are due to Allan Timmermann for many useful discussions on this topic. All

remaining deficiencies are my responsibility. Many thanks go to Vince Crawford and the UCSD Economics

Experimental and Computational Laboratory for providing the computational resources required for this

project. Financial support from the IAM Programme in Hedge Fund Research at LSE and the UCSD Project

in Econometric Analysis fellowship is gratefully acknowledged. Address correspondence to Andrew J.

Patton, Financial Markets Group, London School of Economics, Houghton Street, London WC2A 2AE, UK,

or e-mail: [email protected].

Journal of Financial Econometrics, Vol. 2, No. 1, pp. 130--168

ª 2004 Oxford University Press; all rights reserved. DOI: 10.1093/jjfinec/nbh006

Recent studies in the empirical finance literature have reported evidence of two

types of asymmetries in the joint distribution of stock returns. The first is skewness

or asymmetry in the distribution of individual stock returns, which has been

reported by numerous authors over the last three decades.1 Evidence that stock

returns exhibit some form of asymmetric dependence has been reported by

several authors in recent years [see Erb, Harvey, and Viskanta (1994), Longin

and Solnik (2001), Ang and Bekaert (2002), Ang and Chen (2002), Campbell,Koedijk, and Kofman (2002), and Bae, Karolyi, and Stulz (2003)]. The presence of

either of these asymmetries violates the assumption of elliptically distributed asset

returns, which underlies traditional mean-variance analysis [see Ingersoll (1987)].

In this article we examine the economic and statistical significance of these two

asymmetries for asset allocation decisions in an out-of-sample setting. This article

can thus be viewed as an attempt to address the suggestions of Harvey and

Siddique (1999) and Longin and Solnik (2001), who propose investigating the

impact of conditional skewness (Harvey and Siddique) and asymmetric depen-dence (Longin and Solnik) on portfolio choices.

Theoretical justification for the importance of distributional asymmetries may

be found in Arrow (1971), who suggests that a desirable property of a utility

function is that it exhibits nonincreasing absolute risk aversion.2 Under non-

increasing absolute risk aversion investors can be shown to have a preference for

positively skewed portfolios. The skewness of a portfolio of two assets is a function

of the skewness of the individual assets, and two ‘‘coskewness’’ terms. Asymmetry

in the dependence structure can be shown [see Patton (2002)] to lead to nonzerocoskewness and thus impact the skewness of the portfolio return. This suggests that

risk-averse investors will have preferences over alternative dependence struc-

tures. Ang, Chen, and Xing (2002) report empirical evidence in support of this.

We examine the problem of an investor with constant relative risk aversion

(CRRA) allocating wealth between the risk-free asset, the Center for Research in

Security Prices (CRSP) small cap and large cap indices, comprised of the 1st and

10th decile of U.S. stocks sorted by market capitalization. We use monthly data

from January 1954 to December 1989 to develop the models, and data fromJanuary 1990 to December 1999 for out-of-sample forecast evaluation. This pro-

blem is representative of that of choosing between a high risk--high return asset

and a lower risk--lower return asset, as the annualized mean and standard devia-

tion on these indices over the sample were 9.95% and 21.29% for the small caps,

and 7.97% and 14.29% for the large caps. Our motivation for studying a problem

involving two stocks rather than a stock and a bond, as in numerous previous

studies, is that evidence of asymmetric dependence has so far been reported only

1 See Kraus and Litzenberger (1976), Friend and Westerfield (1980), Singleton and Wingender (1986), Lim

(1989), Richardson and Smith (1993), Harvey and Siddique (1999, 2000), and Aıt-Sahalia and Brandt

(2001), among others. Peri�oo (1999) finds no such evidence.2 Utility functions that exhibit nonincreasing absolute risk aversion include the constant absolute risk

aversion utility function, and the constant relative risk aversion utility function, see Huang and

Litzenberger (1988).

PATTON | Out-of-Sample Importance of Skewness 131

for equity returns. The presence or absence of asymmetric dependence between

equity and bond returns is yet to be established.

We use models of the asset returns that can capture the empirically observed

time-varying means and variances of stock returns, and also the presence of

(possibly time-varying) skewness and kurtosis, as in Hansen (1994) and Jondeau

and Rockinger (2003). Further, we employ models of the dependence structure

(or copula) that allow for, but do not impose different dependence during bearmarkets than bull markets, and allow for changes in this dependence structure

through time. A thorough introduction to copula theory is presented in Schweizer

and Sklar (1983), Joe (1997), and Nelsen (1999).

The importance of skewness and asymmetric dependence for asset allocation

is measured by comparing the performance of a portfolio based on a bivariate

normal distribution model with a portfolio based on a model developed using

copula theory. We compute the amount that an investor could be charged to make

him/her indifferent between two competing portfolios, as in West, Edison, andCho (1993), Ang and Bekaert (2002), and others. The significance of the differences

in portfolio performance are tested using bootstrap methods. We find evidence in

most cases that nonnormalities in the marginal distributions and copula do have

important economic implications for asset allocation, however, the statistical sig-

nificance of the improvement is only moderate. Gains are generally only present

for investors that are not short-sales constrained, such as hedge funds.

This article is essentially trying to test three hypotheses simultaneously:

(1) Are these asymmetries present in this dataset? (2) Are these asymmetriespredictable out-of-sample? (3) Can we make better portfolio decisions by using

forecasts of these asymmetries than we can by ignoring them? If the answer to any

of these questions is ‘‘no,’’ then we would conclude that the out-of-sample impor-

tance of these asymmetries for asset allocation is zero. The distinction between in-

sample and out-of-sample significance is an important one. Finding that a more

flexible distribution model fits the data better in-sample does not imply that it will

lead to better out-of-sample portfolio decisions than those based on a simpler

model. In fact, a common finding in the point forecasting literature is that morecomplicated models often provide poorer forecasts than simple misspecified mod-

els [see Weigend and Gershenfeld (1994), Swanson and White (1995, 1997), and

Stock and Watson (1999)].

In this article we consider both unconstrained and short sales-constrained

estimates of the optimal portfolio weight. The first reason for doing so is econom-

ically motivated: many market participants face the constraint that they are unable

to short sell stocks or to borrow and invest the proceeds in stocks, while others,

such as hedge funds, actively take both long and short positions. The secondreason is statistically motivated: the optimal portfolio weight given a density

forecast is itself only an estimate of the true optimal portfolio weight. By ensuring

that our estimate always lies in the interval [0, 1], we employ a type of ‘‘insanity

filter’’ that prevents the investor from taking an extreme position in the market.

Such constraints have been found to improve the out-of-sample performance of

optimal portfolios based on parameter estimates [see Frost and Savarino (1988)

132 Journal of Financial Econometrics

and Jagannathan and Ma (2002)]. One could also consider an intermediate filter

that allows for some limited amount of short selling, but we do not explore such a

possibility here.

Much of the existing work on asset allocation focused on special cases where

the combination of utility function and distribution model were such that

an analytical solution for the optimal portfolio decision exists [see Kandel and

Stambaugh (1996) or Campbell and Viceira (1999), among others]. Brandt (1999)and Aıt-Sahalia and Brandt (2001) overcome the problem of the appropriate

distributional assumption to combine with a given utility function by using the

method of moments and the first-order conditions of the investor’s optimization

problem to obtain an optimal portfolio decision. Detemple, Garcia and Rindisbacher

(2003) present a sophisticated new method for finding optimal portfolio weights

from empirically relevant models. In this article we combine density models that

are shown to adequately describe the statistical properties of the asset returns with

the CRRA utility function.One of the costs of using flexible parametric models for the joint distribution

of stock returns is that we are forced by computational constraints to be relatively

unsophisticated in other aspects of the project. First, we ignore the effects of

parameter estimation uncertainty on the investor’s decision problem, though

this was found to be important by Kandel and Stambaugh (1996). Also, we only

consider the investor’s problem for the one-period-ahead investment horizon,

thus ignoring the hedging component of the optimal portfolio weight [see Merton

(1971)]. Empirical evidence on the importance of the hedging component is mixed:Brandt(1999),CampbellandViceira(1999),andDetemple,Garcia,andRindisbacher

(2003) find it to be important, whereas Aıt-Sahalia and Brandt (2001) and Ang and

Bekaert (2002) find only weak evidence.

The remainder of the article is structured as follows. In Section 1 we provide a

brief introduction to copula theory and its use in the density forecasting of stock

returns. In Section 2 we present the investor’s decision problem in detail. Section 3

presents the empirical results on the asset allocation problem for a portfolio of a

small-cap index and a large-cap index: the models employed, comparisons ofportfolio weights, and tests for improvements in portfolio performance. We con-

clude in Section 4. In Appendix A we present some details of the optimization

procedure and in Appendix B we provide the functional forms of the copula

models considered.

1 FLEXIBLE MULTIVARIATE DISTRIBUTION MODELS USINGCOPULAS

In this article we use copula theory to develop flexible parametric models of the joint

distribution of returns. Suppose we have two (scalar) random variables of interest,

Xt and Yt, and some exogenous variables Wt. The variables’ joint conditional

distribution is (Xt, Yt)jF tÿ1 � Ht¼Ct (Ft, Gt), where Ht is some conditional bivari-

ate distribution function, with conditional univariate distributions of Xt and Yt

being Ft and Gt, the conditional copula being Ct, and F tÿ1 is the information set


defined as F t�sðZtÞ; for Zt� ½Xt, Yt, W0t, Xtÿ1, Ytÿ1, W0

tÿ1, . . . , Xtÿj, Ytÿj, W0tÿj�0. We

will denote the distribution (cdf ) of a random variable using an uppercase letter

and the corresponding density (pdf ) using a lowercase letter.

A copula is any multivariate distribution function that has Uniform (0,1)

marginal distributions. It links together two (or more) marginal distributions to

form a joint distribution. The marginal distributions that it couples can be of any

type: a normal and an exponential, or a Student’s t and a Uniform, for example.The theory of copulas dates back to Sklar (1959) and since then numerous applica-

tions have appeared in the statistics literature and more recently also in the

analysis of economic data.3 The main theorem in copula theory is that of Sklar

(1959), presented below for the conditional case. For an introduction to copula

theory see Joe (1997) and Nelsen (1999).

Theorem 1 (Sklar’s theorem for continuous conditional distributions). Let F be theconditional distribution of XjZ, G be the conditional distribution of YjZ, and H be the jointconditional distribution of (X, Y)jZ. Assume that F and G are continuous in x and y, andlet Z be the support of Z. Then there exists a unique conditional copula C such that

Hðx, yjzÞ ¼ CðFðxjzÞ, GðyjzÞjzÞ, 8ðx, yÞ 2 �RR� �RR and each z 2 Z: ð1Þ

Conversely, if we let F be the conditional distribution of XjZ, G be the conditionaldistribution of YjZ, and C be a conditional copula, then the function H defined byEquation (1) is a conditional bivariate distribution function with conditional marginaldistributions F and G.

Sklar’s theorem allows us to decompose a bivariate distribution, Ht, into three

components: the two marginal distributions, Ft and Gt, and the copula, Ct. The

density function equivalent of Equation (1) is obtained quite easily, provided that

Ft and Gt are differentiable, and Ht and Ct are twice differentiable:

htðx, yjzÞ ¼ ftðxjzÞ � gtðyjzÞ � ctðu, vjzÞ, 8ðx, y, zÞ 2 �RR� �RR� Z, ð2Þ

where u � Ft(xjz), and v � Gt(yjz). Taking logs of both sides we obtain

log htðx, yjzÞ ¼ log ftðxjzÞ þ log gtðyjzÞ þ log ctðu, vjzÞ ð3Þ

and so the joint log-likelihood is equal to the sum of the marginal log-likelihoods

and the copula log-likelihood. For the purposes of multivariate density modeling,

the copula representation allows for great flexibility in the specification: we may

model the individual variables using whichever marginal distributions provide

the best fit and then work on modeling the dependence structure via a model for

the copula. The estimation of multivariate time-series models constructed using

3 In statistics, see Clayton (1978), Cook and Johnson (1981), Oakes (1989) and Genest and Rivest (1993). In

economics and finance, see Li (2000), Embrechts, Hoing, and Juri (2001), Patton (2001a, 2001b), Rockinger

and Jondeau (2001), Chen and Fan (2002a, 2002b), Mashal and Zeevi (2002), Miller and Liu (2002), Junker

and May (2002), Fermanian and Scaillet (2003), and Rosenberg (2003).


copulas is discussed in Patton (2001a) for the parametric case and Fermanian and

Scaillet (2003) for the nonparametric case.

2 THE INVESTOR’S OPTIMIZATION PROBLEM

The utility functions we assume for our hypothetical investors are from the class of

CRRA utility functions:

UðgÞ ¼ ð1ÿ gÞÿ1 � ðP0 � ð1þ vxXt þ vyYtÞÞ1ÿg if g 6¼ 1

logðP0 � ð1þ vxXt þ vyYtÞÞ if g ¼ 1,

(ð4Þ

where P0 is the initial wealth, Xt and Yt represent the continuously compounded

excess return (over the risk-free rate) on the small cap and large cap indices,

respectively, and vi is the proportion of wealth in asset i. The degree of relative

risk aversion (RRA) is denoted by g. For this utility function, the initial wealth

does not affect the choice of optimal weight and so we set P0 ¼ 1. We consider fivedifferent levels of relative risk aversion: g ¼ 1, 3, 7, 10, and 20. A similar range

of risk aversion levels was also considered in Campbell and Viceira (1999) and

Aıt-Sahalia and Brandt (2001). While there exist other utility functions that place

higher weight on tail events or asymmetries in the distribution of payoffs, we

focus on the CRRA utility because of its prominence in the finance literature. If

gains are found using the CRRA utility function then they may be thought of as a

conservative estimate of the possible gains using other, more sensitive, utility

functions.The setup of the investor’s problem is as follows. Let the excess returns on the

two risky assets under consideration be denoted Xt and Yt, with some joint

distribution, Ht, with associated marginal distributions, Ft and Gt, and copula, Ct.

We will develop density forecasts of this joint distribution ---- FFtþ1, GGtþ1, and the

conditional copula, CCtþ1 ---- and use them to compute the optimal weights,

v�tþ1� ½v�x;tþ1, v�y;tþ1�, for the portfolio. The optimal weights are found by maximiz-

ing the expected utility of the end-of-period wealth under the estimated prob-

ability density:

v�tþ1� arg maxv2W

EHHtþ1½Uð1þ vxXtþ1 þ vyYtþ1Þ�

¼ arg maxv2W

Z ZUð1þ vxxþ vyyÞ � fftþ1ðxÞ � ggtþ1ðyÞ

� cctþ1ðFFtþ1ðxÞ, GGtþ1ðyÞÞ � dx � dy, ð5Þ

where W is some compact subset of R2 for the unconstrained investor and

W¼ {(vx, vy)2 [0, 1]2 : vx þ vy� 1} for the short sales-constrained investor.

The investor is assumed to estimate the model of the conditional distributionof excess returns using maximum likelihood and then optimize the portfolio’s

weight using the predicted conditional distribution of returns. Work from the

forecasting and estimation literature suggests that the parameter estimation

stage and the forecast evaluation stage should both use the same objective function


[see Granger (1969), Weiss (1996), and Skouras (2001)]. We use maximum-like-

lihood estimation for computational tractability.

The double-integral defining the expected utility of wealth does not have a

closed-form solution for our case. We use 10,000 Monte Carlo replications to

estimate the value of this integral, which must be done for each point in the out-

of-sample period. The objective function was found to be well behaved (smooth

and having a unique global optimum) for our choices of utility functions anddensity models and so we employed the Broyden-Fletcher-Goldfarb-Shanno

(BFGS) algorithm to locate the optimum, v�tþ1, at each point in time. Further details

on this procedure may be found in Appendix A.

One concern that may arise in this design is the existence of

EHHtþ1½Uð1þ vxXtþ1 þ vyYtþ1Þ� for certain density models. Given the CRRA utility,

any density model that assigns positive probability to the case of bankruptcy

would preclude the existence of EHHtþ1½U�. All of the above specifications will assign

some (extremely small) positive probability to bankruptcy. We deal with this bymodifying the left tail of the distribution: we apply a logistic transformation to the

lower tail of the portfolio return distribution so that all probability mass assigned

to the region (ÿ1, «) is relocated to the region (0, «), where « is some extremely

small positive number.

3 A PORTFOLIO OF SMALL CAP AND LARGE CAP STOCKS

In this section we consider an investor with constant relative risk aversion facing

the problem of allocating wealth between two assets: a portfolio of low marketcapitalization stocks (‘‘small caps’’) and a portfolio of high market capitalization

stocks (‘‘large caps’’). These two assets were chosen as being representative of the

general problem of balancing a portfolio comprised of a high risk----high return

asset and a lower risk----lower return asset.

3.1 Description of the Data

We use monthly data from the CRSP on the top 10% and bottom 10% of stocks

sorted by market capitalization to form indices ---- the ‘‘large cap’’ and ‘‘small cap’’

indices, from January 1954 to December 1999, yielding 552 observations. These

data were also analyzed in a different context by Perez-Quiros and Timmermann

(2001). We reserve the last 120 observations, from January 1990 to December 1999,

for the out-of-sample evaluation of the models. Descriptive statistics on the two

portfolios are presented in Table 1.

The small cap index generally exhibited slightly positive skewness, while thelarge cap index exhibited negative skewness. Both indices exhibited excess kur-

tosis. The Jarque-Bera statistic indicates that neither series is unconditionally

normal, and the unconditional correlation coefficient indicates a high degree of

linear dependence, as expected. Table 1 also reveals that the small cap index had a

higher mean and higher volatility than the large cap index over the total sample


and the in-sample period, but not over the out-of-sample period. During the 1990s,

as is well known, large cap stocks performed better than their historical average.

The change in expost average returns and standard deviations for the small and

large cap indices between the in-sample and out-of-sample period suggests that

allowing for structural breaks in the returns-generating process may improve

portfolio decisions. One promising method of doing so is the reversed orderedCusum (ROC) procedure of Pesaran and Timmermann (2002). Due to the com-

putational constraints, we are forced to ignore the possibility of structural breaks.

We use three further variables as explanatory variables in our analysis. The

first is the one-month Treasury bill rate, denoted Rft, which is taken as the risk-free

rate. This variable has been used by Fama (1981) and others as a proxy for shocks

to expected growth in the real economy. The second variable is the difference

between the yield on corporate bonds with Moody’s rating Baa versus those with

an Aaa rating, denoted SPRt, which is called the ‘‘default spread.’’ This variabletracks the cyclical variation in the risk premium on stocks, see Perez-Quiros and

Timmermann (2001). Finally, we look at the dividend yield, denoted DIVt, which

is measured as the total dividends paid over the previous 12 months divided by

the stock price at the end of the month. This variable acts as a proxy for time-

varying expected returns. For a comprehensive review of the variables that have

been used in previous studies as predictive variables for stock returns see Aıt-

Sahalia and Brandt (2001).

To examine the presence of asymmetric dependence between these two assetswe use measures presented in Longin and Solnik (2001) and Ang and Chen (2002)

Table 1 Descriptive statistics.

Full sample In-sample Out-of-sample

Small caps Large caps Small caps Large caps Small caps Large caps

Mean* 9.9549 7.9748 9.7076 6.4315 10.8452 13.5306

Std Dev* 21.2932 14.2888 22.0482 14.4729 18.4016 13.5423

Skewness 0.0558 ÿ0.3795 0.1653 ÿ0.3173 ÿ0.6128 ÿ0.6124

5% VaR 8.7973 6.2306 8.9204 6.3922 6.6928 4.9430

1% VaR 18.9576 9.6657 19.0624 9.7958 16.2402 10.9989

Kurtosis 7.5647 4.9088 7.7197 4.9871 5.2857 4.6428

Min ÿ29.3153 ÿ20.8934 ÿ29.3153 ÿ20.8934 ÿ21.9607 ÿ14.9491

Max 38.3804 16.8145 38.3804 16.8145 13.9502 11.3488

Jarque-Bera 479.5162 97.0484 401.0592 77.9577 33.0707 20.6444

p-val 0.0000 0.0000 0.0000 0.0000 0.0000 0.0000

Correlation 0.7210 0.7392 0.6457

The statistics marked with an asterisk were annualized to ease interpretation.

Jarque-Bera refers to the test for normality of the unconditional distribution of returns. The full sample runs

from January 1954 to December 1999, the in-sample period from January 1954 to December 1989, and the

out-of-sample period from January 1990 to December 1999.


called ‘‘exceedence correlations,’’ ~rrðqÞ. We will not use this measure in the asset

allocation problem, but we have found it to be a useful, intuitive way of taking a

preliminary look at our data.

~rrðqÞ�corr½X, YjX � QxðqÞ \Y � QyðqÞ�, for q � 0:5

corr ½X, YjX>QxðqÞ \Y>QyðqÞ�, for q � 0:5,

�where Qx(q) and Qy(q) are the qth quantiles of X and Y, respectively. As Longin

and Solnik (2001) and Ang and Chen (2002) point out, the shape of the exceedence

correlation plot (as a function of q) depends on the underlying distribution of the

data. Even for the standard bivariate normal distribution, ~rrðqÞ is nonlinear in q. In

Figure 1 we plot the empirical exceedence correlations based on the (raw) excess

returns on the two indices, along with what would be obtained if they had the

bivariate normal distribution. In Figure 2 we plot the empirical exceedence corre-

lations based on the transformed standardized residuals of the models for the twoindices, along with what would be obtained if these assets had the normal copula

and the ‘‘rotated Gumbel’’ copula, which is described below. Figure 1 shows the

degree of asymmetry in the unconditional distribution of the returns on these two

assets; Figure 2 shows the degree of asymmetry in the unconditional copula of

these two assets, having removed all marginal distribution asymmetry. Clearly

both the unconditional joint distribution and the unconditional copula exhibit

substantial asymmetry. This suggests that the assumption of normality, which

implies a symmetric dependence structure, is inappropriate for these assets.Whether capturing this asymmetric dependence leads to substantially better

portfolio decisions is the focus of Section 3.3.

Figure 1 Exceedence correlations between excess returns (X and Y) on small caps and large caps.The horizontal axis shows the cutoff quantile, and the vertical axis shows the correlation betweenthe two returns given that both exceed that quantile.


3.2 Analysis of the Different Models

We consider a number of different investment strategies. In this section we

describe the models used to obtain the density forecasts on which some of the

strategies are based.

The first three strategies we consider are simply buy-and-hold strategies

(all small caps, all large caps, or an even mix of both). The fourth strategy is one

based solely on the unconditional distribution of returns. For this portfolio

we assume that the investor optimizes his/her portfolio weights for the period

tþ 1 using the empirical unconditional distribution of returns observed up untiltime t:

v�uncond;tþ1� arg maxv2W

EEt½Uð1þ vxXtþ1 þ vyYtþ1Þ�

¼ arg maxv2W

tÿ1Xt

j¼1

Uð1þ vxxj þ vyyjÞ: ð6Þ

This portfolio is based on the assumption that the joint distribution of these two

assets is i.i.d. throughout the sample. A comparison of the performance of thisportfolio with those constructed using parametric conditional distribution models

may then be interpreted as a measure of the benefits to modeling the conditionaldistribution of these stock returns.

3.2.1 Marginal distribution models The benchmark model for our study is the

bivariate normal distribution, which is compared with a model constructed using

Figure 2 Exceedence correlations between transformed residuals (U and V) of small caps andlarge caps. The horizontal axis shows the cutoff quantile, and the vertical axis shows the correla-tion between the two residuals given that both exceed that quantile.


copula theory. Both models have the same forms for the conditional means, mxt and

myt , and variances, hx

t and hyt . In their article on the value of volatility timing for

asset allocation decisions, Fleming, Kirby, and Ostdiek (2001) assumed a constant

conditional mean rather than using some model for expected returns. In their

framework this was shown to lead to a conservative estimate of the value of

volatility timing. In our framework, however, a misspecified conditional mean

will lead to a misspecified skewness model and a misspecified dependence model,and unlike Fleming, Kirby, and Ostdiek, we cannot be sure what impact this will

have on the results; whether it will exaggerate or dampen the differences between

the models under analysis. For this reason, we cannot escape the building of a

model for expected returns.

The conditional means were set to be linear functions of up to 12 lags of the

two asset returns, the risk-free rate, the default spread, and the dividend yield. For

the conditional variance we employed a TARCH(1, 1) specification and allowed

the three lagged exogenous regressors to enter into the conditional variancespecification in levels and squares. We used likelihood ratio tests to determine

the best-fitting model over the in-sample period. The selected models for the mean

and variance equations are given below. The full sequence of parameter estimates

for each point in the out-of-sample period are available from the author upon

request.

Xt ¼ b0 þ b1Xtÿ1 þ b2Rftÿ1 þ b3SPRtÿ1 þ b4DIVtÿ1 þ

ffiffiffiffiffihx

t

p«t ð7Þ

hxt ¼ b5 þ b6hx

tÿ1 þ b7hxtÿ1«2

tÿ11f«tÿ1 > 0g þ b8hxtÿ1«2

tÿ11f«tÿ1 < 0gþ b9R

ftÿ1 þ b10SPRtÿ1

Yt ¼ g0 þ g1Rftÿ1 þ g2SPRtÿ1 þ g3DIVtÿ1 þ

ffiffiffiffiffih

yt

qht ð8Þ

hyt ¼ g4 þ g5h

ytÿ1 þ g6h

ytÿ1«2

tÿ11f«tÿ1 > 0g þ g7hytÿ1«2

tÿ11f«tÿ1 < 0g þ g8Rftÿ1

Although the models are recursively reestimated throughout the out-of-

sample period, they are ‘‘nonadaptive,’’ in that the model specifications are

determined using the in-sample data and not updated in the out-of-sample

period.

To determine the importance of skewness and asymmetric dependence for

asset allocation we specify distribution models that can capture these features. Wefound Hansen’s (1994) skewed Student’s t distribution to provide a good fit for

the marginal distributions of both assets. Jondeau and Rockinger (2003) present

some further results on this distribution. In addition to time-varying conditional

means and variances, the skewed t distribution can capture time-varying condi-

tional skewness and kurtosis. Both skewness and kurtosis parameters, denoted

lt and nt, were allowed to depend on lags of the exogenous variables and the fore-

cast conditional means and variances. As suggested by Hansen (1994), we use

transformations to ensure that the skewness and degrees-of-freedom parametersremained within (ÿ1, 1) and (2, 1], respectively, at all times by setting lt¼L(Z0tÿ1b) and nt¼ 2.1þ (Z0tÿ1b)2, where Z0tÿ1b is the linear function of the regressors


and parameters for that variable and L(x)¼ (1ÿeÿx)/(1þeÿx) is the modified

logistic transformation, designed to keep lt in (ÿ1, 1) at all times.

For both assets we found significant in-sample time variation in these

moments, though some variables were dropped as their coefficients were not

significant. The total additional number of parameters in the skewed t distribution

over those in the normal distribution for the small caps (large caps) was 5 (4).

Using likelihood ratio tests, we could reject (with p-values of less than 0.01), forboth assets, the assumptions of skewness being constant at zero and kurtosis being

constant at three, both jointly and separately for the in-sample period.4 The

improved in-sample goodness-of-fit of the skewed t distribution is traded off

against possible increased parameter estimation error in an out-of-sample setting.

In addition to testing the significance of each of the possible variables to

include in the model, we conducted goodness-of-fit tests (not reported) for the

final marginal density models. Such tests are critical when constructing multi-

variate densities using copulas, as a misspecified marginal density implies thatany copula model will be misspecified. We used Kolmogorov-Smirnov (KS) tests

for the proposed density and Lagrange multiplier (LM) tests for serial dependence

in the probability integral transforms of the variables, as suggested in Diebold,

Gunther, and Tay (1998). We also employed the multinomial hit test described in

Patton (2001b). We found no evidence against the skewed t models and some

evidence against the normal models. To show the outputs of these models, in

Figure 3 we present the conditional mean, conditional variance, skewness para-

meter, and kurtosis parameters for the out-of-sample period, estimated at eachpoint in time only using data available as in the previous period.

3.2.2 Copula models For the bivariate normal model, all that remains to be

specified is a model for the conditional correlation. The conditional correlation

was set as a function of the lagged risk-free rate, default spread, dividend yield,

and the forecasts of the conditional means of the two variables. All of these

variables were found to be important in-sample. The bivariate normal model is

Xt ÿ mxtffiffiffiffiffi

hxt

p ,Yt ÿ m

ytffiffiffiffiffi

hyt

q0B@

1CA � N 0,1 rt

rt 1

� �� ð9Þ

rt ¼ Lða0 þ a1Rftÿ1 þ a2SPRtÿ1 þ a3DIVtÿ1 þ a4mxt þ a5m

yt Þ, ð10Þ

where L(x) is the modified logistic transformation.

For the flexible distribution model, all that remains is to specify the form ofthe copula used to link the two skewed t marginal distributions. A total of nine

different copulas were estimated on the transformed residuals from the skewed

4 The fact that the restrictions on skewness and kurtosis were both rejected means that both higher

moments may be important for asset allocation. We could attempt to disentangle the benefits of each

by including an additional intermediate model, such as the usual Student’s t, which allows for kurtosis

but not skewness. We do not do so here in the interest of parsimony.


t models in the search for the best-fitting copula. The copulas considered were thenormal, Student’s t, Clayton, rotated Clayton,5 Joe-Clayton, Plackett, Frank,

Gumbel, and rotated Gumbel copulas; contour plots of a few of these copulas

are provided in Figure 4 and the functional forms of these copulas are contained in

Jan90 Jan92 Jan94 Jan96 Jan98 Dec99

0

5

Con

ditio

nal m

ean

Plot of one-step ahead forecasts

Small CapsBig Caps


4

6

8

10

Con

ditio

nal v

olat

ility


0.4

Ske

wne

ss p

aram

eter


5

10

15

20

Kur

tosi

s pa

ram

eter

–0.4

–0.2

0

0.2

–5

Figure 3 Plots of the first four conditional moment parameters over the out-of-sample period.

5 Let (U, V) be distributed according to the copula C. Then (1ÿU, 1ÿV) will be said to be distributed

according to the ‘‘rotated C’’ copula. The rotation allows us to take a copula that allows for greater

dependence in the first (third) quadrant and create one that allows for greater dependence in the third

(first) quadrant.


Figure 4 Contour plots of various distributions all with standard normal marginal distributionsand linear correlation coefficients of 0.5.


Appendix B. This list includes almost all of the copulas considered in the various

applications of copulas in statistics and economics,6 and is significantly more than

we found in any single previous applied study.

The plots in Figure 4 show the isoprobability contours of bivariate densities

with Normal(0,1) margins and linear correlation coefficient of 0.5. We fixed the

marginals and the correlation coefficient in this figure so that the differences in the

densities could be more clearly identified and attributed to the different copulas.The copula in the upper left panel is the normal copula, making the joint density a

bivariate standard normal and giving us the familiar elliptical contours. Immedi-

ately below the normal density is the joint density formed using Clayton’s copula.

We can see that this density’s contours are more tightly clustered around the

diagonal in the third (‘‘negative-negative’’) quadrant than in the first quadrant,

indicating stronger dependence between negative observations than between

positive observations. This is qualitatively the type of dependence suggested by

the exceedence correlation plots in Figures 1 and 2. Of these six, the normal,Student’s t, and Plackett all generate symmetric dependence, whereas the

Gumbel, Clayton, and Joe-Clayton all generate asymmetric dependence. For

further details on the properties of these copulas see Joe (1997), Nelsen (1999),

and Patton (2001b).

As in the bivariate normal distribution, we estimated these copulas with

conditional dependence modeled as a function of the lagged risk-free rate, default

spread, and dividend yield, and the forecasts of the conditional means of the two

variables [see Equation (12) below]:

Xt ÿ mxtffiffiffiffiffi

hxt

p ,Yt ÿ m

ytffiffiffiffiffi

hyt

q0B@

1CA � CðSkewed tðlxt , nx

t Þ, Skewed tðlyt , n

yt Þ; dtÞ ð11Þ

dt ¼ Gðb0 þ b1Rftÿ1 þ b2SPRtÿ1 þ b3DIVtÿ1 þ b4mxt þ b5m

yt Þ, ð12Þ

where G(x) is a function designed to keep dt in the feasible region for the copula Cat all times, and C is one of the nine copulas discussed above.

The maximum log-likelihood and information criteria values for each of the

copulas considered are presented in Table 2, and we can see that the rotated

Gumbel copula attained the greatest log-likelihood value and the lowest value

of both information criteria. We will thus use the rotated Gumbel copula in theflexible distribution specification, which we will call the ‘‘Gumbel’’ model for

simplicity.

We specify one final alternative model, called the ‘‘NormCop’’ model,

which uses the skewed t marginal distributions along with a normal copula. This

specification is included to determine where the benefits, if any, from flexible

6 One copula that was consciously omitted from this list is the Farlie-Gumbel-Morgenstern copula. This

copula was excluded due to the limited amount of dependence it is able to consider: rank correlation

under this copula is bounded in absolute value by one-third [see Joe (1997, p. 35)].


density modeling lie: in the marginal distribution specifications or in the copula

specification. The values of the log-likelihoods at the optimum for the three

joint distributions (normal, NormCop, and Gumbel) are ÿ2391.04, ÿ2355.38, and

ÿ2342.28, so in terms of in-sample goodness-of-fit we can see that the Gumbel

model provides the best fit, and that about 73% of the gains come from the flexiblemarginal distribution models, though in an out-of-sample setting this ranking

and decomposition of gains need not hold.

We again use likelihood ratio tests to determine if any of the five regressors for

the conditional copula parameter can be dropped. For the bivariate normal dis-

tribution and the NormCop models, all five were significant at the 10% level,

while for the rotated Gumbel copula the risk-free rate and the spread were not

significant and so were removed from the model, reducing the number of para-

meters for this copula from six to four.We conducted some specification tests (not reported) on the normal and

rotated Gumbel copulas over the in-sample period to determine their goodness-

of-fit, employing the multinomial hit test described in Patton (2001b). We found

that the normal copula estimated using residuals from the normal marginal

distribution models could be rejected, which is unsurprising since the marginals

of that model were also rejected. Neither the normal copula nor the rotated

Gumbel copula estimated on residuals from the skewed t marginal distribution

models could be rejected at the 5% level.In Figure 5 we present the conditional correlation parameter from the

bivariate normal model and the implied conditional correlation from the

skewed t-rotated Gumbel copula model. We use correlation as the measure

Table 2 Results from the copula specification search.

Model LC

Number of

parameters AIC BIC

Symmetric copulas

Normal 153.5681 6 ÿ307.1076 ÿ307.0499

Student’s t 158.1329 7 ÿ316.2325 ÿ316.1651

Plackett 163.1763 6 ÿ326.3240 ÿ326.2663

Frank 158.2502 6 ÿ316.4718 ÿ316.4141

Asymmetric copulas

Clayton 151.3272 6 ÿ302.6258 ÿ302.5681

Rotated Clayton 90.8669 6 ÿ181.7052 ÿ181.6475

Joe-Clayton 158.8478 12 ÿ317.6385 ÿ317.5230

Gumbel 127.8091 6 ÿ255.5896 ÿ255.5319

Rotated Gumbel 166.6628 6 ÿ333.2970 ÿ333.2393

Presented here are the nine copula specifications tried for the copula distribution model. The copula

likelihood at the optimum is denoted LC. Also presented are the numbers of parameters estimated in the

models and the values of the Akaike and Schwarz’s bayesian information criteria at the optima.


of dependence here for comparability across models. While all three conditional

correlation estimates generally moved in the same direction, their levelsare quite different at times.

3.3 Performance of the Different Strategies

We now analyze the performance of the different asset allocation decisions made

using the various models. We consider five levels of relative risk aversion (RRA ¼1, 3, 7, 10, and 20) and 11 strategies. The 11 strategies are (1) always hold the small

cap index; (2) always hold the large cap index; (3) always hold a 50 : 50 mix of the

two indices; (4) optimize the portfolio weight using the unconditional empiricaldistribution of returns; (5) find the optimal portfolio weight for each period using

the bivariate normal model; (6) find the optimal portfolio weight for each period

using the NormCop model; (7) find the optimal portfolio weight for each period

using the Gumbel model. Strategies 8--11 are the same as strategies 4--7, subject to a

short-sales constraint.

3.3.1 Summary statistics In Table 3 we present six summary statistics on theoptimal portfolio return series based on the different models. In addition to the

Jan90 Jan92 Jan94 Jan96 Jan98 Dec990.3

0.4

0.5

0.6

0.7

0.8

0.9

Con

ditio

nal c

orre

latio

n

Conditional correlation implied by the three models

Gumbel NormCopNormal

Figure 5 Conditional correlation implied by the three models (Gumbel, NormCop, and normal)over the out-of-sample period (January 1990 to December 1999). The straight dashed line is theunconditional correlation over this period (0.6457).


Table 3 Realized portfolio return summary statistics.

Unconstrained Short sales constrained

Small caps Large caps 50:50 mix Uncond Normal NormCop Gumbel Uncond Normal NormCop Gumbel

RRA¼ 1

Mean 0.9038 1.1275 1.0157 2.6768 8.2724 5.2659 6.6488 0.8925 1.2659 1.2630 1.2555

Std Dev 5.3121 3.9093 4.1928 10.3586 45.1292 30.4298 31.4865 5.1452 3.8598 3.8636 3.8626

Sharpe ratio 0.1701 0.2884 0.2422 0.2684 0.1833 0.1731 0.2112 0.1735 0.3280 0.3269 0.3250

Skewness ÿ0.6128 ÿ0.6124 ÿ1.0307 ÿ1.0854 ÿ0.3708 0.0889 0.4814 ÿ0.6608 ÿ0.2738 ÿ0.2702 ÿ0.2655

5% VaR 6.9628 4.943 5.3338 12.9446 58.0278 47.8849 44.6935 6.6903 4.6733 4.6733 4.6733

5% ES 11.5049 7.9529 9.5332 22.5199 104.5118 64.3796 62.1585 11.3452 7.1708 7.1708 7.1708

RRA¼ 3

Mean 0.9038 1.1275 1.0157 0.8848 4.1397 0.7284 2.1974 0.8935 1.2075 1.0293 1.2074

Std Dev 5.3121 3.9093 4.1928 3.5501 24.823 10.9912 10.4894 3.5243 3.8060 3.6672 3.6156

Sharpe ratio 0.1701 0.2884 0.2422 0.2492 0.1668 0.0663 0.2095 0.2535 0.3171 0.2807 0.3340

Skewness ÿ0.6128 ÿ0.6124 ÿ1.0307 ÿ1.2768 ÿ0.2671 ÿ0.0164 0.0492 ÿ1.1645 ÿ0.2455 ÿ0.1251 ÿ0.1861

5% VaR 6.9628 4.943 5.3338 4.4658 33.1535 16.2164 12.731 4.6009 4.6733 4.3646 3.8212

5% ES 11.5049 7.9529 9.5332 8.0024 57.1473 24.7803 20.4849 7.8806 7.0491 6.9646 6.6575

RRA¼ 7

Mean 0.9038 1.1275 1.0157 0.4119 1.994 0.1168 1.4336 0.4119 0.9904 0.6961 0.8874

Std Dev 5.3121 3.9093 4.1928 1.6026 11.5989 6.4371 6.6022 1.6026 3.4672 2.9854 2.9373

Sharpe ratio 0.1701 0.2884 0.2422 0.2570 0.1719 0.0181 0.2171 0.2570 0.2856 0.2332 0.3021


5% VaR 6.9628 4.943 5.3338 2.0264 14.3828 10.8161 9.2782 2.0264 4.2064 3.4382 3.2623

5% ES 11.5049 7.9529 9.5332 3.4964 27.8403 17.6022 13.7038 3.4964 6.6776 5.9938 5.7425

continued

PA

TT

ON

|O

ut-o

f-Samp

leIm

po

rtan

ceo

fSkew

ness

14

7

Table 3 (continued) Realized portfolio return summary statistics.

Unconstrained Short sales constrained

Small caps Large caps 50:50 mix Uncond Normal NormCop Gumbel Uncond Normal NormCop Gumbel

RRA¼ 10

Mean 0.9038 1.1275 1.0157 0.289 1.4064 0.0777 1.1397 0.2890 0.8236 0.5530 0.7043

Std Dev 5.3121 3.9093 4.1928 1.1251 8.1985 5.0844 4.584 1.1251 3.2152 2.3789 2.3596

Sharpe ratio 0.1701 0.2884 0.2422 0.2569 0.1715 0.0153 0.2486 0.2569 0.2561 0.2325 0.2985


5% VaR 6.9628 4.943 5.3338 1.4247 10.1103 8.014 6.3274 1.4247 3.9514 2.7253 2.4289

5% ES 11.5049 7.9529 9.5332 2.4449 19.6923 15.1624 8.7192 2.4449 6.4419 4.7754 4.5365

RRA ¼ 20

Mean 0.9038 1.1275 1.0157 0.1455 0.706 0.0117 0.6583 0.1455 0.4832 0.2928 0.3838

Std Dev 5.3121 3.9093 4.1928 0.5641 4.1239 2.8471 2.7152 0.5641 2.0592 1.2541 1.2382

Sharpe ratio 0.1701 0.2884 0.2422 0.2579 0.1712 0.0041 0.2424 0.2580 0.2346 0.2335 0.3100


5% VaR 6.9628 4.943 5.3338 0.7146 5.0671 4.7973 3.3004 0.7146 2.4188 1.5186 1.2693

5% ES 11.5049 7.9529 9.5332 1.2247 9.9177 8.3836 5.1940 1.2242 6.6262 4.3324 4.2081

RRA refers to the coefficient of relative risk aversion.

The first two columns of data report the results on the small cap and large cap indices, the third column reports the results for a constant evenly weighted portfolio, the

fourth is based on a weight that is optimized using the empirical unconditional distribution of returns, the fifth is based on the normal distribution model, the sixth on the

skewed t - normal copula model, and the seventh is based on the skewed t - rotated Gumbel copula model. The rows present the sample mean, sample standard

deviation, Sharpe ratio (mean/standard deviation), sample 5% VaR (fifth percentile) and sample 5% expected shortfall (mean of returns that exceed the 5% VaR).

14

8Jo

urn

alo

fFin

ancial

Econ

om

etrics

usual summary statistics we also present two alternative measures of risk, the 5%

value-at-risk (VaR) and the 5% expected shortfall (ES). The 5% VaR is defined as

the negative of the fifth empirical percentile of the realized returns, that is,dVaRVaRðX; 0:05Þ�ÿFFÿ1n ð0:05Þ, where FFn is the empirical distribution of returns on

portfolio X using the n out-of-sample observations. While VaR has some advan-

tages over traditional measures of risk, it has received criticism for not being a

‘‘coherent’’ measure of risk [see Artzner et al. (1999)]. An alternative to VaR thathas gained some attention recently is the ‘‘expected shortfall’’ of a portfolio. The

5% expected shortfall is defined as the negative of the average return on a portfolio

given that the return has exceeded its 5% VaR, that is, cESESðX; 0:05Þ�ÿEEn½XjX � dVaRVaRðX; 0:05Þ�, where EEn is the sample average.

A striking feature of the summary statistics is the much greater mean and

standard deviation of the portfolio returns based on the distribution models

(Normal, NormCop, and Gumbel) than the portfolios with constant weights for

all but the most risk-averse investor. We ignore parameter estimation uncertainty,and so the query may be raised as to whether the investors would so aggressively

invest if they knew that they were using parameter estimates rather than the true

parameters. Kandel and Stambaugh (1996) and Brandt (1999) both find that even

when parameter estimation uncertainty is accounted for, a CRRA investor aggres-

sively seeks the best portfolio. The results for the short sales-constrained investors

reveal a much smaller difference in mean and risk between the distribution

portfolios and the constant weight portfolios.

Also note the skewness coefficients: the Normal and NormCop portfolios alsogenerally exhibited negative skewness, while the unconstrained Gumbel portfolio

actually displayed positive skewness, suggesting that modeling both skewness

and asymmetric dependence enables the investor to better avoid negatively

skewed portfolio returns.

3.3.2 Performance statistics The performance measure we consider is the

amount in basis points per year that the investor would pay to switch from the

‘‘50:50 mix’’ portfolio to another portfolio. One interpretation of this amount is as

the ‘‘management fee’’ that could be deducted from the monthly return on port-

folio i over the out-of-sample period and leave the investor indifferent between the

50:50 portfolio and portfolio i. For example, an investor with risk aversion 1 would

be willing to pay up to 25.176 basis points per year to switch from the 50:50portfolio to the constrained Gumbel portfolio, while he would require compensa-

tion of 2.0114 basis points per year to switch from the 50:50 portfolio to the

‘‘unconditional’’ portfolio. See Table 4 for the complete set of results.

It should be pointed out that the investors with risk aversion of one and three

using the normal model density forecast would have gone bankrupt in the month

of January 1992. On this date these two investors took the positions vx¼ÿ8.9,

vy¼ 21.3 and vx¼ÿ5.1, vy¼ 11.5, respectively, and the month finished with

returns of 14.0% on the small caps (the largest return on this asset over the out-of-sample period) and ÿ2.6% on the large caps, leading to negative gross returns


for these investors.7 For this month the realized utility for these investors is ÿ1,

making the required management fee to switch to this portfolio ÿ1.

The performance statistics indicate that substantial gains may be obtained byemploying weights obtained from a model of the conditional distribution of stock

returns, particularly when coupled with a short-sales constraint. The uncon-

strained portfolios generally do not perform as well as simply holding an equally

weighted portfolio of the two indices, as the large caps performed particularly

well over the period 1990 to 1999. This result can be interpreted as further

evidence that placing short sales constraints on the optimal portfolio weights

obtained from forecasts improves out-of-sample portfolio performance [see Frost

and Savarino (1988) and Jagannathan and Ma (2002)]. If the short-sales constraintis interpreted as a type of ‘‘insanity filter,’’ preventing the investor from taking an

Table 4 Realized portfolio return performance: management fee.

RRA

1 3 7 10 20

Small caps ÿ1.9811 ÿ3.3108 ÿ6.3061 ÿ9.0655 ÿ23.0183

Large Caps 1.4659 1.8658 2.9521 4.2016 13.1512

50:50 mix 0 0 0 0 0

Unconstrained Uncond 13.9372 ÿ0.5508 0.1412 3.6384 25.9284

Normal ÿ1 ÿ1 ÿ67.0713 ÿ40.6527 4.5104

NormCop ÿ15.2901 ÿ22.4892 ÿ26.3494 ÿ20.5650 ÿ1.4480

Gumbel 1.8154 ÿ2.7387 ÿ4.8058 2.6759 23.4016

Short-sales Uncond ÿ2.0114 ÿ0.4145 0.1412 3.6384 25.9287

constrained Normal 25.3021 3.0180 2.8306 3.7620 24.4493

NormCop 25.2654 1.0794 0.6189 3.7849 25.9580

Gumbel 25.1760 3.2780 3.0071 5.5675 27.0638

RRA refers to the coefficient of relative risk aversion.

Each of the 11 rows of figures refer to a particular portfolio: the first two portfolios are the assets

themselves, the third is a constant evenly weighted portfolio, the fourth is based on a weight that is

optimized using the empirical unconditional distribution of returns, the fifth is based on the normal

distribution model, the sixth is based on the skewed t-normal copula model and the seventh is based on

the skewed t-rotated Gumbel copula model. Portfolios 8--11 correspond to portfolios 4--7 with a short-sales

constraint imposed. The performance is measured as the number of basis points per year that the investor

would be willing to pay to switch from the 50:50 portfolio; a possible ‘‘management fee.’’

7 This obviously represents a failure of these investors’ models or optimization methods, as they did not

recognize the risk of taking such extreme positions. According to the normal density forecast for that

month the probability of a return such as this or more extreme, that is, Pr [X> 14.0\Y�ÿ2.6], was about

1 in 3.5 million. Our Monte Carlo estimate of the expected utility used only 10,000 draws from the forecast

density, so it is not surprising that this outcome was not anticipated by an investor using a normal density

forecast. This may be interpreted as a signal that the normal density forecast is misspecified; the model

with skewed t margins and rotated Gumbel copula assigned a probability of about 1 in 75,000, which is

about 45 times larger than under the normal density forecast.


extreme position in the market, then this finding reinforces results previously

reported in the forecasting literature [see, e.g., Stock and Watson (1999)], that

constrained forecasts often outperform unconstrained forecasts from nonlinear

models.

The management fees that one could charge an investor currently holding the

50:50 portfolio to switch to the constrained Gumbel portfolio range between ÿ5

and 27 basis points per year. Management fees of less than 10 or 20 basis points peryear are of questionable economic significance. The largest gains (23 and 27) for

the Gumbel portfolio occur for the most risk-averse investor, though it should be

noted that the gains are not monotonic in the risk-aversion parameter.

Looking now to the gains from modeling higher moments and asymmetric

dependence, we compare the Normal portfolio with the Gumbel portfolio. In 9 of

10 comparisons the Gumbel portfolio outperformed the Normal portfolio, and for

the remaining comparison the difference was ÿ0.1 basis points. Ignoring the

two comparisons where the Normal portfolio went bankrupt (and so the effectivemanagement fee difference would be þ1) the average outperformance was

16.2 basis points; 41.5 basis points for the unconstrained investor and only

0.9 basis points for the short sales-constrained investor. We will see in the following

section the reason for the extremely small difference for the short-sales

constrained investor.

The Gumbel model also outperformed the NormCop model in 9 of 10 com-

parisons. The average outperformance of Gumbel portfolio over NormCop port-

folio was 21.3 (1.5) basis points for the unconstrained (short sales-constrained)investor. The corresponding figures for the Normal versus the NormCop portfolio

were ÿ18.3 and 0.5, indicating that the Normal portfolio performed worse for the

unconstrained investor, but marginally better for the short sales-constrained

investor. That the Gumbel portfolio outperformed both the Normal and NormCop

portfolios suggests that the copula specification is important for asset allocation.

3.3.3 Analysis of the optimal portfolio weights In this section we look at the

time series of portfolio weights resulting from the portfolio decisions made using

different models and different levels of risk aversion. To consider the impact of

risk aversion, we present Table 5 and Figure 6. Table 5 contains quantiles of the

distribution of portfolio weights obtained using the Gumbel model for risk aver-

sion ranging between 1 and 20, and Figure 6 shows the time series of portfolio

weights for investors with a relative risk aversion of 7 and 20. Both show thatincreasing the level of relative risk aversion shrinks the portfolio weights toward

zero, as expected. In the limit of infinite risk aversion the investor would put no

wealth in the risky assets and all wealth in the risk-free asset.

In Figure 7 we show the impact of imposing a short-sales constraint. The plot

makes it clear that even for moderately high-risk aversion, seven in this case, the

short-sales constraint is binding for at least one asset almost every period. For the

Gumbel model the proportions of times that the short-sales constraint is binding

for risk aversion levels of 1, 3, 7, 10, and 20 are 1, 1, 0.98, 0.95, and 0.94, respec-tively. Similar figures are obtained for the Normal and NormCop portfolios.


This shows that much of the information content of these models is lost if theinvestor is short-sales constrained.

One example of this reduced information content relates to comparing the

short sales-constrained portfolios. The proportions of times that the short sales-

constrained Gumbel portfolio took the same portfolio weights as the short sales-

constrained Normal portfolio for risk aversion levels of 1, 3, 7, 10, and 20 are 0.95,

0.80, 0.59, 0.52, and 0.52, respectively. The corresponding figures comparing the

Gumbel with the NormCop portfolio are similar. Thus, of the 120 observations in

the out-of-sample period, the number of observations that enable us to distinguishone short sales-constrained portfolio from another range from 58 (when there is

‘‘only’’ a 52% overlap, for the two most risk-averse investors) to just 5 (for the least

risk-averse investor). This suggests, and is confirmed in the next section, that if

there are gains to capturing and forecasting skewness and asymmetric depen-

dence they are more likely to be present for unconstrained investors than for short

sales-constrained investors.

To compare the portfolio weights obtained using the different models we

present Table 6 and Figures 8 and 9. Table 6 presents quantiles of the distributionof portfolio weights obtained using the different models for RRA ¼ 7, and Figures 8

and 9 compare the time series of portfolio weights over the out-of-sample period.

The results show that the weights from the unconstrained normal model are more

aggressive than those from the NormCop or Gumbel models. For example, the

latter two models yield median positions of being short 1 unit of the small cap

index and long 1 to 1.3 units of the large cap index, while for the normal model the

median position is being short almost 2 units of the small cap index and long

Table 5 Optimal portfolio weight summary statistics (Gumbel model).

RRA¼ 1 RRA¼ 7 RRA¼ 10 RRA¼ 20

Small

caps

Large

caps

Small

caps

Large

caps

Small

caps

Large

caps

Small

caps

Large

caps

Unconstrained

Minimum ÿ13.85 ÿ1.58 ÿ4.38 ÿ3.68 ÿ2.96 ÿ3.03 ÿ2.20 ÿ2.98

25% quantile ÿ7.70 7.49 ÿ1.44 0.68 ÿ1.03 0.56 ÿ0.52 0.30

Median ÿ5.62 9.04 ÿ1.01 1.33 ÿ0.72 0.95 ÿ0.37 0.50

75% quantile ÿ4.06 10.41 ÿ0.60 1.66 ÿ0.43 1.19 ÿ0.21 0.63

Maximum 4.22 15.53 0.74 2.56 0.95 1.88 1.12 0.99

Short-sales constrained

Minimum 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00

25% quantile 0.00 1.00 0.00 0.38 0.00 0.27 0.00 0.14

Median 0.00 1.00 0.00 0.61 0.00 0.47 0.00 0.26

75% quantile 0.00 1.00 0.00 0.89 0.00 0.68 0.00 0.36

Maximum 0.00 1.00 0.58 1.00 0.52 1.00 0.26 0.56

This table present some summary statistics of the optimal portfolio weights over the out-of-sample period

for an investor using the ‘‘Gumbel’’ model, with relative risk aversion of 1, 7, 10, and 20, with and without a

short-sales constraint imposed.


3 units of the large cap index. Figure 8 confirms that the Normal portfolio weights

are almost always more extreme than the Gumbel portfolio weights. This is

possibly due to the fact that the Gumbel model takes into account the fat tails,

skewness, and asymmetric dependence of these assets. Negative skewness, fat

tails, and dependence of the rotated Gumbel type will other make a risk-averse

investor less aggressive in his/her portfolio decisions, as they all lead, other things

being equal, to a higher probability of large negative moves for the portfolio.

Figure 9 reveals that the NormCop and Gumbel portfolios have similar portfolioweights. Much of the difference in portfolio weights between the Gumbel and

the Normal portfolios, then, was driven by the different marginal distribution

assumptions.

Note that we do not test for the significance of the differences in portfolio

weights directly in this article. Differences in portfolio weights are only economic-

ally interesting if they lead to differences in portfolio performance and so it seems

more appropriate to test for differences using the metric of portfolio performance.

We proceed to such tests in the next section.To try to determine in more detail the causes of the differences in portfolio

weights between the Normal and Gumbel portfolios we considered the following


Por

tfolio

wei

ght

Optimal portfolio weights using the Gumbel model

Wsml,RRA=7 Wbig,RRA=7 Wsml,RRA=20Wbig,RRA=20

–5

– 4

– 3

–2

– 1

0

1

2

3

Figure 6 Optimal weights for the unconstrained Gumbel portfolio for investors with relative riskaversion of 7 and 20 over the out-of-sample period (January 1990 to December 1999). ‘‘Wsml’’stands for the weight put in small caps, and ‘‘Wbig’’ stands for the weight put in large caps.


simple regression of the difference between the Gumbel and Normal portfolio

weights on a constant and the nine parameters of the more flexible model (two

expected returns, two volatilities, two skewness parameters, two kurtosis para-

meters, and one copula parameter):

vGUMi;t ÿ vN

i;t ¼ b0 þ b1mx;t þ b2my;t þ b3

ffiffiffiffiffiffiffihx;t

qþ b4

ffiffiffiffiffiffiffihy;t

qþ b5lx;t þ b6ly;t þ b7nx;t þ b8ny;t þ b9kt þ ei;t, ð13Þ

i ¼ X, Y

where vGUMi;t , vN

i;t are the optimal Gumbel and Normal portfolio weights for

RRA ¼ 7. The optimal portfolio weights are a complicated, nonlinear functionof the parameters of the joint density, and so the above regression is almost

certainly misspecified.8 Further, for simplicity we ignore the fact that the variables


Por

tfolio

wei

ght

Optimal portfolio weights using the Gumbel model

Wsml,RRA=7,unconstrained Wbig,RRA=7,unconstrained Wsml,RRA=7,short-sales constrainedWbig,RRA=7,short-sales constrained

–5

–4

–3

–2

–1

0

1

2

3

Figure 7 Optimal weights for the unconstrained and short sales-constrained Gumbel portfoliosfor an investor with relative risk aversion of 7 over the out-of-sample period ( January 1990 toDecember 1999). ‘‘Wsml’’ stands for the weight put in small caps, and ‘‘Wbig’’ stands for theweight put in large caps.

8 We may consider this regression as a first-order approximation of the true function relating the para-

meters of the forecast density to the optimal portfolio weights.


on the right-hand side of Equation (13) are estimated, and so this regression will

suffer errors-in-variables bias. Nevertheless, it may help highlight some of the

causes of the differences in portfolio weights. To aid interpretation, Table 7 also

presents the results of regressions of the individual portfolio weights on the

regressors in Equation (13), though we will focus our discussion below solely onthe regressions involving the difference in portfolio weights, presented in the last

two columns.

The signs on the coefficients on expected returns indicate that the Normal

portfolio weights reacted more strongly to changes in forecasted returns than the

Gumbel portfolio weights. For example an increase in small cap expected returns

lead to a larger increase in Normal portfolio small cap weight than Gumbel

portfolio small cap weight, making the coefficient in the regression of the differ-

ence in portfolio weights negative. The significant positive (negative) coefficienton large cap volatility for the large (small) cap regression also reflects that fact that

the Normal portfolio weights reacted more strongly than the Gumbel portfolio

weights.

The negative (positive) sign on the degree-of-freedom parameter coefficient in

the small (large) cap regression suggests that as the degree-of-freedom parameter

decreases, reflecting an increase in the fatness of the tails, the weight in the small

(large) caps in the Gumbel portfolio goes toward zero. The Normal portfolio

weights were essentially uncorrelated with this parameter, as expected. That is,an increase in tail fatness led to a less aggressive Gumbel portfolio.

Table 6 Optimal portfolio weight summary statistics (RRA¼ 7).

Uncond Normal NormCop Gumbel

Small

caps

Large

caps

Small

caps

Large

caps

Small

caps

Large

caps

Small

caps

Large

caps

Unconstrained

Minimum 0.04 0.19 ÿ3.83 ÿ0.36 ÿ4.21 ÿ6.96 ÿ4.38 ÿ3.68

25% quantile 0.13 0.21 ÿ2.32 1.93 ÿ1.33 0.53 ÿ1.44 0.68

Median 0.14 0.21 ÿ1.89 3.03 ÿ0.92 0.99 ÿ1.01 1.33

75% quantile 0.15 0.30 ÿ1.31 3.78 ÿ0.57 1.36 ÿ0.60 1.66

Maximum 0.17 0.43 0.53 10.23 1.70 2.59 0.74 2.56


Minimum 0.04 0.19 0.00 0.00 0.00 0.00 0.00 0.00

25% quantile 0.13 0.21 0.00 0.69 0.00 0.29 0.00 0.38

Median 0.14 0.21 0.00 1.00 0.00 0.65 0.00 0.61

75% quantile 0.15 0.30 0.00 1.00 0.00 0.92 0.00 0.89

Maximum 0.17 0.43 0.49 1.00 0.60 1.00 0.58 1.00

This table presents some summary statistics of the optimal portfolio weights over the out-of-sample period

for an investor with relative risk aversion of 7, obtained using four different models, with and without a

short-sales constraint imposed.


The dependence parameter was included in this regression to reflect the fact

that at higher dependence levels the rotated Gumbel portfolio diverges more from

the Normal copula (at independence these two copulas are identical), becoming

more asymmetric. As the assets get more highly correlated the Gumbel portfolio

placed more weight in the small caps and less in the large caps, which, over this

sample period, brought both portfolio weights closer to zero. Thus greater depen-dence led to more conservative portfolio weights, reflecting the fact that stronger

dependence in the rotated Gumbel copula leads to more negatively skewed

portfolio returns. The Normal portfolio responded in precisely the opposite way

to an increase in dependence: an increase in dependence led to a decrease in the

weight in small caps and an increase in the weight in large caps, representing a

more aggressive strategy.

3.3.4 Tests for superior portfolio performance In this section we attempt to

determine whether the differences in portfolio performances documented in pre-

vious sections are statistically significant. We present the results of two tests for

superior performance: a bootstrap test of pairwise comparisons, and the reality


0

2

4

6

8

10

12

Por

tfolio

wei

ght

Optimal portfolio weights using the Gumbel and Normal models

Wsml,RRA=7,GumbelWbig,RRA=7,GumbelWsml,RRA=7,NormalWbig,RRA=7,Normal

–6

–4

–2

Figure 8 Optimal weights for the unconstrained Gumbel and normal portfolios for an investorwith relative risk aversion of 7 over the out-of-sample period (January 1990 to December1999). ‘‘Wsml’’ stands for the weight put in small caps, and ‘‘Wbig’’ stands for the weight putin large caps.


check of White (2000), as modified by Hansen (2001). In all cases we employ the

stationary bootstrap of Politis and Romano (1994).9

We conduct pairwise comparisons by looking at the bootstrap confidence

interval on the difference in the performance measures of two portfolios.10 Let

the performance measure of portfolio i be mi. If the lower bound of the bootstrap

90% confidence interval of mi ÿ mj is greater than zero, then we take model i to be

significantly better than model j. If the upper bound of the interval is less than

zero, then we take model j to be significantly better than model i. If the confidence

Jan90 Jan92 Jan94 Jan96 Jan98 Dec99–7

–6

–5

–4

–3

–2

–1

0

1

2

3

Por

tfolio

wei

ght

Optimal portfolio weights using the Gumbel and NormCop models

Wsml,RRA=7,Gumbel Wbig,RRA=7,Gumbel Wsml,RRA=7,NormCopWbig,RRA=7,NormCop

Figure 9 Optimal weights for the unconstrained Gumbel and NormCop portfolios for an inves-tor with relative risk aversion of 7 over the out-of-sample period (January 1990 to December1999). ‘‘Wsml’’ stands for the weight put in small caps, and ‘‘Wbig’’ stands for the weight put inlarge caps.

9 The stationary bootstrap is a block bootstrap with block lengths that are distributed as a geometric (q)

random variable. The average block length is 1/q. We choose q by running univariate regressions of each

portfolio’s returns on 36 lags, in both levels and squares, to capture serial dependence in the conditional

mean and variance. We set 1/q equal to the maximum of six and the largest significant lag in the

regressions. The results suggested an average block length of between 25 and 34 observations. We

investigated whether the results were sensitive to the choice of average block length and found that the

results were quite robust for average length choices greater than 20.10 In this section we bootstrap the average realized utility of a portfolio rather than the ‘‘management fee’’

discussed above. This is simply for computational ease and should not affect the conclusions drawn.


interval includes zero, then the test is inconclusive and we cannot statistically

distinguish models i and j according to that performance measure. The results of

these tests are presented in Table 8. In this table we include only the 50:50

portfolio of the three naıve portfolios to save space. The results from the pairwise

comparisons involving this portfolio are representative of the results from com-

parisons involving the other two naıve portfolios.

Table 8 shows that the unconstrained Gumbel portfolio significantly outper-formed both the Normal and the NormCop portfolios for all levels of risk aver-

sion. Comparisons of the Gumbel with the 50:50 portfolio and the Uncond

portfolio yielded no conclusive results. The Normal portfolio was beaten in every

comparison by the 50:50 portfolio and the Uncond portfolio, and the NormCop

was beaten by these portfolios in all but two comparisons. This is strong evidence

against the Normal and NormCop portfolios for unconstrained investors.

Table 7 Explaining the optimal portfolio weights.

Normal weights Gumbel weights Gumbel ---- normal weights

Small

Caps

Large

Caps

Small

Caps

Large

Caps

Small

Caps

Large

Caps

Constant ÿ1.2653 15.1015 11.7894 13.7499 ÿ10.5241* 28.8514

(4.8231) (9.6253) (7.3742) (19.1579) (6.3642) (21.3988)

Expected return 0.9045* ÿ1.0251* 0.4815* ÿ0.5448* ÿ0.4230* 0.4802*(small caps) (0.0854) (0.1059) (0.0936) (0.0995) (0.0850) (0.1704)

Expected return ÿ0.9568* 3.7710* 0.9780 0.9951 1.9349* ÿ4.7661*(large caps) (0.5220) (0.9774) (0.7374) (1.9298) (0.5828) (2.1219)

Volatility 0.2221* 0.2312* 0.1529 0.0899 0.0693 ÿ0.3211

(small caps) (0.0924) (0.1381) (0.1037) (0.1810) (0.0788) (0.2042)

Volatility 0.4236* ÿ1.2942* 0.1727 ÿ0.2416* 0.2509 1.0526*(large caps) (0.1290) (0.2369) (0.1148) (0.1025) (0.1595) (0.3066)

Skewness ÿ2.0162 ÿ4.9512 5.7788 9.8843 3.7626 14.8355

(small caps) (2.5957) (4.6368) (3.8478) (9.3447) (3.0265) (10.2489)

Skewness 0.3105 ÿ0.2501 0.7848 ÿ2.9184 1.0953 ÿ2.6683

(large caps) (0.3866) (0.5357) (0.8610) (2.0345) (0.7840) (2.3093)

Degrees-of-freedom ÿ0.0394 0.2721 ÿ0.3921* 0.0307 ÿ0.3528* 0.2414

(small caps) (0.1394) (0.2049) (0.1826) (0.4149) (0.1297) (0.5142)

Degrees-of-freedom 0.0033 ÿ0.1181* 0.0252 0.2278 0.0219 0.3459*(large caps) (0.0397) (0.0661) (0.0424) (0.1788) (0.0356) (0.1772)

Dependence ÿ0.7823 10.5485* 5.7114 6.9595 6.4937* ÿ17.5080

parameter (2.7165) (5.3584) (4.0757) (10.6587) (3.5563) (12.0185)

R2 0.8727 0.9140 0.4828 0.1572 0.6022 0.5674

This table reports the results of a regression of the optimal portfolio weights and the differences in optimal

portfolio weights for the Gumbel and the normal portfolios, for risk aversion of 7, on the 10 listed variables.

Newey and West (1987) standard errors computed allowing for up to 12th-order serial dependence are

reported in parentheses below the parameter estimates, and estimates that are significantly different from

zero at the 10% level are marked with an asterisk.


For the short sales-constrained portfolios we find fewer significant compar-

isons. For all but one comparison we find that the Gumbel portfolio either sig-

nificantly outperforms or has performance that is indistinguishable from the

competing portfolios. In one comparison (for RRA¼ 1), the Normal portfoliosignificantly outperformed the Gumbel portfolio.

From the fact that the Gumbel portfolio generally significantly outperformed

both the Normal and the NormCop portfolios, whereas the Normal and NormCop

portfolios were generally indistinguishable, we can conclude that it was the

capturing of asymmetric dependence rather than skewness that yielded the

Table 8 Pair wise comparisons of the unconstrained models’ performance.

RRA

1 3 7 10 20

Unconstrained

Naıve vs. Uncond Uncond ---- ---- ---- ----

Naıve vs. Normal Naıvea Naıvea Naıve Naıve Naıve

Uncond vs. Normal Unconda Unconda Uncond Uncond Uncond

Naıve vs. NormCop ---- ---- ---- Naıve ----

Uncond vs. NormCop ---- ---- Uncond Uncond Uncond

Normal vs. NormCop NormCopa NormCopa ---- ---- ----

Naıve vs. Gumbel ---- ---- ---- ---- ----

Uncond vs. Gumbel ---- ---- ---- ---- ----

Normal vs. Gumbel Gumbela Gumbela Gumbel Gumbel Gumbel

NormCop vs. Gumbel Gumbel Gumbel Gumbel Gumbel Gumbel


Naıve vs. Uncond ---- ---- ---- ---- ----

Naıve vs. Normal ---- ---- ---- ---- ----

Uncond vs. Normal ---- ---- Normal ---- ----

Naıve vs. NormCop ---- ---- ---- ---- ----

Uncond vs. NormCop ---- ---- ---- ---- ----

Normal vs. NormCop ---- Normal ---- ---- ----

Naıve vs. Gumbel ---- ---- ---- ---- ----

Uncond vs. Gumbel ---- ---- Gumbel ---- ----

Normal vs. Gumbel Normal ---- ---- Gumbel Gumbel

NormCop vs. Gumbel ---- Gumbel Gumbel Gumbel Gumbel

This table presents the results of pairwise comparisons of the 50:50 portfolio (denoted ‘‘naıve’’), the

unconditionally optimal portfolio, and the portfolios based on the normal distribution, the skewed

t-rotated Gumbel copula, and the skewed t-normal copula models. The tests were conducted at the 10%

significance level. A dash indicates the test was inconclusive, and the name of the model was reported if

that model significantly out performed the other. The performance measure used is the sample mean of the

realized utility.a The unconstrained normal portfolio went bankrupt during the out-of-sample period, implying that a

CRRA investor would never choose that portfolio. We report that this portfolio was beaten though we

could not conduct a formal test.


greatest gains for asset allocation. Note, however, that accurate modeling of the

dependence structure relies on accurate modeling of the marginal distributions,

and so even though capturing skewness alone does not appear helpful for this

dataset, it is required for the gains from copula modeling to be realized.

Although the above results are useful for comparing the results of just two

particular models, a more appropriate test would compare all models jointly. With

this in mind we now present the results of the reality check test of White (2000).This is a test that a given benchmark portfolio performs as well as the best

competing alternative model, where we have possibly many competing alterna-

tives. We present the three estimates of the reality check p-values discussed in

Hansen (2001) and focus our attention on the ‘‘consistent ’’ p-value estimates. We

reject the null hypothesis that the benchmark model performs as well as the best

competing alternative model whenever the p-value is less than 0.10. In these tests

we separate the two sets of models into unconstrained and constrained, and

include the three naıve portfolios in both sets. Table 9 presents the results whenthe 50:50, Normal, and NormCop portfolios are taken as the benchmarks.

When comparing the 50:50 portfolio with the unconstrained model-based

portfolios we are not able to reject that it performs as well as the best alternative

for any level of risk aversion. Comparing the 50:50 portfolio with the short sales-

constrained portfolios leads to a single rejection for the most risk-averse investor.

From the second panel of Table 9 we see that we are able to reject the uncon-

strained Normal portfolio using four out of five levels of risk aversion. Table 9

similarly shows that we are able to reject the unconstrained NormCop portfolio forthree out of five levels of risk aversion. However, the constrained Normal is only

rejected once, and we are unable to reject the constrained NormCop portfolio for

any level of risk aversion.

Overall these results support the results of the pairwise comparisons of

unconstrained portfolios, that the Normal and NormCop portfolios yielded infer-

ior returns over this period. However, when the investor is short-sales constrained

these results suggest that Normal and NormCop portfolios are just as good as the

best alternative, a conclusion supported by the economically small differences inthe management fee reported in Section 3.3.2, and by the substantial overlap in

selected portfolio weights reported in Section 3.3.3. Thus the benefits from model-

ing skewness and asymmetric dependence are only present for investors that are

not short-sales constrained. For such investors, the gains range up to 27 basis

points per year, and are generally statistically significant.

4 CONCLUSIONS AND FUTURE WORK

In this article we considered the impact that skewness and asymmetric depen-

dence have on the out-of-sample portfolio decisions of a CRRA investor, with arange of levels of risk aversion. Skewness in the distribution of individual stock

returns has been reported by numerous authors in the last three decades. ‘‘Asym-

metric dependence,’’ of a form where equity returns exhibit greater dependence

during market downturns than during market upturns, has been reported by Erb,


Harvey, and Viskanta (1994), Longin and Solnik (2001), and Ang and Chen (2002),

inter alia, and can be shown to induce negative skewness in the distribution of

portfolio returns. It is known that any investor that exhibits nonincreasing abso-

lute risk aversion, a very weak requirement, has a preference for positively

skewed assets, ceteris paribus. Thus both of these asymmetries may, in theory,

impact portfolio decisions.

We considered the problem of allocating wealth between the risk-free asset

and the CRSP small cap and large cap indices, using monthly data from January1954 to December 1999. We used the data up to December 1989 to develop the

models and reserved the last 120 months for an out-of-sample evaluation of the

competing methods. We used conditional distribution models that are able to

capture time-varying conditional moments of up to order four, and we employed

Table 9 Bootstrap reality check p-values.

Unconstrained Short-sales constrained

Lower Consistent Upper Lower Consistent Upper

‘‘50:50 mix’’ as benchmark

RRA¼ 1a 0.259 0.259 0.462 0.177 0.177 0.266

RRA¼ 3a 0.352 0.43 0.736 0.196 0.196 0.291

RRA¼ 7 0.327 0.451 0.843 0.313 0.313 0.481

RRA¼ 10 0.28 0.28 0.657 0.175 0.175 0.312

RRA¼ 20 0.145 0.145 0.373 0.057 0.057 0.251

‘‘Normal’’ as benchmark

RRA¼ 1 0.000b 0.000b 0.000b 0.316 0.316 0.866

RRA¼ 3 0.000b 0.000b 0.000b 0.586 0.667 0.792

RRA¼ 7 0.042 0.042 0.042 0.746 0.792 0.842

RRA¼ 10 0.034 0.034 0.034 0.373 0.384 0.593

RRA¼ 20 0.117 0.185 0.309 0.082 0.082 0.535

‘‘NormCop’’ as benchmark

RRA¼ 1a 0.126 0.126 0.405 0.556 0.556 0.932

RRA¼ 3a 0.066 0.066 0.317 0.319 0.368 0.470

RRA¼ 7 0.067 0.067 0.305 0.349 0.394 0.493

RRA¼ 10 0.023 0.023 0.224 0.380 0.511 0.579

RRA¼ 20 0.238 0.380 0.380 0.151 0.161 0.611

This table presents the results of the reality check of White (2000), as modified by Hansen (2001). ‘‘Lower’’,

‘‘Consistent,’’ and ‘‘Upper’’ refer to three estimates of the p-value of the test statistic. A p-value of less than

0.10 indicates that we may reject the hypothesis that the benchmark model performs as well as the best

alternative model considered. Any rejections are marked in bold face type. The performance measures

used is the sample mean of the realized utility.a For the comparisons of unconstrained portfolios for investors with risk aversion of 1 and 3 using average

realized utility we excluded the bivariate normal portfolio, as it went bankrupt during the sample.b The unconstrained normal portfolio went bankrupt during the out-of-sample period, implying that a

CRRA investor would never choose that portfolio. We report that this portfolio was beaten though we

could not conduct a formal test.


models of the dependence structure of these asset returns that allowed for greater

dependence during market downturns than market upturns.

We found some economic evidence that the model capturing skewness and

asymmetric dependence yielded better portfolio decisions than the bivariate nor-

mal model. The amount that one could charge a risk-averse investor for use of the

most flexible density model rather than the bivariate normal model ranged

between approximately 0 and 27 basis points per year. The most economicallyand statistically significant differences were for the unconstrained portfolios; the

short sales-constrained portfolios were generally not substantially different. Our

results suggest that both marginal distribution modeling and copula modeling

have important implications for out-of-sample portfolio performance.

This article leaves unanswered a number of questions. We considered only

two specific indices, and it would be interesting to compare the results obtained

for other assets. Further, it would be of interest to extend the problem to that of

multiple assets ---- do the benefits of flexibly modeling the joint distribution ofreturns increase with the dimension of the distribution, or does parameter estima-

tion error dominate? In this article we ignored the impact of parameter estimation

uncertainty on the investor’s optimization problem, and it would be interesting to

determine how the results would change when this is taken into account. Finally,

it would be of great interest to compare the results of the methods presented in this

article with other parametric approaches, such as Ang and Bekaert (2002), and

with nonparametric approaches, such as those of Brandt (1999) or Aıt-Sahalia and

Brandt (2001). All of these questions are left for future work.

APPENDIX A: DETAILS OF THE OPTIMIZATION PROCEDURE

In this short appendix we provide further information on the computation of

the optimal portfolio weights. We used the in-sample period, January 1954 toDecember 1989, to determine the best-fitting density specification, and then fol-

lowed the steps below for each month t in the out-of-sample period, January 1990

to December 1999. See Judd (1998) for some of the issues surrounding the use of

Monte Carlo simulations to approximate objective functions containing integrals.

1. Estimate the parameters of the density model using data available up

until date t ÿ 1 and store the parameter estimate as uut. We used uutÿ1 as

a starting value for the estimation procedure.

2. Generate n¼ 10,000 independent draws from the forecast density,

HðuutÞ, for month t:

(a) Generate n independent draws from the forecast copula for month t,denote these as (U t,i, Vt,i), for i¼ 1, 2, . . . , n.

(b) Let Xt;i�Fÿ1ðUt;i; uutÞ and Yt;i�Gÿ1ðVt;i; uutÞ, for i¼ 1, 2, . . . , n, to

obtain n draws from the forecast joint density, where Fÿ1 and

Gÿ1 are the inverse cdfs of the models for Xt and Yt.


3. Define the estimated optimal portfolio weights for month t, utility

function U, and density forecast H(uut) as

v�t � arg maxv2W

nÿ1Xn

i¼1

U�t ðR�t;iðvÞÞ,

where

Rt;iðvÞ ¼ 1þ vxXt;i þ vyYt;i,

« ¼ 2:2204� 10ÿ16, and

R�t;iðvÞ ¼Rt;iðvÞ if Rt;iðvÞ > «

2«½1ÿ 1=ð1þ expfRt;iðvÞ ÿ «gÞ� if Rt;iðvÞ � «

��UU t ¼ nÿ1

Xn

i¼1

UðR�t;iðv�tÿ1ÞÞ

U�t ðrðvÞÞ ¼100

j�UU tjUðrðvÞÞ:

The cut off 2.2204 � 10ÿ16 was chosen as this is ‘‘machine epsilon’’ for

Matlab. The function U* is used instead of U directly, as the numericalmaximization routine does not work well with very small or very

large numbers. The transformation does not affect the ranking of

alternatives, it simply recenters the objective function so that the

values it returns are ‘‘around’’ 100.

4. The maximization was carried out using the Broyden-Fletcher-

Goldfarb-Shanno (BFGS) algorithm, via function fmincon in Matlab,

using vtÿ1* as a starting value.

APPENDIX B: COPULA FUNCTIONAL FORMS

In this appendix we provide the functional forms of the copulas used in thisarticle. The cdf will be denoted C, and the pdf c. For further details on any of

these copulas, or for other copulas, the reader is referred to Joe (1997) and Nelsen

(1999).

Normal copula

CNðu, v; rÞ ¼ FrðFÿ1ðuÞ, Fÿ1ðvÞÞ

cNðu, v; rÞ ¼ 1ffiffiffiffiffiffiffiffiffiffiffiffiffi1ÿ r2

p expFÿ1ðuÞ2 þFÿ1ðvÞ2 ÿ 2rFÿ1ðuÞFÿ1ðvÞ

2ð1ÿ r2Þ

(

þFÿ1ðuÞ2Fÿ1ðvÞ2

2

)r 2 ðÿ1, 1Þ


Clayton copula [Kimeldorf and Sampson copula in Joe (1997)]

CCðu, v; uÞ ¼ ðuÿu þ vÿu ÿ 1Þÿ1=u

cCðu, v; rÞ ¼ ð1þ uÞðuvÞÿuÿ1ðuÿu þ vÿu ÿ 1Þÿ2ÿ1=u

u 2 ½ÿ1,1Þnf0g

Rotated Clayton copula

CRCðu, v; uÞ ¼ uþ vÿ 1þ CCð1ÿ u, 1ÿ v; uÞcRC ¼ cCð1ÿ u, 1ÿ v; uÞ

u 2 ½ÿ1,1Þnf0g

Student’s t copula

CTðu, v; r, nÞ ¼ Tn;rðTÿ1n ðuÞ, Tÿ1

n ðvÞÞ

cTðu, v; r, nÞ ¼ Gðn þ 2=2ÞtnðTÿ1n ðuÞÞ

ÿ1tnðTÿ1n ðvÞÞ

ÿ1

npGðn=2Þffiffiffiffiffiffiffiffiffiffiffiffiffi1ÿ r2

p� 1þ Tÿ1

n ðuÞ2 þ Tÿ1

n ðvÞ2 ÿ 2rTÿ1

n ðuÞTÿ1n ðvÞ

nð1ÿ r2Þ

!ÿðnþ2Þ=20@ 1ATÿ1

n is the inverse cdf of a Student0s t,

and tn is the pdf of a Student0s t

Tn,r is the bivariate Student0s t cdf

r 2 ðÿ1, 1Þ, n > 2

Joe-Clayton copula [family BB7 in Joe (1997)]

CJCðu, vjtU, tLÞ ¼ 1ÿ ð1ÿ f½1ÿ ð1ÿ uÞk�ÿg þ ½1ÿ ð1ÿ vÞk�ÿg ÿ 1gÿ1=gÞ1=k

cJCðu, vjtU, tLÞ ¼ very long: Available from the author upon request:

where k ¼ ½log2ð2ÿ tUÞ�ÿ1 and g ¼ ½ÿlog2ðtLÞ�ÿ1

tU 2 ð0, 1Þ, tL 2 ð0, 1Þ

Plackett copula

CPðu, v; pÞ ¼ 1

2ðp ÿ 1Þ�

1þ ðp ÿ 1Þðuþ vÞ

ÿffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffið1þ ðp ÿ 1Þðuþ vÞÞ2 ÿ 4pðp ÿ 1Þuv

q �cPðu, v; pÞ ¼ pð1þ ðp ÿ 1Þðuþ vÿ 2uvÞÞ

ðð1þ ðp ÿ 1Þðuþ vÞÞ2 ÿ 4pðp ÿ 1ÞuvÞ3=2

p 2 ½0,1Þnf1g


Frank copula

CFðu, v; lÞ ¼ ÿ1

llog

ð1ÿ eÿlÞ ÿ ð1ÿ eÿluÞð1ÿ eÿlvÞ

ð1ÿ eÿlÞ

!

cFðu, v; lÞ ¼ lð1ÿ eÿlÞeÿlðuþvÞ

ðð1ÿ eÿlÞ ÿ ð1ÿ eÿluÞð1ÿ eÿlvÞÞ2

l 2 ðÿ1,1Þnf0g

Gumbel copula

CGðu, v; dÞ ¼ expnÿ ððÿlog uÞd þ ðÿlog vÞdÞ1=d

ocGðu, v; dÞ¼ CGðu, v; dÞðlog u � log vÞdÿ1

uvððÿlog uÞdþðÿlog vÞdÞ2ÿ1=d

�ððÿlog uÞdþðÿlog vÞdÞ1=dþ dÿ 1

�d 2 ½1,1Þ

Rotated Gumbel copula

CRGðu, v; dÞ ¼ uþ vÿ 1þ CGð1ÿ u, 1ÿ v; dÞcRGðu, v; dÞ ¼ cGð1ÿ u, 1ÿ v; dÞ

d 2 ½1,1Þ

Received October 25, 2002; revised April 22, 2003; accepted October 29, 2003

REFERENCES

Aıt-Sahalia, Y., and M. W. Brandt. (2001). ‘‘Variable Selection for Portfolio Choice.’’Journal of Finance 56, 1297--1355.

Ang, A., and G. Bekaert. (2002). ‘‘International Asset Allocation with Regime Shifts.’’Review of Financial Studies 15, 1137--1187.

Ang, A., and J. Chen. (2002). ‘‘Asymmetric Correlations of Equity Portfolios.’’ Journal ofFinancial Economics 63, 443--494.

Ang, A., J. Chen, and Y. Xing. (2002). ‘‘Downside Correlation and Expected Returns.’’Working paper, Columbia Business School.

Arrow, K. J. (1971). Essays in the Theory of Risk Bearing. Chicago: Markham Publishing.Chicago.

Artzner, P., F. Delbaen, J.-M. Eber, and D. Heath. (1999). ‘‘Coherent Measures of Risk.’’Mathematical Finance 9, 203--228.

Bae, K.-H., A. Karolyi, and R. M. Stulz. (2003). ‘‘A New Approach to MeasuringFinancial Contagion.’’ Forthcoming in the Review of Financial Studies.

Brandt, M. W. (1999). ‘‘Estimating Portfolio and Consumption Choice: A ConditionalEuler Equations Approach.’’ Journal of Finance 54, 1609--1645.

Campbell, J. Y., and L. M. Viceira. (1999). ‘‘Consumption and Portfolio DecisionsWhen Expected Returns are Time Varying.’’ Quarterly Journal of Economics 114,433--495.


Campbell, R., K. Koedijk, and P. Kofman (2002). ‘‘Increased Correlation in BearMarkets: A Downside Risk Perspective.’’ Financial Analysts Journal 58, 87--94

Chen, X., and Y. Fan. (2002a). ‘‘Evaluating Density Forecasts via the Copula Approach.’’Working paper, New York University.

Chen, X., and Y. Fan. (2002b). ‘‘Semiparametric Estimation of Copula-Based Time SeriesModels.’’ Working paper, New York University.

Clayton, D. G. (1978). ‘‘A Model for Association in Bivariate Life Tables and itsApplication in Epidemiological Studies of Familial Tendency in Chronic DiseaseIncidence.’’ Biometrika 65, 141--151.

Cook, R. D., and M. E. Johnson. (1981). ‘‘A Family of Distributions for Modelling Non-Elliptically Symmetric Multivariate Data.’’ Journal of the Royal Statistical SocietySeries B 43, 210--218.

Detemple, J. B., R. Garcia, and M. Rindisbacher. (2003). ‘‘A Monte Carlo Method forOptimal Portfolios.’’ Journal of Finance 58, 401--446.

Diebold, F. X., T. A. Gunther, and A. S. Tay. (1998). ‘‘Evaluating Density Forecasts, withApplications to Financial Risk Management.’’ International Economic Review 39,863--883.

Embrechts, P., A. Hoing, and A. Juri. (2001). ‘‘Using Copulae to Bound the Value-at-Risk for Functions of Dependent Risks.’’ Forthcoming in Finance and Stochastics.

Erb, C. B., C. R. Harvey, and T. E. Viskanta. (1994). ‘‘Forecasting International EquityCorrelations.’’ Financial Analysts Journal 50, 32--45.

Fama, E. F. (1981). ‘‘Stock Returns, Real Activity, Inflation, and Money.’’ AmericanEconomic Review 71, 545--565.

Fermanian, J.-D., and O. Scaillet. (2003). ‘‘Nonparametric Estimation of Copulas forTime Series.’’ Forthcoming in the Journal of Risk.

Fleming, J., C. Kirby, and B. Ostdiek. (2001). ‘‘The Economic Value of VolatilityTiming.’’ Journal of Finance 56, 329--352.

Friend, I., and R. Westerfield. (1980). ‘‘Co-Skewness and Capital Asset Pricing.’’ Journalof Finance 35, 897--913.

Frost, P. A., and J. E. Savarino. (1988). ‘‘For Better Performance: Constrain PortfolioWeights.’’ Journal of Portfolio Management 15, 29--34.

Genest, C., and L.-P. Rivest. (1993). ‘‘Statistical Inference Procedures for BivariateArchimedean Copulas.’’ Journal of the American Statistical Association 88, 1034--1043.

Granger, C. W. J. (1969). ‘‘Prediction with a Generalized Cost of Error Function.’’Operational Research Quarterly 20, 199--207.

Hansen, B. E. (1994). ‘‘Autoregressive Conditional Density Estimation.’’ InternationalEconomic Review 35, 705--730.

Hansen, P. R. (2001). ‘‘An Unbiased and Powerful Test for Superior Predictive Ability.’’Working Article 01-06, Brown University.

Harvey, C. R., and A. Siddique. (1999). ‘‘Autoregressive Conditional Skewness.’’ Journalof Financial and Quantitative Analysis 34, 465--488.

Harvey, C. R., and A. Siddique. (2000). ‘‘Conditional Skewness in Asset Pricing Tests.’’Journal of Finance 55, 1263--1295.

Huang, C.-F., and R. H. Litzenberger. (1988). Foundations for Financial Economics.Englewood Cliffs, NJ: Prentice-Hall.

Ingersoll, J. E. Jr. (1987). Theory of Financial Decision Making. Baltimore: Rowman andLittlefield.

Jagannathan, R., and T. Ma. (2002). ‘‘Risk Reduction in Large Portfolios: Why Imposingthe Wrong Constraints Helps.’’ Forthcoming in the Journal of Finance.


Joe, H. (1997). Multivariate Models and Dependence Concepts. Monographs on Statisticsand Applied Probability 73. London: Chapman and Hall.

Jondeau, E., and M. Rockinger. (2003). ‘‘Conditional Volatility, Skewness, and Kurtosis:Existence, Persistence and Comovements.’’ Forthcoming in the Journal of EconomicDynamics and Control.

Judd, K. L. (1998). Numerical Methods in Economics. Cambridge, MA: MIT Press.Junker, M., and A. May. (2002). ‘‘Measurement of Aggregate Risk with Copulas.’’

Working ’paper, Research Center Caesar, Financial Engineering, Bonn.Kandel, S., and R. F. Stambaugh. (1996). ‘‘On the Predictability of Stock Returns: An

Asset Allocation Perspective.’’ Journal of Finance 51, 385--424.Kraus, A., and R. H. Litzenberger. (1976). ‘‘Skewness Preference and the Valuation of

Risk Assets.’’ Journal of Finance 31, 1085--1100.Li, D. X. (2000). ‘‘On Default Correlation: A Copula Function Approach.’’ Journal of Fixed

Income 9, 43--54.Lim, K.-G. (1989). ‘‘A New Test of the Three-Moment Capital Asset Pricing Model.’’

Journal of Financial and Quantitative Analysis 24, 205--216.Longin, F., and B. Solnik. (2001). ‘‘Extreme Correlation of International Equity

Markets.’’ Journal of Finance 56, 649--676.Mashal, R., and A. Zeevi. (2002). ‘‘Beyond Correlation: Extreme Co-movements

Between Financial Assets.’’ Working paper, Columbia Graduate School ofBusiness.

Merton, R. C. (1971). ‘‘Optimal Consumption and Portfolio Rules in a Continuous-TimeModel.’’ Journal of Economic Theory 3, 373--413.

Miller, D. J., and W.-H. Liu. (2002). ‘‘On the Recovery of Joint Distributions fromLimited Information.’’ Journal of Econometrics 107, 259--274.

Nelsen, R. B. (1999). An Introduction to Copulas. New York: Springer-Verlag.Newey, W. K., and K. D. West. (1987). ‘‘A Simple, Positive Semidefinite,

Heteroskedasticity and Autocorrelation Consistent Covariance Matrix.’’Econometrica 55, 703--708.

Oakes, D. (1989). ‘‘Bivariate Survival Models Induced by Frailties.’’ Journal of theAmerican Statistical Association 84, 487--493.

Patton, A. J. (2001a). ‘‘Estimation of Copula Models for Time Series of Possibly DifferentLengths.’’ Working paper 2001-17, University of California, San Diego.

Patton, A. J. (2001b). ‘‘Modelling Time-Varying Exchange Rate Dependence Using theConditional Copula.’’ Working paper 2001-09, University of California, San Diego.

Patton, A. J. (2002). ‘‘Applications of Copula Theory in Financial Econometrics.’’Unpublished Ph.D. dissertation, University of California, San Diego.

Peir�oo, A. (1999). ‘‘Skewness in Financial Returns.’’ Journal of Banking and Finance 23,847--862.

Perez-Quiros, G., and A. Timmermann. (2001). ‘‘Business Cycle Asymmetries in StockReturns: Evidence from Higher Order Moments and Conditional Densities.’’Journal of Econometrics 103, 259--306.

Pesaran, M. H., and A. Timmermann. (2002). ‘‘Market Timing and Return PredictabilityUnder Model Instability.’’ Forthcoming in the Journal of Empirical Finance.

Politis, D. N., and J. P. Romano. (1994). ‘‘The Stationary Bootstrap.’’ Journal of theAmerican Statistical Association 89, 1303--1313.

Richardson, M., and T. Smith. (1993). ‘‘A Test for Multivariate Normality in StockReturns.’’ Journal of Business 66, 295--321.


Rockinger, M., and E. Jondeau. (2001). ‘‘Conditional Dependency of Financial Series:An Application of Copulas.’’ Working paper, HEC School of Management, France.

Rosenberg, J. V. (2003). ‘‘Nonparametric Pricing of Multivariate Contingent Claims.’’Journal of Derivatives 10, 9--26.

Schweizer, B., and A. Sklar. (1983). Probabilistic Metric Spaces. New York: ElsevierScience.

Singleton, J. C., and J. Wingender. (1986). ‘‘Skewness Persistence in Common StockReturns.’’ Journal of Financial and Quantitative Analysis 21, 335--341.

Sklar, A., (1959). ‘‘Fonctions de r�eepartition a n dimensions et leurs marges.’’ Publicationsde l’ Institut Statistique de l’Universite de Paris 8, 229--231.

Skouras, S. (2001). ‘‘Decisionmetrics: A Decision-Based Approach to EconometricModelling.’’ Working paper 01-11-064, Santa Fe Institute.

Stock, J. H., and M. W. Watson. (1999). ‘‘A Comparison of Linear and NonlinearUnivariate Models for Forecasting Macroeconomic Time Series.’’ In R. F. Engleand H. White (eds.), Cointegration, Causality, and Forecasting: A Festschrift in Honor ofClive W. J. Granger. Oxford: Oxford University Press.

Swanson, N. R., and H. White. (1995). ‘‘A Model Selection Approach to Assessing theInformation in the Term Structure Using Linear Models and Artificial NeuralNetworks.’’ Journal of Business and Economic Statistics 13, 265--275.

Swanson, N. R., and H. White. (1997). ‘‘A Model Selection Approach to Real-TimeMacroeconomic Forecasting Using Linear Models and Artificial NeuralNetworks.’’ Review of Economics and Statistics 79, 540--550.

Weigend, A. S., and N. A. Gershenfeld. (1994). Time Series Prediction: Forecasting theFuture and Understanding the Past. Boston: Addison-Wesley.

Weiss, A. A. (1996). ‘‘Estimating Time Series Models Using the Relevant CostFunction.’’ Journal of Applied Econometrics 11, 539--560.

West, K. D., H. J. Edison, and D. Cho. (1993). ‘‘A Utility-Based Comparison of SomeModels of Exchange Rate Volatility.’’ Journal of International Economics 35, 23--45.

White, H. (2000). ‘‘A Reality Check for Data Snooping.’’ Econometrica 68, 1097--1126.


Date post:	24-Jan-2020
Category:	Documents
Upload:	others
View:	1 times
Download:	0 times

On the Out-of-Sample Importance of Skewness and...

Documents