+ All Categories
Home > Documents > #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the...

#2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the...

Date post: 25-May-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
34
Transcript
Page 1: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

CONDITIONAL VALUE-AT-RISK

FOR GENERAL LOSS DISTRIBUTIONS

RESEARCH REPORT #2001-5

R. Tyrrell Rockafellar1 and Stanislav Uryasev2

Risk Management and Financial Engineering Lab

Center for Applied Optimization

Dept. of Industrial and Systems Engineering

University of Florida, Gainesville, FL 32611

Date: April 4, 2001

Revised: July 3, 2001

Correspondence should be addressed to: Stanislav Uryasev

Abstract. Fundamental properties of conditional value-at-risk, as a measure of risk with

signi�cant advantages over value-at-risk, are derived for loss distributions in �nance that can in-

volve discreetness. Such distributions are of particular importance in applications because of the

prevalence of models based on scenarios and �nite sampling. Conditional value-at-risk is able to

quantify dangers beyond value-at-risk, and moreover it is coherent. It provides optimization short-

cuts which, through linear programming techniques, make practical many large-scale calculations

that could otherwise be out of reach. The numerical e�ciency and stability of such calculations,

shown in several case studies, are illustrated further with an example of index tracking.

Key Words: Value-at-risk, conditional value-at-risk, mean shortfall, coherent risk measures,

risk sampling, scenarios, hedging, index tracking, portfolio optimization, risk management

1University of Washington, Department of Mathematics, Box 354350, Seattle, WA 98195-4350; e-mail:

[email protected] of Florida, Department of Industrial and Systems Engineering, PO Box 116595, Gainesville, FL

32611-6595; e-mail: [email protected] .edu; URL: http://www.ise.u .edu/uryasev

1

Page 2: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

1 Introduction

Measures of risk have a crucial role in optimization under uncertainty, especially in coping with

the losses that might be incurred in �nance or the insurance industry. Loss can be envisioned

as a function z = f(x; y) of a decision vector x 2 X � IRn representing what we may generally

call a portfolio, with X expressing decision constraints, and a vector y 2 Y � IRm representing

the future values of a number of variables like interest rates or weather data. When y is taken

to be random with known probability distribution, z comes out as a random variable having its

distribution dependent on the choice of x. Any optimization problem involving z in terms of the

choice of x should then take into account not just expectations, but also the \riskiness" of x.

Value-at-risk, or VaR for short, is a popular measure of risk which has achieved the high status

of being written into industry regulations. It su�ers, however, from being unstable and di�cult

to work with numerically when losses are not \normally" distributed|which in fact is often the

case, because loss distributions tend to exhibit \fat tails" or empirical discreteness. Moreover,

VaR fails to be coherent in the sense of Artzner, Delbaen, Eber and Heath [6]. A very serious

shortcoming of VaR, in addition, is that it provides no handle on the extent of the losses that

might be su�ered beyond the amount indicated by this measure. It is incapable of distinguishing

between situations where losses that are worse may be deemed only a little bit worse, and those

where they could well be overwhelming. Indeed, it merely provides a lowest bound for losses in

the tail of the loss distribution and has a bias toward optimism instead of the conservatism that

ought to prevail in risk management.

An alternative measure that does quantify the losses that might be encountered in the tail

is conditional value-at-risk, or CVaR. As a tool in optimization modeling, CVaR has superior

properties in many respects. It maintains consistency with VaR by yielding the same results

in the limited settings where VaR computations are tractable, i.e., for normal distributions (or

perhaps \elliptical" distributions as in [17]); for portfolios blessed with such simple distributions,

working with CVaR, VaR, or minimum variance [29] are equivalent (cf. [39]). Most importantly

for applications, however, CVaR can be expressed by a remarkable minimization formula. This

formula can readily be incorporated into problems of optimization with respect to x 2 X that are

designed to minimize risk or shape it within bounds. Signi�cant shortcuts are thereby achieved

while preserving crucial problem features like convexity.

Such computational advantages of CVaR over VaR are turning into a major stimulus for the

development of CVaR methodology, in view of the fact e�cient algorithms for optimization of

2

Page 3: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

VaR in high-dimensional settings are still not available, despite the substantial e�orts that have

gone into research in that direction [3, 7, 20, 21, 22, 26, 37, 43].

CVaR and its minimization formula were �rst developed in our paper [39]. There, we demon-

strated numerical e�ectiveness through several case studies, including portfolio optimization and

options hedging. In follow-up work in [34], investigations were carried out with the minimization

of CVaR subject to a constraint on expected return, the maximization of return subject to a

constraint on the CVaR, and the maximization of a utility function that balances CVaR against

return. Strategies for investigating the e�cient frontier between CVaR and return were considered

as well. In [4], the approach was applied to credit risk management of a portfolio of bonds. Ex-

tensions in [12] have centered on a closely related notion of CDaR, conditional drawdown-at-risk,

in the optimization of portfolios with drawdown constraints.

In these works, with their focus on demonstrating the potential of the new approach, discussion

of CVaR in its full generality was postponed. Only continuous loss distributions were treated, and

in fact, for the sake of an elementary initial justi�cation of the minimization formula so as to get

started with using it, distributions were assumed to have smooth density. In the present paper we

drop those limitations and complete the foundations for our methodology. This step is needed of

course not just for theory, but because many problems of optimization under uncertainty involve

discontinuous loss distributions in which the discrete probabilities come out of scenario models or

the �nite sampling of random variables. While some consequences of our minimization formula

itself have since been explored by P ug [32] in territory outside of the assumptions we made in

[39], an understanding of what the quantity given by the formula then represents in the usual

framework of risk measures in �nance has been missing.

For continuous loss distributions, the CVaR at a given con�dence level is the expected loss

given that the loss is greater than the VaR at that level, or for that matter, the expected loss given

that the loss is greater than or equal to the VaR. For distributions with possible discontinuities,

however, it has a more subtle de�nition and can di�er from either of those quantities, which for

convenience in comparision can be designated by CVaR+ and CVaR�, respectively. CVaR+ has

sometimes been called \mean shortfall" (cf. [31], although the seemingly identical term \expected

shortfall" has been interpreted in other ways in [1] and [2], with the latter paper taking it as a

synonym for CVaR itself), while \tail VaR" is a term that has been suggested for CVaR� (cf. [6]).

Here, in order to consolidate ideas and reduce the potential for confusion, we speak of CVaR+

and CVaR� simply as \upper" and \lower" CVaR. Generally CVaR� � CVaR � CVaR+, with

equality holding when the loss distribution function does not have a jump at the VaR threshold;

3

Page 4: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

but when a jump does occur, which for scenario models is always the situation, both inequalities

can be strict.

On the basis of the general de�nition of CVaR elucidated below, and with the help of argu-

ments in [32], CVaR is seen to be a coherent measure of risk in the sense of [6], whereas CVaR+

and CVaR� are not. (A direct alternative proof of this fact has very recently been furnished by

Acerbi, Nordio and Sirtori [1].) The lack of coherence of CVaR+ and CVaR� in the presence of

discreteness does not seem to be widely appreciated, although this shortcoming was already noted

for CVaR� by the authors of [6]. They suggested, as a remedy, still another measure of risk which

they called \worst conditional expectation" and proved to be coherent. That measure is imprac-

tical for applications, however, because it can only be calculated in very narrow circumstances.

In contrast, CVaR is not only coherent but eminently practical by virtue of our minimization

formula for it. That formula opens the door to computational techniques for dealing with risk

far more e�ectively than before.

Interestingly, CVaR can be viewed as a weighted average of VaR and CVaR+ (with the weights

depending, like these values themselves, on the decision x). This seems surprising, in the face of

neither VaR nor CVaR+ being coherent. The weights arise from the particular way that CVaR

\splits the atom" of probability at the VaR value, when one exists.

Besides laying out such implications of the general de�nition of CVaR and its associated

minimization formula, we put e�ort here into bringing out properties of CVaR that enhance

the usefulness of this approach when dealing with fully discrete distributions. For such distri-

butions, we furnish an elementary way of calculating CVaR directly. We show how a suitable

speci�cation of the con�dence level, depending on the �nite, discrete distribution of y, can ensure

that CVaR=CVaR+ regardless of the choice of x. For con�dence levels close enough to 1, we

prove that CVaR, CVaR� and VaR coincide with maximum loss, and again this can be ensured

independently of x.

We go over the optimization shortcuts o�ered by CVaR and extend them to models where

risk is shaped at several con�dence levels. As part of this, CVaR is proved to be stable with

respect to the choice of the con�dence level, although other proposed measures of risk are not.

Finally, we illustrate the main facts and ideas with a numerical example of portfolio replication

with CVaR constraints. This example demonstrates how the incorporation of such constraints in

a �nancial model may improve both the in-sample and the out-of-sample risk characteristics. The

calculations con�rm that CVaR methodology o�ers a management tool for e�ciently controlling

risks in practice.

4

Page 5: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

Broadly speaking, problems of risk management with VaR and CVaR can be classi�ed as

falling under the heading of stochastic optimization. Various other concepts of risk in optimization

have earlier been studied in the stochastic programming literature, but not in a context of �nance;

see [10, 19, 24, 25, 33, 41, 36]. The reader interested in applications of stochastic optimization

techniques in the �nance area can �nd relevant papers in [46, 47].

For elucidation of the many statements in this paper that rely on background in convex

optimization, we refer the reader to the book [38] (or [40]).

Additional properties of CVaR, including a powerful result on estimation, are available in the

new paper of Acerbi and Tasche [2].

2 General Concept of CVaR

In everything that follows, we suppose the random vector y is governed by a probability measure

P on Y (a Borel measure) that is independent of x. (The independence could be relaxed for

some purposes, but it is essential for key results about convexity that underly the use of linear

programming reductions in computation.) For each x, we denote by (x; �) on IR the resulting

distribution function for the loss z = f(x; y), i.e.,

(x; �) = Pfy j f(x; y) � �g; (1)

making the technical assumptions that f(x; y) is continuous in x and measurable in y, and that

Efjf(x; y)jg <1 for each x 2 X. We denote by (x; ��) the left limit of (x; �) at �; thus

(x; ��) = Pfy j f(x; y) < �g: (2)

When the di�erence

(x; �)�(x; ��) = Pfy j f(x; y) = �g (3)

is positive, so that (x; �) has a jump at �, a probability \atom" is said to be present at �.

We consider a con�dence level � 2 (0; 1), which in applications would be something like

� = :95 or � = :99. At this con�dence level, there is a corresponding value-at-risk , de�ned in the

following way.

De�nition 1 (VaR). The �-VaR of the loss associated with a decision x is the value

��(x) = minf� j(x; �) � �g: (4)

5

Page 6: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

The minimum in (4) is attained because (x; �) is nondecreasing and right-continuous in

�. When (x; �) is continuous and strictly increasing, ��(x) is simply the unique � satisfying

(x; �) = �. Otherwise, this equation can have no solution or a whole range of solutions.

The case of no solution corresponds to a vertical gap in the graph of (x; �) as in Figure 1,

with � lying in an interval of con�dence levels that all yield the same VaR. The lower and upper

endpoints of that interval are

��(x) = (x; ��(x)�); �+(x) = (x; ��(x)): (5)

The case of a whole range of solutions corresponds instead to a constant segment of the graph, as

shown in Figure 2. The solutions form an interval having ��(x) as its lower endpoint. The upper

endpoint of the interval is the value �+� (x) introduced next.

De�nition 2 (VaR+). The �-VaR+ (\upper" �-VaR) of the loss associated with a decision x is

the value

�+� (x) = inff� j(x; �) > �g: (6)

Obviously ��(x) � �+� (x) always, and these values are the same except when (x; �) is con-

stant at level � over a certain �-interval. That interval is either [��(x); �+

� (x)) or [��(x); �+

� (x)],

depending on whether or not (x; �) has a jump at �+� (x).

Both Figure 1 and Figure 2 illustrate phenomena that raise challenges in the treatment of

general loss distributions. This is especially true for discrete distributions associated with �nite

sampling or scenario modeling, since then (x; �) is a step function (constant between jumps),

and there is no getting around these circumstances.

Observe, for instance, that the situation in Figure 2 entails a discontinuity in the behavior

of VaR: a jump is sure to occur if a slightly higher con�dence level is demanded. This degree of

instability is distressing for a measure of risk on which enormous sums of money might be riding.

Furthermore, although x is �xed in this picture, examples easily show that the misbehavior in

the dependence of VaR on � can e�ect its dependence on x as well. That makes it hard to cope

successfully with VaR-centered problems of optimization in x.

These troubles, and many others, motivate the search for a better measure of risk than value-

at-risk for practical applications. Such a measure is conditional value-at-risk.

De�nition 3 (CVaR). The �-CVaR of the loss associated with a decision x is the value

��(x) = mean of the �-tail distribution of z = f(x; y), (7)

6

Page 7: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

ςΨ( , )x

0 ας ( )x ς

α−( )x

α+( )xα

1

Figure 1: Equation (x; �) = � has no solution in �.

ςΨ( , )x

0 ας +( )x ς

α

1

ας ( )x

Figure 2: Equation (x; �) = � has many solutions in �.

7

Page 8: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

α ςΨ ( , )x

0 ς

α αα

+ −−

( )1x

1

ας ( )x

Figure 3: Distribution function �(x; �) is obtained by rescaling the function (x; �) in the

interval [�,1].

where the distribution in question is the one with distribution function �(x; �) de�ned by

�(x; �) =

(0 for � < ��(x),

[(x; �)� �]=[1 � �] for � � ��(x).(8)

Note that �(x; �) truly is another distribution function, like (x; �): it is nondecreasing and

right-continuous, with �(x; �)! 1 as � !1. The �-tail distribution referred to in (7) is thus

well de�ned through (8).

The subtlety of De�nition 3 resides in the case where the loss with distribution function (x; �)

has a probability atom at ��(x). In that case the interval [��(x);1) has probability greater than

1� �, inasmuch as

(x; ��(x)�) < � � (x; ��(x)) when (x; ��(x)

�) < (x; ��(x)); (9)

and the issue comes up of what really should be meant by the �-tail distribution, since that term

presumably ought to refer to the \upper 1 � � part" of the full distribution. This is resolved

by specifying the �-tail distribution through the distribution function in (8), which is obtained

by rescaling the portion of the graph of the original distribution between the horizontal lines at

levels 1� � and 1 so that it spans instead between 0 and 1. For the case shown in Figure 1, the

result is depicted in Figure 3.

8

Page 9: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

The consequences of this maneuver will be examined in relation to the following variants in

which the whole interval [��(x);1) or its interior (��(x);1) are the focus.

De�nition 4 (CVaR+ and CVaR�). The �-CVaR+ (\upper" �-CVaR) of the loss associated

with a decision x is the value

�+�(x) = Eff(x; y) j f(x; y) > ��(x)g; (10)

whereas the �-CVaR� (\lower" �-CVaR) of the loss is the value

��

�(x) = Eff(x; y) j f(x; y) � ��(x)g: (11)

The conditional expectation in (11) is well de�ned because Pff(x; y) j f(x; y) � ��(x)g �

1� � > 0, but the one in (10) only makes sense as long as Pff(x; y) j f(x; y) > ��(x)g > 0, i.e.,

(x; ��(x)) < 1, which is not assured merely through our assumption that � 2 (0; 1), since there

might be a probability atom at ��(x).

As indicated in the introduction, (10) is sometimes called \mean shortfall". The closely related

expression

Eff(x; y)� ��(x) j f(x; y) > ��(x)g = �+�(x)� ��(x) (12)

goes however by the name of \mean excess loss"; cf. [8], [18]. In ordinary language, a shortfall

might be thought the same as an excess loss, so \mean shortfall" for (10) potentially poses a

con ict. The conditional expectation in (11) has been dubbed in [6] the \tail VaR" at level �, but

as revealed in the proof of the next proposition, it is really the mean of the tail distribution for the

con�dence level ��(x) in (5) rather than the one appropriate to � itself. The \upper" and \lower"

terminology in De�nition 4 avoids such di�culties while emphasizing the basic relationships

among these values that are described next.

Proposition 5 (basic CVaR relations). If there is no probability atom at ��(x), one simply has

��

�(x) = ��(x) = �+�(x): (13)

If a probability atom does exist at ��(x), one has

��

�(x) < ��(x) = �+�(x) when � = (x; ��(x)); (14)

or on the other hand,

��

�(x) = ��(x) when (x; ��(x)) = 1 (15)

9

Page 10: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

(with �+�(x) then being ill de�ned). But in all the remaining cases, characterized by

(x; ��(x)�) < � < (x; ��(x)) < 1; (16)

one has the strict inequality

��

�(x) < ��(x) < �+�(x): (17)

Proof. In comparison with the de�nition of ��(x) in (7), the �+�(x) value in (10) is the mean of

the loss distribution associated with

+

�(x; �) =

(0 for � < ��(x),

[(x; �)� �+(x)]=[1 � �+(x)] for � � ��(x),(18)

whereas the ��

�(x) value in (11) is the mean of the loss distribution associated with

�(x; �) =

(0 for � < ��(x),

[(x; �)� ��(x)]=[1 � ��(x)] for � � ��(x),(19)

Recall that �+(x) and ��(x), de�ned in (5), mark the top and bottom of the vertical gap at ��(x)

for the original distribution function (x; �) (if a jump occurs there).

The case of there being no probability atom at ��(x) corresponds to having ��(x) = �+(x) =

� 2 (0; 1). Then (13) obviously holds, because the distribution functions in (8), (18) and (19)

are identical. When a probability atom exists, but � = �+(x), we get ��(x) < �+(x) < 1 and

thus the relations in (14), while if �+(x) = 1 we nevertheless get (15) through (9). Under the

alternative of (16), however, it is clear from the de�nitions of the distribution functions in (8),

(18) and (19) that the strict inequalities in (17) prevail.

For the situation in Figure 1, the distribution functions in (18) and (19) that have �+�(x) and

��

�(x) as their means are illustrated in Figures 4 and 5. They are the tail distributions for the

con�dence levels �+(x) and ��(x).

Proposition 5 con�rms, in the case in (13), that �-CVaR throughly reduces for continuous

loss distributions (i.e., ones without any probability atoms induced by discreteness) to the more

elementary expressions for conditional value-at-risk that we worked with in [39]. An important

task ahead will be to demonstrate that the minimization formula we developed in [39], which is

vital to the feasibility of practical applications of CVaR in risk management, carries over from

that special context to the present one.

The �-CVaR and the �-CVaR+ of the loss coincide often, but not always, according to Propo-

sition 5. Another perspective on the connection between these two values is developed next.

10

Page 11: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

α ς+Ψ ( , )x

0 ς

1

ας ( )x

Figure 4: Distribution function +� (x; �) is obtained by rescaling the function (x; �) in the

interval [�+(x),1].

α ς−Ψ ( , )x

0 ς

α αα

+

−−−

( ) ( )1 ( )x x

x

1

ας ( )x

Figure 5: Distribution function �� (x; �) is obtained by rescaling the function (x; �) in the

interval [��(x),1].

11

Page 12: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

Proposition 6 (CVaR as a weighted average). Let ��(x) be the probability assigned to the loss

amount z = ��(x) by the �-tail distribution in De�nition 3, namely

��(x) = [(x; ��(x))� �]=[1 � �] 2 [0; 1]: (20)

If (x; ��(x)) < 1, so there is a chance of a loss greater than ��(x), then

��(x) = ��(x)��(x) + [1� ��(x)]�+

�(x) (21)

with ��(x) < 1, whereas if (x; ��(x)) = 1, so ��(x) is the highest loss that can occur (and thus

��(x) = 1 but �+�(x) is ill de�ned), then

��(x) = ��(x): (22)

Proof. These relations are evident from formulas (7) and (8), together with the observation that

� � (x; ��(x)) always by De�nition 1.

Corollary 7 (CVaR over VaR). From its de�nition, �-CVaR dominates �-VaR: ��(x) � ��(x).

Indeed, ��(x) > ��(x) unless there is no chance of a loss greater than ��(x).

Proof. This was more or less clear from the beginning, but now it emerges explicitly from

Proposition 6 and the fact, seen through (12), that �+�(x) > ��(x).

In representing CVaR as a certain weighted average of VaR and CVaR+, formula (21) seems

surprising. Neither VaR nor CVaR+ behaves well as a measure of risk for general loss distributions,

and yet CVaR has many advantageous properties, to be seen in what follows.

The unusual feature in the de�nition of CVaR that leads to its power is the way that proba-

bility atoms, if present, can be \split". Such splitting is highlighted in formulas (20) and (21) of

Proposition 6. In the notation of �+(x) and ��(x) in (5) and the circumstances in (16), where

��(x) < � < �+(x), an atom at ��(x) having total probability �+(x) � ��(x) is e�ectively split

into two pieces with probabilities �+(x) � � and � � ��(x), respectively. In concept, only the

�rst of these pieces is adjoined to the interval (��(x);1), which itself has probability 1� �+(x),

so as to achieve a probability of [1 � �+(x)] + [�+(x) � �] = 1 � �, whereas, if the atom could

not be split, we would have to choose between the intervals [��(x);1) and (��(x);1), neither of

which actually has probability 1� �.

The splitting of probability atoms in this manner also stabilizes the response of �-CVaR to

shifts in �. This will be shown later in Proposition 13.

12

Page 13: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

Our next result addresses the extreme case where discreteness of the loss distribution rules

entirely, as in scenario-based optimization under uncertainty. In scenario models, �nitely many

elements y 2 Y are singled out in some way as representative \scenarios," and all the probability

is concentrated in them.

Proposition 8 (CVaR for scenario models). Suppose the probability measure P is concentrated

in �nitely many points yk of Y , so that for each x 2 X the distribution of the loss z = f(x; y) is

likewise concentrated in �nitely many points, and (x; �) is a step function with jumps at those

points. Fixing x, let those corresponding loss points be ordered as z1 < z2 < � � � < zN , with the

probability of zk being pk > 0. Let k� be the unique index such that

k�Xk=1

pk � � >k��1Xk=1

pk: (23)

The �-VaR of the loss is given then by

��(x) = zk� ; (24)

whereas the �-CVaR is given by

��(x) =1

1� �

h� k�Xk=1

pk � ��zk� +

NXk=k�+1

pkzki: (25)

Furthermore, in this situation

��(x) =1

1� �

� k�Xk=1

pk � ���

pk�pk� + � � �+ pN

: (26)

Proof. According to (23), we have

(x; ��(x)) =k�Xk=1

pk; (x; ��(x)�) =

k��1Xk=1

pk; (x; ��(x))�(x; ��(x)�) = pk� :

The assertions then follow from (8) and Proposition 6, except for the upper bound claimed for

��(x). To understand that, observe that the expression for ��(x) in (26) decreases with respect

to �, which belongs to the interval in (23). The upper bound is obtained by substituting the

lower endpoint of that interval for � in this expression.

Corollary 9 (highest losses). In the notation of Proposition 8, if the highest point zN has

probability pN > 1� �, then actually ��(x) = ��(x) = zN .

Proof. This amounts to having k� = N , and the result then comes from (24) and (25).

13

Page 14: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

Of course, it must be remembered in Proposition 8 and Corollary 9 that not only the loss

values zk and their probabilities pk, but also their ordering can depend on the choice of x, and so

too then the index k�, even though our notation omits that dependence for the sake of simplicity.

The case in Corollary 9 can very well come up in multistage stochastic programming models

over scenario trees, for instance. In such optimization problems, the �rst stage may have only

a few scenarios (see e.g. [19]), and CVaR will coincide then with maximum loss at that stage.

Subsequent stages usually are represented with more scenarios and thus need the full force of the

expressions in Proposition 8.

3 Minimization Rule and Coherence

We work now towards the goal of showing that the �-VaR and �-CVaR of the loss z associated

with a choice x can be calculated simultaneously by solving an elementary optimization problem

of convex type in one dimension. For this purpose we utilize, as we did in our original paper [39]

in this subject, the special function

F�(x; �) = � +1

1� �En[f(x; y)� �]+

o; where [t]+ = maxf0; tg: (27)

The following theorem con�rms that the minimization formula we originally developed in [39]

under special assumptions on the loss distribution, such as the exclusion of discreteness, persists

when the CVaR concept is articulated for general distributions in the manner of De�nition 2. In

contrast, no such formula holds for CVaR+ or CVaR�.

Theorem 10 (fundamental minimization formula). As a function of � 2 IR, F�(x; �) is �nite

and convex (hence continuous), with

��(x) = min� F�(x; �) (28)

and moreover

��(x) = lower endpoint of argmin� F�(x; �);

�+�(x) = upper endpoint of argmin� F�(x; �);

(29)

where the argmin refers to the set of � for which the minimum is attained and in this case has to

be a nonempty, closed, bounded interval (perhaps reducing to a single point). In particular, one

always has

��(x) 2 argmin� F�(x; �); ��(x) = F�(x; ��(x)): (30)

14

Page 15: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

Proof. The �niteness of F�(x; �) is a consequence of our assumption that the Efjf(x; y)jg <1

for each x 2 X. Its convexity follows at once from the convexity of [f(x; y) � �]+ with respect

to �. As a �nite convex function, F�(x; �) has �nite right and left derivatives at any � (see [38,

Theorems 23.1, 24.1]). Our approach to proving the rest of the assertions in the theorem will rely

on �rst establishing for these one-sided derivatives the formulas

@+F�@�

(x; �) =(x; ��(x))� �

1� �;

@�F�@�

(x; �) =(x; ��(x)

�)� �

1� �: (31)

We start by observing that

F�(x; �0)� F�(x; �)

� 0 � �= 1 +

1

1� �En [f(x; y)� � 0]+ � [f(x; y)� �]+

� 0 � �

o: (32)

When � 0 > � we have

[f(x; y)� � 0]+ � [f(x; y)� �]+

� 0 � �

8>><>>:= �1 if f(x; y) � � 0,

= 0 if f(x; y) � �,

2 (�1; 0) if � < f(x; y) < � 0.

Since Pfy j f(x; y) > � 0g = 1 � (x; � 0) and Pfy j � < f(x; y) � � 0g = (x; � 0) � (x; �), with

(x; � 0)&(x; �) as � 0 &� (i.e., as � 0 ! � with � 0 > �), it follows that

lim�0& �

En [f(x; y)� � 0]+ � [f(x; y)� �]+

� 0 � �

o= �[1�(x; �)]:

Applying this in (32), we obtain

lim�0& �

F�(x; �0)� F�(x; �)

� 0 � �= 1 +

1

1� �[(x; �)� 1] =

(x; �)� �

1� �;

thereby verifying the �rst formula in (31). For the second formula in (31), we argue similarly

that when � 0 < � we have

[f(x; y)� � 0]+ � [f(x; y)� �]+

� 0 � �

8>><>>:= �1 if f(x; y) � �,

= 0 if f(x; y) � � 0,

2 (�1; 0) if � 0 < f(x; y) < �,

where Pfy j f(x; y) � �g = 1�(x; ��) and Pfy j � 0 < f(x; y) < �g = (x; ��)�(x; � 0). Since

(x; � 0)%(x; ��) as � 0 %� (i.e., as � 0 ! � with � 0 < �), we obtain

lim�0% �

En [f(x; y)� � 0]+ � [f(x; y)� �]+

� 0 � �

o= �[1�(x; ��)];

and then in (32)

lim�0% �

F�(x; �0)� F�(x; �)

� 0 � �= 1 +

1

1� �[(x; ��)� 1] =

(x; ��)� �

1� �:

15

Page 16: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

That gives the second formula in (31).

Because of convexity, the one-sided derivatives in (31) are nondecreasing with respect to �,

with the formulas assuring that

lim�!1

@+F�@�

(x; �) = lim�!1

@�F�@�

(x; �) = 1

and on the other hand

lim�!�1

@+F�@�

(x; �) = lim�!�1

@�F�@�

(x; �) = ��

1� �:

On the basis of these limits, we know that the level sets of F�(x; �) are bounded and therefore

that the minimum in (28) is attained, with the argmin set being a closed, bounded interval. The

values of � in that set are characterized as the ones such that

@�F�@�

(x; �) � 0 �@+F�@�

(x; �):

According to the formulas in (31), they are the values of � satisfying (x; ��) � � � (x; �).

The lowest such � is ��(x) by De�nition 1, while the highest is �+� (x) by De�nition 2.

Thus, (29) and the �rst claim in (30) are correct. The truth of the second claim in (30) is

immediate then from (28).

Note: Very recently, and independently of our work, Acerbi and Tasche in [2] have likewise

con�rmed that our formula in [39] persists for CVaR in general. Their argument omits the details

above, relying instead on observations about functions similar to our F� that can be gleaned from

exercises in classical probability texts.

Theorem 10 turns a powerful spotlight on the di�erence between CVaR and VaR, revealing

the fundamental reason why CVaR is much better behaved than VaR when dependence on a

choice of x 2 X must be handled. The reason is the fact, well known in optimization theory, that

the optimal value in a problem of minimization, in this case ��(x), is much more agreeable as a

function of parameters than is the optimal solution set, which is here the argmin interval having

��(x) as its lower endpoint.

The special circumstances in Proposition 8 can be appreciated from the perspective of the

minimization formula in Theorem 10 as well. The function F�(x; �) is in this case piecewise linear

with derivative breakpoints at the loss values zk. The argmin has to consist either of a single

derivative breakpoint zk� or an interval [zk� ; zk�+1 ] between successive derivative breakpoints.

For the next result, we recall that a function h(x) is sublinear if h(x + x0) � h(x) + h(x0)

and h(�x) = �h(x) for � > 0. The second of these two properties, called positive homogeneity,

16

Page 17: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

implies in particular that h(0) = 0. Sublinearity is equivalent to the combination of convexity

with positive homogeneity; see [38]. Linearity is a special case of sublinearity.

Corollary 11 (convexity of CVaR). If f(x; y) is convex with respect to x, then ��(x) is convex

with respect to x as well. Indeed, in this case F�(x; �) is jointly convex in (x; �).

Likewise, if f(x; y) is sublinear with respect to x, then ��(x) is sublinear with respect to x.

Then too, F�(x; �) is jointly sublinear in (x; �).

Proof. The joint convexity of F�(x; �) in (x; �). is an elementary consequence of the de�nition

of this function in (27) and the convexity of the function (x; �) 7! [f(x; y) � �]+ when f(x; y)

is convex in x. The convexity of ��(x) in x follows immediately then from the minimization

formula (28). (In convex analysis, when a convex function of two vector variables is minimized

with respect to one of them, the residual is a convex function of the other; see [38].)

The argument is for sublinearity is entirely parallel. Only the additional feature of positive

homogeneity needs attention, according to the remark about sublinearity above.

A case especially worth noting where the sublinearity in Corollary 11 is present is the one

where f(x; y) is actually linear with respect to x, i.e., of the form

f(x; y) = x1f1(y) + � � � + xnfn(y): (33)

This case is common to numerous applications.

The observation that the minimization formula in Theorem 10 yields the convexity in Corollary

11 was made in our original paper [39]. We did not mention sublinearity there, but P ug, in his

follow-up article [32], noted that it too was a consequence of our formula.

Very close to Corollary 11 is an important fact about the coherence of CVaR as a risk measure,

in the sense introduced by Artzner, Delbaen, Eber and Heath [6]. In the framework of those

authors, a risk measure is a functional on a linear space of random variables. If we denote such

random variables generically by z, thinking of them as losses, the axioms in [6] for coherence of

a risk measure � amount to the requirement that � be sublinear,

�(z + z0) � �(z) + �(z0); �(�z) = ��(z) for � � 0; (34)

and in addition satisfy

�(z) = c when z � c (constant) ; (35)

along with

�(z) � �(z0) when z � z0; (36)

17

Page 18: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

where the inequality z � z0 refers to �rst-order stochastic dominance. (In [6], a stronger-seeming

property than (35) is required, that �(z + z0) = c + �(z0) when z � c, but that follows from

(35) and the subadditivity rule in (34).) Here our framework is di�erent, due to the way we

are modeling a loss as the joint outcome of a decision x and an underlying random vector y,

but coherence can nonetheless be captured by viewing it (equivalently) as an assertion about the

special case in (33).

Corollary 12 (coherence of CVaR). On the basis of De�nition 3, �-CVaR is a coherent risk

measure: when f(x; y) is linear with respect to x, not only is ��(x) sublinear with respect to x,

but furthermore it satis�es

��(x) = c when f(x; y) � c (37)

(thus accurately re ecting a lack of risk), and it obeys the monotonicity rule that

��(x) � ��(x0) when f(x; y) � f(x0; y): (38)

Proof. In terms of z = f(x; y) and z0 = f(x0; y) in the context of the linearity in (33), these

properties come out as the ones in (34), (35) and (36). The sublinearity of �� in the case of (33)

has already been noted as ensured by Corollary 11. Like that, the additional properties (37) and

(38) too can be seen as simple consequences of the fundamental minimization formula for �� in

Theorem 10.

Of course, the relations on the right sides of (37) and (38) should technically be interpreted

as ones between random variables (with respect to y), rather than pointwise relations between

functions of y. According to (38), for instance, a decision x that leads to an outcome at least as

good as another decision x0, no matter what happens, is deemed no riskier than x0.

P ug, in [32], demonstrated that if a measure of risk were introduced in the framework of

Artzner, Delbaen, Eber and Heath in [6] by the general expression derivable from the right side

of our minimization formula, namely,

�(z) = min�2IR

n� +

1

1� �En[z � �]+

oo; (39)

it would be a coherent measure of risk. This conclusion tightly parallels Corollary 12, but here

we are asserting that coherence holds for �-CVaR as the quantity introduced in De�nition 3,

not just for the functional de�ned by (39). For that assertion, the arguments behind Theorem

10, and with them the subtleties of �-CVaR as an \adjusted" conditional expectation that splits

probability atoms, have a major role. The coherence of �-CVaR is a formidable advantage not

shared by any other widely applicable measure of risk yet proposed.

18

Page 19: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

Besides the properties already mentioned, P ug uncovered others for the functional � in (39)

that would likewise transfer to ��(x). For this, we refer to his paper [32].

We close this section by pointing out still another feature of CVaR that distinguishes it from

other common measures of risk for general loss distributions.

Proposition 13 (stability of CVaR). The value ��(x) behaves continuously with respect to the

choice of � 2 (0; 1) and even has left and right derivatives, given by

@�

@���(x) =

1

(1� �)2Ef[f(x; y)� ��(x)]

+g;@+

@���(x) =

1

(1� �)2Ef[f(x; y)� �+� (x)]

+g:

Proof. Fixing x, consider for each � 2 IR the function of 2 IR de�ned by

��( ) = � + En[f(x; y)� �]+

o; (40)

and let

�( ) = min�2IR ��( ): (41)

In this way, we have through Theorem 10 that

��(x) = �( ) for = 1=(1 � �); (42)

with the minimum in (41) being attained when � belongs to the interval [��(x); �+

� (x)].

According to (41), � is the pointwise minimum of the collection of functions �� . Those func-

tions are a�ne, hence � is concave. A �nite, concave function on IR is necessarily continuous

and has left and right derivatives at every point. Under the pointwise minimization, the right

derivative is the lowest of the slopes of the a�ne functions �� for which the minimum is attained,

whereas the left derivative is the highest of such slopes. The slope of �� is given by the expectation

in (40), which decreases as � increases. At = 1=(1 � �), we therefore get the highest slope by

taking � = ��(x) and the lowest by taking � = �+� (x). Hence, at = 1=(1��), the left and right

derivatives of � are Ef[f(x; y)� ��(x)]+g and Ef[f(x; y)� �+� (x)]

+g, respectively.

The result now follows through (42) by considering the function � 7! ��(x) as the composition

of � with � 7! 1=(1 � �) and invoking the chain rule.

4 CVaR in Optimization

In problems of optimization under uncertainty, CVaR can enter into the objective or the con-

straints, or both. A big advantage of CVaR over VaR in that context is the preservation of

19

Page 20: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

convexity, seen in Corollary 11. In numerical applications, the joint convexity of F�(x; �) with

respect to both x and � in Corollary 10 is even more valuable than the convexity of ��(x) in x.

That is because the minimization of ��(x) over x 2 X, which can be adopted as a basic prototype

in the management of risk when measured by �-CVaR, can be transformed into a much more

tractable problem of minimizing F�(x; �) in both x and �.

Theorem 14 (optimization shortcut). Minimizing ��(x) with respect to x 2 X is equivalent to

minimizing F�(x; �) over all (x; �) 2 X � IR, in the sense that

minx2X

��(x) = min(x;�)2X�IR

F�(x; �); (43)

where moreover

(x�; ��) 2 argmin(x;�)2X�IR

F�(x; �) () x� 2 argminx2X

��(x); �� 2 argmin�2IR

F�(x�; �): (44)

Proof. This rests on the principle in optimization that minimization with respect to (x; �) can

be carried out by minimizing with respect to � for each x and then minimizing the residual with

respect to x. In the situation at hand, we invoke Theorem 10 and in particular, in order to get

the equivalence in (44), the fact there that the minimum of F�(x; �) in � (for �xed x) is always

attained.

Corollary 15 (VaR and CVaR calculation as a by-product). If (x�; ��)minimizes F� over X�IR,

then not only does x� minimize �� over X, but also

��(x�) = F�(x

�; ��); ��(x�) � �� � �+� (x

�); (45)

where actually ��(x�) = �� if argmin� F�(x

�; �) reduces to a single point.

The fact that the minimization of CVaR does not have to proceed numerically through re-

peated calculations of ��(x) for various decisions x, may at �rst seem really surprising. It is

a powerful attraction to working with CVaR, all the more so when compared with attempts to

minimize VaR, which can be quite ill behaved and o�ers no such shortcut.

In the circumstance mentioned at the end of Corollary 15 where argmin� F�(x�; �) does not

consist of just a single point, is possible to have ��(x�) < �� in (45). Then the joint minimization

in Theorem 14, in producing (x�; ��), although it yields the �-CVaR associated with x�, does

not immediately yield the �-VaR associated with x�. That could well happen, for instance, in

the scenario model of Proposition 8. But then, as noted earlier, argmin� F�(x�; �) is the interval

between two consecutive points zk in the discrete distribution of losses. In that case, therefore,

20

Page 21: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

��(x�) can nonetheless easily be obtained from the joint minimization: it is simply the highest

zk � ��.

Linear programming techniques can readily be utilized for the double minimization in Theorem

14 in the linear case in (33), as we have already illustrated in more restricted setting adopted

in [39]. This can be done similar to other linear programming approaches used in portfolio

optimization with mean absolute deviation [27], maximum deviation [45], and mean regret [15].

Here, the signi�cance of Theorem 14 and Corollary 15 lies in underscoring that the previous

restrictions can be dropped.

The minimization of ��(x) with respect to x 2 X is not the only way that CVaR can be

utilized in risk management. It can also be brought in to \shape" the risk in an optimization

model. For that purpose, several probability thresholds can be handled.

Theorem 16 (risk-shaping with CVaR). For any selection of probability thresholds �i and loss

tolerances !i, i = 1; : : : ; l, the problem

minimize g(x) over x 2 X satisfying ��i(x) � !i for i = 1; : : : ; l; (46)

where g is any objective function chosen on X, is equivalent to the problem

minimize g(x) over (x; �1; : : : ; �l) 2 X � IR� � � � � IR

satisfying F�i(x; �i) � !i for i = 1; : : : ; l:(47)

Indeed, (x�; ��1 ; � � � ; ��l ) solves the second problem if and only if x� solves the �rst problem and

the inequality F�i(x�; ��i ) � !i holds for i = 1; : : : ; l.

Moreover one then has ��i(x�) � !i for every i, and actually ��i(x

�) = !i for each i such

that F�i(x�; ��i ) = !i (i.e., such that the corresponding CVaR constraint is active).

Proof. Again, this relies on the minimization formula (28) in Theorem 10 and the assured

attainment of the minimum there. The argument is very much like that for Theorem 14. Because

��i(x) = min�i2IR

F�i(x; �i); (48)

we have ��i(x) � !i if and only if there exists �i such that F�i(x; �i) � !i.

When X and g are convex and f(x; y) is convex in x, we know from Corollary 11 that the

optimization problems in Theorems 14 and 16 are ones of convex programming and thus especially

favorable for computation. In comparison, analogous problems in terms of VaR instead of CVaR

could be highly unfavorable. Of course, a combination of the models in Theorems 14 and 16 could

likewise be handled in such a manner, by taking g(x) = ��0(x) for some �0.

21

Page 22: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

Linear programming techniques can be used to compute answers in this setting as well. That

is most evident when Y is a discrete probability space with elements yk, k = 1; : : : ; N having

probabilities pk, k = 1; : : : ; N . Then from (27) we have

F�i(x; �i) = �i +1

(1� �i)

NXk=1

pk[f(x; yk)� �i]+: (49)

The constraint F�i(x; �i) � !i in Theorem 16 can be handled by introducing additional variables

�ik subject to the conditions

�ik � 0; f(x; yk)� �i � �ik � 0; (50)

and requiring that

�i +1

(1� �i)

NXk=1

pk�k � !i: (51)

The minimization in the expanded problem (47) is converted then into the minimization of g(x)

over x 2 X, the �i's and all the new �ik's, with the constraints F�i(x; �i) � !i being replaced by

(50) and (51). When f is linear in x as in (33), these reconstituted constraints are linear.

This conversion is entirely parallel to the one we introduced in [39] for the expanded opti-

mization problem with respect to x and � that appears in Theorem 14.

5 An Example of Portfolio Replication with CVaR Constraints

Putting together a portfolio in order to track a given �nancial index is a common and important

undertaking. It �ts in the framework of \portfolio replication" as a form of approximation,

but of course the approximation criterion that is adopted must be one that focuses on risks

associated with inaccuracies in the tracking. We present an example that demonstrates how

CVaR constraints can be used e�ciently to control such risks. For other work on portfolio

replication, see for instance [5, 9, 11, 13, 14, 16, 28, 42, 44].

Suppose we want to replicate an instrument I (e.g. the S&P100 index) using certain other

instruments Sj, j = 1; : : : ; n. Denote by It the price of instrument I at time t, for t = 1; : : : ; T ,

and denote by ptj the price of instrument Sj at time t. Let � be amount of money to be on hand

at the �nal time T . We denote by � = �IT

the number of units of the instrument I at time T . Let

xj, for j = 1; : : : ; n, be the number of units of instrument Sj in the proposed replicating portfolio.

The value of that portfolio at time t is thenPn

j=1 ptjxj . The relative absolute deviation of the

portfolio value from the target value �It is j(�It �Pn

j=1 ptjxj)=�Itj.

22

Page 23: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

To put this into our earlier framework, we think of the price vectors pt = (pt1; : : : ; ptn) for

t = 1; : : : ; T as observations of a random element y 2 IRn, but now write p instead of y (and have

indexing t = 1; : : : ; T instead of k = 1; : : : ; N). These observed vectors pt give a �nite distribution

of p in which p = pt has probability 1=T . We take the loss associated with a decision x to be the

relative shortfall

f(x; p) =��It �

nXj=1

ptjxj�.

�It; (52)

and introduce, as the expression to be minimized, the expectation of jf(x; p)j, i.e., the average of

the relative absolute deviations jf(x; pt)j for t = 1; : : : ; T . In addition, we impose a constraint on

the CVaR amount ��(x) associated with the loss f(x; p) in order to control large deviations of

the portfolio value below the target value.

In the pattern of the expanded problem (47) in Theorem 16, but with only one CVaR con-

straint, our portfolio replication problem comes out then as follows:

mininimize g(x) =1

T

TXt=1

�����It � nXj=1

ptjxj�.

�It��� (53)

subject to the constraintsnX

j=1

pjTxj = �; (54)

0 � xj � j ; j = 1; : : : ; n; (55)

(which realize in this setting the constraint x 2 X in the general discussion earlier) and

� +1

(1� �)T

TXt=1

hh��It �

nXj=1

ptjxj�.

�Iti� �

i+

� !: (56)

The minimization takes place with respect to both x = (x1; : : : ; xn) and the variable �. The

expression on the left side of (56) is F�(x; �); thus, (56) corresponds to requiring ��(x) � !.

For any choice of � and !, this problem can be solved by conversion to linear programming,

more or less in the manner already explained above. The performance function g is handled by

introducing still more variables �t0 � 0 constrained by��It �

Pnj=1 ptjxj

�.�It � �t0 � 0;

���It �

Pnj=1 ptjxj

�.�It + �t0 � 0;

and minimizing the expression (1=T )PT

t=1 �t0.

Several important issues in the modeling, such as transaction costs and how to select the

stocks to be included in the replicating portfolio, are beyond the scope of this paper. However,

that does not undermine the basic idea of the CVaR approach, which we proceed to lay out.

23

Page 24: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

Calculations for this example were conducted using LP solver of CPLEX package.

In our numerical experiments, we aimed at replicating the S&P100 index using 30 of the

stocks that belong to that index (namely, the ones with ticker symbols: GD, UIS, NSM, ORCL,

CSCO, HET, BS, TXN, HM, INTC, RAL, NT, MER, KM, BHI, CEN, HAL, BDK, HWP, LTD,

BAC, AVP, AXP, AA, BA, AGC, BAX, AIG, AN, AEP). These stocks were the instruments Sj .

The experiments were conducted in two stages:

Stage 1 (in-sample calculations): the problem (53){(56) was solved using in-sample historical

data on stock prices.

Stage 2 (out-of-sample calculations): replicating properties of the portfolio were veri�ed by

using the out-of-sample historical data just after the in-sample replicating period.

For the in-sample calculations, we used the closing prices for 600 days (from 10.21.1996 to

03.08.1999). For the out-of-sample calculations we considered 100 days (from 03.09.1999 to

07.28.1999). The con�dence level in CVaR constraint (56) was taken to be � = 0:9, so that

the CVaR constraint would control the largest 10% of relative deviations (underperformance of

the portfolio compared to the index).

We solved the replication problem (53){(56) for several values of the risk-tolerance level ! in

the CVaR constraint (! was varied from 0.02 to 0.001). To verify out-of-sample goodness of �t

we calculated the values of performance function (53) and the CVaR in (56) for the out-of-sample

dataset. The results of the calculations are presented in Table 1 and Figures 6{13. The analysis

of these results follows.

In-sample calculations. Imposing the CVaR constraint ought to lead to a deterioration in

the value of the in-sample objective function (the average absolute relative deviation). Indeed,

decreasing the value of ! causes an increase in the value of objective function in in-sample

region (Column 2 of Table 1). This is seen in Figure 6 (continuous thick line) and is an evident

consequence of the fact that decreasing the value of ! diminishes the feasible set. At the risk-

tolerance level ! = 0:02, the constraint on CVaR in (56) is inactive; at ! � 0:01 that constraint is

active. The dynamics of relative absolute deviation (in-sample) for an instance when the CVaR

constraint is active (at ! = 0:005) and an instance when it is inactive (at ! = 0:02) are shown

in Fig. 7. This �gure reveals that the CVaR constraint has reduced underperformance of the

portfolio versus the index in the in-sample region: the dotted curve corresponding to the active

CVaR constraint is lower than solid curve corresponding to the inactive CVaR constraint. The

dynamics of portfolio and index values for cases when the CVaR constraint is active (at ! = 0:005)

and inactive (at ! = 0:02) are shown in Fig. 8 and Fig. 9, respectively. These �gures demonstrate

24

Page 25: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

con�dence in-sample (600 days) out-of-sample (100 days) out-of-sample CVaR, in %

level ! objective function, in % objective function, in %

0.02 0.71778 2.73131 4.88654

0.01 0.82502 1.64654 3.88691

0.005 1.11391 0.85858 2.62559

0.003 1.28004 0.78896 2.16996

0.001 1.48124 0.80078 1.88564

Table 1: Calculation results for various risk levels ! in the CVaR constraint.

that the portfolio �ts the index quite well for both active and inactive CVaR constraints.

At ! = 0:005 and the optimal portfolio point x�, we got �� = 0:001538627671 and the CVaR

value of the left side in (56) equal to 0.005. In this case the probability of the VaR point itself is

14/600, which means that 14 time points have the same deviation 0.001538627671. To verify our

optimization result at the optimal portfolio x�, we manually calculated:

VaR=0.001538627671, CVaR = 0.005

CVaR� = 0.004592779726, CVaR+=0.005384596925.

We found that �� =VaR and the left side of the inequality (56) is CVaR= ! = 0:005. In the

case under consideration, the losses of 54 scenarios exceed VaR. The probability of exceeding the

VaR, i.e, the probability of the interval (��(x�);1), was

1�(x�; ��(x�)) = 54=600 < 1� �;

whereas

��(x�) = [(x�; ��(x

�))� �]=[1 � �] = [546=600 � 0:9]=[1 � 0:9] = 0:1:

In accordance with formula (20), we got

CVaR = ��(x�) VaR + (1� ��(x)) CVaR

+,

(CVaR = 0.1 * 0.001538627671 + 0.9 * 0.005384596925= 0.005).

Also, because, (x; ��(x�)) > �

CVaR� < CVaR < CVaR+.

In several runs we observed that the optimal �� may overestimate the VaR because of the

nonuniqueness of the optimal solution, i.e., instances of a nontrivial argmin interval in (29).

25

Page 26: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

(In our case of a discrete distribution, �� can equal the value of the �rst loss possibility beyond

the VaR.) Also, when the CVaR constraint (56) is not active, the optimal �� may be quite far

from the VaR and the value on the left of (56) may likewise be quite far from the CVaR.

Out-of-sample calculations. Table 1 shows that the CVaR calculated in the out-of-sample

region decreases when value of ! decreases (Column 4). This means that we improved both

in-sample and the out-of-sample \large deviations" by imposing the constraint (56). The index

and optimal portfolio values in the out-of-sample region when the CVaR constraint is active (at

! = 0:005) are shown in Fig. 10, and when it is not active (at ! = 0:02), are shown in Fig. 11.

The relative absolute deviation in the out-of-sample region for the active cases (at ! = 0:005)

and the inactive (at ! = 0:02) are displayed in Fig. 12.

An improvement of the CVaR for both in-sample and out-of-sample regions was also observed

for other data intervals, for instance, for 600 in-sample days from 11.28.1997 to 04.13.2000 and

100 out-of-sample days from 04.14.2000 to 09.06.2000.

Column 3 of Table 1 demonstrates that imposing the in-sample CVaR constraint brings about

an improvement of the objective function in the out-of-sample data region (in contrast to the in-

sample increase of the objective function); see Figure 1. However, a decrease in the objective

function in the out-of-sample region was not observed for several other datasets.

Acknowledgments. We are grateful to Alexander Golodnikov from the Glushkov Institute in

Kiev, and to Grigori Zrazhevski from Kiev University for conducting numerical experiments for

the portfolio replication problem.

26

Page 27: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

0

1

2

3

4

5

6

0 .0 2 0 .0 1 0 .0 0 5 0 .0 0 3 0 .0 0 1

o m e g a

Val

ue

(%)

in -s a m p le o b je c tiv efu n c tio no u t-o f -s a m p le o b je c tiv efu n c tio no u t-o f -s a m p le C V A R

Figure 6: In-sample objective function, out-of-sample objective function, out-of-sample CVaR for

various risk levels ! in CVaR constraint.

-6

-5

-4

-3

-2

-1

0

1

2

3

1 51 101 151 201 251 301 351 401 451 501 551

Day number: in-sample region

Dis

crep

ancy

(%

)

activeinactive

Figure 7: Relative discrepancy in in-sample region, CVaR constraint is active (!=0.005) and

inactive (!=0.02)

27

Page 28: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

0

2000

4000

6000

8000

10000

12000

1 51 101

151

201

251

301

351

401

451

501

551

Day number: in-sample region

Po

rtfo

lio v

alu

e (U

SD

)

portfolioindex

Figure 8: Index and optimal portfolio values in in-sample region, CVaR constraint is active

(!=0.005).

0

2000

4000

6000

8000

10000

12000

1 51 101 151 201 251 301 351 401 451 501 551

Day number: in-sample region

Po

rtfo

lio v

alu

e (U

SD

)

portfolioindex

Figure 9: Index and optimal portfolio values in in-sample region, CVaR constraint is inactive

(!=0.02)

28

Page 29: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

8500

9000

9500

10000

10500

11000

11500

12000

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97

Day number in out-of-sample region

Po

rtfo

lio v

alu

e (U

SD

)

portfolioindex

Figure 10: Index and optimal portfolio values in out-of-sample region, CVaR constraint is active

(! =0.005).

8500

9000

9500

10000

10500

11000

11500

12000

1 7 13 19 25 31 37 43 49 55 61 67 73 79 85 91 97

Day number in out-of-sample region

Po

rtfo

lio v

alu

e (U

SD

)

portfolioindex

Figure 11: Index and optimal portfolio values in out-of-sample region, CVaR constraint is inactive

( !=0.02)

29

Page 30: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

-2

-1

0

1

2

3

4

5

6

1 6 11 16 21 26 31 36 41 46 51 56 61 66 71 76 81 86 91 96

Day number in out-of-sample region

Dis

crep

ancy

(%

)

activeinactive

Figure 12: Relative discrepancy in out-of-sample region, CVaR constraint is active ( !=0.005)

and inactive (! =0.02)

References

[1] C. Acerbi, C. Nordio and C. Sirtori, Expected shortfall as a tool for �nancial risk

management. Working paper (2001), can be downloaded from http://www.gloriamundi.org.

[2] C. Acerbi and D. Tasche, On the coherence of expected shortfall. Working paper (2001),

can be downloaded from http://www.gloriamundi.org.

[3] J.V. Andersen and D. Sornette, Have Your Cake and Eat It Too: Increasing Returns

While Lowering Large Risks. Working Paper, University of Los Angeles, (1999), can be

downloaded from http://www.gloriamundi.org.

[4] F. Andersson, H. Mausser, D. Rosen, S. Uryasev, Credit risk optimization with

conditional value-at-risk, Mathematical Programming, Series B, December, 2000, relevant

Research Report 99-9 can be downloaded from www.ise.u .edu/uryasev/pubs.html#t.

[5] C. Andrews, D. Ford and K. Mallinson, The Design of Index Funds and Alternative

Methods of Replication, The Investment Analyst 82, (October 1986), 16-23.

30

Page 31: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

[6] P. Artzner, F. Delbaen, J.-M. Eber, D. Heath, Coherent measures of risk, Mathe-

matical Finance, 9 (1999), 203{228.

[7] S. Basak and A. Shapiro, Value-at-Risk Based Management: Optimal Policies and Asset

Prices. Working Paper, Wharton School, University of Pennsylvania, (1998), can be down-

loaded from http://www.gloriamundi.org.

[8] F. Bassi, P. Embrechts, M. Kafetzaki, Risk management and quantile estimation, in

Practical Guide to Heavy Tails (R. Adler, F. Feldman, M. Taqqu, eds.), Birkh�auser, Boston,

(1998), 111{130.

[9] J.E. Beasley, N. Meade, T.-J. Chang, Index Tracking,Working Paper, Imperial College

London, (1999).

[10] J.R. Birge and F. Louveaux, Introduction to Stochastic Programming. Springer, New

York, (1997).

[11] I.R.C. Buckley and R. Korn, Optimal Index Tracking under Transaction Costs and

Impulse Control, International Journal of Theoretical and Applied Finance, (1998), 315-330.

[12] A. Checklov, S. Uryasev, M. Zabarankin, Portfolio optimization with drawdown con-

straints, submitted to Applied Mathematical Finance, relevant Research Report 2000-5 can

be downloaded from www.ise.u .edu/uryasev/pubs.html#t.

[13] G. Connor and H. Leland, Cash Management for Index Tracking, Financial Analysts

Journal 51(6) (November/December 1995), 75{80.

[14] H. Dalh, A. Meeraus, and S.A. Zenios, Some Financial Optimization Models: I risk

Management, in Financial optimization, S.A. Zenios ed., Cambridge University Press, (1993),

3-36.

[15] R.S. Dembo and A.J. King, Tracking Models and the Optimal Regret Distribution in

Asset Allocation. Applied Stochastic Models and Data Analysis. Vol. 8, (1992) 151{157.

[16] R. Dembo and D. Rosen, The Practice of Portfolio Replication. A Practical Overview of

Forward and Inverse Problems, Annals of Operations Research 85 (1999), 267{284.

[17] P. Embrechts, A. McNeil, D. Straumann, Correlation and dependency in risk manage-

ment: properties and pitfalls, To appear in "Risk Management: Value at Risk and Beyond."

Ed. M. Dempster. Cambridge University Press, Cambridge, (2001).

31

Page 32: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

[18] P. Embrechts, C. Kl�uppelberg, T. Mikosch,Modelling Extremal Events for Insurance

and Finance, Springer, New York, (1997).

[19] Yu. Ermoliev and R. J-B Wets (Eds.), Numerical Techniques for Stochastic Optimiza-

tion, Springer Series in Computational Mathematics, 10 (1988).

[20] A.A. Gaivoronski, and G. Pflug, Value at Risk in portfolio optimization: properties

and computational approach, NTNU, Department of Industrial Economics and Technology

Management, Working paper, (July 2000).

[21] C. Gourieroux, J.P. Laurent, and O. Scaillet, Sensitivity Analysis of Values-at-

Risk., Working paper, Universite de Louvan, (January 2000), can be downloaded from

http://www.gloriamundi.org.

[22] H. Grootweld and W.G. Hallerbach, Upgrading VaR from Diagnostic Metric to De-

cision Variable: A Wise Thing to Do? , Report 2003 Erasmus Center for Financial Research,

(June 2000).

[23] Ph. Jorion, Value at Risk: A New Benchmark for Measuring Derivatives Risk. Irwin Pro-

fessional Pub. (1996).

[24] P. Kall, and S.W. Wallace, Stochastic Programming. Willey, Chichester, (1994).

[25] Y.S. Kan and A.I. Kibzun, Stochastic Programming Problems with Probability and Quan-

tile Functions, John Wiley & Sons, (1996) 316.

[26] R. Kast, E. Luciano, and L. Peccati, VaR and Optimization, 2nd International Work-

shop on Preferences and Decisions, Trento, (July 1998).

[27] Konno, H. and H. Yamazaki, Mean Absolute Deviation Portfolio Optimization Model

and Its Application to Tokyo Stock Market, Management Science, 37, (1991) 519-531.

[28] H. Konno and A. Wijayanayake, Minimal Cost Index Tracking under Nonlinear Trans-

action Costs and Minimal Transaction Unit Constraints, Tokyo Institute of Technology,

CRAFT Working paper 00-07, (2000).

[29] H.M. Markowitz, Portfolio Selection. Journal of Finance. Vol.7, 1, (1952) 77{91.

[30] H. Mausser and D. Rosen, Beyond VaR: From Measuring Risk to Managing Risk, Algo

Research Quarterly , Vol. 1, No. 2,(1998), 5{20.

32

Page 33: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

[31] H. Mausser and D. Rosen, E�cient Risk/Return Frontiers for Credit Risk, Algo Research

Quarterly , Vol. 2, No. 4,(1999), 35{47.

[32] G. Pflug, Some remarks on the value-at-risk and the conditional value-at-risk, in \Proba-

bilistic Constrained Optimization: Methodology and Applications" (S. Uryasev ed.), Kluwer

Academic Publishers, 2000.

[33] G.Ch. Pflug, Optimization of Stochastic Models: The Interface Between Simulation and

Optimization, Kluwer Academic Publishers, (1996).

[34] J. Palmquist, S. Uryasev, and P. Krokhmal, Portfolio optimization with conditional

value-at-risk criterion, Journal of Risk, forthcoming (relevant Research Report 99-14 can be

downloaded from www.ise.u .edu/uryasev/pal.pdf).

[35] M. Pritsker, Evaluating Value at Risk Methodologies, Journal of Financial Services Re-

search, 12:2/3, (1997) 201{242.

[36] A. Prekopa, Stochastic Programming , Kluwer Academic Publishers, 1995.

[37] A. Puelz, Value-at-Risk Based Portfolio Optimization.Working paper, Southern Methodist

University, (November 1999).

[38] R. T. Rockafellar, Convex Analysis, Princeton University Press, 1970; available since

1997 in paperback in the series Princeton Landmarks in Mathematics and Physics.

[39] R. T. Rockafellar, S. Uryasev, Optimization of conditional value-at-risk, Journal of

Risk 2 (2000), 21{41.

[40] R. T. Rockafellar and R. J-B Wets, Variational Analysis, Grundlehren der Math. Wis-

senschaften 317, Springer Verlag, 1997.

[41] R. Rubinstein and A. Shapiro, Discrete Event Systems: Sensitivity Analysis and Stochas-

tic Optimization via the Score Function Method, Willey, Chichester,(1993).

[42] A. Rudd, Optimal Selection of Passive Portfolios, Financial Management, (Spring 1980),

57-66.

[43] D. Tasche, Risk contributions and performance measurement, Working paper, Munich

University of Technology, (July 1999).

33

Page 34: #2001-5nor CV + b eing coheren t. The w eigh ts arise from the particular w a y that \splits the atom" of probabilit y at the V aR v alue, when one exists. Besides la ying out suc

[44] W.M. Toy and M.A. Zurack, Tracking the Euro-Pac Index, The Journal of Portfolio

Management, 15(2) (Winter 1989), 55{58.

[45] M.R. Young, A Minimax Portfolio Selection Rule with Linear Programming Solution.

Management Science, Vol.44, No. 5, (1998) 673{683.

[46] S.A. Zenios (Ed.), Financial Optimization, Cambridge Univ Press, (1993).

[47] T.W. Ziemba and M.J. Mulvey (Eds.), Worldwide Asset and Liability Modeling , Cam-

bridge Press, Publications of the Newton Institute, (1998).

34


Recommended