+ All Categories
Home > Documents > Testing for Concordance Ordering

Testing for Concordance Ordering

Date post: 26-Nov-2023
Category:
Upload: independent
View: 0 times
Download: 0 times
Share this document with a friend
33
TESTING FOR CONCORDANCE ORDERING Ana C. Cebri ´ an Dpto. Metodos Estadisticos. Ed. Matematicas Facultad de Ciencias Universidad de Zaragoza CP. Cerbuna, 12 S-Zaragoza 50009, Spain [email protected] Michel Denuit Institut de Statistique Universit´ e Catholique de Louvain Voie du Roman Pays, 20 B-1348 Louvain-la-Neuve, Belgium [email protected] Olivier Scaillet HEC Gen` eve and FAME Universit´ e de Gen` eve Bd Carl Vogt, 102 CH - 1211 Gen` eve 4, Suisse [email protected] February 21, 2002
Transcript

TESTING FOR CONCORDANCE ORDERING

Ana C. Cebrian

Dpto. Metodos Estadisticos. Ed. MatematicasFacultad de Ciencias

Universidad de ZaragozaCP. Cerbuna, 12

S-Zaragoza 50009, Spain

[email protected]

Michel Denuit

Institut de StatistiqueUniversite Catholique de Louvain

Voie du Roman Pays, 20

B-1348 Louvain-la-Neuve, [email protected]

Olivier Scaillet

HEC Geneve and FAMEUniversite de GeneveBd Carl Vogt, 102

CH - 1211 Geneve 4, [email protected]

February 21, 2002

Abstract

We propose inference tools to analyse the ordering of concordance of random vectors. Theanalysis in the bivariate case relies on tests for upper and lower quadrant dominance of thetrue distribution by a parametric or semiparametric model, i.e. for a parametric or semi-parametric model to give a probability that two variables are simultaneously small or largeat least as great as it would be were they left unspecified. Tests for its generalisation inhigher dimensions, namely joint lower and upper orthant dominance, are also analysed. Theparametric and semiparamettric setting are based on the copula representation for multi-variate distribution, which allows for disentangling behaviour of margins and dependencestructure. We propose two types of testing procedures for each setting. The first procedureis based on a formulation of the dominance concepts in terms of values taken by randomvariables, while the second procedure is based on a formulation in terms of probability levels.For each formulation a distance test and an intersection-union test for inequality constraintsare developed depending on the definition of null and alternative hypotheses. An empiricalillustration is given for US insurance claim data.

Key words and phrases: Nonparametric, Concordance Ordering, Quadrant Domi-nance, Orthant Dominance, Copula, Inequality Constraint Tests, Risk Management, LossSeverity Distribution.

JEL Classification: C12, D81, G10, G21, G22.

MSC 2000: 62G10, 62P05.

1 Introduction

Random variables are concordant if they tend to be all large together or small together.Concordance of random variables conveys the idea of clustering of large and small events. Anordering of concordance was initially considered for two random variables by YanagimotoandOkamoto (1960), Cambanis, Simons and Stout (1976) and Tchen (1980), and thenextended by Joe (1990) to the multivariate case. This ordering corresponds to a naturalnotion of stochastic dominance between two distributions functions with fixed marginals.Large and small values will tend to be more often associated under the distribution whichdominates the other one.

Detection of concordant behaviour is especially important in risk management of largeportfolios of insurance contracts or financial assets. In these portfolios the main risk isthe occurrence of many joint default events or simultaneous downside evolution of prices.An accurate knowledge of concordance between claims or financial asset prices will helpto assess this risk of loss clustering and there from allows for taking appropriate actionto ensure that the risk incurred by the financial institution remains within its stated riskappetite. Clearly the presence of concordance affects risk measures and asset allocationsresulting from optimal portfolio selection. Analysis of concordance cannot be neglected andreveals much of the danger associated to a given position.

Modelling of concordance can be fully parametric or semiparametric. In the first casespecific parametric forms are selected for the dependence structure and the margins, while inthe second case margins are left unspecified. The dependence structure is expressed by meansof a parametric copula function. Once these models estimated a natural concern of a riskmanager ought to be: does the adopted modelling reflect the dependence structure presentin the data safely enough? The aim of this paper is to provide inference tools to answerthis question. These tools will be tests for the ordering of concordance for random variables.In fact we propose testing procedures for concordance order between the chosen model andthe empirical distribution. This allows for checking whether the estimated parametric orsemiparametric model gives a safe picture of the association between small and large observedlosses.

The paper is organized as follows. In Section 2, we formally define concordance and itsordering. We also recall the definition of copula functions, as well as the classical Sklar’srepresentation theorem for multivariate distributions. Section 3 is devoted to illustrations ofthe practical relevance of the concordance order. Sections 4 and 5 are devoted to inference.The first one deals with the parametric setting, and the second one with the semiparametricsetting. For each setting we develop inference formulated in terms of values taken by therandom variables, i.e. loss or return levels, and in terms of probability levels. In Section 6we develop the testing procedures, and describe the null and alternative hypotheses we areinterested in. These procedures are closely related to inference tools for traditional first orderand second order stochastic dominance, or for positive quadrant dependence. These toolsalso rely on distance and intersection-union tests for inequality constraints (see DavidsonandDuclos (2000),Denuit and Scaillet (2001) and the references therein). An empiricalillustration on US insurance claim data is proposed in Section 7. Section 8 contains someconcluding remarks. Proofs are gathered in an appendix.

1

2 Concordance order

Let F and G denote a n-dimensional cdf, and F and G their corresponding survival function.Then G is more concordant than F , written F �c G, if

F (y) ≤ G(y) and F (y) ≤ G(y), ∀y ∈ Rn. (2.1)

The first inequality F (y) ≤ G(y) corresponds to lower orthant dominance, while the secondone F (y) ≤ G(y) corresponds to upper orthant dominance. If both inequalities hold, largeand small values will tend to be more often associated under G than F . Condition (2.1)implies that F and G have the same jth univariate marginal distribution (j = 1, . . . , n),and that all bivariate and higher dimensional marginals of G are more concordant that thecorresponding ones for F . Hence we see that the ordering of concordance of variables is de-rived from comparisons of pairs of distributions with identical marginals (Yanagimoto andOkamoto (1960), Cambanis, Simons and Stout (1976), Tchen (1980), Joe (1990)).Besides the concordance order is equivalent to a partial order among the parameters for el-liptically contoured distributions, such as the multivariate normal and student distributions.Henceforth we will freely use �c between cdf’s or random vectors to indicate that (2.1) holds.

At this stage, it is interesting to stress the very particular nature of the bivariate case(compared to dimension ≥ 3). Indeed, consider F and G with identical marginals. Then,it is easily seen that anyone of the two inequalities in (2.1) implies the other. It will beseen further in the paper that the bivariate case is really particular, mainly because theconcordance order coincides with the supermodular order for random couples (this is not thecase for random vectors of larger dimension).

The concordance order can also be characterised in terms of copulas. The marginal pdfand cdf of each element Yj of Y at point yj, j = 1, ..., n, will be written fj(yj), and Fj(yj),respectively. How the joint distribution F is “coupled” to its univariate margins Fj, can bedescribed by a copula. While the joint distribution F provides complete information con-cerning the behaviour of Y , copulas allow for separating dependence and marginal behaviourof the elements constituting Y = (Y1, . . . , Yn)

′. Before defining formally a copula, we wouldlike to refer the reader to Nelsen (1999) and Joe (1997) for more extensive theoreticaltreatments.

A n-dimensional copula C is simply (the restriction to [0, 1]n of) an n-dimensional cdfwith unit uniform marginals. The reason why a copula is useful in revealing the link betweenthe joint distribution and its margins transpires from the following theorem.

Theorem 2.1. (Sklar’s Theorem)Let F be an n-dimensional cdf with margins F1, ..., Fn. Then there exists an n-copula C suchthat for all y in R

n,

F (y) = C(F1(y1), ..., Fn(yn)

). (2.2)

If F1, ..., Fn are all continuous, then C is uniquely defined. Otherwise, C is uniquely deter-mined on rangeF1 × ...× rangeFn. Conversely, if C is an n-copula and F1, ..., Fn are cdf’s,then the function F defined by (2.2) is an n-dimensional cdf with margins F1, ..., Fn.

2

Although copulas constitute a less well-known approach to describing dependence thancorrelation, they offer the best understanding of the general concept of dependency (seeEmbrechts, McNeil and Straumann (2000) for implications on risk management). Inparticular, copulas share the nice property that strictly increasing transformations of theunderlying random variables result in the transformed variables having the same copula(what is not true for linear correlation).

As an immediate corollary of Sklar’s Theorem, we have

C(u) = F (F−11 (u1), ..., F

−1n (un)) (2.3)

for any u ∈ [0, 1]n. From Expression (2.3), we may observe that the dependence structureembodied by the copula can be recovered from the knowledge of the joint cdf F and itsmargins Fj .

Let us now state the definition (2.1) of concordance in terms of copulas. We denote by Cthe survival copula associated with C, and use subscripts F and G to reflect their associateddistribution. If F and G share the same univariate margins, G is more concordant than F if

CF (u) ≤ CG(u) and CF (u) ≤ CG(u), ∀u ∈ [0, 1]n. (2.4)

Note that the density c associated with the copula C is given by:

c(u1, . . . , un) =∂nC(u1, . . . , un)

∂u1 . . . ∂un.

The density f of F can be expressed in terms of the copula density c and the product of theunivariate marginal densities fj :

f(y1, . . . , yn) = c(F1(y1), . . . , Fn(yn))

n∏j=1

fi(yi).

Obviously the latter equality does not depend on a parametric assumption for the multi-variate distribution. In the inference part of this paper we will consider two cases. The firstcase will put some parametric assumption fj(yj; βj) on the margins, while the second willnot. The copula will be parameterised according to c(u1, . . . , un; θ) in both cases.

Let us remark that independence between random variables can be characterised throughcopulas. Indeed, n random variables are independent if, and only if, their copula is C(u) =C⊥(u) =

∏nj=1 uj, for all u ∈ [0, 1]n. C⊥ is further referred to as the independence copula.

Before proceeding further with tests for concordance ordering we illustrate its practicalrelevance for insurance and finance in the next section.

3 Applications of the concordance order

Stochastic orderings are binary relations defined on classes of probability distributions. Theyaim to mathematically translate intuitive ideas like “being larger” or “being more variable”for random quantities. They thus extend the classical mean-variance approach to compareriskiness.

3

Let us define the following utility classes. Let U1 contain all non-decreasing utilitiesu : R → R. Let U2 be the restriction of U1 to its concave elements. In what follows, weassume that decision-makers maximize a von Neumann-Morgenstern expected utility (butwe mention that results involving U1 and U2 still hold in dual theories for choice under risk,see e.g. Denuit, Dhaene and Van Wouwe (1999) for further information).

Let Y1 and Y2 be two random variables such that Eu(Y1) ≤ Eu(Y2) holds for all u ∈ U1

(resp. u ∈ U2), provided the expectations exist. Then Y1 is said to be smaller than Y2 in thestochastic dominance (resp. increasing concave order), denoted as Y1 �d Y2 (resp. Y1 �icv Y2).From the very definitions of �d and �icv, we see that these stochastic orderings express thecommon preferences of the classes of profit-seeking decision-makers, and of profit-seekingrisk-averters, respectively. This provides an intuitive meaning to rankings in the �d- or �icv-sense. If Y1 �icv Y2 and EY1 = EY2, then we write Y1 �cv Y2. In this case Eu(Y1) ≤ Eu(Y2)for all the concave utilities u, so that Y2 is preferred over Y1 by all risk-averters.

For a more detailed exposition of stochastic orderings, see e.g. the review papers byKroll and Levy (1980) and Levy (1992), the classified bibliography by Mosler andScarsini (1993) and the book by Shaked and Shanthikumar (1994).

Statistical inference for �d and �icv is investigated in vast details in Davidson andDuclos (2000), where connections with economic and social welfare in different populationsare explicated.

The bivariate version of �c coincides with the supermodular order. This yields a hostof useful results for random couples (which are no more valid in dimensions ≥ 3 becauseconcordance and supermodular orders are then strongly distinct, see e.g. Muller (1997)for further details on this issue).

The main interest of the concordance order among random couples comes from the follow-ing result of which we provide a short proof in Appendix A. It is a straightforward adaptationof the result of Dhaene and Goovaerts (1996) established in the convex actuarial setting.

Proposition 3.1. If X �c Y then Y1 + Y2 �cv X1 +X2.

This means that when X �c Y holds, every risk-averter agrees to say that Y1 +Y2 is lessfavourable than X1 + X2. Consequently, most insurance premiums and risk measures willbe larger for Y1 + Y2 than for X1 +X2 (since the principles used to calculate such quantitiesare in accordance with the common preferences of risk-averters). For instance, since thefunction x �→ −(x − κ)+, with (·)+ = max{0, ·}, is concave for any κ ∈ R, the inequalityE(X1 +X2 − κ)+ ≤ E(Y1 + Y2 − κ)+ holds true for all κ. The quantity E(Y1 + Y2 − κ)+ isreferred to as the stop-loss premium relating to the risk portfolio Y1 +Y2 in actuarial science(κ is called the deductible). In finance, when appropriately discounted, it can be regardedas the price of a basket option with Y1 and Y2 as underlying assets and κ as strike price.

In particular, we see that if X �c Y then

Var[α1X1 + α2X2] ≤ Var[α1Y1 + α2Y2] for all α1, α2 > 0,

that is, the variance of a linear combination with positive weights of each coordinate willbe lower when computed under F than under G. In finance this means that portfolios withshort sales constraints will be considered as more efficient in the mean-variance sense underF than under G.

4

For every X with marginals F1 and F2, the stochastic inequalities(F−1

1 (U), F−12 (1 − U)

)�cX�c

(F−1

1 (U), F−12 (U)

)(3.1)

are valid, where U stands for a unit uniform random variables. The random vectors involvedin (3.1) are referred to as the Frechet bounds; they represent perfect positive and negativedependence, respectively.

A powerful closure property of the concordance order is given next (it is easily establishedcoming back to the definition (2.4) of �c in terms of copulas). For all non-decreasing functionsφ and ψ, the implication

(X1, X2)�c(Y1, Y2) ⇒ (φ(X1), ψ(X2))�c(φ(Y1), ψ(Y2)),

holds true. So, �c has a functional invariance property. As a simple illustration of therelevance of this result, suppose that we have a probability model (multivariate distribution)for dependent insurance losses of various kinds. If we decide that our interest now liesin modelling the logarithm of these losses, the �c ranking will not change. Similarly ifwe change from a model of percentage returns on several financial assets to a model oflogarithmic returns. This also clearly shows that an ordering in the �c sense only dependson the underlying copula once the marginals have been fixed.

It is also known that the concave order is closely related to the Lorenz order. Let us recallthat the Lorenz order is defined by means of pointwise comparison of Lorenz curves. Thelatter is used in economics to measure the inequality of incomes (see Beach and Davidson(1983), Dardanoni and Forcina (1999) for related inference). More precisely, let Y bea non-negative random variable with cdf F . The Lorenz curve L associated with Y is thendefined by

L(p) =1

EY

∫ p

t=0

F−1(u)du, p ∈ [0, 1].

When Y represents the income of the individuals in some population, L maps p ∈ [0, 1] tothe proportion of the total income of the population which accrues to the poorest 100p % ofthe population.

Consider two non-negative random variables Y1 and Y2 with finite expectations. Then,Y1 is said to be smaller than Y2 in the Lorenz order, henceforth denoted by Y1 �Lorenz Y2,when L1(p) ≥ L2(p) for all p ∈ [0, 1]. When Y1 �Lorenz Y2 holds, Y1 does not exhibit moreinequality in the Lorenz sense than does Y2. A standard reference for �Lorenz is Arnold(1987).

Provided EY1 = EY2, it can be shown Y1 �Lorenz Y2 ⇔ Y2 �cv Y1. Hence, if X �c Ythen X1 +X2 �Lorenz Y1 + Y2 in virtue of Proposition 3.1.

The usefulness of this relation can be illustrated as follows. Let (X1, X2) representthe incomes of married couples (X1 for husband and X2 for his wife) in some population.Assume that for some sociological or legal reason, (X1, X2) is replaced with (Y1, Y2) such that(X1, X2) �c (Y1, Y2) (in words, this means that the incomes of husband and wife becomemore positively dependent, but the marginal incomes for married men and women remainunchanged). Then, this increases the inequality of incomes at the couples level since X1 +X2 �Lorenz Y1 + Y2 holds.

5

Let us now provide an illustration in relation with insurance premium calculation prin-ciples. Consider an insurance company with initial wealth w and with a utility functionu ∈ U2. The company covers a collective risk with an aggregate claim amount S. It wonderswhether it should cover a new risk X, and if affirmative how to set the premium amountfor this new risk. Of course, X is correlated to S (at least to some extent). The amount ofpremium π(X) is determined following the adoption of an economic decision principle. Weassume here that the insurance company sets its price for coverage π(X) as the solution ofthe equation

Eu(w − S + π(X) −X) = u(w − S). (3.2)

This way of computing premiums is classical in actuarial science; see e.g. Goovaerts andal. (1990) for more details. Condition (3.2) expresses that the premium π(X) is fair in termsof utility: the right-hand side of (3.2) represents the utility of not issuing the contract; theleft-hand side of (3.2) represents the expected utility of the insurer assuming the randomfinancial loss X. Therefore (3.2) means that the expected utility of wealth with the contractis equal to the utility without the contract. This type of pricing principle is sometimes alsoused in finance to set prices of derivative assets (see e.g. Davis (1997), Karatzas and Kou(1996)).

In such a case, it is possible to show that the premium π(X) should increase with thepositive dependence existing between X and S. Indeed, the implication

(S,X) �c (S, Y ) ⇒ π(X) ≤ π(Y )

holds true, meaning that the safety loading increases with the dependence existing betweenthe new risk and those already written by the company.

4 Inference under parametric specification

Now that the relevant theoretical concepts and applications have been presented, we mayturn our attention to inference. We consider a setting made of i.i.d. observations {Y t; t =1, ..., T} of a random vector Y taking values in Rn. These data may correspond to eitherobserved individual losses on n insurance contracts, amounts of claims reported by a givenpolicyholder on n different guarantees in a multiline product or observed returns of n financialassets. We begin with fully parametric specifications, and analyse two cases. The first one isbased on a grid of loss or return levels, while the second one is based on a grid of probabilitylevels.

4.1 Inference based on loss levels

Let us start with the parametric family

{F (y; ν) = C(F1(y1; β1), ..., Fn(yn; βn); θ), ν = (β ′, θ′)′ ∈ Ψ ⊂ Rq+p}.

This parametric family is specified in terms of a parametric copula C(u; θ) and parametricmargins Fj(yj; βj), j = 1, . . . , n. The q-dimensional vector β = (β ′

1, . . . , β′n)′ and the p-

dimensional vector θ forming ν are jointly estimated by pseudo maximum likelihood. The

6

estimator ν is derived from

maxβ,θ

1

T

T∑t=1

ln c(F1(Y1t; β1), . . . , Fn(Ynt; βn); θ)

n∑j=1

ln fi(Yit; βi),

and its limit, i.e. the pseudo true value, is denoted by ν0 = (β ′0, θ

′0)

′. We wish to checkwhether F (·; ν0) is more concordant than the true distribution function F0(·), namely

F0(y) ≤ F (y; ν0) and F0(y) ≤ F (y; ν0), ∀y ∈ Rn. (4.1)

As in traditional stochastic dominance tests or positive quadrant dependence tests we usea version of the conditions defining concordance on a predetermined grid, and only considera fixed number of distinct points, say d points yi = (yi1, ..., yin)

′ in Rn. These points willtypically span the whole range of possible values. We define Di

1 = F (yi; ν0) − F0(yi), andDi

1 = F (yi; ν0) − F0(yi), and set D1 = (D11, ..., D

d1)

′, and D1 = (D11, ..., D

d1)

′. The testingprocedures described in Section 6 will be built from the empirical counterpart Di

F , resp.ˆD

i

F , of DiF , resp. Di

F , obtained by substituting estimated and empirical distributions for theunknown parametric and true distributions. The joint empirical distribution is given by

F (yi) =1

T

T∑t=1

n∏j=1

I[Yjt ≤ yij], i = 1, ..., d. (4.2)

The empirical counterparts are thus equal to:

Di1 = C(F1(yi1; β1), ..., Fn(yin; βn); θ) − F (yi),

andˆD

i

1 = C(F1(yi1; β1), ..., Fn(yin; βn); θ) − ˆF (yi).

The following proposition characterizes the joint asymptotic distribution of D1 and ˆD1. Thesubscript in E0 and Cov0 refers to integration w.r.t. the true distribution F0.

Proposition 4.1. The random vector√T (D1−D1), resp.

√T ( ˆD1−D1), converges in dis-

tribution to a d-dimensional normal random variable with mean zero and covariance matrixV 1, resp. V 1, whose elements are

v1,kl = limT→∞

T Cov0

[Dk

1 , Dl1

]= Bk′

ν0Cov0

[Sk

ν0, Sl

ν0

]Bl

ν0, k, l = 1, ..., d,

withBi

ν0=(∇i

ν0C ′J−1

ν0, 1)′,

Siν0

=

(∂

∂ν ′log f(Y ; ν0), I[Y ≤ yi]

)′,

and

v1,kl = limT→∞

T Cov0

[ˆD

k

1,ˆD

l

1

]= Bk′

ν0Cov0

[Sk

ν0, Sl

ν0

]Bl

ν0, k, l = 1, ..., d,

7

withBi

ν0=(∇i

ν0C ′J−1

ν0, 1)′,

Siν0

=

(∂

∂ν ′log f(Y ; ν0), I[Y > yi]

)′,

where

Jν0 = E0

[− ∂2

∂ν∂ν ′log f(Y ; ν0)

],

while

limT→∞

T Cov0

[Dk

1 ,ˆD

l

1

]= Bk′

ν0Cov0

[Sk

ν0, Sl

ν0

]Bl

ν0, k, l = 1, ..., d.

A consistent estimate of each covariance can be obtained by replacing expectations withempirical averages, and unknown parameter values by their estimates.

4.2 Inference based on probability levels

Let us now proceed with the analogous quantities when we use probability levels instead ofloss levels, and take d points ui = (ui1, ..., uin)

′, with uij ∈ (0, 1), i = 1, ..., d, j = 1, . . . , n.We assume hereafter that the cdf Fj0 is such that the equation Fj0(y) = uij admits a uniquesolution denoted ζij, i = 1, ..., d, j = 1, ..., n, while fj0(ζij) > 0 at each quantile ζij. Wedenote the stack of the univariate quantiles ζij by ζi.

We may then define Di2 = C(ui; θ0)−F0(ζi), and D2 = (D1

2, ..., Dd2)

′. The survival quan-tities will be Di

2 = C(ui; θ0) − F0(ζi), and D2 = (D12, ..., D

d2)

′. The empirical counterparts

are then Di2 = C(ui; θ) − F (ζi), and ˆD

i

2 = C(ui; θ) − ˆF (ζi), where ζi = (ζi1, ..., ζin)′ is

made of the empirical univariate quantiles ζij. The main difference when compared with theprevious case is that the loss levels are no more given deterministic values, but quantilesestimated on the basis of sample information, and thus random quantities.

Proposition 4.2. The random vector√T (D2−D2), resp.

√T ( ˆD2−D2), converges in dis-

tribution to a d-dimensional normal random variable with mean zero and covariance matrixV 2, resp. V 2, whose elements are

v2,kl = limT→∞

T Cov0

[Dk

2 , Dl2

]= Bk′

θ0Cov0

[Sk

θ0, Sl

θ0

]Bl

θ0, k, l = 1, ..., d,

with

Biθ0

=

(∇i

θ0C ′J−1

θ0,−1,

∂F0(�i)∂x1

f10(ζi1), ...,

∂F0(�i)∂xn

fn0(ζin)

)′

,

Siθ0

=

(∂

∂θ′log f(Y ; ν0), I[Y ≤ ζi], I[Y1 ≤ ζi1], ..., I[Yn ≤ ζin]

)′,

and

v2,kl = limT→∞

T Cov0

[ˆD

k

2,ˆD

l

2

]= Bk′

θ0Cov0

[Sk

θ0, Sl

θ0

]Bl

θ0, k, l = 1, ..., d,

8

with

Biθ0

=

(∇i

θ0C ′J−1

θ0,−1,

∂F0(�i)∂x1

f10(ζi1), ...,

∂F0(�i)∂xn

fn0(ζin)

)′

,

Siθ0

=

(∂

∂θ′log f(Y ; ν0), I[Y > ζi], I[Y1 ≤ ζi1], ..., I[Yn ≤ ζin]

)′,

where

Jν0 = E0

[− ∂2

∂θ∂θ′log f(Y ; ν0)

],

while the elements of the cross covariance matrix CV 2 are

cv2,kl = limT→∞

T Cov0

[Dk

2 ,ˆD

l

2

]= Bk′

θ0Cov0

[Sk

θ0, Sl

θ0

]Bl

θ0, k, l = 1, ..., d.

Some of the asymptotic covariances involve derivatives of F0 and the univariate densitiesfj0. These quantities may be estimated by standard kernel methods (see e.g. Scott (1992))in order to deliver a consistent covariance estimate. For example we may take a Gaussiankernel and different bandwidth values hj in each dimension, which leads to:

∂F (ζi)

∂xj= (Thj)

−1T∑

t=1

ϕ

(Yjt − ζij

hj

)n∏

l �=j

Φ

(Ylt − ζil

hl

),

fj(ζij) = (Thj)−1

T∑t=1

ϕ

(Yjt − ζij

hj

),

where ϕ and Φ denote the pdf and cdf of a standard Gaussian variable. In the empiricalsection of the paper, we opt for the standard choice (rule of thumb) for the bandwiths hj ,that is 1.05T−1/5 times the estimated standard deviation of Yj.

5 Inference under semiparametric specification

5.1 Inference based on loss levels

The previous section was devoted to the fully parametric specification. If we wish to be lessrestrictive a priori on the univariate margins, we may leave them unspecified, and use thefamily

{F (y; θ) = C(F1(y1), ..., Fn(yn); θ), θ ∈ Θ ⊂ Rp}.

Hence we get a semiparametric setting only parameterised through C(u; θ). The estimatorθ of θ is obtained by

maxθ

1

T

T∑t=1

ln c(F1(Y1t), . . . , Fn(Ynt); θ),

where

Fj(y) =1

T

T∑t=1

I[Yjt ≤ y], j = 1, ..., n. (5.1)

9

Its limit is denoted by θ∗0, and will correspond to θ0 (the true value) if both copula andmargins are well specified in the parametric case. The asymptotic distribution of θ undercorrect specification is given in Genest, Ghoudi and Rivest (1995) and Shih and Louis(1995). Again we wish to check whether F (·; θ∗0) is more concordant than the true distributionfunction F0(·), namely

F0(y) ≤ F (y; θ∗0) and F0(y) ≤ F (y; θ∗0), ∀y ∈ Rn. (5.2)

Along the same lines as in the parametric setting we define

Di3 = C(F1(yi1), ..., Fn(yin); θ) − F (yi),

andˆD

i

3 = C(F1(yi1), ..., Fn(yin); θ) − ˆF (yi),

together with their corresponding stacks D3 and ˆD3.

Proposition 5.1. The random vector√T (D3−D3), resp.

√T ( ˆD3−D3), converges in dis-

tribution to a d-dimensional normal random variable with mean zero and covariance matrixV 3, resp. V 3, whose elements are

v3,kl = limT→∞

T Cov0

[Dk

3 , Dl3

]= Bk′

θ∗0Cov0

[Sk

θ∗0, Sl

θ∗0

]Bl

θ∗0, k, l = 1, ..., d,

with

Biθ∗0

=(∇i

θ∗0C ′J−1

θ∗0,∇i

u1C, ...,∇i

unC,−1

)′,

Siθ∗0

=(U ′

θ∗0, I[Y1 ≤ yi1], ..., I[Yn ≤ yin], I[Y ≤ yi]

)′,

and

v3,kl = limT→∞

T Cov0

[ˆD

k

3,ˆD

l

3

]= Bk′

θ∗0Cov0

[Sk

θ∗0, Sl

θ∗0

]Bl

θ∗0, k, l = 1, ..., d,

with

Biθ∗0

=(∇i

θ∗0C ′J−1

θ∗0,∇i

u1C, ...,∇i

unC,−1

)′,

Siθ∗0

=(U ′

θ∗0, I[Y1 ≤ yi1], ..., I[Yn ≤ yin], I[Y > yi]

)′,

where

Jθ∗0= E0

[− ∂2

∂θ∂θ′log c(F10(Y1), ..., Fn0(Yn); θ

∗0)

]and

Uθ∗0=

∂θlog c(F10(Y1), ..., Fn0(Yn); θ

∗0)

+

n∑j=1

∫Rn

I[Yj ≤ zj ]∂2

∂θ∂ujlog c(F10(z1), ..., Fn0(zn); θ∗0)dF0(z1, ..., zn)

while

limT→∞

T Cov0

[Dk

3 ,ˆD

l

3

]= Bk′

θ∗0Cov0

[Sk

θ∗0, Sl

θ∗0

]Bl

θ∗0, k, l = 1, ..., d.

10

5.2 Inference based on probability levels

If we prefer to set probability levels, we will use

Di4 = C(ui; θ) − F (ζi),

andˆD

i

4 = C(ui; θ) − ˆF (ζi).

This leads to the following proposition which is equivalent to Proposition 2 of the parametricsetting.

Proposition 5.2. The random vector√T (D4−D4), resp.

√T ( ˆD4−D4), converges in dis-

tribution to a d-dimensional normal random variable with mean zero and covariance matrixV 4, resp. V 4, whose elements are

v4,kl = limT→∞

T Cov0

[Dk

4 , Dl4

]= Bk′

θ∗0Cov0

[Sk

θ∗0, Sl

θ∗0

]Bl

θ∗0, k, l = 1, ..., d,

with

Biθ∗0

=

(∇i

θ∗0C ′J−1

θ∗0,−1,

∂F0(�i)∂x1

f10(ζi1), ...,

∂F0(�i)∂xn

fn0(ζin)

)′

,

Siθ0

=(U ′

θ∗0, I[Y ≤ ζi], I[Y1 ≤ ζi1], ..., I[Yn ≤ ζin]

)′,

and

v4,kl = limT→∞

T Cov0

[ˆD

k

4,ˆD

l

4

]= Bk′

θ∗0Cov0

[Sk

θ∗0, Sl

θ∗0

]Bl

θ∗0, k, l = 1, ..., d,

with

Biθ∗0

=

(∇i

θ∗0C ′J−1

θ∗0,−1,

∂F0(�i)∂x1

f10(ζi1), ...,

∂F0(�i)∂xn

fn0(ζin)

)′

,

Siθ∗0

=(U ′

θ∗0, I[Y > ζi], I[Y1 ≤ ζi1], ..., I[Yn ≤ ζin]

)′,

where

Jθ∗0= E0

[− ∂2

∂θ∂θ′log c(F10(Y1), ..., Fn0(Yn); θ

∗0)

]and

Uθ∗0=

∂θlog c(F10(Y1), ..., Fn0(Yn); θ

∗0)

+n∑

j=1

∫Rn

I[Yj ≤ zj ]∂2

∂θ∂uj

log c(F10(z1), ..., Fn0(zn); θ∗0)dF0(z1, ..., zn)

while

limT→∞

T Cov0

[Dk

4 ,ˆD

l

4

]= Bk′

θ∗0Cov0

[Sk

θ∗0, Sl

θ∗0

]Bl

θ∗0, k, l = 1, ..., d.

11

6 Testing procedures

The distributional results of Propositions 4.1-4.2 and 5.1-5.2 are the building blocks of the

testing procedures. Let Zk, resp. Zk, be the stack of Dk, resp. Dk, and Dk, resp. ˆDk,k = 1, . . . , 4.

The null hypothesis of a test for concordance may be written as

H0k = {Zk : Zk ≥ 0},

with alternative hypothesis:

Hk1 = {Zk : Zk unrestricted }.

To examine these hypotheses we will use the usual distance tests for inequality constraints,initiated in the multivariate one-sided hypothesis literature for positivity of the mean(Bartholomew (1959a,b)). They are relevant when one or several components of Zk

are found to be negative (in such a case one wants to know whether this invalidates concor-dance).

Let Zk, be solution of the constrained quadratic minimisation problem:

infZT (Z − Zk)

′Σ−1

k (Z − Zk) s.t. Z ≥ 0, (6.1)

where Σk is a consistent estimate of the asymptotic covariance matrix Σk of√T Zk, and put

ξk = T (Zk − Zk)′Σ

−1

k (Zk − Zk).

Roughly speaking, Zk is the closest point to Zk under the null in the distance measured inthe metric of Σk, and the test statistic ξk is the distance between Zk and Zk. The idea isto reject Hk

0 when this distance becomes too large.The asymptotic distribution of ξk under the null (see e.g. Gourieroux, Holly and

Monfort (1982), Kodde and Palm (1986), Wolak (1989a,b)) is such that for any posi-tive x:

P [ξk ≥ x] =d∑

i=1

P [χ2i ≥ x]w(d, d− i, Σk),

where the weight w(d, d−i, V k) is the probability that Zk has exactly d−i positive elements.Computation of the solution Zk can be performed by a numerical optimisation routine for

constrained quadratic programming problems available in most statistical softwares. Closedform solutions for the weights are available for d ≤ 4 (Kudo (1963)). For higher dimensionsone usually relies on a simple Monte Carlo technique as advocated in Gourieroux, Hollyand Monfort (1982) (see also Wolak (1989a)). Indeed it is enough to draw a given largenumber of realisations of a multivariate normal with mean zero and covariance matrix Σk.Then use these realisations as Zk in the above minimisation problem (6.1), compute Zk,and count the number of elements of the vector greater than zero. The proportion of drawssuch that Zk has exactly d − i elements greater that zero gives a Monte Carlo estimate ofw(d, d−i, Σk). If one wishes to avoid this computational burden, the upper and lower boundcritical values of Kodde and Palm (1986) can be adopted.

12

Let us now turn our attention to the second testing procedure aimed to test for non-concordance. It is based on the null hypothesis:

H0k = {Zk : Z l

k ≤ 0 for some l, l = 1, . . . , 2d},

and the alternative hypothesis:

H1k = {Zk : Z l

k > 0 for all l}.

These hypotheses will be tested through intersection-union tests based on the minimum of at-statistic. They are used when all components of Zk are found to be positive. The questionis then whether this suffices to ensure concordance.

Let γlk =

√T Z l

k/√σk,l, where σk,l is a consistent estimate of the asymptotic standard

deviation of√T Z l

k, l = 1, . . . , 2d. Then under H0k , the limit of P [inf γl

k > z1−α] will beless or equal to α, and exactly equal to α if Z l

k = 0 for a given l and Zsk > 0 for s �= l,

while its limit is one under H1k . Hence the test consisting of rejecting H0

k when inf γlk is

above the (1− α)-quantile z1−α of a standard normal distribution has an upper bound α onthe asymptotic size and is consistent (see e.g. Howes (1993), Kaur, Prakasa Rao andSingh (1994)).

Power issues are extensively studied for stochastic dominance and nondominance testsin Dardanoni and Forcina (1999) (see also the comments in Davidson and Duclos(2000)). They carry over to our case. First, approaches based on distance tests exploitthe covariance structure, and are thus expected to achieve better power properties relativeto approaches, such as ones based on t-statistics, that do not account for it. In a set ofMonte Carlo experiments, they find that, indeed, distance tests are worth the extra amountof computational work. Second, it is possible that nonrejection of the null of dominance,here concordance, by distance tests occurs along with the nonrejection of the null of non-dominance, here non-concordance, by intersection-union tests. This is due to the highlyconservative nature of the latter, and will typically occur in our setting if Zk is close enoughto zero for a number of coordinates. This empirical feature has already been observed ontests for positive quadrant dependence (PQD) in Denuit and Scaillet (2001).

7 An empirical illustration: US Losses and ALAE’s

7.1 Presentation of the data

Often insurance processes involve correlated pairs of variables. A fine example is the LOSSand allocated adjustment expenses (ALAE, in short) on a single claim. ALAE’s are type ofinsurance company expenses that are specifically attributable to the settlement of individ-ual claims such as lawyers’ fees and claims investigation expenses. The joint modelling inparametric settings of those two variables is examined by Frees and Valdez (1998) whochoose the Pareto distribution to model the margins and select Gumbel and Frank’s copu-las. Both models express PQD by their estimated parameter values. Klugman and Parsa(1999) opt for the Inverse Paralogistic for LOSS and for the Inverse Burr for ALAE’s anduse Frank’s copula to model the dependence between them. Denuit and Scaillet (2001)

13

test the existence of PQD for LOSS and ALAE using a nonparametric approach and findthat, as both previous models suggest, significant positive quadrant dependence exists.

The database we have considered consist in T = 1, 466 uncensored observed values ofthe random vector (LOSS,ALAE). The estimated values for Pearson’s r, Kendall’s τ andSpearman’s ρ are 0.381, 0.307 and 0.444, respectively; all of them are significantly positiveat a 1% level. Summary statistics for (LOSS,ALAE) are provided in Table 7.1.

LOSS ALAEMean 37,109.58 12,017.47Std Dev. 92,512.80 26,712.35Skew. 10.95 10.07Kurt. 209.62 152.39Min 10.00 15.00Max 2,173,595.00 501,863.001st Quart. 3,750.00 2,318.25Median 11,048.50 5,420.503rd Quart. 32,000.00 12,292.00

Table 7.1: Summary statistics for variables LOSS and ALAE.

Because some very high values of the variables are contained in the data set, we will workon a logarithmic scale to represent the data. This will not affect testing for concordanceordering since this order enjoys a functional invariance property (cf. Section 3.2). Figure7.1 shows the kernel estimator of the bivariate pdf of the couple (log(LOSS),log(ALAE)),together with its contour plot. This estimation relies on a product of Gaussian kernelsand bandwidth values selected by the standard rule of thumb (Scott (1992)). The graphsobviously suggest strong positive dependence between both variables.

7.2 Inference under parametric specification

First, the parametric framework suggested by Frees and Valdez (1998) is studied. It relieson a Gumbel copula

C(u1, u2; θ) = exp{−[(− ln(u1))

θ + (− ln(u2))θ]1/θ},

for the dependence structure and Pareto distributions

Fi(x) = 1 −(

1 + ξix

γi

)−1/ξi

, i = 1, 2,

for the marginal behaviours.Estimated values for the parameter ν = (ξ1, γ1, ξ2, γ2, θ)

′ are shown in Table 7.2.In the testing procedures we opt for inference based on probability levels, and take

81 points built on the grid {0.1, 0.2, . . . , 0.9} × {0.1, 0.2, . . . , 0.9}. Since 105 of the 162components of the vector Z2 are negative, with 30 among them less than -0.1, a concordancetest is applied. See Figure 7.2 for a representation.

14

4

6

8

1012

14

LogLOSS

4

6

8

10

12

LogALAE

00.

020.

040.

060.

08D

ensi

ty fu

nctio

n

LogLOSS

LogA

LAE

4 6 8 10 12 14

46

810

12

7550

25

Figure 7.1: Kernel estimation of the bivariate pdf for (log(LOSS),log(ALAE)).

15

0.2

0.4

0.6

0.8

u 10.2

0.4

0.6

0.8

u2

00.

20.

40.

60.

81

Gum

bel c

opul

a

0.2

0.4

0.6

0.8

u 10.2

0.4

0.6

0.8

u2

00.

20.

40.

60.

81

Em

piric

al c

opul

a

0.2

0.4

0.6

0.8

u 1

0.2

0.4

0.6

0.8

u2

-0.0

4-0

.03

-0.0

2-0

.01

00.

01C

(par

)-C

(em

p)

Figure 7.2: Estimated copulas: Gumbel (top), empirical (mid) and difference (bottom).

16

LOSS Pareto ξ1 = 0.760, β1 = 12, 816.9ALAE Pareto ξ2 = 0.425, β2 = 6, 756.5Copula Gumbel θ = 1.425

Table 7.2: Estimated parameter values of the bivariate distribution of (LOSS, ALAE).

To compute the solution Z2 of the quadratic minimisation problem (6.1), a local min-imiser for nonlinear functions subject to boundary constraints is used (specifically, nlminbin Splus). Since Z2 represents in a way the closest point to Z2 under the null, we take thevector max(0, Z2) as starting point for the numerical optimisation routine. This initial valuesatisfies the boundary restrictions. The minimum value of the function is then found to beξ2 = 2.096.

According to the bounds given in Kodde and Palm (1986), the null hypothesis of agreater concordance of the fitted distribution cannot be rejected at any level lower than5%. This indicates that the amount of positive dependence expressed by the parametricframework is at least as large as that suggested by the data. This is particularly appealingto actuaries since it ensures that most actuarial quantities computed in the Gumbel-Paretomodel will not be underestimated.

It can also be of interest to test the concordance behaviour only in the upper tails. Weconsider a 81 grid formed by the percentiles in {0.91, 0.92, . . . , 0.99}×{0.91, 0.92, . . . , 0.99}.See Figure 7.3 for a representation. In this case only 10 components of Z2 are found tobe negative and the minimum value of the function is ξ2 = 0.000035. Thus, again the nullhypothesis cannot be rejected.

7.3 Inference under semiparametric specification

In this section we wish to test the same type of hypothesis than in the previous subsection butusing the semiparametric approach. We thus drop the Pareto modelling of the marginals andleave them unspecified. Note however the similarity between the parametric and empiricalestimations of both margins in Figure 7.4. This explains why the estimated value θ = 1.415of the Gumbel copula is not much affected.

Again we perform inference based on probability levels and we use the same grid{0.1, 0.2, . . . , 0.9}×{0.1, 0.2, . . . , 0.9}. Figure 7.5 displays the differences between the semi-parametric and empirical estimations and the semiparametric and parametric estimationsof the copula on our data. Note again the small difference between the semiparametric andparametric estimations. We thus expect to get the same conclusion under the semiparametricframework as under the parametric one.

The minimum value of the function is now found to be ξ4 = 14.161. Since this valuedoes not allow us to get a conclusion about its significance using the bounds of Kodde andPalm (1986), we need to rely on the simple Monte Carlo technique described in Section 6. Ap-value equal to 0.98 has been obtained which clearly yields to not reject the null hypothesisof concordance.

Besides the results about the concordance behaviour in the upper tails are equivalentto the ones from the parametric approach using the same grid. Differences between the

17

0.92

0.94

0.96

0.98

u 10.92

0.94

0.96

0.98

u2

0.85

0.9

0.95

1G

umbe

l cop

ula

0.92

0.94

0.96

0.98

u 10.92

0.94

0.96

0.98

u2

0.85

0.9

0.95

1Em

piric

al c

opul

a

0.92

0.94

0.96

0.98

u 1

0.92

0.94

0.96

0.98

u2

-0.0

02-0

.001

00.

001

0.00

20.

003

0.00

40.

005

C(p

ar)-

C(e

mp)

Figure 7.3: Estimated copulas in the upper tails: Gumbel (top), empirical (mid) and difference(bottom).

18

loss

F(lo

ss)

0 5*10^5 10^6 1.5*10^6 2*10^6

0.0

0.2

0.4

0.6

0.8

1.0

Empirical estimationGP estimation

ALAE

F(A

LA

E)

0 100000 200000 300000 400000 500000

0.0

0.2

0.4

0.6

0.8

1.0

Empirical estimationGP estimation

loss

Fe

mp

-Fp

are

to

0 5*10^5 10^6 1.5*10^6 2*10^6

-0.0

2-0

.01

0.0

0.0

10

.02

0.0

30

.04

ALAE

Fe

mp

-Fp

are

to

0 100000 200000 300000 400000 500000

-0.0

3-0

.02

-0.0

10

.00

.01

0.0

2

Figure 7.4: Parametric and empirical estimations of margins (top) and their differences (bottom).

empirical and the semiparametric estimations (left) and between the semiparametric and

the parametric estimations (right) are shown in Figure 7.6. Only 12 components of Z4 arenegative and the minimum value of the function is ξ4 = 0.000036, which does not allow toreject the null hypothesis.

8 Concluding remarks

In this paper we have analysed simple distributional free inference for concordance ordering.The testing procedures have proven to be empirically relevant to the analysis of dependenciesamong US insurance claim data. In particular they suggest that the Gumbel copula reflectsthe dependence structure in the data safely enough. This should reassure actuaries in theiruse of this copula when computing an insurance premium.

19

0.2

0.4

0.6

0.8

u

0.2

0.4

0.6

0.8

v

-0.0

4-0

.03

-0.0

2-0

.01

00.0

1C

(sem

ipar)

-C(e

mp)

0.2

0.4

0.6

0.8

u 10.2

0.4

0.6

0.8

u2

-0.0

014 -0.0

012 -0

.001 -0

.0008 -0.0

006 -0.0

004 -0.0

002

0C

(sem

ipar)

-C(p

ar)

Figure 7.5: Differences between estimated copulas: semiparametric - empirical (left) andsemiparametric - parametric (right).

0.92

0.94

0.96

0.98

u

0.92

0.94

0.96

0.98

v

-0.0

02

00.0

02

0.0

04

C(s

em

ipar)

-C(e

mp)

0.92

0.94

0.96

0.98

u 10.92

0.94

0.96

0.98

u2

-0.0

005 -0

.0004 -0

.0003-0

.0002

-0.0

001

0C

(sem

ipar)

-C(p

ar)

Figure 7.6: Differences between estimated copulas in the upper tails: semiparametric - empirical(left) and semiparametric - parametric (right) in the tails.

20

APPENDICES

A Proof of Proposition 3.1

Let us recall that given two two rv’s Z1 and Z2, Z1 �cv Z2 holds if, and only if, EZ1 = EZ2

and E(x− Z1)+ ≥ E(x− Z1)+ is valid for all x ∈ R. Now, note that∫ x

−∞P [Y1 + Y2 ≤ t]dt =

[tP [Y1 + Y2 ≤ t]

]x−∞

−∫ x

−∞tdP [Y1 + Y2 ≤ t] = E(x− Y1 − Y2)+.

So, we want to show that the inequality E(x− Y1 − Y2)+ ≥ E(x−X1 −X2)+ holds for anyreal constant x when X �c Y . Now, let us express E(x − Y1 − Y2)+ in terms of the jointcdf of Y . Note that∫ x

−∞I[y1 ≤ t, y2 ≤ x− t]dt =

∫ x

−∞I[y1 ≤ t ≤ x− y2]dt = (x− y1 − y2)+

whence it follows that

E(x− Y1 − Y2)+ =

∫ x

−∞P [Y1 ≤ t, Y2 ≤ x− t]dt.

Finally,

E(x−Y1−Y2)+−E(x−X1 −X2)+ =

∫ x

−∞

{P [Y1 ≤ t, Y2 ≤ x− t]−P [X1 ≤ t, X2 ≤ x− t]

}dt

where the integrand {. . . } is non-negative provided X �c Y , which ends the proof.

B Asymptotic distributions

We first derive the asymptotic distribution of the parametric estimator ν = (β ′, θ′)′ andthe semiparametric estimator θ in a misspecified framework. For the well specified case theresults can be found in Genest, Ghoudi and Rivest (1995) and Shih and Louis (1995).Then we proceed with the asymptotic distribution of the various difference vectors Dk andˆDk, k = 1, ..., 4.

B.1 Asymptotic distribution of the parametric estimator

The asymptotic distribution of ν immediately results from usual pseudo maximum likelihoodtheory (see e.g. White (1982), Gourieroux, Monfort and Trognon (1984)). Indeedfrom a standard Taylor expansion of the first order condition of the maximum likelihoodcriterion and the law of large numbers, we get:

√T (ν − ν0) = J−1

ν0

1√T

T∑t=1

∂νlog f(Y t; ν0) + op(1),

21

with

Jν0 = E0

[− ∂2

∂ν∂ν ′log f(Y ; ν0)

],

where E0 denotes expectation w.r.t. the true distribution F0, and by application of the centrallimit theorem

√T (ν − ν0) =⇒ N(0, J−1

ν0Iν0J

−1ν0

),

where

Iν0 = E0

[∂

∂νlog f(Y ; ν0)

∂ν ′log f(Y ; ν0)

].

When the parametric model is well specified, i.e. F (·, ν0) = F0(·), we have Jν0 = Iν0.

B.2 Asymptotic distribution of the semiparametric estimator

From a Taylor expansion of the first order condition of the maximum likelihood criterionand the law of large numbers, we get:

√T (θ − θ∗0) = J−1

θ∗0

1√T

T∑t=1

∂θlog c(F1(Y1t), ..., Fn(Ynt); θ

∗0) + op(1), (B.1)

where

Jθ∗0= E0

[− ∂2

∂θ∂θ′log c(F10(Y1), ..., Fn0(Yn); θ

∗0)

].

The random part of the right hand side in (B.1) can be written:

1√T

T∑t=1

∂θlog c(F1(Y1t), ..., Fn(Ynt); θ

∗0)

=√T

∫Rn

∂θlog c(F1(z1), ..., Fn(zn); θ∗0)dF (z1, ..., zn),

and decomposed into three terms:

√T

∫Rn

∂θlog c(F1(z1), ..., Fn(zn); θ∗0)dF (z1, ..., zn)

=√T

∫Rn

∂θlog c(F1(z1), ..., Fn(zn); θ∗0)dF0(z1, ..., zn)

+√T

∫Rn

[∂

∂θlog c(F1(z1), ..., Fn(zn); θ∗0) −

∂θlog c(F10(z1), ..., Fn0(zn); θ∗0)

]×d[F (z1, ..., zn) − F0(z1, ..., zn)]

+√T

∫Rn

∂θlog c(F10(z1), ..., Fn0(zn); θ∗0)d[F (z1, ..., zn) − F0(z1, ..., zn)].

22

The second term converges to zero, and by the central limit theorem the third termconverges to N(0, Iθ∗0

), where

Iθ∗0= E0

[∂

∂θlog c(F1(Y1), ..., Fn(Yn); θ

∗0)

∂θ′log c(F1(Y1), ..., Fn(Yn); θ

∗0)

].

Now a Taylor expansion of the first term (see Serfling (1980) for expansion of vonMises differentiable statistical functions) leads to

√T

∫Rn

∂θlog c(F1(z1), ..., Fn(zn); θ∗0)dF0(z1, ..., zn)

=√T

∫Rn

∂θlog c(F10(z1), ..., Fn0(zn); θ∗0)dF0(z1, ..., zn)

+√T

n∑j=1

∫Rn

∂2

∂θ∂ujlog c(F10(z1), ..., Fn0(zn); θ∗0)dF0(z1, ..., zn)

×(Fj(zj) − Fj0(zj)) + op(1)

= 0 +√T

n∑j=1

∫R

[∫Rn

I[wj ≤ zj ]∂2

∂θ∂uj

log c(F10(z1), ..., Fn0(zn); θ∗0)dF0(z1, ..., zn)

]×d[Fj(wj) − Fj0(wj)] + op(1).

Hence by the central limit theorem the first term converges to N(0,Mθ∗0), where

Mθ∗0= Var0

[n∑

j=1

∫Rn

I[Yj ≤ zj ]∂2

∂θ∂uj

log c(F10(z1), ..., Fn0(zn); θ∗0)dF0(z1, ..., zn)

].

Since the conditional expectation of ∂∂θ

log c(F10(Y1), ..., Fn0(Yn); θ∗0) w.r.t. any Yj is null,the first term and the third term are uncorrelated, and we finally get:

√T (θ − θ∗0) =⇒ N(0, J−1

θ∗0(Iθ∗0

+Mθ∗0)J−1

θ∗0).

When the parametric model is well specified, i.e. F (·; θ∗0) = F0(·), we have Jθ∗0= Iθ∗0

.

B.3 Asymptotic distribution of the difference vectors

B.3.1 Asymptotic distribution of D1 andˆD1

A Taylor expansion of C(F1(yi1; β1), ..., Fn(yin; βn); θ) around ν0 = (β ′0, θ

′0)

′ gives:

C(F1(yi1; β1), ..., Fn(yin; βn); θ)

= C(F1(yi1; β10), ..., Fn(yin; βn0); θ0)+∂

∂νC(F1(yi1; β10), ..., Fn(yin; βn0); θ0)

′(ν−ν0)+op(T−1/2),

23

where the overline stands for some mean values.Hence we deduce from Subsection A:

√T (Di

1 −Di1) = ∇i

ν0C ′J−1

ν0

1√T

T∑t=1

∂νlog f(Y t; ν0) +

√T (F (yi) − F (yi)) + op(1),

where

∇iν0C =

∂νC(F1(yi1; β10), ..., Fn(yin; βn0); θ0).

An application of the central limit theorem delivers:√T (Di

1 −Di1) =⇒ N(0, Bi′

ν0Cov0

[Si

ν0, Si

ν0

]Bi

ν0),

whereBi

ν0=(∇i

ν0C ′J−1

ν0, 1)′,

and

Siν0

=

(∂

∂ν ′log f(Y ; ν0), I[Y ≤ yi]

)′,

whilelim

T→∞T Cov0

[Dk

1 , Dl1

]= Bk′

ν0Cov0

[Sk

ν0, Sl

ν0

]Bl

ν0, k, l = 1, ..., d.

Similarly we may deduce for the survival quantity:

√T ( ˆD

i

1 − Di1) =⇒ N(0, Bi′

ν0Cov0

[Si

ν0, Si

ν0

]Bi

ν0),

whereBi

ν0=(∇i

ν0C ′J−1

ν0, 1)′,

and

Siν0

=

(∂

∂ν ′log f(Y ; ν0), I[Y > yi]

)′,

while

limT→∞

T Cov0

[ˆD

k

1,ˆD

l

1

]= Bk′

ν0Cov0

[Sk

ν0, Sl

ν0

]Bl

ν0, k, l = 1, ..., d.

Finally we also have:

limT→∞

T Cov0

[Dk

1 ,ˆD

l

1

]= Bk′

ν0Cov0

[Sk

ν0, Sl

ν0

]Bl

ν0, k, l = 1, ..., d.

B.3.2 Asymptotic distribution of D2 andˆD2

A Taylor expansion of C(ui; θ) around θ0 gives:

C(ui; θ) = C(ui; θ0) +∂

∂θC(ui; θ0)

′(θ − θ0) + op(T−1/2),

where θ0 lies between θ and θ0.

24

We get using Subsection A:

√T (C(ui; θ) − C(ui; θ0)) = ∇i

θ0C ′J−1

θ0

1√T

T∑t=1

∂θlog f(Y t; ν0) + op(1),

where

∇iθ0C =

∂θC(ui; θ0), Jθ0 = E0

[− ∂2

∂θ∂θ′log f(Y ; ν0)

].

Furthermore let M = {I[ · ≤ x1]...I[ · ≤ xn] : xj ∈ R, j = 1, ..., n}. Since M satisfiesPollard’s entropy condition for some finite constant taken as envelope, the sequence{

F (x) = T−1T∑

t=1

n∏j=1

I[Yjt ≤ xj ] : T ≥ 1

}

is stochastically differentiable at ζi with random derivative (d × 1)-vector DF (ζi) (see e.g.Pollard (1985), Andrews (1994,1999) for definition, use and check of stochastic differen-tiability). It means that we have the approximation:

F (ζi) = F (ζi) +DF (ζi)′(ζi − ζi) + op(T

−1/2),

where ζi is a mean value located between ζi and ζi.Similarly we get the approximations:

Fj(ζij) = Fj(ζij) +DFj(ζij)(ζij − ζij) + op(T−1/2).

Combining these approximations and using Fj0(ζij) = uij = Fj(ζij) leads to

F (ζi) = F (ζi) −DF (ζi)′diagSi + op(T

−1/2),

where Si is the stack of (Fj(ζij) − uij)/DFj(ζij), j = 1, ..., n, and diag Si is the diagonalmatrix built from this stack.

Hence we get:

√T (Di

2 −Di2)

= ∇iθ0C ′J−1

θ0

1√T

T∑t=1

∂θlog f(Y t; ν0)−

√T (F (ζi)− F0(ζi)) +DF (ζi)

′√TdiagSi + op(1).

An application of the central limit theorem delivers:

√T (Di

2 −Di2) =⇒ N(0, Bi′

θ0Cov0

[Si

θ0, Si

θ0

]Bi

θ0),

where

Biθ0

=

(∇i

θ0C ′J−1

θ0,−1,

∂F0(�i)∂x1

f10(ζi1), ...,

∂F0(�i)∂xn

fn0(ζin)

)′

25

and

Siθ0

=

(∂

∂θ′log f(Y ; ν0), I[Y ≤ ζi], I[Y1 ≤ ζi1], ..., I[Yn ≤ ζin]

)′,

whilelim

T→∞T Cov0

[Dk

2 , Dl2

]= Bk′

θ0Cov0

[Sk

θ0, Sl

θ0

]Bl

θ0, k, l = 1, ..., d.

We also get: √T ( ˆD

i

2 − Di2) =⇒ N(0, Bi′

θ0Cov0

[Si

θ0, Si

θ0

]Bi

θ0),

where

Biθ0

=

(∇i

θ0C ′J−1

θ0,−1,

∂F0(�i)∂x1

f10(ζi1), ...,

∂F0(�i)∂xn

fn0(ζin)

)′

and

Siθ0

=

(∂

∂θ′log f(Y ; ν0), I[Y > ζi], I[Y1 ≤ ζi1], ..., I[Yn ≤ ζin]

)′,

while

limT→∞

T Cov0

[ˆD

k

2,ˆD

l

2

]= Bk′

θ0Cov0

[Sk

θ0, Sl

θ0

]Bl

θ0, k, l = 1, ..., d,

and

limT→∞

T Cov0

[Dk

2 ,ˆD

l

2

]= Bk′

θ0Cov0

[Sk

θ0, Sl

θ0

]Bl

θ0, k, l = 1, ..., d.

B.3.3 Asymptotic distribution of D3 andˆD3

Using a Taylor expansion of C(F1(yi1), ..., Fn(yin); θ) around θ∗0 and Fj0(yij), j = 1, ..., n, weget:

√T (D3−D3) = ∇i

θ∗0C ′

√T (θ−θ∗0)+

n∑j=1

∇iujC ′

√T (Fj(yij)−Fj0(yij))−

√T (F (yi)−F0(yi))+op(1),

where

∇iθ∗0C =

∂θC(F10(yi1), ..., Fn0(yin); θ∗0), ∇i

ujC =

∂ujC(F10(yi1), ..., Fn0(yin); θ

∗0).

From Subsection B and the central limit theorem, we deduce

√T (Di

3 −Di3) =⇒ N(0, Bi′

θ∗0Cov0

[Si

θ∗0, Si

θ∗0

]Bi

θ∗0),

where

Biθ∗0

=(∇i

θ∗0C ′J−1

θ∗0,∇i

u1C, ...,∇i

unC,−1

)′,

and

Siθ∗0

=(U ′

θ∗0, I[Y1 ≤ yi1], ..., I[Yn ≤ yin], I[Y ≤ yi]

)′

26

and

Uθ∗0=

∂θlog c(F10(Y1), ..., Fn0(Yn); θ

∗0)

+

n∑j=1

∫Rn

I[Yj ≤ zj ]∂2

∂θ∂ujlog c(F10(z1), ..., Fn0(zn); θ∗0)dF0(z1, ..., zn).

We also have:

limT→∞

T Cov0

[Dk

3 , Dl3

]= Bk′

θ∗0Cov0

[Sk

θ∗0, Sl

θ∗0

]Bl

θ∗0, k, l = 1, ..., d.

Besides, √T ( ˆD

i

3 − Di3) =⇒ N(0, Bi′

θ∗0Cov0

[Si

θ∗0, Si

θ∗0

]Bi

θ∗0),

where

Biθ∗0

=(∇i

θ∗0C ′J−1

θ∗0,∇i

u1C, ...,∇i

unC,−1

)′,

and

Siθ∗0

=(U ′

θ∗0, I[Y1 ≤ yi1], ..., I[Yn ≤ yin], I[Y > yi]

)′,

while

limT→∞

T Cov0

[ˆD

k

3,ˆD

l

3

]= Bk′

θ∗0Cov0

[Sk

θ∗0, Sl

θ∗0

]Bl

θ∗0, k, l = 1, ..., d,

and

limT→∞

T Cov0

[Dk

3 ,ˆD

l

3

]= Bk′

θ∗0Cov0

[Sk

θ∗0, Sl

θ∗0

]Bl

θ∗0, k, l = 1, ..., d.

B.3.4 Asymptotic distribution of D4 andˆD4

The only difference betwen Di2 and Di

4 lies in the replacement of the parameric estimatorθ by the semiparametric estimator θ. Hence the asymptotic distribution of Di

4 is obtained

after substituting ∇iθ∗0C ′J−1

θ∗0for ∇i

θ0C ′J−1

θ0, and Uθ∗0

for∂

∂θlog f(Y ; θ0) in the asymptotic

results for Di2.

Similarly in order to derive the asymptotic normality of ˆDi

4, we only have to replace

∇iθ0C ′J−1

θ0with ∇i

θ∗0C ′J−1

θ∗0, and

∂θlog f(Y ; θ0) with Uθ∗0

in the asymptotic results for ˆDi

2.

Acknowledgements

The authors would like to thank Professors Frees and Valdez for kindly providing the Loss-ALAE data, which were collected by the US Insurance Services Office (ISO). Ana Cebrianthanks the Universite catholique de Louvain, Louvain-la-Neuve, Belgium, for financial sup-port through an FSR research grant. Michel Denuit acknowledges financial support of theBelgian Government under “Projet d’Actions de Recherche Concertees” (No. 98/03-217).

27

Olivier Scaillet receives support by the Swiss National Science Foundation through the Na-tional Center of Competence: Financial Valuation and Risk Management. Part of thisresearch was done when he was visiting THEMA and IRES.Downloadable athttp://www.hec.unige.ch/professeurs/SCAILLET Olivier/pages web/Home Page of Olivier Scaillet.htm

28

REFERENCES :

• Andrews, D. (1994): “Empirical Process Methods in Econometrics”, in Handbook ofEconometrics, Vol. IV, eds R. Engle and D. McFadden, 2247-2294.

• Andrews, D. (1999): “Estimation when a Parameter is on a Boundary”, Econometrica,67, 1341-1384.

• Arnold, B. (1987): Majorization and the Lorenz Order: A Brief Introduction, Springer-Verlag, New York, NY.

• Bartholomew, D. (1959a): “A Test of Homogeneity for Ordered Alternatives, I”,Biometrika, 46, 36-48.

• Bartholomew, D. (1959b): “A Test of Homogeneity for Ordered Alternatives, II”,Biometrika, 46, 328-335.

• Beach, C. and R. Davidson (1983): “Distribution-Free Statistical Inference with LorenzCurves and Income Shares”, Review of Economic Studies, 50, 723-735.

• Cambanis, S., G. Simons and W. Stout (1976): “Inequalities for Ek(X, Y ) when theMarginals are Fixed”, Z. Wahrsch. Verw. Gebiete, 36, 285-294.

• Dardanoni, V. and A. Forcina (1999): “Inference for Lorenz Curve Orderings”, Econo-metrics Journal, 2, 48-74.

• Davidson, R. and J.-Y. Duclos (2000): “Statistical Inference for Stochastic Dominanceand for the Measurement of Poverty and Inequality”, Econometrica, 68, 1435-1464.

• Davis, M. (1997): “Option Pricing in Incomplete Markets”, Mathematics of DerivativeSecurities, Publication of the Newton Institute, Demster M. and Pliska S., CambridgeUniversity Presss, 216-227.

• Denuit, M., Dhaene, J., and Van Wouwe, M. (1999): “The Economics of Insurance: aReview and Some Recent Developments”, Bulletin of the Swiss Association of Actuar-ies, 2, 137-175.

• Denuit, M. and O. Scaillet (2001): “Nonparametric Tests for Positive Quadrant De-pendence”, mimeo.

• Dhaene, J. and M. Goovaerts (1996): “Dependency of Risks and Stop-Loss Order”,ASTIN Bulletin, 26, 201-212.

• Embrechts, P., A. McNeil and D. Straumann (2000), “Correlation and Dependency inRisk Management: Properties and Pitfalls”, in Risk Management: Value at Risk andBeyond, eds Dempster M. and Moffatt H., Cambridge University Press, Cambridge.

• Frees, E. and E. Valdez (1998): “Understanding Relationships Using Copulae”, NorthAmerican Actuarial Journal, 2, 1-25.

29

• Genest, C., K. Ghoudi and L.-P. Rivest (1998): “A Semiparametric Estimation Proce-dure of Dependence Parameters in Multivariate Families of Distributions”, Biometrika,82, 543-552.

• Goovaerts, M., R. Kaas, A. Van Heerwaarden and T. Bauwelinckx (1990): EffectiveActuarial Methods, Insurance Series 3, North Holland, Amsterdam.

• Gourieroux, C., A. Holly and A. Monfort (1982): “Likelihood Ratio, Wald Test, andKuhn-Tucker Test in Linear Models with Inequality Constraints on the RegressionParameters”, Econometrica, 50, 63-80.

• Gourieroux, C., A. Monfort and A. Trognon (1984): “Pseudo Maximum LikelihoodTheory”, Econometrica, 52, 681-700.

• Howes, S. (1993): “Inferring Population Rankings from Sample Data”, STICERDdiscussion paper.

• Joe, H. (1990). “Multivariate concordance”. Journal of Multivariate Analysis, 35,12-30.

• Joe, H. (1997): Multivariate Models and Dependence Concepts, Chapman & Hall,London.

• Karatzas, A., and S. Kou (1996): “On the Pricing of Contingent Claims under Con-straints”, Annals of Applied Probability, 6, 321-369.

• Kaur, A., B. Prakasa Rao and H. Singh (1994): “Testing for Second-Order Dominanceof Two Distributions”, Econometric Theory, 10, 849-866.

• Klugman, S. and R. Parsa (1999): “Fitting Bivariate Loss Distributions with Copulas”,Insurance Mathematics and Economics 24, 139-148.

• Kodde, D. and F. Palm (1986): “Wald Criteria for Jointly Testing Equality and In-equality Restrictions”, Econometrica, 54, 1243-1248.

• Kroll, Y. and H. Levy (1980): “Stochastic Dominance Criteria : a Review and SomeNew Evidence”, in Research in Finance (Vol. II), p. 263-227, JAI Press, Greenwich.

• Kudo, A. (1963): “A Multivariate Analogue for the One-Sided Test”, Biometrika, 50,403-418.

• Levy, H. (1992): “Stochastic Dominance and Expected Utility : Survey and Analysis”,Management Sciences, 38, 555-593.

• Mosler, K. and M. Scarsini (1993): Stochastic Orders and Applications, a ClassifiedBibliography, Springer-Verlag, Berlin.

• Muller, A. (1997): “Stop-Loss Order for Portfolios of Dependent Risks”, Insurance:Mathematics and Economics 21, 219 - 223.

30

• Nelsen, R. (1999): An Introduction to Copulas, Lecture Notes in Statistics, Springer-Verlag, New-York.

• Pollard, D. (1985): “New Ways to Prove Central Limit Theorems”, Econometric The-ory, 1, 295-314.

• Serfling, R. (1980): Approximation Theorems of Mathematical Statistics, Wiley, New-York.

• Shih, J., and T. Louis (1995): “Inferences on the Association Parameter in CopulaModels for Bivariate Survival Data”, Biometrics, 51, 1384-1399.

• Scott, D. (1992): “Multivariate Density Estimation: Theory, Practice and Visualisa-tion”, John Wiley & Sons, New-York.

• Shaked, M. and J. Shanthikumar (1994): Stochastic Orders and their Applications,Academic Press, New York.

• Sklar, A. (1959): “Fonctions de Repartition a n Dimensions et leurs Marges, Publ.Inst. Stat. Univ. Paris, 8, 229-231.

• Tchen, A. (1980): “Inequalities for Distributions with Given Marginals”, The Annalsof Probability, 8, 814-827.

• White, H. (1982), “Maximum Likelihood Estimation of Misspecified Models”, Econo-metrica, 50, 1-26.

• Wolak, F. (1989a): “Testing Inequality Constraints in Linear Econometric Models”,Journal of Econometrics, 41, 205-235.

• Wolak, F. (1989b): “Local and Global of Linear and Nonlinear Inequality Constraintsin Nonlinear Econometric Models”, Econometric Theory, 5, 1-35.

• Yanagimoto, Y. and M. Okamoto (1969): “Partial orderings of permutations and mono-tonicity of a rank correlation statistic”, Annals of the Institute of Statistical Mathe-matics, 21, 489–506.

31


Recommended