+ All Categories
Home > Documents > Grasselli_WishartCorrelation

Grasselli_WishartCorrelation

Date post: 14-Apr-2018
Category:
Upload: ginovainmona
View: 216 times
Download: 0 times
Share this document with a friend

of 49

Transcript
  • 7/30/2019 Grasselli_WishartCorrelation

    1/49Electronic copy available at: http://ssrn.com/abstract=1054721

    ISSN 1283-0623

    LEONARD DE VINCI

    Estimating the Wishart Affine StochasticCorrelation Model using the Empirical

    Characteristic Function

    Jose Da Fonseca Martino Grasselli Florian Ielpo

    11-2007ESILV, DER-MIF, No RR-35

  • 7/30/2019 Grasselli_WishartCorrelation

    2/49Electronic copy available at: http://ssrn.com/abstract=1054721

    Estimating the Wishart Affine StochasticCorrelation Model using the Empirical

    Characteristic Function

    Jose Da Fonseca Martino Grasselli Florian Ielpo

    First draft: November 27th, 2007This draft: March 5th, 2008

    Abstract

    In this paper, we present and discuss the estimation of the Wishart AffineStochastic Correlation (WASC) model introduced in Da Fonseca et al.(2006) under the historical measure. We review the main estimation pos-sibilities for this continuous time process and provide elements to showthat the utilization of empirical characteristic function-based estimates isadvisable as this function is exponential affine in the WASC case. Wethus propose to use the estimation strategy closed to the ones developed inChacko and Viceira (2003) and Carrasco et al. (2007). We use a continuumof moment conditions based on the characteristic function obtained whenthe co-volatility process has been integrated out. We investigate the behav-ior of the estimates through Monte Carlo simulations. Then, we presentthe estimation results obtained using a dataset of equity indexes: SP500,FTSE, DAX and CAC. On the basis of these results, we show that theWASC captures many of the known stylized facts associated with finan-cial markets, including the negative correlation between stock returns andvolatility. It also helps reveal interesting patterns in the studied indexescovariances and their correlation dynamics.

    Keywords: Wishart Process, Empirical Characteristic Function, Stochastic Cor-relation, Generalized Method of Moments.

    Acknowledgements: We are particularly indebted to Marine Carrasco for remarkable in-sights and helpful comments. We are also grateful to Fulvio Pegoraro and Francois-XavierVialard for helpful remarks. Any errors remain ours.

    Ecole Superieure dIngenieurs Leonard de Vinci, Departement Mathematiques et IngenierieFinanciere, 92916 Paris La Defense, France. Email: jose.da [email protected] and ZeliadeSystems, 56, Rue Jean-Jacques Rousseau, 75001 Paris.

    Universita degli Studi di Padova , Dipartimento di Matematica Pura ed Applicata, ViaTrieste 63, Padova, Italy.E-mail: grassell @math.unipd.it and ESILV.

    Centre dEconomie de la Sorbonne, 106 Boulevard de lHopital, 75013 Paris, France. E-mail: [email protected]. Dexia S.A., Passerelle des Reflets, La Defense, France.

    1

  • 7/30/2019 Grasselli_WishartCorrelation

    3/49

    1 Introduction

    The estimation of continuous time processes under the physical measure attracted

    a lot of attention over the few past years, and several estimation strategies havebeen proposed in the literature. When the transition density is known in closedform, it is possible to perform a maximum likelihood estimation of the diffusionparameters, as presented e.g. in Lo (1988). However, the number of models forwhich the transition density is known in a closed form expression is somewhatlimited. Moreover, the existence of unobservable factors such as the volatilityprocess in the Heston (1993) model makes it difficult if not impossible toestimate such models using a conditional maximum likelihood approach. A pos-sible solution consists in discretizing and simulating the unobservable process: forexample, Duffie and Singleton (1993) used the Simulated Methods of Momentsto estimate financial Markov processes (methods of this kind are reviewed in

    Gourieroux and Monfort, 1996). However, as pointed out in Chacko and Viceira(2003), even though these methods are straightforward to apply, it is difficultto measure the numerical errors due to the discretization. What is more, thecomputational burden of this class of methods precludes its use for multivariateprocesses.

    For the special class of affine models, another estimation strategy can be used.The affine models present tractable exponentially affine characteristic functionsthat can in turn be used to estimate the parameters under the historical measure.Singleton (2001) and Singleton (2006) present a list of possible estimation strate-

    gies that can be applied to recover these parameters from financial time series,using the characteristic function of the process. Methodologies of this kind havebeen applied to one-dimensional processes, like the Cox-Ingersoll-Ross process(e.g. Zhou, 2000), the Heston process and a mixture of stochastic volatility andjump processes (e.g. Jiang and Knight, 2002, Rockinger and Semenova, 2005 andChacko and Viceira, 2003), affine jump diffusion models (e.g. Yu, 2004), yieldinginteresting results. Still, it involves additional difficulties. First, as remarkedin Jiang and Knight (2002) and Rockinger and Semenova (2005), numericallyintegrating the characteristic function of a vector of the state variable is com-putationally intensive. In our multivariate case the state variable is already a

    vector: for this type of methodology, the integral discretization is likely to lead tonumerical errors. Second, with the Spectral GMM method presented in Chackoand Viceira (2003), the use of a more limited number of points of the character-istic function settles the numerical problem, but leads to a decrease in estimatesefficiency. Carrasco and Florens (2000) and Carrasco et al. (2007) present amethod that uses a continuum of moment conditions built from the character-istic function. With this method, the estimates obtained reach the efficiency ofthe maximum likelihood method, thanks to the instrument used in this strategy.These features make this methodology particularly well-suited for the estimationof affine multivariate continuous time processes.

    Here, we propose to use a Spectral Generalized Method of Moments estimation

    2

  • 7/30/2019 Grasselli_WishartCorrelation

    4/49

    strategy to estimate the Wishart Affine Stochastic Correlation model, an affinemultivariate stochastic volatility and correlation model introduced in Da Fonsecaet al. (2006). Based on the previous models of Gourieroux and Sufana (2004) and

    Buraschi et al. (2006), this affine model can be regarded as a multivariate versionof the Heston (1993) model: in fact, the volatility matrix is assumed to evolveaccording to the Wishart dynamics (mathematically developed by Bru, 1991),the matrix analogue of the mean reverting square root process. In addition tothe Heston model, it allows for a stochastic conditional correlation, which makesit very promising process for financial applications.

    Multivariate stochastic volatility models have recently attracted a great deal ofattention. Asai et al. (2006) present a survey of the existing models, along withestimation methodologies. When compared to the previously mentioned pro-cesses, several important differences must be underlined. (1) The volatility beinga latent factor, the observable state variable (the asset log returns) is not Markov,thus the ML efficiency cannot be reach. (2) What is more, as we discuss it in thepaper, since the process involves latent volatilities and correlations, the instru-ment must be set to be equal to one for the usual GMM methodology to be used.This precludes the use of the Double Index instruments procedure presented inCarrasco et al. (2007). (3) Since correlations are also stochastic, there are morelatent variables than in the stochastic volatility models, making simulation-basedmethods useless. (4) The dimensionality of the problem makes the characteristicfunction difficult to invert. (5) Finally, we show that the DCC-based estima-tion method proposed in Buraschi et al. (2006) is unreliable in the WASC model

    whenever we are interested in estimating the correlation structure of the model.

    In view of these difficulties, we propose to estimate the WASC model using itscharacteristic function, following an approach that is closed to the ones presentedin Chacko and Viceira (2003) and Carrasco et al. (2007). We present a MonteCarlo investigation of the estimates behavior in a small sample and we discussthe empirical results obtained using a real dataset composed of the prices of theSP500, FTSE, DAX and CAC 40. Our estimates reject systematically the partic-ular correlation structure chosen by Gourieroux and Sufana (2004) and Buraschiet al. (2006). Finally, the results show that the WASC model, thanks to its ability

    to describe dynamic correlation, can encompass most of the desired features offinancial markets.

    The paper is organized as follows. First we present the WASC process, alongwith the computation of its conditional characteristic function. Then in Section3, we review the main estimation strategies and discuss the difficulties linked todiscrete time models. In Section 4, we present the estimation methodology usedin this paper and briefly review the main theoretical results. Finally, in Section 5,we present the estimation results obtained with both simulated and real datasetsand discuss their interpretation.

    3

  • 7/30/2019 Grasselli_WishartCorrelation

    5/49

    2 The model

    In this section, we present the Wishart Affine Stochastic Correlation model in-

    troduced in Da Fonseca et al. (2006): we detail the diffusion that drives thismultidimensional process and present the conditional characteristic function to-gether with its derivatives.

    2.1 The dynamics

    The Wishart Affine Stochastic Correlation (WASC) model is a new continuoustime process that can be considered as a multivariate extension of the Heston(1993) model, with a more accurate correlation structure. The framework of thismodel was introduced in Gourieroux and Sufana (2004). It relies on the followingassumption.

    Assumption 2.1. The evolution of asset returns is conditionally Gaussian whilethe stochastic variance-covariance matrix follows a Wishart process.

    In formulas, we consider a n-dimensional risky asset St whose risk-neutral dy-namics are given by

    dSt = diag[St]

    dt +

    tdZt

    , (1)

    where is the vector of returns and Zt Rn is a vector Brownian motion. Fol-lowing Gourieroux and Sufana (2004), we assume that the quadratic variation of

    the risky assets is a matrix t which is assumed to satisfy the following dynamics:

    dt =

    + Mt + tM

    dt +

    tdWtQ + Q (dWt)

    t, (2)

    with , M , Q Mn, invertible, and Wt Mn a matrix Brownian motion (denotes transposition). In the present framework we assume that the above dy-namics are inferred from observed asset price time series, hence the stochasticdifferential equation is written under the historical measure.

    Equation (2) characterizes the Wishart process introduced by Bru (1991), andrepresents the matrix analogue of the square root mean-reverting process. Inorder to ensure the strict positivity and the typical mean-reverting feature of thevolatility, the matrix M is assumed to be negative semi-definite, while satisfies

    = QQ, > n 1, (3)with the real parameter > n 1 (see Bru, 1991 p. 747).

    In full analogy with the square-root process, the term is related to theexpected long-term variance-covariance matrix through the solution to thefollowing linear equation:

    = M + M. (4)

    4

  • 7/30/2019 Grasselli_WishartCorrelation

    6/49

    Moreover, Q is the volatility of the volatility matrix, and its parameters will becrucial in order to explain some stylized observed effects in equity markets.

    Last but not least, Da Fonseca et al. (2006) proposed a very special yet tractablecorrelation structure that is able to accommodate the leverage effects found infinancial time series and option prices. Since it is well known that it is possibleto approximately reproduce observed negative skewness within the Heston (1993)model by allowing for negative correlation between the noise driving returns andthe noise driving variance, they proposed the following assumption:

    Assumption 2.2. The Brownian motions of the asset returns and those drivingthe covariance matrix are linearly correlated.

    Da Fonseca et al. (2006) proved that Assumption 2.2. leads to the following

    relation:

    dZt = dWt +

    1 dBt, (5)

    with Zt = (dZ1, dZ2, . . . , d Z n), B a vector of independent Brownian motions

    orthogonal to W, as defined in equation (2), and = (1, 2, . . . , n).

    With this specification, the model is able to generate negative skewness, giventhe possibly negative correlation between the noise driving the log returns of theassets and the matrix-sized noise perturbating the covariance matrix. This iseasy to show in the special case of two assets (n = 2), fow which the variance-

    covariance matrix is given by

    t =

    11t

    12t

    12t 22t

    . (6)

    The correlations between assets returns and their volatilities admit a closed formexpression, highlighting the impact of the parameters on its value and positivity:

    corr

    d log S1, d11

    =1Q11 + 2Q21

    Q211 + Q

    221

    (7)

    corr d log S2, d22 = 1Q12 + 2Q22Q212 + Q222 , (8)where we recall that

    11 (resp.

    22) represents the volatility of the first

    (resp. second) asset. Therefore, the sign and magnitude of the skew effects aredetermined by both the matrix Q and the vector . When Q is diagonal, weobtain the following skews for asset 1 and 2:

    corr

    d log S1, d11

    = 1 corr

    d log S2, d22

    = 2, (9)

    thus allowing a negative skewness for each asset whenever i < 0,

    i (see Gourier-

    oux and Jasiak (2001) on this point). This correlation structure is similar the

    5

  • 7/30/2019 Grasselli_WishartCorrelation

    7/49

    one obtained in an Heston model.

    Other less general specifications close to the WASC model, actually nested within

    the WASC correlation structure, have been proposed in the literature. First,Gourieroux and Sufana (2004) imposed = 0n Rn, a choice that leads to azero correlation case (see equations (5), (7) and (8)) by analogy with the wellknown properties of the Heston model. With this specification, the log returnsunivariate distribution is symmetric.

    Second, Buraschi et al. (2006) proposed to impose = (1, 0). Their model isthus able to display negative skewness for asset 1 (resp. asset 2), depending onthe positivity ofQ11 (resp. Q12). This model is actually close to the WASC and isable to display similar features. Their choice of is less restrictive than it seems,in so far as this parameter is only defined up to a rotation. Thus, their hypothesisis reduced to |||| = 1. With these settings, the vector-sized noise in the returns isfully generated by the Brownian motions of the covariances W. This hypothesishaving no a priori justification, the WASC model of Da Fonseca et al. (2006)eliminates it, assuming the more general correlation structure compatible withan exponential affine characteristic function (see Proposition 1 in Da Fonsecaet al. (2006) on this point).

    2.2 The Characteristic functions

    In the WASC model, the characteristic function of t and Yt = log St is an expo-

    nential affine function of the state variables. For the log returns, the characteristicfunction of Yt+ conditional on Yt and t is denoted:

    Yt,t(, ) = E

    ei,Yt+|t, Yt

    , (10)

    where E[.|t, Yt] denotes the conditional expected value with respect to the his-torical measure, Rn, i2 = 1 and ., . is the scalar product in Rn.Proposition 2.1. (Da Fonseca et al., 2006) The characteristic function of theasset returns in the WASC model is given by

    Yt,t(, ) =E ei,Yt+|t, Yt (11)= exp {Tr [A()t] + i,Yt + c()} , (12)

    where = (1, . . . , n) Rn and the deterministic function A(t) Mn is as

    follows:A () = A22 ()

    1 A21 () , (13)

    with

    A11 () A12 ()A21 () A22 () = exp

    M + Qi 2QQ1

    2 (i)(i) nj=1 ijejj M + iQ

    .

    (14)

    6

  • 7/30/2019 Grasselli_WishartCorrelation

    8/49

    The function c() can be obtained by direct integration, thus giving:

    c() =

    2

    Tr log(A22()) + M + i(Q) + Tr i

    . (15)The characteristic function of the Wishart process is defined as:

    t(, ) = E

    eiTr[t+]|t

    , (16)

    where Mn.Proposition 2.2. (Da Fonseca, 2006) Given a real symmetric matrix D, theconditional characteristic function of the Wishart process t is given by:

    t(, ) = E eiTr[t+]|t, Yt= exp {Tr [B()t] + C()} , (17)

    where the deterministic complex-valued functions B() Mn(Cn), C() C aregiven by

    B () = (iB12 () + B22 ())1 (iB11 () + B21 ()) (18)

    C() = 2

    Tr

    log(iB12() + B22()) + M

    ,

    with B11 () B12 ()B21 () B22 ()

    = exp

    M 2QQ0 M

    .

    What is more, both these characteristic functions can be derived with respectto and the elements of M, Q and , using the results of Daleckii (1974), onthe derivative of a matrix function. We provide detailed calculations of thesederivatives in the Appendix.

    3 Possible estimation strategies

    In this section, we review the possible estimation strategies for the WASC. Wediscuss how similar processes have been estimated in special correlation cases,using the Dynamic Conditional Correlation model introduced by Engle (2002)to make the time-varying covariances observable. This methodology has beenapplied by Buraschi et al. (2006) to a particular specification of the WASC modelcorresponding to the choice = (1, 0) for the correlation structure. We showwhy this approach cannot be pursued in the WASC correlation specification,both from a theoretical and empirical point of view. Finally, we present a dis-crete time model whose continuous time limit is the WASC model and explainwhy such an approach may lead to the kind of estimation errors we want to avoid.

    7

  • 7/30/2019 Grasselli_WishartCorrelation

    9/49

    3.1 Preliminary considerations

    The WASC model fits in the multivariate volatility models class and thus sharesthe basic features of the univariate models. Univariate stochastic volatility models

    are built upon an unobservable volatility process, making their estimation difficultbecause of the available information in the conditioning. Different estimationstrategies have been proposed in the case of continuous time processes (see thesurvey in Ghysels et al., 1996 and Ruiz and Broto, 2004). Only a few of themcould be applied to multivariate stochastic volatility models:

    (1) The use of filtering has been proposed to circumvent the unobservable fea-ture of stochastic variance, as in Harvey et al. (1994), where the gaussianityof the factors made possible to apply the methodology of the linear Kalmanfilter.

    (2) The unobserved volatility process can be made observable using an exoge-nous measure. The realized volatility have been extensively used for finan-cial applications in the literature, see e.g. Barndorff-Nielsen and Shephard(2002).

    (3) When using moment estimators based on the characteristic function, asin the Heston model, the volatility can be integrated out, allowing for adirect focus on the distribution of returns. Following Gourieroux (2006),a spectral moment-based estimator can be constructed. Constructions ofthis kind can be found in Chacko and Viceira (2003) and Rockinger and

    Semenova (2005).

    (4) Finally, in the special case of continuous-time stochastic volatility models,a GARCH-like discrete time model that converges towards the continuoustime model can be constructed in the spirit of Nelson (1990).

    Were it not for problems of dimension, similar methods could be applied to mul-tivariate stochastic volatility models. Nevertheless, we review each of them anddiscuss their applicability to the WASC model. Although it has been proposedas an estimation method by Gourieroux (2006), we quickly discard the filteringapproach, since in our non Gaussian framework the non-linearity of the optimal

    filter leads to a complicated numerical implementation.

    3.2 Estimation when the covariances are observable

    The first possibility is thus to make the covariances matrix observable, usinga parametric or non-parametric measure. Non-parametric estimators have beenproposed in Barndorff-Nielsen and Shephard (2004). Estimators of this kind havebeen applied to a pure Wishart process (named Wishart Autoregressive Process orWAR) in Gourieroux et al. (2004) using stock market data, in a moment-method-based approach. A parametric approach has been proposed, using the Dynamic

    Conditional Correlation model (Engle, 2002 and and Engle and Sheppard, 2001)to estimate the daily covariances matrix. This methodology has been applied in

    8

  • 7/30/2019 Grasselli_WishartCorrelation

    10/49

    Buraschi et al. (2006) to a special case of the WASC, with the aforementionedcorrelation structure = (1, 0) and with a dataset made of stocks and interestrates.

    Methodologies of this kind, especially the parametric one, are particularly difficultto apply to the WASC model for the following reasons. On one hand, the observedcovariances should now lead to a standard maximum likelihood estimation of theWASC. However, the density is not available in a closed-form expression and thenumerical multivariate Fourier transform of the characteristic function is quitecomputer intensive. On the other hand, any moment-based method will lead toa lack of efficiency, and in particular the DCC method cannot be used for theWASC model in order to estimate the vector parameter, as we are going to show.

    Let us shortly review the DCC model and provide numerical results. The DCCmodel assumes that the returns ofn assets possess a conditionally Normal distri-bution with a zero mean and a covariance matrix Ht. Thus:

    rt|Ft1 N(0, Ht) (19)

    andHt = DtRtDt, (20)

    where Dt = diag(

    hit) is the diagonal matrix of time-varying standard deviationsand Rt is the time-varying correlation matrix. The {hit; i = 1, . . . , n} follow aunivariate GARCH dynamic, with the usual parameter restrictions. The dynamicof the correlation Rt is given by:

    Qt = (1 )Q + t1t1 + Qt1Rt = Q

    t1QtQ

    t1,

    with Qt = diag(

    qii) and t N(0, Rt). Once again suitable constraints shouldbe imposed on and in order to guarantee a well-behaved Rt matrix.

    As long as the DCC model provides consistent estimates of the time-varyingvariance-covariance matrices, it is possible to estimate M, Q and consistently,

    by using the unconstrained first-order method of moments estimator presentedin Gourieroux et al. (2004). In our case, such an approach is precluded by thepresence of: in fact, the DCC does not allow for correlation between the returnsvector and the covariance matrix, while this correlation is non-zero in the WASCmodel and is parametrized by . To see why, recall that in the DCC model thevariances are assumed to follow a GARCH(1,1) specification. Thus, the dynamicof the ith component is:

    rit =

    hitit , (21)

    it N(0, 1) (22)hit = 0 + 1hit1 + 2(rit1)2. (23)

    9

  • 7/30/2019 Grasselli_WishartCorrelation

    11/49

    When computing the skew, we obtain:

    Covt1(rit, h

    it+1 hit) = 2h

    3

    2

    t Et1[(it)

    3] = 0. (24)

    Thus, the variances obtained with a DCC model have no correlation with thereturns. A similar computation can be perform to show that the conditional co-variance between the log of the asset price and the correlation process in the DCCframework is equal to 0. These two key features preclude any DCC-based estima-tion of in the WASC. This problem is well-known in financial econometrics, atleast since Nelson and Foster (1994). We report in Figure 1 the empirical distri-bution of the skew and estimators with DCC-based time-varying covariances,when the true (simulated) data-generating process is the WASC, illustrating thiscorrelation-estimation problem. Thus DCC-based estimates cannot be used. Oneinteresting exception is the model presented in Cappiello et al. (2006). However,

    in this case, as in the DCC case, we fear model-linked measurement errors: inparticular, we cannot quantify how well these models yield a consistent estima-tion of the skew.

    Finally, it is noteworthy to mention that the DCC cannot be used as a discretiza-tion of the continuous time WASC model for estimation purposes. As presentedin Nelson and Foster (1994) in the univariate case, the use of a GARCH processto estimate a square root diffusion process lead to an estimator that is expectedto have a very low efficiency. The WASC process being a matrix square rootprocess, we would face similar if not worst difficulties by undertaking such a

    strategy in our multidimensional settings. Coming back to DCC, this model isusually estimated in a two-stage framework as volatilities and correlations can beestimated separately. Clearly, the use of such an estimation strategy is dubiousfor the WASC model since volatilities and correlations dwell on the same set ofparameters. We develop this point in the empirical results section as we detailthe dynamics of volatility and correlation in the two-dimension case. On the basisof these arguments, the use of the DCC to estimate the WASC should clearly bediscarded as a possible estimation strategy. For these reasons, we present in thenext subsection a converging discretization of the WASC process.

    3.3 Approximation of the WASC model by Stochastic Dif-ference Equations

    Another way to estimate continuous time processes consists in building a discretetime model that converges towards the continuous time model, when the time laggoes to zero. A methodology of this kind has already been applied to stochasticvolatility models by Nelson (1990). In this section, we build up a discrete-timemodel of this kind that converges in probability towards the WASC model.

    Following the seminal paper of Nelson (1990), we build up a stochastic difference

    equations approximation of the WASC. Our approach relies on the fact thata Wishart process with integer Gindikin value can be written as a square of a

    10

  • 7/30/2019 Grasselli_WishartCorrelation

    12/49

    Ornstein-Uhlenbeck matrix process. On this point, see Bru (1991). This propertyhas been used by Gourieroux et al. (2004) to construct an estimator of a pureWishart process. Thus, there exists a matrix-sized stochastic process Xt in Mn

    such that t = Xt Xt and whose dynamics are given by

    dXt = M dt + dNtQ, (25)

    with {Nt; t 0} a n matrix Brownian motion and Q the rotation of Q,so that Q is symmetric. What is more, the link between N and W is knownfrom Bru (1991). Thus, once the link between Z and N has been found (seethe Appendix), we are able to specify the WASC with the following stochasticdifferential equation:

    dYt = +12

    Vec((XX)ii) dt +

    XXdZt

    dXt = Mdt + diag(dZt)t + diag(dZt)t + diag(dZt)t + diag(dZt)t Q,(26)

    with Z, Z, Z and Z independent Brownian motions. , , and are func-tions of the (rotated) parameter and of Xt. Using this result and the onesin Nelson (1990), we can provide a discrete time model converging towards theWASC. From the Hermite polynomials theory we can build from {k; k N}some sequences of uncorrelated normalized Gaussian variables, due to the factthat E[Hi(U)Hj(U)] = ijci with N(0, 1), where ij is the Kronecker functionand ci = i!. By taking the limit, these sequences will converge to independent

    Brownian motions, thus extending to a dimension greater than two the theorem3.2 of Nelson (1990). We illustrate such converging sequences in the followinglemma, whose proof is provided in the Appendix.

    Lemma 3.1. Let {tk; k N} be such that ti ti1 = . Denote by(Ytk , Xtk+1)the solution of the stochastic difference equations

    Ytk Ytk1 =

    + 12

    Vec((Xtk)Xtk)ii

    +

    Xtk

    Xtk

    tk

    Xtk+1 = Xtk

    + M + NtkQ

    ,

    where

    {tk ; k

    N

    }are i.i.d

    N(0, In), Ntk is given by

    Ntk =

    H1(

    1tk

    ) 00 H1(

    2tk

    )

    tk

    c1+

    H2(

    1tk

    ) 00 H2(

    2tk

    )

    tk

    c2

    +

    H3(

    1tk

    ) 00 H3(

    2tk

    )

    tk

    c3+

    H4(

    1tk

    ) 00 H4(

    2tk

    )

    tk

    c4,

    with ci = E[Hi(U)2], U N(0, 1) and Hi(x) is the ith Hermite polynomial.

    The process (Ytk, Xtk+1

    ) converges weakly to (Yt, Xt) as t 0, with (Yt, Xt)defined previously.

    Even though this approach is attractive, insofar as it makes the conditioningeasier, we will not pursue this direction. As already pointed out by Nelson (1990),

    11

  • 7/30/2019 Grasselli_WishartCorrelation

    13/49

    it remains to prove that the discrete approximation scheme will provide consistentestimates of the continuous time model parameters. The question of consistencyand efficiency, when dealing with unobservable components in the dynamics, is

    of tremendous importance. We presented the previous discretization in order tocast some light on the possibility of estimating a discrete version of the WASC,without using the GARCH-like model, because of the need to estimate the vector-sized parameter. Nevertheless, we have little control on the impact of thediscretization error brought about by such a procedure. Thus, instead of followingthe approximation procedure, we prefer to turn our attention toward the use ofthe exponential affine characteristic function of the WASC process, for whichwe have well-established results related to Generalized Method of Moment-basedestimation procedures.

    4 Spectral GMM in the WASC settingRecent papers presented estimation methodologies using the empirical character-istic function as an estimation tool, since this function has a tractable expressionfor many continuous time processes. In this section, we present how to estimatethe WASC in this framework, building on the approaches developed in Chackoand Viceira (2003) and Carrasco et al. (2007).

    The usual way to present the generalized method of moments based on spectralmoment conditions unfold as follows. Let ht be the conditional moment condition

    such that

    ht = eiw,Yt+Yt X, (27)

    with the notations developed earlier. X is a non stochastic function of the processparameters. Let g(Yt) be the instruments. We need to identify X such that thefollowing relation holds:

    E[htg(Yt)] = 0 E[eiw,Yt+Ytg(Yt)] E[Xg(Yt)] = 0. (28)As what can be found in Chacko and Viceira (2003), the usual way to proceed

    consists in setting

    X = E[eiw,Yt+Yt|Yt]. (29)In the WASC case, X is thus set to be equal to :

    X = ec()E[eA(),t|Yt] = ec()0(t, iA()), (30)with c() defined in equation (15), A() defined in equation (13) and t(.) de-fined in equation (17). Clearly, in such a case equation (28) has no reason tobe equal to zero. The relation only holds if (Yt)tN and (t)tN are independent

    processes or if g(Yt) = 1. The autonomous behavior of the matrix volatility isnot equivalent to having an independent process. Since in the WASC case this

    12

  • 7/30/2019 Grasselli_WishartCorrelation

    14/49

    independence is false, we imposed the instrument value to be 1, as in Chacko andViceira (2003) (see page 272). A spectral GMM approach cannot be undertakenotherwise1. This setting stems from the fact that we integrate the volatility out

    when computing X. Could we use t as an instrument, a more general form ofinstruments would be readily used.

    In order to increase the efficiency of our estimates, we use a continuum of momentconditions, as presented Carrasco et al. (2007). Note that the fact that we set theinstruments to be equal to 1 naturally prevents us from reaching the ML efficiencyof CGMM estimates of Carrasco et al. (2007). Anyway, Yt conditionally upon itspast is no longer a Markov process, since the covariance matrix is unobservable.ML efficiency cannot be reach for non-Markov process: the special instrumentschosen here does not necessarily jeopardize the estimation results.

    Let now ht(.) be the sample mean of the moment condition, that is a functionfrom R2n to C. In an infinite conditions framework, Carrasco et al. (2007) showedthat the objective function to minimize is:

    = arg min

    K1/2h(), (31)

    where K is the covariance operator, that is the counterpart of the covariancematrix in finite dimension as in standard GMM approach and . is the weightednorm

    f

    2 = Rn Rn f()f()()d

    where denotes any probability measure. As in Carrasco et al. (2007), we chose itto be the normal distribution. Carrasco et al. (2007) showed that the covarianceoperator K can be written as follows:

    Kf() =

    k(, )f()()d

    where the function k is the so called kernel of the integral operator K and isdefined by:

    k(, ) =

    +j=

    E0 ht(; 0)htj(; 0) .

    Since our approach is nested within the Carrasco et al. (2007)s, we now thor-oughly follow their settings. Our approach can also be related to the methodologypresented in Rockinger and Semenova (2005). In order to construct an estimatorof the covariance operator, Carrasco et al. (2007) proposed a two-step procedure.The first step consists in finding:

    1 = arg min

    h(). (32)1

    We thank Marine Carrasco for pointing out this fact.

    13

  • 7/30/2019 Grasselli_WishartCorrelation

    15/49

    Since ht is not a martingale difference series, the second step consists in estimatingthe kernel k as follows:

    k(s, r, v, w) =

    T

    T qT1

    j=T+1

    jST T(j), (33)with

    T(j) =

    1T

    Tt=j+1 ht(s, r, 1)htj(v, w, 1), j 0

    1T

    Tt=j+1 ht+j(s, r, 1)ht(v, w, 1), j < 0,

    , (34)

    where w(.) is any kernel satisfying some regularity conditions (see Carrasco et al.(2007) Appendix A.6) and ST is a bandwidth parameter.

    Once the covariance operator is estimated, the minimization in equation (31)

    requires the computation of the inverse of K. Unfortunately, K has typicallya countable infinity of eigenvalues decreasing to zero, so that its inverse is notbounded. We need then to regularize the inverse of K, which can be done byreplacing K by a nearby operator that has a bounded inverse, due to the presenceof a penalizing term. Carrasco et al. (2007) used the Tikhonov approximation ofthe generalized inverse of K. Let be a strictly positive parameter, then K1 isreplaced by:

    (K)1 = (K2 + I)1K. (35)

    As outlined in Carrasco et al. (2007), the choice of is important but does notjeopardize the consistency of the estimates. Carrasco and Florens (2000) investi-

    gated an empirical method to select its value, and the optimal value for it shouldrepresent a trade-off between the instability of the generalized inverse (for smallvalues of ) and the distance from the true inverse as increases. Furthermorewe found much more convenient to compute (K)1 using the Choleskys de-composition than the spectral decomposition: it is sufficient for the evaluation ofequation (31) and avoids the numerically difficult problem of eigenvectors com-putation.

    Under mild regularity conditions (conditions A.1. to A.5. in Carrasco et al.,2007), it can be proved that the optimal C-GMM estimator of is obtained by:

    = arg min

    (KT)1/2hT() (36)and is asymptotically Normal with

    T(T 0) L N

    0, (E0(h), E0(h)K)1

    , (37)

    as T and Ta5/4T go to infinity and goes to zero. (h denotes the Jacobianmatrix of h(.)).

    Finally, it is important to mention that Carrasco et al. (2007) present a matrix-based version of their estimation method that may be more appealing than the

    one presented here for a WASC model based on more than two assets or for othermodels.

    14

  • 7/30/2019 Grasselli_WishartCorrelation

    16/49

    5 Empirical Results

    We now review the empirical results obtained with the aforementioned estimation

    methodology. First, we provide insight into the model and the parameters inter-pretation. Then we review the results of a Monte Carlo experiment investigatingthe empirical behavior of the estimation methodology. Finally, we present theestimates obtained using equity indexes and discuss the results obtained.

    5.1 Preliminary considerations

    Unlike the Heston (1993) model, the Wishart Affine Stochastic Correlation modelis a new model for which the parameters interpretation is not immediate. Suchan interpretation is however essential to the understanding of the model and forits estimation. For the sake of simplicity, we focus on the case where n = 2, i.e.

    the case for which we observe two assets. Yt is the vector containing the log ofthe asset prices, and t is its covariance matrix given by equation (6). Y

    1t being

    the log return of the first asset, its volatility is given by

    11t . In the WASCframework, individual parameters can hardly be interpreted on their own: on thecontrary, combinations of these parameters have standard financial interpreta-tions, such as the mean-reverting parameter or the volatility of volatility. Now,we review the computation of these quantities.

    For the first asset, the quadratic variation of the volatility can be computed asfollow:

    d11, 11t = 411t (Q211 + Q221)dt, (38)for the first asset. Therefore the first column of Q parametrizes the volatility ofvolatility of the first asset. Similar results can be obtained for the second asset.

    Then, as presented in Section 2,

    corr

    dY1, d11

    =Q111 + Q212

    Q211 + Q

    221

    , (39)

    where corr(.) is the correlation coefficient. As already mentioned, the short termbehavior of the smile and the skewness effect heavily depend on the correlationstructure given by the vector . IfQ and are such that this quantity is negative,then the volatility of S1 will rise in response to negative shocks in returns of thisasset. We expect this correlation to be large and negative, in order to accountfor the large skewness found in financial datasets.

    The Gindikin coefficient insures the positiveness of the Wishart process. Whatis more, an increase of it will shift the distribution of the smallest eigenvalueto positive values. Thus, this parameter can be interpreted as a global variance

    shift factor. From equations (3) and (4), if is multiplied by a factor , thelong term covariance matrix will be multiplied by the same factor. also

    15

  • 7/30/2019 Grasselli_WishartCorrelation

    17/49

    impacts the mean reverting and variance of the correlation process. The higherthe parameter and the lower the persistence and the variance of the correlationprocess. Thus, there is a trade-off in the WASC model between volatility of the

    returns and volatility of the correlation process.

    The M matrix can be compared to the mean reverting parameter in the Cox-Ingersoll-Ross model. Like for the parameters previously investigated, the ele-ments of this matrix can hardly be interpreted directly. However, we can computein a closed form expression the drift part of the dynamics of ij. In the case ofthe first asset:

    d11t = . . . + 11t

    2M11 + 2M12

    22t

    11t

    12t

    + . . . , (40)

    where 12t is the instantaneous correlation between the log-returns of the twoassets. Thus, the mean reverting parameter for 11 is a combination of the ele-ments of M. What is more, this drift term is made of two parts: a deterministic

    part (2M11) and a stochastic correction (2M12

    22t11t

    12t ), linked to the joint dy-

    namics of both assets. Thus, the drift term of 11t is influenced by one of theoff-diagonal elements ofM. This feature cannot be replicated by most of the mul-tivariate GARCH-like models. We can perform similar calculations for 12t and22t . These quantities can then be used to compare the half life of the variancesand covariance processes and thus evaluate their relative persistence in financial

    markets.

    The instantaneous correlation between assets has also a closed form expression:

    d12t =

    At

    12t2

    + Bt12t + Ct

    dt +

    1 (12t )2(...)d(Noiset) (41)

    with At, Bt, Ct recursive functions of 11t ,

    22t and the model parameters. We

    present the drift coefficients and the diffusion term in the Appendix. The driftassociated to the correlation is quadratic, and the linear term has a negative co-efficient Bt < 0, thus presenting the typical mean reverting behavior of

    12t (at

    least around zero where the quadratic part is negligible). The linear part canthus be used to analyze the persistence of the correlation and its mean-revertingcharacteristics, during low correlation periods. When the absolute value of thecorrelation is higher, the quadratic part of the drift get the upper hand and thecorrelation process looses most of its persistence. By comparing the values ofBt and At, when can thus compare the correlation behavior during low and highcorrelation cycles. This information has not been documented until now, whereasit is important to understand the joint behavior of financial assets.

    The WASC model can also be used to investigate potential contagion effects in

    financial markets. By computing the correlation between the correlation processand the returns, we can discuss under which condition the model is able to display

    16

  • 7/30/2019 Grasselli_WishartCorrelation

    18/49

    an asymmetric correlation effect2. Asymmetric correlation effect leads correlationto go up whilst returns are getting down. As already noticed in Da Fonseca et al.(2006), we have:

    dY1, 12t =

    11t22t

    (1 12t 2) (Q121 + Q222) Sign of asset 2 skew

    . (42)

    Thus, the sign of the skews determines the one of the covariance between correla-tion and returns. Were the skew to be negative and the model would also displayincreases in the correlation following negative returns. Thus, the WASC model isalso able to display an asymmetric correlation effect, whose sign is driven by theskewness associated to the returns. Since the asset returns are negatively cor-related to their own volatility (leverage effect), we thus expect volatilities to be

    positively correlated to correlation: negative returns periods correspond to bothhigher correlation and higher volatility periods. In fact, simple computationsgiven in Appendix lead to

    d

    12, 11t

    =

    11t22t

    1 12t 2 Q12 Q11 + Q22 dt.

    where Q is the symmetric positive definite matrix associated to the polar decom-position ofQ3. A positive value for Q12 would mean that the WASC model is ableto accomodate stylized effects of the type mentioned earlier. Due to the increase

    in the drift term of the correlation dynamics, situation of this kind are expectednot to last for long.

    We now turn our attention toward a series of Monte Carlo experiments, so as toinvestigate the empirical performance of the chosen estimation strategy.

    5.2 Monte Carlo study

    Following Carrasco et al. (2007), we present the results of a Monte Carlo studyof the CGMM estimation methodology applied to the WASC. We first present

    the technical details of the simulation and then we review the results obtained.

    For the ease of the presentation, we restrict to the two-assets case. The parame-

    2On asymmetric correlation effects, see Roll (1988) and Ang and Chen (2002).3Any invertible matrix Q can be uniquely written as the product of a rotation matrix and

    a symmetric positive definite matrix Q, see the Appendix.

    17

  • 7/30/2019 Grasselli_WishartCorrelation

    19/49

    ters used in the simulation are the following:

    0 = 0.0225 0.0054

    0.0054 0.0144 (43)

    M = 5 3

    3 5

    (44)

    =

    0.3 0.4

    (45)

    Q =

    0.1133137 0.033358710.0000000 0.07954368

    (46)

    = 15. (47)

    The Q matrix is obtained by inverting the relation that links Q to M, and :

    QQ =

    1

    M + M . (48)This ensure the stationarity of the correlation process. When Q is selected arbi-trary and given the mean reverting property of t, the first part of the simulatedsample will be tainted by the collapse of the process toward its long term average.Situations of this kind should be discarded. The Figure 2 presents a simulatedpath for both volatilities and correlation, using the previous parameter values.The figure displays mean-reverting dynamics for each of these moments.

    The Figure 3 shows the characteristic function used in the spectral GMM methodused in the paper, as presented in equation (29). The grid used for the numerical

    integration of the objective function ranges on the real line from -300 to 300.We used Gaussian kernels with appropriate variance parameter to maintain asmuch information as possible. The integral is computed numerically using theTrapezoidal Rule that seemed to performed well over the simulated dataset. Theobjective function is minimized using a simulated annealing method, as describedin Belisle (1992).

    We present the results of different Monte Carlo experiments. Each of them comesout after 1000 iterations, but they differ by the length of the simulated sampleand the sampling frequency: daily, weekly and monthly. For each sampling fre-

    quency, we used two different samples, one of which contains 500 observationsand the other one 1500 observations. The Table 7.3 presents the Mean Bias andthe Root Mean Square Error (RMSE) obtained. We did not reported the medianbias insofar it was close to the median bias, thus indicating that the estimatorshave a symmetric empirical distribution.

    The results can be analyzed as follows. The Monte Carlo results obtained for Q, and show that an increase in the sample depth globally results in a reductionof the variance of the estimates. The bias obtained are small and not significa-tive. For the weekly frequency, 1 displays a noticeable difference as the variance

    of the estimate grows with the sample size. This feature will have to be consid-ered when analyzing the real dataset-based estimation results. The M parameter

    18

  • 7/30/2019 Grasselli_WishartCorrelation

    20/49

    also presents this variance increase feature. However, this behavior is not verysurprising: a large number of articles emphasize the difficulties involved by theestimation of the mean reverting parameter in continuous time diffusions (see e.g.

    Gourieroux and Monfort, 2007). The Monte Carlo results indicate that this meanreverting parameter is estimated with less volatility with daily series. What ismore, diagonal elements (resp. off-diagonal elements) of M are estimated with asmall positive (resp. negative) bias and thus may be underestimated (resp. over-estimated) when working with a real-time dataset. Finally, the correlation vectordisplays a remarkably small bias and small RMSE for the daily datasets, evenin the small sample version. This fact is somewhat constant for each samplingfrequency. This point is important for the WASC model, given that we are in-terested in the analysis of the fine correlation structure implicit in asset dynamics.

    We now detail the empirical results obtained with stock indexes.

    5.3 Estimation on stock indexes

    In this last subsection, we present the empirical results obtained when estimatingthe WASC using the C-GMM method on a real dataset. We used the followingstock indexes: SP500, FTSE, DAX and CAC 40. For each stock, the time seriesstarts on January 2nd 1990 and ends on June 30th 2007. This period excludesthe 1987 crash and the subprime crisis. It nonetheless includes a lot of financialturmoils, as pointed in Rockinger and Semenova (2005). The table 2 presentsthe descriptive statistics for the sample used in the estimation, at a weekly sam-

    pling frequency. We used daily and weekly time series. We discarded the use ofmonthly ones since the sample would be far too small. In many articles devoted tothe estimation of continuous time models, the change in the sampling frequencyusually leads to an interesting analysis of the subtle dynamics of financial mar-kets (see e.g. Chacko and Viceira, 2003). Since the characteristic-function basedestimators do not suffer from discretization errors, we can actually use any sam-pling frequency. Like in Carrasco et al. (2007), = 0.02 were found to performwell. The integration grid is the same as the one used for the previous simulationexercises. We chose to use the Bartlett kernel for the GMM covariance matrixestimation, following the procedure presented in Newey and West (1994).

    For numerical sake, we focus again on a two-assets case (n = 2). We estimatedthe parameters driving the WASC process for the following couples of indexes:(SP500,FTSE), (SP500,DAX), (SP500,CAC), (DAX/CAC), (DAX,FTSE) and(FTSE/DAX). This way, we will be able to compare the characteristics specificto individual stock while estimated with each of the others. For example, we willbe able to compare the volatility of volatility of the SP500, when it is estimatedwith the DAX, CAC and FTSE as a second asset. It will highlight the impact ofjoint dynamics on idiosyncratic behaviors, which has hardly be documented untilnow.

    The estimation results are presented in table 3 for the daily results and in table

    19

  • 7/30/2019 Grasselli_WishartCorrelation

    21/49

    4 for the weekly results. Most of the estimates are significative up to a 5 or 10%risk level. What is more, in the weekly observation case, the size of the sample isstrongly reduced and so is the efficiency of the estimation method. Nevertheless,

    the estimation results yield interesting information both about the WASC processand the dynamics of the stock indexes.

    As presented in the previous subsection, it is difficult to compare the individualparameters and we will thus focus on combinations of these parameters most ofwhich are comparable with the ones of the Heston (1993) model.

    For the estimated parameters presented in Table 3, the associated volatility ofvolatility are presented in Table 5. The estimation of this quantity is essentialto test the ability of the model to capture financial market features: as pointedout in Chacko and Viceira (2003), this parameter controls the kurtosis of theunderlying process. Several remarks can be made. First, the global results matchwhat is generally expected from stock indexes. Such markets are known to lead toa volatility of volatility ranging from 5% to 25%. Second, the results obtained forthe SP500 are remarkably stable when the second asset changes at least for thedaily sample: it ranges from 14.6% to 24.4%, thus matching the results obtainedin Eraker et al. (2003). Still, it is below the estimates obtained in Chacko andViceira (2003) and Rockinger and Semenova (2005): this may be explained bythe fact that the model that is estimated here is multivariate, whereas existingattempts to capture stochastic volatility has been made in a univariate framework.Da Fonseca et al. (2006) showed that the correlation between the volatility of each

    stock is non 0 insofar as

    d11, 22 = 12dt. (49)Hence, whenever 12 is positive, the WASC model is able to model volatilitytransmission phenomena among assets. It is noteworthy to remark that theseresults are globally stable across the datasets and close to the existing results.Third, when the sampling frequency reduces, the volatility of volatility parame-ter globally increases. The few lacks of consistency for this fact may be due tothe fact that the number of observations in the weekly dataset is far below theone used in the daily dataset. These results are different from those obtained in

    Chacko and Viceira (2003). However, this is in line with what is observed for thevolatility of the log-returns when reducing the sampling frequency. This diver-gence may also be explained by the effect of the correlation between variancesthat cannot be mimicked in a Heston-like framework.

    Now, let us discuss an important parameter for the specification of the WASCmodel, that is the correlation between the returns and the volatility. We alreadymentioned that this parameter is essential to have a model that is consistent withmany stylized facts, such as negative skewness and thus skewed implied volatilitysurfaces. It can be computed using its expression given in equation (39). The

    Table 6 displays the results obtained.

    20

  • 7/30/2019 Grasselli_WishartCorrelation

    22/49

    This time, we have results that are comparable to the one obtained in the existingliterature, and especially for the SP500. The correlation for this index is reportedin the first line of the previous Table. In the literature, it actually ranges from

    -0.27 (Rockinger and Semenova, 2005) to -0.62 (Chacko and Viceira, 2003), whichis close to what is obtained here. The parameter values obtained for the CAC,DAX and FTSE indexes are not surprising either, since their sign is negative.The main problem here lies in the instability of the parameter for the differentestimation involved, when comparing both the sample frequency and the coupleof indexes that are estimated. The change in sampling frequency does not leadto a similar behavior across the datasets: depending upon the couples and thesampling frequency, the skewness in the dataset can considerably differ, underlin-ing the fact that correlation processes implicit in financial markets are complex.Beyond the remarks made in the previous paragraph on the importance of thedataset depth, we also emphasize that the computation of this parameter dwellson the inverse of the square root of a quantity that is small. In this situation,the inverse of something small can be found to be very variable: any error in theestimation of Q11 and Q21 will have a strong impact on

    1Q211

    +Q221

    . Thus, this

    skewness quantity must be cautiously interpreted. Last but not least, since theskews are negative, the fitted WASC models also display asymmetric correlationeffects: negative returns are likely to be followed by a higher correlation betweenthe two assets.

    We now turn our attention toward the mean reverting matrix M. For SP500,CAC, DAX and FTSE, we find the same structure for the matrix M. They aredefinite negative thus ensuring a mean reverting behavior for the Wishart processand have positives off diagonal terms. As presented in the previous subsection,the drift can be decomposed into two different part: an idiosyncratic part (de-noted 1 in the tables) and a joint part (denoted 2 in the tables). For univariatestochastic volatility models, the estimation results usually lead to an estimateof = 1 + 2, that is the sum of the two preceding elements. Thanks to thecomplexity of the WASC process, we are now able to disentangle and analysethese two different elements. The estimation results are reported in table 8. The values should be compared with the mean reverting value of the Heston. Weare close to Rockinger and Semenova (2005) results who found 6.3352 (see their

    Table 1) for the S&P500, even though their results are obtained on a differentsample and using daily data. However, when analysing 1 and 2, we find that theidiosyncratic mean reverting component is always higher than the usual Hestonparameter. This idiosyncratic element is dampened by the negative joint meanreverting component: its negativity is to be related to the negative non-diagonalelements of the estimated M matrices. Again, when the sampling frequencychanges, each of these values vary, suggesting that the mean reverting parame-ter associated to the volatility strongly depends on the sampling frequency, aspointed out in Chacko and Viceira (2003). Globally, the associated half lives arearound one month, which is a realistic value.

    21

  • 7/30/2019 Grasselli_WishartCorrelation

    23/49

    As presented in the previous section, it is possible to perform similar computa-tions for the drift term of the stochastic correlation. This drift is a non linearfunction of12t and the usual comments have to be adapted. We present in figure

    4 and 5 the instantaneous variation of 12

    t as a function of 12

    t , highlighting thecontribution of the quadratic term when the correlation gets very high thatis during crisis period. Our results suggest that the correlation process is muchmore persistent than volatility when the correlation is below its long term level,since in such a case its mean reverting parameter can be reduced to (Bt): inthis situation, the half life is around 2 months, which is again realistic. Whencorrelation is high, the quadratic term gets higher and the persistence goes down,since At is added to Bt as both these elements are negative. This is consistentwith what is empirically observed during financial market crises: the correlationgets very high on a very short period, to finally go back to its long term behaviorrapidly. The aforementioned figures display reaction functions of this kind, un-derlining the ability of the WASC model to encompass this standard feature offinancial markets.

    Another quantity of importance is ||||, since the WASC can be seen as a general-ization of the processes proposed in Gourieroux and Sufana (2004) and Buraschiet al. (2006), with a more complex correlation structure. Since the WASC modelis only defined up to a rotation matrix, the model presented in Buraschi et al.(2006) encompasses any correlation structure that satisfies |||| = 1. Testingsuch an assumption is thus of a tremendous importance to judge whether thecomplexity of the WASC is empirically justified. The table 10 presents the norm

    of this vector parameter. Each of the estimated value strongly differs from 1,suggesting the general correlation structure imposed in Da Fonseca et al. (2006)is empirically grounded.

    As presented earlier, a contagion effect can be handled by the WASC through apositive value for Q12. In table 11 we report the estimated values for this param-eter. We found positive values for all couples of indexes as expected. Therefore,the estimated WASC model is able to detect the existence of potential contagioneffects in the dataset. As mentioned earlier, these findings may be due to the factthat the dataset includes several financial crises, periods during which dramatic

    contagion effects are expected.

    6 Conclusion

    In this paper we investigated the estimation of a new continuous time model:the Wishart Affine Stochastic Correlation model, presented in Da Fonseca et al.(2006). After having presented the problem that arise when trying to estimatea discrete version of this model, this paper proposes to estimate the process us-ing its exponential affine characteristic function. The estimation method uses a

    continuum of spectral moment conditions in a GMM framework. After a pre-

    22

  • 7/30/2019 Grasselli_WishartCorrelation

    24/49

    liminary Monte Carlo investigation of the estimation methodology, we show thatreal-dataset estimation results bring support to the WASC process. First, theempirical results are comparable to those obtained in the literature (when com-

    parable). Second, the general correlation structure of the WASC casts light onnot-so-well documented features of international equities, allowing us to discussfor example the persistence of the correlation process, contagion effects or asym-metric correlation effects. Third, the generality of the correlation structure isnot rejected by the dataset, bringing empirical support to the model presentedin Da Fonseca et al. (2006).

    References

    Ang, A. and Chen, J. (2002). Asymmetric Correlations of Equity Portfolios.

    Journal of Financial Economic, (63):443494.

    Asai, M., McAleer, M., and Yu, J. (2006). Multivariate Stochastic Volatility: aReview. Econometric Review, 25(2-3):145175.

    Barndorff-Nielsen, O. and Shephard, N. (2002). Econometric Analysis of RealisedVolatility and its Use in Estimating Stochastic Volatility Models. Journal ofthe Royal Statistical Society, 63:253280.

    Barndorff-Nielsen, O. E. and Shephard, N. (2004). Econometric analysis of real-ized covariation: High frequency based covariance, regression, and correlation

    in financial economics. Econometrica, 72(3):885925.

    Bathia, R. (2005). Matrix Analysis. Graduate Texts in Mathematics, Springer-Verlag.

    Belisle, C. J. P. (1992). Convergence Theorems for a Class of Simulated AnnealingAlgorithms. Rd J Applied Probability, 29:885895.

    Bru, M. F. (1991). Wishart Processes. Journal of Theoretical Probability, 4:725743.

    Buraschi, A., Porchia, P., and Trojani, F. (2006). Correlation risk and optimal

    portfolio choice. Working paper, SSRN-908664.

    Cappiello, L., Engle, R. F., and Sheppard, K. (2006). Asymmetric Dynamicsin the Correlations of Global Equity and Bond Returns. Journal of FinancialEconometrics, 4(4):537572.

    Carrasco, M., Chernov, M., Florens, J.-P., and Ghysels, E. (2007). EfficientEstimation of Jump Diffusions and General Dynamic Models with a Continuumof Moment Conditions. Journal of Econometrics, (140):529573.

    Carrasco, M. and Florens, J. (2000). Generalization of GMM to a Continuum of

    Moment Conditions. Econometric Theory, (16):797834.

    23

  • 7/30/2019 Grasselli_WishartCorrelation

    25/49

    Chacko, G. and Viceira, L. M. (2003). Spectral GMM Estimation of Continuous-Time Processes. Journal of Econometrics, 116(1-2):259292.

    Da Fonseca, J., Grasselli, M., and Tebaldi, C. (2005). Wishart Multi-DimensionalStochastic Volatility, to appear as: A Multifactor Volatility Heston Model inQuantitative Finance. Mimeo Ecole Superieure dIngenieurs Leonard de Vinci,RR-31.

    Da Fonseca, J., Grasselli, M., and Tebaldi, C. (2006). Option Pricing whenCorrelations are Stochastic: an Analytical Framework. To appear in Review ofDerivatives Research.

    Daleckii, J. (1974). Differentiation of Non-Hermitian Matrix Functions Dependingon a Parameter. AMS Translations, 47(2):7387.

    Daleckii, J. and Krein, S. (1974). Integration and Differentiation of Functions ofHermitian Operators and Applications to the Theory of Perturbations. AMSTranslations, 47(2):130.

    Donoghue, W. J. (1974). Monotone matrix functions and analytic continuation.Springer-Verlag.

    Duffie, D. and Singleton, K. (1993). Simulated Moments Estimation of MarkovModels of Asset Prices. Econometrica, 61:929952.

    Engle, R. (2002). Dynamic Conditional Correlation: A Simple Class of Multivari-

    ate Generalized Autoregressive Conditional Heteroskedasticity Models. Journalof Business & Economic Statistics, 20(3):33950.

    Eraker, B., Johannes, M., and Polson, N. (2003). The Impact of Jumps in Volatil-ity and Returns. The Journal of Finance, 58(3):12691300.

    Faraut, J. (2006). Analyse sur les groupes de Lie. Calvage & Mounet.

    Gourieroux, C. (2006). Continuous Time Wishart Process for Stochastic Risk.Econometric Review, 25(2-3):177217.

    Gourieroux, C. and Jasiak, J. (2001). Financial Econometrics. Princeton Uni-versity Press.

    Gourieroux, C., Jasiak, J., and Sufana, R. (2004). The Wishart Autoregressiveof Multivariate Stochastic Volatility. Working paper, University of Toronto,(2004-32).

    Gourieroux, C. and Sufana, R. (2004). Derivative Pricing with MultivariateStochastic Volatility: Application to Credit Risk. Les Cahiers du CREF 04-09.

    Hall, B. C. (2003). Lie Groups, Lie Algebras, and Representations: An Elemen-tary introduction. Graduate Texts in Mathematics 222, Springer-Verlag.

    24

  • 7/30/2019 Grasselli_WishartCorrelation

    26/49

    Harvey, A., Ruiz, E., and Shephard, N. (1994). Multivariate stochastic volatilitymodels. Review of Economic Studies, 61:247264.

    Heston, S. (1993). A Closed-Form Solution for Options with Stochastic Volatilitywith Applications to Bond and Currency Options. The Review of FinancialStudies, 6(2).

    Jiang, G. J. and Knight, J. L. (2002). Estimation of Continuous-Time Processesvia the Empirical Characteristic Function. Journal of Business & EconomicStatistics, 20(2):198212.

    Lo, A. W. (1988). Maximum Likelihood Estimation of Generalized Ito Processeswith Discretely Sampled Data. Econometric Theory, 4:231247.

    Nelson, D. B. (1990). ARCH models as diffusion approximations. Journal of

    Econometrics, 45:738.

    Nelson, D. B. and Foster, D. P. (1994). Asymptotic Filtering Theory for Univari-ate Arch Models. Econometrica, (1):141.

    Newey, W. K. and West, K. D. (1994). Automatic lag selection in covariancematrix estimation. Review of Economic Studies, 61(4):63153.

    Rockinger, M. and Semenova, M. (2005). Estimation of Jump-Diffusion Processvia Empirical Characteristic Function. FAME Research Paper Series rp150,International Center for Financial Asset Management and Engineering.

    Roll, R. (1988). The International Crash of October, 1987. Financial AnalystsJournal, (September-October):1935.

    Singleton, K. (2001). Estimation of Affine Pricing Models Using the EmpiricalCharacteristic Function. Journal of Econometrics, (102):111141.

    Singleton, K. J. (2006). Empirical Dynamic Asset Pricing: Model Specificationand Econometric Assessment. Princeton University Press.

    25

  • 7/30/2019 Grasselli_WishartCorrelation

    27/49

    7 Appendix

    7.1 Computing the gradient

    The gradient of the characteristic function is needed to study the asymptoticdistribution of the estimates but also in the optimization process underlying theestimation procedure. Therefore we turn our attention to the differentiation ofmatrix function depending on a parameter. We illustrate the theoretical frame-work with the characteristic function of the assets log returns and we give withouttechnical details the results for the forward characteristic function needed in ourempirical study. We mainly rely on the work of Daleckii (1974) for the generalcase (i.e. in the non-Hermitian matrix case) and to Daleckii and Krein (1974),Donoghue (1974) and Bathia (2005) for the Hermitian matrix case.

    Let us first state some basic results on linear algebra. Denote by {i; i = 1..n}the set of eigenvalues of a matrix X Mn and mi the multiplicity of i as aroot of the characteristic polynomial of X. Define Li = Ker(X iI) and Pi theprojection operator from Cn onto Li, then we have

    ni=1 Pi = I. Define also Ji

    such that (X iI)Pi = Ji. The Jordan normal form ofX if given by the wellknown decomposition X =

    ni=1(iPi + Ji).

    Let f be a function from Mn into Mn: the derivative of f at X in direction H,denoted Df,X(H), is by definition f(X + tH) f(X) Df,X(H) = to(H)and can be computed using the following formula Daleckii (1974):

    Df,X(H) =n

    j1,j2

    mj11r1=0

    mj21r2=0

    1

    r1r2

    r1+r2

    r1j1r2j2

    f(j1) f(j2)

    j1 j2

    Pj1J

    r1j1

    HPj2Jr2j2

    . (50)

    Remark 7.1. Wheneverj1 = j2 the term within the bracket should be replacedby f(j1).

    Remark 7.2. When X can be diagonalized then mj = 1 for each j and we arelead to the very simple form

    Df,X(H) =n

    j1,j2

    f(j1) f(j2)j1

    j2

    Pj1HPj2 . (51)

    If X is Hermitian thenP1 = P and we recover the result presented for examplein Daleckii and Krein (1974).

    Simple algebra leads to Df,X(H) = P Mf (P1HP)P1 where P is the matrix ofthe eigenvectors ofX4, is the Schur product5 and Mf = (Mf(k, l)){k=1...n,l=1...n}is the Pick matrix associated to the function f, which is defined by

    Mf(, ) =

    f()f()

    if = f() if =

    (52)

    4If pi is the ith eigenvector ofX and qi is the i

    th row ofP1 then the projection operator

    on Li is given by Pi = piq

    i5Given two matrix of same size Xkl and Ykl then X Y = XklYkl

    26

  • 7/30/2019 Grasselli_WishartCorrelation

    28/49

    This formulation is well known and can be found for example in Donoghue (1974)p. 79 and Bathia (2005) p123-124.The gradient of the characteristic function involves A where A is given by (13).

    In fact for any parameter value which may be equal to Mkl, Qkl, Rkl or thegradient is given by

    Yt,t(, z) = (Tr(At) + c()) Yt,t(, z).

    From (13) and (L1) = L1(L)L1 implied by the derivation of L1L = I

    we conclude that

    A () = A22 ()1 (A22 ())A() + A22()1A21 () ,

    Therefore we are lead to the computation ofA11 () A12 ()A21 () A22 ()

    ,

    that is the derivative with respect to a parameter of a function of a matrix (inthis case the exponential function). In order to use formula (50) we specify in thefollowing table for each parameter of the WASC model the choice of the matriceX and H. As usual {el; l = 1 . . . n} resp. {ekl; k, l = 1 . . . n} stands for thecanonical basis ofRn resp. Mn(R) (the function f being f(x) = e

    x).

    Parameter X H

    Mkl G ekl 0

    0 ekl

    Qij G

    (ekl)i 2(Qekl + (ekl)Q)

    0 iekl

    l G

    Qeli

    00 iel Q

    To fulfill the analytical computation of the gradient we need the derivative ofc()with respect to a model parameter and particularly the term log A22().

    c() =

    2 Tr ((log A22())) + Tr M + i(Q) (53)In order to apply (50) we just need to define P22 from M2n into Mn such thatP22L = L22 with

    L =

    L11 L12L21 L22

    . (54)

    Then it is easy to see that

    log A22() = Dlog,A22()(P22Dexp,G(H)) (55)

    Once again using (50) with the log function gives the result.

    27

  • 7/30/2019 Grasselli_WishartCorrelation

    29/49

    Remark 7.3. If f is the exponential function then we can also compute thederivative of the exponential of a matrix using the Baker-Hausdorffs formula(see e.g. Hall (2003) p. 71 formula (3.10) for the details)

    eG = D expG ( G), (56)

    where adXY = [X, Y] = XY Y X is the Lie bracket and D expX = eXIeadXadX .The empirical study is based on the forward characteristic function of assets logreturns defined by

    0(t, iA())ec() = exp {T r [B(t)0] + C(t) + c()} (57)with

    B (t) = (A()B12(t) + B22(t))1

    (A()B11(t) + B21(t))

    C(t) = 2

    Tr

    log(A()B12(t) + B22(t)) + tM

    c() = 2

    Tr

    log(A22()) + M + i(Q)

    As for the characteristic function of assets log returns it is straightforward toshow that for any given model parameter we have:

    0(t, iA())ec() = 0(t, iA())ec() (Tr(B(t)0) + C(t) + c()))

    where the matrix derivatives are given byB(t) = (A()B12(t) + B22(t))(A()B12(t) + A()B12(t) + B22(t))1B(t)

    + (A()B12(t) + B22(t))1(A()B11(t) + B21(t)),

    C(t) = 2

    Tr(Dlog,A()B12(t)+B22(t)(A()B12(t) + A()B12(t) + B22(t))).

    This completes the analytical computation of the gradient.

    7.2 WASC approximation

    To prove the convergence of the discrete time process defined in Lemma 3.1 tothe WASC model we first make some simplifications.

    Using the fact that Q GL(n,R)6 we know that Q = KQ with K O(n)and Q Pn where O(n) stands for the orthogonal group7 and Pn is the set ofsymmetric positive definite matrices. We refer to Faraut (2006) for basic resultson matrix analysis. The matrix Brownian motion being invariant under rotationthe dynamic can be rewritten as :

    dt =

    + Mt + tM

    dt +

    tdWtQ + QdW

    t

    t. (58)

    6

    GL(n,R) is the linear group, the set of invertible matrices.7O(n) = {g GL(n,R)|gg = In}

    28

  • 7/30/2019 Grasselli_WishartCorrelation

    30/49

    The associated correlation structure is = K. Note that Q = Q thus theWASC models build upon (, M, Q , ) and (,M, Q, ) have same characteristicfunction. From now on we will work with the (,M, Q, ) specification without

    loss of generality.

    To build a discrete time approximating scheme to the WASC it is tempting touse an Euler scheme. Unfortunately a major difficulty arises with the positive-ness of the Wishart process. One can not assure that the discrete approximatingscheme remains in Pn. One way to solve the problem it to rewrite the Wishartprocess as a product of Ornstein-Uhlenbeck matrix processes thus ensuring posi-tiveness and to approximate, through an Euler scheme, the Ornstein-Uhlenbeckprocess. In the WASC the correlation between assets returns and their volatili-ties is controlled by the vector therefore we need to reformulate the correlationbetween the assets and the Ornstein-Ulhenbeck process. We turn to a more for-mal presentation. It is an extension to our framework of the strategy developedin Gourieroux et al. (2004).

    We make the following hypothesis on the Gindikin coefficient: N and >n 1. Under this hypothesis and following equation (5.7) of Bru (1991) thereexists a matrix stochastic process Xt in Mn such that t = X

    t Xt and whose

    dynamics are given bydXt = M dt + dNtQ, (59)

    with {Nt; t 0} a n matrix Brownian motion. The matrix Brownian motionin equation (58) is related to equation (59) through d

    Wt = 1t Xt dNt. Asthe correlation in the WASC is defined between {Zt; t 0} and {Wt; t 0}, we

    turn our attention toward the problem of specifying it with respect to {Nt; t 0}. Starting from the fact that dZkt dWijt = jkidt for i,j,k = 1, . . . , n8 anddefining {kjt ; k, j = 1, . . . , n} such that dZkt dNijt = kjt ki we conclude that kjt =

    jnl=0

    klt X

    klt

    where

    (ijt )i,j=1..n =

    (Xt

    Xt)

    (ijt )i,j=1..n = ijt

    1

    i,j=1..n.

    From now on we formulate the problem with n = = 2 without loss of generality.Using the definition of kjt we have :

    dN11t =

    11t dZ

    1t +

    12t dZ

    1t +

    1 (11t )2 (12t )2dZ1t

    dN12t = 12t dZ

    1t 11t dZ1t +

    1 (12t )2 (11t )2dZ1t

    dN21t = 21t dZ

    2t +

    22t dZ

    2t +

    1 (21t )2 (22t )2dZ2t

    dN22t = 22t dZ

    2t 21t dZ2t +

    1 (22t )2 (21t )2dZ2t ,

    with {Zt; t 0},{Zt; t 0} and {Zt; t 0} three independent Brownian motions(two dimensional vector). Combining the dynamics of the assets (1), (59), t =

    8ij is the Kronecker function.

    29

  • 7/30/2019 Grasselli_WishartCorrelation

    31/49

    Xt Xt and the decomposition of Nt with respect to Zt,Zt,Zt and Zt, we obtainthat (Yt = log(St), Xt) satisfy the stochastic differential equations system

    dYt = 12 Vec((XX)ii) dt + Xt XtdZtdXt = Mdt +

    diag(dZt)t + diag(dZt)t + diag(dZt)t + diag(dZt)t

    Q.

    Finally, in order to build an approximation of the above diffusions in the spiritof Nelson (1990), it is enough to discretize the above SDE, thus leading to theannounced stochastic difference equations. Since the process Xt is driven bymore noises than Yt, we then use the Hermite polynomials in order to generateindependent noises. In the case of = 0 we recover the result of Gourieroux et al.(2004).

    7.3 Dynamics of the correlation processIn this Appendix we compute in the 2-dimensional case the drift and the diffusioncoefficients of the correlation process 12t defined by

    12t =12t

    11t 22t

    . (60)

    The Wishart process t satisfies the following SDE:

    dt = d 11t

    12t

    12t

    22t

    =

    11 1221 22

    11 2112 22

    +

    M11 M12M21 M22

    11t

    12t

    12t 22t

    +

    11t

    12t

    12t 22t

    M11 M21M12 M22

    dt

    +

    11t

    12t

    12t 22t

    1/2 dW11t dW

    12t

    dW21t dW22t

    Q11 Q12Q21 Q22

    +

    Q11 Q21Q12 Q22

    dW11t dW21t

    dW12t dW22t

    11t 12t

    12t 22t

    1/2

    .

    Let be 11t

    12t

    12t 22t

    :=

    11t

    12t

    12t 22t

    1/2,

    so that

    2t = t =

    (11t )

    2+ (12t )

    211t

    12t +

    12t

    22t

    11t 12t +

    12t

    22t (

    12t )

    2+ (22t )

    2

    . (61)

    We first compute the diffusion coefficient since it follows easily from the compu-tations in the Appendix B of Da Fonseca et al. (2005). From the dynamics of the

    30

  • 7/30/2019 Grasselli_WishartCorrelation

    32/49

    Wishart process we have:

    d11t = (.)dt + 211t

    Q11dW

    11t + Q21dW

    12t

    + 212t Q11dW21t + Q21dW22t ,d12t = (.)dt +

    11t

    Q12dW

    11t + Q22dW

    12t

    + 12t

    Q12dW

    21t + Q22dW

    22t

    + 12t

    Q11dW

    11t + Q21dW

    12t

    + 22t

    Q11dW

    21t + Q21dW

    22t

    ,

    d22t = (.)dt + 212t

    Q12dW

    11t + Q22dW

    12t

    + 222t

    Q12dW

    21t + Q22dW

    22t

    ,

    while the covariations are given by:

    d11, 11t = 411t (Q211 + Q221)dt,d12, 12t =

    11t

    Q212 + Q

    222

    + 212t (Q11Q12 + Q21Q22) +

    22t (Q

    211 + Q

    221)

    dt,

    d22, 22t = 422t (Q212 + Q222)dt,d11, 12t =

    211t (Q11Q12 + Q21Q22) + 2

    12t

    Q211 + Q

    221

    dt,

    d11, 22t = 412t (Q11Q12 + Q21Q22) dt,d12, 22t = 2

    12t

    Q212 + Q

    222

    + 22t (Q11Q12 + Q21Q22)

    dt.

    Now we differentiate both sides of the equality (12t )2

    =(12t )

    2

    11t 22t

    , thus obtaining

    212t d12t =

    212t11t

    22t

    d12t +

    12t2 1

    22td

    1

    11t

    +

    1

    11td

    1

    22t

    + (.)dt,

    so that

    d12t =1

    11t 22t

    d12t

    12t211t

    d11t 12t

    222td22t

    + (.)dt.

    By using the covariations among the Wishart elements we have

    d 12t =1

    11

    t 22

    t 11t Q212 + Q222 + 212t (Q11Q12 + Q21Q22) + 22t Q211 + Q221+

    12t2 Q211 + Q221

    11t+

    Q212 + Q222

    22t+ 2

    12t11t

    22t

    (Q11Q12 + Q21Q22)

    2 12t

    11t

    11t (Q11Q12 + Q21Q22) +

    12t

    Q211 + Q

    221

    2

    12t

    22t

    12t

    Q212 + Q

    222

    + 22t (Q11Q12 + Q21Q22)

    dt,

    which leads to:

    d 12t = 1 12t 2Q212 + Q22222t + Q2

    11

    + Q2

    2111t 212

    t

    (Q11Q12 + Q21Q22)11t

    22t

    dt.31

  • 7/30/2019 Grasselli_WishartCorrelation

    33/49

    Now let us compute the drift of the process 12t .

    We differentiate both sides of the equality 12

    t =

    12t

    11t 22t and we consider thefinite variation terms:

    d12t =1

    11t 22t

    d12t + 12t d

    1

    11t 22t

    + d

    12,

    11122

    t

    =1

    11t 22t

    1121 + 1222 + M21

    11t + M12

    22t + (M11 + M22)

    12t

    dt

    + 12t

    1

    22t

    12(11t )3

    211 + 212 + 2M11

    11t + 2M12

    12t

    +

    111t

    1

    2

    (22t )

    3

    221 + 222 + 2M2112t + 2M2222t

    +3

    8

    11t 22t (

    11t )

    2d

    11t

    +3

    8

    11t 22t (

    22t )

    2d

    22t

    +1

    4

    (11t

    22t )

    3d

    11, 22t

    dt + 1

    22t

    1

    2

    (11t )

    3

    d 11, 12

    t

    + 111t

    12

    (22t )3

    d 12, 22t

    + Diffusions.

    Now we use the formulas of the covariations of the Wishart elements and wearrive to an expression which can be written as follows:

    d12t =

    At

    12t2

    + Bt12t + Ct

    dt + Diffusions,

    where9:

    At = 111t

    22t

    (Q11Q12 + Q21Q22) 22t11t

    M12 11t22t

    M21

    Bt = 211 +

    212

    211t

    221 +

    222

    222t+

    Q211 + Q221

    211t+

    Q212 + Q222

    222t0

    Ct =1

    11t 22t

    (1121 + 1222 2 (Q11Q12 + Q21Q22))

    +

    22t11t

    M12 +

    11t22t

    M21.

    9

    Notice that the diffusion term and both the expressions for Bt and Ct are different fromthe ones obtained by Buraschi et al. (2006).

    32

  • 7/30/2019 Grasselli_WishartCorrelation

    34/49

    From the definition of =

    Q and the Gindikin condition we deduce thatBt is negative. As a by-product, we easily deduce the instantaneous covariationbetween the Wishart element 11t and the correlation process:

    d

    12, 11t

    =1

    11t 22t

    d11, 12t

    12t

    211td11, 11t

    12t

    222td22, 22t

    = 2

    11t22t

    1 12t 2 (Q11Q12 + Q21Q22) dt.

    Using the fact that Q GL(n,R)10 there exists a unique couple (K, Q) O(n) Pn11 such that Q = KQ. We refer to Faraut (2006) for basic results on matrixanalysis. The law of the t being invariant by rotation of Q we rewrite thiscovariation as

    d

    12, 11t

    = 2

    11t22t

    1 12t 2 Q12 Q11 + Q22 dt.

    10GL(n,R) is the linear group, the set of invertible matrices.11

    O(n) stands for the orthogonal group ie O(n) = {g GL(n,R)|g

    g = In} and Pn is theset of symmetric definite positive matrices.

    33

  • 7/30/2019 Grasselli_WishartCorrelation

    35/49

    Rho1

    Support

    Frequency

    0.2

    0.0

    0.2

    0.4

    0.6

    050100150200

    Rho2

    Sup

    port

    Frequency

    0.2

    0.0

    0.2

    0.4

    0.6

    050100200300

    Skewf

    orasset1

    Support

    Frequency

    0.6

    0.4

    0.2

    0.0

    0.2

    050100150200

    Skewf

    orasset2

    Sup

    port

    Frequency

    0.6

    0.4

    0

    .2

    0.0

    0.2

    050100150200

    Fig

    ure1:Estimated(top)andskew(bottom)from

    DCCestimationoftheW

    ASC.

    Realfiguresarein

    redand

    DC

    C-basedestimatedonesareinblue.

    Thefiguredisplaystheestimatedp

    arameter(top)andskew(bottom

    )inatwoassetscase.WefirstsimulatedaWASCmodelinwhich

    =(.5,.5).

    We

    thenestimatedtheDCC-basedtimevaryingcovariancematrices.Finallytheskewisestimatedbyusingequation(7)and(8).Bydenotingtheskew

    vector

    0

    =(

    0 1,

    0 2),thevectoris

    estimatedby=

    Q

    1

    diag

    QQ

    0.Thetopfigurespresentin

    redthevalueforchosenforthesimulation

    and

    inbluethehistogramoftheestim

    ationobtained.Thebelowfigure

    spresentinredtheempiricaldist

    ributionoftheskewestimateddir

    ectlyonthe

    simulatedWASCsamplesandinblue

    theempiricaldistributionoftheDCC-basedskewestimator.

    34

  • 7/30/2019 Grasselli_WishartCorrelation

    36/49

    0 200 400 600 800 1000

    0.5

    1.0

    1.5

    2.0

    2.5

    Index

    Volatility

    Time Varying Volatilities

    0 200 400 600 800 1000

    0.0

    0.4

    0.8

    Index

    Correlation

    Time Varying Correlation

    Figure 2: Time varying (simulated) volatilities (top) and correlations(bottom).

    This figure displays simulated volatilities and correlation in the two dimen-sional case (n = 2). The simulation has been produced using the parametersused in the Monte Carlo experiments. Given t the dynamic covariance ma-

    trix, the volatilities are 11t , 22t . The correlation is obtained by computing12t /

    11t +

    22t

    .

    35

  • 7/30/2019 Grasselli_WishartCorrelation

    37/49

    w

    300

    200

    100

    0

    100

    200

    300

    w

    300

    200

    100

    0

    100

    200

    300

    Ob

    jec

    tive

    0.02

    0.04

    0.06

    0.08

    0.10

    Figure 3: Integrand of the C-GMM estimation criterion.

    The figure displays the characteristic function with the integrated volatility pre-sented in equation (29). The parameters used for to compute this characteristicfunction are those used in the Monte Carlo experiments.

    36

  • 7/30/2019 Grasselli_WishartCorrelation

    38/49

    Daily Weekly

    Number of obs. 500 1500 500 1500

    M11 Bias 0.0133 0.0382 0.227 0.0706

    RMSE 1.5126 1.5213 1.529 1.6011M12 Bias -0.04 -0.014 -0.048 -0.036

    RMSE 1.0136 1.0341 1.098 1.0896M21 Bias -0.056 -0.04 -0.085 -0.036

    RMSE 0.981 1.0205 1.104 1.1101M22 Bias 0.1667 0.1784 0.070 0.0657

    RMSE 1.5378 1.5502 1.511 1.5778Q11 Bias 0.1077 0.1069 0.132 0.1166

    RMSE 0.088 0.0872 0.476 0.3568Q12 Bias 0.0309 0.0287 0.022 0.0208

    RMSE 0.0664 0.0675 0.554 0.4467Q21 Bias -3E-04 -6E-04 -0.059 -0.008

    RMSE 0.0876 0.0862 0.553 0.371Q22 Bias 0.0792 0.0772 0.057 0.0574

    RMSE 0.0678 0.0661 0.455 0.4424 Bias 0.0632 -0.066 0.044 -0.107

    RMSE 4.335 4.3472 4.546 4.43641 Bias 0.0055 -0.007 0.035 0.0268

    RMSE 0.4617 0.4913 0.641 0.80312 Bias -0.004 0.0265 0.063 0.0445

    RMSE 0.5231 0.4734 0.653 0.6405

    Table 1: Results of the Monte Carlo experiments

    This table displays the results for the Monte Carlo simulations performed using the followingparameters:

    0 =

    0.0225 0.0054

    0.0054 0.0144.M =

    5 33 5

    , =

    0.3 0.4

    , (62)

    Q =

    0.1133137 0.033358710.0000000 0.07954368

    , = 15. (63)

    Two different types of simulations are presented: one of the sample includes 500 daily observa-tions and a second one uses 1500 daily observations, as in Carrasco et al. (2007).

    SP500 FTSE DAX CAC 40Min. :-0.128129 Min. :-0.141420 Min. :-0.197775 Min. :-0.149149

    1st Qu.:-0.009940 1st Qu.:-0.010858 1st Qu.:-0.014283 1st Qu.:-0.015173Median : 0.002491 Median : 0.002101 Median : 0.003727 Median : 0.002117

    Mean : 0.001535 Mean : 0.001014 Mean : 0.001523 Mean : 0.0010953rd Qu.: 0.013506 3rd Qu.: 0.013424 3rd Qu.: 0.019416 3rd Qu.: 0.018025Max. : 0.123746 Max. : 0.135879 Max. : 0.171546 Max. : 0.166252

    Table 2: Descriptive statistics for the real dataset.

    The table summarizes the descriptive statistics for the available dataset. This dataset is madeof the SP500, FTSE, DAX and CAC time series, on a weekly sampling frequency. The dataset

    starts on January 2nd 1990 and ends on June 30th 2007.

    37

  • 7/30/2019 Grasselli_WishartCorrelation

    39/49

    SP500/DAX

    SP500/CAC

    SP5

    00/FTSE

    DAX/CAC

    FTSE/DAX

    FTSE/CAC

    M11

    -3.44

    -2.9

    4

    -2.6

    5

    -3.2

    2

    -2.6

    2

    -2.6

    1

    Std.

    Dev.

    (0.23

    9)

    (0.2

    17)

    (0.3

    09)

    (0.1

    5)

    (0.1

    58)

    (0.1

    81)

    M12

    1.17

    0.33

    0.74

    1.51

    0.04

    0.5

    Std.

    Dev.

    (0.21

    1)

    (0.2

    67)

    (0.5

    19)

    (0.2

    66)

    (0.3

    58)

    (0.2

    36)

    M21

    0.22

    0.96

    1.79

    1.51

    0.74

    0.7

    6

    Std.

    Dev.

    (0.46

    9)

    (0.3

    72)

    (0.4

    91)

    (0.1

    68)

    (0.1

    53)

    (0.1

    62)

    M22

    -2.87

    -3.4

    1

    -3.0

    4

    -3.4

    9

    -3.5

    3

    -2.4

    8

    Std.

    Dev.

    (0.2

    1)

    (0.2

    61)

    (0.3

    84)

    (0.1

    93)

    (0.2

    15)

    (0.1

    65)

    Q11

    -0.05

    -0.0

    7

    -0.0

    1

    0.06

    0.06

    0.1

    2

    Std.

    Dev.

    (0.0

    1)

    (0.0

    05)

    (0.0

    08)

    (0.0

    04)

    (0.0

    05)

    (0.0

    03)

    Q12

    -0.1

    0.01

    -0.0

    8

    -0.0

    4

    -0.0

    4

    0.0

    4

    Std.

    Dev.

    (0.01

    2)

    (0.0

    07)

    (0.0

    07)

    (0.0

    04)

    (0.0

    04)

    (0.0

    03)

    Q21

    0.07

    0.02

    -0.0

    8

    -0.0

    8

    0.09

    0.0

    1

    Std.

    Dev.

    (0.00

    7)

    (0.0

    05)

    (0.0

    07)

    (0.0

    07)

    (0.0

    09)

    (0.0

    04)

    Q22

    0.07

    0.12

    -0.0

    3

    -0.1

    0.08

    0.0

    6

    Std.

    Dev.

    (0.01

    4)

    (0.0

    05)

    (0.0

    1)

    (0.0

    04)

    (0.0

    05)

    (0.0

    03)

    10.6

    5

    12.25

    12.7

    11.34

    10.7

    4

    10.8

    7

    Std.

    Dev.

    (0.73

    3)

    (0.7

    76)

    (1.2

    36)

    (0.6

    64)

    (0.7

    38)

    (0.6

    38)

    1

    0.3

    0.6

    0.4

    0.16

    -0.2

    -0.3

    Std.

    Dev.

    (0.06

    7)

    (0.1

    03)

    (0.0

    98)

    (0.0

    76)

    (0.0

    81)

    (0.0

    69)

    2

    -0.2

    -0.4

    0.28

    0.4

    -0.2

    6

    -0.8

    1

    Std.

    Dev.