+ All Categories
Home > Documents > Multivariate measurement error models based on scale mixtures of the skew–normal distribution

Multivariate measurement error models based on scale mixtures of the skew–normal distribution

Date post: 20-Dec-2016
Category:
Upload: pulak
View: 212 times
Download: 0 times
Share this document with a friend
18
This article was downloaded by: [Moskow State Univ Bibliote] On: 02 January 2014, At: 10:20 Publisher: Taylor & Francis Informa Ltd Registered in England and Wales Registered Number: 1072954 Registered office: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK Statistics: A Journal of Theoretical and Applied Statistics Publication details, including instructions for authors and subscription information: http://www.tandfonline.com/loi/gsta20 Multivariate measurement error models based on scale mixtures of the skew–normal distribution V. H. Lachos a , F. V. Labra a , H. Bolfarine b & Pulak Ghosh c a Departamento de Estatística, IMECC , Universidade Estadual de Campinas , Caixa Postal 6065, CEP 13083-859, Campinas, São Paulo, Brazil b Departamento de Estatística , Universidade de São Paulo , São Paulo, Brazil c Department of Quantative Methods and Information Sciences, Faculty Block B-003 , Indian Institute of Management , Bannerghatta Road, Bangalore, 560076, India Published online: 28 Oct 2009. To cite this article: V. H. Lachos , F. V. Labra , H. Bolfarine & Pulak Ghosh (2010) Multivariate measurement error models based on scale mixtures of the skew–normal distribution, Statistics: A Journal of Theoretical and Applied Statistics, 44:6, 541-556, DOI: 10.1080/02331880903236926 To link to this article: http://dx.doi.org/10.1080/02331880903236926 PLEASE SCROLL DOWN FOR ARTICLE Taylor & Francis makes every effort to ensure the accuracy of all the information (the “Content”) contained in the publications on our platform. However, Taylor & Francis, our agents, and our licensors make no representations or warranties whatsoever as to the accuracy, completeness, or suitability for any purpose of the Content. Any opinions and views expressed in this publication are the opinions and views of the authors, and are not the views of or endorsed by Taylor & Francis. The accuracy of the Content should not be relied upon and should be independently verified with primary sources of information. Taylor and Francis shall not be liable for any losses, actions, claims, proceedings, demands, costs, expenses, damages, and other liabilities whatsoever or howsoever caused arising directly or indirectly in connection with, in relation to or arising out of the use of the Content. This article may be used for research, teaching, and private study purposes. Any substantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,
Transcript

This article was downloaded by: [Moskow State Univ Bibliote]On: 02 January 2014, At: 10:20Publisher: Taylor & FrancisInforma Ltd Registered in England and Wales Registered Number: 1072954 Registeredoffice: Mortimer House, 37-41 Mortimer Street, London W1T 3JH, UK

Statistics: A Journal of Theoretical andApplied StatisticsPublication details, including instructions for authors andsubscription information:http://www.tandfonline.com/loi/gsta20

Multivariate measurement errormodels based on scale mixtures of theskew–normal distributionV. H. Lachos a , F. V. Labra a , H. Bolfarine b & Pulak Ghosh ca Departamento de Estatística, IMECC , Universidade Estadualde Campinas , Caixa Postal 6065, CEP 13083-859, Campinas, SãoPaulo, Brazilb Departamento de Estatística , Universidade de São Paulo , SãoPaulo, Brazilc Department of Quantative Methods and Information Sciences,Faculty Block B-003 , Indian Institute of Management ,Bannerghatta Road, Bangalore, 560076, IndiaPublished online: 28 Oct 2009.

To cite this article: V. H. Lachos , F. V. Labra , H. Bolfarine & Pulak Ghosh (2010) Multivariatemeasurement error models based on scale mixtures of the skew–normal distribution, Statistics: AJournal of Theoretical and Applied Statistics, 44:6, 541-556, DOI: 10.1080/02331880903236926

To link to this article: http://dx.doi.org/10.1080/02331880903236926

PLEASE SCROLL DOWN FOR ARTICLE

Taylor & Francis makes every effort to ensure the accuracy of all the information (the“Content”) contained in the publications on our platform. However, Taylor & Francis,our agents, and our licensors make no representations or warranties whatsoever as tothe accuracy, completeness, or suitability for any purpose of the Content. Any opinionsand views expressed in this publication are the opinions and views of the authors,and are not the views of or endorsed by Taylor & Francis. The accuracy of the Contentshould not be relied upon and should be independently verified with primary sourcesof information. Taylor and Francis shall not be liable for any losses, actions, claims,proceedings, demands, costs, expenses, damages, and other liabilities whatsoever orhowsoever caused arising directly or indirectly in connection with, in relation to or arisingout of the use of the Content.

This article may be used for research, teaching, and private study purposes. Anysubstantial or systematic reproduction, redistribution, reselling, loan, sub-licensing,

systematic supply, or distribution in any form to anyone is expressly forbidden. Terms &Conditions of access and use can be found at http://www.tandfonline.com/page/terms-and-conditions

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

10:

20 0

2 Ja

nuar

y 20

14

Statistics, Vol. 44, No. 6, December 2010, 541–556

Multivariate measurement error models based on scale mixturesof the skew–normal distribution

V.H. Lachosa*, F.V. Labraa, H. Bolfarineb and Pulak Ghoshc

aDepartamento de Estatística, IMECC, Universidade Estadual de Campinas, Caixa Postal 6065, CEP13083-859, Campinas, São Paulo, Brazil; bDepartamento de Estatística, Universidade de São Paulo, SãoPaulo, Brazil; cDepartment of Quantative Methods and Information Sciences, Faculty Block B-003, Indian

Institute of Management, Bannerghatta Road, Bangalore, 560076, India

(Received 6 September 2007; final version received 7 July 2009 )

Scale mixtures of the skew–normal (SMSN) distribution is a class of asymmetric thick–tailed distributionsthat includes the skew–normal (SN) distribution as a special case. The main advantage of these classesof distributions is that they are easy to simulate and have a nice hierarchical representation facilitatingeasy implementation of the expectation–maximization algorithm for the maximum-likelihood estimation.In this paper, we assume an SMSN distribution for the unobserved value of the covariates and a symmetricscale mixtures of the normal distribution for the error term of the model. This provides a robust alterna-tive to parameter estimation in multivariate measurement error models. Specific distributions examinedinclude univariate and multivariate versions of the SN, skew–t , skew–slash and skew–contaminated normaldistributions. The results and methods are applied to a real data set.

Keywords: EM algorithm; scale mixtures of the skew–normal distribution; Mahalanobis distance;measurement error models

1. Introduction

Linear regression is one of the most widely used statistical tools and has been subject to extensiveapplications in the literature for over a century. In many such applications, the explanatory variable(x) is not directly observed, and thus measured with error. There is a considerable amount of workon this problem of parameter estimation when the explanatory variable is measured with error,see for instance, Cheng and Van Ness [1] and references therein. As in Cheng and Van Ness [1],instead of observing x, one observes X = x + u, where u is the measurement error. In this paper,we consider a classical approach to the multivariate measurement error models (MEMs), wherethe response variable is an r-dimensional random vector (r > 1), with a single predictor sub-ject to random measurement error. The MEMs are useful concept in many disciplines, includinglinear and nonlinear errors-in-variables regression models, factor analysis models, latent struc-tural models, and simultaneous equations models. The MEM has also been extensively used in the

*Corresponding author. Email: [email protected]

ISSN 0233-1888 print/ISSN 1029-4910 online© 2010 Taylor & FrancisDOI: 10.1080/02331880903236926http://www.informaworld.com

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

10:

20 0

2 Ja

nuar

y 20

14

542 V.H. Lachos et al.

problem of comparing measurement devices [2–4]. Barnett [2] used the MEM for the comparisonof four combinations of two instruments and two operators for measuring vital capacity. Severalother examples of the MEM in the medical area are reported in the literature, especially in [4,5].Examples in agriculture are considered in Cheng and Van Ness [1], and examples in psychologyand education were considered by Dunn [6]. All the aforementioned works were based on theassumption that the distribution of the random errors and the unobserved covariate are Gaussian.However, the normality assumption can be too restrictive and may suffer from the lack of robust-ness against departure from the normal distribution and thus can have an important effect onthe inferences [3,7]. Recently, Arellano-Valle et al. [8] have shown the advantage of using theskew-normal (SN) distribution in the context of the MEM.

In this paper, we extend the recent works by Bolfarine and Galea-Rojas [3] and Arellano-Valleet al. [8]. In particular, in this work, the unobserved value of the covariate (x) is assumed tofollow scale mixtures of the skew-normal (SMSN) distribution; [9], while the random errors areassumed to follow a scale mixtures of the normal distribution (SMN) [10]. Combining together,these extensions result in a flexible class of MEM. Moreover, this class of SMSN distribution is arich family of distributions that contains as special cases, the SN, skew-t (ST), skew-slash (SSL)and skew-contaminated normal (SCN) distributions. All these distributions have heavier tails thanthe SN distribution, and thus can be used for robust inference in many types of model.

The paper is organized as follows. In Section 2, we briefly discuss the SMSN distributionsand some of their properties. In Section 3, we present the SMSN-MEM model. In Section 4, wedevelop the expectation–maximization (EM) algorithm for maximum-likelihood (ML) estima-tion in SMSN-MEM. The observed information matrix is analytically derived in Section 5. Themethodology proposed is illustrated in Section 6 by analysing a real data set and some conclusiveremarks are presented in Section 7.

2. Scale mixtures of the SN distribution

The idea of the SMSN distribution originated from an early work by Branco and Dey [9] (seealso [11]) which included the SN distribution as a special case. In order to motivate our proposedmethodology, we start with the definition of the SN distribution. A p × 1 random vector Y followsan SN distribution with a p × 1 location vector μ, p × p positive-definite dispersion matrix �

and a p × 1 skewness parameter vector λ and denoted by Y ∼ SNp(μ, �, λ), if its probabilitydensity function (pdf) is given by

f (y) = 2φp(y | μ, �)�(λ��−1/2(y − μ)), (1)

where φp(· | μ, �) denotes for the pdf of the p-variate normal distribution with mean vector μ

and covariate matrix �, �(·) represents the cumulative distribution function (cdf) of the standardunivariate normal distribution and �−1/2 is such that �−1/2�−1/2 = �−1. A similar model hasbeen introduced by Azzalini and Dalla-Valle [12], and was extensively studied by Azzalini andCapitanio [13].

Let Z = Y − μ. Since aZ ∼ SNp(0, a2�, λ), for all a > 0, the SMSN class of distributionscan be defined as the p-dimensional random vector

Y = μ + κ1/2(U)Z, (2)

where μ is a location vector, Z ∼ SNp(0, �, λ). κ(·) is a weight function and U is a positivemixing random variable with cdf H(u; ν) and pdf h(u; ν), independent of Z, where ν is a parameterindexing the distribution of U .

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

10:

20 0

2 Ja

nuar

y 20

14

Statistics 543

From Equation (2), it can be shown that given U = u, Y follows a multivariate SN distribution,SNp(μ, κ(u)�, λ). Hence, integrating out u, the pdf of Y is given by

f (y) = 2∫ ∞

0φp(y | μ, κ(u)�)�(κ−1/2(u)λ��−1/2(y − μ))dH(u). (3)

For a p-dimensional random vector Y with the above pdf as in Equation (3), we denote it byY ∼ SMSNp(μ, �, λ; H). One particular case of this distribution is the SN distribution, which isobtained when H is degenerate, κ(u) = 1 and u > 0. Also, when λ = 0, the SMSN distributionreduces to the SMN class, i.e. the class of SMN distribution represented by the pdf

f0(y) =∫ ∞

0φp(y | μ, κ(u)�)dH(u).

We use the notation Y ∼ SMNp(μ, �; H) when Y has distribution in the symmetrical class ofSMN distributions. Notice that when κ(u) = u−1 in Equation (2), the distribution of Y reducesto the SN/independent class as discussed in Lachos et al. [14] and if λ = 0, the distribution ofY reduces to the normal/independent class discussed in Lange and Sinsheimer [15]. In the nextsection, we present some additional special cases of the SMSN distributions. We also computethe conditional moments defined by ur = E[κ−r (U) | y] and ηr = E[κ−r/2(U)W�(κ−1/2(U)A) |y], where A = λ��−1/2(y − μ) and W�(x) = φ1(x)/�(x), with x ∈ R. Derivation of thesemoments will be useful in the implementation of the EM algorithm and in the estimation ofthe latent variable x. Other members of the SMSN class can be found in Branco and Dey [9].

2.1. Multivariate ST distribution

The multivariate ST distribution with ν degrees of freedom is denoted by STp(μ, �, λ; ν) andcan be derived from Equation (3), when κ(u) = 1/u, with U distributed as Gamma(ν/2, ν/2),u > 0 and ν > 0. The pdf of Y has the following form:

f (y) = 2tp(y | μ, �; ν)T

(√p + ν

d + νA; ν + p

), y ∈ R

p,

where d = (y − μ)��−1(y − μ) is the Malanobis distance, tp(· | μ, �; ν) and T (·; ν) denote,respectively, the pdf of the p-variate Student’s-t distribution and the cdf of the standard univariatet-distribution. A particular case of the ST distribution is the skew-Cauchy distribution obtainedwhen ν = 1. Also, when ν → ∞, one obtains the SN distribution. From Proposition 1 inLachos et al. [14], it follows that the conditional moments are given by

ur = f0(y)

f (y)

2r+1�((ν + p + 2r)/2)(ν + d)−r

�((ν + p)/2)T

(√ν + p + 2r

ν + dA; ν + p + 2r

)

and

ηr = f0(y)

f (y)

2(r+1)/2�((ν + p + r)/2)

π1/2�((ν + p)/2)

(ν + d)(ν+p)/2

(ν + d + A2)(ν+p+r)/2.

Recent applications of the ST distribution to robust estimation can be found in Azzalini andGenton [16].

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

10:

20 0

2 Ja

nuar

y 20

14

544 V.H. Lachos et al.

2.2. Multivariate SSL distribution

Another SMSN distribution termed the multivariate SSL distribution is denoted bySSLp(μ, �, λ; ν). A multivariate SSL distribution is obtained when κ(u) = 1/u and thedistribution of U is Beta(ν, 1), (0 < u < 1 and ν > 0). The pdf is given by

f (y) = 2ν

∫ 1

0uν−1φp(y | μ, u−1�)�(u1/2A)du, y ∈ R

p.

The SSL distribution reduces to the SN distribution when ν → ∞. The conditional moments ur

and ηr for the SSL distributions are

ur = f0(y)

f (y)

2�((2ν + p + 2r)/2)

�((2ν + p)/2)

(2

d

)rP1((2ν + p + 2r)/2, d/2)

P1((p + 2ν)/2, d/2)E{�(S1/2A)}

and

ηr = f0(y)

f (y)

2r/2+1/2�((2ν + p + r)/2)

�((2ν + p)/2)π1/2

d(2ν+p)/2

(d + A2)(2ν+p+r)/2

P1((2ν + p + r)/2, (d + A2)/2)

P1((p + 2ν)/2, d/2),

where Px(a, b) denotes the cdf of the Gamma(a, b) distribution evaluated at x and S ∼Gamma((2ν + p + 2r)/2, d/2)I(0,1). Applications of the SSL distribution can be found in Wangand Genton [17].

2.3. Multivariate SCN distribution

The multivariate SCN distribution is denoted by SCNp(μ, �, λ; ν, γ ), with 0 ≤ ν ≤ 1 and 0 <

γ ≤ 1. Here, κ(u) = 1/u and U is a discrete random variable taking one of two states. The pdfof U , given the parameter vector ν = (ν, γ )T , is given by

h(u; ν) = νI(u=γ ) + (1 − ν)I(u=1), 0 < ν < 1, 0 < γ ≤ 1.

It follows that

f (y) = 2{νφp(y | μ, γ −1�)�(γ 1/2A) + (1 − ν)φp(y | μ, �)�(A)}.Parameter ν can be interpreted as the proportion of outliers while γ may be interpreted as a scalefactor. The SCN distribution reduces to the SN one when γ = 1. In this case, we have

ur = 2

f (y){νγ rφp(y | μ, γ −1�)�(γ 1/2A) + (1 − ν)φp(y | μ, �)�(A)}

and

ηr = 2

f (y){νγ r/2φp(y | μ, γ −1�)φ1(γ

1/2A) + (1 − ν)φp(y | μ, �)φ1(A)}.

3. The model

Let Xi be the observed value of the covariate for unit i, yij , be the j th observed response in uniti and xi be the unobserved (true) covariate value for unit i (for i = 1, . . . , n and j = 1, . . . , r).

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

10:

20 0

2 Ja

nuar

y 20

14

Statistics 545

Following Barnett [2], we write the multivariate MEM as

Xi = xi + ui (4)

and

Yi = α + βxi + ei , (5)

where Yi = (yi1, . . . , yir )� is the vector of responses for the ith experimental unit, ei =

(ei1, . . . , eir )� is a random vector of measurement errors of dimension r , α = (α1, ..., αr)

� andβ = (β1, ..., βr)

� are parameter vectors of dimension r , respectively. Let εi = (ui, e�i )� and

Zi = (Xi,Y�i )�, then the model defined by Equations (4) and (5) can be written as

Zi = a + bxi + εi = a + Bri , i = 1, . . . , n, (6)

where a = (0, α�)� and b = (1, β�)� are p × 1 vectors, with p = r + 1, B = [b; Ip] is ap × (p + 1) matrix and ri = (xi, ε

�i )�. Thus, from Equation (6), the distribution of Zi become

specified once the distribution of ri is specified.In order to obtain a robust estimation of the parameters, we assume that

ri =[xi

εi

]iid∼ SMSNp+1

((μx

0

), D(φx, φ),

(λx

0

); H

), i = 1, . . . , n, (7)

where D(φx, φ) = diag(φx, φ1, . . . , φp), with φ = (φ1, . . . , φp). The above model will be calledstructural SMSN-MEM. From Equation (2), this formulation implies that[

xi

εi

]| Ui = ui ∼ SNp+1

((μx

0

), κ(ui)D(φx, φ),

(λx

0

)), (8)

Ui ∼ H(ui; ν), i = 1, . . . , n. (9)

Since the scale matrix and the skewness parameter of the SN distribution defined in Equation (8)are diagonal and have null second element, respectively, it follows from Proposition 6 of Azzaliniand Capitanio [13], that conditional on Ui ; εi and xi are independent, with

εi | Ui = ui

ind∼ Np(0, κ(ui)D(φ)) and xi | Ui = ui

ind∼ SN1(μx, κ(ui)φx, λx).

Also marginally εi ∼ SMNp(0, D(φ); H) and xi ∼ SMSN1(μx, φx, λx; H). Since for each i, εi

and xi are indexed by the same scale mixing factor Ui, they are not independent in general. Theindependence corresponds to the case where H is degenerate, with κ(Ui) = 1, so that the SMSN-MEM reduces to the SN-MEM as described in Arellano-Valle et al. [8]. However, as stated inEquations (8) and (9), conditional on Ui , εi and xi are independent for each i = 1, . . . , n, whichimplies that εi and xi are not correlated, once Cov(εi , xi) = E[εixi | Ui] = 0.

The errors εi are related to measurement error so that it is expected to be symmetricallydistributed. The skewness parameter λx incorporates asymmetry in the latent variable xi andconsequently in the observed quantities Zi , which will be shown to have marginally a multivari-ate SMN. If λx = 0, then the asymmetric model reduces to the symmetric MEM consideringSMN distribution. Classical inference for the parameter vector θ = (α�, β�, φ�, μx, φx, λx)

� ofthe MEM is usually based on the marginal distribution for Zi [18], which is given in the followingproposition.

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

10:

20 0

2 Ja

nuar

y 20

14

546 V.H. Lachos et al.

Proposition 1 Under the structural SMSN-MEM defined in Equations (6)–(7), the marginaldistribution of Zi is given by

f (zi | θ) = 2∫ ∞

0φp(zi | μ, κ(ui)�)�(κ−1/2(ui)λ

�x �−1/2(zi − μ))dH(ui), (10)

i.e. Zi

iid∼ SMSNp(μ, �, λx; H), for i = 1, . . . , n, where

μ = a + bμx, � = φxbb� + D(φ) and λx = λxφx�−1/2b√

φx + λ2x x

,

with x = φx/c and c = 1 + φxb�D−1(φ)b.

Proof The proof follows from Equations (6) and (7) by using Proposition 5.4 given in Brancoand Dey [9]. �

It follows that the log-likelihood function for θ given the observed sample z = (z�1 , . . . , z�

n )�is given by

�(θ) =n∑

i=1

�i(θ), (11)

where �i(θ) = log 2 − p/2 log 2π − 1/2 log |�| + log Ki , with

Ki = Ki(θ) =∫ ∞

0κ−p/2(ui) exp

(−1

2κ−1(ui)di

)�(κ−1/2(ui)Ai)dH(ui),

and μ, �, λx as in Proposition 1, di = (zi − μ)��−1(zi − μ) and Ai = λ�x �−1/2(zi − μ) = Axai ,

with

Ax = λx x√φx + λ2

x x

and ai = (zi − μ)�D−1(φ)b.

The result presented in Proposition 1 facilitates easy implementation of the model with standardoptimization routines and existing statistical softwares. The asymptotic covariance matrix of theML estimators can be estimated by using the Hessian matrix, which can be numerically computedusing, for instance, the optim routine in platform R. A disadvantage of direct maximization ofthe log-likelihood function is that it may not converge unless good starting values are used;[8, p. 278]. Thus, we use the EM algorithm [19] for the parameter estimation, which is quiteinsensitive to the stating values and is a powerful computational tool. The EM algorithm is stableand straightforward to implement since the iterations converge monotonically and no secondderivatives are required. In this paper, we use the EM algorithm for parameter estimation via asimple modification, called ECM algorithm [20]. A key feature of this EM-type algorithm (ECM)is that it preserves the stability of the EM algorithm with its monotone convergence property.

4. The EM algorithm

In this section, we develop the ECM algorithm for ML estimation of the SMSN–MEM. A keyfeature of this model is that it can be formulated in a flexible hierarchical representation that is

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

10:

20 0

2 Ja

nuar

y 20

14

Statistics 547

useful for theoretical derivations. From Equation (2) and the marginal stochastic representationof an SN random vector as given in Arellano-Valle et al. [21], it follows that

Zi | xi, Ui = ui

ind∼ Np(a + bxi, κ(ui)D(φ)), (12)

xi | Ti = ti , Ui = ui

ind∼ N1(μx + τxti , κ(ui)ν2x ), (13)

Ti | Ui = ui

iid∼ HN1(0, κ(ui)), (14)

Ui

iid∼ H(ui; ν), i = 1, . . . , n, (15)

where HN1(0, σ 2) denotes the half-N1(0, σ 2) distribution, ν2x = φx(1 − δ2

x) and τx = φ1/2x δx ,

with δx = λx/(1 + λ2x)

1/2. Following the suggestions by Lange et al. [22] and Lucas [23], wetake the value of ν to be known.

Let z = (z�1 , . . . , z�

n )�, x = (x1, . . . , xn)�, u = (u1, . . . , un)

� and t = (t1, . . . , tn)�. Let

θ(k) = (α

(k)�, β

(k)�, φ

(k)�, μ(k)

x , φ(k)x , λ(k)

x )�, denote the estimates of θ at the kth iteration. It fol-lows from Equations (12)–(15) that the complete log-likelihood function given zc = (z, x, t, u)

is of the form

�c(θ | zc) = −n

2log(|D(φ)|) − 1

2

n∑i=1

κ−1(ui)(zi − a − bxi)�D−1(φ)(zi − a − bxi)

− n

2log(ν2

x ) − 1

2ν2x

n∑i=1

κ−1(ui)(xi − μx − τxti)2 + C,

where C is a constant, independent of the parameter vector θ. Given the current esti-

mate of θ = θ(k)

, the E-step calculates Q(θ | θ(k)

) = E[�c(θ | zc) | θ(k)

, z] = ∑ni=1 Qi(θ |

θ(k)

), which involves u(k)i = E[κ−1(Ui) | θ

(k), zi], ut

(k)i = E[κ−1(Ui)ti | θ

(k), zi], ut2

i

(k) =E[κ−1(Ui)t

2i | θ

(k), zi], ux

(k)i = E[κ−1(Ui)xi | θ

(k), zi], ux2

(k)

i = E[κ−1(Ui)x2i | θ

(k), zi] and

utx(k)i = E[κ−1(Ui)tixi | θ

(k), zi]. Note that all these expressions could be readily evaluated at

ut(k)i = u

(k)i μ

(k)T i + M

(k)T η

(k)i , ut2

(k)

i = u(k)i μ

2(k)Ti

+ M2(k)T + M

(k)T μ

(k)Ti

η(k)i ,

utx(k)i = r

(k)i ut

(k)i + s(k) ut2

(k)

i ux(k)i = r

(k)i u

(k)i + s(k) ut

(k)i ,

ux2(k)

i = Tx2(k) + r

2(k)i u

(k)i + 2r

(k)i s(k) ut

(k)i + s2(k) ut2

(k)

i ,

for i = 1, . . . , n. Omitting the supraindex (k), we have

M2T = [1 + τx

2b�(D(φ) + νx

2bb�)−1b]−1,

μTi= τxM

2T b

�(D(φ) + νx

2bb�)−1(yi − a − bμx),

Tx2 = νx

2[1 + νx2b

�D−1(φ)b]−1,

ri = μx + Tx2b

�D−1(φ)(yi − a − bμx) and s = τx(1 − Tx

2b

�D−1(φ)b),

and, as defined in Section 2,

η1i = ηi = E

[κ−1/2(Ui)W�

(κ−1/2(Ui)μTi

MT

)|θ, zi

].

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

10:

20 0

2 Ja

nuar

y 20

14

548 V.H. Lachos et al.

In each step, the conditional expectations ui and ηi can be easily derived from the result given inSection 2. For the ST and SCN distributions, we have computationally attractive expressions andthus these can be easily implemented. However, for the SSL distribution, a Monte Carlo (MC)integration may be employed, which yields the so-called MC-EM algorithm, once at the kthiteration, the conditional moments u

(k)i and η

(k)i need to be approximated by the MC integration.

The conditional maximization steps then conditionally maximize Q(θ | θ(k)

) with respect to θ

and obtain a new estimate θ(k+1)

, as described below:

α(k+1) = z(k)

u − x(k)u β

(k), β

(k+1) =∑n

i=1 ux(k)i (zi − z(k)

u )∑ni=1 ux2

(k)

i − n¯ˆ (k)ubarx

2(k)u

,

φ(k+1)1 = 1

n

n∑i=1

(u(k)i X2

i − 2ux(k)i Xi + ux2

(k)

i ),

φ(k+1)j+1 = 1

n

n∑i=1

(u(k)i z2

ij + u(k)i α

2(k)j + β

2(k)j ux2

(k)

i − 2u(k)i α

(k)j zij − 2yijβ

(k)j ux

(k)i

+ 2α(k)j β

(k)j ux

(k)i ), j = 1, . . . , r,

μ(k+1)x = x(k)

u − τx(k)t (k)

u , ν2(k)x = 1

n

n∑i=1

(ux2(k)

i − μ(k)x ux

(k)i ) − τx

(k) 1

n

n∑i=1

utx(k)i ,

τ (k+1)x =

∑ni=1(utx

(k)i − x(k)

u ut(k)i )∑n

i=1(t2i

(k) − t(k)u ut

(k)i )

,

λ(k+1)x = τ (k+1)

x

ν(k+1)x

and φ(k+1)x = τ 2(k+1)

x + ν2(k+1)x ,

where,

z(k)u =

∑ni=1 u

(k)i zi∑n

i=1 u(k)i

, x(k)u =

∑ni=1 ux

(k)i∑n

i=1 u(k)i

, t (k)u =

∑ni=1 ut

(k)i∑n

i=1 u(k)i

and u(k) = 1

n

n∑i=1

u(k)i .

The above algorithm is repeated until a suitable convergence rule is satisfied, e.g. ‖θ(k+1) − θ(k)‖is sufficiently small. An EM algorithm is often criticized as it tends to get stuck at local modes.A convenient way to circumvent such problem is to try several EM iterations with a varietyof starting values. If there exist several modes, one can find the global mode by comparingtheir relative masses and log-likelihood values. Note that when κ(Ui) = 1, for i = 1, . . . , n (adegenerated random variable) the equations in the M-step reduces to the equations obtained underthe SN distribution and when λx = 0 (or τx = 0) the M-step reduces to the equations obtainedby Bolfarine and Galea-Rojas [18]. Moreover, when κ(u) = 1/u, Ui ∼ Gamma(ν/2, ν/2) andλx = 0, the M-step reduces to equations obtained by Bolfarine and Galea-Rojas [3].

Following Lin and Lee [24], we now consider an empirical Bayes estimation of the latentvariables that is useful for estimating the xi . From model (12)–(15), the conditional distributionof xi given (zi , ui) belong to the extended SN family of distributions and its pdf is given by

f (xi | zi , ui) = φ1(xi | μx + xai, κ(ui) x)�(κ−1/2(ui)λx(xi − μx)/φ

1/2x )

�(κ−1/2(ui)Ai),

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

10:

20 0

2 Ja

nuar

y 20

14

Statistics 549

where x is given as in Proposition 1 and ai , Ai are given in Equation (11). It follows from [25,p. 374] that

E[xi | zi , ui] = μx + xai + κ1/2(ui) xλx√

1 + λ2x x

W�(κ−1/2(ui)Ai), i = 1, . . . , n,

and since E[x | z] = EU [E[x | z, U ] | z], the minimum mean-square error estimator of xi

obtained by the conditional mean of xi given yi and is given by

xi = E[xi | zi] = μx + xai + xλx√1 + λ2

x x

η−1i , i = 1, . . . , n. (16)

Note that, if zi , has an ST distribution or an SCN distribution, we can obtain closed form expressionfor the expected values given in Equation (16) from the results given in Section 2. In practice, theBayes estimator of xi , denoted by xi , can be obtained by substituting the ML estimate of θ intoEquation (16).

5. The observed information matrix

From Equation (11), Proposition 1 and after some algebraic manipulations, we find that thelog-likelihood function can be written as:

�(θ) =n∑

i=1

�i(θ), (17)

where �i(θ) = log 2 − p/2 log 2π − 1/2 log |�| + log(Ki). Thus, the matrix of second deriva-tives with respect to θ is given by

L =n∑

i=1

∂2�i(θ)

∂θ ∂θ� = −n

2

∂2 log |�|∂θ ∂θ� −

n∑i=1

1

K2i

∂Ki

∂θ

∂Ki

∂θ� +n∑

i=1

1

Ki

∂2Ki

∂θ ∂θ� ,

where∂Ki

∂θ= I

φ

i

(p + 1

2

)∂Ai

∂θ− 1

2I�i

(p + 2

2

)∂di

∂θ

and

∂2Ki

∂θ ∂θ� = 1

4I�i

(p + 4

2

)∂di

∂θ

∂di

∂θ� − 1

2I�i

(p + 2

2

)∂2di

∂θ ∂θ�

− 1

2I

φ

i

(p + 3

2

) (∂Ai

∂θ

∂di

∂θ� + ∂di

∂θ

∂Ai

∂θ�

)− I

φ

i

(p + 3

2

)Ai

∂Ai

∂θ

∂Ai

∂θ�

+ Iφ

i

(p + 1

2

)∂2Ai

∂θ ∂θ� ,

with,

I�i (w) =

∫ ∞

0κ−w(ui) exp

(−1

2κ−1(ui)di

)�(κ−1/2(ui)Ai)dH(ui)

and

i (w) = 1√2π

∫ ∞

0κ−w(ui) exp

(−1

2κ−1(ui)(di + A2

i )

)dH(ui).

Note that we can also write Ki = I�i (p/2). Direct substitution of H in the integrals above

yields the following results for each distribution considered, viz.,

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

10:

20 0

2 Ja

nuar

y 20

14

550 V.H. Lachos et al.

• ST

I�i (w) = 2wνν/2�(w + ν/2)

�(ν/2)(ν + di)ν/2+wT

(√ν + 2w

di + νAi; ν + 2w

)and

i (w) = 2wνν/2�((ν + 2w)/2)√2π�(ν/2)(di + A2

i + ν)(ν+2w)/2.

• SSL

I�i (w) = ν22+w�(ν + w)

dν+wi

P1

(ν + w,

di

2

)E{�(S

1/2i Ai )}

and

i (w) = ν2ν+w�(ν + w)√2π(di + A2

i )ν+w

P1

(ν + w,

di + A2i

2

),

where Si ∼ Gamma(ν + w, di/2)I(0,1).• SCN

I�i (w) = √

2π{νγ w−1/2φ1

(√di | 0,

1

γ

)�(γ 1/2Ai) + (1 − ν)φ1(

√di | 0, 1)�(Ai)}

and

i (w) = νγ w−1/2φ1

(√di + A2

i | 0,1

γ

)+ (1 − ν)φ1

(√di + A2

i | 0, 1

).

The derivatives of log |�|,di andAi involve tedious but not complicated algebraic manipulationsand are given in the appendix. Asymptotic confidence intervals and test on the ML estimator canbe obtained using this matrix.

6. Application

We illustrate the developed method with data from Chipkevitch et al. [4]. The data measurethe testicular volume of 42 adolescents in a certain sequence by using five different techniques:ultrasound (US), a graphical method proposed by the authors (I), dimensional measurement (II),Prader orchidometer (III) and ring orchidometer (IV). The ultrasound approach is assumed to bethe reference measurement device. Galea-Rojas et al. [26] analysed the same data set by fittinga normal-MEM (N-MEM) and recommended the use of a cube root transformation to betterapproximate the normality. We revisit this data set with the aim of providing a better fit by usingthe SMSN distributions. In order to verify the existence of skewness in the latent variable, wefirst fit an ordinary N-MEM (normal) to the data. Figure 1 displays the histogram and the Q–Qplot of the empirical Bayes estimates of xi , computed using the result given in Equation (16)with λx = 0. These figures show that the latent variable is positively skewed and thus a normalmodel does not fit the data well. Moreover, the Q–Q plot clearly support the use of thick-taileddistributions.

We now re-analyse the data using SMSN-MEM. We compare among the four submodels ofSMSN class, viz., the SN-MEM, the ST-MEM, the SCN-MEM and the SSL-MEM. We choose thevalue of ν by maximizing the the likelihood function as illustrated in Figure 2(c). For the ST model,we found ν = 6, for the SSL, ν = 3 and for the SCN ν = 0.3 and γ = 0.3. Table 1 contains theML estimates of the parameters from the four models, together with their corresponding standarderrors calculated via the observed information matrix given in the appendix. The AIC and BIC

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

10:

20 0

2 Ja

nuar

y 20

14

Statistics 551

Empirical Bayes estimates of the latent variable

Den

sity

5 10 15 20

0.00

0.02

0.04

0.06

0.08

−2 −1 0 1 2

510

1520

Quantiles of Normal

Figure 1. Histogram and normal Q–Q plot of empirical Bayes estimates of xi when normality is assumed on theChipkevitch data.

Theoretical 2 quantiles

Sam

ple

valu

es a

nd s

imul

ated

env

elop

e

Theoretical F quantiles

Sam

ple

valu

es a

nd s

imul

ated

env

elop

e

2 4 6 8 10 12 14

05

1015

0 2 4 6 8

08

1012

00.2

0.40.6

0.81

00.2

0.40.6

0.81

−428−426

−424

−422

−420

−418

−416

−414

log−

likel

ihoo

d 415.97

Measurement of ultrasound (US)

Den

sity

24

6

0 5 10 15 20 25

0.00

0.02

0.04

0.06

0.08

0.10

0.12

NormalSNSTSCNSSL

a

c d

b

Figure 2. Chipkevitch data set. Q–Q plots and simulated envelopes: (a) SN and (b) ST models. (c) Plot of the profilelog-likelihood for fitting an SCN model. (d) histogram of the reference device measurement with superimposed fittedSMSN densities.

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

10:

20 0

2 Ja

nuar

y 20

14

552 V.H. Lachos et al.

Table 1. ML estimation results for fitting various mixture models on the Chipkevitch data.

SN-MEM ST-MEM SCN-MEM SSL-MEM

Parameter Estimate SE Estimate SE Estimate SE Estimate SE

α1 0.1022 0.5655 0.0426 0.4940 0.1559 0.5017 0.1226 0.5317α2 −0.0096 0.6216 −0.2472 0.5261 −0.1527 0.5227 −0.1716 0.5777α3 0.0483 0.6277 0.1110 0.5674 0.1464 0.5643 0.1027 0.5999α4 1.5391 0.6337 1.5444 0.5605 1.6404 0.5729 1.5784 0.5978β1 0.8838 0.0509 0.8990 0.0513 0.8911 0.0511 0.8887 0.0517β2 0.9495 0.0559 0.9866 0.0565 0.9782 0.0547 0.9754 0.0577β3 1.1419 0.0565 1.1537 0.0586 1.1540 0.0574 1.1466 0.0583β4 1.0826 0.0570 1.0957 0.0579 1.0885 0.0584 1.0864 0.0578φ1 1.3384 0.3714 0.9291 0.2893 0.7467 0.2240 0.8345 0.2508φ2 1.3284 0.3480 0.9538 0.2841 0.7972 0.2225 0.8405 0.2369φ3 1.6736 0.4322 0.9028 0.2960 0.7294 0.2105 0.8938 0.2849φ4 1.1578 0.3710 0.9481 0.3160 0.7845 0.2537 0.7998 0.2581φ5 1.4105 0.3994 1.0196 0.3385 0.9119 0.2761 0.9080 0.2783μx 3.9952 1.3958 4.1688 1.0890 4.2784 1.3844 4.1830 1.3096σ 2

x 59.2857 21.5487 38.1512 14.3057 30.9914 13.6063 35.0036 13.3506λx 4.7842 4.7925 3.4300 2.4361 3.2404 2.8759 3.7562 3.2021ν – – 6 – 0.3 – 3 –γ – – – – 0.3 – – –

log likelihood −422.1628 −416.1776 −415.9791 −419.3461AIC 876.3256 866.3552 867.9582 872.6922BIC 874.227 864.1254 865.5972 870.4624

Note: SE are the estimated asymptotic standard errors based on the observed information matrix given in the appendix.

model selection criteria [21] indicate that the class of SMSN distributions with a heavy tail presentsthe best fit compared with the SN-MEM. Particularly, the ST distribution fits the data better thanthe other three distributions. Although the intercept and slope estimates are similar in all the fourfitted models (Table 1), the standard errors of the SMSN-MEM are smaller than those in the SNmodel. This suggests that the three models with tails longer than those in SN seem to producemore accurate ML estimates. The estimates for the variance components are not comparable sincethey are on different scale. The Q-Q plots and envelopes shown in Figure 2(a) and (b) are basedon the distribution of the Mahalanobis distance of Zi , which are a χ2

(5) distribution for the SNdistribution and a 5F(5, ν) distribution for the ST distribution, respectively. The lines in these

−10 −8 −6 −4 −2 0 2 4 6 8 10−15

−10

−5

0

5

10

15

% C

hang

e

1

−10 −8 −6 −4 −2 0 2 4 6 8 10−20

0

20

40

60

80

100

120

140

160

180

% C

hang

e

1

SNSTSNST

SNST

Figure 3. Relative changes on the ML estimates of β1 and φ1 from the SN-MEM (solid line) and the ST-MEM (dashedline) for different contaminations �.

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

10:

20 0

2 Ja

nuar

y 20

14

Statistics 553

figures represent the 5th percentile, the mean and the 95th percentile of 100 simulated points foreach observation. These figures clearly shows that the ST distribution provides a better-fit to thedata set than the SN (and the normal) distribution.

The robustness of the ST model can be assessed by considering the influence of a singleoutlying observation on the ML-estimated parameters. In particular, we can asses how much theML estimates of θ influences by a change of � units in a single observation Yik . We replace a singleobservation Yi1 (or by the contaminated value Yik(�) = Yik + �, re-estimate the parameters andrecord the relative change in the estimates ((θ(�) − θ )/θ), where θ denotes the original estimateand θ (�) the estimate for the contaminated data. In this example, we contaminate a typical valueas follows: for the first observation on subject i = 38, the � is varied between −10 and 10. InFigure 3, we present the results of the relatives changes of the estimates of β1 and φ1, for differentvalues of �, under the SN-MEM and the ST-MEM, respectively. As expected, the estimates fromthe ST-MEM is less affected by variations of � than those from the SN-MEM.

7. Final conclusion

In this work, we have developed a new class of multivariate SMSN-MEM, with the SN-MEMand the elliptical class of scale mixtures of N-MEM as special cases. A closed-form expressionis derived for the likelihood function of the observed measurements that can be maximized byusing the existing statistical software. An EM algorithm is developed by exploring the statisticalproperties of the distributions in the SMSN class. The observed information matrix is analyticallyderived, which allows the direct implementation of inference on this class of models. We showedthat the SMSN-MEM with heavy tails seems to be more appropriate for fitting the Chipkevitchet al. [4] data set than the SN-MEM. MATLAB and R programs are available from the authorsupon request. There are a number of future works that can be mentioned: (i) further research couldbe centered to know if the differences among the postulated models are significant. As pointedout by one referee, the methodology discussed in Vuong [27] could be helpfull for this purpose.Another very general model selection criteria that does not require alternative models to be nestedcan also be developed based on Raftery [28]. However, a deeper investigation of these modelselection criteria are beyond the scope of the present paper. Although the methodology can beconceptually extended when ν is unknown, we faced some methodological and computationalhurdles. When ν is unknown, a more efficient ECME algorithm [29] should be used, where onecan replace the usual M-step by a step that maximizes the restricted actual log-likelihood function.Unfortunately, for the SSL-MEM, we have encountered some difficulties (very slow convergence)once it involves an integral in the marginal likelihood (M-step), which complicates the practicaluse of the proposal. Another strategy that may speed up the convergence rate is to use the PX-EMalgorithm [30]. However, the application of this algorithm is not straightforward for SMSN-MEMclass and thus needs further exploration.

Acknowledgements

The authors are grateful to two anonymous referees for their valuable comments and suggestions that led to a significantimprovement of the paper. The first author acknowledges the partial financial support from Fundação deAmparo à Pesquisado Estado de São Paulo.

References

[1] C.L. Cheng and J.W. Van Ness, Statistical Regression with Measurement Error, Arnold, London, 1999.[2] V.D. Barnett, Simultaneous pairwise linear structural relationships, Biometrics 25 (1969), pp. 129–142.

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

10:

20 0

2 Ja

nuar

y 20

14

554 V.H. Lachos et al.

[3] H. Bolfarine and M. Galea-Rojas, On structural comparative calibration under a t-model, Comput. Stat. 11 (1996),pp. 63–85.

[4] E. Chipkevitch, R. Nishimura, D. Tu, and M. Galea-Rojas, Clinical measurement of testicular volume in adolescents:comparison of the reliability of 5 methods, J. Urol. 156 (1996), pp. 2050–2053.

[5] G. Kelly, The influence function in the errors in variables problem, Ann. Statist. 12 (1984), pp. 87–100.[6] G. Dunn, Design and Analysis of Reliability: The statistical Evaluation of Measurement Errors, Edward Arnold,

New York, 1992.[7] M. Galea-Rojas, H. Bolfarine, and F.V. Labra, Local influence in comparative calibration models under elliptical

t-distribution, Biom. J. 47 (2005), pp. 691–706.[8] R.B. Arellano-Valle, S. Ozan, H. Bolfarine, and V.H. Lachos, Skew normal measurement error models, J. Multivar.

Anal. 98 (2005), pp. 265–281.[9] M. Branco and D. Dey, A general class of multivariate skew-elliptical distribution, J. Multivar. Anal. 79 (2001),

pp. 93–113.[10] D.F.Andrews and C.L. Mallows, Scale mixtures of normal distributions, J. Roy. Statist. Soc. B, 36 (1974), pp. 99–102.[11] R.B. Arellano-Valle, M.D. Branco, and M.G. Genton, A unified view on skewed distributions arising from selections,

Can. J. Stat. 34 (2006), pp. 581–601.[12] A. Azzalini and A. Dalla-Valle, The multivariate skew-normal distribution, Biometrika 83 (1996), pp. 715–726.[13] A. Azzalini and A. Capitanio, Statistical applications of the multivariate skew-normal distributions, J. Roy. Statist.

Soc. B, 61 (1999), pp. 579–602.[14] V.H. Lachos, P. Ghosh, and R.B. Arellano-Valle, Likelihood based inference for skew–normal/independent linear

mixed models, Statist. Sinica (in press 20, (2010)).[15] K.L. Lange, and J. S. Sinsheimer, Normal/independent distributions and their applications in robust regression, J.

Comput. Graph. Stat. 2 (1993), pp. 175–198.[16] A. Azzalini and M.G. Genton, Robust likelihood methods based on the skew-t and related distributions, Int. Statist.

Rev. 76 (2008), pp. 106–129.[17] J. Wang and M. Genton, The multivariate skew-slash distribution, J. Statist. Plann. Inference 136 (2006),

pp. 209–220.[18] H. Bolfarine and M. Galea-Rojas, Maximum likelihood estimation of simultaneous pairwise linear structural

ralationships, Biom. J. 37 (1995), pp. 673–689.[19] A.P. Dempster, N.M. Laird, and D.B. Rubin, Maximum likelihood from incomplete data via the EM algorithm, J.

Roy. Statist. Soc. B, 39 (1977), pp. 1–22.[20] X.L. Meng, and D.B. Rubin, Maximum likelihood estimation via the ECM algorithm: a general framework,

Biometrika 80 (1993), pp. 267–278.[21] R.B. Arellano-Valle, H. Bolfarine, and V.H. Lachos, Skew-normal linear mixed models, J. Data Sci. 3 (2005),

pp. 415–438.[22] K.L. Lange, J.A. Little, and M.G.J. Taylor, Robust statistical modeling using the t distribution, J. Amer. Statist.

Assoc. 84 (1989), pp. 881–896.[23] A. Lucas, Robustness of the Student t based M estimator, Commun. Stat. Theory Methods 26 (1997), pp. 165–1182.[24] T.I. Lin, and J.C. Lee, Estimation and prediction in linear mixed models with skew-normal random effects for

longitudinal data, Stat. Med. 27 (2008), pp. 1490–1507.[25] N. Loperfido, Modeling maxima of longitudinal contralateral observations, Test 17 (2008), pp. 370–380.[26] M. Galea-Rojas, H. Bolfarine, and M. de Castro, Local influence in comparative calibration models, Biom. J. 44

(2002), pp. 59–81.[27] Q. Vuong, Likelihood ratio tests for model selection and non–nested hypotheses, Econometrica, 57 (1989),

pp. 307–333.[28] A.E. Raftery, Bayesian model selection in social research (with discussion), Sociol. Methodol. 25 (1995),

pp. 111–196.[29] X.L. Meng, and D. Van Dyk, The EM algorithm: an old folk-song sung to a fast new tune (with discussion), J. Roy.

Statist. Soc. B, 59 (1997), pp. 511–567.[30] C.H. Liu, D.B. Rubin, and Y. Wu, Parameter expansion to accelerate EM: the PX-EM algorithm, Biometrika, 85

(1998), pp. 755–770.

Appendix 1. The observed information matrix of SMSN-MEM

In this appendix, the first and second derivatives of log |�|, Ai and di are obtained.

• Ai From Equation (11), it follows that

∂Ai

∂γ=

[Ax

∂ai

∂γ+ ai

∂Ax

∂γ

],

∂2Ai

∂γ∂τ� =[

∂Ax

∂γ

∂ai

∂τ� + Ax

∂2ai

∂γ ∂τ� + ∂ai

∂γ

∂Ax

∂τ� + ai

∂2Ax

∂γ ∂τ�

],

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

10:

20 0

2 Ja

nuar

y 20

14

Statistics 555

with γ = μx, α, β, φx, φ, λx , Ax = λx x/(φx + λ2x x)1/2, x = φx/c, ai = X�

i D−1(φ)b, Xi = Zi − a − bμx , c =1 + φxbT D−1(φ)b, i = 1, . . . , n. Using the results related to vector derivatives, it follows that

∂Ax

∂γ= 0, γ = μx, α,

∂Ax

∂β= − (2c + λ2

x)

λ2x

A3xD−1(ψ)β,

∂Ax

∂λx

= φx

2xλ3

x

A3x ,

∂Ax

∂φ= (2c + λ2

x)

2λ2x

A3xD(b)D−2(φ)b,

∂Ax

∂φx

= (2c + λ2x − c2)

2φ2xλ2

x

A3x

∂ai

∂μx

= −b�D−1(φ)b,

∂ai

∂α= −D−1(ψ)β,

∂ai

∂β= D−1(ψ)W2i − μxD−1(ψ)β,

∂ai

∂φx

= 0,

∂ai

∂φ= −D(b)D−2(φ)Xi ,

∂ai

∂λx

= 0,

∂2Ax

∂β ∂β� = −(

4φx

λ2x

A3x − 3(2c + λ2

x)2

λ4x

A5x

)M1 − 2c + λ2

x

λ2x

A3xD−1(ψ),

∂2Ax

∂β ∂φx

= −[

2(c − 1)

λ2xφx

A3x + 3(2c + λ2

x)(2c + λ2x − c2)

2λ4xφ2

x

A5x

]D−1(ψ)β,

∂2Ax

∂β ∂φ� =[

2φx

λ2x

A3x − 3(2c + λ2

x)2

2λ4x

A5x

]D−1(ψ)βb�D(b)D−2(φ) + 2c + λ2

x

λ2x

A3xI(p)D(b)D−2(φ),

∂2Ax

∂β ∂λx

= φxA3x

λ5x 2

x

(−3A2x(2c + λ2

x) + 4λ2x x)D−1(ψ)β,

∂2Ax

∂φx ∂φx

= −λ2x + 1

λ2xφ3

x

A3x + 3(2c + λ2

x − c2)2

4λ4xφ4

x

A5x ,

∂2Ax

∂φx ∂φ� =[

(c − 1)

λ2xφx

A3x + 3(2c + λ2

x)(2c + λ2x − c2)

4λ4xφ2

x

A5x

]b�D(b)D−2(φ),

∂2Ax

∂φx ∂λx

= c − 2

λ3x xφx

A3x + 3(2c + λ2

x − c2)

2λ5x 2

xφx

A5x ,

∂2Ax

∂λx ∂λx

= − 3φx

λ4x 2

x

A3x + 3φ2

x

λ6x 4

x

A5x ,

∂2Ax

∂φ ∂φ� =[−φx

λ2x

A3x + 3(2c + λ2

x)2

4λ4x

A5x

]D(b)D−1(φ)MD−1(φ)D(b) − 2c + λ2

x

λ2x

D2(b)D−3(φ)A3x ,

∂2Ax

∂φ∂λx

= φxA3x

2λ5x 2

x

[3A2x(2c + λ2

x) − 4λ2x x ]D(b)D−2(φ)b,

∂2ai

∂γ ∂τ� = 0, γ = μx, α, φx, λx, τ = μx, α, φx, λx ; ∂2ai

∂γ ∂τ� = 0, γ = β, φ, τ = φx, λx,

∂2ai

∂μx ∂φ� = b�D(b)D−2(φ),∂2ai

∂α ∂β� = −D−1(ψ),∂2ai

∂α ∂φ� = I(p)D(b)D−2(φ),

∂2ai

∂μx ∂β� = −2β�D−1(ψ),∂2ai

∂β ∂β� = −2μxD−1(ψ),

∂2ai

∂β ∂φ� = −I(p)D(Zi − a − 2μxb)D−2(φ),∂2ai

∂φ ∂φ� = 2D(Xi )D(b)D−3(φ).

• diFor di = X�

i �−1Xi , with diγ = ∂di/∂γ and diγτ� = ∂2di/∂γ ∂τ�, it follows that

diμx = −2b��−1Xi , diφx = −c−2a2i , diλx = 0, diβφx = −2c−2aiA�

i ,

diα = −2I(p)�−1Xi , diβ = −2qiD

−1(ψ)W2i + 2c−1φxaiqiD−1(ψ)β,

diφ = −D−2(φ)D(Xi )Xi + 2c−1φxaiD−2(φ)D(b)Xi − c−2φ2

xa2i D−2(φ)D(b)b,

diμxμx = 2b��−1b, diμxα� = 2b��−1I�(p), diμxβ� = −2c−1Ai ,

diμxφx = 2(c − 1)

c2φx

ai , diμxφ� = 2c−1X�i �−1D−1(φ)D(b), diαα� = 2I(p)�

−1I�(p),

diαβ� = 2qi [D−1(ψ) − 2c−1φxM1] + 2c−1φxD−1(ψ)β(Y i − α)�D−1(ψ),

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

10:

20 0

2 Ja

nuar

y 20

14

556 V.H. Lachos et al.

diαφx = 2c−2aiD−1(ψ)β, diαφ� = 2I(p)�

−1D−1(φ)[D(Xi ) − c−1φxaiD(b)],

diββ� = 4φ2

x

c2ai [D−1(ψ)(Y i − α − 2βμx)β�D−1(ψ) + D−1(ψ)β(Y i − α − 2βμx)�D−1(ψ)]

− 2c−1φxD−1(ψ)(Y i − α − 2βμx)(Y i − α − 2βμx)�D−1(ψ)

+ 2μx(qi + c−1φxai)D−1(ψ) + 2

φ2x

c2a2

i

[D−1(ψ) − 4

φx

cM1

],

diφxφx = 2c−3

φx

(c − 1)a2i , diφxφ� = (−2c−3φxa2

i D(b)D−2(φ)b + 2c−2aiD(b)D−2(φ)Xi )�,

diβφ� = 2[qiI(p)D(Zi − a − qib) + c−1φxA�i (Zi − a − bqi)

�D(b)]D−2(φ),

diφφ� = 2D−3(φ)D2(Xi ) − 4c−1φxaiD−3(φ)D(b)D(Xi ) − 2c−1φxD−2(φ)D(b)XiX�

i D(b)D−2(φ)

+ 2c−2φ2xD−2(φ)D(b)XiX�

i MD−1(φ)D(b) + 2c−2φ2xD−1(φ)D(b)MXiX�

i D(b)D−2(φ)

+ 2c−2φ2xa2

i D−3(φ)D2(b) − 2c−3φ3xa2

i D−1(φ)D(b)MD(b)D−1(φ).

• log |�|∂2log|�|∂τ ∂γ� = 0, τ = μx, α, λx ; γ = μx, α, β, φx, φ, λx,

∂2log|�|∂β ∂φx

= 2c−2D−1(ψ)β,

∂2log|�|∂φx∂φx

= − 1

c2φ2x

(c − 1)2,∂2log|�|∂β ∂φ� = −2c−1φx [D1(β) − c−1φxD−1(ψ)βb�D(b)]D−2(φ),

∂2log|�|∂β ∂β� = 2c−1φx [D−1(ψ) − 2c−1φxM1], ∂2log|�|

∂φx∂ φ� = −c−2b�D(b)D−2(φ),

∂2log|�|∂φ ∂φ� = −D−2(φ) − c−2φ2

xD(b)D−1(φ)MD−1(φ)D(b) + 2c−1φxD2(b)D−3(φ),

where Ai = (Y i − α − 2qiβ)�D−1(ψ), M = D−1(φ)bb�D−1(φ), M1 = D−1(ψ)ββ�D−1(ψ), ψ = (φ2, . . . , φp)�,qi = μx + c−1φxai W2i = Y i − α − βμx , I(p) = [0, Ip−1], (p − 1) × p.

Dow

nloa

ded

by [

Mos

kow

Sta

te U

niv

Bib

liote

] at

10:

20 0

2 Ja

nuar

y 20

14


Recommended