LATTICE MODELS FOR CONDITIONAL INDEPENDENCEIN A MULTIVARIATE

LATTICE MODELS FOR CONDITIONAL INDEPENDENCE IN A

MULTIVARIATE NORMAL DISTRIBUTION

by

Steen Arne AnderssonMichael D. Perlman

TECHNICAL REPORT No. 155 (Revised)

August 1991

Department of Statistics, GN~22

Universityof Washington

Seattle,Washington 98195 USA

LAITICE MODElS FOR CONDITIONAL INDEPENDENCE IN A

MULTIVARIATE NORMAL DISTRIBlITION1,2

BY

STEEN ARNE ANDERSSO~

DEPARTMENT OF MATHEMATI~

UNIVERSITY OF INDIANA

AND

MIaIAEL D. PERLMAN

DEPARTMENT OF STATISTI~

UNIVERSITY OF WASHINGTON

ITh i s research was supported in part by the Danish Research Council and by

U.S. National Science Foundation Grant Nos. DMS 86-03489 and 89-02211.

1991.

s was out at the Instititute of Mathematical

Statistics.

SU11llll8.ry

The lattice conditional independence model N(::Il} is defined to be the

set of all normal distributions on R1 such that for every pair L.M € ::Il,

XL and ~ are conditionally independent given XLnM' Here ::Il is a lattice

of subsets of the finite index set I and, for K € ::Il, ~ is the coordin

ate projection of x € R1 to RK. Statistical properties of N(::Il} are

studied, eg .• maximum likelihood inference. invariance. and the problem

of testing HO: N(::Il} vs H: N(i} when i is a sublattice of ::Il. The set J(::Il)

of join-irreducible elements of ::Il plays a central role in the analysis

of N(::Il}. This class of statistical models is relevant to the analysis of

non-nested multivariate missing data patterns.

W$ 1980 subject classification: Primary 62H12, 62H15; Secondary 62H20.

62H25.

Key words and phrases: Distributive lattice, join-irreducible elements.

pairwise conditional independence. multivariate normal distribution.

generalized block....tri.angular matrices., maximum likelihood

quotient spaces,

fl. INTRODUCTION •••••••••••••••••••••••••••••••••••••••••••••••••••••• ····1

§2. THE CLASS ':;If I) OF COVARIANCE MATRICES 2: DETERM:INED BY PAIRWISE

CONDITIONAL INDEPENDENCE WIm RESPECT TO A FINITE DISTRIBUTIVE LATTICE :;I ·13

2.1. The poset Jf:;l) of Join-irreducible elements ·····················14

2.2. The ~-parameters of I ···········································16

2.3. Characterization of conditional independence in terms of 2:-1 ····17

2.4. The :;I-preserving matrices: generalized block-triangular

matrices with lattice structure ·································19

2.5. The :;I-parametrization of ':;1(1) ··································24

2.6. Transitive action of the group of :;I-preserving matrices ·········26

2.7. Reconstruction of 2: from its :;I-parameters ·······················282.8. Examples •••••••••••••••••••••••••••••••••••••••••••••••••••••• ··32

§3. LIKELIHOOD INFERENCE FOR A NORMAL MODEL DETERMINED BY PAIRWISE

CONDITIONAL INDEPENDENCE ·················································47

3.1. Factorization of the likelihood function; the MLE of 2: ··········47

3.2. Examples of pairwise conditional independence models ············49

3.3. Invariance of the model ·········································56

§4. TESTING ONE PAIRWISE CONDITIONAL INDEPENDENCE MODEL AGAINST ANOTHER··56

4.1. The likelihood ratio statistic ··································57

4.2. Central distribution and Box approximation ······················58

4.3. Examples of testing problems ····································61

95. INVARIANT FORMULATION OF THE CI MODEL AND TESTING PROBLEM ············65

5.1. The lattice structure of quotient spaces ························65

5.2. Invariant formulation of the pairwise CI model·················

5.3. Reduction of the CI model to canonical coordinate-wise form ·····68

5.4. of the testing pr()blem ••••••••••••.•• "•• ".

§6. ···················································69

.............•........•................•...•.......... ···········71APPENDIX

A.I. De,C(lIRq:lOlllition Theorem and existence a ....1 •••••••••••••••••••••••••••••••••••••••••••

4.2 •••••••••••••••••••••••••••••••••••••••••••

" " " " " "" " ., .... ..

1

§1.INTRODUCTION.

Because conditional independence (CI) plays an increasingly important

role in statistical model bUilding, it is of interest to study classes of

CI models with tractable statistical properties and to develop methods

for testing one CI model against another. In this paper we define and

study a class of CI models determined by finite distributive lattices.

For multivariate normal distributions, the parameter sPaCe and the

likelihood function (LF) for such a lattice CI model can be factored into

a product of parameter sPaCes and conditional LF's, respectively,

corresponding to ordinary multivariate normal linear regression models.

This in turns yields explicit maximum likelihood estimators (MLE) and

likelihood ratio tests (LRT) by means of standard technique from

multivariate analysis.

These lattice CI models arise in a natural way in the analysis of

multivariate missing data sets with non-monotone missing data patterns.

The factorizations mentioned above can be readily applied to obtain

explicit MLE's and LRT's by standard linear methods (cf. [AP] (1991)1).

We introduce this class of lattice CI models by means of the following

simple and familiar model. Let (xl'''2,~)t denote a random observation

from the trivariate normal distribution N(z) with mean vector 0 and

unknown covariance matrix z.2 Consider the model that specifies that x2

1References to Andersson are abbreviated by [A]. Andersson

[AP] , etc.

Perlman by

iei paper we I assume ! is

and mean vector the YVjtJu;,a. ...ion is

tel' asswnp1:io.n is easi (1991) .

2

and ~ are condi tionally independent given xl' which we express in the

familiar notation

(l.l)

In terms of the covariance matrix~. (l.l) is equivalent to the condition

(1.2) -1 -1(~ )23 = (~ )32 =o.

In order to express this as a lattice CI model. let I - {1.2.3} denote

the index set and consider

(1.3) ~ ={0. {I}. {1.2}. {1.3}. I}.

a subring of the ring !l(I) of all subsets of 1. Clearly ~ is a finite

distributive lattice under the usual set operations U and n. Define the

class P~{I) of real positive definite IxI matrices as follows:

(1.4) V L. M € ~.

where x'" N{~) and "T denotes the T-subvector of x when T ~ 1. It is

readily verified that and (1.4) are eqlili"aJ.en.t conditions. 3

In the of parameter space and

mentioned :::Ih()vp are represented as follows:

1

(1.5)

(1.6)

3

-1 -11 ~ (111, 121111 , 122-1, 131111, 133_1)

f(xl'~'~) = f(xl)f(~lxl)f(~lxl)'

The five parameters on the right-hand side of (1.5) represent ordinary

unconditional and conditional variances and regression coefficients.

Whereas the range of the positive definite matrix 1 in (1.5) is

constrained by (1.2), the ranges of these five parameters are

unconstrained (except for the trivial requirement that 111, 222-1, and

233- 1 are positive). Thus the MLE's of these five parameters, called the

~-parameters of the CI model, are easily obtained from (1.6), and the MLE

of 1 may be reconstructed from these estimates.

A subset K € ~ is called join-irreducible if K is not the join (:

union) of two or more proper subsets of K (cf. Section 2.1). The

collection of all join-irreducible elements in 1 is denoted by J(1). Thus

when ~ is given by (1.3),

(1. 7) J(1) = {{I}, {1,2}. {1,3}}.

It will be seen that the baSic factorizations (1.5) and (1.6), as well as

their extensions to the general lattice CI model H(~) defined next,

always are indexed by the members of J(~).

(1.4) immediately extends to define the general lattice CI

model. I be an index set and let 1 be an arbi

subring of I), so again 1 is a finite distributive lattice. 4 (1.4:)

of

fini te lattice can be

a some finite set I.

4

restrictions wi th resp!ct .!.!!. the lattice 1: :I € '1(1} if and only if ~

and ~ ~ conditionally indep!ndent ~iven "u1M for every~ L. M € 1.

If N(:I) denotes the normal distribution on IRI with mean vector 0 and

unknown covariance matrix :I. the normal statistical model

(1.8)

is the lattice conditional independence (Q) model determined !?x. 1.

In this paper we study the structure of '1(1) and the statistical

properties of the model H(1). In Section 2.3 (Theorem 2.1) we generalize

(1.2) by characterizing :I € '1(1) in terms of the precision matrix :I-I

In Section 2.5 (Theorem 2.2) we generalize (1.5) by showing that each :I €

P1(1) can be uniquely represented in terms of its 1-parameters. whose

range are unconstrained. so that the parameter space P1 (1 ) again factors

into a product of parameter spaces for ordinary linear regression models.

In Section 2.7 we present a general algorithm for reconstructing :I €

P1(1) from its 1-parameters. A series of examples in Section 2.8

illustrates these results.

The factorization (1.6) of the LF a.s a product of conditional densities

involving only the 1-parameters of :I is extended to the general lattice

CI model H(1) in Section 3.1 (Theorem 3.1). The MLE' s of the 1...parameters

of :I are from the general factorization. the of :I

can be reconsrrueree the algorithm given in Section 2.7.

estimation procedure is illustrated by examples in Section 3.2. In Remark

3.5 it is n ....1r-""rI the model is

or iveri. and Car in

( ) with addi L ......LKl.'" lattice structure.

5

In Section 4 we treat the problem of testing one lattice eI model

against another, i.e., testing

(1.9)

when ~ is a sublattice of ~.5 For example, in the trivial case considered

above with I ={I,2,3}, suppose that ~ = {0, {I}, {I,2}, {I,3}, I} (cf.

(1.3)) and ~ = {0. I}. Then H(~} is simply the normal model with no

restriction on ~ and (1.9) becomes the problem of testing ~ II~ IXI

(equivalently, (1.2}) against the unrestricted alternative, which can be

stated equivalently as the problem of testing

(1.10)

where ~ = (a . . li,j =1,2,3). If, however,IJ

(1.11) 6~ = {0, {I}, {3}, {I,2}, {I,3}. I}

while ~ = {0, {I}, {I,2}, {1,3}, I}, then (1.9) becomes the problem of

testing (XI ,x2) II~ against x2 II~ lXI' which is equivalent to the

problem of testing

~ote ~ C ~ =) H(::tt} &; H(~} •

. 1) and ::I.' = =

lattices rlara,~_~~_ tions Jl . Thus

two different same eI

6

(1.12)

The LRT statistic A for the general testing problem (1.9) is derived in

Section 4.1 and is readily expressible in terms of the KLE's of the

~-parameters and ~-parameters of ~. In Section 4.2 the central

distribution of A is derived in terms of its moments by means of the

invariance of the testing problem. Specific examples of this testing

problem are considered in Section 4.3.

These and associated results are greatly facilitated by the fact that

the model H(~) is invariant under a group G == ~(I) that G acts

transitively on P~(I). This group G is a subgroup of a group of

nonsingular block-triangular IxI matrices. To illustrate this. return to

the trivariate lattice CI model considered above with ~ given by (1.3).

It can be seen that the CI model given by (1.1) == (1.2) is invariant

under all nonsingular linear transformations of the form

(1.13)

and that any nonsingular linear transformation A that leaves this CI

model invariant must be of the form (1.13). The collection of all such

matrices A forms a subgroup of the group of all 3x3 nonsingular lower

triangular matrices. It is also true. but not so easy to see. that G acts

{ ( ) (1. • is

a sp~ecJlal of ( .

7

transitively on the class P:f(I) of all covariance ma.trices l that satisfy

t(1.1) == (1.2). i.e .• for any such l there exists A € G such that l = AA .

These facts. some of which were used by Da.s Gupta (1977). Giri (1979).

Banerjee and Giri (1980), and Jlarden (1981) to study the distribution and

optima.lityof invariant tests for problems such as (1.10) and (1.12).

will be extended in the present paper to the general lattice CI model

N(:f). In Section 2.4 it will be shown how :f determines the invariance

group GL:f(I), a group of g~peralized block-triangular IxI ma.trices with

lattice structure, while the transitive action of ~(I) on P:f(I) is

demonstrated in Section 2.6 (Theorem 2.3), generalizing the well-known

Choleski decomposition of an arbitrary positive definite ma.trix. The

transitivity yields a factorization (Lemma. 2.5) of the determinant of l €

P:f(I) , a generalization of the well-known Schur formula det(l) =

det(lll)det(l22.1)·

As already seen for the trivariate example above, all statistical

properties of the general lattice CI model N(:f). including the definition

of the :f-parameters of l, the factorizations of its parameter space and

LF as products of those for linear regression models. the form of the

MLE, the form of the LRT statistic and its central distribution. and the

partitioning and location of zeroes in the invariance ma.trix A € ~(I).

are determined by the fundamental structure of the lattice:f. in

particular by associated

(cf. 2.1). As in the a balanced ANOVA des 1~:n Wll1er'e the

poset of join-irreducible elements of the lattice of subspaces determines

. [A] lattu:e CI

the model.

in

8

non-monotone missing data models. Under the assumption of multivariate

normality it is well known that a monotone missing data model with

unrestricted covariance matrix L admits a complete and explicit

likelihood analysis, remaining invariant under the appropriate group of

block-triangular matrices (in the usual sense), which acts transitively

on the unrestricted set of covariance matrices (cf. Eaton and Kariya

(1983), [AMP] (1990». If the missing data pattern is non-monotone,

however, then explicit analysis is not possible in general.

The relationship between lattice CI models and non-monotone missing

data patterns is developed fully in [AP] (1991) but can be illustrated in

terms of the trivariate example considered above. Suppose that one

attempts to observe a random sample from the trivariate normal

distribution N(L), where L is unknown and initially unrestricted, but

that some of the observations are incomplete. For example, suppose that

we have several complete vector observations of the form (x1,x2,~)t and

also several incomplete observations of the forms (x1,x2)t

and (x1,x3)t.

Then the missing data pattern (actually, the pattern of the observed

data) is the set

(1.14) ~ := {{1,2}, {1,3}, {1,2,3}},

i.e., the collection of subsets of I == {L2.3} corresponding to the

subvectors actually observed. Because the missing data pattern ~ is

non-monotone, i.e., is not totally ordered by inclusion, the LF cannot be

into a product of 's of linear relg;r4~s£:lioin models and the MLE

9

of ~ cannot be obtained explicitly.8 Instead. iterative estimation

methods such as the EM algorithm must be used. possibly accompanied by

difficulties with convergence or uniqueness of the estimates (cf. Little

and Rubin (1987».

An alternate approach. suggested by Rubin (1987) and developed in [AP]

(1991). is to restrict ~ by imposing the CI conditions of the lattice CI

model H(~). where ~ =~(~) is the lattice generated by ~. With ~ given by

(1.14) it is easy to see that ~ is given by (1.3). so the corresponding

CI condition is given by (1.1). Under this condition the densities for

the complete and incomplete observations factor as

(1.15)

f(Xl'~'~) = f(xl)f(x2Ixl)f(~lxl)'

f(xl'~) = f(xl)f(x2Ixl)'

f(xl'~) = f(xl)f(x3Ixl)'

so the overall LF is a product of LF's of only the three types f(x1).

f(x2Ix1). and f(~lxl)' the latter two corresponding to simple linear

regression models. Also. the overall parameter space is the product of

the parameter spaces for these three LF's. Therefore the similar terms

may be combined and the MLE of ~ may be obtained by maximizing these

three LF's separately. which involves only elementary calculations.

Furthermore. under theeI restriction ~ .€P~(I). this non...monotone

UlJ":>;::'J,.l~ data model remains invariant group ~(I) of

triangular matrices A in (1.13) and ~(I) acts transitively on P~(I).

fact some non-monotone ...... ,"'......."6

obf;ervat1(IDS, :I may not

10

Finally, the CI assumption may be tested by means of the LRT for (1.10)

as discussed above.

Whereas the determination of the appropriate CI conditions and the

factorization (1.15) is transparent in this simple example. a general

missing data pattern requires the lattice-theoretic approach developed in

the present paper - see [AP] (1991) for complete details. Thus. the

results in the present paper open the possibility of applying classical

multivariate techniques to a class of missing data models much larger

than the monotone class.

In Section 5 the CI models and results already described are recast in

an invariant (: coordinate-free) formulation. rather than in the matrix

(coordinate-wise) formulation just given. This is done for the following

reason: a model which. when presented in matrix formulation. may not

appear to be a lattice CI model according to the non-invariant definition

given above. may in fact belong to this class after an appropriate linear

f . 9trans ormatlon.

This is readily illustrated in terms of the trivariate missing data

example given in the paragraph containing (1.14). Rather than the missing

data pattern described by (1.14). consider a missing data array that

includes incomplete observations involving not only the coordinates of x

but also one or more linear combinations of these coordinates. For

also several incomplete observations of the forms

suppose that we have

t

complete observations of the form

9Of course is no means unique to the lattice CI For

must be described

in terms rather than

values certain coordinates the mean vector.

11

(X1,X2)t and (xl+~' ~)t. Alth0qgh this does not directly fit into the

framework of the coordinate...wise missing data models discussed above and

in [AP] (1991), it is easy to transform it to such a framework by means

of a nonsingular linear transformation (Yl'Y2'Y3) = (xl+~' x2' ~). In

terms of Yl' Y2, Y3 the missing data pattern is now given precisely by

(1.14), hence as before the associated lattice CI model imposes the

assumption that Y2 1 Y3 IY1, Le., ~ 1~ IX1+~ (equivalently,

Xl 1~ Ix1+x2)·

The existence and form of an appropriate linear transformation from x

to y (or equivalently, of an appropriate vector basis for the observation

space) may not be so apparent in more complex missing data schemes with

linear combinations present. The invariant formulation of a general

lattice CI model, presented in Section 5, allows one to recognize and

treat, without a preliminary transformation, a set of CI conditions such

as X2 1 ~lx1+~ in the same manner as the coordinate-wise lattice CI

conditions in (1.4).

The invariant formulation is stated in terms of a lattice fJ. of quotient

spaces Q of a real finite-dimensional vector space V.10 (See Section 5.1

for definitions, where it is noted that iffJ. is distributive then it is

finite.) For each Q € fJ. let PQ:V ~Q denote the projection onto Q. Then

the general latticeJD(,')(lel'V(Q) lsdefined Section 5.2 to be the

for ~ R, T € l/l. Theorem 5.1 it is noted that KV(l/l) is

if and only fJ. is distributive.

I vector spaces

over the field

matrices in paper are

12

To express our original coordinate-wise formulation of the lattice CI

models in this invariant framework. set V = IRI• identify each subset K ~

I wi the quotient space &1'. and let ~:IRI -+&1' denote the usual

coordinate projection mapping. Then the definition of the general lattice

CI model in the preceding paragraph reduces to (1.4).

The basic decomposition theorem for a distributive lattice Q of

quotient spaces (cf. Appendix A.l) states that the observation space V

can be represented as a product of vector spaces indexed by the poset

J(Q} of join-irreducible elements in Q in such a way that for each Q € Q.

the projection PQ:V -+Q becomes simply a canonical projection. By means

of this representation we may choose a Q-adapted basis for V (cf.

Proposition 5.1). In Section 5.3 it is shown that in terms of the

coordinate system determined by this basis. the CI model ~V(Q} can be

expressed in the canonical coordinate-wise form (1.4) and the statistical

analysis of the model may then proceed according to the coordinate-wise

formulation.

The general problem of testing one lattice CI model against another is

formulated invariantly as follows: test HO: ~V(Q} vs. H: ~V(~}. where Q

and ~ aredistribtitive lattic.es of qu.otient spaces of V su.ch that ~ C Q.

In Section 5.4 it is noted that one can choose a basis for V that is both

Q-adapted and ~-adapted. by means of which this testing problem can be

reduced to caJrlOIlic:al coordinate-wise form (1.9).

Se'veI'al 1"v,,,,.,,, ....... ,,, extensions class of lattice models are

dfscussed briefly in Section 6. Three important but technical

are in

tions dire<:ted or

13

increasing attention. Prominent references for normal distributions

include Dempster (1972). Frydenberg (1990). Frydenberg and Lauritzen

(1989). Kiiveri. Speed. and Carlin (1984). Lauritzen (1985. 1989).

Lauritzen. Dawid. Larsen and Leimer (1990). Lauritzen and Wermuth (1984.

1989). Porteous (1985). Speed and Kiiveri (1986). and Wermuth (1976.

1980. 1985. 1988); see Whittaker (1990) for a readable introduction to

this area. In many of these studies the CI assumptions are equivalent to

-1the occurrence of patterns of zeroes in the precision matrix ~ of a

multivariate normal distribution. hence the models are linear in ~-1. It

will be seen from Examples 2.6 - 2.8. however. that unlike the special

case (1.2). in general the lattice CI models introduced here are neither

linear in ~-1 nor ~. Furthermore. the statistical interpretation and

analysis of a lattice CI model apPear to differ from those of a model

defined by graphical conditions. Although it is of interest to determine

the relation between these two types of CI models and compare their

properties. our attempts to interpret either class in the framework of

the other have not been illuminating thus far.

§2. THE CLASS P:II(I) OF OOVARIANCE MATRICES :I DETERMINED BY PAIRWISE

OONDITIONAL INDEPENDENCE WITH RESPECT TO A FINITE DISTRIBUTIVE LATTICE :II.

Let I bea finite index let !D{I) denote the all subsets of

1. Le.• :II We

shall o € :II. a

lattice with U and n as the and meet operations.

T.U € I) write T C U to that T b U Let

the number in a set T.

14

Let NCI) denote the normal distribution on IItI with mean 0 € IItI and

covariance matrix I € pel). where P(I) denotes the set of all positive

definite IxI matrices. For any T ~ I and column vector x = (xili€I) € IItI

define ~ := (xili€T). the T-subcolumn of x. Note that XI =X and define

x0 := {OJ.

Definition 2.1. The class P1(I) ~ P(I) is defined as follows (cf. (1.4»:

(2.1) I € P1(I) <=> XL 1 XX IXLnM V L.M € 1 when x ~ N(I).

I.e .. XL and XX are conditionally independent (CI) given xlflM V L.M € 1.0

If lflM = 0 then (2.1) reduces to XL 1 Xw that is. XL and XX are

independent. Note that the right hand side of (2.1) is ordinarly written

in the form

(2.2) V L.M € 1.

Some of these pairwise CI cOnditions are trivially sa.tisfied, e.g .•

whenever L ~ M (or M~ L) (also see Remark 3.2). In particular. if 1 is a

chain then P1(I) = P(I). I.e., I is unrestricted (cf. Examples 2.1 and

2.2) .

2.1. The poset J{1) of join-irreducible elements.

structure of I € P1(I)

1, which we now define. e ,K

15

(K) := U(K' € 11K' C K)

[K] := K\(K),

so that

(2.3) K = (K) U [K],

where Uindicates that the union is disjoint. Then define

J(1) := {K € 11K ¢ 0, (K) C K}

= {K € 11K ¢ 0, [K] ¢ 0}

= {K € 11K ¢ 0, '/L,X € 1: K = LUM =) K = L or K = X}.

If K € J(1) we say that K is join-irreducible. (See Gratzer (1978),

Chapter II, or Davey and Priestley (1990). Chapter 8. for properties of

J(~); in particular. 1 is uniquely determined by J(1).)

For L € 1 define ~ := {K € 11K ~ L}. a sublattice of 1 (~I =1). The

following relations are elementary:

(2.4)

(2.5)

(2.6)

(2.7)

L = U(K € J(~»

J(~) = J(1) n ~

J(~) = J(~) n J(:tlx)

J(~) = J(~) U J(:tlx).

Proposition 2.1. Every L € ~ can be decomposed according to the members

of J(~) as follows:

= €

16

Proof. Let K.H € j(:1t) with K ¢I. so that KI1ftf C K or KOH C H. Suppose

that KI1ftf C H. Then KI1ftf t (H) and it follows that [K]n[l] ::

K n oo'n H n (H)c = 0. hence ([KlIK E J(:Jl)) is a disjoint family. The

inclusion ~ in (2.8) is trivial. To establish ~ consider , E L. Define K,

:= n(L' E ::It I, E L·). the smallest set in ::It containing L Then K E j(:1t).,as seen from the following indirect argument. Suppose that K, f. j(::It) and

thus that K, = L1U L2 where L1.L2 E :1t. L1 C K,' and L2 C K,' Then, E K1

or , E K2 contradicting the minimality of K,' Finally. if , E <K,> (C K,)

the minimality of K again would be contradicted. hence, E [K ]. Since, ,K, E j(::It) this establishes the inclusion ~ in (2.8).

In particular. set L = I in (2.8) to obtain

[]

(2.9) I = U([K]IK E j(:1t».

For example. suppose that I ={1.2.3} and :1t is given by (1.3). Then j(::It)

is given by (1.1) and we find that [{l}] = {l}. [{L2}] = {2}. and

[{1.3}] = {3}. so (2.9) is evident.

2.2. The :1t-parameters of I.

For any finite index sets T and U let I(TxU) denote the vector space of

all peT) positive TxT

I(T) :: I(TxT) algebra of and GL(T) the group of

nonsingular TxT matrices. For every I E P(I) and every subset T ~ I. let

peT»~ sul:lIDa·tri.x of }; and

€ to as

17

(2.10)

so ~<K> € P«K». ~[K] € perK]). ~[K> € M([K]x<K». and ~<K]

Furthermore. define

(2.11)

-1 -1 Iand let ~[K]. denote (~[K].) . Then for every x € m•

Definition 2.2. For ~ € P(I). the family of matrices

(2.13)

is called the family of ~~~rameters of ~.

2.3. Characterization of conditional independence in terms of ~-1.

o

Theorem 2.1 presents an algebraic characterization of the set P~(I) of

covariance matrices ~ defined in terms of pairwise conditional

independence (2.1). following description of pairwise CI is useful.

~~-=:.::...:..:..Let x "V N(~). ~ € P(I). Then for any L. M~ 1. ~ 11.14: IXIIlMif V x e mI :

= + - tr

18

Proof. The difference

tr(~) - tr(~tw.)

appears in the exponential term of the conditional density of

X(LUM)\.(lflM) given "uw. Therefore "L Jl "M l"uw if and only if this

difference is the sum of the differences aPPearing in the exponential

terms of the conditional densities of "L\.(lflM) given xlflM and "M\.(lflM)

given xlflM. This sum is

and the lemma. follows.

Theorem 2.1. (Characterization of P1(I).) For! € pel) the following

conditions are equivalent:

(i) ! € P1(l);

(il) V x €lRl:

(iii) V x € ,V L € 1:

[J

19

Proof. Trivially (iii) => (ii). On the other hand. (iii) follows from

(ii) if we replace I and ~ by L and~. respectively. in (ii).

To show (i) => (ii). use induction on IJ(1) I =: q. If q = 1 then by

(2.4). 1 = {0. I} and (ii) is trivial. Next. assume that (ii) is true

whenever q S k-l and suppose that q = k. If I € J(~) then J(~) =

J(1<I» U{I}. hence IJ(~<I»I = k-l and (iii) is true with L replaced by

<I>. so (ii) follows from (2.12) with K replaced by I. If. on the other

hand. I ~ J(1). then I = LUM where LeI and Mel. It follows from (2.4)

that IJ(1L) 1 < k and IJ(\tJI < k , so by the induction assumption. (iii)

is valid with L replaced by L. M. and LnM. Then (ii) follows from (2.6).

(2.7). and Lemma 2.1.

To show (iii) => (i). consider any pair L.M € 1. Apply condition (iii)

four times. with L replaced by DUM. L. M. and LnM. and then apply (2.6)

and (2.7) to obtain (2.14). By Lemma 2.1. therefore. (i) is satisfied. 0

2.4. The :1t-preserving matrices: generalized block-triangular matrices

with lattice structure.

We now introduce a group ~(I) of nonsingular matrices A that will be

seen in SeCtiOIl 2.6 to act transitively on P1(I). In the present section

~(I) is shown to be a group of block-triangular matrices with lattice

structure determined by 1.

For any A € Mel) anci any two subs.ets L.X € J(1) let A[LM] denote the

of A.

;;..::..;;:.a:;..;;;.;:...::..;;.=;,...;;;;.;:..;;:;.;;.. Let A € IfI ) . on A are

20

(i) Vx € mI, V L € ~: XL = 0 => (Ax)L = 0;

(ii) Vx € mI, V L €~: (Ax)L =~XL;

(iii) VL,M € J(~): Mg L =) A[LM] = o.

Proof: (ii) => (i) is trivial.

(iii) => (ii): By the usual formula for matrix multiplication by

blocks,

(Ax)L = (2(A[KM]x(MlIM € J(~))IK € J(=\))

= (2(A[KM]x[M]IM € J(=\»IK € J(=\»

= ~XL·

The first equality uses (2.8) and (2.9), the second uses condition (i),

while the third uses (2.8) twice.

(i) => (iii): Suppose L,M € J(~) with Mg L. Let c denote any column

vector in mI satisfyi~ C(K] =0 for K € J(~), K ¢ M. Then

But (Ae)L :: 0 by (i). hence (ACl(L] == o. Since C(M] is arbi trary this

implies A[LM] =0 as required. []

Let ~(I) denote the set of all A )(1) satisfy the equivalent

condi tions (i). (U). (iU) in Proposition 2.2 and let ~(I) denote

set of I nonsingular matrices in ~(I). It follows from (i) that ~(I)

is a matrix and ~(I) is a matrix group. It

I is set 1 matrices each €

21

preserve the kernel of the projection (RI ~nf given by x ~ xL' Note that

when ~ ={0,I}, ~(I) =M(I) and ~(I) =eL(I).

Definition 2.3. The algebra ~(I) is called the algebra of ~-preserving

matrices and ~(I) the group of ::tt-preserving matrices.

Remark 2.1. When::tt is a chain then J(::tt) E ::tt\{0} is also a chain, so it

follows from Proposition 2.2 (iii) that ~(I) is an algebra of

block-triangular matrices in the usual sense. For a general ::tt let q :=

o

IJ(::tt) I and let K1,K2 , · · · ,Kq be a never-decreasing listing of the members

of the poset J(::tt), i.e., i < j => K. g K.. If every A € M(I) isJ 1

partitioned according to the ordered decomposition

(2.15) [K ],q

then it is seen from Proposition 2.2 (iii) that ~(I) can be represented

as a subalgebra of the algebra of lower block-triangular matrices. That

is, A € ~(I) is lower block-triangular with additional blocks of zeroes

below the main diagonal ~ see (1.13) and also Section 2.8 for further

examples.

!£l!!!!:.!~b~For K € ::tt and A € M(I) let ~ denote the KxK submatrix of A

and tion A according to (2.3) and (2.10) as follows:

n

22

note that A[KK] =A[K] when K € J{~). By Proposition 2.2{ii). if A €

I~(I) then for every K € J{~) and x € IR •

(2.17)

(2.18)

A(K] =0

(Ax)[K] = A[K]x[K] + A[K)x(K)'

Furthermore. the linear mapping

(2.19) ~(I) -+ X{JI{[K]x(K»xJl([K]) IK € J(~»

A -+ ((A[K)' A[K]) IK € J(1»

is bijective. This holds because. by Proposition 2.2(iii). A € ~(I) if

and only if the [K]x{I\K)-submatrix of A is 0 for every K € J(1). Under

the correspondence (2.19) the subset ~(I) corresponds to the subset

(2.20) X(JI{[K]x(K»xGL{[K])IK € J(1». JJ

Lemma 2.2. For A € ~(I). L € 1. and K € J{~).

(2.21)

(2.22)

(2.23)

~~_ From Proposition 2.2{ii). (AC)L = ALCL for every A.C € ~(I). L €

JJfrom{2.23}{2.ies

23

Lemma. 2.3. The ma.pping

(2.24)

from ~ to its ~-parameters commutes with the actions of ~(l) on pel)

and on X(M([K]x<K»xP([K]lIK E: J(~» given by

(2.25)

and

(2.26)

~(I)xP(l) -+ pel)

(A,~) -+ AZAt

~(l)x(X(M([K]x<K»xP([K])IK E: J(~»)

-+ X(M([K]x<K))xP([K]) IK E: J(~»

(A, «R[K)' A[K])IK E: J(~»)

-1 ~1 t I-+ « A[K]R[K)A<K) + 4[K)A<K)' A[K]A[K]A.[K] li K E: J(~»,

respectively.

Proof. It is straightforward to verify that (2.26) is a group action. We

must show that for every A E: ~(l), ~ E: and K E: J(~).

(2.27)

and

(2.28)

be =

[J

24

tProposition 2.3. If I e P:fCI) and A € ~O). then nA € P:f0 ) .

Proof. We shall show that condition (ii) of Theorem 2.1 is valid with I

replaced by nAt. Since I € P:f0 ). (ii) holds for I. Now replace x by

A-Ix in (ii) and let B =A-I. The left-hand side of (ii) becomes

tr{(nAt)-lxxt) while the summands on the right-hand side become

The first equality uses (2.18) and Proposition 2.2(ii). the third uses

(2.22) and (2.23). and the fourth uses (2.27) and (2.28). Therefore

condition (ii) of Theorem 2.1 holds for nAt.

2.5. The :1l-parametrizaU<>n of P:1l(I).

Theorem 2.2 below establishes the one-to-one correspondence between I

and its :f-parameters. Together with Theorem 2.1(U) and Lemma 2.5. this

decomposition of the parameter space P:f(l) yields the fundamental

factorization of the likelihood function for the CI model .N(:1l) (cf.

[J

1).

For any fami

) € )

25

there exists a matrix A € ~(I) such that for every K € J(:1l).

(2.29)

(2.30)

Proof. First choose matrices A[K] € CL([K]). K € J(:1l). that satisfy

(2.30). As in Remark 2.1 let K1•••• .Kq

be. a never-decreasing listing of

the elements in J(::tl). For notational cOIlveIlienc~<abbr~vi<tt~I<k by k , <I<k>

by <k>. [I<k> by [k>. and EI<k] by [k] whenever they appear as subscripts.

If K1 C K2 then <K2> = [K1]. so A(2) = A[I] and A[2> is uniquely

determined by (2.29); if K1 ~ K2 then <K2> = 0 so (2.29) is vacuous. Now

suppose that we have determined A[2>.···.A[k_l> satisfying (2.29). These

k-2 matrices (some of which may be vacuous). together with

An].··· .A[k-l]' completely determine A(k>' This follows from the

decomposition (cf. (2.8»

(2.31)

and the fact that Ki ~ <I<k> => i < k for a never-decreasing listing. Now

A[k> is uniquely determined by (2.29) and. after indllction on k.

is the of

(2. A € CL:1l(I).

) is

26

(2.32)

Proof. By Theorem 2.2(11). (2.32) is injective. To show that (2.32) is

surjective. consider

((R[K)' A[K]) IK € J(:If» € X(JI([K]x<K))xP([K]) IK c J(:If».

By Lemma 2.4 there exists a matrix A € ~(I) satisfying (2.29) and

(2.30). Define! := !At; then! € P:If(I) by Proposition 2.3 (with! = 11),-1 -1

The :If-parameters of ! are given by ![K)!<K) = A[K)A<K> = R[K> and ![K]. =A[K]A[K] =A[K]' K € J(:If) (set! = 11 in (2.27) and (2.28».0

2.6. Transitive action of the group of :If-preservin~ matrices.

Theorem<2.3. The action

(2.33) ~(I)xP:If(I) ~ P:If(I)

(A.!) ~ !!At

is well-defined. transitive. continllous. and proper.

;:..;;..;:...:;;.;;;._ That (2.33) is well-defined Vii. .LV"';:' from Proposi tion By Lemma

and

Lemma so it

is transitive. That (2.33) is continuous is

) .and c!~lSs1CfL! action or I) on I} is proper i

the action is proper. u

27

action

(2.34) -1 -1~(I)xP1(I) ~ P1(I)

(A,A) ~ (A-1) t AA-1

-1induced on P1(I) by (2.33) is also well-defined, transitive,

continuous, and proper. [J

Remark 2.4. Since both P1(I) and P1(I)-1 contain the IxI identity matrix

II' it follows from the transitivity of the actions (2.33) and (2.34)

that

(2.35)

(2.36)

P1(I) = {AAt€ p(I)IA € ~(I)}

P1(1)-1 = {AtA € p(I)IA € ~fI)}. [J

-1If 1 = {0, I} then P1 ( 1) = P1(I) = P(I), so both actions (2.33) and

(2.34) reduce to the well-known transitive actions of CL(I) on P(I). If 1

is a chairi as in EXarnPles2.1 ELIld 2.2 in Section 2.8 then againP::c(I) =P::c(I)-l = P(I), but now ~{I) is agrQUP of nonsingular lower

block-triangular matrices in the usual sense ~ the actions (2.33) and

(2.34) are the well-known i- ....'ft.,,; ~(I) on P(I).

The following lemma generalizes the Schur decomposition for

28

Lemma .2.5. For I € '=tt(Il.

(2.37) det(I) = Jl(det(I[K]JIK e J(:1f».

Proof: By Theorem 2.3 there exists A € ~(I) such that I =AAt. Thus

tdet(I) = det(AA )

= Jl(det(A[K]A(K]J IK € J(:tt))

= Jl(det(I[K].)IK € J(:1f)l·

The second equality holds since A can be represented as a lower

block-triangular matrix (cf. Remark 2.1). while the third equality

follows from (2.28).

2.7. Reconstruction of I from its :1f-parameters.

By Theorem 2.2. I € ':1f(I} is uniquely determined by its :1f-parameters

[J

(2.3S)

it is imr>O"J"T&:I~'T to find an

icit method for reconstructing I € '1(1) from its 1-parameters.

=

29

which is just a re...e:xpression of Theorem 2.1(ii). where Ar(K) is the IxI

matrix whose KxK submatrix is

(2.40)[

t ...1lI[K>A.[.. K.]R.[K>

-1-A[K]R[K>

t -1]-R[K>A[K]-1

A[K]

and whose remaining entries are O. In general. however. it is not a

simple task to determine .~ from (2.39) by matrix inversion. We now

present a step-wise algorithm for reconstructing ~ directly from its

:1l-parameters.

Let K1.···.Kq be a never-decreasing listing of the members of the poset

J(:1l) (cf. Remark 2.1 and the proof of Lemma 2.4). partition ~ according

to (2.9), and list the :1l-parameters in the corresponding order:

(2.41) (A[l]' (R[2>' A(21).···' (R[q> , A[q]J) €

P([K1])xJ(([K2]x<~> )xP([K2])x••• xJ(([Kq

] x <Kq>)xP([KqD.

(Recall that whenever they appear as subscripts. ~, <~>, [~>, and [~]

are abbreviated bYk. ~>. [k>. arid[k] • respectively.) The

reconstruction algorithm proceeds step-wise as follows. At step k the

relations in (2.38) are inverted to determine .I[k> and .I[k).. from the

in k-l. The

• ·U(k-l)

relOlillLn:tng entries in ~1U- _-Uk are determined

by CI conditions.

=] 1 .

Step 2:

30

~(2) = R[2>~(2)'

~[2] = A[2] + R(2)~<2]'

At this point the submatrix ~1U2 is completely determined: if K1 C K2

then ~IU2 = ~2' while if K1 ~ K2 then K1nK2 = 0 so the [K1]x[K2]

submatrix of ~ is 0 by (2.2). (Recall that lU2 abbreviates K1UK2 when

appearing as a subscript.) By (2.42), <K3) ~ K1UK2, so ~(3) is a

submatrix of ~IU2' hence the next step maY be carried out.

Step 3a: ~(3) = R(3)~<3)'

~[3] = A[3]+ R(3)~<3]'

It is important to note that after Steps 1, 2. and 3a, the three

submatrices ~1' ~2' ~3 are now determined but the complete submatrix

~lU2U3 may not yet be fully determined. The remaining

[K3]x((Kl~UK3)\K3)-submatrixof ~1U2U3' which we denote by ~(3)' is

determined from ~lU2 by means of the pairwise CI requirements imposed by

':It [cf . (2.44)):

Step 3b:

is the )-subnlat:rix of . By (2. and

however, ~<3t is in fact a submatrix of hence may be used

in

After k-l such submatrix ful determined

and in turn may be used to OD'caln _-Uk as fol . First note

the never-decrea.sino- l'la'tUI"e of K ••• K imnliesthat~- 1"q -'Y

K1U•••l.JKk = U([KjlIJ=l

••••.k) •

~ =U([KjlIJ:l .••• .k, Kj ~ ~).

From these relations and (2.3) it may be deduced that

(2.42)

(2.43)

Thus. if we denote the [Kk]x«KIU···l.JKkl~)-submatrixof lIU•••Uk by

l[k} and the <~>x((K1U···l.JKk)~)-submatrixby l<k}' it follows from

(2.42) and (2.43) that both l<k> and l<k} are in fact submatrices of

l1V•• •V(k-l)' so the next step may be carried out:

Step k:

(2.42)

l[k> = R(k>l<k>'

l[k] =A[k]+ R[k>l<k]'

l[k} =

relation in i::>lnlCe ~ (:: L) and

is eqtlival.~nt

J-subnntJrlX of ••Uk)-1 is a zero matrix, which

··Ukis deitermbled after after q

• == I is ful

[In carrying out this algorithm one must use the convention that if C '#

9 and D '# 9. then the product of a Cx9 matrix with an 9xD matrix is the

CxD zero matrix.]

2.8. Examples.

A series of nine Examples will illustrate the following basic aspects

of a lattice CI model H(:f): (a) the distributive lattice :f ~ ~(I) and the

poset J(:f) of join-irreducible elements; (b) the :f-parame1:ri:zation (2.32)

of Pjt(I) and the associated decomposition of tr(};-lxxt) given in Theorem

2.1(ii); (c) the choice of a never-decreasing listing of the members of

J(:f) and the reconstruction of the covariance matrix}; € P:f(I) from its

ordered :f-parameters (cf. (2.38» by means of the step-wise algorithm in

Section 2.7. as well as the form of the precision matrix A=};-1 €

P:f{I)-l; (d) the form of the :f-preserving matrices. Le.• the group

GL:f{I) of matrices •• partitioned according the ordered decomposition

(2.15). that acts transitively on P:f(I) (cf. Remarks 2.1 and 2.2). The

reader should verify directly that (2.35) and (2.36) hold for P:f(I} and

and ~(I) in these nine Examples.

In each Example the lattice diagra.m of appears in an accompanying

Figure. in which the mem~rs of J(:1I) are in<licatedby open circles and

the remaining members of :f by solid dots. the minimal

I

These Examples will be continued in Section 3.2. where the .MLE i is

of and in .;,eC;:l:lOn 4.3 to 'n"",,",U1!rle>

tional CXlHll'UClS appear in (1991).

33

Example 2.1. First consider the simple case where :It = {0.L.I} (see Figure

2.1) .

*0o _.--..0---0 IL

Figure 2.1.

Since:lt is a chain. P:It{I) = P(I). Note that J(:It) = {L.I} and <L> = to}.

<D = [L] =L. Thus the :It-parametrization of P:It{I) becomes

(2.45) P(I) ~ P(L)xM([I]xL)xP([I])

and

(2.46)

The algorithm for reconstructing ~ from its ordered :It-parameters A[L]'

R[D' A[I] takes the following form:

Step 1:

Step 2:

~ =ALL]

~[I> = R[I>~

~[I] =A[I] + R[I>~<I]

The group ~(I) is a lower block...triangular matrix group in the ordinary

sense: ~(I) consists of all nonsingular IxI matrices of the

A=o

34

Example 2. 2. 1. == {S== KO' K1... •• Kq_1• Kq== I} is an ascending chain.

1. e .• S C K1C•••C Kq_1C I. then a well-known generalization of the

preceding example is obtained (see Figure 2.2).

S -0-- • •• -0-0 IK

1K

q_

1

Figure 2.2.

Again P1.{I) = P{l). but the l.-parametrization is changed. Note that J.(1.)

= {K1.···.Kq} and <K1> =s. <Kk> =Kk-l' k =2.···.q. Then the

:1!-parametrization of P1.( I) becomes

(2.48)

and

(2.49)

P{I) ~ P(Kl)x.{(~]xKl)xP{(K2])x••• x.{(Kq]xKq_l)xP{[Kq])-1 -1

~ ~ (~1' ~(2)~1 • ~(2]'" •••. ~[q>~q-l' ~(q].)'

where Kl.~.···.Kq.are abbreviated 1.2.···.q whenever they occur as

subscripts. Then ~ is reconstructed from its ordered l.-parameters A(I]'

R[2>' A(2]' •••• R[q>• A(q] as follows:

..•..

~l ,:

~[2> =R(2)~1

~[2] =A(2] + R[2>~<2]•....

=

= +

35

The group~{l) is again a group of lower block-triangular matrices in

the usual sense. For example. when q =4. ~(I) consists of all

nonsingular Ix! matrices of the form

(2.50) A =

Al 0 0 0

A(2) A[2] 0 0......................................................

A(3) : A[3]: 0............ to· ..

A[4> : A[4]

o

Example 2.3. Consider the lattice :It = {0 == UtI. L. M. UlM == I} (see

Figure 2.3).

L

0-<::>1M

Figure 2.3.

Here the CI requirement determined by :It is nontrivial. so P:It{I) C P(I).

Now J(:tt) = {L.M} and. <.I..> = <II> =0. The :It--parametrization takes the. form

(2.51)

and

.52)

P:It(I) +--+ P(L)xP(M)

.I +--+ ~).

, I may

as

Step 1:

Step 2:

36

~ =4[L]

\t = 4[)I]

l[M} = O.

Thus P~(I} consists of all block-diagonal matrices 1 of the form

(2.53)

where 1 is partitioned according to the ordered decomposition

(2.54) I = L U)I.

In this Example. as in Examples 2.1 and 2.2. P~(I) = P~(I}-1 and both are

linear. i.e .• closed under (no;n,;n,egaUve) linear combi:naUons. The group

~(I) consists of all nonsi;n,gular IxI matrices of the form

(2.55) [J

Example 2."l, If ~ = {0 == UW:. L. M. LUM. l} (see Figure 2.4)

L

0~IM

Figure 2.4.

J(~) = )I. = ()I) =0. (I) =P~(I} assumes

(2.56)

and

37

P:It( I) +--t P(L) xP{K)x.{[ I J~(ll.Jl;I)JxP(I1J)....1

}: +--t (~. ~. }:[D~' }:[IJ.)·

tr(}:....1xx5)....1 t -1 t -1 -1 t

= tr(~ "L"L) + tr(\t VM) + tr(}:[I]. (X[I] - }:[n~H···) ).

Now L. M. 1 is a never-decreasiI'lg listiI'lg of J{:It). so}: may be

reconstructed froID its ordered nontrivial :Jt...~rameters A[L]' A[K]' R[n'

A[I] as follows:

Step 1.2:

Step 3:

RePeat Steps 1.2 in ExaJDple 2.3.

}:[I> =R[I>Diag(~.~)

![I] = A[I] + R[I>}:(I]'

Thus P:'It0 ) consists of all }: of the form

(2.57)

where ! is

[~ 0 : ]

~ ~!..?i~:~~;![D:}:[l]

Honed accordiI'lg to the ordered decoIDPOsiHon

(2.5S) •I=LUK

that

is linear is not.

by

group CL:It( consists of all

I

I matrices form

38

(2.59) A=

o : 0

Ax: 0 []

Example 2.f). Suppose that :JI. = {8. I.fII. L. X. I...IJM == I} (see Figure 2.5).

(Note that (1.3) is a special case.)

L

8~IX

Figure 2.5.

Now J(:JI.) = {I.fII. L. X}. and <I.fII> =8. <L> = <X> = I.J1M. The

:J/.-parametrization of P:JI.{I) is given by

(2.60) P:JI.(I) +--+ P{LfIf)xl:([L]x(LfIf»xP{[L])xl:((X]x{I.fII) )xP((M])

-1 -1! +--+ (~. I[L>~' l[L]_' I[X>~' I[M] _) •

and

(2.61)

Since M

reconstructed from its ordered :JI.-parameters A[I.fII]' R[L}' ArL]' R[M}'

as follows:

39

Step 2:

Step 3:

(2.62)

l(L> =R(L>hnt:

l[L] = A(L] + R(L>l(L]

1(11) = R(II>hnt:

1[11]= A(II] + R[II>l(ll]

l(lIl = R(II>l(L]-1

(= l(II>~(L])'

(Note that 1(1I) = l(Lr} Thus l":;It(I} consists of all I € pel} of the form

(2.63)

such that l(lIl satisfies (2.62) and where I is partitioned according to

the ordered decomposition

(2.64) • •I = (LnN) U [L] U (II].

Then P:;It(I}-l consists of all A € P(I) having the simple form

(2.65)A(L]A(L]o

Thus, in this example P:;It(I}-1 is linear while P:;It(I} is not. The group

~(I) consists of all nonsingular Ixl matrices of form

0 0

A = 0 [J

0

40

Example 2.6. Consider the lattice :t = {0. lJ'1H. L. K. LI..JK. I} (see Figure

2.6).

L

0~IK

Figure 2.6.

Note that J(:1t) = {0. lJ'1H. L.K. I}and<lJ'1H> =0. <L> = <K> = I.flM. <1> =LI..JK. The :1t-parametrization of P:tt(l) is given by

(2.67) P:tt(I) ~

P(I.flM) xJl([L]x(lJ'1H» xP([L]) xJl([K]x (I.flM) )xP([K]) xJl([I]x(LI..JK» xP([ I])

and

(2.68)

Since lJ'1H. L. K. I is a ne"er-dElcr.~sing 16 of J(:t) • .I can be

reconstructed from • A[ as follows:

Steps 1.2.3: Rep~t Steps 1.2 and 3 in Example 2.5 to obtain

Step 4: = n .....·'_

+

41

Thus P=tt{ I) consists of all I of the form

(2.69)

•parti t Ioned according to I = IllJM U Dr where 1...uM is given by (2.63).

-1 -1(2.64) and (2.62). The precision matrix A == I € P:/l(I) is

characterized by the condition that~ have the form (2.65). Thus

neither<Pj{I) nor iPj{I)-1 is anear.Thegroup~{I)consists of all

nonsingular IxI matrices of the form

(2.70) A =

\.nM0 0'0

A[L> A[L] 0 0

A[X> 0 A[X]: 0• '" OIl '" ..

A[l> :A[I]

o

Example 2.7. Let :/l be the lattice in Figure 2.7:

L L I

9~IX XI

Figure 2.7.

= L. X. LI. XI} and <l...flM:> =0. <L> = 01> = l...flM:. <L I> =

<1(1) = llJM == L 'rwl• The :/l-parametrization of P:/l{I) is given by

(2.71)

42

P:f(I) ~

P(J.J'ItI)xX([L]x{J.J'ItI))xP([L]) xX{[X]x{lfII» xP{[X])

xX([L' ]x(UII) )xP([L' J)xX([X']x(WM»xP{[M'])

I~

-1 -1(ILnM ' I[L)ILnM • I[L].· I[X>ILnM • I[X].·-1 -1

I[L')IUiM • I[L'].· I[X')IUiM • I[X'].)·

from which the decomposition of tr{I-1xxt) is directly obtained. The

matrix can be reconstructed from its ordered :f-parameters A[U1M]' R[L)'

A[L]' R[M)' A[X]' R[L')' A[L']' R[X')' A[X'] as follows:

Steps 1,2.3: Repeat Steps 1.2.3 in Example 2.5 to obtain I UII =IL'n X'

Steps 4,5: Repeat Steps 2.3 in Example 2.5 with L,X replaced by L' .X'

Thus P:f{I) consists of all I of the form (2.63) withL. X replaced by L' .

M', partitioned according to the ordered decomposition

(2.72) I = (UII) U [L'] U [X']

and where 1.'nM' =~ is given by (2.63). The precision matrix A = I-1

has the form (2.65) with L. X replaced by L'. X' and satisfies the

condition that~ has the form (2.65). Again. P:f(I}

is linear. The group ~(I) consists of

the form

matrices of

(2.73)

43

.Aun. 0 0 0 0

A[L) A[L] 0 0 0

A = A[M) 0 A[M] : 0 0........ '" '" .A[L') :A[L'] 0A[M') . 0 A[M']

D

Example 2.8. Let :It be the lattice in Figure 2.8:

L"

I

Figure 2.8.

Here J(:It) = {lJ)M, L, X, L", M'} and (lJ)M) = 0, (L) = (X) = lJ)M, (L") =L,

()I' > = UJM = L 'n X'. The :It-parametrization of P:It(I) is given by:

(2.74) P:It(I) +-+

P(lJ)M)~M([L]x(lJ)M»xP([L])xJ([M]x(lJ)M»xP([X])

xJ([L"]xL)xP([L"])xJ([X']x(UJM»xP([)I'])

I[M] .• '

I[M' ].)

from which the decomposition

can be reconstructed from its ordered :It-parameters A[lJ)M]' R[L> , A[L]'

, R[M')' A[M'] as follows:

I =

t:l0:neo. accordlna: to

twhere l{LUl =l(L"t'

l[LU) =,R.[Lu>~

l(LU] = A(L"] + R(LU)l(LU]

ThusP:t(I) consists of a.ll I of the form

twherel{M]:::::l(M.t; thus weobta.in

Step 5: l(M'> = R(M')~

l(M']= A(M'] + R(M·)l(M·]

~:E.!~~~Repeat Steps 1.2.3 in Example 2.5. to obta1n

(2.76)

(2.77)

(2.75)

. • • • •]I = U U U U .

45

where I[M)' I[L")' IrK') satisfy (2.62). (2.75). (2.76). respectively.

The precision matrix A == I-I satisfies the following three conditions:

its [K']x[L"]- and [L"]x[K']-submatrices are O. the [L"]x[M]- and

-1 -1[M]x[L"]-submatrices of :I.L' are O. and~ has the form (2.65). Neither

-1P~(I) nor P~(I) is linear. The group ~(I) consists of all nonsingular

IxI matrices of the form

(2.79)

Auw. 0 : 0 0 0

A[L) A[L]: 0 0 0.................................................

A = A[K) 0 :A[M]: 0 0....................................................

A[L") : 0 :A[L"]: 0....................................................

A[K') 0 :A[K']

n

Example 2.9. Finally consider the lattice ~ in Figure 2.9a:

L"

I

Figure 2.9a.

The lattice ~.

Al though this lattice properly contains the lattices Examples 2.7 and

2.8 as sUI)la.tt:lcE~S. the set P~(I) that it is much simpler than

those in Examples 2.7 and 2.8. The reader may verify that P~(I) is

identical to PA(I), where A is the sublattice in 2.9b:

47

Au. :0:0 :0:0

(2.80)

A[L> : A[L]: 0 : 0 : 0'" ..... '" .. '" '" '" . '" '" '" .. '" '" '" .. '" .. '" '" '" '" . '" .. '" . '" ...

A = A[M> : 0 : A[M]: 0 : 0'" . '" '" .... '" '" ... '" '" '" . '" '" . '" ... '" '" '" . '" '" '" . '" '" . '"

A[Ltt> : 0 :A[Ltt]: 0.. '" '" . '" '" .. '" '" '" '" '" '" '" . '" '" '" '" '" '" '" . '" '" '" . '" ... '" '" '" '" .A[Mtt(JJlM)]: 0 :A[MttM]: 0 :A[Mtt]

(Note that (A[Mtt(JJlM)] A[MttM]) =A[Mtt> in (2.80).) []

Remark 2.5. For any K£ 'J(I) defineK' := I\K. It is an elementary

exercise to verify that for L. 11£ ~(I).,.,.Jl.,.1 xJJlM under N(I) if and

only if xL' .Jl.,., I ""AM' under N(I-1

) . From this it follows that P~(I) =-1 1P~,(I) • where ~' := {K' K £ ~} is the dual lattice of ~. For example.

if ~ is the lattice in Figure 2.4. then~' has the same form as the

lattice in Figure 2.5; the relation P~(I) = P~, (1)-1 may be verified by

comparing (2.57) and (2.65).

§3. LIKELIHOOD INFERENCE FOR A NORMAL MODEL DETERMINED. BY PAIRWISE

roNDITIONAL INDEPENDENCE.

3.1. Factorization of the likelihood function; the MLE of I.

Consider n independent. identically distributed (1. 1.d.) observations

[]

from lattice CI model JI(~) defined by (1.8) and (1.4). and

of y.

L£~ del"ot:e the submatrix of y,

K£ tion YK aCl~or,d hill: to (2.3) as

48

The fundamental factorization of the LF for the model 8(:f) is an

immediate consequence of Theorem 2.1(ii). Lemma 2.5. and Theorem 2.2.

Theorem 3.1. (Factorization Theorem.) The likelihood function based on n

t . i. d. observations from the statistical model 8(:f) has the following

factorization:

(3.2) P:f(I)xM(IxN) ~ ]O.oo[

(:I.y) ~ (detCI))-n/2exp(-tr(:I-1yyt)/2) =

-n/2 -1 -1 t I1l«det(:I[K]e)) xexp(tr(:I[K].fY[K] - :I[K):I<K>Y<K>)(eee) )/2) K€J(::1t)).

The parameter space P:f(I) has the factorization given by (2.32). []

Note that the factor corresponding to K E J{:f) is the density for the

conditional distribution of Y[K] given Y<K>'

It follows readily from Theorem 3..1 and well--known. results for the

multivariate normal linear regression model that the MLE ~(y) of :I €

P::1t(I) is unique if it exists. and it exists for a i e , Y € M(IxN) if and

only if

(3.3) n l max{I<K) 1+I.[Kll IK € J(:f)} == max{ IKI IK € J(::1t)}.

case usual

ref!~ressjlon es t imators:

(3.4)

49

K € j{:1t).

twhere S{y) = yy is the empirical covariance matrix. The explicit

expression for ~ itself may be obtained from its :1t-parameters in (3.4) by

means of the reconstruction algorithm given in Section 2.7.

If I e j{:1t) then the condition (3.3) reduces to N ~ II I. so in this

case S is positive definite a.e .• hence !!. fortiori ~[KJ. exists a .e , for

every K e j{:1t). If.. on the other hand. I t£ j{:1t). then condition (3.3)

does not guarantee that S is positive definite. but it still guarantees

that ~[K]. (and hence ~) exists a.e.

By Lemma. 2.5. when (3.3) is satisfied the maximum value of the LF (3.2)

is given by

(3.5)

where c = nn/2x exp{-n.11 1/2). This fact is used in Section 4 to express

the likeliho.od ratio statistic for testing one model against another.

Remark 3.1. The statistical model 1{:1t) is a curved exponential family; it

is linear if and only if p:1t{I)-l is a linear set. Le.• closed under

positive linear In the linear case the MLE ~ based on n

1. 1.d. a minimal ~ is

not nec::es:salri in the general case. [J

associated

:f is a chain as in ~::UU}'''''''''' 2.1 2.2 P:f{I) =P{I)

50

and H(~) is the unrestricted covariance model regardless of the length of

the chain. (The ~-parametrization of P~(I) does depend on this length.

however.) Condition (3.3) for existence of the MLE ~ reduces to the

familar condition n ~ III. while (3.4) reduces to ~ =S.

For the lattice ~ in Example 2.3. partition the observation x € RI

according to (2.54) as x = (x.:..x:.)t. The model H(~) states simply that ~

1l ~ . According to (3.3). the MLE ~ exists if and only if n i

max{ILLIMI} (whereas S is positive definite if and only if n i III) and

is given by ~ = Diag(Sz.'~)'

For the lattice ~ in Example 2.4. partition x € RI according to (2.58)

t t t t . 1las x = (xL.~.x[I]) . Then the model H(~) agaIn states that ~ ~.

Condition (3.3) for the existence of the MLE takes the form n i III.while from (3.4).

We reconstruct ~ from its ~-parameters by following Steps 1-3 in Example

2.4 to obtain

~ = DiagfSz. .~)

xJ[l> =S[I>S;1.niagfSz..SM)

xJ[l] = S[I]. + S[I>(J)iag(SL'~1)-1S<I] (¢ S[I])'

In Example 2.5. x is partitioned according to (2.64) as

The model H(~) states

tion bec:omE~S n i }. while

(3.6b)

(3.6c)

= SuR

=s(L)SuR'

=s(M)SuR'

~(L)- =S(L)_

~(M) - = S(M]- .

By Steps 1-3 in Example 2.5. ~ is given by (3.00.) and

(3.7a)

(3.7b)

(3.7c)

In Example 2.6. x is pa.rtitionedas (~i'X(L)'X(M)'X(I) t and the

model .N(~) states that x(L) .Jl. x[M)I~. Condition (3.3) reduces to n ~

III. while (3.4) is given by (3.oo..b.c) and

From Steps 1-4 in Example 2.6. ~ is given by (3.00.). (3.7a.b.c). and

from

(3.8) ~=

l"uw

IC"uw

I("uw ,x[L],x[M]}'

(il) ~[X] JL

(iii) x[L"J JJ. x[M'J

b,c}.

tion

(Note that XL 'flM' =~ = ("uw ,x[LrX[Xl)'} Condition (3.3) becomes n

~ max{ IL '1.Ix' n. while (3.4) is given by (3.6a.b,c) and (3.6b.c) with

(3.9a)

(3.9b)

(3.9c)

52

X[L] JL x[X] IXJ.nr.I and that x[L'] JJ. xEN'] I(XJ.nr.I ,xELrxEX1) .

where~ is given by (3.8).

I T:'.___ 1 2 8' .. d (t t t t t)t In ~lIpe .• , X IS partltloneas "uw 'X[LrX[XrX[L"rX[X'l . t

may be seen fI'()J1l the form (2.11}o{ :I€P::tt(I) that the IIlOdel N(::tt} is

determined by the following three cOIlditions:

. . t t t tt tIn Example 2.1. x Isparti tioned as (XJ.nr.I ,x[LrxEXrx[L'l,xEX'l)

and the>model N(:I} states that

L,M replac:ed by L',X' (note that SL 'flM' = SLUM) • From Steps 1-5 in

Example 2.1, ~ is giyen)by(3.&3,).(3.1a,b.c}, and

53

~[L"]. =S[L"].

~[M·]. =S[M·].·

From Steps 1-5 in Example 2.8. ~ is given by (3.98.) and (3.7a.b.c). by

~[L") =S[L")' ~[L"] =S[L"]

by (3. 9a. b). and by

Finally. for the lattice 1 in Example 2.9. x is partitioned as

t t t t t t .(xUlM ,x[L]'x[M]'x[L"]'X[M"]) . It readt ly seen (cf'. Remark 3.2) that the

model H(1) is determined by the single condition that

This reflects the fact that this model is of the same form as that in

Example 2.5 (see discussion 1n Example 2.9).

Remark 3.2. Recall the definition of the normal model H(1) for a

distributive every L.M € 1. It may be

seen from the that many of these tions are r'edundant;

may be om! exEll1npJle wltlen.evE~r L & M:.

54

L', x {; X', and L n II = L 'n II'. then "L' 1"., I"L 'flll' => "L 1 ".1"I.11II .hence the latter condition maY be omitted. The question of characterizing

a minimal set of CI conditions that determines N(::I) is currently under

investigation. For a given lattice ::I. however, such minimal determining

sets are not unique. In Example 2.8. the following four sets of CI

conditions are (equivalent) minimal determining sets for N(::I):

(i) "L 1,. 1"I.11II ; (it) "I...uII 1"L" I"L; (it i) "L' 1,.· IxLl.JM

(i) "L 1". 1"I.11II (it) "L" 1 ,.. I"L ;

(i) ,. 1 XL,,1"I.11II (it) "L' 1,..1"I...uII

(i) ,. 1 "L"I"I.11II (it) "L" 1 ,.. I"L . 0

Remark 3.3. For I = {1.2.3.4}. consider the statistical model consisting

of all normal distributions on IRI such that Xl is independent of x2 and

x3 is independent of x4 . It is readily seen that this model is not of the

form N(::I) for any ::I. The same is true for the normal model determined by

the two conditions that Xl and ~ are CI given (~,x4) and ~ and x4 are

u

Remark 3.4. The general model N(~) is defined by the p!irwise CI

req'U.ireIll~nt {1.4} for every pair L.M € ::I. requirement does not

neice!;ss.ry iIllPly. however. that for every SU1,SElt c,j are

CI given xn(KIKeI)' For the ::I in ~"UjJJ,""

be seen by considering the subset c,j = {L". Ll.JM. )In}.

this may

n

alternative statistical Int:erlprEltat1ctn of model

may be obtailned from (2. :X= € € is an

55

observation from the normal mod~l H(~) if and only if x can .be

represented in the form x =Azfor some (generalized block-triangular)

matrix A € ~(I). where Z == (Z(KJ1K e J(~)) € 1R1 is an (unobservable)

stochastic variate such that Z '" N(lI). From Proposition 2.2(iii). this

representation is equivalent to the system of equations

(3.10) x(L] = I(A(UllZ(M]IM € H(L)). L € J(~).

where H(L) := {M € J(~rIM ~ L) J(=\.). This shows that theeI model JI(~)

can be interpreted as a multivariate linear recursive model (cf. Wermuth

(1980). Kiiveri. Speed. and Carlin (1984» with lattice constraints.

Conversely. suppose that J is a finite index set and let (H(t)lt € J)

be a family of subsets of J that satisfies the following two conditions:

(i)

(ii)

t € H(t)

m € H(t) => H(m) ~ H(t}.

For each t € J let D1 and E1 be finite index sets such that IDtl ~ IEtl

and let I = U(D1It e J}. I I =U(£1 It € J}. Consi.der the normal

statistical model defined by the system of equations

(3.lt)

Dt Emwhere x(t] € IR is observable. z(m] € IR is unobservable.

€ '" N(1I} on • Atm € K(Dt xEm}. and t) =Z==

I. Let 'it

be the subsets of J lZeTler:atEld and for H € 'it

= It € ~ := { e is a of

multivariate normal distribution that the L Ld. model determined by N(:'f)

56

of I and themodel&ietermined by the system (3.11) has the form

(3.10). i.e.. it is themoael N(:'f).

3.3. Invarianceof the model.

It follows from the well-known transformation property of the

is invariant under the transitive action (2.33) of ~(I) on the

parameter space P:'ffl) and the action

[]

(3.12) ~(I)xJl(IxN) -+ JI(IxN)

(A,y) -+ Ay

of ~(I) on the observation space JI(IxN). The MLE is thus equivariant.

§4. TESTING ONE PAIRWISE roNDITIONAL INDEPENDENCE MODEL AGAINST ANOTHER.

Let :'f and .M. be two sublattices of ~(I) such that .M. C :'f. Then P:'f(I) ~

P.u(I) and one can consider the following general testing problem: based

on n i.i.d. observations x1.···.xn € mI from the model N(.M.). test

(4.1)

problem. A, expressed in of its

moments, is derived by means of the invariance of this f"""~1"''''HT problem

under on obselrV!:ltion space space.

WhllCh establ ishes the mutual

invariant statistic 11 's

57

2:[K]'" K € J(:1t). Examples of the general testing problem are presented in

Section 4.3.

A warning about the notation is needed here. Since J{:f) ~ J(.M).

quantities such as <K>. (K]. L(Kl' L(K>' 2:(K]" depend not only on the

subset K of I but also on the lattice of which K is considered a member.

Thus. for example. <K>:1t and <K>.M need not be the same. To alleviate this

dHficul ty without introducing :1t and .M as subscripts. the letter K shall

denote a subset of I that is to be considered ass. member of :1t. while M

shall denote a subset of I that is to be considered a member of .M.

4.1. The likelihood ratio statistic.

Denote the MLE's of Lunder H(:1t) and H(.M) by ~:1t == ~ and ~.M == ~.

respectively.

Theorem 4. 1. Suppose that n ~ max{ IMI 1M € J(.M)}. Then for every 2: e

Pi(I). ~ and ~ exist a.e .. The LR statistic X for testing HO against H is

given by

(4.2) X2/n _ det{~}

- det(~)

_ ll(detf~[Ml"lIME J(~ll _ ll(det(SIlMl"lIM € Jf.M»

- ll(det(~(J(lJIK€ J{:1t)) - ll(detfS(K]" )IK e J(:1t))'

.;;..;;...:;..;;;.;.;;._. The first assertion follows from (3.3) and the inequali

max{IMI 1M € J(.M}} ~ max{IKI IK € J(:f}}. To establish this inequality

mapI>ing 'Ii: J(:1t) ~ J(.M) by 'Ii(K) := By an

simi to that in Pro})Osition 3. ii) of ( it may be

" is3. i) of (

58

implies that .p is surjective. hence

ma.x{11(1 II( € J(.«H = ma.x{ 1.p(K>I IK € J(:1IH

~ ma.x{ IKI IK € J(:1IH·

The second assertion of the Theorem now follows from (3.5) and (3.4). n

For computational purposes. note that

(4.3)

K € J(:1I). where S(y) =yyt (cf. (3.1» with an analogous formula for

4.2. Central distribution and Box approximation.

The testing problem (4.1) is inva,riant under the action (3.12) of the

group ~(I) on the sample space Jf(IxN) and the action

(4.4) ~(I)xP.«(I) -+ P.«(I)

(A.I) -+ AlAt

on the parameter space. Let

(4. ...:Jf(IxN) -+ Jf(IxN)/~(I)

the t prC)Jectl[On onto t space

59

under the action (3.12). Since the LR statistic is invariant under

(3.12). A depends on y € Jf(IxN) only through T(Y). The central

distribution of A is readily derived from this fact and Theorem 4.2.

whose proof is deferred to Appendix A.3. Since the restriction of (4.4)

to P:1t(I) is transitive (cf. Theorem 2.3). under HO the distribution of A

does not depend on }; € P:1t(I).

Theorem 4.2. Under BO' the statistics T and ~[K].' K € J(:1t). are mutually

independent. The statiStic~[K]. has the Wishart distribution on peEK])

with n-I<K>I degrees of freedom and expected value };[K].' [J

It follows from Theorem 4.2 that A and ~[K].' K € J(:1t). are mutually

independent. Therefore for every}; € P:1t(I) (~ Pj(I» and a > O.

hence from (2.37) and (4.2).

However. it follows from the Wishart distribution of ~[K]•

...

The Box approximation for the central distribution of -2logA may be

freeffA1rAnt~A between the number

.~... ·..•tr...(.....(n.-.'<.••••>•. ,-i+1)/2••. +0:....lJ... ]llll . . . .... . ... .... .. .. . .. i=1. • • •. IfMll.M € J(A)

20:/ .. f«n-I<M>I-i+1 )/2)E{A n) = _

+~~::r: :~::: ;:::a} j=I.···. I[KliJK € J(1)]

60

ll(det(IEKJ.)

IK € J{::1t» = det(I) = ll(det{I[Ml.) 1M € J{A»

(4.7)

(4.8) f = -2l{l«-I<M>I-i+1)/2Ii=1.···.IEMJI)IM € J{A»

+2l(l«-I<K>I-J+1)/2IJ=1.···.IEKJI)IK € J(::1t»

= l{ IEMJ Ix I<M> 1+IEMJ I( IEMJ 1-1 )/2 1M e J(A»

-l(IEKJlxl<K>I+IEKJI{IEKJI-1)/2IK € J(::1t»

= J{A»

(4.6) ICI[KIIIK € J(::1t» = II I = IClfMlIIM€ J(A»

for I € P::1t(I). one obtains that

obtained as in Anderson (1984) p.3H-316. In Anderson' s notation we have

a = b = III and

and

where the final equality is obtained using (4.6). From (2.37). one

r-ecognfaea f to

,." 0:for K € J(::1t). with an analogous formula for EHdet(I

EM].» ). M e J(A) .

Since

61

4.3. Examples of testis problems.

Let 11 , - - -. 18 , 19, 1 10 , 111 denote the lattices appearing in Figures

2.1. - --. 2.8. 2.9a. 2.10a. 2.11a. respectively. In this subsection we

consider examples of the testing problem (4.1) with (1.J) = (1.•1.) for1 J

variouspa.irs (i.j). In each example the LR statistic A in (4.2) and the

parameter f in (4.8) is rewritten in forms that reflect the statistical

interpretation of the testing problem. i.e .. that reflect the conditional

independence (eI) condition being tested.

For this purpose we must introduce the following notation: for any :I €

P(l) and any K. L € ~(I) such that L k K. let

~ = [•• ~ ~.K\L]~.L~

denote the pa.rtitioning of ~ according to the decomposition

K = L U (K\L)

and define

-1~-L = ~- ~.L~ ~.K\L € P(K\L).

(When K € )(1) and M€ )(J). ~-<K> = :I[K]_ and :I.M-<M> = :I[M)_') The well

known formula

may be ied in (4.2) to oblcain UnA that appear

62

First:. set: .M = {e.I} in (4.1) and consider the testing problems of the

form

(4.9}

for '1 = '13 ••••• '18. For i =3.···.8. the following forms of the LR

statistics Ai directly reflect the statistical interpretations of the

models K{'1i} given in Section 3.2.:

2/n _ det{S)~ - det(SL)det{~)'

2/n _ . det{Sl_{UlMl)

~ - det(SL. (lfll() )det{~.(IJ1M»'

= Ix I[M] I;

:It = ~:

63

~2/n =

f 7 = I[Lllx I[M] I + I[L 'II xl[M ' ] I ;

x deteSL' .L) x det(SI.(WM»

det(S(WM) .L)det(SLn.L) det(SL' .(WM)Jdet(~,• (WM»

_ •. det(SV_(LnM» . x • det(SI.(WM»- det(SM.(lflMJ)det(SLn• (lflM» det(~, .(WM»det(~,. (WM»

f S = I[Lllxl[M]1 + I[M]lxl[Ln]1 + I[Ln]lxl[M']1

= IILllx I[MIl + I[Lnll( I[Mll+I[M'll>

= I[Mll(IILll+ IIv'll> + I[Lnllx11M'] I;

Remark 4.1. The three equivalent expressi9ns for ~n given a.bove

correspond to the .........,......"..... determining sets of CI condi tions

for H(:ltS)

given in Remark 3.2. The expression for A~n suggested by the

fourth set is

[]

is in some

::;:

2/n . .2/n det(SI·(LUM»A:i ,6 ::;: (A.(A6) ::;: det'Sr". (l1JM) )det(~ '. (l1JM) ) ,

2/nequal to AS • Thus the fourth determining

Next we consider five testing problems of the form (4.1) with (:'It •..4t) ::;:

(:'It. ,:'It.). From (4.2) and (4.S) one may obtain the following expressions:1 J

sense·unsatisfactory for describing .N(:'ItS) '

but this is

65

These five testing problems involve the five aciJacent pairs of lattices

in the diagram

The LR statistic A and the parameter f for non"""Rdjacent pairs may be

obtained from those for acijacent pairs in the usual way. for example:

Remark 4.2. It is thus seen that in each example, the LR statistic can be

represented as a product of LR statistics for testing CI of two blocks of

variates. We conjecture that this is true in general. i.e .• that the LR

statistic A in (4.2) for the general testing problem (4.1) may be written

as such a product. and that furthermore. the factors are mutually

independent under HO' Of course it must be realized that the above

examples involve only very simple lattices. More complex distributive

lattices. e.g. non-planar lattices. may lead to statistical models and

tests with more complex structure. IJ

Let V be a finite-dimensional real vector space. A quotient space (or

V is defined to be a pair (Q'PQ)

a vector space Q a 1inear ma]ppilng

ease of is abbrl!!vilat:ed to Q.

66

Let R and T be two quotients of V. If there exists a linear mapping

PRT:T --+ R such that .~ =~ 0 Pr then ~ is necessarily surjective and

unique. hence (R.~l is a quotient of T. In this situation we write

(R.PR) ~ (T,PT)' or simply R ~ T. This relation is equivalent to the

-1 -1condf tion that ~ (0) ~ Pr (0). The relation ~ on the set of all

quotients of V is not antisymmetric. hence one defines an equivalence

-1 -1relation - on this set by R - T if PR (0) =PT (0). The collection of

equivalence classes is den()ted by QCV). Equipped with the relation

induced by ~ (also denoted by~) .Q(V) bec:;:.()mes a partially ordered set (:;:

poset) .

We identify a quotient (Q'PQ) of V with its equivalence class in Q(V).

A convenient representive for this equivalence class is the canonical

-1 -1quotient space (V/PQ (O).p). where p:V --+V/PQ (0) is the canonical

quotient mapping given by p(x) =x + PQ1(0 ) . x € V.

TlleposetQ(V) is in fact a lattice: if R.T € QCV) then their minimum

and maximum exist and are given by

RAT

RVT

-1 -1:= V/(PR (0) + PT (0»

:::: V/fP;1(O) n p;;:1(0»

respeetively. The minilDl:l.l and max~lDl:l.l elements exist and are given by {OJ

and V re~;pectj~vely dim(V) ~ 2 then Q(V) is not distributive and

= Qt. Since V is finite dimensional. the lattice Q(V) has

length. hence so does any sublattice Q k Q(V). Therefore. if Q is a

of tl(V) it must be . The is

to Se.~t:t.on 3 ( lattices

61

5.2. Invariant f'ormulation of' the pairwise CI.model.

For a € P(V} := the cone of' all positive def'inite f'orms on the dual

vector space v* of' V. let NCa} denote the normal distribution on V with

mean vector 0 € V and covariance a (cf'. [A] (1915). Section 5). Let Q l;;

Q(V} be a sublattice such that {O}. V e Q.

Def'inition 5.1. The class PQ(V} l;; P(V} is def'ined as f'ollows:

(5.1) a € PQ(V} (=) J1l(:X:} J1. PT(x} IJ1lAT(x} V R. T € Q when x "" N(a}.

i.e .• PR and PT are conditionally independent (CI) given J1lAT (compare to

Def'inition 2.1). 0

Theorem 5.1. The class PQ(V} is nonempty if' and only if the lattice Q is

distributive.

Proof'. See Appendix A.2. n

The normal statistical model NV(Q} def'ined by the requirement (5.1) of'

pairwise conditional iridependence W'rt Q is then given by

(5.2)

(compare to (I.a)). By Theorem 5. Nv(Ql ~ " if' and only if' Q is

distributive.

te index set.

I)I

Q(~) l;; Q(R ) as fo1

68

each K E :It define the coordinate projection Ii<:IRI -+.t< by Ii<((xiliEI)) =

(xi liEK). Since:lt is a ring, it follows that Q(:It) := U.t<'Ii<lIK E :It} is a

distributive lattice of quotients of IRI. If fif, I €:It, then to} ,IRI e Q(:It).

Thus each canonical coordinate-wise CI model N(:It) given by (1.8) is a

special case of the general CI model NV(Q) given by (5.2). D

Conversely, by Proposition 5.1 below every distributive sublattice Q ~

Q(V) can be represented in the form Q =Q(:It) for some ring of subsets :It

and everl CI model NV(Q) can be represented as a canonical model N(:It).

5.3. Reduction of the CI model to canonical coordinate-wise form.

Proposi tion 5.1. Let Q ~ Q(V) be a distributive lattice of quotients.

Then there exists a set I, a ring :It of subsets of I with the property fif,I

€:It, a lattice isomorphismQ -+K(Q) of Q -+:It, and a basis (eili € I) for

V such that the quotients (Q'PQ) € Q can be represented as follows:

(5.3)

(5.4)

Q = sPan{e.li € K(Q)},1

[e i for i € K(Q)PQ(ei) =

0 for i e I\K{Q) .

Proof. See Appendix A.l.

We say that a basis (e.li € Il for V satisfying the conditions in1

D

tion 5.1 is ~!E.!~ to Q. when V is l(1lenl:l1 ied wi th

Thl~nl'lUh a lJ-:Bd~'ln1:ed (e.li€I),1

lattice Q ~ Q(V)

is wi the == subsets I and the

69

quotients})Q' Q EQare identi:£ied with the cOQrdinate projections

I1c=ml... mK.K E :I(QJfcf. ~le 1). FurtherlDQre. PfV) is identified

with P(I) through the correspondence a ... I. where I is the matrix of a

wrt the dual basis (e~li E I) for v*. The condition (5.1) is then1

transformed into the condition (2.1). hence PQCV) is identified with

P:I(Q) (I) and the model NV(Q) is transformed into the canonical form

N(:1l{Q» .

Remark5.L Since the identity matrix 11 E P:I(Q) (I). the model NV(Q) is

nonempty when Q is distributive. IJ

5.4. Invariant formulation of the testing problem.

Let Q and ,. be two distributive sublattices of Q(V) such that" C Q.

Then PQ{V) ~ P,.{V) and one may consider the general problem of testing

Nv(Ql against the i(possible) larger model NVC") 011 the basis oin i. i .d,

observations from V. Le .. testing

(5.5)

By Proposition 5.1 we may choose a Q-adapted basis (e. Ii E I) for V;1

clearly this basis is also adapted to '!l. It follows immediately that the

tes ting prllDJ.em is into the caJIlOJI1C:al

(4.1) by

§6.

Cnl)1C~e of a fJ.-jac:Ul.Dl:ed basis.

remain open COlllc«~nling the structure of

is QUlesl~1(Jtn under

70

minimal determining sets of CI conditions for N(:Jt) (cf. Remarks 3.2 and

4.1). A second question is whether every testing problem of the general

form (4.1) can be decomposed into a product of simpler testing problems

[cf , Remark 4.2). The answer to this question will be of use for a

decision-theoretic study of the LR test and other invariant tests for the

problem (4.1).

The normal statistical models N(:Jt) may be generalized in several ways.

One natural and possibly fruitful extension is suggested by an

examination of the::tl-p:;'lr~tri~tion.(2.32) of P:Jt(l). A large class of

"second-order" submodelsof H(:Jt) may be obtained by replacing each perK])

in (2.32) by P::tt' ([K]), where each ::tt' == ::tt' (K) is subring of ';([K]).

Third-order and higher-order submodels may be obtained by iterating this

process. This construction yields a rich class of normal conditional

models and associated testing problems which, despite their app:;'lrent

complexity 'adJnit a relativelystandardexplici t li}(.(illihood analysis.

Alternatively, one might replace each term 1I([K]x<K»xP([K]) in the

:Jt....parametrization (2.32) by a suitable covariance selection model

requirement (cf. Dempster (1972), Wermuth (1976, 1980», thus

generalizing the JJlulflva,ria,te gra.pl1ical chain models of Lauritzen and

Wermuth (1989) to "multivariate graphical lattice models".

Another interesting question the rela.tion of the lattice CI models

(1985, 1989),decomposable Q'rl~ntls

CI models determined by

and Wermuth

toexteIls:l.on.s just rlp!Qt'":l~iH(::Jl) (and

(1989), etc.). It appears that the class of decomposable graphical CI

contains nor is co:nulirled in the of lattice

ttl'"',....""~. "" contailnEid here.

71

APPENDIX.

In Appendiees A.1 and A.2. the notation and terminology of Seetion 5

are followed.

A.1. The J)eeoW9si tion Theorelll and existence of a :It-adapted basis.

Lemma A.I. For R € (l(V). the set (l(V}R := {Q € (l IQ ~ R} is a sublattice

of (l(V) iSOlllOrphic to th(3 lattice (l(R) of quotients of R through the

lattice isomorphism

(l(R) ~ (l(V}R

(Q'PQR) ~ (Q'PQRol1l)'

Proof. Straightforward. []

-1Lemma A.2. Let R. T € (l(V) with RVT =V and let rR:R -+ I1MT.R(O) and

rr:r -+ P~T.r(O} be surjective linear mappings. Then the linear mapping

(A.1)-1 -1

cp:V -+ (RAT) x I1MT.R(O} x I1MT.T(O)

x ..+ (~T(x). I'R(PR(x)). rT(PT(x)))

is bijective.

I1Mr(x) =0 and we obtain that l1l(x) €

r R is Similarly PT{x) = O.

Suppose that cp(x) = O.

fact PR(x) = 0

-1 -1hence x € I1l (0) n Pr (0) = {O}. The linear mapping cp is thus injective.

-1 -1O:ln,ce dim(V} = dim{(RAT) x I1MT.R{O) xI1MT. T{O». cp is also

Le.•

dimension dim(Q) - dim«Q>}.

Q.l.Q s R}, aSQ.plattice of fl. (lly= fl.).

VfQ E J(llR})

J(llR} = J(fl.} n llR

-1 I"y:Y -+ X(P<Q>,Q(O} Q E J(ll}}

x -+ (rQ(PQ(x)} IQ E J(ll}}

J.(ll} := {Q E<Q..IQ~ {O}. <Q> <Q}

= {QE lllQj1f {OJ .VR.T EQ.: Q = RVT => Q = R or Q = T}.

<Q> := V(Q' E lllQ' < Q}

As in Section 5.3. let II be a distributive sublattice of ll(Y} such that

In the following theorem the space Y is represented as a product of

is bijective.

(A.2)

(A.3)

(AA)

(A.5)

Proof. For R II define llR :=

vector s.paces indexed by J(ll} such that the space with index Q E J(ll} has

{O}.Y E ll(Y}. For Q E ll. Q ~ {O}. define

TheoremA.l. i(necompositloll Theorem}. For eachQ E JfQ.}. let

rQ:Q -+ P~~>,Q(O} be any surjective linear mapping. Then the linear

mapping

and let J(ll} denote the poset of all join~irreducible elements in ll.

73

(cf. (2.4) - (2.7)). The proof proceeds by induction on IJ(tJ) 1 =: q. If q

= 1. then tJ = {{O}. V} and the result is trivial. Next. assume that the

result is true whenever q ~ k-l and suppose that q = k. If V € J(tJ) then

IJ(l2<v»1 =k-I. hence the mapping

is bijective by the induction assumption and Lemma A.I. Since the linear

mapping

-1V ~ <V> x p<V>(O)

x ~ (P<V>(x). rV(x))

is bijective and PQ.<V>op<V> =PQ for every Q € J(tJ<V»' the mapping

(A.2) is bijective in this case.

If. on the other hand. V f J(tJ). then V =R V T where R < V and T < V.

It follows from (A.3) that IJ(tJR)1 < k and IJ(~)I < k. so by the

induction assumption and Lemma A.I. the mapping

-1 1V ~ X(P<Q>.Q(O) Q € J(tJR))

x ~ (rQ(PQ(x)) IQ e J(tJR))

is (equivalent to) the quotient mapping ~:V ~R. Similarly. the quotient

mappings Pr and I1MT can be represented in an analogous way. hence

74

Thus. by fA.S} and (A.6).

Lemma A.2 now implies that "V is bijective. a

Remark A.!. The representation (A.2) shows that V can be identified with

a product of vector spaces indexed by J(~}; similary. each R € Q can be

identified with the product X(p~~>.Q(O)IQ € J(QR» through the bijective

linear mapping "R defined by "R(x) = (rQ(PQ(x)) IQ € J(QR))' x € R; under

these identifications. each mapping PRT' R S T S V. is simply a canonical

projection mapping.

Proof of Proposition 5.1. For each Q € J(Q). let [K(Q)] be a set with

For R € Q. define

n

(A.7)

and ...~ ...._- I := KeV). Fr()IDfA.5} and (A.6) it follows that ~ 5: ~(~) :=

{K(R)IR €Q} is a ';::l1T1'1"'il'\(T of ~(I) and the mapping R ~ K(R) is a lattice

isomorphism between Q and ::I.. Now Remark A.l implies that there exists a

basis (e. €1

V such the elements (R.I1t) in Q can be

75

A.2. Proof of Theorem 5.1.

Lemma A.3. Suppose that x '" N(o) , a E: P(V). Then for any R, T E: (l(V) ,

1 I -1 -1PR(x) ·Pr(lC)~T(lC) (=) ~ (0) and Pr {OJ are geometrically orthogonal

(g ;o .'} wrt the inner product 6 := a-Ion V (cf. [A] (1990), Definition

4.1, for the definition of g.o.).

-1 ~ -1 ~ -1 ~Proof. Let PR (0) , Pr (OJ , and ~T{O) denote the orthogonal

-1 -1 -1complements of ~ (O), Pr (0), and ~T(O), respectively, wrt 6.

Furthermore, 1etqR' qT' and qRAT be the orthogonal projections of V onto

-1 L -I . .1 -1 .. L ..•• -1. L -1 . L~ (OJ , Pr (0), arl.d~T(O) . Then (~(O) , qR)' (PT (0) , qT) and

-1 ~(~T(O) , qRAT) represent the quotients (R,~). (T.PT). and (RAT,PRAT).

respectively. Therefore

onto

orthogonal

(=)

(=)

(=)

(=)

<=)

~(X) 1 PT(x) I~T(x)

qR(lC) 1 qT(x) IqRAT[x]

(qR(lC) - qRAT(x)) 1 (qT(x) - qRAT(x» IqRAT(x)

(qR(x) - qRAT(x» 1 (qT(x) - qRAT(x».

-1 ~-l -1 ~-1(PR (OJ II~TfO» ~ (PT (0) I"lpRAT(O»

-1 ~ ~1 ~PR (0) and PT(O) are g.o.

-1···· -1PR(O) and Pr.(O) ar-eg ;o.

and this direct sum is

fourth (=) follows from (*). the

fifth and sixth <=)'s are elementary properties of geometric

76

-1Proof of Theorem 5.1. Since the correspondence Q +-+ PQ(O) between Q(V)

and the lattice ~(V) of all subspaces of V [ef . [A] (1990), Section 4.1)

-1 Iis a lattice anti-isomorphism it follows that ~ := {PQ (OlQ € Q} ~ ~(V)

is a lattice and is anti-isomorphic to Q. If a € PQ(V) #- 0, then by Lemma

A.3. ~ is g.o. wrt 6 := a-1 . Thus by Proposition 4.1 of [A] (1990) ~ is

distributive. hence so is Q. Conversely, if Q is distributive. then PQ(V)

#- 0 by Remark 5.1.

A.3. Proof of Theorem 4.2.

Let n C .(IxN) be the open subset

u

(A.8) n := {y € .(IxN)I rank(y) = min{III.n}}.

Since .(IxN)\n is a Lebesgue-null set. we may replace the sample space

.(IxN) by O. Also. since rank{Ay)= rank(y) for A € GL:/l(I) and y €

.(IxN). it follows that ~(I) acts on n by restriction of (3.12).

Furthermore. since n is locally compact. Lemma A.5 at the end of this

subsection implies that this restriction is a proper action (whereas

(3.12) itself is not proper). Thus. in order to prove Theorem 4.2 we may

apply the method of [A] (l9S2) to study the transformation of the normal

distributions in the model HO under the mapping

(A.9) n -+ O/~{I) x (X{p([K])IK € J{:/l»))

y -+ (If"{Y) , (t[K]. (y) IK e J(:/l))).

sA and

group is

rA:::; {A E~(I114[K> ::; O. K E J(::ll}

'!J :::; {T E ~fIrl T[KJ ::; I[K]' K E J(::l}}.

Therefore we may apply the method of [AJ (1982). ~ction 5. with K =

~( I). H = rA. G = '!J. and X = {) to see that 7T can be represented as 7T =

7TrA0Tr'!J' where 1I''!J:{) ~ O/'!J and 7TrA :O/'!f ~ (O/'!J)/rA ~ O/~{I). (The action of '!J

on n is the restriction of (3.I2) to '!JxD. and the induced action of rA on

O/'!J is defined as equation (2I) of [AJ (1982).)

Since the ma.pping (A.9) is invariant un<ier the t'lction of '!J on n {cf.

(2.26J). it has a unique factorization through 7T'!J' Therefore we may first

transform the normal distributions in the model HO from n to O/'!J by 7T'!J'

To do this. we need the following explicit representation:

Lemma A.4. A representation of 7T'!J:n ~ O/'!J is given by

(A. H)

Proof: To show that O/'!J in {A. to) is a cross...s.ection of n and that 7T'!J in

(A. H) is a maximal invariant function. it sl1ffices to show that for each

yEO.

(A.12)

show k in . suppose

(A. Pr~tnnlqition 2.2{ii). {2.

78

for each K € J{~}. hence

{A.13}

K € J{~}. i.e .• -Ty = T~{Y}. To show the opposite inclusion~. it is easy

to verify that T~{Y} € (nt~) for every y € 0; to show that T~{Y} €

{TyIT € ~}. simply note that T~{Y} =Ty where

K € J{~). Finally. the mapping T~:O ... nt~ defined in (A.10) and (A.ll) is

clearly continuous. so this representation is also topological and the

resul t follows.

We may now apply fOl'Dll.lla. {16} of [A] {1982} to transform the normal

distributions in the model HO by the mapping T~ given by {A.ll}. In the

notation of Section 40f [A] {1982}. G = ~. X = O. ). is the restriction

of LebesglJ.e~sl.lreonll(hN} to the open subset 0.11 = T~. 13 is a Haar

measure on ~.AG = ~. == 1. and P = p-).. i.e .• P is the normal

distribution with density p given by

[]

= Y €

79

with respect to A. For .'I E: P::f(I). the density q of If,.(P) wrt the quotient

measure All3 on fV!f is thus given by

-nl21 -1 t(A.14) q(lf,.(Y» = (det(.'I»,.exp{-tr(.'I (Ty)(Ty) )/2}d{j(T)

where

K E: J(::f). T E: ,..

Since d{j(T) = U(~(T[K»IKE: J(::f», where ~ is the Lebesgue measure

on M([K]x<K» [cf". (2.19), the last integral in (A.14) can be calculated

using Fubini's Theorem and the translation invariance of ~. K E: J(::f).

The order of integration should be determined a never-incr~sipg

listing K1,K2, ••• ,KIJf::f)I of the elements in J(::f) [cf . Remark 2.1). After

some calculation we obtain

(A.I5)

xU( E: }

80

-(n-I<K>IJ/2 - 1 I= Ilf{det{2f Kl_l)

exp{-tr{2f Kl_S[Kj_{yl)/2}

•• K € J(:1ll)

1f,.{y) € on.

twhere Sly) = yy .

By Lemma A.4. where 0/,. is represented as a subset of O. the induced

action of the subgroup d on on is simply the restriction of the action

(3.12) to dx(o/sl. The next step is to represent the transformed measure

1f,.fP) q-(M/l) aS1f,.(PJ = ql-v, where v is an invariant measure under

this actiono£ don 0/,..

It follows from the statement following the proof of Proposition 2 on

p , 961 of [AJ (1982) that the quotient measure A/P is relatively

invariant under the action of d on 0/,. with multiplier X given by X(A) =

(mod~A)-IXo(A), A € d. where Xo is the multiplier for A as a relatively

invariantmea.sure under the action of ~(I) on 0 and where the

-1 .automorphisms ~A:" -+,. are defined by ~A(T) = ATA • T € ,.. SInce A =

Diag(A[K1IK € J(:ltl) it is clear that

V K e J(:1l).

hence

I I I<K>1·1 ·1·lfKJIImod~A =llf<iet(,AfKll.· det(A<K» . ·K € J(:ltl).

However.

Xo(A) = Idet(Alln

= Il( Idet(A[KJ) In IK € J(:1l».

so

= Il( ) € }.

81

If we define m=O/" -+ JO,co[ by

it follows that m{Az) =~(A)m(z), z € 0/,.. A € ~ (compare to (17) in [A]

1982). Thus the measure v := m-1· 0v p) is invariant under the action of ~

on 0/,.. From (A.15). the density q1 := mq of W,.(P) with respect to v is

therefore given by

~det (~ (»] (n-I <K> 1)/2[K)· y -1

H( det(z[K].) x exp{-ntr(z[K].~[K].(Y»/2}IK € J(~».

where it should be recalled that z € P~(I).

The final step in the proof of Theorem 4.2 is to obtain the

transformation of the measure W,.(P) =q1·v under the mapping

(A.16) 0/,. -+ (o/,.)/~ x (X(P([K])tK € J(~»)

w:r(y) -+ (W,.(w:r(Y»' (~[KJ~ (ylIK € J(~»).

Since the action of ~ on 0/,. is the restriction to the closed subset

tdx (0/,.) of the proper action of~(I) on 0, it is a proper action. Thus

we may apply Lemma 3 of Andersson. BrimS and Jensen (1983) to see that

there exists a unique measure " on (O/,.)/~ such that the invariant

measure v is transformed into tbe product measure d~vO under the mapping

is an invariant measure on X{P{ ) € the

proper transitive action

(A. 17)

S2

~X(X,(p([K])IK E J(:f») X,(p([K])IK E J(:f»

(A. (A[K]IK E J(:f») (A(K]A(K]AtK]IK € j(:f».

(Lemma 3 of Andersson. Brems. and jensen (1983) is applied with G = sl. X

= DI,.. Y = X(p«(K])IK € J(:f». t = (1T,.(y) .... (~(K]. (y) IK € j(:f»). 1T = 1Tsl•

and v = v.)

Since q1 (z) depends on z := 1T,.(y) only through (~[K].(y>lK € j(:f». the

probability measure Q1·v is therefore transformed under (A.1S) into the

probabf li ty measure r· (dOvo)' where

1': (DI")/~ x (X(P([K]) IK e j(:f») IR+

(w. (A(K] IK € j(:f») ....

(A.1S) ~d teA ) ](n- I<K>I)/2

e [K] . -1llCdet(I[K].> x ex:p{-ntr(I(K].A[K])/2} IK € j(:1l».

Because I' does not depend on w. under HO it follows that 1T == 1T~01T,. is

independent of (~(K].IK€ J(:1l». 1T has distribution K. and

(~[KI.IK € J(:f» has distributions·vO' where sHA[K] IK € J(:1l» is given

by the product (A.1S). Furthermore. since "e = 8(vKIK € J(:f» where vK is

an invariant measure onP([K]) under the usual action of CL(EK]). it

follows that under ~[KI.' K<€ j(:Jt). a.relDUtua1ly independent and

~[K]. has the on perK]) with n-I<K>I degrees of

freedom and expected value I[K].' This ends the proof of Theorem 4.2.

following lemma. which was cited at the beginning of this

interest in its own group

actions in statistics.

Prooos t t.Ioa 5 (ti».

( ) . tre . 14.

83

Lemma A.5. Suppose that C and C' are locally COlllpaCt groups that act

continuously on the locally COIJlP8.Ct spaces X and X' , respectively. Let

'I'=C 4 C' be a continuous group homomorphism and >/1= X 4 X' be a continuous

mapping such that >/I(gx) = 'P(g)>/I(x), x € X, g € C. If 'I' is proper and if

the action of C' on X' is proper, then the action of C on X is also

proper.

Proof: Consider the diagram

C X0

X XX 4 X

'Px>/l J J>/Ix>/l

C'x X' 0' x:« X',4

-1where S(g,x)=(gx,x) and S' [g ' .x ' )=(g'x' .x "}. We must show that S (C) is

compact whenever C ~ XxX is compact. Let PC' denote the projection of

C' xX' ontoC'. Since the diagram commutes, I.e., S'o('Px>/l)=(>/Ix>/l)oS, it

follows that

S-l(C) ~ S-l«>/Ix>/l)-l«>/IX>/l)(C»)) = ('Px>/l)-l(S,-l«>/IxIjlHC»))

-1-1. • • -1~('Px>/l) (PC ,(0 , «>/Ix>/lHC»))xX) = 'I' (C' j-x

where C' = PG•.(S,-l«>/Ix>/lHC))). Sinc.e trivially S-I(C) ~ GXP2(C), where

P2 denote$ the projection of XxX on the sec()nd component, we have that

-1 -1S (C) ~ 'I' (C') xP2(C),

C is since S· is proper and -1t'h"'1"'"",fn1""'" 'I' .) is COlllp;ElCt

84

-1because ~ is proper. Thus 8 (e) is a closed subset of a compact subset

of GxX. hence is compact.

With the identifications G =G' =~(I), X =O. X' = P~(I}. ~ = the

identity mapping on ~(I). and Vi = t Lemma A.S may be applied as

indicated at the beginning of this subsection.

[J

85

REFERENCES •

Anderson. T. W. (1985). An Introduction to Mul tivariate Statistical- ... -Analysis (2nd ed.). John Wiley and Sons. New York.

Andersson. S.A. (1975). Invariant normal models. Ann. Statist. 1.

132-154.

Andersson. S.A.(19S2). Distributions of maximal invariants using

quotient measures. Ann. Statist. 10. 955-961.

Andersson. S.A. (1990). The lattice structure of orthogonal linear models

and orthogonal variance component models. Scand. J.. Statist. 17.

287-319.

Andersson. S.A.• Brsns.H.K.• and Jensen. S.T. (1983). Distribution of

eigenvalues in multivariate statistical analysis. Ann. Statist. 11.

392-415.

Andersson. S.A.• Marden. J.I.. and Perlman. M.D. (1990). Totally ordered

mul tivariate linear models wi th applications to monotone missing data

problems. In pr~paration.

Andersson. S.A.• M.D. (1991).

independence models for missing data. To apPear in Statist.

S6

Banerjee. P.K.• and Giri. N. (1980). On D-. E-. DA- . and »X-optimality

properties of test procedures of hypotheses concerning the covariance

matrix of a normal distribution. In Multivariate Statistical Analysis

(R.P. Gupta. ed.) 11-19. North Holland Pub. Co•• New York.

Bourbaki. N. (1971). Elements de Mathematique. Topologie generale. Chap.

1 a 4. Herman. Paris.

Das Gupta. S. (1977). Tests on multiple correlation coefficient and

multiple partial correlation coefficient. I. Multivariate Analysis I.

82-88.

Davey. B. A. and Priestley. H. A. (1990). Introduction ~Lattices and

Order. Cambridge University Press. Cambridge.

Dempster. A. (1972). Covariance selection models. Biometrics 28. 157-175.

Eaton. M.L. (1983). Multivariate Statistics: A Vector Space Approach.

John Wiley and Sons. New York.

Eaton. M.L. and Kariya. T. (1983). Multivariate tests wi th incomplete

data. Ann. Statist. !!.. 654-665.

Frydenberg. M. (1990). The chain graph Markov property. Scand. J.

17. 333-353.

87

F~denberg. M. and Lauritzen. S.L. (1989). Decomposition of maximum

likelihood in mixed graphical interaction models. Biometrika 76.

539-555.

Giri. N. (1979). Locally minimax test for multiple correlations. Canad.

J. Statist. I. 53-60.

Gratzer. G. (1978). General Lattice Theory. Birkhauser. BaseL

Kiiveri. H.• Speed. T.P .• and Carlin. J.B. (1984). Recursive causal

models. J.. AustraL Math. Soc. (Ser. A) 36. 30-52.

Lauritzen. S.L. (1985). Test of hyPOtheses in decomposable mixed

interaction models. Bull. Int. Statist. Inst. 4. 24.3(1)-24.3(6).-- --

Lauri tzen; S.L. (1989). Mixed graphical association models. Scand. J..

Statist. 16. 273-306.

Lauritzen. S.L.• Dawid. A.P.• Larsen. B.N. and Leimer. H.G. (1990).

Independence Properties of Directed Markov Fields. To apPear in

Networks.

Lauritzen. S.L. and Wermuth. N. (1984). Mixed interaction models. Res.

rep. R-84-8. Inst. of Electr. Systems. Aalborg UBiv.

Lauritzen. S.L. and Wermuth. N. (1989). Graphical models for association

between variables. some of which are quali tative and some quantitative.

Ann. Statist. 11. 31-57.

Little. R.J.A. and Rubin. n.B. (1987). Statistical Analysis with Missing

Data. John Wiley and Sons. New York.

Marden. J.I. (1981). Invariant tests on covariance matrices. Ann.

Statist. ~. 1258-1266.

Porteous. B. T. (1985). Properties of log-linear and covariance selection

models. Doctoral thesis. University of Cambridge.

Rubin. D.B, (1987). Multiple Imputation for Nonresponse in Sample

Surveys. John Wiley and Sons. New York.

Speed. T.P. and Kiiveri. H. (1986). Gaussian Markov distributions over

finite graphs. ~. Statist. 16. 138-150.

Wermuth. N. (1976). Analogies between multiplicative models in

contingency tables and covariance selection. Biometrics 32. 95-108.

Wermuth. N. (1980). Linear recursive equations. covariance selection. and

analysis. J. Amer. Statist. Assoc. 75. 963-972.

(

structures.

89

Wermuth. N. (1988). On block-recursive linear regression equations.

Manuscript. Psychologisches Insti tut. Universi tat M'ainz.

Whittaker. J. (1990). Graphical Models .!!!. Applied. Jful tivariate

Statistics. Wiley. New York.

Date post:	12-Sep-2021
Category:	Documents
Upload:	others
View:	1 times
Download:	0 times

LATTICE MODELS FOR CONDITIONAL INDEPENDENCEIN A MULTIVARIATE

Documents