+ All Categories
Home > Documents > SOME TESTS FOR INDEPENDENCE

SOME TESTS FOR INDEPENDENCE

Date post: 01-Oct-2016
Category:
Upload: david-thomas
View: 215 times
Download: 0 times
Share this document with a friend
9
A@. J. Statist., 16 (l), 1974, 11-19 SOME TESTS FOR INDEPENDENCE’ DAVID THOMAE lLND NANCY FERGTJSON Oregon &.ate Unive~sity a d Califwmia Department of Fish and Game For testing the null hypothesis of independence of random Yariables X and P, asymptotically optimal C(a) tests and locally most powerful raak tests (l.m.p.r.t.’s) are considered for Plackett bivmhte (SV) distributions (Plackett, 1965 ; Steck, 1968 ; Mardia, 1970). We also consider Morgenstarn (Farlie, 1960; Plackett, 1965) and Moran (Moran, 1969) BV distributions. The generztl class of C(e) tests wfa developed by Neyman (1959), who used locally root-n consistent estimators for nuismce parameters. We compare the forms of these two kinds of test statistics and their asymptotic relative efficiency. Asymptotic relative efficiency (A.R.E.) cornparisone am also made with other testa for independence. 1. Description of Plackett BV Distribution Let + be an arbitrary positive number. Plackett (1966) con- structed the joint c.d.f. H(m,y) &8 the root of the quadratic equation Mas& (1967) showed that only one of the two roots of eqmtion (1) reaches the upper and lower bounds of the Fr6chet’s inequality ‘The c.d.f. H(m,y) and p.d.f. h(m,y) are given aa follows, Inax (wwScf(Y) --1)Imw)<~ fw4, a(?/)). (3) where f(~) and g(y) are absolutely continuous p.d.f.’s and Pfz) and G(y) are corresponding c.d.f.’s off and g, S=1+(P(rO)+G(y))(+--1) and o<+<oo. 2. Locally Asymptotically Optimal C(a) Tests First, the gene& development of 0ptjma.l C(a) tests is summarized. Suppose a sample (XI, Xg, . . ., X,,) is taken from a population with p.d.f. f(a;c,9) where 6=(6,, . . ., 6J. The observations Xi may be vector values. For testing the null hypothesis H, : c=C,,, against the alternatives H, : e>c0 (or E<EO) or E2 : e#c0 with unknown nuisance parameters 9, Neyman (1959). considered tats baed on the statistics lMenusoript reoeid h h 26, 1973 ; revised June 14, 1973.
Transcript
Page 1: SOME TESTS FOR INDEPENDENCE

A@. J . Statist., 16 (l), 1974, 11-19

SOME TESTS FOR INDEPENDENCE’

DAVID THOMAE lLND NANCY FERGTJSON Oregon &.ate Unive~sity a d Califwmia Department

of Fish and Game

For testing the null hypothesis of independence of random Yariables X and P, asymptotically optimal C(a) tests and locally most powerful raak tests (l.m.p.r.t.’s) are considered for Plackett bivmhte (SV) distributions (Plackett, 1965 ; Steck, 1968 ; Mardia, 1970). We also consider Morgenstarn (Farlie, 1960; Plackett, 1965) and Moran (Moran, 1969) BV distributions. The generztl class of C(e) tests wfa developed by Neyman (1959), who used locally root-n consistent estimators for nuismce parameters. We compare the forms of these two kinds of test statistics and their asymptotic relative efficiency. Asymptotic relative efficiency (A.R.E.) cornparisone am also made with other testa for independence.

1. Description of Plackett BV Distribution Let + be an arbitrary positive number. Plackett (1966) con-

structed the joint c.d.f. H(m,y) &8 the root of the quadratic equation

Mas& (1967) showed that only one of the two roots of eqmtion (1) reaches the upper and lower bounds of the Fr6chet’s inequality

‘The c.d.f. H(m,y) and p.d.f. h(m,y) are given aa follows, Inax (wwScf(Y) - - 1 ) I m w ) < ~ fw4, a(?/)).

(3)

where f ( ~ ) and g(y) are absolutely continuous p.d.f.’s and Pfz) and G(y) are corresponding c.d.f.’s off and g,

S=1+(P(rO)+G(y))(+--1) and o<+<oo.

2. Locally Asymptotically Optimal C(a) Tests First, the gene& development of 0ptjma.l C(a) tests is summarized. Suppose a sample (XI, Xg, . . ., X,,) is taken from a population

with p.d.f. f(a;c,9) where 6=(6,, . . ., 6J. The observations X i may be vector values. For testing the null hypothesis H , : c=C,,, against the alternatives H , : e>c0 (or E < E O ) or E2 : e#c0 with unknown nuisance parameters 9, Neyman (1959). considered t a t s baed on the statistics

lMenusoript reoe id h h 26, 1973 ; revised June 14, 1973.

Page 2: SOME TESTS FOR INDEPENDENCE

12 DAVID THOMAS lllYD NANCY FEBGUSON

(5)

m d the ap'8 are chosen so aa to minimize the vasiance of

under H,. The resulting minimum varia4ce is denoted by at. The estimators 8=(01, . . ., 6,) me aasnmed to be either l o d y root-n consistent or root-rl consistent for 8. Neyman (1959) dehes an estimator &,, to be locally root-n consistent for 03 if there exist8 a number Aj #O, suoh that as rl+ co, the product I s,, -0j -Aj( 4 -6,) I dn remains bounded in probability for all E md 8. If I &,,-Oj I 4% remaim bounded in probability at3 nn-tco, aj,, is said to be root-n consistent for g. The p.d.f. f (s ;E,O) is also assumed to satiBfy Cr&m&-type regularity conditions (Neyman, 1969, p. 216).

The statbtic of equation (4) has also been shown to have a limiting unit normal distribution under H,. The optimal symmetrio C(a) tests based on equation (4) for two-sided dternatives IT2 : c#g has critical regions &(a) ={Z# : I 2, I

A h

where za12 is such that

For optimd one-tded C(a) tests of H,: E=fo against HI: c>b the critid region ie Bl(a) ={Z, : Z,,>za}.

Moran (1970) indicates that 2, is asymptotidy equivalent

to tests using the maximma likelihood estimator (MLE) 5 of 6, and to the likelihood ratio test based on the statistic

where E and 6 am MLE's and 6, we MLE's under the null hypothesis. An &vantage of the C(a) test over the other two types of tests is that the C(a) tests axe frequently easier to compute.

Now we consider optimal C(a) tests for independence. Suppose (XJ) follows a certain BV distribution with p.d.f. h(a,y;E,e) where 6 is a parameter indicating the association between X and P. That ia, ~(a,y;E,e)=f(s;e)g(y;e) if, and only'if, for some number 40, &Go. For the Phkett distzibutions, the null hypothesie of independence ie given by +=l in equation (3).

We now show that for my BV p.d.f. for which 'pc (equation (6)) can be written as K(P(s))lZ(cf(y)) for some function IT(%), Ogugl,

Page 3: SOME TESTS FOR INDEPENDENCE

SOME TESTS FOE INDEPENDENCE 13

then q, and 'poi, j =1, . . ., T am uncorrelated under the null hypothesis. Aa a result, C(a) test statistics of independence w i l l be functions of pE only. Given

%he rehtions under Ho : E=Fo (7) 'p, = m W ) w ( G ( Y ) ) ,

Eq, = B ( W W ) W W ' ) ) ) = E ( E ( P ( X ) ) ) E ( E ( G ( ~ ) ) ) ={B[K(P(X))])a=o

implies that (8) E(K(B(X) ) ) =B(K(G(P))) =O.

Prom equation (6), we obtain

(9) a lnf a l n g aej ae, qej =- +-, j=1,. . ., r.

Using relation (8) and the fact that Eqej =0, we have for j=1, . . ., r COP ((P,Tej)=E((Pc. CpeJ

=O. Hence, the asymptotic optimal C(a) test for independence, Ha : E d , , , , is bzaed on

where 6 me m y locally root-n consistent or simply root-n consistent estimators of 8 and t& is the variance of q@~,y;~~,6).

Now consider estimation of the nuisance pazmeters ( al, aa, PI, Pe) of both the gamma and Weibull distributions, for example. For the case where there we no linear restrictions on the prtrameters, any estimators, &(X), Gz(P), f i l (X) , $e(P), which are functions of only one of the va,riates X or Y , and am root-a consistent with respect to the corresponding marginal distributions, such za MLE's and moment estimaitors, wi l l be root-n consistent with respect to the bivariah distributions.

The optimal C(a) test statistics of equation (10) me easily evaluated for the Plackett distributions and me given below. Using the density of equation (3) in equation ( 5 ) gives

in equation (7) for the Plackett alternatives. (11) X(U) =2% -1

Using the additional result a&=1/9 in equation (10) then gives

Page 4: SOME TESTS FOR INDEPENDENCE

14 DAVID THOMAB AND NANCY FEWUSON

3. Locally Most Powerful Rank Tests In this section, we consider l.m.p.r.t.’s for the null hypothesirr

of independence in the P h k e t t f d y of distributions. The l.m.p.r.t.’s for independence am derived in general a8 follows.

For a random sample of size 1c from my BV distribution with p.d.f. b(s,y), (XO), Y(RJ), . . ., (X(,,), P(R,)) is a sdc ien t reduction of data, where Xol <Bo, <. . . <&, and R,, B,, . . ., B,, me the ranks of the correeponaing P’s among P,, . . ., Y,. The joint p.d.f. for thia sufEcient statistic (X(l), P~R,)), . . ., (X(,,), P(R,J) is

The problem of testing H, : H($,y) =P(s)G(y) against

for some (X,P) in the Phckett distributions is invariant under the group of function transformations ~7 with elements

H, : WS,Y 1 #PF(*)Q(Y)

(13) go,,o,=So,,o,((X(l,,p(~,)), - 9 ( X ( n ) , y ( R n ) ) )

=IW~(X,~)),W,(P(R.))), . - v (m1(X{n)),ady(h))), where w1 m d 0% are any continuous incrertaing fnnctions from the real line onto the rtml line. The induced group of function trans- formations

where is the parameter associated with independence of X and ‘JT. In particular, [=+ for the Plackett distributions.

A m&ximaJ invarhnt statistic for the group of transformations is (R1, . . ., R,) and the probability function for (B,, . . ., R,) is given

on the pstrameter space has elements (14) @%,% .=B%,O,(P,,G: 4) =(Pol l,cfoz’, 5)

bY

(16) pc(r1, * .? r n ) = * ! l s i-1 fi hc(gityt)hidyi,

where S={(X,,P,,), . . ., (X,,,P,,) ; X,<. . .<X, and ri ie the rank of the P-observation corresponding to X. among the y-sample}. The 1.m.p.r.t. of H, : 4=fo agahst HI : [>to rejects H, for

The testa which reject H, : 4=Ea for

may be used for the two-sided dtemtives HB : f#E,. Using the relation

(17 1 I*l’k

U

i-1 n ~ e ( ~ i , ~ i ) = e ~ [ t-1 ln hc(gi,yi)]

and equation (6) for the Phckett distributions, we obtsrln

Page 5: SOME TESTS FOR INDEPENDENCE

SOME TESTS FOR INDEPENDENCE 15

From equations (15) and (16), we then evaluate the 1.m.p.r.t.

n

i l l =n! Z EvC(Xli,,P(rj)) =n!T*,

where PE(rl, . . ., rn)=l/n! under a, and

For the P h k e t t distribution, using equations (7) and (11) in equation (19), we have

(20) n

i-1 f* = X 1(2P(X,, ,) -1)B@G(Y(ri))-1)

- -

n(n--1) = ) n + l ’ Y s

where 12

is Spearman’s rank correlation coeflicient. Hence, the test which

rejects H, for y,>K- 3(rcf1) is equivalent to rejection of H , for

P*>K. Therefore, the one-tailed test based on ys is a 1.m.p.r.t. for the Phckett distributions. Kendall (1962, p. 76) shows the asymptotic normality of y, and has tables of the probability function

4. The Locally Optimal C(a) Tests and Locally Most Powerful Rank Tests for Morgenstern and Moran BV Distributions

In this section we introduce two other types of BV distributions. They both satisfy the condition given in equation (7). Using the same techniques as for the Plackett BV distributions, we obtained the locally optimal C(a) tests and I.m.p.r.t.’s of independence for Morgenstern and Moran BV distributions.

The p.d.f. and c.d.f. for Morgenstern BV distributions (Farlie, 1960; Plackett, 1965) me

and

where -1gygl.

n(n-1)

of Y,*

(21)

(22)

h7(3,Y) =f(5)9(YM +Y(2F(4 -- l ) (2W -1))

H7(s,y) =+)W)(l +Y(P(4 ---1)(QW -1))

Page 6: SOME TESTS FOR INDEPENDENCE

16 DAVID THOMXB AND NANCY FERGUSON

The p.d.f. for M o m BV distributions is given in Morm as

(1969)

where

and P(s ) and Q(y) are absolutely continuous c.d.f.'s. For Morgenstern BV distributions, it is interesting to know that

both locally optimal C(a) test sbtistice and l.m.p.r.t.'s are in the same form as those for the Plackett BV distributions, i.e., they are equations (12) and (20) respectively. This results from the fmt that the K function in equation (7) is identical for these two BV distributions.

Aa to the Moran BV distributions, the l o d y optimal C(a) test criterion is found to be

and the 1.m.p.r.t. is I)

2 B4-l(~(X~~))~4-1(~(~(~i))) = E w?wE(z(ri)) i-1 i-1

(25 1 where Zti) is the i th order statistic from the unit normal distribution.

Notice that equation (25) divided by 3 [E(Ztt))]* is the Fisher-Yatw

corrhtion coefficient, i.e., 6-1

The l.m.p.r.t.'a for M o m altemtivea can &o be obtained from the previous work of Hhjek and &id& (1967, p. 112) who showed that l.m.p.r.t.'s of independence against the BV normal alternatives >O can be based on the Fisher-Pates rank correlation coefficient. iince rank tests am distributed independently of P(o) and cf(y), any 1.m.p.r.t. of independence again& BV normal alternatives will also be a 1.m.p.r.t. ztgainst Momn altemtive8. Hence the Fieher-pates raSg Correlation coefficient gives a 1.m.p.r.t. of independence rtgainst the Morm distribution with p >O.

Using the approximate score cO-l(i/(n+l)) in p h of XZli , in equation (26) gives the Van der Waerden correlation coefEcient

Hhjek and &d&k (1967, p. 112) have shown the rtsymptotic equivalence of test stati8tics (26) and (27).

It is interesting to note a rehtion between the optimal C(a) tests of independence, i.e., equations (12) and (24), and the comes-

Page 7: SOME TESTS FOR INDEPENDENCE

SOME TESTS FOR INDEPENDENCE 17

ponding l.m.p.r.t.'s. RephcingP(X,; 6) and @(Yi;6) by P(X,) =i/(n +1) and G(Yi) =ri/(n+l), respectively, in equations (12) and (24) yield8 Statistics which me proportional to the corresponding Spearman and Van der Wamden correlation coefficients given in equations (20) and (27).

In the next section we wilI show that l.m.p.r.t.'s are asymp- totically efficient with respect to the corresponding C( a) tests. Unless P and (f have simple forms, the rank tests may be much easier to compute than the corresponding C(a) tests.

5. Asymptotic Relative Efficiency of Non-parametric Tests with Respect to Optimal C(a) Tests

In this section we compute the A.R.E. of tests based on the Spearman correlation coefficient and the Fisher-Yates (normal score) correlation coefficient with respect to the asymptotically optimal C(a) tests of equation (10) for the three distributions.

Let p&) and &t,) denote the mean and variance of test statistic t,. The efficacy of t, is defined as

The A.R.E. of t , to sn is the ratio of their efficacies, that is e(t,,e,) = lim e(t,)/ lim e(s,).

fl-+ m w m

For the locally optimal C(a) test, under the assumption that equation (7) is true, we have

and

For the Plackett and Morgenstern BV distributions, we have

2 o&) =1.

lim e(t,) =1/9, n+ m

and for Moran BV distributions, lim e(t,)=l.

The efficacies of Spearman's correlation coefficient (y,) for the three BV distributions are calculated as follows. From equation (20), we write y. as

n+m (30)

where

B

Page 8: SOME TESTS FOR INDEPENDENCE

18 DAVJD THOMAS AND NANCY FERGU80N

Computing the efliwy of Sn is sufficient to compute y,. asymptotic mean of 8, is given a,a

and the variance under the nnll hypothesis given in Hhjek and &l&k (1967, p. 114) is

ot(a,) =(n +1)z(~-1)/144nZ. For the Phkett and Morgenstsm BV distributions, we have

lim e(y,)= lim e(Sn)=1/9 (31 1 and for the M o m BV distributions we have

lim e(y,) = lim e(8,) =0.91189 =9/nz. (32)

The efficacies of Van der Waerden for the three BV distributions are evdmted in the following way.

The

PC(8n) =J~(~)~(Y)~E(~,Y )MY u-m

M m -OD

n+m (p-cm

Let

BV distributions \ stetisties

S- aomlstion co- aolfmt .. .. ..

Fieher - Yatea (Vender Waerdem) oorrehtion oo- efEoient .. .. ..

(33)

M;:.2:m M o m

1 9lx’

9/x’ 1

which is l/n time8 the numerator of the Van der Waerden comelation mfficient and is aaymptotidy equivalent to the E’isher-Pates correlation coefiicient. The sbsymptotic mean of & is found to be

lim P&) = J @ - l ( m w w m Y )P&,Y ) M Y MOD

and from mjek and &df& (1967) we have

Page 9: SOME TESTS FOR INDEPENDENCE

SOME TESTS FOR INDEPENDENCE 19

6. A.R.E.’s of Two Other Tests of Independence Relative to the C(a) Test for the Plackett Distribution

Plackett (1965) considered the consistent estimator +z =ad/cb, where a, b, c and d me defied as the frequencies of pairs (&,Pi) respectively in the quadrants (X<h ,P<k) , (X<h,P>R) , (X>h ,P<k) and (X>h,P>k) where -w<k<oo,-m<h<w.

Mmdia (1967) considered the consistent estimator +; which ie the solution of the equation 3 =PO(+), where

and y is the Pearson product moment correlation of (Pi,Gi) =(P(X,, 0),G( Pi , e)), z =1, . . ., n.

Mardia also evaluated the efficacies lim e(4Ji) =1/9 andlim e ( + z ) =Ills. Hence from equation (29), the A.R.E.’s of +: and t): relative to asymptoticaJly optimal C(a) tests me respectively 9/16 and 1.

Refermes

PO(+) =colT. (P(X),G(P)) =(4J2--1 -24J $I/(+ -1)%, h A .

7L-w n+ m

Farlie, D. J. G. (1960).

H&jek, Jeroslsv, and Bid& Zbynek (1967).

KenW. Maurioe Qeorge,(1962). Rank CorreZdion Methods, 2nd ed. Hafner. Merdie, K. V. (1967).

“ The perfomanoe of some correlation ooefhients for e

Academic pneral bivariate distribution.”

PFess.

BkomeWih, 47, 307-323. Themy of Rank Teats.

Some contributions to oontingenay-type bivariate distn- butions.” B&omekika. 54. 235-249. ~~ ~

-&a, K. V. (1970). “ A trklation family of bivariate distribution and FAhet’s

&&a, K. V. (1970). F a m i l k of Bivariate Distribution. Hafner Publishing bounds.” Sank&& ser. A, 32, 119-122.

. . company.

- M o m , P. A. P. (1969). “ Statisticel inference with bivariate gamma distributions.”

Moran, P. A. P. (1970). On asymptotioally optimal testa of composite hypotheses.” Bimne$rika. 67, 47-55.

Neymsn, J e q (1959). ‘‘Optimal asymptotic tests of composite statistid hypothesea.”

Plaokett, R. L. (1966). “ A oless of bivsriata distributions.” J . Amp. Stdist. Aseoc., 60, 516-622.

Steok, Ct. P. (1968). “ A note on contingency-type bivariate distribution.”

Biometrika, 56, 627~334.

In ProbaWdy and Stat*&, ed. by Ulf Grenender, Wiley.

Biometrika, 66, 262-264.


Recommended