Download - 5 Comparison of Two Groups

7/28/2019 5 Comparison of Two Groups

1/17

n ro uc on o var a e a s ca na ys s:

Comparison of Two Groups

.

Aims

Difference Hypotheses

Nominal Scales: Difference of Proportions

Ordinal Scales: Wilcoxon Test

NPAR Tests

2010 Poch Bunnak 2


2/17

Recall the link b/w research purposes & statistics

2010 Poch Bunnak 3

Difference H otheses

Examples of difference hypotheses There is a difference between men and women with regard

to earnings

en are more e y t an women to m grate Migration propensity tends to be higher among men than

among women

Students have to learn developing difference

hypotheses: IV must be binary or categorical

Replace the variables in the examples above.

2010 Poch Bunnak 4


3/17

Start-up Statistical tests are much more commonly applied to

two-sample comparisons than to one-sample

The two samples compared can be:that a member is chosen for the inclusion in one sample isnot dependent on which members are selected in the other

Groups to be compared can be derived from dividing a largersample into subgroups: men vs. women, rural vs. urban

,constitute independent random samples

Dependent samples: They occur when members of onesample (husbands-wives, same samples at time1-time2.

Know our binar IV and the DV measurement

2010 Poch Bunnak 5

H othesis Tests for Cate orical Data

Categorical Data

Tests of Proportions Test of Independence: 2 Test

1 Population: Z Test (for parametric test)and Binomial Sign Test (See NPAR below)

2 Large Ind. Samples: Z Test

2 small ind. samples: 2 Test (any tables)

2 small ind. samples: Fisher (fe < 5; 2x2)

2010 Poch Bunnak 6

2 Dep. Samples: McNemar (pXp tables)


4/17

Nominal Scales: Difference of Pro ortions

For 2x2 Table. Use that data in Table 7.1 as example.

Ha: Any c ange n support or t e pres ents per ormance

(measured by % approval) after 2 months in the office? 0 0 - -

The parameter to be tested is 21, where 2 is theproportion of approval in Feb. and is the proportion ofapproval in Jan. Thus, Ha: 2 1 and H0: 2 = 1

Based on the data, 21 = .04. Now, we need to examine ife c ange s s a s ca y s gn can .

Use the formula on page 169 to compute 99% CI and 95% CI

Use the z formula on a e 170 and formula on a e 169 toperform a statistical test and check against the critical z value at =.001 and at = .05.

2010 Poch Bunnak 7

Nominal Scales: Difference of Pro ortions cont.

99% CI for 21 = .04 = (-.004, .084) We are sure that is reater than but the there is insufficient

evidence to conclude that the difference is statistically significant atthe = .01 level.

= =2 1

. . , . Is the change statistically significant at the = .05 level?

Z = 2.35. Table A: z/2 = 2.35p = .0094 for one-sidedp = 2(.0094) =

.0188 for two-sided.

,presidents performance, though not enough evidence to reject H0 atthe = .01 level.

equ remen s or n erva es ma on o e erence wequ remen s or n erva es ma on o e erence w

two proportions: ntwo proportions: n11pp11, n, n11(1(1--pp11), n), n22pp22, and n, and n22(1(1--pp22)) 55

2010 Poch Bunnak 8


5/17

Other Nominal Tests1. Chi-square Test (See HD 6 for more)

It generalizes the 2-sample Z-test to use with >2 proportions,but equivalent to Z-test when comparing 2 proportions.

Assumptions (Page 209): 2 nominal variables

Random or stratified random sample

2x2 table: f 5 in all cells RxC: fe 5 in at least 75% of cells and fe 1 in the remaining cells

2 statistic = (fo - fe)/fe H0: the 2 nominal variables are independent

All2

test values (p) are one-sided; that is, for tests withs g. eve , compare s a s c o df,1- w = r- c- Conclusion: Reject H0 at -level if p < 2

2010 Poch Bunnak 9

-

Other Nominal Tests cont.2. Fishers Exact Test

Assumptions: oes no re y on e norma y assump on, u uses e exac

distribution of the data.

For 2x2 table only when fe < 5 (use in place of2 test)

0: t ere s no re at ons p etween t e two var a es. The p-value can then be computed by calculating the number

of ossible arran ements of observations that roduce tablesthat are more extreme than the observed and then dividingthis by the total number of possible arrangements of theobservations.

See SPSS output for understanding

3. McNemars Test

For comparing proportions from paired data that have PxP (r= c) tables

-

2010 Poch Bunnak 10


6/17

Interval Scales: Difference of Means, Large

2 studies in: 1965 1985

x yStandard deviation: $300 $700

N 25 50

Ha: 1 2 (change in the mean health expenses) H0: 1 = 2 (no change)

z test for large samples (n1 and n2 20) ttest for small samples (n1 and n2 < 20)

The samples must be independent

The DV is normally distributed The variances of the DV in the two populations are equal.

Formula on page 172 to compute CI; Formula on page 173to com ute z for testin h othesis

2010 Poch Bunnak 11

Interval Scales: Difference of Means for Large

.

See computation in the Excel (file: compare 2

groups)

Verif with SPSS

Difference of proportions cannot be obtained with

Difference of means can be obtained using

Ana yze compare means n epen ent-samp es t-

test

o e a rea e pop parame er as un nown,

thus only t test is available.

2010 Poch Bunnak 12


7/17

Paired differences for de endent sam les

This is the case when cases in sample 1 are matched,

Comparing the means of the 2 samples (2 and 1), ,inference about H0 is based on the single sample ofthed distribution.

The data must be restructured or have match ID var.

=0 2 1 a 2 1 See example 7.5

se pa r es .sav : na yze ompare eans Paired-Samples T Test

2010 Poch Bunnak 13

SPSS T-Test of Paired differences

Descript. stats for each var.

This is not part of theaired-sam le test

T-Test stats for pairedsamples. Report those

encircled.

2010 Poch Bunnak 14


8/17

How to restructure our data?

Data restructure

ID Therapy Score

1 1 60

ID = identifier and Therapy = index

3 1 80

1 2 80

ID Score1 Score2

1 60 80

2 70 952 2 95

3 2 953 80 95

Data restructure

Restructure se ecte vars nto cases

Select one group

2010 Poch Bunnak 15

core core = ars o etransposed and pair = fixed var

Other Im lications of Paired-Sam le Tests

Checking Reliability:

The paired t-test can be used to check reliability,

especially test-retest reliability

Suppose that the result above is based on the test-retest data thera 1 and 2 . You should re ort:

Are the two means different? Large or small difference?

Re ort the aired sam les correlation coefficients r:

large (1) or small ( 0). The larger the r, the stronger

the association and the more reliable is the therapy.

The paired samples test statistics are of lesserimportance.

2010 Poch Bunnak 16


9/17

Ordinal Scales

Independent Samples:

Mann-Whitney U Test (nonparametric, equivalent

to t test): Tests whether two independent samples are from the

same population.

Requires an ordinal level of measurement.

U is the number of times a value in the first group

precedes a value in the second group, when values are

sorted in ascending order.

s more power u an e me an es s nce uses

the ranks of the cases.

2010 Poch Bunnak 17

Ordinal Scales

Dependent Samples:

Sign Test (See NPAR tests below):

Note about the two sign tests:

One is for one population with dichotomous data and the testis based on binomial distribution. SPSSNPAR tests

. .

The other is for paired ordinal data. NPAR tests 2 related

samples Check the sign box. This is called SIGN TEST.

Wilcoxon Signed-Rank Test (See NPAR tests

below)

2010 Poch Bunnak 18


10/17

Other Ordinal-Data Tests (Not covered) Kolmogorov-Smirnov Z:

A test of whether two groups come from the same distribution.

,between the two cumulative distributions.

Moses Test: A nonparametric test designed to test hypotheses in which it is expected that the

exper men a var a e w a ec some su ec s n one rec on an o er su ec s n eopposite direction.

Tests for extreme responses compared to the control group. Requires an ordinal scale ofmeasurement. This test focuses on the span of the control group, and is a measure of how

the control group.

Wald-Wolfowittz runs: A nonparametric test of the hypothesis that two samples come from the same population.

Requires at least an ordinal scale of measurement. The values of the observations fromboth samples are combined and ranked from smallest to largest.

Runs are sequences of values from the same group. If the samples are from the same

population, the two groups should be randomly scattered throughout the ranking.

2010 Poch Bunnak 19

Nonparametric Tests (NPT)

NPT can be used with nominal data, ordinal data, orinterval/ratio data when no assum tion can be madeabout the pop. prob. Distribution. Below describesome NPT for at least numerically ordinal data.

1- Sign Test (One Population=Binomial Test) Sign test is used to test if there is a difference in preferring

. Ex. In a study of rural development, n of villagers were asked if

they prefer raising pigs or raising fish.

0 .raising pigs is .5 (the same as the proportion of preferringraising fish)

Ha: p .5 If H0 is true (p=.5), it cannot be rejected. Then, there is no

evidence indicatin that a difference in reference exists.

2010 Poch Bunnak 20

Tests of H0: small sample (n20) and large sample (n>20)


11/17

Non arametric Tests (NPT), cont.

1.A. Sign Test for n20: Data: 12 Rs. 4 prefer raising fish, 8 raising pigs

Ste s in conductin the si n test:

a. H0: P(fish) = .5; Ha: P(fish) .5 (2-tailed).

b. Assi n + si n to those referrin raisin fish and

sign to those preferring the alternative (raising pigs).

The number of + signs is used in the calculations to

determine if H0 is rejected.

c. H0 has a binomial probability distribution.d. For n = 12, H0 is rejected:

at = .05 if n of + signs < 3 or n of + signs > 9

2010 Poch Bunnak 21

Binomial Probability Distribution (n=12)At p < .05:

Lower end: p < .025; n of +s should be 2 because the sum of the prob of0 1 and 2 is .0002+.0029+.0161= .0192 < .025

0.250

Higher end: n 11 because the sum of the prob of 10, 11, and 12 = .092 < .025

0.1934

.

0.19340.200

0.1208 0.1208

0.100

0.150

0.0537 0.0537

0.00020.0029

0.050

0.0161 0.0161.

0.00020.0000 1 2 3 4 5 6 7 8 9 10 11 12

2010 Poch Bunnak 22

Thus, H0 is rejected at = .05 ifn of + signs < 3 or n of + signs > 9


12/17


1.B. Sign Test for n>20:

se z str ut on, w t mean = . n an stan ar

deviation = sqrt(.25n) uppose e a a con a n or s an or p gs:

= .5*n = .5*30 = 15; = sqrt(.25n) = 2.74

e s r u on o s norma , so we re ec 0 z -1.96 or z > 1.96 at = .05

. .

Therefore, the H0 is accepted, meaning that there is no

raising fish or raising pigs.

Use binomialsav data: NPAR Tests Binomial2010 Poch Bunnak 23

Non arametric Tests NPT cont.

2. Sign Test for Paired Ordinal Data: ,

assumptions about the data, but it is also not verypowerful

Used to test the hypothesis that two variables have thesame distribution. H0: the median difference is zero:

e on y nee e s gns or - o e erences o eva ua e snull hypothesis. The differences between the two variables for allcases are computed and classified as either + or (ties excluded)

a t e erences are pos t ve > , an a are negat ve 7), where X =number of positive differences.

Use binomial distribution table for n = 8, p = 0.5: P-value =P(X=0) + P(X=1) + P(X=7) + P(X=8)

= + + +. . . .

=.0704

H0 is accepted, the mean difference of the 2 populationss , e wo popu a ons ave e same s r u on.

SPSS: use binomial sign test paired data.savanalyze nonparametric tests2 related samplessign.

2010 Poch Bunnak 25


3. Mann-Whitney U Test

Other names: Wilcoxon rank-sum test, Mann-

Whitney-Wilcoxon, Mann-Whitney test

It is used to test if there is a difference betweentwo o ulations H =2 o s are identical

Assumptions:

2 independent samples

equal variances

2010 Poch Bunnak 26


14/17

SPSS Result for Mann-Withne Test

Report Ranks Table, Z, and Sig.- Describe the results in Ranks Table:

erapy g ves poorer resu s anTherapy 2 (2.33 vs. 4.67)

-significant at =.05 level (z=-1.52,

p


15/17


3.A. Mann-Whitney Test for small samples (N10): Steps:=.

Highest = n, ties=average)

ii. Split the ranked data by groups and compute the sum ofranks for each group, symbolized by T,

iii. Find the possible values ofTfor one group (H0 group, ex., .

For BT of n=4, min T= 1+2+3+4 = 10 and max T= 6+7+8+9 = 30.Thus, the possible Tfor BT is (10,30)

If the 2 pop are identical, the value of BT would be near the averageof (10+30)/2=20.

.values for the Mann-Whitney-Wilcoxon Test to compute:

TU= n1(n1+n2+1)-TL (TL = 12 for n1=4, n2=5, and = .05)

2010 Poch Bunnak 29

Reject H0 ifT< TL or ifT> TU


3.B. Mann-Whitney Test for large samples (N>10):

o ow eps - a ove

iv. Since n is large, the sampling distribution ofTis

norma . ompu e: Mean: T= [n1(n1+n2+1)]

12 1 2 1 2 Z = (T- T) /

. . , - .1.96.

. . .

nonparametric tests 2 independent tests-

2010 Poch Bunnak 30


16/17


3. Wilcoxon signed-rank test (=Wilcoxon Test in SPSS) Assumptions:

2 dependent samples as the sign test, but it is better than the sign testbecause it compares the signs andthe rank magnitude of the differences.

DV (the pairs scores) can be ordinal or interval e erence e ween pa rs o scores s or na y sca e cu o es

The signed-rank test compares the sum of the average ranks of positivedifferences (R1) to those of the negative differences (R2).

0: e me an erence s zero ese ran sums are a ou equa .

Computep-value to reject or accept H0: If there are less than 16 non-zero differences, use Rosner Table

ere are or more o non-zero erences, use z norma score. s swhat you get when running Wilcoxon signed-rank data in SPSS.

Use Wilcoxon signed test for paired data.sav Analyze

non arametric tests two-related-sam les test select the air variables Ensure that Wilcoxon is cheched

In the output, note on the +, -, = ranks and test statistics. Those should bereported and interpreted, in addition to the mean rank for each paired rank.

2010 Poch Bunnak 31

Summar

Independent samples

Normality assumptions

No Normalit assum tion

Categorical DV

Ordinal DV: Mann-Whitne U

Interval DV

2010 Poch Bunnak 32


17/17

Writing the SPSS Outputs

2010 Poch Bunnak 33

Writing the SPSS Outputs: Your Table

2010 Poch Bunnak 34