Research Methodology: Tools - schwarz & partners 11_EN_201… · Wilcoxon rank-sum test...

Research Methodology: Tools

Applied Data Analysis (with SPSS)

Lecture 11: Nonparametric Methods

May 2014

Prof. Dr. Jürg Schwarz

Lic. phil. Heidi Bruderer Enzler

MSc Business Administration

Slide 2

Contents

Aims of the Lecture ______________________________________________________________________________________ 3

Typical Syntax ___________________________________________________________________________________________ 4

Introduction _____________________________________________________________________________________________ 5

Example ................................................................................................................................................................................................................. 5

Parametric vs. Nonparametric Tests _________________________________________________________________________ 8

Basic Ideas ............................................................................................................................................................................................................. 8

Unrelated and Related Samples ........................................................................................................................................................................... 10

Decision Tree of Nonparametric Tests (a Selection) ............................................................................................................................................. 12

Nonparametric Tests with SPSS ___________________________________________________________________________ 13

Comparison of Two Unrelated Groups ................................................................................................................................................................. 14

Comparison of Two Related Groups ..................................................................................................................................................................... 19

Comparison of Several Unrelated Groups ............................................................................................................................................................ 24

Comparison of Several Related Groups ............................................................................................................................................................... 29

Appendix ______________________________________________________________________________________________ 34

Slide 3

Aims of the Lecture

You will understand the concept of ranking data.

You will understand different concepts of nonparametric tests.

You will understand the key steps in conducting the following nonparametric tests:

◦ Wilcoxon rank-sum test

◦ Kruskal-Wallis test

◦ Wilcoxon signed-rank test

◦ Friedman test

You can apply nonparametric methods with SPSS

In particular, you will know how to >

◦ interpret the output

◦ describe the output

Slide 4

Dependent variables measure

Descriptive statistics

Dependent variable values

Group variable sample

Descriptive statistics

Typical Syntax

Wilcoxon rank-sum test (Mann-Whitney U-Test) NPAR TESTS /M-W= values BY sample(1 2) /STATISTICS=DESCRIPTIVES /MISSING ANALYSIS.

Wilcoxon signed-rank test NPAR TESTS /WILCOXON=measure1 WITH measure2 (PAIRED) /STATISTICS DESCRIPTIVES /MISSING ANALYSIS.

Slide 5

Introduction

Example

Medical research: correlation analysis of babies' weight at birth and the increase from day 70 to

day 100, n = 20

0

500

1000

1500

2000

2500

3000

3500

4000

2000 2500 3000 3500 4000

Problem: Leverage effect due to premature babies with weight at birth of less than 3'000 grams

Average normal weight at birth is 3'400 grams

Correlation coefficient

r = -.76

(Variation of r: -1 ≤ r ≤ 1)

The higher the weight at birth,

the lower the increase from

day 70 to day 100.

Incre

ase in w

eig

ht

[Gra

m]

Weight at birth [gram]

leverage effect

Slide 6

Solution

Take into account ranks

Baby

Weight

at birth Rank

Increase

in weight Rank

1 2'740 5 2'550 17

2 3'180 8 1'790 9

3 3'150 7 1'870 10

4 3'030 6 2'040 13

5 3'370 12 1'470 6

6 2'610 4 2'130 14

7 3'570 16 2'150 15

8 2'270 1 3'350 19

9 2'300 2 3'400 20

10 2'380 3 3'230 18

11 3'260 9 820 2

12 3'350 11 1'190 3

13 3'630 19 1'360 4

14 3'640 20 1'420 5

15 3'490 14 1'960 11

16 3'290 10 1'670 7

17 3'540 15 770 1

18 3'570 17 1'700 8

19 3'460 13 2'010 12

20 3'570 18 2'490 16

"How-to" in SPSS

Scales

Variables: ordinal or higher measurement level

SPSS

Analyze��Correlate��Bivariate... � "Spearman" (Spearman rank correlation coefficient)

Lowest weight at birth = 2'270 → Rank 1

Following weight at birth = 2'300 → Rank 2

etc.

Lowest increase in weight = 770 → Rank 1

Following increase in weight = 820 → Rank 2

etc.

(SPSS will allocate the ranks for you!)

Slide 7

Results

0

5

10

15

20

0 5 10 15 20

Leverage effect due to premature babies has been reduced.

Spearman rank correlation coefficient

r = -.56

(Compare with r = -.76 from above)

Rank o

f in

cre

ase in w

eig

ht

Rank of weight at birth

Slide 8

Parametric vs. Nonparametric Tests

Basic Ideas

All of the tests previously used in this course are based upon specific assumptions.

Especially assumptions about the distribution of variables in the population:

◦ normality (variable or error term is normally distributed)

◦ homogeneity of variance (variance of different groups/units is the same)

These tests are referred to as "parametric tests" because the shapes of the population

distributions are described with known distributions and their parameters.

Example: Variable X is normally distributed → X ~ N(µ,σ2) with parameters µ and σ2

.

Tests that do not make such assumptions are referred to as nonparametric tests.

Nonparametric tests are sometimes known as distribution-free tests

because they make no assumptions about the population distribution.

Nonparametric tests are used when the parametric assumptions are invalid.

Slide 9

Methods for nonparametric tests

There are 2 main methods for nonparametric tests:

◦ Resampling - Permutation (Example: Fisher exact test) - Simulation (Example: Bootstrapping)

◦ Ranking

All Nonparametric tests presented in this lecture work on the principle of ranking the data:

→ High scores being represented by high rankings, low scores by low rankings

The analysis is then carried out on the ranks rather than the original data

Advantages

Nonparametric tests can be used

◦ when nothing is known about the distribution in the population or when parametric assumptions are invalid

◦ with outliers

◦ with small samples

Disadvantage

By ranking the data, information about the magnitude of differences between scores is lost:

→ Nonparametric tests have less power than the parametric test even with same sample size.

It is more likely to miss a significant effect (β error).

Slide 10

Unrelated and Related Samples

Unrelated (independent) samples

The measurement values of a person in sample 1 ( ) and

a person in sample 2 ( ) are not influenced by one another.

Related (paired, dependent) samples

Each measurment value in sample 1 is influenced by a particular

measurement value in sample 2 (and vice versa).

When multiple measurements are applied to the same subject

◦ to examine a development over time (repeated measures)

◦ to compare different treatments

When different persons are tested

◦ who belong together ("natural pairings", e.g. couples)

◦ who are matched to reduce the effects of a confounding variable (e.g. matching persons with a comparable level of empathy)

Husband Wife

Item 1

Anger management training*

A B C

*Matched by level of intelligence

Treatment

A B C

Clinic 1

Clinic 2

Stamina

Day 1 Day 14 Day 70

Placebo

Treatment

A B C

Creativity

Slide 11

SPSS examples for unrelated and for related samples

Unrelated Related

Bone density (Slide 14) Sleeping pills (Slide 24)

:

Sample: Reaction time of proband 3 after

taking four different sleeping pills Sample B Women over 50

Sample A Women up to 50

Slide 12

Decision Tree of Nonparametric Tests (a Selection)

one group

two groups

many groups

normal

any

normal

any

normal

any

normal

any

normal

any

normal

any

related

unrelated

related

unrelated

related

unrelated

t-Test

Sign test

t-Test

t-Test

WRS test

WSR test

Kruskal-Wallis

Friedman

related

unrelated ANOVA

Repeated ANOVA

WRS: Wilcoxon rank-sum test WSR: Wilcoxon signed-rank test

Distribution of dependent va-riable

Samples

Slide 13

Nonparametric Tests with SPSS

SPSS: Analyze��Nonparametric Tests��Legacy Dialogs 7

"2 Independent Samples>" → WRS: Wilcoxon rank-sum test (Mann-Whitney U test)

"K Independent Samples>" → Kruskal-Wallis test

"2 Related Samples>" → WSR: Wilcoxon signed-rank test

"K Related Samples>" → Friedman test

Slide 14

Comparison of Two Unrelated Groups

Wilcoxon rank-sum test (Mann-Whitney U test)

Given

Two independent (unrelated) samples with sample sizes n1, n2 (in general n1 ≠ n2)

Small sample size

Not normally distributed

Question

Do the central tendencies µA and µB of a characteristic differ between two unpaired samples?

H0: Central tendencies are equal µA = µB

HA: Central tendencies are not equal µA ≠ µB

Example

Medical research about osteoporosis: bone density in g/cm3

◦ Sample A: women up to and including age 50, n = 13

◦ Sample B: women above age 50, n = 11

sample A 163 152 202 105 134 134 139 110 122 146 149 94 158

sample B 125 121 133 95 148 96 117 112 100 84 98

Slide 15

Procedure

1. Sort the values of the two samples by size

values

sample A 94 105 110 122 134 134 139 146 149 152 158 163 202

sample B 84 95 96 98 100 112 117 121 125 133 148

2. Allocate ranks to the values

ranks

ranks A 2 7 8 12 15.5 15.5 17 18 20 21 22 23 24

ranks B 1 3 4 5 6 9 10 11 13 14 19

3. Calculate the sum of ranks

rank sum

rank sum A 205

rank sum B 95

4. Calculate the test statistic based on the sample with smaller rank sum

Test statistic U = rank sums – ns ⋅ (ns + 1) / 2

Values of sample k with smaller rank sum

U = rank sumB – nB ⋅ (nB + 1) / 2 = 95 – 11 ⋅ (11 + 1) / 2 = 29

Slide 16

5. Determine the critical value in the table below

Distribution of Wilcoxon rank-sum test (Mann-Whitney U test) – two sided, α = 5%

n2

9 10 11 12 13 14 15 16 17 18 19 20

n1 8 15 17 19 22 24 26 29 31 34 36 38 41

9 17 20 23 26 28 31 34 37 39 42 45 48

10 20 23 26 29 33 36 39 42 45 48 52 55

11 23 26 30 33 37 40 44 47 51 55 58 62

12 26 29 33 37 41 45 49 53 57 61 65 69

→ The critical value is 37.

6. Compare the test statistic with the critical value

The value of the test statistic is in the rejection region of H0:

The bone densities of samples A and B are significantly different.

rejection of H0

non-rejection of H0 non-rejection of H0

37 -37

29

0

Slide 17

Wilcoxon rank-sum test (Mann-Whitney U test) with SPSS

SPSS: Analyze�Nonparametric Tests�Legacy Dialogs�2 Independent Samples >

Values of the grouping variable: Consult value labels or a frequency table

NPAR TESTS

/M-W= values BY sample(1 2)

/STATISTICS=DESCRIPTIVES

/MISSING ANALYSIS.

Slide 18

Here sample size n ≤ 30, therefore:

The bone densities of samples A and B are significantly different (Exact Wilcoxon rank-sum test:

U = 29, p = .013).

If sample size n > 30:

The bone densities of samples A and B are significantly different (Asymptotic Wilcoxon rank-

sum test: Z = -2.463, p = .014).

"Asymp. Sig. (2-tailed)" is based on an approximation

to a normal distribution.

For samples with n > 30 use "Asymp. Sig."

Test statistic U

Rank sum of smaller sample

Z-value calculated for the asymptotic test

Slide 19

Comparison of Two Related Groups

Wilcoxon signed-rank test

Given

Two related samples (sometimes called paired samples)

Small sample sizes


Question

Is there a difference between the central tendencies µA and µB of a characteristic in two related

samples?

H0: central tendencies are equal µA = µB

HA: central tendencies are not equal µA ≠ µB

Example

Medical research about osteoporosis: bone density in g/cm3

Sample: 10 women

◦ Measure 1: bone density before exercise therapy

◦ Measure 2: bone density after exercise therapy

Slide 20

Procedure

1. Calculate differences in values of the two related data points

|difference| = |measure 2 – measure 1|

2. Write down the sign of the difference

Sign of "measure 2 – measure 1"

3. Assign ranks to the absolute differences

Differences with a value of 0 will not be considered.

4. Sum up positive ranks and negative ranks

women measure 1 measure 2 | difference | sign rank positive ranks negative ranks

1 202 133 69 - 9 9

2 163 125 38 - 7 7

3 94 128 34 + 6 6

4 152 121 31 - 5 5

5 134 148 14 + 2 2

6 139 117 22 - 3.5 3.5

7 110 112 2 + 1 1

8 122 100 22 - 3.5 3.5

9 158 85 73 - 10 10

10 146 84 62 - 8 8

sum 55 9 46

1. 2. 3. 4. 4.

Slide 21

5. Calculate the test statistic

Test statistic W = |positive rank sum – negative rank sum | = |9 – 46| = 37


Distribution of Wilcoxon signed-rank test:

Number of differences not equal 0 one sided two sided

T0.95 T0.975

5 15 na

6 17 21

7 22 24

8 26 30

9 29 35

10 35 39

11 40 46

12 44 52

7. Compare the value of the test statistic with the critical value

The value of the test statistic is inside the non-rejection region of H0.

Test (weakly) not significant → Exercise therapy does not increase bone density.

non-rejection of H0

rejection of H0

39

37

0

Slide 22

Wilcoxon signed-rank test with SPSS

SPSS: Analyze�Nonparametric Tests�Legacy Dialogs�2 Related Samples >

NPAR TESTS

/WILCOXON=measure1 WITH measure2 (PAIRED)

/STATISTICS DESCRIPTIVES

/MISSING ANALYSIS.

Slide 23

Test (weakly) not significant.

→ Therapy does not increase bone density (Wilcoxon signed-rank test: Z = -1.887, p = .059).

As the rank sums approximatively follow a normal dis-

tribution, the smaller rank sum is z-standardized.

Slide 24

Comparison of Several Unrelated Groups

Kruskal-Wallis test

Given

Many independent (unrelated) samples with sample sizes n1, ... nk (in general ni ≠ nj, for i ≠ j)

Small sample sizes


Question

Is there a difference between the central tendencies?

H0: central tendencies are equal

HA: at least two of the central tendencies are not equal

Example

Test of 3 different sleeping pills (drug1, drug2, drug3) on sleep duration (measured in hours).

The sleeping pills are used in three random samples (n1 = 3, n2 = 4, n3 = 5)

drug1 6.2 6.9 5.1

drug2 7.1 6.2 6.2 7.9

drug3 8.4 8.8 8.6 8.2 7.2

sleep duration

Slide 25

Procedure

1. All values of the sample are sorted according to size

sleep duration

drug 1 5.1 6.2 6.9

drug 2 6.2 6.2 7.1 7.9

drug 3 7.2 8.2 8.4 8.6 8.8

2. Assign ranks to the values and calculate the rank sum for every sample

ranks rank sum

drug 1 1 3 5 9

drug 2 3 3 6 8 20

drug 3 7 9 10 11 12 49

3. Calculate the test statistic K

k 2 2 2 2j

jj 1

21 2 k j

RConstant 12 9 20 49K 3(N 1) 3(12 1) 7.71

N(N 1) n 12(12 1) 3 4 5

Constant = 12, applies to all numbers of samples and for all sample sizes

N = n +n +...+n R = squared rank sum of sample j

k

=

= − + = + + − + =

+ + ∑

j= number of samples n = sample size of sample j

Slide 26


The test statistic K follows a χ2-distribution with degrees of freedom ν = k – 1 = 3 – 1 = 2

1 - ααααdf 0.700 0.750 0.800 0.850 0.900 0.950 0.975 0.990 0.995

1 1.07 1.32 1.64 2.07 2.71 3.84 5.02 6.63 7.88

2 2.41 2.77 3.22 3.79 4.61 5.99 7.38 9.21 10.60

3 3.66 4.11 4.64 5.32 6.25 7.81 9.35 11.34 12.84

4 4.88 5.39 5.99 6.74 7.78 9.49 11.14 13.28 14.86

5 6.06 6.63 7.29 8.12 9.24 11.07 12.83 15.09 16.75

6 7.23 7.84 8.56 9.45 10.64 12.59 14.45 16.81 18.55

7 8.38 9.04 9.80 10.75 12.02 14.07 16.01 18.48 20.28

8 9.52 10.22 11.03 12.03 13.36 15.51 17.53 20.09 21.95

9 10.66 11.39 12.24 13.29 14.68 16.92 19.02 21.67 23.59

10 11.78 12.55 13.44 14.53 15.99 18.31 20.48 23.21 25.19

11 12.90 13.70 14.63 15.77 17.28 19.68 21.92 24.73 26.76

12 14.01 14.85 15.81 16.99 18.55 21.03 23.34 26.22 28.30

13 15.12 15.98 16.98 18.20 19.81 22.36 24.74 27.69 29.82

14 16.22 17.12 18.15 19.41 21.06 23.68 26.12 29.14 31.32

15 17.32 18.25 19.31 20.60 22.31 25.00 27.49 30.58 32.80

Critical value for α = 5%: χχχχ295% = 5.99


(χ295% = 5.99) < (K = 7.71)

The null hypothesis is rejected. The rank sums differ significantly.

The sleeping pills have different effects on sleep duration.

Slide 27

Kruskal-Wallis test with SPSS

SPSS: Analyze�Nonparametric Tests�Legacy Dialogs�K Independent Samples >

NPAR TESTS

/K-W=duration BY drug(1 3)


/MISSING ANALYSIS.

Slide 28

The sleeping pills have significantly different effects on sleep duration

(Kruskal-Wallis test: χ2 = 7.708, df = 2, p = .021).

Compare "Chi-Square" with test statistic K = 7.71

Slide 29

Comparison of Several Related Groups

Friedman test

Given

Related samples (repeated measures design)

Small sample size


Question

Is there a difference between the central tendencies?

H0: central tendencies are equal

HA: at least two of the central tendencies are not equal

Example

The effect on reaction time of 4 different sleeping pills (drug1, >) is measured (in milliseconds)

proband drug1 drug2 drug3 drug4

1 30 28 16 34

2 14 18 10 22

3 28 28 14 30

4 24 20 18 30

5 38 34 20 44

Example: Reaction time dependent on type of

sleeping pill (drug1, > drug4) of proband 3

Slide 30

Procedure

1. Within each person, assign ranks to the treatments


1 3 2 1 4

2 2 3 1 4

3 2.5 2.5 1 4

4 3 2 1 4

5 3 2 1 4

If several values have the same rank, they will be replaced by the average

2. Calculate the rank sum Rj of each column (= of each treatment)


1 3 2 1 4

2 2 3 1 4

3 2.5 2.5 1 4

4 3 2 1 4

5 3 2 1 4

Rj 13.5 11.5 5.0 20.0

3. Calculate the test statistic V

k2 2 2 2 2j

j 1

Cons tan t 12V R 3n(k 1) (13.5 11.5 5.0 20.0 ) 3 5(4 1) 13.74

nk(k 1) 5 4(4 1)

Constant = 12

k = number of treatment levels = 4 treatment levels

n = number of probands = 5 probands

=

= − + = + + + − ⋅ + =+ ⋅ +∑

Example: ranked reaction time (ranks 1 to 4)

of proband 3

Slide 31


The test statistic V follows a χ2-distribution with degrees of freedom ν = k – 1 = 4 – 1 = 3

1 - ααααdf 0.700 0.750 0.800 0.850 0.900 0.950 0.975 0.990 0.995

1 1.07 1.32 1.64 2.07 2.71 3.84 5.02 6.63 7.88

2 2.41 2.77 3.22 3.79 4.61 5.99 7.38 9.21 10.60

3 3.66 4.11 4.64 5.32 6.25 7.81 9.35 11.34 12.84

4 4.88 5.39 5.99 6.74 7.78 9.49 11.14 13.28 14.86

5 6.06 6.63 7.29 8.12 9.24 11.07 12.83 15.09 16.75

6 7.23 7.84 8.56 9.45 10.64 12.59 14.45 16.81 18.55

7 8.38 9.04 9.80 10.75 12.02 14.07 16.01 18.48 20.28

8 9.52 10.22 11.03 12.03 13.36 15.51 17.53 20.09 21.95

9 10.66 11.39 12.24 13.29 14.68 16.92 19.02 21.67 23.59

10 11.78 12.55 13.44 14.53 15.99 18.31 20.48 23.21 25.19

11 12.90 13.70 14.63 15.77 17.28 19.68 21.92 24.73 26.76

12 14.01 14.85 15.81 16.99 18.55 21.03 23.34 26.22 28.30

13 15.12 15.98 16.98 18.20 19.81 22.36 24.74 27.69 29.82

14 16.22 17.12 18.15 19.41 21.06 23.68 26.12 29.14 31.32

15 17.32 18.25 19.31 20.60 22.31 25.00 27.49 30.58 32.80

Critical value for α = 5%: χχχχ295% = 7.81


(χ295% = 7.81) < (V = 13.74)

The null hypothesis is rejected. The rank sums differ significantly.

The sleeping pills cause significantly different reaction times.

Slide 32

Friedman test with SPSS

SPSS: Analyze�Nonparametric Tests�Legacy Dialogs�K Related Samples >

NPAR TESTS

/FRIEDMAN=drug1 drug2 drug3 drug4


/MISSING LISTWISE.

Slide 33

The sleeping pills have different impact on reaction times

(Friedman test: χ2 = 14.020, df = 3, p = .003).

Compare "Chi-Square" with test statistic V = 13.74

SPSS uses a slightly different algorithm

Slide 34

Appendix

Which Groups Differ? Post hoc Tests after a Kruskal-Wallis Test

Post hoc tests can be accessed through the output generated by the newer SPSS dialogs:

SPSS: Analyze��Nonparametric Tests��Independent Samples >

Alternatively, separate Wilcoxon rank-sum tests for each of the combinations of two drugs could

be conducted (using an alpha level adjustment, e.g. a Bonferroni correction).

NPTESTS

/INDEPENDENT TEST (duration) GROUP (drug)

/MISSING SCOPE=ANALYSIS USERMISSING=EXCLUDE

/CRITERIA ALPHA=0.05 CILEVEL=95.

Note: This dialog does not allow defining

which groups are compared. All groups

coded in drug are being compared.

Slide 35

Double-click table in the output

Select "Pairwise comparisons"

Values needed for reporting the Kruskal-Wallis test

Copy output

Slide 36

=> Only drug=1 and drug=3 differ significantly from one another (p = .029).

SPSS does not offer any choice between different alpha level adjustments. By default,

a Bonferroni adjustment is carried out. Other adjustments need to be carried out manually.

Copy output

Slide 37

Which Groups Differ? Post hoc Tests after a Friedman Test

Post hoc tests can be accessed through the output generated by the newer SPSS dialogs:

SPSS: Analyze��Nonparametric Tests��Related Samples >

Alternatively, separate Wilcoxon signed-rank tests for each of the combinations of two drugs

could be conducted (using an alpha level adjustment, e.g. a Bonferroni correction).

NPTESTS

/RELATED TEST(drug1 drug2 drug3 drug4)

/MISSING SCOPE=ANALYSIS USERMISSING=EXCLUDE

/CRITERIA ALPHA=0.05 CILEVEL=95.

Slide 38

Double-click table in the output

Select "Pairwise comparisons"

Values needed for reporting Friedman test

Copy output

Slide 39

=> Only drug3 and drug4 differ significantly from one another (p = .001).

SPSS does not offer any choice between different alpha level adjustments. By default,

a Bonferroni adjustment is carried out. Other adjustments need to be carried out manually.

Copy output

Date post:	19-Jul-2020
Category:	Documents
Upload:	others
View:	1 times
Download:	0 times

Research Methodology: Tools - schwarz & partners 11_EN_201… · Wilcoxon rank-sum test...

Documents