MEDIAN: NON-PARAMETRIC TESTS
Business Statistics
Hypotheses on the median
The sign test
The Wilcoxon signed ranks test
Old exam question
Further study
CONTENTS
▪ The median is a central value that may be more suitable for
strongly asymmetric distributions▪ and for distributions with fat tails
▪ Can we test a population median?▪ e.g., 𝐻0:𝑀 = 400
▪ Note:▪ for a more or less symmetric distribution, 𝑀 ≈ 𝜇, so a 𝑡-test of
mean is appropriate (if 𝑛 ≥ 15)
▪ although perhaps more sensitive to large positive or negative
outliers in the sample
HYPOTHESES ON THE MEDIAN
𝑀 is here the population median. Think of it as a
Greek letter ...
▪ What is the median of a sample?▪ it is the middle value, i.e. 𝑥 𝑛/2
▪ So, if 𝐻0:𝑀 = 400 would be true, approximately half of
the data in the sample would be lower, and half would be
higher
▪ Therefore, if we count the number of data points that is
lower and compare it to the number of observations, we
can develop a test statistic
▪ Two varieties of such non-parametric tests today:▪ sign test
▪ Wilcoxon signed rank test
HYPOTHESES ON THE MEDIAN
The sign test
▪ involves simply counting the number of positive or negative
signs in a sequence of 𝑛 signs
▪ is based on the binomial distribution
▪ can be applied without requirements on the population
distribution
THE SIGN TEST
Computational steps:
▪ for each data point 𝑥𝑖 compute the difference with the
median (𝑀) of the null hypothesis (𝐻0): 𝑑𝑖 = 𝑥𝑖 −𝑀▪ omit zero differences (𝑑𝑖 = 0); effective sample size is 𝑛′
▪ assign +1 to positive differences (𝑑𝑖 > 0) and −1 to
negative differences (𝑑𝑖 < 0)
▪ test statistic 𝑋 is the sum of the positive numbers (=
number of positive observations)
THE SIGN TEST
Example:
Context: battery life until failure (in hours)
▪ 𝐻0:𝑀 = 400; 𝐻1:𝑀 ≠ 400▪ use 𝛼 = 0.05▪ sample of 𝑛 = 13 observations (𝑥1, … , 𝑥13)
▪ reject for large and for small numbers of positive signs
THE SIGN TEST
Example (𝐻0: 𝑀 = 400):
▪ data: 𝑥𝑖 (𝑖 = 1,… , 13)
▪ difference with 𝑀: 𝑑𝑖 = 𝑥𝑖 − 400▪ no cases where 𝑑𝑖 = 0, so 𝑛′ = 𝑛
▪ 𝑠𝑖 = ቊ1 if 𝑑𝑖 > 0−1 if 𝑑𝑖 < 0
▪ 𝑠𝑖+
= ቊ1 if 𝑑𝑖 > 00 if 𝑑𝑖 < 0
▪ 𝑥 = σ𝑖=1𝑛′ 𝑠𝑖
+= 8
THE SIGN TEST
xi xi-400 si si(+)
342 -58 -1
426 26 1 1
317 -83 -1
545 145 1 1
264 -136 -1
451 51 1 1
1049 649 1 1
631 231 1 1
512 112 1 1
266 -134 -1
492 92 1 1
562 162 1 1
298 -102 -1
Example (continued):
▪ 𝑥 = 8▪ under 𝐻0: 𝑋~𝑏𝑖𝑛 13,0.5▪ 𝑃𝑏𝑖𝑛 13,0.5 𝑋 ≥ 8 = 0.291
▪ why ≥ 8?
▪ if we would reject for 8, we would also reject for 9
▪ 𝑝-value: 2 × 0.291 = 0.581▪ why 2 ×?
▪ because it’s a two-sided null hypothesis
▪ there is no reason to reject 𝐻0
THE SIGN TEST
Suppose we have more observations (𝑛 = 130) and find
𝑥 = 80. Can you look up 𝑃𝑏𝑖𝑛 130,0.5 𝑋 ≥ 80 ?
EXERCISE 1
In the sign test, we replace the numerical values by signs (+ or −)Advantage:▪ we don’t need any assumption on normality, symmetry, etc.
▪ that’s why we say it’s non-parametric: we don’t have to assume a certain distribution with parameters
Disadvantage:▪ we discard much information, so that the test is not very
sensitive (has low “power”; see later)Are there other non-parametric tests that are more powerful?▪ is there a compromise between value and sign that still needs
some assumptions, but not too many assumptions?Yes, replacing data by their rank
THE SIGN TEST
Wilcoxon signed rank test▪ involves comparing the sum of ranks of the values larger
than the test value with the sum of ranks of the values smaller than the test value
Computational Steps:▪ for each data point 𝑥𝑖 compute the absolute difference with
the median (𝑀) of the null hypothesis: 𝑑𝑖 = 𝑥𝑖 −𝑀▪ omit zero differences (𝑑𝑖 = 0); effective sample size is 𝑛′
▪ assign ranks (1,… , 𝑛′) to the 𝑑𝑖▪ reassign + and − to the ranks▪ test statistic (𝑊) is the sum of the positive ranks
THE WILCOXON SIGNED RANK TEST
Example (𝐻0: 𝑀 = 400):
▪ data: 𝑥𝑖 (𝑖 = 1,… , 13)
▪ difference with
𝑀: 𝑑𝑖 = 𝑥𝑖 − 400▪ no cases where 𝑑𝑖 = 0,
so 𝑛′ = 𝑛
▪ 𝑤 = σ𝑖=1𝑛′ 𝑟𝑖
+= 61
▪ under 𝐻0:𝑊~? (use table)
▪ 𝑃𝐻0 𝑊 ≥ 61 =?
THE WILCOXON SIGNED RANK TEST
xi xi–
400
|xi–400| ri ri(+)
342 -58 58 -3
426 26 26 1 1
317 -83 83 -4
545 145 145 10 10
264 -136 136 -9
451 51 51 2 2
1049 649 649 13 13
631 231 231 12 12
512 112 112 7 7
266 -134 134 -8
492 92 92 5 5
562 162 162 11 11
298 -102 102 -6
Testing the median using the Wilcoxon 𝑊 statistic
▪ small samples: using a table of critical values▪ included in tables at exam
▪ large samples: using a normal approximation of 𝑊▪ valid when 𝑛 ≥ 20
▪ The test is only valid for symmetrically distributed
populations▪ if not, use sign test
THE WILCOXON SIGNED RANK TEST
Small samples: critical values of Wilcoxon statistic
▪ two-sided, 𝛼 = 0.05, 𝑛 = 13: 𝑤𝑙𝑜𝑤𝑒𝑟 = 17 and 𝑤𝑢𝑝𝑝𝑒𝑟 = 74▪ 𝑅crit = [0,17] ∪ [74,91]▪ 𝑤calc = 61, so do not reject 𝐻0 at 𝛼 = 0.05
THE WILCOXON SIGNED RANK TEST
a = 0.05 a = 0.025 a = 0.01 a = 0.005
a = 0.10 a = 0.05 a = 0.02 a = 0.01
n
5 0 , 15 --- , --- --- , --- --- , ---
6 2 , 19 0 , 21 --- , --- --- , ---
7 3 , 25 2 , 26 0 , 28 --- , ---
8 5 , 31 3 , 33 1 , 35 0 , 36
9 8 , 37 5 , 40 3 , 42 1 , 44
10 10 , 45 8 , 47 5 , 50 3 , 52
11 13 , 53 10 , 56 7 , 59 5 , 61
12 17 , 61 13 , 65 10 , 68 7 , 71
13 21 , 70 17 , 74 12 , 79 10 , 81
two-tail:
(lower , upper)
Lower and Upper Critical Values W of Wilcoxon Signed-Ranks Test
one-tail:
Table is available at the exam (and on
the course website)
Large samples: under 𝐻0:, it can be shown that
▪ 𝐸 𝑊 =𝑛 𝑛+1
4
▪ var 𝑊 =𝑛 𝑛+1 2𝑛+1
24
Further, for 𝑛 ≥ 20, approximately:
▪
𝑊−𝑛 𝑛+1
4
𝑛 𝑛+1 2𝑛+1
24
~𝑁 0,1
▪ so you can compute 𝑧calc =𝑤calc−
𝑛 𝑛+1
4
𝑛 𝑛+1 2𝑛+1
24
▪ and compare it to 𝑧crit (e.g., ±1.96)
THE WILCOXON SIGNED RANK TEST
Example, continued:
▪ 𝑤 = σ𝑖=1𝑛′ 𝑟𝑖
+= 61
▪ under 𝐻0:𝑊~𝑁 𝐸 𝑊 , var 𝑊
▪ so, under 𝐻0:𝑊−𝐸 𝑊
var 𝑊~𝑁 0,1
▪ 𝑃𝑁 𝑊 ≥ 61 = 𝑃𝑊−𝐸 𝑊
var 𝑊≥
61−45.5
14.31= 𝑃ሺ
ሻ
𝑍 ≥
1.08 = 0.1401▪ 𝑝-value: 2 × 0.1401 = 0.2802▪ there is no reason to reject 𝐻0
THE WILCOXON SIGNED RANK TEST
In fact, not a good idea because 𝑛 = 13 ≱ 20. We do it just to show how it works ...
23 March 2015, Q1l-m
OLD EXAM QUESTION
Doane & Seward 5/E 16.1-16.3
Tutorial exercises week 3
Wilcoxon signed rank test, sign test
FURTHER STUDY