+ All Categories
Home > Documents > NONPARAMETRIK

NONPARAMETRIK

Date post: 02-Jan-2016
Category:
Upload: bianca-carrillo
View: 31 times
Download: 2 times
Share this document with a friend
Description:
NONPARAMETRIK. NON PARAMETRIC TEST. The majority of hypothesis tests discussed so far have made inferences about population parameters, such as the mean and the proportion. These parametric tests have used the parametric statistics of samples that came from the population being tested. - PowerPoint PPT Presentation
106
NONPARAMETRIK
Transcript
Page 1: NONPARAMETRIK

NONPARAMETRIK

Page 2: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

2

The majority of hypothesis tests discussed so far have made

inferences about population parameters, such as the mean and

the proportion. These parametric tests have used the

parametric statistics of samples that came from the population

being tested.

To formulate these tests, we made restrictive assumptions

about the populations from which we drew our samples. For

example, we assumed that our samples either were large or

came from normally distributed populations. But populations

are not always normal.

NON PARAMETRIC TEST

Page 3: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

3

And even if a goodness-of-fit test indicates that a population is

approximately normal. We cannot always be sure we’re right,

because the test is not 100 percent reliable.

Fortunately, in recent times statisticians have develops useful

techniques that do not make restrictive assumption about the

shape of population distribution.

These are known as distribution – free or, more commonly,

nonparametric test.

Non parametric statistical procedures in preference to their

parametric counterparts.

The hypotheses of a nonparametric test are concerned with

something other than the value of a population parameter.

A large number of these tests exist, but this section will examine

only a few of the better known and more widely used ones :

Page 4: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

4

NON PARAMETRIC TESTS

SIGN TEST

WILCOXON SIGNED RANK TEST

MANN – WHITNEY TEST(WILCOXON RANK SUM TEST)

RUN TEST

KRUSKAL – WALLIS TEST

KOLMOGOROV – SMIRNOV TEST

LILLIEFORS TEST

Page 5: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

5

The sign test is used to test hypotheses about the median of a continuous distribution. The median of a distribution is a value of the random variable X such that the probability is 0,5 that an observed value of X is less than or equal to the median, and the probability is 0,5 that an observed value of X is greater than or equal to the median. That is,

Since the normal distribution is symmetric, the mean of a normal distribution equals the median. Therefore, the sign test can be used to test hypotheses about the mean of a normal distribution.

THE SIGN TEST

Page 6: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

6

Let X denote a continuous random variable with median and let denote a random sample of size n from the population of interest. If denoted the hypothesized value of the population median, then the usual forms of the hypothesis to be tested can be stated as follows :

(right-tailed test)

(left-tailed test)

(two-tailed test)

VERSUS

Page 7: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

7

Form the differences : Now if the null hypothesis is true,

any difference is equally likely to be positive or negative. An appropriate test statistic is the number of these differences that are positive, say . Therefore, to test the null hypothesis we are really testing that the number of plus signs is a value of a Binomial random variable that has the parameter

p = 0,5 .A p-value for the observed number of plus signs can be calculated directly from the Binomial distribution. Thus, if the computed p-value.

is less than or equal to some preselected significance level α , we will reject and conclude is true.

Page 8: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

8

To test the other one-sided hypothesis,

vs

is less than or equal α, we will reject . The two-sided alternative may also be tested. If the hypotheses are:

vs p-value is :

Page 9: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

9

It is also possible to construct a table of critical value for the sign test.As before, let denote the number of the differences that are positive and let denote the number of the differences that are negative.Let , table of critical values for the sign test that ensure that

If the observed value of the test-statistic , the the null hypothesis should be reject and accepted

Page 10: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

10

If the alternative is , then reject if .If the alternative is ,then reject if .The level of significance of a one-sided test is one-half the value for a two-sided test.

Page 11: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

11

Since the underlying population is assumed to be continuous, there is a zero probability that we will find a “tie” , that is , a value of exactly equal to .When ties occur, they should be set aside and the sign test applied to the remaining data.

TIES in the SIGN TEST

Page 12: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

12

When , the Binomial distribution is well approximated by a normal distribution when n is at least 10. Thus, since the mean of the Binomial is and the variance is , the distribution of is approximately normal with mean 0,5n and variance 0,25n whenever n is moderately large.Therefore, in these cases the null hypothesis can be tested using the statistic :

THE NORMAL APPROXIMATION

Page 13: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

13

Critical Regions/Rejection Regions for α-level tests

versus

are given in this table :CRITICAL/REJECTION REGIONS FOR

Alternative CR/RR

Page 14: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

14

The sign test makes use only of the plus and minus signs of the differences between the observations and the median (the plus and minus signs of the differences between the observations in the paired case).Frank Wilcoxon devised a test procedure that uses both direction (sign) and magnitude.This procedure, now called the Wilcoxon signed-rank test.The Wilcoxon signed-rank test applies to the case of the symmetric continuous distributions.Under these assumptions, the mean equals the median.

THE WILCOXON SIGNED-RANK TEST

Page 15: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

15

Description of the test :We are interested in testing,

versus

Page 16: NONPARAMETRIK

16

Assume that is a random sample from a continuous and symmetric distribution with mean/median : .Compute the differences , i = 1, 2, … nRank the absolute differences , and then give the ranks the signs of their corresponding differences.Let be the sum of the positive ranks, and be the absolute value of the sum of the negative ranks, and let .

Critical values of , say .1. If , then value of the statistic , reject

2. If , reject if 3. If , reject if

SWN SCIENCE DEPARTMENT

Page 17: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

17

If the sample size is moderately large (n>20), then it can be shown that or has approximately a normal distribution with mean

andvariance

Therefore, a test of can be based on the statistic

LARGE SAMPLE APPROXIMATION

Page 18: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

18

Test statistic :

Theorem : The probability distribution of when is true, which is based on a random sample of size n, satisfies :

Wilcoxon Signed-Rank Test

Page 19: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

19

Proof :

Let if , then

where

For a given , the discrepancy has a 50 : 50 chance

being “+” or “-”. Hence, where

Page 20: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

20

Page 21: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

21

Page 22: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

22

The Wilcoxon signed-rank test can be applied to paired data.Let ( ) , j = 1,2, …n be a collection of paired observations from two continuous distributions that differ only with respect to their means. The distribution of the differences is continuous and symmetric.The null hypothesis is : , which is equivalent to

.To use the Wilcoxon signed-rank test, the differences are first ranked in ascending order of their absolute values, and then the ranks are given the signs of the differences.

PAIRED OBSERVATIONS

Page 23: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

23

Let be the sum of the positive ranks and be the absolute value of the sum of the negative ranks, and .If the observed value , then is rejected and accepted.If , then reject , ifIf , reject , if

Page 24: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

24

Eleven students were randomly selected from a large statistics class, and their numerical grades on two successive examinations were recorded.

Use the Wilcoxon signed rank test to determine whether the second test was more difficult than

the first. Use α = 0,1.

EXAMPLE

Student Test 1 Test 2 Difference Rank

Sign Rank

1234567891011

9478896249788082628379

8565925652747984487182

913-36-341-21412-3

8104746121194

810-47-461-2119-4

Page 25: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

25

solution :Jumlah ranks positif :

TOLAK H0

0 1,281,69

Page 26: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

26

Ten newly married couples were randomly selected, and each husband and wife were independently asked the question of how many children they would like to have. The following information was obtained.

Using the sign test, is test reason to believe that wives want fewer children than husbands?Assume a maximum size of type I error of 0,05

EXAMPLE

COUPLE 1 2 3 4 5 6 7 8 9 10

WIFE XHUSBAND Y

3 2 1 0 0 1 2 2 2 0 2 3 2 2 0 2 1 3 1 2

Page 27: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

27

Tetapkan dulu H0 dan H1 :

H0 : p = 0,5

vs H1 : p < 0,5

Ada tiga tanda +.Di bawah H0 , S ~ BIN (9 , 1/2)

P(S ≤ 3) = 0,2539Pada peringkat α = 0,05 , karena 0,2539 > 0,05maka H0 jangan ditolak.

SOLUSI

Pasangan 1 2 3 4 6 7 8 9 10

Tanda + - - - - + - + -

Page 28: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

28

Suppose that we have two independent continuous populations X1 and X2 with means µ1 and µ2. Assume that the distributions of X1 and X2 have the same shape and spread, and differ only (possibly) in their means.The Wilcoxon rank-sum test can be used to test the hypothesis H0 : µ1 = µ2. This procedure is sometimes called the Mann-Whitney test or Mann-Whitney U Test.

THE WILCOXON RANK-SUM TEST

Page 29: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

29

Let and be two independent random samples of sizes from the continuous populations X1 and X2. We wish to test the hypotheses :

H0 : µ1 = µ2

versus H1 : µ1 ≠ µ2

The test procedure is as follows. Arrange all n1 + n2 observations in ascending order of magnitude and assign ranks to them. If two or more observations are tied, then use the mean of the ranks that would have been assigned if the observations differed.

Description of the Test

Page 30: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

30

Let W1 be the sum of the ranks in the smaller sample (1), and define W2 to be the sum of the ranks in the other sample.Then,

Now if the sample means do not differ, we will expect the sum of the ranks to be nearly equal for both samples after adjusting for the difference in sample size. Consequently, if the sum of the ranks differ greatly, we will conclude that the means are not equal.Refer to table with the appropriate sample sizes n1 and n2 , the critical value wα can be obtained.

Page 31: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

31

H0 : µ1 = µ2 is rejected, if either of the observed values

w1 or w2 is less than or equal wα

If H1 : µ1 < µ2, then reject H0 if w1 ≤ wα

For H1 : µ1 > µ2, reject H0 if w2 ≤ wα.

Page 32: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

32

When both n1 and n2 are moderately large, say, greater than 8, the distribution of W1 can be well approximated by the normal distribution with mean :

and variance :

LARGE-SAMPLE APPROXIMATION

Page 33: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

33

Therefore, for n1 and n2 > 8, we could use :

as a statistic, and critical region is : two-tailed test

upper-tail test

lower-tail test

Page 34: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

34

A large corporation is suspected of sex-discrimination in the salaries of its employees. From employees with similar responsibilities and work experience, 12 male and 12 female employees were randomly selected ; their annual salaries in thousands of dollars are as follows :

Is there reason to believe that there random samples come from populations with different distributions ? Use α = 0,05

EXAMPLE

Females

22,5 19,8 20,6 24,7 23,2 19,2 18,7 20,9 21,6 23,5 20,7 21,6

Males 21,9 21,6 22,4 24,0 24,1 23,4 21,2 23,9 20,5 24,5 22,3 23,6

Page 35: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

35

H0 : f1(x) = f2(x) APA ARTINYA??

random samples berasal dari populasi dengan distribusi yang samaH1 : f1(x) ≠ f2(x)

Gabungkan dan buat peringkat salaries :

SOLUSI

SEX GAJI PERINGKAT

F 18,7 1

F 19,2 2

F 19,8 3

M 20,5 4

F 20,6 5

F 20,7 6

F 20,9 7

M 21,2 8

M 21,6 10

F 21,6 10

F 21,6 10

Page 36: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

36

M 21,9 12

M 22,3 13

M 22,4 14

F 22,5 15

F 23,2 16

M 23,4 17

F 23,5 18

M 23,6 19

M 23,9 20

M 24,0 21

M 24,1 22

M 24,5 23

F 24,7 24

C........

Page 37: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

37

Andaikan, kita pilih sampel dari female, maka jumlah peringkatnya R1 = RF = 117

Statistic

nilai dari statistic U adalah

Page 38: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

38

Grafik

α = 0,05 Zhit = 1,91

maka terima H0

-1,96 1,96 ARTINYA ???

Page 39: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

39

The Kolmogorov-Smirnov Test (K-S) test is conducted by the comparing the hypothesized and sample cumulative distribution function.A cumulative distribution function is defined as : and the sample cumulative distribution function, S(x), is defined as the proportion of sample values that are less than or equal to x.The K-S test should be used instead of the to determine if a sample is from a specified continuous distribution.To illustrate how S(x) is computed, suppose we have the following 10 observations :

110, 89, 102, 80, 93, 121, 108, 97, 105, 103.

KOLMOGOROV – SMIRNOV TEST

Page 40: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

40

We begin by placing the values of x in ascending order, as follows :

80, 89, 93, 97, 102, 103, 105, 108, 110, 121.Because x = 80 is the smallest of the 10 values, the proportion of values of x that are less than or equal to 80 is : S(80) = 0,1.

X S(x) = P(X ≤ x)

80 0,1

89 0,2

93 0,3

97 0,4

102 0,5

103 0,6

105 0,7

108 0,8

110 0,9

121 1,0

Page 41: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

41

The test statistic D is the maximum- absolute difference between the two cdf’s over all observed values. The range on D is 0 ≤ D ≤ 1, and the formula is :

where x = each observed value S(x) = observed cdf at x F(x) = hypothesized cdf at x

Page 42: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

42

Let X(1) , X(2) , …. , X(n) denote the ordered observations of a random sample of size n, and define the sample cdf as :

is the proportion of the number of sample values less than

or equal to x.

Page 43: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

43

The Kolmogorov – Smirnov statistic, is defined to be :

For the size α of type I error, the critical region is of form :

Page 44: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

44

A state vehicle inspection station has been designed so that inspection time follows a uniform distribution with limits of 10 and 15 minutes.A sample of 10 duration times during low and peak traffic conditions was taken. Use the K-S test with α = 0,05 to determine if the sample is from this uniform distribution. The time are :11,3 10,4 9,8 12,6 14,813,0 14,3 13,3 11,5 13,6

EXAMPLE 1

Page 45: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

45

1. H0 : sampel berasal dari distribusi Uniform (10,15)versus H1 : sampel tidak berasal dari distribusi Uniform (10,15)

2. Fungsi distribusi kumulatif dari sampel : S (x) dihitung dari,

SOLUTION

Page 46: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

46

WaktuPengamatan

xS(x) F(x)

9,8 0,10 0,00 0,10

10,4 0,20 0,08 0,12

11,3 0,30 0,26 0,04

11,5 0,40 0,30 0,10

12,6 0,50 0,52 0,02

13,0 0,60 0,60 0,00

13,3 0,70 0,66 0,04

13,6 0,80 0,72 0,08

14,3 0,90 0,86 0,04

14,8 1,00 0,96 0,04

Hasil Perhitungan dari K-S

Page 47: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

47

, untuk x = 10,4Dalam tabel , n = 10 , α = 0,05 D10,0.05 = 0,41

f(D)

α = P(D ≥ D0)

D0 D

0,12 < 0,41 maka do not reject H0

Page 48: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

48

Suppose we have the following ten observations 110, 89, 102, 80, 93, 121, 108, 97, 105, 103 ;were drawn from a normal distribution, with mean µ = 100 and standard-deviation σ = 10.Our hypotheses for this test are H0 : Data were drawn from a normal distribution, with µ = 100 and σ = 10.

versusH1 : Data were not drawn from a normal distribution, with µ = 100 and σ = 10.

EXAMPLE 2

Page 49: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

49

F(x) = P(X ≤ x)SOLUTION

x F(x)

80

89

93

97

102

103

105

108

110

121

P(X ≤ 80) = P(Z ≤ -2) = 0,0228

P(X ≤ 89) = P(Z ≤ -1,1) = 0,1357

P(X ≤ 93) = P(Z ≤ -0,7) = 0,2420

P(X ≤ 97) = P(Z ≤ -0,3) = 0,3821

P(X ≤ 102) = P(Z ≤ 0,2) = 0,5793

P(X ≤ 103) = P(Z ≤ 0,3) = 0,6179

P(X ≤ 105) = P(Z ≤ 0,5) = 0,6915

P(X ≤ 108) = P(Z ≤ 0,8) = 0,7881

P(X ≤ 110) = P(Z ≤ 1,0) = 0,8413

P(X ≤ 121) = P(Z ≤ 2,1) = 0,9821

Page 50: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

50

x F(x) S(x)

80 0,0228 0,1 0,0772

89 0,1357 0,2 0,0643

93 0,2420 0,3 0,0580

97 0,3821 0,4 0,0179

102 0,5793 0,5 0,0793 =

103 0,6179 0,6 0,0179

105 0,6915 0,7 0,0085

108 0,7881 0,8 0,0119

110 0,8413 0,9 0,0587

121 0,9821 1,0 0,0179

Page 51: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

51

Jika α = 0,05 , maka critical value, dengan n=10 diperoleh di tabel = 0,409.

Aturan keputusannya, tolak H0 jika D > 0,409

Karena H0 jangan ditolak atau terima H0 .

Artinya, data berasal dari distribusi normal dengan µ = 100 dan σ = 10.

Page 52: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

52

In most applications where we want to test for normality, the population mean and the population variance are known.In order to perform the K-S test, however, we must assume that those parameters are known. The Lilliefors test, which is quite similar to the K-S test.The major difference between two tests is that, with the Lilliefors test, the sample mean and the sample standard deviation s are used instead of µ and σ to calculate F (x).

LILLIEFORS TEST

Page 53: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

53

A manufacturer of automobile seats has a production line that produces an average of 100 seats per day. Because of new government regulations, a new safety device has been installed, which the manufacturer believes will reduce average daily output.A random sample of 15 days’ output after the installation of the safety device is shown:93, 103, 95 , 101, 91, 105, 96, 94, 101, 88, 98, 94, 101, 92, 95The daily production was assumed to be normally distributed.Use the Lilliefors test to examine that assumption, with α = 0,01

EXAMPLE

Page 54: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

54

Seperti pada uji K-S, untuk menghitung S (x) urutkan, sbb :

SOLUSI

x S(x)

88 1/15 = 0,067

91 2/15 = 0,133

92 3/15 = 0,200

93 4/15 = 0,267

94 6/15 = 0,400

95 8/15 = 0,533

96 9/15 = 0,600

98 10/15 = 0,667

101 13/15 = 0,867

103 14/15 = 0,933

105 15/15 = 1,000

Page 55: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

55

Dari data di atas, diperoleh dan s = 4,85.

Selanjutnya F(x) dihitung sbb :

X F(x)

88

91

92

.

.

.

.

101

103

105

Page 56: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

56

Akhirnya, buat rangkuman sbb :

Tabel, nilai kritis dari uji Lilliefors : α = 0,01 , n = 15 Dtab = 0,257

maka terima H0

x F(x) S(x)

88 0,0401 0,067 0,0269

91 0,1292 0,133 0,0038

92 0,1788 0,200 0,0212

93 0,2358 0,267 0,0312

94 0,3050 0,400 0,0950

95 0,3821 0,533 0,1509 = D

96 0,4602 0,600 0,1398

98 0,6255 0,667 0,0415

101 0,8238 0,867 0,0432

103 0,9115 0,933 0,0215

105 0,9608 1,000 0,0392

Page 57: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

57

Usually a sample that is taken from a population should be random.The runs test evaluates the null hypothesisH0 : the order of the sample data is random

The alternative hypothesis is simply the negation of H0. There is no comparable parametric test to evaluate this null hypothesis.The order in which the data is collected must be retained so that the runs may be developed.

TEST BASED ON RUNS

Page 58: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

58

DEFINITIONS :1. A run is defined as a sequence of the same

symbols.Two symbols are defined, and each sequence must contain a symbol at least once.

2. A run of length j is defined as a sequence of j observations, all belonging to the same group, that is preceded or followed by observations belonging to a different group.

For illustration, the ordered sequence by the sex of the employee is as follows :F F F M F F F M M F F M M M F F M F M M M M M F For the sex of the employee the ordered

sequence exhibits runs of F’s and M’s.

Page 59: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

59

The sequence begins with a run of length three, followed by a run of length one, followed by another run of length three, and so on.The total number of runs in this sequence is 11.Let R be the total number of runs observed in an ordered sequence of n1 + n2 observations, where n1 and n2 are the respective sample sizes. The possible values of R are 2, 3, 4, …. (n1 + n2 ).

The only question to ask prior to performing the test is, Is the sample size small or large?We will use the guideline that a small sample has n1 and n2 less than or equal to 15.

In the table, gives the lower rL and upper rU values of the distribution f(r) with α/2 = 0,025 in each tail.

Page 60: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

60

If n1 or n2 exceeds 15, the sample is considered large, in which case a normal approximation to f(r) is used to test H0 versus

H1.

f(r)

rAR

rL rU

Page 61: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

61

The mean and variance of R are determined to be

normal approximation

Page 62: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

62

The Kruskal – Wallis H test is the nonparametric equivalent of the Analysis of Variance F test.It test the null hypothesis that all k populations possess the same probability distribution against the alternative hypothesis that the distributions differ in location – that is, one or more of the distributions are shifted to the right or left of each other.The advantage of the Kruskall – Wallis H test over the F test is that we need make no assumptions about the nature of sampled populations.A completely randomized design specifies that we select independent random samples of n1, n2 , …. nk

observations from the k populations.

THE KRUSKAL - WALLIS H TEST

Page 63: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

63

To conduct the test, we first rank all :n = n1 + n2 + n3 + … +nk observations and compute the rank sums, R1 , R2 , …, Rk for the k samples.

The ranks of tied observations are averaged in the same manner as for the WILCOXON rank sum test.Then, if H0 is true, and if the sample sizes n1 , n2 , …, nk each equal 5 or more, then the test statistic is defined by :

will have a sampling distribution that can be approximated by a chi-square distribution with (k-1) degrees of freedom.Large values of H imply rejection of H0 .

Page 64: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

64

Therefore, the rejection region for the test is , where is the value that located α in the upper tail of the chi- square distribution.

The test is summarized in the following :

Page 65: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

65

H0 : The k population probability distributions are identical

H1 : At least two of the k population probability distributions differ in location

Test statistic :

where, ni = Number of measurements in sample i

Ri = Rank sum for sample i, where the rank of each measurementis computed according to its relative magnitude in the totality of data for the k samples.

KRUSKAL – WALLIS H TESTFOR COMPARING k POPULATION PROBABILITY DISTRIBUTIONS

Page 66: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

66

n = Total sample size = n1 + n2 + … +nk

Rejection Region : with (k-1) dofAssumptions :

1. The k samples are random and independent2. There are 5 or more measurements in each sample 3. The observations can be ranked

No assumptions have to be made about the shape of the population probability distributions.

Page 67: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

67

Independent random samples of three different brands of magnetron tubes (the key components in microwave ovens) were subjected to stress testing, and the number of hours each operated without repair was recorded. Although these times do not represent typical life lengths, they do indicate how well the tubes can withstand extreme stress. The data are shown in table (below). Experience has shown that the distributions of life lengths for manufactured product are often non normal, thus violating the assumptions required for the proper use of an ANOVA F test.Use the K-S H test to determine whether evidence exists to conclude that the brands of magnetron tubes tend to differ in length of life under stress. Test using α = 0,05

Example

Page 68: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

68

BRAND A B C

36 49 71 48 33 31 5 60 140 67 2 59 53 55 42

Page 69: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

69

Lakukan ranking/peringkat dan jumlahkan peringkat dari 3 sample tersebut.

H0 : the population probability distributions of length of life under stress are identical for the three brands of magnetron tubes.

versusH1 : at least two of the population probability

distributions differ in location

Solusi

A peringkat B peringkat C peringkat

36 5 49 8 71 14

48 7 33 4 31 3

5 2 60 12 140 15

67 13 2 1 59 11

53 9 55 10 42 6

R1 = 36 R2 = 35 R3 = 49

Page 70: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

70

Test statistic :

H0 ???

f(H)

H1,22 5,99

Page 71: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

71

COMPARISON OF POPULATION PROPORTIONSGiven X1~BIN(n1, p1) and X2~BIN(n2, p2)

Statistics :

Are defined to be the sample proportions.

Assume, that X1 and X2 are independent;

2

22

1

11 ˆ;ˆ

n

Xp

n

Xp

)ˆ()ˆ()ˆˆ( 2121 pEpEppE

21 pp

)ˆ()ˆ()ˆˆ( 2121 pVarpVarppVar

2

22

1

11 )1()1(

n

pp

n

pp

Page 72: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

72

For sufficiently large n1 and n2 the standardized statistic :

The (1-α)100% CI :

As p1 and p2 UNKNOWN, approximate (1-α)100% CIfor (p1-p2) :

22

22

1

1121

)1()1()ˆˆ( zn

pp

n

pppp

2

22

1

11

2121

)1()1(

)()ˆˆ(

npp

npp

pppp

2

22

1

11221

)ˆ1(ˆ)ˆ1(ˆ)ˆˆ(

n

pp

n

ppzpp

Page 73: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

73

In the testing situation,Ho : p1 = p2 = p ( p unknown )

Versus

Test statistic :

The unknown common value of p is estimated by :

1H

21 pp 21 pp 21 pp

zZRR : zZRR :2: zZRR

testlos

21

21

)1()1(

ˆˆ

npp

npp

ppZ

21

21ˆnn

XXp

Page 74: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

74

EXAMPLEMembers of the Department of statistics at Iowa State Union collected the following data on grades in an introductory business statistics course and an introductory engineering statistics course.

Course #Students #A gradesB.Stat 571 82E.Stat 156 25 Ho : p1=p2

Vs H1 : p1≠p2

; The proportion of A grades in two courses is equal.

1436,0571

82ˆ1 p 1603,0

156

25ˆ 2 p

Page 75: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

75

1472,0156571

2582ˆ

p

)1561

5711)(8528,0(1472,0

1603,01436,0

Z

52,0Z

The p-value is 2P(Z≤-0,52) = 0,6030 If α= 5% < p-value

Ho would not be rejected

Proportion of A’s does not differ significantly in the two courses.

Page 76: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

76

An insurance company is thinking about offering discount on its life insurance policies to non smokers. As part of its analysis, it randomly select 200 men who are 50 years old and asks them if they smoke at least one pack of cigarettes per day and if they have ever suffered from heart diseases. The results indicate that 20 out of 80 smokers and 15 out of 120 non smokers suffer from heart disease. Can we conclude at the 5% los that smokers have a higher incidence of heart disease than non smokers ? DATA

berumur 50th

perokok menderita penyakit JANTUNG parameter : p1

berumur 50th

bukan perokok menderita penyakit JANTUNG parameter : p2

EXERCISE

Solution:

Page 77: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

77

Jelas Data Qualitative vs

Test statistic :

ztab

Sample proportion : ;

Pooled proportion estimate :Value of the test statistic:

)11

(ˆˆ

)ˆˆ(

21

21

nnqp

ppz

.645,1: 05,0 zzzRR

25,080

20ˆ1 p 125,0

120

15ˆ 2 p

175,0200

35

12080

1520ˆ

p

hitcal zz ˆ ˆ

ˆˆ

1 2

1 2

p -p (0,25-0,125)z= =

1 1 1 1pq( + ) 0,175(0,825)( + )

n n 80 120

0: 21 ppHo0: 211 ppH

Page 78: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

78

tabcal zz 28,2 oHreject

Test statistic, is normally distributed

We can calculate p-value

p-value = Reject Ho

%13,10113,0)28,2( zP

Page 79: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

79

SOAL-SOAL

1. Diberikan pmf dari variabel random X sbb: x 0 1 2 3 p(x) 0 k k 3k2

Tentukan k sehingga memenuhi sifat dari pmf!

xxp 0)(

130)( 2 kkkxp

0123 2 kk

1,3

10)1)(13( kkkk

1)(xp

Solusi: Ada dua sifat pmf, yaitu :

Page 80: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

80

Untuk

Dengan demikian tidak memenuhi. Selanjutnya untuk dapat diperiksa ternyata pada kondisi ini memenuhi sifat pmf.Jadi nilai

01)1(1 pk

01)2( p

1k3

1k

3

1k

Page 81: NONPARAMETRIK

81

In a public opinion survey, 60 out of a sample of 100 high-income voters and 40 out of a sample of 75 low-income voters supported a decrease in sales tax.(a) Can we conclude at the 5% los that the

proportion of voters favoring a sales tax decrease differs between high and low-income voters?

(b) What is the p-value of this test?(c) Estimate the difference in proportions, with 99%

confidence! 0)(: 21 ppHo

0)(: 211 ppH

96,1: zRR

)11

(ˆˆ

)ˆˆ(

21

21

nnqp

ppz

Solution:

vs

Test statistic :

SWN SCIENCE DEPARTMENT

Page 82: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

82

53,075

40ˆ;6,0

100

60ˆ 21 pp

571,0175

100

75100

4060ˆ

p

429,0ˆ1ˆ pq

93,0

)751

1001

)(429,0(571,0

)53,060,0(

calz

0-1,96 1,96

(a) Conclusion : don not reject Ho

(b) p-value = 2P(z > 0,93) = 2(0,1762) = 0,3524.(c)

The difference between the two-proportions is estimated to lie between -0,125 and 0,265

2

22

1

11221

ˆˆˆˆ)ˆˆ(

n

qp

n

qpzpp 75

)47,0)(53,0(

100

)4,0)(6,0(575,2)53,060,0(

195,007,0

Page 83: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

83

TEST on MEANS WHEN THE OBSERVATIONS ARE PAIRED

TESTING THE PAIRED DIFFERENCES

Let (X1, Y1), (X2, Y2) … (Xn, Ym) be the n pairs, where (Xi, Yi) denotes the systolic blood pressure of the i th subject before and after the drug.It is assumed that the differences D1, D2, …, Dn constitute independent normally distributed RV such that: and

TEST STATISTIC:

iiDE 2DiDVar

oDoH : oDH :1vs

nS

DT

D

o

n

DD i

22 )(

1

1DD

nS iDan

d

Page 84: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

84

Rejection criteria for testing hypotheses on means when the observation are paired

Null hypothesis Value test statistic under Ho

Alternative hypothesis Rejection criteria

Reject Ho whenor when

oDoH :ns

dt

d

o

Reject Ho when

Reject Ho when

1,2 ntt oDH :1

1,21 ntt

1,1 ntt

1, ntt

oDH :1

oDH :1

Page 85: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

85

A paired difference experiment is conducted to compare the starting salaries of male and female college graduates who find jobs. Pairs are formed by choosing a male and female with the same major and similar GRADE-POINT-AVERAGE. Suppose a random sample of ten pairs is formed in this manner and starting annual salary of each person is recorded. The result are shown in table. Test to see whether there is evidence that the mean starting salary, μ1 , for males exceeds the mean starting salary, μ2, for female. Use α=0,05.

Page 86: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

86

Pair Male Female Difference (male-female)

1 $ 14.300 $13.800 $ 500

2 16.500 16.600 -100

3 15.400 14.800 600

4 13.500 13.500 0

5 18.500 17.600 900

6 12.800 13.000 -200

7 14.500 14.200 300

8 16.200 15.100 1.100

9 13.400 13.200 200

10 14.200 13.500 700

Page 87: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

87

Solution:

)0(0: 21 DoHvs

)0(0: 211 DH

Test statistic :

dxns

oxt D

DD

D

;

RR : reject Ho if : t > tα ; t0.05,9=1,833

400 n

Dxd iD

61,43489,888.1882 DD SS

91,21061,434

400t

0 1,833

t

T-distribution with 9 dof

Page 88: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

88

tcal falls in RR

Reject Ho at the los=0,05

Starting salary for males exceeds the starting salary for females

Page 89: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

89

Consider a classroom where the students are given a test before they are taught the subject matter covered by the test. The student’s score on this pre test are recorded as the first data set. Next, the subject matter is presented to the class. After the instruction is completed, the students are retested on the same material. The scores on the second test, the post test, compose the second data set. It is reasonable to expect that a student that scored high on the pre test will also score high on the post test(and vice versa). Inherently, a strong dependency exists between the members of a pair of scores generated by each individual.Suppose that the scores in table, have been generated by 15 students under the conditions just described. How would you decide whether the instruction had been effective?

Page 90: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

90

Student Pre test Post test D

1 54 66 12

2 79 85 6

3 91 83 -8

4 75 88 13

5 68 93 25

6 43 40 -3

7 33 78 45

8 85 91 6

9 22 44 22

10 56 82 26

11 73 59 -14

12 63 81 18

13 29 64 35

14 75 83 8

15 87 81 -6

A data set with paired scores

Page 91: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

91

EX : Use the T statistic for the hypotheses

versus , which σ = 1to compute :a) β, if α = 0.05 and n = 16b) α, if β = 0.025 and n = 16c) n, if α = 0.05 and β = 0.025

Solution:vs

Ho : μ = 5

Ho : μ = 5

H1 : μ = 6

H1 : μ = 6

μ = μo = 6

μ = μ1 > μo

Test Statistic :

nXT

)(

RR = { > c}X

(a) 05.0)5( cXP

05.0

161

5

161

5(

cXP

Page 92: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

92

05.0)5(4( cTP

05.0)( tTP

753,115 tt , berarti

753,1)5(4 c c = 5.438

)6()(ˆ 1 cXPbenarHHterimaP o

)248.2()6(4( TPcTP

Tidak ada dalam tabel tJADI PAKAI INTERPOLASIUmumnya, dipakai INTERPOLASI LINEAR

21;)( xxxbxaxf

Page 93: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

93

21 xxx o

)()()(

)()( 112

121 xx

xx

xfxfxfbxaxf ooo

TABEL t

One tail α0,10 0.05 0.025 0.01 0.005 0.001

Two-tail α0,20 0.10 0.05 0.02 0.01 0.002

1.341 1.753 2.131 2.602

υ

123...15

2.248

Page 94: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

94

)()()(

)()( 112

121 xx

xx

xfxfxfxf oo

)117.0(471.0

025.0010.0025.0)(

oxf

021.0)( oxf

021.0)248.2( TP

(b)

025.0)6( cXP

β = 0.025 ; n = 16 α = ?

025.0)6(4( cTP025.0)( tTP

131.2t

Page 95: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

95

042.0)868.1( TP

Jadi : 4(c-6) = -2,131 c = 5,467

)5()( cXPbenarHHtolakP oo

Page 96: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

96

TABLE INTERPOLATION

Suppose that it is desired to evaluate a function f(x) at a point xo , and that a table of values of f(x) is available for some, but not all, values of x. In particular, the table may not give the value f(xo) but may give values for f(x1) and f(x2) where x1< xo< x2 .We can use the known values of f(x) for x = x1 , x2 to approximate the value of f(xo) .This process is known as INTERPOLATION. Perhaps the most commonly used interpolation method is linear interpolation.If f(x) is sufficiently smooth and not too curvilinear between x = x1 and x = x2 , calculus tells us that f(x) can be regarded as being nearly linear over the interval [x1 , x2]

Page 97: NONPARAMETRIK

97

That is,

Solving the equations :

For a and b yields :

21;)( xxxbxaxf

2211 )(;)( bxaxfbxaxf

12

12 )()(

xx

xfxfb

12

121

)()()(

xx

xfxfxfa

Hence :)(

)()()()( 1

12

121 xx

xx

xfxfxfbxaxf ooo

f(x1)

f(xo)f(x2)

x1 xo x2

f(x)a+bx

SWN SCIENCE DEPARTMENT

Page 98: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

98

EXERCISE1. Let (X1, X2, …, Xn) be a random sample of a normal RV

X with mean μ and variance 100. Let :

vsAs a decision test, we use the rule to accept Ho if , where

is the value of sample mean.a) find RRb) find α and β for n = 16.

2. Let (X1, X2, …, Xn) be a random sample of a Bernoulli R.V X with pmf:

where it is know that 0 < p ≤ .Let : vsand n = 20. As a decision test, we use the rule to reject Ho if

Ho : μ = 50H1 : μ = 55

53xx

2

1

Ho : p = H1 : p =2

1 )(1 p

1,0;)1();( 1 xpppxp xxX

2

1

n

iix

1

6

Page 99: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

99

(a) Find the power function γ(p) of the test.(b) Find α(c) Find β : (i) if and (ii) 1p 2p

4

1

10

1

Solutions :2.

a)

b)

Ho : p = 2

1H1 : p = )(1 p

2

1vs

X~BER(p) 1,0;)1()( 1 xppxp xxX

)()( pHrejectPp o

2

10;)1(

20 206

0

ppp

kkk

k

)2

1()

2

1( pHrejectP o

2

10;)

2

1()

2

1(

20 206

0

p

kkk

kTableα=0.058

Page 100: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

100

c) )()( 1 trueisHHacceptPp o

2142,0)4

3()

4

1(

201)

4

1( 20

6

0

kk

k k

)(1 1pHrejectP o

0024,0)10

9()

10

1(

201)

10

1( 20

6

0

kk

k k

Page 101: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

101

Let (X1, X2, …, Xn) be a random sample of a normal RV X with mean μ and variance 100. Let :

vsAs a decision test, we use the rule to accept Ho if . Find the value of c and sample size n such that α =0.025 and β = 0.05.

Ho : μ = 50H1 : μ = 55

cx

Solution :}:),...,,{(: 211 cxxxxR n

)50()( cXPbenarHHtolakP oo

025.0)10

50(

n

cZP

025.0)( zZP

n= 52c = 52.718

Page 102: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

102

975.0)( zZP

975.0)10

50(

n

c

60.19)50(96.1)10

50(

nc

n

c

)55()( 1 cXPbenarHHterimaP o

05.0)10

55(05.0)

10

55(

n

c

n

cP

45.16)55(645.110

)55(

nc

nc

Page 103: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

103

29.3)50(92.3)55(55

29.3

50

92.3

cccc

50.16429.360.21529.3 cc

21.7

10.38010.38021.7 cc

7184466,52721

38010c

718.52c

60.19)50( nc

211.72718

19600

718.2

600.19

718.2

60.19n

52998.51 n

Page 104: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

104

Let (X1, X2, …, Xn) be a random sample of a normal RV X with mean μ and variance 36. Let :

vsAs a decision test, we use the rule to accept Ho if , where

is the value of sample mean.a) Find the expression for the critical region/rejection region R1

b) Find α and β for n = 16.

Ho : μ = 50H1 : μ = 55

53xx

Solution :a) dimana}53:),...,,{(: 211 xxxxR n

)2()5053( ZPXP

n

iixn

x1

1

0228.09772.01)2(1

Page 105: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

105

)5553()( 1 XPbenarHHterimaP o

)333.1()333.1( ZP

)333.1(1

x1 xo x2

1.330 1.333 1.340

0.9082 0.9099 1.330 1.340

x1 < xo < x2

Page 106: NONPARAMETRIK

SWN SCIENCE DEPARTMENT

106

)330.1333.1(330.1340.1

9082.09099.09082.0)333.1(

f

)003.0(0100.0

0017.09082.0

90871.000051.09082.0)333.1( f

)333.1(1 0913.090870.01

0913.0


Recommended