+ All Categories
Home > Documents > Kruskal- Wallis Non-Parametric AOVwebspace.ship.edu/pgmarr/Geo441/Lectures/Lec 9 - Non-Parametric...

Kruskal- Wallis Non-Parametric AOVwebspace.ship.edu/pgmarr/Geo441/Lectures/Lec 9 - Non-Parametric...

Date post: 14-Nov-2018
Category:
Upload: vodang
View: 224 times
Download: 1 times
Share this document with a friend
23
Kruskal- Wallis Non-Parametric AOV
Transcript

Kruskal- Wallis Non-Parametric AOV

Non-parametric AOV, as with other non-parametrictests, uses ranked data.

The non-parametric form of AOV is called the Kruskal-Wallis test, and the test statistic is:

where Ri is the sum of the ranks in category i, and ni isthe number of observations in category i, and N is thepooled observations.

)1(3)1(

12

1

2

+−

+= ∑

=

NnR

NNH

k

i i

i

The procedure is simple:

• Rank the pooled set of N observations from lowestto highest, with the lowest rank being 1.

• Sum the ranks in each category.

• Plug the numbers into the equation.

Country Region Interest07 Country Region Interest07 Country Region Interest07Cameroon AF 0.90 Argentina LA 6.55 Bangladesh SE 2.11Chad AF 1.34 Bolivia LA 2.71 Bhutan SE 9.69Ethiopia AF 1.25 Brazil LA 7.25 Cambodia SE 1.54Gabon AF 6.75 Colombia LA 7.88 Lao PDR SE 1.50Kenya AF 1.06 Costa Rica LA 5.58 Malaysia SE 6.77Mali AF 1.14 Ecuador LA 5.36 New Guinea SE 0.75Namibia AF 1.37 Guatemala LA 5.85 Philippines SE 5.20Nigeria AF 0.78 Honduras LA 1.81 Thailand SE 2.00Tanzania AF 0.77 Jamaica LA 7.21 Vietnam SE 2.99Uganda AF 0.79 Nicaragua LA 1.82Zambia AF 1.32 Paraguay LA 5.30

Peru LA 6.92Venezuela LA 6.58

Interest on International Debt, 2007

Country RegionInterest07 Country RegionInterest07 Country RegionInterest07Cameroon AF 5 Argentina LA 25 Bangladesh SE 17Chad AF 10 Bolivia LA 18 Bhutan SE 33Ethiopia AF 8 Brazil LA 31 Cambodia SE 13Gabon AF 27 Colombia LA 32 Lao PDR SE 12Kenya AF 6 Costa Rica LA 23 Malaysia SE 28Namibia AF 7 Ecuador LA 22 New Guinea SE 1Mali AF 11 Guatemala LA 24 Philippines SE 20Nigeria AF 3 Honduras LA 14 Thailand SE 16Tanzania AF 2 Jamaica LA 30 Vietnam SE 19Uganda AF 4 Nicaragua LA 15Zambia AF 9 Paraguay LA 21

Peru LA 29Venezuela LA 26

Rank Sums 92 310 159

Ranked Interest on International Debt, 2007

)1(3)1(

12

1

2

+−

+= ∑

=

NnR

NNH

k

i i

i

68.18

102)8.10970011.0(

)133(39

15913

3101192

)133(3312 222

=

−×=

+−

++×

+=

H

H

H

The Kruskal-Wallis H table is VERY limited in terms of the sample size displayed. It is only really useful for very smallsample sizes.

The critical value of H for larger samples sizes or where k > 5 is approximated by χ2 table with k – 1 degrees of freedom,where k is the number of groups.

Therefore:

H = 18.68

Critical value: χ2critical= 5.991

Since 18.68 > 5.991, reject H0.

There is a significant difference in interest rates in 2007 among the regions (H18.68, p < 0.001).

Occasionally we find that we have tied ranks. There aretwo additional procedures that must be performed:

1. Give the tied ranks the average rank.2. Apply for following adjustment to H.

where ti is the number of observations tied at a givenrank summed over all sets of ranks.

NN

ttC i

ii

−=∑

3

3 )(1

Chilean Nitrate Processing Facilities

Chilean Nitrate Processing Facilities

North

Middle

South

Jazpampa 40000 Agua Santa 150000 Buen Retiro 53000La Patria 45000 Amelia 50000 Cala Cala 25000Paccha 100000 Aurora 50000 Humberstone 80000San Patricio 40000 Democracia 50000 Mercedes 40000Santa Rita 45000 Primitiva 300000 Paposo 35000Union 35000 Puntunchara 60000 Pena Chica 40000

Rosario de Huara 180000 San Donato 75000San Jorge 100000 Sebastopol 70000Santa Rosa de Huara 70000Slavia 40000

North Prod Middle Prod South Prod

Jazpampa 19 Agua Santa 3 Buen Retiro 11La Patria 15.5 Amelia 13 Cala Cala 24Paccha 4.5 Aurora 13 Humberstone 6San Patricio 19 Democracia 13 Mercedes 19Santa Rita 15.5 Primitiva 1 Paposo 22.5Union 22.5 Puntunchara 10 Pena Chica 19

Rosario de Huara 2 San Donato 7San Jorge 4.5 Sebastopol 8.5Santa Rosa de Huara 8.5Slavia 19

North Rank Middle Rank South Rank

These date were ranked from largest production to smallest.

Rank Sums 96 87 117

North Middle SouthRank Sum 96 87 117n 6 10 8N 23

08.575)400402.0(

75))1.17119.7561536(02.0(

)124(38

1171087

696(

)124(2412 222

=−×=

−++×=

+−

++×

+=

HHH

H

Primitiva 1Rosario de Huara 2Agua Santa 3Paccha 4.5 tSan Jorge 4.5 tHumberstone 6San Donato 7Santa Rosa de Huara 8.5 tSebastopol 8.5 tPuntunchara 10Buen Retiro 11Amelia 13 tAurora 13 tDemocracia 13 tLa Patria 15.5 tSanta Rita 15.5 tJazpampa 19 tSan Patricio 19 tSlavia 19 tMercedes 19 tPena Chica 19 tUnion 22.5 tPaposo 22.5 tCala Cala 24

We have many tied ranks: •4 set of 2 tied ranks (red)• 1 set of 3 tied ranks (green)•1 set of 5 tied ranks (blue)

988.013800

1681

138001202466661

2424)55()33()22()22()22()22(1 3

333333

=

−=

+++++−=

−−+−+−+−+−+−

−=

C

C

C

C

So the correction is: 14.5988.008.5

==H

NN

ttC i

ii

−=∑

3

3 )(1

Df = k – 1 or 3-1 or 2

Thus we get:

Critical χ2 value = 5.991 Since 5.14 < 5.991, accept H0

There is no significant difference in nitrate production among the three oficina groups (Kruskal-Wallis χ2

5.14, 0.10 < p < 0.05).

SPSS confirms our results.

In terms of correcting for ties:

• Ties in the data make the H value a somewhat less than it should be, so the correction increases the size of H.

• Small ties (where 2 observations are tied) do not influence the results very much unless there are a VERY large number of them.

• Situations where there are multiple large ties (where 4 or 5 observations are tied) and where few of the ranks are not tied will have an influence on the results.

Thoughts on tied ranks:

If your data has a very large number of ties then it lacks variation.

A lack of variation in the data makes it difficult to say anything meaningful about any differences you may happen to find.


Recommended