15 - 1
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
15 - 2
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
Understand the nature and role of chi-square distribution
Identify a wide variety of uses of the chi-square distribution
Conduct a test of hypothesis comparing an observed frequency distribution to an
expected frequency distribution
When you have completed this chapter, you will be able to:
15 - 3
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
Conduct a hypothesis test to determine whether two attributes are
independent
Conduct a test of hypothesis for normality using the chi-square distribution
15 - 4
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
Characteristics of the Chi-Square Distribution
… it is positively skewed
… it is non-negative
… it is based on degrees of freedom
…when the degrees of freedom change a new distribution is created
…e.g.
15 - 5
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
df = 3
df = 5
df = 10
Characteristics of the Chi-Square Distribution
15 - 6
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
Goodness-of-Fit Test: Equal Expected
Frequencies
Let f0 and fe be the observed and expected frequencies respectively
H0: There is no difference between the observed and expected frequencies
H1: There is a difference between the observed and the expected frequencies
15 - 7
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
…the critical value is a chi-square value with (k-1) degrees of freedom,
where k is the number of categories
Goodness-of-Fit Test: Equal Expected
Frequencies
e
eo
fff 2
2
… the test statistic is:
15 - 8
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
The following information shows the number of employees absent by day of the week
at a large a manufacturing plant. Day
FrequencyMonday 120Tuesday 45Wednesday 60Thursday 90Friday 130
Total 445
Goodness-of-Fit Test: Equal Expected
Frequencies
At the .05 level of significance, is there a difference in the absence rate by day of the week?
15 - 9
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
Hypothesis Test
Step 1
Step 2
Step 3
Step 4
H0: There is no difference in absence rate by day of the week…
H1: Absence rates by day are not all equal = 0.05
Use Chi-Square test
Reject H0 if 2 >
(5-1) = 4Degrees of freedom
9.488. (see Appendix I)
(120+45+60+90+130)/5 = 89
Goodness-of-Fit Test: Equal Expected
Frequencies
Chi-Square
15 - 10
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
15 - 11
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
Right-Tail Area
= 0.05
Degrees of Freedom 5 – 1
= 4
Reject H0 if 2 > 9.488Using the Table…
15 - 12
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
= 1.98
Day Frequency Expected (fo – fe)2/fe
Monday 120 89 10.80Tuesday 45 89 21.75Wednesday 60 89 9.45Thursday 90 89 0.01Friday 130 89
18.89Total 445 445 60.90
e
eo
fff 2
2
2
Reject the null hypothesis. Absentee rates are not the same for each day
of the week.
(120-89)2/89Step 5
Test Statistics
Reject H0 if 2 > 9.488
15 - 13
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
At the .05 significance level, can we conclude that the Philadelphia area is different from the U.S. as a whole?
Married Widowed Divorced Single
63.9% 7.7% 6.9% 21.5%
A U.S. Bureau of the Census indicated that…
Not re-married Never married
A sample of 500 adults from the Philadelphia area showed:
310 40 30 120
15 - 14
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
Married 310 *319.5 ** .2825Widowed 40 38.5 .0584
Divorced 30 34.5 .5870
Single 120 107.5 1.4535
Total 500 2.3814
… continued
22.3814
* Census figures would predict: i.e. 639*500 = 319.5** Our sample: (310-319.5)2/319.5 = .2825
(fo – fe)2/feExpected
15 - 15
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
Step 1
Step 2
Step 3
Step 4
H0: The distribution has not changed
… continued
H1: The distribution has changed.
H0 is rejected if 2 >7.815, df = 3
= 0.05
2 = 2.3814
Reject the null hypothesis.The distribution regarding marital status in Philadelphia is different from the rest of the United States.
15 - 16
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
Goodness-of-Fit Test: Normality
… the test investigates if the observed frequencies in a frequency distribution
match the theoretical normal distribution
- Compute the z-value for the lower class limit and the upper class limit for each
class- Determine fe for each category- Use the chi-square goodness-of-fit test to
determine if fo coincides with fe
…to determine the mean and standard deviation of the frequency distribution
15 - 17
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
A sample of 500 donations to the Arthritis Foundation is reported in the
following frequency distribution
Is it reasonable to conclude that the distribution is normally distributed with a mean of $10 and a
standard deviation of $2?
Use the .05 significance level
Goodness-of-Fit Test: Normality
15 - 18
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
Amount Spent
fo Area
fe
(fo- fe )2/fe
<$6 20
$6 up to $8 60$8 up to $10 140
$10 up to $12 120
$12 up to $14 90
>$14 70Total 500
… continued
15 - 19
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
To compute fe for the first class, first determine the z - value
… continued
Now… find the probability of a z - value less than –2.00
00.22
106
Xz
.02284772.5000.0)00.2( zP
15 - 20
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
Amount Spent
fo Area
fe
(fo- fe )2/fe
<$6 20 .02
$6 up to $8 60 .14$8 up to $10 140 .34
$10 up to $12 120 .34
$12 up to $14 90 .14
>$14 70 .02Total 500
… continued
15 - 21
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
The expected frequency is the probability of a z-value less than –2.00 times the sample size
… continued
The other expected frequencies are computed similarly
40.11)500)(0228(. ef
15 - 22
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
Amount Spent
fo Area
fe
(fo- fe )2/fe
<$6 20 .02 11.40 6.49
$6 up to $8 60 .14 67.95 .93$8 up to $10 140 .34 170.65 5.50
$10 up to $12 120 .34 170.65 15.03
$12 up to $14 90 .14 67.95 7.16
>$14 70 .02 11.40 301.22Total 500 500 336.33
… continued
15 - 23
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
… continued
Step 1
Step 2
Step 3
Step 4
H0: The observations follow the normal distribution
H0 is rejected if 2 >7.815, df = 6
= 0.05
2 = 336.33
H0: is rejected.The observations do NOT follow the normal distribution
H0: The observations do NOT follow the normal distribution
15 - 24
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
A contingency table is used to investigate whether two traits or characteristics
are related
… the expected frequency is computed as: Expected Frequency = (row total)(column total)/grand
total
… each observation is classified according to two criteria
…the usual hypothesis testing procedure is used
… the degrees of freedom is equal to: (number of rows -1)(number of columns -1)
15 - 25
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
Is there a relationship between the location of an accident and the gender of the person involved in the accident?
A sample of 150 accidents reported to the police were classified by type and gender.
At the .05 level of significance, can we conclude that gender and the location of
the accident are related?
15 - 26
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
… continued
SexWork Home Other
Total
Male 60 20 10 90
Female 20 30 10 60
Total 80 50 20 150
Location
The expected frequency for the work-male intersection is computed as (90)(80)/150 =48
Similarly, you can compute the expected frequencies for the other cells
15 - 27
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
Step 1
Step 2
Step 3
Step 4
H0: The Gender and Location are NOT related
H0 is rejected if 2 >5.991, df = 2
= 0.05
H0: is rejected.Gender and Location are related!
H0: The Gender and Location are related
(…there are (3- 1)(2-1) = 2 degrees of freedom)
Find the value of 2
8
810...48
4860 222
… continued
667.16
15 - 28
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
Test your learning…
www.mcgrawhill.ca/college/lindClick on…
Online Learning Centrefor quizzes
extra contentdata setssearchable glossaryaccess to Statistics Canada’s E-Stat data…and much more!
15 - 29
Copyright © 2004 McGraw-Hill Ryerson Limited. All rights reserved.
This completes Chapter 15