Date post: | 04-Jan-2016 |
Category: |
Documents |
Upload: | shannon-boyd |
View: | 262 times |
Download: | 14 times |
Chapter-8 Chi-square test
Ⅰ The mathematical properties
of chi-square distribution
Types of chi-square tests
Chi-square test
Chi-square distribution
1. Tests of goodness-of-fit
Observed frequencies of one variable are significantly
different from the expected frequencies of the same
variable.
E.g. occurrences of heads and tails while flipping a coin.
2. Chi-SquareChi-Square tests of independence( or relationship)
Two variables are associated or independent of the other.
E.g. association between smoking and lung cancer.
Types of chi-square tests
The chi-square test of independence is probably the
most frequently used hypothesis test in the medicine.
In this chapter, we will use chi-square test to evaluate
differences among population when the test variable is
nominal, dichotomous, ordinal, or grouped interval.
Chi-square test
Independence Defined
Two variables are independent if, for all cases, the
classification of a case into a particular category of one
variable (the group variable) has no effect on the
probability that the case will fall into any particular
category of the second variable (the test variable).
When two variables are independent, there is no
relationship between them. We would expect that the
frequency breakdowns of the test variable to be similar
for all groups.
Independence Demonstrated
Suppose we are interested in the relationship between gender and attending college.
If there is no relationship between gender and attending college and 40% of our total sample attend college, we would expect 40% of the males in our sample to attend college and 40% of the females to attend college.
If there is a relationship between gender and attending college, we would expect a higher proportion of one group to attend college than the other group, e.g. 60% to 20%.
Displaying Independent and Dependent Relationships
Independent Relationship between Gender and College
40% 40% 40%
0%
20%
40%
60%
80%
100%
Males Females TotalPo
po
rtio
n A
tte
nd
ing
Co
lleg
e
Dependent Relationship between Gender and College
60%
20%
40%
0%
20%
40%
60%
80%
100%
Males Females TotalPo
po
rtio
n A
tte
nd
ing
Co
lleg
e
When the variables are independent, the proportion in both groups is close to the same size as the proportion for the total sample.
When group membership makes a difference, the dependent relationship is indicated by one group having a higher proportion than the proportion for the total sample.
Independent and Dependent Variables
The two variables in a chi-square test of independence each play a specific role. The group variable is also known as the independent
variable because it has an influence on the test variable.
The test variable is also known as the dependent variable because its value is believed to be dependent on the value of the group variable.
The chi-square test of independence is a test of the influence or impact that a subject’s value on one variable has on the same subject’s value for a second variable.
Chi square distribution
E
EO 22 )(
Expected frequency
observed frequency
Expected frequency are
computed as if there is no
difference between the groups,
i.e. both groups have the same
proportion.
This formula compute how the pattern of observed frequency differs from the pattern of expected frequency.
2. Chi-square distributions are determined by degree of freedom
Chi square distribution
1. Chi-square distribution is a nonsymmetrical distribution
Chi square test statistic Cannot be negative because all discrepancies are
squared.
Will be zero only in the unusual event that each
observed frequency exactly equals the corresponding
expected frequency.
Larger the discrepancy between the expected
frequencies and their corresponding observed
frequencies, the larger the observed value of chi-square.
Table 2.1 Partial Table of Critical Values of Chi-Square
Probability for chi square test statistic can be obtained
from the chi-square probability distribution.
0.05
reject region
The decision rule
The quantity will be small if the observed and
expected frequency are close together and will be large if
the differences are large.
The computed value of χ2 is compared with the tabulated
value of with K-1 degrees of freedom. The decision rule,
then is: reject H0 if χ2 is greater than or equal to the
tabulated χ2 for the chosen value of α.
E
EO 2)(
ⅡⅡ Chi-Square test Chi-Square test
(tests of goodness-of-fittests of goodness-of-fit)
E
EO 22 )(
Model assumptions: No
cell has an expected
frequency less than 5.
E
EO 22 )5.0(
At least one cell has an
expected frequency less
than 5.
Degrees of Freedom: k - 1
Number of outcomesNumber of outcomes
tests of goodness-of-fittests of goodness-of-fit
Example 1
As personnel director, you want to test the perception of
fairness of three methods of performance evaluation. Of 180
employees, 63 rated Method 1 as fair. 45 rated Method 2 as
fair. 72 rated Method 3 as fair. At the 0.05 level, is there a
difference in perceptions?
tests of goodness-of-fittests of goodness-of-fit
H0: p1 = p2 = p3 = 1/3
H1: At least 1 is different
a = 0.05
tests of goodness-of-fittests of goodness-of-fit
3.6
60
6072
60
6045
60
6063
O
6031180
222
cells all i
2i2
321
E
E
EEE
i
3.6
60
6072
60
6045
60
6063
O
6031180
222
cells all i
2i2
321
E
E
EEE
i
Reject Reject H0 at a = 0.05 at a = 0.05. There is evidence of a difference in There is evidence of a difference in
proportions proportions
Exercise 1
Ask 100 People (n) Which of 3 Candidates (k) They Will
Vote For. At the 0.05 level, is there a difference in
candidates?
Candidate
Tom Bill Mary Total
35 20 45 100
Candidate
Tom Bill Mary Total
35 20 45 100
tests of goodness-of-fittests of goodness-of-fit
Ⅲ Ⅲ Chi-Square test Chi-Square test
(tests of independence or relationshiptests of independence or relationship)
1. hypothesis test for 2hypothesis test for 2××2table2table
n≥40 and E≥5
n≥40 and 1≤ E < 5
n<40 or E<1
E
EO 22 )(
n≥40 and E≥5
))()()((
)( 22
dbcadcba
nbcad
1. hypothesis test for 2hypothesis test for 2××2table2table
Pearson chi- square
E
EO 22 )5.0|(|
n ≥ 40 and 1≤E<5
))()()((
)2/|(| 22
dbcadcba
nnbcad
1. hypothesis test for 2hypothesis test for 2××2table2table
Continuity correction of chi- square
!!!!!
)!()!()!()!(
ndcba
dbcadcbaP
n<40 or E<1
1. hypothesis test for 2hypothesis test for 2××2table2table
Fisher’ exact test
Example 2
A sample of 200 college students participated in
a study designed to evaluate the level of college
students’ knowledge of a certain group of
common diseases. The following table shows
the students classified by major field of study
and level of knowledge of the group of diseases:
1. hypothesis test for 2hypothesis test for 2××2table2table
major good poor total
premedical 16 24 40
other 20 140 160
total 36 164 200
Do these data suggest that there is a relationship between
knowledge of the group of diseases and major field of study
of the college students from which the present sample was
drawn? Let α=0.05.
1. hypothesis test for 2hypothesis test for 2××2table2table
major good poor total
premedical a b R1
other c d R2
total C1 C2 n
Four cells four-fold table
16 24
20 140
131.228.8
32.87.2
Observed cells
Expected cells
200
164160;
200
36160200
16440;
200
3640
2221
1211
EE
EE
H0: there is no relationship (independent) between knowledge and
major field
H1: there is a relationship between knowledge and major field
a = 0.05
396.162.131
2.131140
8.28
8.2820
8.32
8.3224
2.7
2.716
)(
2222
22
)()()()(E
EO
131.228.8
32.87.2
14020
2416
396.16
)1643616040/(200202414016
))()()((
)(
2
22
)(
dbcadcba
nbcad
1. Chi-Square test for 2Chi-Square test for 2××2table2table
df=(R-1)(C-1)=1
84.31,05.02
Reject HReject H00 at a=0 .05 at a=0 .05
There is relationship between knowledge of the group of
diseases and major field of study of the college students.
The students major in premedical has higher knowledge
rates of diseases.
1. Chi-Square test for 2Chi-Square test for 2××2table2table
Exercise 2
A study was conducted to determine whether the
antibody status in wives is related with antibody
status in their husband. 48 couples were examined, the
data regarding the incidence of anti- sperm antibodies
is as follows:
Ab of wife Ab of husband
- + Total
- 8 10 18
+ 4 23 27
total 12 33 45
Question: Is the antibody status in wives
related with antibody status in their husband?
H0: the antibody status in wives is related with antibody
status in their husband
H1: the antibody status in wives is not related with antibody
status in their husband
a = 0.05
452.3))()()((
)2/( 22
dbcadcba
nnbcad
Not reject H0, we can not think the antibody status in
wives is related with antibody status in their husband.
2. hypothesis test for R×C tablehypothesis test for R×C table
E
EO 22 )( )1(
22
CRnn
On
Model assumptions :
The expected frequency should be greater than 5 in
more than 4/5 cells;
The expected frequency in any cell should be greater
than 1.
Pearson chi- square for R×C R×C table table
Example3 To study menstrual dysfunction in distance runners. Somebody did an observational study of three groups of women. The first two groups were volunteers who regularly engaged in some form of running, and the third, a control group, consisted of women who did not run but were otherwise similar to the other two groups. The runners were divided into joggers who jog "slow and easy" 5 to 30 miles per week, and runners who run more than 30 miles per week and combine long, slow distance with speed work. The investigators used a survey to show that the three groups were similar in the amount of physical activity (aside from running), distribution of ages, heights, occupations, and type of birth control methods being used.
Are these data consistent with the hypothesis that running
does not increase the likelihood that a woman will consult
her physician for a menstrual problem?
Table 5-6 shows these expected frequencies, together with the expected frequencies of women who did not consult their physicians.
58.22165
546911
E
627.9...42.31
)42.3140(
58.22
)58.2214( 222
2)12)(13()1)(1( cr
H0: π1 = π2 = π3
H1: At least 1 is different from the other
a = 0.05
99.52,05.02
Reject H0 at 0.05 level, so we can think that running increases
the likelihood a woman will consult her physician for a
menstrual problem.