Date post: | 29-Dec-2015 |
Category: |
Documents |
Upload: | arnold-booth |
View: | 224 times |
Download: | 3 times |
Chapter 13 – 1
Chapter 9: Testing Hypotheses
• Overview• Research and null hypotheses• One and two-tailed tests• Errors• Testing the difference between two means• t tests
Chapter 13 – 2
Overview
Int
erva
l
N
omin
al
Dep
ende
ntV
aria
ble
Independent Variables
Nominal Interval
Considers the distribution of one variable across the categories of another variable
Considers the difference between the mean of one group on a variable with another group
Considers how a change in a variable affects a discrete outcome
Considers the degree to which a change in one variable results in a change in another
You already know how to deal with two
nominal variables
Chapter 13 – 3
Int
erva
l
N
omin
al
Dep
ende
ntV
aria
ble
Independent Variables
Nominal Interval
Considers the difference between the mean of one group on a variable with another group
Considers how a change in a variable affects a discrete outcome
Considers the degree to which a change in one variable results in a change in another
You already know how to deal with two
nominal variables
Lambda
TODAY! Testing the differences
between groups
Overview
Chapter 13 – 4
Int
erva
l
N
omin
al
Dep
ende
ntV
aria
ble
Independent Variables
Nominal Interval
Considers how a change in a variable affects a discrete outcome
Considers the degree to which a change in one variable results in a change in another
You already know how to deal with two
nominal variables
Lambda
TODAY! Testing the differences
between groups
Confidence Intervalst-test
Overview
Chapter 13 – 5
Example
• Draw a random sample of 100 Africa American from GSS 1998.
• Calculate the mean earnings--$24,100• Based on census information, the mean earnings
for Americans is $28,985.• Is the observed gap ($28,985 - $24,100) large
enough to convince us that the sample we drew is not representative of the population?
Chapter 13 – 6
Example
• The average earnings of the Africa Americans are indeed lower than the national average
• The average earnings of the Africa Americans are about the same as the national average, and this sample happens to show a particularly low mean.
Chapter 13 – 7
General ExamplesIn
terv
alN
omin
al
Dep
ende
ntV
aria
ble
Independent Variables
Nominal Interval
Inte
rval
Nom
inal
Dep
ende
ntV
aria
ble
Independent Variables
Nominal Interval
Inte
rval
Nom
inal
Dep
ende
ntV
aria
ble
Independent Variables
Nominal Interval
Is one group scoring significantly higher on average than another group?
Is a group statistically different from another on a particular dimension?
Is Group A’s mean higher than Group B’s?
Chapter 13 – 8
Specific ExamplesIn
terv
alN
omin
al
Dep
ende
ntV
aria
ble
Independent Variables
Nominal Interval
Inte
rval
Nom
inal
Dep
ende
ntV
aria
ble
Independent Variables
Nominal Interval
Inte
rval
Nom
inal
Dep
ende
ntV
aria
ble
Independent Variables
Nominal Interval
Do people living in rural communities live longer than those in urban or suburban areas?
Do students from private high schools perform better in college than those from public high schools?
Is the average number of years with an employer lower or higher for large firms (over 100 employees) compared to those with fewer than 100 employees?
Chapter 13 – 9
• Statistical hypothesis testing – A procedure that allows us to evaluate hypotheses about population parameters based on sample statistics.
• Research hypothesis (H1) – A statement reflecting the substantive hypothesis. It is always expressed in terms of population parameters, but its specific form varies from test to test.
• Null hypothesis (H0) – A statement of “no difference,” which contradicts the research hypothesis and is always expressed in terms of population parameters.
Testing Hypotheses
Chapter 13 – 10
Research and Null HypothesesOne Tail — specifies the hypothesized direction• Research Hypothesis:
H1: 2 1, or 2 1 > 0 • Null Hypothesis:
H0: 2 1, or 2 1 = 0
Two Tail — direction is not specified (more common)• Research Hypothesis:
H1: 2 1, or 2 1 = 0• Null Hypothesis:
H0: 2 1, or 2 1 = 0
Chapter 13 – 11
One-Tailed Tests• One-tailed hypothesis test – A hypothesis test
in which the alternative is stated in such a way that the probability of making a Type I error is entirely in one tail of a sampling distribution.
• Right-tailed test – A one-tailed test in which the sample outcome is hypothesized to be at the right tail of the sampling distribution.
• Left-tailed test – A one-tailed test in which the sample outcome is hypothesized to be at the left tail of the sampling distribution.
Chapter 13 – 12
Two-Tailed Tests
• Two-tailed hypothesis test – A hypothesis test in which the region of rejection falls equally within both tails of the sampling distribution.
Chapter 13 – 13
Probability Values
• Z statistic (obtained) – The test statistic computed by converting a sample statistic (such as the mean) to a Z score. The formula for obtaining Z varies from test to test.
• P value – The probability associated with the obtained value of Z.
Chapter 13 – 14
Probability Values
Chapter 13 – 15
Probability Values
• Alpha ( ) – The level of probability at which the null hypothesis is rejected. It is customary to set alpha at the .05, .01, or .001 level.
Chapter 13 – 16
Five Steps to Hypothesis Testing
(1) Making assumptions
(2) Stating the research and null hypotheses and selecting alpha
(3) Selecting the sampling distribution and specifying the test statistic
(4) Computing the test statistic
(5) Making a decision and interpreting the results
Chapter 13 – 17
• Type I error (false rejection error)the probability (equal to ) associated with rejecting a true null hypothesis.
• Type II error (false acceptance error)the probability associated with failing to reject a false null hypothesis.
Based on sample results, the decision made is to…reject H0 do not reject H0
In the true Type I correct population error () decisionH0 is ...
false correct Type II error decision
Type I and Type II Errors
Chapter 13 – 18
One-Sample z Test• When we know population parameters μ
and σ, how likely we could draw a random sample whose mean (y bar) differs from μ?
• Null Hypothesis
Population mean μy equals to population mean μ.
Chapter 13 – 19
One-Sample z Test
• Test statistic
0
1
: =
: Y
Y
H
H
Y-z=
/ n
Chapter 13 – 20
One-Sample z Test• Compare z we calculate to the critical value
• Make a decisioncritical| | v.s. zz
critical
0
critical
0
| | z
Reject H
| | z
Not reject H
z
z
Chapter 13 – 21
3.55,
Y=3.41
.21
Y-z=
/ n
•Example
how likely we could draw a random sample from a population whose mean is differ from μ? id GPA
7 3.6
1 3.2
1 3.2
1 3.2
4 3.4
5 3.5
7 3.6
6 3.5
7 3.6
3 3.3
Chapter 13 – 22
Example
• Is the observed gap ($28,985 - $24,100) large enough to convince us that the sample we drew is not representative of the population?
Chapter 13 – 23
Five-step Testing Hypothesis-1
Making Assumptions:
• A random sample is selected.
• Because N>50, the assumption of normal population is not required.
• The level of measurement of the dependent variable is interval-ratio.
Chapter 13 – 24
Five-step Testing Hypothesis-2Stating the Research and the Null Hypotheses• The research hypothesis is
• The null hypothesis is
1 Y: $28,985H
0 Y: $28,985H
Chapter 13 – 25
Five-step Testing Hypothesis-3
Selecting the Sampling distribution and Specify the Test Statistic
• We use the z distribution and the z statistic to test the null hypothesis
Chapter 13 – 26
Five-step Testing Hypothesis-4
Computing the z Test Statistic
critical
$24,100 N=100
$28,985 23,335
24,100 28,9852.09
/ 23,335 / 100
| | =2.09 > z 1.96 if =.05
.0183
Y
Yz
N
z
P
Chapter 13 – 27
Five-step Testing Hypothesis-5
Making a Decision and Interpreting the Results• Our obtained |z| statistic of 2.09 is greater than
1.96 or probability of obtaining a z statistic of 2.09 is less than .05. This P value is below .05 alpha level.
• The probability of obtaining the difference of $4885 ($28,985 - $24,100) between the income of African Americans and the national average for all, if the null hypothesis were true, is extremely low.
Chapter 13 – 28
Five-step Testing Hypothesis-5
• We have sufficient evidence to reject the null hypothesis and conclude that the average earnings of African American are significantly different from the average earnings of all. The difference is significant at the .05 level.
Chapter 13 – 29
• t statistic (obtained) – The test statistic computed to test the null hypothesis about a population mean when the population standard deviation is unknown and is estimated using the sample standard deviation.
• t distribution – A family of curves, each determined by its degrees of freedom (df). It is used when the population standard deviation is unknown and the standard error is estimated from the sample standard deviation.
• Degrees of freedom (df) – The number of scores that are free to vary in calculating a statistic.
t Test
Chapter 13 – 30
One-Sample t Test• t test
A test of significance similar to the z test but used when the population’s standard deviation is unknown.
2
Y-t= , df=n-1
/
( )where
1
S n
Y YS
n
Chapter 13 – 31
t distribution
Chapter 13 – 32
t distribution table
Chapter 13 – 33
Chapter 13 – 34
3.55,
Y=3.41
df=9
Y-t=
/YS n
•Example
how likely we could draw a random sample whose mean (Y bar) differs from μ?
id GPA
7 3.6
1 3.2
1 3.2
1 3.2
4 3.4
5 3.5
7 3.6
6 3.5
7 3.6
3 3.3
Chapter 13 – 35
The Earnings of White Women
• We drew a sample of white females (N=371) from GSS 2002.
• The mean earnings is $28,889 with a standard deviation 21,071.
• In 2002, the national average earnings for all women is $24,146.
Chapter 13 – 36
Five-step Testing Hypothesis-1
Making Assumptions:
• A random sample is selected.
• The sample size is large.
• The level of measurement of the dependent variable is interval-ratio.
Chapter 13 – 37
Five-step Testing Hypothesis-2Stating the Research and the Null Hypotheses• The research hypothesis is
• The null hypothesis is
1 Y: $24,146H
0 Y: $24,146H
Chapter 13 – 38
Five-step Testing Hypothesis-3
Selecting the Sampling distribution and Specify the Test Statistic
• We use the t distribution and the t statistic to test the null hypothesis
Chapter 13 – 39
Five-step Testing Hypothesis-4
Computing the Test Statistic• Firstly, calculate the degree of freedom associated
with test
33.4371/21071
2414628889
/
37013711
NS
Yt
Ndf
Y
Chapter 13 – 40
Five-step Testing Hypothesis-5
Making a Decision and Interpreting the Results• Our obtained t statistic of 4.33 is greater than
1.980 or probability of obtaining a t statistic of 4.33 is less than .05. This P value is below .05 alpha level.
• The probability of obtaining the difference of $4743 ($28889-$24146) between the income of white women and the national average for all women, if the null hypothesis were true, is extremely low.
Chapter 13 – 41
Five-step Testing Hypothesis-5
• We have sufficient evidence to reject the null hypothesis and conclude that the average earnings of white women are significantly different from the average earnings of all women. The difference is significant at the .05 level.
Chapter 13 – 42
Exercise
• Can you do a one-tail test see if the mean earnings of white women is significantly higher than the average for all women?
Chapter 13 – 43
Two-Sample t Tests
• The t-test assesses whether the means of two populations statistically differ from each other.
• The 2 independent sample t-test is used when testing 2 independent groups..
Chapter 13 – 44
t-test for difference between two meansIs the value of 2 1 significantly different from 0?This test gives you the answer:
If the t value is greater than 1.96, the difference between the means is significantly different from zero at an alpha of .05 (or a 95% confidence level).
The difference between the two means
the estimated standard error of the difference
21
21
21
)2(
YY
NN S
YYt
The critical value of t will be higher than 1.96 if the total N is less than 122. See Appendix C for exact critical values when N < 122.
Chapter 13 – 45
Test Statistic• Equal population variance assumed
2
112
11
21
2121
2
22
2
11
21
nndf
nnnnsnsn
YYt
Chapter 13 – 46
Test Statistic• Unequal population variance assumed
112
2
2
2
2
1
2
1
2
1
2
2
2
2
1
2
1
2
2
2
1
2
1
21
nn
sn
n
s
n
s
n
s
df
n
s
n
s
YYt
Chapter 13 – 47
t-test and Confidence Intervals
21
21
21
)2(
YY
NN S
YYt
The t-test is essentially creating a confidence interval around the difference score. Rearranging the above formula, we can calculate the confidence interval around the difference between two means:
)(2121 YY
StYY If this confidence interval overlaps with zero, then we cannot be certain that there is a difference between the means for the two samples.
Chapter 13 – 48
Why a t score and not a Z score?
• Use of the Z distribution has assumes the population standard error of the difference is known. In practice, we have to estimate it and so we use a t score.
• When N gets larger than 50, the t distribution converges with a Z distribution so the results would be identical regardless of whether you used a t or Z.
• In most sociological studies, you will not need to worry about the distinction between Z and t.
)(2121 YY
StYY
Chapter 13 – 49
t-Test Example 1Mean pay according to gender:
N Mean Pay S.D.
Women 46 $10.29 .8766
Men 54 $10.06 .9051
9824654df
.8609.2671
.23
46
1
54
1
24654
.8766146.9051154
10.2910.06T
22
What can we conclude about the difference in
wages?
Equal population variances assumed
Chapter 13 – 50
Mean pay according to gender:
N Mean Pay S.D.
Women 57 $9.68 1.055
Men 51 $10.32 .9461
06127551df
7484.0.8552
64.
57
1
51
1
27551
055.11759461.151
68.932.1022
T
What can we conclude about the difference in
wages?
t-Test Example 2
Equal population variances assumed
Chapter 13 – 51
Using these GSS income data, calculate a t-test statistic to determine if the difference between the two group means is statistically significant.
In-Class Exercise
Mean Standard Deviation N
Men $22,052.51 $17,734.92 434
Women $14,331.21 $12,165.89 448
62.71,027.18
7,831.30
44889.165,12
43492.734,17
21.221,1451.052,22T
220)88(
2
22
1
21
12
2)N(N
Ns
Ns
YYT
21
Unequal population variances assumed
Chapter 13 – 52
Steps(1) Making assumptions
(2) Stating the research and null hypotheses and selecting alpha
(3) Selecting the sampling distribution and specifying the test statistic
(4) Computing the test statistic
(5) Making a decision and interpreting the results
Chapter 13 – 53
Example
• Suppose we have obtained # of years of education from one random sample of 38 police officers from City A and # of years of education from a second random sample of 30 police officers from City B.
• The average years of education for the sample from City A is 15 with a standard deviation of 2.
• The average years of education for the sample from City B is 14 with a standard deviation of 2.5.
• Is there a statistically significant difference between the education levels of police officers in City A and City B?
Chapter 13 – 54
1.Making Assumptions
• Two random samples are selected.
• The sample sizes are large. Because N>50, the assumption of normal population is not required.
• The level of measurement of the dependent variable is interval-ratio.
• Population variances are assumed to be equal.
Chapter 13 – 55
2.State Hypotheses
H0: There is no statistically significant difference between the mean education level of police officers working in City A and the mean education level of police officers working in City B.
Chapter 13 – 56
2.State Hypotheses
For a 2-tailed hypothesis test
• H1: There is a statistically significant difference between the mean education level of police officers working in City A and the mean education level of police officers working in City B.
Chapter 13 – 57
2.State Hypotheses
For a 1-tailed hypothesis test
• H1: The mean education level of police officers working in City A is significantly greater than the mean education level of police officers working in City B.
Chapter 13 – 58
2. Set the Rejection Criteria
• Determine the degrees of freedom df = (n1+n2)-2 df = 38+30-2=66• Determine level of confidence -- alpha (1 or 2-
tailed test) Use the t-distribution table to determine the critical
value If using 2-tailed test Alpha.05, tcv= 1.997 If using 1-tailed test Alpha.05, tcv= 1.668
Chapter 13 – 59
3. Specifying the test statistic
• Because the population variances are unknown, t-distribution should be used.
• t-statistic.
Chapter 13 – 60
4. Compute Test Statistic
6623038
835.1545.
1
30
1
38
1
23038
25.61304138
1415
11
2
11
2121
2
22
2
11
21
df
nnnn
snsn
YYt
Chapter 13 – 61
4. Compare the t-cal with t-cri
criticalcalculated
critical
calculated
tt
t
t
997.1
testtailed- twoif
835.1
criticalcalculated
critical
calculated
tt
t
t
668.1
testtailed-one if
835.1
Chapter 13 – 62
5. Make a decision
If using 2-tailed test• the test statistic 1.835 does not meet or
exceed the critical value of 1.997 for a 2-tailed test.
• There is no statistically significant difference between the mean years of education for police officers in City A and mean years of education for police officers in City B.
Chapter 13 – 63
If using 1-tailed test
• the test statistic 1.835 does exceed the critical value of 1.668 for a 1-tailed test.
• Police officers in City A have significantly more years of education than police officers in City B.
Chapter 13 – 64
Another Example
• http://www.gallup.com/poll/111703/Final-Presidential-Estimate-Obama-55-McCain-44.aspx
Chapter 13 – 65
Test for two sample proportions
21
2211
21
21
ˆ
11ˆ1ˆ
nn
pnpn
nn
ppz
Chapter 13 – 66
Interpreting a t testGroup Statistics
38 15.00 2.027 .329
30 13.83 2.534 .463
CITYa
b
EDUN Mean Std. Deviation
Std. ErrorMean
Independent Samples Test
.807 .372 2.110 66 .039 1.167 .553 .063 2.270
2.056 54.751 .045 1.167 .568 .029 2.304
Equal variancesassumed
Equal variancesnot assumed
EDUF Sig.
Levene's Test forEquality of Variances
t df Sig. (2-tailed)Mean
DifferenceStd. ErrorDifference Lower Upper
95% ConfidenceInterval of the
Difference
t-test for Equality of Means