Date post: | 17-Jan-2016 |
Category: |
Documents |
Upload: | rodney-blake |
View: | 224 times |
Download: | 1 times |
Chapter NineChapter NineThree Tests of SignificanceThree Tests of Significance
Winston Jackson and Norine Verberg
Methods: Doing Social Research, 4e
2 © 2007 Pearson Education Canada
Inferential Statistics
Inferential statistics do two things:1. Allow us to judge the accuracy of
generalizing from a limited sample to the larger population We can be 95% certain that the sample
mean will be 30%, plus or minus 4%
2. Conduct hypothesis testing Indicate whether study outcome is a fluke or
reflects a true difference in the pop’n i.e., say if the findings are statistically significant
3 © 2007 Pearson Education Canada
What Does Statistically Significant Mean?What Does Statistically Significant Mean?
A test of significance reports the probability that an observed difference or association is a result of sampling fluctuations and not reflective of a “true” difference in the population from which the sample was selected
Three tests of statistical significance introduced in Chapter 9 Chi-square test, t-test, and F-test
4 © 2007 Pearson Education Canada
Preliminary Considerations
1. Research and null hypothesis
2. The sampling distribution Standard error of the means
3. One- and two-tailed tests of significance
5 © 2007 Pearson Education Canada
1. Research & Null Hypothesis1. Research & Null Hypothesis
Tests of significance are used to test hypotheses Set up in the form of a “research hypothesis” and
“null hypothesis” Research Hypothesis (or Alternative Hypothesis)
States one’s prediction of the relationship between the variables
Null Hypothesis States the prediction that there is no relation
between the variables
6 © 2007 Pearson Education Canada
Research & Null Hypothesis (cont’d)
Research hypothesis 1: The greater the participation, the greater the self-esteem Null hypothesis 1: There is no relation between
levels of participation and self-esteem Research hypothesis 2: Male university faculty
members earn more money than do female faculty members, after controlling for qualifications, achievements and experience Null hypothesis 2: There is no relation between
gender and earnings of faculty members, after controls
7 © 2007 Pearson Education Canada
Research & Null Hypothesis (cont’d)Research & Null Hypothesis (cont’d)
It is the null hypothesis that is tested Leads us to accept or reject the null hypothesis
If the null hypothesis is accepted: Conclude that the association or difference may
simply be the result of sampling fluctuations and may not reflect an association or difference in the population being studied
Research hypothesis deemed to therefore be false
8 © 2007 Pearson Education Canada
Research & Null Hypothesis (cont’d)Research & Null Hypothesis (cont’d)
If the null hypothesis is rejected Argue that there is an association between the
variables in the population, and that this association is of a magnitude that probably has not occurred because of chance fluctuations in sampling
Would then examine the data to see if the association is in the predicted direction i.e., consistent with prediction (it could be
different than predicted)
9 © 2007 Pearson Education Canada
Findings and probability
When the results of a study lead to the rejection of the null hypothesis, this only means that there is probably a relationship between the variables under examination It is one piece of evidence that the
relationships exists Other researchers will test it again, and either
confirm or disconfirm the past findings Research conclusions are therefore treated as
tentative, allows open to disconfirmation
10 © 2007 Pearson Education Canada
Did the fail if they accept the null?
Some researchers believe they failed if they accept the null (i.e., find no relation between the variables rather than support for the predicted relationship) Not so: it is just as important to show that two
variables are not associated as it is to find out they are associated
11 © 2007 Pearson Education Canada
2. The Sample Distribution2. The Sample Distribution
Tests of significance report whether an observed relationship could be the result of sample fluctuations or reflect a “real” difference in the population from which the sample has been taken
Sample fluctuation is the idea that each time we select a sample we will get somewhat different results If we draw 1,000 samples of 50 cases, each
will be slightly different from the first sample
12 © 2007 Pearson Education Canada
2. The Sample Distribution (cont’d)2. The Sample Distribution (cont’d)
If the means of the same variable for each of the samples were plotted, a normal curve would results, but it would be peaked (or leptokurtic)
Example: the means of weights of respondents are plotted The weights range from 70 to 80 kg, but the
majority of samples would cluster around the true mean weight of 75 kg
Note: we are plotting the mean weights of the respondents in each of the 1000 samples drawn
13 © 2007 Pearson Education Canada
2. The Sample Distribution (cont’d)2. The Sample Distribution (cont’d)
The distribution is quite peaked because we are plotting the mean weights for each sample
To measure the dispersion of the means of the samples, we use a statistic called the standard error of the mean
standard error of the mean = Sd population
square root of N
14 © 2007 Pearson Education Canada
2. The Sample Distribution (cont’d)2. The Sample Distribution (cont’d)
Relevance to hypothesis testing? In doing tests of significance, we are
assessing whether the results of one sample fall within the null hypothesis acceptance zone (usually 95% of the distribution) or outside the zone, in which we reject the null hypothesis
Four key points that can be made about probability sampling procedures where repeated measures are taken
15 © 2007 Pearson Education Canada
Four key points: repeated samples
1. Plotting the means of repeated samples will produce a normal distribution: it will be more peaked than when raw data are plotted (as shown in Figure 9.1)
FPO Figure 9.1 Distribution of Raw Data versus Means of Samples from page 254
16 © 2007 Pearson Education Canada
Four key points: repeated samples
2. The larger the sample sizes, the more peaked the distribution and the closer the means of the samples to the population mean (shown in Figure 9.2)
FPO Figure 9.2 Sample Size and the Normal Distribution from page 254
17 © 2007 Pearson Education Canada
Four key points: repeated samples
3. The greater the variability in the population, the greater the variations in the samples
4. When sample sizes are above 100, even if a variable in the population is not normally distributed, the means will be normally distributed when repeated samples are plotted. E.g. weight of pop’n of males and females
will be bimodal, but if we did repeated samples, the weights would be normally distributed
18 © 2007 Pearson Education Canada
3. One- and Two-Tailed Tests3. One- and Two-Tailed Tests
If the direction of a relationship is predicted, the appropriate test will be one-tailed If the direction of the relationship is not
predicted, conduct a two-tailed test Example:
One tailed: Females are less approving of violence than are males
Two-tailed: There is a gender difference in the acceptance of violence [Note: no prediction about which gender is more approving]
19 © 2007 Pearson Education Canada
3. One- and Two-Tailed Tests (cont’d)3. One- and Two-Tailed Tests (cont’d)
Figure 9.3 (next slide) shows two normal distribution curves
The first one has the 5% rejection area split between the two tails—this would be a two tailed test
The second one has the 5% rejection area all in one tail, indicating a one tailed test
Same principle applies to 1% level
20 © 2007 Pearson Education Canada
Figure 9.3 Five Percent Probability Rejection Area: One and Two-Tailed Tests
FPO Table 9.3 Five Percent Probability Rejection Area: One and Two-Tailed Tests, page 255
21 © 2007 Pearson Education Canada
Chi-Square: Red & White BallsChi-Square: Red & White Balls
The Chi-square test (X2) is used primarily in contingency table analysis, where the dependent variable is nominal level
The formula is:
X2 = (fo - fe)2
fe
________
22 © 2007 Pearson Education Canada
One Sample Chi-Square Test
Suppose the following incomes:
INCOME STUDENT GENERAL
SAMPLE POPULATION
Over $100,000 30 15.0 7.8
$40,000 - $99,999 160 80.0 68.9
Under $40,000 10 5.0 23.3
TOTAL 200 100.0 100.0
23 © 2007 Pearson Education Canada
The Computation
Chi-squares compare expected frequencies (assuming the null hypothesis is correct) to the observed frequencies.
To calculate the expected frequencies simply multiply the proportion in each category of the general population times the total number of cases (e.g., 200 students).
Why do you do this?
24 © 2007 Pearson Education Canada
Why?
If the student sample is drawn equally from all segments of society then they should have the same income distribution (this is assuming the null hypothesis is correct).
So what are the expected frequencies in this case?
25 © 2007 Pearson Education Canada
Expected Frequencies fe
Frequency Frequency
Observed Expected 30 15.6 (200 x .078) 160 137.8 (200 x .689) 10 46.6 (200 x .233)
Degrees of Freedom = 2
26 © 2007 Pearson Education Canada
Decision:Decision:
Look up critical value: Table 9.2, p. 260 Need to know:
2 degrees of freedom .05 level of significance 1 tailed test (i.e., column one)
Find the Critical Value = 4.61 Compare to the Chi-Square calculated = 45.61 Decision: Calculated value exceeds critical value so
reject null hypothesis Inspect the data, conclude university students from
higher SES background
27 © 2007 Pearson Education Canada
Standard Chi-Square Test
Drug use by Gender (Box9.4) 3 categories of drug use (no experience,
once or twice, three or more times) row marginal times column marginal divided
by total N of cases yields expected frequencies
degrees of freedom =(row - 1)(columns - 1)=2
28 © 2007 Pearson Education Canada
DecisionDecision
With 2 degrees of freedom, 2-tailed test, .05 level of significance, the Critical Value is 5.99
Calculated Chi-Square is 5.69 Does not equal or exceed the Critical Value So, your decision is what?
Accept the null hypothesis
29 © 2007 Pearson Education Canada
The The tt Distribution: Distribution: tt-Test Groups & Pairs-Test Groups & Pairs
Used often for experimental data
t-test used when: Sample sizes is small (e.g.,< 30) Dependent variable measured at ratio level Random assignment to treatment/control
groups Treatment has two levels only Population normally distributed
30 © 2007 Pearson Education Canada
The t Distribution: t-test groups, t-test pairs
The t-test represents the ratio between the difference in means between two groups and the standard error of the difference. Thus:
t = difference between the means
standard error of the difference
31 © 2007 Pearson Education Canada
Two T-Tests: Betweenand Within Subject Design Between-Subjects T-Test: used in an
experimental design, with an experimental and a control group, where the groups have been independently established.
Within-Subjects: In these designs the same person is subjected to different treatments and a comparison is made between the two treatments.
32 © 2007 Pearson Education Canada
The F Distribution: Means, ANOVA
Box 9.7 provides and illustration of one-way Analysis of Variance (Egalitarianism by Country, p. 269)
Concern is with how much variation there is within columns compared to variation between columns
The F represents the ratio of between variation divided by within variation
Probabilities looked up on Table 9.4, p. 272
33 © 2007 Pearson Education Canada
When Are Test of Significance Not Appropriate?
Total populations studied Non-probability sampling procedures used High non-participation rates Non-experimental research tests for
intervening variables Research is not guided by formal hypotheses