Chapter 23 1
Chapter 23
Use and Abuse of Statistical Inference
Chapter 23 2
Thought Question 1
When presenting the results of a study, would it be sufficient to only report the P-value? Why would it be a good idea to also give a confidence interval based on the results?
Chapter 23 3
Thought Question 2
Suppose a new study found that there was no difference in lung function, measured by average volume of air expired, for smokers and nonsmokers. What may have led to this finding? Do you think the lung function was exactlythe same for both groups in the study?
Chapter 23 4
Thought Question 3
The results of a CNN/USA Today/Gallup public opinion poll in August of 2005 showed that a majority of Americans were pro-choice on the abortion issue. Would it be fair to claim that “significantly more than 50% of Americans were pro-choice”? Explain.
Chapter 23 5
Thought Question 3: Answer
• n=1003• 542 stated that they were pro-choice
•• 95% C.I.: 0.509 to 0.571
0.5401003542 ==p̂
Chapter 23 6
Warnings about Reports on Hypothesis Tests: Data Origins
For any statistical analysis to be valid, the data must come from proper samples. Complex formulas and techniques cannot fix bad (biased) data. In addition, be sure to use an analysis that is appropriate for the type of data collected.
Chapter 23 7
Warnings about Reports on Hypothesis Tests: P-value or C.I.?P-values provide information as to whether findings are more than just good luck, but P-values alone may be misleading or leave out valuable information (as seen later in this chapter). Confidence intervals provide both the estimated values of important parameters and how uncertain the estimates are.
Chapter 23 8
Warnings about Reports on Hypothesis Tests: Significance
If the word significant is used to try to convince you that there is an important effect or relationship, determine if the word is being used in the usual sense or in the statistical sense only.
Chapter 23 9
Case Study: Patient Satisfaction
Bertakis, Klea D., et. al., “The influence of gender on physician practice style”, Medical Care, Vol. 33, No. 4,
1995, pp 407-416.
“Women Doctors Fare Better in Patient Survey”
reported in Sacramento Bee, April 26, 1995
Chapter 23 10
Case Study: Patient Satisfaction
Alternative (Research) Hypothesis: The mean satisfaction rating by patients who first saw a female physician is different from the mean satisfaction rating by patients who first saw a male physician.
Null Hypothesis: There is no difference in the mean satisfaction rating by patients who first saw a female physician and the mean satisfaction rating by patients who first saw a male physician.
Chapter 23 11
Case Study: Patient Satisfaction
The alternative hypothesis is two-sided. Study was double blinded (neither patients
nor physicians were told the purpose of the survey).
Survey was completed by 250 patients at the University of California at Davis Medical Center who rated medical residents on a scale 1 to 5 (very dissatisfied to very satisfied).
Chapter 23 12
Case Study: Patient Satisfaction
Bee: “The female physicians received an average score of 4.27. The men – a respectable, yet significantly lower score of 4.05.”
The average difference was 0.22.Medical Care: the difference was “small but
statistically significant (P-value=0.02).”Medical Care: “This difference is both
statistically and clinically significant.”
Chapter 23 13
Warnings about Reports on Hypothesis Tests: Large Sample
If a study is based on a very large sample size, relationships found to be statistically significant may not have much practical importance.
Chapter 23 14
Warnings about Reports on Hypothesis Tests: Small Sample
If you read “no difference” or “no relationship” has been found in a study, try to determine the sample size used. Unless the sample size was large, remember that it could be that there is indeed an important relationship in the population, but that not enough data were collected to detect it. In other words, the test could have had very low power.
Chapter 23 15
Warnings about Reports on Hypothesis Tests: 1 or 2 Sided
Try to determine whether the test was one-sided or two-sided. If a test is one-sided, and details are not reported, you could be misled into thinking there was no difference, when in fact there was one in the direction opposite to that hypothesized.
Chapter 23 16
Case Study: Seen a UFO?
Seen a UFO? You May Be Healthier Than Your Friends
Roper Organization. Unusual Personal Experiences: An Analysis of the Data from Three National Surveys, Las
Vegas: Bigelow Holding Corp., 1992.
Chapter 23 17
Case Study: Seen a UFO? Research Hypothesis (Alternative): People
who claim to have seen a UFO are on average more psychologically disturbed than those who make no such claim.
Null Hypothesis: People who claim to have seen a UFO are on average no more or less psychologically disturbed than those who make no such claim.
Chapter 23 18
Case Study: Seen a UFO? 49 subjects were recruited through a
newspaper.– 18 were UFO nonintense– 31 were UFO intense (could explain details of encounter)
127 control subjects were recruited– 74 students of a psychology class
(receiving credit for participation)– 53 community members recruited through
a newspaper
Chapter 23 19
Case Study: Seen a UFO? New York Times (1993): “Study Finds
No Abnormality in Those Reporting UFOs.”
Results: UFO groups actually scored significantly better (statistically) on many of the psychological measures.
The stated one-sided alternative hypothesis was not supported. Does this mean the null hypothesis is true?
Chapter 23 20
Warnings about Reports on Hypothesis Tests: Only Significant are Reported?
Sometimes researchers will perform a multitude of tests, and the reports will focus on those that achieved statistical significance. Remember that if nothing interesting is happening and all of the null hypotheses tested are true, then [about] 1 in 20 (.05) tests should achieve statistical significance just by chance. Beware of reports where it is evident that many tests were conducted, but where results of only one or two are presented as “significant.”
Chapter 23 21
Case Study: Spinach is Good?
So You Thought Spinach Was Good for You?
Norwak, R. “Beta-carotene: Helpful or harmful?” Science, Vol. 264, April 22, 1994, pp 500-501.
Chapter 23 22
Case Study: Spinach is Good? Startling finding: Supplements of the
antioxidant beta-carotene markedly increased the incidence of lung cancer among heavy smokers in Finland.
This is the result of a large, randomized clinical trial: 29,000 cases
But…there were multiple tests conducted.
Chapter 23 23
Key Concepts
Difference between a statistically significant effect and a practically important one
Large Samples and Statistical SignificanceSmall Samples and Statistical SignificanceMultiple Tests and Statistical Significance