CLINICAL RESEARCH MANAGEMENT 512
Leslie McIntoshlmcintosh at path.wustl.edu
LECTURE 7
Homework Complete problems from slides Review demo
Part I Presentations
Part II p-value and statistical significance
Part III Hypothesis testing CI & Statistical Significance
HOMEWORKHomeworkDemonstration
DISTRIBUTION OF WEIGHT
Mean weight (kg) = 51.3Median weight (kg) = 49.5Minimum weight (kg) = 25.7Maximum weight (kg) = 98.1
SAMPLES FROM DISTRIBUTION
A B C45.5 39.0 61.739.2 41.4 53.154.7 39.4 49.034.5 34.5 60.762.0 36.0 50.059.7 37.7 39.369.1 40.9 36.443.2 56.3 51.855.2 37.8 79.052.9 49.0 70.9
STATISTICS FROM SAMPLES
A B CSample Mean of Weight (kg)
Standard Deviation
Sample Size
95% Confidence Interval
CONFIDENCE INTERVALS FROM SAMPLES
Population Mean = 51.3
Sample A
Sample CSample B
DEMONSTRATIONS
http://www.amstat.org/publications/jse/v16n3/pvalueapplet.html
Schulz, Eric. "Decisions Based on P-Values and Significance Levels" from the Wolfram Demonstrations Project? http://demonstrations.wolfram.com/DecisionsBasedOnPValuesAndSignificanceLevels/
PART IIPresentations
PART IIp-values and statistical significance
STATISTICAL SIGNIFICANCE
The statistical significance is the probability that the observed relationship (e.g., between variables) or a difference (e.g., between means) in a sample occurred by pure chance ("luck of the draw"), and that in the population from which the sample was drawn, no such relationship or differences exist.
The statistical significance of a result tells us something about the degree to which the result is "true" (in the sense of being "representative of the population").
P-VALUES
The value of the p-value represents a decreasing index of the reliability of a result (see Brownlee, 1960).
The higher the p-value, the less we can believe that the observed relation between variables in the sample is a reliable indicator of the relation between the respective variables in the population.
The p-value represents the probability of error that is involved in accepting our observed result as valid, that is, as "representative of the population."
P-VALUES (EXAMPLE) A p-value of .05 (i.e.,1/20) indicates that there is a
5% probability that the relation between the variables found in our sample is a "fluke."
Meaning: assuming that in the population there was no relation between those variables whatsoever, and we were repeating experiments like ours one after another, we could approximately expect that in every 20 replications of the experiment there would be 1 in which the relation between the variables in question would be equal or stronger than in ours.
Note that this is not the same as saying that, given that there IS a relationship between the variables, we can expect to replicate the results 5% of the time or 95% of the time.
CONCLUSIONS FROM P-VALUES
If the p-value is less than α: The difference between samples is statistically
significant. Reject the null hypothesis (H0).
If the p-value is greater than α:
The difference between samples is not statistically significant.
Do not reject the null hypothesis (H0).
PROS OF SAYING “STATISTICALLY SIGNIFICANT”
It is sometimes necessary to make an efficient answer.
An exact p-value is not always obtainable. Sounds less ambiguous than saying, “Random
sampling would create a difference this big or bigger in 5% of experiments if the null hypothesis should not be rejected.”
PART IIIHypothesis Testing
CI & Statistical Significance
HYPOTHESIS
What are you trying to answer? Do you have a secondary question of
interest? What variables will you need to answer your
question?
ERROR TYPES
Decision
Reject H0Do not Reject
H0
H0
True Type I Error No Error
False No Error Type II Error
H0 = Null Hypothesis
ERROR TYPES
Type I False Positive Occurs when the null hypothesis is rejected, but
it should not have been rejected
Type II False Negative Occurs when the null hypothesis is not rejected
and it should have been rejected
ERROR TYPES
Type I False Positive Occurs when the null hypothesis is rejected, but
it should not have been rejected
Type II False Negative Occurs when the null hypothesis is not rejected
and it should have been rejected
ANALOGIES FOR HYPOTHESIS TEST
Defendant is innocent
Defendant is guilty Gathering of
evidence Summary of
evidence Jury deliberation and
decision
Null hypothesis Alternative hypothesis Gathering of data Calculation of test
statistic Application of the
decision rule
ANALOGIES FOR HYPOTHESIS TEST
Verdict Verdict is to acquit
Verdict is to convict
Presumption of innocence
Decision Failure to reject the null
hypothesis Rejection of the null
hypothesis Assumption that the null
hypothesis is true
ANALOGIES FOR HYPOTHESIS TEST Conviction of an
innocent person
Acquittal of a guilty person
Beyond reasonable doubt
High probability of convicting a guilty person
Type I error (false positive)
Type II error (false negative)
Fixed (small) probability of Type I error
High power
RELATIONSHIP BETWEEN:CONFIDENCE INTERVAL & STATISTICAL SIGNIFICANCE
When a null hypothesis contains a value…
If a 95% CI does not contain the value of the H0, then the result must be statistically significant with p < 0.05.
If a 95% CI does contain the value of the H0, then the result must not be statistically significant (p > 0.05).
RELATIONSHIP BETWEEN:CONFIDENCE INTERVAL & STATISTICAL SIGNIFICANCE
When a null hypothesis contains a value…
If a 90% CI does not contain the value of the H0, then the result must be statistically significant with p < _____.
If a 90% CI does contain the value of the H0, then the result must not be statistically significant (p > ____).