Statistics 1 1
Hypothesis Testing
Discover the relationships that exist between events/things
Accomplished by:
Asking questions
Getting answers
In accord with certain rules ... the scientific method.
Question is a hypothesis
Answer is obtained by testing the hypothesis
Which gives the general model………
Statistics 1 3
With some IMPORTANT restrictions about – How the hypothesis is formed.
How the hypothesis is tested……..
Forming hypotheses is an "everyday-everybody" activity
I will do better on examinations if relax the night before
Is a "hypothesis" ... a statement of a relationship
OK, BUT NOT a scientific hypothesis
Statistics 1 4
A scientific hypothesis must meet certain criterion
A scientific hypothesis must be:
Specific
Empirically testable
Strictly related to some experimental procedure
Statistics 1 5
Moreover, a scientific hypothesis actually consists of two separate mutually exclusive hypotheses
A null hypothesis
An alternative hypothesis
Statistics 1 6
Null Hypothesis
A statement reflecting the possibility that there are no differences between the objects and/or events that are being observed
In formal terms:
Ho: µ1 = µ2
Where: µ1 and µ2 are the mean or average of several observations
Statistics 1 7
Alternative Hypothesis
A statement reflecting the possibility that there are differences between the objects and/or events that are being observed
In formal terms:
H1: µ1 <> µ2 or H1: µ1 < µ2 or H1: µ1 > µ2
Statistics 1 8
Testing between the null and alternative hypothesis
Accomplished through collection of data
Data must be scientifically acceptable, i.e. – Observable – Public – Replicable
The test concentrates on the null hypothesis which you either – Reject – Fail to reject
Statistics 1 9
If there are no differences between your observations you – Fail to reject the null hypothesis and – Disregard the alternative hypothesis
If there are differences between your observations you – Reject the null hypothesis and Accept the alternative
hypothesis
Statistics 1 10
Some things to note about hypothesis testing
Failing to reject the null hypothesis – Does not mean that the null hypothesis is TRUE – The null hypothesis can never be proven – You can only fail to reject it
Rejecting the null hypothesis – Means you accept the alternative hypothesis – It does not establish the validity of a relationship
Validity is a function of experimental design
Statistics 1 11
When testing a hypothesis
Two possible outcomes re: null hypothesis
Two possible states of real world
Thus four possible decisions – Two are incorrect ... i.e. errors
Statistics 1 12
The Real World
Ho: true Ho: false
YourDecision
Reject HoType I error
alpha (p level)
CorrectPower
1 - beta
Do not reject HoCorrect
1 - alphaType II error
beta
Statistics 1 13
Some things to file for future reference
Type I error – You can directly "set" this – It is the chance (probability of the making the error)
you are willing to accept when you test your hypothesis.
Type II error – You cannot directly "set" this
You can attempt to control it through good experimental design.
Statistics 1 14
Alpha Level or the level of significance is a probability value that is used to define the very unlikely sample outcomes if the null hypothesis is true.
Critical region is composed of extreme sample values that are very unlikely to be obtained if the null hypothesis is true. The boundaries for the critical region are determined by the alpha level. If sample data fall in the critical region, the null hypothesis is rejected.
Statistics 1 15
Estimating Population Parameters from Samples
Sample mean
Unlikely to be exactly equal to population mean
BUT
Not more likely to be greater
Not more likely to be less
So sample mean is an unbiased estimate of population mean
Statistics 1 16
Sample standard deviation
Unlikely to be exactly equal to population standard deviation
BUT
More likely to be less
Is usually an under estimate of population parameter
So sample standard deviation is a biased estimate of the population standard deviation
Statistics 1 17
To understand why this is so you must understand the nature and concept of a sampling distribution
What it all means
When you take a sample
Statistics 1 18
Sample mean is unbiased estimate of population mean – But no reason to suspect it is higher or lower than
population mean
Sample standard deviation is a biased estimate of population standard deviation – But it is more likely to be smaller than population
variance and standard deviation
Statistics 1 19
And so must correct any estimate of the population variance increase it (i.e. use "n-1" when calculating the estimate)
Statistics 1 21
Parametric Tests-
Tests that do make assumptions and test hypotheses about population parameters.
• z & t
• ANOVA
• F test
Involves an assessment of whether your observed data is related to your independent variable
Statistics 1 22
Or is simply what might be expected by chance random sampling – i.e. no relation between
• Independent variable
• Dependent measure
Statistics 1 23
Requires knowing or estimating population parameters
Mean: (μ)
Standard deviation: (σ)
Assumption of normality
For example: consider • Pat (individual score) = 64 • Population mean (μ) = 50 • Population standard deviation (σ) = 8
Statistics 1 24
And if population is normal, then you know
~68 (68.26) % data points between + 1
~95 (95.44) % data points between + 2
~99 (99.74) % data points between + 3
And remember: these are percentages, not absolute valuesAreas under the normal curve
Statistics 1 25
Remember the z-distribution ?
Provided areas (proportion of scores) under a normal distribution according to
And if convert Pat's raw score to a z-score
Statistics 1 26
And then look up Pat's z-score in the Z-table
Meaning that ~96% (1.00 - .0401 = .9599) of scores in distribution are below Pat. (page 699, G&W).
OR PUT OTHERWISE – If we were to randomly select a score from Pat's
distribution
The probability that the score would be greater than Pat's would only be 4 in 100
Statistics 1 27
And we have done what we set out to do
Accomplished a statistical test
… i.e. comparing ….– Observed data and What would be expected by
chance .
And thus,
Pat's score is significant at p < .05
Statistics 1 28
P-value
The P-value is the probability, when Ho is true, of a test statistic value at least as contradictory to Ho as the value actually observed. The smaller the P-value, the more strongly the data contradict Ho. The P-value is denoted by P.
The P-value summarizes the evidence in the data about the null hypothesis. A moderate to large P-value means that the data are consistent with Ho.
Statistics 1 29
Eg. P-value .26 or .83 indicates that the observed data would not be unusual if Ho were true. However, a P-value such as .001 means that such data would be very unlikely, if Ho were true.
The P-value is the primary reported result of a significance test.
If the P-value is sufficiently small, one rejects Ho and accepts H1.
Statistics 1 30
Standard Error of the Mean
When comparing an individual to a population needed to know two things about the population
Mean:
Standard deviation:
• Only slightly different when comparing a sample to a population
Statistics 1 31
Since you are not concerned with a single individual but with a sample of individuals
The "population" of interest is not – A population of individuals but rather – A population of samples, i.e. a SAMPLING
DISTRIBUTION
Statistics 1 32
And the measure of "variability" is not – The standard deviation of a population but rather – The standard deviation of a sampling distribution,
i.e. the STANDARD ERROR OF THE MEAN
Statistics 1 33
The calculation details
The standard error of the mean
σm = σ ∕ √n
where: n = sample size
The z comparison
Z = (M – μ) ∕ σm
Which is not really different than what we did when comparing an individual to a population
Statistics 1 34
Suppose: Herd of 10 cows (n=10)Mean milk production is 1.8 gal/cow
Question: How unique is this herd?
μ = 1.5, σ = .55, n=10, M=1.8
Statistics 1 35
Thus herd is pretty unique since the likelihood that any random sample of 10 cows from the population would produce more milk is less than 5 times in 100
Or in "statistics" – The probability of selecting a herd of greater milk
producers is p < 0.05
Statistics 1 37
Problem: compare a sample to a population
Method:
1. Use population parameters to calculate the standard error of the mean of a sampling distribution.
σm = σ ∕ √n
Statistics 1 38
2. Use the standard error of the mean to compare sample mean with population mean by calculating a z-score
Z = (M – μ) ∕ σm
Statistics 1 39
3. Use z-table to determine the probability that a random sample would yield a mean greater than the mean of the sample
Statistics 1 40
A word on the logic and requirements of the statistic
• The "uniqueness" of your sample is the probability that another random sample of the same size would have the same mean as your sample.
• Or put otherwise, is your sample mean, is what would be expected by chance, a random selection?
• The more unique your sample, the more likely it reflects a relationship between:
Statistics 1 42
Two requirements
The population is normally distributed
You know the population
• Mean
• Standard deviation
Statistics 1 43
The t statistic -An alternative to z
MUST know the population mean
But can estimate population standard deviation from sample data
A sample standard deviation is given by (as you know)
Statistics 1 44
And so an estimated standard error of the mean is:
sm = s ∕ √n
And to use the estimated standard error of the mean to compare your sample to the population must make one adjustment– Adjustment is necessary to account for the fact that
you are estimating
Statistics 1 45
Comparisons When Estimating Population Parameters
• The adjustment part
• Estimating the standard deviation requires a different sampling distribution
• Sampling distribution is the t-distribution
Statistics 1 46
The t-distribution
• Normal distribution
• More platokurtic than z-distribution
• Tails more elevated
Statistics 1 47
And comparison becomes
t = (M – μ) ∕ sm
Thus:
Because use estimate of population standard deviation to estimate standard error of mean
Must use t-distribution to get probability of randomly selecting a sample with a mean similar to the mean of your sample
But not conceptually different -- just an adjustment
Statistics 1 48
For Example…
Suppose: Mean milk production of your herd of 10 cows is:1.8, 1.7, 2.4, 2.3, 1.1, 1.7, 1.5, 2.4,1.9, 1.2 gals(Mean = 1.8 gal/cow)
Question: How unique is your herd?
μ = 1.5 gal/day *givenσ = unknownσm =unknown because do not know population standard deviation
Statistics 1 49
QUESTION now is what does that "t-value" mean
• i.e. What is the probability of a random sample of 10 cows being like your cows
• To find out, consult a t-table
Statistics 1 51
The t-Distribution
• Z-table gives exact probabilities
• t-table gives ranges of probabilities
Statistics 1 52
Enter table with degrees of freedom of your sample – Degrees of freedom
• Number of values in a calculation that are free to vary
– That is: – The degrees of freedom for a mean of 10 values is
9 ... because – If the mean of 10 numbers is, for example, 5 – Nine of the numbers "free" to be any value but when
these are established, the 10th number is determined if the mean is to be 5
Statistics 1 53
Determine the probability of your t-observed by the tabled t-values that it falls between,
For example:
• With 10 data points there are 9 degrees of freedom (df=9)
• If the t-observed statistic is 2.04
Statistics 1 54
The t-table gives a probability of that occurring by chance between 0.05 and 0.02 two-tailed (between 5 and 2 times in 100)
And for your cows
The observed t-value of 2.04, df=9 gives a tabled probability of p > 0.05
Statistics 1 55
Which is traditionally not sufficient to reject the Null hypothesis – Any event that has probability of occurring 5 times
or more in 100 is considered by most psychologists an indication of a chance event
And thus your cows are "just average old cows"
Well maybe NOT……….
Statistics 1 56
Directionality of Statistical Tests
Statistical tests have a property called "directionality"
• Nondirectional, called "two-tailed" tests
• Directional, called "one-tailed" tests
Statistics 1 57
Directionality is determined BEFORE you run your experiment
Based upon prior knowledge or data
You predict of the outcome of your observations, the affect of your independent variable – Your independent variable "will improve
performance" – Your independent variable "will interfere with
learning" – etc
Statistics 1 58
Your ability to predict an outcome means that you are better able to determine whether an event is a chance occurrence
More likely to reject Null hypothesis
In statistical terms the region of the sampling distribution indicating that an event is something different than what would be expected by chance is larger
Statistics 1 59
And cows again
If you had valid reasons to predict that your cows produced more milk
You would use a directional test, i.e. one-tailed test
And you would reject the Null hypothesis at p < .05 that your herd was not different than what you would expect from another random sample of cows
Statistics 1 61
• Z-test a statistical test used to decide if a sample mean does or does not come from a specified population, when the standard deviation of the population is known.
• When the standard deviation of the population is unknown then a t-test is performed.
• Hypothesis testing, the goal is to decide whether to reject the null hypothesis.
Statistics 1 62
• Alpha level, traditionally set at .05 where, also, the acceptance and rejection regions are determined.
• Critical value, the absolute value of that defines the rejection region(s).
• Non-directional (two-tailed), where rejection of the sample mean is either above or below hypothesized population mean.
• Directional (one-tailed), where rejection of the sample mean is determined prior to experimentation.
Statistics 1 63
Compared a Sample to a population:
When population parameters are known…..
Sample to population: assume a normal population and known standard deviation
Z = (M – μ) ∕ σm
Statistics 1 64
When population parameters are unknown….
Sample to population: assume a normal population and unknown standard deviation
t = (M – μ) ∕ sm