Introduction to Statistics
From the Data at Hand
to the World at Largeto the World at LargePart II
Siana Halim
TopicsTopics• Testing Hypotheses
About Proportions
• More About Tests
• Comparing Two Proportions
R fReferences:•De Veaux, Velleman , Bock, Stats, Data and Models, Pearson Addison WesleyInternational Edition, 2005•John A Rice, Mathematical Statistics and Data Analysis, Duxbury Press, 1995
2 Siana Halim
T ti H thTesting Hypotheses• The goal of testing hypotheses
is to determine if a conjecture jabout some feature of a population is strongly supported by information obtained from the sample dataobtained from the sample data.
• Prompted by the intent of investigation, this conjecture typically involves an assertion yp yabout the value of a population parameter.
Example: Proportion people in Surabaya who have cell phone
Example: In average people in Surabaya spend Rp. 300.000,-for their cell phones
Population mean
P l ti ti Example: Proportion people in Surabaya who have cell phonemore than one is 0.70
Population proportion
3 Siana Halim
Testing HypothesisTesting HypothesisClaim:The average
Population
The average expenses for cell phone is Rp.300000,-
Can
Randomly chose some people
PopulationH0: μ = 300000
Let the
happened if μ = 300000?
Can
If the probability is too
X = 450000
Rejectsample mean equal to
X = 450000Sample
Hipotesa Null
p ysmall then ,
p
4 Siana Halim
ExampleExampleIngots are huge pieces of metal, often weighing in excess of 20.000 pounds, g g . p ,made in giant mold. They must be cast in one large piece for use in fabricating large structural parts for cars and planes If they crack while being made planes. If they crack while being made, the crack may propagate into the zone required for the part, compromising its integrity.
In one plant, only about 80% of the ingots have been free of cracks.
New methods applied to 400 ingots cast New methods applied to 400 ingots cast and only 17% of them have cracked.
Should management declare victory ?
5 Siana Halim
HypothesesHypothesesIn statistics, a hypothesis proposes a model for the world. Then we look at the data.
► If the data are consistent with that model, we have no reason to disbelieve that hypothesis. BUT
► If the facts are inconsistent with the model, what then ?
If the data are only slightly out of step with the model, we might stick with the model.
If the data dramatically contradict the model, though, that’s strong evidence that the model is incorrect.
6 Siana Halim
Testing HypothesisTesting HypothesisH0, the null hypothesis, specifies a population model
t f i t t d l f th t parameter of interest and proposes a value for that parameter.
Which value to take is often obvious from the context of the Which value to take is often obvious from the context of the problem itself, from the Who andWhat of the data.
Here we can write H0: p = 0.20Here we can write H0: p 0.20
What would convince you that the cracking rate had actually gone down ?g
How big must the difference be before we are convinced that the cracking rate has changed ?
7 Siana Halim
Whenever we ask how big a diff i thi k f th 020800200 ).)(.()(
^===
pqpSD
Now we know both parameters of the Normal
difference is, we think of the standard deviation
020400
.)( ===n
pSD
Now we know both parameters of the Normal sampling distribution model. So , we can find out how likely it would be to see the observed
l f ^
200170
value of %17=p
-3 -2 -1 0 1 2 351020
200170 ..
..−=
−=z
Pr (z < -1.5) = 0.067This is the probability of observing a cracking rate of 17% or less if the null model is true.
8 Siana Halim
What to Do with “Innocent” dependant
f h d l k l
H0 : innocent defendant.
If the evidences is too unlikelygiven this assumption, the jury rejects the null hypothesis and j ypfinds the defendant guilty.
BUT – if there is insufficient evidence to convict the defendant, the jury does notdecide that H0 is true and 0 declare him innocent. Juries can only fail to reject the null hypothesis and declare the hypothesis and declare the defendant “not guilty”
9 Siana Halim
Th R i g f H th i T ti gThe Reasoning of Hypothesis Testing1. Hypotheses
• To perform a hypothesis test, we must specify an alternative hypothesis.• Remember that we can never prove a null hypothesis, only reject it or p yp y
fail to reject it. If we reject it, we then accept the alternative.• In statistical hypothesis testing, hypotheses are almost always about the
model parameters.h ll h h f l d l• The null hypothesis specifies a parameter value to use in our model.
H0 : parameter = hypothesized value• The alternative hypothesis usually gives a range of other possible values.
H1 contains the values of the parameter we accept if we reject the null.
10 Siana Halim
ProblemS f l f C k P H Suppose we want to see if people prefer Coke or Pepsi. How do we translate this into a null hypothesis we can test ?We need to specify a parameter and a value for it If we let p be We need to specify a parameter and a value for it. If we let p be the proportion of people who prefer Coke to Pepsi.
H0 : p = 0.50 pWhat’s the alternative ? We would be interested in learning that either cola was preferred.
H1 : p ≠ 0.5If the data convince us that we should reject the null hypothesis we would accept the null hypothesis, we would accept the alternative.
11 Siana Halim
Problem
Experience has shown that the cure rate for a given disease using standard medication is 60%. The cure rate of a new drug is anticipated to be better than the standard medication. Suppose that the new drug is pp gto be tried on a sample of 20 patients and that the number cured X in the 20 is to be recorded. How should the experimental data be used to answer the question : “Is there substantial evidence that the new drug has a hi h h h d d di i ?”higher cure rate than the standard medication ?”
The hypotheses are :The new drug is better than the standard medication :
p = 0.6The new drug is not better than the standard medication : The new drug is not better than the standard medication :
p > 0.6
12 Siana Halim
Th R i g f H th i T ti gThe Reasoning of Hypothesis Testing2. PLAN
To plan a statistical hypothesis test, specify the model you will use to test the null hypothesis and y ypthe parameter of interest.
qp⎟⎞
⎜⎛⎟
⎞⎜⎛
⎟⎞
⎜⎛ ^^^
W ll thi t t b t th l f ti
nqp
pSDpSDpNp 000 =⎟⎟
⎠
⎞⎜⎜⎝
⎛⎟⎟⎠
⎞⎜⎜⎝
⎛⎟⎟⎠
⎞⎜⎜⎝
⎛,,~
We call this test about the value of a proportion a one proportion z-test
13 Siana Halim
Th R i g f H th i T ti gThe Reasoning of Hypothesis Testing3. Mechanics
Under “mechanics” we place the actual calculation of a test statistic from the data.
The ultimate goal of the calculation is to obtain a P-valuethe probability that the observed statistic value could – the probability that the observed statistic value could
occur if the null model were correct.
If th P l i ll h ’ll j t th ll If the P-value is small enough, we’ll reject the null hypothesis.
14 Siana Halim
Th R i g f H th i T ti gThe Reasoning of Hypothesis Testing4. Conclusion
The conclusion in a hypothesis test is always a statement about the null hypothesis. The conclusion must state either that we reject or that we fail to reject the null hypothesis. that we reject or that we fail to reject the null hypothesis. And, as always, the conclusion should be stated in context !
Your conclusion about the null hypothesis should never be h d f i d Of h i the end of a testing procedure. Often there are actions to
take or policies to change.
The size of the effect is always a concern when we test The size of the effect is always a concern when we test hypotheses. A good way to look at th the effect size is to examine a confidence interval.
15 Siana Halim
One Proportion Z TestOne Proportion Z-TestThe conditions for the one-proportion z-test are the same as f th ti i t l W t t th h th i for the one-proportion z-interval. We test the hypothesis
H0: p = p0 using the statistic
nqp
pSDSD
ppz 00
^
^
0
^
, =⎟⎟⎠
⎞⎜⎜⎝
⎛
⎟⎟⎞
⎜⎜⎛
⎟⎟⎠
⎞⎜⎜⎝
⎛−
=
When the conditions are met and the null hypothesis is the
pSD ⎠⎝⎟⎟⎠
⎜⎜⎝
yptrue, this statistic follows the standard Normal model, so we can use that model to obtain P-value.
16 Siana Halim
Representscritical value
Level of significance = α
/2α/2αH0: p = p0
Two-tail test
Rejection region is shaded
0
0 p p0
H1: p ≠ p0
H0: p = p0 H1: p > p0
α
H0: p = p0 α
0Upper-tail test
H1: p < p0
0
α
Lower-tail test
17 Siana Halim
P-Values and Decisions: What to Tell b t H th i T tabout a Hypothesis TestHow small should be the P-value be in order for you to reject the null hypothesis ?
A jury needs enough evidence to show the defendant j y gguilty „beyond a reasonable doubt“.
How does that translate to P-Value ?How does that translate to P Value ?
The answer is that it‘s highly context dependent.
Another factor in choosing a P-value is the importance of the issue being tested.
18 Siana Halim
ExampleExampleA renowned musicologist claims that she can distinguish between the works of Mozart and Haydn simply by hearing a randomly selected 20 y p y y g yseconds of music from any work by either composer. What‘s the null hypothesis ?
H0 : p = 0.5 Vs H1: p > 0.5Now, we present her with 10 pieces of Mozart or Haydn chosen at random. She gets 9 out of 10 correct. It turns out that the P-value associated with that result is 0.011. What would you conclude ?
Most people would probably reject the null hypothesis and be convinced that she has some ability to do as she claims. Why ?
Because the P-value is small, and we don‘t have any particular reason to doubt the alternative.
19 Siana Halim
ExampleExampleImagine, a student who bets that he can make a flipped coin land the way he wants just by thinking hard To test him we flip a fair coin 10 way he wants just by thinking hard. To test him, we flip a fair coin 10 times. Suppose he gets 9 out of 10 right. This also has a P-value of 0.011.
Are you willing now to reject this null hypothesis ?
Are you convinced that he‘s not just lucky ?
What amout of evidence would convince you ?
We require more evidence when rejecting the null hypotheses would contrandict longstanding beliefs or other scientific results.
Of course, with sufficient evidence we would revise our opinions.
20 Siana Halim
Zero in on the NullZero in on the NullNull hypotheses have special requirements. In order to
f i i l f h h h i perform a statistical test of the hypothesis,
the null must be a statement about the value Think about h ! of a parameter from a model
d h h ll h h
Why !
How do we choose the null hypothesis ?
• The appropriate null arises directly from the context of the problem.
• It is not dictated by the data, but instead by the it tisituation.
21 Siana Halim
How to think about P ValuesHow to think about P-ValuesA P-value is a conditional probability. It is the probability of p y p ythe observed statistic given the null hypothesis is true
It i b bilit b t th d tIt is a probability about the data.
22 Siana Halim
Alpha levelsAlpha levelsWhen the P-value is small, it tells us that our data are rare ,given the null hypothesis.
But how rare is “rare” ?
We can define “rare event” arbitrarily by setting a threshold for our P valuefor our P-value.If our P-value falls below that point we’ll reject the null hypothesis.yp .We call such results statistically significant. The threshold is called an alpha level.p
23 Siana Halim
Alpha levelsAlpha levels
The alpha level is the P value does not fall below α ?The alpha level is the P-value does not fall below α ?When you have established, you should say that “The data have failed to provide sufficient evidence to reject The data have failed to provide sufficient evidence to reject the null hypothesis”
Failed to reject the hypothesis
24 Siana Halim
Critical ValueCritical Value
There is only one critical There is only one critical
value, since the rejection
i i l t ilα
area is in only one tail
Reject H0Do not reject H0
Zα0Z
μ
Critical value
X
Critical value
25 Siana Halim
Critical ValueCritical Value
There is only one critical
value, since the rejection
αarea is in only one tail
Reject H0 Do not reject H0
-Z 0Z
μ X
Critical value
26 Siana Halim
Critical ValueCritical Value
Th α/2There are two cutoff values
(critical values)
Do not reject H0 Reject H0Reject H0
-Z 0 +Z
3
Z
X(critical values),
defining the regions of rejection Z 0 +Z
Lower critical value
g j
1 id d 2 id dα 1 –sided 2-sided0.05 1.645 1.960.01 2.28 2.5750.001 3.09 3.29
27 Siana Halim
Confidence Interval and Hypothesis T tTests• Confidence intervals and hypothesis tests are
built from the calculations.• They have the same assumptions and conditions.• Because confidence intervals are naturally two-
sided, they correspond to two-sided test.
Any value outside the confidence interval would make a null hypothesis that we would reject.
28 Siana Halim
Making ErrorMaking ErrorWhen we perform a hypothesis test, we can mistakes in two ways :
I. The null hypothesis is true, but we mistakenly reject it (f l i i )(false positive)
II. The null hypothesis is false, but we fail to reject it (false negative)negative)
How often will a Type I error occur ?
It happens when the null hypothesis is true but we’ve had the bad luck to draw an unusual sample.
29 Siana Halim
Neyman Pearson ParadigmNeyman-Pearson Paradigm
H True H False
The Truth
H0 True H0 False
Reject H0 Type IE
Power
sion
ErrorFail to OK Type IIM
y D
ecis
Reject H0 Error
Neyman – Pearson Paradigm
30 Siana Halim
Neyman Pearson ParadigmNeyman-Pearson ParadigmTo reject H0, the P-value must fall below α. When H0 is true, that happens exactly with probability α.
The probability of a Type I error is αThe probability of a Type I error is α
When H0 is false and we reject it, we have done the right thing. Our ability to detect a false hypothesis is called the thing. Our ability to detect a false hypothesis is called the power of the test.
When H is false and we fail to reject it we have made a When H0 is false and we fail to reject it, we have made a Type II error. We assign the letter β to the probability of this mistake.
31 Siana Halim
The Power of TestThe Power of TestNull Hypothesis
The Truthp0
Power
The Truth
pp*
Reject H0Fail to reject H0
32 Siana Halim
Comparing Two ProportionComparing Two Proportion
Who are typically more intelligent, men or yp y g ,women ?
Gallup Poll : 520 – women; 506 men
When asked about intelligence,
28% of the men thought that men were generally more intelligent, but only 14% of the women agreedof the women agreed.
33 Siana Halim
The Standard Deviation of the Diff b t t P tiDifference between two Proportions
222
111 n
qppSDn
qppSD
YVarXVarYXSD^^
;
)()()(
=⎟⎟⎠
⎞⎜⎜⎝
⎛=⎟⎟
⎠
⎞⎜⎜⎝
⎛
+=−
2
2
222
1
1121
21
nqp
nqpppVar
nn
^^
⎟⎟⎠
⎞⎜⎜⎝
⎛+⎟
⎟⎠
⎞⎜⎜⎝
⎛=⎟⎟
⎠
⎞⎜⎜⎝
⎛−
⎠⎝⎠⎝
2
22
1
1121
21
nqp
nqpppSD
^^+=⎟⎟
⎠
⎞⎜⎜⎝
⎛−
⎠⎝⎠⎝⎠⎝
2
22
1
1121 n
qpn
qpppSE
^^^^^^
+=⎟⎟⎠
⎞⎜⎜⎝
⎛−
34 Siana Halim
Assumptions and ConditionsAssumptions and Conditions1. Independence Assumption
Randomization condition.
10 % condition
Independent samples condition
2 Sample Size Condition2. Sample Size ConditionSuccess / failure condition for each sample.
35 Siana Halim
The Sampling DistributionThe Sampling DistributionA two-proportion z-interval
When the conditions are met, we are ready to find the confidence interval for the difference of two proportions p1 – p2. The confidence interval is
⎟⎞
⎜⎛
⎟⎞
⎜⎛ ^^^^
Where the standard error of the difference
⎟⎟⎠
⎞⎜⎜⎝
⎛−±⎟⎟
⎠
⎞⎜⎜⎝
⎛− 21221 / ppSEzpp α
2
22
1
1121 n
qpnqp
ppSE
^^^^^^
+=⎟⎟⎠
⎞⎜⎜⎝
⎛−
The critical value zα/2 depends on the particular confidence level, α, that you specify.
21 nn⎠⎝
36 Siana Halim
Will I snore when I am 64 ?Will I snore when I am 64 ?The National Sleep Foundation asked a random sample of 1010 US adults questions about their sleep habits. One of the questions asked about snoring.
Of h 995 d 37% f d l d h Of the 995 respondents, 37% of adults reported that they snored at least a few nights a week during the past year.
Would you expect that percentage to be the same for all age groups ?
Split into two age categories, 26% of the 184 people under 30 snored, p g g , p p ,compared with 39% of the 811 in the older group.
Is this difference of 13% real, or due only to natural fluctuations in the sample we’ve chosen ?
37 Siana Halim
Two – Proportion z – testThe conditions for the two proportion z test are the same as The conditions for the two-proportion z-test are the same as for the two-proportion z-interval. We are testing the hypothesis H0: p1 = p2Because we hypothesize that the proportions are equal we pool Because we hypothesize that the proportions are equal, we pool them to find
21
nnSuccessSuccessp pooled +
+=
^
and use that pooled value to estimate the standard error
21 nn +
2121 n
qpnqp
ppSE pooledpooledpooledpooledpooled
^^^^^^
+=⎟⎟⎠
⎞⎜⎜⎝
⎛−
Now we find the test statistic using the statistic, When the conditions are met and the null hypothesis is true, this statistic follows the t d d l d l⎟⎟
⎠
⎞⎜⎜⎝
⎛−
−=
21
21^^
^^
ppSE
ppz
pooled standard normal modelso we can use that model to obtain a p-value
⎟⎠
⎜⎝
21 pppooled
38 Siana Halim