Date post: | 25-Dec-2015 |
Category: |
Documents |
Upload: | marlene-stephens |
View: | 214 times |
Download: | 0 times |
1
Experimental StatisticsExperimental Statistics - week 2 - week 2Experimental StatisticsExperimental Statistics - week 2 - week 2
• Sampling Distributions– Chi-square
– F
• Statistical Inference– Confidence Intervals
– Hypothesis Tests
Review ContinuedReview Continued
2
Chi-Square DistributionChi-Square Distribution (distribution of the sample variance) (distribution of the sample variance)
Chi-Square DistributionChi-Square Distribution (distribution of the sample variance) (distribution of the sample variance)
22
2 21
1 ( )( ) ni
i
X Xn S
IF:IF:• Data are Normally Distributed
• Observations are Independent
Then:Then:
has a Chi-SquareChi-Square distribution with n - 1 degrees of freedom
3
Chi-square Distribution, Figure 7.10, page 357
4
5
6
F-DistributionF-DistributionF-DistributionF-Distribution
IF:IF:• S1
2 and S22 are sample variances from 2 samples
• samples independent
• populations are both normal
Then:Then:2 21 12 22 2
/
/
S
S
1 2n nhas an F-distribution with and df
7
F-distribution, Figure 7.10, page 357
8
9
10
(1-(1-)x100% Confidence Intervals)x100% Confidence Intervalsfor for
(1-(1-)x100% Confidence Intervals)x100% Confidence Intervalsfor for
Setting:• Data are Normally Distributed
• Observations are Independent
Case 1: known
/ 2 / 2X z X zn n
Case 2: unknown
/ 2 / 2X t X tn n
( 1n df )
11
CI Example CI Example CI Example CI Example
An insurance company is concerned about the number and magnitude of hail damage claims it received this year. A random sample 20 of the thousands of claims it received this year resulted an average claim amount of $6,500 and a standard deviation of $1,500. What is a 95% confidence interval on the mean claim damage amount?
Suppose that company actuaries believe the company does not need to increase insurance rates for hail damage if the mean claim damage amount is no greater than $7,000. Use the above information to make a recommendation regarding whether rates should be raised.
12
Interpretation of 95% Interpretation of 95% Confidence IntervalConfidence Interval
Interpretation of 95% Interpretation of 95% Confidence IntervalConfidence Interval
100 different 95% CI plotted in the case for which true mean is 80
i.e. about 95% of these confidence intervals should “cover” the true mean
Concern has been mounting Concern has been mounting that SAT scores are falling.that SAT scores are falling.
• 3 years ago -- National AVG = 955
• Random Sample of 200 graduating high school students this year (sample average = 935) (each the standard deviation is about 100)
Question: Have SAT scores dropped ?
Procedure: Determine how “extreme” or “rare” our sample AVG of 935 is if population AVG really is 955.
We must decide:We must decide:
• The sample came from population with population AVG = 955 and just by chance the sample AVG is “small.”
OR
• We are not willing to believe that the pop. AVG this year is really 955. (Conclude SAT scores have fallen.)
15
Statistical HypothesisStatistical Hypothesis- statement about the parameters of one or more populations
Null HypothesisNull Hypothesis - hypothesis to be “tested”
(standard, traditional, claimed, etc.)- hypothesis of no change, effect, or difference
(usually what the investigator wants to disprove)
Alternative HypothesisAlternative Hypothesis- null is not correct
(usually what the hypothesis the investigator suspects or wants to show)
0( )H
( )aH
Hypothesis Testing TerminologyHypothesis Testing TerminologyHypothesis Testing TerminologyHypothesis Testing Terminology
16
Basic Hypothesis Testing Question:Basic Hypothesis Testing Question:Basic Hypothesis Testing Question:Basic Hypothesis Testing Question:
Do the Data provide sufficient evidence to refute the Null Hypothesis?
17
Critical Region (Rejection Region)Critical Region (Rejection Region)- region of test statistic that leads to
rejection of null (i.e. t > c, etc.)
Critical ValueCritical Value- endpoint of critical region
Significance LevelSignificance Level - probability that the test statistic will
be in the critical region if null is true - probability of rejecting when it is true
Hypothesis Testing (cont.)Hypothesis Testing (cont.)Hypothesis Testing (cont.)Hypothesis Testing (cont.)
18
Types of HypothesesTypes of Hypotheses
0 0
0
:
:a
H
H
0 0
0
:
:a
H
H
One-Sided TestsOne-Sided Tests
Two-sided TestsTwo-sided Tests
0 0
0
:
:a
H
H
19
Rejection Regions for One- and Rejection Regions for One- and Two-Sided AlternativesTwo-Sided Alternatives
Rejection Regions for One- and Rejection Regions for One- and Two-Sided AlternativesTwo-Sided Alternatives
-t
Critical Value
0 0 0 : : vs. aH H
0 0 0 : : vs. aH H
0 0 0 : : vs. aH H
0H t t Reject if
0H t tReject if
0 / 2|H t tReject if |
20
A Standard A Standard Hypothesis Test Write-upHypothesis Test Write-up
A Standard A Standard Hypothesis Test Write-upHypothesis Test Write-up
1. State the null and alternative
2. Give significance level, test statistic,and the rejection region
3. Show calculations
4. State the conclusion- statistical decision
- give conclusion in language of the problem
21
Hypothesis Testing Example 1Hypothesis Testing Example 1Hypothesis Testing Example 1Hypothesis Testing Example 1A solar cell requires a special crystal. If properly manufactured, the mean weight of these crystals is .4g. Suppose that 25 crystals are selected at random from from a batch of crystals and it is calculated that for these crystals, the average is .41g with a standard deviation of .02g. At the = .01 level of significance, can we conclude that the batch is bad?
22
Hypothesis Testing Example 2Hypothesis Testing Example 2Hypothesis Testing Example 2Hypothesis Testing Example 2A box of detergent is designed to weigh on the average 3.25 lbs per box. A random sample of 18 boxes taken from the production line on a single day has a sample average of 3.238 lbs and a standard deviation of 0.037 lbs. Test whether the boxes seem to be underfilled.
23
Actual Situation
Errors in Hypothesis TestingErrors in Hypothesis TestingErrors in Hypothesis TestingErrors in Hypothesis Testing
Null is True Null is False
Do NotReject Ho
Reject Ho
Conclusion
CorrectDecision
CorrectDecision
( )
( )( 1 - )
( 1 - )(Power)
Type IIError
Type IError
24
p-p-ValueValue p-p-ValueValue
(observed value of t)
-2.39
p-value
0 0 0 : : vs. aH H
0H t t Reject if
Suppose t = - 2.39 is observed from data for test above
Note: “Large negative values” of t make us believe alternative is true
the probability of an observation as extreme or more extreme than the one observed when the null is true
25
Note:Note:-- if p-value is less than or equal to then we reject null at the significance level
-- the p-value is the smallest level of significance at which the null hypothesis would be rejected
26
Find the p-values for Examples 1 and 2
27
Two Independent SamplesTwo Independent SamplesTwo Independent SamplesTwo Independent Samples
• Assumptions: Measurements from Each Population are
– Mutually Independent Independent within Each Sample
Independent Between Samples
– Normally Distributed (or the Central Limit Theorem can be Invoked)
• Analysis Differs Based on Whether the Two Populations Have the Same Standard Deviation
28
Two Types of Independent Two Types of Independent SamplesSamples
Two Types of Independent Two Types of Independent SamplesSamples
• Population Standard Deviations Equal– Can Obtain a Better Estimate of the Common
Standard Deviation by Combining or “Pooling” Individual Estimates
• Population Standard Deviations Different– Must Estimate Each Standard Deviation
– Very Good Approximate Tests are Available
If Unsure, Do Not AssumeEqual Standard Deviations
29
Equal Population Standard Equal Population Standard DeviationsDeviations
Equal Population Standard Equal Population Standard DeviationsDeviations
Test Statistic
df = n1 + n2 - 2
nns
)μ(μ)yy( t=
p21
2121
11
s= s
+nn
sn + sn=s
pp
p
2
21
222
2112
2
)1()1(
where
30
Behrens-Fisher ProblemBehrens-Fisher ProblemBehrens-Fisher ProblemBehrens-Fisher Problem
y
2
22
1
21
2121 t~
ns
ns
)(y
1 2 If
31
Satterthwaite’s Approximate t Satterthwaite’s Approximate t StatisticStatistic
Satterthwaite’s Approximate t Satterthwaite’s Approximate t StatisticStatistic
y
1 t
ns
ns
)(y
2
22
1
21
212
1 2 If
2 2 21 2
2 21 2
1 2
( ), ,
1 1
a b s sa b
a b n nn n
df = (Approximate t df)
(i.e. approximate t)
32
Often-Recommended Strategy Often-Recommended Strategy for Tests on Meansfor Tests on Means
Often-Recommended Strategy Often-Recommended Strategy for Tests on Meansfor Tests on Means
Test Whether 1 = 2 (F-test )– If the test is not rejected, use the 2-sample t statistics,
assuming equal standard deviations– If the test is rejected, use Satterthwaite’s approximate t
statistic
NOTE: This is Not a Wise Strategy– the F-test is highly susceptible to non-normality
Recommended Strategy:– If uncertain about whether the standard deviations are
equal, use Satterthwaite’s approximate t statistic
33
Example 3: Example 3: Comparing the Mean BreakingComparing the Mean Breaking Strengths of 2 Plastics Strengths of 2 PlasticsExample 3: Example 3: Comparing the Mean BreakingComparing the Mean Breaking Strengths of 2 Plastics Strengths of 2 Plastics
Plastic A:
Plastic B:
.= , s.=y , = n AAA 3332835
Assumptions:Mutually independent measurementsNormal distributions for measurements from each type of plasticEqual population standard deviations
.= , s.=y , = n AAA 9472640
Question:Question: Is there a difference between the 2 plastics in terms of mean breaking strength?
34
New diet -- Is it effective?New diet -- Is it effective?
Design:Design:
50 people: randomly assign 25 to go on diet and 25 to eat normally for next month.
Assess results by comparing weights at end of 1 month.
Diet: No Diet:Diet: No Diet:
D
D
X
SND
ND
X
S
Run 2-sample t-test using guidelines we have discussed.
Is this a good design?
35
Better Design:Better Design:
Randomly select subjects and measure them before and after 1-month on the diet.
Subject Before After 1 150 147 2 210 195 : : :
n 187 190
Difference 3 15 :
-3
Procedure: Calculate differences, and analyze differences using a 1-sample test
““Paired t-Test”Paired t-Test”
36
Example 4:Example 4: International Gymnastics International Gymnastics JudgingJudging
Example 4:Example 4: International Gymnastics International Gymnastics JudgingJudging
Contestant 1 2 3 4 5 6 7 8 9 10 11 12Native J udge 6.8 4.5 8.0 7.2 8.7 4.5 6.6 5.8 6.0 8.8 8.7 4.4Foreign J udges 6.7 4.3 8.1 7.2 8.3 4.6 5.4 5.9 6.1 9.1 8.7 4.3
Question: Do judges from a contestant’s country rate their own contestant higher than do foreign judges?
0 : N FH i.e. test
:a N FH
Data: