of 57
8/9/2019 12 SSGB Amity BSI Hypothesis
1/57
Module-12
Hypothesis Testing
8/9/2019 12 SSGB Amity BSI Hypothesis
2/57
Objectives of this module
To introduce a range of hypothesis tests suitablefor testing for differences between averages.
Specifically:
1 Sample t-test
2 Sample t-test
Paired t-test
ANOVA
8/9/2019 12 SSGB Amity BSI Hypothesis
3/57
Tools for Verifying the Causes
Verify the CausesPlot - ScatterMatrix
Boxplot
Dotplot
CorrelationHypothesis Tests
ANOVA Regression
Design ofExperiments
Identify possible
relationships
Determine strengthof relationships
Quantifyrelationships
Identify and quantifycomplex relationships
Increasingcomplexit
y
ofre
lationship
8/9/2019 12 SSGB Amity BSI Hypothesis
4/57
Hypothesis Tests for Average
Before a hypothesis test for averages can be selected,there are two preliminary questions:
1) Are the samples Normally distributed?
Each of the samples that are to be included in the test must
contain Normally distributed data in order for the results of thetest to be statistically valid.
2) How many samples do you want to compare?
The number of samples refers to the number of different
samples (subgroups) that are to be compared, not the amount ofdata that is in each of those samples (thats sample size).
8/9/2019 12 SSGB Amity BSI Hypothesis
5/57
Are the
Samples
normally
distributed
Determinethe Numberof Samples
Transform theDatausing Box-Cox
CompareMedian
Values
YES
NO
NO
1 Sample t - Test
2 Sample t - Test
One-way Anova
1
2
2
3+
Paired t - Test
To compare the Averages of samples of data
8/9/2019 12 SSGB Amity BSI Hypothesis
6/57
Are the
Samples
normally
distributed
Determinethe Numberof Samples
Transform theData
using Box-Cox
CompareMedian
Values
Does theData have
Outliers
Use the
Moods
Median
Test
Use theKruskalWallis
Test
YES
YESNO
NO
NO
To compare the Medians of samples of data
8/9/2019 12 SSGB Amity BSI Hypothesis
7/57
An Expensive Situation
You manage a warranty claims department. A
customer claims loss of earnings of $1,245 for an
item which usually is about $1,000. You examine 250 previous claims of the same
item to make a comparison and find the average
to indeed be $997 with a standard deviation $88. You want to know if the customer is over-
claiming or if it is reasonable
8/9/2019 12 SSGB Amity BSI Hypothesis
8/57
Innocent or Guilty?
If the customer is not over-claiming (it is a
legitimate claim) then we would expect the
claim to fit with the pattern of data representedby the previous 250 claims
If the customer is over-claiming then we wouldexpect the claim to not fit the pattern of data
from the previous 250 claims
8/9/2019 12 SSGB Amity BSI Hypothesis
9/57
Does the Claim fit the Pattern?
You remember from previous module that if the data isnormally distributed you can apply normal theory
997 1085 1173 1261
s = 88
1245
By using standard Normaltables we can calculate theprobability of a claim of$1245 based on the historicaldata. If the probability is low,then we can assume anIllegitimate claim
Z =X - X
s
Z = = 2.821245 - 99788From Standard Normal tables the P-value, the probability of being equal to orgreater than 1245 is 0.0024 or 0.24%. In other words we would expect such a claimto happen 1 in 417 claims
8/9/2019 12 SSGB Amity BSI Hypothesis
10/57
P-values are Probabilities of Interest
P----value
Tail area Area under curve beyond value of interest
Probability of being at value of interest orbeyond
-4 -3 -2 -1 0 1 2 3 4 -4 -3 -2 -1 0 1 2 3 4 -4 -3 -2 -1 0 1 2 3 4
Value of
Interest
Value of
Interest
Value of
Interest
Value of
Interest
8/9/2019 12 SSGB Amity BSI Hypothesis
11/57
Making the Decision
The Normal theory is telling us that based on the previous250 claims, we should expect a claim of $1245 or greaterevery 417 claims
Hence: This claim is legitimate it is that 1 in 417
This is not legitimate - it does not fit the previous data
Experience shows a small p-value (0 to 0.05) means The probability is small that the value of interest comes
from that distribution by chance therefore something elseis going on
Since our p-value 0.024 < 0.05 we can conclude that theclaim is not legitimate
8/9/2019 12 SSGB Amity BSI Hypothesis
12/57
What Have we done?
In the previous example we have used the properties of theNormal distribution to test whether the occurrence of anevent could have happened by chance (the data fits the
expected pattern) or there is a real difference (the data doesnot fit the expected pattern)
This type of situation occurs frequently during Six Sigma
improvement projects, either in the
Analyze phase when we are looking for differences toidentify potential roots causes
Improve and Control phases when we are aiming todemonstrate that a real change has been made we havemade a difference
8/9/2019 12 SSGB Amity BSI Hypothesis
13/57
Analyze: Is there a Difference?
A common question during the Analyze phase is is therereally a difference?
For example: We suspect the output of a process depends
upon the supplier
Is there a difference?
Process YA
Variation inprocess output
Sample yA and sA
Supplier A
Process YB
Variation inprocess output
Sample yB and sB
Supplier B
8/9/2019 12 SSGB Amity BSI Hypothesis
14/57
Improve: Have We Made a Difference?
A common question having improved a process is
have we really made a difference?
CHANGE
Is there a difference?
Old Process Yold
Variation in
process output
Sample yold and sold
New Process Ynew
Variation inprocess output
Sample ynew and snew
8/9/2019 12 SSGB Amity BSI Hypothesis
15/57
Is there a difference?
10 11 12 13 14 15 16 17 18 19 20
0
10
20
Yold
Fre
quency
10 11 12 13 14 15 16 17 18 19
0
10
20
Ynew
Fre
quency
Looking at the histograms may give us the idea thatthere is a difference
But can we be sure?
8/9/2019 12 SSGB Amity BSI Hypothesis
16/57
A Difference?
It is possible that we have taken two samples from the samedistribution
Oldsample
Newsample
While on face value they look different statistically theyare not!! The difference is due to chance (sampling)
8/9/2019 12 SSGB Amity BSI Hypothesis
17/57
Hypothesis Testing
In the first example we calculated probabilities to help usdecide. This situation is common, where we are interested inmaking a decision about differences between two or more
sets of data that relate to real world situations Hypothesis Testing allows us to decide whether an observed
difference is real or has happened by chance
In simple terms we expect there to be a difference betweenexpected and observed outcomes due to random(chance/common cause variation) factors. Hypothesis testingis concerned with whether these differences are so large that
they cannot be explained by random (chance) effects Hypothesis tests are often based on sample data in which
case we use sampling distributions rather than thedistribution of individual values
8/9/2019 12 SSGB Amity BSI Hypothesis
18/57
Hypothesis Testing Concept
In order to show that an apparent difference is real
or due to chance we start by assuming that there is
no difference (Null Hypothesis, H0). If the observed difference is within that expected by
chance then the Null Hypothesis of no difference iscorrect.
If the observed difference is greater than that expected bychance then the Null Hypothesis of no difference is not
correct.
8/9/2019 12 SSGB Amity BSI Hypothesis
19/57
Greater Than Expected!
Mean x
-1x 1x-2x 2x-3x 3x
Distribution ofsample means -
+-Theoretically the
limits are
68.26%
95.45%
99.73%100%
We cannot be 100% certain that an actual sample meanfalls outside the distribution. There is always an
element of risk
8/9/2019 12 SSGB Amity BSI Hypothesis
20/57
95% Confidence
Mean x
-1.96x 1.96x
Distribution ofsample means
95%
Experience has shownthat setting the decision
Boundaries at 95%provides a good balance
between being able tomake a decision and the
risk of making thewrong decision
8/9/2019 12 SSGB Amity BSI Hypothesis
21/57
Risk
Mean x
95%
A point here could be dueto chance or there is a difference
If we decide they are different
hen they are not we have made atype I error
The risk of making this error
Is called the -risk
A point here could be due tochance or there is a difference.
If decide they are the same when
they are not we have made atype II error
The risk of making this error
is called the -risk
8/9/2019 12 SSGB Amity BSI Hypothesis
22/57
Type I and II errors
Type I - Deciding they are different when they are not
Type II - Not detecting a difference when there is one
P-value = the actual probability of making a Type I error
The probability of a Type II can be calculated given anassumed true difference
Both are important Guarding too heavily against one increase the risk of the
other
Increasing the sample size Reduces Type II errors
Allows you to detect smaller differences
More Later
8/9/2019 12 SSGB Amity BSI Hypothesis
23/57
Hypothesis Tests
Hypothesis tests can be used in a number of situations:
A sample is taken and found to have a mean x, is this value equal toa target value T?
Samples taken before and after a change, the two sample means aredifferent - but are they?
Samples from a process for different stratification factors, the twosample means are different - but are they?
Samples taken before and after a change, the standard deviations aredifferent - but are they?
Samples from a process for different stratification factors, the twosample standard deviations are different - but are they?
The proportion defective appears to vary with different factors butdoes it?
8/9/2019 12 SSGB Amity BSI Hypothesis
24/57
Steps in Hypothesis Testing
Hypothesis tests follow the pattern
1. determine the null hypothesis H0
2. determine the alternative hypothesis H1 or Ha3. determine statistical test
4. state level of risk
5. establish sample size6. collect data
7. analyse data
8. accept or reject the null hypothesis9. determine conclusion
8/9/2019 12 SSGB Amity BSI Hypothesis
25/57
1. The Null Hypothesis
The null hypothesis is always stated so as to
specify that there is no (null) difference
When comparing means or variances the null
hypothesis is:
H0: = Target
H0: 1 = 2H
0
: 1
= 2
8/9/2019 12 SSGB Amity BSI Hypothesis
26/57
2. The Alternative Hypothesis
The alternative hypothesis opposes the null
hypothesis and can therefore has several options
H1: 1 < 2
orH1: 1 > 2
or
H1: 1 2
The choice dependsupon the problemMinitab gives theoption to select
these
For example
8/9/2019 12 SSGB Amity BSI Hypothesis
27/57
3. Determine Tests
Compare two or more groupproportions.
Chi-square test
Compare two or more group
variances.
Test for equal variances (F-test,
Bartletts test, Levenes test)
Compare two or more groupaverages.
ANOVA
(Analysis Of Variance)
Compare two group averages when
data is matched.
paired t-test
Compare a group average against atarget.
Compare two group averages.
t-test
Can be used toHypothesis Test
8/9/2019 12 SSGB Amity BSI Hypothesis
28/57
3. Determine Test
Hypothesis tests are used when the input or process (X)variable is discrete.
If the X data are continuous, use Regression analysis tojudge whether they are related to the output (Y)variable.
Y
X
Continuou
s
Discre
te
(Proportions)
Discrete(Groups) Continuous
Chi-Square
t-testPaired t-test
ANOVA
Logisticregression
Regression
8/9/2019 12 SSGB Amity BSI Hypothesis
29/57
4. State level of Risk
It is very important to specify before conducting the test thelevel of risk that the decision will be based on
The risk means that there is the chance that we will make
the wrong decision In assessing the risk level we need to consider the situation
and the business it is a question of significance vs.importance
Significant means there is a real difference as proven by ahypothesis test
Important means that the difference observed is important
to the business Typically the -risk is set at 5% which provides a 95%confidence
8/9/2019 12 SSGB Amity BSI Hypothesis
30/57
8/9/2019 12 SSGB Amity BSI Hypothesis
31/57
"Important" vs. "Significant" Differences,
Important but not significant differences
Sometimes you cannot claim a difference is statisticallysignificant yet the observed difference is of importance to
the business Example: Production volumes
An increase of 1000 items produced per day is observed during the pilot.
An increase of 1000 is important to the business.
However, the difference is not statistically significant
Either the observed difference is due to random variation and no true
difference exists, or the the variation is too large (or sample size too
small) to detect the difference.
You (and your business leaders) need to decide if it is worth
the risk to go ahead and implement the new process
8/9/2019 12 SSGB Amity BSI Hypothesis
32/57
5. Establish Sample Size
The ability to detect differences is related to the
sample size
A small sample means we cannot detect smalldifferences, but the cost of data collection is low
A large sample size will enable the detection of small
differences, but the cost of data collection could be
high
Selecting the right sample size depends upon
thePower = (1 - )
8/9/2019 12 SSGB Amity BSI Hypothesis
33/57
6. Collect Data
Remember all the rules and guidelines about
collecting data
Sample size Good operational definition
Clear measurement procedure
Un-biased samples
Good Gauge R&R
8/9/2019 12 SSGB Amity BSI Hypothesis
34/57
7. Analyse Data
All hypothesis tests can be conducted by hand
calculation BUT
Minitab Rules!
8/9/2019 12 SSGB Amity BSI Hypothesis
35/57
8. Acceptance or Rejection
Method 1if the p-value is > or = to - fail to reject H0if the p-value is < - reject H0 and accept H1
Method 2
if 0 falls within the confidence interval - fail to reject H0if 0 falls outside the confidence interval - reject H0 and
accept H1
In practice we use Methods 2 and 3 from information providedby Minitab
8/9/2019 12 SSGB Amity BSI Hypothesis
36/57
t -tests
i f i
8/9/2019 12 SSGB Amity BSI Hypothesis
37/57
Mr Gossett (1876 1936) workedfor the Guinness company andpublished his work under the pen
name of Student, hence thestudent T-test.
The T-test measures thesignificance of shift in the average.average.
t- tests a bit of history
To compare the Averages of samples of data
8/9/2019 12 SSGB Amity BSI Hypothesis
38/57
Are the
Samples
normally
distributed
Determinethe Numberof Samples
Transform the
Datausing Box-Cox
CompareMedianValues
YES
NO
NO
1 Sample t - Test
2 Sample t - Test
One-way Anova
1
2
2
3+
Paired t - Test
To compare the Averages of samples of data
1 S l (1)
8/9/2019 12 SSGB Amity BSI Hypothesis
39/57
1 Sample t-test (1)
1 Sample t-tests are used to compare the average
of one sample of data against a known average.
The known average may be a historical average
or an industry benchmark.
For the purpose of the test, the known average is
assumed to be exact.
Open data file: One-Sample-t.MPJ
1 S l t t t (2)
8/9/2019 12 SSGB Amity BSI Hypothesis
40/57
1 Sample t-test (2)
A project team has implemented a new invoicingprocess, and wants to compare it against the
historical average of 16.5 days.
Hypothesis:
The Individual Value
Plot suggests that theaverage invoice timefor the new process is
lower (quicker) thanthe historical averageof 16.5
New
Process
22
20
18
16
14
12
10
Individual Value Plot of New Process
New process
16.5
1 S l t t t (3)
8/9/2019 12 SSGB Amity BSI Hypothesis
41/57
1 Sample t-test (3)
A 1 Sample t-test can be used to test if the differencebetween the averages is statistically significant.
The hypotheses for the test would be:
Ho: There is no difference between the average invoice time
of the new process and 16.5 Ha: New process is lower than average invoice time of 16.5
Because our theory is that there is a difference, we
expect to reject the Null hypothesis.
1 S l t t t (4)
8/9/2019 12 SSGB Amity BSI Hypothesis
42/57
1 Sample t-test (4)
MINITAB: Stat > Basic Statistics > 1-Sample t
There are two optionsfor entering data.
1st Option: Enter thecolumn that contains the
data here.
2nd Option: If you knowthe sample size, meanand standard deviation
of the data it can beentered here.
In both cases, the known averagethat the data is to be comparedagainst should be entered here.
1 Sample t tests (5)
8/9/2019 12 SSGB Amity BSI Hypothesis
43/57
1 Sample t-tests (5)
Select the graph most appropriatefor the samples sizes involved. If indoubt, check all three.
Leave the confidence level of95%
Alternative: less than
MINITAB: Stat > Basic Statistics > 1-Sample t
1 Sample t tests (6)
8/9/2019 12 SSGB Amity BSI Hypothesis
44/57
1 Sample t-tests (6)
The p-value for the1 sample t-test isfound in the sessionwindow output. Inthis case, the p-valueis 0.040
Session Window
Output for 1 Sample
t-test
Since the p-value for this test is lessthan 0.05, we can be 95% confident
that there is a difference between theaverage invoicing time of the newprocess versus the historical average.We reject the Null Hypothesis
One-Sample T: New Process
Test of mu = 16.5 vs < 16.5
95%Upper
Variable N Mean StDev SE Mean Bound T P
New Process 15 14.9818 3.1218 0.8060 16.4015 -1.88 0.040
1 Sample t tests (7)
8/9/2019 12 SSGB Amity BSI Hypothesis
45/57
1 Sample t-tests (7)
New Process
22201816141210
_X
Ho
Individual Value Plot of New Process(with Ho and 95% t-confidence interval for the mean)
MINITAB adds a blue line to
the graphs that represents theconfidence interval of the
sample average, and alsoindicates the position of Ho(the known average).
In this case, Ho (the known
average) is outside theconfidence interval of the
sample average andtherefore we can prove adifference between the two.
New Process
F
requency
22201816141210
5
4
3
2
1
0 _X
Ho
Histogram of New Process(with Ho and 95% t-confidence interval for the mean)
1 Sample t-test Whiskey or water?
8/9/2019 12 SSGB Amity BSI Hypothesis
46/57
1 Sample t-test Whiskey or water?
Problem : A cocktail bar suspects that one of its suppliers ofwhiskey is adding water to increase profits. The freezingtemperature of whiskey has a normal distribution with a
mean of= - 0.5450 C. Adding water raises the freezing
temperature. The cocktail bar wants to know if its suspicionsare correct. Is the supplier adding water to its whiskey?
Procedure: A sample from each of ten bottles is obtained from
this supplier. Experimental Unit : A bottle.
Measurement : Freezing temperature of the sample.
Significance Level : = 0.05
Data Set : Whiskey.MPJ
To compare the Averages of samples of data
8/9/2019 12 SSGB Amity BSI Hypothesis
47/57
Are the
Samplesnormally
distributed
Determinethe Numberof Samples
Transform the
Datausing Box-Cox
CompareMedianValues
YES
NO
NO
1 Sample t - Test
2 Sample t - Test
One-way Anova
1
2
2
3+
Paired t - Test
o co pa e t e ve ages o sa p es o data
The concept of Two Sample t-tests
8/9/2019 12 SSGB Amity BSI Hypothesis
48/57
You take two samples ofdata from the sameprocess, but at different
times. Has the processmean moved?
Would you think the
process mean has movedif the results had been..
What did you look at to make your decisions?What did you look at to make your decisions?
The concept of Two Sample t tests
2 Sample t-tests (1)
8/9/2019 12 SSGB Amity BSI Hypothesis
49/57
2 Sample t tests (1)
2 Sample t-tests are used to compare the averages of twosamples of data. The two samples may represent:
two different suppliers
two different processes
two different teams
two different products etc.
The two samples can be of different sample sizes, since the 2
sample t-test takes account of both sample sizes.
The two samples can have different levels of variation(standard deviation) since the two sample t-test takes
account of this as well.Open data file: Two-Sample-T-Appeal.MPJ
2 Sample t-tests (2)
8/9/2019 12 SSGB Amity BSI Hypothesis
50/57
2 Sample t tests (2)
Process
Time(days)
OldNew
42.5
40.0
37.5
35.0
32.5
30.0
27.5
25.0
Boxplot of Time (days) vs Process
A project team has implemented a new appealsprocess, and the box plot below shows the cycle
time of the new and old processes.
Hypothesis:
The Box Plot suggests
that the average cycletime for the new
appeals process is lower(quicker) than the old
process.
2 Sample t-tests (3)
8/9/2019 12 SSGB Amity BSI Hypothesis
51/57
2 Sample t tests (3)
A 2 Sample t-test can be used to test if thedifference between the average cycle times is
statistically significant.
The hypotheses for the test would be:
Ho: There is no difference between the average cycle
times for the new and old processes.
Ha: The new cycle time is lower than the cycle times for
the old processes.
Because our theory is that the new is lower, we
expect to reject the Null hypothesis.
2 Sample t-tests (4)
8/9/2019 12 SSGB Amity BSI Hypothesis
52/57
2 Sample t tests (4)
There are several options forentering data.
1st Option: If the data is in asingle column, with a secondcolumn containing subgroupdata.
2nd
Option: If the two datasamples are in two separatecolumns.
3rd Option: If you know thesample size, mean and standarddeviation of both samples, thisinformation can be enteredhere.
MINITAB: Stat > Basic Statistics > 2-Sample t
2 Sample t-tests (5)
8/9/2019 12 SSGB Amity BSI Hypothesis
53/57
p ( )
Set the options as shown
above:
a confidence of 95%
a test difference of 0.0 Alternative: less than
MINITAB: Stat > Basic Statistics > 2-Sample t
2 Sample t-tests (6)
8/9/2019 12 SSGB Amity BSI Hypothesis
54/57
Two-Sample T-Test and CI: Time (days), Process
Two-sample T for Time (days)
Process N Mean StDev SE MeanNew 20 32.12 3.46 0.77
Old 25 34.35 3.12 0.62
Difference = mu (New) - mu (Old)
Estimate for difference: -2.2370095% upper bound for difference: -0.56083T-Test of difference = 0 (vs
8/9/2019 12 SSGB Amity BSI Hypothesis
55/57
p ( )
Process
Time
(days)
OldNew
42.5
40.0
37.5
35.0
32.5
30.0
27.5
25.0
Boxplot of Time (days) by Process
The graphs provided by the 2 sample t-test are thesame as those provided by using the graph menu.
By default, MINITAB adds a symbol to show theaverage of each subgroup.
Process
Time
(days)
OldNew
42.5
40.0
37.5
35.0
32.5
30.0
27.5
25.0
Individual Value Plot of Time (days) vs Process
2 Sample t-tests (8)
8/9/2019 12 SSGB Amity BSI Hypothesis
56/57
p
Exercise: Using the data file: Two-Sample-T-Appeal.MPJ
2) Repeat the test using the samples in different
columns option.
3) Repeat the test using the summarised data
option.
To compare the Averages of samples of data
8/9/2019 12 SSGB Amity BSI Hypothesis
57/57
Are the
Samplesnormally
distributed
Determinethe Numberof Samples
Transform the
Datausing Box-Cox
CompareMedianValues
YES
NO
NO
1 Sample t - Test
2 Sample t - Test
One-way Anova
1
2
2
3+
Paired t - Test