8/11/2019 Six Sigma Interview Helper
http://slidepdf.com/reader/full/six-sigma-interview-helper 1/5
MeasureDATA COLLECTION PLAN
Define clear strategy for collecting reliable dataefficiently.
Ensure resources are used effectively to collect onlydata that is critical to the success of the project.
Ensures that the collected data accuratelyrepresents the true nature of your process.
Data collection is costly; therefore, ensure datacollection plan and the measurement system aresound.
Sampling
When to Sample: Collecting all the data is impractical or too costly……….Data collection can be a destructiveprocess…………When measuring a high -volume process
When not to sample: A subset of data may not accurately depict the process, leading to a wrong conclusion (every unit is unique – e.g., structured deals)
The major reason sampling is done is for efficiency reasons-it is often too costly or time consuming to measure all of the data.Sampling provides a good alternative to collect data in an effective and efficient manner. If the circumstances surrounding thedata collection plan do not justify sampling, then sampling should not be done. This is often the case in low volume processes.Random Stratified Clustered Systematic
Convenience bias , Environment bias , Measurement bias , Non-response bias
Measurement System Error
Accuracy Precision
Bias Stability Linearity Repeatability Reproducibility
Accuracy: The average of multiple measurements of an event are equal to the true value
Stability:
Linearity:
Precision:
The measurement system maintains its performance over time
The measurement system maintains its performance over a range of events
There is little variation in repeated measurements of the same event
8/11/2019 Six Sigma Interview Helper
http://slidepdf.com/reader/full/six-sigma-interview-helper 2/5
Two-Way ANOVA Table With Interaction
Source DF SS MS F P
Call Number 2 592.667 296.333 205.154 0.000
Operator Num 2 6.889 3.444 2.385 0.208
Call Number * Operator Num 4 5.778 1.444 7.800 0.001
Repeatability 18 3.333 0.185
Total 26 608.667
Gage R&R
%Contribution
Source VarComp (of VarComp)
Total Gage R&R 0.8272 2.46
Repeatability 0.1852 0.55
Reproducibility 0.6420 1.91
Operator Num 0.2222 0.66
Operator Num*Call Number 0.4198 1.25
Part-To-Part 32.7654 97.54
Total Variation 33.5926 100.00
Study Var %Study Var
Source StdDev (SD) (6 * SD) (%SV)
Total Gage R&R 0.90948 5.4569 15.69
Repeatability 0.43033 2.5820 7.42
Reproducibility 0.80123 4.8074 13.82
Operator Num 0.47140 2.8284 8.13
Operator Num*Call Number 0.64788 3.8873 11.18
Part-To-Part 5.72411 34.3447 98.76
Total Variation 5.79591 34.7755 100.00
Number of Distinct Categories = 8
The Total Gage R&R value under % contributionshows the measurement system error for thistest, in this case 2.46% of the variation in thedata comes from the measurement system. Welike it to be less than 10%, so this measurementsystem is acceptable.
The remaining variation comes from Part-to-Partvariation. In this study, 97.54% of the variationwas due to variation in performance of the actualcall cycle time – true process variation.
Gage R and R ANOVA Method: Example
X Bar R Method
–Typically used in automobile industry
–Extreme values affect the method
–Short & Long Method
Short Method does not measure operatorand equipment variability separately
Long method measures operator andequipment variability separately
ANOVA Method
–Measures operator & equipment variabilityseparately as well as combined effect ofoperator & parts
–More effective when extreme value are
present–Most tedious to perform manual calculations
8/11/2019 Six Sigma Interview Helper
http://slidepdf.com/reader/full/six-sigma-interview-helper 3/5
START:Is Y Continuous
or Discrete?
Is X Continuousor Discrete?
Variation orCentering?
Chi Square
Binomial
Regression
Scatter Plot
Discrete
Discrete
Discrete Continuous
Continuous
Continuous
Variation Centering
Homogeneity ofVariance Bartlett
Homogeneity ofVariance
F-Test
Is X Continuousor Discrete?
Normal ornon-Normal?
Normal ornon-Normal?
Homogeneity ofVariance Levine
Normal Non-Normal
ComparingRelative to a
Target?
Comparing OnlyTwo Samples?
Normal Non-Normal
ANOVA
One SampleT-Test
Two SampleT-Test
No No
LogisticRegression
Non-Parametric
Tests
Mann-Whitney
Mood’s
Median
Yes Yes
Statistical Test Decision Tree
C chart
U chart
NP chart
P chart Sample size > 6X-bar and S
Sample size < 6X-bar and R
IMR & EWMA(Sample size = 1)
Discrete
Continuous
Variable sample
Constant sample
8/11/2019 Six Sigma Interview Helper
http://slidepdf.com/reader/full/six-sigma-interview-helper 4/5
LEAN1. Specify value from the
standpoint of the end customerby product family.
2. Identify all the steps in thevalue stream for each productfamily, eliminating whenever
possible those steps that do notcreate value.3. Make the value-creating steps
occur in tight sequence so theproduct will flow smoothlytoward the customer.
4. As flow is introduced, letcustomers pull value from thenext upstream activity.
5. As value is specified, valuestreams are identified, wastedsteps are removed, and flowand pull are introduced, beginthe process again and continueit until a state of perfection isreached in which perfect valueis created with no waste.
8/11/2019 Six Sigma Interview Helper
http://slidepdf.com/reader/full/six-sigma-interview-helper 5/5
Ppk is an index of process performance which tells how wella system is meeting specifications. Ppk calculations useactual sigma (sigma of the individuals), and shows how thesystem is actually running when compared to thespecifications. This index also takes into account how wellthe process is centered within the specification limits.
If Ppk is 1.0... ...the system is producing 99.73% of its outputwithin specifications. The larger the Ppk, the less thevariation between process output and specifications.
If Ppk is between 0 and 1.0...not all process output meetsspecifications.
If the system is centered on its target value...Ppk should beused in conjunction with the Pp index. If the system iscentered on its target value, Ppk and Pp will be equal. If theyare not equal, the smaller the difference between theseindices, the more centered the process is.
Cpk is a capability index that tells how well as system canmeet specification limits. Cpk calculations use estimatedsigma and, therefore, shows the system's "potential" tomeet specifications. Since it takes the location of the processaverage into account, the process does not need to becentered on the target value for this index to be useful.
If Cpk is 1.0......the system is producing 99.73% of its outputwithin specifications. The larger the Cpk, the less variationyou will find between the process output and specifications.
If Cpk is between 0 and 1.0......not all process output meetsspecifications.
If the system is centered on its target value......Cpk should beused in conjunction with the Cp index. Cpk and Cp will beequal when the process is centered on its target value. Ifthey are not equal, the smaller the difference betweenthese indices, the more centered the process is.
Differences Between C pk and P pk “C pk is for short term, Ppk is for long term .” “Ppk produces an index number (like 1.33) for the process variation. C pk references the variation to your specification limits. If you just want to know howmuch variation the process exhibits, a Ppk measurement is fine. If you want to know how that variation will affect the ability of your process to meet customerrequirements (CTQ’s), you should use C pk .” Michael Whaley “It could be argued that the use of Ppk and C pk (with sufficient sample size) are far more valid estimates of long and short term capability of processes since the1.5 sigma shift has a shaky statistical foundation.” Eoin “C pk tells you what the process is CAPABLE of doing in future, assuming it remains in a state of statistical control. Ppk tells you how the process has performedin the past. You cannot use it predict the future, like with C pk , because the process is not in a state of control. The values for C pk and Ppk will converge to almostthe same value when the process is in statistical control. that is because sigma and the sample standard deviation will be identical (at least as can bedistinguished by an F- test). When out of control, the values will be distinctly different, perhaps by a very wide margin.” Jim Parnella “C p and C pk are for computing the index with respect to the subgrouping of your data (different shifts, machines, operators, etc.), while Pp and Ppk are for thewhole process (no subgrouping). For both Ppk and C pk the ‘k’ stands for ‘centralizing facteur ’ –it assumes the index takes into consideration the fact that your
data is maybe not centered (and hence, your index shall be smaller). It is more realistic to use Pp and Ppk than C p or C pk as the process variation cannot betempered with by inappropriate subgrouping. However, C p and C pk can be very useful in order to know if, under the best conditions, the process is capable offitting into the specs or not.It basically gives you the best case scenario for the existing process.”