+ All Categories
Home > Documents > SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis...

SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis...

Date post: 19-Jan-2018
Category:
Upload: dwain-fields
View: 216 times
Download: 0 times
Share this document with a friend
Description:
Is Process B Better Than Process A? Copyright 2003© Duans S. Boning. 3
35
SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003© Duans S. Boning. 1
Transcript
Page 1: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems

Lecture 10: Data and Regression Analysis

Lecturer: Prof. Duane S. Boning

Copyright 2003© Duans S. Boning. 1

Page 2: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

Agenda

1. Comparison of Treatments (One Variable) • Analysis of Variance (ANOVA) 2. Multivariate Analysis of Variance • Model forms 3. Regression Modeling • Regression fundamentals • Significance of model terms • Confidence intervals

Copyright 2003© Duans S. Boning. 2

Page 3: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

Is Process B Better Than Process A?

Copyright 2003© Duans S. Boning. 3

Page 4: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

Two Means with Internal Estimate of Variance

Copyright 2003© Duans S. Boning. 4

Page 5: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

Comparison of Treatments

Copyright 2003© Duans S. Boning. 5

• Consider multiple conditions (treatments, settings for some variable) – There is an overall mean μ and real “effects” or deltas between

conditions τ. – We observe samples at each condition of interest • Key question: are the observed differences in mean “significant”?

- Typical assumption (should be checked): the underlying variances are all the same – usually an unknown value (σ02)

Page 6: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

Steps/Issues in Analysis of Variance

Copyright 2003© Duans S. Boning. 6

1. Within group variation – Estimates underlying population variance 2. Between group variation – Estimate group to group variance

3. Compare the two estimates of variance – If there is a difference between the different treatments, then the between group variation estimate will be inflated compared to the within group estimate – We will be able to establish confidence in whether or not observed differences between treatments are significant Hint: we’ll be using F tests to look at ratios of variances

Page 7: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

(1) Within Group Variation

Copyright 2003© Duans S. Boning. 7

• Assume that each group is normally distributed and shares a common variance• SS= sum of square deviations wgroup (there are k groups)

• Estimate of wthin group variance in tgroup (ust variance formula)

• Pool these (across different conditions) to get estimate of common thin group variance:

• This is the wthin group “mean square” variance estimate)

Page 8: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

(2) Between Group Variation

Copyright 2003© Duans S. Boning. 8

• We will be testing hypothesis = … = • If all the means are in fact equal, then a 2 estimate could be

formed based on the observed differences between group means:

• If all the treatments in fact have different means, then estimates something larger:

Page 9: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

(3) Compare Variance Estimates

Copyright 2003© Duans S. Boning. 9

• We now have two different posses for s depending on whether the observed sample mean differences are “real” or are just occurring by chance (by sampling) •

• Use statistic to see if the ratios of these variances are likely to have occurred by chance!

• Formal test for significance:

Page 10: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

(4) Compute Significance Level

Copyright 2003© Duans S. Boning. 10

• Calculate observed ratio (wth appropriate degrees of freedom in numerator and denominator) • distribution to find how likely a ratio this large is to have occurred by chance alone -This is our “signifcance level” -If

• then we say that the mean differences or treatment effects are sgnificant to (1-)100% confidence or

better

Page 11: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

(5) Variance Due to Treatment Effects

Copyright 2003© Duans S. Boning. 11

• We also want to estimate the sum of squared deviations from the grand mean among all

samples:

Page 12: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

(6) Results: The ANOVA Table

Copyright 2003© Duans S. Boning. 12

source of sum of degrees Variation squares of mean square freedom

Between Treatments

Within treatments

Total about the grand average

Page 13: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

Example: Anova

Copyright 2003© Duans S. Boning. 13

Page 14: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

ANOVA – Implied Model

Copyright 2003© Duans S. Boning. 14

• The ANOVA approach assumes a smple mathematical model: • Where is the treatment mean (for treatment type t)• And is the treatment effect • With being zero mean normal residuals ~N• Checks -Plot residuals against time order -Examne distribution of residuals: should be IID, Normal -Plot residuals vs. estimates -Plot residuals vs. other variables of interest

Page 15: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

MANOVA – Two Dependencies

Copyright 2003© Duans S. Boning. 15

• Can extend to two (or more) variables of interest. MANOVA assumes a mathematical model, again simply capturing the means (or treatment offsets) for each discrete variable level:

• Assumes that the effects from the two variables are additive

Page 16: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

Example: Two Factor MANOVA

Copyright 2003© Duans S. Boning. 16

• Two LPCVD depositon tube types, three gas suppers. Does supper matter in average particle counts on wafers? Experiment: 3 lots on each tube, for each gas; report average # particles added

Page 17: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

MANOVA – Two Factors with Interactions

Copyright 2003© Duans S. Boning. 17

• May be interaction: not simpy additive – effects may depend synergistically on both factors:

An effect that depends on both t & factors simultaneously

t = first factor = 1,2, … k (k = #eves of first factori = second factor = 1,2, … n (n = # levels of second factor)j = replication = 1,2, … m (m = # replications at t,th combination of factor levels

• Can split out the model more explicitly…

Estimate by:

Page 18: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

(6) Results: The ANOVA Table

Copyright 2003© Duans S. Boning. 18

source of sum of degrees Variation squares of mean square freedom

Between levels of factor 1 (T)

Between levels of factor 2 (B)

Interaction

Within Groups(Error)

Total about the grand

average

Page 19: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

Measures of Model Goodness – R

Copyright 2003© Duans S. Boning. 19

• Goodness of fit –R

-Think of this as (1 – varance remaining in the residual) Recall

• Adjusted R -For “fair” comparison between models wth different numbers of coeffi

cients, an alternatve is often used

-Think of this as the fraction of squared deviationsfrom the grand average) in the data which is captured by the

-Queston consdered: how much better does the model do thatust using the grand average?

Page 20: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

Regression Fundamentals

Copyright 2003© Duans S. Boning. 20

• Use least square error as measure of goodness to estimate coefficients in a model

• One parameter model: – Model form – Squared error – Estimation using normal equations – Estimate of experimental error – Precision of estimate: variance in b – Confidence interval for β – Analysis of variance: significance of b – Lack of fit vs. pure error • Polynomial regression

Page 21: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

Regression Fundamentals

Copyright 2003© Duans S. Boning. 21

• We use least-squares to estimate

coefficients in typical regression models

• One-Parameter Model:

• Goais to estimate th “best” b

• How define “best”?

-That b whch mzes sum of squared

error between predicton and data

-The residua sum of squares (for the

best estimate) is

Page 22: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

Regression Fundamentals

Copyright 2003© Duans S. Boning. 22

• Least squares estimation via normal equations -For linear problems, we need not calcuate SS( ); rather, drect solution for b is possib -Recognize that vector of resduals will be normal to vector of x vaues at the least squares estimate

• Estimate of experimental error -Assumng model structure is adequate, estimate scan be obtained:

Page 23: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

Precision of Estimate: Variance in b

Copyright 2003© Duans S. Boning. 23

• We can calculate the variance in our estimate of the slope, b:

• Why?

Page 24: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

Confidence Interval for β

Copyright 2003© Duans S. Boning. 24

• Once we have the standard error in b, we can calculate confidence intervals to some desred (1-)100% level of confidence

•Analysis of variance

-Test hypothess:

-If confdence interval for includes 0, then β not significant

-Degrees of freedom (need in order to use t distribution

p = # parameters estmated by least squares

Page 25: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

Example Regression

Copyright 2003© Duans S. Boning. 25

• Note that this smple model assumes an intercept of

zero – model must go through origin

• We wll relax this requirement soon

Age Income 8 6.16 22 9.88 35 14.35 40 24.06 57 30.34 73 32.1778 42.1887 43.2398 48.76

Page 26: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

Lack of Fit Error vs. Pure Error

Copyright 2003© Duans S. Boning. 26

•Sometimes we have replicated data

-E.g. multiple runs at same x values in a designed experiment

•We can decompose the resdual error contributons

Where

= resdual sum of squares error

= lack of ft squared error

= pure replicate error•This alows us to TEST for lack of fit -By “lack of fit” we mean evidence that the linear model form is inadequate

Page 27: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

Regression: Mean Centered Models

Copyright 2003© Duans S. Boning. 27

Model form

Model form

Page 28: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

Regression: Mean Centered Models

Copyright 2003© Duans S. Boning. 28

•Confidence Intervals

•Our confidence interval on y widens as we get further from the center of our data1

Page 29: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

Polynomial Regression

Copyright 2003© Duans S. Boning. 29

•We may bellwe thai a hlgher order model strudurm applk. Pdynomlal forms are also Ilnem in the coendents and mm be M wlth least squares

•Example: Growth rate data

Page 30: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

Regression Example: Growth Rate Data

Copyright 2003© Duans S. Boning. 30

R6ph~ated ata provKles opportunity to check for ladc of fit

Page 31: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

Growth Rate - First Order Model

Copyright 2003© Duans S. Boning. 31

•Mean significant, but linear term not

•Clear evidence of lack of fit

Page 32: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

Growth Rate - Second Order Model

Copyright 2003© Duans S. Boning. 32

•No evidence of lack of fit

•Quadratic term significant

Page 33: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

Polynomial Regression In Excel

Copyright 2003© Duans S. Boning. 33

•Create additional input columns for each input

•Use "Data Analysis" and "Regression" tool

Page 34: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

Polynomial Regression

Copyright 2003© Duans S. Boning. 34

• Generated using JMP package

Page 35: SMA 6304 / MIT 2.853 / MIT 2.854 Manufacturing Systems Lecture 10: Data and Regression Analysis Lecturer: Prof. Duane S. Boning Copyright 2003 Duans S.

Summary

Copyright 2003© Duans S. Boning. 35

• Comparison of Treatments – ANOVA

• Multivariate Analysis of Variance

• Regression Modeling

Next Time

• Time Series Models

• Forecasting


Recommended