+ All Categories
Home > Documents > C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter...

C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter...

Date post: 21-Jan-2016
Category:
Upload: poppy-patrick
View: 220 times
Download: 0 times
Share this document with a friend
Popular Tags:
43
CHAPTER 4: INTRODUCTORY LINEAR REGRESSION Chapter Outline 4.1 Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2 Curve Fitting 4.3 Inferences About Estimated Parameters 4.4 Adequacy of the model coefficient of determination 4.5 Pearson Product Moment Correlation Coefficient 4.6 Test for Linearity of Regression 4.7 ANOVA Approach Testing for
Transcript
Page 1: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

CHAPTER 4: INTRODUCTORY LINEAR REGRESSION

Chapter Outline4.1 Simple Linear Regression•Scatter Plot/Diagram•Simple Linear Regression Model4.2 Curve Fitting4.3 Inferences About Estimated Parameters4.4 Adequacy of the model coefficient of determination4.5 Pearson Product Moment Correlation Coefficient4.6 Test for Linearity of Regression4.7 ANOVA Approach Testing for Linearity of Regression

Page 2: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

Regression – is a statistical procedure for establishing the r/ship between 2 or more variables.

This is done by fitting a linear equation to the observed data.

The regression line is used by the researcher to see the trend and make prediction of values for the data.

There are 2 types of relationship:Simple ( 2 variables)Multiple (more than 2 variables)

INTRODUCTION TO LINEAR REGRESSION

Page 3: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

Many problems in science and engineering involve exploring the relationship between two or more variables.

Two statistical techniques:(1) Regression Analysis (2) Computing the Correlation Coefficient (r). Linear regression - study on the linear

relationship between two or more variables. This is done by fitting a linear equation to the

observed data. The linear equation is then used to predict

values for the data.

Page 4: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

In simple linear regression only two variables are involved:

i. X is the independent variable.

ii. Y is dependent variable. The correlation coefficient (r ) tells us how

strongly two variables are related.

Page 5: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

Example 4.1:

1) A nutritionist studying weight loss programs might wants to find out if reducing intake of carbohydrate can help a person reduce weight.a) X is the carbohydrate intake (independent variable).b) Y is the weight (dependent variable).

2) An entrepreneur might want to know whether increasing the cost of packaging his new product will have an effect on the sales volume.a) X is costb) Y is sales volume

5

Page 6: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

4.1 SIMPLE LINEAR REGRESSION MODEL

Linear regression model is a model that expresses the linear relationship between two variables.

The simple linear regression model is written as:

where ;

Random error is the difference of data point from the deterministic value.

0 1Y X

0

1

= intercept of the line with the Y-axis

slope of the line

= random error

Page 7: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

This regression line is estimated from the data collected by fitting a straight line to the data set and getting the equation of the straight line,

0 1ˆ ˆY X

Page 8: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

SCATTER PLOT Scatter plots show the relationship between

two variables by displaying data points on a two-dimensional graph.

The variable that might be considered as an explanatory variable is plotted on the x-axis, and the response variable is plotted on the y- axis.

Scatter plots are especially useful when there are a large number of data points.

4.2 CURVE FITTING (SCATTER PLOT)

Page 9: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

They provide the following information about the relationship between two variables:

(1) Strength

(2) Shape - linear, curved, etc.

(3) Direction - positive or negative

(4) Presence of outliers

Page 10: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

EXAMPLES:

Page 11: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

A linear regression can be develop by freehand plot of the data.

Example 4.2:

The given table contains values for 2 variables, X and Y. Plot the given data and make a freehand estimated regression line.

PLOTTING LINEAR REGRESSION MODEL

11

X -3 -2 -1 0 1 2 3

Y 1 2 3 5 8 11 12

Page 12: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

12

Page 13: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

The Least Square method is the method most commonly used for estimating the regression coefficients

The straight line fitted to the data set is the line:

where is the estimated value of y for a given value of X.

4.3 INFERENCES ABOUT ESTIMATED PARAMETERS

Y

0 1 and

0 1ˆ ˆY X

LEAST SQUARES METHOD

Page 14: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

i) y-Intercept for the Estimated Regression Equation,

0 1ˆ ˆy x

and are the mean of and respectivelyx y x y

0

Page 15: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

ii) Slope for the Estimated Regression Equation,

1 1

1

2

12

1

2

12

1

n n

i ini i

xy i ii

n

ini

yy ii

n

ini

xx ii

x y

S x yn

y

S yn

x

S xn

1xy

xx

S

S

Page 16: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

Before, x 65 63 76 46 68 72 68 57 36 96

After, y 68 66 86 48 65 66 71 57 42 87

a) Develop a linear regression model with “before” as the independent variable and “after” as the dependent variable.

b) Predict the score a student would obtain “after” if he scored 60 marks “before”.

The data below represent scores obtained by ten primary school students before and after they were taken on a tour to the museum (which is supposed to increase their interest in history)

EXAMPLE 3.3: STUDENTS SCORE IN HISTORY

Page 17: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

2

2

2

10 44435

647 44279 64.7

656 44884 y = 65.6

647 65644435 1991.8

10

64744279 2418.1

10

448.

xy

xx

yy

Solution

n xy

x x x

y y

S

S

S

265684 1850.4

10

Page 18: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

1

0 1

1991.8ˆa) 0.82372418.1

ˆ ˆ 65.6 0.8237 64.7 12.3063

12.3063 0.8237

xy

xx

S

S

y x

Y X

b) X 60

12.3063 0.8237 60 61.7283Y

Page 19: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

INCOME, x FOOD EXPENDITURE, y

55 14

83 24

38 13

61 16

33 9

49 15

67 17

a) Fit a linear regression model with income as the independent variable and food expenditure as the dependent variable.

b) Predict the food expenditure if income is 50.

Answer:

EXERCISE 4.1:

ˆ ˆ) 1.505 0.2525 ) 14.13a Y X b Y

Page 20: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

EXERCISE 4.2:

Page 21: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

The coefficient of determination is a measure of the variation of the dependent variable (Y) that is explained by the regression line and the independent variable (X).

The symbol for the coefficient of determination is or .

If =0.90, then =0.81. It means that 81% of the variation in the dependent variable (Y) is accounted for by the variations in the independent variable (X).

4.4 ADEQUACY OF THE MODEL COEFFICIENT OF DETERMINATION( )

2Rr 2r

2R

21

2r

Page 22: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

The rest of the variation, 0.19 or 19%, is unexplained and called the coefficient of non determination.

Formula for the coefficient of non determination is 21.00 r

Page 23: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

Relationship Among SST, SSR, SSE

where: SST = total sum of squares SSR = sum of squares due to regression SSE = sum of squares due to error

SST = SSR + SST = SSR + SSE SSE

2( )iy y 2ˆ( )iy y 2ˆ( )i iy y

The coefficient of determination is:

where:where:

SSR = sum of squares due to regressionSSR = sum of squares due to regression

SST = total sum of squaresSST = total sum of squares

2

2 xy

xx yy

SSSRr

SST S S

23

Page 24: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

4.5 PEARSON PRODUCT MOMENT CORRELATION COEFFICIENT (r)

Correlation measures the strength of a linear relationship between the two variables.

Also known as Pearson’s product moment coefficient of correlation.

The symbol for the sample coefficient of correlation

is (r) Formula :

or

.xy

xx yy

Sr

S S

21(sign of ) r b r 21(sign of ) r b r

Page 25: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

Properties of (r):

Values of r close to 1 implies there is a strong

positive linear relationship between x and y. Values of r close to -1 implies there is a strong

negative linear relationship between x and y. Values of r close to 0 implies little or no linear

relationship between x and y.

1 1r

Page 26: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

ASSUMPTIONS ABOUT THE ERROR TERM E

1. The error is a random variable with mean of zero.1. The error is a random variable with mean of zero.

2. The variance of , denoted by 2, is the same for all values of the independent variable.2. The variance of , denoted by 2, is the same for all values of the independent variable.

3. The values of are independent.3. The values of are independent.

4. The error is a normally distributed random variable.4. The error is a normally distributed random variable.

Page 27: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

EXAMPLE 4.4: REFER PREVIOUS EXAMPLE 4.2, STUDENTS SCORE IN HISTORY

Calculate the value of r and interpret its meaning.

SOLUTION:

.

1991.8

2418.1 1850.4

0.9416

xy

xx yy

Sr

S S

Thus, there is a strong positive linear relationship between score obtain before (x) and after (y).

Page 28: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

EXERCISE 4.3:

Refer to previous Exercise 4.1 and Exercise 4.2, calculate coefficient correlation and interpret the results.

Page 29: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

To test the existence of a linear relationship between two variables x and y, we proceed with testing the hypothesis.

Two test are commonly used:

(i)

(ii)

4.6 TEST FOR LINEARITY OF REGRESSION

tt -Test -Test

FF -Test -Test

Page 30: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

1. Determine the hypotheses.1. Determine the hypotheses.

2. Compute Critical Value/ level of significance.2. Compute Critical Value/ level of significance.

3. Compute the test statistic.3. Compute the test statistic.

( no linear r/ship)(exist linear r/ship)

(i) t-Test

valueportn

2,

2

0:0:

11

10

HH

xx

xyyy

Sn

SSVar

Vart

1

2

ˆ)ˆ(

)ˆ(

ˆ

11

1

1

Page 31: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

2,2

2,2

or

nn

tttt

4. Determine the Rejection Rule. 4. Determine the Rejection Rule.

Reject H0 if :

There is a significant relationship between variable X and Y.

5.Conclusion.5.Conclusion.

p-value <

Page 32: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

EXAMPLE 4.5: REFER PREVIOUS EXAMPLE 4.3, STUDENTS SCORE IN HISTORY

Test to determine if their scores before and after the trip is related. Use a=0.05

SOLUTION:1)

2)

( no linear r/ship)(exist linear r/ship)0:

0:11

10

HH

306.205.0

8,2

05.0

t

Page 33: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

3)

4) Rejection Rule:

5) Conclusion: Thus, we reject H0. The score before (x) is linear relationship to the score after (y) the trip.

1

1( )

0.82377.926

0.0108

testtVar

11

1( )

2

1850.4 (0.8237)(1991.8) 18 2418.1

0.0108

yy xy

xx

S SVar

n S

0.025,8

7.926 2.306testt t

Page 34: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

EXERCISE 4.4:

Page 35: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

EXERCISE 4.5:

Page 36: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

1. Determine the hypotheses.1. Determine the hypotheses.

3. Compute the test statistic.3. Compute the test statistic.

F = MSR/MSE - this value can get from ANOVA table

4. Determine the Rejection Rule. 4. Determine the Rejection Rule.

Reject H0 if :p-value < aF test >

(ii) F-Test

( no linear r/ship)(exist linear r/ship)0:

0:11

10

HH

2. Specify the level of significance.2. Specify the level of significance.

2,1, nF

2,1, nF valuepor

Page 37: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

There is a significant relationship between variable X and Y.

5.Conclusion.5.Conclusion.

Page 38: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

4.7 ANOVA APPROACH FOR TESTING LINEARITY OF REGRESSION

The analysis of variance (ANOVA) method is an approach to test the significance of the regression.

We can arrange the test procedure using this approach in an ANOVA table as shown below;

Page 39: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

EXAMPLE 4.6: The manufacturer of Cardio Glide exercise equipment

wants to study the relationship between the number of months since the glide was purchased and the length of time (hours) the equipment was used last week.

At , test whether there is a linear relationship between the variables.

01.0

Page 40: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

Solution:

1. Hypothesis:

2. Test Statistic:

F = MSR/MSE = 17.303

or using p-value approach:

significant value =0.003

3. F-distribution table:

4. Rejection region: (draw picture)

Since F statistic > F table (17.303>11.2586 ), we reject H0 or since p-value (0.003 0.01 ) we reject H0

5. Thus, there is a linear relationship between the variables (month X and hours Y).

0 1

1 1

: 0

: 0

H

H

0.01,1,8 11.26F

Page 41: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

EXERCISE 4.6:

An agricultural scientist planted alfalfa on several plots of land, identical except for the soil pH. Following are the dry matter yields (in pounds per acre) for each plot.

pH Yield

4.6 1056

4.8 1833

5.2 1629

5.4 1852

5.6 1783

5.8 2647

6.0 2131

Page 42: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

a) Construct a scatter plot of yield (y) versus pH (x). Verify that a linear model is appropriate.

b) Compute the estimated regression line for predicting Yield from pH.

c) If the pH is increased by 0.1, by how much would you predict the yield to increase or decrease?

d) For what pH would you predict a yield of 1500 pounds per acre?

e) Calculate coefficient correlation, and interpret the results.

Answer : ˆ) 2090.9 737.1

ˆ) 73.71

) 4.872

b y x

c y

d pH

Page 43: C HAPTER 4: I NTRODUCTORY L INEAR R EGRESSION Chapter Outline 4.1Simple Linear Regression Scatter Plot/Diagram Simple Linear Regression Model 4.2Curve.

EXERCISE 4.7A regression analysis relating the current market value in dollars to the size in square feet of homes in Greeny County, Tennessee, follows. The portion of a regression software output as below:

a)Determine how many homes in the sample.

b)Determine the regression equation.

c)Can you conclude that there a linear relationship between the variables at ?

Predictor Coef SE Coef T PConstant 12.726 8.115 1.57 0.134Size 0.00011386 0.00002896 3.93 0.001

Analysis of VarianceSource DF SS MS F PRegression 1 10354 10354 15.46 0.001Error 18 12054 670Total 19 22408

0.05


Recommended