+ All Categories
Home > Documents > Multiple Linear Regression uses 2 or more predictors General form: Let us take simplest multiple...

Multiple Linear Regression uses 2 or more predictors General form: Let us take simplest multiple...

Date post: 14-Dec-2015
Category:
Upload: jalynn-apley
View: 221 times
Download: 0 times
Share this document with a friend
21
Multiple Linear Regression uses 2 or more predictors rm: e simplest multiple regression case--two predictors b’s are not simply and , ave zero correlation with one another. Any correl- een x 1 and x 2 makes determining the b’s less simple. e related to the partial correlation, in which the he other predictor(s) is held constant. Holding oth constant eliminates the part of the correlation du er predictors and not just to the predictor at hand partial correlation of y with x 1 , with x 2 held is written n x n x x x y z b z b z b z b z .... 3 2 1 3 2 1 2 1 2 1 x x y z b z b z y x cor , 1 y x cor , 2 2 1 . , x x y cor
Transcript
Page 1: Multiple Linear Regression uses 2 or more predictors General form: Let us take simplest multiple regression case--two predictors: Here, the b’s are not.

Multiple Linear Regressionuses 2 or more predictors

General form:

Let us take simplest multiple regression case--two predictors:

Here, the b’s are not simply and , unlessx1 and x2 have zero correlation with one another. Any correl-ation between x1 and x2 makes determining the b’s less simple.The b’s are related to the partial correlation, in which thevalue of the other predictor(s) is held constant. Holding otherpredictors constant eliminates the part of the correlation dueto the other predictors and not just to the predictor at hand.

Notation: partial correlation of y with x1, with x2 held constant, is written

nxnxxxy zbzbzbzbz ....321 321

21 21 xxy zbzbz

yxcor ,1 yxcor ,2

21., xxycor

Page 2: Multiple Linear Regression uses 2 or more predictors General form: Let us take simplest multiple regression case--two predictors: Here, the b’s are not.

21 21 xxy zbzbz

For 2 (or any n) predictors, there are 2 (or any n) equationsin 2 (or any n) unknowns to be solved simultaneously.When n >3 or so, determinant operations are necessary.For case of 2 predictors, and using z values (variables standardized by subtracting their mean and then dividing by the standard deviation) for simplicity, the solution can be done by hand. The two equations to be solved simultaneously are:

b1.2 +b2.1(cor x1,x2) = cory,x1

b1.2(corx1,x2) +b2.1 = cory,x2

Goal is to find the two b coefficients, b1.2 and b2.1

Page 3: Multiple Linear Regression uses 2 or more predictors General form: Let us take simplest multiple regression case--two predictors: Here, the b’s are not.

b1.2 +b2.1(cor x1,x2) = cory,x1

b1.2(corx1,x2) +b2.1 = cory,x2

Example of a multiple regression problem with two predictors

The number of Atlantic hurricanes between June and Novemberis slightly predictable 6 months in advance (in early December) using several precursor atmospheric and oceanic variables. Twovariables used are (1) 500 millibar geopotential height in Novem-ber in the polar north Atlantic (67.5N-85°N latitude, 10E-50°Wlongitude); and (2) sea level pressure in November in the Northtropical Pacific (7.5N-22.5°N latitude, 125-175°W longitude).

Page 4: Multiple Linear Regression uses 2 or more predictors General form: Let us take simplest multiple regression case--two predictors: Here, the b’s are not.

http://www.cdc.noaa.gov/map/images/sst/sst.anom.month.gif

S L P

500mb

Location of two long-lead Atlantic hurricane predictor regions

Page 5: Multiple Linear Regression uses 2 or more predictors General form: Let us take simplest multiple regression case--two predictors: Here, the b’s are not.

Physical reasoning behind the two predictors:(1) 500 millibar geopotential height in November in the polarnorth Atlantic. High heights are associated with a negativeNorth Atlantic Oscillation (NAO) pattern, tending to associatewith a stronger thermohaline circulation, and also tending to befollowed by weaker upper atmospheric westerlies and weakerlow-level trade winds in the tropical Atlantic the following hurricane season. All of these favor hurricane activity.(2) sea level pressure in November in the North tropical Pacific.High pressure in this region in winter tends to be followed byLa Nina conditions in the coming summer and fall, which favorseasterly Atlantic wind anomalies aloft, and hurricane activity.

First step: Find “regular” correlations among all the variables (x1 ,x2, y): corx1,y corx2,y corx1,x2

Page 6: Multiple Linear Regression uses 2 or more predictors General form: Let us take simplest multiple regression case--two predictors: Here, the b’s are not.

X1: Polar north Atlantic 500 millibar heightX2: North tropical Pacific sea level pressure

= 0.20 (x1,y)

= 0.40 (x2,y) = 0.30 (x1,x2)

,PacificSLP hurricanescor500 ,Atlantic mb hurricanescor

Simultaneous equations to be solved

b1.2 +(0.30)b2.1 = 0.20(0.30)b1.2 +b2.1 = 0.40

Solution: Multiply 1st equation by 3.333, then subtractsecond equation from first equation. This gives(3.033)b1.2 +0 = 0.267So b1.2 = 0.088 and use this to find that b2.1 = 0.374Regression equation is Zy = (0.088)zx1 + (0.374)zx2

500 ,Atlantic mb PacificSLPcor one pre-dictor vsthe other

Page 7: Multiple Linear Regression uses 2 or more predictors General form: Let us take simplest multiple regression case--two predictors: Here, the b’s are not.

Multiple correlation coefficient = R = correlation betweenpredicted y and actual y using multiple regression.

yxyx corbcorbR21 1.22.1

In example above, = 0.408

Note this is only very slightly better than using the secondpredictor alone in simple regression. This is not surprising,since the first predictor’s total correlation with y is only0.2, and it is correlated 0.3 with the second predictor, sothat the second predictor already accounts for some of whatthe first predictor has to offer. A decision would probably be made concerning whether it is worth the effort to includethe first predictor for such a small gain. Note: the multiplecorrelation can never decrease when more predictors are added.

)40)(.373(.)20)(.088(. R

Page 8: Multiple Linear Regression uses 2 or more predictors General form: Let us take simplest multiple regression case--two predictors: Here, the b’s are not.

Multiple R is usually inflated somewhat compared withthe true relationship, since additional predictors fitthe accidental variations found in the test sample.

Adjustment (decrease) of R for the existence of multiplepredictors gives a less biased estimate of R:

Adjusted R = n = sample size k = number of predictors 1

)1(2

kn

knR

Page 9: Multiple Linear Regression uses 2 or more predictors General form: Let us take simplest multiple regression case--two predictors: Here, the b’s are not.

1

1)( n

StError zerocorrel

Sampling variability of a simple (x, y) correlation coefficientaround zero when population correlation is zero is approximately

In multiple regression the same approximate relationshipholds except that n must be further decreased by thenumber of predictors additional to the first one.

If the number of predictors (x’s) is denoted by k, thenthe sampling variability of R around zero, when there isno true relationship with any of the predictors, is given by

It is easier to get a given multiple correlation by chance asthe number of predictors increases.

knStError zerocorrel

1

)(

Page 10: Multiple Linear Regression uses 2 or more predictors General form: Let us take simplest multiple regression case--two predictors: Here, the b’s are not.

Partial Correlation is correlation between y and x1, where avariable x2 is not allowed to vary. Example: in an elemen-tary school, reading ability (y) is highly correlated with the child’s weight (x1). But both y and x1 are really causedby something else: the child’s age (call x2). What would thecorrelation be between weight and reading ability if the agewere held constant? (Would it drop down to zero?)

, 1 , 2 1, 2, 1. 2 2 2

, 2 1, 2

( )( )

(1 )(1 )

y x y x x xy x x

y x x x

r r rr

r r

, 21 , 1. 2

1, 2

y xy x x

x x

StErrorEstb r

StErrorEst

A similar set of equations exists for the second predictor.

Page 11: Multiple Linear Regression uses 2 or more predictors General form: Let us take simplest multiple regression case--two predictors: Here, the b’s are not.

Suppose the three correlations are:

reading vs. weight :

reading vs. age:

weight vs. age:

The two partial correlations come out to be:

Finally, the two regression weights turn out to be:

, 1. 2 0.193y x xr

, 2. 1 0.664y x xr

, 1 0.66y xr

, 2 0.82y xr 1, 2 0.71x xr

1 0.157b 2 0.709b 0.827R

Weight is seen to be a minor factor compared with age.

Page 12: Multiple Linear Regression uses 2 or more predictors General form: Let us take simplest multiple regression case--two predictors: Here, the b’s are not.

Another Example – Sahel Drying TrendSuppose 50 years of climate data suggest that the drying of theSahel in northern Africa in July to September may be relatedboth to warming in the tropical Atlantic and Indian oceans (x1) as well as local changes in land use in the Sahel Itself (x2).x1 is expressed as SST, and x2 is expressed as percentagevegetation decrease (expressed as a positive percentage) fromthe vegetation found at the beginning of the 50 year period. While both factors appear related to the downward trend inrainfall, the two predictors are somewhat correlated with oneanother. Suppose the correlations come out as follows:

Cor(y,x1)= -0.52 Cor(y,x2)= -0.37 Cor(x1,x2)= 0.50

What would be the multiple regression equation in “unit-free”standard deviation (z) units?

Page 13: Multiple Linear Regression uses 2 or more predictors General form: Let us take simplest multiple regression case--two predictors: Here, the b’s are not.

Cor(x1,y)= -0.52 Cor(x2,y)= -0.37 Cor(x1,x2)=0.50

First we set up the two equations to be solved simultaneously

b1.2 +b2.1(cor x1,x2) = cory,x1

b1.2(corx1,x2) +b2.1 = cory,x2

b1.2 +(0.50)b2.1 = -0.52(0.50)b1.2 +b2.1 = -0.37

Want to eliminate (or cancel) b1.2 or b2.1. To eliminate b2.1,multiply first equation by 2 and subtract second one from it:

1.5 b1.2 = -0.67 and b1.2 = -0.447 and b2.1 = -0.147Regression equation is Zy = -0.447 zx1 -0.147 zx2

Page 14: Multiple Linear Regression uses 2 or more predictors General form: Let us take simplest multiple regression case--two predictors: Here, the b’s are not.

Regression equation is Zy = -0.447zx1 -0.147zx2

If want to express the above equation in physical units, thenmust know the means and standard deviations of y, x1 and x2

and make substitutions to replace the z’s.

When substitute and simplify results, y, x1 and x2 terms will appear instead of z terms. There generally will also be a constantterm that is not found in the z expression because the originalvariables probably do not have means of 0 the way z’s always do.

yySDzyy

1111 xx SDzxx

2222 xx SDzxx

yy SDyyz /)(

1/)( 111 SDxxxzx

2/)( 222 SDxxxzx

Page 15: Multiple Linear Regression uses 2 or more predictors General form: Let us take simplest multiple regression case--two predictors: Here, the b’s are not.

The means and the standard deviations of the three data sets arey: Jul-Aug-Sep Sahel rainfall (mm): mean 230 mm, SD 88 mmx1: Tropical Atlantic/Indian ocean SST: mean 28.3 degr C, SD 1.7 Cx2: Deforestation (percent of initial): mean 34%, SD 22%

Zy = -0.447-zx1 -0.147zx2

After simplification, final form will be: y = coeff x1 + coeff x2 + constant (here, both coeff <0) b1 b2

21

)(147.0

)(447.0

)( 2211

xxy SD

xx

SD

xx

SD

yy

1 2( 28.3) ( 34)( 230)0.447 0.147

88 1.7 22

x xy

Page 16: Multiple Linear Regression uses 2 or more predictors General form: Let us take simplest multiple regression case--two predictors: Here, the b’s are not.

We now compute the multiple correlation R, and thestandard error of estimate for the multiple regression.Using the two individual correlations and the b terms: Cor(x1,y)= -0.52 Cor(x2,y)= -0.37 Cor(x1,x2)=0.50Regression equation is Zy = -0.447 zx1 -0.147 zx2

yxyx corbcorbR21 1.22.1

)37.)(147.()52.)(447.( R = 0.535

The deforestation factor helps the prediction accuracy onlyslightly. If there were less correlation between the twopredictors, then the second predictor would be more valuable.

2,( 1 2)1 y x xRStandard Error of Estimate = = 0.845

In physical units it is (0.845)(88 mm) =74.3 mm

Page 17: Multiple Linear Regression uses 2 or more predictors General form: Let us take simplest multiple regression case--two predictors: Here, the b’s are not.

knStError zerocorrel

1

)(

Let us evaluate the significance of the multiple correlationof 0.535. How likely could it have arisen by chance alone?First we find the standard error of samples of 50 drawn froma population having no correlations at all, using 2 predictors:

For n=50 and k=2 we get = 0.145

For a 2-sided z test at the 0.05 level, we need 1.96(0.145) = 0.28This is easily exceeded, suggesting that the combination of thetwo predictors (SST and deforestation) do have an impact onSahel summer rainfall. (Using SST alone in simple regression, withcor=0.52, would have given nearly the same level of significance.)

250

1

Page 18: Multiple Linear Regression uses 2 or more predictors General form: Let us take simplest multiple regression case--two predictors: Here, the b’s are not.

Example problem using this regression equation:

Suppose that a climate change model predicts that in year 2050, the SST in the tropical Atlantic and Indian oceanswill be 2.4 standard deviations above the means given forthe 50-year period of the preceding problem. (It is nowabout 1.6 standard deviations above that mean.) Assumethat land use practices (percentage deforestation) will bethe same as they are now, which is 1.3 standard deviationsabove the mean. Under this scenario, using the multipleregression relationship above, how many standard deviationsaway from the mean will Jul-Aug-Sep Sahel rainfall be,and what seasonal total rainfall does that correspond to?

Page 19: Multiple Linear Regression uses 2 or more predictors General form: Let us take simplest multiple regression case--two predictors: Here, the b’s are not.

The problem can be solved either in physical units or in standarddeviation units, and then the answer can be expressed in either (orboth) kinds of units afterward.

If solved in physical units, the values of the two predictions in SDunits (2.4 and 1.3) can be converted to raw units using the means and standard deviations of the variables provided previously,and the raw units form of the regression equation would be used.

If solved in SD units, the simpler equation can be used: Zy = -0.447zx1 -0.147zx2 The z’s of the two predictors, according to the scenario given, will be 2.4 and 1.3, respectively.

Then Zy = -0.447(2.4) – 0.147(1.3) = -1.264. This is how manySDs away from the mean the rainfall would be. Since the rainfallmean and SD are 230 and 88 mm, respectively, the actual amountpredicted is 230 – 1.264(88) = 230 – 111.2 = 118.8 mm.

Page 20: Multiple Linear Regression uses 2 or more predictors General form: Let us take simplest multiple regression case--two predictors: Here, the b’s are not.

Colinearity

When the predictors are highly correlated with one anotherin multiple regression, a condition of colinearity exists. Whenthis happens, the coefficients of two highly correlatedpredictors may have opposing signs, even when each of themhas the same sign of simple correlation with the predictand.(Such opposing signed coefficients minimizes squared errors.)Issues and problems with this are (1) it is counterintuitive,and (2) the coefficients are very unstable, such that if onemore sample is added to the data, they may change drastically.

When colinearity exists, the multiple regression formulawill often still provide useful and accurate predictions. Toeliminate colinearity, predictors that are highly correlatedcan be combined into a single predictor.

Page 21: Multiple Linear Regression uses 2 or more predictors General form: Let us take simplest multiple regression case--two predictors: Here, the b’s are not.

Overfitting

When too many predictors are included in a multipleregression equation, random correlations between thevariations of y (the predictand) and one of the predictorsare “explained” by the equation. Then when the equationis used on independent (e.g. future) predictions, theresults are worse than expected.

Overfitting and colinearity are two different issues. Overfitting is more serious, since it is “deceptive”.

To reduce effects of overfitting: Can use cross-validation. --withhold one or more cases for forming equation, then predict those cases; rotate cases withheld --withhold part of the period for forming equation, then predict that part of the period.


Recommended