+ All Categories
Home > Documents > multiple regression 2564

multiple regression 2564

Date post: 25-Nov-2021
Category:
Upload: others
View: 9 times
Download: 0 times
Share this document with a friend
12
āļāļēāļĢāļ§āļīāđ€āļ„āļĢāļēāļ°āļŦāđŒāļāļēāļĢāļ–āļ”āļ–āļ­āļĒāļžāļŦāļļāļ„āļļāļ“ (Multiple Regression Analysis) āļœāļđāđ‰āļŠāđˆāļ§āļĒāļĻāļēāļŠāļ•āļĢāļēāļˆāļēāļĢāļĒāđŒāļ™āļīāļ„āļĄ āļ–āļ™āļ­āļĄāđ€āļŠāļĩāļĒāļ‡ āļŠāļēāļ‚āļēāļ§āļīāļŠāļēāļ§āļīāļ—āļĒāļēāļāļēāļĢāļĢāļ°āļšāļēāļ”āđāļĨāļ°āļŠāļĩāļ§āļŠāļ–āļīāļ•āļī āļ„āļ“āļ°āļŠāļēāļ˜āļēāļĢāļ“āļŠāļļāļ‚āļĻāļēāļŠāļ•āļĢāđŒ āļĄāļŦāļēāļ§āļīāļ—āļĒāļēāļĨāļąāļĒāļ‚āļ­āļ™āđāļāđˆāļ™ Email: [email protected] Web: https://home.kku.ac.th/nikom 2 2 2 1 S S F 2 2 2 1 2 1 2 1 n s n s x x t 05 . 0 value p 05 . 0 n s/ Ξ x t ) n (s/ t x CI % df Îą/ , 2 ) 100 ( āļāļēāļĢāļ§āļīāđ€āļ„āļĢāļēāļ°āļŦāđŒāļāļēāļĢāļ–āļ”āļ–āļ­āļĒāļžāļŦāļļāļ„āļļāļ“ āļ§āļīāđ€āļ„āļĢāļēāļ°āļŦāđŒāļ„āļ§āļēāļĄāļŠāļąāļĄāļžāļąāļ™āļ˜āđŒāļĢāļ°āļŦāļ§āđˆāļēāļ‡āļ•āļąāļ§āđāļ›āļĢāļ•āļēāļĄ 1 āļ•āļąāļ§āđāļ›āļĢ āđāļĨāļ°āļ•āļąāļ§āđāļ›āļĢ āļ­āļīāļŠāļĢāļ° 2 āļ•āļąāļ§āđāļ›āļĢ āļ‚āļķ Êāļ™āđ„āļ› āļ•āļąāļ§āđāļ›āļĢāļ­āļīāļŠāļĢāļ° (independent variables) āļŦāļĢāļ·āļ­āļ•āļąāļ§āđāļ›āļĢāļ­āļ˜āļīāļšāļēāļĒ (explanatory variables) āļ•āļąāļ§āđāļ›āļĢāļ•āļēāļĄ (dependent variable) āļŦāļĢāļ·āļ­āļ•āļąāļ§āđāļ›āļĢāļ•āļ­āļšāļŠāļ™āļ­āļ‡ (response variable) āļ•āļąāļ§āđāļ›āļĢāļœāļĨāļĨāļąāļžāļ˜āđŒ (outcome variable) i p p e x x x y ,..., ˆ 2 2 1 1 0 āļāļēāļĢāļ§āļīāđ€āļ„āļĢāļēāļ°āļŦāđŒāļāļēāļĢāļ–āļ”āļ–āļ­āļĒāļžāļŦāļļāļ„āļļāļ“ (Multiple Regression) āļāļēāļĢāļ§āļīāđ€āļ„āļĢāļēāļ°āļŦāđŒāļāļēāļĢāļ–āļ”āļ–āļ­āļĒāļ­āļĒāđˆāļēāļ‡āļ‡āđˆāļēāļĒ (Simple Regression) i e bx a y ˆ i e x y 0 ˆ āļŦāļĢāļ·āļ­ āļ§āļąāļ•āļ–āļļāļ›āļĢāļ°āļŠāļ‡āļ„āđŒāļāļēāļĢāļ§āļīāđ€āļ„āļĢāļēāļ°āļŦāđŒāļāļēāļĢāļ–āļ”āļ–āļ­āļĒāļžāļŦāļļāļ„āļļāļ“ āļ§āļąāļ”āļ„āļ§āļēāļĄāļŠāļąāļĄāļžāļąāļ™āļ˜āđŒāđ€āļŠāļīāļ‡āđ€āļŠāđ‰āļ™āļĢāļ°āļŦāļ§āđˆāļēāļ‡āļ•āļąāļ§āđāļ›āļĢāļ­āļīāļŠāļĢāļ° (independent variables) (āļ) āļ•āļąāļ§āđāļ›āļĢāļ•āđˆāļ­āđ€āļ™āļ· Éāļ­āļ‡ (āļ‚) āļ•āļąāļ§āđāļ›āļĢāļˆāđāļēāđāļ™āļāļ›āļĢāļ°āđ€āļ āļ— (āļŠāļĢāđ‰āļēāļ‡āļ•āļąāļ§āđāļ›āļĢāļŦāļļ āđˆāļ™) āļāļąāļšāļ•āļąāļ§āđāļ›āļĢāļ•āļēāļĄ (dependent variable) -> āļ•āļąāļ§āđāļ›āļĢāļ•āđˆāļ­āđ€āļ™āļ· Éāļ­āļ‡ āļ—āđāļēāļ™āļēāļĒ (prediction) Systolic BP CHOL TRI AGE ... idno sysbp chol age tri idno sysbp chol age tri 1 155 375 66 230 11 132 304 40 140 2 136 290 49 161 12 164 428 51 175 3 133 267 47 187 13 136 282 56 159 4 166 340 55 178 14 73 165 36 44 5 111 282 42 112 15 153 395 51 181 6 150 352 71 125 16 135 324 54 164 7 131 285 39 149 17 149 426 51 205 8 167 383 59 208 18 149 337 57 189 9 166 363 60 208 19 142 347 45 152 10 126 283 48 138 20 148 349 55 194 āļ•āļąāļ§āļ­āļĒāđˆāļēāļ‡ āļāļēāļĢāļĻāļķāļāļĐāļēāļ„āļ§āļēāļĄāļŠāļąāļĄāļžāļąāļ™āļ˜āđŒāļĢāļ°āļŦāļ§āđˆāļēāļ‡āļ­āļēāļĒāļļ āļĢāļ°āļ”āļąāļš cholesterol āļĢāļ°āļ”āļąāļš triglyceride āļāļąāļš systolic blood pressure āļ‚āđ‰āļ­āļĄāļđāļĨāļ•āļąāļ§āđāļ›āļĢ sysbp āđāļĨāļ°āđ€āļĄāļ•āļĢāļīāļāļ‹āđŒāļ•āļąāļ§āđāļ›āļĢāļ­āļīāļŠāļĢāļ° (chol, age,tri) 148 ... 136 155 y 194 55 349 1 ... 161 49 290 1 230 66 375 1 x āļāļēāļĢāļ„āđāļēāļ™āļ§āļ“āļŠāļąāļĄāļ›āļĢāļ°āļŠāļīāļ—āļ˜āļīāļāļēāļĢāļ§āļīāđ€āļ„āļĢāļēāļ°āļŦāđŒāļāļēāļĢāļ–āļ”āļ–āļ­āļĒāļžāļŦāļļāļ„āļļāļ“ Í āđƒāļŠāđ‰āļ§āļīāļ˜āļĩ least square method āđ‚āļ”āļĒāđƒāļŠāđ‰ matrix approach i x x x y 3 3 2 2 1 1 0 ˆ 1 x p y x' 1 ) p p x x' ( 1 x p b x āļāļēāļĢāļ§āļīāđ€āļ„āļĢāļēāļ°āļŦāđŒāļāļēāļĢāļ–āļ”āļ–āļ­āļĒāļžāļŦāļļāļ„āļļāļ“āđ‚āļ”āļĒāđƒāļŠāđ‰ Stata . regress sysbp chol age tri Source | SS df MS Number of obs = 20 ---------+------------------------------ F( 3, 16) = 35.56 Model | 7942.70165 3 2647.56722 Prob > F = 0.0000 Residual | 1191.09835 16 74.4436471 R-squared = 0.8696 ---------+------------------------------ Adj R-squared = 0.8451 Total | 9133.80 19 480.726316 Root MSE = 8.6281 ------------------------------------------------------------------------------ sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval] ---------+-------------------------------------------------------------------- chol | .1654515 .0496455 3.333 0.004 .0602077 .2706953 age | .5122311 .2802612 1.828 0.086 -.0818961 1.106358 tri | .2006968 .0745745 2.691 0.016 .042606 .3587876 _cons | 27.15522 12.80998 2.120 0.050 -.0007308 54.31117 ------------------------------------------------------------------------------
Transcript
Page 1: multiple regression 2564

āļāļēāļĢāļ§āđ€āļ„āļĢāļēāļ°āļŦāļāļēāļĢāļ–āļ”āļ–āļ­āļĒāļžāļŦāļ„āļ“

(Multiple Regression Analysis)

āļœāļŠāļ§āļĒāļĻāļēāļŠāļ•āļĢāļēāļˆāļēāļĢāļĒāļ™āļ„āļĄ āļ–āļ™āļ­āļĄāđ€āļŠāļĒāļ‡

āļŠāļēāļ‚āļēāļ§āļŠāļēāļ§āļ—āļĒāļēāļāļēāļĢāļĢāļ°āļšāļēāļ”āđāļĨāļ°āļŠāļ§āļŠāļ–āļ•

āļ„āļ“āļ°āļŠāļēāļ˜āļēāļĢāļ“āļŠāļ‚āļĻāļēāļŠāļ•āļĢ āļĄāļŦāļēāļ§āļ—āļĒāļēāļĨāļĒāļ‚āļ­āļ™āđāļāļ™

Email: [email protected] Web: https://home.kku.ac.th/nikom

2

2

2

1

S

SF

2

22

1

21

21

ns

ns

xxt

05.0 valuep

05.0

ns/

Ξxt

)n(s/txCI% dfÎą/ ,2)100(

āļāļēāļĢāļ§āđ€āļ„āļĢāļēāļ°āļŦāļāļēāļĢāļ–āļ”āļ–āļ­āļĒāļžāļŦāļ„āļ“

āļ§āđ€āļ„āļĢāļēāļ°āļŦāļ„āļ§āļēāļĄāļŠāļĄāļžāļ™āļ˜āļĢāļ°āļŦāļ§āļēāļ‡āļ•āļ§āđāļ›āļĢāļ•āļēāļĄ 1 āļ•āļ§āđāļ›āļĢ āđāļĨāļ°āļ•āļ§āđāļ›āļĢ

āļ­āļŠāļĢāļ° 2 āļ•āļ§āđāļ›āļĢ āļ‚āļ™āđ„āļ›

āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ° (independent variables) āļŦāļĢāļ­āļ•āļ§āđāļ›āļĢāļ­āļ˜āļšāļēāļĒ

(explanatory variables)

āļ•āļ§āđāļ›āļĢāļ•āļēāļĄ (dependent variable) āļŦāļĢāļ­āļ•āļ§āđāļ›āļĢāļ•āļ­āļšāļŠāļ™āļ­āļ‡(response variable) āļ•āļ§āđāļ›āļĢāļœāļĨāļĨāļžāļ˜ (outcome variable)

ipp exxxy ,...,ˆ 22110

āļāļēāļĢāļ§āđ€āļ„āļĢāļēāļ°āļŦāļāļēāļĢāļ–āļ”āļ–āļ­āļĒāļžāļŦāļ„āļ“ (Multiple Regression)

āļāļēāļĢāļ§āđ€āļ„āļĢāļēāļ°āļŦāļāļēāļĢāļ–āļ”āļ–āļ­āļĒāļ­āļĒāļēāļ‡āļ‡āļēāļĒ (Simple Regression)

iebxay ˆiexy 0

ˆāļŦāļĢāļ­

āļ§āļ•āļ–āļ›āļĢāļ°āļŠāļ‡āļ„āļāļēāļĢāļ§āđ€āļ„āļĢāļēāļ°āļŦāļāļēāļĢāļ–āļ”āļ–āļ­āļĒāļžāļŦāļ„āļ“

āļ§āļ”āļ„āļ§āļēāļĄāļŠāļĄāļžāļ™āļ˜āđ€āļŠāļ‡āđ€āļŠāļ™āļĢāļ°āļŦāļ§āļēāļ‡āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ° (independent variables)

(āļ) āļ•āļ§āđāļ›āļĢāļ•āļ­āđ€āļ™āļ­āļ‡ (āļ‚) āļ•āļ§āđāļ›āļĢāļˆāļēāđāļ™āļāļ›āļĢāļ°āđ€āļ āļ— (āļŠāļĢāļēāļ‡āļ•āļ§āđāļ›āļĢāļŦāļ™)

āļāļšāļ•āļ§āđāļ›āļĢāļ•āļēāļĄ (dependent variable) -> āļ•āļ§āđāļ›āļĢāļ•āļ­āđ€āļ™āļ­āļ‡

āļ—āļēāļ™āļēāļĒ (prediction)

Systolic BPCHOL

TRI

AGE

...

idno sysbp chol age tri idno sysbp chol age tri

1 155 375 66 230 11 132 304 40 140

2 136 290 49 161 12 164 428 51 175

3 133 267 47 187 13 136 282 56 159

4 166 340 55 178 14 73 165 36 44

5 111 282 42 112 15 153 395 51 181

6 150 352 71 125 16 135 324 54 164

7 131 285 39 149 17 149 426 51 205

8 167 383 59 208 18 149 337 57 189

9 166 363 60 208 19 142 347 45 152

10 126 283 48 138 20 148 349 55 194

āļ•āļ§āļ­āļĒāļēāļ‡ āļāļēāļĢāļĻāļāļĐāļēāļ„āļ§āļēāļĄāļŠāļĄāļžāļ™āļ˜āļĢāļ°āļŦāļ§āļēāļ‡āļ­āļēāļĒ āļĢāļ°āļ”āļš cholesterol

āļĢāļ°āļ”āļš triglyceride āļāļš systolic blood pressure

āļ‚āļ­āļĄāļĨāļ•āļ§āđāļ›āļĢ sysbp āđāļĨāļ°āđ€āļĄāļ•āļĢāļāļ‹āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ° (chol, age,tri)

148

...

136

155

y

194553491

...

161492901

230663751

x

āļāļēāļĢāļ„āļēāļ™āļ§āļ“āļŠāļĄāļ›āļĢāļ°āļŠāļ—āļ˜āļāļēāļĢāļ§āđ€āļ„āļĢāļēāļ°āļŦāļāļēāļĢāļ–āļ”āļ–āļ­āļĒāļžāļŦāļ„āļ“

āđƒāļŠāļ§āļ˜ least square method āđ‚āļ”āļĒāđƒāļŠ matrix approach

ixxxy 3322110ˆ

1xpyx'1)

ppxx'(

1xpb

x

āļāļēāļĢāļ§āđ€āļ„āļĢāļēāļ°āļŦāļāļēāļĢāļ–āļ”āļ–āļ­āļĒāļžāļŦāļ„āļ“āđ‚āļ”āļĒāđƒāļŠ Stata. regress sysbp chol age tri

Source | SS df MS Number of obs = 20---------+------------------------------ F( 3, 16) = 35.56

Model | 7942.70165 3 2647.56722 Prob > F = 0.0000Residual | 1191.09835 16 74.4436471 R-squared = 0.8696---------+------------------------------ Adj R-squared = 0.8451

Total | 9133.80 19 480.726316 Root MSE = 8.6281

------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]

---------+--------------------------------------------------------------------chol | .1654515 .0496455 3.333 0.004 .0602077 .2706953age | .5122311 .2802612 1.828 0.086 -.0818961 1.106358tri | .2006968 .0745745 2.691 0.016 .042606 .3587876

_cons | 27.15522 12.80998 2.120 0.050 -.0007308 54.31117------------------------------------------------------------------------------

Page 2: multiple regression 2564

(āļ.) āļ„āļ§āļēāļĄāļŠāļĄāļžāļ™āļ˜āđ€āļŠāļ‡āđ€āļŠāļ™āļĢāļ°āļŦāļ§āļēāļ‡āļ•āļ§āđāļ›āļĢāļ•āļēāļĄāđāļĨāļ°

āļāļĨāļĄāļ‚āļ­āļ‡āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°

āđ€āļžāļ­āļŠāļĢāļ›āļ§āļēāļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āļ­āļĒāļēāļ‡āļ™āļ­āļĒ 1 āļ•āļ§āđāļ›āļĢ āļĄāļ„āļ§āļēāļĄāļŠāļĄāļžāļ™āļ˜āđ€āļŠāļ‡āđ€āļŠāļ™

āļāļšāļ•āļ§āđāļ›āļĢ Y

āļāļēāļĢāļ—āļ”āļŠāļ­āļšāļŠāļĄāļĄāļ•āļāļēāļ™

āļāļēāļĢāļŠāļĢāļ›āļ āļēāļžāļĢāļ§āļĄ āđƒāļŠāļ•āļēāļĢāļēāļ‡āļāļēāļĢāļ§āđ€āļ„āļĢāļēāļ°āļŦāļ„āļ§āļēāļĄāđāļ›āļĢāļ›āļĢāļ§āļ™ (ANOVA)

āļŠāļēāļŦāļĢāļšāļāļēāļĢāļ§āđ€āļ„āļĢāļēāļ°āļŦāļāļēāļĢāļ–āļ”āļ–āļ­āļĒ āđ€āļžāļ­āļ„āļēāļ™āļ§āļ“āļ„āļē F-test

0210

k

Îē...Îē:ÎēH 0:0 ik

ÎēHāļŦāļĢāļ­

0: ikÎēH A

MSR

āļ•āļēāļĢāļēāļ‡ ANOVA āļŠāļēāļŦāļĢāļšāļāļēāļĢāļ§āđ€āļ„āļĢāļēāļ°āļŦāļāļēāļĢāļ–āļ”āļ–āļ­āļĒ

triagecholiY 20.51.17.16.27ˆ F-test

Source | SS df MS Number of obs = 20---------+------------------------------ F( 3, 16) = 35.56

Model | 7942.70165 3 2647.56722 Prob > F = 0.0000Residual | 1191.09835 16 74.4436471 R-squared = 0.8696---------+------------------------------ Adj R-squared = 0.8451

Total | 9133.80 19 480.726316 Root MSE = 8.6281

āļŠāļĄāļĄāļ•āļāļēāļ™āļŠāļēāļŦāļĢāļšāļāļēāļĢāļ—āļ”āļŠāļ­āļšāļ™āļĒāļŠāļēāļ„āļāļ‚āļ­āļ‡āļŠāļĄāļāļēāļĢāļ–āļ”āļ–āļ­āļĒāļ—āļ‡āļŦāļĄāļ”

H0 : āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ° k āļ•āļ§āđāļ›āļĢāđ„āļĄāļŠāļēāļĄāļēāļĢāļ–āļ­āļ˜āļšāļēāļĒāļ„āļ§āļēāļĄāļœāļ™āđāļ›āļĢ

āļ‚āļ­āļ‡āļ•āļ§āđāļ›āļĢ Y āđ„āļ”

āļŦāļĢāļ­

HA : āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ° k āļ•āļ§āđāļ›āļĢāļŠāļēāļĄāļēāļĢāļ–āļ­āļ˜āļšāļēāļĒāļ„āļ§āļēāļĄāļœāļ™āđāļ›āļĢ

āļ‚āļ­āļ‡āļ•āļ§āđāļ›āļĢ Y āđ„āļ”

āļŦāļĢāļ­

āļāļēāļĢāļ—āļ”āļŠāļ­āļšāđƒāļŠāļŠāļ–āļ• F-test

0210

k

Îē...Îē:ÎēH

0: ik

ÎēH A

MSE

MSR

error)(or residualsquaremean

model)(or regressionsquaremeanF

āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āđāļ•āļĨāļ°āļ•āļ§āđāļ›āļĢ

āđ„āļĄāļĄāļ„āļ§āļēāļĄāļŠāļĄāļžāļ™āļ˜āđ€āļŠāļ‡āđ€āļŠāļ™āļāļšāļ•āļ§āđāļ›āļĢāļ•āļēāļĄ

āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āļ­āļĒāļēāļ‡āļ™āļ­āļĒ 1 āļ•āļ§āđāļ›āļĢ

āļĄāļ„āļ§āļēāļĄāļŠāļĄāļžāļ™āļ˜āđ€āļŠāļ‡āđ€āļŠāļ™āļāļšāļ•āļ§āđāļ›āļĢāļ•āļēāļĄ

0:0 ikÎēH

k

kn

R

RF

121

2

SSY

SSR

SSY

SSESSYR

2

āļŦāļĢāļ­āļ„āļēāļ™āļ§āļ“āļ„āļē F āļˆāļēāļ

āļāļēāļĢāļ„āļēāļ™āļ§āļ“āļ„āļē

n=āļ‚āļ™āļēāļ”āļ•āļ§āļ­āļĒāļēāļ‡

k=āļˆāļēāļ™āļ§āļ™āļ•āļ§āđāļ›āļĢ

R2 = āļŠāļĄāļ›āļĢāļ°āļŠāļ—āļ˜āļāļēāļĢāļāļēāļŦāļ™āļ” (coefficient of determination)

āđ€āļĄāļ­

āļāļēāļĢāļ§āđ€āļ„āļĢāļēāļ°āļŦ āļžāļšāļ§āļē āļĄāļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āļ­āļĒāļēāļ‡āļ™āļ­āļĒ 1 āļ•āļ§āđāļ›āļĢāļĄāļ„āļ§āļēāļĄāļŠāļĄāļžāļ™āļ˜

āđ€āļŠāļ‡āđ€āļŠāļ™āļāļšāļ•āļ§āđāļ›āļĢāļ•āļēāļĄ āļ­āļĒāļēāļ‡āļ™āļĒāļŠāļēāļ„āļāļ—āļēāļ‡āļŠāļ–āļ• (F-test= 35.56;

p<.0001) [āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āđ„āļ”āđāļ chol, tri]

. regress sysbp chol age tri

Source | SS df MS Number of obs = 20---------+------------------------------ F( 3, 16) = 35.56

Model | 7942.70165 3 2647.56722 Prob > F = 0.0000Residual | 1191.09835 16 74.4436471 R-squared = 0.8696---------+------------------------------ Adj R-squared = 0.8451

Total | 9133.80 19 480.726316 Root MSE = 8.6281

------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]

---------+--------------------------------------------------------------------chol | .1654515 .0496455 3.333 0.004 .0602077 .2706953age | .5122311 .2802612 1.828 0.086 -.0818961 1.106358tri | .2006968 .0745745 2.691 0.016 .042606 .3587876

_cons | 27.15522 12.80998 2.120 0.050 -.0007308 54.31117------------------------------------------------------------------------------

āļŠāļĄāļĄāļ•āļāļēāļ™āļŠāļēāļŦāļĢāļšāļāļēāļĢāļ—āļ”āļŠāļ­āļšāļ™āļĒāļŠāļēāļ„āļāļ‚āļ­āļ‡āļŠāļĄāļāļēāļĢāļ–āļ”āļ–āļ­āļĒāļ—āļ‡āļŦāļĄāļ”H0 : āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ° k āļ•āļ§āđāļ›āļĢāđ„āļĄāļŠāļēāļĄāļēāļĢāļ–āļ­āļ˜āļšāļēāļĒāļ„āļ§āļēāļĄāļœāļ™āđāļ›āļĢ

āļ‚āļ­āļ‡āļ•āļ§āđāļ›āļĢ Y āđ„āļ” HA : āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ° k āļ•āļ§āđāļ›āļĢāļŠāļēāļĄāļēāļĢāļ–āļ­āļ˜āļšāļēāļĒāļ„āļ§āļēāļĄāļœāļ™āđāļ›āļĢ

āļ‚āļ­āļ‡āļ•āļ§āđāļ›āļĢ Y āđ„āļ” āļāļēāļĢāļ—āļ”āļŠāļ­āļšāđƒāļŠāļŠāļ–āļ• F-test

0: ik

ÎēH A

0:0 ikÎēH

. xi: reg sysbp i.occ i.genderi.occ _Iocc_1-3 (naturally coded; _Iocc_1 omitted)i.gender _Igender_1-2 (naturally coded; _Igender_1 omitted)

Source | SS df MS Number of obs = 20-------------+---------------------------------- F(3, 16) = 0.36

Model | 570.586942 3 190.195647 Prob > F = 0.7859Residual | 8563.21306 16 535.200816 R-squared = 0.0625

-------------+---------------------------------- Adj R-squared = -0.1133Total | 9133.8 19 480.726316 Root MSE = 23.134

------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------_Iocc_2 | 4.137457 14.54323 0.28 0.780 -26.69281 34.96772_Iocc_3 | 12.50172 12.72194 0.98 0.340 -14.46758 39.47102

_Igender_2 | -.6082474 11.50742 -0.05 0.959 -25.0029 23.7864_cons | 135.1014 9.637349 14.02 0.000 114.6711 155.5316

------------------------------------------------------------------------------

Page 3: multiple regression 2564

āļŠāļĄāļĄāļ•āļāļēāļ™āļŠāļēāļŦāļĢāļšāļāļēāļĢāļ—āļ”āļŠāļ­āļšāļ™āļĒāļŠāļēāļ„āļāļ‚āļ­āļ‡āļŠāļĄāļāļēāļĢāļ–āļ”āļ–āļ­āļĒāļ—āļ‡āļŦāļĄāļ”H0 : āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ° k āļ•āļ§āđāļ›āļĢāđ„āļĄāļŠāļēāļĄāļēāļĢāļ–āļ­āļ˜āļšāļēāļĒāļ„āļ§āļēāļĄāļœāļ™āđāļ›āļĢ

āļ‚āļ­āļ‡āļ•āļ§āđāļ›āļĢ Y āđ„āļ” HA : āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ° k āļ•āļ§āđāļ›āļĢāļŠāļēāļĄāļēāļĢāļ–āļ­āļ˜āļšāļēāļĒāļ„āļ§āļēāļĄāļœāļ™āđāļ›āļĢ

āļ‚āļ­āļ‡āļ•āļ§āđāļ›āļĢ Y āđ„āļ” āļāļēāļĢāļ—āļ”āļŠāļ­āļšāđƒāļŠāļŠāļ–āļ• F-test

0: ik

ÎēH A

0:0 ikÎēH

. xi: reg sysbp i.occ i.gender choli.occ _Iocc_1-3 (naturally coded; _Iocc_1 omitted)i.gender _Igender_1-2 (naturally coded; _Igender_1 omitted)

Source | SS df MS Number of obs = 20-------------+---------------------------------- F(4, 15) = 13.43

Model | 7139.70141 4 1784.92535 Prob > F = 0.0001Residual | 1994.09859 15 132.939906 R-squared = 0.7817

-------------+---------------------------------- Adj R-squared = 0.7235Total | 9133.8 19 480.726316 Root MSE = 11.53

------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------_Iocc_2 | .9696195 7.262195 0.13 0.896 -14.50938 16.44862_Iocc_3 | -2.080515 6.671207 -0.31 0.759 -16.29986 12.13883

_Igender_2 | -7.837831 5.82667 -1.35 0.199 -20.25708 4.581421chol | .3255662 .0463141 7.03 0.000 .22685 .4242825_cons | 37.714 14.66305 2.57 0.021 6.460443 68.96757

------------------------------------------------------------------------------

āļ‚. āļāļēāļĢāļ§āđ€āļ„āļĢāļēāļ°āļŦāļ„āļ§āļēāļĄāļŠāļĄāļžāļ™āļ˜āđ€āļŠāļ‡āđ€āļŠāļ™ āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āđāļ•āļĨāļ°āļ•āļ§āđāļ›āļĢ

āļĄāļ„āļ§āļēāļĄāļŠāļĄāļžāļ™āļ˜āļāļšāļ•āļ§āđāļ›āļĢāļ•āļēāļĄ

āļāļēāļĢāļ—āļ”āļŠāļ­āļšāļŠāļĄāļĄāļ•āļāļēāļ™ H0: i = 0; HA: i 0

āđ€āļĄāļ­ āļ„āļ­āļŠāļĄāļ›āļĢāļ°āļŠāļ—āļ˜āđāļĨāļ° āļ„āļ­ Standard Errori

i

S

Îēt

ˆ

ii

S

ˆ

. regress sysbp chol age triF( 3, 16) = 35.56

Source | SS df MS Number of obs = 20-------------+------------------------------ F( 3, 16) = 35.56

Model | 7942.70165 3 2647.56722 Prob > F = 0.0000Residual | 1191.09835 16 74.4436471 R-squared = 0.8696

-------------+------------------------------ Adj R-squared = 0.8451Total | 9133.80 19 480.726316 Root MSE = 8.6281

------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------chol | .1654515 .0496455 3.33 0.004 .0602076 .2706953age | .5122311 .2802612 1.83 0.086 -.0818961 1.106358tri | .2006968 .0745745 2.69 0.016 .042606 .3587876

_cons | 27.15522 12.80998 2.12 0.050 -.0007309 54.31117------------------------------------------------------------------------------

āļāļēāļĢāđāļ›āļĨāļ„āļ§āļēāļĄāļŦāļĄāļēāļĒ āđāļ›āļĨāļœāļĨāđ‚āļ”āļĒāļžāļˆāļēāļĢāļ“āļēāđ€āļ„āļĢāļ­āļ‡āļŦāļĄāļēāļĒ

āļžāļˆāļēāļĢāļ“āļē āļ„āļ§āļēāļĄāļŠāļĄāļžāļ™āļ˜āļĢāļ°āļŦāļ§āļēāļ‡āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āļāļšāļ•āļ§āđāļ›āļĢāļ•āļēāļĄ

āļ•āļ§āđāļ›āļĢ chol, tri āļĄāļ„āļ§āļēāļĄāļŠāļĄāļžāļ™āļ˜āļ—āļēāļ‡āļšāļ§āļāļāļš sysbp āđāļĨāļ°āļĄāļ™āļĒāļŠāļēāļ„āļāļ—āļēāļ‡āļŠāļ–āļ•

āļ•āļ§āđāļ›āļĢ age āđ„āļĄāļĄāļ„āļ§āļēāļĄāļŠāļĄāļžāļ™āļ˜āļāļš sysbp (āļĄāļ„āļ§āļēāļĄāļŠāļĄāļžāļ™āļ˜āđāļ•āđ„āļĄāļĄāļ™āļĒāļŠāļēāļ„āļ

āļ—āļēāļ‡āļŠāļ–āļ•)

. regress sysbp chol age triF( 3, 16) = 35.56

Source | SS df MS Number of obs = 20-------------+------------------------------ F( 3, 16) = 35.56

Model | 7942.70165 3 2647.56722 Prob > F = 0.0000Residual | 1191.09835 16 74.4436471 R-squared = 0.8696

-------------+------------------------------ Adj R-squared = 0.8451Total | 9133.80 19 480.726316 Root MSE = 8.6281

------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------chol | .1654515 .0496455 3.33 0.004 .0602076 .2706953age | .5122311 .2802612 1.83 0.086 -.0818961 1.106358tri | .2006968 .0745745 2.69 0.016 .042606 .3587876

_cons | 27.15522 12.80998 2.12 0.050 -.0007309 54.31117------------------------------------------------------------------------------

āļāļēāļĢāļžāļˆāļēāļĢāļ“āļēāļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āđāļ•āļĨāļ°āļ•āļ§āđāļ›āļĢāļĄāļ„āļ§āļēāļĄāļŠāļĄāļžāļ™āļ˜

āļāļšāļ•āļ§āđāļ›āļĢāļ•āļēāļĄāļĄāļēāļāļ™āļ­āļĒāđ€āļžāļĒāļ‡āđƒāļ”

āļžāļˆāļēāļĢāļ“āļēāļˆāļēāļāļŠāļĄāļāļēāļĢāļ—āļĄāļāļēāļĢāļ›āļĢāļšāļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ° āļĄāļŦāļ™āļ§āļĒāđ€āļ”āļĒāļ§āļāļ™

āļ—āļē Xi āđƒāļŦāđ€āļ›āļ™āļ„āļ°āđāļ™āļ™āļĄāļēāļ•āļĢāļāļēāļ™ Z-score

yy

xxi

S

Sor

sd

xxz *;

. regress sysbp chol age tri, beta

Source | SS df MS Number of obs = 20-------------+------------------------------ F( 3, 16) = 35.56

Model | 7942.70165 3 2647.56722 Prob > F = 0.0000Residual | 1191.09835 16 74.4436471 R-squared = 0.8696

-------------+------------------------------ Adj R-squared = 0.8451Total | 9133.80 19 480.726316 Root MSE = 8.6281

------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| Beta

-------------+----------------------------------------------------------------chol | .1654515 .0496455 3.33 0.004 .4663705age | .5122311 .2802612 1.83 0.086 .2076355tri | .2006968 .0745745 2.69 0.016 .3805016

_cons | 27.15522 12.80998 2.12 0.050 .------------------------------------------------------------------------------

. di .16545147*(61.802976/21.925472)

.46637049

. di .51223109*(8.8876022/21.925472)

.20763549

. di .20069683*(41.568555/21.925472)

.3805016

āļāļĢāļ“āļ—āļēāđƒāļŦāđ€āļ›āļ™āļ„āļ°āđāļ™āļ™āļĄāļēāļ•āļĢāļāļēāļ™ Z-scoresd

xxz i

Constant āļĄāļ„āļēāļ™āļ­āļĒāļĄāļēāļ ~ 0

yy

xx

S

S *

.zscore sysbp chol age tri

.regress z_sysbp z_chol z_age z_tri

Source | SS df MS Number of obs = 20-------------+------------------------------ F( 3, 16) = 35.56

Model | 16.5222943 3 5.50743142 Prob > F = 0.0000Residual | 2.47770574 16 .154856609 R-squared = 0.8696

-------------+------------------------------ Adj R-squared = 0.8451Total | 19.00 19 1.00 Root MSE = .39352

------------------------------------------------------------------------------

z_sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------

z_chol | .4663705 .1399396 3.33 0.004 .1697118 .7630292z_age | .2076355 .1136053 1.83 0.086 -.033197 .448468z_tri | .3805016 .1413859 2.69 0.016 .0807768 .6802263_cons | 3.62e-16 .0879934 0.00 1.000 -.1865376 .1865376

------------------------------------------------------------------------------

2

2

1

1

ˆ

2

21

n

i)Y(Y

n

i)Y

iY(

SSY

SSR

k...x,xy|x

R

i

0.869594 9133.80

7942.70165

āļ•āļ§āļ­āļĒāļēāļ‡ āļˆāļēāļāļ‚āļ­āļĄāļĨāļ•āļ§āļ­āļĒāļēāļ‡āļ„āļēāļ™āļ§āļ“āļ„āļēāļ‚āļ­āļ‡ coefficient of determination

āļ•āļ§āđāļ›āļĢ chol ,age āđāļĨāļ° trigyceride āļŠāļēāļĄāļēāļĢāļ–āļ­āļ˜āļšāļēāļĒāļāļēāļĢāđ€āļ›āļĨāļĒāļ™āđāļ›āļĨāļ‡

(āļ„āļ§āļēāļĄāđāļ›āļĢāļ›āļĢāļ§āļ™) āļĢāļ°āļ”āļš systolic blood pressure āđ„āļ”āļ–āļ‡āļĢāļ­āļĒāļĨāļ° 86.96

āļŠāļĄāļāļēāļĢāļ—āļēāļ™āļēāļĒāđāļĨāļ°āļāļēāļĢāļ›āļĢāļ°āđ€āļĄāļ™āļŠāļĄāļāļēāļĢāļ–āļ”āļ–āļ­āļĒāļžāļŦāļ„āļ“ āļŠāļĄāļāļēāļĢāļ—āļēāļ™āļēāļĒ: āļāļēāļĢāļ›āļĢāļ°āđ€āļĄāļ™āļŠāļĄāļāļēāļĢāļ–āļ”āļ–āļ­āļĒāļžāļŦāļ„āļ“ āļžāļˆāļēāļĢāļ“āļēāļˆāļēāļāļ„āļēāļŠāļĄāļ›āļĢāļ°āļŠāļ—āļ˜āļāļēāļĢāļāļēāļŦāļ™āļ”

(coefficient of determination

)(20.0)(51.0)(17.016.27ˆ triagecholyi

. regress sysbp chol age tri

Source | SS df MS Number of obs = 20-------------+------------------------------ F( 3, 16) = 35.56

Model | 7942.70165 3 2647.56722 Prob > F = 0.0000Residual | 1191.09835 16 74.4436471 R-squared = 0.8696

-------------+------------------------------ Adj R-squared = 0.8451Total | 9133.80 19 480.726316 Root MSE = 8.6281

Page 4: multiple regression 2564

āđƒāļ™āļāļēāļĢāļŠāļĢāļēāļ‡āđ‚āļĄāđ€āļ”āļĨ (āļ•āļ§āđāļšāļš) āļžāļšāļ§āļēāđ€āļĄāļ­āļˆāļēāļ™āļ§āļ™āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āļĄāļēāļāļ‚āļ™

āļ—āļēāđƒāļŦāļ„āļē R2 āļŠāļ‡āļ‚āļ™ āļ„āļ§āļĢāļĄāļāļēāļĢāļ›āļĢāļšāļ„āļē R2 āđ€āļĢāļĒāļāļ§āļē

“Adjusted coefficient of determination”

SSY

SSR

pn

nadj

R

12

Adjusted coefficient of determination

. regress sysbp chol age tri

Source | SS df MS Number of obs = 20-------------+------------------------------ F( 3, 16) = 35.56

Model | 7942.70165 3 2647.56722 Prob > F = 0.0000Residual | 1191.09835 16 74.4436471 R-squared = 0.8696

-------------+------------------------------ Adj R-squared = 0.8451Total | 9133.80 19 480.726316 Root MSE = 8.6281

Report Regression Table (Publication Manual of the American Psychological

Association sixth Edition,2010; pp 144)

Variable b s.e. Wald (t) P-value 95%CI R2 R2 change

Cholesterol .17 .05 3.33 0.004 .06-.07 .47 .75 .75

Age .51 .28 1.83 0.086 -.08-1.11 .21 .81 .07

Trigyceride .20 .07 2.69 0.016 .04-3.36 .38 .87 .06

Constant 27.16 12.81 2.12 0.050 .00-54.31

R2=0.87, Adjusted R2 = .85 , F = 35.56, p-value <.0001, n = 20

Recommended Report (Lang, et al. (1997). How to report Statistics in Medicine.pp,115

. regress sysbp chol age triSource | SS df MS Number of obs = 20

-------------+------------------------------ F( 3, 16) = 35.56Model | 7942.70165 3 2647.56722 Prob > F = 0.0000

Residual | 1191.09835 16 74.4436471 R-squared = 0.8696-------------+------------------------------ Adj R-squared = 0.8451

Total | 9133.80 19 480.726316 Root MSE = 8.6281------------------------------------------------------------------------------

sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------

chol | .1654515 .0496455 3.33 0.004 .0602076 .2706953age | .5122311 .2802612 1.83 0.086 -.0818961 1.106358tri | .2006968 .0745745 2.69 0.016 .042606 .3587876

_cons | 27.15522 12.80998 2.12 0.050 -.0007309 54.31117------------------------------------------------------------------------------

. do "M:\516701_2555\report_mreg.do"

. use "M:\516701_2555\multiple_reg_data.dta", clear

. regress sysbp chol

...Residual | 2267.92107 17 133.407122 R-squared = 0.7516

...

. regress sysbp chol ageResidual | 1729.02942 16 108.064339 R-squared = 0.8106

...

. regress sysbp chol age triResidual | 1191.02416 15 79.4016106 R-squared = 0.8696...

. regress sysbp chol age tri, betaSource | SS df MS Number of obs = 20

-------------+------------------------------ F( 3, 16) = 35.56Model | 7942.70165 3 2647.56722 Prob > F = 0.0000

Residual | 1191.09835 16 74.4436471 R-squared = 0.8696-------------+------------------------------ Adj R-squared = 0.8451

Total | 9133.8 19 480.726316 Root MSE = 8.6281------------------------------------------------------------------------------

sysbp | Coef. Std. Err. t P>|t| Beta-------------+----------------------------------------------------------------

chol | .1654515 .0496455 3.33 0.004 .4663705age | .5122311 .2802612 1.83 0.086 .2076355tri | .2006968 .0745745 2.69 0.016 .3805016

_cons | 27.15522 12.80998 2.12 0.050 .------------------------------------------------------------------------------

āļāļēāļĢāļ§āđ€āļ„āļĢāļēāļ°āļŦāļāļēāļĢāļ–āļ”āļ–āļ­āļĒāļžāļŦāļ„āļ“ āđ€āļĄāļ­āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āđ€āļ›āļ™āļ•āļ§āđāļ›āļĢāļˆāļēāđāļ™āļ

āļ›āļĢāļ°āđ€āļ āļ— āđ€āļŠāļ™ āđ€āļžāļĻ āļ­āļēāļŠāļž āļŊāļĨāļŊ āļ•āļ­āļ‡āļ—āļēāđƒāļŦāđ€āļ›āļ™āļ•āļ§āđāļ›āļĢāļŦāļ™ (dummy

variables)

pp

k

ljljl xDxy

j

1

10 1

ˆ

āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āļĄ k āļĢāļ°āļ”āļš āļŠāļĢāļēāļ‡āļ•āļ§āđāļ›āļĢāļŦāļ™āđ„āļ”āđ€āļ—āļēāļāļš k-1 āļ•āļ§āđāļ›āļĢ(k=āļĢāļ°āļ”āļš, āļāļĨāļĄ)

āļ•āļ§āđāļ›āļĢ

āļ•āļ§āđāļ›āļĢāļŦāļ™ (dummy variable)

D1 D2

code = 1 0 0

code = 2 1 0

Code = 3 0 1)(

1ˆ 654320 gender)(occÎē)(occÎē(tri)Îē(age)Îē(chol)ÎēÎēy officecomm

āļ•āļ§āļ­āļĒāļēāļ‡ āļ•āļ§āđāļ›āļĢāļ­āļēāļŠāļž (āđ€āļāļĐāļ•āļĢāļāļĢāļĢāļĄ, āļ„āļēāļ‚āļēāļĒ, āļ‚āļēāļĢāļēāļŠāļāļēāļĢ) āđ€āļ›āļ™āļ•āļ§āđāļ›āļĢāļāļĨāļĄ

āđƒāļŦāļ—āļēāđ€āļ›āļ™ āļ•āļ§āđāļ›āļĢāļŦāļ™ k-1=3-1 = 2 āļ•āļ§āđāļ›āļĢ āļ”āļ‡āļ™

āļ„āļēāļŠāļ‡ Stata: xi: regresst sysbp age tri i.occ i.gender

āļ­āļēāļŠāļž

āļ•āļ§āđāļ›āļĢāļŦāļ™ (dummy variable)

D1 D2

āđ€āļāļĐāļ•āļĢāļāļĢāļĢāļĄ = 1 0 0

āļ„āļēāļ‚āļēāļĒ = 2 1 0

āļĢāļšāļĢāļēāļŠāļāļēāļĢ = 3 0 1

*** āļāļĢāļ“āļĄ 2 āļāļĨāļĄ āđ€āļžāļĻ āļĢāļŦāļŠāđ€āļ›āļ™ 0, 1 āļ§āđ€āļ„āļĢāļēāļ°āļŦāđƒāļ™āđ‚āļ›āļĢāđāļāļĢāļĄ Stata āđ„āļ”āđ€āļĨāļĒ

āļ–āļēāļĄāļĢāļŦāļŠ 1, 2 āļāļēāļŦāļ™āļ”āđ€āļ›āļ™ āļ•āļ§āđāļ›āļĢāļŦāļ™

Page 5: multiple regression 2564

. xi: regress sysbp chol age tri i.occ i.gender

. xi: regress sysbp chol age tri i.occ i.genderi.occ _Iocc_1-3 (naturally coded; _Iocc_1 omitted)i.gender _Igender_1-2 (naturally coded; _Igender_1 omitted)

Source | SS df MS Number of obs = 20-------------+------------------------------ F( 6, 13) = 16.99

Model | 8101.00425 6 1350.16737 Prob > F = 0.0000Residual | 1032.79575 13 79.4458272 R-squared = 0.8869

-------------+------------------------------ Adj R-squared = 0.8347Total | 9133.8 19 480.726316 Root MSE = 8.9132

------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------chol | .1745477 .0564986 3.09 0.009 .0524899 .2966054age | .504353 .3139673 1.61 0.132 -.173932 1.182638tri | .2081322 .0796249 2.61 0.021 .036113 .3801514

_Iocc_2 | 5.242509 5.77858 0.91 0.381 -7.241355 17.72637_Iocc_3 | -1.13821 5.298263 -0.21 0.833 -12.58441 10.30799

_Igender_2 | -4.495496 4.72941 -0.95 0.359 -14.71276 5.721772_cons | 24.02471 13.96057 1.72 0.109 -6.135272 54.18469

------------------------------------------------------------------------------

. list+-------------------------------------------------------------------------------+| idno sysbp chol age tri occ gender _Iocc_2 _Iocc_3 _Igend~2 ||-------------------------------------------------------------------------------|

1. | 1 155 375 66 230 3 2 0 1 1 |2. | 2 136 290 49 161 1 1 0 0 0 |3. | 3 133 267 47 187 1 1 0 0 0 |4. | 4 166 340 55 178 2 1 1 0 0 |5. | 5 111 282 42 112 2 2 1 0 1 |

|-------------------------------------------------------------------------------|6. | 6 150 352 71 125 3 1 0 1 0 |7. | 7 131 285 39 149 2 2 1 0 1 |8. | 8 167 383 59 208 3 1 0 1 0 |9. | 9 166 363 60 208 1 1 0 0 0 |10. | 10 126 283 48 138 2 2 1 0 1 |

|-------------------------------------------------------------------------------|11. | 11 132 304 40 140 3 1 0 1 0 |12. | 12 164 428 51 175 2 2 1 0 1 |13. | 13 136 282 56 159 3 1 0 1 0 |14. | 14 73 165 36 44 1 1 0 0 0 |15. | 15 153 395 51 181 1 2 0 0 1 |

|-------------------------------------------------------------------------------|16. | 16 135 324 54 164 2 1 1 0 0 |17. | 17 149 426 51 205 3 1 0 1 0 |18. | 18 149 337 57 189 1 1 0 0 0 |19. | 19 142 347 45 152 3 2 0 1 1 |20. | 20 148 349 55 194 3 2 0 1 1 |

+-------------------------------------------------------------------------------+

āļāļēāļĢāļ„āļ”āđ€āļĨāļ­āļāļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āđ€āļ‚āļēāđƒāļ™āļŠāļĄāļāļēāļĢ: āļāļēāļĢāļ§āđ€āļ„āļĢāļēāļ°āļŦāļŠāļēāļŦāļĢāļš

āļāļēāļĢāļ–āļ”āļ–āļ­āļĒāļžāļŦāļ„āļ“

1. Forward selection Procedure

āļžāļˆāļēāļĢāļ“āļēāļ™āļēāđ€āļ‚āļēāļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āļ—āļĨāļ° 1 āļ•āļ§āđāļ›āļĢ

2. Backward elimination procedure

āļžāļˆāļēāļĢāļ“āļēāļ‚āļˆāļ”āļ­āļ­āļāļ—āļĨāļ° 1 āļ•āļ§āđāļ›āļĢ

3. The Stepwise regression procedure

āđƒāļŠāļ—āļ‡āļ§āļ˜ Forward & Backward

* āļāļēāļĢāļžāļˆāļēāļĢāļ“āļēāļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āļˆāļēāļ P-value (āļˆāļēāļāļ„āļēāļŠāļ–āļ• t)

āđ‚āļ”āļĒāļāļēāļŦāļ™āļ” (āļ) Probability to Entry (Pe) -> Forward

(āļ‚) Probability to Remove (Pr)->Backward

(āļ„) Pe āđāļĨāļ° Pr -> Stepwise

āļ§āļ˜āļāļēāļĢāļ‚āļˆāļ”āļ­āļ­āļ (Backward elimination procedure)āļ‚āļ™āļ•āļ­āļ™āļ— 1 āļŠāļĢāļēāļ‡āļŠāļĄāļāļēāļĢāļ–āļ”āļ–āļ­āļĒāļ›āļĢāļ°āļāļ­āļšāļ”āļ§āļĒ āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āļ—āļāļ•āļ§āđāļ›āļĢ

SYSBP = 27.16 + 0.17(chol) + 0.51(age) + 0.20(tri)

āļ‚āļ™āļ•āļ­āļ™āļ— 2 āļ„āļēāļ™āļ§āļ“āļ„āļēāļŠāļ–āļ• t (partial) āđāļĨāļ° p-value āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āļ—āļāļ•āļ§āđāļ›āļĢ

āđƒāļ™āđ‚āļĄāđ€āļ”āļĨ

āļ‚āļ™āļ•āļ­āļ™āļ— 3 āļžāļˆāļēāļĢāļ“āļēāļ•āļ§āđāļ›āļĢāļ—āļĄāļ„āļē p-value (āļˆāļēāļāļ„āļē t) āļĄāļēāļāļ—āļŠāļ”

. regress sysbp chol age tri

Source | SS df MS Number of obs = 20-------------+------------------------------ F( 3, 16) = 35.56

Model | 7942.70165 3 2647.56722 Prob > F = 0.0000Residual | 1191.09835 16 74.4436471 R-squared = 0.8696

-------------+------------------------------ Adj R-squared = 0.8451Total | 9133.80 19 480.726316 Root MSE = 8.6281

------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------chol | .1654515 .0496455 3.33 0.004 .0602076 .2706953age | .5122311 .2802612 1.83 0.086 -.0818961 1.106358tri | .2006968 .0745745 2.69 0.016 .042606 .3587876

_cons | 27.15522 12.80998 2.12 0.050 -.0007309 54.31117------------------------------------------------------------------------------

āļ‚āļ™āļ— 4 āđ€āļ›āļĢāļĒāļšāđ€āļ—āļĒāļšāļ„āļē p-value āļāļš āļĢāļ°āļ”āļšāļ™āļĒāļŠāļēāļ„āļ

āļ—āļāļēāļŦāļ™āļ” ( āđ€āļŠāļ™ Pr = 0.05)

āļ–āļē p-value > āļĢāļ°āļ”āļšāļ™āļĒāļŠāļēāļ„āļāļ—āļāļēāļŦāļ™āļ” (p-value > Pr)

āļ•āļ”āļ•āļ§āđāļ›āļĢāļ™āļ™āļ­āļ­āļāļˆāļēāļāļŠāļĄāļāļēāļĢ

āļ‚āļˆāļ”āļ•āļ§āđāļ›āļĢ age āļ­āļ­āļ (p-value = .086 > 0.05)

. regress sysbp chol age tri

â€Ķ

------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------chol | .1654515 .0496455 3.33 0.004 .0602076 .2706953age | .5122311 .2802612 1.83 0.086 -.0818961 1.106358tri | .2006968 .0745745 2.69 0.016 .042606 .3587876

_cons | 27.15522 12.80998 2.12 0.050 -.0007309 54.31117------------------------------------------------------------------------------

āļ§āļ™āļĢāļ­āļš: āļ—āļēāļāļēāļĢāļ—āļ”āļŠāļ­āļšāļ•āļēāļĄāļ‚āļ™āļ•āļ­āļ™āļ— 1 āļ–āļ‡ āļ‚āļ™āļ•āļ­āļ™āļ— 4 āđƒāļŦāļĄāļāļš

āļ•āļ§āđāļ›āļĢāļ—āđ€āļŦāļĨāļ­āļˆāļ™āļāļ§āļēāđ„āļĄāļĄāļ•āļ§āđāļ›āļĢāđƒāļ”āļĄāļēāļāļāļ§āļēāļĢāļ°āļ”āļšāļ™āļĒāļŠāļēāļ„āļ

āļ—āļāļēāļŦāļ™āļ” (āđ€āļŠāļ™ Pr=0.05)

. regress sysbp chol tri

Source | SS df MS Number of obs = 20-------------+------------------------------ F( 2, 17) = 45.42

Model | 7694.02578 2 3847.01289 Prob > F = 0.0000Residual | 1439.77422 17 84.6926011 R-squared = 0.8424

-------------+------------------------------ Adj R-squared = 0.8238Total | 9133.80 19 480.726316 Root MSE = 9.2029

------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------chol | .1875776 .0513543 3.65 0.002 .0792295 .2959258tri | .238911 .0763522 3.13 0.006 .0778219 .4

_cons | 40.00673 11.42093 3.50 0.003 15.91068 64.10278------------------------------------------------------------------------------

Page 6: multiple regression 2564

āļ§āļ˜āļāļēāļĢāļ‚āļˆāļ”āļ­āļ­āļ (Backward elimination procedure). sw regress sysbp chol age tri, pr(.05)

begin with full modelp = 0.0863 >= 0.0500 removing ageSource | SS df MS Number of obs = 20

---------+------------------------------ F( 2, 17) = 45.42Model | 7694.02578 2 3847.01289 Prob > F = 0.0000

Residual | 1439.77422 17 84.6926011 R-squared = 0.8424---------+------------------------------ Adj R-squared = 0.8238

Total | 9133.80 19 480.726316 Root MSE = 9.2029------------------------------------------------------------------------------

sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]---------+--------------------------------------------------------------------

chol | .1875776 .0513543 3.653 0.002 .0792295 .2959258tri | .238911 .0763522 3.129 0.006 .0778219 .4

_cons | 40.00673 11.42093 3.503 0.003 15.91068 64.10278------------------------------------------------------------------------------

. stepwise, pr(.05) : regress sysbp chol age tribegin with full model

p = 0.0863 >= 0.0500 removing age...------------------------------------------------------------------------------

sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------

chol | .1875776 .0513543 3.65 0.002 .0792295 .2959258tri | .238911 .0763522 3.13 0.006 .0778219 .4

_cons | 40.00673 11.42093 3.50 0.003 15.91068 64.10279

āļ§āļ˜āļ™āļēāđ€āļ‚āļē (Forward selection procedure)āļ‚āļ™āļ•āļ­āļ™āļ— 1 āđ€āļĨāļ­āļāļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āļ•āļ§āđāļĢāļāđ€āļ‚āļēāļĄāļēāđƒāļ™āļŠāļĄāļāļēāļĢ

āļāļēāļŦāļ™āļ” Pe (āđ€āļŠāļ™ Pe=0.05)

(āļ) āļ§āđ€āļ„āļĢāļēāļ°āļŦāļāļēāļĢāļāļēāļĢāļ–āļ”āļ–āļ­āļĒāļ­āļĒāļēāļ‡āļ‡āļēāļĒ āļ•āļ§āđāļ›āļĢāļ•āļēāļĄāļāļšāļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°

āļ—āļĨāļ°āļ•āļ§āđāļ›āļĢ

āđ€āļĨāļ­āļāļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āļ—āļĄāļ„āļ§āļēāļĄāļŠāļĄāļžāļ™āļ˜āđ€āļŠāļ‡āđ€āļŠāļ™āļāļšāļ•āļ§āđāļ›āļĢāļ•āļēāļĄ āļ„āļē p-value

āļ™āļ­āļĒāļ—āļŠāļ”āđāļĨāļ° < āļĢāļ°āļ”āļšāļ™āļĒāļŠāļēāļ„āļ (Pe)

*****āļāļĢāļ“ p-value > Pe āļĒāļ•āļāļēāļĢāļ§āđ€āļ„āļĢāļēāļ°āļŦ. regress sysbp cholâ€Ķ------------------------------------------------------------------------------

sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------

chol | .3075584 .0416769 7.38 0.000 .2199986 .3951183_cons | 39.95941 13.93348 2.87 0.010 10.68625 69.23256

------------------------------------------------------------------------------

. regress sysbp cholâ€Ķ------------------------------------------------------------------------------

sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------

chol | .3075584 .0416769 7.38 0.000 .2199986 .3951183_cons | 39.95941 13.93348 2.87 0.010 10.68625 69.23256

------------------------------------------------------------------------------

. regress sysbp ageâ€Ķ------------------------------------------------------------------------------

sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------

age | 1.695629 .4223477 4.01 0.001 .8083094 2.582949_cons | 53.60554 22.09811 2.43 0.026 7.179137 100.032

------------------------------------------------------------------------------

. regress sysbp triâ€Ķ------------------------------------------------------------------------------

sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------

tri | .4471421 .0659423 6.78 0.000 .3086025 .5856817_cons | 67.34391 11.2005 6.01 0.000 43.81254 90.87528

------------------------------------------------------------------------------

āļāļēāļĢāļ§āđ€āļ„āļĢāļēāļ°āļŦāļāļēāļĢāļ–āļ”āļ–āļ­āļĒāļ­āļĒāļēāļ‡āļ‡āļēāļĒ āļ•āļ§āđāļ›āļĢāļ•āļēāļĄ āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°

āļ”āļ‡āļ™āļ™āļ•āļ§āđāļ›āļĢ chol āđ€āļ›āļ™āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āļ•āļ§āđāļ›āļĢāđāļĢāļāļ—āļ™āļēāđ€āļ‚āļēāđƒāļ™āļŠāļĄāļāļēāļĢ

āļŦāļĢāļ­ āļ§āļ˜āļ™āļēāđ€āļ‚āļē (Forward selection procedure)

āļ§āđ€āļ„āļĢāļēāļ°āļŦāļŠāļŦāļŠāļĄāļžāļ™āļ˜āļĢāļ°āļŦāļ§āļēāļ‡āļ•āļ§āđāļ›āļĢāļ•āļēāļĄāļāļšāļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ° āđ€āļĨāļ­āļāļ•āļ§āđāļ›āļĢ

āļ­āļŠāļĢāļ°āļ—āļĄāļ„āļ§āļēāļĄāļŠāļĄāļžāļ™āļ˜āļāļšāļ•āļ§āđāļ›āļĢāļ•āļēāļĄāļĄāļēāļāļ—āļŠāļ”āđāļĨāļ° P-Value < Pe

āļˆāļēāļāļ•āļ§āļ­āļĒāļēāļ‡āļžāļšāļ§āļēāļŠāļŦāļŠāļĄāļžāļ™āļ˜āđ€āļ›āļ™āļ”āļ‡āļ™

rSYSBP-CHOL =0.8669

rSYSBP-AGE =0.6873

rSYSBP-TRI =0.8477

āļ”āļ‡āļ™āļ™āļ•āļ§āđāļ›āļĢ chol āđ€āļ›āļ™āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āļ•āļ§āđāļ›āļĢāđāļĢāļāļ—āļ™āļēāđ€āļ‚āļēāđƒāļ™āļŠāļĄāļāļēāļĢ

. pwcorr sysbp chol age tri, sig| sysbp chol age tri

-------------+------------------------------------sysbp | 1.0000

|chol | 0.8669 1.0000

| 0.0000|

age | 0.6873 0.5609 1.0000 | 0.0008 0.0101|

tri | 0.8477 0.7467 0.5732 1.0000 | 0.0000 0.0002 0.0082

āļ•āļ§āđāļ›āļĢ tri āļĄāļ„āļē p-valueāļ™āļ­āļĒāļāļ§āļē (t =3.13, p-value=.006)

.regress sysbp tri cholâ€Ķ------------------------------------------------------------------------------

sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------

tri | .238911 .0763522 3.13 0.006 .0778219 .4chol | .1875776 .0513543 3.65 0.002 .0792295 .2959258_cons | 40.00673 11.42093 3.50 0.003 15.91068 64.10279

āļ‚āļ™āļ•āļ­āļ™āļ— 2 āļžāļˆāļēāļĢāļ“āļēāļ•āļ§āđāļ›āļĢāļ— 2 āđ€āļ‚āļēāđ‚āļĄāđ€āļ”āļĨ(āļ) āļŠāļĢāļēāļ‡āđ‚āļĄāđ€āļ”āļĨāļ•āļ§āđāļ›āļĢāļ•āļēāļĄāļāļšāļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āļ—āļ™āļēāđ€āļ‚āļēāđƒāļ™āļ‚āļ™āļ•āļ­āļ™āļ— 1

āđāļĨāļ°āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āļ—āđ€āļŦāļĨāļ­āļ—āļĨāļ°āļ•āļ§āđāļ›āļĢ āļžāļˆāļēāļĢāļ“āļēāđ€āļĨāļ­āļāļ„āļē p-value (āļˆāļēāļ t-test) (āļ) P-value < Pe āđƒāļŦāļ™āļēāđ€āļ‚āļēāļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āđƒāļ™āđ‚āļĄāđ€āļ”āļĨ

(āļ‚) P-value > Pe āļĒāļ•āļāļēāļĢāļ™āļēāđ€āļ‚āļēāļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āđƒāļ™āđ‚āļĄāđ€āļ”āļĨ . regress sysbp age cholâ€Ķ------------------------------------------------------------------------------

sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------

age | .7236989 .3145605 2.30 0.034 .0600341 1.387364chol | .2491839 .0452355 5.51 0.000 .1537453 .3446225_cons | 21.81301 14.79754 1.47 0.159 -9.407062 53.03308

------------------------------------------------------------------------------

)1)(1( 22XZYZ

XZYZYXYX|Z

rr

rrrr

*āđƒāļŠāļ§āļ˜ partial correlation āđ„āļ”āđƒāļŦāļœāļĨāđ€āļŠāļ™āđ€āļ”āļĒāļ§āļāļ™

)1)(1( 22choltricholsysbp

choltricholsysbptrisysbptri|cholsysbp

rr

rrrr

āļŦāļĢāļ­ āļ‚āļ™āļ•āļ­āļ™āļ— 2 āļžāļˆāļēāļĢāļ“āļēāļ•āļ§āđāļ›āļĢāļ— 2 āđ€āļ‚āļēāđ‚āļĄāđ€āļ”āļĨ

)1)(1( 22cholagecholsysbp

cholagecholsysbpagesysbpage|cholsysbp

rr

rrrr

. pcorr sysbp tri chol(obs=20)Partial correlation of sysbp with

Variable | Corr. Sig.-------------+------------------

tri | 0.6045 0.006chol | 0.6631 0.002

. pcorr sysbp age chol(obs=20)Partial correlation of sysbp with

Variable | Corr. Sig.-------------+------------------

age | 0.4873 0.034chol | 0.8006 0.000

Page 7: multiple regression 2564

āļ‚āļ™āļ— 3 āļžāļˆāļēāļĢāļ“āļēāļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āļ—āļ™āļēāđ€āļ‚āļēāļŦāļēāļ„āļē t āđāļĨāļ° p-value

āļĢāļ§āļĄāļāļšāļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āļ—āļ™āļēāđ€āļ‚āļēāļāļ­āļ™

āļ–āļē

(āļ.) P-value < āļĢāļ°āļ”āļšāļ™āļĒāļŠāļēāļ„āļāļ—āļēāļ‡āļŠāļ–āļ•āļ—āļāļēāļŦāļ™āļ” (Pe) āđƒāļŦāļ™āļēāđ€āļ‚āļē

āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āļ”āļ‡āļāļĨāļēāļ§āđƒāļ™āđ‚āļĄāđ€āļ”āļĨāļŠāļĄāļāļēāļĢāļ–āļ”āļ–āļ­āļĒ

(āļ‚.) P-value > āļĢāļ°āļ”āļšāļ™āļĒāļŠāļēāļ„āļāļ—āļēāļ‡āļŠāļ–āļ•āļ—āļāļēāļŦāļ™āļ” (Pe) āļĒāļ•āļāļēāļĢāļ™āļēāđ€āļ‚āļē

āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ° āļ„āļ‡āđƒāļŠāđ‚āļĄāđ€āļ”āļĨ āđƒāļ™āļ‚āļ™āļ•āļ­āļ™āļ— 1

āļ‚āļ™āļ— 3 āļ—āļēāļ•āļēāļĄāļ‚āļ™āļ•āļ­āļ™āļ— 2 āļ‹āļēāļāļšāļ•āļ§āđāļ›āļĢāļ—āđ€āļŦāļĨāļ­ āļ—āļēāđ€āļŠāļ™āļ™āļˆāļ™āļ„āļĢāļšāļ—āļ

āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ° āļˆāļ™āđ„āļĄāļĄāļ•āļ§āđāļ›āļĢāđƒāļ”āļ— P-value < Pe āļ—āļāļēāļŦāļ™āļ” āđƒāļŦāļĒāļ•āļāļēāļĢ

āļ™āļēāđ€āļ‚āļē

āļŦāļĢāļ­ (2) āđƒāļŠ Partial Correlation

āđƒāļŦāļ„āļē t, p-value āđ€āļŠāļ™āđ€āļ”āļĒāļ§āļāļ™

. regress sysbp age chol triâ€Ķ------------------------------------------------------------------------------

sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------

age | .5122311 .2802612 1.83 0.086 -.0818961 1.106358chol | .1654515 .0496455 3.33 0.004 .0602076 .2706953tri | .2006968 .0745745 2.69 0.016 .042606 .3587876

_cons | 27.15522 12.80998 2.12 0.050 -.0007311 54.31117------------------------------------------------------------------------------

. pcorr sysbp age chol tri(obs=20)Partial correlation of sysbp with

Variable | Corr. Sig.-------------+------------------

age | 0.4156 0.086chol | 0.6401 0.004tri | 0.5582 0.016

āļ”āļ— p-value = 0.086 (āļ‚āļ­āļ‡āļ„āļē t=1.83) > āļ™āļĒāļŠāļēāļ„āļāļ—āļēāļ‡āļŠāļ–āļ•āļ—āļāļēāļŦāļ™āļ”

(Pe=0.05) āļĒāļ•āļāļēāļĢāļ™āļēāđ€āļ‚āļē

. regress sysbp age chol tri

Source | SS df MS Number of obs = 20-------------+------------------------------ F( 3, 16) = 35.56

Model | 7942.70165 3 2647.56722 Prob > F = 0.0000Residual | 1191.09835 16 74.4436471 R-squared = 0.8696

-------------+------------------------------ Adj R-squared = 0.8451Total | 9133.8 19 480.726316 Root MSE = 8.6281

------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------age | .5122311 .2802612 1.83 0.086 -.0818961 1.106358chol | .1654515 .0496455 3.33 0.004 .0602076 .2706953tri | .2006968 .0745745 2.69 0.016 .042606 .3587876

_cons | 27.15522 12.80998 2.12 0.050 -.0007311 54.31117------------------------------------------------------------------------------

āļ„āļē p-value = 0.086 > āļ™āļĒāļŠāļēāļ„āļāļ—āļēāļ‡āļŠāļ–āļ•āļ—āļāļēāļŦāļ™āļ” (Pe=0.05)

āļĒāļ•āļāļēāļĢāļ™āļēāđ€āļ‚āļē āđƒāļŠāđ‚āļĄāđ€āļ”āļĨāđƒāļ™āļ‚āļ™āļ•āļ­āļ™āļ— 2

. regress sysbp chol tri

Source | SS df MS Number of obs = 20-------------+------------------------------ F( 2, 17) = 45.42

Model | 7694.02578 2 3847.01289 Prob > F = 0.0000Residual | 1439.77422 17 84.6926011 R-squared = 0.8424

-------------+------------------------------ Adj R-squared = 0.8238Total | 9133.80 19 480.726316 Root MSE = 9.2029

------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------chol | .1875776 .0513543 3.65 0.002 .0792295 .2959258tri | .238911 .0763522 3.13 0.006 .0778219 .4

_cons | 40.00673 11.42093 3.50 0.003 15.91068 64.10278------------------------------------------------------------------------------

āļ§āļ˜āļ™āļēāđ€āļ‚āļē (Forward selection procedure). sw regress sysbp chol age tri, pe(.05)

begin with empty modelp = 0.0000 < 0.0500 adding cholp = 0.0061 < 0.0500 adding triSource | SS df MS Number of obs = 20

---------+------------------------------ F( 2, 17) = 45.42Model | 7694.02578 2 3847.01289 Prob > F = 0.0000

Residual | 1439.77422 17 84.6926011 R-squared = 0.8424---------+------------------------------ Adj R-squared = 0.8238

Total | 9133.80 19 480.726316 Root MSE = 9.2029------------------------------------------------------------------------------

sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]---------+--------------------------------------------------------------------

chol | .1875776 .0513543 3.653 0.002 .0792295 .2959258tri | .238911 .0763522 3.129 0.006 .0778219 .4

_cons | 40.00673 11.42093 3.503 0.003 15.91068 64.10278------------------------------------------------------------------------------

. stepwise, pe(.05) : regress sysbp chol age tribegin with empty model

p = 0.0000 < 0.0500 adding cholp = 0.0061 < 0.0500 adding tri...------------------------------------------------------------------------------

sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------

chol | .1875776 .0513543 3.65 0.002 .0792295 .2959258tri | .238911 .0763522 3.13 0.006 .0778219 .4

_cons | 40.00673 11.42093 3.50 0.003 15.91068 64.10279------------------------------------------------------------------------------

āļ§āļ˜āļŠāļĄāļāļēāļĢāļ–āļ”āļ–āļ­āļĒāđāļšāļšāļ‚āļ™āļ•āļ­āļ™ (Stepwise regression procedure). sw regress sysbp chol age tri, pr(0.1) pe(.05) forward

begin with empty modelp = 0.0000 < 0.0500 adding cholp = 0.0061 < 0.0500 adding tri

Source | SS df MS Number of obs = 20-------------+------------------------------ F( 2, 17) = 45.42

Model | 7694.02578 2 3847.01289 Prob > F = 0.0000Residual | 1439.77422 17 84.6926011 R-squared = 0.8424

-------------+------------------------------ Adj R-squared = 0.8238Total | 9133.8 19 480.726316 Root MSE = 9.2029

------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------chol | .1875776 .0513543 3.65 0.002 .0792295 .2959258tri | .238911 .0763522 3.13 0.006 .0778219 .4

_cons | 40.00673 11.42093 3.50 0.003 15.91068 64.10279------------------------------------------------------------------------------

. stepwise, pr(.10) pe(.05) forward: regress sysbp chol age tribegin with empty model

p = 0.0000 < 0.0500 adding cholp = 0.0061 < 0.0500 adding tri

...------------------------------------------------------------------------------

sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------

chol | .1875776 .0513543 3.65 0.002 .0792295 .2959258tri | .238911 .0763522 3.13 0.006 .0778219 .4

_cons | 40.00673 11.42093 3.50 0.003 15.91068 64.10279------------------------------------------------------------------------------

Page 8: multiple regression 2564

āļ§āļ˜āļŠāļĄāļāļēāļĢāļ–āļ”āļ–āļ­āļĒāđāļšāļšāļ‚āļ™āļ•āļ­āļ™ (Stepwise regression procedure)

āđ€āļ›āļ™āļ§āļ˜āļ—āļ„āļ”āđ€āļĨāļ­āļāļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āļĢāļ§āļĄāļāļ™ āļĢāļ°āļŦāļ§āļēāļ‡āļ§āļ˜āļ‚āļˆāļ”āļ­āļ­āļāđāļĨāļ°

āļ§āļ˜āļ™āļēāđ€āļ‚āļē

āđƒāļ™āđ€āļ­āļāļŠāļēāļĢāļ™āđƒāļŠāļ§āļ˜āļ™āļēāđ€āļ‚āļēāļāļ­āļ™āļ§āļ˜āļ‚āļˆāļ”āļ­āļ­āļāļ‚āļ™āļ•āļ­āļ™āļ— 1 (āļ) āļ§āđ€āļ„āļĢāļēāļ°āļŦāļāļēāļĢāļ–āļ”āļ–āļ­āļĒāļ­āļĒāļēāļ‡āļ‡āļēāļĒ āļĢāļ°āļŦāļ§āļēāļ‡āļ•āļ§āđāļ›āļĢāļ•āļēāļĄ

āļāļšāļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āļ—āļĨāļ°āļ•āļ§āđāļ›āļĢ

āđ€āļĨāļ­āļāļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āļ—āļĄāļ„āļ§āļēāļĄāļŠāļĄāļžāļ™āļ˜āđ€āļŠāļ‡āđ€āļŠāļ™āļāļšāļ•āļ§āđāļ›āļĢāļ•āļēāļĄāļĄāļēāļāļāļ§āļē

āļžāļˆāļēāļĢāļ“āļēāļˆāļēāļāļ•āļ§āđāļ›āļĢāļ—āļĄ āļ„āļē t āļĄāļēāļ āļŦāļĢāļ­āļ„āļē p-value āļ™āļ­āļĒāļāļ§āļēāđāļĨāļ°

p-value < āļĢāļ°āļ”āļšāļ™āļĒāļŠāļēāļ„āļāļ—āļāļēāļŦāļ™āļ” (Pe) āļ™āļēāđ€āļ‚āļēāļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°

p-value > Pe āļĒāļ•āļāļēāļĢāļ§āđ€āļ„āļĢāļēāļ°āļŦ

āļ§āđ€āļ„āļĢāļēāļ°āļŦāļāļēāļĢāļ–āļ”āļ–āļ­āļĒāļ­āļĒāļēāļ‡āļ‡āļēāļĒāļ—āļĨāļ°āļ•āļ§āđāļ›āļĢ āđ€āļĨāļ­āļāļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āļ—āļĄāļ„āļ§āļēāļĄāļŠāļĄāļžāļ™āļ˜āļāļš

āļ•āļ§āđāļ›āļĢāļ•āļēāļĄāļŠāļ‡āļāļ­āļ™ (p-value āļ™āļ­āļĒāļāļ§āļē) āđāļĨāļ° < Pe. regress sysbp chol...------------------------------------------------------------------------------

sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------

chol | .3075584 .0416769 7.38 0.000 .2199986 .3951183_cons | 39.95941 13.93348 2.87 0.010 10.68625 69.23256

------------------------------------------------------------------------------

. regress sysbp age

...------------------------------------------------------------------------------

sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------

age | 1.695629 .4223477 4.01 0.001 .8083094 2.582949_cons | 53.60554 22.09811 2.43 0.026 7.179137 100.032

------------------------------------------------------------------------------

. regress sysbp tri

... ------------------------------------------------------------------------------

sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------

tri | .4471421 .0659423 6.78 0.000 .3086025 .5856817_cons | 67.34391 11.2005 6.01 0.000 43.81254 90.87528

------------------------------------------------------------------------------

p-value āļ•āļ§āđāļ›āļĢ chol < Pe: āļ•āļ§āđāļ›āļĢ chol āđ€āļ›āļ™āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āļĨāļēāļ”āļšāđāļĢāļ

āļ—āļ™āļēāđ€āļ‚āļēāđƒāļ™āđ‚āļĄāđ€āļ”āļĨ

.regress sysbp tri cholâ€Ķ------------------------------------------------------------------------------

sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------

tri | .238911 .0763522 3.13 0.006 .0778219 .4chol | .1875776 .0513543 3.65 0.002 .0792295 .2959258_cons | 40.00673 11.42093 3.50 0.003 15.91068 64.10279

āļ‚āļ™āļ•āļ­āļ™āļ— 2 āļžāļˆāļēāļĢāļ“āļēāļ•āļ§āđāļ›āļĢāļ— 2 āđ€āļ‚āļēāđ‚āļĄāđ€āļ”āļĨ(āļ) āļŠāļĢāļēāļ‡āđ‚āļĄāđ€āļ”āļĨāļ•āļ§āđāļ›āļĢāļ•āļēāļĄāļāļšāļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āļ—āļ™āļēāđ€āļ‚āļēāđƒāļ™āļ‚āļ™āļ•āļ­āļ™āļ— 1 āđāļĨāļ°

āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āļ—āđ€āļŦāļĨāļ­āļ—āļĨāļ°āļ•āļ§āđāļ›āļĢ āļžāļˆāļēāļĢāļ“āļēāđ€āļĨāļ­āļāļ„āļē p-value (āļˆāļēāļ t-test) āļ—āļĄāļ„āļēāļ™āļ­āļĒāļāļ§āļē āļāļĢāļ“āļ„āļē p-value > Pe āļĒāļ•āļāļēāļĢāļ™āļēāđ€āļ‚āļē

. regress sysbp age cholâ€Ķ------------------------------------------------------------------------------

sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------

age | .7236989 .3145605 2.30 0.034 .0600341 1.387364chol | .2491839 .0452355 5.51 0.000 .1537453 .3446225_cons | 21.81301 14.79754 1.47 0.159 -9.407062 53.03308

------------------------------------------------------------------------------

āļˆāļēāļāļ‚āļ­āļĄāļĨāļ•āļ§āļ­āļĒāļēāļ‡ āļ•āļ§āđāļ›āļĢ tri (triglyceride) āļĄāļ„āļē t āļĄāļēāļāļ—āļŠāļ”āđāļĨāļ° p-value āļ™āļ­āļĒāļāļ§āļē

āđāļĨāļ° (p-value < Pe) āđƒāļŦāļ™āļēāļ•āļ§āđāļ›āļĢ tri āđƒāļ™āđ‚āļĄāđ€āļ”āļĨāļāļēāļĢāļ–āļ”āļ–āļ­āļĒāļžāļŦāļ„āļ“

āļ‚āļ™āļ— 3 āļžāļˆāļēāļĢāļ“āļēāļ‚āļˆāļ”āļ•āļ§āđāļ›āļĢāļ­āļ­āļāļˆāļēāļāđ‚āļĄāđ€āļ”āļĨ āļŠāļĢāļēāļ‡āļāļēāļĢāļ–āļ”āļ–āļ­āļĒāļžāļŦ āđāļĨāļ§āļžāļˆāļēāļĢāļ“āļēāļ„āļē t , p-value āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āļ—āļ­āļĒāđƒāļ™āļŠāļĄāļāļēāļĢāļ–āļ”āļ–āļ­āļĒāļžāļŦāļžāļˆāļēāļĢāļ“āļēp-value (āļˆāļēāļāļ„āļē t) āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āļāļš Pr P-value > Probability to Remove (Pr) āļ•āļ”āļ•āļ§āđāļ›āļĢāļ™āļ™āļ­āļ­āļāļˆāļēāļāļŠāļĄāļāļēāļĢ P-value < Probability to Remove (Pr) āļ„āļ‡āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āđƒāļ™āļŠāļĄāļāļēāļĢ

āļ„āļē p-value āļ•āļ§āđāļ›āļĢ cholesterol, trigyceride < Pr (āļĢāļ°āļ”āļšāļ™āļĒāļŠāļēāļ„āļ

āļ—āļāļēāļŦāļ™āļ”, Pr =0.20) āļ„āļ‡āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āļ—āļ‡ 2 āļ•āļ§āđāļ›āļĢ āļ­āļĒāđƒāļ™āļŠāļĄāļāļēāļĢ

. regress sysbp chol tri

...------------------------------------------------------------------------------

sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------

chol | .1875776 .0513543 3.65 0.002 .0792295 .2959258tri | .238911 .0763522 3.13 0.006 .0778219 .4

_cons | 40.00673 11.42093 3.50 0.003 15.91068 64.10279------------------------------------------------------------------------------

āļ‚āļ™āļ— 4 āļ—āļēāļ•āļēāļĄāļ‚āļ™āļ•āļ­āļ™āļ— 2-3 āļ‹āļēāļāļšāļ•āļ§āđāļ›āļĢāļ—āđ€āļŦāļĨāļ­ āđƒāļ™āļ—āļ™āļ„āļ­ age

āļžāļˆāļēāļĢāļ“āļēāļ™āļēāđ€āļ‚āļē (āļ‚āļ™āļ•āļ­āļ™āļ— 2)

(āļ) āļžāļˆāļēāļĢāļ“āļēāļ„āļē p-value (āļˆāļēāļāļ„āļē t) āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āļ—āđ€āļŦāļĨāļ­ āļˆāļēāļāļŠāļĄāļāļēāļĢ

āļ–āļ”āļ–āļ­āļĒāļžāļŦ

(āļ‚) āļāļĢāļ“ p-value > Pribability to Entry (Pe) āļĒāļ•āļāļēāļĢāļ™āļēāđ€āļ‚āļē

(āļ„) āļāļĢāļ“āļ—āļŠāļēāļĄāļēāļĢāļ–āļ™āļēāđ€āļ‚āļēāļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āđ„āļ” āđƒāļŦāļ—āļēāļ‚āļ™āļ•āļ­āļ™ 2-3 āļˆāļ™āļ„āļĢāļšāļ—āļ

āļ•āļ§āđāļ›āļĢ āļˆāļ™āđ„āļĄāļĄ āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āđƒāļ”āļ— p-vakue < Pe āđƒāļŦāļĒāļ•āļāļēāļĢāļ™āļēāđ€āļ‚āļē

āļ•āļ§āđāļ›āļĢ age āļĄāļ„āļē p-value=0.086 > Pe (0.05) āļĒāļ•āļāļēāļĢāļ™āļēāđ€āļ‚āļē

. regress sysbp age chol triâ€Ķ

------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------age | .5122311 .2802612 1.83 0.086 -.0818961 1.106358chol | .1654515 .0496455 3.33 0.004 .0602076 .2706953tri | .2006968 .0745745 2.69 0.016 .042606 .3587876

_cons | 27.15522 12.80998 2.12 0.050 -.0007311 54.31117------------------------------------------------------------------------------

āđ‚āļĄāđ€āļ”āļĨāļ—āļ„āļ”āđ€āļĨāļ­āļāđāļšāļš Stewise regression

sysbp predicted

)(24.0)(19.001.40ˆ 21 xxyi

. regress sysbp chol tri

Source | SS df MS Number of obs = 20-------------+------------------------------ F( 2, 17) = 45.42

Model | 7694.02578 2 3847.01289 Prob > F = 0.0000Residual | 1439.77422 17 84.6926011 R-squared = 0.8424

-------------+------------------------------ Adj R-squared = 0.8238Total | 9133.80 19 480.726316 Root MSE = 9.2029

------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------chol | .1875776 .0513543 3.65 0.002 .0792295 .2959258tri | .238911 .0763522 3.13 0.006 .0778219 .4

_cons | 40.00673 11.42093 3.50 0.003 15.91068 64.10278------------------------------------------------------------------------------

Page 9: multiple regression 2564

āļ§āļ˜āļŠāļĄāļāļēāļĢāļ–āļ”āļ–āļ­āļĒāđāļšāļšāļ‚āļ™āļ•āļ­āļ™ (Stepwise regression procedure). sw regress sysbp chol age tri, pr(0.1) pe(.05) forward

begin with empty modelp = 0.0000 < 0.0500 adding cholp = 0.0061 < 0.0500 adding tri

Source | SS df MS Number of obs = 20-------------+------------------------------ F( 2, 17) = 45.42

Model | 7694.02578 2 3847.01289 Prob > F = 0.0000Residual | 1439.77422 17 84.6926011 R-squared = 0.8424

-------------+------------------------------ Adj R-squared = 0.8238Total | 9133.8 19 480.726316 Root MSE = 9.2029

------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------chol | .1875776 .0513543 3.65 0.002 .0792295 .2959258tri | .238911 .0763522 3.13 0.006 .0778219 .4

_cons | 40.00673 11.42093 3.50 0.003 15.91068 64.10279------------------------------------------------------------------------------

. stepwise, pr(.10) pe(.05) forward: regress sysbp chol age tribegin with empty model

p = 0.0000 < 0.0500 adding cholp = 0.0061 < 0.0500 adding tri

...------------------------------------------------------------------------------

sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------

chol | .1875776 .0513543 3.65 0.002 .0792295 .2959258tri | .238911 .0763522 3.13 0.006 .0778219 .4

_cons | 40.00673 11.42093 3.50 0.003 15.91068 64.10279------------------------------------------------------------------------------

āļāļēāļĢāļāļēāļŦāļ™āļ”āđƒāļŦāļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āļ­āļĒāđƒāļ™āđ‚āļĄāđ€āļ”āļĨāļāļēāļĢāļ„āļ”āđ€āļĨāļ­āļ

lockterm1 = keep the first term

*āļāļĢāļ“āļ•āļ§āđāļ›āļĢāļŦāļ™ (Dummy Variable) āđƒāļŠāļžāļˆāļēāļĢāļ“āļēāđ€āļ›āļ™āļāļĨāļĄāļ•āļ§āđāļ›āļĢ

āđ‚āļ”āļĒāđƒāļŠ (age i.occ). xi:stepwise, forward lockterm1 pr(.10) pe(.05): regress sysbp (age i.occ) chol tri

. stepwise, forward lockterm1 pr(.10) pe(.05): regress sysbp (age) chol tribegin with term 1 model

p = 0.0000 < 0.0500 adding cholp = 0.0161 < 0.0500 adding tri

Source | SS df MS Number of obs = 20-------------+------------------------------ F( 3, 16) = 35.56

Model | 7942.70165 3 2647.56722 Prob > F = 0.0000Residual | 1191.09835 16 74.4436471 R-squared = 0.8696

-------------+------------------------------ Adj R-squared = 0.8451Total | 9133.8 19 480.726316 Root MSE = 8.6281

------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]

-------------+----------------------------------------------------------------age | .5122311 .2802612 1.83 0.086 -.0818961 1.106358chol | .1654515 .0496455 3.33 0.004 .0602076 .2706953tri | .2006968 .0745745 2.69 0.016 .042606 .3587876

_cons | 27.15522 12.80998 2.12 0.050 -.0007311 54.31117------------------------------------------------------------------------------

Report Regression Table (Publication Manual of the American Psychological Association sixth Edition,2010;pp 145.)

Stepwise logistic regression āļžāļˆāļēāļĢāļ“āļēāļ„āļē p-value

Hosmer & Lemeshow (2000) āļ„āļ§āļĢāļāļēāļŦāļ™āļ”

p-value for entry (Pe) 0.15-0.25 , p-value for remove (Pr) > Pe

āļāļēāļĢāļāļēāļŦāļ™āļ” p-value for entry āļŠāļ‡āļŦāļĢāļ­āļ•āļēāđ€āļāļ™āđ„āļ› use more tradition level (0.05) fails to identify variables known

to be important ?

higher level has disadvantage of including variables that are of

questionable importance at the model building stage

(Original: Mickey & Greenland, 1977: p125-137;

Reference: Hosmer & Lemeshow (2000): p95)

āļ‚āļ­āļāļēāļŦāļ™āļ”āđƒāļ™āļāļēāļĢāļ§āđ€āļ„āļĢāļēāļ°āļŦāļāļēāļĢāļ–āļ”āļ–āļ­āļĒāļžāļŦāļ„āļ“

(Assumption)

āļžāļˆāļēāļĢāļ“āļēāļˆāļēāļāļŠāļ§āļ™āļ—āđ€āļŦāļĨāļ­ (Residual: ei āļŦāļĢāļ­ )

āļ•āļ§āđāļšāļšāļĄāļ„āļ§āļēāļĄāļŠāļĄāļžāļ™āļ˜āđ€āļŠāļ‡āđ€āļŠāļ™ (The regression function is linear)

āļ„āļē residual (ei) āļĄāļāļēāļĢāđāļˆāļāđāļˆāļ‡āđāļšāļšāļ›āļāļ•

āļ„āļē residual (ei) āļĄāļ„āļē variance āļ„āļ‡āļ— (homoscedasticity)

āļ„āļē residual (ei) āđ„āļĄāļĄāļ„āļ§āļēāļĄāļŠāļĄāļžāļ™āļ˜āļāļšāļ„āļēāļ­āļ™āđ†

(no auto-correlation, no serial correlation)

āļžāļˆāļēāļĢāļ“āļē Independent variable

āđ„āļĄāļĄ Multicollinearity

ii yy ˆ

āļ„āļ§āļēāļĄāđ€āļŦāļĄāļēāļ°āļŠāļĄāļ‚āļ­āļ‡āļ•āļ§āđāļšāļš (Aptness of Model)

āļ•āļ§āđāļšāļšāļĄāļĨāļāļĐāļ“āļ°āđ€āļŠāļ‡āđ€āļŠāļ™ (The regression function is linear)

āļ§āļ˜āļ•āļĢāļ§āļˆāļŠāļ­āļšāļžāļĨāļ­āļ•āļāļĢāļēāļŸāļĢāļ°āļŦāļ§āļēāļ‡āļ„āļēāļŠāļ§āļ™āļ—āđ€āļŦāļĨāļ­ (residual: ei) āļāļš fitted value

āļ•āļ§āđāļšāļšāļĄāļĨāļāļĐāļ“āļ°āđ€āļŠāļ‡āđ€āļŠāļ™ āđ€āļĄāļ­āļˆāļ”āļžāļĨāļ­āļ•āļ›āļĢāļēāļāļāļĢāļ­āļšāđ€āļŠāļ™āđāļāļ™āļ™āļ­āļ™ āļ—āļĄāļ„āļēāļŠāļ§āļ™

āļ—āđ€āļŦāļĨāļ­ (ei) āđ€āļ—āļēāļāļš 0 āļ”āļ‡āļĢāļ› a

āļ•āļ§āđāļšāļšāđ„āļĄāļĄāļĨāļāļĐāļ“āļ°āđ€āļŠāļ‡āđ€āļŠāļ™ āđāļŠāļ”āļ‡āļ”āļ‡āļĢāļ› b āđ‚āļ”āļĒāļ—āļˆāļ”āļžāļĨāļ­āļ•āļĄāļĨāļāļĐāļ“āļ°āđ€āļžāļĄāļ‚āļ™āđāļĨāļ°

āļĨāļ”āļĨāļ‡āļ­āļĒāļēāļ‡āđ€āļ›āļ™āļĢāļ°āļšāļš

iy

āļĢāļ› a āļĢāļ› b

iy iy

Page 10: multiple regression 2564

.regress sysbp chol tri

.rvfplot ,yline(0)

iy

ei VS iy

āļ„āļē residual (ei) āļĄāļāļēāļĢāđāļˆāļāđāļˆāļ‡āđāļšāļšāļ›āļāļ•

Normal probability plot, Box-Whisker plot, Stem & leaf etc.

Shapiro-Wilk test . quietly regress sysbp chol age tri. predict e,residual. swilk e

Shapiro-Wilk W test for normal dataVariable | Obs W V z Prob>z

-------------+-------------------------------------------------e | 20 0.95467 1.073 0.142 0.44361

. pnorm e

āļ„āļē residual (ei) āļĄāļ„āļē variance āļ„āļ‡āļ— (homoscedasticity)

āļāļēāļĢāļŸāļĨāļ­āļ•āļāļĢāļēāļŸ āļĢāļ°āļŦāļ§āļēāļ‡āļ„āļē residual (ei) āļāļš

āļ—āļ”āļŠāļ­āļš Cook-Weisberg test for heteroscedasticity

Stataestat hettest tests for heteroskedasticityestat imtest information matrix testestat ovtest Ramsey regression specification-error

test for omitted variablesestat szroeter Szroeter's rank test for

heteroskedasticity

rvfplot residual-versus-fitted plot

iy

āļ„āļē residual (ei) āļĄāļ„āļē variance āļ„āļ‡āļ— (homoscedasticity) āļāļēāļĢāļŸāļĨāļ­āļ•āļāļĢāļēāļŸ āļĢāļ°āļŦāļ§āļēāļ‡āļ„āļē residual (ei) āļāļš

āļ—āļ”āļŠāļ­āļš Cook-Weisberg test for heteroscedasticityiy

. rvfplot, ylin(0)

. estat hettestBreusch-Pagan / Cook-Weisberg test for heteroskedasticity

Ho: Constant varianceVariables: fitted values of sysbpchi2(1) = 1.38Prob > chi2 = 0.2409

. estat hettestBreusch-Pagan / Cook-Weisberg test for heteroskedasticity

Ho: Constant varianceVariables: fitted values of sysbpchi2(1) = 1.38Prob > chi2 = 0.2409

. estat imtestCameron & Trivedi's decomposition of IM-test---------------------------------------------------

Source | chi2 df p---------------------+-----------------------------Heteroskedasticity | 3.28 5 0.6563

Skewness | 2.14 2 0.3438Kurtosis | 1.21 1 0.2720

---------------------+-----------------------------Total | 6.63 8 0.5775

---------------------------------------------------

. estat szroeter , rhs mtest(holm)Szroeter's test for homoskedasticity

Ho: variance constantHa: variance monotonic in variable

---------------------------------------Variable | chi2 df p

-------------+-------------------------chol | 2.37 1 0.2481 #tri | 0.67 1 0.4134 #

---------------------------------------# Holm-adjusted p-values

. hettestCook-Weisberg test for heteroskedasticity

Ho: Constant variancechi2(1) = 7.44Prob > chi2 = 0.0064

. rvfplot, border yline(0)

. hettestCook-Weisberg test for heteroskedasticity

Ho: Constant variancechi2(1) = 0.00Prob > chi2 = 1.0000

. rvfplot, border yline(0)

Page 11: multiple regression 2564

āļ„āļē residual (ei) āđ„āļĄāļĄāļ„āļ§āļēāļĄāļŠāļĄāļžāļ™āļ˜āļāļšāļ„āļēāļ­āļ™āđ† (no auto-correlation, no serial correlation )**āđƒāļŠāđ€āļ‰āļžāļēāļ°āļ‚āļ­āļĄāļĨāđāļšāļš Time-Series

āļ„āļ§āļēāļĄāļŠāļĄāļžāļ™āļ˜āļĢāļ°āļŦāļ§āļēāļ‡āļ„āļēāļĢāļ°āļŦāļ§āļēāļ‡āļĢāļēāļĒāļ‚āļ­āļĄāļĨāļ‚āļ­āļ‡āļ•āļ§āđāļ›āļĢāđ€āļ”āļĒāļ§āļāļ™ error āļĢāļēāļĒāļ— i VS error āļĢāļēāļĒāļ— i-1

āļāļēāļĢāļ„āļēāļ™āļ§āļ“āļ„āļē Durbin-Watson (d)d āļĄāļ„āļē 1-4d < 2 āļžāļšāļ§āļēāđ€āļāļ” positive autocorrelation d > 2 āđ€āļāļ” negative autocorrelationAs a rough rule of thumb, Durbin–Watson is 1.5 – 2.5

are relatively normal.

n

iie

n

i)ie(e

d

1

2

2

21

id age time expose lt1 42 15 1 54 2 46 14 2 7.3 3 43 8 4 3 4 25 3 3 2 5 26 13 4 5.4 6 55 12 4 5 7 23 10 4 3.7 8 24 11 4 5 9 38 7 3 2.8

10 24 4 4 2.2 11 28 6 4 2.5 12 38 9 4 3.1 13 26 5 4 2.5 14 28 1 4 .8 15 26 2 2 1.2

āļ•āļ§āļ­āļĒāļēāļ‡ āļāļēāļĢāļĻāļāļĐāļēāļāļēāļĢāđ„āļ”āļĢāļšāļŠāļēāļĢ Beryllium āđƒāļ™āļ„āļ™āļ‡āļēāļ™āđ€āļŦāļĄāļ­āļ‡āļ–āļēāļ™āļŦāļ™āđ‚āļ”āļĒāļĻāļāļĐāļēāļ•āļ§āđāļ›āļĢ age exposure āļāļš higher rate of blastogeniclymphocyte transformation (lt ratio)

0:;0:0 AHH

. tsset timetime variable: time, 1 to 15

delta: 1 unit

. qui regress lt age expose

. estat dwatson

Durbin-Watson d-statistic( 3, 15) = 1.98835

. estat durbinaltDurbin's alternative test for autocorrelation--------------------------------------------------------------------

lags(p) | chi2 df Prob > chi2-------------+------------------------------------------------------

1 | 1.843 1 0.1746--------------------------------------------------------------------

H0: no serial correlation

āļ āļēāļ§āļ°āļĢāļ§āļĄāđ€āļŠāļ™āļ•āļĢāļ‡* (Collinearity)

āļ„āļ§āļēāļĄāļŠāļĄāļžāļ™āļ˜āļĢāļ°āļŦāļ§āļēāļ‡āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ°āļ”āļ§āļĒāļāļ™ āļĄāļ„āļēāļŠāļ‡

(r2 > 0.90; r > 0.95 Kleinbaum, Muller, Nizam; 1998, 241)

āļŠāļ­āļ­āļ™ “āļ āļēāļ§āļ°āļĢāļ§āļĄāđ€āļŠāļ™āļ•āļĢāļ‡āļžāļŦ (Multicollinearity)”

āļāļēāļĢāļĨāļ”āļŦāļĢāļ­āđ€āļžāļĄāļ•āļ§āđāļ›āļĢāđƒāļ™āđ‚āļĄāđ€āļ”āļĨ āļ—āļēāđƒāļŦāđ€āļ›āļĨāļĒāļ™āđāļ›āļĨāļ‡āļ„āļēāļŠāļĄāļ›āļĢāļ°āļŠāļ—āļ˜

āļ—āļ‡āļ‚āļ™āļēāļ”āđāļĨāļ°/āļŦāļĢāļ­āđ€āļ„āļĢāļ­āļ‡āļŦāļĄāļēāļĒ

āļ„āļē R2 āļĄāļ„āļēāļŠāļ‡āđāļ•āļāļēāļĢāļ—āļ”āļŠāļ­āļšāļ—āļēāļ‡āļŠāļ–āļ•āļāļšāļŠāļĄāļ›āļĢāļ°āļŠāļ—āļ˜ āļžāļšāļ§āļē

āđ„āļĄāļĄāļ™āļĒāļŠāļēāļ„āļ

āļ—āļēāđƒāļŦāļ„āļē Standard error āļŠāļ‡ āļ‹āļ‡āļŠāļ‡āļœāļĨāđƒāļŦāļ„āļēāļŠāļ–āļ•āļĄāļ„āļēāļ•āļē āđ€āļŠāļ™ t, z

āđāļĨāļ°āļ—āļēāđƒāļŦāļ„āļēāļŠāļ§āļ‡āđ€āļŠāļ­āļĄāļ™āļ‚āļ­āļ‡āļŠāļĄāļ›āļĢāļ°āļŠāļ—āļ˜āļĄāļ„āļēāļāļ§āļēāļ‡

*āļžāļˆāļ™āļēāļ™āļāļĢāļĄāļĻāļžāļ—āļ„āļ“āļ•āļĻāļēāļŠāļ•āļĢ āļ‰āļšāļšāļĢāļēāļŠāļšāļ“āļ‘āļ•āļĒāļŠāļ–āļēāļ™, 2552

āļāļēāļĢāļ•āļĢāļ§āļˆāļŠāļ­āļš Collinearity āļŦāļĢāļ­ Multicollinearity

Pearson Correlation (informal method)

āļ•āļĢāļ§āļˆāļŠāļ­āļšāļ„āļ§āļēāļĄāļŠāļĄāļžāļ™āļ˜āļ—āļāļ•āļ§āđāļ›āļĢ āđ‚āļ”āļĒāđƒāļŠāļŠāļ–āļ• Pearson correlation

āļžāļˆāļēāļĢāļ“āļēāļ•āļ§āđāļ›āļĢāļ—āļĄāļ„āļ§āļēāļĄāļŠāļĄāļžāļ™āļ˜āļāļšāļ•āļ§āđāļ›āļĢāļ­āļ™āđ† āļŠāļ‡

. corr chol age age tri(obs=20)

| chol age age tri-------------+------------------------------------

chol | 1.0000age | 0.5609 1.0000age | 0.5609 1.0000 1.0000tri | 0.7467 0.5732 0.5732 1.0000

Indication of Multicollinearity āļ”āļ§āļĒāļ§āļ˜ Variance inflation factors* VIF > 10 indication that Multicollinearity Mean VIF provides information about the severity of the

multicollinearity if Mean VIF > 1 are indicative of serious multicollinearity

problems(*Neter, Wasserman & Kutner, 1987; Marquardt, 1970; Belsley, Kuh &

Welsch, 1980)

tolerence (O’Brien, 2007)

tolerence <0.20 (vif>5) or tolerence 0.10 (vif=10+)

Stata collin [varlistâ€Ķ]estat vif variance inflation factors for the

independent variables

)2i

R(1i

tolerance

Page 12: multiple regression 2564

Variance Inflation Factors (VIF: formal method)āļžāļˆāļēāļĢāļ“āļēāļ„āļē VIF > 10 āđāļĨāļ°āļ„āļēāđ€āļ‰āļĨāļĒāļ‚āļ­āļ‡ VIF āļĄāļēāļāļāļ§āļē 1 āļĄāļ›āļāļŦāļēāļāļēāļĢāđ€āļāļ” Multicolinearity

21

11)21(

iRi

Ri

VIF

1

1

1)(

p-

p-

iK

VIF

VIF

. collin age chol triCollinearity Diagnostics

SQRT R-Variable VIF VIF Tolerance Squared

----------------------------------------------------age 1.58 1.26 0.6315 0.3685chol 2.40 1.55 0.4162 0.5838tri 2.45 1.57 0.4077 0.5923

----------------------------------------------------Mean VIF 2.15

CondEigenval Index

---------------------------------1 3.9477 1.00002 0.0303 11.42063 0.0126 17.73024 0.0094 20.4609

---------------------------------Condition Number 20.4609 Eigenvalues & Cond Index computed from scaled raw sscp (w/ intercept)Det(correlation matrix) 0.2794

. quietly regress sysbp chol age tri

. vifVariable | VIF 1/VIF -------------+----------------------

tri | 2.45 0.407722chol | 2.40 0.416193age | 1.58 0.631508

-------------+----------------------Mean VIF | 2.15

āļŦāļĢāļ­. estat vif

Variable | VIF 1/VIF -------------+----------------------

tri | 2.45 0.407722chol | 2.40 0.416193age | 1.58 0.631508

-------------+----------------------Mean VIF | 2.15

. di (2.45+2.40+1.58)/32.1433333

r r2 vif.1 0.01 1.01 .2 0.04 1.04 .3 0.09 1.10 .4 0.16 1.19 .5 0.25 1.33 .6 0.36 1.56 .7 0.49 1.96 .8 0.64 2.78 .9 0.81 5.26 .91 0.83 5.82 .92 0.85 6.51 .93 0.86 7.40 .94 0.88 8.59 .95 0.90 10.26.96 0.92 12.76 .97 0.94 16.92 .98 0.96 25.25 .99 0.98 50.25

1 1.00 .

āļ„āļ§āļēāļĄāļŠāļĄāļžāļ™āļ˜āļĢāļ°āļŦāļ§āļēāļ‡ VIF vs āļ„āļē correlation

.95

Conditional Index & Variance Decomposition Proportionāļ„āļē Conditional Index (CI) āđāļĨāļ°āļ„āļē Variance Decomposition Proportion (VDP) āđ€āļ›āļ™āļ„āļēāļ—āļ„āļēāļ™āļ§āļ“āļˆāļēāļ eigenvalue āļˆāļēāļāļāļēāļĢāļ§āđ€āļ„āļĢāļēāļ°āļŦāđ€āļĄāļ•āļĢāļāļ‹āļŠāļŦāļŠāļĄāļžāļ™āļ˜ āļ‚āļ­āļ‡āļ•āļ§āđāļ›āļĢāļ­āļŠāļĢāļ° āđ‚āļ”āļĒ Conditional Index āļ„āļēāļ™āļ§āļ“āļˆāļēāļ

āļ„āļē Conditional Index āļĄāļ„āļē 10-30 āđāļŠāļ”āļ‡āļ§āļēāļĄāļ āļēāļ§āļ°āļĢāļ§āļĄāđ€āļŠāļ™āļ•āļĢāļ‡ āļ„āļē conditional index > 30 āđāļŠāļ”āļ‡āļ§āļēāļĄāļ›āļāļŦāļēāļ āļēāļ§āļ°āļĢāļ§āļĄāđ€āļŠāļ™āļ•āļĢāļ‡ Conditional Index > 100 āđāļŠāļ”āļ‡āļ§āļēāļĄāļ āļēāļ§āļ°āļĢāļ§āļĄāđ€āļŠāļ™āļ•āļĢāļ‡āļŠāļ‡āļĄāļēāļāđ† (Belsley, 1991a)

between 10 and 30, there is moderate to strong multicollinearity and if it exceeds 30 there is severe multicollinearity. (Gujarati, 2002)

Eigenvaluek MinMax ;/

Conditional Index & Variance Decomposition Proportion

āļ„āļē Variance Decomposition Proportion āđāļ™āļ°āļ™āļēāđ‚āļ”āļĒ

Belsley et al. (1980) āđāļĨāļ° Belsley (1991a)

āļžāļˆāļēāļĢāļ“āļē VDP āļĄāļēāļāļāļ§āļē 0.5

āļ„āļēāļ™āļ§āļ“āļ„āļēāļŠāļ”āļŠāļ§āļ™āļ‚āļ­āļ‡āļ„āļ§āļēāļĄāđāļ›āļĢāļ›āļĢāļ§āļ™ (proposed calculation of

the proportions of variance) āļ‚āļ­āļ‡āđāļ•āļĨāļ°āļ•āļ§āđāļ›āļĢāļŠāļĄāļžāļ™āļ˜āļāļš

āļ„āļēāļ­āļ‡āļ„āļ›āļĢāļ°āļāļ­āļš (principal component) āđ€āļ›āļĢāļĒāļšāđ€āļŠāļĄāļ­āļ™

āļ­āļ‡āļ„āļ›āļĢāļ°āļāļ­āļšāļ‚āļ­āļ‡āļ„āļēāļŠāļĄāļ›āļĢāļ°āļŠāļ—āļ˜āļ„āļ§āļēāļĄāđāļ›āļĢāļ›āļĢāļ§āļ™āđƒāļ™āđāļ•āļĨāļ°āļĄāļ•

(decomposition of the coefficient variance for each dimension)

kj

jkjk VIF

Vp

2

(Fox,1984)

. coldiag2 tri chol age, force w(5)

Condition number using scaled variables = 20.46

Condition Indexes and Variance-Decomposition Proportions

conditionindex _cons tri chol age

1 1.00 0.00 0.00 0.00 0.00 2 11.42 0.32 0.38 0.01 0.03 3 17.73 0.32 0.00 0.14 0.95 4 20.46 0.36 0.61 0.85 0.02

. prnt_cx, force w(5)

Condition Indexes and Variance-Decomposition Proportionscondition

index _cons tri chol age 1 1.00 . . . . 2 11.42 0.32 0.38 . . 3 17.73 0.32 . . 0.95 4 20.46 0.36 0.61 0.85 .

Variance-Decomposition Proportions less than .3 have been printed as "."


Recommended