āļāļēāļĢāļ§āđāļāļĢāļēāļ°āļŦāļāļēāļĢāļāļāļāļāļĒāļāļŦāļāļ
(Multiple Regression Analysis)
āļāļāļ§āļĒāļĻāļēāļŠāļāļĢāļēāļāļēāļĢāļĒāļāļāļĄ āļāļāļāļĄāđāļŠāļĒāļ
āļŠāļēāļāļēāļ§āļāļēāļ§āļāļĒāļēāļāļēāļĢāļĢāļ°āļāļēāļāđāļĨāļ°āļāļ§āļŠāļāļ
āļāļāļ°āļŠāļēāļāļēāļĢāļāļŠāļāļĻāļēāļŠāļāļĢ āļĄāļŦāļēāļ§āļāļĒāļēāļĨāļĒāļāļāļāđāļāļ
Email: [email protected] Web: https://home.kku.ac.th/nikom
2
2
2
1
S
SF
2
22
1
21
21
ns
ns
xxt
05.0 valuep
05.0
ns/
Ξxt
)n(s/txCI% dfÎą/ ,2)100(
āļāļēāļĢāļ§āđāļāļĢāļēāļ°āļŦāļāļēāļĢāļāļāļāļāļĒāļāļŦāļāļ
āļ§āđāļāļĢāļēāļ°āļŦāļāļ§āļēāļĄāļŠāļĄāļāļāļāļĢāļ°āļŦāļ§āļēāļāļāļ§āđāļāļĢāļāļēāļĄ 1 āļāļ§āđāļāļĢ āđāļĨāļ°āļāļ§āđāļāļĢ
āļāļŠāļĢāļ° 2 āļāļ§āđāļāļĢ āļāļāđāļ
āļāļ§āđāļāļĢāļāļŠāļĢāļ° (independent variables) āļŦāļĢāļāļāļ§āđāļāļĢāļāļāļāļēāļĒ
(explanatory variables)
āļāļ§āđāļāļĢāļāļēāļĄ (dependent variable) āļŦāļĢāļāļāļ§āđāļāļĢāļāļāļāļŠāļāļāļ(response variable) āļāļ§āđāļāļĢāļāļĨāļĨāļāļ (outcome variable)
ipp exxxy ,...,Ë 22110
āļāļēāļĢāļ§āđāļāļĢāļēāļ°āļŦāļāļēāļĢāļāļāļāļāļĒāļāļŦāļāļ (Multiple Regression)
āļāļēāļĢāļ§āđāļāļĢāļēāļ°āļŦāļāļēāļĢāļāļāļāļāļĒāļāļĒāļēāļāļāļēāļĒ (Simple Regression)
iebxay Ëiexy 0
ËāļŦāļĢāļ
āļ§āļāļāļāļĢāļ°āļŠāļāļāļāļēāļĢāļ§āđāļāļĢāļēāļ°āļŦāļāļēāļĢāļāļāļāļāļĒāļāļŦāļāļ
āļ§āļāļāļ§āļēāļĄāļŠāļĄāļāļāļāđāļāļāđāļŠāļāļĢāļ°āļŦāļ§āļēāļāļāļ§āđāļāļĢāļāļŠāļĢāļ° (independent variables)
(āļ) āļāļ§āđāļāļĢāļāļāđāļāļāļ (āļ) āļāļ§āđāļāļĢāļāļēāđāļāļāļāļĢāļ°āđāļ āļ (āļŠāļĢāļēāļāļāļ§āđāļāļĢāļŦāļ)
āļāļāļāļ§āđāļāļĢāļāļēāļĄ (dependent variable) -> āļāļ§āđāļāļĢāļāļāđāļāļāļ
āļāļēāļāļēāļĒ (prediction)
Systolic BPCHOL
TRI
AGE
...
idno sysbp chol age tri idno sysbp chol age tri
1 155 375 66 230 11 132 304 40 140
2 136 290 49 161 12 164 428 51 175
3 133 267 47 187 13 136 282 56 159
4 166 340 55 178 14 73 165 36 44
5 111 282 42 112 15 153 395 51 181
6 150 352 71 125 16 135 324 54 164
7 131 285 39 149 17 149 426 51 205
8 167 383 59 208 18 149 337 57 189
9 166 363 60 208 19 142 347 45 152
10 126 283 48 138 20 148 349 55 194
āļāļ§āļāļĒāļēāļ āļāļēāļĢāļĻāļāļĐāļēāļāļ§āļēāļĄāļŠāļĄāļāļāļāļĢāļ°āļŦāļ§āļēāļāļāļēāļĒ āļĢāļ°āļāļ cholesterol
āļĢāļ°āļāļ triglyceride āļāļ systolic blood pressure
āļāļāļĄāļĨāļāļ§āđāļāļĢ sysbp āđāļĨāļ°āđāļĄāļāļĢāļāļāļāļ§āđāļāļĢāļāļŠāļĢāļ° (chol, age,tri)
148
...
136
155
y
194553491
...
161492901
230663751
x
āļāļēāļĢāļāļēāļāļ§āļāļŠāļĄāļāļĢāļ°āļŠāļāļāļāļēāļĢāļ§āđāļāļĢāļēāļ°āļŦāļāļēāļĢāļāļāļāļāļĒāļāļŦāļāļ
āđāļāļ§āļ least square method āđāļāļĒāđāļ matrix approach
ixxxy 3322110Ë
1xpyx'1)
ppxx'(
1xpb
x
āļāļēāļĢāļ§āđāļāļĢāļēāļ°āļŦāļāļēāļĢāļāļāļāļāļĒāļāļŦāļāļāđāļāļĒāđāļ Stata. regress sysbp chol age tri
Source | SS df MS Number of obs = 20---------+------------------------------ F( 3, 16) = 35.56
Model | 7942.70165 3 2647.56722 Prob > F = 0.0000Residual | 1191.09835 16 74.4436471 R-squared = 0.8696---------+------------------------------ Adj R-squared = 0.8451
Total | 9133.80 19 480.726316 Root MSE = 8.6281
------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------chol | .1654515 .0496455 3.333 0.004 .0602077 .2706953age | .5122311 .2802612 1.828 0.086 -.0818961 1.106358tri | .2006968 .0745745 2.691 0.016 .042606 .3587876
_cons | 27.15522 12.80998 2.120 0.050 -.0007308 54.31117------------------------------------------------------------------------------
(āļ.) āļāļ§āļēāļĄāļŠāļĄāļāļāļāđāļāļāđāļŠāļāļĢāļ°āļŦāļ§āļēāļāļāļ§āđāļāļĢāļāļēāļĄāđāļĨāļ°
āļāļĨāļĄāļāļāļāļāļ§āđāļāļĢāļāļŠāļĢāļ°
āđāļāļāļŠāļĢāļāļ§āļēāļāļ§āđāļāļĢāļāļŠāļĢāļ°āļāļĒāļēāļāļāļāļĒ 1 āļāļ§āđāļāļĢ āļĄāļāļ§āļēāļĄāļŠāļĄāļāļāļāđāļāļāđāļŠāļ
āļāļāļāļ§āđāļāļĢ Y
āļāļēāļĢāļāļāļŠāļāļāļŠāļĄāļĄāļāļāļēāļ
āļāļēāļĢāļŠāļĢāļāļ āļēāļāļĢāļ§āļĄ āđāļāļāļēāļĢāļēāļāļāļēāļĢāļ§āđāļāļĢāļēāļ°āļŦāļāļ§āļēāļĄāđāļāļĢāļāļĢāļ§āļ (ANOVA)
āļŠāļēāļŦāļĢāļāļāļēāļĢāļ§āđāļāļĢāļēāļ°āļŦāļāļēāļĢāļāļāļāļāļĒ āđāļāļāļāļēāļāļ§āļāļāļē F-test
0210
k
Îē...Îē:ÎēH 0:0 ik
ÎēHāļŦāļĢāļ
0: ikÎēH A
MSR
āļāļēāļĢāļēāļ ANOVA āļŠāļēāļŦāļĢāļāļāļēāļĢāļ§āđāļāļĢāļēāļ°āļŦāļāļēāļĢāļāļāļāļāļĒ
triagecholiY 20.51.17.16.27Ë F-test
Source | SS df MS Number of obs = 20---------+------------------------------ F( 3, 16) = 35.56
Model | 7942.70165 3 2647.56722 Prob > F = 0.0000Residual | 1191.09835 16 74.4436471 R-squared = 0.8696---------+------------------------------ Adj R-squared = 0.8451
Total | 9133.80 19 480.726316 Root MSE = 8.6281
āļŠāļĄāļĄāļāļāļēāļāļŠāļēāļŦāļĢāļāļāļēāļĢāļāļāļŠāļāļāļāļĒāļŠāļēāļāļāļāļāļāļŠāļĄāļāļēāļĢāļāļāļāļāļĒāļāļāļŦāļĄāļ
H0 : āļāļ§āđāļāļĢāļāļŠāļĢāļ° k āļāļ§āđāļāļĢāđāļĄāļŠāļēāļĄāļēāļĢāļāļāļāļāļēāļĒāļāļ§āļēāļĄāļāļāđāļāļĢ
āļāļāļāļāļ§āđāļāļĢ Y āđāļ
āļŦāļĢāļ
HA : āļāļ§āđāļāļĢāļāļŠāļĢāļ° k āļāļ§āđāļāļĢāļŠāļēāļĄāļēāļĢāļāļāļāļāļēāļĒāļāļ§āļēāļĄāļāļāđāļāļĢ
āļāļāļāļāļ§āđāļāļĢ Y āđāļ
āļŦāļĢāļ
āļāļēāļĢāļāļāļŠāļāļāđāļāļŠāļāļ F-test
0210
k
Îē...Îē:ÎēH
0: ik
ÎēH A
MSE
MSR
error)(or residualsquaremean
model)(or regressionsquaremeanF
āļāļ§āđāļāļĢāļāļŠāļĢāļ°āđāļāļĨāļ°āļāļ§āđāļāļĢ
āđāļĄāļĄāļāļ§āļēāļĄāļŠāļĄāļāļāļāđāļāļāđāļŠāļāļāļāļāļ§āđāļāļĢāļāļēāļĄ
āļāļ§āđāļāļĢāļāļŠāļĢāļ°āļāļĒāļēāļāļāļāļĒ 1 āļāļ§āđāļāļĢ
āļĄāļāļ§āļēāļĄāļŠāļĄāļāļāļāđāļāļāđāļŠāļāļāļāļāļ§āđāļāļĢāļāļēāļĄ
0:0 ikÎēH
k
kn
R
RF
121
2
SSY
SSR
SSY
SSESSYR
2
āļŦāļĢāļāļāļēāļāļ§āļāļāļē F āļāļēāļ
āļāļēāļĢāļāļēāļāļ§āļāļāļē
n=āļāļāļēāļāļāļ§āļāļĒāļēāļ
k=āļāļēāļāļ§āļāļāļ§āđāļāļĢ
R2 = āļŠāļĄāļāļĢāļ°āļŠāļāļāļāļēāļĢāļāļēāļŦāļāļ (coefficient of determination)
āđāļĄāļ
āļāļēāļĢāļ§āđāļāļĢāļēāļ°āļŦ āļāļāļ§āļē āļĄāļāļ§āđāļāļĢāļāļŠāļĢāļ°āļāļĒāļēāļāļāļāļĒ 1 āļāļ§āđāļāļĢāļĄāļāļ§āļēāļĄāļŠāļĄāļāļāļ
āđāļāļāđāļŠāļāļāļāļāļ§āđāļāļĢāļāļēāļĄ āļāļĒāļēāļāļāļĒāļŠāļēāļāļāļāļēāļāļŠāļāļ (F-test= 35.56;
p<.0001) [āļāļ§āđāļāļĢāļāļŠāļĢāļ°āđāļāđāļ chol, tri]
. regress sysbp chol age tri
Source | SS df MS Number of obs = 20---------+------------------------------ F( 3, 16) = 35.56
Model | 7942.70165 3 2647.56722 Prob > F = 0.0000Residual | 1191.09835 16 74.4436471 R-squared = 0.8696---------+------------------------------ Adj R-squared = 0.8451
Total | 9133.80 19 480.726316 Root MSE = 8.6281
------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
---------+--------------------------------------------------------------------chol | .1654515 .0496455 3.333 0.004 .0602077 .2706953age | .5122311 .2802612 1.828 0.086 -.0818961 1.106358tri | .2006968 .0745745 2.691 0.016 .042606 .3587876
_cons | 27.15522 12.80998 2.120 0.050 -.0007308 54.31117------------------------------------------------------------------------------
āļŠāļĄāļĄāļāļāļēāļāļŠāļēāļŦāļĢāļāļāļēāļĢāļāļāļŠāļāļāļāļĒāļŠāļēāļāļāļāļāļāļŠāļĄāļāļēāļĢāļāļāļāļāļĒāļāļāļŦāļĄāļH0 : āļāļ§āđāļāļĢāļāļŠāļĢāļ° k āļāļ§āđāļāļĢāđāļĄāļŠāļēāļĄāļēāļĢāļāļāļāļāļēāļĒāļāļ§āļēāļĄāļāļāđāļāļĢ
āļāļāļāļāļ§āđāļāļĢ Y āđāļ HA : āļāļ§āđāļāļĢāļāļŠāļĢāļ° k āļāļ§āđāļāļĢāļŠāļēāļĄāļēāļĢāļāļāļāļāļēāļĒāļāļ§āļēāļĄāļāļāđāļāļĢ
āļāļāļāļāļ§āđāļāļĢ Y āđāļ āļāļēāļĢāļāļāļŠāļāļāđāļāļŠāļāļ F-test
0: ik
ÎēH A
0:0 ikÎēH
. xi: reg sysbp i.occ i.genderi.occ _Iocc_1-3 (naturally coded; _Iocc_1 omitted)i.gender _Igender_1-2 (naturally coded; _Igender_1 omitted)
Source | SS df MS Number of obs = 20-------------+---------------------------------- F(3, 16) = 0.36
Model | 570.586942 3 190.195647 Prob > F = 0.7859Residual | 8563.21306 16 535.200816 R-squared = 0.0625
-------------+---------------------------------- Adj R-squared = -0.1133Total | 9133.8 19 480.726316 Root MSE = 23.134
------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------_Iocc_2 | 4.137457 14.54323 0.28 0.780 -26.69281 34.96772_Iocc_3 | 12.50172 12.72194 0.98 0.340 -14.46758 39.47102
_Igender_2 | -.6082474 11.50742 -0.05 0.959 -25.0029 23.7864_cons | 135.1014 9.637349 14.02 0.000 114.6711 155.5316
------------------------------------------------------------------------------
āļŠāļĄāļĄāļāļāļēāļāļŠāļēāļŦāļĢāļāļāļēāļĢāļāļāļŠāļāļāļāļĒāļŠāļēāļāļāļāļāļāļŠāļĄāļāļēāļĢāļāļāļāļāļĒāļāļāļŦāļĄāļH0 : āļāļ§āđāļāļĢāļāļŠāļĢāļ° k āļāļ§āđāļāļĢāđāļĄāļŠāļēāļĄāļēāļĢāļāļāļāļāļēāļĒāļāļ§āļēāļĄāļāļāđāļāļĢ
āļāļāļāļāļ§āđāļāļĢ Y āđāļ HA : āļāļ§āđāļāļĢāļāļŠāļĢāļ° k āļāļ§āđāļāļĢāļŠāļēāļĄāļēāļĢāļāļāļāļāļēāļĒāļāļ§āļēāļĄāļāļāđāļāļĢ
āļāļāļāļāļ§āđāļāļĢ Y āđāļ āļāļēāļĢāļāļāļŠāļāļāđāļāļŠāļāļ F-test
0: ik
ÎēH A
0:0 ikÎēH
. xi: reg sysbp i.occ i.gender choli.occ _Iocc_1-3 (naturally coded; _Iocc_1 omitted)i.gender _Igender_1-2 (naturally coded; _Igender_1 omitted)
Source | SS df MS Number of obs = 20-------------+---------------------------------- F(4, 15) = 13.43
Model | 7139.70141 4 1784.92535 Prob > F = 0.0001Residual | 1994.09859 15 132.939906 R-squared = 0.7817
-------------+---------------------------------- Adj R-squared = 0.7235Total | 9133.8 19 480.726316 Root MSE = 11.53
------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------_Iocc_2 | .9696195 7.262195 0.13 0.896 -14.50938 16.44862_Iocc_3 | -2.080515 6.671207 -0.31 0.759 -16.29986 12.13883
_Igender_2 | -7.837831 5.82667 -1.35 0.199 -20.25708 4.581421chol | .3255662 .0463141 7.03 0.000 .22685 .4242825_cons | 37.714 14.66305 2.57 0.021 6.460443 68.96757
------------------------------------------------------------------------------
āļ. āļāļēāļĢāļ§āđāļāļĢāļēāļ°āļŦāļāļ§āļēāļĄāļŠāļĄāļāļāļāđāļāļāđāļŠāļ āļāļ§āđāļāļĢāļāļŠāļĢāļ°āđāļāļĨāļ°āļāļ§āđāļāļĢ
āļĄāļāļ§āļēāļĄāļŠāļĄāļāļāļāļāļāļāļ§āđāļāļĢāļāļēāļĄ
āļāļēāļĢāļāļāļŠāļāļāļŠāļĄāļĄāļāļāļēāļ H0: i = 0; HA: i 0
āđāļĄāļ āļāļāļŠāļĄāļāļĢāļ°āļŠāļāļāđāļĨāļ° āļāļ Standard Errori
i
S
Îēt
Ë
ii
S
Ë
. regress sysbp chol age triF( 3, 16) = 35.56
Source | SS df MS Number of obs = 20-------------+------------------------------ F( 3, 16) = 35.56
Model | 7942.70165 3 2647.56722 Prob > F = 0.0000Residual | 1191.09835 16 74.4436471 R-squared = 0.8696
-------------+------------------------------ Adj R-squared = 0.8451Total | 9133.80 19 480.726316 Root MSE = 8.6281
------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------chol | .1654515 .0496455 3.33 0.004 .0602076 .2706953age | .5122311 .2802612 1.83 0.086 -.0818961 1.106358tri | .2006968 .0745745 2.69 0.016 .042606 .3587876
_cons | 27.15522 12.80998 2.12 0.050 -.0007309 54.31117------------------------------------------------------------------------------
āļāļēāļĢāđāļāļĨāļāļ§āļēāļĄāļŦāļĄāļēāļĒ āđāļāļĨāļāļĨāđāļāļĒāļāļāļēāļĢāļāļēāđāļāļĢāļāļāļŦāļĄāļēāļĒ
āļāļāļēāļĢāļāļē āļāļ§āļēāļĄāļŠāļĄāļāļāļāļĢāļ°āļŦāļ§āļēāļāļāļ§āđāļāļĢāļāļŠāļĢāļ°āļāļāļāļ§āđāļāļĢāļāļēāļĄ
āļāļ§āđāļāļĢ chol, tri āļĄāļāļ§āļēāļĄāļŠāļĄāļāļāļāļāļēāļāļāļ§āļāļāļ sysbp āđāļĨāļ°āļĄāļāļĒāļŠāļēāļāļāļāļēāļāļŠāļāļ
āļāļ§āđāļāļĢ age āđāļĄāļĄāļāļ§āļēāļĄāļŠāļĄāļāļāļāļāļ sysbp (āļĄāļāļ§āļēāļĄāļŠāļĄāļāļāļāđāļāđāļĄāļĄāļāļĒāļŠāļēāļāļ
āļāļēāļāļŠāļāļ)
. regress sysbp chol age triF( 3, 16) = 35.56
Source | SS df MS Number of obs = 20-------------+------------------------------ F( 3, 16) = 35.56
Model | 7942.70165 3 2647.56722 Prob > F = 0.0000Residual | 1191.09835 16 74.4436471 R-squared = 0.8696
-------------+------------------------------ Adj R-squared = 0.8451Total | 9133.80 19 480.726316 Root MSE = 8.6281
------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------chol | .1654515 .0496455 3.33 0.004 .0602076 .2706953age | .5122311 .2802612 1.83 0.086 -.0818961 1.106358tri | .2006968 .0745745 2.69 0.016 .042606 .3587876
_cons | 27.15522 12.80998 2.12 0.050 -.0007309 54.31117------------------------------------------------------------------------------
āļāļēāļĢāļāļāļēāļĢāļāļēāļāļ§āđāļāļĢāļāļŠāļĢāļ°āđāļāļĨāļ°āļāļ§āđāļāļĢāļĄāļāļ§āļēāļĄāļŠāļĄāļāļāļ
āļāļāļāļ§āđāļāļĢāļāļēāļĄāļĄāļēāļāļāļāļĒāđāļāļĒāļāđāļ
āļāļāļēāļĢāļāļēāļāļēāļāļŠāļĄāļāļēāļĢāļāļĄāļāļēāļĢāļāļĢāļāļāļ§āđāļāļĢāļāļŠāļĢāļ° āļĄāļŦāļāļ§āļĒāđāļāļĒāļ§āļāļ
āļāļē Xi āđāļŦāđāļāļāļāļ°āđāļāļāļĄāļēāļāļĢāļāļēāļ Z-score
yy
xxi
S
Sor
sd
xxz *;
. regress sysbp chol age tri, beta
Source | SS df MS Number of obs = 20-------------+------------------------------ F( 3, 16) = 35.56
Model | 7942.70165 3 2647.56722 Prob > F = 0.0000Residual | 1191.09835 16 74.4436471 R-squared = 0.8696
-------------+------------------------------ Adj R-squared = 0.8451Total | 9133.80 19 480.726316 Root MSE = 8.6281
------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| Beta
-------------+----------------------------------------------------------------chol | .1654515 .0496455 3.33 0.004 .4663705age | .5122311 .2802612 1.83 0.086 .2076355tri | .2006968 .0745745 2.69 0.016 .3805016
_cons | 27.15522 12.80998 2.12 0.050 .------------------------------------------------------------------------------
. di .16545147*(61.802976/21.925472)
.46637049
. di .51223109*(8.8876022/21.925472)
.20763549
. di .20069683*(41.568555/21.925472)
.3805016
āļāļĢāļāļāļēāđāļŦāđāļāļāļāļ°āđāļāļāļĄāļēāļāļĢāļāļēāļ Z-scoresd
xxz i
Constant āļĄāļāļēāļāļāļĒāļĄāļēāļ ~ 0
yy
xx
S
S *
.zscore sysbp chol age tri
.regress z_sysbp z_chol z_age z_tri
Source | SS df MS Number of obs = 20-------------+------------------------------ F( 3, 16) = 35.56
Model | 16.5222943 3 5.50743142 Prob > F = 0.0000Residual | 2.47770574 16 .154856609 R-squared = 0.8696
-------------+------------------------------ Adj R-squared = 0.8451Total | 19.00 19 1.00 Root MSE = .39352
------------------------------------------------------------------------------
z_sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------
z_chol | .4663705 .1399396 3.33 0.004 .1697118 .7630292z_age | .2076355 .1136053 1.83 0.086 -.033197 .448468z_tri | .3805016 .1413859 2.69 0.016 .0807768 .6802263_cons | 3.62e-16 .0879934 0.00 1.000 -.1865376 .1865376
------------------------------------------------------------------------------
2
2
1
1
Ë
2
21
n
i)Y(Y
n
i)Y
iY(
SSY
SSR
k...x,xy|x
R
i
0.869594 9133.80
7942.70165
āļāļ§āļāļĒāļēāļ āļāļēāļāļāļāļĄāļĨāļāļ§āļāļĒāļēāļāļāļēāļāļ§āļāļāļēāļāļāļ coefficient of determination
āļāļ§āđāļāļĢ chol ,age āđāļĨāļ° trigyceride āļŠāļēāļĄāļēāļĢāļāļāļāļāļēāļĒāļāļēāļĢāđāļāļĨāļĒāļāđāļāļĨāļ
(āļāļ§āļēāļĄāđāļāļĢāļāļĢāļ§āļ) āļĢāļ°āļāļ systolic blood pressure āđāļāļāļāļĢāļāļĒāļĨāļ° 86.96
āļŠāļĄāļāļēāļĢāļāļēāļāļēāļĒāđāļĨāļ°āļāļēāļĢāļāļĢāļ°āđāļĄāļāļŠāļĄāļāļēāļĢāļāļāļāļāļĒāļāļŦāļāļ āļŠāļĄāļāļēāļĢāļāļēāļāļēāļĒ: āļāļēāļĢāļāļĢāļ°āđāļĄāļāļŠāļĄāļāļēāļĢāļāļāļāļāļĒāļāļŦāļāļ āļāļāļēāļĢāļāļēāļāļēāļāļāļēāļŠāļĄāļāļĢāļ°āļŠāļāļāļāļēāļĢāļāļēāļŦāļāļ
(coefficient of determination
)(20.0)(51.0)(17.016.27Ë triagecholyi
. regress sysbp chol age tri
Source | SS df MS Number of obs = 20-------------+------------------------------ F( 3, 16) = 35.56
Model | 7942.70165 3 2647.56722 Prob > F = 0.0000Residual | 1191.09835 16 74.4436471 R-squared = 0.8696
-------------+------------------------------ Adj R-squared = 0.8451Total | 9133.80 19 480.726316 Root MSE = 8.6281
āđāļāļāļēāļĢāļŠāļĢāļēāļāđāļĄāđāļāļĨ (āļāļ§āđāļāļ) āļāļāļ§āļēāđāļĄāļāļāļēāļāļ§āļāļāļ§āđāļāļĢāļāļŠāļĢāļ°āļĄāļēāļāļāļ
āļāļēāđāļŦāļāļē R2 āļŠāļāļāļ āļāļ§āļĢāļĄāļāļēāļĢāļāļĢāļāļāļē R2 āđāļĢāļĒāļāļ§āļē
âAdjusted coefficient of determinationâ
SSY
SSR
pn
nadj
R
12
Adjusted coefficient of determination
. regress sysbp chol age tri
Source | SS df MS Number of obs = 20-------------+------------------------------ F( 3, 16) = 35.56
Model | 7942.70165 3 2647.56722 Prob > F = 0.0000Residual | 1191.09835 16 74.4436471 R-squared = 0.8696
-------------+------------------------------ Adj R-squared = 0.8451Total | 9133.80 19 480.726316 Root MSE = 8.6281
Report Regression Table (Publication Manual of the American Psychological
Association sixth Edition,2010; pp 144)
Variable b s.e. Wald (t) P-value 95%CI R2 R2 change
Cholesterol .17 .05 3.33 0.004 .06-.07 .47 .75 .75
Age .51 .28 1.83 0.086 -.08-1.11 .21 .81 .07
Trigyceride .20 .07 2.69 0.016 .04-3.36 .38 .87 .06
Constant 27.16 12.81 2.12 0.050 .00-54.31
R2=0.87, Adjusted R2 = .85 , F = 35.56, p-value <.0001, n = 20
Recommended Report (Lang, et al. (1997). How to report Statistics in Medicine.pp,115
. regress sysbp chol age triSource | SS df MS Number of obs = 20
-------------+------------------------------ F( 3, 16) = 35.56Model | 7942.70165 3 2647.56722 Prob > F = 0.0000
Residual | 1191.09835 16 74.4436471 R-squared = 0.8696-------------+------------------------------ Adj R-squared = 0.8451
Total | 9133.80 19 480.726316 Root MSE = 8.6281------------------------------------------------------------------------------
sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------
chol | .1654515 .0496455 3.33 0.004 .0602076 .2706953age | .5122311 .2802612 1.83 0.086 -.0818961 1.106358tri | .2006968 .0745745 2.69 0.016 .042606 .3587876
_cons | 27.15522 12.80998 2.12 0.050 -.0007309 54.31117------------------------------------------------------------------------------
. do "M:\516701_2555\report_mreg.do"
. use "M:\516701_2555\multiple_reg_data.dta", clear
. regress sysbp chol
...Residual | 2267.92107 17 133.407122 R-squared = 0.7516
...
. regress sysbp chol ageResidual | 1729.02942 16 108.064339 R-squared = 0.8106
...
. regress sysbp chol age triResidual | 1191.02416 15 79.4016106 R-squared = 0.8696...
. regress sysbp chol age tri, betaSource | SS df MS Number of obs = 20
-------------+------------------------------ F( 3, 16) = 35.56Model | 7942.70165 3 2647.56722 Prob > F = 0.0000
Residual | 1191.09835 16 74.4436471 R-squared = 0.8696-------------+------------------------------ Adj R-squared = 0.8451
Total | 9133.8 19 480.726316 Root MSE = 8.6281------------------------------------------------------------------------------
sysbp | Coef. Std. Err. t P>|t| Beta-------------+----------------------------------------------------------------
chol | .1654515 .0496455 3.33 0.004 .4663705age | .5122311 .2802612 1.83 0.086 .2076355tri | .2006968 .0745745 2.69 0.016 .3805016
_cons | 27.15522 12.80998 2.12 0.050 .------------------------------------------------------------------------------
āļāļēāļĢāļ§āđāļāļĢāļēāļ°āļŦāļāļēāļĢāļāļāļāļāļĒāļāļŦāļāļ āđāļĄāļāļāļ§āđāļāļĢāļāļŠāļĢāļ°āđāļāļāļāļ§āđāļāļĢāļāļēāđāļāļ
āļāļĢāļ°āđāļ āļ āđāļāļ āđāļāļĻ āļāļēāļāļ āļŊāļĨāļŊ āļāļāļāļāļēāđāļŦāđāļāļāļāļ§āđāļāļĢāļŦāļ (dummy
variables)
pp
k
ljljl xDxy
j
1
10 1
Ë
āļāļ§āđāļāļĢāļāļŠāļĢāļ°āļĄ k āļĢāļ°āļāļ āļŠāļĢāļēāļāļāļ§āđāļāļĢāļŦāļāđāļāđāļāļēāļāļ k-1 āļāļ§āđāļāļĢ(k=āļĢāļ°āļāļ, āļāļĨāļĄ)
āļāļ§āđāļāļĢ
āļāļ§āđāļāļĢāļŦāļ (dummy variable)
D1 D2
code = 1 0 0
code = 2 1 0
Code = 3 0 1)(
1Ë 654320 gender)(occÎē)(occÎē(tri)Îē(age)Îē(chol)ÎēÎēy officecomm
āļāļ§āļāļĒāļēāļ āļāļ§āđāļāļĢāļāļēāļāļ (āđāļāļĐāļāļĢāļāļĢāļĢāļĄ, āļāļēāļāļēāļĒ, āļāļēāļĢāļēāļāļāļēāļĢ) āđāļāļāļāļ§āđāļāļĢāļāļĨāļĄ
āđāļŦāļāļēāđāļāļ āļāļ§āđāļāļĢāļŦāļ k-1=3-1 = 2 āļāļ§āđāļāļĢ āļāļāļ
āļāļēāļŠāļ Stata: xi: regresst sysbp age tri i.occ i.gender
āļāļēāļāļ
āļāļ§āđāļāļĢāļŦāļ (dummy variable)
D1 D2
āđāļāļĐāļāļĢāļāļĢāļĢāļĄ = 1 0 0
āļāļēāļāļēāļĒ = 2 1 0
āļĢāļāļĢāļēāļāļāļēāļĢ = 3 0 1
*** āļāļĢāļāļĄ 2 āļāļĨāļĄ āđāļāļĻ āļĢāļŦāļŠāđāļāļ 0, 1 āļ§āđāļāļĢāļēāļ°āļŦāđāļāđāļāļĢāđāļāļĢāļĄ Stata āđāļāđāļĨāļĒ
āļāļēāļĄāļĢāļŦāļŠ 1, 2 āļāļēāļŦāļāļāđāļāļ āļāļ§āđāļāļĢāļŦāļ
. xi: regress sysbp chol age tri i.occ i.gender
. xi: regress sysbp chol age tri i.occ i.genderi.occ _Iocc_1-3 (naturally coded; _Iocc_1 omitted)i.gender _Igender_1-2 (naturally coded; _Igender_1 omitted)
Source | SS df MS Number of obs = 20-------------+------------------------------ F( 6, 13) = 16.99
Model | 8101.00425 6 1350.16737 Prob > F = 0.0000Residual | 1032.79575 13 79.4458272 R-squared = 0.8869
-------------+------------------------------ Adj R-squared = 0.8347Total | 9133.8 19 480.726316 Root MSE = 8.9132
------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------chol | .1745477 .0564986 3.09 0.009 .0524899 .2966054age | .504353 .3139673 1.61 0.132 -.173932 1.182638tri | .2081322 .0796249 2.61 0.021 .036113 .3801514
_Iocc_2 | 5.242509 5.77858 0.91 0.381 -7.241355 17.72637_Iocc_3 | -1.13821 5.298263 -0.21 0.833 -12.58441 10.30799
_Igender_2 | -4.495496 4.72941 -0.95 0.359 -14.71276 5.721772_cons | 24.02471 13.96057 1.72 0.109 -6.135272 54.18469
------------------------------------------------------------------------------
. list+-------------------------------------------------------------------------------+| idno sysbp chol age tri occ gender _Iocc_2 _Iocc_3 _Igend~2 ||-------------------------------------------------------------------------------|
1. | 1 155 375 66 230 3 2 0 1 1 |2. | 2 136 290 49 161 1 1 0 0 0 |3. | 3 133 267 47 187 1 1 0 0 0 |4. | 4 166 340 55 178 2 1 1 0 0 |5. | 5 111 282 42 112 2 2 1 0 1 |
|-------------------------------------------------------------------------------|6. | 6 150 352 71 125 3 1 0 1 0 |7. | 7 131 285 39 149 2 2 1 0 1 |8. | 8 167 383 59 208 3 1 0 1 0 |9. | 9 166 363 60 208 1 1 0 0 0 |10. | 10 126 283 48 138 2 2 1 0 1 |
|-------------------------------------------------------------------------------|11. | 11 132 304 40 140 3 1 0 1 0 |12. | 12 164 428 51 175 2 2 1 0 1 |13. | 13 136 282 56 159 3 1 0 1 0 |14. | 14 73 165 36 44 1 1 0 0 0 |15. | 15 153 395 51 181 1 2 0 0 1 |
|-------------------------------------------------------------------------------|16. | 16 135 324 54 164 2 1 1 0 0 |17. | 17 149 426 51 205 3 1 0 1 0 |18. | 18 149 337 57 189 1 1 0 0 0 |19. | 19 142 347 45 152 3 2 0 1 1 |20. | 20 148 349 55 194 3 2 0 1 1 |
+-------------------------------------------------------------------------------+
āļāļēāļĢāļāļāđāļĨāļāļāļāļ§āđāļāļĢāļāļŠāļĢāļ°āđāļāļēāđāļāļŠāļĄāļāļēāļĢ: āļāļēāļĢāļ§āđāļāļĢāļēāļ°āļŦāļŠāļēāļŦāļĢāļ
āļāļēāļĢāļāļāļāļāļĒāļāļŦāļāļ
1. Forward selection Procedure
āļāļāļēāļĢāļāļēāļāļēāđāļāļēāļāļ§āđāļāļĢāļāļŠāļĢāļ°āļāļĨāļ° 1 āļāļ§āđāļāļĢ
2. Backward elimination procedure
āļāļāļēāļĢāļāļēāļāļāļāļāļāļāļāļĨāļ° 1 āļāļ§āđāļāļĢ
3. The Stepwise regression procedure
āđāļāļāļāļ§āļ Forward & Backward
* āļāļēāļĢāļāļāļēāļĢāļāļēāļāļ§āđāļāļĢāļāļŠāļĢāļ°āļāļēāļ P-value (āļāļēāļāļāļēāļŠāļāļ t)
āđāļāļĒāļāļēāļŦāļāļ (āļ) Probability to Entry (Pe) -> Forward
(āļ) Probability to Remove (Pr)->Backward
(āļ) Pe āđāļĨāļ° Pr -> Stepwise
āļ§āļāļāļēāļĢāļāļāļāļāļāļ (Backward elimination procedure)āļāļāļāļāļāļ 1 āļŠāļĢāļēāļāļŠāļĄāļāļēāļĢāļāļāļāļāļĒāļāļĢāļ°āļāļāļāļāļ§āļĒ āļāļ§āđāļāļĢāļāļŠāļĢāļ°āļāļāļāļ§āđāļāļĢ
SYSBP = 27.16 + 0.17(chol) + 0.51(age) + 0.20(tri)
āļāļāļāļāļāļ 2 āļāļēāļāļ§āļāļāļēāļŠāļāļ t (partial) āđāļĨāļ° p-value āļāļ§āđāļāļĢāļāļŠāļĢāļ°āļāļāļāļ§āđāļāļĢ
āđāļāđāļĄāđāļāļĨ
āļāļāļāļāļāļ 3 āļāļāļēāļĢāļāļēāļāļ§āđāļāļĢāļāļĄāļāļē p-value (āļāļēāļāļāļē t) āļĄāļēāļāļāļŠāļ
. regress sysbp chol age tri
Source | SS df MS Number of obs = 20-------------+------------------------------ F( 3, 16) = 35.56
Model | 7942.70165 3 2647.56722 Prob > F = 0.0000Residual | 1191.09835 16 74.4436471 R-squared = 0.8696
-------------+------------------------------ Adj R-squared = 0.8451Total | 9133.80 19 480.726316 Root MSE = 8.6281
------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------chol | .1654515 .0496455 3.33 0.004 .0602076 .2706953age | .5122311 .2802612 1.83 0.086 -.0818961 1.106358tri | .2006968 .0745745 2.69 0.016 .042606 .3587876
_cons | 27.15522 12.80998 2.12 0.050 -.0007309 54.31117------------------------------------------------------------------------------
āļāļāļ 4 āđāļāļĢāļĒāļāđāļāļĒāļāļāļē p-value āļāļ āļĢāļ°āļāļāļāļĒāļŠāļēāļāļ
āļāļāļēāļŦāļāļ ( āđāļāļ Pr = 0.05)
āļāļē p-value > āļĢāļ°āļāļāļāļĒāļŠāļēāļāļāļāļāļēāļŦāļāļ (p-value > Pr)
āļāļāļāļ§āđāļāļĢāļāļāļāļāļāļāļēāļāļŠāļĄāļāļēāļĢ
āļāļāļāļāļ§āđāļāļĢ age āļāļāļ (p-value = .086 > 0.05)
. regress sysbp chol age tri
âĶ
------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------chol | .1654515 .0496455 3.33 0.004 .0602076 .2706953age | .5122311 .2802612 1.83 0.086 -.0818961 1.106358tri | .2006968 .0745745 2.69 0.016 .042606 .3587876
_cons | 27.15522 12.80998 2.12 0.050 -.0007309 54.31117------------------------------------------------------------------------------
āļ§āļāļĢāļāļ: āļāļēāļāļēāļĢāļāļāļŠāļāļāļāļēāļĄāļāļāļāļāļāļ 1 āļāļ āļāļāļāļāļāļ 4 āđāļŦāļĄāļāļ
āļāļ§āđāļāļĢāļāđāļŦāļĨāļāļāļāļāļ§āļēāđāļĄāļĄāļāļ§āđāļāļĢāđāļāļĄāļēāļāļāļ§āļēāļĢāļ°āļāļāļāļĒāļŠāļēāļāļ
āļāļāļēāļŦāļāļ (āđāļāļ Pr=0.05)
. regress sysbp chol tri
Source | SS df MS Number of obs = 20-------------+------------------------------ F( 2, 17) = 45.42
Model | 7694.02578 2 3847.01289 Prob > F = 0.0000Residual | 1439.77422 17 84.6926011 R-squared = 0.8424
-------------+------------------------------ Adj R-squared = 0.8238Total | 9133.80 19 480.726316 Root MSE = 9.2029
------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------chol | .1875776 .0513543 3.65 0.002 .0792295 .2959258tri | .238911 .0763522 3.13 0.006 .0778219 .4
_cons | 40.00673 11.42093 3.50 0.003 15.91068 64.10278------------------------------------------------------------------------------
āļ§āļāļāļēāļĢāļāļāļāļāļāļ (Backward elimination procedure). sw regress sysbp chol age tri, pr(.05)
begin with full modelp = 0.0863 >= 0.0500 removing ageSource | SS df MS Number of obs = 20
---------+------------------------------ F( 2, 17) = 45.42Model | 7694.02578 2 3847.01289 Prob > F = 0.0000
Residual | 1439.77422 17 84.6926011 R-squared = 0.8424---------+------------------------------ Adj R-squared = 0.8238
Total | 9133.80 19 480.726316 Root MSE = 9.2029------------------------------------------------------------------------------
sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]---------+--------------------------------------------------------------------
chol | .1875776 .0513543 3.653 0.002 .0792295 .2959258tri | .238911 .0763522 3.129 0.006 .0778219 .4
_cons | 40.00673 11.42093 3.503 0.003 15.91068 64.10278------------------------------------------------------------------------------
. stepwise, pr(.05) : regress sysbp chol age tribegin with full model
p = 0.0863 >= 0.0500 removing age...------------------------------------------------------------------------------
sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------
chol | .1875776 .0513543 3.65 0.002 .0792295 .2959258tri | .238911 .0763522 3.13 0.006 .0778219 .4
_cons | 40.00673 11.42093 3.50 0.003 15.91068 64.10279
āļ§āļāļāļēāđāļāļē (Forward selection procedure)āļāļāļāļāļāļ 1 āđāļĨāļāļāļāļ§āđāļāļĢāļāļŠāļĢāļ°āļāļ§āđāļĢāļāđāļāļēāļĄāļēāđāļāļŠāļĄāļāļēāļĢ
āļāļēāļŦāļāļ Pe (āđāļāļ Pe=0.05)
(āļ) āļ§āđāļāļĢāļēāļ°āļŦāļāļēāļĢāļāļēāļĢāļāļāļāļāļĒāļāļĒāļēāļāļāļēāļĒ āļāļ§āđāļāļĢāļāļēāļĄāļāļāļāļ§āđāļāļĢāļāļŠāļĢāļ°
āļāļĨāļ°āļāļ§āđāļāļĢ
āđāļĨāļāļāļāļ§āđāļāļĢāļāļŠāļĢāļ°āļāļĄāļāļ§āļēāļĄāļŠāļĄāļāļāļāđāļāļāđāļŠāļāļāļāļāļ§āđāļāļĢāļāļēāļĄ āļāļē p-value
āļāļāļĒāļāļŠāļāđāļĨāļ° < āļĢāļ°āļāļāļāļĒāļŠāļēāļāļ (Pe)
*****āļāļĢāļ p-value > Pe āļĒāļāļāļēāļĢāļ§āđāļāļĢāļēāļ°āļŦ. regress sysbp cholâĶ------------------------------------------------------------------------------
sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------
chol | .3075584 .0416769 7.38 0.000 .2199986 .3951183_cons | 39.95941 13.93348 2.87 0.010 10.68625 69.23256
------------------------------------------------------------------------------
. regress sysbp cholâĶ------------------------------------------------------------------------------
sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------
chol | .3075584 .0416769 7.38 0.000 .2199986 .3951183_cons | 39.95941 13.93348 2.87 0.010 10.68625 69.23256
------------------------------------------------------------------------------
. regress sysbp ageâĶ------------------------------------------------------------------------------
sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------
age | 1.695629 .4223477 4.01 0.001 .8083094 2.582949_cons | 53.60554 22.09811 2.43 0.026 7.179137 100.032
------------------------------------------------------------------------------
. regress sysbp triâĶ------------------------------------------------------------------------------
sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------
tri | .4471421 .0659423 6.78 0.000 .3086025 .5856817_cons | 67.34391 11.2005 6.01 0.000 43.81254 90.87528
------------------------------------------------------------------------------
āļāļēāļĢāļ§āđāļāļĢāļēāļ°āļŦāļāļēāļĢāļāļāļāļāļĒāļāļĒāļēāļāļāļēāļĒ āļāļ§āđāļāļĢāļāļēāļĄ āļāļ§āđāļāļĢāļāļŠāļĢāļ°
āļāļāļāļāļāļ§āđāļāļĢ chol āđāļāļāļāļ§āđāļāļĢāļāļŠāļĢāļ°āļāļ§āđāļāļĢāđāļĢāļāļāļāļēāđāļāļēāđāļāļŠāļĄāļāļēāļĢ
āļŦāļĢāļ āļ§āļāļāļēāđāļāļē (Forward selection procedure)
āļ§āđāļāļĢāļēāļ°āļŦāļŠāļŦāļŠāļĄāļāļāļāļĢāļ°āļŦāļ§āļēāļāļāļ§āđāļāļĢāļāļēāļĄāļāļāļāļ§āđāļāļĢāļāļŠāļĢāļ° āđāļĨāļāļāļāļ§āđāļāļĢ
āļāļŠāļĢāļ°āļāļĄāļāļ§āļēāļĄāļŠāļĄāļāļāļāļāļāļāļ§āđāļāļĢāļāļēāļĄāļĄāļēāļāļāļŠāļāđāļĨāļ° P-Value < Pe
āļāļēāļāļāļ§āļāļĒāļēāļāļāļāļ§āļēāļŠāļŦāļŠāļĄāļāļāļāđāļāļāļāļāļ
rSYSBP-CHOL =0.8669
rSYSBP-AGE =0.6873
rSYSBP-TRI =0.8477
āļāļāļāļāļāļ§āđāļāļĢ chol āđāļāļāļāļ§āđāļāļĢāļāļŠāļĢāļ°āļāļ§āđāļāļĢāđāļĢāļāļāļāļēāđāļāļēāđāļāļŠāļĄāļāļēāļĢ
. pwcorr sysbp chol age tri, sig| sysbp chol age tri
-------------+------------------------------------sysbp | 1.0000
|chol | 0.8669 1.0000
| 0.0000|
age | 0.6873 0.5609 1.0000 | 0.0008 0.0101|
tri | 0.8477 0.7467 0.5732 1.0000 | 0.0000 0.0002 0.0082
āļāļ§āđāļāļĢ tri āļĄāļāļē p-valueāļāļāļĒāļāļ§āļē (t =3.13, p-value=.006)
.regress sysbp tri cholâĶ------------------------------------------------------------------------------
sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------
tri | .238911 .0763522 3.13 0.006 .0778219 .4chol | .1875776 .0513543 3.65 0.002 .0792295 .2959258_cons | 40.00673 11.42093 3.50 0.003 15.91068 64.10279
āļāļāļāļāļāļ 2 āļāļāļēāļĢāļāļēāļāļ§āđāļāļĢāļ 2 āđāļāļēāđāļĄāđāļāļĨ(āļ) āļŠāļĢāļēāļāđāļĄāđāļāļĨāļāļ§āđāļāļĢāļāļēāļĄāļāļāļāļ§āđāļāļĢāļāļŠāļĢāļ°āļāļāļēāđāļāļēāđāļāļāļāļāļāļāļ 1
āđāļĨāļ°āļāļ§āđāļāļĢāļāļŠāļĢāļ°āļāđāļŦāļĨāļāļāļĨāļ°āļāļ§āđāļāļĢ āļāļāļēāļĢāļāļēāđāļĨāļāļāļāļē p-value (āļāļēāļ t-test) (āļ) P-value < Pe āđāļŦāļāļēāđāļāļēāļāļ§āđāļāļĢāļāļŠāļĢāļ°āđāļāđāļĄāđāļāļĨ
(āļ) P-value > Pe āļĒāļāļāļēāļĢāļāļēāđāļāļēāļāļ§āđāļāļĢāļāļŠāļĢāļ°āđāļāđāļĄāđāļāļĨ . regress sysbp age cholâĶ------------------------------------------------------------------------------
sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------
age | .7236989 .3145605 2.30 0.034 .0600341 1.387364chol | .2491839 .0452355 5.51 0.000 .1537453 .3446225_cons | 21.81301 14.79754 1.47 0.159 -9.407062 53.03308
------------------------------------------------------------------------------
)1)(1( 22XZYZ
XZYZYXYX|Z
rr
rrrr
*āđāļāļ§āļ partial correlation āđāļāđāļŦāļāļĨāđāļāļāđāļāļĒāļ§āļāļ
)1)(1( 22choltricholsysbp
choltricholsysbptrisysbptri|cholsysbp
rr
rrrr
āļŦāļĢāļ āļāļāļāļāļāļ 2 āļāļāļēāļĢāļāļēāļāļ§āđāļāļĢāļ 2 āđāļāļēāđāļĄāđāļāļĨ
)1)(1( 22cholagecholsysbp
cholagecholsysbpagesysbpage|cholsysbp
rr
rrrr
. pcorr sysbp tri chol(obs=20)Partial correlation of sysbp with
Variable | Corr. Sig.-------------+------------------
tri | 0.6045 0.006chol | 0.6631 0.002
. pcorr sysbp age chol(obs=20)Partial correlation of sysbp with
Variable | Corr. Sig.-------------+------------------
age | 0.4873 0.034chol | 0.8006 0.000
āļāļāļ 3 āļāļāļēāļĢāļāļēāļāļ§āđāļāļĢāļāļŠāļĢāļ°āļāļāļēāđāļāļēāļŦāļēāļāļē t āđāļĨāļ° p-value
āļĢāļ§āļĄāļāļāļāļ§āđāļāļĢāļāļŠāļĢāļ°āļāļāļēāđāļāļēāļāļāļ
āļāļē
(āļ.) P-value < āļĢāļ°āļāļāļāļĒāļŠāļēāļāļāļāļēāļāļŠāļāļāļāļāļēāļŦāļāļ (Pe) āđāļŦāļāļēāđāļāļē
āļāļ§āđāļāļĢāļāļŠāļĢāļ°āļāļāļāļĨāļēāļ§āđāļāđāļĄāđāļāļĨāļŠāļĄāļāļēāļĢāļāļāļāļāļĒ
(āļ.) P-value > āļĢāļ°āļāļāļāļĒāļŠāļēāļāļāļāļēāļāļŠāļāļāļāļāļēāļŦāļāļ (Pe) āļĒāļāļāļēāļĢāļāļēāđāļāļē
āļāļ§āđāļāļĢāļāļŠāļĢāļ° āļāļāđāļāđāļĄāđāļāļĨ āđāļāļāļāļāļāļāļ 1
āļāļāļ 3 āļāļēāļāļēāļĄāļāļāļāļāļāļ 2 āļāļēāļāļāļāļ§āđāļāļĢāļāđāļŦāļĨāļ āļāļēāđāļāļāļāļāļāļāļĢāļāļāļ
āļāļ§āđāļāļĢāļāļŠāļĢāļ° āļāļāđāļĄāļĄāļāļ§āđāļāļĢāđāļāļ P-value < Pe āļāļāļēāļŦāļāļ āđāļŦāļĒāļāļāļēāļĢ
āļāļēāđāļāļē
āļŦāļĢāļ (2) āđāļ Partial Correlation
āđāļŦāļāļē t, p-value āđāļāļāđāļāļĒāļ§āļāļ
. regress sysbp age chol triâĶ------------------------------------------------------------------------------
sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------
age | .5122311 .2802612 1.83 0.086 -.0818961 1.106358chol | .1654515 .0496455 3.33 0.004 .0602076 .2706953tri | .2006968 .0745745 2.69 0.016 .042606 .3587876
_cons | 27.15522 12.80998 2.12 0.050 -.0007311 54.31117------------------------------------------------------------------------------
. pcorr sysbp age chol tri(obs=20)Partial correlation of sysbp with
Variable | Corr. Sig.-------------+------------------
age | 0.4156 0.086chol | 0.6401 0.004tri | 0.5582 0.016
āļāļ p-value = 0.086 (āļāļāļāļāļē t=1.83) > āļāļĒāļŠāļēāļāļāļāļēāļāļŠāļāļāļāļāļēāļŦāļāļ
(Pe=0.05) āļĒāļāļāļēāļĢāļāļēāđāļāļē
. regress sysbp age chol tri
Source | SS df MS Number of obs = 20-------------+------------------------------ F( 3, 16) = 35.56
Model | 7942.70165 3 2647.56722 Prob > F = 0.0000Residual | 1191.09835 16 74.4436471 R-squared = 0.8696
-------------+------------------------------ Adj R-squared = 0.8451Total | 9133.8 19 480.726316 Root MSE = 8.6281
------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------age | .5122311 .2802612 1.83 0.086 -.0818961 1.106358chol | .1654515 .0496455 3.33 0.004 .0602076 .2706953tri | .2006968 .0745745 2.69 0.016 .042606 .3587876
_cons | 27.15522 12.80998 2.12 0.050 -.0007311 54.31117------------------------------------------------------------------------------
āļāļē p-value = 0.086 > āļāļĒāļŠāļēāļāļāļāļēāļāļŠāļāļāļāļāļēāļŦāļāļ (Pe=0.05)
āļĒāļāļāļēāļĢāļāļēāđāļāļē āđāļāđāļĄāđāļāļĨāđāļāļāļāļāļāļāļ 2
. regress sysbp chol tri
Source | SS df MS Number of obs = 20-------------+------------------------------ F( 2, 17) = 45.42
Model | 7694.02578 2 3847.01289 Prob > F = 0.0000Residual | 1439.77422 17 84.6926011 R-squared = 0.8424
-------------+------------------------------ Adj R-squared = 0.8238Total | 9133.80 19 480.726316 Root MSE = 9.2029
------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------chol | .1875776 .0513543 3.65 0.002 .0792295 .2959258tri | .238911 .0763522 3.13 0.006 .0778219 .4
_cons | 40.00673 11.42093 3.50 0.003 15.91068 64.10278------------------------------------------------------------------------------
āļ§āļāļāļēāđāļāļē (Forward selection procedure). sw regress sysbp chol age tri, pe(.05)
begin with empty modelp = 0.0000 < 0.0500 adding cholp = 0.0061 < 0.0500 adding triSource | SS df MS Number of obs = 20
---------+------------------------------ F( 2, 17) = 45.42Model | 7694.02578 2 3847.01289 Prob > F = 0.0000
Residual | 1439.77422 17 84.6926011 R-squared = 0.8424---------+------------------------------ Adj R-squared = 0.8238
Total | 9133.80 19 480.726316 Root MSE = 9.2029------------------------------------------------------------------------------
sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]---------+--------------------------------------------------------------------
chol | .1875776 .0513543 3.653 0.002 .0792295 .2959258tri | .238911 .0763522 3.129 0.006 .0778219 .4
_cons | 40.00673 11.42093 3.503 0.003 15.91068 64.10278------------------------------------------------------------------------------
. stepwise, pe(.05) : regress sysbp chol age tribegin with empty model
p = 0.0000 < 0.0500 adding cholp = 0.0061 < 0.0500 adding tri...------------------------------------------------------------------------------
sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------
chol | .1875776 .0513543 3.65 0.002 .0792295 .2959258tri | .238911 .0763522 3.13 0.006 .0778219 .4
_cons | 40.00673 11.42093 3.50 0.003 15.91068 64.10279------------------------------------------------------------------------------
āļ§āļāļŠāļĄāļāļēāļĢāļāļāļāļāļĒāđāļāļāļāļāļāļāļ (Stepwise regression procedure). sw regress sysbp chol age tri, pr(0.1) pe(.05) forward
begin with empty modelp = 0.0000 < 0.0500 adding cholp = 0.0061 < 0.0500 adding tri
Source | SS df MS Number of obs = 20-------------+------------------------------ F( 2, 17) = 45.42
Model | 7694.02578 2 3847.01289 Prob > F = 0.0000Residual | 1439.77422 17 84.6926011 R-squared = 0.8424
-------------+------------------------------ Adj R-squared = 0.8238Total | 9133.8 19 480.726316 Root MSE = 9.2029
------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------chol | .1875776 .0513543 3.65 0.002 .0792295 .2959258tri | .238911 .0763522 3.13 0.006 .0778219 .4
_cons | 40.00673 11.42093 3.50 0.003 15.91068 64.10279------------------------------------------------------------------------------
. stepwise, pr(.10) pe(.05) forward: regress sysbp chol age tribegin with empty model
p = 0.0000 < 0.0500 adding cholp = 0.0061 < 0.0500 adding tri
...------------------------------------------------------------------------------
sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------
chol | .1875776 .0513543 3.65 0.002 .0792295 .2959258tri | .238911 .0763522 3.13 0.006 .0778219 .4
_cons | 40.00673 11.42093 3.50 0.003 15.91068 64.10279------------------------------------------------------------------------------
āļ§āļāļŠāļĄāļāļēāļĢāļāļāļāļāļĒāđāļāļāļāļāļāļāļ (Stepwise regression procedure)
āđāļāļāļ§āļāļāļāļāđāļĨāļāļāļāļ§āđāļāļĢāļāļŠāļĢāļ°āļĢāļ§āļĄāļāļ āļĢāļ°āļŦāļ§āļēāļāļ§āļāļāļāļāļāļāļāđāļĨāļ°
āļ§āļāļāļēāđāļāļē
āđāļāđāļāļāļŠāļēāļĢāļāđāļāļ§āļāļāļēāđāļāļēāļāļāļāļ§āļāļāļāļāļāļāļāļāļāļāļāļāļ 1 (āļ) āļ§āđāļāļĢāļēāļ°āļŦāļāļēāļĢāļāļāļāļāļĒāļāļĒāļēāļāļāļēāļĒ āļĢāļ°āļŦāļ§āļēāļāļāļ§āđāļāļĢāļāļēāļĄ
āļāļāļāļ§āđāļāļĢāļāļŠāļĢāļ°āļāļĨāļ°āļāļ§āđāļāļĢ
āđāļĨāļāļāļāļ§āđāļāļĢāļāļŠāļĢāļ°āļāļĄāļāļ§āļēāļĄāļŠāļĄāļāļāļāđāļāļāđāļŠāļāļāļāļāļ§āđāļāļĢāļāļēāļĄāļĄāļēāļāļāļ§āļē
āļāļāļēāļĢāļāļēāļāļēāļāļāļ§āđāļāļĢāļāļĄ āļāļē t āļĄāļēāļ āļŦāļĢāļāļāļē p-value āļāļāļĒāļāļ§āļēāđāļĨāļ°
p-value < āļĢāļ°āļāļāļāļĒāļŠāļēāļāļāļāļāļēāļŦāļāļ (Pe) āļāļēāđāļāļēāļāļ§āđāļāļĢāļāļŠāļĢāļ°
p-value > Pe āļĒāļāļāļēāļĢāļ§āđāļāļĢāļēāļ°āļŦ
āļ§āđāļāļĢāļēāļ°āļŦāļāļēāļĢāļāļāļāļāļĒāļāļĒāļēāļāļāļēāļĒāļāļĨāļ°āļāļ§āđāļāļĢ āđāļĨāļāļāļāļ§āđāļāļĢāļāļŠāļĢāļ°āļāļĄāļāļ§āļēāļĄāļŠāļĄāļāļāļāļāļ
āļāļ§āđāļāļĢāļāļēāļĄāļŠāļāļāļāļ (p-value āļāļāļĒāļāļ§āļē) āđāļĨāļ° < Pe. regress sysbp chol...------------------------------------------------------------------------------
sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------
chol | .3075584 .0416769 7.38 0.000 .2199986 .3951183_cons | 39.95941 13.93348 2.87 0.010 10.68625 69.23256
------------------------------------------------------------------------------
. regress sysbp age
...------------------------------------------------------------------------------
sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------
age | 1.695629 .4223477 4.01 0.001 .8083094 2.582949_cons | 53.60554 22.09811 2.43 0.026 7.179137 100.032
------------------------------------------------------------------------------
. regress sysbp tri
... ------------------------------------------------------------------------------
sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------
tri | .4471421 .0659423 6.78 0.000 .3086025 .5856817_cons | 67.34391 11.2005 6.01 0.000 43.81254 90.87528
------------------------------------------------------------------------------
p-value āļāļ§āđāļāļĢ chol < Pe: āļāļ§āđāļāļĢ chol āđāļāļāļāļ§āđāļāļĢāļāļŠāļĢāļ°āļĨāļēāļāļāđāļĢāļ
āļāļāļēāđāļāļēāđāļāđāļĄāđāļāļĨ
.regress sysbp tri cholâĶ------------------------------------------------------------------------------
sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------
tri | .238911 .0763522 3.13 0.006 .0778219 .4chol | .1875776 .0513543 3.65 0.002 .0792295 .2959258_cons | 40.00673 11.42093 3.50 0.003 15.91068 64.10279
āļāļāļāļāļāļ 2 āļāļāļēāļĢāļāļēāļāļ§āđāļāļĢāļ 2 āđāļāļēāđāļĄāđāļāļĨ(āļ) āļŠāļĢāļēāļāđāļĄāđāļāļĨāļāļ§āđāļāļĢāļāļēāļĄāļāļāļāļ§āđāļāļĢāļāļŠāļĢāļ°āļāļāļēāđāļāļēāđāļāļāļāļāļāļāļ 1 āđāļĨāļ°
āļāļ§āđāļāļĢāļāļŠāļĢāļ°āļāđāļŦāļĨāļāļāļĨāļ°āļāļ§āđāļāļĢ āļāļāļēāļĢāļāļēāđāļĨāļāļāļāļē p-value (āļāļēāļ t-test) āļāļĄāļāļēāļāļāļĒāļāļ§āļē āļāļĢāļāļāļē p-value > Pe āļĒāļāļāļēāļĢāļāļēāđāļāļē
. regress sysbp age cholâĶ------------------------------------------------------------------------------
sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------
age | .7236989 .3145605 2.30 0.034 .0600341 1.387364chol | .2491839 .0452355 5.51 0.000 .1537453 .3446225_cons | 21.81301 14.79754 1.47 0.159 -9.407062 53.03308
------------------------------------------------------------------------------
āļāļēāļāļāļāļĄāļĨāļāļ§āļāļĒāļēāļ āļāļ§āđāļāļĢ tri (triglyceride) āļĄāļāļē t āļĄāļēāļāļāļŠāļāđāļĨāļ° p-value āļāļāļĒāļāļ§āļē
āđāļĨāļ° (p-value < Pe) āđāļŦāļāļēāļāļ§āđāļāļĢ tri āđāļāđāļĄāđāļāļĨāļāļēāļĢāļāļāļāļāļĒāļāļŦāļāļ
āļāļāļ 3 āļāļāļēāļĢāļāļēāļāļāļāļāļ§āđāļāļĢāļāļāļāļāļēāļāđāļĄāđāļāļĨ āļŠāļĢāļēāļāļāļēāļĢāļāļāļāļāļĒāļāļŦ āđāļĨāļ§āļāļāļēāļĢāļāļēāļāļē t , p-value āļāļ§āđāļāļĢāļāļŠāļĢāļ°āļāļāļĒāđāļāļŠāļĄāļāļēāļĢāļāļāļāļāļĒāļāļŦāļāļāļēāļĢāļāļēp-value (āļāļēāļāļāļē t) āļāļ§āđāļāļĢāļāļŠāļĢāļ°āļāļ Pr P-value > Probability to Remove (Pr) āļāļāļāļ§āđāļāļĢāļāļāļāļāļāļāļēāļāļŠāļĄāļāļēāļĢ P-value < Probability to Remove (Pr) āļāļāļāļ§āđāļāļĢāļāļŠāļĢāļ°āđāļāļŠāļĄāļāļēāļĢ
āļāļē p-value āļāļ§āđāļāļĢ cholesterol, trigyceride < Pr (āļĢāļ°āļāļāļāļĒāļŠāļēāļāļ
āļāļāļēāļŦāļāļ, Pr =0.20) āļāļāļāļ§āđāļāļĢāļāļŠāļĢāļ°āļāļ 2 āļāļ§āđāļāļĢ āļāļĒāđāļāļŠāļĄāļāļēāļĢ
. regress sysbp chol tri
...------------------------------------------------------------------------------
sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------
chol | .1875776 .0513543 3.65 0.002 .0792295 .2959258tri | .238911 .0763522 3.13 0.006 .0778219 .4
_cons | 40.00673 11.42093 3.50 0.003 15.91068 64.10279------------------------------------------------------------------------------
āļāļāļ 4 āļāļēāļāļēāļĄāļāļāļāļāļāļ 2-3 āļāļēāļāļāļāļ§āđāļāļĢāļāđāļŦāļĨāļ āđāļāļāļāļāļ age
āļāļāļēāļĢāļāļēāļāļēāđāļāļē (āļāļāļāļāļāļ 2)
(āļ) āļāļāļēāļĢāļāļēāļāļē p-value (āļāļēāļāļāļē t) āļāļ§āđāļāļĢāļāļŠāļĢāļ°āļāđāļŦāļĨāļ āļāļēāļāļŠāļĄāļāļēāļĢ
āļāļāļāļāļĒāļāļŦ
(āļ) āļāļĢāļ p-value > Pribability to Entry (Pe) āļĒāļāļāļēāļĢāļāļēāđāļāļē
(āļ) āļāļĢāļāļāļŠāļēāļĄāļēāļĢāļāļāļēāđāļāļēāļāļ§āđāļāļĢāļāļŠāļĢāļ°āđāļ āđāļŦāļāļēāļāļāļāļāļ 2-3 āļāļāļāļĢāļāļāļ
āļāļ§āđāļāļĢ āļāļāđāļĄāļĄ āļāļ§āđāļāļĢāļāļŠāļĢāļ°āđāļāļ p-vakue < Pe āđāļŦāļĒāļāļāļēāļĢāļāļēāđāļāļē
āļāļ§āđāļāļĢ age āļĄāļāļē p-value=0.086 > Pe (0.05) āļĒāļāļāļēāļĢāļāļēāđāļāļē
. regress sysbp age chol triâĶ
------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------age | .5122311 .2802612 1.83 0.086 -.0818961 1.106358chol | .1654515 .0496455 3.33 0.004 .0602076 .2706953tri | .2006968 .0745745 2.69 0.016 .042606 .3587876
_cons | 27.15522 12.80998 2.12 0.050 -.0007311 54.31117------------------------------------------------------------------------------
āđāļĄāđāļāļĨāļāļāļāđāļĨāļāļāđāļāļ Stewise regression
sysbp predicted
)(24.0)(19.001.40Ë 21 xxyi
. regress sysbp chol tri
Source | SS df MS Number of obs = 20-------------+------------------------------ F( 2, 17) = 45.42
Model | 7694.02578 2 3847.01289 Prob > F = 0.0000Residual | 1439.77422 17 84.6926011 R-squared = 0.8424
-------------+------------------------------ Adj R-squared = 0.8238Total | 9133.80 19 480.726316 Root MSE = 9.2029
------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------chol | .1875776 .0513543 3.65 0.002 .0792295 .2959258tri | .238911 .0763522 3.13 0.006 .0778219 .4
_cons | 40.00673 11.42093 3.50 0.003 15.91068 64.10278------------------------------------------------------------------------------
āļ§āļāļŠāļĄāļāļēāļĢāļāļāļāļāļĒāđāļāļāļāļāļāļāļ (Stepwise regression procedure). sw regress sysbp chol age tri, pr(0.1) pe(.05) forward
begin with empty modelp = 0.0000 < 0.0500 adding cholp = 0.0061 < 0.0500 adding tri
Source | SS df MS Number of obs = 20-------------+------------------------------ F( 2, 17) = 45.42
Model | 7694.02578 2 3847.01289 Prob > F = 0.0000Residual | 1439.77422 17 84.6926011 R-squared = 0.8424
-------------+------------------------------ Adj R-squared = 0.8238Total | 9133.8 19 480.726316 Root MSE = 9.2029
------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------chol | .1875776 .0513543 3.65 0.002 .0792295 .2959258tri | .238911 .0763522 3.13 0.006 .0778219 .4
_cons | 40.00673 11.42093 3.50 0.003 15.91068 64.10279------------------------------------------------------------------------------
. stepwise, pr(.10) pe(.05) forward: regress sysbp chol age tribegin with empty model
p = 0.0000 < 0.0500 adding cholp = 0.0061 < 0.0500 adding tri
...------------------------------------------------------------------------------
sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]-------------+----------------------------------------------------------------
chol | .1875776 .0513543 3.65 0.002 .0792295 .2959258tri | .238911 .0763522 3.13 0.006 .0778219 .4
_cons | 40.00673 11.42093 3.50 0.003 15.91068 64.10279------------------------------------------------------------------------------
āļāļēāļĢāļāļēāļŦāļāļāđāļŦāļāļ§āđāļāļĢāļāļŠāļĢāļ°āļāļĒāđāļāđāļĄāđāļāļĨāļāļēāļĢāļāļāđāļĨāļāļ
lockterm1 = keep the first term
*āļāļĢāļāļāļ§āđāļāļĢāļŦāļ (Dummy Variable) āđāļāļāļāļēāļĢāļāļēāđāļāļāļāļĨāļĄāļāļ§āđāļāļĢ
āđāļāļĒāđāļ (age i.occ). xi:stepwise, forward lockterm1 pr(.10) pe(.05): regress sysbp (age i.occ) chol tri
. stepwise, forward lockterm1 pr(.10) pe(.05): regress sysbp (age) chol tribegin with term 1 model
p = 0.0000 < 0.0500 adding cholp = 0.0161 < 0.0500 adding tri
Source | SS df MS Number of obs = 20-------------+------------------------------ F( 3, 16) = 35.56
Model | 7942.70165 3 2647.56722 Prob > F = 0.0000Residual | 1191.09835 16 74.4436471 R-squared = 0.8696
-------------+------------------------------ Adj R-squared = 0.8451Total | 9133.8 19 480.726316 Root MSE = 8.6281
------------------------------------------------------------------------------sysbp | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------age | .5122311 .2802612 1.83 0.086 -.0818961 1.106358chol | .1654515 .0496455 3.33 0.004 .0602076 .2706953tri | .2006968 .0745745 2.69 0.016 .042606 .3587876
_cons | 27.15522 12.80998 2.12 0.050 -.0007311 54.31117------------------------------------------------------------------------------
Report Regression Table (Publication Manual of the American Psychological Association sixth Edition,2010;pp 145.)
Stepwise logistic regression āļāļāļēāļĢāļāļēāļāļē p-value
Hosmer & Lemeshow (2000) āļāļ§āļĢāļāļēāļŦāļāļ
p-value for entry (Pe) 0.15-0.25 , p-value for remove (Pr) > Pe
āļāļēāļĢāļāļēāļŦāļāļ p-value for entry āļŠāļāļŦāļĢāļāļāļēāđāļāļāđāļ use more tradition level (0.05) fails to identify variables known
to be important ?
higher level has disadvantage of including variables that are of
questionable importance at the model building stage
(Original: Mickey & Greenland, 1977: p125-137;
Reference: Hosmer & Lemeshow (2000): p95)
āļāļāļāļēāļŦāļāļāđāļāļāļēāļĢāļ§āđāļāļĢāļēāļ°āļŦāļāļēāļĢāļāļāļāļāļĒāļāļŦāļāļ
(Assumption)
āļāļāļēāļĢāļāļēāļāļēāļāļŠāļ§āļāļāđāļŦāļĨāļ (Residual: ei āļŦāļĢāļ )
āļāļ§āđāļāļāļĄāļāļ§āļēāļĄāļŠāļĄāļāļāļāđāļāļāđāļŠāļ (The regression function is linear)
āļāļē residual (ei) āļĄāļāļēāļĢāđāļāļāđāļāļāđāļāļāļāļāļ
āļāļē residual (ei) āļĄāļāļē variance āļāļāļ (homoscedasticity)
āļāļē residual (ei) āđāļĄāļĄāļāļ§āļēāļĄāļŠāļĄāļāļāļāļāļāļāļēāļāļāđ
(no auto-correlation, no serial correlation)
āļāļāļēāļĢāļāļē Independent variable
āđāļĄāļĄ Multicollinearity
ii yy Ë
āļāļ§āļēāļĄāđāļŦāļĄāļēāļ°āļŠāļĄāļāļāļāļāļ§āđāļāļ (Aptness of Model)
āļāļ§āđāļāļāļĄāļĨāļāļĐāļāļ°āđāļāļāđāļŠāļ (The regression function is linear)
āļ§āļāļāļĢāļ§āļāļŠāļāļāļāļĨāļāļāļāļĢāļēāļāļĢāļ°āļŦāļ§āļēāļāļāļēāļŠāļ§āļāļāđāļŦāļĨāļ (residual: ei) āļāļ fitted value
āļāļ§āđāļāļāļĄāļĨāļāļĐāļāļ°āđāļāļāđāļŠāļ āđāļĄāļāļāļāļāļĨāļāļāļāļĢāļēāļāļāļĢāļāļāđāļŠāļāđāļāļāļāļāļ āļāļĄāļāļēāļŠāļ§āļ
āļāđāļŦāļĨāļ (ei) āđāļāļēāļāļ 0 āļāļāļĢāļ a
āļāļ§āđāļāļāđāļĄāļĄāļĨāļāļĐāļāļ°āđāļāļāđāļŠāļ āđāļŠāļāļāļāļāļĢāļ b āđāļāļĒāļāļāļāļāļĨāļāļāļĄāļĨāļāļĐāļāļ°āđāļāļĄāļāļāđāļĨāļ°
āļĨāļāļĨāļāļāļĒāļēāļāđāļāļāļĢāļ°āļāļ
iy
āļĢāļ a āļĢāļ b
iy iy
.regress sysbp chol tri
.rvfplot ,yline(0)
iy
ei VS iy
āļāļē residual (ei) āļĄāļāļēāļĢāđāļāļāđāļāļāđāļāļāļāļāļ
Normal probability plot, Box-Whisker plot, Stem & leaf etc.
Shapiro-Wilk test . quietly regress sysbp chol age tri. predict e,residual. swilk e
Shapiro-Wilk W test for normal dataVariable | Obs W V z Prob>z
-------------+-------------------------------------------------e | 20 0.95467 1.073 0.142 0.44361
. pnorm e
āļāļē residual (ei) āļĄāļāļē variance āļāļāļ (homoscedasticity)
āļāļēāļĢāļāļĨāļāļāļāļĢāļēāļ āļĢāļ°āļŦāļ§āļēāļāļāļē residual (ei) āļāļ
āļāļāļŠāļāļ Cook-Weisberg test for heteroscedasticity
Stataestat hettest tests for heteroskedasticityestat imtest information matrix testestat ovtest Ramsey regression specification-error
test for omitted variablesestat szroeter Szroeter's rank test for
heteroskedasticity
rvfplot residual-versus-fitted plot
iy
āļāļē residual (ei) āļĄāļāļē variance āļāļāļ (homoscedasticity) āļāļēāļĢāļāļĨāļāļāļāļĢāļēāļ āļĢāļ°āļŦāļ§āļēāļāļāļē residual (ei) āļāļ
āļāļāļŠāļāļ Cook-Weisberg test for heteroscedasticityiy
. rvfplot, ylin(0)
. estat hettestBreusch-Pagan / Cook-Weisberg test for heteroskedasticity
Ho: Constant varianceVariables: fitted values of sysbpchi2(1) = 1.38Prob > chi2 = 0.2409
. estat hettestBreusch-Pagan / Cook-Weisberg test for heteroskedasticity
Ho: Constant varianceVariables: fitted values of sysbpchi2(1) = 1.38Prob > chi2 = 0.2409
. estat imtestCameron & Trivedi's decomposition of IM-test---------------------------------------------------
Source | chi2 df p---------------------+-----------------------------Heteroskedasticity | 3.28 5 0.6563
Skewness | 2.14 2 0.3438Kurtosis | 1.21 1 0.2720
---------------------+-----------------------------Total | 6.63 8 0.5775
---------------------------------------------------
. estat szroeter , rhs mtest(holm)Szroeter's test for homoskedasticity
Ho: variance constantHa: variance monotonic in variable
---------------------------------------Variable | chi2 df p
-------------+-------------------------chol | 2.37 1 0.2481 #tri | 0.67 1 0.4134 #
---------------------------------------# Holm-adjusted p-values
. hettestCook-Weisberg test for heteroskedasticity
Ho: Constant variancechi2(1) = 7.44Prob > chi2 = 0.0064
. rvfplot, border yline(0)
. hettestCook-Weisberg test for heteroskedasticity
Ho: Constant variancechi2(1) = 0.00Prob > chi2 = 1.0000
. rvfplot, border yline(0)
āļāļē residual (ei) āđāļĄāļĄāļāļ§āļēāļĄāļŠāļĄāļāļāļāļāļāļāļēāļāļāđ (no auto-correlation, no serial correlation )**āđāļāđāļāļāļēāļ°āļāļāļĄāļĨāđāļāļ Time-Series
āļāļ§āļēāļĄāļŠāļĄāļāļāļāļĢāļ°āļŦāļ§āļēāļāļāļēāļĢāļ°āļŦāļ§āļēāļāļĢāļēāļĒāļāļāļĄāļĨāļāļāļāļāļ§āđāļāļĢāđāļāļĒāļ§āļāļ error āļĢāļēāļĒāļ i VS error āļĢāļēāļĒāļ i-1
āļāļēāļĢāļāļēāļāļ§āļāļāļē Durbin-Watson (d)d āļĄāļāļē 1-4d < 2 āļāļāļ§āļēāđāļāļ positive autocorrelation d > 2 āđāļāļ negative autocorrelationAs a rough rule of thumb, DurbinâWatson is 1.5 â 2.5
are relatively normal.
n
iie
n
i)ie(e
d
1
2
2
21
id age time expose lt1 42 15 1 54 2 46 14 2 7.3 3 43 8 4 3 4 25 3 3 2 5 26 13 4 5.4 6 55 12 4 5 7 23 10 4 3.7 8 24 11 4 5 9 38 7 3 2.8
10 24 4 4 2.2 11 28 6 4 2.5 12 38 9 4 3.1 13 26 5 4 2.5 14 28 1 4 .8 15 26 2 2 1.2
āļāļ§āļāļĒāļēāļ āļāļēāļĢāļĻāļāļĐāļēāļāļēāļĢāđāļāļĢāļāļŠāļēāļĢ Beryllium āđāļāļāļāļāļēāļāđāļŦāļĄāļāļāļāļēāļāļŦāļāđāļāļĒāļĻāļāļĐāļēāļāļ§āđāļāļĢ age exposure āļāļ higher rate of blastogeniclymphocyte transformation (lt ratio)
0:;0:0 AHH
. tsset timetime variable: time, 1 to 15
delta: 1 unit
. qui regress lt age expose
. estat dwatson
Durbin-Watson d-statistic( 3, 15) = 1.98835
. estat durbinaltDurbin's alternative test for autocorrelation--------------------------------------------------------------------
lags(p) | chi2 df Prob > chi2-------------+------------------------------------------------------
1 | 1.843 1 0.1746--------------------------------------------------------------------
H0: no serial correlation
āļ āļēāļ§āļ°āļĢāļ§āļĄāđāļŠāļāļāļĢāļ* (Collinearity)
āļāļ§āļēāļĄāļŠāļĄāļāļāļāļĢāļ°āļŦāļ§āļēāļāļāļ§āđāļāļĢāļāļŠāļĢāļ°āļāļ§āļĒāļāļ āļĄāļāļēāļŠāļ
(r2 > 0.90; r > 0.95 Kleinbaum, Muller, Nizam; 1998, 241)
āļāļāļāļ âāļ āļēāļ§āļ°āļĢāļ§āļĄāđāļŠāļāļāļĢāļāļāļŦ (Multicollinearity)â
āļāļēāļĢāļĨāļāļŦāļĢāļāđāļāļĄāļāļ§āđāļāļĢāđāļāđāļĄāđāļāļĨ āļāļēāđāļŦāđāļāļĨāļĒāļāđāļāļĨāļāļāļēāļŠāļĄāļāļĢāļ°āļŠāļāļ
āļāļāļāļāļēāļāđāļĨāļ°/āļŦāļĢāļāđāļāļĢāļāļāļŦāļĄāļēāļĒ
āļāļē R2 āļĄāļāļēāļŠāļāđāļāļāļēāļĢāļāļāļŠāļāļāļāļēāļāļŠāļāļāļāļāļŠāļĄāļāļĢāļ°āļŠāļāļ āļāļāļ§āļē
āđāļĄāļĄāļāļĒāļŠāļēāļāļ
āļāļēāđāļŦāļāļē Standard error āļŠāļ āļāļāļŠāļāļāļĨāđāļŦāļāļēāļŠāļāļāļĄāļāļēāļāļē āđāļāļ t, z
āđāļĨāļ°āļāļēāđāļŦāļāļēāļāļ§āļāđāļāļāļĄāļāļāļāļāļŠāļĄāļāļĢāļ°āļŠāļāļāļĄāļāļēāļāļ§āļēāļ
*āļāļāļāļēāļāļāļĢāļĄāļĻāļāļāļāļāļāļĻāļēāļŠāļāļĢ āļāļāļāļĢāļēāļāļāļāļāļāļĒāļŠāļāļēāļ, 2552
āļāļēāļĢāļāļĢāļ§āļāļŠāļāļ Collinearity āļŦāļĢāļ Multicollinearity
Pearson Correlation (informal method)
āļāļĢāļ§āļāļŠāļāļāļāļ§āļēāļĄāļŠāļĄāļāļāļāļāļāļāļ§āđāļāļĢ āđāļāļĒāđāļāļŠāļāļ Pearson correlation
āļāļāļēāļĢāļāļēāļāļ§āđāļāļĢāļāļĄāļāļ§āļēāļĄāļŠāļĄāļāļāļāļāļāļāļ§āđāļāļĢāļāļāđ āļŠāļ
. corr chol age age tri(obs=20)
| chol age age tri-------------+------------------------------------
chol | 1.0000age | 0.5609 1.0000age | 0.5609 1.0000 1.0000tri | 0.7467 0.5732 0.5732 1.0000
Indication of Multicollinearity āļāļ§āļĒāļ§āļ Variance inflation factors* VIF > 10 indication that Multicollinearity Mean VIF provides information about the severity of the
multicollinearity if Mean VIF > 1 are indicative of serious multicollinearity
problems(*Neter, Wasserman & Kutner, 1987; Marquardt, 1970; Belsley, Kuh &
Welsch, 1980)
tolerence (OâBrien, 2007)
tolerence <0.20 (vif>5) or tolerence 0.10 (vif=10+)
Stata collin [varlistâĶ]estat vif variance inflation factors for the
independent variables
)2i
R(1i
tolerance
Variance Inflation Factors (VIF: formal method)āļāļāļēāļĢāļāļēāļāļē VIF > 10 āđāļĨāļ°āļāļēāđāļāļĨāļĒāļāļāļ VIF āļĄāļēāļāļāļ§āļē 1 āļĄāļāļāļŦāļēāļāļēāļĢāđāļāļ Multicolinearity
21
11)21(
iRi
Ri
VIF
1
1
1)(
p-
p-
iK
VIF
VIF
. collin age chol triCollinearity Diagnostics
SQRT R-Variable VIF VIF Tolerance Squared
----------------------------------------------------age 1.58 1.26 0.6315 0.3685chol 2.40 1.55 0.4162 0.5838tri 2.45 1.57 0.4077 0.5923
----------------------------------------------------Mean VIF 2.15
CondEigenval Index
---------------------------------1 3.9477 1.00002 0.0303 11.42063 0.0126 17.73024 0.0094 20.4609
---------------------------------Condition Number 20.4609 Eigenvalues & Cond Index computed from scaled raw sscp (w/ intercept)Det(correlation matrix) 0.2794
. quietly regress sysbp chol age tri
. vifVariable | VIF 1/VIF -------------+----------------------
tri | 2.45 0.407722chol | 2.40 0.416193age | 1.58 0.631508
-------------+----------------------Mean VIF | 2.15
āļŦāļĢāļ. estat vif
Variable | VIF 1/VIF -------------+----------------------
tri | 2.45 0.407722chol | 2.40 0.416193age | 1.58 0.631508
-------------+----------------------Mean VIF | 2.15
. di (2.45+2.40+1.58)/32.1433333
r r2 vif.1 0.01 1.01 .2 0.04 1.04 .3 0.09 1.10 .4 0.16 1.19 .5 0.25 1.33 .6 0.36 1.56 .7 0.49 1.96 .8 0.64 2.78 .9 0.81 5.26 .91 0.83 5.82 .92 0.85 6.51 .93 0.86 7.40 .94 0.88 8.59 .95 0.90 10.26.96 0.92 12.76 .97 0.94 16.92 .98 0.96 25.25 .99 0.98 50.25
1 1.00 .
āļāļ§āļēāļĄāļŠāļĄāļāļāļāļĢāļ°āļŦāļ§āļēāļ VIF vs āļāļē correlation
.95
Conditional Index & Variance Decomposition Proportionāļāļē Conditional Index (CI) āđāļĨāļ°āļāļē Variance Decomposition Proportion (VDP) āđāļāļāļāļēāļāļāļēāļāļ§āļāļāļēāļ eigenvalue āļāļēāļāļāļēāļĢāļ§āđāļāļĢāļēāļ°āļŦāđāļĄāļāļĢāļāļāļŠāļŦāļŠāļĄāļāļāļ āļāļāļāļāļ§āđāļāļĢāļāļŠāļĢāļ° āđāļāļĒ Conditional Index āļāļēāļāļ§āļāļāļēāļ
āļāļē Conditional Index āļĄāļāļē 10-30 āđāļŠāļāļāļ§āļēāļĄāļ āļēāļ§āļ°āļĢāļ§āļĄāđāļŠāļāļāļĢāļ āļāļē conditional index > 30 āđāļŠāļāļāļ§āļēāļĄāļāļāļŦāļēāļ āļēāļ§āļ°āļĢāļ§āļĄāđāļŠāļāļāļĢāļ Conditional Index > 100 āđāļŠāļāļāļ§āļēāļĄāļ āļēāļ§āļ°āļĢāļ§āļĄāđāļŠāļāļāļĢāļāļŠāļāļĄāļēāļāđ (Belsley, 1991a)
between 10 and 30, there is moderate to strong multicollinearity and if it exceeds 30 there is severe multicollinearity. (Gujarati, 2002)
Eigenvaluek MinMax ;/
Conditional Index & Variance Decomposition Proportion
āļāļē Variance Decomposition Proportion āđāļāļ°āļāļēāđāļāļĒ
Belsley et al. (1980) āđāļĨāļ° Belsley (1991a)
āļāļāļēāļĢāļāļē VDP āļĄāļēāļāļāļ§āļē 0.5
āļāļēāļāļ§āļāļāļēāļŠāļāļŠāļ§āļāļāļāļāļāļ§āļēāļĄāđāļāļĢāļāļĢāļ§āļ (proposed calculation of
the proportions of variance) āļāļāļāđāļāļĨāļ°āļāļ§āđāļāļĢāļŠāļĄāļāļāļāļāļ
āļāļēāļāļāļāļāļĢāļ°āļāļāļ (principal component) āđāļāļĢāļĒāļāđāļŠāļĄāļāļ
āļāļāļāļāļĢāļ°āļāļāļāļāļāļāļāļēāļŠāļĄāļāļĢāļ°āļŠāļāļāļāļ§āļēāļĄāđāļāļĢāļāļĢāļ§āļāđāļāđāļāļĨāļ°āļĄāļ
(decomposition of the coefficient variance for each dimension)
kj
jkjk VIF
Vp
2
(Fox,1984)
. coldiag2 tri chol age, force w(5)
Condition number using scaled variables = 20.46
Condition Indexes and Variance-Decomposition Proportions
conditionindex _cons tri chol age
1 1.00 0.00 0.00 0.00 0.00 2 11.42 0.32 0.38 0.01 0.03 3 17.73 0.32 0.00 0.14 0.95 4 20.46 0.36 0.61 0.85 0.02
. prnt_cx, force w(5)
Condition Indexes and Variance-Decomposition Proportionscondition
index _cons tri chol age 1 1.00 . . . . 2 11.42 0.32 0.38 . . 3 17.73 0.32 . . 0.95 4 20.46 0.36 0.61 0.85 .
Variance-Decomposition Proportions less than .3 have been printed as "."