Lecture 4, part 1: Linear Regression Analysis: Two Advanced Topics July 14, 2011 Karen...

Lecture 4, part 1Lecture 4, part 1: Linear Regression: Linear RegressionAnalysis: Two Advanced TopicsAnalysis: Two Advanced Topics

July 14, 2011July 14, 2011

Karen Bandeen-Roche, PhDDepartment of Biostatistics

Johns Hopkins University

Introduction to Statistical Measurement

and Modeling

Data examples Boxing and neurological injury

Scientific question: Does amateur boxing lead to decline in neurological performance?

Some related statistical questions:

Is there a dose-response increase in the rate of cognitive decline with increased boxing exposure?

Is boxing-associated decline independent of initial cognition and age?

Is there a threshold of boxing that initiates harm?

Boxing data-2

0-1

00

10

20

blk

diff

0 100 200 300 400blbouts

bandwidth = .8

Lowess smoother

Outline

Topic #1: Confounding Handling this is crucial if we are to draw

correct conclusions about risk factors

Topic #2: Signal / noise decomposition Signal: Regression model predictions

Noise: Residual variation

Another way of approaching inference, precision of prediction

Topic # 1: Confounding

Confound means to “confuse”

When the comparison is between groups that are otherwise not similar in ways that affect the outcome

Lurking variables,….

Confounding Example: Drowning and Eating Ice Cream

Ice Cream eaten

Drowning rate

**

**

***

**

*

*

**

**

**

***

**

*

*

**

July 2010 JHU Intro to Clinical Research 7

ConfoundingEpidemiology definition: A characteristic “C” is a confounder if it is associated (related) with both the outcome (Y: drowning) and the risk factor (X: ice cream) and is not causally in between

Ice Cream Consumption

Ice Cream Consumption Drowning rateDrowning rate

????

ConfoundingStatistical definition: A characteristic “C” is a confounder if the strength of relationship between the outcome (Y: drowning) and the risk factor (X: ice cream) differs with, versus without, adjustment for C

Ice Cream EatenIce Cream Eaten Drowning rateDrowning rate

Outdoor Temperature

Confounding Example: Drowning and Eating Ice Cream

Ice Cream eaten

Drowning rate

**

**

***

**

*

*

**

**

**

***

**

*

*

**

Cool temperature

Warm temperature

July 2010 JHU Intro to Clinical Research 10

Effect modificationA characteristic “E” is an effect modifier if the strength of relationship between the outcome (Y: drowning) and the risk factor (X: ice cream) differs within levels of E

Ice Cream Consumption

Ice Cream Consumption Drowning rateDrowning rate

Outdoor temperature

Outdoor temperature

Effect Modification: Drowning and Eating Ice Cream

Ice Cream eaten

Drowning rate

*

* * *

*

*

*

** *

*

*

*

**

**

***

**

*

*

**

Cool temperature

Warm temperature

Topic #2: Signal/Noise Decomposition

Lovely due to geometry of least squares

Facilitates testing involving multiple parameters at once

Provides insight into R-squared

Signal/Noise Decomposition First step: decomposition of variance

“Regression” part: Variance of s

“Error” or “Residual” part: Variance of e

Together: These determine “total” variance of Ys

“Sums of Squares” (SS) rather than variance per se

Regression SS (SSR):

Error SS (SSE):

Total SS (SST):

Y

Signal/Noise Decomposition

Properties SST = SSR + SSE

SSR/SST = “proportion of variance explained” by regression = R-squared

Follows from geometry

SSR and SSE are independent (assuming A1-A5) and have easily characterized probability distributions

Provides convenient testing methods

Follows from geometry plus assumptions


SSR and SSE are independent Define M = span(X) and take “Y” as centered at

It is possible to orthogonally rotate the coordinate axes so that first p axes ε M; remaining n-p-1 axes ε M⊥

Gram-Schmidt orthogonalization

Doing this transforms Y into TY :=Z, for some orthonormal matrix T with columns:= {e1,...,en-1}

Distribution of Z = N(TE[Y|X],σ2I)

Y


SSR and SSE are independent - continued TY=Z Y = T’Z

SSE = squared length of =

SSR = squared length of =

Claim now follows: SSR & SSE are independent because (Z1,…,Zp) and (Zp+1,…,Zn-

1) are independent

Signal/Noise Decomposition Under A1-A5 SSE, SSR and their scaled ratio

have convenient distributions

Under A1-A2: E[Y|X] ε M, E[Zj|X] =0, all j>p

Recall {Z1,...,Zn-1} are mutually independent normal with variance=σ2

Thus SSE = =

~ σ2 χ2n-p-1 under A1-A5

(a sum of k independent squared N(0,1) is ) k2


Under A1-A5 SSE, SSR and their scaled ratio have convenient distributions

For j ≤ p E[Zj|X] ≠ 0 in general

Exception: H0: β1=…=βp = 0

Then SSR = ~ σ2 χ2p under A1-A5

and

~ Fp,n-p-1 ~

with numerator and denominator independent.

SSR p

SSE n p

/

/ ( ) 1

p

n p

p

n p

2

12 1

/

/ ( )


An organizational tool: The analysis of variance (ANOVA) table

SOURCE Sum of Squares (SS)

Degrees of freedom (df)

Mean square (SS/df)

Regression SSR p SSR/p

Error SSE n-p-1 SSE/(n-p-1)=

Total SST= SSR + SSE

n-1

2

F = MSR/MSE

“Global” hypothesis tests These involve sets of parameters

Hypotheses of the form

H0: βj = 0 for all j in a defined subset of {j=1,...,p} vs. H1: βj ≠ 0 for at least one of the j

Example 1: H0: βLATITUDE = 0 and βLONGITUDE = 0

Example 2: H0: all polynomial or spline coefficients involving a given variable = 0.

Example 3: H0: all coefficients involving a variable = 0.

“Global” hypothesis tests Testing method: Sequential decomposition of sums of

squares

Hypothesis to be tested is H0: βj1=...=βjk = 0 in full model

Fit model excluding xj1,...,xjpj: Save SSE = SSEs

Fit “full” (or larger) model adding xj1,...,xjpj to smaller model. Save SSE=SSEL, often=overall SSE

Test statistic S = [(SSES-SSEL)/pj]/[SSEL(n-p-1)]

Distribution under null: F(pj,n-p-1)

Define rejection region based on this distribution

Compute S

Reject or not as S is in rejection region or not


An augmented version for global testing

SOURCE Sum of Squares (SS)

Degrees of freedom (df)

Mean square (SS/df)

Regression SSR p SSR/p

X1 SST-SSEs p1

X2|X1 SSES-SSEL p2 (SSES-SSEL )/p2

Error SSEL n-p-1 SSEL/(n-p-1)

Total SST= SSR + SSE

n-1

F = MSR(2|1)/MSE

R-squared – Another view

From last lecture: ECDF Corr(Y, ) squared

More conventional: R2 = SSR/SST

Geometry justifies why they are the same Cov(Y, ) = Cov(Y- + , ) = Cov(e, ) +

Var( )

Covariance = inner product first term = 0

A measure of precision with which regression model describes individual responses

Y

Y Y Y Y Y Y

Outline: A few more topics

Colinearity

Overfitting

Influence

Mediation

Multiple comparisons

Main points Confounding occurs when an apparent association

between a predictor and outcome reflects the association of each with a third variable A primary goal of regression is to “adjust” for confounding

Least squares decomposition of Y into fit and residual provides an appealing statistical testing framework An association of an outcome with predictors is

evidenced if SS due to regression is large relative to SSE

Geometry: orthogonal decomposition provides convenient sampling distribution, view of R2

ANOVA

Date post:	22-Dec-2015
Category:	Documents
Upload:	elliot-stansfield
View:	215 times
Download:	1 times

Lecture 4, part 1: Linear Regression Analysis: Two Advanced Topics July 14, 2011 Karen...

Documents