Date post: | 22-Dec-2015 |
Category: |
Documents |
Upload: | elliot-stansfield |
View: | 215 times |
Download: | 1 times |
Lecture 4, part 1Lecture 4, part 1: Linear Regression: Linear RegressionAnalysis: Two Advanced TopicsAnalysis: Two Advanced Topics
July 14, 2011July 14, 2011
Karen Bandeen-Roche, PhDDepartment of Biostatistics
Johns Hopkins University
Introduction to Statistical Measurement
and Modeling
Data examples Boxing and neurological injury
Scientific question: Does amateur boxing lead to decline in neurological performance?
Some related statistical questions:
Is there a dose-response increase in the rate of cognitive decline with increased boxing exposure?
Is boxing-associated decline independent of initial cognition and age?
Is there a threshold of boxing that initiates harm?
Boxing data-2
0-1
00
10
20
blk
diff
0 100 200 300 400blbouts
bandwidth = .8
Lowess smoother
Outline
Topic #1: Confounding Handling this is crucial if we are to draw
correct conclusions about risk factors
Topic #2: Signal / noise decomposition Signal: Regression model predictions
Noise: Residual variation
Another way of approaching inference, precision of prediction
Topic # 1: Confounding
Confound means to “confuse”
When the comparison is between groups that are otherwise not similar in ways that affect the outcome
Lurking variables,….
Confounding Example: Drowning and Eating Ice Cream
Ice Cream eaten
Drowning rate
**
**
***
**
*
*
**
**
**
***
**
*
*
**
July 2010 JHU Intro to Clinical Research 7
ConfoundingEpidemiology definition: A characteristic “C” is a confounder if it is associated (related) with both the outcome (Y: drowning) and the risk factor (X: ice cream) and is not causally in between
Ice Cream Consumption
Ice Cream Consumption Drowning rateDrowning rate
????
ConfoundingStatistical definition: A characteristic “C” is a confounder if the strength of relationship between the outcome (Y: drowning) and the risk factor (X: ice cream) differs with, versus without, adjustment for C
Ice Cream EatenIce Cream Eaten Drowning rateDrowning rate
Outdoor Temperature
Confounding Example: Drowning and Eating Ice Cream
Ice Cream eaten
Drowning rate
**
**
***
**
*
*
**
**
**
***
**
*
*
**
Cool temperature
Warm temperature
July 2010 JHU Intro to Clinical Research 10
Effect modificationA characteristic “E” is an effect modifier if the strength of relationship between the outcome (Y: drowning) and the risk factor (X: ice cream) differs within levels of E
Ice Cream Consumption
Ice Cream Consumption Drowning rateDrowning rate
Outdoor temperature
Outdoor temperature
Effect Modification: Drowning and Eating Ice Cream
Ice Cream eaten
Drowning rate
*
* * *
*
*
*
** *
*
*
*
**
**
***
**
*
*
**
Cool temperature
Warm temperature
Topic #2: Signal/Noise Decomposition
Lovely due to geometry of least squares
Facilitates testing involving multiple parameters at once
Provides insight into R-squared
Signal/Noise Decomposition First step: decomposition of variance
“Regression” part: Variance of s
“Error” or “Residual” part: Variance of e
Together: These determine “total” variance of Ys
“Sums of Squares” (SS) rather than variance per se
Regression SS (SSR):
Error SS (SSE):
Total SS (SST):
Y
Signal/Noise Decomposition
Properties SST = SSR + SSE
SSR/SST = “proportion of variance explained” by regression = R-squared
Follows from geometry
SSR and SSE are independent (assuming A1-A5) and have easily characterized probability distributions
Provides convenient testing methods
Follows from geometry plus assumptions
Signal/Noise Decomposition
SSR and SSE are independent Define M = span(X) and take “Y” as centered at
It is possible to orthogonally rotate the coordinate axes so that first p axes ε M; remaining n-p-1 axes ε M⊥
Gram-Schmidt orthogonalization
Doing this transforms Y into TY :=Z, for some orthonormal matrix T with columns:= {e1,...,en-1}
Distribution of Z = N(TE[Y|X],σ2I)
Y
Signal/Noise Decomposition
SSR and SSE are independent - continued TY=Z Y = T’Z
SSE = squared length of =
SSR = squared length of =
Claim now follows: SSR & SSE are independent because (Z1,…,Zp) and (Zp+1,…,Zn-
1) are independent
Signal/Noise Decomposition Under A1-A5 SSE, SSR and their scaled ratio
have convenient distributions
Under A1-A2: E[Y|X] ε M, E[Zj|X] =0, all j>p
Recall {Z1,...,Zn-1} are mutually independent normal with variance=σ2
Thus SSE = =
~ σ2 χ2n-p-1 under A1-A5
(a sum of k independent squared N(0,1) is ) k2
Signal/Noise Decomposition
Under A1-A5 SSE, SSR and their scaled ratio have convenient distributions
For j ≤ p E[Zj|X] ≠ 0 in general
Exception: H0: β1=…=βp = 0
Then SSR = ~ σ2 χ2p under A1-A5
and
~ Fp,n-p-1 ~
with numerator and denominator independent.
SSR p
SSE n p
/
/ ( ) 1
p
n p
p
n p
2
12 1
/
/ ( )
Signal/Noise Decomposition
An organizational tool: The analysis of variance (ANOVA) table
SOURCE Sum of Squares (SS)
Degrees of freedom (df)
Mean square (SS/df)
Regression SSR p SSR/p
Error SSE n-p-1 SSE/(n-p-1)=
Total SST= SSR + SSE
n-1
2
F = MSR/MSE
“Global” hypothesis tests These involve sets of parameters
Hypotheses of the form
H0: βj = 0 for all j in a defined subset of {j=1,...,p} vs. H1: βj ≠ 0 for at least one of the j
Example 1: H0: βLATITUDE = 0 and βLONGITUDE = 0
Example 2: H0: all polynomial or spline coefficients involving a given variable = 0.
Example 3: H0: all coefficients involving a variable = 0.
“Global” hypothesis tests Testing method: Sequential decomposition of sums of
squares
Hypothesis to be tested is H0: βj1=...=βjk = 0 in full model
Fit model excluding xj1,...,xjpj: Save SSE = SSEs
Fit “full” (or larger) model adding xj1,...,xjpj to smaller model. Save SSE=SSEL, often=overall SSE
Test statistic S = [(SSES-SSEL)/pj]/[SSEL(n-p-1)]
Distribution under null: F(pj,n-p-1)
Define rejection region based on this distribution
Compute S
Reject or not as S is in rejection region or not
Signal/Noise Decomposition
An augmented version for global testing
SOURCE Sum of Squares (SS)
Degrees of freedom (df)
Mean square (SS/df)
Regression SSR p SSR/p
X1 SST-SSEs p1
X2|X1 SSES-SSEL p2 (SSES-SSEL )/p2
Error SSEL n-p-1 SSEL/(n-p-1)
Total SST= SSR + SSE
n-1
F = MSR(2|1)/MSE
R-squared – Another view
From last lecture: ECDF Corr(Y, ) squared
More conventional: R2 = SSR/SST
Geometry justifies why they are the same Cov(Y, ) = Cov(Y- + , ) = Cov(e, ) +
Var( )
Covariance = inner product first term = 0
A measure of precision with which regression model describes individual responses
Y
Y Y Y Y Y Y
Outline: A few more topics
Colinearity
Overfitting
Influence
Mediation
Multiple comparisons
Main points Confounding occurs when an apparent association
between a predictor and outcome reflects the association of each with a third variable A primary goal of regression is to “adjust” for confounding
Least squares decomposition of Y into fit and residual provides an appealing statistical testing framework An association of an outcome with predictors is
evidenced if SS due to regression is large relative to SSE
Geometry: orthogonal decomposition provides convenient sampling distribution, view of R2
ANOVA