8/20/2019 Correlation lecture
1/79
Linear correlation andlinear regression
8/20/2019 Correlation lecture
2/79
Continuous outcome
(means)OutcomeVariable
Are the observations independent or correlated?Alternatives if thenormality assumption isviolated (and smallsample size):
independent correlated
Continuous(e g painscale!cognitivefunction)
Ttest: compares meansbet"een t"o independentgroups
ANOVA: compares means
bet"een more than t"oindependent groups
Pearson’s correlationcoefcient (linearcorrelation): sho"s linearcorrelation bet"een t"ocontinuous variables
Linear regression: multivariate re ression
Paired ttest: comparesmeans bet"een t"o relatedgroups (e g ! the samesub%ects before and after)
Repeated-measuresANOVA: compares changesover time in the means of t"oor more groups (repeatedmeasurements)
Mixed models/ !!modeling : multivariateregression techni#ues tocompare changes over timebet"een t"o or more groups$
&on'parametric statistics
"ilcoxon sign-ran#test : non'parametricalternative to the paired ttest
"ilcoxon sum-ran# test ( ann'*hitney + test): non'parametric alternative tothe ttest
$rus#al-"allis test: non'parametric alternative toA&OVA
%pearman ran#
8/20/2019 Correlation lecture
3/79
/ecall: Covariance
1
))((),(cov 1
−
−−=
∑=
n
Y y X x y x
n
iii
8/20/2019 Correlation lecture
4/79
cov(0!1) 2 3 0 and 1 are positively correlated
cov(0!1) 4 3 0 and 1 are inversely correlated
cov(0!1) 3 0 and 1 are independent
5nterpreting Covariance
8/20/2019 Correlation lecture
5/79
Correlation coe.cient,earson-s Correlation Coe.cient is standardized covariance (unitless):
y x
y xariancer
var var
),(cov=
8/20/2019 Correlation lecture
6/79
Correlationeasures the relative strength of the linear
relationship bet"een t"o variables
+nit'less/anges bet"een 67 and 7
8he closer to 67! the stronger the negative linearrelationship
8he closer to 7! the stronger the positive linearrelationship
8he closer to 3! the "ea9er any positive linear relationship
8/20/2019 Correlation lecture
7/79
catter ,lots of ;ata "ithVarious CorrelationCoe.cients Y
X
Y
X
Y
X
Y
X
Y
X
r = -1 r = -.6 r = 0
r = +.3r = +1
Y
Xr = 0
Slide from: Statistics for Managers Using Microsoft® Excel 4th Edition, 2004 rentice!"all
8/20/2019 Correlation lecture
8/79
Y
X
Y
X
Y
Y
X
X
Linear relationships Curvilinear relationshipsLinear Correlation
Slide from: Statistics for Managers Using Microsoft® Excel 4th Edition, 2004 rentice!"all
8/20/2019 Correlation lecture
9/79
Y
X
Y
X
Y
Y
X
X
Strong relationships Weak relationshipsLinear Correlation
Slide from: Statistics for Managers Using Microsoft® Excel 4th Edition, 2004 rentice!"all
8/20/2019 Correlation lecture
10/79
Linear Correlation
Y
X
Y
X
No relationship
Slide from: Statistics for Managers Using Microsoft® Excel 4th Edition, 2004 rentice!"all
8/20/2019 Correlation lecture
11/79
Calculating by hand<
1
)(
1
)(
1
))((
var var
),(cov#
1
2
1
2
1
−
−
−
−
−
−−
==
∑∑
∑
==
=
n
y y
n
x x
n
y y x x
y x
y xariancer
n
i
i
n
i
i
n
iii
8/20/2019 Correlation lecture
12/79
impler calculation
formula<
y x
xy
n
ii
n
ii
n
iii
n
ii
n
ii
n
iii
SS SS
SS
y y x x
y y x x
n
y y
n
x x
n
y y x x
r
=−−
−−
=
−
−
−
−
−
−−
=
∑∑
∑
∑∑
∑
==
=
==
=
1
2
1
2
1
1
2
1
2
1
)()(
))((
1
)(
1
)(
1
))((
#
y x
xy
SS SS
SS r =#
Numerator ofcovariance
Numerators ofvariance
8/20/2019 Correlation lecture
13/79
&istri'ution o( t)e
correlation coefcient:
=note! li9e a proportion! the variance of the correlation coe.cient
depends on the correlation coe.cient itself substitute in estimated r
21
)#(
2
−
−= n
r r SE
The sample correlation coefficient follows a T-distributionwith n-2 degrees of freedom (since you have to estimate thestandard error).
8/20/2019 Correlation lecture
14/79
Continuous outcome
(means)OutcomeVariable
Are the observations independent or correlated?Alternatives if thenormality assumption isviolated (and smallsample size):
independent correlated
Continuous(e g painscale!cognitivefunction)
Ttest: compares meansbet"een t"o independentgroups
ANOVA: compares meansbet"een more than t"oindependent groups
Pearson’s correlationcoefcient (linearcorrelation): sho"s linearcorrelation bet"een t"ocontinuous variables
Linear regression: multivariate re ression
Paired ttest: comparesmeans bet"een t"o relatedgroups (e g ! the samesub%ects before and after)
Repeated-measuresANOVA: compares changesover time in the means of t"oor more groups (repeatedmeasurements)
Mixed models/ !!modeling : multivariateregression techni#ues tocompare changes over timebet"een t"o or more groups$
&on'parametric statistics
"ilcoxon sign-ran#test : non'parametricalternative to the paired ttest
"ilcoxon sum-ran# test ( ann'*hitney + test): non'parametric alternative tothe ttest
$rus#al-"allis test: non'parametric alternative toA&OVA
%pearman ran#
8/20/2019 Correlation lecture
15/79
Linear regression5n correlation! the t"o variables are treated as e#uals 5n regression!one variable is considered independent ( predictor) variable ( X ) and
the other the dependent ( outcome) variable Y
8/20/2019 Correlation lecture
16/79
*hat is >Linear ?/emember this:Y=mX+B?
B
m
8/20/2019 Correlation lecture
17/79
*hat-s lope?
$ slo%e of 2 means that ever& 1!'nit change in
&ields a 2!'nit change in *
8/20/2019 Correlation lecture
18/79
,rediction+f &o' no- something a.o't , this no-ledge hel%s &o'
%redict something a.o't * (So'nd familiar/ so'ndli e conditional %ro.a.ilities/)
8/20/2019 Correlation lecture
19/79
/egression e#uation<
iii x x y E β α +=)(Expected value of y at a given level of x
8/20/2019 Correlation lecture
20/79
,redicted value for anindividual<
&i α 3 β xi 3 random error i
5ollo-s a normaldistri.'tion
5ixed 6exactl&on the
line
8/20/2019 Correlation lecture
21/79
Assumptions (or the @neprint)
Linear regression assumes that<7 8he relationship bet"een 0 and 1 is linear
1 is distributed normally at each value of0B 8he variance of 1 at every value of 0 isthe same (homogeneity of variances)
8he observations are independent
8/20/2019 Correlation lecture
22/79
The standard error of ! given " is the average variability around theregression line at any given value of ". #t is assumed to be e$ual atall values of ".
%y&x
%y&x
%y&x
%y&x
%y&x
%y&x
8/20/2019 Correlation lecture
23/79
' A
A
yi
x
y
yi
C
7east s8'ares estimationgave 's the line (9) thatminimi ed ; 2
α β += ii x y#
y
2 2 ' 2 %%total Total s*uared distance o(o'ser+ations (rom na,+emean o( Total variation
%%reg ariance aro'nd the regression lineAdditional variability not explained by
x—what least squares method aims tominimize
∑∑ ∑== =
−+−=−n
iii
n
i
n
iii y y y y y y
1
2
1 1
22 )#()#()(
/egression ,icture
* 2 %%reg &%%total
8/20/2019 Correlation lecture
24/79
/ecall eEample: cognitivefunction and vitamin ;
Fypothetical data loosely based onG7H$ cross'sectional study of 733middle'aged and older Iuropeanmen
Cognitive function is measured by the
;igit ymbol ubstitution 8est (; 8)
7 Lee ; ! 8a%ar A! +lubaev A! et al Association bet"een J'hydroEyvitamin ; levels and cognitive performancein middle'aged and older Iuropean men K &eurol &eurosurg ,sychiatry 33 Kul$M3(N):N '
8/20/2019 Correlation lecture
25/79
;istribution of vitamin ;
ean B nmolPL
tandard deviation BBnmolPL
8/20/2019 Correlation lecture
26/79
;istribution of ; 8&ormally distributed
ean M points
tandard deviation 73 points
8/20/2019 Correlation lecture
27/79
Qour hypothetical datasets
5 generated four hypotheticaldatasets! "ith increasing 8/+Islopes (bet"een vit ; and ; 8):
33 J points per 73 nmolPL
7 3 points per 73 nmolPL7 J points per 73 nmolPL
8/20/2019 Correlation lecture
28/79
;ataset 7: no relationship
8/20/2019 Correlation lecture
29/79
;ataset : "ea9relationship
8/20/2019 Correlation lecture
30/79
;ataset B: "ea9 tomoderate relationship
8/20/2019 Correlation lecture
31/79
;ataset : moderaterelationship
8/20/2019 Correlation lecture
32/79
8he >Rest @t line
*egressione$uation+
E(! i) 2, /vit0 i (in 1 nmol& )
8/20/2019 Correlation lecture
33/79
8he >Rest @t line
Note how the line isa little deceptive3 itdraws your eye4ma5ing therelationship appearstronger than itreally is6
*egressione$uation+
E(! i) 27 .8/vit0 i (in 1 nmol& )
8/20/2019 Correlation lecture
34/79
8he >Rest @t line
*egression e$uation+
E(! i) 22 1. /vit0 i (in 1 nmol& )
8/20/2019 Correlation lecture
35/79
8he >Rest @t line
*egression e$uation+
E(! i) 2 1.8/vit 0 i (in 1 nmol& )
Note+ all the lines gothrough the point(794 2,)6
8/20/2019 Correlation lecture
36/79
Istimating the intercept andslope: least s#uaresestimation 7east S8'ares Estimation
$ little calc'l's *?hat are -e tr&ing to estimate/ :4 the slope4 from
?hat@s the constraint/ ?e are tr&ing to minimi e the s8'ared distance (hence the Aleast s8'aresB) .et-een theo.servations themselves and the %redicted val'es , or (also called the Aresid'alsB, or left!over 'nex%lained
varia.ilit&)
8/20/2019 Correlation lecture
37/79
/esulting formulas<
%lope (beta coefficient)
)(
),(#
xVar
y xCov=β
),( y x
x#!:;alc'late β α =#ntercept
*egression line always goes through the point+
8/20/2019 Correlation lecture
38/79
/elationship "ithcorrelation
y
x
SDSD
r β ## =
+n correlation, the t-o varia.les are treated as e8'als* +n regression, one varia.le is consideredinde%endent ( %redictor) varia.le ( X ) and the other the de%endent ( o'tcome) varia.le Y *
8/20/2019 Correlation lecture
39/79
IEample: dataset
y
x
SS SS
β #
%0x 99 nmol&
%0y 1 points
'ov("4!) 179points/nmol&
eta 179&99 2 .18points per nmol&
1.8 points per 1nmol&
r 179&(1 /99) .;<
=r
r .18 / (99&1 ) .;<
8/20/2019 Correlation lecture
40/79
igni@cance testing<%lope0istribution of slope > T n-2 (:4s.e.( ))
β #
F 3 : S 7 3 (no linear relationship)F 7 : S 7 ≠ 3 (linear relationship doeseEist)
)#*(*0#
β β e s
−T n-2
8/20/2019 Correlation lecture
41/79
Qormula for the standard errorbeta (you "ill not have tocalculate by handT):
ii
n
ii
x y
x x
β α ###and
)(SS-here1
2x
+=
−= ∑=
x
x y
x
n
iii
SS
s
SS n
y y
s21
2
#2
)#(
=−
−
=∑=
β
8/20/2019 Correlation lecture
42/79
IEample: dataset
tandard error (beta) 3 3B 8 LM 3 7JP3 3B J! p4 3337
JU Con@dence interval 3 3 to3 7
8/20/2019 Correlation lecture
43/79
/esidual Analysis: chec9assumptions
8he residual for observation i! e i! is the
di erence bet"een its observed and predictedvalueChec9 the assumptions of regression byeEamining the residuals
IEamine for linearity assumptionIEamine for constant variance for all levels of 0(homoscedasticity)Ivaluate normal distribution assumptionIvaluate independence assumption
Wra hical Anal sis of /esiduals
iii Y Y e #−=
8/20/2019 Correlation lecture
44/79
,redicted values<
ii x y *120# +=?or @itamin 0
8/20/2019 Correlation lecture
45/79
/esidual observed ' predicted
14#
F4#
4H
=−
=
=
ii
i
i
y y
y
y"
8/20/2019 Correlation lecture
46/79
/esidual Analysis for
Linearity
Not Linear Linear
x r e s
i d u a
l s
x
Y
x
Y
x
r e s
i d u a
l s
Slide from: Statistics for Managers Using Microsoft® Excel 4th Edition, 2004 rentice!"all
8/20/2019 Correlation lecture
47/79
/esidual Analysis forFomoscedasticity
Non-constant variance Constant variance
x x
Y
x x
Y
r e s
i d u a
l s
r e s
i d u a
l s
Slide from: Statistics for Managers Using Microsoft® Excel 4th Edition, 2004 rentice!"all
8/20/2019 Correlation lecture
48/79
Residual Analysis forIndependence
Not IndependentIndependent
X
X r e s
i d u a l s
r e s
i d u a
l s
X r e s
i d u a
l s
Slide from: Statistics for Managers Using Microsoft® Excel 4th Edition, 2004 rentice!"all
8/20/2019 Correlation lecture
49/79
/esidual plot! dataset
8/20/2019 Correlation lecture
50/79
ultiple linearregression<
*hat if age is a confounder here?Older men have lo"er vitamin ;
Older men have poorer cognition>Ad%ust for age by putting age inthe model:
; 8 score intercept Xslope 7 x vitamin ; X slope A x age
8/20/2019 Correlation lecture
51/79
8/20/2019 Correlation lecture
52/79
;i erent B; vie"<
8/20/2019 Correlation lecture
53/79
Qit a plane rather than aline<
=n the plane4 the
slope for vitamin0 is the same atevery age3 thus4the slope forvitamin 0represents the
effect of vitamin0 when age isheld constant.
8/20/2019 Correlation lecture
54/79
I#uation of the >Rest @tplane<
; 8 score JB X 3 33B x vitamin ;(in 73 nmolPL) ' 3 x age (in years)
,'value for vitamin ; 22 3J,'value for age 4 3337
8hus! relationship "ith vitamin ; "asdue to confounding by ageT
8/20/2019 Correlation lecture
55/79
ultiple Linear /egressionore than one predictor<
I(y) α X β 7=0 X β =* X β
B =Y<
Iach regression coe.cient is the amount ofchange in the outcome variable that "ould be
eEpected per one'unit change of the predictor! ifall other variables in the model "ere heldconstant
8/20/2019 Correlation lecture
56/79
Qunctions of multivariateanalysis:
;ontrol for confo'ndersCest for interactions .et-een %redictors(effect modification)+m%rove %redictions
8/20/2019 Correlation lecture
57/79
A ttest is linear regressionT;ivide vitamin ; into t"o groups:
5nsu.cient vitamin ; (4J3 nmolPL) u.cient vitamin ; (2 J3 nmolPL)!reference group
*e can evaluate these data "ith a ttestor a linear regression<
000H*D4I*F
4IH*10
E4H*10
E*JE*F24022GH
==
+
=−= pT
8/20/2019 Correlation lecture
58/79
As a linear regression<
Parameter ````````````````Standard Variable Estimate Error t Value Pr > |t|
Intercept 40.07407 1.47511 27.17 .0001
insu!! "7.5#0$0 2.174%# "#.4$ 0.000&
#nterceptrepresents themean value inthe sufficientgroup.
%lope representsthe difference inmeans between thegroups. 0ifferenceis significant.
A&OVA is linear
8/20/2019 Correlation lecture
59/79
A&OVA is linearregressionT
;ivide vitamin ; into three groups:;e@cient (4 J nmolPL)5nsu.cient (2 J and 4J3 nmolPL)
u.cient (2 J3 nmolPL)! reference group; 8 α ( value for su.cient) X β insu.cient =(7 if
insu.cient) X β =(7 if de@cient)
8his is called >dummy coding Z"here multiple
binary variables are created to represent being ineach category (or not) of a categorical variable
8/20/2019 Correlation lecture
60/79
8he picture<
%ufficient vs.#nsufficient
%ufficient vs.0eficient
8/20/2019 Correlation lecture
61/79
/esults< Parameter Estimates
Parameter Standard Variable '( Estimate Error t Value Pr > |t|
Intercept 1 40.07407 1.47&17 27.11 .0001
deficient 1 -9.87407 3.73950 -2.64 0.0096 insufficient 1 -6.87963 2.33719 -2.94 0.0041
5nterpretation: 8he de@cient group has a mean ; 8 MNpoints lo"er than the reference (su.cient)group
8he insu.cient group has a mean ; 8 MN
points lo"er than the reference (su.cient)
h f l
8/20/2019 Correlation lecture
62/79
Other types of multivariateregression
M'lti%le linear regression is for normall&distri.'ted o'tcomes
7ogistic regression is for .inar& o'tcomes
;ox %ro%ortional ha ards regression is 'sed -hentime!to!event is the o'tcome
8/20/2019 Correlation lecture
63/79
'ommon multivariate regression models.
K'tcome(de%endentvaria.le)
Exam%leo'tcomevaria.le
$%%ro%riatem'ltivariateregressionmodel
Exam%le e8'ation ?hat do the coefficients give&o'/
;ontin'o's Llood %ress're
inearregression
.lood %ress're (mm"g) α 3 βsalt salt cons'm%tion (ts% da&) 3βage age (&ears) 3 βsmo er ever
smo er (&es 1 no 0)
slo%es tells &o' ho- m'chthe o'tcome varia.leincreases for ever& 1!'nitincrease in each %redictor*
Linar& "igh .lood %ress're(&es no)
ogisticregression
ln (odds of high .lood %ress're) α 3 βsalt salt cons'm%tion (ts% da&) 3βage age (&ears) 3 βsmo er ever
smo er (&es 1 no 0)
odds ratios tells &o' ho-m'ch the odds of theo'tcome increase for ever&1!'nit increase in each
%redictor*
Cime!to!event Cime!to!death
'ox regression ln (rate of death) α 3 βsalt salt cons'm%tion (ts% da&) 3βage age (&ears) 3 βsmo er ever
smo er (&es 1 no 0)
ha ard ratios tells &o' ho-m'ch the rate of the o'tcomeincreases for ever& 1!'nitincrease in each %redictor*
l i i i
8/20/2019 Correlation lecture
64/79
ultivariate regressionpitfallsAulti-collinearity*esidual confounding=verfitting
8/20/2019 Correlation lecture
65/79
ulticollinearityAulticollinearity arises -hen t-o varia.les thatmeas're the same thing or similar things (e*g*,-eight and LM+) are .oth incl'ded in a m'lti%leregression modelD the& -ill, in effect, cancel eachother o't and generall& destro& &o'r model*
Model .'ilding and diagnostics are tric & .'sinessN
8/20/2019 Correlation lecture
66/79
/esidual confounding
1ou cannot completely "ipe outconfounding simply by ad%usting
for variables in multiple regressionunless variables are measured "ithzero error ("hich is usually
impossible)IEample: meat eating andmortality
"h t l t f t
8/20/2019 Correlation lecture
67/79
en "ho eat a lot of meatare unhealthier for manyreasonsT
Sinha O, ;ross $P, Qra'.ard L+, 7eit mann M5, Schat in $* Meat inta e and mortalit&: a %ros%ective st'd& of over half a million %eo%le* Arc !n"ern #ed
8/20/2019 Correlation lecture
68/79
ortality ris9s<
Sinha O, ;ross $P, Qra'.ard L+, 7eit mann M5, Schat in $* Meat inta e and mortalit&: a %ros%ective st'd& of over half a million %eo%le* Arc !n"ern #ed
8/20/2019 Correlation lecture
69/79
Over@tting
5n multivariate modeling! you canget highly signi@cant but
meaningless results if you put toomany predictors in the model 8he model is @t perfectly to the
#uir9s of your particular sample!but has no predictive ability in ane" sample
O @tti l d t
8/20/2019 Correlation lecture
70/79
Over@tting: class dataeEample
5 as9ed A to automatically @ndpredictors of optimism in our class
dataset Fere-s the resulting linearregression model: Parameter Standard Variable Estimate Error )*pe II SS ( Value Pr > (
Intercept 11.&0175 2.%) 11.%$0$7 15.$5 0.001%
e+ercise "0.2%10$ 0.0%7%& $.745$% &. 0.0117 sleep "1.%15%2 0.#%4%4 17.%&&1& 2#.5# 0.0004 obama 1.7#%%# 0.24#52 #%.01%44 51.05
8/20/2019 Correlation lecture
71/79
5f something seems togood to be true<
'linton4 univariate+ arameter Standard >aria.le 7a.el al'e r R t
+nterce%t +nterce%t 1 *4FIHH 2*1F4JI 2* 0*01HH 'linton 'linton 1 .2;
8/20/2019 Correlation lecture
72/79
ore univariate models<
=bama4 Cnivariate+ arameter Standard >aria.le 7a.el al'e r R t
+nterce%t +nterce%t 1 0*H210J 2*4F1FJ 0*F4 0*JFHG obama obama 1 .,B2B7 .91aria.le 7a.el al'e r R t
+nterce%t +nterce%t 1 F*J02J0 1*2 F02 2*GI 0*00JI math ove math ove 1 .8
8/20/2019 Correlation lecture
73/79
Over@tting
,ure noise variables still produce good R values if themodel is over@tted 8he distribution of R values from aseries of simulated regression models containing onlynoise variables(Qigure 7 from: Rabya9! A *hat 1ou ee ay &ot Re *hat 1ou Wet: A Rrief! &ontechnical5ntroduction to Over@tting in /egression'8ype odels Psychosomatic Medicine : 77' 7
O'le of th'm.: o' need atleast 10 s'. ects for eachadditional %redictorvaria.le in the m'ltivariate
regression model*
8/20/2019 Correlation lecture
74/79
/evie" of statistical tests
8he follo"ing table gives the appropriatechoice of a statistical test or measure of
association for various types of data(outcome variables and predictorvariables) by study design
;ontin'o's o'tcome
Linar& %redictor ;ontin'o's %redictors
e*g*, .lood %ress're %o'nds 3 age 3 treatment (1 0)
Types of variables to be analy ed %tatistical proceduref
8/20/2019 Correlation lecture
75/79
or measure of associationFredictor variable&s=utcome variable'ross-sectional&case-control studies
'ategorical (G2 groups) 'ontinuous N=@'ontinuous 'ontinuous %imple linear regression
Aultivariate
(categorical andcontinuous)
'ontinuous Aultiple linear regression
'ategorical 'ategorical 'hi-s$uare test (or ?isherHsexact)inary inary =dds ratio4 ris5 ratio
Aultivariate inary ogistic regression
'ohort %tudies&'linical Trials
inary inary *is5 ratio'ategorical Time-to-event Iaplan-Aeier& log-ran5 test
Aultivariate Time-to-event 'ox-proportional ha ardsregression4 ha ard ratio
inary (two groups) 'ontinuousT-test
inary *an5s&ordinal Jilcoxon ran5-sum test
'ategorical 'ontinuous *epeated measures N=@Aultivariate 'ontinuous Aixed models3 KEE modeling
Al i
8/20/2019 Correlation lecture
76/79
Alternative summary:statistics for various types ofoutcome data
OutcomeVariable
Are the observations independentor correlated?
Assumptions
independent correlated
Continuous
(e g pain scale!cognitive function)
8testA&OVALinear correlationLinear regression
,aired ttest/epeated'measuresA&OVA
iEed modelsPWIImodeling
Outcome isnormally distributed
(important for smallsamples)Outcome andpredictor have alinear relationship
Rinary orcategorical(e g fracture
;i erence inproportions/elative ris9sChi's#uare test
c&emar-s testConditional logisticregressionWII modeling
Chi's#uare testassumes su.cientnumbers in each
cell (2 J)
8/20/2019 Correlation lecture
77/79
Continuous outcome
(means)$ F/, J PF/,OutcomeVariable
Are the observations independent or correlated?Alternatives if thenormality assumption isviolated (and smallsample size):
independent correlated
Continuous(e g painscale!cognitivefunction)
Ttest: compares meansbet"een t"o independentgroups
ANOVA: compares meansbet"een more than t"o
independent groups
Pearson’s correlationcoefcient (linearcorrelation): sho"s linearcorrelation bet"een t"ocontinuous variables
Linear regression: multivariate re ression
Paired ttest: comparesmeans bet"een t"o relatedgroups (e g ! the samesub%ects before and after)
Repeated-measures
ANOVA: compares changesover time in the means of t"oor more groups (repeatedmeasurements)
Mixed models/ !!modeling : multivariateregression techni#ues to
compare changes over timebet"een t"o or more groups$
&on'parametric statistics
"ilcoxon sign-ran#test : non'parametricalternative to the paired ttest
"ilcoxon sum-ran# test ( ann'*hitney + test): non'parametric alternative tothe ttest
$rus#al-"allis test: non'parametric alternative toA&OVA
%pearman ran#
Ri i l
8/20/2019 Correlation lecture
78/79
Rinary or categoricaloutcomes (proportions)$ F/,
J PF/, 7OutcomeVariable
Are the observations correlated? Alternative to thechi's#uare test ifsparse cells:
independent correlated
Rinary orcategorical(e gfracture!yesPno)
.)i-s*uare test: compares proportionsbet"een t"o or moregroups
Relati+e ris#s: oddsratios or ris9 ratios
Logistic regression: multivariate techni#ue used "hen outcome isbinary$ gives multivariate'ad%usted odds ratios
McNemar’s c)i-s*uaretest: compares binaryoutcome bet"een correlatedgroups (e g ! before and after)
.onditional logisticregression: multivariateregression techni#ue for abinary outcome "hen groupsare correlated (e g ! matcheddata)
!! modeling: multivariate
regression techni#ue for abinary outcome "hen groups
is)er’s exact test: compares proportions bet"eenindependent groups "henthere are sparse data (somecells 4J)
McNemar’s exact test: compares proportions bet"eencorrelated groups "hen thereare sparse data (some cells4J)
8/20/2019 Correlation lecture
79/79
8ime'to'event outcome
(survival data)$ F/, OutcomeVariable
Are the observation groups independent or correlated? odi@cationsto CoEregression ifproportional'hazards isviolated:
independent correlated
8ime'to'event (e g !time tofracture)
$aplan-Meier statistics: estimates survivalfunctions for each group (usually displayed graphically)$compares survival functions "ith log'ran9 test
.ox regression: ultivariate techni#ue for time'to'event data$ gives multivariate'ad%usted hazard ratios
nPa (alreadyover time)
8ime'dependentpredictors ortime'dependenthazard ratios(tric9yT)