Statistical Consulting Topics
Generalized Estimating Equations (GEEs)
• GEEs are generally used as a method to han-dle correlated data under the generalized lin-ear model framework.
– Longitudinal logistic regression
– Longitudinal Poisson regression
• GEEs utilize a quasi-likelihood rather than aformal likelihood approach.
Quasi-likelihoodA quasi-likelihood does not fully specify adistribution (like common exponential fam-ilies of normal or binomial, which have aknown distributional ‘shape’).
1
For independent data, a quasi-likelihood mod-els the mean as a function of the covariatesand assumes the variance is a function of themean (akin to the idea of first and secondmoments).
Quasi-likehoods are often associated with thephrase overdispersion. For example, takethe Poisson distribution where the varianceand mean are equal (vi = µi). In a quasi-Poisson, the variance need only be propor-tional to the mean (vi = φµi).
For correlated data, the same quasi-likelihoodframework is employed, but we also specifya “working” correlation matrix for the re-peated observations on each subject.
•Why use GEEs? Even in the binary casewhere likelihood analysis is possible, compu-tation is difficult (Stiratelli et al, 1984).
2
Generalized Linear Models (short intro)
• Generalized linear model (GLMs) for expo-nential families were introduced by McCul-lagh and Nelder in 1989.
• GLMs have a wide range of uses, but onecommon use is to model a response variablethat is dichotomous (Bernoulli or binomial)or non-negative discrete (Poisson) with a re-gression model.
• GLMs connect the response variable to thepredictors through a link function, often de-noted as g. The mean µi depends on theindependent variables in the following man-ner...
g(µi) = g(E[Yi|xi]) =x′iβ
3
– Poisson regression with log link
ln (λi) = β0 + β1x1i + · · · + βkxki
– Logistic regression with a logit link
ln
(P (Y = 1)
P (Y = 0)
)= β0+β1x1i+· · ·+βkxki
• The method of GEEs was developed to ex-tend the GLM framework to accommodatecorrelation between observations using a quasi-likelihood approach.
•Why? Because when the responses are notapproximately multivariate normal (e.g. bi-nary with correlation), likelihood based meth-ods are less tractable.
4
• GEEs assume that the mean and variance arecharacterized as in the GLM (i.e. the same asif there were independent observations), butthe covariance between observations is alsomodeled.
• GEEs don’t assume a specific exponentialfamily for the dependent variable (becausethe method utilizes quasi-likelihood).
• Seminal paper:
Zeger and Liang (Biometrics, 1986).
Their equations are extensions of those usedin quasi-likelihood (Wedderburn, 1974, Bio-metrics) methods.
5
Zeger and Liang write:
“We specify that a known function of themarginal expectation of the dependent vari-ate is a linear function of the covariates, andassume that the variance is a known functionof the mean.”
• GEEs are meant to characterize marginalexpectations of the response as a functionof independent variables.
“The average response for observations shar-ing the same covariates.”
• GEEs are used as a ‘means to an end’. We’restill trying to get the ‘best’ parameter es-timates because it is these parameters thatconnect our covariates to our mean structure.
6
• The method of GEEs is robust to misspec-ification of the ‘working’ correlation matrixand provides consistent and asymptoticallynormal parameter estimates.
•Most software that I’m familiar with for fit-ting a GLM with correlation uses GEEs toestimate parameters in a meaningful way.
– PROC GENMOD in SAS with theREPEATED statement invokes GEEs.
– geepack package in R.
7
• Example: Binary longitudinal data
Let Yij be the observation for the ith subject
at the jth timepoint for j = 1, . . . , t(longitudinal)...
g(E[Yij|xij]) =x′ijβ
ln(
E[Yij|xij]1−E[Yij|xij]
)=x′ijβ
which implies
E[Yij|xij] = µij =exp(x′ijβ)
1 + exp(x′ijβ)
and if we assume the binomial distribution...
var[Yij|xij] = vij =exp(x′ijβ)
(1 + exp(x′ijβ))2
(the mean and variance are tied together)
8
In addition to specifying the mean and vari-ance, we specify the covariance of the tobservations on a subject.
LettingV i represent the t×t variance-covariancematrix for subject i, GEEs use the followingstructure...
V i = φA1/2i Ri(α)A
1/2i
where Ai is a diagonal matrix of variancefunctions v(µij), Ri(α) is the workingcorrelation matrix (not covariance matrix)indexed by a vector of parameters α, and φis a dispersion parameter.
Ri(α) can take on a common form, such asthe AR(1), unstructured, or even a diagonalmatrix suggesting independence, if so chosen.
9
• Consider 4 observations over time, one op-tion for the correlation...
Ri(ρ) =
1 ρ ρ2 ρ3
ρ 1 ρ ρ2
ρ2 ρ 1 ρ
ρ3 ρ2 ρ 1
In this case, if there was constant variance,v(µij) = σ2, and φ = 1, then
Vi = σ2
1 ρ ρ2 ρ3
ρ 1 ρ ρ2
ρ2 ρ 1 ρ
ρ3 ρ2 ρ 1
and the ‘overall’ variance-covariance struc-ture would look like a block diagonal (withblocks of V i down the diagonals), assumingwe have independence between subjects.
10
• Zeger and Liang refer to Ri(α) as the “work-ing” correlation matrix because it is not re-quired to be correctly specified to get consis-tent estimators of the regression parameters(nor of the estimated variance of the param-eter estimates).
If your working correlation matrix is closerto the truth, you’ll have more efficient esti-mators.
• A set of estimating equations are solved throughan iterative process to get β̂ (as is also thecase for GLM with exponential families whichutilizes iteratively reweighted least squares(IRWL) to find the MLEs).
– Get initial estimates of β (perhaps via GLM)
– Compute working correlations Ri(α)
– Compute estimate of covariance matrix Vi
– Update β
– Iterate until convergence
11
• An empirical estimator called the “sandwich”or “robust” estimator is used to estimate thecovariance matrix for β̂, or var(β̂).
Fitting models using SAS:
[Graphic found in Cerrito (2006).]
12
Example:Depression scores (high/low) over timefor two groups of women (placebo/control).
Placebo group (0, n=27)Treatment group (1, n=34)
Observations taken every month for 6 months.
Pre-dichotomized data:
dep1
0 5 15 25 0 5 10 20 5 10 20
010
20
010
20
dep2
dep35
15
05
15 dep4
dep5
010
20
0 5 15 25
515
5 10 20 0 5 15
dep6
13
>cor(dp,use="complete")
dep1 dep2 dep3 dep4 dep5 dep6
dep1 1.0000 0.4982 0.5258 0.3933 0.3674 0.2795
dep2 0.4982 1.0000 0.8672 0.7357 0.7500 0.6900
dep3 0.5258 0.8672 1.0000 0.7831 0.8520 0.7967
dep4 0.3933 0.7357 0.7831 1.0000 0.8449 0.7894
dep5 0.3674 0.7500 0.8520 0.8449 1.0000 0.9014
dep6 0.2795 0.6900 0.7967 0.7894 0.9014 1.0000
proc genmod data=dp descending;
class Group Subject Visit;
model DepCat= Group Visit/d=binomial covb;
repeated subject=Subject/corrw type=AR(1);
lsmeans Group/pdiff;
run;
The SAS System
The GENMOD Procedure
Model Information
Data Set WORK.DP
Distribution Binomial
Link Function Logit
Dependent Variable DepCat
14
Number of Observations Read 366
Number of Observations Used 295
Number of Events 157
Number of Trials 295
Missing Values 71
Criteria For Assessing Goodness Of Fit
Criterion DF Value Value/DF
Deviance 288 377.0009 1.3090
Scaled Deviance 288 377.0009 1.3090
Pearson Chi-Square 288 295.1153 1.0247
Scaled Pearson X2 288 295.1153 1.0247
Log Likelihood -188.5004
Working Correlation Matrix
Col1 Col2 Col3 Col4 Col5 Col6
Row1 1.0000 0.5975 0.3570 0.2133 0.1274 0.0761
Row2 0.5975 1.0000 0.5975 0.3570 0.2133 0.1274
Row3 0.3570 0.5975 1.0000 0.5975 0.3570 0.2133
Row4 0.2133 0.3570 0.5975 1.0000 0.5975 0.3570
Row5 0.1274 0.2133 0.3570 0.5975 1.0000 0.5975
Row6 0.0761 0.1274 0.2133 0.3570 0.5975 1.0000
15
The GENMOD Procedure
Analysis Of GEE Parameter Estimates
Empirical Standard Error Estimates
Standard 95% Confidence
Parameter Estimate Error Limits Z Pr > |Z|
Intercept -0.1373 0.3003 -0.7259 0.4513 -0.46 0.6476
Group 0 1.2970 0.4175 0.4787 2.1153 3.11 0.0019
Group 1 0.0000 0.0000 0.0000 0.0000 . .
Visit 1 0.2439 0.3107 -0.3651 0.8530 0.78 0.4325
Visit 2 -0.2590 0.3113 -0.8691 0.3511 -0.83 0.4054
Visit 3 -0.2630 0.2737 -0.7995 0.2735 -0.96 0.3366
Visit 4 -0.0382 0.3448 -0.7140 0.6377 -0.11 0.9119
Visit 5 -0.1409 0.2933 -0.7158 0.4340 -0.48 0.6310
Visit 6 0.0000 0.0000 0.0000 0.0000 . .
Least Squares Means
Standard Chi-
Effect Group Estimate Error DF Square Pr > ChiSq
Group 0 1.0835 0.3411 1 10.09 0.0015
Group 1 -0.2135 0.2348 1 0.83 0.3633
The GENMOD Procedure
Differences of Least Squares Means
Standard Chi-
Effect Group _Group Estimate Error DF Square Pr > ChiSq
Group 0 1 1.2970 0.4175 1 9.65 0.0019
16
• References:
Zeger, S.L., and K.-Y. Liang (1986). Longitudinaldata analysis for discrete and continuous outcomes.Biometrics, 42:121-130.
Cerrito, P.B. (2006). From GLM to GLIMMIX-whichmodel to choose? Paper SP10 in Proceedings ofthe SAS Users Group (PharmaSUG).
Stirarelli, R., Laird, N. and Ware, J.H. (1984). Random-effects models for serial observations with binaryresponse. Biometrics, 40:961-971.
17