+ All Categories
Home > Documents > 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April...

1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April...

Date post: 14-Dec-2015
Category:
Upload: mathew-lory
View: 215 times
Download: 1 times
Share this document with a friend
60
1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011
Transcript
Page 1: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

1 Date

Statistical Analysis of Longitudinal Data

Ziad Taib

Biostatistics, AZ

April 2011

Page 2: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date2

Outline of lecture 1

1. An introduction

2. Two examples

3. Principles of Inference

4. Modelling continuous longitudinal data

Page 3: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date3

Part 1: An introduction

Page 4: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date4

Why longitudinal data?

Very useful for their own sake. With longitudinal data, we have the possibility of

understanding what mixed models are about in a relatively simple but yet rich enough context.

___________________________________

A good reference is the book ”Designing experiments and analyzing data” by Maxwel l& Delaney (2004)

Page 5: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date5

Longitudinal Data

Repeated measures are obtained when a response is measured repeatedly on a set of units• Units:

• Subjects, patients, participants, . . .

• indivduals, plants, . . .

• Clusters: nests, families, towns, . .

• . . .

• Special case: Longitudinal data

Obs! Possible to handle several levels

Page 6: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date6

A motivating example

Consider a randomized clinical trial with two treatment groups and repeated measurements at baseline, 3 and 6 months later. As it turned out some of the data was missing. Moreover patients did not always comply with time requirements. Our first reaction is to try to compensate for the missing values by some kind of imputation, or to use list-wise deletion.

Both ”methods” having their shortcomings, wouldn't it be nice to be able to use something else? There is in fact an alternative method: using the idea of mixed models.

With mixed models,1. we can use all our data having the attitude that ”what is missing is

missing”. 2. we can even account for the dependencies resulting from measurements

made on the same individuals at different times. 3. we don’t need to be consistent about time.

A

B

Baseline 3 months 6 months

Page 7: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date7

Mixed effects models

Ordinary fixed effects linear model usually assume:

1) independence with the same variance.2) normally distributed errors.3) constant parameters

If we modify assumptions 1) and 3), then the problem becomes more complicated and in general we need a large number of parameters only to describe the covariance structure of the observations. Mixed effects models deal with this type of problems.

In general, this type of models allows us to tackle such problems as: clustered data, repeated measures, hierarchical data.

constant. ),,0( is , 2 INXY

nnn x

x

Y

Y

...

1

......

1

...1

1

011

Page 8: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date8

Various forms of models and relation between them

LM: Assumptions:

1. independence,

2. normality,

3. constant parameters

GLM: assumption 2) Exponential family

LMM: Assumptions 1) and 3) are modified

GLMM: Assumption 2) Exponential family and assumptions 1) and 3) are modified

Repeated measures: Assumptions 1) and 3) are modified

Longitudinal dataMaximum likelihood

Classical statistics (Observations are random, parameters are unknown constants)

Bayesian statistics

LM - Linear model

GLM - Generalised linear model

LMM - Linear mixed model

GLMM - Generalised linear mixed model

Non-linear models

Page 9: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date9

Part 2: Two examples

Rat data Prostate data

Page 10: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date10

Example 1: Rat Data (Verbecke et al)

Research question How does craniofacial growth in the wistar rat depend on testosteron production?

Page 11: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date11

Simplifie

d

(univariate) re

sponse

Page 12: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date12

•Randomized experiment in which 50 male Wistar rats are randomized to:

Control (15 rats) Low dose of Decapeptyl (18 rats) High dose of Decapeptyl (17 rats)

Treatment starts at the age of 45 days. Measurements taken every 10 days, from day 50

on. The responses are distances (pixels) between two

well defined points on x-ray pictures of the skull of each rat. Here, we consider only one response, reflecting the height of the skull.

Prevents the production of testesterone

45

Days

60 7050 80

Page 13: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date13

Individual profiles:

1. Connected profiles better that scatter plots2. Growth is expected but is it linear3. Of interest change over time (i.e. Relationship between response and age)

Page 14: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date14

Complication: Many dropouts due to anaesthesia imply less power but

no bias.

Without dropouts easier problem because of balance.

Page 15: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date15

Remarks:

Much variability between rats Much less variability within rats Fixed number of measurements scheduled per

subject, but not all measurements available due to dropout, for known reason.

Measurements taken at fixed time points

Research question: How does craniofacial growth in the wistar

rat depend on testosteron production ?

Page 16: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date16

Example 2: The BLSA Prostate Data

Page 17: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date17

Example 2: The BLSA Prostate Data (Pearson et al., Statistics in Medicine,1994). Prostate disease is one of the most common and

most costly medical problems in the world. Important to look for biomarkers which can detect the disease at an early stage.

Prostate-Specific Antigen is an enzyme produced by both normal and cancerous prostate cells. It is believed that PSA level is related to the volume of prostate tissue.

Problem: Patients with Benign Prostatic Hyperplasia also have an increased PSA level

Overlap in PSA distribution for cancer and BPH cases seriously complicates the detection of prostate cancer.

Page 18: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date18

Research question: Can longitudinal PSA profiles be used to detect prostate cancer in an early stage ?

A retrospective case-control study based on frozen serum samples:

16 control patients 20 BPH cases 14 local cancer cases 4 metastatic cancer cases

Page 19: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date19

Individual profiles:

Page 20: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date20

Remarks:

Much variability between subjects Little variability within subjects Highly unbalanced data

Research question: Can longitudinal PSA profiles be used to

detect prostate cancer in an early stage ?

Page 21: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date21

Part 3: Principles of Inference

Page 22: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date22

Fisher´s likelihood Inference for observable y and fixed parameter q Data Generation : Given a stochastic model

, Generate data, y, from

Parameter Estimation : Given the data y, make inference about q by using the likelihood

Connection between two processes :

)(yf

)/( yL

)()/( yfyL

)(yf

Page 23: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date23

(Classical) Likelihood Principle

Birnbaum (1962) All the evidence or information about the parameters in the data is in the likelihood.

Conditionality principle& Sufficiency principle

Likelihood principle

Page 24: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date24

Bayesian Inference for observable y and unobservable n Data Generation : Generate data according to

1. n, from

2. For n fixed generate y from

Combine into Parameter Estimation : Given the data y, make

inference about n by using The connection between two processes:

)(f

)/()()/()( yfyfyff

)/()( yff

)/( yf

)/( yf

prior

posterior

Compare with )/( yL

)/()(),()/()()(

),()/( yffyfyfyf

yf

yfyf

Page 25: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date25

Extended likelihood inference: (Lee and Nelder) for observable y, fixed parameter q and unobservable n

Page 26: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date26

Parameter estimation )()/( yfyL

Page 27: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date27

Extended Likelihood Principle

Björnstad (1996) All information in the data about the unobservables and the parameters is in the “likelihood”.

Conditionality principle& Sufficiency principle

Likelihood principle

Page 28: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date28

Prediction: predict the number of seizures during the next week

Page 29: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date29

Page 30: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date30

Bayesian Predictive Inference

Given n, the observations y are assumed to be independent. How do we predict the next value, Y, of the observable? In a Bayesian setting we may determine the posterior and define the predictive density of Y given y as:)/( yxfY

)/( yf

Obs!

Jefreys’ Priors

Page 31: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date31

Bayesian inference (Pearson, 1920)

Page 32: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date32

Page 33: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date33

Nelder and Lee (1996)

?

Page 34: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date34

Page 35: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date35

Part 4: A Model for Longitudinal Data

Page 36: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date36

Introduction

In practice: often unbalanced data due to (i) unequal number of measurements per subject (ii) measurements not taken at fixed time points.

Therefore, ordinary multivariate regression techniques are often not applicable.

Often, subject-specific longitudinal profiles can be well approximated by linear regression functions. This leads to a 2-stage model formulation:

Stage 1: A linear (e.g. regression) model for each subject separately

Stage 2: Explain variability in the subject-specific (regression) coefficients using known covariates

Page 37: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date37

A 2-stage Model Formulation: Stage 1 Response Yij for ith subject, measured at time tij, i = 1, . . . , N,

j = 1, . . . , ni Response vector Yi for ith subject:

Zi is a (ni x q) matrix of known covariates and

bi is a (ni x q) matrix of parameters

Note that the above model describes the observed variability within subjects

iiiiiiii

iniii

InNZY

YYYYi

2

21

often ),,0(~ ,

)',...,,(

Possibly after some convenient transformation

Page 38: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date38

Stage 2

Between-subject variability can now be studied from relating the parameters bi to known covariates

Ki is a (q x p) matrix of known covariates and

b is a (p-dimensional vector of unknown regression

parameters Finally

iii bK

),0(~ ii Nb

Page 39: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date39

The General Linear Mixed-effectsModel The 2-stages of the 2-stage approach can now be

combined into one model:

Average evolution Subject specific

Page 40: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date40

Convenient using multivariate normal.Very difficult with other distributions

The general mixed effects models can be summarized by:

Terminology:• Fixed effects: b• Random effects: bi

• Variance components: elements in D and Si

Page 41: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date41

Remarks

1. It is occasionally unclear if we should treat an effect as a fixed or a mixed effect. For example in clinical trials with treatment and clinic as “factors” should we consider clinics as random?

2. Considering the general form of a mixed effects model

notice that the fixed effects are involved only in mean values (just like in ordinary linear models) while random effects modify the covariance matrix of the observations.

iiiii bZXY

?

Page 42: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date42

Example: The Rat Data

Page 43: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date43

Transformation of the time scale to linearize the profiles:

Note that t = 0 corresponds to the start of the treatment (moment of randomization)

• Stage 1 model:

]10

)45(1ln[

ij

ijij

AgetAge

iijijiiij njtY ,1,... ,21

Page 44: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date44

Stage 1

i

ii

2

1

Page 45: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date45

Stage 2 model:

In the second stage, the subject-specific intercepts and time effects are related to the treatment of the rats

Page 46: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date46

The hierarchical versus the marginal Model

The general mixed model is given by It can be written as

It is therefore also called a hierarchical model

Page 47: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date47

f(yi I bi)f(bi)

f(yi)

Marginally we have that is distributed as

Hence

Page 48: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date48

Example: The Rat Data

Linear model where eachrat has its own interceptand its own slope

Can be negative or positivereflecting individual deviationfrom average

Page 49: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date49

Notice that the model assumes that thevariance function is quadratic over time.

Comments:• Linear average evolution in each group• Equal average intercepts• Different average slopes

Moreover, taking

Page 50: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date50

),cov()(

),cov(

),cov(1

,

),cov(1

,1

),cov(1

)cov(,1

),1,,1(

))(),((

112221122111

11222112212111

112

2211212111

1122212

12111

1122

11

22

121

2

11

21

ii

ii

ii

ii

iii

i

ii

ii

i

i

i

dttdttd

dttdtdtd

tdtddtd

tdd

ddt

tt

ttCov

ttCov

YY

Page 51: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date51

Page 52: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date52

Page 53: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date53

The prostate data

iijijiijii

ij

ij

njtt

PSA

Y

,1,... ,

)1ln(2

321

A model for the prostate cancer Stage 1

Page 54: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date54

The prostate data

Age could not be matched

jiiiii

jiiiii

jiiiii

i

i

i

bMLBCAge

bMLBCAge

bMLBCAge

31514131211

2109876

154321

3

2

1

A model for the prostate cancer Stage 2

Ci, Bi, Li, Mi are indicators of the classes: control, BPH, local or

metastatic cancer. Agei is the subject’s age at diagnosis. The parameters in the first row are the average intercepts for the different classes.

Page 55: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date55

The prostate data

This gives the following model

eij

Page 56: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date56

Stochastic components in general linear mixed model

Average evolution

Subject 2

Subject 1

Time

Res

pons

e

Page 57: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date57

References

Aerts, M., Geys, H., Molenberghs, G., and Ryan, L.M.(2002). Topics in Modelling of Clustered Data. London: Chapman and Hall.

• Brown, H. and Prescott, R. (1999). Applied Mixed Models in Medicine. New-York: John Wiley & Sons.

• Crowder, M.J. and Hand, D.J. (1990). Analysis of Repeated Measures. London: Chapman and Hall.

• Davidian, M. and Giltinan, D.M. (1995). Nonlinear Models For Repeated Measurement Data. London: Chapman and Hall.

Davis, C.S. (2002). Statistical Methods for the Analysis of Repeated Measurements. New York: Springer-Verlag.

Diggle, P.J., Heagerty, P.J., Liang, K.Y. and Zeger, S.L. (2002). Analysis of Longitudinal Data. (2nd edition). Oxford: Oxford University Press.

Page 58: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date58

References

Fahrmeir, L. and Tutz, G. (2002). Multivariate Statistical Modelling Based on Generalized Linear Models, (2nd edition). Springer Series in Statistics. New-York: Springer-Verlag.

Goldstein, H. (1979). The Design and Analysis of Longitudinal Studies. London: Academic Press.

Goldstein, H. (1995). Multilevel Statistical Models. London: Edward Arnold.

Hand, D.J. and Crowder, M.J. (1995). Practical Longitudinal Data Analysis. London: Chapman and Hall.

Jones, B. and Kenward, M.G. (1989). Design and Analysis of Crossover Trials. London: Chapman and Hall.

Kshirsagar, A.M. and Smith, W.B. (1995). Growth Curves. New-York: Marcel Dekker.

Lindsey, J.K. (1993). Models for Repeated Measurements. Oxford: Oxford University Press.

Longford, N.T. (1993). Random Coefficient Models. Oxford: Oxford University Press.

Page 59: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date59

References

Pinheiro, J.C. and Bates D.M. (2000). Mixed effects models in S and S-Plus, Springer Series in Statistics and Computing. New-York: Springer-Verlag.

Searle, S.R., Casella, G., and McCulloch, C.E. (1992). Variance Components. New-York: Wiley.

Senn, S.J. (1993). Cross-over Trials in Clinical Research. Chichester: Wiley.

Verbeke, G. and Molenberghs, G. (1997). Linear Mixed Models In Practice: A SAS Oriented Approach, Lecture Notes in Statistics 126. New-York: Springer-Verlag.

Verbeke, G. and Molenberghs, G. (2000). Linear Mixed Models for Longitudinal Data. Springer Series in Statistics. New-York: Springer-Verlag.

Vonesh, E.F. and Chinchilli, V.M. (1997). Linear and Non-linear Models for the Analysis of Repeated Measurements. Marcel Dekker: Basel.

Page 60: 1 Date Name, department Statistical Analysis of Longitudinal Data Ziad Taib Biostatistics, AZ April 2011.

Name, department

Date60

Any Questions?


Recommended