Sample Size Calculations for the Rate of Changes in Repeated Measures Designs Chul Ahn, Ph.D. UT...

transcript

Sample Size Calculations for the Rate of

Changes in Repeated Measures Designs

Chul Ahn, Ph.D.UT Southwestern Medical Center

at Dallas(Joint work with Sinho Jung at Duke)

Normal outcomes

1. Univariate summary statisticsKirby et al. (1994) Overall and Doyle (1994)

2. Univariate split-plot ANOVABloch (1986)Lui and Cumberland (1992)

3. Hotelling’s T2

Vonesch and Schork (1986)Rochon (1991)

4. Multivariate ANOVA

Muller and Barton (1989)

Muller et al. (1992)

Binary Outcomes

1. Extension of univariate split-plot model

Lui (1999)

2. Weighted least squares

Rochon (1989)

Lipsitz and Fitzmaurice (1994)

Liu and Liang (1997): Score test, no closed form formula except for some special cases

Rochon (1998), Wald test {Pan (2001), Z-test, a special case of Wald

test, SAS and S-Plus use Wald test} Jung and Ahn (2003, 2004, 2005)

Ahn and Jung (2003, 2005), Z-test Dahmen et al. (2004) Kim et al. (2005)

Other Approaches

Hedeker et al. (1999)Yi and Panzarella (2002)Gastanaga et al. (2006)Tu et al. (2004, 2007)

Problem formulation

Diggle et al. (2002) “Correlation between repeated observations affects the sample size estimates in a different way depending on the problem.”.

Leon (2004) and Rochon (1998): As correlation increases, the required sample size increases when comparing group averages.

Jung and Ahn (2003, 2005) and Ahn and Jung (2005) show that it may not be the case when comparing the rates of changes over time within subjects

GEE (Jung and Ahn, 2003, 2005)

A closed form formula for sample size and power for comparing the rate of changes between two groups

Sample size can be computed using a scientific calculator

GEE for continuous outcomes

Let yij be a continuous variable at measurement time tij (j=1, …, Ki) for subject i.

Let ri =0 for control group and ri =1 for experimental group.

We assume missing completely at random (MCAR)

β4 is the parameter of interest

Sn(b)=0

is approximately normal with mean 0 and variance Σn = An

-1 Vn An-1

Reject H0 : β4 = 0

if the absolute value of is larger than z1-α/2

where is the (4,4)-component of Σn

Sample size estimation

Sample size estimate to detect

H1 : β4 = β40 with a two-sided α test and power 1-γ

Assume that the visits are either made at scheduled times or missing, and the missing probability depends on measurement time only.

Let A and V denote the limits of An and Vn. Then, Σn converges to Σ=A-1 V A-1

Let σ42 denote the (4,4) component of Σ

Then, the required sample size is

We need to derive the expression of A and V for σ4

2 to calculate the sample size

Let δij =0 for missing observation, and δij =1 otherwise

Under MCAR (δi1, …,δiK) is independent of (yi1, …,yiK)

Let visit times be fixed (t1, …,tK)

Let σ2 = var (εij), ρjj’ = corr (εij, εij’),

pj=E(δij)=p(observation at tj)

pjj’=E(δij δij’)

=p(observation at both tj and tj’)

Sample Size Formula

σ42 is the (4,4) component of Σ=A-1 V A-1

σ42 =σ2st

2 /(μ02 σr

2 σt4),

where σt2 = μ2 - μ1

The required sample size is given by

Note that we do not have to specify the true values for β1, β2, and β3 in sample size calculation for testing β4

Calculation of σ42 requires projection of

the missing probabilities and true correlation structure

As a special case, we consider two missing patterns;

independent missing (pjj’ = pj pj’) and monotone missing (pjj’ = pj’ for j<j’)

We can use any correlation structures. The commonly used correlation structures are AR(1) with ρjj’ = ρ|j-j’| , and compound symmetry (exchangeable) with ρjj’ = ρ for j≠j’

The sample size calculation can be done easily with a scientific calculator

Example

Davis (1991, SIM)83 women in labor were randomized to

receive a pain medication (43 women) or placebo (40 women). The amount of pain was self-reported (0 = no pain, 100 = extreme pain)

K=6, maximum number of measurementsMonotone missing pattern

Sample size calculation

From the data, we got σ2 = 815.84H1 : β40 =5.71 in a new study

Assign equal number of subjects in each group: σr

2 = 0.25 (=r(1-r))

Proportion of observed measurements

(p1 , …, p6 )=(1, 0.9, 0.78,, 0.67, 0.54, 0.41)

From these, we get μ0=4.31, μ1=2.02, μ2=6.73, σt

2=2.65

Under CS, we get ρ=0.64 and st2=8.30

from the dataWe need n=67 to detect β40 =5.71 with

α=0.05 and 90% powerUnder AR(1), we get ρ=0.80 and

st2=13.73 from the data

We need n=111 to detect β40 =5.71 with α=0.05 and 90% power

Simulation study

With the same ρ value, sample size under AR(1) is larger than that under CS for testing the rates of changes between two groups

A conservative approach is to use AR(1) With the same ρ value, sample size under CS is

larger than that under AR(1) when comparing marginal means between two groups

A conservative approach is to use CS (Rochon, 1998)

K group comparisons

Jung and Ahn (2004)Two group comparisons can be

extended to K (K≥3) group comparisonsUse of non-central chi-square

distribution

Increase n or m?

Ahn and Jung (2004)Efficiency of the slope estimator in

repeated measurementsRelative benefit of adding subjects (n)

versus adding measurements (m) on a specified fixed study period [0,T]

n and m will affect the standard error of β4 estimate

Given m, let g(m)=n1/2 se(β)The effect of increase from m to (m+1)

on se(β) is the same as that from n to n’, where n’ satisfies

g(m+1)/ n1/2 =g(m)/ (n’)1/2

That is, n’=n{g(m)/g(m+1)}2

True correlation, CS

Under no missing, pj=pjj’=1,

σm2 = 12 σ2(1-ρ)m/{(m+1)(m+2)T2 }

σm+1/σm does not depend on ρ in the complete data case, while it depends on ρ in the missing data case

Adding one more measurements in [0, T] is equivalent to adding n(m-1)/(m+1)2 more subjects in the complete data case.

That is, we can reduce n(m-1)/(m(m+3)) patients by adding one more assessments to achieve the same precision in the complete data case

Suppose that we increase the number of measurements from m to m+1, the relative reduction in standard error of slope is

(se(βm)- se(βm+1))/se(βm)

Effect of dropout on sample size estimate

Monotone missingLet N be the estimated total sample size

under no missing data, and q be the proportion of dropout at the end of the study

Can we estimate the sample size using N/(1-q)?

Dropout patterns

Binary Repeated Measurements

Jung and Ahn (2005, SIM)g(pkij )= ak + bk tkij

where g(p)=log{p/(1-p)}pkij (ak,bk)=g-1(ak + bk tkij)

=exp(ak+bk tkij)/{1+ exp(ak+bk tkij)}

Closed-form sample size formula can be derived in a similar way as we did for continuous outcomes

Sample size to test H1 : |b1 – b2 |=d

Steps for sample size calculation

1. Choose type I error α and power 1-β

2. Schedule measurement times (t1,…,tm)

3. Choose allocation proportions r1 and r2

4. Given pk1 and pkm, calculate (ak,bk), and pkj Set d= b2 - b1

5. Specify non-missing proportions (δ1,…,δm), and a missing pattern for δjj’

6. Specify the true correlation structure and the associated correlation parameter ρ

7. Calculate the variance vk and the sample size n

Example

75% of scleroderma patients do not have pulmonary fibrosis at baseline in the ongoing GENOSIS trial

A new clinical trial will examine the effect of a new drug in preventing the occurrence of pulmonary fibrosis

Presence or absence of pulmonary fibrosis will be assessed at baseline, and at months 6, 12, 18, 24 and 30.

Compare the occurrence of pulmonary fibrosis from baseline to 30 months for placebo versus a new drug

Within-group correlation structure: AR(1) with ρ=0.8, ρjj’=0.8|j-j’|

Assign equal number of patients in each group, r1= r2=0.5

We project that proportion of subjects without pulmonary fibrosis is p11 =0.75 at baseline, and p16 =0.5 at 30 month in a placebo group

We assume that a new therapy will prevent further occurrence of pulmonary fibrosis

That is, p21 = p26 =0.75

b1 = {g(0.5)-g(0.75)}/(6-1)=-0.220

a1 =g(0.75)=1.099

Similarly, we obtain (a2,b2)=(1.099,0)

So, d=0-(-0.220)=0.220

The probabilities of no pulmonary fibrosis can be estimated from the logistic regression equation

(0.750, 0.707, 0.659, 0.608, 0.555, 0.500) for the placebo group

(0.750, 0.750, 0.750, 0.750, 0.750, 0.750) for the treatment group

The proportions of observed measurements are expected to be

(δ1,…,δ6)=(1.0, 0.95, 0.90, 0.85, 0.80, 0.75)

Suppose that we expect independent missing

Now, we have all the parameters values to compute the sample size n

From the parameters, we obtain v1 =0.305 and v2 =0.353

Finally, n=(1.96+0.84)2 (0.305/0.5+0.353/0.5)/0.2202=214

Software for sample size estimate

GEESIZE version 3.1http://www.imbs.uni-luebeck.de/pub/Geesize/

“GEESIZE computes the minimum sample size in studies with correlated response data based on GEE. These correlated response data arise e.g. in repeated measurement designs, family studies or studies involving paired organs.”

RMASS2: Repeated measuers with attrition: sample size for 2 groups

http://tigger.uic.edu/~hedeker/ml.html

Sample Size Calculations for the Rate of Changes in Repeated Measures Designs Chul Ahn, Ph.D. UT...

Documents