Post on 25-May-2015
transcript
Clinical Study Design
Henrik Ekberg, MD, PhDMalmö, Sweden
Associate Editor: American Journal of Transplantation 2003-
Editorial Board Member:Transplantation 2004 -
Transplant International 2004 -Clinical Transplantation 2008 -
Journal of Transplantation 2008 -
Guangzhou October 9, 2010
Rejection of submitted manuscript- various reasons
• Rejected on priority grounds: Maybe a good study – but not a topic of interest, or done before
• Rejected, not allowed resubmission:– a bad study; design problems, cannot be re-
written in a good way
• Rejected but allowed resubmission:– no serious design problems, interesting topic,
but needs to be rewritten for language, discussion, figures, tables, etc.
Rejection of submitted manuscript- various reasons
• Rejected on priority grounds: Maybe a good study – but not a topic of interest, or done before
• Rejected, not allowed resubmission:– a bad study; design problems, cannot be re-
written in a good way
• Rejected but allowed resubmission:– no serious design problems, interesting topic,
but needs to be rewritten for language, discussion, figures, tables, etc.
Study design alternatives
• Retrospective studies = Using medical charts of existing data• Uncontrolled • Case-controlled• Hypothesis generating
• Prospective studies= Protocol directives for Rx and F/u• Uncontrolled, one-arm, pilot• Randomized Controlled Trial (RCT)• Hypothesis testing
Clinical study design phases
• Phase 1• Drug action, metabolism, PK, PD, safety
• Phase 2• Limited (un)controlled study for efficacy and safety
• Phase 3• Large randomized multicenter study• Determine efficacy and safety for FDA and EMEA
• Phase 4• After drug release: new uses of the drug• Marketing
HypothesisAppropriate populationClinically relevant achievementAdequately-poweredEnd pointsComparison group (placebo)RandomizedDouble-blindIntent-to-treat analysisProtocolAnalysis plan
Key Elements of Trial Quality
Experimental HypothesisMay be based on a pilot or retrospective study or on hopes for a new drug
Drug A > drug B (or placebo)with regards to …
Null hypothesis (H0): A < B, A > B (no difference)A < B (non-inferiority)
Key Elements of Trial Quality
Appropriate population
Include: Normal risk kidney transplant recipientsfrom living or deceased donors
Exclude: High risk patients, such asPRA > 20% (50%?)Retransplants (all?)High donor age ?Expanded donor criteria?Cold ischemia time ?HLA- DR mismatch ?
Key Elements of Trial Quality
HypothesisAppropriate populationClinically relevant achievementAdequately-poweredEnd pointsComparison group (placebo)RandomizedDouble-blindIntent-to-treat analysisProtocolAnalysis plan
Key Elements of Trial Quality
With one-year graft survival > 90%and acute rejection rates < 20%
The Success
With one-year graft survival > 90%and acute rejection rates < 20%
we have a high level of successand further improvement is difficult to
achieve and demonstrate
we need very large studies!
The Problem
Primary end pointThe parameter on which 1. the hypothesis is based, to be verified or rejected2. the sample size is calculated
Secondary end pointsAdditional parameters which may1. describe the patients, events and results2. be used for formulations of new hypotheses
End Points and Sample Size
1. Select the primary end point
2. Clinically relevant achievement regarding end point = Difference between control and experimental groupse.g.: GFR increased by 10 ml/min
AR rate reduced by 10%
3. Determine the number of patients in each group needed to verify that the difference between the groups most likely is true (<5% risk of mistake).
4. With a certain power and p-value.
End Points and Sample Size
“We chose to study 313 patients in each group in order to have 80% power of detecting a 33% reduction in AR rate from a baseline rate of 30% with a significance level of 0.05”.
End point: Acute RejectionClinically relevant achievement:
33% reduction (from 30% to 20%)Power: 80%Significance level: 5%Therefore:
Number of patients in each group: 313
End Points and Sample Size
“We chose to study 313 patients in each group in order to have 80% power of detecting a 33% reduction in AR rate from a baseline rate of 30% with a significance level of 0.05”.
= p-value = 0.05; means a 5% risk of obtaining a group difference by chance (although there is no true difference).
= 0.20 means a Power of 80%; that is 80% chance of obtaining a group difference and 20% risk of missing it (when there is a true difference).
P1 = 0.30 and ∆ = 0.10 (33% of p1)
End Points and Sample Size
“We chose to study 313 patients in each group in order to have 80% power of detecting a 33% reduction in AR rate from a baseline rate of 30% with a significance level of 0.05”.
= p-value = 0.05; means a 5% risk of obtaining a group difference by chance (although there is no true difference).
= 0.20 means a Power of 80%; that is 80% chance of obtaining a group difference and 20% risk of missing it (when there is a true difference).
P1 = 0.30 and ∆ = 0.10 (33% of p1)
End Points and Sample SizeQuestion:If there is a true difference between the groups and
we do 100 studies with 313 patients in each groupHow many studies will result in a group difference,
that is at least a 33% reduction of AR?
1. 5 studies2. 20 studies3. 80 studies4. 95 studies
“We chose to study 313 patients in each group in order to have 80% power of detecting a 33% reduction in AR rate from a baseline rate of 30% with a significance level of 0.05”.
= p-value = 0.05; means a 5% risk of obtaining a group difference by chance (although there is no true difference).
= 0.20 means a Power of 80%; that is 80% chance of obtaining a group difference and 20% risk of missing it (when there is a true difference).
P1 = 0.30 and ∆ = 0.10 (33% of p1)
End Points and Sample SizeQuestion:If there is a true difference between the groups and
we do 100 studies with 313 patients in each groupHow many studies will result in a group difference,
that is at least a 33% reduction of AR?
1. 5 studies2. 20 studies3. 80 studies4. 95 studies
“We chose to study 313 patients in each group in order to have 80% power of detecting a 33% reduction in AR rate from a baseline rate of 30% with a significance level of 0.05”.
= p-value = 0.05; means a 5% risk of obtaining a group difference by chance (although there is no true difference).
= 0.20 means a Power of 80%; that is 80% chance of obtaining a group difference and 20% risk of missing it (when there is a true difference).
P1 = 0.30 and ∆ = 0.10 (33% of p1)
End Points and Sample Size80 studies will show a significant differenceand 20 studies will not.
Comment:20% risk of not seeing a true differenceis quite high
“We chose to study 313 patients in each group in order to have 80% power of detecting a 33% reduction in AR rate from a baseline rate of 30% with a significance level of 0.05”.
= p-value = 0.05; means a 5% risk of obtaining a group difference by chance (although there is no true difference).
= 0.20 means a Power of 80%; that is 80% chance of obtaining a group difference and 20% risk of missing it (when there is a true difference).
P1 = 0.30 and ∆ = 0.10 (33% of p1)
End Points and Sample SizeQuestion:If there is not a true difference between the groups and
we do 100 studies with 313 patients in each group.How many studies will result in a group difference?
1. 5 studies2. 20 studies3. 80 studies4. 95 studies
“We chose to study 313 patients in each group in order to have 80% power of detecting a 33% reduction in AR rate from a baseline rate of 30% with a significance level of 0.05”.
= p-value = 0.05; means a 5% risk of obtaining a group difference by chance (although there is no true difference).
= 0.20 means a Power of 80%; that is 80% chance of obtaining a group difference and 20% risk of missing it (when there is a true difference).
P1 = 0.30 and ∆ = 0.10 (33% of p1)
End Points and Sample SizeQuestion:If there is not a true difference between the groups and
we do 100 studies with 313 patients in each group.How many studies will result in a group difference?
1. 5 studies2. 20 studies3. 80 studies4. 95 studies
“We chose to study 313 patients in each group in order to have 80% power of detecting a 33% reduction in AR rate from a baseline rate of 30% with a significance level of 0.05”.
= p-value = 0.05; means a 5% risk of obtaining a group difference by chance (although there is no true difference).
= 0.20 means a Power of 80%; that is 80% chance of obtaining a group difference and 20% risk of missing it (when there is a true difference).
P1 = 0.30 and ∆ = 0.10 (33% of p1)
End Points and Sample SizeQuestion:If there is not a true difference between the groups and
we do 100 studies with 313 patients in each group.How many studies will result in a group difference?
5 studies will show a group differencealthough this is not true
“We chose to study 313 patients in each group in order to have 80% power of detecting a 33% reduction in AR rate from a baseline rate of 30% with a significance level of 0.05”.
P-value = 5%; The risk of seeing a difference which is not true
Power = 80%; The chance of seeing a difference which is true
P1 = 0.30 and ∆ = 0.10 (33% of p1)
End Points and Sample Size
p=0.05
Sample Size for Acute Rejection
AR Treatment
AR
Control (P1)
Power Sample size
0.20 0.30 80% 313
p=0.05
Sample Size for Acute Rejection
AR Treatment
AR
Control (P1)
Power Sample size
0.20 0.30 80% 313
0.15 0.20 80% 700
p=0.05
Sample Size for Acute Rejection
AR Treatment
AR
Control (P1)
Power Sample size
0.20 0.30 80% 313
0.15 0.20 80% 700
0.15 0.20 90% 954
We need to do large multicenter studies !!!
Question:The primary end point (PEP) and 10 secondary end points (SEP) were analysed; SEP in two ways each.The PEP was NS, one of the SEP was stat sign (P<0.05).
Why is the analysis more reliable for PEP than SEP?Is this significant result of the SEP reliable?10 x 2 = 20 tests What is the probability of a “significant finding” by chance?
End Points and Sample Size
The trap of multiple tests
No. of independent tests
2 5 10 20 50
Probability of one or more p < 0.05 by chance
10% 23% 40% 64% 92%
To keep = 0.05accept as significantonly p less than
The trap of multiple tests
No. of independent tests
2 5 10 20 50
Probability of one or more p < 0.05 by chance
10% 23% 40% 64% 92%
To keep = 0.05accept as significantonly p less than
0.025 0.010 0.005 0.002 0.001
Use p = 0.05 / no. of tests
HypothesisAppropriate populationClinically relevant achievementAdequately-poweredEnd pointsComparison group (placebo)RandomizedDouble-blindIntent-to-treat analysisProtocolAnalysis plan
Key Elements of Trial Quality
Clinical End Points
We want to achieve improvement inpatient survival and graft survival
These are the Clinical end points
Five cadaver kidney transplant recipients received azathioprine
One patient survived 365 days, becoming the first successful cadaveric transplant
Uncontrolled Trial: Patient Survival (n=5)
Murray, et al. New Engl J Med 1963; 268:1315
1-year graft survival CsA.......72% Aza........52%
1-year patient survival CsA.......94% Aza........92%
European Multicentre Trial Group. Lancet 1983; 2:986
p=0.001
RCT: Graft & Patient Survival (n=232)
NS
Acute rejection is associated with graft survival
Acute rejection became the surrogate end point for graft survival
Where Did We Go From Here?
Acute rejection at 6 mo. MMF 2g...........20%
MMF 3g...........17%
Pla/Aza.............41%
1-year graft survival MMF 2g...........90%
MMF 3g...........89%
Pla/Aza............88%
Halloran, et al. Transplantation 1996; 63:39
p<0.01
RCT: Acute Rejection (n=1493)
NS
Conclusion of MMF trials:
“Acute rejection was reduced but graft survival was not improved”
Was this true - or a question of insufficient power of the study?
What difference in graft survival should have been expected?
Where Did We Go From Here?
Sample size and power to verify true differences in graft survival of 4% or 5%.
Graft survival in treatment groups
Difference in Graft survival
Sample size at 80% power
Power at sample size 150
86 % 90 % 4 % 1 037 19 %
75 % 80 % 5 % 1 091 18 %
Ekberg H. Transpl Rev 2003; 17: 187
Surrogate Endpoint Definitions
Clinical endpoint:A characteristic or variable that reflects how a patient feels, functions or survives.
Surrogate endpoint:A biomarker that is intended to substitute for a clinical endpoint, and predict clinical benefit …
Biomarkers Definitions Working Group. Clin Pharmacol Ther 2001; 69:89
Risk factors after transplantationAcute rejectionGraft functionNew onset of diabetes mellitusCholesterol levelsTreatment failure (drug toxicity)Malignancy
Do they predict graft or patient survival?
Risk factors and potential End points
Possible Surrogate Endpoints
Acute rejection Acute rejection + 1/Cr return to baseline 1-year graft function Composite end point
Association or Prediciton ?
Acute Rejection with 1/Cr return to baselineTransplants 1995–2002
Log-rank P value for equality of strata ≤0.0001.Meier-Kriesche et al. ATC 2003.
Time Posttransplantation (mo)
1.0
Graft Survival
(%)
0 6 12 18 24 30 36 42 48 54 60 66 72
0.9
0.8
0.7
0.6
0.5
0.4
AR-1/SCr worse than 5% from baseline49.4%
n = 55,092n = 4,061n = 2,782
n = 22,212n = 2,669n = 1,455
n = 2,891n = 414n = 221
AR-1/SCr within 5% from baseline
73.4%
73.1%
No acute rejection
Predictive Quality for Graft Loss: AR vs. AR Without Return to Baseline
6 years
2 years
Follow-up
38.527.6
15.89.2
Positive Predictive Value
Acute Rejection No Return to
BaselineAcute
Rejection
Meier-Kriesche et al. ATC 2003.
Conclusion: AR and AR with return to baselineare associated but not predictive of graft survival
“Post-transplant Renal Function at 1 Year Predicts Long-Term Kidney Transplant Survival”
0
20
40
60
80
100
0 12 24 36 48 60
Months Posttransplantation
<1.0
1.0 - 1.5
1.6 - 2.0
2.1 - 2.5
2.6 - 3.0
>3
N = 61,157
Graft Survival
(%)
Hariharan S et al. Kidney Int. 2002; 62: 311.
ROC Plot for 7-Year Overall Graft Loss From 1-Year Creatinine Baseline Level AUC = 0.624
Sensitivity
1 - Specificity
0.0 1.00.90.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
3.02.5
2.32.1
2.01.9
1.81.7
1.61.5
1.41.3
1.21.01.1
ROC = receiver operator curve.
H-U Meier-Kriesche
ROC Plot for 7-Year Overall Graft Loss From 1-Year Creatinine Baseline Level AUC = 0.624
Sensitivity
1 - Specificity
0.0 1.00.90.0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1.0
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8
3.02.5
2.32.1
2.01.9
1.81.7
1.61.5
1.41.3
1.21.01.1
ROC = receiver operator curve.
H-U Meier-Kriesche
Prediction Diagnostics for Seven Year Overall Graft Loss from One Year Creatinine Level
Patient population: Adult first transplant recipients from USRDS database
after 1988 with minimum seven years follow up
Prediction Diagnostics
Sensitivity Specificity PPV NPV
Creatinine Cutoff Level
1.6 62% 55% 53% 64%
1.8 48% 71% 58% 62%
2.0 36% 82% 63% 61%
H-U Meier-Kriesche
Possible Surrogate Endpoints
Acute rejection
Acute rejection + 1/Cr return to baseline
1-year graft function
Composite end point
Composite end point (CEP)
1,389 KTx at Univ of Minnesota 1985-1997
Creat at 1 year (Cr12)
Cr12 <1.0 to >3.0 -> 10 yr GS from 75% to 25%
Suggested Composite End Point:
Graft loss < 12 mo. or Cr12 > 2.0Reduction of CEP incidence by 33%626 patients in total needed in such study
Paraskevas et al Transplantation 2003; 75: 1256
Composite end point (CEP)
CEP definition:Occurrence of at least oneAcute rejection, Graft loss, Death or S-Creat > 1.5
UNOS data base 1995-2000: 59,000 patients61.2% met the CEP - Margin for improvement- Less number of patients needed
Siddiqi et al ATC 2003; #1160Hariharan et al AJT 2003; 3: 933
Composite end point (CEP)
CEP:Not a surrogate end point – no predictionNot a clinical end point – incl ‘surrogate’ factors
Weighted score:Death 1.0 x proportionGraft loss 0.5 x proportionAcute rej 0.25 x proportionS-crea>1.5 0.25 x proportion
Hariharan et al AJT 2003; 3: 933
Clinical end point (short term only)
Alternatively;Clinical end point (“how the patient functions …”)without prediction of long-term patient or graft
survival
e.g. GFR (Cockcroft-Gault formula) at 12 mo.Symphony study e.g. New Onset of Diabetes After Transplantation
(NODAT) according to American Diabetes Association (ADA) definitions
Conclusions on End Points
What are the best end points?
Acute rejection
Acute rejection + 1/Cr return to baseline
1-year graft function
Composite end point
NODAT
HypothesisAppropriate populationClinically relevant achievementAdequately-poweredEnd pointsComparison group RandomizedPlacebo-controlledDouble-blindIntent-to-treat analysisProtocolAnalysis plan
Key Elements of Trial Quality
Question:We are designing a study on CNI
nephrotoxicity and are discussing the treatment of the control group. It was decided to give them CsA with trough levels 200-400 ng/ml first 2 months and then 100-200 ng/ml months 3-12.
OK?
The Comparison Group
The benefits, risks, burdens and effectiveness of a new method should be tested against those of the best current prophylactic, diagnostic, and therapeutic methods.
World Medical Association Declaration of Helsinki
The Comparison Group
The new drug or method should hypothetically and potentially be better than the best known current treatment(= standard of care)- but not yet proven to be so
The Study Group
Placebo vs Study Drug Study drug in addition to the best current regimene.g. placebo vs daclizumab
Old Drug vs New DrugEither drug in addition to the best current regimene.g. Aza vs MMF
Controlled trial
HypothesisAppropriate populationClinically relevant achievementAdequately-poweredEnd pointsComparison group (placebo)RandomizedDouble-blindIntent-to-treat analysisProtocolAnalysis plan
Key Elements of Trial Quality
Random Assignment of TreatmentParameters associated with outcome should be similarly distributed between study and comparison groups
Methods for example: computerized and via telephone1:1 or 2:1Stratification (per center or LD/DD)
Randomized Controlled Trial
Double BlindPhysician not knowing which treatmentPatient not knowing
Problems: drug administrationdrug monitoring
Labs and visits the same in both groupsSometimes extra blood sampling in controls (ethics?)
The Blind Treating The Blind
HypothesisAppropriate populationClinically relevant achievementAdequately-poweredEnd pointsComparison group (placebo)RandomizedDouble-blindIntent-to-treat analysisProtocolAnalysis plan
Key Elements of Trial Quality
ITT analysis – the Standard method= All participating patients are included Does not exclude treatment failuresConclusion: “With this intention, we had the results ...”
Limitation of the ITT AnalysisIn a long-term study (e.g. 3 yrs), many patients would
have switched therapy or been withdrawnPhysicians regard the fate of the patient more
important than the study-> Reduced differences between treatment groups
Intention-to-Treat Analysis
Per Protocol (PP) Analysis= On-treatment analysisEmphasis on the positive results of treatmentExcludes premature withdrawals (“failures”)
Limitation of the PP analysisConclusion: “Only in successful cases, we had these results...”“Only patients who could follow this protocol, …”-> Seriously biased results when excluding failures
Per Protocol Analysis
HypothesisAppropriate populationClinically relevant achievementAdequately-poweredEnd pointsComparison group (placebo)RandomizedDouble-blindIntent-to-treat analysisProtocolAnalysis plan
Key Elements of Trial Quality
Synopsis and Protocol
SynopsisA short summary of the study protocolUsed to invite investigators to participate
ProtocolA detailed description of all relevant aspects of the studyUsed to make sure all centers perform the study
correctlyUsed for approval of Ethical Committee and Health
Authorities
Patient Information and Consent
HypothesisAppropriate populationClinically relevant achievementAdequately-poweredEnd pointsComparison group (placebo)RandomizedDouble-blindIntent-to-treat analysisProtocolAnalysis plan
Key Elements of Trial Quality
Analysis Plan
A detailed description of all analyses that are planned; statistical methods, outlines of tables and graphs
Including:Primary end point to verify or reject the null hypothesis Secondary end points to further describe the data and
formulate new hypotheses
Secondary analyses (ad hoc, made after viewing the results and not part of the analysis plan) should be avoided
Interim analyses - confidentially for Data Safety Monitoring Board (DSMB). To report in public interim results during the study should not be done!
Further Reading
A Uniform Clinical Trial Registration Policy for Journals of Kidney Disease, Dialysis and transplantation
Couser WG, AJT 2005; 5: 643 www.clinicaltrials.gov
Design and Analysis of Clinical Trials in Transplantation
Schold JD, AJT 2008; 8: 1779