Within-Subject Clinical Trials: Introduction to New Methods and Statistical Models
June 22, 2017
2
To RCT or not to RCT: That is the Question
Donald E. Stull, PhDHead, Data Analytics and Design Strategy
RTI Health Solutions
3
• Background: Are RCTs the only acceptable/respectable approach for establishing treatment efficacy or cause-effect?– RCTs = multi-country, multi-center, randomized, double-blind, placebo-
controlled clinical trial• Brief (Read: NOT comprehensive) presentation of some
“issues” with RCTs: – The Good, – The Bad, and – The Juggly
• Some alternative approaches:– (Bayesian) adaptive trials– Within-Subject Clinical Trials (WSCTs)
• Brief discussion of analytic approaches/software for dealing with intensive data
Agenda
4
• “The Gold Standard” (?)• Internal validity• Randomization• Blinding• Control over comparisons• Manipulation of key variables
RCTs: The Good
5
• Ethics• External validity• Cost• Covariate imbalance• Investigator discretion
RCTs: The Bad
6
• Juggling the Good and the Bad:– RCTs are often a balance between cost, external/internal validity,
accepting (choosing?) no direct head-to-head comparison, etc., etc.
RCTs: The Juggly
7
“Many new drugs are expensive, and in some countries drug budgets are growing faster than other health care sectors…The key questions are: how much better are the new drugs than the old ones, how much more does it cost to obtain the additional benefits, and does the extra cost represent value for the money.” (Henry and Hill, BMJ, 1995)
• Does answering these key questions always require RCTs?
Alternative Approaches to Understanding Change and Treatment Effects
Henry D, Hill S. Comparing treatments. BMJ. 1995 May 20;310(6990):1279.
8
“Because every study design may have problems in particular applications, studies should be evaluated by appropriate criteria, and not primarily according to the simplistic RCT/non-RCT dichotomy promoted by some prominent advocates of the evidence-based medicine movement and by the research evaluation guidelines based on its principles.” (Grossman & Mackenzie, 2005)
The Randomized Controlled Trial: gold standard, or merely standard?
Grossman J, Mackenzie FJ. The randomized controlled trial: gold standard, or merely standard? Perspect Biol Med. 2005 Autumn;48(4):516-34.
9
• (Bayesian) Adaptive trials• Mixture models for heterogeneous data
• What if you want to test a treatment for an ultra-rare disease?
• What if you need a Go/No-Go decision?• Are there study designs that can handle these challenges
without undertaking an RCT?
We will focus on within-subject clinical trials as an approach to address many of these challenges
Alternative Approaches to Understanding Change and Treatment Effects
10
Within-Subject Clinical Trials: Complementary Alternatives to RCTs
Ty A. Ridenour, PhD, MPEDevelopmental Behavior Epidemiologist
Behavioral Health EpidemiologyRTI International
11
Objective of RCTs
Weissberg-Benchell, Antisdel-Lomaglio, et al. Insulin pump therapy a meta-analysis. Diabetes Care, 2003; 26:, 1079-1087.
Meta-efficacy:
Insulin Pump BetterConventional MDI BetterFigure 1—Effect sizes for parallel design studies. Studies are presented in increasing order of chronology from the bottom, with primary authors’ names along the left side of the graph. *Mean effect size. Bars denote the 95% CIs of the mean. Mean effect size for the 11 studies was d = 0.95.
12
Objective of RCTs
Insulin Pump BetterConventional MDI BetterFigure 1—Effect sizes for parallel design studies. Studies are presented in increasing order of chronology from the bottom, with primary authors’ names along the left side of the graph. *Mean effect size. Bars denote the 95% CIs of the mean. Mean effect size for the 11 studies was d = 0.95.
Range:
Weissberg-Benchell, Antisdel-Lomaglio, et al. Insulin pump therapy a meta-analysis. Diabetes Care, 2003; 26:, 1079-1087.
13
• Must base patient’s treatment on population
• Ecological fallacy (Robinson, 1950)
• Ergodicity theorem (Birkoff, 1931)
• Simpson’s paradox (Simpson, 1951)
Clinician’s Dilemma
14
• Small population or sample
– Pilot studies
– Rare or newly discovered diseases
– Genetic microtrials
• In-the-field research required
• Little funding
• Patients have study exclusion criteria
• Intervention mechanisms / processes
Needs for Within-Subject Clinical Trials
15
• Multiple Baseline DesignOverall Goal of Designs: Eliminate alternative explanations
Part 1: Within-Subject Experimental Designs
AllPsych; //allpsych.com/researchmethods/multiplebaselines/#.Vd30PvlVhBe; Kazdin, Single-case research designs. Oxford U Press. 2011.
Results support Treatment Results don’t support Treatment
16
Real life data are messy
Problem with Visual Inspection
Ridenour, Pineo et al. Toward idiographic research in prevention science: Demonstration of three techniques for rigorous small sample research. Prevent Sci 2013;14: 267-278.
Patient D
Glu
cose
mg/
dL
17
• Multiple Baseline DesignOverall Goal of Designs: Eliminate alternative explanations
Part 1: Within-Subject Experimental Designs
AllPsych; //allpsych.com/researchmethods/multiplebaselines/#.Vd30PvlVhBe; Kazdin, Single-case research designs. Oxford U Press. 2011.
Results support Treatment Results don’t support Treatment
18
• Level 1: time series observations within-person• Level 2: aggregates for individuals / sample
Hierarchical linear model
Part 2: Hierarchical Modeling
ititiiit euuY ++++++= Time)*(IntxIntx(Time)(Time) 3it21100 ββββ
interceptterms
slopeterms
Differences between phases(control, treatment 1, treatment 2)
19
Levels of WSCT Models
Patient A Patient B
Patient C Patient D
Glu
cose
mg/
dLG
luco
se m
g/dL
Ridenour, Pineo et al. Toward idiographic research in prevention science: Demonstration of three techniques for rigorous small sample research. Prevent Sci 2013;14: 267-278.
20
Hierarchical Model Components
Patient A Patient B
Patient C Patient D
Glu
cose
mg/
dLG
luco
se m
g/dL
Intercepts
Slopes
21
• WSCTs can help tease out potential “period effects” that may confound our understanding of the effects of an intervention– i.e., something occurs that affects the responses of all participants at a
particular time
• Standard analytic methods (e.g., HLM/MLM), we can examine responses across many assessment points and identify the “step functions” indicating when an intervention had an effect
• Small numbers of participants are offset with many observations per participant, providing confidence in results
Implications for Pharmaceuticals and Medical Devices
22
Illustration 1: Small Sample Pilot Study 8/
1/05
8/8/
05
8/15
/05
8/22
/05
8/29
/05
9/5/
05
9/12
/05
9/19
/05
9/26
/05
10/3
/05
10/1
0/05
10/1
7/05
10/2
4/05
10/3
1/05
11/7
/05
11/1
4/05
11/2
1/05
11/2
8/05
12/5
/05
12/1
2/05
12/1
9/05
12/2
6/05
Patient A ss ss GG GG GG GG GG GG GG GG GG GG GG GG GG … Patient B ss ss ss ss ss ss ss GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG … Patient C … Patient D …
10/2
/06
10/9
/06
10/1
6/06
10/2
3/06
10/3
0/06
11/6
/06
11/1
3/06
11/2
0/06
11/2
7/06
12/4
/06
12/1
1/06
12/1
8/06
12/2
5/06
1/1/
07
1/8/
07
1/15
/07
1/22
/07
1/29
/07
2/5/
07
2/12
/07
2/19
/07
2/26
/07
3/5/
07
3/12
/07
3/19
/07
3/26
/07
4/2/
07
4/9/
07
4/16
/07
4/23
/07
4/30
/07
Patient A … Patient B … Patient C … ss ss ss ss ss ss GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG Patient D … ss ss ss ss ss ss sG GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG GG G
Ridenour, Pineo et al. Toward idiographic research in prevention science: Demonstration of three techniques for rigorous small sample research. Prevent Sci 2013;14: 267-278.
23
Detailed Heterogeneity in Outcomes
Aggregated Times 7:30am 11:30am 4:30pm 8:30pm
Entire Sample -49.4(9.2)
-35.9(9.8)
-43.3*(194.2)
-59.4(9.7)
-59.1*(277.9)
Patient A -40.9(10.7)
0.2*(11.1)
1.8*(24.4)
-50.4(20.2)
-104.2(19.4)
Patient B -107.9(11.8)
-32.2(8.8)
-117.3(23.0)
-156.3(19.3)
-122.2(17.0)
Patient C -22.6*(15.3)
11.5*(27.5)
-66.6(26.8)
-35.5*(25.4)
3.0*(27.7)
Patient D -24.6(10.1)
-112.1(16.0)
26.3*(17.6)
43.5(17.7)
-57.3(24.3)
Note: * Change in glucose was NS (p>.01). Parenthetical values are 95% confidence intervals. The orange cell presents preliminary efficacy. Green cells present “impact” of treatment per patient.
Ridenour, Pineo et al. Toward idiographic research in prevention science: Demonstration of three techniques for rigorous small sample research. Prevent Sci 2013;14: 267-278.
24
• RCTs compare differences in mean effects between treatment arms
• Variability around mean effects within treatment arms, (heterogeneity of treatment effects), can “wash out” differences between treatment arms or an understanding of when treatment is best administered (as in this example)
• Examining what is occurring within patients can lend important insights into these effects and may be informative about individual responses
• Examining this individual variability fits practically and philosophically with personalized medicine
Implications for Pharmaceuticals and Medical Devices
25
• Immunosuppressive drugs prevent rejection of organ• 40-60% of patients lapse from treatment regimen• 15-25% of noncompliance due to high cost• Inclusion criteria: age>18, post-transplant of 6 MO for liver or
3 MO for kidney, 3+ trough concentrations before & after switch (stable dosing)
• N=103 (48 liver, 55 kidney); observations = 746 trough concentrations (ng/mL)
• No organ rejections, no appreciable changes in liver / kidney function
Momper, Ridenour, et al. The impact of conversion from Prograf to generic tacrolimus in liver and kidney transplant recipients with stable graft function. Am J Transplant 2011; 11: 1861-1867.
Illustration 2: Rigorous Testing in Small Population
26
Heterogeneity in Outcomes
Figure 2: Percent change in the mean whole blood tacrolimus trough concentrations following generic substitution in liver (top) and kidney (bottom) transplant recipients when the dosing regimen remained constant.
27
Statistical Sophistication & PowerTable 3 : Summary of covariate effects on tacrolimus trough concentrations in liver transplant recipients
Bivariate Bivariate Multivariate backward stepwise elimination
95% CI 95% CIB p Lower Upper B p* Lower Upper
Tacrolimus dose (per mg/70 kg) 0.45 < 0.005 0.03 0.87 0.57 <0.005 0.29 0.85Patient age (per year) −0.08 < 0.01 −0.14 −0.02 – NS – –Female (gender) 0.50 NS −0.98 1.98 – – – –Time posttransplant (per year) −0.14 < 0.025 −0.25 −0.03 – NS – –Albumin (per g/dL) −1.29 < 0.005 −2.44 −0.14 −0.77 <0.005 −1.38 −0.16Total bilirubin (per mg/dL) 0.09 < 0.025 −0.03 0.21 0.18 <0.01 0.13 0.23Creatinine (per mg/dL) −0.38 < 0.005 −0.89 0.14 – NS – –Use of generic tacrolimus −1.511 < 0.005 −2.29 −0.73 −1.98 <0.005 −3.05 −0.92Dependent variable: Tacrolimus whole blood trough concentrations in liver transplant recipients; ∗p-value derived from the difference in –2 log likelihood of (a) model with all remaining predictors and (b) model with the predictor in the r ow omitted; B, unstandardized (raw) coefficient; CI, confidence interval; NS, not significant. Table 4 : Summary of covariate effects on tacrolimus trough concentrations in kidney transplant recipients
Bivariate Bivariate Multivariate backward stepwise elimination
95% CI 95% CIB p Lower Upper B p* Lower Upper
Tacrolimus dose (per mg/70 kg) 0.22 <0.005 0.03 0.41 0.26 <0.005 0.04 0.48Patient age (per year) 0.01 NS −0.03 0.05 – – – –Female (gender) −0.032 NS −1.282 1.218 – – – –Time posttransplant (per year) −0.075 NS −0.263 0.113 – – – –Albumin (per g/dL) 0.01 NS −0.14 0.16 – – – –Total bilirubin (per mg/dL) 2.39 <0.005 0.07 4.72 2.35 <0.005 0.07 4.62Creatinine (per mg/dL) 0.70 <0.005 −1.04 2.44 – NS – –Use of generic tacrolimus −0.94 <0.005 −1.54 −0.35 −0.87 <0.005 −1.47 −0.27Dependent variable: Tacrolimus whole blood trough concentrations in kidney transplant recipients; ∗ p-value derived from the difference in –2 log likelihood of (a) model with all remaining predictors and (b) model with the predictor in the row omitted; B, unstandardized (raw) coefficient; CI, confidence interval; NS, not significant.
28
• WSCTs:– Can be used in applied/clinical settings– Could be applicable for small dose-response studies– Can give clues to sources of heterogeneity of responses
Implications for Pharmaceuticals and Medical Devices
29
Ding, Cooper et al. Usage of tilt-in-space, recline, and elevation seating functions in natural environment of wheelchair users. J Rehab Res Dev 2008; 45; 973-983.
Illustration 3: Proof-of-Concept with Medical Device
Sensor1
Sensor2
Sensorn
Sensing
Clinical Recommendation
Actual PSF Use
Elements of a User’s Context
Coaching Strategy
Presentation (Web-based Application
Coaching Messages
Clinician Interface
User Interface
Decision Making (Single Board Computer)
Presentation (Touch Screen)
Compliance
30
Outcomes Among Phases
Mean Standard Deviation
Cohen’s d Compared to
BaselineBASELINE (244 observations)
General Discomfort 41.9 12.39 n/aDiscomfort Intensity 19.2 9.52 n/aFrequency of Use 2.1 2.36 n/aDuration of Use in Mod/Max 2 50.8 44.78 n/a
INSTRUCTION (561 observations)General Discomfort 42.6 13.01 - -Discomfort Intensity 19.9 9.36 - -Frequency of UseB 1.5 2.09 0.28Duration of Use in Mod/Max 2B 37.6 46.02 0.29
VIRTUAL COACH (262 observations)General Discomfort 42.3 10.81 - -Discomfort IntensityB,I 10.7 5.52 1.10Frequency of UseB,I 3.3 3.02 0.44Duration of Use in Mod/Max 2B,I 67.4 45.73 0.37
Note: BDiffers from Baseline phase (p<.001). IDiffers from Instruction phase (p<.001).
Ridenour, Chen et al. The clinical trials mosaic: Toward a range of clinical trials designs to optimize evidence-based treatment. J Person Oriented Res. In press.
31
Competing Mediation Models of Day-to-Day Change
PSF Usage 2PSF Usage
DiscomfortIntensity
GeneralDiscomfort 2
GeneralDiscomfort
Discomfort Intensity 2
G2err
F2err
D2err
Autocorrelation Only Model
PSF Usage 2PSF Usage
DiscomfortIntensity
GeneralDiscomfort 2
GeneralDiscomfort
Discomfort Intensity 2
G2err
F2err
D2err
Cooper & Liu Same-day Model
PSF Usage 2PSF Usage
DiscomfortIntensity
GeneralDiscomfort 2
GeneralDiscomfort
Discomfort Intensity 2
G2err
F2err
D2err
Zheng et al. Generic Model
32
Intervention Process: Moderating the Mediation of Outcomes
Frequency of Power Seat Use
Association Baseline Instruction Virtual Coach
1. DiscomfortIntensity with Use Frequency on the same day
-.04 .22 .56
2. DiscomfortIntensity with Use Frequency on the next day
-.08 -.05 .39
3. UsageAutocorrelation .63 .49 .41
PSF Usage 2PSF Usage
DiscomfortIntensity
GeneralDiscomfort 2
GeneralDiscomfort
Discomfort Intensity 2
I1 with U1
G1 with U1
G1 to G2
G1 to U2
U1 to U2
I1 to U2
I1 to I2
I2 to U2
G2 to U2
G1
with
I1
Ridenour, Chen et al. The clinical trials mosaic: Toward a range of clinical trials designs to optimize evidence-based treatment. J Person Oriented Res. In press.
12
3
33
• Proof-of-concept studies to explore viability of assets for further development
• Studying assets in orphan diseases/very small populations• Overcoming variability in treatment effect• Understanding complex, time-ordered relationships within
treatment data
Overall Implications for Pharmaceuticals and Medical Devices
34
Sample of Recent or Ongoing WSCT StudiesField Outcomes Intervention
Behavior Medicine Blood-glucose test usage MI, CM, internet-aided adherence
Family Therapy Satisfaction, Depression Emotion Focused Therapy
Geriatric Medicine Blood sugar level “Manual Pancreas”
Pharmacy Pain, Patient satisfaction ICU Sedatives
Surgery Transplanted liver/kidney function Prograf vs generic transplant drug
Rehabilitation Pain, Adherence Virtual Coach Power Seat
Cardiac arrest recovery Exercise outside physical therapy
Addiction Treatment Smoking cessation Pharmacist-aided use of patch
Clinical Psychology Psychopathy Contingency management
Policing Electrodermal activity Etiology: stressful confrontations
Partner Violence His & her violence perpetration Etiology: violence precursors
Family Therapy Satisfaction, Depression Emotion focused therapy
Speech Therapy Verbal- & e-communication Speech therapist laptop facilitator
Enunciation, Slurring AAC for stroke victims
36
Generating knowledge and providing greater understanding so that you—and those who regulate, pay for, prescribe, and use your products—can make better decisions.
rtihs.org
37
Our Experts
Donald E. Stull, PhDHead, Data Analytics and Design StrategyRTI Health [email protected]
Ty A. Ridenour, PhD, MPEDevelopmental Behavior EpidemiologistBehavioral Health EpidemiologyRTI [email protected]