Distributed analysis in multi-center studies
Sharing of individual-level data across health plans or healthcare delivery systems continues to be challenging due to concerns about loss of patient privacy, unauthorized uses of transferred data, inaccurate analysis or interpretation of data, or contractual or legal restrictions. Although these challenges can be addressed in part by proper governance and appropriate updates to existing regulations, newer privacy-protecting analytic and data-sharing methods offer another potential solution. This presentation will describe the use of privacy-protecting analytic methods that allow robust and flexible statistical analysis using aggregate-level information, without centralized pooling of individual-level datasets across data sources. We will present several comparative safety and effectiveness studies of medical treatments that employ these methods to generate actionable real-world evidence.
1. Toh S, Gagne JJ, Rassen JA, Fireman BH, Kulldorff M, Brown JS. Confounding adjustment incomparative effectiveness research conducted within distributed research networks. MedCare 2013:51(8 Suppl 3):S4-S10
2. Toh S, Hampp C, Reichman ME, Graham DJ, Balakrishnan S, Pucino F, Hamilton J, Lendle S,Iyer A, Rucker M, Pimentel M, Nathwani N, Griffin MR, Brown NJ, Fireman BH. Risk ofhospitalized heart failure among new users of saxagliptin, sitagliptin, and otherantihyperglycemic drugs: A retrospective cohort study. Ann Intern Med 2016;164(11):705-714 (PMC5178978)
3. Toh S, Reichman ME, Graham DJ, Hampp C, Zhang R, Butler MG, Iyer A, Rucker M,Pimentel M, Hamilton J, Lendle S, Fireman BH; for the Mini-Sentinel AMI-SaxagliptinSurveillance Writing Group. Prospective post-marketing surveillance of acute myocardialinfarction in new users of saxagliptin: A population-based study. Diabetes Care2018;41(1):39-48
Distributed analysis in multi‐center studies
Darren Toh, ScDDepartment of Population Medicine
Harvard Medical School & Harvard Pilgrim Health Care InstituteBoston, MA
November 18, 2018
Disclosures
Research support• Patient‐Centered Outcomes Research Institute (ME‐1403‐11305)• Office of the Assistant Secretary for Planning and Evaluation & Food and Drug Administration
(HHSF223200910006I)• National Institutes of Health (U01EB023683)• Agency for Healthcare Research and Quality (R01HS026214)
Board of Directors, International Society for Pharmacoepidemiology
My spouse is an employee of Biogen
2
Overview
Evolution of multi‐center studies
Analytic methods in multi‐center studies
Select examples
Discussion
3
Overview
Evolution of multi‐center studies
Analytic methods in multi‐center studies
Select examples
Discussion
4
Multi‐center studies
Many studies are now done in multi‐center settings
5
Why do multi‐center studies?
6
Larger sample sizes• Allow studies of rare treatments or rare outcomes• Allow studies in specific subpopulations• Allow studies to be done more quickly
More diverse populations• Allow more generalizable findings• Allow assessment of treatment effect heterogeneity
Multi‐center studies v1.0
Analysis center
7
Multi‐center studies v1.0
8
Pooling study‐specific individual‐level datasets
Typical datasets shared in multi‐center studies v1.0
PatID Exposure Outcome Time Age Sex DM HTN CVD …
001 1 0 312 33 1 0 1 1 …
002 0 0 40 45 1 0 1 0 …
003 0 0 365 76 0 0 0 0 …
004 0 0 200 56 0 1 0 0 …
005 0 1 2 21 0 0 1 0 …
006 1 1 15 80 1 0 0 1 …
007 1 0 4 65 1 1 0 1 …
008 1 0 145 77 0 1 0 0 …
009 0 1 33 48 1 0 0 0 …
010 0 0 98 52 1 0 0 0 …
011 0 0 34 32 0 0 0 0 …
… … … … … … … … … …
9
Typical datasets shared in multi‐center studies v1.0
PatID Exposure Outcome Time Age Sex DM HTN CVD …
001 1 0 312 33 1 0 1 1 …
002 0 0 40 45 1 0 1 0 …
003 0 0 365 76 0 0 0 0 …
004 0 0 200 56 0 1 0 0 …
005 0 1 2 21 0 0 1 0 …
006 1 1 15 80 1 0 0 1 …
007 1 0 4 65 1 1 0 1 …
008 1 0 145 77 0 1 0 0 …
009 0 1 33 48 1 0 0 0 …
010 0 0 98 52 1 0 0 0 …
011 0 0 34 32 0 0 0 0 …
… … … … … … … … … …
10
Each row represents an individual
Typical datasets shared in multi‐center studies v1.0
PatID Exposure Outcome Time Age Sex DM HTN CVD …
001 1 0 312 33 1 0 1 1 …
002 0 0 40 45 1 0 1 0 …
003 0 0 365 76 0 0 0 0 …
004 0 0 200 56 0 1 0 0 …
005 0 1 2 21 0 0 1 0 …
006 1 1 15 80 1 0 0 1 …
007 1 0 4 65 1 1 0 1 …
008 1 0 145 77 0 1 0 0 …
009 0 1 33 48 1 0 0 0 …
010 0 0 98 52 1 0 0 0 …
011 0 0 34 32 0 0 0 0 …
… … … … … … … … … …
11
Each column represents a covariate
Typical datasets shared in multi‐center studies v1.0
12
PatID Exposure Outcome Time Age Sex DM HTN CVD …
001 1 0 312 33 1 0 1 1 …
002 1 0 40 45 1 0 1 0 …
003 1 0 365 76 0 0 0 0 …
004 1 0 200 56 0 1 0 0 …
005 0 1 2 21 0 0 1 0 …
006 0 1 15 80 1 0 0 1 …
007 0 0 4 65 1 1 0 1 …
008 0 0 145 77 0 1 0 0 …
… … … … … … … … … …
PatID Exposure Outcome Time Age Sex DM HTN CVD …
001 0 1 35 44 0 1 3 0 …
002 0 1 213 54 0 1 1 1 …
003 0 1 453 78 0 0 4 1 …
004 0 0 58 87 1 0 3 1 …
005 1 0 31 22 1 0 3 0 …
006 1 0 56 46 0 1 2 0 …
007 1 0 123 53 0 1 1 1 …
008 1 0 546 35 0 0 3 0 …
… … … … … … … … … …
Site PatID Exposure Outcome Time Age Sex DM HTN CVD …
1 001 1 0 312 33 1 0 1 1 …
1 002 1 0 40 45 1 0 1 0 …
1 003 1 0 365 76 0 0 0 0 …
1 004 1 0 200 56 0 1 0 0 …
1 005 0 1 2 21 0 0 1 0 …
1 006 0 1 15 80 1 0 0 1 …
1 007 0 0 4 65 1 1 0 1 …
1 008 0 0 145 77 0 1 0 0 …
… … … … … … … … … … …
2 001 0 1 35 44 0 1 3 0 …
2 002 0 1 213 54 0 1 1 1 …
2 003 0 1 453 78 0 0 4 1 …
2 004 0 0 58 87 1 0 3 1 …
2 005 1 0 31 22 1 0 3 0 …
2 006 1 0 56 46 0 1 2 0 …
2 007 1 0 123 53 0 1 1 1 …
2 008 1 0 546 35 0 0 3 0 …
… … … … … … … … … … …
Data Partner 1
Data Partner 2
Multi‐center studies v2.0
Individual data partners
Site 1 Site 2
Site 3 Site 4
Data standardization(common data model)
Site 1
Site 2
Site 3
Site 4
Data accessible to research projects
• Research projects
• Programs written against common data model
Data quality improvement feedback loop
Adapted from: http://www.hcsrn.org/asset/b9efb268‐eb86‐400e‐8c74‐2d42ac57fa4F/VDW.Infographic031511.jpg 13
Data standardization – Common data model
14
Review & Run Query
Review & Return Results
Data Partner 1
EnrollmentDemographicsUtilizationPharmacy
Etc
1. User creates and submits query
Review & Run Query
Review & Return Results
Data Partner 2
EnrollmentDemographicsUtilizationPharmacy
Etc
Analysis Center
Secure Network Portal
1
Distributed analysis in networks with common data model
15
Review & Run Query
Review & Return Results
Data Partner 1
EnrollmentDemographicsUtilizationPharmacy
Etc
1. User creates and submits query
2. Data partners retrieve query
2
Review & Run Query
Review & Return Results
Data Partner 2
EnrollmentDemographicsUtilizationPharmacy
Etc
Analysis Center
Secure Network Portal
1
Distributed analysis in networks with common data model
16
Review & Run Query
Review & Return Results
Data Partner 1
EnrollmentDemographicsUtilizationPharmacy
Etc
1. User creates and submits query
2. Data partners retrieve query
3. Data partners review and run query against their local data
4. Data partners review results
2 3 4
Review & Run Query
Review & Return Results
Data Partner 2
EnrollmentDemographicsUtilizationPharmacy
Etc
3 4
Analysis Center
Secure Network Portal
1
Distributed analysis in networks with common data model
17
Review & Run Query
Review & Return Results
Data Partner 1
EnrollmentDemographicsUtilizationPharmacy
Etc
1. User creates and submits query
2. Data partners retrieve query
3. Data partners review and run query against their local data
4. Data partners review results
5. Data partners return results via secure network
6. Results are aggregated and reported
2 3 45
6
Review & Run Query
Review & Return Results
Data Partner 2
EnrollmentDemographicsUtilizationPharmacy
Etc
3 4
Analysis Center
Secure Network Portal
1
Distributed analysis in networks with common data model
18
Typical datasets shared in multi‐center studies v2.0
19
PatID Exposure Outcome Time Age Sex DM HTN CVD …
001 1 0 312 33 1 0 1 1 …
002 1 0 40 45 1 0 1 0 …
003 1 0 365 76 0 0 0 0 …
004 1 0 200 56 0 1 0 0 …
005 0 1 2 21 0 0 1 0 …
006 0 1 15 80 1 0 0 1 …
007 0 0 4 65 1 1 0 1 …
008 0 0 145 77 0 1 0 0 …
… … … … … … … … … …
PatID Exposure Outcome Time Age Sex DM HTN CVD …
001 0 1 35 44 0 1 3 0 …
002 0 1 213 54 0 1 1 1 …
003 0 1 453 78 0 0 4 1 …
004 0 0 58 87 1 0 3 1 …
005 1 0 31 22 1 0 3 0 …
006 1 0 56 46 0 1 2 0 …
007 1 0 123 53 0 1 1 1 …
008 1 0 546 35 0 0 3 0 …
… … … … … … … … … …
Site PatID Exposure Outcome Time Age Sex DM HTN CVD …
1 001 1 0 312 33 1 0 1 1 …
1 002 1 0 40 45 1 0 1 0 …
1 003 1 0 365 76 0 0 0 0 …
1 004 1 0 200 56 0 1 0 0 …
1 005 0 1 2 21 0 0 1 0 …
1 006 0 1 15 80 1 0 0 1 …
1 007 0 0 4 65 1 1 0 1 …
1 008 0 0 145 77 0 1 0 0 …
… … … … … … … … … … …
2 001 0 1 35 44 0 1 3 0 …
2 002 0 1 213 54 0 1 1 1 …
2 003 0 1 453 78 0 0 4 1 …
2 004 0 0 58 87 1 0 3 1 …
2 005 1 0 31 22 1 0 3 0 …
2 006 1 0 56 46 0 1 2 0 …
2 007 1 0 123 53 0 1 1 1 …
2 008 1 0 546 35 0 0 3 0 …
… … … … … … … … … … …
Data Partner 1
Data Partner 2
Concerns about data sharing in multi‐center studies v1 & v2
Loss of patient privacy
Unauthorized uses of data
Inaccurate analysis or interpretation of data
Disclosures of sensitive institutional or corporate information
Contractual restrictions
20
Data sharing – A balancing act
Granularity or identifiability of
information
Analytic flexibility
21
Multi‐center studies v3.0
Analysis Center
22
Multi‐center studies v3.0
Pooling study‐specific summary‐level datasets23
Overview
Evolution of multi‐center studies
Analytic methods in multi‐center studies
Select examples
Discussion
24
Privacy‐protecting methods for multi‐center studies v3.0
Summary score‐based methods
Meta‐analysis of database‐specific effect estimates
Distributed regression
25
Summary scores
PS: Propensity scores DRS: Disease risk scores
26
Treatment Outcome
Confounders
DRSPS
Individual‐level dataset with individual covariates
PatID Exposure Outcome Time Age Sex DM HTN CVD …
001 1 0 312 33 1 0 1 1 …
002 0 0 40 45 1 0 1 0 …
003 0 0 365 76 0 0 0 0 …
004 0 0 200 56 0 1 0 0 …
005 0 1 2 21 0 0 1 0 …
006 1 1 15 80 1 0 0 1 …
007 1 0 4 65 1 1 0 1 …
008 1 0 145 77 0 1 0 0 …
009 0 1 33 48 1 0 0 0 …
010 0 0 98 52 1 0 0 0 …
011 0 0 34 32 0 0 0 0 …
… … … … … … … … … …
27
Individual‐level dataset with summary scores
PatID Exposure Outcome Time PS
001 1 0 312 0.33
002 0 0 40 0.21
003 0 0 365 0.56
004 0 0 200 0.11
005 0 1 2 0.97
006 1 1 15 0.56
007 1 0 4 0.40
008 1 0 145 0.22
009 0 1 33 0.43
010 0 0 98 0.78
011 0 0 34 0.38
… … … … …
28
Summary score‐based method #1 – Matching
29
Persons in exposed
Persons in unexposed
Events in exposed
Events in unexposed
500 500 80 75
PatID Exposure Outcome Time PS
001 1 0 312 0.33
002 0 0 40 0.21
003 0 0 365 0.56
004 0 0 200 0.11
005 0 1 2 0.97
006 1 1 15 0.56
007 1 0 4 0.40
008 1 0 145 0.22
009 0 1 33 0.43
010 0 0 98 0.78
011 0 0 34 0.38
… … … … …
Summary score‐based method #1 – Matching
30
Persons in exposed
Persons in unexposed
Events in exposed
Events in unexposed
500 500 87 85
Persons in exposed
Persons in unexposed
Events in exposed
Events in unexposed
400 400 68 65
Site Persons in exposed
Persons in unexposed
Events in exposed
Events in unexposed
1 500 500 87 85
2 400 400 68 65
Data Partner 1
Data Partner 2
Summary score‐based method #2 – Stratification
31
PatID Exposure Outcome Time PS
001 1 0 312 0.33
002 0 0 40 0.21
003 0 0 365 0.56
004 0 0 200 0.11
005 0 1 2 0.97
006 1 1 15 0.56
007 1 0 4 0.40
008 1 0 145 0.22
009 0 1 33 0.43
010 0 0 98 0.78
011 0 0 34 0.38
… … … … …
PS or DRS stratum
Persons in exposed
Persons in unexposed
Events in exposed
Events in unexposed
1 200 150 30 352 150 100 20 403 200 180 21 214 150 200 26 18
Summary score‐based method #3 – Risk set analysis
32
PatID Exposure Outcome Time PS
001 1 0 312 0.33
002 0 0 40 0.21
003 0 0 365 0.56
004 0 0 200 0.11
005 0 1 2 0.97
006 1 1 15 0.56
007 1 0 4 0.40
008 1 0 145 0.22
009 0 1 33 0.43
010 0 0 98 0.78
011 0 0 34 0.38
… … … … …
Event Event time Event exposed
Risk set exposed
Risk set unexposed
1 8 0 300 2992 12 1 296 2953 20 1 290 2884 21 0 286 283… … … … …
Meta‐analysis of database‐specific effect estimates
33
PatID Exposure Outcome Time Age Sex DM HTN CVD …
001 1 0 312 33 1 0 1 1 …
002 0 0 40 45 1 0 1 0 …
003 0 0 365 76 0 0 0 0 …
004 0 0 200 56 0 1 0 0 …
005 0 1 2 21 0 0 1 0 …
006 1 1 15 80 1 0 0 1 …
007 1 0 4 65 1 1 0 1 …
008 1 0 145 77 0 1 0 0 …
009 0 1 33 48 1 0 0 0 …
010 0 0 98 52 1 0 0 0 …
011 0 0 34 32 0 0 0 0 …
… … … … … … … … … …
Hazard ratio Lower 95% CI Upper 95% CI
2.97 1.95 4.52
Distributed regression
34
Analyst inputs individual‐level dataset into statistical software
Statistical software produces final results
Statistical software produces intermediate statistics as part of
computing process
ID E X1 X2 Y
A001 0 13.89 3.42 28.70
A002 1 18.10 1.29 27.90
A003 0 6.41 4.86 33.10
A004 1 16.30 1.45 17.20
A005 1 17.57 2.51 21.70
… … … … …
A100 0 5.78 2.53 23.76
Type Name Intercept E X1 X2 Y
SSCP Intercept 100.0 52.0 1157.1 405.9 2235.5
SSCP E 52.0 52.0 813.2 138.1 1060.9
SSCP X1 1157.1 813.2 17751.3 3458.7 23815.8
SSCP X2 405.9 138.1 3458.7 2240.8 9572.3
SSCP Y 2235.5 1060.9 23815.8 9572.3 56911.9
MEAN 1.0 0.5 11.6 4.1 22.4
STD 0.0 0.5 6.6 2.5 8.4
N 100 100 100 100 100
VariableParameter estimate
Standard error
Intercept 25.4540 3.7959
E ‐0.4323 1.7865
X1 ‐0.5643 0.1432
X2 ‐0.6564 0.4532
Distributed regression
35
Analyst inputs individual‐level dataset into statistical software
Statistical software produces final results
Statistical software produces intermediate statistics as part of
computing process
ID E X1 X2 Y
A001 0 13.89 3.42 28.70
A002 1 18.10 1.29 27.90
A003 0 6.41 4.86 33.10
A004 1 16.30 1.45 17.20
A005 1 17.57 2.51 21.70
… … … … …
A100 0 5.78 2.53 23.76
Type Name Intercept E X1 X2 Y
SSCP Intercept 100.0 52.0 1157.1 405.9 2235.5
SSCP E 52.0 52.0 813.2 138.1 1060.9
SSCP X1 1157.1 813.2 17751.3 3458.7 23815.8
SSCP X2 405.9 138.1 3458.7 2240.8 9572.3
SSCP Y 2235.5 1060.9 23815.8 9572.3 56911.9
MEAN 1.0 0.5 11.6 4.1 22.4
STD 0.0 0.5 6.6 2.5 8.4
N 100 100 100 100 100
VariableParameter estimate
Standard error
Intercept 25.4540 3.7959
E ‐0.4323 1.7865
X1 ‐0.5643 0.1432
X2 ‐0.6564 0.4532
“Regular” regression shares this
Distributed regression
36
Analyst inputs individual‐level dataset into statistical software
Statistical software produces final results
Statistical software produces intermediate statistics as part of
computing process
ID E X1 X2 Y
A001 0 13.89 3.42 28.70
A002 1 18.10 1.29 27.90
A003 0 6.41 4.86 33.10
A004 1 16.30 1.45 17.20
A005 1 17.57 2.51 21.70
… … … … …
A100 0 5.78 2.53 23.76
Type Name Intercept E X1 X2 Y
SSCP Intercept 100.0 52.0 1157.1 405.9 2235.5
SSCP E 52.0 52.0 813.2 138.1 1060.9
SSCP X1 1157.1 813.2 17751.3 3458.7 23815.8
SSCP X2 405.9 138.1 3458.7 2240.8 9572.3
SSCP Y 2235.5 1060.9 23815.8 9572.3 56911.9
MEAN 1.0 0.5 11.6 4.1 22.4
STD 0.0 0.5 6.6 2.5 8.4
N 100 100 100 100 100
VariableParameter estimate
Standard error
Intercept 25.4540 3.7959
E ‐0.4323 1.7865
X1 ‐0.5643 0.1432
X2 ‐0.6564 0.4532
Distributed regression shares this
Overview
Evolution of multi‐center studies
Analytic methods in multi‐center studies
Select examples
Discussion
37
Example 1
http://www.hopkinsmedicine.org/healthlibrary/test_procedures/gastroenterology/laparoscopic_adjustable_gastric_banding_135,63/
http://www.hopkinsmedicine.org/healthlibrary/test_procedures/gastroenterology/roux‐en‐y_gastric_bypass_weight‐loss_surgery_135,65/
38
Study design
•≥21 years at time of bariatric surgery•≥1 BMI of 35kg/m2 or greater •Continuous enrollment w/ benefits•No prior bariatric surgery•No prior diagnosis of study outcome
1/1/2005
Time
Contributing person‐times
12/31/2010Start of follow up (discharge date)
•Re‐hospitalization•Death•Health plan disenrollment•12/31/2010•730 days of follow‐up
365 days
Index bariatric hospitalization
39Toh et al, Med Care, 2014;52:664‐668
Confounders
40
Age Asthma*Sex Deep vein thrombosis*Race/ethnicity Pulmonary embolism*Diabetes* Congestive heart failure*Baseline BMI* Hyperlipidemia*Year of procedure Coronary artery disease*Charlson comorbidity score* Oxygen use*Atrial fibrillation* Assistive walking device*
GERD* Smoking status*Hypertension* Blood pressure*Sleep Apnea* Length of stay assoc. with procedure
*Identified during the 365‐day baseline period prior to the index bariatric hospitalization
Toh et al, Med Care, 2014;52:664‐668
Statistical analysis
Propensity score stratification
Analysis• Pooled patient‐level data analysis (benchmark)• Risk set‐based analysis• PS‐stratified analysis (by quintile)• Meta‐analysis of site‐specific effect estimates
41Toh et al, Med Care, 2014;52:664‐668
Select baseline patient characteristicsCharacteristics Adjustable gastric band (n=1,550) Roux‐en‐y gastric bypass (n=5,792)
N %* N %*
Mean age (SD) 46.7 11.2 45.7 10.7
Age > 65 years 76 4.9 141 2.4
Female sex 1,266 81.7 4,823 83.3
Race/ethnicityBlack or African American 137 8.8 522 9.0
White 1,130 72.9 3,840 66.3
Hispanic 142 9.2 769 13.3
Other 62 4.0 280 4.8
Unknown 79 5.1 381 6.6
Baseline BMI
30‐34.9 96 6.2 174 3.0
35‐39.9 480 31.0 1,410 24.3
40‐49.9 813 52.4 3,126 54.0
≥50 161 10.4 1,082 18.7
42Toh et al, Med Care, 2014;52:664‐668
Individual‐level data analysis, by site
Site Adjusted HR 95% CISite 1 0.68 0.45, 1.02Site 2 0.65 0.37, 1.15Site 3 0.52 0.26, 1.04Site 4 0.72 0.35, 1.50Site 5 0.82 0.46, 1.48Site 6 0.32 0.13, 0.75Site 7 0.79 0.62, 1.01
43Toh et al, Med Care, 2014;52:664‐668
Results, by method
Method AdjustedHR 95% CI
Individual‐level 0.71 0.59, 0.84
Risk set 0.71 0.59, 0.84
PS stratification 0.70 0.59, 0.83
Meta‐analysis 0.71 0.60, 0.84
44Toh et al, Med Care, 2014;52:664‐668
Example 2 – Distributed regression
45
Distributed Regression vs. Pooled Patient‐Level Regression – LINEAR
CovariatesDistributed Regression Pooled Patient‐Level Differences in
Parameter EstimatesDifferences in Standard ErrorsParameter Estimates Standard Errors Parameter Estimates Standard Errors
Intercept 35.50548 1.57690 35.50548 1.57690 ‐8.38E‐13 2.26E‐14Variable 1 ‐0.27283 0.04401 ‐0.27283 0.04401 4.44E‐16 9.92E‐16Variable 2 ‐1.01582 0.23259 ‐1.01582 0.23259 1.09E‐13 3.22E‐15Variable 3 ‐0.73017 0.07229 ‐0.73017 0.07229 3.54E‐14 1.32E‐15
Distributed Regression vs. Pooled Patient‐Level Regression – LOGISTIC
CovariatesDistributed Regression Pooled Patient‐Level Differences in
Parameter EstimatesDifferences in Standard ErrorsParameter Estimates Standard Errors Parameter Estimates Standard Errors
Intercept 2.49660 0.49057 2.49660 0.49060 1.33E‐15 9.99E‐16Variable 1 ‐0.14465 0.03686 ‐0.14460 0.03690 2.04E‐13 ‐2.97E‐14Variable 2 ‐0.14105 0.06976 ‐0.14100 0.06980 1.38E‐14 ‐2.22E‐16Variable 3 ‐0.13889 0.02376 ‐0.13890 0.02380 ‐2.42E‐14 ‐2.19E‐16
Distributed Regression vs. Pooled Patient‐Level Regression – COX
CovariatesDistributed Regression Pooled Patient‐Level Differences in
Parameter EstimatesDifferences in Standard ErrorsParameter Estimates Standard Errors Parameter Estimates Standard Errors
Variable 1 ‐0.06692 0.02084 ‐0.06692 0.02084 ‐1.39E‐16 2.78E‐17Variable 2 ‐0.34644 0.19024 ‐0.34644 0.19024 2.22E‐16 ‐2.78E‐17Variable 3 0.09653 0.02724 0.09653 0.02724 ‐1.80E‐16 1.73E‐17
46
Example 3 – PCORnet Bariatric Study
Use of bariatric surgery has expanded considerably
Evidence on the comparative effectiveness and safety of these procedures is limited
Study design
47
Main analysis Aggregate analysisComparisons • RYGB vs. SG
• RYGB vs. AGB• AGB vs. SG
• RYGB vs. SG
Outcomes • Weight change 1, 3, and 5 yrs post‐surgery
• Diabetes remission and relapse• Major adverse events
• Weight change 1 yr post surgery
Analysis • One model that combines all data• Additional data‐driven approaches
to select covariates
• Site‐specific PS model• Fixed set of covariates
48
49
Combining propensity scores with distributed regression
50
VariableParameter estimate Standard error
Pooled individual‐level data analysis
Pooled individual‐level data analysis
RYGB vs. SG ‐0.05470 0.00113
PS stratum 1 Reference Reference
PS stratum 2 ‐0.00754 0.00209
PS stratum 3 ‐0.00671 0.00210
PS stratum 4 ‐0.00717 0.00211
PS stratum 5 0.00034218 0.00212
PS stratum 6 ‐0.00583 0.00213
PS stratum 7 ‐0.00135 0.00214
PS stratum 8 ‐0.00435 0.00216
PS stratum 9 ‐0.00523 0.00218
PS stratum 10 ‐0.00812 0.00222
Combining propensity scores with distributed regression
51
VariableParameter estimate Standard error
Pooled individual‐level data analysis
Distributed regression
Pooled individual‐level data analysis
Distributed regression
RYGB vs. SG ‐0.05470 ‐0.05470 0.00113 0.00113
PS stratum 1 Reference Reference Reference Reference
PS stratum 2 ‐0.00754 ‐0.00754 0.00209 0.00209
PS stratum 3 ‐0.00671 ‐0.00671 0.00210 0.00210
PS stratum 4 ‐0.00717 ‐0.00717 0.00211 0.00211
PS stratum 5 0.00034218 0.00034218 0.00212 0.00212
PS stratum 6 ‐0.00583 ‐0.00583 0.00213 0.00213
PS stratum 7 ‐0.00135 ‐0.00135 0.00214 0.00214
PS stratum 8 ‐0.00435 ‐0.00435 0.00216 0.00216
PS stratum 9 ‐0.00523 ‐0.00523 0.00218 0.00218
PS stratum 10 ‐0.00812 ‐0.00812 0.00222 0.00222
52www.sentinelinitiative.org/sites/default/files/Drugs/Assessments/Mini‐Sentinel_AMI‐and‐Anti‐Diabetic‐Agents_Protocol_0.pdf
Example 4: Prospective surveillance of saxagliptin
52
53http://www.fda.gov/downloads/Drugs/GuidanceComplianceRegulatoryInformation/Guidances/ucm071627.pdf 53
54
SAVOR‐TIMI 53 Trial
54
55www.sentinelinitiative.org/sites/default/files/Drugs/Assessments/Mini‐Sentinel_AMI‐and‐Anti‐Diabetic‐Agents_Protocol_0.pdf
Prospective surveillance of saxagliptin
55
56
Saxagliptin vs. sitagliptin
56
57
Saxagliptin vs. pioglitazone
57
58
Saxagliptin vs. sulfonylureas
58
59
Saxagliptin vs. long‐acting insulin
59
60
Comparisons with SAVOR‐TIMI 53 trial
Characteristics SAVOR‐TIMI 53 Trial Mini‐Sentinel surveillance*
Comparator Placebo Select anti‐hyperglycemics
No. saxagliptin users 8,280 82,264
No. comparator users 8,212 146,045 to 452,969
No. AMI in saxagliptin 265 94 to 171
No. AMI in comparator 278 75 to 1,085
Length of follow‐up 2.1 years (median) 4 to 8 months (mean)
Statistical analysis Intention‐to‐treat As‐treated
Hazard ratio for AMI 0.95 (95% CI: 0.80, 1.12) 0.54 to 1.17
* From end‐of‐surveillance analysis that included all patients
Interim results from the first 5 sequential analyses were made available to FDA prior to the publication of SAVOR‐
TIMI 53 findings
60
Overview
Evolution of multi‐center studies
Analytic methods in multi‐center studies
Select examples
Discussion
61
Analytical flexibility vs. granularity of information
62
Analytic flexibility
Individual‐level data
with individual covariates
Effect‐estimate data
Individual‐level data
with summary scores
Summary‐table data
Risk‐set data
Intermediate statistics
Privacy protection
Analytic methods in multi‐center studies
Covariate summarization technique
Individual covariates*
Propensity scores
Disease risk scores
Summary scores + individual covariates
A hybrid of above
Data sharing approach
Individual‐level data
Summary‐table data
Risk‐set data
Effect‐estimate data
Intermediate statistics
Covariate adjustment technique
Matching
Stratification
Restriction
Weighting
Modeling
Outcome type
Continuous
Binary
Count
Survival
63
What to share? How to share? What can we do? What outcome?
Analytic methods in multi‐center studies
Covariate summarization technique
Individual covariates
Propensity scores
Disease risk scores
Summary scores + individual covariatesA hybrid of above
Data sharing approach
Individual‐level data
Summary‐table data
Risk‐set data
Effect‐estimate data
Intermediate statistics
Covariate adjustment technique
Matching
Stratification
Restriction
Weighting
Modeling
Outcome type
Continuous
Binary
Count
Survival
64
Conclusion
A suite of analytic methods are available for multi‐center studies
There are often trade‐offs between analytic flexibility and identifiability of information shared
Some newer methods offer excellent analytic flexibility and good privacy protection
65