Real World Evidence (RWE): Experiments on the potential value of advanced analytics National Academy of Medicine September 2017 Ameet Nathwani
1
SOURCE: McKinsey
Evid
ence
requ
ired
Sub- mission
Pricing Review
New Competition
Competitor goes generic
New Indication / Formulation
Understand standard of care
Post-marketing commitments (safety etc.) Budget impact
Adherence Unmet need/ disease burden
Patient recruitment
Trial design
Utilization/ prescribing
patterns
Long-term safety, clinical outcomes
Head to head safety, comparative effectiveness
Target populations
Differentiation in sub-populations
Effects of switching on outcomes
Usage difference
Differentiate with or vs. protected formulation
Development Launch
Development Growth Phase Mature Phase
Now Past As an industry, the continuous evidence requirements are increasing
CONTEXT
2
RWE can change the way we approach traditional expensive RCT’s
1 – p-value reported for composite score SOURCE : Pubmed.gov, web research; B Cartensen et al. Cancer occurrence in Danish diabetic patients: duration and insulin effects. 2011; National Diabetes Statistics, 2011. ; AZ website
Case example – AZs CVD-REAL RWE study
• More than 300,000 patients
• Medical records from across 6 countries
• Unique insights on treatment with SGLT-2s
– Reduced all-cause mortality by 51%
– Risk of hospitalization by for HF by 39% compared to other T2D medicines
– p<0.0011
• Less than 6 months
• Cost < USD 1 million
3
RWE is defined as Health Care data that are not included in randomized clinical trials (RCT)
SOURCE : JAMA, 22 May 2014
RCT
RWE
4
We believe that increased knowledge of determinants of health provide opportunity for personalized treatment- the magnitude of variables demands a complex analytics approach
SOURCE:Adapted from: https://pct.mdanderson.org/#/
5
We conduct a series of RWE experiments with external partners (from Dec 2016 to May 2017)
Advanced Analytics QUINTEN
Foundational RWE
Predictive analytics
New platforms
Data integration: Palantir/QB
Visualization tools QB/ IBM WATSON
RWE sources of data
RCT DRE
Optum Humedica
IHME Explorys/Truven
Launched RWE experiments
A
B
C
Sanofi RCT data
6
The platform and analytics allowed an interactive visualization of treatment pathways to Insulin Long Acting (ILA) creating a Real time dashboard on Physician practice
Insulin Long Acting +/- OADs
Last before ILA*
*Includes any combination with OADs SR (🎯🎯): success rate of target regimen HbA1c (⇢): average Hba1c before and during treatment regimen Elapsed (⇥): Time from 1st treatment to start of current treatment Naïve: patients whose first treatment regimen was ILA, or any combination with ILA Duration (↦): Time between start of current regimen to start of target treatment
2,163 2.3y (4.1 reg.)
1,482 0y (0reg.)
649 2.3y (4.3 reg.)
158 1.4y (3.5 reg.)
259
2.8y (5.0 reg.)
614 2.1y (4.7 reg.)
373 2.0y (4.4 reg.)
352
2.1y (3.1 reg.)
165 2.2y (5.6 reg.)
3rd Regimen 1st Regimen 2nd Regimen 1st ILA* Rx
Source: Explorys database
6.2K
2.2K
0.5K
53
7
We were able to gain quick insights: for example the application of Machine Learning Models for Personalized Medicine approaches
Actual successful patient intensifying from Metformin to Free Combo ran through the machine learning models for success, persistence, hypoglycemia & associated pathway costs
8
Conditional Probability of Reaching Control
After 12 month on ILA the probability of achieving control over the next 3 month is minimal.
We were able to demonstrate that the probability of incremental control after 12 months on ILA is minimal- and can then drill down into the specificities of the characteristics and predictors of cohorts
9
We tested the utility of applying NLP on Doctors’ notes followed by Machine learning
Humedica - new and rich RWE data source
1 ▪ First time use of Humedica – a rich database with 4.4 Mn diabetes
patients ▪ Over 10 years of longitudinal data, 1000s of variables
Novel use of “Natural Language Processing” allowing to capture more hypos
2
▪ Most comprehensive definition of “hypo” and “severe hypo” events to date
▪ Novel use of “Natural Language Processing” (NLP) on Doctors’ notes allowing to capture twice more hypos
▪ Rigorous exclusion criteria at patient, treatment and hypo event level
Use of actual hypo cost, incurred by the payor
3 ▪ Integration of Electronic Health Records (EHR) and Claims data allowing to estimate actual “cost” incurred by the payor, associated to hypo events
Unsupervised machine learning and modelling at treatment level on the full EMR dataset
4
▪ Advanced unsupervised clustering allowing to identify performance drivers of hypo rate/”cost”
▪ Machine learning techniques automatically adjusting for hundreds of biases when comparing outcomes across cohorts
▪ Dynamic modeling at “treatment level” (vs. conventional “patient level”), allowing to capture more insights from the data (“switchers” can contribute to the study more than once)
10 All content Copyright © 2017 QuantumBlack Visual Analytics Ltd.
Illustration: Use of NLP on doctors’ notes allows to identify ~90% more hypos and ~20% more severe hypos within Type 2 BI-treated patients
53
59
61
53
101
109
83
172
13
37
235
191 (81%)
44 (19%)
Hypos uniquely identified through ICD codes
Additional hypos uniquely identified through NLP
Additional hypos uniquely identified through "IM Glucagon"
Total hypo events identified
Additional hypos uniquely identified through lab results
Severe
Non-severe
1 Codes regarded as severe by default: ICD-9 249.30; 250.30; 250.31; 251.0; ICD-10 E08.641; E09.641; E10.641; E11.641; E13.641; E15. Additionally, ICD codes coinciding with discharge or admission diagnosis or hypo event date on the same day as ED visit or inpatient admission.
2 Plasma glucose <54 mg/Dl; 3 Presence of severe NLP term (e.g. seizure) or presence of severe NLP descriptor (e.g. serious, major) or presence of ED/inpatient visit on the day of the hypo event
~90% more hypos (~20% more severe hypos) identified through
NLP within BI-treated patients
Breakdown of distinct hypo events experienced by T2D basal insulin treated patients (2009-16 period, 000’s hypos)
11 All content Copyright © 2017 QuantumBlack Visual Analytics Ltd.
Identified segments with differential severe hypo rate on A vs. B B population used as baseline, only segments with >10% of B patient-treatments are shown
Looking at advanced modelling of outcomes – Overview of segments with differential severe hypo rate on different insulins
A underperforming B
A outperforming B
T over performing its own average
A underperforming its own average
Patients w hypertension (58%; -$259 | -0.044)
African American patients (13%; -$181 | -0.030)
Obese patients (18%; -$179 | -0.025)
Patients w neuropathy (12%; -$151 | -0.025)
Patients w mean HbA1C > 9.7 (25%; $0 | -0.005)
Patients w mean HbA1C < 7.9 (25%; $0 | -0.005)
Patients w previous noncompliance (11%; +$94 | +.028)
Patients w/o hypertension (42%; +$218 | +.045)
SEVERE HYPOS
PREDICTIVE MODELING
Patients older than 69 (26%; -$2 | -0.001)
Patients w cardiovascular disorders (31%; -$5 | -0.001)
Patients prescribed DppIV (29%; -$170 | -0.029)
Δ incurred hypo cost
Segment name (A%; $B | C)
A. % of B patients B. Δ incurred severe hypo cost on
A vs. B (pppy) C. Δ severe hypo rate on A vs. B
(pppy)
A vs
B
A
12
Cloud hosting
Data engineering
Data model
Analytics tools
Visualization tools
Portal
Custom analytics
Infrastructure
Custom analytics
Custom analytics
RWE & other external sources (e.g., EMR, Claims, Payers, Labs, Devices, social)
Sanofi internal data (e.g., RCT, phase IV, Research, Digital, Commercial)
Single point of entry to “analytics cockpit”
Reporting & data visualization
Analytical outputs from Sanofi & external partners
Suite of advanced analytics tools to run analyses
Data sources accessed from 1 common “Data lake”
Semantic linkages within and across data sources
Platform components
Highly scalable modern technology backbone
How we work: We are establishing an integrated platform consolidating all data, models and visualizing insights
13
What we do: Methodologies we use and some of the partners we work with
Methodologies Partnerships
Machine learning Unsupervised clustering
Model bootstrapping
Predictive modelling Natural Language Processing (NLP)
Propensity-score matching Meta-analysis
Traditional RWE statistics
Rule extraction
OMOP data modelling
Traditional analytics
Advanced analytics
14
Advanced analytics methodology: 6 step process overview
Develop clinical interpretation of outcomes in overall population or patient subsegments Propose actions
– Inform clinical and outcomes research
– Refine view of health economics
– Publish findings
Identify segments with differential outcomes between drugs of interest Validate causation
Run linear and non-linear ML algorithms to identify drivers of target variables Create new IVs as
drivers identified
Include key variables defined by experts (e.g., insurance type) Increase signal to
noise ratio by grouping co-occurring variables (clustering): – Diagnoses – Procedures – Drug treatments
Identify and build independent vari-ables e.g., – Comorbidities – Coprescriptions – Habits (e.g.,
smoking) Build select proxy
variables, e.g. disease progression, disease activity
Select initial datasets Build target
variables Test and refined
variables Perform descriptive
analyses to understand underlying data set
C
B
A
…
Initial datasets, 3-5 target variables
100s of independent variables
1000s of independent variables
10-50 perfor-mance drivers
Predictions
Machine learning models
1 2 3 4 5 6 Segmentation
Scenario construction and impact assessment
Decision support
Activities
Critical stages
15
EXAMPLE SCENARIO: Predictive analytics is developed on the entire dataset and allows extrapolation to non-observed data
Approach Simulation Description
Descriptive analysis
▪ A single descriptive statistic across entire population sample is calculated (typically mean or median)
▪ Descriptive analysis is performed only on overlapping (matched) subsample of population
▪ For non-overlapping populations analysis is typically not performed or difficult to interpret
Predictive analytics
▪ Predictive model is developed on the entire available dataset
▪ Predictive analytics is typically used to extrapolate to non-observed data
Propensity score matching
16
EXAMPLE SCENARIO: Predictive analytics can capture patterns across the entire dataset available for modelling
Approach Simulation Description
Descriptive analysis
▪ A single estimate is provided reflecting entire available dataset(e.g. mean, median)
▪ Analysis performed on matched samples does not capture patterns in the non-overlapping subsample (e.g. downward sloping pattern in the red sample)
Predictive analytics
▪ Predictive model can capture patterns across the entire dataset available for modelling
Propensity score matching
17
Literature that supports a predictive modeling approach for RWE
Takeaway
1 2 3
4
5 6 7
Study Cepada et al. Comparison of Logistic Regression versus Propensity Score When the Number of Events Is Low and There Are Multiple Confounder: https://www.ncbi.nlm.nih.gov/pubmed/12882951
▪ "Logistic regression is the technique of choice when there are at least eight events per confounder."
▪ Given the size of our data (tens of thousands of patients), this is almost certainly the case
G. King et al. Why Propensity Scores Should Not Be Used for Matching: https://gking.harvard.edu/publications/why-propensity-scores-should-not-be-used-formatching
▪ PSM often results in “increasing imbalance, inefficiency, model dependence, and bias”
Stefan Wager, Susan Athey - Estimation and Inference of Heterogeneous Treatment Effects using Random Forests: https://arxiv.org/abs/1510.04342
▪ Read 'Treatment Effect' as delta, and 'Heterogenous' as 'different per segment, and even per patient'
▪ On classical methods such as matching: "These methods perform well in applications with a small number of covariates, but quickly break down as the number of covariates increases"
▪ Paper offers several citations of people using predictive methodology as opposed to PSM ▪ Paper explores an alternative approach, the 'Causal Forest', which is almost a hybrid of
predictive and PSM
Shah BR1, Laupacis A, Hux JE, Austin PC. Propensity score methods gave similar results to traditional regression modeling in observational studies: a systematic review: https://www.ncbi.nlm.nih.gov/pubmed/15878468
▪ "Observational studies had similar results whether using traditional regression or propensity scores to adjust for confounding. Propensity scores gave slightly weaker associations; "
▪ Basically, even for those studies where it is feasible to use PSM, regression can be a better option
Klungel OH et al. Methods to assess intended effects of drug treatment in observational studies are reviewed: https://www.ncbi.nlm.nih.gov/pubmed/15617947
▪ An interesting survey of the area
Shmueli G. To Explain or Predict: https://www.stat.berkeley.edu/~aldous/157/Papers/shmueli.pdf
▪ An in depth discussion of predictive vs. explanatory analysis, especially also with respect to causality.
Angrist JD and Pischke JS Mostly Harmless Econometrics: An Empiricist’s Companion: http://www.mostlyharmlesseconometrics.com
▪ Discussion of methods to deal with causality and confounding: Heckman’s correction, Probit and Tobit, Propensity scores (weighting and matching).
▪ Contains many links, references for further reading.