Real World Evidence (RWE): Experiments on the potential ... · PDF fileReal World Evidence...

Real World Evidence (RWE): Experiments on the potential value of advanced analytics National Academy of Medicine September 2017 Ameet Nathwani

1

SOURCE: McKinsey

Evid

ence

requ

ired

Sub- mission

Pricing Review

New Competition

Competitor goes generic

New Indication / Formulation

Understand standard of care

Post-marketing commitments (safety etc.) Budget impact

Adherence Unmet need/ disease burden

Patient recruitment

Trial design

Utilization/ prescribing

patterns

Long-term safety, clinical outcomes

Head to head safety, comparative effectiveness

Target populations

Differentiation in sub-populations

Effects of switching on outcomes

Usage difference

Differentiate with or vs. protected formulation

Development Launch

Development Growth Phase Mature Phase

Now Past As an industry, the continuous evidence requirements are increasing

CONTEXT

2

RWE can change the way we approach traditional expensive RCT’s

1 – p-value reported for composite score SOURCE : Pubmed.gov, web research; B Cartensen et al. Cancer occurrence in Danish diabetic patients: duration and insulin effects. 2011; National Diabetes Statistics, 2011. ; AZ website

Case example – AZs CVD-REAL RWE study

• More than 300,000 patients

• Medical records from across 6 countries

• Unique insights on treatment with SGLT-2s

– Reduced all-cause mortality by 51%

– Risk of hospitalization by for HF by 39% compared to other T2D medicines

– p<0.0011

• Less than 6 months

• Cost < USD 1 million

3

RWE is defined as Health Care data that are not included in randomized clinical trials (RCT)

SOURCE : JAMA, 22 May 2014

RCT

RWE

4

We believe that increased knowledge of determinants of health provide opportunity for personalized treatment- the magnitude of variables demands a complex analytics approach

SOURCE:Adapted from: https://pct.mdanderson.org/#/

5

We conduct a series of RWE experiments with external partners (from Dec 2016 to May 2017)

Advanced Analytics QUINTEN

Foundational RWE

Predictive analytics

New platforms

Data integration: Palantir/QB

Visualization tools QB/ IBM WATSON

RWE sources of data

RCT DRE

Optum Humedica

IHME Explorys/Truven

Launched RWE experiments

A

B

C

Sanofi RCT data

6

The platform and analytics allowed an interactive visualization of treatment pathways to Insulin Long Acting (ILA) creating a Real time dashboard on Physician practice

Insulin Long Acting +/- OADs

Last before ILA*

*Includes any combination with OADs SR (🎯🎯): success rate of target regimen HbA1c (⇢): average Hba1c before and during treatment regimen Elapsed (⇥): Time from 1st treatment to start of current treatment Naïve: patients whose first treatment regimen was ILA, or any combination with ILA Duration (↦): Time between start of current regimen to start of target treatment

2,163 2.3y (4.1 reg.)

1,482 0y (0reg.)

649 2.3y (4.3 reg.)

158 1.4y (3.5 reg.)

259

2.8y (5.0 reg.)

614 2.1y (4.7 reg.)

373 2.0y (4.4 reg.)

352

2.1y (3.1 reg.)

165 2.2y (5.6 reg.)

3rd Regimen 1st Regimen 2nd Regimen 1st ILA* Rx

Source: Explorys database

6.2K

2.2K

0.5K

53

7

We were able to gain quick insights: for example the application of Machine Learning Models for Personalized Medicine approaches

Actual successful patient intensifying from Metformin to Free Combo ran through the machine learning models for success, persistence, hypoglycemia & associated pathway costs

8

Conditional Probability of Reaching Control

After 12 month on ILA the probability of achieving control over the next 3 month is minimal.

We were able to demonstrate that the probability of incremental control after 12 months on ILA is minimal- and can then drill down into the specificities of the characteristics and predictors of cohorts

9

We tested the utility of applying NLP on Doctors’ notes followed by Machine learning

Humedica - new and rich RWE data source

1 ▪ First time use of Humedica – a rich database with 4.4 Mn diabetes

patients ▪ Over 10 years of longitudinal data, 1000s of variables

Novel use of “Natural Language Processing” allowing to capture more hypos

2

▪ Most comprehensive definition of “hypo” and “severe hypo” events to date

▪ Novel use of “Natural Language Processing” (NLP) on Doctors’ notes allowing to capture twice more hypos

▪ Rigorous exclusion criteria at patient, treatment and hypo event level

Use of actual hypo cost, incurred by the payor

3 ▪ Integration of Electronic Health Records (EHR) and Claims data allowing to estimate actual “cost” incurred by the payor, associated to hypo events

Unsupervised machine learning and modelling at treatment level on the full EMR dataset

4

▪ Advanced unsupervised clustering allowing to identify performance drivers of hypo rate/”cost”

▪ Machine learning techniques automatically adjusting for hundreds of biases when comparing outcomes across cohorts

▪ Dynamic modeling at “treatment level” (vs. conventional “patient level”), allowing to capture more insights from the data (“switchers” can contribute to the study more than once)

10 All content Copyright © 2017 QuantumBlack Visual Analytics Ltd.

Illustration: Use of NLP on doctors’ notes allows to identify ~90% more hypos and ~20% more severe hypos within Type 2 BI-treated patients

53

59

61

53

101

109

83

172

13

37

235

191 (81%)

44 (19%)

Hypos uniquely identified through ICD codes

Additional hypos uniquely identified through NLP

Additional hypos uniquely identified through "IM Glucagon"

Total hypo events identified

Additional hypos uniquely identified through lab results

Severe

Non-severe

1 Codes regarded as severe by default: ICD-9 249.30; 250.30; 250.31; 251.0; ICD-10 E08.641; E09.641; E10.641; E11.641; E13.641; E15. Additionally, ICD codes coinciding with discharge or admission diagnosis or hypo event date on the same day as ED visit or inpatient admission.

2 Plasma glucose <54 mg/Dl; 3 Presence of severe NLP term (e.g. seizure) or presence of severe NLP descriptor (e.g. serious, major) or presence of ED/inpatient visit on the day of the hypo event

~90% more hypos (~20% more severe hypos) identified through

NLP within BI-treated patients

Breakdown of distinct hypo events experienced by T2D basal insulin treated patients (2009-16 period, 000’s hypos)

11 All content Copyright © 2017 QuantumBlack Visual Analytics Ltd.

Identified segments with differential severe hypo rate on A vs. B B population used as baseline, only segments with >10% of B patient-treatments are shown

Looking at advanced modelling of outcomes – Overview of segments with differential severe hypo rate on different insulins

A underperforming B

A outperforming B

T over performing its own average

A underperforming its own average

Patients w hypertension (58%; -$259 | -0.044)

African American patients (13%; -$181 | -0.030)

Obese patients (18%; -$179 | -0.025)

Patients w neuropathy (12%; -$151 | -0.025)

Patients w mean HbA1C > 9.7 (25%; $0 | -0.005)

Patients w mean HbA1C < 7.9 (25%; $0 | -0.005)

Patients w previous noncompliance (11%; +$94 | +.028)

Patients w/o hypertension (42%; +$218 | +.045)

SEVERE HYPOS

PREDICTIVE MODELING

Patients older than 69 (26%; -$2 | -0.001)

Patients w cardiovascular disorders (31%; -$5 | -0.001)

Patients prescribed DppIV (29%; -$170 | -0.029)

Δ incurred hypo cost

Segment name (A%; $B | C)

A. % of B patients B. Δ incurred severe hypo cost on

A vs. B (pppy) C. Δ severe hypo rate on A vs. B

(pppy)

A vs

B

A

12

Cloud hosting

Data engineering

Data model

Analytics tools

Visualization tools

Portal

Custom analytics

Infrastructure

Custom analytics

Custom analytics

RWE & other external sources (e.g., EMR, Claims, Payers, Labs, Devices, social)

Sanofi internal data (e.g., RCT, phase IV, Research, Digital, Commercial)

Single point of entry to “analytics cockpit”

Reporting & data visualization

Analytical outputs from Sanofi & external partners

Suite of advanced analytics tools to run analyses

Data sources accessed from 1 common “Data lake”

Semantic linkages within and across data sources

Platform components

Highly scalable modern technology backbone

How we work: We are establishing an integrated platform consolidating all data, models and visualizing insights

13

What we do: Methodologies we use and some of the partners we work with

Methodologies Partnerships

Machine learning Unsupervised clustering

Model bootstrapping

Predictive modelling Natural Language Processing (NLP)

Propensity-score matching Meta-analysis

Traditional RWE statistics

Rule extraction

OMOP data modelling

Traditional analytics

Advanced analytics

14

Advanced analytics methodology: 6 step process overview

Develop clinical interpretation of outcomes in overall population or patient subsegments Propose actions

– Inform clinical and outcomes research

– Refine view of health economics

– Publish findings

Identify segments with differential outcomes between drugs of interest Validate causation

Run linear and non-linear ML algorithms to identify drivers of target variables Create new IVs as

drivers identified

Include key variables defined by experts (e.g., insurance type) Increase signal to

noise ratio by grouping co-occurring variables (clustering): – Diagnoses – Procedures – Drug treatments

Identify and build independent vari-ables e.g., – Comorbidities – Coprescriptions – Habits (e.g.,

smoking) Build select proxy

variables, e.g. disease progression, disease activity

Select initial datasets Build target

variables Test and refined

variables Perform descriptive

analyses to understand underlying data set

C

B

A

…

Initial datasets, 3-5 target variables

100s of independent variables

1000s of independent variables

10-50 perfor-mance drivers

Predictions

Machine learning models

1 2 3 4 5 6 Segmentation

Scenario construction and impact assessment

Decision support

Activities

Critical stages

15

EXAMPLE SCENARIO: Predictive analytics is developed on the entire dataset and allows extrapolation to non-observed data

Approach Simulation Description

Descriptive analysis

▪ A single descriptive statistic across entire population sample is calculated (typically mean or median)

▪ Descriptive analysis is performed only on overlapping (matched) subsample of population

▪ For non-overlapping populations analysis is typically not performed or difficult to interpret


▪ Predictive model is developed on the entire available dataset

▪ Predictive analytics is typically used to extrapolate to non-observed data

Propensity score matching

16

EXAMPLE SCENARIO: Predictive analytics can capture patterns across the entire dataset available for modelling

Approach Simulation Description

Descriptive analysis

▪ A single estimate is provided reflecting entire available dataset(e.g. mean, median)

▪ Analysis performed on matched samples does not capture patterns in the non-overlapping subsample (e.g. downward sloping pattern in the red sample)


▪ Predictive model can capture patterns across the entire dataset available for modelling

Propensity score matching

17

Literature that supports a predictive modeling approach for RWE

Takeaway

1 2 3

4

5 6 7

Study Cepada et al. Comparison of Logistic Regression versus Propensity Score When the Number of Events Is Low and There Are Multiple Confounder: https://www.ncbi.nlm.nih.gov/pubmed/12882951

▪ "Logistic regression is the technique of choice when there are at least eight events per confounder."

▪ Given the size of our data (tens of thousands of patients), this is almost certainly the case

G. King et al. Why Propensity Scores Should Not Be Used for Matching: https://gking.harvard.edu/publications/why-propensity-scores-should-not-be-used-formatching

▪ PSM often results in “increasing imbalance, inefficiency, model dependence, and bias”

Stefan Wager, Susan Athey - Estimation and Inference of Heterogeneous Treatment Effects using Random Forests: https://arxiv.org/abs/1510.04342

▪ Read 'Treatment Effect' as delta, and 'Heterogenous' as 'different per segment, and even per patient'

▪ On classical methods such as matching: "These methods perform well in applications with a small number of covariates, but quickly break down as the number of covariates increases"

▪ Paper offers several citations of people using predictive methodology as opposed to PSM ▪ Paper explores an alternative approach, the 'Causal Forest', which is almost a hybrid of

predictive and PSM

Shah BR1, Laupacis A, Hux JE, Austin PC. Propensity score methods gave similar results to traditional regression modeling in observational studies: a systematic review: https://www.ncbi.nlm.nih.gov/pubmed/15878468

▪ "Observational studies had similar results whether using traditional regression or propensity scores to adjust for confounding. Propensity scores gave slightly weaker associations; "

▪ Basically, even for those studies where it is feasible to use PSM, regression can be a better option

Klungel OH et al. Methods to assess intended effects of drug treatment in observational studies are reviewed: https://www.ncbi.nlm.nih.gov/pubmed/15617947

▪ An interesting survey of the area

Shmueli G. To Explain or Predict: https://www.stat.berkeley.edu/~aldous/157/Papers/shmueli.pdf

▪ An in depth discussion of predictive vs. explanatory analysis, especially also with respect to causality.

Angrist JD and Pischke JS Mostly Harmless Econometrics: An Empiricist’s Companion: http://www.mostlyharmlesseconometrics.com

▪ Discussion of methods to deal with causality and confounding: Heckman’s correction, Probit and Tobit, Propensity scores (weighting and matching).

▪ Contains many links, references for further reading.

Date post:	06-Feb-2018
Category:	Documents
Upload:	votu
View:	222 times
Download:	1 times

Real World Evidence (RWE): Experiments on the potential ... · PDF fileReal World Evidence...

Documents