+ All Categories
Home > Documents > Rachel Melamed patient health for drug safety studies ... · Using indication embeddings to...

Rachel Melamed patient health for drug safety studies ... · Using indication embeddings to...

Date post: 01-Oct-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
1
Using indication embeddings to represent patient health for drug safety studies Rachel Melamed Biomedical Data Science @University of Chicago & Biology @ UMass Lowell [email protected] | @RDMelamed Goal: high-throughput drug safety studies Randomized trials : low-throughput but unbiased 1) From data select people.. 1) Enroll cohort 2) Randomize treatment 3) Compare experimental groups Cohort studies : reuse health data to emulate randomized trial. Drug safety Does taking this drug change your risk of some health outcome? Exposure Cancer Health data X- ray asthma statin arthritis 5,000 Precriptions 10,000 Diagnosis codes 20,000 Procedure codes …taking treatment drug …or comparator drug Can we match without expert design? Creating indication embeddings Evaluating embeddings The challenge: confounding Embeddings identify comparator drugs Match with embeddings age aspirin P(treated | age,..) Propensity score match HEALTH age aspirin diabetes Evaluate with propensity score: P(treated) in treated cohort P(treated) in comp. P(treated | …) 2) Match to emulate randomization High-throughput cohort studies Currently, cohort study design relies on domain experts: never often 25 75 age aspirin Expert Task 1 Find suitable comparator drug Expert Task 2 Design matching— identify confounders The solution: matching age Match on confounders ! "#$%"$& %'$, %)!*#*+, … ) Insulin resistance insulin, Type 2 diabetes amoxicillin xanax …other Rx, Dx, Px How to match on 30,000+ dimensional, sparse, uninformative vectors? Instead map them to small, meaningful embeddings Indication embedding Drug embedding Training task: Predict drug Simple neural network 150 million patient histories blood lab gout statin diabetes New Rx metformin History event New Rx ( , ) ( , ) Create training examples ( , ) ( , ) Embeddings relate codes to health needs Drug embedding = drugs given in most similar health contexts Indication embedding = health context for prescription of a new drug Map each event to 50- dimensional vector For each drug, performance of embedding distance to predict indications. Overall ROCAUC = .82 auc Dot-products between antidepressants and selected closest diagnoses. tricyclic SSRI SNRI anticonvulsants antipsychotics Drugs with closest embedding dot product are more comparable, as measured by AUC Drugs with same therapeutic use as carbamazepine: primarily anticonvulsant, off-label for bipolar. aripiprazole asenapine carbamazepine chlorpromazine clozapine divalproex_sodium fluphenazine gabapentin haloperidol haloperidol_lactate lacosamide lamotrigine levetiracetam lithium_carbonate lurasidone Olanzapine ROC AUC Expert Task 1 Therapeutic class oxcarbazepine paliperidone perphenazine primidone prochlorperazine prochlorperazine_maleate quetiapine_fumarate risperidone thioridazine thiothixene tiagabine topiramate trifluoperazine valproic_acid ziprasidone zonisamide 2003 2013 25 75 age year Step 1 : Coarsened exact matching by age, gender, year, number of Rx Step 2 : Encode histories à small dense vectors Step 3 : Mahalanobis match on health summaries within bins Indication embedding (R V x E ) Weighted average (upweight recent history) Embedding matching Expert Task 2 ! Embedding can match for key confounders Propensity score matching Matching people on bupropion to trazodone is complicated by alternate indication of bupropion for smoking cessation. Each point is one person on bupropion, trazodone, or varenicline. Health summary vectors Embedding better matches nonsmokers to nonsmokers 0 0 1 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 .2 1 0 .1 .8 Then do simple nearest- neighbor matching
Transcript
Page 1: Rachel Melamed patient health for drug safety studies ... · Using indication embeddings to represent patient health for drug safety studies Rachel Melamed Biomedical Data Science

Using indication embeddings to represent patient health for drug safety studies

Rachel MelamedBiomedical Data Science @University of Chicago &

Biology @ UMass [email protected] | @RDMelamed

Goal: high-throughput drug safety studiesRandomized trials:

low-throughput but

unbiased

1) From data select people..

✄ ☤☤✄

☤✄ ☤

1) Enroll cohort 2) Randomize treatment 3) Compare experimental groups

Cohort studies: reuse

health data to emulate

randomized trial.

Drug safetyDoes taking this drug

change your risk of

some health outcome?

Exposure Cancer☤Health data

X-ray

ast

hm

a

stat

in

arthritis

✄ ☤

5,000

Precriptions

10,000

Diagnosis

codes

20,000

Procedure

codes

…taking treatment drug

…or comparator drug

Can we match without expert

design?

Creating indication embeddings

Evaluating embeddings

The challenge: confounding

Embeddings identify comparator drugs

Match with embeddings

age

aspirin

P(treated | age,..)

Propensity score match

HEALTHage aspirin

☤diabetes

Evaluate with

propensity score:

P(treated) in

treated cohort

P(treated)

in comp.

P(treated | …)

2) Match to emulate randomization

High-throughput cohort studies Currently, cohort study

design relies on domain

experts:

never

often

25 75age

aspirin

Expert

Task 1

Find suitable

comparator drug

Expert

Task 2

Design

matching—

identify

confounders

The solution: matching

age

Match on confounders! "#$%"$& %'$, %)!*#*+, … )

Insulin resistance

insulin, Type 2 diabetes

amoxi

cillin

xanax …other Rx,

Dx, Px

How to match on 30,000+ dimensional,

sparse, uninformative vectors?Instead map them to small, meaningful

embeddings

Indication

embedding

Drug

embedding

Training task:

Predict drug

Simple neural network

150 million patient histories

☤blo

od lab

gout

statin

diab

etes✄

New Rx metformin

History

event

New Rx

( , )

( , )

Create training

examples

( , )

( , )✄

Embeddings relate codes to health needs

Drug embedding = drugs

given in most similar health

contexts

Indication embedding =

health context for prescription

of a new drug

Map each event to 50-dimensional vector

For each drug, performance of embedding

distance to predict indications. Overall

ROCAUC = .82

auc

Dot-products between

antidepressants and selected

closest diagnoses.

tricyclicSSRI

SNRI

anticonvulsants

antipsychotics

Drugs with closest embedding dot

product are more comparable, as

measured by AUC

Drugs with same therapeutic use

as carbamazepine: primarily

anticonvulsant, off-label for bipolar.aripiprazoleasenapinecarbamazepinechlorpromazineclozapinedivalproex_sodiumfluphenazinegabapentinhaloperidolhaloperidol_lactatelacosamidelamotriginelevetiracetamlithium_carbonatelurasidoneOlanzapine

✄ ☤

☤✄☤

✄ ☤☤

ROC AUC

Expert

Task 1

Thera

peutic c

lass

oxcarbazepinepaliperidoneperphenazineprimidoneprochlorperazineprochlorperazine_maleatequetiapine_fumaraterisperidonethioridazinethiothixenetiagabinetopiramatetrifluoperazinevalproic_acidziprasidonezonisamide

2003

2013

25 75age

year

Step 1:

Coarsened exact matching by

age, gender, year, number of Rx

Step 2:

Encode histories àsmall dense vectors

Step 3:

Mahalanobis match on health

summaries within bins

Indication

embedding

(RV x E )

Weighted average

(upweight recent history)

✄ ☤

Embedding matching

Expert

Task 2

!

Embedding can match for key confoundersPropensity score matching

Matching people on

bupropion to trazodone is

complicated by alternate

indication of bupropion for

smoking cessation. Each

point is one person on

bupropion, trazodone, or

varenicline.

Health summary vectors

Embedding better matches

nonsmokers to nonsmokers

0 0 1 0 0 0 0 0 1 0 1 0 0 0 0

0 0 0 0 0 0 1 0 0 0 0 0 0 0 0

0 .2 1

0 .1 .8

Then do simple nearest-

neighbor matching

Recommended