+ All Categories
Home > Documents > Predictive Analytics for OpenFDA & Other Sources October 6, 2014.

Predictive Analytics for OpenFDA & Other Sources October 6, 2014.

Date post: 21-Dec-2015
Category:
Upload: muriel-lane
View: 216 times
Download: 2 times
Share this document with a friend
Popular Tags:
18
Predictive Analytics for OpenFDA & Other Sources October 6, 2014
Transcript

Predictive Analytics for OpenFDA & Other Sources

October 6, 2014

Data Fusion to Know a Individuals

OpenFDA Queries

https://api.fda.gov/drug/event.json?

search=patient.drug.openfda.pharm_class_epc:"nonsteroidal+anti-inflammatory+drug”

&count=patient.reaction.reactionmeddrapt.exact

End Point

search for records where

openfda.pharm_class_epc (pharmacologic

class) contains nonsteroidal anti-

inflammatory drug.

count the field patient.reaction.rea

ctionmeddrapt (patient reactions).

https://api.fda.gov/drug/event.json?search=patient.drug.openfda.pharm_class_epc:%22nonsteroidal+anti-inflammatory+drug%22&count=patient.reaction.reactionmeddrapt.exact

Important OpenFDA data types

What the drug is supposed to fix: Pharmacologic Class (EPC) - pharm_class_epc

How the drug works: Mechanism of Action (MOA) - pharm_class_moa

What the drug affects: Physiologic Effect (PE) - pharm_class_pe

What is in the drug: Chemical Structure (CS) - pharm_class_cs

https://api.fda.gov/drug/event.json?search=patient.drug.openfda.pharm_class_epc:%22Serotonin+and+Norepinephrine+Reuptake+Inhibitor%22

Safety Report ID

Biographical DataAdverse Reactions

Drug Information

More OpenFDA data types

How serious is the reaction: serious (1 for Yes, 2 for No)• "serious": "1",• "seriousnesscongenitalanomali": "1", • "seriousnessdeath": "1", • "seriousnessdisabling": "1" • "seriousnesshospitalization": "1", • "seriousnesslifethreatening": "1", • "seriousnessother": "1”

What is the drug indicated for: drugindication

Circumstances for taking drug: patient.drug.drugadditional

Predictions on OpenFDA Data

Hierarchical Clustering (“unsupervised learning”) on Manufacturers by Drug Class and Adverse Events

Generates Insights and Further Questions to Explore, Like; Do some adverse events dominate all others? What is the role of retail distributors rather than

manufacturers – an artifact of the data or something else they do between between themselves and patient?

Manufacturers by All Drug Classes

Group distinguished by abnormally large adverse events for the products they make – includes companies Mylan and Teva

Group troubling in the large number of adverse events for the products they make – includes companies Abbvie and Pfizer

Group above average for the number of product adverse events. includes private labeling companies CVS, Kroger, Wal-Mart, Publix

Other manufacturers not troubling in the number of adverse events

Manufacturers by All Adverse Events

Other manufacturers not troubling in the number of adverse events

Group of 1 highly (Mylan) distinguished by abnormally large adverse events for the products they make

Group troubling in the large number of adverse events for the products they make – includes companies Teva and Grocery Store Kroger

Group above average for the number of product adverse events. includes big pharma maker Merck.

Conditional Probability Models (Bayes) Very Helpful for Predictions

Model Type % Correct on Age

% Correct on Gender

Random Forest 48% 55%

Support Vector Machine

48% 55%

Decision Trees 14% 9%

Naïve Bayes 64% 78%

Why is Bayes So Much Better?

Works on Conditional Probability

Utilizes Much More of What We Already Know

Probability of Age 18to34 | Rating % Age 18to34drug

drug

Bayes is Conditional Probability

Intuition is “What the chances of X given I know Y”

This will always be better than flipping a coin – as in the case of gender prediction

The probability of Female (F) for a any given Drug (T) is the same as the probability of the Drug given Female times the probability of being female divided by the probability of the Drug.

Bayes Results for Single Person Households

**** ACCURACY **** WEIGHTED ACCURACY

Genre Gender Age Size Weight Gender AgeADVENTURE 75.4% 62.0% 16,565 1.001 75.5% 62.1%

AUDIENCE PARTICIPATION 84.1% 78.8% 46,283 1.003 84.4% 79.0%AWARD CEREMONIES 60.4% 42.6% 655 1.000 60.4% 42.6%

CHILD - LIVE 78.6% 67.7% 4,868 1.000 78.6% 67.7%CHILD DAY - ANIMATION 74.7% 59.3% 3,487 1.000 74.7% 59.4%

CHILD MULTI-WEEKLY 81.6% 73.2% 1,916,697 1.144 93.3% 83.8%CHILDREN'S NEWS 76.0% 33.3% 300 1.000 76.0% 33.3%COMEDY VARIETY 76.7% 68.9% 326,770 1.025 78.6% 70.6%CONCERT MUSIC 67.8% 54.6% 2,822 1.000 67.9% 54.6%

CONVERSATIONS, COLLOQUIES 76.8% 63.3% 113,290 1.009 77.5% 63.9%

DAYTIME DRAMA 81.1% 62.5% 20,478 1.002 81.2% 62.6%DEVOTIONAL 64.0% 47.8% 1,344 1.000 64.0% 47.8%

EVENING ANIMATION 80.7% 76.7% 481,722 1.036 83.6% 79.5%FEATURE FILM 74.5% 62.7% 449,549 1.034 77.0% 64.8%

FORMAT VARIES 76.6% 56.0% 1,127 1.000 76.6% 56.0%GENERAL DOCUMENTARY 74.6% 63.9% 2,004,256 1.150 85.8% 73.5%

GENERAL DRAMA 75.0% 63.6% 1,949,243 1.146 86.0% 72.9%GENERAL VARIETY 73.4% 62.1% 377,859 1.028 75.5% 63.8%

INSTRUCTION, ADVICE 79.1% 67.2% 1,000,586 1.075 85.0% 72.2%NEWS 77.8% 65.4% 971,951 1.073 83.5% 70.1%

NEWS DOCUMENTARY 77.5% 63.2% 100,634 1.008 78.1% 63.7%OFFICIAL POLICE 46.6% 29.2% 1,009 1.000 46.6% 29.2%

PARTICIPATION VARIETY 75.3% 62.3% 174,900 1.013 76.3% 63.1%POPULAR MUSIC 77.0% 67.5% 458,606 1.034 79.6% 69.8%POPULAR MUSIC

STANDARD 69.0% 50.5% 2,335 1.000 69.0% 50.5%PRIVATE DETECTIVE 71.5% 71.5% 20,522 1.002 71.6% 71.7%

QUIZ GIVE AWAY 79.1% 68.7% 76,822 1.006 79.5% 69.1%QUIZ PANEL 79.8% 63.4% 1,700 1.000 79.8% 63.4%

SCIENCE FICTION 76.1% 65.3% 24,219 1.002 76.2% 65.4%SITUATION COMEDY 75.4% 61.3% 1,124,687 1.084 81.8% 66.5%

SPORTS ANTHOLOGY 83.8% 64.8% 52,166 1.004 84.1% 65.0%SPORTS COMMENTARY 79.0% 68.7% 993,734 1.075 84.9% 73.9%

SPORTS EVENT 75.0% 62.2% 204,127 1.015 76.2% 63.1%SPORTS NEWS 81.1% 68.3% 15,275 1.001 81.2% 68.4%

SUSPENSE/MYSTERY 81.3% 70.9% 342,405 1.026 83.4% 72.7%UNCLASSIFIED 77.8% 62.8% 38,060 1.003 78.0% 63.0%

WESTERN DRAMA 75.6% 63.8% 4,300 1.000 75.7% 63.9%

AVERAGE 75.4% 62.1%13,325,35

3 77.5% 63.9%

Simplifying the Problem Set

Single Households

Multi-Person Households

Same Gender & Same Age Class

Same Gender & Diff. Age Class

Diff. Gender & Same Age Class

Diff. Gender & Diff. Age Class

123K

21K

44K

303K

133K

500K

nothing to predict

predict age

predict gender

predict both

Age / Gender models by Drug

2 Stage Models

Same Gender & Diff. Age Class

Diff. Gender & Same Age Class

Diff. Gender & Diff. Age Class

predict age

predict gender

predict both

Age / Gender Models by Drug

Age / Gender Conditional Probability

1

2

Single Households

Age Conditional Probabilities

Full Bayes Model

Using all the independent variables –

Where MAX is the prediction of Age or Gender classification given all the conditional probabilities known.

NOTE: The MAX prediction for Age is constrained by ID – each ID has only 2 possible Age classes since these are known, so if model predicts an Age class outside boundaries of a ID pick next highest MAX probability for Age.


Recommended