+ All Categories
Home > Health & Medicine > Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Date post: 18-Nov-2014
Category:
Upload: karin-verspoor
View: 290 times
Download: 0 times
Share this document with a friend
Description:
Practice-based Evidence in Medicine: Where Information Retrieval Meets Data Mining
Popular Tags:
76
Practice-based Evidence in Medicine: Where Information Retrieval Meets Data Mining Karin M. Verspoor Department of Computing and Information Systems Health and Biomedical Informatics Centre The University of Melbourne [email protected] 08 July 2014
Transcript
Page 1: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Practice-based Evidence in Medicine: Where Information Retrieval Meets Data Mining

Karin M. Verspoor

Department of Computing and Information Systems

Health and Biomedical Informatics Centre

The University of Melbourne

[email protected]

08 July 2014

Page 2: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Myriad problems, Myriad solutions

genomics

clinical decision support

furthering biological

knowledge

empowering patients

novel treatments

Page 3: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Evidence-based medicine

“Best available clinical evidence” = randomized clinical trials

Page 4: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

How physicians are taught EBM

Critically appraise the key article(s)

Select key article(s)Make a decision

Integrate decisioninto practice

Evaluate impactof decision

Formulatea question

Practice Other triggers

From EBM curriculum, U. of Ottawa http://www.med.uottawa.ca/sim/data/EBM_Intro_slides.ppt

Select key article(s)

Formulatea question

Page 5: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

That sounds familiar, right?

Select key article(s)

Formulatea question

Page 6: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

How to use the evidence base

• Critical appraisal– What are the study results?– Are the study results valid?

• Will the results help in caring for my patient?– Were all clinically important outcomes reported?– Are likely treatment benefits greater than

potential harms?– Can the results help my patient?

Page 7: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

IR in Evidence-based medicine

• Identify articles relevant to a clinical question.

• Identify clinical elements of the literature.– PICO: Population/Patient,

Intervention/Indicator, Comparator/Control, Outcome

• Support a systematic review of the clinical literature.

Lots of opportunities for IR here. But I won’t say much more about literature mining.

Page 8: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Limitations of EBM

• Clinical variability• Biological variability• Randomized controlled trials

– Undertaken under controlled conditions– Applicability to patient not always clear

• Clinical judgement about how the evidence fits the patient

Page 9: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

EVIDENCE FITTING PATIENT?

Page 10: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

EBM + Clinical Judgement

• Do the results of the study apply to my patient?

• What if my patient mostly but not completely satisfies the inclusion criteria?

http://commons.wikimedia.org/wiki/File%3APregnant_woman.jpg

Page 11: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Natural Experiments

• Randomized Clinical Trials are limited– By design

(specific inclusion, exclusion criteria)

– By resources(limited patient cohorts recruited)

• How do the results generalize?

http://commons.wikimedia.org/wiki/File%3ABig_Day_Out_(8392285402).jpg

Page 12: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Natural Experiments

• What we would really like to do is to study large populations of people– to identify side-effects or interactions that

appear when a treatment is provided to 1000s or 10,000s of people rather than 100s

– to explore what characteristics of individuals are ultimately responsible for a positive/negative outcome

Evidence deriving from clinical practice “in the wild” rather than controlled studies

practice-based evidence

Page 13: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Whence the Evidence?

Mining electronic health records: towards better research applications and clinical carePeter B. Jensen, Lars J. Jensen & Søren Brunak, Nature Reviews Genetics 13, 395-405 (2012)doi:10.1038/nrg3208

Page 14: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Electronic Health Records

Page 15: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

EHRs facilitate better care

• More complete picture of the patient and ongoing patient history– understand trends in vital signs– track allergies

• Integrate billing, pharmacy, radiology, laboratory• Streamlined clinical workflow: Share lab results,

imaging, specialist assessments, etc. directly• Better tracking of prescription/test orders• Fewer medical errors / possibility for error

checking• Clinical decision support

Page 16: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

EHRs enable analysis

• Analyse trends in the effectiveness of treatments• Investigate the efficacy of medications in

patients with co-morbidities• Are we seeing evidence of an emerging

epidemic?• Outcomes research:

– why are patients in one geographic region having higher rates of cancer recurrence than in another region?

– do patients with diabetes experience higher rates of hearing loss?

Page 17: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Clinical Text

About 80% of clinical information is in textual form

– ED triage notes– Clinical progress notes– Radiology and Pathology reports– GP and specialist letters– Discharge summaries– Medicare claims– Registry data– Literature: Medical articles

Page 18: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

WORKING WITH EHR

Page 19: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

“This system is designed for physicians to point and click their way through an entire exam quickly and effortlessly.” (EMR product review)

Electronic Health Records

Page 20: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Variations in unstructured text

1 Tablet(s) PO Daily1 tab by mouth or orally daily 1 tab orally every 24 hours. 1 tab(s) PO (oral) qDay 1 tab(s) orally once a day. 1 tabs QD1.0 tab po qdONE TABLET; ORAL QD One orally dailyOne tablet po dailyTAKE 1 TABLET DAILY TAKE ONE PO QDTake 1 Tab by mouth daily. Take 1 tab daily daily orally Take 1 tab daily orally Take 1 tab po qdayTake 1 tab qd poTake 1 tab qday POTake 1 tab(s) daily orally

Take 1 tablet by mouth daily. Take 1 tablet orally Daily Take 1 tablet orally every day Take one orally daily Take one orally daily as discussede Take one tablet by mouth daily Take one tablet by mouth every day Take one tablet daily Take one tablet once per day orally Take one tablet po qdby mouth one po qdone orally once a day one orally per dayone tablet by mouth daily one tablet dailyone tablet once a day take 1 tab po dailytake 1 tab po qdtake one orally each day

Page 21: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Structuring knowledge

• Information Extraction– Of Entities– Of Concepts– Of Relations

Extract structured content from

unstructured text

Take ONE to TWO tablets a day when required

Rx_Dosage: AmtMin: 1 AmtMax: 2 AmtUnit: tabletRx_Frequency: PerWdwDays: 1 DosesPerWdw: 1

Page 22: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Structured vocabularies

• Play a strategic role in providing access to computerized health information because clinicians use a variety of terms for the same concept. – either “leukopenia” or “low white cell count” might be written

in a patient record—usually these are synonyms.– Without a structured vocabulary, an automated system will

not recognize these terms as being equivalent.

• Encode data for exchange, comparison, aggregation • SNOMED CT (Systematized Nomenclature Of

Medicine Clinical Terms): core general terminology for patient data

• ICD (International Classification of Disease): used for diagnosis and procedure data

Page 23: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

One EHR fits all?

• EHRs are used in complex clinical environments.

• Features and interfaces appropriate for one medical specialty (such as pediatrics) may be frustratingly unusable in another (such as the intensive care unit).

• The data presented, the format, the level of detail, the order of presentation may need to be different.

• “Clinical IT projects are complex social endeavors in unforgiving clinical settings that happen to involve computers, as opposed to IT projects that happen to involve doctors.”

• -- Scot M. Silverstein, MD, Drexel University

Page 24: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Remember the user

“Clinical IT projects are complex social endeavors in unforgiving clinical settings

that happen to involve computers, as opposed to IT projects that happen to

involve doctors.” -- Scot M. Silverstein, MD, Drexel University

Page 25: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

The Clinical Narrative

“...In years past, a well-written history and physical, or progress note, would unfold like a story, giving a vivid description of the patient’s symptoms and physical exam at the point of the encounter, as well as the synthesis of the data and the plan of care."

“EMRs: Finding a balance between billing efficiency and patient care", Henry F. Smith, Jr., MD, Commentary, The Times Leader, Wilkes-Barre, PA, June 12, 2011.

Page 26: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

A typical clinical narrative

April 14, 2007 CHIEF COMPLAINT: Shortness of breath. HISTORY OF PRESENT ILLNESS: This 68-year-old female presents to the emergency department with shortness of breath that has gone on for 4-5 days, progressively getting worse. It comes on with any kind of activity whatsoever. She has had a nonproductive cough. She has not had any chest pain. She has had chills but no fever. EMERGENCY DEPARTMENT COURSE: The patient was admitted. She has had intermittent episodes of severe dyspnea. Lungs were clear. These would mildly respond to breathing treatments and morphine. Her D‐dimer was positive. We cannot scan her chest; therefore, a nuclear V/Q scan has been ordered. However, after consultation with Dr. C, it is felt that she is potentially too unstable to go for this. Given the positive D‐dimer and her severe dyspnea, we have weighed the risks and benefits of anticoagulation with her heme-positive stools. She states that she has been constipated lately and doing a lot of straining. Given the possibility of a PE, it was felt like anticoagulation was very important at this time period; therefore, she was anticoagulated. The patient will be admitted to the hospital under Dr. C.

Page 27: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Identifying clinical terms

April 14, 2007 CHIEF COMPLAINT: Shortness of breath. HISTORY OF PRESENT ILLNESS: This 68-year-old female presents to the emergency department with shortness of breath that has gone on for 4-5 days, progressively getting worse. It comes on with any kind of activity whatsoever. She has had a nonproductive cough. She has not had any chest pain. She has had chills but no fever. EMERGENCY DEPARTMENT COURSE: The patient was admitted. She has had intermittent episodes of severe dyspnea. Lungs were clear. These would mildly respond to breathing treatments and morphine. Her D‐dimer was positive. We cannot scan her chest; therefore, a nuclear V/Q scan has been ordered. However, after consultation with Dr. C, it is felt that she is potentially too unstable to go for this. Given the positive D‐dimer and her severe dyspnea, we have weighed the risks and benefits of anticoagulation with her heme-positive stools. She states that she has been constipated lately and doing a lot of straining. Given the possibility of a PE, it was felt like anticoagulation was very important at this time period; therefore, she was anticoagulated. The patient will be admitted to the hospital under Dr. C.

Page 28: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

details

April 14, 2007 CHIEF COMPLAINT: Shortness of breath. HISTORY OF PRESENT ILLNESS: This 68-year-old female presents to the emergency department with shortness of breath that has gone on for 4-5 days, progressively getting worse. It comes on with any kind of activity whatsoever. She has had a nonproductive cough. She has not had any chest pain. She has had chills but no fever. EMERGENCY DEPARTMENT COURSE: The patient was admitted. She has had intermittent episodes of severe dyspnea. Lungs were clear. These would mildly respond to breathing treatments and morphine. Her D‐dimer was positive. We cannot scan her chest; therefore, a nuclear V/Q scan has been ordered. However, after consultation with Dr. C, it is felt that she is potentially too unstable to go for this. Given the positive D‐dimer and her severe dyspnea, we have weighed the risks and benefits of anticoagulation with her heme-positive stools. She states that she has been constipated lately and doing a lot of straining. Given the possibility of a PE, it was felt like anticoagulation was very important at this time period; therefore, she was anticoagulated. The patient will be admitted to the hospital under Dr. C.

Page 29: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

reasoning

EMERGENCY DEPARTMENT COURSE: The patient was admitted and nontoxic in appearance. Blood pressure was brought down aggressively. With this combined with BiPAP, she has reversed her respiratory distress promptly. She has improved significantly. She will not require intubation at this time period. Her family has elected to go back to M, Dr. W. I did discuss this case with Dr. G who is on call for L Cardiology. She has accepted him in transfer; however, there are no PCU or ICU beds at this time period. Will admit here for a brief period until a bed is available at M. I discussed this case with Dr. R who will admit.

• Clinicians were trying to determine whether the shortness of breath was due exclusively to her failing heart, or whether she has pneumonia.

• Prompt response indicates that pneumonia is not the issue.

Page 30: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Unlocking information in the text

• (Semantic) Information retrieval– Finding relevant documents, paragraphs related to

specified concepts

• Entity recognition– Identifying relevant and important entities

• Relationship identification– Understanding underlying language to determine

relationships between entities of interest

unexpectedassociations new insights

new knowledge

Page 31: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)
Page 32: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

MEDICAL CONCEPT RECOGNITION

Page 33: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

MetaMap: UMLS concept annotation

http://www.cvast.tuwien.ac.at/projects/iUMLS

Page 34: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Abstracting linguistic variation

• Terminology mapping tools generalise language variation

• e.g. UMLS Concept C0027497• nausea• nauseated• feels sick• feeling sick• queasy• felt sick• nauseous

Page 35: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

ICD coding

http://www.zydoc.com/zydoc-extracts-icd-10-codes-from-unstructured-text-with-nlp-driven-cac/

Page 36: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

NegEx: identify negated concepts

http://healthinformatics.wikispaces.com/NegEx+Algorithm

Page 37: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Classification framework

Training setNotes + labels

for classes of interest(e.g. ICD-10 codes)

Machine learning algorithm

Words, Phrases,Linguistic categories;

names of entities;Domain concepts; Document features

Biomedical knowledge sources

UMLS (SnomedCT, ICD)

Language processing

ModelRelating features

of the text to classes of interest

Page 38: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

SYNDROMIC SURVEILLANCEFROM CLINICAL NOTES

Page 39: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

SynSurv

• SynSurv– Victorian Department of Health pilot

syndromic surveillance program– Detection of outbreaks based on ICD-10

diagnostic codes and presenting complaints as captured in free text notes

Our focus:Extracting information from unstructured free text to enable subsequent analysis and monitoring

Page 40: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Objectives of our project

• To enable surveillance directly from notes; integration into natural workflow of ED

• To support higher sensitivity and higher precision than keyword-based methods

Page 41: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Emergency Department triage notes

• Free text notes– written by triage nurse upon assessment in

the Emergency Department– captures presenting symptoms and

complaints of a patient

CENTRAL CHEST DISCOMFORT WHILE EATING, RADIATING TO ARMS. PPM INSERTED 2/52 AGO. PAIN FREE O/A. HR72, BP160

FEBRILE ILLNESS FLU LIKE SYMPTOMS NAUSEA

L BASAL GANGLIAN BLEED POST COLLAPSE, NON VERBAL, EYES SPON OPENED, HYPERTENSIVE, P 70REG, PEARL, PMX CEREBRAL BLEED

Page 42: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

SynSurv data characteristics

• 918,330 records• 730,054 records with ICD-10 diagnosis• 456,213 records with note text• 316,362 records with ICD-10 diagnosis

and note text

Page 43: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Two sets of Experiments

• Given a free text note,– Predict the ICD-10 code(s) for the note

– Predict a syndromic group, based on pre-defined sets of ICD-10 codes of interest

Page 44: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Predicting ICD-10 codes

• Approaches– Baseline strategy

• direct detection of ICD-10 terms in triage notes

– Augmented baseline• direct detection of SNOMED-CT terms in notes• map to ICD-10 codes via reference mapping

– Machine learning• Build a set of binary classifiers; one yes/no

classifier per ICD-10 code• Experiment with different features and different

learning algorithms

Page 45: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Predicting ICD-10 codes(Results)

• Direct term matching strategy outperformed by machine learning– Performance difference between micro-

average and macro-average indicates that some ICD-10 codes are underrepresented in the data, and cannot be modeled well

Page 46: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Predicting Syndromic Groups

• Task– Syndromic groups are defined by sets of

ICD-10 codes, e.g. Flu like group

Page 47: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Syndrome distribution

• Data– 6 groups with a reasonable number of

examples– Large imbalance between yes/no classes

Page 48: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Predicting Syndromic Groups(Approach)

– Machine learning• Build a set of binary classifiers; one yes/no

classifier per Syndromic Group• Experiment with different features and different

learning algorithms• Incorporate ICD-10 and SNOMED term

recognition in pre-processing (to generalise over linguistic variation)

Page 49: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Predicting Syndromic Groups(Detailed Results)

Page 50: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Syndromic Group Expansion Results

• Improve syndromic group definitions by adding related ICD-10 codes to the provided definitions

• Done using a data-driven strategy– Look for ICD-10 codes with similar records – Compare groups of records based on

cosine similarity– Select ICD-10 codes from the most similar

ones with relevant records

Page 51: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Syndromic Group Expansion Results(Aggregate results)

• Results for SynSurv_Acute_respiratory, SynSurv_Diarrhoea and SynSurv_Flu_like_illness

Page 52: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Issues for low performance

• Inconsistency in ICD-10 annotation– ? FISH BONE IN THROAT J03– ? FISH BONE IN THROAT T18– ? FISH BONE IN THROAT T18– ? FISH BONE IN THROAT S10.9– ? FISH BONE IN THROAT J02.0

• Notes not related to the patient´s visit– DIRECT ADMISSION FROM BAIRNSDALE TO 3S BED 25

• Typos in the notes text– ? FIH BONE IN THROAT

Page 53: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Integrating with Syndromic Surveillance framework

• Input to the BioSurv system– Trained machine learning models used as

input to BioSurv (e.g., C2 algorithm)– Prediction probability > 0.5

Model

Predicted Classification

(label)

Yesflu-like illness

No

BioSurvCount +1

Page 54: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Example: Flu like syndrome NLP notes annotation

• Records with no ICD-10 codes in the database are now available to SynSurv

• 730,054 out of 918,330 records with ICD-10 codes

Page 55: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

C2 algorithm: ICD-10 vs NLP

• Earlier alert time using NLP methods

ICD-10 NLP

Page 56: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

RETRIEVE DISEASE-RELEVANT CLINICAL RECORDS

Page 57: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Disease Recognition from Clinical Reports

• Task: classify records according to specified disease– Enables retrieval of specific cases– Detect patterns of disease occurrence– Support creation of patient cohorts– Prelude to automated ICD-encoding

• Disease: Lung Cancer– Identified by ICD-10 code

• C34: Malignant neoplasm of bronchus and lung

Text mining for lung cancer cases over large patient admission data (2014) Martinez D, Cavedon L, Alam Z, Bain C, Verspoor K. HISA Big Data 2014, CEUR vol. 1149. http://ceur-ws.org/Vol-1149/bd2014_cavedon.pdf

Page 58: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Alfred Health (Melbourne)

Page 59: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Method

• Data: radiology reports for 2 (financial) years (2011--2013) extracted from REASON platform– 756,502 reports, plus associated metadata

• Each report linked to an admission record• Metadata: ICD-10 (manually assigned) used as

ground truth;• demographics, reason for admission, etc.

– Data pre-processed to remove ICD-10 codes and extract features

• Challenge: real distribution highly skewed – only 0.8% of data are positive for lung cancer

Page 60: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Method

• Features:– Bags-of-Words from report text– Bags-of-Phrases identified by MetaMap– Negative context identified by NegEx

– Metadata from admission record• Name, Dob, Sex, MaritalStatus, Religion• AdmissionReason, AdmissionUnit,

AdmissionType• Allergies, DrugCode, DrugDesc• ...

Page 61: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Method

• Machine learning algorithms– Support Vector Machines– Correlation-based feature selection filter

• Baseline: keyword-based approach“lung cancer”, “lung malignancy”,

“lung malignant”, “lung neoplasm”,

“lung tumour”, “lung carcinoma”

Page 62: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Results

Evaluation: stratified 10-fold cross-validation

Classifier Precision Recall F-score

Text features only 0.855 0.800 0.825

Full feature set (including metadata)

0.871 0.820 0.843

Term-matching baseline 0.643 0.742 0.689

* Results not using feature selection, which reduced performance

Page 63: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Temporal variation

Page 64: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Related Applications

Fungal infection surveillance by classifying CT scan reports

Extracting information from pathology reports

Work by Lawrence Cavedon, David Martinez, and others at NICTA in recent years

Martinez et al (2014) Cross-hospital portability of information extraction of cancer staging information. AI in Medicine.

Page 65: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

DATA MINING

Page 66: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Making Sense of clinical data

66

Page 67: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Data analysis

Once the data is structured, anything is possible• Association rule mining• Clustering• Machine learning• Hypothesis testing• Statistical analysis• Etc.

Page 68: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Clustering patients

• Patients represented as sets of features

• Features could include any aspect of their profile– demographic– clinical– treatment

• drugs• devices

– genomic– environmental– nutritional– etc.Roque et al. (2011) Using Electronic Patient Records to Discover Disease

Correlations and Stratify Patient Cohorts. PLoS Comput Biol 7(8): e1002141. doi:10.1371/journal.pcbi.1002141

Page 69: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Aggregating resources

Electronic Patient-Reported Data Capture as a Foundation of Rapid Learning Cancer Care

Abernethy, Amy P.; Ahmad, Asif; Zafar, S. Yousuf; Wheeler, Jane L.; Reese, Jennifer Barsky; Lyerly, H. Kim. Medical Care. 48(6):S32-S38, June 2010. doi: 10.1097/MLR.0b013e3181db53a4

Page 70: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Pharmacovigilance

• Mining of clinical records to identify adverse drug events– Estimated >90% of adverse events do not appear in coded

data– Transform patient records into patient-feature matrix

encoded using clinical terminologies

70LePendu et al. (2013) “Pharmacovigilance Using Clinical Notes” Clinical Pharmacology & Therapeutics 93(6), 547–555; doi: 10.1038/clpt.2013.47

Page 71: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

A “Phenotypic code” for complex disease

• Simple and complex diseases appear to share a genetic architecture

• Mining of co-morbidities of complex diseases and Mendelian diseases with known genetic cause identifies a ‘code’ for each complex disease in terms of Mendelian genetic loci.

• Evidence of epistasis among the Mendelian variants (superlinear complex disease risk)

71

Blair et al. Cell (2013); 155 (1); 70-80. http://dx.doi.org/10.1016/j.cell.2013.08.030

Page 72: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Personalised Medicine

• Look for evidence in text for:(classification) – Which patients have not

responded or had a toxicity event?

(prediction)– Which patients are

likely to respond to the drug?

(interpretation)– Why did some patients

respond well?

Page 73: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Rapid Learning Healthcare

Electronic Patient-Reported Data Capture as a Foundation of Rapid Learning Cancer Care

Abernethy, Amy P.; Ahmad, Asif; Zafar, S. Yousuf; Wheeler, Jane L.; Reese, Jennifer Barsky; Lyerly, H. Kim. Medical Care. 48(6):S32-S38, June 2010. doi: 10.1097/MLR.0b013e3181db53a4

Page 74: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Conclusions

• We are at the beginning of a transition from evidence-based medicine to practice-based evidence

Prediction: factors in disease and effective treatmentDetection: observables indicating diseasePrevention: what factors circumvent those related to prediction

• enabled by increasing roll-out of EHR and HI systems • Linked hospital data allows multiple sources to be

leveraged for complex analytic tasks• Text is a major and important part of the clinical record

Many data structuring and mining problems in the clinical context can be treated as retrieval problems.

Page 75: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

Impact of Informatics for Biomedicine

Advancing the science of medicine

Improving the effectiveness of

healthcare

Page 76: Medical Information Retrieval Workshop Keynote (MedIR@SIGIR2014)

© Copyright The University of Melbourne 2014


Recommended