Date post: | 26-Dec-2015 |
Category: |
Documents |
Upload: | brooke-tate |
View: | 217 times |
Download: | 0 times |
Improving Access to Clinical Data Locked in Narrative Reports:
An Informatics Approach
Wendy W. Chapman, PhD
Division of Biomedical InformaticsUniversity of California, San Diego
Overview
• The promise of natural language processing (NLP)
• Challenges of developing NLP in the clinical domain
• Challenges in applying NLP in the clinical domain
• Improving access to text through NLP resources
The promise of NLP
• Vast & growing amounts of clinical text
• Rich in information
– Patient care
– Evaluation/QC
– Comparative effectiveness research
– Epidemiology
• Locked in free text
• Natural language promising can help unlock that information
• Encouraging NLP success stories
The promise of NLP
Murff (2011)JAMA
NLP captures:• Renal failure• Pulmonary
embolism• Deep vein
thrombosis• Sepsis• Pneumonia• Miocardial
infarction
Results: “... higher sensitivity and lower specificity compared with patient safety indicators based on discharge coding.”
“The promise of natural language processing ... may be closer than ever.”
Other promising NLP accomplishments ...
• Smoking status (Savova, Hazlehurst)
• Peripheral arterial disease (Pathak)
• Medication extraction (Uzuner)
• Pneumonia (Chapman)
• Colonoscopy quality metrics (Harkema)
• Breast cancer recurrence (Carrell)
• Colorectal cancer screening behavior (Denny)
• Rheumatoid arthritis (Zeng)
Overview
• The promise of natural language processing (NLP)
• Challenges of developing NLP in the clinical domain
• Challenges in applying NLP in the clinical domain
• Improving access to text through NLP resources
NLP Success
Fresh off its butt-kicking performance on Jeopardy!, IBM’s supercomputer "Watson" has enrolled in medical school at Columbia University,” New York Daily News February 18th 2011
“IBM's computer could very well
herald a whole new era in
medicine." ComputerWorld
February 17, 2011
Dr. Watson??
Clinical NLP Since 1960’s
Why has clinical NLP had little impact on clinical care?
Barriers to Development
• Sharing clinical data difficult– Have not had shared datasets for development and
evaluation– Modules trained on general English not sufficient
• Insufficient common conventions and standards for annotations– Data sets are unique to a lab– Not easily interchangeable
• Limited collaboration– Clinical NLP applications silos and black boxes– Have not had open source applications
• Reproducibility is formidable– Open source release not always sufficient– Software engineering quality not always great– Mechanisms for reproducing results are sparse
Overview
• The promise of natural language processing (NLP)
• Challenges of developing NLP in the clinical domain
• Challenges in applying NLP in the clinical domain
• Improving access to text through NLP resources
Security & Privacy Concerns
• Clinical texts have many patient identifiers– 18 HIPAA identifiers
• Names• Addresses
• Items not regulated by HIPAA– tight end for the Steelers
• Unique cases– 50s-year-old woman who is pregnant
• Sensitive information– HIV status
Institutions are reluctant to share dataInstitutions are reluctant to share data
Lack of user-centered development and scalability– Perceived cost of applying NLP outweighs the
perceived benefit (Len D’Avolio)
Overview
• The promise of natural language processing (NLP)
• Challenges of developing NLP in the clinical domain
• Challenges in applying NLP in the clinical domain
• Improving access to text through NLP resources
Access to Resources for Developing NLP Algorithms
Resources for NLP Developers
Knowledge Bases
Clinical Data
Annotations
Annotation Environment
Evaluation
Domain Schema Ontology
Domain Schema Ontology
Modifier OntologyModifier Ontology
Modifiers of clinical elements
Linguistic representation of clinical elements
Disease: colon cancerExperiencer: familyNegation: noHistorical: yes
Disease: colon cancerExperiencer: familyNegation: noHistorical: yes
“Patient denies a family history of colon cancer”
Melissa Tharp
Schema Ontology: Elements
Schema Ontology: Relationships
Modifier Ontology
Modifiers are important for interpreting text– Chest radiograph confirms pneumonia– Family history of pneumonia– No evidence of pneumonia
Affirmation/negationUncertaintyExperiencerHistorical/RecentSeverity
Allowable modifiersFor each clinical element
Modifier OntologyTypes of modifiersTypes of modifiers Linguistic
expressionsLinguistic
expressions
ActionsActions
TranslationsTranslations
Schema Ontology Imports Modifier Ontology
Medications– Type– Dose– Frequency– Route
Diagnosis– Negation– Uncertainty– Severity– History– Experiencer
Consistent with other models:Clinical element models, cTAKES type system,
Common model
Domain Ontology for NLP
• Instance of schema ontology
• Clinical elements from a particular domain
Synonyms Misspellings
Regular expressions
Synonyms Misspellings
Regular expressions
Resources for NLP Experts
Lack of shareable data is a barrier
•University of Pittsburgh Repository– 111,045 reports of 9 types– 600 users– No longer available
•MT Samples– 2,300 reports from MTSamples.com– De-identified
Schemas
Clinical Data
Annotations
Annotation Environment
Evaluation
Resources for NLP Experts
AMIA NLP Working GroupShARe - Sharing Annotated Resources
5R01GM090187: Chapman, Savova, Elhadad
•600 clinical notes from MIMIC II repository•Annotate disorders and modifiers
– Anatomic location
•Map to SNOMED codes•CLEF Shared Task 2013 and 2014
– https://sites.google.com/site/shareclefehealth/
Schemas
Annotation Environment
Evaluation
Annotations
Clinical Data
B South, D Mowery, S Velupillai, L Christensen, S Meystre
Resources for NLP Experts
Distributed annotation in secure environmentSchemas
Evaluation
Clinical Data
Annotation Environment
Annotations
Annotation Admin eHOST
Web applicationiDASH cloud
Client app
VA, SHARP, and NIGMS : S Duvall, B South, B Adams, G Savova, N Elhadad, H Hochheiser
Annotator Registry
Annotator Registry
Annotators•Enlist for annotation •Certify for annotation tasks
– Personal health information– Part-of-speech tagging
– UMLS mapping
•Set pay rate
NLP Admins•Search for annotators
http://nlp-ecosystem.ucsd.edu/annotators
1. Assign annotators to a task1. Assign annotators to a task
2. Create a Schema2. Create a Schema
3. Assign users and set time expectations3. Assign users and set time expectations
4. Keep track of progress4. Keep track of progress
Resources for NLP Experts
Distributed annotation in secure environmentSchemas
Evaluation
Clinical Data
Annotation Environment
Annotations
Annotation Admin eHOST
Web applicationiDASH cloud
Client app
Annotator Registry
Resources for NLP Experts
Schemas
Clinical Data
Annotations
Annotation Environment
Evaluation
• Compare output of NLP annotators
• NLP system vs human annotation
• View annotations
• Calculate outcome measures
• Drill down to all levels of annotation
• Perform error analysis
Document & annotations
Outcome Measures forSelected Annotations
Select Classifications
to View
ReportList
Attributes for Selected
Annotation
Relationships for Selected
AnnotationVA and ONC SHARP: Christensen, Murphy, Frabetti, Rodriguez, Savova
Access to Information in Text
User’s ConceptsCough
DyspneaInfiltrate on CXR
WheezingFever
Cervical Lymphadenopathy
User’s ConceptsCough
DyspneaInfiltrate on CXR
WheezingFever
Cervical Lymphadenopathy
Controlled Vocabs
Dry cough Productive coughCoughHacking coughBloody cough
Controlled Vocabs
Dry cough Productive coughCoughHacking coughBloody cough
Which concepts?
User’s ConceptsCough
DyspneaInfiltrate on CXR
WheezingFever
Cervical Lymphadenopathy
User’s ConceptsCough
DyspneaInfiltrate on CXR
WheezingFever
Cervical Lymphadenopathy
Attribute-values
Temp 38.0CLow-grade temperature
Attribute-values
Temp 38.0CLow-grade temperature
What values?
Efficient Access to Information in the Patient Chart
Knowledge Author
Chart Review Interface
Schema BuilderSchema Builder
Disease: colon cancerExperiencer: familyNegation: noHistorical: yes
Disease: colon cancerExperiencer: familyNegation: noHistorical: yes
“Family history of colon cancer”
NLP Schema Domain Ontology
Knowledge Author
• Front end interface for users
• Back end– Schema ontology– Modifier ontology
• Output– Domain ontology– Schema for NLP system
B Scuba, F Fana, Liqin Wang, Mingyuan Zhang, Y Liu, M Kong, F Drews
Ibuprofen
Ibuprofen p.o.
No family history of colon cancer
Linguistic modifiers
Calls Voogo synonym tool
Access Information in Patient Chart
Knowledge Author
• Navigate patient data more efficiently
• Point chart reviewer to ambiguous and
contradictory information
– Reduce biasChart Review
Interfaces
Access Information in Patient Chart
Knowledge Author
Chart Review Interfaces
NLP VizEMR Subjects, DiagnosesFindings,Anatomical Locations
PopulationPatientDocumentExpression
User Identifies Patients Meeting Criteria
Feedback – improve models
Interactive Search and Review of Clinical Records with Multi-layered Semantic Annotation NLM 1R01LM010964-01. Chapman, Wiebe, Hwa.
Population View
Patient View
Access to NLP Tools and Interfaces
Access to NLP Tools
Classifier Workbench
NLP Workbench
Visualization Workbench
KBKB NLP Platform
NLP Platform
Annotations
UserUser
EditEdit
Mix &
M
atchM
ix &
Match
Cor
rect
Cor
rect
v3NLP (Zeng, Divita)pyConText (Chapman)
RapTat (Matheny, Gobbell)
• Interact• Customize
TextVectTextVect
TextVect
Classifier Workbench
NLP Workbench
Visualization Workbench
Feature Selection
Algorithms
Feature Selection
AlgorithmsTraining Set
UserUser
YesNoNo
Yes 1 0 0 0 1 1 1
No 0 0 1 1 0 0 0
No 0 0 0 1 0 1 0
Select NLP Features
X N-grams
X UMLS Concepts
Part-of-speech tags
X Negation
Select Representation
Binary
X Count
tf-idf
NLP ToolsNLP Tools
A Kumar, C Elkan, S Abdelrahmanhttps://github.com/abhishek-kumar/TextVect
Evaluation of TextVect
54
CMC dataset
Micro-F-Measure
Average 0.77
Best 0.89
TextVect 0.82
I2b2 dataset
Micro-F-Measure
Baseline 0.71
Average 0.91
Best 0.97
TextVect 0.95
Access to Visualizations of NLP Output
Classifier Workbench
NLP Workbench
Visualization Workbench
NLP System
NLP System
AnnotationsAnnotations
Visualization
workbench
Visualization
workbench
Timeline View
Jianlin Shi, T Wang, E Shenvi, R El-Kareh, M Tharp, R Reeves
Access to Understanding
Access to UnderstandingClinical Notes
Chief Complaint:
Hypoxic respiratory failure
Major Surgical or Invasive Procedure:
Intubation.
History of Present Illness:
81 yo man w/ho CAD, COP, PVD, AAA xfered from OSH for mngmt resp failure. Pt was found @ home by EMS followign c/o [**05-29**] "crushing", nonradiating SSCP. Pt diaphoretic during transport. Sat 84-->94% on NRB. Given ASA, NT, nebs en route to OSH where started on BIPAP and eventually intubated. BP on arrival 240/140 so started on NTG drip titrated up until BP fell to 90/58 resulting in IVF, dopamine. Given 80 IV lasix. First set enzymes negative and BNP 1700. Pt xferred for further management.
• Definitions
• Medical terms
• Acronyms/abbreviations
• Pictures
• Internet sites
• Biomedical literature
• Normal range checking
Conclusion
• Collaborations for NLP improve ability to– Create potentially useful resources and tools
• Provide access to– Resources for NLP development– Information in reports– NLP and visualization tools
• Major challenge is applying NLP • Future need
– More integration with other tools– More coordination
Acknowledgments
• Lee Christensen• Melissa Tharp• Mike Conway• Danielle Mowery• Bill Scuba• Milan Kovacevich• Dieter Hillert• Samir Abdelrahman• Leah Willis• Bob Angell
• Harry Hochheiser
• Jan Wiebe
• Rebecca Hwa
• Guergana Savova
• Noemie Elhadad
• Michael Matheny
• Rob El-Kareh
• Ruth Reeves
• Qing Zeng
• Guy Divita
• Frank Drews
BLU Lab Collaborators• Sumithra Vellupilai• Maria Kvist• Maria Skeppstedt• Aron Henrikkson• Brian Chapman• David Carrell• Sascha Dublin• Zia Agha• Stephane Meystre• Scott DuVall• Jianlin Shi