Date post: | 29-Nov-2014 |
Category: |
Education |
Upload: | harry-hochheiser |
View: | 231 times |
Download: | 0 times |
MWRI WIP February 2014Harry Hochheiser, [email protected]
User tools for Biomedical Informatics: the Human Side of the Fundamental Theorem
Harry Hochheiser !University of Pittsburgh School of Medicine Department of Biomedical Informatics [email protected]!
NLM Training Conference June 2014Harry Hochheiser, [email protected]
• Human + Computer > Human iff
• Value(Computer) > Cost(Computer)
• all too often, this does not hold
Hochheiser's perspective on biomedical informatics
• Informatics tools must
• Support researcher’s tasks and goals.
• Take care of the “stupid” work
NLM Training Conference June 2014Harry Hochheiser, [email protected]
GRADS: Genomic Research In Alpha-1 Antitrypsin Deficiency Syndrome and Sarcoidosis
• Alpha-1 antitrypsin deficiency
• “genetic predisposition to early onset pulmonary emphysema and airway obstructions” (GRADS MOP)
• Mutation in SERPINA1 gene - codes for alpha 1-antitrypsin
• Genotyes PiMM (normal), PiMS, (80% serum level), PiSS/PiMZ (60%), PiSZ (40%), PiZZ (20%)
• Sarcoidosis
• “systemic disease characterized by the formation of granulomatous lesions, especially in the lungs, liver, skin, and lymph nodes, with a heterogeneous set of clinical manifestations and a variable course” (GRADS MOP)
• No specific genetic cause
• Infection may play a role..
NLM Training Conference June 2014Harry Hochheiser, [email protected]
GRADS Goals
Use ‘omics data to characterize phenotypes
gene expression
miRNA expression
microbiome
~ 600 patients (400 sarc., 200 A1AT, distribute across phenotypic/genotypic groups), 7 centers
detailed clinical data
lung CT
‘omics, etc.
!
NLM Training Conference June 2014Harry Hochheiser, [email protected]
GRADS Data sharing Goals• Integrative exploration of clinical and ‘omic data
• Identify cohorts suitable for analysis
• Are there enough participants to ask my questions?
• Which genes/miRNAS/microbes might be “interesting”
• How do clinical data relate to ‘omic data
• Web-based interactive filters and exploration
• Coordinated histogram widgets as both input and output
• Initially, GRADS clinical centers
• eventually, broader community
NLM Training Conference June 2014Harry Hochheiser, [email protected]
Research Challenges
• Algorithmic enhancements
• Data retrieval and management
• Calculation of “interesting” genes
• GPU-based calculation
• Additional user facilities?
• statistical comparison of subgroups?
NLM Training Conference June 2014Harry Hochheiser, [email protected]
Interactive Search and Review of Clinical Records with Multi-Layered Semantic Annotations• Challenge: retrospective chart review for clinical research
• Quality assessment
• measuring guideline adherence for colonoscopy
• Cohort identification
• patients who may have had adverse reactions
!
• Use Natural Language Processing to extract relevant variables
• But… researchers need to review findings and correct mistakes.
• Ultimate goal: bridge gap between NLP and clinical research
NLM Training Conference June 2014Harry Hochheiser, [email protected]
Word Tree Visualization Wattenberg and Viégas, 2008, implementation from https://github.com/silverasm/wordtree
Patterns in the text can help facilitate review of NLP results.
NLM Training Conference June 2014Harry Hochheiser, [email protected]
Next Steps
Interfaces for handling suggested revisions to NLP models:
Selecting spans
Changing variable assignments
Submitting changes
Reviewing modified variable assignments
Assessments
Usability studies
Empirical studies
How much training is needed to “seed” expert review?
NLM Training Conference June 2014Harry Hochheiser, [email protected]
Monarch Initiative: Using cross-species phenotypes to explore disease (some slides courtesy of M. Haendel)
Problem: Clinical and model phenotypes are described differently
NLM Training Conference June 2014Harry Hochheiser, [email protected]
OWLSim: Phenotype similarity across patients or organisms !https://code.google.com/p/owltools/wiki/OwlSim
Statistical details available on demand
NLM Training Conference June 2014Harry Hochheiser, [email protected]
Scaling up.. Multiple candidates
b2b1035Clo (aka Blue Meanie)
Duplex kidney Cleft palate Prenatal growth retardation Tricuspid valve atresia Persistent truncus arteriosis Double outlet right ventricle Anophthalmia Microphthalmia Kidney cysts Pulmonary valve atresia Polycystic kidney Ventricular septal defect Common atrium Atrioventricular septal defect Complete atrioventricular septal defect …… !!b2b012Clo
(aka Heart Under Glass)Cleft palate Abnormal sternum morphology Double outlet right ventricle Polydactyly Pulmonary hypoplasia Kidney cysts Duplex kidney Right aortic arch Common atrium Complete atrioventricular septal defect Pulmonary artery atresia !
Fgfr2
Fuzb2b1273Clo
(aka octomouse)
NLM Training Conference June 2014Harry Hochheiser, [email protected]
Visualization Challenges:
How to explain the inferences driven by ontological calculations?
How to integrate multiple data types to aid interpretation?
Pathways
Gene expression
protein-protein interaction
…..
How to compare across phenotype profiles?
NLM Training Conference June 2014Harry Hochheiser, [email protected]
Undiagnosed Disease Program: Comparing Phenotype Profiles
NLM Training Conference June 2014Harry Hochheiser, [email protected]
Phenotype Profile - Model Views
NLM Training Conference June 2014Harry Hochheiser, [email protected]
Other challenges
Process support - search and interpretation as an ongoing activity
!
Reducing bias - how do we avoid cherry-picking and thorough investigation
!
Navigating semantic chains
phenotypes -> networks -> genes - > model
NLM Training Conference June 2014Harry Hochheiser, [email protected]
Closing thoughts…
• The hard problems are not technical
• Collaboration required..
NLM Training Conference June 2014Harry Hochheiser, [email protected]
AcknowledgmentsGRADS: U. Pittsburgh: Steve Wisniewski, Mike Becich, Scott O’Neal, Bill Shirey, Becky Boes, Sahawut Wesaratchakit Yale: Naftali Kaminski
Support: NHLBI U01HL112707
Monarch: U. Pittsburgh: Chuck Borromeo, Bec ky Boes, Jeremy Espino OHSU: Melissa Haendel, Nicole Vasilevky, Matt Brush NIH-UDP: Murat Sincan, David Adams, Neal Boerkel, Amanda Links, Bill Gahl LBNL: Nicole Washington, Suzanna Lewis, Chris Mungall + colleagues at Sanger, Charite , Toronto, and JAX UCSD: Anita Bandrowski, Amarnath Gupta, Jeff Grethe, Maryann Martone, Trish Whetzel
Support: NIH Office of Director: 1R24OD011883, NIH-UDP: HHSN2682013
Interactive Search and Review of Clinical Records with Multi-Layered Semantic Annotations:
U. Pittsburgh: Janyce Wiebe, Rebecca Hwa, Alex Conrad, Phuong Pham, Lanfei Shi, Gaurav Trivedi U. Utah: Wendy Chapman, Danielle Mowery Support: NLM 7R01LM010964 !Other Support: Addressing Gaps in Clinically Useful Evidence on Drug-Drug Interactions (R. Boyce, NLM: 1R01LM011838) Cancer Deep Phenotype Extraction from Electronic Medical Records (R. Crowley & G.Savova, NCI: 1U24CA184407) Quantifying Electronic Medical Record Usability to Improve Clinical Workflow (Z. Agha, AHRQ: 5R01HS021290)