Date post: | 29-Dec-2015 |
Category: |
Documents |
Upload: | jonathan-charles |
View: | 216 times |
Download: | 0 times |
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Translational e-Science
Ida Sim, MD, PhD
March 17, 2009
Division of General Internal Medicine, and Graduate Group in Biological and Medical Informatics
UCSF
Copyright Ida Sim, 2009. All federal and state rights reserved for all original material presented in this course through any medium, including lecture or print.
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Some Observations• We reinvent the wheel with every study• We don’t repurpose data efficiently• Research and care are separate,
unintegrated• We use computers for data processing, not
concept processing• It’s logistically hard to work with collaborators• ...will increasingly limit C&T research we want
and need to do– “The ‘clinical research grid’ is failing.” (Crowley, et al, JAMA
2004; 291:1120-1126), Institute of Medicine
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Outline
• Translational biomedical informatics
• Collaborative Knowledge Work– Web 2.0 principles
• Class Summary
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Personalized Medicine
• Geno-pheno correlations crux of personalized medicine– need genomic and phenotype data in computable
form for large-scale correlations• Genomic data will be a commodity
– SNPs, whole genome analysis • Phenotype is the bottleneck
– what is “phenotype”?– how to represent it? standardize it? – where does phenotype data come from?
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Phenotype Definition
• Molecular/biochemical phenotype– e.g., expression profiles, proteomics,
metabolomics• Clinical phenotype
– clinically observable manifestations of a person’s genetic make-up and environment
• Sources of clinical phenotype– clinical care data– clinical research data, i.e., human studies
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Human Studies Used For• ... biomedical research
– what works? what doesn’t? what do the results tell me about mechanism, biology?
– what’s been studied?
• ... patient care– will this work in this patient? how well will it work? is
it better than other alternatives?
• “Human studyome” is the scientific foundation for understanding human health and disease and for advancing human health
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Lifecycle of Human Studies
Systematic Reviews
Decision Models
Guidelines Electronic
Patient Record
Human Studies Performance
Human Studies Interpretation
Human Studies Application
Regulatory Reporting Study Execution Study Design
Feedback to Study Design
Scientific Reporting Study Registration
Journals, Trial Registers, etc.
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Systems Interoperation Needed
CDISC-PR BRIDGSDTM
HL7-CTR
GEM/GLIF/SAGE
CCD/CCR
HL7 RIM
Systematic Reviews
Decision Models
Guidelines Electronic
Patient Record
Human Studies Performance
Human Studies Interpretation
Human Studies Application
Regulatory Reporting Study Execution Study Design
Feedback to Study Design
Scientific Reporting Study Registration
Journals, Trial Registers, etc.
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Human Studyome is Central
CDISC-PR BRIDGSDTM
HL7-CTR
GEM/GLIF/SAGE
CCD/CCR
HL7 RIM
Systematic Reviews
Decision Models
Guidelines Electronic
Patient Record
Human Studies Performance
Human Studies Interpretation
Human Studies Application
Regulatory Reporting Study Execution Study Design
Feedback to Study Design
Scientific Reporting Study Registration
Human Studyome
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Computerizing the Studyome
• Human studyome: totality of human studies worldwide• Computerize for large-scale discovery, reanalysis, reuse• More complex than for genome, proteome, etc.
– raw results have very different meaning within different study designs
• e.g,. interventional vs. observational study
– need to standardize study design descriptions• to make sense of raw results
• to combine results across multiple studies
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Sharing Raw Results
46.4 (39.2-51.2) 45.1 (39.9-50.5)
0.83 (0.79-0.99) 0.91 (0.93-1.04)
2.2 (1.7-3.4) 2.7 (1.1 - 4.1)
110 (87-134) 121 (99-129)
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Need Standardized Metadata
• Variable names are metadata• MeSH, ICD, SNOMED, etc. are standard clinical vocabularies
– ionized calcium: UMLS code C0373561
Age 46.4 (39.2-51.2) 45.1 (39.9-50.5)
ICa 0.83 (0.79-0.99) 0.91 (0.93-1.04)
Creatinine 2.2 (1.7-3.4) 2.7 (1.1 - 4.1)
Weight (lbs) 110 (87-134) 121 (99-129)
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Garlic Chocolate
Age 46.4 (39.2-51.2) 45.1 (39.9-50.5)
ICa 0.83 (0.79-0.99) 0.91 (0.93-1.04)
Creatinine 2.2 (1.7-3.4) 2.7 (1.1 - 4.1)
Weight (lbs) 110 (87-134) 121 (99-129)
Need Metadata About the Study
• Study results = “study data”
• Variable names = “study results metadata”
• Data about study design = “study metadata”
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Garlic Chocolate
Age 46.4 (39.2-51.2) 45.1 (39.9-50.5)
ICa 0.83 (0.79-0.99) 0.91 (0.93-1.04)
Creatinine 2.2 (1.7-3.4) 2.7 (1.1 - 4.1)
Weight (lbs) 110 (87-134) 121 (99-129)
Need Study Design Metadata
• Randomized trial of garlic vs. chocolate for weight loss? Observational study of ionized calcium levels?
• i.e., need data standardized in an ontology of human studies research
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
OCRe
• Ontology of Clinical Research [Sim, et al] – in Ontology Web Language (OWL)
• Scope/Domain– all human studies, all clinical domains, any intent
– all variable types • quantitative, qualitative, imaging, genomics, etc.
• Importing subsets of concepts and terms where appropriate
• e.g., BRIDG, Ontology for Biomedical Investigations (OBI)
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Who’s Studied? Phenotype• Clinical phenotype definitions in human studies
– eligibility rules• e.g., “No other malignancy within the past 5 years except curatively
treated basal cell or squamous cell skin cancer or carcinoma in situ of the cervix or breast”
– outcome definitions
• No computable language yet exists for expressing such complex logical rules– negation: what exactly does no mean? no other *known*
malignancy?– temporal representation: 5 years from when?– “curatively treated”?
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Specifying Eligibility Rules• ASPIRE approach (HL7, CDISC) [Niland, et al]
– standard demographic “rules” (e.g., smoking yes/no; gender M/F; reproductive status)
– domain-specific (NCI doing this too)• e.g., breast cancer: ER/PR status, Stage
• ERGO generic rule expression language [Sim, et al]
– set of templates and grammars with 3 statement types• person has_property X• person has_intervention Y• person has_behavior Z
– X,Y,Z should be CDEs or standard vocabulary terms
– X,Y,Z can be negated, modified, ANDed/ORed, etc.
– theoretically, can say all can be said about clinical phenotype
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
“Phenotype Informatics” T1
Translation
T2
TranslationGenomicsProteomicsPharmacogenomicsMetabolomics, etc.
Clinical trialsEpidemiologyMolecular Epi
Evidence-based practicePatient safetyQuality of care
Basic Discovery
Clinical Research
Clinical Care
• Need rich computable clinical phenotype statements– who, what is being studied
• Need representations of clinical study design – to put data in context
• In order to – guide T1 researchers on selecting clinical cohorts– estimate potential cohort sizes in EHRs
– facilitate eligibility determination of individual patients– facilitate retrieval of studies studying “patients like this one”
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Outline
• Translational biomedical informatics
• Collaborative Knowledge Work– Web 2.0 principles
• Class Summary
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Extreme Scale Discovery
• Research based on “cyberinfrastructure” is the single most important challenge confronting the nation’s science laboratories (NSF)
• The challenge is based on a “grand convergence” of– maturation of the Internet as connective data technology
– ubiquity of microchips in computers, appliances, and sensors
– an explosion of data from the research enterprise
• Ability to do large-scale multi-disciplinary data analysis, visualization, etc. is frontier of research
http://www.nsf.gov/news/special_reports/cyber/index.jsp
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Why Collaborative Knowlege?
• Collaborative– sense-making is a group
activity
– multi-disciplinary, asynchronous, distributed
• Knowledge– beyond data & information
– certainly not transactions!
VirtualPatient
Transactions
Raw data
Medicalknowledge
Clinicalresearch
transactions
Rawresearch
data
De
cisi
on
sup
po
rt
Me
dic
al l
og
ic
PATIENT CARE /WELLNES RESEARCH
Workflow modeling and support, usability, cognitive support,computer-supported cooperative work (CSCW), etc.
Where clinicianswant to stay
EHRs
CTMSs
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Collab Knowledge in Care
• Beyond the EHR (i.e., beyond record-keeping)• Supporting collaborative care
– messaging, task management, shared conceptualization of problem/education, group decision making, secure distributed permissioned access
• “Upskilling” all participants– 40% of Americans have a chronic condition
• chronic diseases account of >75% of total medical costs
– not enough primary care or specialists for chronic disease management
– must increase knowledge of entire care team (e.g., families)
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Electronic Health “X”
• EHX systems “upskill” all participants for outcomes-based, coordinated care– transactions fall out of
meeting objectives
– documentation falls out of interactions and
transactions
VirtualPatient
Transactions
Raw data
Medicalknowledge
Clinicalresearch
transactions
Rawresearch
data
De
cisi
on
sup
po
rt
Me
dic
al l
og
ic
PATIENT CARE /WELLNES RESEARCH
Workflow modeling and support, usability, cognitive support,computer-supported cooperative work (CSCW), etc.
EHX
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
NIH Challenge Grants: EHX• 06-LM-102* Self-documenting encounters. Develop technologies, tools, and processes to
achieve rapid and comprehensive electronic documentation of encounters with patients/research subjects.
• 10-LM-102* Advanced decision support for complex clinical decisions. Use artificial intelligence techniques to provide practical support for complex decision making in health care and clinical research contexts.
• 06-LM-101* Intelligent Search Tool for Answering Clinical Questions. Develop new computational approaches to information retrieval that would allow a clinician or clinical researcher to pose a single query that would result in search of multiple data sources to produce a coherent response that highlights key relevant information which may signal new insights for clinical research or patient care.
• 06-OD(OBSSR)-101* Using new technologies to improve or measure adherence. New and innovative technologies to improve and/or measure patient adherence to prescribed medical regimens and utilization of adherence-enhancing strategies in clinical practice would greatly enhance the health impact of efficacious treatments and preventive regimens.
• 05-LM-104* Value of “Virtual Reality” Interaction in Improving Compliance with Diabetic Regimen. Interactions between avatars in virtual reality environments such as Second Life are known to influence behavior. Studies should explore the effectiveness of periodic physician/nurse interaction with diabetic patients via a virtual reality environment in improving diabetic control, as compared to standard practice.
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Collab Knowledge in Research• Beyond data storage, security, and access
– knowledge retrieval and reasoning
• Supporting collaborative sense-making– visualization, pattern matching and testing,
combining multi-disciplinary worldviews
• Continuous learning by all participants– teachable moments for new methods, findings,
hypotheses
– tighter coupling of front-line clinical evidence needs to research questions
Virtual Patient
Transactions
Raw data
Medical knowledge
Clinical research
transactions
Raw research
data
Dec
isio
n su
ppor
t
Med
ical
logi
c
PATIENT CARE / WELLNES RESEARCH
Workflow modeling and support, usability, cognitive support, computer-supported cooperative work (CSCW), etc.
EHX
Big Picture of Health InformaticsCollab Research Systems
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
NIH Challenge Grants: CRI• 06-RR-101* Virtual environments for multidisciplinary and translational
research. Virtual networking environments like Science Commons, Facebook, and Second Life, create platforms that can eliminate many barriers in scientific collaborations.
• 10-CA-101* Cyber-Infrastructure for Health: Building Technologies to Support Data Coordination and Computational Thinking.
• 06-RR-102* Infrastructure for biomedical knowledge discovery. Biomedical research depends on heterogeneous data of varying reliability that are increasingly multimedia and high-dimensional.
• 10-RR-101* Information Technology Demonstration Projects Facilitating Secondary Use of Healthcare Data for Research Analysis of enormous amounts of aggregate, anonymous, healthcare data has potential to provide evidence for best practices and to identify promising areas for additional research.
• 07-NS-101* Developing technology to increase efficiency and decrease cost of clinical trials. Clinical trials are becoming increasingly expensive, and many US patients are unwilling to enroll, which has led to delays in trial completion and further cost increases.
• 10-EB-102 User-friendly computing infrastructures for biomedical researchers and clinicians. Openly available computing infrastructures that link to shared research and clinical databases as well robust analysis and visualization tools need to be available to users who do not have prior computing expertise.
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
A Knowledge Commons
• Science Commons all open data, on semantic web
• Health Commons virtual labs vision– “buy” scientific services like you shop at Amazon
• high-throughput genotyping, array analysis, trial recruitment, survey design
– assemble your team as needed
– IP, material transfer agreements, etc. all handled by Health Commons framework (like e-commerce)
• Predicated on large-scale, open data
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Outline
• Translational biomedical informatics
• Collaborative Knowledge Work– Web 2.0 principles
• Class Summary
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Web 2.0• Vague-ish term on alternative/emerging
web future
• Several principles– user-generated content
– harness power/wisdom of crowds
– openness
– architecture of participation
– niche markets(P. Anderson, What is Web 2.0? JISC Tech and Standards Watch, Feb 2007)
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
User-Generated Content
• Anyone anywhere is a source of content– YouTube, Flickr, Wikipedia
– citizen journalism, blogs
• Time magazine’s 2006 Person of the Year– “You”
• Exists in parallel with (trumps?) Old Media, hierarchical information sources (e.g., journals)
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Power/Wisdom of Crowds• Tapping into distributed intelligence of
people– “stock market” for 2004 election outcome– wikipedia
• Use distributed resources– SETI project uses your PC to analyze
signals for signs of intelligence from outer space (setiathome.berkeley.edu)
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Openness• Dimensions of openness
– open source: computer code open to all for wisdom of crowds to improve
– open access: no restrictions on use or distribution of content
– open participation: everyone can participate• communal management, flat hierarchies• consensus, emergent decision-making
• Allows “mash-ups” of freed data– http://web.mac.com/jburg/iWeb/GoogleLit/GoogleLit%20Trips.html
for Aeneid, Grapes of Wrath, user-generated road trips...
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Architecture of Participation
• “the service automatically gets better the more people use it”, e.g., – Google search
• the more “link paths” people tread, the richer the data for the Google search algorithm
– Amazon book ratings, Netflix ratings
• Network externalities concept– fax machines, cell phones...the more the
better
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Niche Markets• “The web” is unlimited resource
– can service even extremely small market niches
• Shape of the web: the “long tail”where traditional focus is
with infinitely long tail, majority of action is here
# p
eo
ple
market niche/things being done
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Web 2.0 for Health/Research?
• What health/research can Web 2.0 transform today?
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Content Production• Anyone can produce “content” (researchers,
clinicians, patients, etc.)– clinicians: e.g., www.ganfyd.org, a medical wiki for
MDs, www.sermo.com, etc.– patients: tens of thousands of web sites...– social tagging/social bookmarking (e.g., del.icio.us)
• (content, your-bookmark-tag, your-name) <==> (content, same-bookmark-tag, potential-collaborator)
• All content is open– e.g., Consolidated Appropriations Act of 2007 requires
open online access to NIH funded research– NIH Data Sharing initiative, PubMed Central, etc.
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Publication
• “Publication” is self-controlled– self-archiving, self-publishing in institutional
repositories and/or eScience communities– e.g., PLoS One, Nature portals
• papers published into PLoS platform• scientists self-aggregate into (niche) communities• reader ratings & comments “direct” papers to relevant
communities• evaluation is by # of views, # of comments/citations,
ratings, link outs, blog mentions, etc.
Disclosure: I’m on PLoS One Advisory Board
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Web 2.0 for Health/Research?
• To support discovery, not just participation, need more than just Web 2.0
• Need “semantic web”
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
What’s the Semantic Web?
• Semantic– “of or relating to meaning in language” (Merriam-
Webster)
– “relating to signification or meaning” (OED)
• Current web is non-semantic– “the web” does not “understand” the meaning of
• content of web pages, or
• data that is sent over the network (e.g., Netflix movie names, or movie content)
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Semantic Web• All content on or sent over the web is
expressed using OWL ontologies– Ontology Web Language, for describing
everything, like “SNOMED for everything”• see OntoWiki, National Center for Biomedical Ontology
• “Intelligent agents” can roam the web doing smart things for you– e.g., booking your summer vacation, making
appointment with the best cardiothoracic surgeon, re-balancing your retirement portfolio
– learning from your actions, acting on your behalf
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
“Killer App” for SW/Web 2.0• Combination of semantic web and web 2.0
applied to science• Open data/open science on epic scale
– everyone produces content
– automated data mining and knowledge discovery across all of biomedicine
– collaborative, flat, fluid, emergent, open participation
– even very esoteric communities can be supported
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Open Knowledge
• Public and Other Data Repositories– GenBank, UK BioBank, deCODE– Gene Expression Omnibus (GEO) gene expression and
genomic hybridization experiments http://www.ncbi.nlm.nih.gov/geo
– PharmGKB, pharmacogenomics http://pharmgkb.org– ClinicalTrials.gov
• Knowledge repositories– Morningside repository of computable guidelines
with computable parameters (e.g., inpatient? location?) and workflow
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Large-Scale Knowledge Discovery
• Text mining, data mining, model building across ALL data on web– within and outside biomedicine– supervised (e.g, neural net) and
unsupervised (e.g., clustering) learning
• e.g., www.freebase.com– free + database = absolutely everything in
structured, computable form– indexed to OWL ontologies
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
eCare and eScience
Administrative Clinical Care Research
Physical Networking
Standard Communications Protocols (e.g., HL-7)
PracticeManagement
Systems
EHRExecutionAnalysis
Medical BusinessData Model
Clinical CareData Model
Clinical StudyData Models
Open de-identified repositories
OWL Ontologies of Everything
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Outline
• Translational biomedical informatics
• Collaborative Knowledge Work– Web 2.0 principles
• Class Summary
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Summary• The more “computable” the information, the more
the computer can do for us• ...not just us individually, but together as a
community of science– syntatic interoperation: a common grammar for
machines talking to each other in biomedicine• e.g., HL7
– semantic interoperation: reliable exchange of common meaning among humans and machines
• requires standard vocabularies and standard data models
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Summary for Clinical Care• EHR adoption slowly increasing
– CCHIT certification helping– barriers include finances, lack of organizational
change expertise, fragmentation of health care system, misaligned incentives
• EHR and data warehouses can but don’t always help research
• Limited success of decision support systems• Fundamental tradeoff of coding effort vs.
“smartness” of system limits both EHR and CDSS return on investment
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Standardization• Standardization of terms absolutely critical but
not a solved problem– SNOMED most comprehensive but use is unproven
• Standardization of how we put terms together for specific uses is also important– “Common Data Elements” for use in research
– a standard EHR data model so all EHRs “look” alike• e.g., HL7 CDA version 2
– a standard protocol model for each “experiment type”, etc. in biomedical research
• e.g., clinical trials, microarrays
March 17, 2009: I. Sim Translational eScienceEpi – 206 Medical Informatics
Take-Home Message• Informatics necessary to do better
knowledge management in care and research
• Much can be done today, major barriers are policy and workflow related– lack of easy-to-use, robust vocabulary and
data model standards is contributory
• Disruptive change to eScience quite possible if we can get from data processing to concept processing
Virtual Patient
Transactions
Raw data
Medical knowledge
Clinical research
transactions
Raw research
data
Dec
isio
n su
ppor
t
Med
ical
logi
c
PATIENT CARE / WELLNES RESEARCH
Workflow modeling and support, usability, cognitive support, computer-supported cooperative work (CSCW), etc.
EHX
Big Picture of Health InformaticsCollab Research Systems