Clinical and Translational Informatics Capabilities for the University of Kansas Medical Center
Russ Waitman, PhD Associate Professor, Director Medical Informatics
Department of Biostatistics September 29, 2011
Outline
What is Biomedical Informatics? What are the Clinical Translational Science Awards? Informatics Aims: focus on storing and getting Tools for storing information: CRIS and REDCap Tool for viewing/getting information: HERON/i2b2 Oversight Process Information Architecture Observations Milestones
Questions
Background: Charles Friedman The Fundamental Theorem of Biomedical
Informatics: A person working with an information resource is
better than that same person unassisted. NOT!!
Charles P. Friedman: http://www.jamia.org/cgi/reprint/16/2/169.pdf
Background: William Stead The Individual Expert
William Stead: http://courses.mbl.edu/mi/2009/presentations_fall/SteadV1.ppt
Evidence
Patient Record
Synthesis & Decision
Clinician
Fact
s pe
r Dec
isio
n
1000
10
100
5 Human Cognitive
Capacity
The demise of expert-based practice is inevitable
2000 2010 1990 2020
Structural Genetics: e.g. SNPs, haplotypes
Functional Genetics: Gene expression
profiles
Proteomics and other effector molecules
Decisions by Clinical Phenotype
William Stead: http://courses.mbl.edu/mi/2009/presentations_fall/SteadV1.ppt
Background: Edward Shortliffe Biomedical Informatics Applications
Basic Research
Applied Research
Biomedical Informatics Methods, Techniques, and Theories
Imaging Informatics
Clinical Informatics Bioinformatics Public Health
Informatics
Molecular and Cellular Processes
Tissues and Organs
Individuals (Patients)
Populations And Society
Edward Shortliffe: http://www.dentalinformatics.com/conference/conference_presentations/shortliffe.ppt
Background: Edward Shortliffe Biomedical Informatics Research Areas
Edward Shortliffe: http://www.dentalinformatics.com/conference/conference presentations/shortliffe.ppt
Biomedical Knowledge
Biomedical Data
Knowledge Base
Inferencing System
Data Base
Data Acquisition
Biomedical Research Planning & Data Analysis
Knowledge Acquisition
Teaching Human Interface
Treatment Planning
Diagnosis Information Retrieval
Model Development
Image Generation
Real-time acquisition Imaging Speech/language/text Specialized input devices
Machine learning Text interpretation Knowledge engineering
“It is the responsibility of those of us involved in today’s biomedical research enterprise to translate the remarkable scientific innovations we are witnessing into health gains for the nation.”
Clinical and Translational Science Awards A NIH Roadmap Initiative
• Administrative bottlenecks • Poor integration of translational resources • Delay in the completion of clinical studies • Difficulties in human subject recruitment • Little investment in methodologic research • Insufficient bi-directional information flow • Increasingly complex resources needed • Inadequate models of human disease • Reduced financial margins • Difficulty recruiting, training, mentoring scientists
Background: NIH Goal to Reduce Barriers to Research
CTSA Objectives: The purpose of this initiative is to assist institutions to forge a
uniquely transformative, novel, and integrative academic home for Clinical and Translational Science that has the consolidated resources to:
1) captivate, advance, and nurture a cadre of well-trained multi-
and inter-disciplinary investigators and research teams; 2) create an incubator for innovative research tools and
information technologies; and 3) synergize multi-disciplinary and inter-disciplinary clinical and
translational research and researchers to catalyze the application of new knowledge and techniques to clinical practice at the front lines of patient care.
NIH CTSAs: Home for Clinical and Translational Science
Trial Design
Advanced Degree-Granting
Programs
Participant & Community Involvement
Regulatory Support
Biostatistics
Clinical Resources
Biomedical Informatics
Clinical Research
Ethics
CTSA HOME
NIH
Other Institutions
Industry
Dan Masys: http://courses.mbl.edu/mi/2009/presentations_fall/masys.ppt
Gap!
Bench Bedside Practice
Building Blocks and Pathways Molecular Libraries Bioinformatics Computational Biology Nanomedicine
Translational Research Initiatives
Integrated Research Networks Clinical Research Informatics NIH Clinical Research Associates Clinical outcomes Harmonization Training
Interdisciplinary Research Innovator Award Public-Private Partnerships (IAMI)
Dan Masys: http://courses.mbl.edu/mi/2009/presentations_fall/masys.ppt
Reengineering Clinical Research
KUMC CTSA Specific Aims 1. Provide a HICTR portal for investigators to access clinical and
translational research resources, track usage and outcomes, and provide informatics consultative services.
2. Create a platform, HERON (Healthcare Enterprise Repository for Ontological Narration), to integrate clinical and biomedical data for translational research.
3. Advance medical innovation by linking biological tissues to clinical phenotype and the pharmacokinetic and pharmacodynamic data generated by research cores in phase I and II clinical trials (addressing T1 translational research).
4. Leverage an active, engaged statewide telemedicine and Health Information Exchange (HIE) effort to enable community based translational research (addressing T2 translational research).
Supporting Aim 1: Clinical Research Information Systems KUMC has purchased Velos eResearch for a Clinical Trial
Management System (CTMS) and an Electronic Data Capture (EDC) System
CTMS functions Define Studies, Assign Patients to Studies Capture Adverse Events, Reports Budgeting, financial planning for studies and invoicing Sample management, regulatory tracking
EDC functions Design and Capture data on electronic Case Report Forms
(CRFs) – ideally in real time. “Patient portal” for surveys and EDC by subjects Export Data for analysis.
CRIS Intro Screen
CRIS: sample e Case Report Form
CRIS: Document Adverse Events
REDCap: Research Electronic Data Capture https://redcap.kumc.edu
It uses the same username and password as your KUMC email. Check out the training materials under videos Case Report Forms and Surveys
For consultation and to move project to production: Register your project with us so we can make sure we don't screw up and drop the ball. http://biostatistics.kumc.edu/projectReg.aspx After you register your project, a CRIS team member, likely Kahlia Ford will
get in touch with you.
Check out other institutions using REDCap and possibly borrow from the master library. http://www.project-redcap.org/
REDCap Disclaimer For clinical trials, CRIS/Velos may be a better fit Multiple years of experience CRIS team builds for you with biostats review Budget for CRIS team and biostats explicity
“Investigator driven” REDCap works if PI takes responsibility for data Scalability: informatics provides consultation and
responsibility for technical integrity; not your dictionary. Underwritten by CTSA right now
Or middle model where informatics can build for you in REDCap. Again, you budget for our team’s time
REDCap Case Report Form Example
REDCap Survey Example
Aim #2: Create a data “fishing” platform
Develop business agreements, policies, data use agreements and oversight.
Implement open source NIH funded (i.e. i2b2) initiatives for accessing data.
Transform data into information using the NLM UMLS Metathesaurus as our vocabulary source.
Link clinical data sources to enhance their research utility.
Develop business agreements, policies, data use agreements and oversight. September 2010 the hospital, clinics and university signed a
master data sharing agreement to create the repository. Executive Committee – decides organization/systems expansion Data Request Oversight Committee – guides implementation and
approves/monitors use. Use Cases:
After signing a system access agreement, cohort identification queries and view-only access is allowed but logged and audited
Requests for de-identified patient data, while not deemed human subjects research, are reviewed.
Identified data requests require approval by the Institutional Review Board prior to data request review.
Contact information from the Frontiers Participant Registry have their study request and contact letters reviewed by the Participant and Clinical Interactions Resources Program
Current Functionality • Single sign-on (CAS) integration with HERON portal
linked off Frontiers home page (Aim 1) • Real-time check for current human subjects training
(LDAP Chalk) • System Access Agreements, Data Use Agreements
and Review Processes implemented in HERON with web pages for monitoring system use
• Demonstration • i2b2 and HERON tools • if time, we’ll do this in real time at the end of the talk
Implement NIH funded (i.e. i2b2) initiatives for accessing data.
i2b2: Count Cohorts
i2b2: Patient Count in Lower Left
i2b2: Ask for Patient Sets
i2b2: Analyze Demographics Plugin
i2b2: Demographics Plugin Result
i2b2: View Timeline
i2b2: Timeline Results
Constructing a Research Repository: Ethical and Regulatory Concerns Who “owns” the data? Doctor, Clinic/Hospital, Insurer,
State, Researcher… perhaps the Patient? Perception/reality is often the organization that paid for the system
owns the data. My opinion: we are custodians of the data, each role has rights and
responsibilities Regulatory Sources:
Health Insurance Portability and Accountability Act (HIPAA) Human Subjects Research
Research depends on Trust which depends on Ethical Behavior and Competence
Goals: Protect Patient Privacy (preserve Anonymity), Growing Topic: Quantifying Re-identification risk.
Re-identification Risk Example Will the released columns in combination with publicly available data re-identify individuals? What if the released columns were combined with other items which “may be known”? Sensitive columns, diagnoses or very unique individuals? New measures to quantify re-identification risk.
Reference: Benitez K, Malin B. Evaluating re-identification risks with respect to the HIPAA privacy rule. J Am Med Inform Assoc. 2010 Mar-Apr;17(2):169-77.
Constructing a Repository: Understanding Source Systems, Example CPOE
Generic Interface
Engine (GIE)
Laboratory System
Pharmacy System
WizOrder Server
WizOrder Client
Mainframe DB2
Rx DB
HL7 Lab DB
Temporary Data queue (TDQ)
Internal Format
HL7
SQL
SQL
SQL
Repackages and Routes
Print SubSystem
document
Knowledge Base, Files
SQL Orderables, Orderset DB
Drug DB
SQL
SQL
Most Clinical Systems focus on transaction processing for workflow automation
Constructing a Repository: Understanding Differing Data Models used by Systems
http://www.cs.pitt.edu/~chang/156/14hier.html
http://www.ibm.com/developerworks/library/x-matters8/index.html Star Schemas: Data Warehouses
Hierarchical databases (MUMPS), still very common in Clinical systems (VA VISTA, Epic, Meditech)
Relational databases (Oracle, Access), dominant in business and clinical systems (Cerner, McKesson)
Murphy SN, Weber G, Mendis M, Gainer V, Chueh HC, Churchill S, Kohane I. Serving the enterprise and beyond with informatics for integrating biology and the bedside (i2b2). J Am Med Inform Assoc. 2010 Mar-Apr;17(2):124-30.
HERON: Repository Architecture
Extracting, Loading, Transforming Data • Goal: stable monthly process, minimal downtime
• Complete rebuild of the repository, not HL7 messaging. • Two databases: create new DB while old DB is in use. • When the new DB is ready, switch over i2b2 to serve it.
• Initial Files from Clinical Organizations • Export KUH Epic Clarity relational database instead of
Cache/MUMPS. • Monthly file from UKP clinic billing system (GE IDX).
• Demographics, services, diagnoses, procedures, and Frontiers research participant flag.
• ELT processes largely SQL (some Oracle PL/SQL) • Wrapped in python scripts.
HERON De-identification Decisions HIPAA Safe Harbor De-identification Remove 18 identifiers and date shifting by 365 days back Resulting in non-human subjects research data but treated
as a limited data set from a system access perspective. System users and data recipients agree to treat as a limited data set (acknowledging re-identification risk)
To be addressed: For now, we won’t add free text such as progress notes with
text scrubbers (DeID, MITRE Identification Scrubber toolkit) Currently have “obfuscation” turned on.
No sets < 10 and sets randomly perturbed + 3 patients While de-identified, access to timeline functionality provides
individualized patient “signatures”
Transform data into information using standard vocabularies and ontologies
Source terminology Completed planned Notes
Demographics: i2b2 April 2010 Using i2b2 hierarchy. Restricted search criteria to geographic regions (> 20,000 persons) instead of individual zipcodes
Diagnoses: ICD9 April 2010 Using i2b2 hierarchy Procedures: CPT June 2010 UMLS extract scripts developed with UTHSC at Houston
Lab terms: LOINC November 2010 Plan to use i2b2 hierarchy Medication ontologies: NDF-RT December 2010 Physiologic effect, mechanism of action, pharmacokinetics, and
related diseases. Nursing Observations July 2010- NDNQI pressure ulcers mapped to SNOMED CT to evaluate
automated extraction of self reported activity. (Drs. Dunton and Warren.)
Pathology: SNOMED CT February 2011 Providing coded pathology results and patient diagnosis is a critical objective for defining cancer study cohorts in Aim 3.
Clinical narrative 2012 As hospital restructures clinical narrative documentation to use EPIC’s SmartData (CUI) concepts, will determine appropriate standard.
National Center for Biological Ontology
2013 In support of Aim 3 focus on bridging clinical and bioinformatics to advance novel methods.
Other Key HERON decision “Lazy” Load supports alternative views of reality Load with the local terminology first. Map concepts to
standards secondarily in the concept space. Allows multiple ontologies for observations and works
around mapping challenges with contributing organizations
Further technical details described at: http://informatics.kumc.edu/work/wiki/HERON
Linking Clinical Data: FY2012 Sources Supporting Cancer Center Initiative HERON Executive Committee approval June 2011 for
incorporating: University Biospecimen Repository (Aim 3, Cancer Center) Hospital Tumor registry (Aim 3, Cancer Center) University REDCap and Velos Registries and Clinical Trials
systems (Aim 3, Cancer Center) Hospital billing ICD9, MS-DRG, Insurance Status Social Security Death Master File (Aim 4, Cancer Center) Cerner CoPath pathology system (Aim 3, Cancer Center)
Also continue to extract and refine data from Epic EMR
Developing a Rich Description of our Population: Existing and Planned Data Sources for HERON. Existing sources shown in bold underlined text and planned in plain text
An i2b2 query against HERON for currently supported cancer centric data sources
Any neoplasm ICD9 diagnosis (106,000 patients) and a WBC count (121,000) -> 44,000 distinct patients, *require height (123,000) and weight (154,000) -> 35,000 patients, •require Wong-Baker pain scale (84,000) ->14468 patients, •Body Temperature (158,000) -> 14463 patients, •Surgical Pathology Procedures CPT (85,000) -> 12446 patients,
Finally selective seratonin 5-HT3 antagonist antiemetics -> 8517 patients With our improved hardware (Fusionio memory cards), the cohort size is returned in 15 seconds for this 8 group query.
CTSA Aim #3: Link biological tissues to clinical phenotype and our research cores’ results
Support Cancer Center, IAMI, and bridge to Lawrence Research
First focus: Incorporate clinical pathology and biological tissue repositories with HERON and CRIS to improve cohort identification, clinical trial accrual, and improved clinical trial characterization Aligned with existing enterprise objectives to improve
biological tissue repository information systems Clinical trial accrual identified by many as a weak point
institutionally Target both biological research specimens and routine
clinical pathology
Biospecimen Shared Resource Integration
KUH Tumor Registry Validated Outcomes and Observations Tumors, Nodes, Metastasis (TNM) on complete cases Untapped investment: 7 cancer registrars (Tim Metcalf) ~65,000 cases, data since 1950s
North American Association of Central Cancer Registries (NAACCR) file format Will build on work at other NCI designated i2b2 users
(Group Health Cooperative in Seattle, Kimmel Cancer Center in Philadelphia have shared their code/metadata with us)
John Keighley providing invaluable expertise Later, supplement with additional treatment
information not in NAACCR file
Adding Social Security Death Master File Have Death status on approximately 90 million
people. Contains Social Security Number, Name, Date of Birth, Date
of Death, Place of Death Monthly update file from ntis; will sync with releases
Released Friday, September 23, 2011 Matching on SSN plus DOB 177,706 of our 1.8 million people noted as deceased
according to Social Security Administration versus 23,850 from hospital systems.
Future Functionality: IRB and i2b2 1.6 Moving beyond counting to line item data review In August, Karen Blackwell Privacy officials agreed to allow
timeline access under current system access agreement Released Friday, September 23, 2011
i2b2 version 1.5: DataMart Request Form to facilitate our Data Use Agreement
i2b2 version 1.6: Visit enabled queries
i2b2 Modifiers with i2b2 version 1.6 Will have to redo ELT to take advantage
Example: Prostate Cancer and PSA tests
Data Mart Request Form
Murphy SN et al, https://www.i2b2.org/events/slides/i2b2_OpeningTalk_20110628_Murphy.pdf
What do Visits and Modifiers Offer? Visits: I want to know the patient had the lab and the medication in
the same episode of care. Conceptually, i2b2 has had a table for the visit dimension
but the software never exploited the data Modifiers: Is it a billing diagnosis or from the problem list? Is it a
primary or secondary? How to I represent all parts of a medication order (dose,
route, frequency)?
Constrain observations to the same visit
Murphy SN et al, https://www.i2b2.org/events/slides/i2b2_OpeningTalk_20110628_Murphy.pdf
i2b2 Modifiers in the User Interface
Murphy SN et al, https://www.i2b2.org/events/slides/i2b2_OpeningTalk_20110628_Murphy.pdf
i2b2 Modifiers in the User Interface
Murphy SN et al, https://www.i2b2.org/events/slides/i2b2_OpeningTalk_20110628_Murphy.pdf
i2b2 Modifiers in the User Interface
Murphy SN et al, https://www.i2b2.org/events/slides/i2b2_OpeningTalk_20110628_Murphy.pdf
i2b2 Modifiers in the User Interface
Murphy SN et al, https://www.i2b2.org/events/slides/i2b2_OpeningTalk_20110628_Murphy.pdf
HERON <-> REDCap Integration i2b2: excels at data warehousing, knowledge
management, hypothesis exploration REDCap: pretty solid tool for storing and collecting
research data and it’s very user friendly. Goal: if we can integrate the best of both, we will
inherit the advancements in each project. Use cases: Breast Cancer Registry in REDCap integrated with HERON
which holds the biospecimens (REDCap -> HERON) Fulfilling Data Request for Participant Contact Information
(HERON -> REDCap)
Breast Cancer Registry
HERON
Similar to Tumor Registry: BSR Personnel create forms and enter data to improve annotation for fields that are difficult to automatically extract from Epic and other clinical systems.
Providing and Auditing Use of Participant Contact Information
Questions, HERON, REDCap demo