Integrating Genomic and Clinical Data in Electronic Health Records and
Biomedical Repositories: Challenges, Solutions and Opportunities
American College of Medical Informatics(ACMI)
What is ACMI?
The American College of Medical Informatics is a college of elected fellows from the United States and abroad who have made significant and sustained contributions to the field of medical informatics. Initially incorporated in 1984, the organization later dissolved its separate corporate status to merge with the American Association for Medical Systems and Informatics (AAMSI) and the Symposium on Computer Applications in Medical Care (SCAMC) when the American Medical Informatics Association was formed in 1989. The College now exists as an elected body of fellows within AMIA, with its own bylaws and regulations that guide the organization, its activities, and its relationship with the parent organization.
Integration of Genomic and Clinical Data
The integration of genomic data into the more traditional phenomic databases, such as electronic health records and biomedical data warehouses, offers great potential for the advancement of biomedical research and patient care. However, there are a number of challenges to accomplishing this integration in a seamless manner, including the consistent, standardized representation and coding of the data, coping with the shear volume of information, and proper indexing of important genomic features to facilitate retrieval. The panelists, all Fellows of the American College of Medical Informatics, each a leader in their own institutions and in the field of biomedical informatics, will describe their work on addressing the challenges of genomic-phenomic integration, with working solutions and examples of how such integration can be brought to bear on tasks such as helping clinicians understand their patients’ genetic data, using genetic data to support clinical decision making, and advancing biomedical research. The panel will also discuss implications for national standards on representation and data sharing. The session will include time for audience participants to share the solutions from their own institutions.
Integration of Genomic and Clinical Data• Potential for biomedical research and patient care
• Consistent standardized representation
• Consistent standardized coding
• Coping with the volume
• Indexing of important genomic features
• Helping clinicians understand patients’ genetic data
• Using genetic data to support clinical decision making
• Advancing biomedical research
• Implications for national standards for representation
• Implications for national standards for data sharing
Presenters• Shawn N. Murphy, FACMI (representation)
– Massachusetts General Hospital– Harvard Medical School– Partners HealthCare
• Henry Lowe, FACMI (linking genome & phenome)– Stanford University
• Elmer V. Bernstam, FACMI (reuse for research)– University of Texas at Houston
• Riccardo Bellazzi, FACMI (supporting research)– Università di Pavia
• Lucila Ohno-Machado, FACMI (iDASH Center)– University of California at San Diego
• Peter Tarczy-Hornoch, FACMI (decision support)– University of Washington
Expression of Genomic Variants in a Clinical Research Database
Shawn Murphy MD, Ph.D.
Lori Phillips MS
Brian Wilson
De-identified
Data Warehouse
1) Queries for aggregate patient numbers
00000042185793......
00000042185793......
2) Returns identified patient data
Z731984XZ74902XX......
Real identifiers
Query construction in web tool
Encrypted identifiers
OR- Start with list of specific patients, usually from (1)- Authorized use by IRB Protocol- Returns contact and PCP information, demographics, providers, visits, diagnoses, medications, procedures, laboratories, microbiology, reports (discharge, LMR, operative, radiology, pathology, cardiology, pulmonary,
endoscopy), and images into a Microsoft Access database and text files.
- Warehouse of in & outpatient clinical data- 5.0 million Partners Healthcare patients- 1.3 billion diagnoses, medications, procedures, laboratories, & physical findings coupled to demographic & visit data- Authorized use by faculty status- Clinicians can construct complex queries- Queries cannot identify individuals, internally can produce identifiers for (2)
Research Patient Data Registry exists at Partners Healthcare to find patient cohorts for clinical research
Query items Person who is using tool
Query construction
Results - broken down by number distinct of patients
HGVS Variant Notation
VariantWildtype Sequence
Footprint
Set of patients is selected through RPDR and data is gathered into a data mart
RPDR
Selected patients
Data directly from RPDR
Data from other hospital sources
Data collected specifically for project
Daily Automated Queries search for Patients and add Data
ProjectSpecific
Phenotypic Data
Data is available through a specialized Workbench
Requirements of Genomic Variant Notation
Ability to organize the variants for ease of navigation
Ability to query for the variant in the workbench Implication is that the identifier (basecode) for the variant does not
change over time or is maintainable.
Ability to explore or annotate the variant within the workbench Implication is that we know enough about the variant so that it can be
located in existing external genome browsers, analytical tools, etc
Challenges of Genomic Variant Notation
• Balancing the capabilities of multiple providers– Genomic labs may report data differently
• Maintainability– Define the variant so it may be reliably identified over time
• Balancing the needs of multiple consumers– Needs may differ for geneticists vs physicians vs research scientists
Proposed Strategy for Clinical Data feeds
Gather SNP data from reference data
Gather SNP data from genomic lab reporting system
Weighing the data provided by the lab source
• Gene location MYH7
• Flanking sequences
– 5’ AGGCGCTAGAGAAGTCCGAGGCTC
– 3’ CCGCAAGGAGCTGGAGGAGAAGAT
• Positional information c.2606
• Nucleotide substitution G>A
• Functional information p.Arg869His
Proposed Strategy for Research Data feeds
1. Store Summarized Genomic Annotation Information Within the current fact table of the star schema (EAV table)
2. Store Detailed Genomic Annotation Information Within A Object Orientated Data Base.
3. Store Genomic Datasets (BAM, PED etc…) Within A Secure File System – Indexed within i2b2 Data Mart
MongoDB – Data Persistence
GenomicFeatures
gridFS- BAM Files
Data AnalysisMeta Data
ExperimentMeta Data
Interface
I2b2 Web Service API Genomic Report API
Report Engine
Galaxy- Raw Data Storage- ‘Canned’ Workflow Reports
I2b2-Galaxy Adaptor
Other Resources
Domain Experts
Report Request Broker Genomic Data Importer( PED, GFF3 ... )
Genomic Data Exporter( PED, GFF3, BED, WIG ... )
Export GeneLevel Results to CRC
I2b2 Hive Domain Power Users
PM-Cell(Authentication)
CRC-Cell(Summary Annotations)
i2b2 Hive Core
R Perl cURL
Flow Diagram
How do we make Invariant Variants…that are palatable for human use in queries?
• RS number
• Gene name + flanking sequences
• HGVS name
RS number
• Uniquely identifies a variant over time ….but….
• Novel variants may not have rs number – User may not want to submit to dbSNP
Gene name + flanking sequences
• Not guaranteed if gene has several isoforms
– EGFR
HGVS Name
• Uniquely identifies variant within a referenced and versioned accession and details the nucleotide substitution.
NM_005228.3:c.2155G>T
RefSeq accession Position
Coding DNA
Nucleotidesubstitution
Is there a common denominator in all of this?
• Yes … all ultimately describe variant location on a chromosome.
• Nucleotide substitution defines the physical manifestation of the variant.
WE PROPOSE:– HGVS name (n/t subst, positional info)– Flanking sequences ( a way to verify positional info)
AS A WAY TO UNEQUIVOCALLY EQUATE TWO VARIANTS – ACROSS DOMAINS – ACROSS VERSIONS
GenomicMetadata record
GenomicMetadata Version 1.0 ReferenceGenomeVersion hg18 SequenceVariant HGVSName NM_0005228.3:c.2155G>T SystematicName c.2155G>T SystematicNameProtein p.Glu719Cys AaChange missense DnaChange substitution SequenceVariantLocation GeneName EGFR FlankingSeq_5 GAATTCAAAAAGATCAAAGTGCTG FlankingSeq_3 GCTCCGGTGCGTTCGGCACGGTGT RegionType exon RegionName Exon 18 Accessions Accession Name NM_005228 Type mrna (NCBI) Accession Name NP_005219 Type protein (NCBI) Accession Name NT_004487 Type contig (NCBI) ChromosomeLocation Chromosome chr7 Region 7p12 Orientation +
Combining equivalent terms
Linking to external services
• Genome Browser
– Requires chromosome location; reference genome
• PolyPhen (predicted functional effects)
– Requires chromosome location; reference genome
– RS number
– Or HGVS name
VISTA Services
• Flankmap (location service)
Converts several formats to a chromosome location on a reference genome
– Gene/flanking sequence
– Full HGVS notation
– dbSNP rs number
• Conservation plots
– Based on location
VISTA workbench tools
Embedded VISTA browser
References
• Kimball, R. The Data Warehousing Toolkit. New York: John Wiley, 1997.• Murphy, S.N., Gainer, V.S., Chueh, H. A Visual Interface Designed for
Novice Users to find Research Patient Cohorts in a Large Biomedical Database. AMIA, Fall Symp. 2003: 489-493.
• Murphy, S.N., Weber, G., Mendis, M., Gainer, V.S., Churchill, S., Kohane, I.S. Serving the Enterprise and Beyond with Informatics for Integrating Biology and the Bedside (i2b2). Journal of the American Medical Informatics Association, 2010 March 1; 17(2): 124-130.
• den Dunnen JT, Antonarakis SE: Mutation nomenclature extensions and suggestions to describe complex mutations: A discussion. Hum Mutat 2000, 15:7-12.
• Dalgleish R, et al.: Locus Reference Genomic sequences: an improved basis for describing human DNA variants. Genome Medicine; 2010, 2:24.
• http://www.hgvs.org/mutnomen/recs.html
33
Integrating Clinical and Genomic Data:
Opportunities, Challengesand a Proposal
Henry Lowe MD
Stanford Center for Clinical Informatics
And The Division of Systems Medicine
Stanford University School of Medicine
• Electronic Health Record Deployment Increasing• Creation of Clinical Data Warehouses Increasing
• Support for Research Access to Clinical Data• Optimized for use of Aggregate Data• Cohort Searching, Data Review & Analysis• Clinical Data (Including Text) Mining Tools
• Aggregation of Clinical Data across sites
Opportunities – Clinical Data Warehouses
• Linkage of Clinical and Biospecimen Data• Characterizing Biospecimens using Clinical Data• Identifying Biospecimen Cohorts• Linkage of Genomic Data to Clinical Data• Integration of Genomic Data back into the EHR
Opportunities – Biospecimen Linkage
• Clinical Data is not the Entire Phenotype• Missing Data (e.g. Occupational History)• May be Spread across many eHealth Systems
• Clinical Data is not Perfect• Diagnoses may be coded only in ICD9• Important Data may be missing• Clinical Text may be challenging to parse
Challenges – Clinical Data
• Creating validated algorithms to define phenotype from EMRs is complex
• Diagnostic Codes Alone may not be sufficient (eMERGE)
• Extracting phenotypic data from clinical text can be difficult
• Phenotype data use goes beyond genomic studies, e.g. Research Cohort Identification
Challenges – Identifying Phenotype
• Create a Web-based, searchable directory of validated high level phenotype algorithms
• Encourage contributions from multiple sites• Algorithms would be freely available for use• Would use a standard set of metadata elements• Would use a standard description formalism• Provide APIs to support application/system level
access to the phenotype algorithm directory
Proposal – A National Phenotype Catalog
Challenges in Leveraging Clinical Data
Elmer BernstamProfessor
Biomedical Informatics and Internal MedicineDirector, Biomedical Informatics ComponentCenter for Clinical and Translational Research
The University of Texas Health Science Center at Houston
Main points
• To leverage genomic data, need (matching) clinical data– Research data
• Expensive and scarce• Relatively easy to compute• May not accurately reflect clinical reality
– Routine clinical data• Plentiful and “cheap” (though may not match)• Very hard to compute• Necessary
• Challenges inherent in routine clinical data
Traditional view
CPRCPR
Clinical data(CDW)
Clinical data(CDW)
Genetic data
Genetic data
Why do we need routine data?
• Clinical research moving abroad– Glickman SW, McHutchinson JG, Peterson ED, et al. Ethical and scientific
implications of the globalization of clinical research. N Engl J Med. 2009;360(8):816–823, PMID:19228627.
• If we are to compete…– Need to make use of routine data– Only some routine clinical care can be outsourced
Problems
• Little overlap between clinical and research data– Genetic data on study subjects– Clinical data on patients who are not study subjects
• Clinical data not like research data– Measurement error– Missing data– Biased data– …
Attempts to leverage clinical data• [Quality of care]• [(Non-representative)Cohort selection]• Reproducing large RCTs
– Extremely large sample sizes– Example: Tannen RL. Weiner MG. Xie D. Use of primary care electronic medical record
database in drug efficacy research on cardiovascular outcomes: comparison of database and randomised controlled trial findings. BMJ. 2009. 338:b81. – 8M patients (5.7% of the population of the UK)
• Reproducing prediction rules– Example: Hripcsak G, Knirsch C, Zhou L, Wilcox A, Melton GB. Using discordance to
improve classification in narrative clinical databases: An application to community-acquired pneumonia. Comp Biol Med, 37 (2007) 296-304.
– Often doesn’t work• Solution 1: eliminate problematic data (10% of sample)
– Bias• Solution 2: account for the confounds via statistical model
– Requires knowing the answer
Required enabling technologies
• Infrastructure– Collect, store, protect, analyze, update
• NLP, NLP, NLP– Structured (billing) data misleading
• (UTH data) 20% endometrial cancer, 50% breast cancer
• Statistics– Requires unusual degree of collaboration with
statistical colleagues
For the present
• Critically important research area• Careful to maintain enthusiasm without over-
promising– AI Winter(s)
Integrating genomic and clinical data: some challenges from EU and italian
projects
University of Pavia, ItalyRiccardo Bellazzi
Biomedical Informatics
Labs‘Mario
Stefanelli’
Cardiology
Oncology
IRCCS Fondazione S. Maugeri
IRCCS Fondazione C. Mondino
Headache
IRCCS Policlinico S. Matteo
The EU Inheritance project
Collaborations BMI labs and Pavia hospitals
Clinical Bioinformatics – the Italbionet / i2b2 Pavia project
DW / clinical research chart
Intelligent query / data mining
Knowledge repositories
Reasoning systems
EMR
Research data-bases
Discharge letters
HIV
Biobanks
Projects
Genetic of arrythomogenic diseases
Support to oncology research
IRCCS Fondazione S. Maugeri
TRIAD and I2b2
TRIAD: Transatlantic registry of inheritedArrythmogenic diseases
i2b2
ETL - KETTLE
TRIAD and i2b2
Adding statistical functionalities
BIOINFORMATICS METHODOLOGY AND TECHNOLOGY TO INTEGRATECLINICAL AND BIOLOGICAL KNOWLEDGE SUPPORTING ONCOLOGY TRANSATIONAL RESEARCH (ONCO-I2B2)
Inheritance: dilated cardiomiopathies
IRCCS Policlinico S. Matteo
The EU Inheritance project
Projects
Dilated cardiomiopathy
Centre for Inherited Cardiovascular Diseases - IRCCS Policlinico San Matteo - Pavia
“DCM”
DystrofinopathiesLaminopathiesDesminopathies
MitocondriopathiesEpicardinopathiesActinopathiesZaspopathies
Desmosonopathies
From DCM to…
Clinically oriented genetic investigation
Centre for Inherited Cardiovascular Diseases - IRCCS Policlinico San Matteo - Pavia
ECGRest, effort,
holter
PedigreeFamily screening
SymptomsDuration
Physicalevaluation
Non FamilialFamilial: AD, AR,
X-LR, MT
Cardiac, ExtraCardiac,Recent
Onset, Long term
Muscle, SkinEyes, Kidney,
Liver, Lung
LAB
Imaging: echo,MRI
RV Cath
AVB, PR, WPW, etc,
CPK, Leukocytes,Enzymes, Metab.
Etc
LVNC, DE
EMB
Family screeningClinical markers
DiagnosticHypothesis:
Before Genetic Testing
Increasing the number of genotyped CMP
One gene ---> one disease
Inheritance architecture
I2b2 environment
Web interface
Data analysis plugin
Text mining and literature search engines
Reasoning module
Wiki-based collaborative system
Annotation tools
KB/Red flags
Data warehouse
Cardioregister
Projects
IRCCS Fondazione C. Mondino
Headache
Populating the datawarehouse
CRC
Research Clinical Data
Ontology Mapped Clinical Data
Domain Ontology
Documents
Legacy Databases
NLP System
ICHD Diagnosis
ICHD Code System
Task 1. Computational methods and tools to perform data mining and knowledge integration
Web-based data analyticsWeb-based annotation
Automated Literature search
Mining annotations and literature
Efficient management ofMS-data
• Several projects where the same architecture can be applied:
• Main adaptation needs:– Specific domain ontologies– Representation of genetic information– Representation of phenotypic information– Importing data from EHR
• Interesting research directions related to building –omics enabled decision support and knowledge management tools
In summary
Several projects where the same architecture can be applied:
Main adaptation needs: Specific domain ontologies Representation of genetic information Representation of phenotypic information Importing data from EHR
Interesting research directions related to building –omics enabled decision support and knowledge management tools
Integrating Genomic and Clinical Data for EHR and Biomedical Repositories
Lucila Ohno-Machado, MD, PhDDivision of Biomedical Informatics UCSD
TBI-CRI Bridge Day Panel03/8/11
EHR and Genomics at
Division of Biomedical Informatics overview
Research and Applications• Clinical Data Warehouse
– NLP, privacy technology, preference management
• integrating Data for Analysis, Anonymization, and Sharing
• Personalized risk assessment– How ‘personalized’ is it?
• 550,000 outpatient visits/year
• 180,000 hospital admissions
• 17 million orders• 2 million patients
Clinical Data Warehouse
UCLA(Epic)
Data matching function: Map D onto data dictionaries
Clinician/Researcher wants data
Return data D
Request about individual
Request for data D
UC Irvine (Eclipsys)
UC Davis(Epic)
UCSF(GE)
Community Partners
UCSD(Epic)
EHR and Genomics at
Division of Biomedical Informatics overview
Research and Applications• Clinical Data Warehouse
– NLP, privacy technology, preference management
• integrating Data for Analysis, Anonymization, and Sharing
• Personalized risk assessment– How ‘personalized’ is it?
Sharing Data– Today
• Public repositories (mostly non-clinical)• Limited data use agreements
– Tomorrow• Annotated public databases• Informed consent management system• Certified trust network
Sharing Computational Resources– Today
• Computer scientists looking for data, biomedical and behavioral scientists looking for analytics
• Duplication of pre-processing efforts• Massive storage and high performance computing limited to a
few institutions– Tomorrow
• Processed, de-identified, ‘anonymized’, shared data• Secure biomedical/behavioral cloud
integrating Data forAnalysis, Anonymization and Sharing
Analysis
• Compression• Query language• NLP• Study design
(2nd generation seq)
• Pattern recognition (computing with streams, rare
events)
• High performance computing
(Courtesy Bafna and Varghese)
Anonymization
Informed Consent Management SystemDo I wish to disclose data D to P for Reason R?
Information Exchange Registry
Provider P requests Data D on individual I for Reason R
Does the law, Regulation require D to be sent?
Yes No
Yes
No
Individual preferences
Preferences
Inspection
Focus Groups,Surveys
•Identity Management
•Trust Management
Home
Trusted Broker(s)
Patient I
Community
Respecting Privacy and Getting the Job Done
Security Entity
Healthcare Entity
Preference Registry
I can check who or which entity looked (wanted to look) at the data for what reasons
EHR and Genomics at
Division of Biomedical Informatics overview
Research and Applications• Clinical Data Warehouse
– NLP, privacy technology, preference management
• integrating Data for Analysis, Anonymization, and Sharing
• Personalized risk assessment– How ‘personalized’ is it?
Personalized Medicine
If the rule of thumb for building predictive models is 10 cases per variable:
How many individual genotypes are needed?
22%
16%
“this program shows the estimated health risks of people with your same age, gender, and risk factor levels”
Your Risk
p=1
x
“this means that 5 of 100 people with this level of risk will have a heart attack or die”
Input space
“people with your same age, gender, and risk factor levels”
People “like you”
Output space
“people with this level of risk”
me
p=1
x
People “like me”
height
gender
me
Patients “like you”
Patients “like you”
me
height
gender0 1
1
Patients “like you”
me
height
gender
risk
0
2
1
1
Assessing Quality of Individual Predictions
• Hybrid model construction– Non-parametric and parametric regression– Kernel-based models
• Evaluation of calibration – Graphical tools based on calibration error– Input-based assessment
• Calibration methods– Smooth isotonic regression (1:30 Cyril Magnin II)– Doubly-penalized SVM
Summary
• We need to aggregate as much information we can from experiments and clinical data to create reasonable predictive models
• Objective models are being used in a variety of medical domains, but few users know their limitations
• We need better methods to assess the quality of the models
Genome-Phenome Integration @ UCSD
Funding from NLM, NHLBI, NHGRI, NIBIB, NCRR, NIGMS, AHRQ, Fogarty, VAMRF, Komen Foundation, UCSD Medical Center
Integrating Genomic and Clinical Data in EHRs and Biomedical Repositories:Challenges, Solutions and
OpportunitiesPeter Tarczy-Hornoch MD Director, Biomedical Informatics Core, ITHS Director, Research and Data Integration, ITS Head and Professor, Biomedical and Health Informatics Adjunct Professor, Computer Science and Engineering Professor, Neonatology
March 9, 2011AMIA TBI-19/CRI-01 ACMI Panel
Electronic Medical
Record/Clinical Data
Electronic Case Report Form Data
Biodata(Instruments)
Biospecimens
Researcher
Honest
broker
IRB approved protocol
IRB approved protocol
IRB approved protocol
Solutions for generating new genomic knowledge require integrating diverse phenotypic and genomic data
The University of Washington data repository (Amalga) integrates phenotypic data from 30+ interfaces (10/2010)
Scope of Repository
• 3.5M patients, 42M visits, 220M+ lab results, 180M+ diagnoses & procedures over 18 years• 14 data systems populating Amalga via 30+ real-time or batch interfaces• 2.7 Terabytes of data• 4M new messages/day• Use IRB/HIPAA compliant
Amalga can identify patients with a given phenotype and help investigators augment phenotypic information • Eligibility criteria (IRB approved study)
Patients whose age >=18 years and are not deceased
AND
Had ICD-9 codes of 648.* OR 250.* OR 648 OR 250
AND
Had lab test results (Albumin >= 30 and <= 400) OR (Albumin/Creatinine Ratio >= 30 and <= 400) within the last 2 years.
AND
Had ANY encounter in the service centers for Internal Medicine OR Diabetes Care Center OR Family Medical Center in the last 2 years
AND
Have not had a diagnosis of 592.* OR 592 OR 585.6 OR V42.0
AND
Have not had lab test (Calcium > 10.5) OR (GFR < 60) OR (Hemoglobin A1C HPLC > 9.5) OR (Hemoglobin A1C Rapid > 9.5).
• Nightly updates to candidate list, automated notification, & custom study input screen
Link demographic, diagnoses, labs, & visit history data
Some phenotypes more challenging to capture
Capurro, Tarczy-Hornoch
TBI 2011 (TBI-10)
Semantic alignment in data repositories pulling data from disparate systems is a challenge
- 2 medication lists- 2 systems - Pharmacy - Medical record
- Single dictionary (A)
- 1 medication lists- 1 system - Medical record
- Single dictionary (B)
- n medication lists- n systems (members) - hospitals - clinics - pharmacies - mail-order- NO dictionary
Ongoing research and research opportunities: ontologies, semantic alignment
EHR computable phenotypes may not be granular enough thus text mining is a key opportunity
CONFIDENTIAL – UNPUBLISHED DATA (Black, Capurro et al)
Systems to bring genomic knowledge to the point of care need to integrate with both genomic and phenotypic data• “Pharmacogenomics (PGx) is the study of the genetic basis of
variability among individuals in response to drugs” (Pharmacogenomics & Personalized Medicine, Colen N. 2008)
Overby, Tarczy-Hornoch et al BMC Bioinformatics 2010
Motivation
Example: Tamoxifen and time to recurrence
Increased monitoring for poor metabolizers recommended
Increased monitoring for poor metabolizers recommended
Note: Given limited evidence, as of 2009, ASCO does NOT recommend testing for CYP2D6
Methods
Genotypescoring system
Raw from Sheffield et al. Clin Bio Rev. 2009
Overby et al IDAMAP 2010
Pharmacogenomic decision support requires reasoning across assertions with different levels of evidence
Prototype system built on Amalga integrates Illumina SNP data and clinical data and basic genomic knowledge
• Potential applications: discovery of associations, validation of associations, clinical alerts/reminders
• Research opportunities: genomics, data modeling, data mining, text mining/NLP, decision support
Data is from simulated patients
* Overby: decision support for pharmacogenomics, * Yetisgen-Yildiz: phenotype extraction
Collaboration is key to realize the opportunity for use biomedical data to advance genomic research & practice
13 Cores includingBiomedical Informatics(and Regulatory/Bioethics)
Academics & Clinical- Lab Medicine- Pathology- Genome Sciences- Northwest Institute of Genetic Medicine
Faculty (19+39)- Research, ServiceStudents (39)- MS, PhD, Postdoc
Medical Records Billing SystemsData Repositories
Biomedical Data &
Biospecimens
Nursing Public Health
ComputerScience
Acknowledgements (incomplete)
• Funding: NCRR UL1 RR 025014, NLM T15 LM07442, NIH, NSF, AHRQ, UW Medicine
• Faculty Nick Anderson, Jim Brinkley,
Alon Halevy, Ira Kalet, Kari Stephens, Dan Suciu, Peter Tarczy-Hornoch
• PhD Students Eithon Cadag, Daniel Capurro,
Paul Fearn, Alicia Guidry, Ping Lin, Brent Louie, Peter Mork, Casey Overby, Rupa Patel, Terry Shen
• ITHS BMI Core Staff Bill Barker, Tony Black, Joshua
Franklin, Gene Hart, Greg Hather, Xenia Hertzenberg, Brent Louie, May Lim, Paul Oldenkamp, Roy Pardee, Jim Piper, Jaime Prosser, Justin Prosser, Ron Shaker, Richard Veino
• ITS Staff Joe Frost, Jim Hoath, Mike Kuffel,
Soohee Lee, Dave Rankin, Dan Sullivan, Paul Tittel, Tanya Tobin, and more