+ All Categories
Home > Documents > 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core...

0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core...

Date post: 24-Dec-2015
Category:
Upload: jeffrey-osborne
View: 215 times
Download: 1 times
Share this document with a friend
Popular Tags:
33
The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for Bioinformatics
Transcript
Page 1: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

1

The Cancer Biomedical

Informatics GridFrom Village to City

Peter A. Covitz, Ph.D.

Director, Core InfrastructureNational Cancer Institute

Center for Bioinformatics

Page 2: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

2

National Cancer Institute 2015 Goal

Relieve suffering and death due to cancer by the year 2015

Page 3: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

3

Origins of caBIG

Need: Enable investigators and research teams to broadly combine and leverage their findings and expertise in order to meet NCI 2015 Goal.

Strategy: Create scalable, actively managed organization that will connect members of the NCI-supported cancer enterprise by building a biomedical informatics network

Page 4: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

4

Scenario from Strategic Plan

A researcher involved in a phase II clinical trial of a new molecularly targeted therapeutic for brain tumors observes that cancers derived from one specific tissue progenitor appear to be strongly affected.

The trial has been generating proteomic and microarray data. The researcher would like to identify potential biochemical and signaling pathways that might be different between this cell type and other potential progenitors in cancer, deduce whether anything similar has been observed in other clinical trials involving agents known to affect these specific pathways, and identify any studies in model organisms involving tissues with similar pathway activity.

Page 5: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

5

From Village to City

Page 6: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

6

caBIG Principles

Open Source– Publicly-funded development must yield openly distributable

products.

Open Development– Community-driven development aligns needs with development

priorities

Open Access– Data has value beyond original purpose for collection. Scientific

method demands verification by peers. Obligation to share publicly-funded data products.

Federated– Local control of deployments. No central “Ministry of Information.”

Scalable.

Page 7: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

7

Community Priorities

0 5 10 15 20 25 30 35

Clinical Data Management Tools & DatabasesStaff Resources

Distributed General Data Sharing & Analysis ToolsTranslational Research Tools

Access to DataTissue & Pathology Tools

Center Integration & ManagementCommon Data Elements (CDE) & Architecture

Meta-ProjectVocabulary & Ontology Tools & Databases

Statistical Data Analysis ToolsVisualization & Front-End Tools

Remote/BandwidthProteomics

Microarray & Gene Expression ToolsMeeting

Laboratory Information Management Systems (LIMS)Licensing Issues

PathwaysHigh Performance Computing

IntegrationImaging Tools & Databases

Database & Datasets

Number of Needs Reported

Clinical Trial Management Systems

Tissue Banks & Pathology

Integrative Cancer Research

Page 8: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

8

caBIG Organization Structure

Architecture

Vocabularies & Common Data Elements

Working Working GroupGroup

General ContractorGeneral Contractor

Strategic Working GroupsStrategic Working Groups

Clinical Trial Mgmt

Integrative Cancer Research

Tissue Banks & Pathology Tools

Working Working GroupGroup

Working Working GroupGroup

Working Working GroupGroup

Working Working GroupGroup

caBIG OversightcaBIG Oversight

= Project

Page 9: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

9

Interoperability

SemanticSemanticinteroperabilityinteroperability

SyntacticSyntacticinteroperabilityinteroperability

Courtesy: Charlie Mead

in·ter·op·er·a·bil·i·ty– ability of a system...to use the parts or equipment of

another systemSource: Merriam-Webster web site

interoperability– ability of two or more systems or components to

exchange information and to use the information that has been exchanged.

Source: IEEE Standard Computer Dictionary: A Compilation of IEEE Standard Computer Glossaries, IEEE, 1990]

Page 10: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

10

SYNTACTIC

SEMANTIC

SEMANTIC

SEMANTIC

caBIG Compatibility Guidelines

Page 11: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

11

Model-Driven Architecture

Page 12: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

12

Page 13: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

13

MDA Approach

Analyze the problem space and develop the artifacts for each scenario– Use Cases

Use Unified Modeling Language (UML) to standardize model representations and artifacts. Design the system by developing artifacts based on the use cases– Class Diagram – Information Model– Sequence Diagram – Temporal Behavior

Use meta-model tools to generate the code

Page 14: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

14

Limitations of MDA

Limited expressivity for semantics

No facility for runtime semantic metadata management

Page 15: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

15

caCORE

MDA plus a whole lot more!

Page 16: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

16

caCORE

Bioinformatics Objects

Enterprise Vocabulary

Common Data Elements

SECURITY

Page 17: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

17

Use Cases

Description

Actors

Basic Course

Alternative Course

Page 18: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

18

Bioinformatics Objects

Page 19: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

19

What do all those data classes and attributes actually mean, anyway?

Data descriptors or “semantic metadata” required

Computable, commonly structured, reusable units of metadata are “Common Data Elements” or CDEs.

NCI uses the ISO/IEC 11179 standard for metadata structure and registration

Common Data Elements

Page 20: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

20

Semantic metadata example: Agent

<Agent>

<name>Taxol</name>

<nSCNumber>007</nSCNumber>

</Agent>

Page 21: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

21

Why do you need metadata?Why do you need metadata?Class/Attribute

NCI Metadata CIA Metadata Example Value

Agent Chemical compound administered to a human being to treat a disease or condition, or prevent the onset of a disease or condition

A sworn intelligence agent; a spy

AgentnSCNumber

Identifier given to chemical compound by the US Food and Drug Administration (FDA) Nomenclature Standards Committee (NSC)

Identifier given to an intelligence agent by the National Security Council

007

Agentname

Common name of chemical compound used as an agent

CIA code name given to intelligence agents

Taxol

Page 22: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

22

Cancer Data Standards Repository

ISO/IEC 11179 Registry for Common Data Elements – units of semantic metadata

Precise definitions of Classes, Attributes, Data Types, Permissible Values: Strong typing of data objects.

Tools:– UML Loader: automatically register UML models as metadata

components– CDE Curation: Fine tune metadata and constrain permissible

values with data standards– Form Builder: Create standards-based data collection forms– CDE Browser: search and export metadata components

Client for Enterprise Vocabulary: metadata constructed from ontology terms and concepts.

Page 23: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

23

Preferred Name

Synonyms

Definition

Relationships

Concept Code

Enterprise VocabularyDescription Logic Ontologies

Page 24: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

24

Tying it all together: The caCORE semantic management framework

OntologyMetadata ID Concept Codes

2223333 C1708

2223866 C1708:C412432223869 C1708:C253932223870 C1708:C256832223871 C1708:C42614

Enterprise VocabularyCommon Data

ElementsBioinformatics Objects

Page 25: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

25

Computable Interoperability

Agent

name

nSCNumber

FDAIndID

CTEPName

IUPACName

Drug

id

NDCCode

approver

approvalDate

fdaCode

C1708:C41243

C1708:C41243

C1708 C1708

My model Your model

Page 26: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

26

caCORE Software Development Kit

Page 27: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

27

caCORE SDK Components

UML Modeling Tool (we use Enterprise Architect)– Information domain model defines data classes, attributes and

relationships

Semantic Connector (included in download)– Annotates UML model with ontology concepts: bridges the world of

databases to that of structured semantics

UML Loader (run by NCICB staff for now)– Loads model into the caDSR metadata registry– Model and associated semantics are available as metadata at runtime

Code Generator (included in download)– UML model used as input into code generator– Produces object-oriented middleware that instantiates model– Object-relational mappings tie middleware to databases and other

storage/retrieval systems. – Programming interfaces provide access to system for application

developers (Java APIs currently implemented; Web Services in upcoming release)

Page 28: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

28

Java Applications

Data AccessObjects

Web Application Server

Interfaces

Java

SOAP

XML

HTTP Clients

SOAP Clients

DataDataClientsClients

Perl Clients

EnterpriseVocabulary

CommonData

Elements

MiddlewareMiddleware

API

API

API

API

Data AccessObjects

DomainObjects[Gene,

Disease, etc.]

DomainObjects[Gene,

Disease, Agent,etc.]

caCORE Architecture

BiomedicalData

Page 29: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

29

Cancer Center Cancer Center

Cancer Center

Cancer Center

Cancer Center

NCI

caGricaGridd

OTHER caBIGSERVICE

PROVIDERS

OTHERTOOLKITS

Page 30: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

30

Grid Communication Protocol

Service Description

Service

Workflow

Service R

egistry

Secu

rity

Sem

antic S

ervice

Reso

urce M

anag

emen

t

Functions Quality of Service

ID R

esolu

tion

OGSA Compliant - Service Oriented Architecture

Transport

caGrid Service-Oriented Architecture

GSI

CAS

myProxy

Globus

OGSA-DAI GlobusGRAM

Globus Toolkit

caCORE

Mobius

Globus

Page 31: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

31

caBIG Compatible Software and Data Resources

caArray – Cancer microarray data management system

C3D – Clinical Trials data capture application C3PR - Clinical trial participant registry toolcaWorkbench - Microarray analysis suitecaTIES - Automated free-text pathology data

extraction toolcaTISSUE - Biospecimen database and tracking

systemRProteomics - MALDI-TOF proteomics analysis

toolGene Ontology Miner (GOMiner) - Tool for

aggregate analysis of gene setsHapMap - caBIG accessible map of haplotypes

in human genomePromoter DatabaseUniProt-PIR - Protein sequence and annotation

databaseCurated Cancer Pathways Data - Data sets

generated from NCI 60 cell linesHuman-Mouse Anatomy OntologyNutritional Compound Ontology

*Note: Examples of upcoming 2006 Products and Data Sets

Distance Weighted Discrimination - Microarray data analysis integrator

Cancer Molecular Pages Prototype - Cancer gene annotation with web-based visualization

Magellan - Tool for the analysis of heterogeneous data types (e.g., microarray)

Visual and Statistical Data Analyzer (VISDA) - Multivariate statistical visualization tool for the analysis of complex data

FunctionExpress - Tool for integrated analysis and visualization of Microarray data

Quantitative Pathway Analysis in Cancer (QPACA) - Pathway modeling and analysis tool

TrAPSS - Disease gene mutation discovery and analysis tool

Proteomics Laboratory Information Management System Prototype

SEED - Peer-to-Peer genome annotation toolPathways Tool Project - Pathway visualization toolsLexGrid – Ontology hosting software

Page 32: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

32NCIAndrew von EschenbachAnna BarkerWendy PattersonOCDCTDDCBDCPDCEGDCCPSCCR

Industry PartnersSAICBAHOracleScenProEkagraApelonTerrapin SystemsPanther Informatics

NCICBKen BuetowSue DubmanLeslie DerrFrank HartelGeorge KomatsoulisAvinash ShanbhagDenise WarzelSherri De CoronadoDianne ReevesGilberto FragosoJill Hadfield

Page 33: 0 The Cancer Biomedical Informatics Grid From Village to City Peter A. Covitz, Ph.D. Director, Core Infrastructure National Cancer Institute Center for.

33

caBIG Participant Community

9Star ResearchAlbert EinsteinArdais Argonne National LaboratoryBurnham Institute California Institute of Technology-JPLCity of Hope Clinical Trial Information Service (CTIS)Cold Spring HarborColumbia University-Herbert IrvingConsumer Advocates in Research and Related Activities (CARRA)Dartmouth-Norris CottonData Works DevelopmentDepartment of Veterans AffairsDrexel University Duke UniversityEMMES CorporationFirst Genetic TrustFood and Drug AdministrationFox Chase Fred HutchinsonGE Global Research CenterGeorgetown University-LombardiIBMIndiana UniversityInternet 2Jackson LaboratoryJohns Hopkins-Sidney Kimmel Lawrence Berkeley National Laboratory Massachusetts Institute of Technology Mayo Clinic Memorial Sloan KetteringMeyer L. Prentis-KarmanosNew York UniversityNorthwestern University-Robert H. Lurie

Ohio State University-Arthur G. James/Richard SoloveOregon Health and Science UniversityRoswell Park Cancer Institute St Jude Children's Research HospitalThomas Jefferson University-KimmelTranslational Genomics Research InstituteTulane University School of MedicineUniversity of Alabama at BirminghamUniversity of Arizona University of California Irvine-Chao FamilyUniversity of California, San FranciscoUniversity of California-DavisUniversity of ChicagoUniversity of ColoradoUniversity of Hawaii University of Iowa-HoldenUniversity of MichiganUniversity of MinnesotaUniversity of NebraskaUniversity of North Carolina-Lineberger University of Pennsylvania-AbramsonUniversity of PittsburghUniversity of South Florida-H. Lee Moffitt University of Southern California-NorrisUniversity of VermontUniversity of WisconsinVanderbilt University-IngramVelosVirginia Commonwealth University-MasseyVirginia TechWake Forest UniversityWashington University-SitemanWistarYale University


Recommended