FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

Post on 19-Mar-2016

87 views 0 download

Tags:

description

FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot. Brand Niemann (US EPA), Chair, Semantic Interoperability Community of Practice (SICoP) Best Practices Committee (BPC), CIO Council December 28, 2005 http://web-services.gov/ and http://colab.cim3.net/cgi-bin/wiki.pl?SICoP - PowerPoint PPT Presentation

transcript

1

FHA Data Architecture Working Group: SICoP DRM 2.0 Pilot

Brand Niemann (US EPA), Chair,Semantic Interoperability Community of Practice (SICoP)

Best Practices Committee (BPC), CIO CouncilDecember 28, 2005

http://web-services.gov/ and http://colab.cim3.net/cgi-bin/wiki.pl?SICoP

http://colab.cim3.net/cgi-bin/wiki.pl?DRMImplementationThroughIterationandTestingPilotProjects

2

Preface

• Conceptual Data Model – a model to guide data architecture and not a model to guide database development.

• But an ontology provides both and the pilot is both a CDM and an executable application based on DRM 2.0!

• So Data Architecture can be implemented in ontology-driven information systems.– See next slide.

3

Preface

• Ontology-Driven Information Systems:– Methodology Side – the adoption of a highly

interdisciplinary approach:• Analyze the structure at a high level of generality.• Formulate a clear and rigorous vocabulary.

– Architectural Side – the central role in the main components of an information system:

• Information resources.• User interfaces.• Application programs.

See for example: Nicola Guarino, Formal Ontology and Information Systems,Proceedings of FOIS ’98, Trento, Italy, 6-8 June 1998.

4

Preface

Layer FHA/DAWG SICoP DRM 2.0 Pilot

1 CDM Ontology FHA Health Domains & Data Element Concepts

(1-2) Wiring diagram

Semantic Relationships

Semantic links & associations

2 DB Systems

Metadata & Data

Health US 2005

5

Preface• Health Information Technology CoP’s Health Information

Technology Ontology Project (HITOP)* Major Objectives and Examples:– 1. Ontology-graph assisted search of medical literature:

SemanTxLife Sciences Pilot.– 2. Ontology in major health standards development: Barry Smith

- HL7 RIM.– 3. Ontology in the FHA Data Architecture Work Group: Brand

Niemann – DRM 2.0 Pilot.– 4. Ontologies in bioinformatics: Ken Baclawski – Book and

Keynote Presentation.– 5. Ontologies in operational clinical systems: Mark Musen –

Stanford Medical Informatics.– 6. Ontologies in large –scale medical research systems: Connor

Skankey – Visual Knowledge BioCAD.

* Marc Wine, GSA Office of Intergovernmental Solutions, CoP Lead

6

Preface• Question: Should the FHA DAWG be overly focused on metadata?

– Metadata and data are integrated together in DRM 2.0 and the pilot.• Question: Should FHA DAWG work with unstructured or semi-

structured data or defer this task to partners/agencies?– All three types of data are integrated together in DRM 2.0 and the pilot.

• Question: Should FHA DAWG also add physical data modeling to methodology?– The DRM ITIT Pilot shows how both conceptual and physical data are

done together with ontologies.• Question: Should educational material on metadata and data

modeling be present in the Data Strategy?– DRM 2.0 put educational material in the DRM Reference Model and ITIT

Wiki Pages, not the Reference Model Document itself.

7

Preface• Question: Should we align more closely to FEA DRM?

– Aligning with DRM 2.0 adds credibility to the work and pilot specifically demonstrates the three components of DRM 2.0.

• Question: How detailed of a level of analysis can be performed by the FHA DAWG?– This depends on the level of detailed data and information that

the FHA partners are willing to expose, e.g. the pilot uses summary data that is in the public domain.

• Question: Does the FHA DAWG analyze only (discover) or does it prescribe a solution (recommendation) like semantic harmonization scenarios?– SICoP and DRM ITIT are concerned with achieving semantic

harmonization and interoperability. E.g., the suggestion to include the CHI vocabularies in the pilot should be implemented.

8

Overview

• 1. The New Data Reference Model 2.0• 2. Health, United States, 2005• 3. Data Architecture Working Group• 4. Pilot Project• Appendix. Other Related Work

9

1. The New Data Reference Model 2.0

• The FEA framework and its five supporting reference models (Performance, Business, Service, Technical and Data) are now used by departments and agencies in developing their budgets and setting strategic goals. With the recent release of the Data Reference Model (DRM), the FEA will be the “common language” for diverse agencies to use while communicating with each other and with state and local governments seeking to collaborate on common solutions and sharing information for improved services.

Source: Expanding E-Government, Improved Service Delivery for the American People Using Information Technology, December 2005, pages 2-3.http://www.whitehouse.gov/omb/budintegration/expanding_egov_2005.pdf

10

1. The New Data Reference Model 2.0• The following chart illustrates the potential uses

of the newly released DRM Version 2.0:– The FEA mechanism for identifying what data the

Federal government has and how it can be shared in response to a business/mission requirement.

– The frame of reference to facilitate Communities of Interest (which will be aligned with the Lines of Business) toward common ground and common language to facilitate improved information sharing.

– Guidance for implementing repeatable processes for sharing data Government-wide.

Source: Expanding E-Government, Improved Service Delivery for the AmericanPeople Using Information Technology, December 2005, pages 2-3.http://www.whitehouse.gov/omb/budintegration/expanding_egov_2005.pdf

11

1. The New Data Reference Model 2.0

Source: Expanding E-Government, Improved Service Delivery for the American People Using Information Technology, December 2005, pages 2-3.http://www.whitehouse.gov/omb/budintegration/expanding_egov_2005.pdf

12

1. The New Data Reference Model 2.0

• FEA Reference Model Taxonomies

• FEA “Common Language”

• DRM 1.0 by committee– Implementation after

development.

• FEA Reference Model Ontology

• FEA Semantic Model

• DRM 2.0 by open, collaborative process– Implementation though

iteration and testing during development.

Paradigm Shifts

13

1. The New Data Reference Model 2.0

• Original FEA Lines of Business (6):– Data and Statistics:

• Opted out because of FedStats, Federal Committee on Statistical Methodology, etc. (it had its act together for statistical data management)

• Now it’s back with:– The new Data Reference Model 2.0 because statistical

programs generally have the best data and metadata and data management practices.

– The National Infrastructure for Community Statistics Community of Practice (NICS CoP)

– The Federal Health Architecture Data Architecture Working Group because FHA agencies are statistical agencies:

» See for example Health, United States, 2005 from the National Center for Health Statistics!

14

1. The New Data Reference Model 2.0• Metamodel: Precise

definitions of constructs and rules needed for abstraction, generalization, and semantic models.

• Model: Relationships between the data and its metadata.

• Metadata: Data about the data.

• Data: Facts or figures from which conclusions can be inferred.

Relationships and associations

The purpose of this schematic is to show that we need to describe information model relationships and associations in a way that can be accessed and searched.

Source: Professor Andreas Tolk, August 16, 2005

15

1. The New Data Reference Model 2.0

The point of this graph is that Increasing Metadata (from glossaries to ontologies) is highly correlated with Increasing Search Capability (from discovery to reasoning).

16

1. The New Data Reference Model 2.0

• Five Key Activities Over the Next Year:– 1. Education and Training in DRM Version 2.0 and use in FEA –

DRM-based Information Sharing Pilots (started June 13th). – 2. Testing of XML Schemas and OWL Ontologies by NIST and

the National Center for Ontological Research, respectively, among others (began October 27th).

– 3. Inventory/Repository of Semantic Interoperability Assets and Development of a Common Semantic Model (COSMO) by the new Ontology and Taxonomy Coordinating Work Group (ONTACWG) (started October 5th).

– 4. Continued early implementation of DRM 2.0 concepts and artifacts by industry in “open collaboration with open standards” pilot projects and workshops (started July 19th). E.g. FHA/DAWG.

– 5. Fostering champions of DRM Best Practices to improve (1) agency data architectures within agencies and (2) cross-agency data sharing across agencies in funded projects (started June 13th).

17

1. The New Data Reference Model 2.0

Scale of Activity / Metadata Function

Agency CoP/LoB Cross-CoP/LoB

Discovery e.g., EAAF 2.0 e.g.,FHA/DAWG e.g., Indicators

Integration

Reasoning

Where is SICoP DRM Implementation Going?Super Pilot: Address as Many Boxes as Possible!

CoP: Community of PracticeLoB: Line of BusinessEAAF: OMB Enterprise Architecture Assessment Framework 2.0FHA/DAWG: Federal Health Architecture – Data Architecture Working Group

? ? Yes

18

1. The New Data Reference Model 2.0

• December 5-7, 2005, Knowledge Management Collaboration & Knowledge Sharing Conference, Orlando, Florida:– Using CoPs To Simplify Processes and Unify Work Across

Agencies: Cross-Industry Applications:• Semantically Enabled Content (Wiki Purple Numbers, Ontology

Modeling Before Content is Created-e.g., SiberLogic, Repurposed Content, etc.)

• December 13, 2005, Invited Presentation to the Federal Metadata Management Consortium (FMMC):– SICoP and DRM Implementation Through Iteration and Testing

Work: Making It Real:• Semantic Knowledge Modeling and a Knowledge Reference Model

for Implementing the Semantic Web in the Federal Government.

19

2. Health, United States, 2005

• 156 tables in Excel plus 37 tables in Excel for figures

• Metadata (multiple levels and types)– For tables– Sources of data– Data stories

• Definitions - 194• Repurpose this excellent content and

model and map it to the DRM 2.0.

20

2. Health, United States, 2005

http://www.cdc.gov/nchs/hus.htm

21

3. Data Architecture Working Group

Source: FHA Data Strategy, DRAFT V1.0, December 28, 2005.CDM: Conceptual Data Model.

22

3. Data Architecture Working Group

• The scope of the Data Architecture Working Group is to help partner agencies to ensure that the FHA and its partners have a comprehensive and accurate view of the data needs of the FHA and to collect, store, and access the metadata in a consistent way. This charter extends to all Federal Departments whose mission is to provide and/or support the delivery of health care services that have been recommended and accepted. The Data Architecture Work Group also focuses on health metadata collection, analysis, and planning activities that are supported by the FHA Partner Council. The DAWG, as it pursues its data architecture objectives, will coordinate these activities with the other established workgroups of the FHA.

Source: FHA Data Architecture Working Group Initial Kickoff Meeting, December 13, 2005.

23

4. Pilot Project

• DRM 2.0:– Description (slides 24 and 27):

• Metadata (Title, Data, Notes, and Sources)• Data Story• Definitions and Methods

– Context:• Taxonomy and Search (slides 25-26)

– Sharing:• Separation of Presentation and Data (slides 28-29)

24

4. Pilot Project

See http://web-services.gov and Dynamic Knowledge Repositories

This Data Architecture Provides the Three S’s: Structure, Searchability, and Semantics.

25

4. Pilot Project

Query of HUS 2005 Taxonomy Nodes

Federated Search of All FHA Taxonomy NodesSee next slide for explanation.

26

4. Pilot Project• Query of HUS 2005 Taxonomy Nodes:

– This is the Expert Search Form Interface in the Web Browser where the (1) left pane has the hierarchical table of contents structure in the left pane where the document (s) and their subsections are selected for search and the (2) right pane has the boxes for the actual search query terms (“IDC codes”), number of words about the highlighted search terms that are desired (none), the search execution button, and the query syntax explanation.

• Federated Search of All FHA Taxonomy Nodes:– This is the same as item 2 above, except that a different set of

boxes are checked in the (1) left pane (the entire FHA Node) and a different query (“data architecture”) and number of words about the highlighted search terms that are desired (five) are used in the (2) right pane.

27

4. Pilot Project

Note: Can Highlight Table and Copy and Paste to Spreadsheet Because of XML Markup.

Metamodel

Model

Metadata

Data

Data Story

Recall Slide 8

28

4. Pilot Project

Data &Metadata(see nextslide)

http://web-services.gov/statabs2003no1.htm

Separation of the Data Presentation from the Data & Metadata.

DataPresentation/Visualization

29

4. Pilot Project

Data &Metadatain XML

http://web-services.gov/statabs2003no1.htm

The Data & Metadata Travel Together in XML Format!

30

4. Pilot Project

• Federal Health Architecture Data Architecture Ontology Metamodel:– Use Lines of Business/Business Reference

Model to Define the “Upper Ontology”:• See slide 31.

– Use Data Elements to Define the Domain-Specific Ontology:

• See slide 32.

31

4. Pilot ProjectFHA Health Domains(FEA Health LoB Sub-Functions)

Definition Instances

1. Access to Care Focuses on the access to appropriate care

Access to Care

2. Population Health and Consumer Safety

Assesses health indicators and consumer products as a means to protect and promote the health of the general population

Healthy People Chartbook IndicatorsSupporting disease registries, e.g., cancer, hepatitis C, SCI, and immunology

3. Health Care Administration

Assures that federal health care resources are expended effectively to ensure quality, safety, and efficiency

Health Care Expenditures and Payors

4. Health Care Delivery Services

Provides and supports the delivery of health care to its beneficiaries

Health Care Utilization and Health Care Resources

5. Health Care Research and Practitioner Education

Fosters advancements in health discovery and knowledge

HUS, 2005, Special Feature: Adults 55–64 Years of Age

32

4. Pilot ProjectConcept Definitions and

MethodsInstance

Population Population Data Table for Figure 1

Age Age Data Table for Figure 2

Race Race Data Table for Figure 3

Poverty Poverty level Data Table for Figure 4

Income Family income Data Table for Figure 5

Health insurance coverage

Health insurance coverage

Data Table for Figure 6

SEE ONLINE VERSION

33

Appendix. Other Related Work• Building an Ontology of the National Health Information

Network (NHIN): Status Report:– http://web-services.gov/nhinrfiontology04052005.ppt– http://ontolog.cim3.net/cgi-bin/wiki.pl?NhinRfi– http://ontolog.cim3.net/cgi-bin/wiki.pl?HealthOntologyMapping

• Collaborative Expedition Workshops (examples):– December 9, 2004, Standard Vocabularies in Health Care, Kathy

Lesh, Kevric.– July 19, 2005, Building a Hospital Incident Reporting Ontology

(HIRO) in the Web Ontology Language (OWL) Using the JCAHO Patient Safety Event Taxonomy (PSET), Liju Fan, Kevric, et al.

– December 6, 2005, Introduction to the Semantic Web for Bioinformatics, Ken Baclawski, Northeastern University.

– December 6, 2005, Boston Children's Hospital "smart search" and Semantic UMLS Ontology-based Professional Language Processing PubMed Search, Michael Belanger, President, SemanTxLife Sciences.

• See http://colab.cim3.net/cgi-bin/wiki.pl?ExpeditionWorkshop

34

Appendix. Other Related Work• Health Information Technology Ontology Project

(HITOP):– New CoP Led by Marc Wine, GSA Office of Intergovernmental

Solutions:• Develop a roadmap on the state-of-the-art use of ontology tools to

achieve semantic interoperability for high priority health IT applications involving clinical decision support systems (DSS) and electronic health records (EHRs).

– See http://colab.cim3.net/cgi-bin/wiki.pl?HealthInformationTechnologyCommunityofPractice

– Fourth Semantic Interoperability for E-Government Conference, February 9-10, 2005, Work Group Reports:

• Featured Presentation: Barry Smith, HL7 RIM– See http://colab.cim3.net/cgi-bin/wiki.pl?

FourthSemanticInteroperabilityforEGovernmentConference_2006_2_0910