+ All Categories
Home > Documents > Why a CMR? What to include in a CMR? Architecting ... - UNECE€¦ · 6 Factors for determining CMR...

Why a CMR? What to include in a CMR? Architecting ... - UNECE€¦ · 6 Factors for determining CMR...

Date post: 16-Jun-2018
Category:
Upload: tranthien
View: 231 times
Download: 0 times
Share this document with a friend
20
1 Architecting Architecting a Corporate Metadata Repository Corporate Metadata Repository at the at the U.S. Bureau of Census U.S. Bureau of Census Gail Wright CMR Program Manager Technical Director Oracle Corporation [email protected] Agenda n Why a CMR? n What to include in a CMR? n Architecting a CMR n Leveraging a CMR
Transcript

1

Architectinga

Corporate Metadata Repositoryat the

U.S. Bureau of Census

ArchitectingArchitectingaa

Corporate Metadata RepositoryCorporate Metadata Repositoryat theat the

U.S. Bureau of CensusU.S. Bureau of Census

Gail WrightCMR Program ManagerTechnical DirectorOracle [email protected]

Agenda

nWhy a CMR?

nWhat to include in a CMR?

n Architecting a CMR

n Leveraging a CMR

2

3

Why aWhy aCorporate MetadataCorporate MetadataRepository (CMR)?Repository (CMR)?

4

Metadata Technology ContinuumMetadata Technology Continuum

low integration

low share/reuse

few open standards

low interoperability

high integration

high share/reuse

many open standards

high interoperability

Buried,Inaccessible

Metadata

DefinedApplication

Models

AutonomousRepositories

IntegratedVertical/

Inter-DeptMetadata

Tool-basedData

Dictionaries

IntegratedGlobal

EnterpriseMetadata

IntegratedCorporateEnterpriseMetadata

EMR CMR EMR CMR FedStatsFedStats

3

BOC Current Business ProcessDoes not include an Integrated MetadataBusiness Process

BOC Current Business ProcessDoes not include an Integrated MetadataBusiness Process

internallydevelopedsystems

customizedcommercial

systems

CASES

variety ofprogramming

languages

GIDS

individualtool of choice

internallydevelopedsystems

customizedcommercial

systems

CASES

variety ofprogramming

languages

GIDS

individualtool of choice

CATICAPIMailPAPIOCSICM

CADECSAQOCRTDE

PFIRS

CATICAPIMailPAPIOCSICM

CADECSAQOCRTDE

PFIRS

internallydevelopedsystems

SAS

DEVSURV

COBOLFORTRANDECForms

StEPSECON DW

individualtool of choice

internallydevelopedsystems

SAS

DEVSURV

COBOLFORTRANDECForms

StEPSECON DW

individualtool of choice

DADS/AFF

CENSAS

FERRET

Econ DW

CD-ROM

Internet

ISS (future)

DADS/AFF

CENSAS

FERRET

Econ DW

CD-ROM

Internet

ISS (future)

Census 2000 AmericanCommunitySurvey

DemographicSurveys

Econ Census

Econ SurveysDesign Collect Process Share

What are the problems with the currentBusiness Process?What are the problems with the currentBusiness Process?

n Difficult to:n meet customer demands for quick turnaround of

surveys, and customized productsn re-use and share metadata within the BOCn maintain consistent standardsn compile and format metadata needed by dissemination

systemsn share metadata with external agencies, participate in

Virtual Statistical Agencies, etc.n meet new metadata requirements like FGDC’s CSDGM

content standardn perform time series or cross dataset comparisons

n Metadata integrity and quality can be compromised

4

Censusand

SurveyDesign

Censusand

SurveyDesign

DataCollection

DataCollection

DataProcessing

DataProcessing

DataDissemin-

ation

DataDissemin-

ation

Corporate M E T A D A T A RepositoryCorporate M E T A D A T A Repository

1998AnnualSurvey

1998AnnualSurvey

1998AnnualSurvey

1998AnnualSurvey

copy

1999AnnualSurvey

copycopycopy

1999AnnualSurvey

1999AnnualSurvey

1999AnnualSurvey

BOC Goal: An Integrated Metadata ProcessBOC Goal: An Integrated Metadata Process

8

What to includeWhat to includein ain a

CorporateCorporateMetadataMetadata

Repository?Repository?

5

9

n “Data about data”n Information about “raw” data that gives it meaning,

context or enhances understandingn Data about the Content, Quality, Condition, and

other characteristics about data

n Every informational asset that’s not datan Requirements, Data Models, Business Models,

Screen Layoutsn Data Mappings and transformationsn Hierarchies, Aggregation rules, Formulasn Rules for comparison of data sets and historical

meaningn Security access controls, operational schedules,

code, ...

What is Metadata?

What is a Repository?

DataDictionary

DataDirectory

DataRegistry

DataEncyclopedia

DataRepository

•Name•Definition•Format

•Name•Definition•Format

•Name•Definition•Format

•Name•Definition•Format

•Name•Definition•Format

•Source•Destination•Legacy

•Source•Destination•Legacy

•Source•Destination•Legacy

•Source•Destination•Legacy

•Owner•Authority•Standard

•Owner•Authority•Standard

•Owner•Authority•Standard

•Application•System•Model

•Application•System•Model

Everythingelse

6

Factors for determining CMR content

n Strategic to BOC Enterprisen Opportunity for sharing and reuse of:

n Metadatan Meta-Model

n Generic vs. Application specific

CMR Meta-Models

Data Element Registry (ISO/IEC 11179 Standard)

Data Elements, Value Domains, Valid Values, Data Element Concepts,… Data Set Registry

(Support FGDC CSDGM Geospatial Metadata Standard)A Data Set is a collection of Data Elements.

Product Registry(Supports FGDC CSDGM Geospatial Metadata Standard & Dublin Core)A Data Product may be a file/document, website/URL, or physical object.

Data Store(OMG CWM Standard)

Metadata for the physical data store.(Supports Relational, Multi-

Dimensional, and Flat File stores)Business Rule Registry

Workflow Framework

Security Framework

Configuration MgmtFramework

Classification Schemes(ISO/IEC 11179 Standard)

Taxonomies, Keywords

Survey RegistrySurveys, Survey Instances, Universes, Frames, Sample, Questionnaires,

Questions,…

7

Basic CMR Meta-Model Relationships

Survey

SurveyInstance

Questionnaire

Question

Product

DataSet

DataElement

DataStore

Definitions

n Administered Componentn An object requiring naming, identification,

configuration, security, and optionally,registration

n Has one or more designations (names)n Has one or more definitions

n Classified Componentn An object that may be classified as a part of a

classification scheme

8

CMR Meta-Model High Level

Basic CMR Meta-Model Relationships

Survey

Survey Instance

Questionnaire

Question

Product

Data Set

Data Element

DataStore

Administered Component

Classified Component

Generating a Census Bureau Taxonomy+ Census Bureau Information

+ Demographic

+ Census

- 1990 Census

+2000 Census

- Questionnaires

- Products

+Datasets

- Public Use Microdata Sample

- 100% Edited Detail File

+Sample Edited Detail File

- Data Elements

- Related Information

- Survey

- Economic

- Geographic

+Data Elements

+Basic Demographic

- Relationship

+ Sex

- Alternative Designations

- Alternative Definitions

- Data Element Concept

- Conceptual Domain

- Value Domain

- Related Data Elements

- Related Information

- Age

- Race

- Marital Status

- Occupation/Employment

- Housing

9

17

ArchitectingArchitectingaa

CMRCMR

CMR Component Based ArchitectureCMR Component Based Architecture

Metadata Repository Physical Storage Layer

COTSIntegratedProducts

Object Layer

AdminTools

BrowsingTools

BrowsingTools

MetadataInterchangeLoad/Unload

Browser User Interface ExternalSystems

SecurityFramework

u Flexible,functional,open,standards-based,component-basedarchitecture

u ReuseComponents

u SwapComponents

u Minimizechangeimpacts

u Flexible,functional,open,standards-based,component-basedarchitecture

u ReuseComponents

u SwapComponents

u Minimizechangeimpacts

10

Proposed Technical/Software Architecture

Four Ways an Application Can Use CMR Metadata

Proposed Technical/Software Architecture

Four Ways an Application Can Use CMR MetadataTightly Coupledwith CMR

Loosely Coupledwith CMR

1. Application written against CMR - uses it directlyfor metadata access and maintenance.

2. Application uses same CMR core physical model- can replicate metadata from CMR.

3. Application communicates with CMR through anAPI to exchange metadata.

4. Application communicates with CMR using astandard XML-based metadata interchange.

CMR Tools

Corporate Metadata RepositoryCMR Core Meta-Models

Web-enabledAdministration

Tools

OpenJavaAPI

Web-enabledBrowsing

Tools

OpenXML

InterchangeIntegrated

PortalWebSite

Builder

11

CMR Extensibility

Corporate MetadataRepository

CMR Core Meta-Models

Web-enabledAdministration

Tools

OpenJavaAPI

Web-enabledBrowsing

Tools

OpenXML

InterchangeIntegrated

PortalWebSite

BuilderCMR

ExtendedMeta-Model

CMRExtendedTools, API,Interchange

S/W Requirements

n Scalablen Provides for open API and Interchangen Implements Standards

n ISO/IEC 11179n FGDC CSDGMn Dublin Core

n COTS preferred, if meets requirementsn High productivity development toolsn Self-documenting, easy to maintain app

12

CMR S/W for Deployment & Development

Software Used for

Oracle8i EE V8.1.6

WebDB V2.2 (upgrading to Oracle 9i Portal)

OAS V4.0.8.1 (upgrading to iAS)

interMedia

CMR Physical Repository

Structured and Full-text Metadata

CMR Web Server

CMR Web Portal

Oracle XDK & MS Notepad

Rational Rose 2000

JDeveloper V3.1

Designer6iCMR Server Modeling. CMR Web ApplicationGeneration plus some PL/SQL coding.

CMR Java API and XML applicationdevelopment (BC4Js & JSPs)

CMR XML generation, parsing, processing, &upload/download from database tables

CMR UML Modeling

LogicalModels

MiddleTier

Deployment

ServerTier

Deployment

PhysicalModels

ClientTier

Deployment

FunctionalRequirements

UseCases

UMLObjectModel

ServerModel

WebModules

CMRRepository

TAPI(PL/SQL)

PL/SQLgeneratingHTML & JSApplication

Code HTTP

Net8

Net8

View LayerCreated/Generated usingOracle DesignerHand codedCreated/Generated usingRational Rose

OASEnvironment

w/PL/SQL

Cartridge&

HTTPListeners

WebBrowserHTML

Application

Designer Generated CMR Tools

13

LogicalModels

MiddleTier

Deployment

ServerTier

Deployment

PhysicalModels

ClientTier

Deployment

FunctionalRequirements

UseCases

UMLObjectModel

CMRViewLayer JDBC

BOC Java Applet orApplication

HTTP

BOC JavaServerPages

DER XMLApplication

HTTP

OAS

RationalRose

DesignerGenerated

CMRRepository

CMR OpenAPI

JavaObjectLayer(BC4J)

JDeveloperGenerated

ServerModel

Rational RoseGenerated. Oracle DesignerMaintained.

JDeveloper Generated Java API

26

LeveragingLeveragingaa

CMRCMR

14

27

1 5 0 45 22 7 1 5 03 2 1 90 5

4 2 0 0 0 ...5 7 1 23 16 3 0 37 47 4 0 14 08 2 0 75 2

Survey/Census: 1990 Decennial CensusSource: Bureau of the CensusDataset: 1990 Public Use Microdata Sample (PUMS)Description: The PUMS dataset has basic demographic information about

persons and housing in the U.S. This information comes from the 1990 Decennial Census long form which is randomly sent to 1 in every 7 households. This dataset is for public use and does not compromise the confidentiality of individuals.

Data Elements: ID - Record Identifier - A unique id for a record. Each record identifies 1 or more persons having the same demographic characteristics. (See WGT) WGT - Person Weight - A weight given to a record to represent the 1 or more persons with the same demographic characteristics. Valid values: 1..9 SEX - Person Gender - Valid values (0: male, 1: female) AGE - Person Age in Years - Valid values (0-90) Persons over 90 years of age are top-coded with an age of 90 for confidentiality reasons. MARITAL - Person Marital Status - Valid values (0: not applicable, 1: single, 2: married, 3: separated, 4: divorced, 5: widowed). Universe: Persons over 15 years of age. Those 15 and under are given a value of 0.

For more information: Related Datasets and Publications, Sampling Errors andTechniques, etc.

Data

Metadata

ID WGT SEX AGE MARITAL

Metadata for Dissemination

CMR Support for American FactFinder

CMR AFF

ASCIIAFFFile

AFF MetadataProviders

ASCIIAFFFile

XMLCMRFile

Data ElementsData SetsData Products

15

AFF Metadata-Driven ArchitectureAFF Metadata-Driven Architecture

Pr o d u c e s

AFFApplication Code

CMR/AFFBusiness & Technical

Metadata

RunTimeCal ls

AFF Metadata-Driven, Dynamic Application

u Add metadata and data for new

dataset -> AFF can automatically

search and query the new dataset

u Geography Trees, Datasets, Subjects,

Report topics, etc. are all generated

at runtime, by accessing the metadata

u Business metadata is linked to

technical metadata such that user

selections are used to generate SQL

statements to query the data

16

17

CMR Support for Econ 2002 Census

CMR

EconMetadataProviders

ASCIIAFFFile

XMLCMRFile

EconACSDFile

GIDS

AFF

ASCIIAFFFile

FGDCFile

EMR

XMLSurvey

File

450 Econ Questionnaires

Activating the CMR

Data Element Registry (ISO/IEC 11179 Standard)

Data Elements, Value Domains, Valid Values, Data Element Concepts,… Data Set Registry

(Support FGDC CSDGM Geospatial Metadata Standard)A Data Set is a collection of Data Elements.

Product Registry(Supports FGDC CSDGM Geospatial Metadata Standard & Dublin Core)A Data Product may be a file/document, website/URL, or physical object.

Data Store(OMG CWM Standard)

Metadata for the physical data store.(Supports Relational, Multi-

Dimensional, and Flat File stores)Business Rule Registry

Workflow Framework

Security Framework

Configuration MgmtFramework

Classification Schemes(ISO/IEC 11179 Standard)

Taxonomies, Keywords

Survey RegistrySurveys, Survey Instances, Universes, Frames, Sample, Questionnaires,

Questions,…

Data QualityInspection

SurveyInstrumentGeneration

ProductGeneration

Data SetQuery

Generation

TaxonomyTree

Generation

18

Metadata: A core enabling component of any Information technology

Data Warehousing& Decision SupportLegacy Migration Data Query

and Search

Data Integration Application/ToolIntegration

EnterpriseInformation Portal Digital Libraries

e-Business ERP

Knowledge Mgmt &Business Intelligence

36

LeveragingLeveragingthethe

CMRCMRData Element RegistryData Element Registry

19

Dat

a El

emen

t Reg

istr

y

Global Standardized Data Elements

Agency Standardized Data Elements

Non-Standardized Data Elements

Integration Layer

BOCDemographic,

Economic,Geographic

Data

BLSEconomic

Data

USGSGeographic

Data

HUDHousing

Data

Government Vision

EPAEnvironmental

Data

CDCHealthData

FAAAir Safety

Data

NASAAircraft

Data

NCIHealthData

HCFAHealthData

20

External FFData Sources

Data Marts

Exports

Legacy Migration DW and Analytics

OLTP DBData Warehouse

Multi-DimensionalCubes

StagingDB

Extract TransformQualityCheck Load

LegacyData

Web DeploymentInformation PortalsE-Commerce Apps

SourceFlat Files

DER Integration Technology

DER and MetadataRepository

QuestionsQuestions


Recommended