+ All Categories
Home > Documents > The Challenge of Environmental Data Interoperability on the Global Information Grid (GIG) Briefing...

The Challenge of Environmental Data Interoperability on the Global Information Grid (GIG) Briefing...

Date post: 22-Dec-2015
Category:
View: 214 times
Download: 1 times
Share this document with a friend
30
The Challenge of Environmental Data Interoperability on the Global Information Grid (GIG) Briefing by: Virginia T. Dobey SAIC/SETA Support to DMSO Environmental Representation Domain Lead (703) 824-3411 or (703) 963-851 [email protected]
Transcript

The Challenge ofEnvironmental Data

Interoperabilityon the

Global Information Grid (GIG)

Briefing by:Virginia T. DobeySAIC/SETA Support to DMSOEnvironmental Representation

Domain Lead(703) 824-3411 or (703) [email protected]

DoD M&S Master Plan Task:Environmental Representations

Provide consistent, comprehensive environmental representations that include the natural environment, as well as representations of anthropogenic impacts, flora, and fauna, to DoD M&S users before FY 2014 when and where needed.

• Provide, before FY 2008, environmental data sets, algorithms, models, tools, and documentation to environmental resource repositories.

• Establish, before FY 2009, a capability to provide authoritative and dynamic representations of the natural environment.

• Establish and publish, before FY 2010, authoritative data sources, data dictionaries, data structure, attribution scheme, symbology, and metadata for each natural environment domain; and provide a common interchange mechanism for both static and dynamic environmental representations.

• Provide, before FY 2012, tools to ensure that natural environmental representations dynamically interact with other representations.

Explanation

Technologies provide the means to represent environmental data (terrain, ocean, air and space), and promote the unambiguous, loss-less and non-proprietary interchange of environmental data.

Level I Level II Level III Level IV Level V

Impact of platforms, weapons, sensors, and their actions onspace, atmosphere, terrain, and ocean conditions

Space conditionsAtmospheric conditionsTerrain conditionsOcean conditions

Effects ofspace, atmosphere, terrain, and oceanconditions onplatforms, weapons, and sensors

In M&S, a complete and accurate environmental representation must include not only the environmental conditions but also their effects on system C&P, as well as feedback of system activity on the environment. This, in turn, requires environmental data that can be FUSED with other data sources.

The Emerging GIG Data Environment(Task, Post, Process and Use - TPPU)

UbiquitousGlobal Network

MetadataCatalogs

Enterprise &Community

Web Sites Application Services

(e.g., Web)

Shared DataSpace

Metadata Registries

Security Services(e.g., PKI,

SAML)

Consumer

DeveloperPosts to and uses metadata registries to structure data and document formats for reuse and interoperability

ProducerSearches metadata catalogs to find dataAnalyzes metadata to determine context of data foundPulls selected data based on understanding of metadata

Describes content using metadataPosts metadata in catalogs and data in shared space

Data Standards posted in Metadata registries

Producer tags and post data

Location of Data Posted in Metadata Catalogs

Actual Data posted to shared data spaces

Consumer can find and pull data based on metadatatags

What are Warfighter Issues?Shifting Paradigms

• The adoption of a Net-Centric Data Enterprise

– It’s not just a producer / user world anymore… (now EVERYONE’s a producer!)

– Consumers want access to data / information / knowledge immediately

– Consumers want to input how the data is manipulated/filtered

• Moving from a …Collector / Product focus: Task, Process, Exploit and Disseminate

• To a ... Analyst / Data focus: Task, Post, Process and Use (share)

Reliance on “Factory”Resource intensive data

downloadOne (producer) to many

(consumers) Bandwidth utilization /

availability - not a consideration

Moving to “many-to-many” topology

Smart “data ordering” agentsSharing of information Immediate access to

Through-the-Sensor dataBandwidth - critical to warfighters

GIG: Increasing the Interoperability Challenge

• Everyone is a potential producer• Multiple legacy environmental data sources and

user systems exist– Significant investment in existing production and user hardware and

software– Data in multiple (often system-specific) formats need updating

• Few data resources are reliably compatible, even those produced by the Government – example: Navy’s Master Library (OAML) — product-specific

formats

• “Power to the Edge” concept empowers user to identify other sources of required data– No requirement for common data syntax/semantics– Increases the challenge of data fusion– Places responsibility for VV&A squarely on the shoulders of the user

GIG: Assumptions in Assessing Environmental Data Interoperability

• Traditional data producers will continue to provide data in producer-specific and product-specific formats following existing production guidelines, since those products and formats meet the general needs of most customers (users). Formats will continue to leverage producer standards such as the Joint METOC Conceptual Data Model and the Feature and Attribute Coding Catalog. Tailoring data to user requirements will remain a user responsibility.

• Users will need a data mediation capability that can access not only these traditional data sources but also non-traditional and often unknown data sources such as commercial products (sometimes having proprietary formats) and streaming data from in-situ sensors (anticipated development using future technology) which can be identified and obtained over the GIG

Barriers to Data Interoperability

• Data sources, models, and operational systems developed independently of each other

• Simulations not traditionally designed to interface with operational systems (and sometimes with each other!)

• Tailored (both in format and in content) datasets that are optimized for a specific system support only specific uses

Result: syntactically and semantically different forms of data representation are in use

Developing Interoperable Data“A data model is an abstract, self-contained, logical definition of the objects, operators,

and so forth, that together constitute the abstract machine with which users interact. The objects allow us to model the structure of data…An implementation of a given data model is a physical realization on a real machine of the components of the abstract machine that together constitute that model…the familiar distinction between logical and physical…” [emphasis in the original] C.J. Date1

“Logical Data Model: A model of data that represents the inherent structure of that data and is independent of the individual applications of the data and also of the software or hardware mechanisms which are employed in representing and using the data.” DoD 8320.1-M2

“Normalization leads to an exact definition of entities and data attributes, with a clear identification of homonyms (the same name used to represent different data) and synonyms (different names used to represent the same data). It promotes a clearer understanding and knowledge of what each data entity and data attribute means.”

C.Finkelstein3

1Colleague of E.F. Codd, originator and developer of relational database theory2 DoD authority on information engineering3 “Originator and main architect of the Information Engineering methodology”

User 1 …User 2 User NUser 1 …User 2 User N

*key*key*key *key*key*key

*key*key *key*key

*key*key *key*keyConverting user-specific data requirements into conceptual “building

blocks” for data integration

Logical data model building blocks are

the basis for application

data structures

Normalized logical data model serves as conceptual design “bridge” from the external schema to and from the internal schema

User application views

Also facilitates ingest of

other source data

Internal schemaExternal schema

Conceptual schema

Achieving Data Interoperability:The Three-Schema Architecture

The Three-Schema Architecture Applied to Environmental Data

User 1 …User 2 User NUser 1 …User 2 User N

*key*key*key *key*key*key

*key*key *key*key

*key*key *key*key

Normalized logical data model serves as conceptual “bridge

User (production) applications: • CBRN, • Weather effects,• Terrain trafficability, …

Allows for ingest of

other source data

…Prod 2 Prod NProd 1 …

Producer product formats: • METOC producer-specific formats, • NGA product formats, • JMCDM, • FACC, …

Fusion of normalized datainternal to the system

Implementation-independent “middle layer”can be placed at the producer interface, user interfaceor somewhere in between

Creating a Reusable Implementation-Independent Middle Layer

Such an architectural layer must be:• Independent of source products• Independent of optimized system

implementation• Provides for the FULL SPECIFICATION of all

source product data as well as all system data requirements

• Developed as an implementation-independent (LOGICAL) relational data model, as required by DoDAF OV-7 Product view

A Reusable Middle Layer for Environmental Data

• Requires standardized terms in all environmental domains – leverage existing International/DoD standards

• Requires a concise, well-organized, non-redundant data structure –

– Must extend from a normalized logical data model

• Requires highly granular, independent data elements –‘atomic’ level concepts– To support the many formats required by users recise

rendering of translations to and from the hub)

A Complete Representation: All Environmental Domains

A Concise Non-Redundant Data Structure

• Must address format as well as content– Format

• Must handle the large number of required data representation formats while preserving consistency of data (the “fair fight” across the federation)

– Content

• Must be based on atomic data elements from a normalized logical data model (support for data fusion)

Controlled Image Base (raster)

Foundation Feature Data (vector)

Lake

Trees

Vector topology

Challenge: The Many Formats of M&S Data

Geometry

DTED (gridded)

1, 2, and 3-D point observation data

Nested, gridded data

Surface Backscatter Strength

as a Function of Angle of incidence and EM Band Angle of incidence in degrees

15 30 45 60 75 90 microwave 300 290 240 207 198 170

L-Band 160 230 180 167 158 130 S-Band 165 152 78 22 8 1.5 X-Band 179 122 45 11 6 1 E

M B

and

V-Band 200 90 40 9 4 0.1 Tabular data

Interchange Hub

The Final Additions to the set of M&S Formats

• Compact Terrain Data Base (proprietary)• Digital Terrain Elevation Data (product)• E&S GDF (proprietary)• E&S S1000 (proprietary)• GeoTIFF• Gridded raster (product)• MultiGen (proprietary)• Shapefile (proprietary)• Terrex DART, Terra Vista (proprietary)• Vector Product Format (product)

“Atomic” Level Concepts

To facilitate precise rendering of translations to and from the hub

Producers use their own coding systems, each of which captures specific desired information—some of which may be captured by others, and some of which may be unique. Almost always each producer carries information not available from other sources. Extracting information “imbedded” in definitions through explicit statement of atomic attributes assists in adding attributes without overwriting the object

The Value of Atomic-Level Attributes: An Example

Entity: Bridge over riverEntity: Suspension bridgeEntity: Bridge for two-way traffic

Decomposed:Bridge + located over water body = riverBridge + bridge type = suspensionBridge + traffic carried = vehicular + number of traffic directions = 2

Results in:Bridge + located over water body = river

+ bridge type = suspension+ traffic carried = vehicular+ number of traffic directions = 2

(each of these attributes can be changed/updated as new information is acquired)

“Complete and Accurate”—Does That Mean Data Fusion?

• Is the COP affected by METOC conditions? If so, can those effects be reflected in actual changes to the COP on the user system? This can be handled internally to the system without requiring data fusion capability.

• Does the user need to derive useful or critical information from the interaction of METOC/terrain data and information in the COP and provide it to other systems? The answer to this question determines whether data fusion is required by the user.

• Will the warfigher integrate environmental data into operational problems or will he use them as map or other overlays? The answer to this question determines whether data fusion is required by the user and allowed by the producer.

• Does the user need to have the ability to update METOC conditions and effects as reported by data from other (e.g., intel, foreign forces, etc.) battlefield sources? The answer to this question determines whether data fusion is required by the user.

• What is the total set of requirements?

• There are many processes and products involved (some of which, as in ArcInfo/ArcView terrain products, may be proprietary)—but the exchange mechanism must be independent of these. While we may know all of the currently available sources, will there ever be new ones available to the warfighter?

• Different views of the environment

– Air, land, sea, space

– Spatial location and orientation (coordinate system and datum)

• Lack of underlying environmental framework

– No integrated reference model available

• Representation (how the concept will be depicted on the user’s system—a visual object? 2D or 3D? A data point? Background data for algorithm use?)

• Naming/semantics

– Existing Data Models are conceptual, future models which are non-integrated and don’t address current data repositories and data interchange requirements

Bu

sin

ess

Tec

hn

ical

Summary: The Challenge of Data Fusion

DataProducer 1

DataProducer 2

DataProducer 3

DataProducer n

DataConsumer

Application A

DataConsumer

Application B

DataConsumer

Application C

DataConsumer

Application Z

THE TRADITIONAL SOLUTION: Direct Mapping

RESULT: A BIAS AGAINST TRANSLATION SOFTWARE

COMMONINTERCHANGE

HUB

DataProducer 1

DataProducer 2

DataProducer 3

DataProducer n

DataConsumer

Application A

DataConsumer

Application B

DataConsumer

Application C

DataConsumer

Application Z

A GIG-Oriented Solution: The Interoperable “Middle Layer”

What works for one system…creates unusual behaviors in another…

The Result of Improper Data Fusion

SEDRIS Interchange technologieswww.sedris.org

DoD VV&A Recommended Practices Guide, http://vva.dmso.mil/

including:Special Topics Paper—

Foundations for V&V of the Natural Environment in a Simulation

Tools to Facilitate Environmental Data Fusion on the GIG

SEDRIS: How it works

1. Identify representation structure of original data object (point, vector, raster, etc.—geometry, topology, grid, pixel, etc.) (this is the data format)

2. Separate attribution of the object (what it is, characteristics of what it is) from its representation (this is the data content)

3. Determine georeferencing of the object (this is the location of each object in its original spatial reference frame—UTM, MGRS, WGS-84, any local inertial or celestial reference datum, etc.)

4. Overlay representation on SEDRIS Data Representation Model, convert attribution to EDCS codes, and decompose georeferencing using Spatial Reference Model

5. Reassemble objects from multiple sources using the SEDRIS Transmittal Format to integrate/fuse data (more than just the simple overlay that is used in C4I, M&S systems now)

Points of Contact• Air and Space Natural Environment Modeling & Simulation

Executive Agent Liaison to DMSODr. William H. Campbell, (703) 824-3455

[email protected]

• DMSO Verification, Validation, and Accreditation Program ManagerMs. Simone M. Youngblood, (703) 824-3427

[email protected]

• DMSO Environmental Representation Program ManagerMs. Virginia T. Dobey, (703) 824-3411

[email protected]

BACK-UP SLIDES

GIG Policy: The TPPU Paradigm

(diagram obtained from: http://ges.dod.mil/about/tppu.htm)

Normalization Challenges

• Users are familiar with non-normalized physical data elements. Tendency is to call these “logical” and stop there.

• In any large data model, normalization is difficult. It is often ignored (benign neglect).

• Complete data models incorporate business rules (how the entities relate to each other).

• May not be needed for an implementation-independent model used to develop a data dictionary (of interoperable concepts), but…


Recommended