Data Model & DDWG Update
Management Council Face-to-FaceFlagstaff, Arizona
August 22-23, 2011
Topics
• Design Process• Builds• Calendar• Build 1b Review Issues
Data Standards Design Process
• What exactly has to happen?
"Build"
• Freeze the Information Model
"Build"
• Freeze the Information Model• Finalize the System• Generate Schema• Freeze the Document Set
"Build"
• Freeze the Information Model• Finalize the System• Generate Schema• Freeze the Document Set
• Introduction• Concepts Document• Glossary• Jump Start• Data Provider's Handbook• Standards Reference• Dictionary Tutorial• Data Dictionary• Example Set
"Build"
"Build"
Reasonably Stable
• Freeze the Information Model• Finalize the System• Generate Schema• Freeze the Document Set
• Introduction• Concepts Document• Glossary• Jump Start• Data Provider's Handbook• Standards Reference• Dictionary Tutorial• Data Dictionary• Example Set
"Build"
GeneratedReasonably Stable
• Freeze the Information Model• Finalize the System• Generate Schema• Freeze the Document Set
• Introduction• Concepts Document• Glossary• Jump Start• Data Provider's Handbook• Standards Reference• Dictionary Tutorial• Data Dictionary• Example Set
"Build"
GeneratedReasonably Stable
Human Intervention
• Freeze the Information Model• Finalize the System• Generate Schema• Freeze the Document Set
• Introduction• Concepts Document• Glossary• Jump Start• Data Provider's Handbook• Standards Reference• Dictionary Tutorial• Data Dictionary• Example Set
• What this translates to is "lead time".
• Right now we're looking at two to three weeks lead time from "freeze the model" to "flip the switch" on the build.
• Let's look at a calendar.
"Build"
Objects on the Calendar
2010
Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Builds 1b
Meetings
2011
Reviews
Objects on the Calendar
2010
Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Builds 1bInternal
DDWGTech
Meetings
2011
Reviews
Objects on the Calendar
2010
Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Builds 1b 1cInternal System
"Mini"DDWG MC DDWGTech
Meetings
2011
ReviewsIPDA
Objects on the Calendar
2010
Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Builds 1b 1c 1dInternal System M. Rose
"Mini"DDWG MC DDWG MCTech
Meetings
2011
ReviewsIPDA External
Objects on the CalendarAre Closer Than They Appear
2010
Dec Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Builds 1b 1c 1d 2Internal System M. Rose ORR
"Mini"DDWG MC DDWG MC MCTech
Meetings
2011
ReviewsIPDA External
Internal Review Issues
• 1b Review produced > 200 separate issues/comments
• Issues fell into two broad categories:• Documentation issues - clarity, consistency,
completeness, integration.• Concerns about the model contents & implementation.
• The Status of the review issues fall into two categories:• Open• Closed
Internal Review IssuesOpen
• Still working for Build 2.• Will address after Build 2.• Have not decided whether or not to implement.
Closed• We have implemented. • Model related issue arose from misunderstanding
some aspect of PDS4.• We disagree:
• Incompatible with PDS4 requirements. • Incompatible with the model approach we're using. • Not possible to implement within our time & budget
constraints.
Internal Review Some Closed Issues
Implemented• Document set integration.• Need analogs for PDS3 spreadsheet & container.
Misunderstanding• New Structures don't support qubes.• Volatile metadata in a static archive (redelivery issue).
Disagree• Labels that describe multiple data objects don't really
work.• Do away with character tables.• Other space science archives: Consider using
VOTABLE, CDM & OPeNDAP approach, class="variable" & named "dimension".
Internal Review Some Open Issues
• Documentation issues – still working many of them.• Need robust, global metadata.• New Structures don't support some EDRs, Telemetry,
DSN data.• Use a standard bundle entry (bundle index.html)• Consider a nomenclature review.• There is a proposed alternate XML implementation
• Starts with XML Schema 1.0 or 1.1?• Perceived complexity.• Too many subclasses.
Open Issue: Too many Subclasses (1)
• Going back to the original reviews, the issue is for the number of variations expanded from the four base structural types. The underlying concerns are overhead and confusion.
• There have been a lot of changes since build 1b. Now as we look at this issue we have to ask three questions.• What do we count? • Are there too many?• If the numbers are reasonable, do we have the right
ones?
Open Issue: Too many Subclasses (2)
• What do we count?• Count what the data providers and end users see.
Open Issue: Too many Subclasses (3)
• What do we count?• Count what the data providers and end users see.
• Schema – specifically the Product_* schema.
Open Issue: Too many Subclasses (4)
• What do we count?• Count what the data providers and end users see.
• We have 40 Product schema. Wait for it …
Open Issue: Too many Subclasses (5)
• 40 Product schema – by function.
• Aggregations – 2 (Probably will be 3)
Open Issue: Too many Subclasses (6)
• 40 Product schema – by function.
• Aggregations – 2 • Observational Data – 10 (probably will add 1 or 2)
Open Issue: Too many Subclasses (7)
• 40 Product schema – by function.
• Aggregations – 2• Observational Data – 10 • Observational Support – 10 (e.g., browse, document)
Open Issue: Too many Subclasses (8)
• 40 Product schema – by function.
• Aggregations – 2• Observational Data – 10 • Observational Support – 10• Context – 5
Open Issue: Too many Subclasses (9)
• 40 Product schema – by function.
• Aggregations – 2• Observational Data – 10 • Observational Support – 10• Context – 5• Operations – 13 (includes 5 PDS3 Context)
Open Issue: Too many Subclasses (10)
• 40 Product schema – by function.
• Aggregations – 2• Observational Data – 10 • Observational Support – 10• Context – 5• Operations – 13
• Providers see 27, end users see 22.
Open Issue: Too many Subclasses (11)
• Are there too many?
• Comparing to PDS3 tends to be an apples and oranges situation, but the number of
• PDS4 observational data products is roughly equivalent to the corresponding subset of PDS3 Data Objects.
• PDS4 context products is roughly equivalent to the corresponding subset of PDS3 Catalog Objects.
• PDS4 observational data support products is substantially greater than the corresponding subset of PDS3 Data Objects.
Open Issue: Too many Subclasses (12)
• Do we have the correct set?
• We're close, but will probably add and subtract a few.
• May be significantly affected by the potential change in the XML Schema implementation.
Questions?
Backups
Acknowledgements*Ed BellRichard ChenDan CrichtonAmy CulverPatty GarciaEd GrayzeckEd GuinnessMitch GordonSean HardmanLyle HuberSteve HughesChris IsbellSteve Joy
* Anyone who sat through a DDWG 2-hour telecon or provided useful input.
Ronald JoynerDebra KazdenTodd KingJoe MafiMike MartinThomas MorganLynn NeakrasePaul RamirezAnne RaughMark RoseElizabeth RyeBoris SemenovDick SimpsonSusie Slavney
Peter AllanDavid HeatherMichel GangloffSanta MartinezThomas RoatschAlain Sarkissian
PDS4 Documentsand their Relationships
ConceptsDocument
Big Picture
StandardsReference
RequirementsUser Friendly
XML Schemas
Blueprints
PDS4Product Labels
Deliverables
Data Dictionary
Definitions
PDS4 InformationModel Specification
RequirementsEngineering Specification
Informative
Data Provider’sHandbook
Cookbook
deriv
egenerates
references
creates /validates
inst
ruct
generates
refe
renc
es
RegistryConfiguration File
Object Descriptionsconfigures
generates
Registry
Product Tracking and Cataloging
gene
rates
Introduction toPDS4 Documentation
Jumpstart
Glossary
Data DictionaryTutorial
Complete
Some TBD
Legend
Requirements & Domain Knowledge
PDS4 Information
Model
Query Models
Information Model
Specification
XML Schema(Generic)
Filter and Translator
Information Modeling
Tool
PDS4 Data Dictionary
(Doc and DB)
XML Schema(Specific)
XML Document
(Label)
XMI/UML
Registry Configuration Parameters
PDS4 Data Dictionary
(ISO/IEC 11179)
PDS4 Information Model and Generated Documents