DC2001 Conference - Toyko
METAXPath
Curtis DyresonE.E. and Computer Science Washington State University
USA
Michael Böhlenand
Christian S. JensenComputer ScienceAalborg University
Denmark
Nykredit Center for Database ResearchAalborg University, Denmark
DC2001 Conference - Toyko
Outline
• Data Data model
XML Query language
XPath
• Metadata METAXPath
• Future work
DC2001 Conference - Toyko
An XML Database Architecture
XML data and metadata
Database
Client(HTTP browser)
HTTP server
DC2001 Conference - Toyko
Database Data Model Evolution60s - Hierarchical data model
70s - Network data model
80s - Relational data model
90s - Object-oriented data model
00s - Unstructured/semistructured/XML Innovators
Unstructured data models (UPenn) UnQL/Strudel (AT&T) OEM and Lore (Stanford) XML (W3C)
DC2001 Conference - Toyko
Object Exchange Model (OEM)• Heterogeneous OODBs
Exchange objects Text description
text (XML)
object 1
object 1
my database your database
object 2
DC2001 Conference - Toyko
<person id=&1 name=“Joe Doe” age=“25” />
<person id=&1> <name>Joe Doe</name> <age>25</age> </person>
Object Representation in XML• Use names and values• Ignore types• &X denotes object X
// A person classclass Person { String name; int age; }
// A person objectPerson joe = new Person(‘Joe Doe’, 25);
<!ATTLIST person id ID #REQUIRED><!ELEMENT person (name age)>
DC2001 Conference - Toyko
XML (XPath) Data Model• Each element or attribute is a node
• Edges indicate nesting
• Nodes contain information
• Tree is ordered
age
element
person
element
name=“Joe”
attribute
id=“&1”
attribute
/n
text
25
text
/n
text
root
XML
<person id=&1 name=“Joe”> <age>25</age> </person>
XPath
DC2001 Conference - Toyko
Semistructured Data Model• Each element or attribute is a node• Edges indicate nesting• Edges are labeled
Joe25
XML Semistructured
&1
person
nameage
<person id=&1 name=“Joe”> <age>25</age> </person>
DC2001 Conference - Toyko
Data Models Compared• Insensitive to
text order, whitespace attributes vs. elements
• Directed graph (many roots, can contain cycles)
• Captures text order, whitespace, attributes and elements
• A tree (single root, no cycles)
age
element
person
element
name=“Joe”
attribute
id=“&1”
attribute
/n
text
25
text
/n
text
root
Joe25Semistructured
&1
person
nameage
XPath
DC2001 Conference - Toyko
Outline
• Data Data model
XML Query language
XPath
• Metadata XML - METAXPath
• Future work
DC2001 Conference - Toyko
XPath
• W3C Recommendation – 1999 Used in XQuery, XSLT, and XPointer Language for selecting locations in an XML document
• Query Sequence of location steps separated by ‘/’ Location step
axis::node_test [predicate1]…[predicateN]
Evaluated with respect to a context node Results in a node-set (actually a list of nodes!) Step continues from nodes reached in previous step
DC2001 Conference - Toyko
Descendent Axis Example
name
element
person
element
dateOfBirth
element
last
elementmonth
element
year
element
Susan
text
Douglas
text
January
text
1981
text
This…
comment
root
initial=“S”
attribute
SSN=“99…”
attribute
first
element
DC2001 Conference - Toyko
• Ancestor, descendent, following, preceding, and self partition a tree.
Axes that Partition a Tree
preceding followingdescendent
ancestor
self
DC2001 Conference - Toyko
XPath Node Test and Predicates
• Each node in result-set must pass node test Is this an element node named person?
person Is this an element node?
*
• Predicates are further tests (about other nodes) Does node have a ssn attribute?
[attribute::ssn]
DC2001 Conference - Toyko
Example /child::person/child::*/child::last
name
element
person
element
dateOfBirth
element
last
elementmonth
element
year
element
Susan
text
Douglas
text
January
text
1981
text
This…
comment
root
initial=“S”
attribute
SSN=“99…”
attribute
first
element
root
person
element
name
element
This…
comment
dateOfBirth
element
last
element
last
element
DC2001 Conference - Toyko
XPath Examples
• The dateOfBirth children of person nodes
/descendent::person/child::dateOfBirth
• The last text node
/descendent::text()[position()=last()]
DC2001 Conference - Toyko
Abbreviated Syntax
• Think of file path specifications in Unix• Year child of dateOfBirth
child::dateOfBirth/child::year
dateOfBirth/year
• name siblings
parent::*/child::name
../name
• All year nodes
/descendent-or-self::*/child::year
//year
DC2001 Conference - Toyko
Outline
• Data Data model
XML Query language
XPath
• Metadata XML - METAXPath
• Future work
DC2001 Conference - Toyko
Metadata
• Database metadata Schema, security, transaction time (versions)
• Web metadata Author, language, subject, privacy
• Web metadata recommendations RDF, RDD, P3P
• Features Descriptive, but also exclusionary Irregular Multiple Ad-hoc
DC2001 Conference - Toyko
A Movie Database
• Movie data Bruce Willis stars in Colour of Night. Colour of Night premiered 1/Jul/1995.
• Publication meta-data language English
URL http://www.auc.dk
publication date 2/Apr/1997
privacy/security ‘over 18’
publication history v1.2, modified 31/Jul/1998
subject Film, Suspense, Thriller
namespace http://www.auc.dk/movieDataDTD.xml
DC2001 Conference - Toyko
Movie Database Queries
• Metadata only Retrieve information published at Danish web sites.
• Metadata compared to data Find reviews published in the first week of the movie’s release.
• Metadata and data, but independent Get suspense films starring Bruce Willis.
DC2001 Conference - Toyko
Properties of a Metadata Data Model
• Goal: Same query language for data and metadata User learns “one” language Compiler/optimization reuse
• Challenges: Data and metadata in different dataspaces Query on data should not accidently query metadata Meta-metadata
Metadata for metadata Metadata has semantics Data with/without metadata
DC2001 Conference - Toyko
METAXPath Data Model
• Data model Reuse XPath data model Meta attribute points to metadata tree “Right angle” data model
• Features Minimal extension of XPath Backwards-compatible
DC2001 Conference - Toyko
Example
• Data<?xml version="1.0">
<person ssn="234">
<name>Ichiro</name>
</person>
• URL metadata<source URL=“www.wsu.edu/p.htm”>
• Language metadata of person element<language>English</language>
• Author meta-metadata - language metadata author<author name="Suzuki"/>
Type element
Value person
Attributes {(ssn, 223)}
Type element
Value name
Attributes {}
Type text
Value Ichiro
Type root
Type text
Value \n
Type text
Value \n\t
<?xml version="1.0"><person ssn="234"> <name>Ichiro</name></person>
Type element
Value name
Attributes {}
Type text
Value \n
Type text
Value \n\t
Type element
Value person
Attributes {(ssn, 223)}
Type text
Value Ichiro
Type root
Meta
Type element
Value source
Attributes {(URL, www.wsu.edu/p.htm)}
Type root
<source URL=“www.wsu.edu/p.htm”>
Type element
Value name
Attributes {}
Type text
Value \n
Type text
Value \n\t
Type element
Value person
Attributes {(ssn, 223)}
Meta
Type element
Value language
Attributes {}
Type text
Value English
Type text
Value Ichiro
Type root
Meta
Type element
Value source
Attributes {(URL, www.wsu.edu/p.htm)}
Type root
Type root
<language>English</language>
Type element
Value name
Attributes {}
Type text
Value \n
Type text
Value \n\t
Type element
Value person
Attributes {(ssn, 223)}
Meta
Type element
Value language
Attributes {}
Type text
Value English
Type text
Value Ichiro
Type root
Meta
Type element
Value source
Attributes {(URL, www.wsu.edu/p.htm)}
Type root
Type root
Meta Type root
Type element
Value author
Attributes {(name, Suzuki)}
<author name="Suzuki"/>
DC2001 Conference - Toyko
Sharing and Excluding Metadata
• Meta property points to metadata for a node Shared pointers ==> shared metadata
• To share with child Copy pointer
• To exclude from child Duplicate excluded portion Copy remaining shared pointers
Type text
Value Ichiro
Meta
Type element
Value person
Attributes {(ssn, 223)}
Meta
Type text
Value English
Meta
Type root
Meta
Type element
Value source
Attributes {(URL, www.wsu.edu/p.htm)}
Type root
Type root
Meta
Type text
Value \n\t
Meta
Type element
Value name
Attributes {}
Meta
Type text
Value \n
Meta
Type root
Type element
Value language
Attributes {}
Meta
Type element
Value author
Attributes {(name, Suzuki)}
Share metadata with descendents
Type element
Value person
Attributes {(ssn, 223)}
Meta
Type root
Meta
Type element
Value source
Attributes {(URL, www.wsu.edu/p.htm)}
Type root
Type root
Meta
Type text
Value \n\t
Meta
Type element
Value name
Attributes {}
Meta
Type text
Value \n
Meta
Type root
Type root
Meta
Type text
Value English
Meta
Type element
Value language
Attributes {}
Meta
Type element
Value author
Attributes {(name, Suzuki)}
Type text
Value Ichiro
Meta
Ichiro text not
authored by
Suzuki
DC2001 Conference - Toyko
METAXPath Queries
• XPath plus level shift operation meta axis ^ in abbreviated syntax
• Example - Locate data nodes with URL metadata of p.htm /descendent-or-self::*
[meta::*/child::source[attribute::URL="p.htm"]] In abbreviated syntax
//*[^source[@URL="p.htm"]]
• Example - Locate the URL metadata //*^source/@URL
• Example - Locate data that has metadata authored by Suzuki (meta-metadata)//*[^//*^author[@name="Suzuki"]]
DC2001 Conference - Toyko
Outline
• Data Data model
XML Query language
XPath
• Metadata XML - METAXPath
• Future work
DC2001 Conference - Toyko
Metadata Semantics
• Transaction time example
Color of Night
&2
&3
Colour of Night
name: title
trans. time: [1/Aug/1998 - uc]
&1
name: reviewed
trans. time: [1/Sep/1999 - uc]
name: movie
name: title
trans. time: [2/Apr/1997 - 31/Jul/1998]
&1
&2
&3
Not a path!
DC2001 Conference - Toyko
AUCQL Collapse Example
• PropertyCollapse for name is concatenation, for trans. time it is temporal intersection.
Color of Night
&1
Colour of Night
name: reviewed
trans. time: [1/Sep/1999 - uc]
&2
&3
name: title
trans. time: [2/Apr/1997 - 31/Jul/1998]
name: title
trans. time: [1/Aug/1998 - uc]
name: movie
name: reviewed.movie.title
trans. time: [1/Sep/1999 - uc]
name: reviewed.movie.title
trans. time: undefined
DC2001 Conference - Toyko
AUCQL Additional Operations
• Coalesce - compute a distributed property value
&1
&2
name: review
security! developer
trans. time: [1/Jul/1999 - 15/Jul/1999]
name: review
security! subscriber
trans. time: [16/Jul/1999 - uc]
trans. time: [1/Jul/1999 - uc]
DC2001 Conference - Toyko
Thin Layer Impementation
METAXPath query
METAXPath CompilerMetadata
encoding
DB
XPath Compiler
XPath query
result
DC2001 Conference - Toyko
Prototype Implementation
METAXPath query
METAXPath Compiler
DBM
Query Evaluation Engine
Evaluation Tree
result
Database API
Perl
Perl
XML
Parser
XML
RDF
Indexing
DC2001 Conference - Toyko
Summary
• METAXPath website http://www.eecs.wsu.edu/~cdyreson/pub/MetaXPath
• AUCQL website VLDB ‘99 Implemented research prototype Free, downloadable, Unix environment http://www.eecs.wsu.edu/~cdyreson/pub/AUCQL Interactive query engine Tutorials