+ All Categories
Home > Documents > Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey...

Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey...

Date post: 22-Dec-2015
Category:
Upload: brian-snow
View: 212 times
Download: 0 times
Share this document with a friend
34
Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming language Declara tive mapping s Access to the whole document, able to reconstruct the original
Transcript
Page 1: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

Full-Fidelity FlexibleObject-Oriented XML Access

James F. Terwilliger,Philip A. Bernstein, Sergey Melnik

Persistent XML via your favorite

programming language

Declarative mappings

Access to the whole document, able to

reconstruct the original

Page 2: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

2

LRX: LINQ over Relations and XML

Classes TablesXML

Object-oriented access to stored data

SQL, XQuery

Native programming language Object-based queries and updatesStatic type checking

ORM’s do not handle XML!

?

Language-Integrated Query

Page 3: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

3

Problem

• Currently common: use ORM mapper, bring XML data to client as a string and process on client– Load into XML-like objects and do XPath through API

from o in DB.JobCandidateswhere o.Resume.Skills.Contains("production")select o.Resume.Name.Name_Last;

WITH XMLNAMESPACES ('http://.../adventure-works/Resume' AS r)SELECT [Extent1].[Resume].value(N'/*[1]/r:Name/r:Name.Last',

N'nvarchar(max)') AS [C1]FROM [HumanResources].[JobCandidate] AS [Extent1]WHERE [Extent1].[Resume].exist(N'/*[1]/r:Skills[

contains(., "production"]') = CAST(1 as bit)

Page 4: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

4

Inspiration: O-R MappingEntity Framework (Melnik et al. 2007)

Client-side (Objects): Store side (Relations):

Classes Tables

Q1 = Q1’Q2 = Q2’Q3 = Q3’

(select-project only)

Query view VQ

Update view VUMerge view VM

Object Queries (LINQ)

Object Updates

Mapping specified at schema level

Mapping compiled to views

Preserve fidelity of the source data

Page 5: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

5

Person:idnametitle

EF Example

Client-side (Classes): Store side (Relations):Person1( id integer PRIMARY KEY, name varchar(50),)Person2( id integer PRIMARY KEY, title varchar(50), details varchar(2000))

πid, name Person = πid, name Person1

Person = πid, name, title Person1 ⋈ Person2

πid, title Person = πid, title Person2

Page 6: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

6

Extending EF for XML:Design Requirements

Classes TablesXML

• Map classes to XML using similar mechanism• Schema-level mapping language• Compile into query and update procedures• In-place updates to maintain fidelity of source

• BONUS: Full-Fidelity object representation

Page 7: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

7

Challenges and Related Work

1. Express O-X mappings declaratively– Some existing tools are canonical (not flexible)– E.g., LINQ-to-XSD

2. Translate mappings into bidirectional procedures– Some existing tools are unidirectional– E.g., Clio

3. Translate client queries and updates into server analogs

– Some existing tools are state-based– E.g., Lenses, Bidirectional XQuery

Page 8: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

8

Outline

• Introduction• Mappings• Mapping compilation and query translation• Full Fidelity and update translation• Performance• Conclusion

Page 9: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

9

Running ExampleExample Document: Store Side (XML Schema):

type Contact: Address? sequence (1..5) phone_type number* Address* Name @Prefix First_Name Last_Name BusinessInfo Business_Name

xsd:choice

<contact> <Address> … </Address> <phone_type>Home</phone_type> <number>555-5123</number> <phone_type>Cell</phone_type> <phone_type>Work</phone_type> <number>555-5234</number> <number>555-5345</number> <Address> … </Address> <Address> … </Address> <Name Prefix=“Ms.”> <First_name>Sue</First_name> <Last_name>Wall</Last_name> </Name></contact>

Page 10: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

10

Component Designator (CD)Store Side (XML Schema):

type Contact: Address? sequence (1..5) phone_type number* Address* Name @Prefix First_Name Last_Name BusinessInfo Business_Name

/type::ns:Contact/model::sequence /schemaElement::ns:Address[1]

/type::ns:Contact/model::sequence /model::sequence[1]

/type::ns:Contact/model::sequence /schemaElement::ns:Address[2]

schemaElement::ns:Name /type::0/model::sequence[1] /schemaElement::First_Name[1]Name/First_Name

Page 11: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

Mappings and Flexibility: IntuitionClient Side (Objects): Store Side (XML Schema):

type Contact: Address? sequence (1..5) phone_type number* Address* Name @Prefix First_Name Last_Name BusinessInfo Business_Name

Contact:AddressPhone1…Phone5ContactType (P/B)PersonNameBusinessInfo

PersonInfo:PrefixFirst_NameLast_Name

= P

model::sequence, 1Address[1], 1

model::sequence, 5

PhoneInfo:typeNumbers

11

Page 12: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

12

Alternative Representation

Client Side (Objects): Store Side (XML Schema):

Contact:AddressPhone[5]

Person:PrefixFirst_NameLast_Name

model::sequence, all

Name/@Prefix, 1

Name/Last_Name, 1

type Contact: Address? sequence (1..5) phone_type number* Address* Name @Prefix First_Name Last_Name BusinessInfo Business_Name

EXISTS

Page 13: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

13

Mappings

• A type mapping:– Associates one client-side class with one XML type– Assigns to each class member a CD expression

• “Mapping Fragment”• Essentially maps to a schema element• Might include a position reference if mapping into list

– Allows conditions on either side• Client-side can have equality conditions on values• XML-side can have equality conditions on values, tag

names, or existence of elements

Page 14: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

14

Compiling CD Expressions: UPAExample Document: Store Side (XML Schema):

type Contact: Address? sequence (1..5) phone_type number* Address* Name @Prefix First_Name Last_Name BusinessInfo Business_Name

<contact> <Address> … </Address> <phone_type>Home</phone_type> <number>555-5123</number> <phone_type>Cell</phone_type> <phone_type>Work</phone_type> <number>555-5234</number> <number>555-5345</number> <Address> … </Address> <Address> … </Address> <Name Prefix=“Ms.”> <First_name>Sue</First_name> <Last_name>Wall</Last_name> </Name></contact>

Unique Particle Attribution

Page 15: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

15

Compiling CD ExpressionsStore Side (XML Schema):

type Contact: Address? sequence (1..5) phone_type number* Address* Name @Prefix First_Name Last_Name BusinessInfo Business_Name

CD Expression (Compiles to)Query to retrieve all elements that match the element

/type::ns:T/model::sequence /schemaElement::ns:Address[1] (Compiles to)/Address[.<<../phone_type[1]]

Name/First_Name (Compiles to)Name/First_Name

Page 16: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

16

Queries and Query Translation:Intuition

from p in ObjectContext.Peoplefrom e in p.resume.employerswhere e.address.city.Contains(“Port”)select new {pname = p.name, ename = e.name}

from p in ObjectContext.Peoplefrom e in p.resume.employerswhere e.address.city.Contains(“Port”)select new {pname = p.name, ename = e.name}

from p in ObjectContext.Peoplefrom e in SEQUENCE(p.resume, “/resume/employers”)where TEST(e, “/resume/address/city[contains(., “Port”)]”)select new {pname = p.name, ename =

VALUE(e, “/name”, string)}

Page 17: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

17

Queries and Query Translation:Basics

Foo.bar1.bar2.bar3

PH(Q)

Type T

Q’: Compiled query for CD expression of T.bar3

PH’(Q/Q’)

VALUE: Run query, cast result as primitive typeQUERY: Run querySEQUENCE: Run query, iterate over resultsTEST: Run query, return boolean indicating if result is non-empty

PH and PH’ in {VALUE, QUERY, SEQUENCE, TEST}

Page 18: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

18

Type Translation

Client Side (Objects): Store Side (XML Schema):

Contact:AddressPhone[5]

Person:PrefixFirst_NameLast_Name

type Contact: Address? sequence (1..5) phone_type number* Address* Name @Prefix First_Name Last_Name BusinessInfo Business_Name

EXISTS

where obj is Personwhere TEST(obj, “./Name”)

Page 19: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

19

Full Fidelity

<contact><!-- added by Tom --><Address source=“corporate”>

…</Address>

…<!-- Need to review addresses --></contact>

class Contact {…AddressType Address;…

}

Not part of the schema for the document

Page 20: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

20

Delta Representation

contactObject

Address = new Address (…)

Phone[1] = …

contactObject BeforeEnd: “Need to review addresses” (Comment)

Address = new Address (…) Before: “added by Tom” (Comment) Start: source=“corporate” (Attribute)Phone[1] = …

<contact><!-- added by Tom --><Address source=“corporate”>

…</Address>

…<!-- Need to review addresses --></contact>

• Each mappable location (anchor) is a key into the delta• Unmapped data becomes associated with an anchor with a relative position reference• Anchors stored in document order

Page 21: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

21

UpdatesAKA: What Does Full Fidelity Get Us

• Re-serialization is always an option– Repackage the entire XML document and overwrite

• In-place updates may be substantially faster– Oracle, SQL Server, DB2 support in-place updates

• In-place updates based on XPath/XQuery– Insert new node, replace existing node, delete node– Inserts are relative to an existing node in tree– After, before, as first, as last

Page 22: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

22

Relative Location

Client Side (Objects): Store Side (XML Schema):

Contact:AddressPhone[5]

Person:PrefixFirst_NameLast_Name

type Contact: Address? sequence (1..5) phone_type number* Address* Name @Prefix First_Name Last_Name BusinessInfo Business_Name

First_Name = BobPhone[1] = new PhoneType (…)

• After? Before? As first?• Correct location depends on pre-existing data• Schema is insufficient• Use delta representation to determine correct placement

Page 23: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

23

Performance

• XMark queries over partially shredded data (4GB)– Q1: Simple paths– Q5: Aggregation and filtering– Q9: Joins– QN: Variant of Query Q6, Descendant axis

• Query 6 needed to be re-written because the interesting part of Q6 had been shredded

• LRX versus bringing data to client first– Currently, only other option is manual XQuery

Page 24: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

24

Example: Q5• With LRXvar c = (from o in db.ClosedAuctions where o.closed_auction.price >= 40 select o.auto_id).Count();

• Without LRXvar q = from o in db.ClosedAuctions select o.closed_auction;int i = 0;foreach (var o in q) if ((decimal)o.Element("price") >= 40) i++;

Data is pulled to client

Filter and count on client

Page 25: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

25

Takeaway: Benefit from Pushing Operations to Server

• Results are in seconds per 100 runs• Blue bars are LRX, green bars are without• Tried with cold (C) and hot (H) page buffers

Page 26: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

26

Conclusion and Future Directions

• Query optimization– Pushing operations to either relational or XML

• Keyrefs Object pointers• Queries/updates directly to delta• LRX versus Lorax

Page 27: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

27

Attention, my VLDB attendees!Our system is LRX, it speaks for the treesThe XML trees overlooked by the toolsThat follow the object-relational rules

Of course, one can always resort to XQueryBut FLWOR’s the deed that makes optimists

drearyWe leave all relational portions pristineBut add new components for XML seen

Page 28: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

28

LRX takes fragments on schema expressedAnd compiles them to queries whose

structures suggestHow to draw the right data from trees

intertwinedAnd pack into objects of custom design

But what of the stuff ‘twixt the elements fall?The comments, the whitespace, the order of

xsd:all?Our LRX tucks all of that data awayIn a structure called “delta”, an indexed array

Page 29: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

29

We draw from the keys in the delta in caseWe must locate the space to do updates in

placeWhen queries or updates on clients arriveNative XQuery does LRX contrive

Inspection of query performance has shownThat LRX is faster than client aloneThis is how we make objects of stored XMLMy talk is now done, so I bid you farewell

Page 30: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

30

Thanks!

var q = from c in db.ClosedAuctions where c.closed_auction.price > 20 from t in c.closed_auction

.annotation.description.text

.Descendants(“emph") select t;

WITH XMLNAMESPACES('http://.../Auction' AS a) SELECT T FROM (SELECT T FROM ClosedAuctions C, SEQUENCE(C.closed_auction,

'a:auction/a:annotation/a:description/a:text//emph') AS T)

WHERE VALUE(C.closed_auction, 'a:auction/a:price', int) > 20

Page 31: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

31

Supplemental Slides

Page 32: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

32

Escape Hatch: LINQ-to-XML

• xsd:anyType, mixed content– Or a preference for the XPath model

• Map to class XElement• XPath-like interface to XML-like data

– Each method invocation translated into XPath on server

from c in db.ClosedAuctions where c.closed_auction.price > 20 from t in c.closed_auction

.annotation.description.text

.Descendants("emph") select t;

Object is of type XElement

Pushed to server as corresponding XPath axis

Page 33: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

33

Queries and Query Translation:Conditions

where e.foo == “bar”

where e.foo.Contains(“bar”)

where e.foo is barType

IF conditions cover a client-side mapping fragment condition,translate into the corresponding store-side XML conditions

IF method has an XML analog, translate into the analogous XQuery function

Find fragment for barType, then translate into type or element conditions on XML

Page 34: Full-Fidelity Flexible Object-Oriented XML Access James F. Terwilliger, Philip A. Bernstein, Sergey Melnik Persistent XML via your favorite programming.

34

Object queries and updates via LINQ

O-X mappings

DB2 Oracle

DB2 Provider

SQL Server

PH XQuery

SS Provider

PH XQuery

Ora Provider

PH XQuery

O-R mappings

Translate XML-mapped references to

placeholder (PH) functions

Shred XML into objects according to object type and mappings

Translation to vendor-specific

SQL syntax

Relation-mapped classes XML-mapped classes

Package queries and updates into abstract trees, then

transform by applying mappings

Build objects from query

results

Client-side object space Object queries and updates via LINQ

Translate XML-mapped references to

placeholder (PH) functions

Package queries and updates into abstract trees, then

transform by applying mappings

PH XQuery

Build objects from query

results

Shred XML into objects according to object type and mappings

Relation-mapped classes XML-mapped classes


Recommended