+ All Categories
Home > Documents > The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented...

The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented...

Date post: 21-Dec-2015
Category:
View: 217 times
Download: 1 times
Share this document with a friend
Popular Tags:
28
The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen
Transcript
Page 1: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen.

The Complexity of XPath Evaluation

Paper By:Georg GottlobCristoph KochReinhard Pichler

Presented By:Royi Ronen

Page 2: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen.

Introduction

• All major XPath evaluating algorithms run in exponential time.

• Paper’s main goals: – Prove that the “XPath problem” P-complete.– Prove that other related problems are

LOGCFL-complete.

Page 3: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen.

XPath – Quick Reminder

• XPath is a query language for XML documents.

• Navigating through a document:

/descendant::a/child::b selects nodes named “b” that have a father named “a”.

• Testing nodes:

/descendant::a/child::b[@c=3] requires that b’s attribute c equals 3.

Page 4: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen.

Sketch: How P-Completeness is proven

• In order to prove P-Completeness of a problem, we have to prove:– Membership in P;– P-Hardness;

P

P-Complete

P-Hard

Page 5: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen.

XPath is P-Complete

• Sketch:1. Membership of XPath in P is already proven (By the same authors).2. P-Hardness of XPath will be proven by reduction from the monotone circuit problem (which is known to be P-Complete) to Core XPath (a subset of XPath with its main features). Why is it enough?

Page 6: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen.

Monotone Boolean Circuit Problem

• A Monotone Boolean circuit is a circuit with many inputs and one output that uses the following Boolean gates only:– AND– OR– DUMMY

• Given a circuit and its inputs, solving the problem is stating the output.

• The problem is P-Complete.

Page 7: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen.

A Monotone Boolean Circuit

• Item 3 in the handout:

Page 8: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen.

Core XPath - Definition

XPath is has many features, and is inconvenient for theoretical treatment. Therefore Core XPath, a subset of XPath with its main features is defined by the following grammar (Item 1 in the handout):

locpath ::= ‘/’ locpath | locpath ‘/’ locpath | locpath ‘|’ locpath | locstep.

locstep ::= axis ‘::’ ntst `[' bexpr `]' . . . ‘[‘ bexpr ‘]’.bexpr ::= bexpr ‘and’ bexpr | bexpr ‘or’ bexpr |

‘not(’ bexpr ‘)’ | locpath.axis ::= ‘self’ | ‘child’ | ‘parent’ |

‘descendant’ | ‘descendant-or-self’ | ‘ancestor’ | ‘ancestor-or-self’

‘following’ | ‘following-sibling’ ‘preceding’ | ‘preceding-sibling’.

Page 9: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen.

The Corresponding Languages

• The paper shows direct reductions between the problems.

• We will show the same reduction, but between the corresponding languages, since it is the methodology used in the Technion Computability course.

• The proofs are equivalent.

Page 10: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen.

The Corresponding Languages

• L-Core XPath:

{(Q,D) | Q is a Core XPath query, D is a valid document and Q yields a

non-empty result when run on D} • L-Monotone Circuit:

{(C,I) | C is a monotone circuit, I is a set of inputs to C and C evaluates 1 when run on I}

Page 11: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen.

The Reduction

• Reduction is our tool to prove that one language is at least as hard as another.

• Here we will show: L-Circuit is reducible to L-Core XPath. It proves that L-Core XPath is at least as hard as L-Circuit, therefore P-Hard.

• We have to build (Q,D) that yields a nonempty result iff (C,I) evaluates to 1.

Page 12: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen.

The circuit layered

• An equivalent monotone circuit, in which only one non-dummy gate exists in every layer (Item 4 in the handout).

• The gates are ordered, data can flow from lower to higher indexed gates only.

Page 13: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen.

Q and D

• D is built as follows:

M inputs, Here M=4 N non-input gates, Here N=5

Total of 2(M+N)+1 nodes.

Nodes are tagged, from the alphabet: {0,1,Ii,Oi,G }Where i is from {1,2,…,N}

Page 14: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen.

Tagging Rules

• V1-VM are tagged each with its input value, e.g. 0 or 1.

• VM+N Is tagged R, Vi is tagged G (inc. VM+N).

• If gate Gi is an input to gate GM+k (i<M+k), Ik is added to Vi and Ok – to VM+k.

• V’1..M are tagged Ii and Oi, where i is in {1,..,N}.• V’M+i are tagged Ik and Ok, where k is in {i,..,N}.

These tags will be used by the query.

Page 15: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen.

A Simple ExampleD

V0

V’1

V1

V’2

V2 V3

V’3

G G G

R

I1 I1O1 O1

I1O1

I1 I11 01 0

G1C

O1

Page 16: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen.

The Query

• The query in the output of the reduction is:

k

The reduction can be achieved in logarithmic space

N/descendant-or-self::[T(R) and ]:= descendant-or-self::[T(Ok) and parent::*[ ]]k

kk := not(child::*[T(Ik) and not( )]) If GM+k is an AND Gate

kk := child::*[T(Ik) and ( )]If GM+k is an OR/DUMMY Gate

k := ancestor-or-self::*[T(G) and ]1k0 := T(1) End of

recursion

Evaluation of Gk by: selecting V0 iff all (one of) Gk inputs are (is) 1 and the gate is “AND” (“OR”).

Pushing down results

Page 17: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen.

Sub-queries Meaning1: ::*[ ( ) ]k kancestor or self T G

: ( ::*[ ( ) ( )])k k knot child T I not

: ::*[ ( ) ( )])k k kchild T I

: ::*[ ( ) ::[ ]]k k kdescendant or self T O parent

/ ::*[ ( ) ]kdescendant or self T R

Returns nodes in the previous iteration and their tagged children, e.g. pushes “down” results by including the children.

Returns the root iff all the inputs to gate k are true, in an AND gate.

Returns the root iff at least one of the inputs to gate k is true, in an OR gate. In both cases, returns the nodes that represent gates that were previously evaluated to true.

Includes Vk iff the root was returned by the previous sub-query.

Returns the rightmost node iff the output gate is evaluated to true. (No other gate is tagged R).

Page 18: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen.

The Query - ExampleV0

V’1

V1

V’2

V2 V3

V’3

G G G

R

I1 I1O1 O1

I1O1

I1 I11 0

1/ ::*[ ( ) ]descendant or self T R

1 1 1: ::*[ ( ) ::[ ]]descendant or self T O parent

1 1 1: ( ::*[ ( ) ( )])not child T I not

1 : [ ( ) (1)]ancestor or self T G T

0

O1

Page 19: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen.

Discussion

It is enough to show that:

Reason: T(R) is true for the rightmost node only.

If the last gate evaluates to 1, then the result of the query consists of that

node, and (Q,D) is in Circuit.

Otherwise, the result is empty, and (Q,D) is not in Circuit.

kVi [ ] iff Gi evaluates to true

Page 20: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen.

Tagged Tree Example

I23

G 1

I24 1 G

I1 0 G

O I1 G

O1 I34 G

I5 O2 G

O3 I5

GO4 I5 G

O5 R G

I1-I5O1-O5

I1-I5O1-O5

I1-I5O1-O5

I1-I5O1-O5

I1-I5O1-O5

I2-I5O2-O5

I3-I5O3-O5

I4-I5O4-O5

I5O5

and and andand or

For C in the handout

Page 21: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen.

Discussion

• consists of the values of the k nodes in layer k of the circuit.

• It can also be viewed as the situation at the k-th tick of a clock in a synchronous system.

• Proof:

kVi [ ] iff Gi evaluates to true

k

Page 22: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen.

Despite P-Completeness

• Problems that are P-Complete are considered inherently sequential, and thus cannot benefit from parallelization.

• However, for real-world use, it may be very useful to find subsets of the problem and classify them into lower complexity classes (easier problems).

• Does anyone recall a well known problem that can benefit from such manipulation?

• The paper continues by looking for how to degenerate the problem.

Page 23: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen.

First Modification Trial

• Only usage of the axes: child, parent and descendant-or-self is allowed.

• The modification doesn’t yield lower complexity. The same reduction will work after changing:

ancestor-or-self::*

to

descendant-or-self::*/parent::*

Page 24: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen.

Second Modification Trial

• Let Positive Core-XPath be: Core-XPath \ Queries that use negation.• This problem is a member of LOGCFL.• LOGCFL problems can be reduced in logarithmic

space to a context free language.• Being context free embodies the ability to be

parallelized. Segments do not dependant on each other.

• The reduction is very similar. It uses the problem of semi-bounded circuits for the reduction.

Page 25: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen.

WF and Positive WF

• WF is a subset of XPath that allows Core-XPath, arithmetic operations and conditions using position() last() and constants.

• Where is WF?

• Positive WF is LOGCFL-Complete. The proof of hardness resembles the proof we have just seen.

Page 26: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen.

The Global Picture

Page 27: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen.

BACKUP

• BACKUP

Page 28: The Complexity of XPath Evaluation Paper By: Georg Gottlob Cristoph Koch Reinhard Pichler Presented By: Royi Ronen.

PF is NL-Complete

• PF is the problem of navigating through an XML document, with no conditions allowed.

• NL is the class of problems solved by a Turing Machine that uses, non-deterministically, logarithmic space.

• Proof: PF is NL-Complete.– Membership in NL (By random guessing)– NL-Hardness


Recommended