Wolfgang Mulzer Institut f ür Informatik Data Structures on Event Graphs Bernard ChazelleWolfgang...

Wolfgang MulzerInstitut für Informatik

Data Structures on Event Graphs

Bernard Chazelle Wolfgang MulzerFU BerlinPrinceton University

2Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs

It‘s the data

Data can be

huge

Rethink classical algorithms from a data-oriented perspective.

corrupted

…

low-entropy expensive


It‘s the data

Data can be

huge

We study a model that represents temporal locality of the data.

corrupted

…

low-entropy expensive


A concrete problem – successor search

Given: An ordered universe U of n elements

x1 x2 x3 x4 x5 x6 x7 x8 x9 x10

Goal: maintain a subset S of U supporting successor queries

Operations: Insert(xi)

Delete(xi)Successor(xi)

Also known as Union-Split-Find Problem.


A concrete problem – successor search

Given: An ordered universe U of n elements

x1 x2 x3 x4 x5 x6 x7 x8 x9 x10

Can be solved in O(log log n) time on a pointer machine.

[van Emde Boas, Kaas, Zijlstra 77]

This is optimal.

[Mehlhorn, Näher, Alt 88], [Pătraşcu, Thorup 06]


Event graphs

Given: An ordered universe U of n elementsand

a labeled, connected, undirected graph G

Ix0

Ix7 Dx9

Dx2

Sx7

Ix9

Sx2

Ix5

G is labeled with operations Ixi, Dxi, Sxi

G is known in advance

G can be preprocessed

Adversary walks on G to perform ops

Similar to Markov chains


Event graphs

G is labeled with operations Ixi, Dxi, Sxi

G is known in advance

G can be preprocessed

Adversary walks on G to perform ops

x1 x2 x3 x4 x5 x6 x7 x8 x9 x10

Ix0

Ix7

Dx2

Sx7

Ix9

Sx2

Ix5

Dx9


Decorated graphs

The walk of the adversary induces a walk on a much bigger graph.

Decorated Graph dec(G): directed graph with vertex set V(G) Pow(U). Represents current node of G + current set S.

Ix0

Ix7

Dx2

Sx7

Ix9

Sx2

Ix5

Dx9

(Dx2, )

(Sx2, )

(Ix9, {x9})

(Ix5, {x5, x9})


Decorated graphs


Decorated Graph dec(G): directed graph with vertex set V(G) Pow(U). Represents current node of G + current set S.

If dec(G) is available, we can perform all operations in constant time.

But: The size of dec(G) is exponential.


Decorated graphs


Decorated Graph dec(G): directed graph with vertex set V(G) Pow(U). Represents current node of G + current set S.Questions: - What can we say about the structure of dec(G)?

- What can we deduce about dec(G), given G?

- In which cases can dec(G) be compressed efficiently?


The structure of decorated graphs

dec(G) contains a unique strongly connected component that has no exit and is reachable from every other node.

This component is called the unique sink.

C1

C4C3

C2


The structure of decorated graphs

Theorem: Given a node vV(G) and a set SU, we can decide in time O(|V(G)|+|E(G)|) whether (v,S) lies in the unique sink.

Proof idea: We show that for every node in the unique sink there exists a unique certificate in G (a certifying walk).

A modified graph search in G can be used to find a certifying walk for (v,S), if it exists.


Can the decorated graph be compressed?

Consider the case that G is a path.

Theorem: If G is a path, the successor problem can be solved in O(1) time per operation with O(n1+) space on a word RAM, where n=|V|.

Ix0Ix7 Dx2Sx7Ix9Sx2

Ix5Dx9




Ix0Ix7 Dx2Sx7Ix9Sx2

Ix5Dx9




Ix0Ix7 Dx2Sx7Ix9Sx2

Ix5Dx9

Proof: Maintain S in a doubly linked list. Each node in G has a pointer to its predecessor or successor in S. Use this pointer to answer the queries. Need only maintain those pointers that will be relevant next. Use lookup-table.


Example

Dx3Dx1 Dx2Sx5 Sx8Ix7 Dx9Ix2

x1 x3 x5 x7 x10

… …


Reducing the space requirement

A naïve implementation uses two lookup-tables per node to update the pointers → O(n2) space usage.

Can be improved to O(n1+) space.

Approach: Use spatial decomposition and bootstrapping to compress the lookup-tables (cf. [Crochemore et al, 2008])


What about randomization?

We assumed an adversary.

But: What if the walk on the path is random?

Theorem: If the requests are generated by a random walk on a path, the successor problem can be solved in O(1) expected time per operation with O(n) space on a word RAM, where n=|V|.


What about randomization?

Theorem: If the requests are generated by a random walk on a path, the successor problem can be solved in O(1) expected time per operation with O(n) space on a word RAM, where n=|V|.

Proof (sketch): Subdivide the path into segments of n nodes.The random walk requires (n) steps to leave a segment.Build the quadratic data structure once the walk enters the next segment.Use overlapping segments and deamortization techniques to make it work.


What about more complicated graphs?What if G is a tree, a grid, or something more complicated?

The path approach does not work any more

Ix0

Ix7

Sx7

Sx2 Dx2 Ix9Sx2 Dx9

Ix0Ix7

Dx2

Sx7

Ix9Sx2

Ix5Dx9 Ix7

We conjecture that in this case the O(log log n) bound from van Emde Boas trees is optimal (but we do not know).


Conclusion and open problems

A new way to model request sequences to a data structure.

Can be applied to any data structuring problem.

More algorithmic questions on decorated graphs, e.g., can we estimate the size of the unique sink efficiently?

Can we prove lower bounds for the successor problem on general event graphs?


Thank you!

Date post:	01-Jan-2016
Category:	Documents
Upload:	myrtle-gray
View:	217 times
Download:	0 times

Wolfgang Mulzer Institut f ür Informatik Data Structures on Event Graphs Bernard ChazelleWolfgang...

Documents