data structures on event graphs

22
Wolfgang Mulzer Institut für Informatik Data Structures on Event Graphs Bernard Chazelle Wolfgang Mulzer FU Berlin Princeton University

Upload: domani

Post on 23-Feb-2016

29 views

Category:

Documents


0 download

DESCRIPTION

Data Structures on Event Graphs. Bernard Chazelle. Wolfgang Mulzer. Princeton University. FU Berlin. It‘s the data. Data can be. huge. corrupted. low-entropy. expensive. …. Rethink classical algorithms from a data-oriented perspective. It‘s the data. Data can be. huge. corrupted. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Data Structures on Event Graphs

Wolfgang MulzerInstitut für Informatik

Data Structures on Event Graphs

Bernard Chazelle Wolfgang MulzerFU BerlinPrinceton University

Page 2: Data Structures on Event Graphs

2Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs

It‘s the data

Data can be huge

Rethink classical algorithms from a data-oriented perspective.

corrupted

low-entropy expensive

Page 3: Data Structures on Event Graphs

3Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs

It‘s the data

Data can be huge

We study a model that represents temporal locality of the data.

corrupted

low-entropy expensive

Page 4: Data Structures on Event Graphs

4Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs

A concrete problem – successor search

Given: An ordered universe U of n elements

x1 x2 x3 x4 x5 x6 x7 x8 x9 x10

Goal: maintain a subset S of U supporting successor queries Operations: Insert(xi)

Delete(xi)Successor(xi)

Also known as Union-Split-Find Problem.

Page 5: Data Structures on Event Graphs

5Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs

A concrete problem – successor search

Given: An ordered universe U of n elements

x1 x2 x3 x4 x5 x6 x7 x8 x9 x10

Can be solved in O(log log n) time on a pointer machine.[van Emde Boas, Kaas, Zijlstra 77]

This is optimal.[Mehlhorn, Näher, Alt 88], [Pătraşcu, Thorup 06]

Page 6: Data Structures on Event Graphs

6Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs

Event graphs

Given: An ordered universe U of n elementsand

a labeled, connected, undirected graph G

Ix0

Ix7 Dx9

Dx2

Sx7

Ix9

Sx2

Ix5

G is labeled with operations Ixi, Dxi, Sxi

G is known in advance G can be preprocessed Adversary walks on G to perform ops Similar to Markov chains

Page 7: Data Structures on Event Graphs

7Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs

Event graphs

G is labeled with operations Ixi, Dxi, Sxi

G is known in advance G can be preprocessed Adversary walks on G to perform ops

x1 x2 x3 x4 x5 x6 x7 x8 x9 x10

Ix0

Ix7

Dx2

Sx7

Ix9

Sx2

Ix5

Dx9

Page 8: Data Structures on Event Graphs

8Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs

Decorated graphs

The walk of the adversary induces a walk on a much bigger graph.Decorated Graph dec(G): directed graph with vertex set V(G) Pow(U). Represents current node of G + current set S.

Ix0

Ix7

Dx2

Sx7

Ix9

Sx2

Ix5

Dx9

(Dx2, )

(Sx2, )

(Ix9, {x9})

(Ix5, {x5, x9})

Page 9: Data Structures on Event Graphs

9Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs

Decorated graphs

The walk of the adversary induces a walk on a much bigger graph.Decorated Graph dec(G): directed graph with vertex set V(G) Pow(U). Represents current node of G + current set S.If dec(G) is available, we can perform all operations in constant time.

But: The size of dec(G) is exponential.

Page 10: Data Structures on Event Graphs

10Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs

Decorated graphs

The walk of the adversary induces a walk on a much bigger graph.Decorated Graph dec(G): directed graph with vertex set V(G) Pow(U). Represents current node of G + current set S.Questions: - What can we say about the structure of dec(G)?- What can we deduce about dec(G), given G?- In which cases can dec(G) be compressed efficiently?

Page 11: Data Structures on Event Graphs

11Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs

The structure of decorated graphs

dec(G) contains a unique strongly connected component that has no exit and is reachable from every other node.

This component is called the unique sink.

C1

C4C3

C2

Page 12: Data Structures on Event Graphs

12Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs

The structure of decorated graphs

Theorem: Given a node vV(G) and a set SU, we can decide in time O(|V(G)|+|E(G)|) whether (v,S) lies in the unique sink.

Proof idea: We show that for every node in the unique sink there exists a unique certificate in G (a certifying walk).

A modified graph search in G can be used to find a certifying walk for (v,S), if it exists.

Page 13: Data Structures on Event Graphs

13Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs

Can the decorated graph be compressed?

Consider the case that G is a path.

Theorem: If G is a path, the successor problem can be solved in O(1) time per operation with O(n1+) space on a word RAM, where n=|V|.

Ix0Ix7 Dx2Sx7Ix9Sx2

Ix5Dx9

Page 14: Data Structures on Event Graphs

14Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs

Can the decorated graph be compressed?

Theorem: If G is a path, the successor problem can be solved in O(1) time per operation with O(n1+) space on a word RAM, where n=|V|.

Ix0Ix7 Dx2Sx7Ix9Sx2

Ix5Dx9

Page 15: Data Structures on Event Graphs

15Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs

Can the decorated graph be compressed?

Theorem: If G is a path, the successor problem can be solved in O(1) time per operation with O(n1+) space on a word RAM, where n=|V|.

Ix0Ix7 Dx2Sx7Ix9Sx2

Ix5Dx9

Proof: Maintain S in a doubly linked list. Each node in G has a pointer to its predecessor or successor in S. Use this pointer to answer the queries. Need only maintain those pointers that will be relevant next. Use lookup-table.

Page 16: Data Structures on Event Graphs

16Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs

Example

Dx3Dx1 Dx2Sx5 Sx8Ix7 Dx9Ix2

x1 x3 x5 x7 x10

… …

Page 17: Data Structures on Event Graphs

17Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs

Reducing the space requirement

A naïve implementation uses two lookup-tables per node to update the pointers → O(n2) space usage.

Can be improved to O(n1+) space.

Approach: Use spatial decomposition and bootstrapping to compress the lookup-tables (cf. [Crochemore et al, 2008])

Page 18: Data Structures on Event Graphs

18Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs

What about randomization?

We assumed an adversary.

But: What if the walk on the path is random?

Theorem: If the requests are generated by a random walk on a path, the successor problem can be solved in O(1) expected time per operation with O(n) space on a word RAM, where n=|V|.

Page 19: Data Structures on Event Graphs

19Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs

What about randomization?Theorem: If the requests are generated by a random walk

on a path, the successor problem can be solved in O(1) expected time per operation with O(n) space on a word RAM, where n=|V|.

Proof (sketch): Subdivide the path into segments of n nodes.The random walk requires (n) steps to leave a segment.Build the quadratic data structure once the walk enters the next segment.Use overlapping segments and deamortization techniques to make it work.

Page 20: Data Structures on Event Graphs

20Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs

What about more complicated graphs?What if G is a tree, a grid, or something more complicated?

The path approach does not work any more

Ix0

Ix7

Sx7

Sx2 Dx2 Ix9Sx2 Dx9

Ix0Ix7

Dx2

Sx7

Ix9Sx2

Ix5Dx9 Ix7

We conjecture that in this case the O(log log n) bound from van Emde Boas trees is optimal (but we do not know).

Page 21: Data Structures on Event Graphs

21Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs

Conclusion and open problems

A new way to model request sequences to a data structure.

Can be applied to any data structuring problem.

More algorithmic questions on decorated graphs, e.g., can we estimate the size of the unique sink efficiently?

Can we prove lower bounds for the successor problem on general event graphs?

Page 22: Data Structures on Event Graphs

22Bernard Chazelle and Wolfgang Mulzer – Data Structures on Event Graphs

Thank you!