presented by yuval shimron course 236801 1.12.2010

45
Color-Coding Presented by Yuval Shimron Course 236801 1.12.2010

Upload: kasandra-stillman

Post on 15-Dec-2015

224 views

Category:

Documents


1 download

TRANSCRIPT

Color-Coding

Presented by Yuval ShimronCourse 2368011.12.2010

2

The Uses of Color-Coding Find solutions to sub-cases of the

Subgraph Isomorphism Problem in polynomial time.

Find More efficient solutions to some sub-cases that already had polynomial time solutions.

Find simple paths and cycles of specific length k. This was the initial goal of the authors…

3

Main Three New Results

(1) For a fixed k, if G=(V,E) contains a cycle of length k it can be found in O(Vω) expected time or O(VωlogV) worst-case time (ω<2.376 is the exponent of matrix multiplication).

(2) For a fixed k, if a planar graph G=(V,E) contains a cycle of length k it can be found in O(V) expected time or O(VlogV) worst-case time (Applies also to any non-trivial minor-closed family of graphs).

4

Main Three New Results

(3) If G=(V,E) contains a subgraph isomorphic to a bounded tree-width graph H=(VH,EH) where |VH| = O(logV), then such a subgraph can be found in polynomial time. Was not previously known even if H were

just a simple path of length O(logV). Shows that the LOG PATH problem is in

NC (and not just in P).

5

1 .The Method

Randomized method Vertices are randomly colored using k =

|VH| colors.

If |VH| = O(logV), then with a small (but only polynomial small) probability all the vertices of the (isomorphic to H) subgraph are colored in distinct colors.

Makes the task of finding this ‘color-coded’ subgraph much easier.▪ Be patient…

6

1 .The Method

De-randomized algorithm? Needs a family of colorings of G, such

that every subset of k vertices of G is assigned with distinct colors by at least one of these coloring.▪ In other words, a family of perfect hash

functions from {1, 2, …, |V|} to {1, 2, …, k}.

Only “small” loss of efficiency.

7

2 .Random Orientations 2.1: Simple Path of Length k

If acyclic – simple O(E) time for a simple algorithm.

So eliminate cycles: Choose a random permutation . Build by using :▪ Direct the edges:

, :

, '

, '

u v E

u v u v E

v u v u E

' , 'G V E

8

2 .Random Orientations 2.1: Simple Path of Length k

Every directed path of length k in G’ is a simple path of length k in G.

Every simple path of length k in G has a 2/(k+1)! chance of becoming a directed path in G’.

So if no path of length k was found in G’ repeat the process.

The expected number of times this process is repeated is at most (k+1)!/2.

9

2 .Random Orientations 2.1: Simple Path of Length k

So we get O(E(k+1)!) time complexity. This is also the result for the directed case.▪ Delete edges that don’t agree with .

Use the following fact + DFS to reduce it to O(V(k+1)!) for the undirected case: Every graph with V vertices and at least k|V|

edges contains a path of length k. So first run a DFS on the original graph. Apply the above algorithm only if no vertex

of depth k was found (answered in O(k|V|) time).

10

2 .Random Orientations 2.2: Simple Cycle of Length k

Choose random acyclic orientation G’. Raise the adjacency matrix of G’ to the power

of k-1 using O(logk) matrix multiplications. This gives all the pairs of vertices connected

by a path of length k-1. Check if any of these pairs are connected.

If so . If not, repeat the process.▪ Expected number of at most k!/2 time.

Complexity: O(k!(logk)Vω)=O(Vω) for a fixed k.

11

3. Color-Coding

To find a path of length k-1 in a graph G we can choose a random coloring of the vertices of G in k colors.

Every simple path of length k-1 in G has a chance of k!/kk > e-k to become colorful. Each vertex is colored with a different

color. We can find it using lemma 3.1.

12

3. Color-Coding3.1: Colorful path of Length k-1

Use Color-Coding to find a colorful path of length k-1 in 2O(k)E worst case time (if exists). Actually it finds a path of length k that starts at a

specific vertex s.▪ but we can always add some vertex s to G (with a new

color). The algorithm uses a given (random) coloring

c : V {1, 2, … k} The algorithm uses a dynamic programming

approach.

13

3 .Color-Coding3.1: Colorful path of Length k-1

Suppose we’ve found for each vertex v the sets of colors on colorful paths of length i that connects s and v. A collection of at most color sets.

For that we only need to record the color sets appearing on i-length paths. And not the path themselves…

We inspect every color set C of that collection.

k

i

14

3 .Color-Coding3.1: Colorful path of Length k-1

We also inspect every edge (v,u) in E.

If we add to the collection of u that corresponds to colorful paths of length i+1.

The graph G contains a colorful path of length k-1 iff the final collection, corresponds to paths of length k-1, of at least one vertex is non-empty.

c u C C c u

15

3 .Color-Coding 3.1: Colorful path of Length k-1

The number of operations is at most

.

The proof holds for both directed and undirected graphs.

0

2k

k

i

kO i E O k E

i

16

3.2 :Pairs of Vertices Connected by Colorful Paths of Length k-1

We can find all pairs of vertices connected by path of length k-1 in or worst case time.

To get time simply run 3.1 algorithm |V| times, from each vertex of G=(V,E).

Use recursive approach to get time.

( )2O k VE ( )2O k V

( )2O k VE

( )2O k V

17

3.2 :Pairs of Vertices Connected by Colorful Paths of Length k-1

Keep all partitions of {1,2,…,k} into two subsets C1,C2 of size k/2 each. There are such partitions.

For each partition, split G into two graphs derived from C1, C2 coloring.

Recursively find pairs of vertices connected by paths of k/2-1.

Store the results in Boolean matrices A1,A2.

22

kk

k

18

3.2 :Pairs of Vertices Connected by Colorful Paths of Length k-1

Define B to be a Boolean matrix of adjacency relations between V1,V2 vertices.

Compute A1BA2. You get all pairs connected by paths of

length k-1▪ First k/2 vertices are colored by colors from C1

▪ Last k/2 vertices are colored by colors from C2

By OR-ing all the matrices obtained from all the partitions you get your answer.

Time complexity?

19

3. Color Coding3.3,3.4: Two Interesting Results

A simple path of length k-1 in a directed / undirected graph G=(V,E) can be found (if exists) in: expected time for undirected graph.▪ DFS…

expected time for directed graph. A simple cycle of size k in a directed /

undirected graph G=(V,E) can be found (if exists) in either

or expected time. Simply use lemma 3.2.

2O k V

2O k E

2O k VE 2O k V

20

4. Derandomized Orientations and Colorings

The previous randomized algorithms can be derandomized with a loss of efficiency. Extra logV factor to the complexity.

What we need is a family of k-perfect hash functions from {1, 2, …, |V|} to {1, 2, …, k}. If we use these hash functions we know

that for every subset of k vertices there exists a coloring that gives each vertex in it, a distinct color.

21

4. Derandomized Orientations and Colorings

There exists an algorithm that constructs a k-perfect family of hash functions from{1, 2, ..., n} to {1, 2, ..., k}. But its size is .

There also exists an algorithm that constructs a k-perfect family of hash functions from {1, 2, ..., n} to {1, 2, ..., k2} that its size is .

22 logO k n

1 logOk n

22

4. Derandomized Orientations and Colorings

So we use 2-level hashing: Mapping from {1, 2, ..., n} to {1, 2, ...,

k2} by using the second algorithm. Mapping from {1, 2, ..., k2} to {1, 2, ...,

k} by using the first algorithm.. And we get just the promised extra

O(logV) time. The value of each element can be

evaluated in O(1) time.

23

4 .Derandomizing: Derandomize Orientations

Use k-perfect hash coloring functions. Choose a random coloring (ant not a

permutation)c : V --> {1, 2, … k}

Remove edges (u,v) s.t. . Direct remaining edges (u,v) from u to v. Again G’, the obtained graph, is acyclic. Simple path of length k in G has a probability

of 2k-k to become a directed path in G’.

Different from the Color-Coding method.

1c v c u

24

5. Cycles in Minor-ClosedFamilies of Graphs

An undirected graph G is d-degenerate if every subgraph of it has a vertex of degree at most d.

Smallest such d is called the degeneracy or the max-min degree of G. Maximum over the minimum degrees of

all sub-graphs of G.

If G is d-degenerate then clearly .

E d V

25

5. Cycles in Minor-Closed graphs5.1: Finding an Orientation

Let G be a connected undirected graph.

An acyclic orientation of G=(V,E) such that for every v we have can be found in O(E) time.

outd v d G

26

5. Minor-Closed Families of Graphs

A graph H is a minor of undirected graph G if it can be obtained from G by the removal and the contraction of edges.

A family C of graphs is minor-closed if a minor of any graph in it is also a member of the family.

If such C is non-trivial then all graphs in C are of bounded degeneracy. s.t. .

Cd : CG C d G d

27

5. Minor-Closed Families of Graphs

Consider the family of planar graphs Cplanar

It is minor-closed. Each planar graph has a vertex whose

degree is at most 5. .

5planarCd

28

5. Minor-Closed Families of Graphs5.2: Simple Cycle of size k

Let C be a non-trivial minor-closed family of graphs and let be a fixed parameter.There exists a randomized algorithm that given an undirected graph in C finds a Ck - cycle of size k in it if one exists, in O(V) expected time.

Proof: Let G = (V,E) be a graph in C that contains a Ck. Choose a random coloring c : V -> {1, 2, 3, …, k}. Ck is considered well-colored if colored in a

consecutive way by the colors 1, 2, …, k.

3k

29

5. Minor-Closed Families of Graphs5.2: Simple Cycle of size k

The Ck in G has a chance of 2/kk-1 to be well-colored.

Can we find it efficiently? Yes, but with some probability…

Assume that the degeneracy of C is d = O(1). We describe a randomized algorithm that

given a coloring c, finds Ck with probability of 1/(2d)k.

Combining both gives a probability of at least so the expected time is

. 12 2k kd k 2

kO dk V

30

5. Minor-Closed Families of Graphs5.2: Simple Cycle of size k

We can assume all edges of G connect vertices that are colored by consecutive colors (mod k). Edges that don’t may be safely removed.

We orientate the graph so that the out-degree of all the vertices is at most d. This takes only O(V) time.

The algorithm tries to find the edge that connects the vertices in Ck colored by k and k-1: vk ,vk-1. It “flipps coins” to guess it’s orientation and index – 2d possible combinations.

31

5 .Minor-Closed Families of Graphs5.2: Simple Cycle of size k

For each guess of such index i If the orientation is from vk-1 to vk:▪ All edges that leave vk-1 but whose index is

not i are removed. Otherwise does the opposite.▪ (for edges that leave vk)

Result is the graph G’ that contains a Ck with a probability of at least 1/(2d). A forest of rooted stars.

32

5 .Minor-Closed Families of Graphs5.2: Simple Cycle of size k

Each such star is contracted into a single vertex and assigned with the color k-1.

The obtained graph is denoted by G’’. G’’ contains a well-colored Ck-1 iff G’

contains a well-colored Ck. Since each edge of G’ and therefore G’’

connects consecutively colored vertices. G’’ is also a graph in the minor-closed

family C. So we recursively look for Ck-1.

33

5 .Minor-Closed Families of Graphs5.2: Simple Cycle of size k

It will take us O((k-1)V) expected time. And yields Ck-1 with a probability of at least 1/(2d)k-1.

Obviously it’s easy to reconstruct Ck from Ck-1. We can stop the recursion when k=3 and use

an existing algorithm for finding triangles in a general graph in time. Any triangle in a three-colored graph is well-

colored. is in our case.

O E d G

O E d G O V

34

5.3: Simple Cycle of size kDerandomized Algorithm

There exists a determinist algorithm that given a graph in C, finds Ck if exist, in O(VlogV) WC time.

Proof: Instead of using random coloring we exhaust a list of

kO(k)logV colorings that has this property:▪ Every sequence of k vertices is consecutively colored by

1,2,…,k by at list one coloring of the list. Instead of guessing the direction and index of each

edge in the Ck we exhaust for each coloring all the (2d)k possible choices.▪ If G contains a Ck then at least one Ck will be found this way.

35

6. Finding Bounded Tree-Width Sub-Graphs

A graph G1 is said to be isomorphic to a graph G2 if there exists a bijection:

f : V(G1) -> V(G2) such that any two vertices u and v of G1 are adjacent in G1 iff ƒ(u) and ƒ(v) are adjacent in G2.

36

6.1: Finding Sub-Graph Isomorphic to a Forest

Let F be a directed/undirected forest on k vertices. Let G be a directed/undirected graph.

A sub-graph isomorphic to F can be found if exists in: expected time in the directed

case. expected time in the undirected

case.

2O k E 2O k V

37

6.1: Finding Sub-Graph Isomorphic to a Forest

Proof: Start as usual, by choosing a random coloring:

c : V -> {1, …, k} of G. With a probability of at least e-k the copy of F in

G becomes colorful.▪ Meaning, each vertex is assigned with a different color.

Suppose that F is composed of l (directed) trees T1, T2, …, Tl with k1, k2, …, kl vertices each.

Let Fi be the (directed) forest composed of T1, T2, …, Ti.

38

6.1: Finding Sub-Graph Isomorphic to a Forest

For each we find the color sets that appear on colorful copies of Ti in G. Note that copies of Ti , Tj with disjoint color sets

are necessarily disjoint. Then, in 2O(k) time we find the color sets

that appear on colorful copies of Fi for .

If the collection corresponding to F=Fl is not empty then G contains a colorful copy of F. How do we find it…?

1 i l

1 i l

39

6.1: Finding Sub-Graph Isomorphic to a Forest

How do we find the color sets that appear on colorful copies of Ti in G? Let t be an arbitrary vertex in Ti=T. For each vertex v in G we find the color sets

that appear on copies of T in which v plays the role of r.

If T is a singe vertex then it’s easily done… Otherwise let e=(r,r’) be a (directed) edge

in T.▪ We break T into two (directed) sub-trees T’, T’’.

40

6.1: Finding Sub-Graph Isomorphic to a Forest

We recursively find, for each vertex v in G, the color sets in copies of T’ and T’’ in which v plays the role of r and then of r’.

For every (directed) edge (u,v) we update u’s collection with v’s collection if they are disjoint.

The complexity of this recursive algorithm is as required. For the undirected case we use the fact that

a graph with at least k|V| edges contains as a subgraph any forest on k vertices.

2 iO k E

41

6.2 :Tree Decomposition

Remember tree-width of a graph G? The minimum tree-width over all

possible tree-decomposition of G to (X,T).

T = (I, F) is a tree. X = { Xi : i I} is a set of subsets of V

such that:▪ The union of all Xis equals to V.

▪ For every edge (u,v) of G there exists an i such that u,v are in Xi.

▪ If , and j is on the path from i to k in T then:

, ,i j k I

i k jX X X

42

6.3: Finding Sub-Graph Isomorphic to a Bounded TW Graph

Let H be a directed or undirected graph on k vertices with tree-width t. Let G be a directed or undirected graph.

A sub-graph of G, isomorphic to H, if one exists, can be found in expected time and in worst case time.

Proof is similar to that of Theorem 6.1. So we will skip it...

12O k tV 12 logO k tV V

43

6.3: Finding Sub-Graph Isomorphic to a Bounded TW Graph

In [RS86b] it is shown that if C is a minor closed family of graphs that excludes at least one planar graph G’ then there exists a (huge) constant cG’ such that every graph in C has a tree-width of at most cG’.

So we can use 6.3 wherever |VH| = O(logV) and H excludes at least one planar graph. and decide in polynomial time whether G

contains a graph isomorphic to H.

44

Using 6.3: LOG PATH Problem

As a very special case of Theorem 6.3 we get that the LOG PATH problem is in P A path of logV vertices is a tree.

In addition, all the algorithms we described are easily parallelizable. So we get that the LOG PATH problem

and other problems are in NC.

45

Conclusions

The Color-Coding method efficiently finds k-vertex simple paths, k-vertex cycles, and other small sub-graphs within a given graph using probabilistic algorithms.

The Color-Coding method is a good example of demonstrating de-randomization techniques.

Algorithms presented can be easily parallelized. Yielding efficient NC algorithms.