1 combinatorial algorithms parametric pruning. 2 metric k-center given a complete undirected graph g...
TRANSCRIPT
2
Metric k-center
• Given a complete undirected graph G = (V, E) with nonnegative edge costs satisfying the triangle inequality, and k is a positive integer. For any set S V and vertex v define connect(v,S) = min{cost(u,v)|uS} (the cost of the cheapest edge from v to a vertex in S.)
• Find a set S V, with |S|=k, so as to minimize maxv{connect(v,S)}.
• The metric k-center problem is NP-hard.
Parametric pruning (1)
• If we know the cost of an optimal solution, we may be able to prune away irrelevant parts of the input and thereby simplify the search for a good solution.
• However computing the cost of an optimal solution is precisely the difficult core of NP-hard NP-optimization problems.
• The technique of parametric pruning gets around this difficulty as follows. A parameter t is chosen, which can be viewed as a “guess” on the cost of an optimal solution. For each value of t, the given instance I is pruned by removing parts that will not be used in any solution of cost > t.
3
Parametric pruning (2)
The algorithm consists of two steps. • In the first step, the family of instances I(t) is
used for computing a lower bound on OPT, say t∗.
• In the second step, a solution is found in instance I(α t∗), for a suitable choice of α.
4
8
Parametric pruning
• Sort the edges of G in nondecreasing order of cost, i.e. cost(e1) cost(e2) … cost(em).
• Let Gi = (V, Ei), where Ei={e1, e2,…, ei}.
• For each Gi , we have to check whether there exists a subset S V such that every vertex in V – S is adjacent to a vertex in S.
9
Dominating Set
• A dominating set in an undirected graph G = (V, E) is a subset S V such that every vertex in V – S is adjacent to a vertex in S.
• Let dom(G) denote the size of minimum cardinality dominating set in G.
• Computing dom(G) is NP-hard.
10
k-Center
• The k-center problem is equivalent to finding the smallest index i such that Gi has a dominating set of size at most k.
• Gi contains k stars (K1,p) spanning all vertices.
K1,7
11
G2
• Independent set (stable set) in G = (V, E) is a subset I V of pairwise non-adjacent vertices.
• Define the square of graph G = (V, E) to be the graph G2 = (V, E′), containing an edge (u,v) E′ whenever G has a path of length at most 2 between.
G=K1,4G2=K5
13
Hochbaum-Shmoys Algorithm (1986)
Input (G, cost: E → Q+)1) Construct G1
2, G22,…, Gm
2. 2) Compute a maximal independent set, Ir in
each graph Gr2.
3) Find the smallest index r such that | Ir | k, say j.
Output (Ij)
14
Approximation ratio of Hochbaum-Shmoys Algorithm-1
Theorem 4.2
Hochbaum-Shmoys Algorithm achieves an approximation factor of 2 for the metric k-center problem.
15
Main Lemma
• Lemma 4.3
For j as defined in the algorithm, cost(ej) ≤ OPT.
Proof. • For evry r < j we have that | Ir | > k.
• Now by Lemma 4.1 dom(Gr) ≥ | Ir | > k.
• So r* > r, and r* ≥ j. • cost(ej) ≤ OPT
16
Proof of Theorem 4.2
• A maximal independent set Ij in a graph Gj2 is
also a dominating set.• Thus there exist stars in Gj
2 centered on the vertices of Ij , covering all vertices.
• By the triangle inequality, each edge used in constructing these stars has cost at most 2cost(ej).
• Lemma 6.3 implies 2 cost(ej) ≤ 2 OPT.
18
Metric weighted k-center
• Given a complete undirected graph G = (V, E) with nonnegative edge costs satisfying the triangle inequality, a weight function on vertices, w: V → R+ and a bound W R+. For any set S V and vertex v define connect(v,S) = min{cost(u,v)|uS}.
• Find a set S V of total weight at most W, so as to minimize maxv{connect(v,S)}.
19
Weight dominating set
• Let wdom(G) denote the weight of minimum weight dominating set in G.
• Calculating wdom(G) is NP-hard.
20
Parametric pruning
• Sort the edges of G in nondecreasing order of cost, i.e. cost(e1) cost(e2) … cost(em).
• Let Gi = (V, Ei), where Ei={e1, e2,…, ei}.
• We need to find the smallest index индекс i such that wdom(Gi) W. If i* is this index, then the cost of the optimal solution is OPT = cost(ei*).
21
Lightest neighbors
• Given a vertex weighted graph G = (V, E) let I be an independent set in G2.
• For each uI, let s(u) denote a lightest neighbor of u in G, where u is also considered a neighbor of itself.
• Let S = {s(u) | uI }.
22
Lower Bound
• Lemma 4.4 Given graph H. Let I be an independent set in H2.
Then w(S) wdom(H).Proof. • Let D be a minimum weight dominating set of H. • Then the exists a set of disjoint stars in H, centered on the
vertices of D and covering all the vertices. • Since each of these stars becomes a clique in H 2, the set I
can pick at most one vertex from each of them.• Thus each vertex in I has a center of the corresponding star
available as a neighbor in H. Hence, w(S) wdom(H).
23
Hochbaum-Shmoys Algorithm-2
Input (G, cost: E → Q+, w: V → R+ ,W)1) Construct G1
2, G22,…, Gm
2. 2) Compute a maximal independent set Ir , in each
graph Gr2.
3) Compute Sr = {sr(u) | uIr }4) Find the minimum index r such that w(Sr) W,
say j.Output (Sj)
24
Approximation ratio of Hochbaum-Shmoys Algorithm-2
Theorem 4.5
Hochbaum-Shmoys Algorithm-2 achieves an approximation factor of 3 for the metric weighted k-center problem.
Proof
• By Lemma 4.4, cost(ej) is a lower bound on OPT; the argument is identical to that in Lemma 4.3. Since Ij is a dominating set in Gj
2, we can cover V with stars of Gj
2 centered in vertices of Ij. By the triangle inequality these stars use edges of cost at most 2 cost(ej).
• Each star center is adjacent to a vertex in Sj, using an edge of cost at most cost(ej). Move each of the centers to the adjacent vertex in Sj and redefine the stars. Again, by the triangle inequality, the largest edge cost used in constructing the final stars is at most cost(ej).25
28
Shortest superstring
• Given a finite alphabet Σ, and a set of n strings S = {s1,…,sn} Σ+.
• Find a shortest string s that contains each si as a substring.
• Without lost of generality, we may assume that no string si is a substring of another string sj, i j.
Overlap, prefix
• We begin by developing a good lower bound on OPT.• Let us assume that s1, s2,…, sn are numbered in order
of leftmost occurrence in the shortest superstring, s.• Let overlap(si, sj) denote the maximum overlap
between si and sj i.e., the longest suffix of si that is a prefix of sj.
• Let prefix(si, sj) be the prefix of si obtained removing its overlap with sj.
29
30
Prefix
ss1
sn–1
s2
pref(s1, s2)
sn
s1
pref(sn–1, sn) pref(sn, s1) over(sn, s1)
.,overlap,prefix
,prefix,prefixOPT
11
3221
ssss
ssss
nn
31
• Define the prefix graph of S as the directed graph Gpref on vertex set V={1,…,n} that contains an edge i → j of weight prefix(si,sj) for each i, j.
• | prefix(s1,s2)| + | prefix(s2,s3)| + …+ | prefix(sn,s1)| represents the weight of the tour 12…n1.
• Hence the minimum weight of a travelling salesman tour of the prefix graph gives a lower bound on OPT.
• Unfortunately, this lower bound is not very useful. TSP is NP-hard.
113221 ,overlap,prefix ,prefix,prefixOPT ssssssss nn
32
Lower Bound
• We will use the minimum weight of a cycle cover of the prefix graph.
• A cycle cover is a collection of disjoint cycles covering all vertices.
• A Hamiltonian cycle is a cycle cover. • We get that the minimum weight of a cycle cover lower-
bounds OPT.• Unlike minimum TSP, a minimum weight cycle cover can be
computed in polynomial time.
33
Cycle → prefix
• If c = (i1 i2 … il i1) is a cycle in the prefix graph, let α(с) = prefix(si1
,si2) ○…○ prefix(sil-1
,sil) ○ prefix(sil
,si1).
• Let w(с) be the weight of с, w(с) = |α(с)|.• Notice that each string si1
,si2,…, sil
is a substring of (α(с)).
• Next, let σ(с) = α(с) ○ si1.
• Then σ(с) is a superstring of si1,si2
,…, sil .
• In the above construction, we “opened” cycle c at an arbitrary string si1
. For the rest of the algorithm, we will call si1 the
representative string for с.
34
Example
abcdeabcdeabcde bcdeabcdeabcdea cdeabcdeabcdeabc deabcdeabcdeabcd abcdeabcdeabcde
α(с) = abcde , |α(с)|=5, (α(с))2 = abcdeabcde , bcdeabcdeabcdea is a substring of (α(с))4. σ(с) = α(с)○si1
= abcdeabcdeabcdeabcde
35
Algorithm Superstring
Input (S = {s1,…,sn })1) Construct the prefix graph Gpref corresponding to
strings in S. 2) Find a minimum weight cycle cover of Gpref , С
= {c1,…,ck}Output (σ(c1)○…○ σ(ck)).
36
Remark
• Clearly, the output σ(c1)○…○ σ(ck) is a superstring of the strings in S.
• Notice that if in each of the cycles we can find a representative string of length at most the weight of the cycle, then the string output is within 2OPT.
• Thus, the hard case is when all strings of some cycle c are long.
37
Example
abcde|abcde|abcde bcde|abcde|abcde|a cde|abcde|abcde|abc de|abcde|abcde|abcd abcde|abcde|abcde
α(с) = abcde , |α(с)|=5, (α(с))2 = abcdeabcde , bcdeabcdeabcdea is a substring of (α(с))4. σ(с) = α(с)○si1
= abcde|abcde|abcde|abcde
38
New lower bound
• Lemma 4.6 If each string in S′ S is a substring of t for a
string t, then there is a cycle of weight at most |t| in the prefix graph covering all the vertices corresponding to string in S′ .
39
Proof of Lemma 4.6
• For each string in S′, locate the starting point of its first occurrence in t .
• All these starting points will be distinct and will lie in the first copy of t.
• Consider the cycle in the prefix graph visiting the corresponding vertices in this order.
• Clearly, the weight of this cycle is at most |t|.
40
Lower bound on overlap
• Lemma 4.7 Let c and c′ be two cycles in C (cyclic cover of
the minimal weight), and let r, r′ be representative strings from these cycles. Then
|overlap(r, r′)| < w(c) + w(c′).
41
|overlap(r, r′)| ≥ w(c) + w(c′)
r
r'
overlap(r, r′)
α α
α' α' α'
α○α' = α'○α
α is a prefix of length w(c) of overlap (r, r′).
α′ is a prefix of length w(c′) of overlap (r, r′).
Since |overlap(r, r′)| ≥ w(c) + w(c′), it is follows that α and α′ commute.
42
|overlap(r, r′)| ≥ w(c) + w(c′).
r
r'
overlap(r, r′)
α α
α' α' α'
α○α' = α'○α
α is a prefix of length w(c) of overlap (r, r′).
α′ is a prefix of length w(c′) of overlap (r, r′).
(α)∞ = (α')∞For any N > 0, the prefix of length N of (α)∞ is the same as that of (α')∞.
Proof of Lemma 4.7
• Now, by Lemma 4.6, there is a cycle of weight at most w(c) in the prefix graph covering all strings in c and c, contradicting the fact that C is a minimum weight cycle cover.
• So, we have |overlap(r, r′)| < w(c) + w(c′).
43
44
Approximation ratio ofAlgorithm Superstring
Theorem 4.8
Algorithm Superstring achieves an approximation factor of 4 for the shortest superstring problem.
45
Algorithm Superstring
Input (S = {s1,…,sn })1) Construct the prefix graph Gpref corresponding to
strings in S. 2) Find a minimum weight cycle cover of Gpref , С
= {c1,…,ck}Output (σ(c1)○…○ σ(ck)).
46
Proof
OPT1
k
iicwCw
k
ii
k
ii rCwcA
11
ri is a representative string for сi.
,...,...,,...,...:* 21 krrrstring
k
ii
k
ii
L
k
iii
k
ii
cwr
rrr
11
7.4
1
11
1
2
,overlapOPT OPT31
k
iir
OPT4A
Exercise 4.1
• Show that the metric k-center problem cannot be approximated within factor < 2, unless P=NP.
• Hint: show that such an algorithm can solve the dominating set problem in polynomial time.
Dominating set• Given an undirected graph G=(V,E) and a
number k N, ∈ is there a dominating set X ⊆ V(G) with |X| ≤ k.
47