algorithms for massive data sets

Randomized Search Heuristics Introduction

Algorithms for Massive Data Sets

02283, F 2011

Carsten Witt

DTU InformatikDanmarks Tekniske Universitet

Spring 2011

102283, Lecture 8


Summary of Last Lecture

• Motivation and definition of randomized search heuristics

• Example problems

• Probabilistic background

• Upper bounds• OneMax• Fitness-based partitions; BinVal

• Lower bounds• General lower bound• Trap function

202283, Lecture 8


Definition of Randomized Search Heuristics

Focus so far: analyzed (1+1) EA and RLS on example problems.

Aim: maximize f : 0, 1n → R

(1+1) Evolutionary Algorithm ((1+1) EA)

1 t := 0. Choose x = (x1, . . . , xn) ∈ 0, 1n uniformly at random.

2 y := x

3 Choose one bit in y uniformly and flip it. (mutation).

4 If f(y) ≥ f(x) Then x := y (selection).

5 t := t+ 1. Continue at line 2.

Approach: derive upper and lower bound on the expectedoptimization time (and study its distribution).

302283, Lecture 8


Definition of Randomized Search Heuristics

Focus so far: analyzed (1+1) EA and RLS on example problems.

Aim: maximize f : 0, 1n → R

Randomized Local Search (RLS)


2 y := x

3 Choose one bit in y uniformly and flip it. (mutation).

4 If f(y) ≥ f(x) Then x := y (selection).


Approach: derive upper and lower bound on the expectedoptimization time (and study its distribution).

302283, Lecture 8


Combinatorial Optimization

Most of the problems you know from algorithms classes arecombinatorial optimization problems.

• shortest path problems,• matching problems,• spanning tree problems,• covering problems,• . . .

Aims: understand randomized search heuristics on such problems,

• deriving bound on the optimization time,• not hoping to be better than the best problem-specific algorithm!

Today: two problems

402283, Lecture 8

Randomized Search Heuristics MST

Minimum Spanning Trees (MSTs)

Problem

Given: Undirected connected graph G = (V,E) with n vertices andm edges with positive integer weights w : E → R.Find: Edge set E ′ ⊆ E with minimal weight connecting all vertices.

Solvable in time O(n log n+m) by textbook algorithms.

502283, Lecture 8


The Fitness Function

Model:

• Given an instance to the MST problem,

• reserve for each edge a bit: fitness function f : 0, 1m → R,

• f(x) = weight of the selection if x describes a tree.

To decide: What “fitness” for unconnected graphs?

Problematic: if all unconnected graphs get the same bad fitness⇒ Search heuristic likely to miss connected graphs

Instead: small amount of problem knowledge, e. g.

• start with all edges selected,

• or let the fitness function lead to trees.

602283, Lecture 8


The Fitness Function (2/2)

Choice:

f(x) :=

w1x1 + · · ·+ wmxm if x encodes a tree

M2 · (c(x)− 1) + M · (e(x)− (n− 1)) otherwise,

where

• M huge number (e. g., M = n · (w1 + · · ·+ wm)).

• c(x) = no. of connected components in graph encoded by x

• e(x) = x1 + · · ·+ xm (no. of chosen edges)

Aim: minimize f (replace ≥-selection by ≤)

Note: unconnected graphs always worse than connected ones,connected non-trees always worse than trees

702283, Lecture 8

Randomized Search Heuristics MST: Upper Bound

Upper Bound

Theorem

The expected time until (1+1) EA and RLS≤2 construct a minimumspanning tree is bounded by O(m2(log n+ logwmax))= O(n4(log n+ logwmax)) where wmax := maxw1, . . . , wm.

RLS≤2 flips in mutation step either one or two uniformly chosen bits(each of these happening with probablity 1/2)


2 y := x

3 Choose b ∈ 1, 2 uniformly at random. Flip exactly b bits in ychosen uniformly at random.

4 If f(y) ≤ f(x) Then x := y (selection).


802283, Lecture 8


Upper Bound

Theorem

The expected time until (1+1) EA and RLS≤2 construct a minimumspanning tree is bounded by O(m2(log n+ logwmax))= O(n4(log n+ logwmax)) where wmax := maxw1, . . . , wm.

RLS≤2 flips in mutation step either one or two uniformly chosen bits(each of these happening with probablity 1/2)

Proof outline: three phases

1 From unconnected to connected graphs,

2 from connected graphs to spanning trees,

3 from spanning trees to minimum spanning trees

802283, Lecture 8


Phase 1

Lemma

The expected time until a connected graph has been reached isO(m log n).

Proof using fitness-based partitions

• Li := x | (n− i)M2 > f(x) ≥ (n− i− 1)M2,i. e., x has n− i CCs, 0 ≤ i < n

• There must be ≥ n− i− 1 edges running between n− i CCs(otherwise would not have connected graph).

902283, Lecture 8


Phase 1

Lemma

The expected time until a connected graph has been reached isO(m log n).


• Li := x | (n− i)M2 > f(x) ≥ (n− i− 1)M2,i. e., x has n− i CCs, 0 ≤ i < n

• There must be ≥ n− i− 1 edges running between n− i CCs(otherwise would not have connected graph).

• si = Ω((n− i− 1)/m) (both for (1+1) EA and RLS).

• Time bound O(∑n−2

i=0 m/(n− i− 1)) = O(m log n).

902283, Lecture 8


Phase 2

Lemma

The expected time until a spanning tree has been reached isO(m log n).


• Li := x | (m− i)M > f(x) ≥ (m− i− 1)M,i. e., x has m− i edges, 0 ≤ i < m− (n− 1)

• At least m− i− (n− 1) of these can be flipped out withoutlosing connectedness.

• si = Ω((m− i− n+ 1)/m) (both for (1+1) EA and RLS).

• Time boundO(∑m−n+1

i=0 m/(m− i− n+ 1)) = O(m logm) = O(m log n).

1002283, Lecture 8


Phase 3

Have reached a tree: How to make progress now?

Combinatorial argument (Mayr/Plaxton, 1992): From arbitraryspanning tree T to MST T ∗

e1

α(e3)

e3e2α(e1) α(e2)

• k := |E(T ∗) \ E(T )|• Bijection α : E(T ∗) \ E(T )→ E(T ) \ E(T ∗)

• α(ei) on the cycle of E(T ) ∪ ei• w(ei) ≤ w(α(ei))

1102283, Lecture 8


Phase 3: Analyzing 2-bit Flips

Essence of combinatorial argument:∃ k ≤ n− 1 improving 2-bit flips; doing all of them after anotherleads from current tree x to optimum xopt.

• Problem: some 2-bit flips might only yield tiny progress

• But: each of k possibilities improves on average by f(x)−f(xopt)

k.

• A uniformly random 2-bit flip ((m2

)choices) improves in

expectation by f(x)−f(xopt)

m(m−1)/2(as worsenings never accepted).

• 2-bit flip happens with prob. ≥ 1e22

(with RLS: 1/2).

Hence: current value f(x) next f -value in expectation

≤ f(x)− f(x)−f(xopt)

2em2 , distance to optimum (i. e. f(x)− fopt)decreases by expected factor ≤ 1− 1

2em2 .

Now: handle the expected distance decrease improvement.

1202283, Lecture 8


Decrease by Constant Factor: Exercise

You start with 1024 DKK and spend half your money every day. Howmany days does it take until who have at most 1 DKK left?

10 days.

You start with n DKK and spend a (1− c)-fraction (i.e. (1− c) · 100percent) of your money every day. How many days does it take untilwho have at most 1 DKK left?

Solve n · cd ≤ 1

⇒ d = − ln(n)ln(c)

= − lnc(n) = ln1/c(n).

If you spend a (1− c)-fraction only in expectation things are similar.

1302283, Lecture 8


Phase 3: Formalizing Progress

Proof method expected multiplicative distance decrease

Scenario: process with random distances Dt, t ≥ 0, from optimum

If the Dt are nonnegat. integers and E(Dt+1 | Dt = d) ≤ cd for c < 1

then expected time to reach distance 0bounded from above −2dlnc(2D0)e = 2dln(2D0)/ ln(1/c)e.

1402283, Lecture 8


Phase 3: Formalizing Progress

Proof method expected multiplicative distance decrease

Scenario: process with random distances Dt, t ≥ 0, from optimum

If the Dt are nonnegat. integers and E(Dt+1 | Dt = d) ≤ cd for c < 1

then expected time to reach distance 0bounded from above −2dlnc(2D0)e = 2dln(2D0)/ ln(1/c)e.

Proof:

• Observe E(Dt) ≤ ctD0 (induction)

• Consider Dt∗ for t∗ = d− lnc(2D0)e• E(Dt∗) ≤ D0 · c− lnc(2D0) = D0 · 1

2D0= 1

2

• Markov’s inequality: Dt∗ < 1 with probability ≥ 1/2.

• Dt∗ < 1⇒ Dt∗ = 0 due to integrality (!)

• Repeat argumentation expected ≤ 2 times.14

02283, Lecture 8


Phase 3: Applying the Distance-Decrease Method

Back to MST problem:

• Distance decrease by factor c = 1− 12em2 ,

• initial distance at most w1 + · · ·+ wm ≤ m · wmax,

• expected length of Phase 3: O(

ln(2m·wmax)ln(1/(1−/(2em2)))

)= O(m2(logm+ logwmax)) = O(m2(log n+ logwmax))since ln(1 + x) ≈ x for x near 0.

(more precisely: ln(1− x) ≥ −2x for 0 ≤ x ≤ 1/2.)

All phases together: still expected time O(m2(log n+ logwmax)).

1502283, Lecture 8

Randomized Search Heuristics MST: Lower Bound

Lower Bound: Idea

∃ instance where (1+1) EA and RLS≤2 take Ω(n4 log n) steps.

2n22n2

3n2

2n2

3n2

2n2

3n2

2n22n2

Kn/2weights 1

• Ω(n) triangles, each with 2 wrong and 1 right configurations

correct3n2

2n22n2

wrong3n2

2n22n2

wrong3n2

2n22n2

• Clique on n/2 nodes makes the graph have m = Ω(n2) edges

• Two edges of triangle flipped with prob. Θ(1/m2) = Θ(1/n4).

• Coupon collector argument → time Ω(n4 log n).

1602283, Lecture 8

Randomized Search Heuristics Maximum Matchings

Maximum Matching

A matching in an undirected graph is a subset of pairwise disjointedges; aim: find a maximum matching (solvable in poly-time)

Easy example: path of odd length

Maximum matching with more than half of edges

Concept: augmenting path

• Alternating between edges being inside and outside the matching• Starting and ending at “free” nodes not incident on matching• Flipping choices along the path improves matching

Example: whole graph is augmenting path

Interesting: how RSHs find augmenting paths

1702283, Lecture 8

Randomized Search Heuristics Maximum Matchings

Maximum Matching

A matching in an undirected graph is a subset of pairwise disjointedges; aim: find a maximum matching (solvable in poly-time)

Easy example: path of odd length

Suboptimal matching

Concept: augmenting path

• Alternating between edges being inside and outside the matching• Starting and ending at “free” nodes not incident on matching• Flipping choices along the path improves matching

Example: whole graph is augmenting path

Interesting: how RSHs find augmenting paths

1702283, Lecture 8

Randomized Search Heuristics MM: Upper Bound

Maximum Matching: Upper Bound

Study again (1+1) EA and RLS≤2

Fitness function:

• for legal matchings: size of matching• otherwise leading to empty matching

First example: Path with n+ 1 nodes, n edges: bit string from0, 1n selects edges

Theorem

The expected time until (1+1) EA and RLS≤2 find a maximummatching on a path of n edges is O(n4).

1802283, Lecture 8


Proof Technique: Fair Random Walk

Scenario: Fair random walk

• Initially, Player A and B both have n2

DKK

• Repeat: Flip a coin

• If heads: A pays 1 DKK to B, tails: other way round

• Until one of the players is ruined.

How long does the game take in expectation?

Theorem:Fair random walk on 0, . . . , n takes in expectation O(n2) steps.

1902283, Lecture 8


Fair Random Walk: Proof Idea

N trials with success probability 1/2

• E(no. successes) = N/2.

•√

Var(no. successes) =√N/2.

N2

N2

+√N2

• In N steps of fair random walk expect deviation√N/2

• Hence, deviation Ω(n) in n2 steps.

2002283, Lecture 8


Maximum Matching: Upper Bound (Ctnd.)

Proof idea for O(n4) bound

• Consider the level of second-best matchings.

• Fitness function does not change (walk on plateau).

• If “free” edge: chance to flip one bit! → probability Θ(1/n).

• Else 2-bit flips → probability Θ(1/n2).

• Shorten or lengthen augmenting path

• At length 1, flip the free edge!

• Length changes according to a fair random walk→ Expected runtime O(n2) ·O(n2) = O(n4).

2102283, Lecture 8

Randomized Search Heuristics MM: Lower Bound

Maximum Matching: Lower Bound

Worst-case graph Gh,` (Sasaki/Hajek, 1988)

h ≥ 3

` = 2`′ + 1

Augmenting path can get shorter but is more likely to get longer.(unfair random walk)

Theorem

For h ≥ 3, (1+1) EA and RLS≤2 have exponential expected runtime2Ω(`) on Gh,`.

Proof requires advanced drift analysis (not shown here).2202283, Lecture 8

Randomized Search Heuristics MM: Approximations

Maximum Matching: Approximation (1/3)

Insight: do not hope for exact solutions but for approximations

For maximization problems: solution with value a is called(1 + ε)-approximation if OPT

a≤ 1 + ε, where OPT optimal value.

Theorem

For ε > 0, (1+1) EA and RLS≤2 find a (1 + ε)-approximation of amaximum matching in expected time O(m2/ε+2) (m number ofedges).

2302283, Lecture 8



Lemma (Structural lemma)

Given a current matching of size S, there exists an augmenting pathof length at most 2 S

OPT−S + 1.

Proof idea

• Augmenting paths means: improve matching size by 1.

• There are at least OPT− S node-disjoint augmenting paths.

• All of these contain edges from the current matching.

• Pigeon-hole principle: augmenting path with ≤ SOPT−S such

edges, hence length at most 2 SOPT−S + 1

2402283, Lecture 8



Lemma (Structural lemma)

Given a current matching of size S, there exists an augmenting pathof length at most 2 S

OPT−S + 1.

Use of the lemma:

• Solution worse than (1 + ε)-approximate → S(1 + ε) ≤ OPT.

• OPT− S ≥ S(1 + ε− 1) ⇒ SOPT−S ≤ 1/ε

• Wait for (1+1) EA to flip the path of length ≤ 2/ε+ 1→ expected time O(m2/ε+1) (similar argument for RLS≤2)

• Altogether at most m improvements → bound O(m2/ε+2).

2502283, Lecture 8

Randomized Search Heuristics Summary

Summary

• Studied (1+1) EA and RLS on two combinatorial optimizationproblems

• Polynomial bound O(n4 log n) for minimum spanning treesunless weights very large

• Polynomial bound for maximum matchings (MM) on paths

• No general polynomial bound for MM

• but good approximation in polynomial time

• Proof techniques:• expected multiplicative distance decrease• fair random walk

2602283, Lecture 8

Randomized Search Heuristics Summary

Literature

• F. Neumann and I. Wegener: Randomized local Search,evolutionary algorithms, and the minimum spanning treeproblem, Theoretical Computer Science, vol. 378, 32–40, 2007

• O. Giel and I. Wegener: Evolutionary Algorithms and theMaximum Matching Problem, STACS 2003, 415–426, Springer,2003

• Ernst W. Mayr and C. Greg Plaxton: On the spanning trees ofweighted graphs, Combinatorica, vol. 12(4), 433–447, 1992

• G. Sasaki and B. Hajek: The time complexity of maximummatching by simulated annealing, Journal of the ACM, vol. 35,387–403, 1988

2702283, Lecture 8

algorithms for massive data sets

Documents