are p and q connected? - sjtubasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long...

57
Are p and q connected?

Upload: others

Post on 14-Aug-2020

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Are p and q connected?

Page 2: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Network connectivity

Yes, they are connected!

Page 3: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Network connectivity

I Problem:Given a set of nodes Nand a set of links between pairs of nodes L.Find connectivity for node p and node q.(p∈N,q∈N)

Page 4: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Real World Application

Page 5: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Kruskal

Page 6: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

The union-find data structure

Zhengtian Xu Xiaoqing Geng Lihua QianRuxuan Zhang Chen Feng

Department of Computer Science and EngineeringShanghai Jiao Tong University

8th December 2016

Page 7: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Outline

Brief Introduction for Union-Find Data Structure

Improvement

Time Complexity Analysis

Page 8: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Union-Find data structure type

Goal.Support three operations on a set of elements:

I MAKE-SET(x).Create a new set containing only element x .

I FIND(x). Return a canonical element in the set containing x .

I UNION(x , y). Merge the sets containing x and y .

Page 9: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Union-Find example

6

a

2,7,9,3

b

4,5,8

c

Page 10: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Union-Find example

6

a

2,7,9,3

b

4,5,8

c

FIND(9) = 2

Page 11: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Union-Find example

6

a

2,7,9,3

b

4,5,8

c

FIND(9) = 2MAKE-SET(1)

Page 12: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Union-Find example

6

a

2,7,9,3

b

4,5,8

c

1

d

FIND(9) = 2MAKE-SET(1)

Page 13: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Union-Find example

6

a

2,7,9,3

b

4,5,8

c

1

d

FIND(9) = 2MAKE-SET(1)

UNION(2,4)

Page 14: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Union-Find example

6

a

2,7,9,3,4,5,8

e

1

d

FIND(9) = 2MAKE-SET(1)

UNION(2,4)

Page 15: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Union-Find data structure

Representation

Represent each set as a tree of elements.

I Each element has a parent pointer in the tree.

I The root serves as the canonical element.

I FIND(x). Find the root of the tree containing x .

I UNION(x , y). Make the root of one tree point to root ofother tree.

d

a c

e b

f

root

parent of e is c

Page 16: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Find operation

Representation

FIND(x). Find the root of the tree containing x .

d

a c

e b

g

f

Page 17: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Find operation

Representation

FIND(x). Find the root of the tree containing x .

d

a c

e b

g

f FIND(g)

Page 18: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Find operation

Representation

FIND(x). Find the root of the tree containing x .

d

a c

e b

g

f FIND(g)

Page 19: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Find operation

Representation

FIND(x). Find the root of the tree containing x .

d

a c

e b

g

f FIND(g)

Page 20: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Find operation

Representation

FIND(x). Find the root of the tree containing x .

d

a c

e b

g

f FIND(g)

Page 21: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Find operation

Representation

FIND(x). Find the root of the tree containing x .

d

a c

e b

g

f FIND(g)

Page 22: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Find operation

Representation

FIND(x). Find the root of the tree containing x .

d

a c

e b

g

f FIND(g)

FIND(d)

Page 23: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Find operation

Representation

FIND(x). Find the root of the tree containing x .

d

a c

e b

g

f FIND(g)

FIND(d)

Page 24: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Union operation

I Maintain an integer rank for each node, initially 0.

I Link root of smaller rank to root of larger rank; if tie, increaserank of new root by 1.

Note. For now, rank = height.

Page 25: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Union operation

I Maintain an integer rank for each node, initially 0.

I Link root of smaller rank to root of larger rank; if tie, increaserank of new root by 1.

union(d, g)

d

c i j

g

k b

a h

e

rank = 2rank = 1

Note. For now, rank = height.

Page 26: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Union by rank

I Maintain an integer rank for each node, initially 0.

I Link root of smaller rank to root of larger rank; if tie, increaserank of new root by 1.

union(d, g)

d

c i j

g

k b

a h

e

rank = 2

Page 27: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Union by rank

I Maintain an integer rank for each node, initially 0.

I Link root of smaller rank to root of larger rank; if tie, increaserank of new root by 1.

union(d, g)

d

c

l

i j

g

k b

a h

e

rank = 2rank = 2

Page 28: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Union by rankI Maintain an integer rank for each node, initially 0.I Link root of smaller rank to root of larger rank; if tie, increase

rank of new root by 1.

union(d, g)

d

c

l

i j

g

k b

a h

e

rank = 3

Page 29: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Union by rank: analysis

Lemma 1. Using union by rank, for every root node r

size(r) ≥ 2rank(r)

Proof.[ by induction on number of links ]

I Base case: singleton tree has size 1 and rank 0.

I Inductive hypothesis: assume true after first i links.

Page 30: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Union by rank: analysis

Proof.

I Case 1. [ rank(r) > rank(s) ] or [ rank(r) < rank(s) ]

size ′(r) ≥ size(r) ≥ 2rank(r) = 2rank ′(r)

s

r

size = 8

size = 3

(rank = 2)

(rank = 1)

Page 31: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Union by rank: analysisProof.

I Case 2. [ rank(r) = rank(s) ]

size ′(r) = size(r) + size(s)

≥ 2× size(r)

≥ 2× 2rank(r)

= 2rank(r)+1

= 2rank ′(r)

s

r

size = 6

size = 3

(rank = 2)

(rank = 1)

Page 32: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Union by rank:analysis

Lemma 2. There are at most n2k elements of rank k .

Page 33: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Union by rank:analysis

Lemma 2. There are at most n2k elements of rank k .

Proof. According to Lemma 1, for node has rank k, its sizes areat least 2k . If the size of all elements is n. Obviously we can getLemma 2.

Page 34: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Union by rank:analysis

Theorem. Using Union by rank, any FIND operations takesO(log2n) time in the worst case, where n is the number ofelements; any UNION operations take constant time.

Page 35: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Union by rank:analysis

Theorem. Using Union by rank, any FIND operations takesO(log2n) time in the worst case, where n is the number ofelements; any UNION operations take constant time.

Proof.

I The running time of each operation is bounded by the treeheight.

I We can know that the height ≤ blog2nc

Page 36: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Outline

Brief Introduction for Union-Find Data Structure

Improvement

Time Complexity Analysis

Page 37: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Improvement

Observation

I It is the height of the tree that affects the running time.

I When we’re trying to find the root of the tree containing agiven node, we’re touching all the nodes on the path fromthat node to the root.

So...

I Why not make each of those just point to the root?

I That’s the idea of path compression!

Page 38: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Path compression

I Just after computing the root of the target node, set theparent of each examined node to point to that root.

a

b

d

g

i j

h

e f

c

height=4

find(j)

Page 39: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Path compression

I Just after computing the root of the target node, set theparent of each examined node to point to that root.

a

b

d

g

i

h

e f

cj

Page 40: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Path compression

I Just after computing the root of the target node, set theparent of each examined node to point to that root.

a

b

d

h

e f

cj g

i

Page 41: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Path compression

I Just after computing the root of the target node, set theparent of each examined node to point to that root.

a

b

e f

cj g

i

d

h

height=2

Page 42: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Path compression: benefits

I The resulting tree is much flatter.I If the target node is very deep, path compression may

dramatically decrease the height of the tree.

I Speed up future operations on all the nodes on the path andon those referencing them, directly or indirectly.

Page 43: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Path compression: rank vs. height

I The rank of a tree does not change during path compression.

I ...but the height of a tree may decrease.

I Now it is possible that rank 6= height!

Page 44: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Path compression: rank vs. height

I Example: Apply the following operations on the forest below.I Union(a,g)I Find(f)I Find(j)

a

b

d

f

e

c

g

h

j

i

3

2 0

1 0

0

2

1 0

0

Page 45: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Path compression: rank vs. height

I Union(a,g)

I Find(f)

I Find(j)

a

b

d

f

e

c g

h

j

i

3

2 0

1 0

0

2

1 0

0

Page 46: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Path compression: rank vs. height

I Union(a,g)

I Find(f)

I Find(j)

a

b

e

d f c g

h

j

i

3

2 01

0

0 2

1 0

0

Page 47: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Path compression: rank vs. height

I Union(a,g)

I Find(f)

I Find(j)

a

b

e

d f c g

i

h j

3

2 01

0

0 2 1

0

0

Page 48: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Path compression: rank vs. height

I Union(a,g)

I Find(f)

I Find(j)

a

b

e

d f c g

i

h j

3

2 01

0

0 2 1

0

0

height(a)=26=rank(a)

Page 49: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Outline

Brief Introduction for Union-Find Data Structure

Improvement

Time Complexity Analysis

Page 50: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Time Complexity Analysis

I Without path compression : O(log n) per find instruction.

I With path compression : O(log log n) per find instruction.

Page 51: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Time Complexity Analysis

Lemma.

I There are at most n2k nodes with rank k .

I rank(parent(x)) > rank(x).

I rank(root) ≤ log2 n.

Page 52: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Time Complexity Analysis

Definition.

I f (t) = 2× t

a

b rank=2

rank=6

rank(a) > f (rank(b))

happy node, long edge

a

b rank=2

rank=4

rank(a) ≤ f (rank(b))

sad node, short edge

Page 53: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Time Complexity Analysis

Observation. In one find operation, at most log log n long edgesare traversed.

Proof.

r b · · · c d

rank ≤ log n rank ≥ 1

Observation. x is sad for at most rank(x) find operations.Proof.

I rank(parent(x)) > rank(x)

I rank(parent(x)) increases per find operation.

I After rank(x) find ops,rank(parent(x)) > rank(x) + rank(x) = f (rank(x))

Page 54: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Time Complexity Analysis

time of all find ops

∑find ops #long edges

log log n ×m

∑find ops #short edges

∑e(#find operations when e is short)

∑x rank(x)

2n

Page 55: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Time Complexity Analysis

cost of all find operations =∑

find ops

(#long edges + #short edges)

∑find ops

#long edges ≤ log log n ×#find ops

∑find ops

#short edges =∑

e

(#find operations when e is short)

≤∑

x

rank(x)

≤log n∑k=0

(k ×#rank − k − nodes)

≤∞∑

k=0

k × n

2k

≤ 2n

Page 56: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Time Complexity Analysis

Conclusion. With path compression, time for one findop(#find operation ≥ n)is:

O(log log n + 2) = O(log log n)

Improvement.

I f (t) = 2t2 → log ∗n

I Ackerman Function→ α(n)

Page 57: Are p and q connected? - SJTUbasics.sjtu.edu.cn/~dominik/teaching/2016-cs214/... · nd ops #long edges log log n # nd ops X nd ops #short edges = X e (# nd operations when e is short)

Thanks!