data structure & algorithm
DESCRIPTION
Data Structure & Algorithm. 11 – Minimal Spanning Tree JJCAO. Steal some from Prof. Yoram Moses & Princeton COS 226 . Weighted Graphs. G =(V,E), wt wt : E → R wt (G) = . Sub-Graphs. Note: G' is not a spanning sub-graph of G. Minimum Spanning Tree. A Subgraph A tree Spans G - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/1.jpg)
Data Structure & Algorithm
11 – Minimal Spanning Tree
JJCAO
Steal some from Prof. Yoram Moses & Princeton COS 226
![Page 2: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/2.jpg)
2
Weighted GraphsG =(V,E),wtwt: E → R
wt(G) =
![Page 3: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/3.jpg)
3
Sub-Graphs
Note: G' is not a spanning sub-graph of G
![Page 4: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/4.jpg)
4
Minimum Spanning Tree• A Subgraph• A tree• Spans G• Of minimal weight
![Page 5: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/5.jpg)
5
MST OriginOtakar Boruvka (1926).• Electrical Power Company of Western Moravia in
Brno.• Most economical construction of electrical power
network.• Concrete engineering problem is now a
cornerstone problem in combinatorial optimization.
![Page 6: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/6.jpg)
6
MST describes arrangement of nuclei in the epithelium for cancer research
http://www.bccrc.ca/ci/ta01_archlevel.html
![Page 7: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/7.jpg)
7
Normal Consistency
[Hoppe et al. 1992] • Based on angles between unsigned normals• May produce errors on close-by surface sheets
![Page 8: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/8.jpg)
8
MST is fundamental problem with diverse applications
• Network design.– telephone, electrical, hydraulic, TV cable, computer, road
• Approximation algorithms for NP-hard problems.– traveling salesperson problem, Steiner tree
• Indirect applications.– max bottleneck paths– LDPC codes for error correction– image registration with Renyi entropy– learning salient features for real-time face verification– reducing data storage in sequencing amino acids in a protein– model locality of particle interactions in turbulent fluid flows– autoconfig protocol for Ethernet bridging to avoid cycles in a network
• Cluster analysis.
![Page 9: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/9.jpg)
9
Minimum Spanning Tree on Surface of Sphere 5000 Vertices
![Page 10: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/10.jpg)
10
Minimum Spanning TreeInput: a connected, undirected graph - G, with a weight function on the edges – wt
Goal: find a Minimum-weight Spanning Tree for G
Fact:If all edge weights are distinct, the MST is unique
Brute force: Try all possible spanning trees• problem 1: not so easy to implement• problem 2: far too many of them
Ex: [Cayley, 1889]: V^{V-2} spanning trees on the complete graph on V vertices.
![Page 11: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/11.jpg)
11
Main algorithms of MST1. Kruskal’s algorithm 2. Prim’s algorithm
Both O(ElgV) using ordinary binary heapsBoth greedy algorithms => Global solution
3. …
![Page 12: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/12.jpg)
12
Two Greedy Algorithms• Kruskal's algorithm. Consider edges in ascending
order of cost. Add the next edge to T unless doing so would create a cycle.
• Prim's algorithm. Start with any vertex s and greedily grow a tree T from s. At each step, add the cheapest edge to T that has exactly one endpoint in T.
Greed is good. Greed is right. Greed works. Greedclarifies, cuts through, and captures the essence of theevolutionary spirit."
- Gordon Gecko
![Page 13: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/13.jpg)
13
Cycle Property• Let T be a minimum spanning tree
of a weighted graph G• Let e be an edge of G that is not in
T and C be the cycle formed by e with T
• For every edge f of C, weight(f) ≤ weight(e)
Proof:• By contradiction• If weight(f) > weight(e) we can
get a spanning tree of smaller weight by replacing e with f
![Page 14: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/14.jpg)
14
Edges cross the cut
![Page 15: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/15.jpg)
15
Cut (/Partition) PropertyLemma:Let G =(V,E) and X ⊂ V.If e = a lightest edge connecting X and V-Xthen e appears in some MST of G.
Proof:• Let T be an MST of G• If T does not contain e, consider the cycle C
formed by e with T and let f be an edge of C across the partition
• By the cycle property, weight(f) ≤ weight(e)
• Thus, weight(f) = weight(e)• We obtain another MST by replacing f with e
locally optimal choice(of lightest edges)
globally optimal solution(MST)
![Page 16: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/16.jpg)
16
Disjoint Set ADT
![Page 17: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/17.jpg)
17
An application of disjoint-set data structures
![Page 18: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/18.jpg)
18
Linked List Implementation
![Page 19: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/19.jpg)
19
Unionin Linked List Implementation
![Page 20: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/20.jpg)
20
![Page 21: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/21.jpg)
21
Worst-Case Example• n: the number of MAKE-SET operations,• m: the total number of MAKE-SET, UNION, and FIND-
SET operations• we can easily construct a sequence of m operations
on n objects that requires (n^2) time
![Page 22: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/22.jpg)
22
Weighted Union Heuristic• Each set id includes the length of the list• In Union - append shorter list at end of
longer
Theorem: Performing m > n operations takes O(m + nlgn) time
![Page 23: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/23.jpg)
23
Simple Forest Implementation
Find-Set(x) -follow pointersfrom x up to root
Union(c,f) - make c a child of f and return f
∪
![Page 24: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/24.jpg)
24
Worst-Case Examplen
…
3
2
1
![Page 25: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/25.jpg)
25
Weighted Union Heuristic• Each node includes a weight fieldweight = # elements in sub-tree rooted at node
• Find-Set(x) - as before O(depth(x))
• Union(x,y) - always attach smaller tree below the root of larger tree O(1)
![Page 26: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/26.jpg)
26
Weighted UnionTheorem:Any k-node tree created using theweighted-union heuristic, has height ≤ lg(k)
Proof: By induction on k
Find-Set Running Time: O(lg n)
![Page 27: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/27.jpg)
27
2nd heuristic: Path Compression
![Page 28: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/28.jpg)
28
The function lg nlg n = the number of times we have to take the log2 n repeatedly to reach root node
Lg 2 = 1Lg 2^2 = 2Lg 2^16 = lg 65536 = 16
=> Lg n < 16 for all practical values of n
![Page 29: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/29.jpg)
29
Theorem(Tarjan): IfS = a sequence of O(n) Unions and Find-SetsThe worst-case time for S with
– Weighted Unions, and– Path Compressions
is O(nlgn)
The average time is O(lgn) per operation
in Linked List Implementation
![Page 30: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/30.jpg)
30
Theorem(Tarjan): LetS = a sequence of O(n) Unions and Find-SetsThe worst-case time for S with
– Weighted Unions, and– Path Compressions
is O(nα(n))
The average time is O(α(n)) per operation, α(n) < 5 in practice
![Page 31: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/31.jpg)
31
Connected Components usingUnion-Find
Reminder:
• Every node v is connected to itself• if u and v are in the same connected
component then v is connected to u and u is connected to v
• Connected components form a partition of the nodes and so are disjoint:
![Page 32: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/32.jpg)
32
MST-Kruskal
Kruskal's algorithm for minimum spanning tree works by inserting edges in order of increasing cost, adding as edges to the tree those which connect two previously disjoint components.
Kruskal's algorithm on a graph of distances between 128 North American cities
The minimum spanning tree describes the cheapest network to connect all of a given set of vertices
![Page 33: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/33.jpg)
33
Example
![Page 34: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/34.jpg)
34
![Page 35: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/35.jpg)
35
MST-Kruskal
![Page 36: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/36.jpg)
36
MST-Kruskal
![Page 37: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/37.jpg)
37
MST-KruskalRunning Time:
![Page 38: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/38.jpg)
38
MST-Prim-Jarnik
![Page 39: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/39.jpg)
39
Example
![Page 40: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/40.jpg)
40
MST-Prim-Jarnik
![Page 41: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/41.jpg)
41
MST-Prim
![Page 42: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/42.jpg)
42
MST-Prim
![Page 43: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/43.jpg)
43
MST-Prim
![Page 44: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/44.jpg)
44
Decrease_key(v,x)We use a min-Heap to hold the edges in G-THow can we implement Decrease key(v,x)?
Simple solution:• Change value for v• Follow strategy for Heap_insert from v
upwards
• Cost: O(lgV)
![Page 45: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/45.jpg)
45
MST-PrimRunning Time:
![Page 46: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/46.jpg)
46
Does a linear-time MST algorithm exist?
![Page 47: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/47.jpg)
47
Euclidean MSTGiven N points in the plane, find MST connecting them, where the distances between point pairs are their Euclidean distances.
Brute force. Compute ~ /2 distances and run Prim's algorithm.Ingenuity. Exploit geometry and do it in ~ c N lg N.
![Page 48: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/48.jpg)
48
Scientific application: clusteringk-clustering. Divide a set of objects classify into k coherent groups.Distance function. Numeric value specifying "closeness" of two objects.
Goal. Divide into clusters so that objects in different clusters are far apart.
Applications.• Routing in mobile ad hoc networks.• Document categorization for web search.• Similarity searching in medical image databases.• Skycat: cluster 109 sky objects into stars, quasars, galaxies.
outbreak of cholera deaths in London in 1850s (Nina Mishra)
![Page 49: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/49.jpg)
49
Single-link clustering
k-clustering. Divide a set of objects classify into k coherent groups.Distance function. Numeric value specifying "closeness" of two objects.
Goal. Divide into clusters so that objects in different clusters are far apart.
Single link. Distance between two clusters equals the distance between the two closest objects (one in each cluster).
Single-link clustering. Given an integer k, find a k-clustering that maximizes the distance between two closest clusters.
![Page 50: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/50.jpg)
50
Single-link clustering algorithm
“Well-known” algorithm for single-link clustering:• Form V clusters of one object each.• Find the closest pair of objects such that each object is
in a different cluster, and merge the two clusters.• Repeat until there are exactly k clusters.
Observation. This is Kruskal's algorithm (stop when k connected components).
Alternate solution. Run Prim's algorithm and delete k-1 max weight edges.
![Page 51: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/51.jpg)
51
DendrogramTree diagram that illustrates arrangement of clusters.
![Page 52: Data Structure & Algorithm](https://reader036.vdocument.in/reader036/viewer/2022062323/56816707550346895ddb6e6c/html5/thumbnails/52.jpg)
52
Dendrogram of cancers in humanTumors in similar tissues cluster together