graph algorithms basics graph: g = {v, e} v ={v 1, v 2, …, v n }, e = {e 1, e 2, … e m }, e ij...

GRAPH ALGORITHMSBASICS

• Graph: G = {V, E}

• V ={v1, v2, …, vn}, E = {e1, e2, … em}, eij=(vi, vj, lij)

• A set of n nodes/vertices, and m edges/arcs as pairs of nodes: Problem sizes: n, m

• Where lij, if present, is a label or the weight of an edge.

04/19/23 (C) Debasis Mitra

v1

v2v3

v4v5

V ={v1, v2, v3, v4, v5}E = {e1=(v1, v2), (v3, v1), (v3, v4), (v5, v2), e5=(v4, v5)}

degree(v1)=2, degree(v2)=2, … :number of arcs

For directed graph: indegree, and outdegree



v1

v2v3

v4v5

Weighted graph G:V ={v1, v2, v3, v4, v5}E = {(v1, v2, 10), (v3, v1, 20), (v3, v4 , 30), (v5, v2 , 25), (v4, v5 , 50)}

2010

25

50

30

Adjacency List Representation:v1: (v2,10), (v3, 20)v2: (v1,10), (v5, 25)v3: (v1,20), (v4, 30)v4: (v3,30), (v5, 50)v5: (v2,25), (v4, 50)

Matrix representation: v1 v2 v3 v4 v5

v10 10 20 Inf inf

v210 0 Inf Inf 25

v320 inf 0 30 inf

v4Inf inf 30 0 50

v5inf 30 Inf 50 0

Matrix representationfor unweighted Graph?

Matrix representationfor directed Graph?

• Directed graph: edges are ordered pairs of nodes.

• Weighted graph: each edge (directed/undirected) has a weight.

• Path between a pair of nodes vi, vk: sequence of edges with vi, vk at the two ends.

• Simple path: covers no node in it twice.

• Loop: a path with the same start and end node.

• Path length: number of edges in it.

• Path weight: total wt of all edges in it.

• Connected graph: there exists a path between every pair of nodes, no node is disconnected.

• Complete graph: edge between every pair of nodes [NUMBER OF EDGES?].

• Acyclic graph: a graph with no cycles.

• Etc.

• Graphs are one of the most used models of real-life problems for computer-solutions.



• Algorithm 1:

• For each node v in V do

• -Steps- // (|V|)

•

• Algorithm 2:


• For each edge in E do

• -Steps- // (|V|*|E|)

•

• Algorithm 3:


• For each edge e adjacent to v do

• -Steps- // ( |E| ) with adjacency list,

• // but (|V|* |V| ) for matrix representation

•

• Algorithm 4:


• -steps-

• For each edge e of v do

• -Steps- // ( |V|+|E| ) or ( max{|V|, |E| })



v2v3

v4v5

2010

25

50

30

v1

TOPOLOGICAL SORTProblem 1

• Input: directed acyclic graph• Output: sequentially order the nodes without violating any arc

ordering.

• Note: you may have to check for cycles - depending on the problem definition (input: directed graph).

• An important data: indegree of each node - number of arcs coming in (#courses pre-requisite to "this" course).



• Input: directed acyclic graph• Output: sequentially order the nodes without violating any arc

ordering.

• A first-pass Algorithm: – Assign indegree values to each node

– In each iteration: Eliminate one of the vertices with indegree=0 and its associated (outgoing) arcs (thus, reducing indegrees of the adjacent nodes) - after assigning an ordering-number (as output of topological-sort ordering) to these nodes,

– keep doing the last step for each of the n number-of-nodes.

– If at any stage before finishing the loop, there does not exist any node with indegree=0, then a cycle exists! [WHY?]

• A naïve way of finding a vertex with indegree 0 is to scan the nodes that are still left in the graph.

• As the loop runs for n times this leads to (n2) algorithm.



• Input: A directed graph

• Output: A sorting on the node without violating directions, or Failure

• Algorithm naïve-topological-sort

• 1 For each node v ϵ V calculate indegree(v); // (N)

•

• 2 counter = 0;

• 3 While V≠ empty do // (n)

• 4 find a node v ϵ V such that indegree(v)= =0;

• // have to scan all nodes here: (n)

• // total: (n) x (while loop’s n) = O(n2)

• 5 if there is no such v then return(Failure)

• else

• 6 ordering-id(v) = ++counter;

• 7 V = V – v;

• 8 For each edge (v, w) ϵ E do //nodes adjacent to v

• //adjacent nodes: (|E|), total for all nodes O(n or |E|)

• 9 decrement indegree(w);

• 10 E = E – (v, w);

• end if;

• end while;

• End algorithm.

• Complexity: dominating term O(n2)




v1

v2v3

v4v5

Indegree(v1)=0Indegree(v2)=1Indegree(v3)=1Indegree(v4)=2Indegree(v5)=1

Ordering-id(v1) =1Eliminate node v1, and edges (v1,v2), (v1,v3) from GUpdate indegrees of nodes v2 and v3



v2v3

v4v5

Indegree(v2)=0Indegree(v3)=0Indegree(v4)=2Indegree(v5)=1

Ordering-id(v1) =1, Ordering-id(v2)=2Eliminate node v2, and edges (v2,v5) from GUpdate indegrees of nodes v5



v3

v4v5

Indegree(v3)=0Indegree(v4)=2Indegree(v5)=0

Ordering-id(v1) =1, Ordering-id(v2)=2, O-id(v3)=3Eliminate node v3, and edges (v3,v4) from GUpdate indegrees of nodes v4



v4v5

Indegree(v4)=1Indegree(v5)=0

Ordering-id(v1) =1, Ordering-id(v2)=2, O-id(v3)=3, (v5)=4Eliminate node v5, and edges (v5,v4) from GUpdate indegrees of nodes v4



v4

Indegree(v4)=0

Ordering-id(v1) =1, Ordering-id(v2)=2, O-id(v3)=3, (v5)=4, (v4)=5Eliminate node v4 from G, and V is empty nowResulting sort: v1, v2, v3, v5, v4



v1

v2v3

v4v5

However, now,Indegree(v1)=0Indegree(v2)=2Indegree(v3)=1Indegree(v4)=2Indegree(v5)=1

Ordering-id(v1) =1Eliminate node v1, and edges (v1,v2), (v1,v3) from GUpdate indegrees of nodes v2 and v3



v2v3

v4v5

Indegree(v2)=1Indegree(v3)=0Indegree(v4)=2Indegree(v5)=1

Ordering-id(v1) =1, Ordering-id(v3)=2Eliminate node v3, and edges (v3,v4) from GUpdate indegrees of nodes v4



v4v5

Indegree(v2)=1Indegree(v4)=1Indegree(v5)=1

No node to choose before all nodes are ordered: Fail

v2



• A smarter way: notice indegrees become zero for only a subset of nodes in a pass, but all nodes are being checked in the above naïve algorithm.

• Store those nodes (who have become “roots” of the truncated graph) in a box and pass the box to the next iteration.

• An implementation of this idea could be done by using a queue: • Push those nodes whose indegrees have become 0’s (when you update those values), at the back of a queue;• Algorithm terminates on empty queue,• but having empty Q before covering all nodes:

• we have got a cycle!



Algorithm Q-based-topological-sort1 For each vertex v in G do // initialization2 calculate indegree(v); // (|E|) with adj. list3 if indegree(v) = =0 then push(Q, v);

end for; 4 counter = 0;5 While Q is not empty do

// indegree becomes 0 once & only once for a node// so, each node goes to Q only once: (N)

6 v = pop(Q);7 ordering-id(v) = ++counter;8 V = V – v; 9 For each edge (v, w) ϵ E do //nodes adjacent to v

// two loops together (max{N, |E|})// each edge scanned once and only once

10 E = E – (v, w);11 decrement indegree(w);12 if now indegree(w) = =0 then push(Q, w);

end for;end while;

13 if counter != N then return(Failure);// not all nodes are numbered yet, but Q is empty

// cycle detected;

End algorithm.

Complexity: the body of the inner for-loop is executed at most once per edge, even considering the outer while loop, if adjacency list is used.

The maximum queue length is n.

Complexity is (|E| + n).


SOURCE TO ALL NODES SHORTEST PATH-LENGTHProblem 2

Path length = number of edges on a path. Compute shortest path-length from a given source node to all nodes on the graph. Strategy: starting with the source as the "current-nodes," in each of the iteration expand children (adjacent nodes) of "current-nodes." Assign the iteration# as the shortest path length to each expanded node, if the value is not already assigned. Also, assign to each child, its parent node-id in this expansion tree (to retract the shortest path if necessary). Input: Undirected graph, and source nodeOutput: Shortest path-length to each node from source

This is a breadth-first traversal, nodes are expanded as increasing distances from the source: 0, then 1, then 2, etc.



Once again, a better idea is to have the children being pushed at the back of a queue. Input: Undirected graph, and source nodeOutput: Shortest path-length to each node from sourceAlgorithm q-based-shortest-path1 d(s) = 0; // source2 enqueue only s in Q; // (1), no loop, constant-time 3 while (Q is not empty) do

// each node goes to Q only once: (N)4 v = dequeue Q;5 for each vertex w adjacent to v do // (|E|)6 if d(w) is yet unassigned then7 d(w) = d(v) + 1;8 last_on_pathpath((w)w) = v;9 enqueue w in Q;

end if; end for;end while;

End algorithm.

Complexity: (|E| + n), by a similar analysis as that of the previous queue-based algorithm.


v2v3

v4v5

Source: v1


: v1, length=0

v2, length=1 v3, length=1

Queue: v1

Queue: v2,v3 , v5 v5, length=1

v5, already assigned v4, length=2v1, no v1, nov1, nov1, no

Q: v3 ,v5

Q: empty

Q: v4

v3, no


v2v3

v4v5

2010

25

50

30

v1

SOURCE TO ALL NODES SHORTEST PATH-WEIGHT

Input: Positive-weighted graph, a “source” node s Output: Shortest distance to all nodes from the source s

70

Breadth-first search does not work any more!Say, source is v1Shortest-path-length(v5) = 35, not 70

However, the concept works: expanding distance front from s


SINGLE SOURCE SHORTEST PATH

Input: Weighted graph Dsw , a “source” node s [can not handle negative cycle]Output: Shortest distance to all nodes from the source sAlgorithm Dijkstra1 s.distance = 0; // shortest distance from s2 mark s as “finished”; //shortest distance to s from s found3 For each node w do 4 if (s, w) ϵ E then w.distance= Dsw else w.distance = inf;

// O(n)

5 While there exists any node not marked as finished do // (n), each node is picked once & only once

6 v = node from unfinished list with the smallest v.distance; // loop on unfinished nodes O(n): total O(n2)7 mark v as “finished”; // WHY? 8 For each edge (v, w) ϵ E do //adjacent to v: (|E|)9 if (w is not “finished” & (v.distance + Dvw < w.distance) ) then 10 w.distance = v.distance + Dvw;11 w.previous = v;

// v is parent of w on the shortest path end if;

end for;end while;

End algorithm


v2 ( )

2010

25

50

30

v1 (0)

70

DJIKSTRA’S ALG: SINGLE SOURCE SHORTEST PATH

v2 (10)

v4 ( )v5 (70)

2010

25

50

30

v1 (0)

70v3 (20)

v4 ( )v5 (35)

2010

25

50

30

70

v1 (0)

v2 (10)v3 ( )

v4 ( )v5 ( )

v3 (20)

v3 (20)

v4 (50) v5 (35)

2010

25

50

30

70

v1 (0)

v2 (10)v3 (30)

v4 (50) v5 (35)

2010

25

50

30

70

v1 (0)

v2 (10)


DJIKSTRA’S ALG: COMPLEXITY

• Lines 2-4, initialization: O(n);• Line 5: while loop runs n-1 times: [(n)], • Line 6: find-minimum-distance-node v runs another (n) within it,

thus the complexity is (N2); • Can you reduce this complexity by a Queue?

• Line 8: for-loop within while, for adjacency list graph data structure, runs for |E| times including the outside while loop.

• Grand total: (|E| + n2) = (n2), as |E| is always n2.


DJIKSTRA’S ALG: COMPLEXITY

• Djikstra complexity: (|E| + n2) = (n2), as |E| is always n2. • Using a heap data-structure find-minimum-distance-node (line 6) may be log_n,inside while loop (line 5): total O(n log_n)

• but as the distance values get changed the heap needs to be kept reorganized, thus, increasing the for-loop's complexity to (|E|log N).

• Grand total: (|E|log_n + n log_n) = (|E|log_n), as |E| is always n in a connected graph.


DJIKSTRA’S ALG: PROOF SKETCH

Induction base: Shortest distance for the source s itself must have been found

Induction Hypothesis: Suppose, it works up to the k-th iteration. This means, for k nodes the corresponding shortest paths from s have been already identified (say, set F, and V-F is U, the set of unfinished nodes).

Induction Step: Let, node p has shortest distance in U, in the k+1 –th iteration.

(1) Each node in F has its chance to update p. (2) Each other node in U has a distance >= that of p, so, cannot improve p any more (assuming edge weights are positive or non-negative)(3) Hence, p is ‘finished” and so, moved from U to F

(4) Each node in F, except p, has its chance to update nodes in U. (5) Each other node in U has a distance >= that of p, so, cannot improve distances of nodes in U optimally (assuming edge weights are positive or non-negative)(6) Hence, p is the only candidate to improve distances of nodes in U and so, updates only nodes in U in the k+1 -th iteration


DJIKSTRA’S ALG:WITH NEGATIVE EDGE

• Cost may reduce by adding a new arc

• Hence, the ‘finished’ marker does not work

• The proof of above Djikstra depends on positive edge weights

• Assumption: a path’s cost is non-decreasing as new arc is added to it

• This assumption is not true for negative-weighted edges



Algorithm Dijkstra-negativeInput: Weighted (undirected) graph G, and a node s in VOutput: For each node v in V, shortest distance w.distance from s1 Initialize a queue Q with the s; 2 While Q not empty do3 v = pop(Q);4 For each w such that (v, w) ϵ E do5 If (v.distance + Dvw < w.distance) then6 w.distance = v.distance +Dvw;7 w.parent = v;8 If w is not already in Q then push(Q, w); // if it is already there in Q do not duplicate

end if end forend while

End Algorithm.



•Negative cost on an edge or a path is all right, but negative cycle is a problem

•A path may repeat itself on a loop, thus reducing negative cost on it

•That will go on infinitely

•No limit to minimizing the cost between a source to any node on a negative cycle

•Shortest-path is not defined when there is a negative cycle.



• Cost reduces on looping around negative cycles

• Q-Djikstra does not prevent that: it may not terminate

• But how many looping should we allow?

• Can we predict that we are on an infinite loop here?

• How many times a node may appear in the queue?

• What is the “diameter” of a graph: longest path length?


DJIKSTRA’S ALG:WITH NEGATIVE CYCLE

• “Diameter” of a graph = n

• No node should get a chance to update others more than n+1 times!

• Because, there cannot be a simple (loop free) path of length more than n

• Line 8 of Q-Djikstra is guaranteed to happen once for any node, or, any node will be pushed to Q at least once

• But not more than n times, unless it is on a negative loop

• So, Q-Dijkstra should be terminated if a node v gets n+1 times in to the queue

• If that happens: the graph has a negative-cost cycle with the node v on it• Otherwise, Q-Djikstra may go into an infinite loop (over the cycle with negative path-weight)

Complexity of Q-Djikstra: O(n*|E|), because each node may be updated by each edge at most one time (if there is no negative cycle).


Application in Project managementDjikstra-like algorithm on Directed Acyclic Graph

An Activity Graph with duration for each activity

CRITICAL PATH FINDING ALGORITHM



Corresponding Event Graph: Activities are edges



Event Graph:

Find Earliest Completion Time (EC(v)) for each node v

EC(v1) = 0;N = Topological sort of nodes on DAC G;For each node w (other than v1) in the sorted list N for each edge (v,w) in E EC(w) = max (EC(v) + d(v,w); // d is the distances on arcs on G



Find Latest Completion Time (LC(v)) for each node v

LC(vn) = EC(vn); // last node becomes source nowFor each node v (other than vn) in reverse order of the sorted list N for each edge (v,w) in E LC(v) = min(LC(w) - d(v,w); // d is the distances on arcs on G



Find Slack Time for each edge (v,w)

For each edge (v,w) in E Slack(v,w) = LC(w) – EC(v) - d(v,w); // the task on (v,w) has this much slack time

Critical path: A path from v1 to vn with slack on each edge as 0Any delay on critical path will delay the whole project!

EC: above each node

LC: below each node

Edges: Task id / duration / slack


Djikstra’s Algorithm is a Greedy Algorithm: pick up the best node from U in each cycle

Find all pairs shortest distance (Problem 4):For Each node v as source

Run Djikstra(v, G)Complexity: O(n.n2), or O(n.n.|E|) for negative edges in G

A dynamic programming algorithm (Floyd’s alg) does better

Problem4: All Pairs Shortest Path

• A variation of Djikstra’s algorithm. Called Floyd-Warshal’s algorithm. Good for dense graph.

Algorithm Floyd

1. Copy the distance matrix in d[1..n][1..n];

2. for k=1 through n do //consider each vertex as updating candidate

1. for i=1 through n do

1. for j=1 through n do

2. if (d[i][k] + d[k][j] < d[i][j]) then

3. d[i][j] = d[i][k] + d[k][j];

4. path[i][j] = k; // last updated via k

End algorithm.• O(n3), for 3 loops.

04/19/23 (C) Debasis Mitra Source: Cormen et al.

D1(4,5) = min{D0(4,5) , D0(4,1) + D0(1,5)}= min{inf, 2+3}

Problem 5: Max-flow / Min-cut (Optimization)

• Input: Weighted directed graph (weight = maximum permitted flow on an edge),

a source node s without any incoming arc,

a sink node n without any outgoing arc.

• Objective function: schedule a flow graph s.t. no node holds any amount (influx=outflux), except for s and n, and no arc violets its allowed amount

• Output: maximize the objective function (maximize the amount flowing from s to n)

G =>

<= output max-Gflow

Max-flow / Min-cut Greedy Strategy

• Input: Weighted directed graph (weight = maximum permitted flow on an edge), a source node s without any incoming arc, a sink node n without any outgoing arc.

• Output: maximize the objective function (maximize the amount flowing from s to n)

• Greedy Strategy:– Initialize a flow graph Gf (each edge 0) Gr, and a residual graph as G

– In each iteration, find a “max path” p from s to n in Gr

– Max path is a path allowing maximum flow from s->n, subject to the constraints

– Add p to Gf, and subtract p from Gr, any edge having 0 is removed from Gr

– Continue iterations until there exists no such path in Gr

– Return Gf


Input G Gflow Gresidual

Greedy Alg, initialization

Source: Weiss


Greedy Alg, iteration 1

Terminates

Gflow

G => <= optimum output

Gresidual

Max-flow / Min-cut Greedy Strategy

• Complexity:– How many paths may exist in a graph? |E| !

– How many times any edge may be updated until it gets to 0? f.|E| for max-flow f

– Each iteration, looks for max-path: |E| log |V|

– Total: O(f. |E|2 log |V|)


Returned Result, Flow = 3

Optimum Result, Flow = 5

Capacities Unutilized

1

1

=>


Correct Alg, with oracle, not-greedy, iteration 1Start with not-max-path s->b->d->t (2 each edge)

Iteration 2


Iteration 2

Iteration 3

Terminates with optimum flow=5

Max-flow Backtracking StrategyFord-Fulkerson’s Algorithm

• Greedy Strategy is sub-optimal

• A Backtracking optimal algorithm: – Initialize a flow graph Gf (each edge 0) Gr, and a residual graph as G

– In each iteration, find a path p from s to n in Gr

– Add p to Gf, and Gr is updated with reverse flow opposite to p, and leaving residual amounts in forward direction => allows backtracking if necessary

– Continue iterations until either all edges from s are incoming, or all edges to n are outgoing

– Return Gf

• Guaranteed to terminate for rational numbers input• Complexity: If f is the max flow in returned Gf,

– Then, number of times any edge may get updated is f , even though each arc never gets to 0

– O(f. |E|)

• Random choice for a path p may run into bad input


Ford-Fulkerson, using backflow residual, Iteration 1

Iteration 2

Terminates, no outflow left from source, or inflow to sink


Problem input for Ford-Fulkerson:May keep backtracking if starts with random path, say, s->a->b->t

Complexity: If f is the max flow in returned Gf, O(f. |E|), or O(100000*|E|)

Why bother?Use greedy!

Max-flow Backtracking with DjikstraEdmonds-Karp Algorithm

• Another Backtracking optimal algorithm (may avoid problem of Ford-Fulkerson):

– Initialize a flow graph Gf (each edge 0) Gr, and a residual graph as G

– In each iteration, find a “max path” p from s to n in Gr using Djikstra

– Add p to Gf, and Gr is updated with reverse flow opposite to p, and leaving residual amounts in forward direction => allows backtracking if necessary

– Continue iterations until either all edges from s are incoming, or all edges to n are outgoing

– Return Gf

• Complexity: O(cmax|E|2 log |V|), first term is max capacity of any edge [actually logCmax in complexity!]

Min-cut = Minimum total capacity from one partition (of G) S ϵ source to another T ϵ sinkWhich edges are to cut from G to isolate source from sink?Min-cut = Max-flow


Problem 6: Minimum Spanning Tree (Optimization)Prim’s Algorithm

• Input: Weighted un-directed graph • Objective function: Total weight of a spanning tree over G• Output: minimize the objective function (minimum-wt spanning tree -

MST)

• Prim’s Algorithm: Djikstra’s strategy• Separate two sets of nodes F (in MST now) and U (not in MST yet), starting

with any arbitrary node s ϵ F• In each iteration find a ‘closest’ node v ϵ U from any node pϵF; move v

from U to F; add Epv to MST: {For all xϵU, For all yϵF, |Evy| ≤ |Exy| }

• Continue until U is empty• Return MST • How many nodes are on a spanning tree?

• How many arcs?

• How many spanning trees exist on a graph?

• Complexity of Prim’s Alg:?


• How many nodes are on a spanning tree?

• How many arcs?

• How many spanning trees exist on a graph?

• Complexity of Prim’s Alg:?

Prim’s Algorithm for MST:Complexity

Exactly same as Djikstra: O(|E| log n)


Prim’s Algorithm for MST

a

b

d

c

e

1

23

10

97

8

Prim’s Alg starting with a in F

b

d

c

e

1

23

10

97

8

a

b

d

c

e

1

23

10

97

8

a

Closest to a is b with 1 on Eab

b

d

c

e

1

23

10

97

8

a

Closest to a or b is e with Ebe=1

b

d

c

e

1

23

10

97

8

a

Closest to a, b or e is c with Eac=3

b

d

c

e

1

23

10

97

8

a

Closest to a, b, e or c is d with Ebd=8

Done! All nodes in F. Eab+ Ebc + Ebd+ Eac= 14, is MST weight


Proof sketch:Prims’s Algorithm

Induction base: ???

Induction hypothesis: on k-th iteration (1<=k<n), say, MST is correct for the subgraph of G with k nodes in F

Induction Step: Epv is a shortest arc between all arcs from F to U After adding Epv to MST, it is still correct for the new subgraph with k+1 nodes


Problem 6: MSTKruskal’s Algorithm

• Input: Weighted un-directed graph

• Objective function: Total weight of a spanning tree over G

• Output: minimize the objective function (minimum-wt spanning tree - MST)

• Kruskal’s Algorithm: Greedy strategy

• 1)Sort all arcs ϵE in low to high weights; 2)Initialize an empty MST

• In each iteration 3)find the shortest arc eϵE; 4)add it to MST unless it causes a cycle (MST may be a forest until iterations end); 5)E = E\e

• Continue iterations until E is empty // will run for |E| times

• Complexity of Kruskal’s Alg?

• log|E| sorting

• (Cycle checking in MST needs special datastructure) x |E| iterations

• O(|E| log |E|), which is O(|E| log n)


Kruskal’s Algorithm for MST

a

b

d

c

e

1

23

10

97

8

Kruskal’s Alg must start with Ebc in MST

b

d

c

e

1

23

10

97

8

a

b

d

c

e

1

23

10

97

8

a

b

d

c

e

1

23

10

97

8

a

Add Eac

b

d

c

e

1

23

10

97

8

a

Eae creates cycle, fails to be added

b

d

c

e

1

23

10

97

8

a

Add Ebd

Done! All n-1 arcs in MST. Eab+ Ebc + Ebd+ Eac= 14, is MST weight


Kruskal’s Algorithm for MST

a

b

d

c

e

1

32

10

97

8

Kruskal’s Alg must start with Ebc in MST

b

d

c

e

1

32

10

97

8

a

b

d

c

e

1

2

10

97

8

a

b

d

c

e

1

32

10

97

8

a

Add Eac

b

d

c

e

1

22

10

97

8

a

Eae creates cycle, fails to be added

b

d

c

e

1

33

10

97

8

a

Add Ebd

Done! All n-1 arcs in MST. Eab+ Ebc + Ebd+ Eac= 14, is MST weight

3


Proof sketch:Kruskal’s Algorithm

Induction does not work: growing forest is not necessarily MST until finished

Contradiction: Suppose, returned T is not MST, but a different ST P is: {||T|| > ||P||}, double bar indicates weight

Transform T to P, by adding arc Exy to T -> form a cycle -> remove Elm from the cycle already in T to create a new ST -> chain through this process -> P

|| Elm || must be <= || Exy ||, as Elm was picked up for T before Exy,

from pre-sorted list

Elm is in T but not Exy, Exy is in P but not Elm So, ||T|| <= ||P||. Contradiction!

Depth First Traversal of Graphs


• Creates a Spanning Forest on a finite graph

• For connected graph, it will be a spanning tree

• For dynamically developing graph (e.g. search tree), where finiteness is not guranteed, DFS may not terminate

• Space requirement is limited by the depth of the treeWhat is space complexity on BFS traversal?

• Time: depends on branching factor b – have to come back to the same node b number of times

• Stack is the typical data-structure for DFS

• Many graph algorithms need DFS traversal

Recursive Depth First Traversal of Graphs


Input: graph (V, E)Output: a DFS spanning forestAlgorithm dfs(v)1. mark node v as visited;2. operate on v (e.g., print);3. for each node w adjacent to node v do4. if w is not marked as visited then5. dfs(w); // an iterative version of this //will maintain its own stack

end for;End algorithm. Starter algorithm

on a given graph G call dfs(any/given node in G);End starter.

Problem with above algorithm?

Recursive Depth First Traversal of Graphs


Problem with above algorithm:Will terminate prematurely if the graph is disconnected.

repeat until all nodes are markedpick up next unmarked node s;dfs(s);

end repeat;

This will generate the required DFS-spanning forest


Problem 7: Finding Articulation Points in a Graph

A B

C D

G E F

Input: Undirected, unweighted graph GArticulation Point: a node, removal of which from G will disconnect the graphOutput: Articulation point(s) of G, if there exists any


Finding Articulation Points in a GraphBy ordered DFS traversal

Strategy: •Draw a dfs-spanning tree T for the given graph starting from any of its nodes.•Number them according to the pre-order of their traversal (num(v)).

•Def. of “low”: For each vertex v, find the lowest vertex-num where one can go to, from v by the directed edges on T & possibly by a single left-out edge of G (back-edge on T):

•Do post-order DFS-traversal on T now, & assign low of each node v: •{low(v) =min{num(v), for all children w, low(w), num(w) for each node w with an arc from v in G}

•If for any vertex v there is a child w such that num(v) low(w), then v is an articulation point, unless v is the root in the T (this condition is always true for the root!).

•If v is the root of the T, and has two or more children in this T, then v is an articulation point.


B 1/1

A 2/1

C 3/1

D 4/1

G 7/7

E 5/4F 6/4

A B

C D

G E F

Finding Articulation Points in a GraphBy ordered DFS traversal

Complexity: O(|E| + n)


Problem 7: Finding Strongly Connected Component (SCC) in a Directed Graph

Source: Weiss, Fig 9.74


Input: Directed Graph GOutput: Subgraphs SCC of G, each subgraph in SCC is a strong connected component of G

Strategy:1. Traverse G using DFS to create spanning forest of the graph T,

numbering the nodes in the post-order traversal;2. Create Gr by reversing the directions of arcs in G;3. Traverse using DFS the nodes of Gr in a decreasing order of the numbers,

4. DFS spanning-forest from this second traversal creates the set of SCC’s for G.

SCC-detection Algorithm


SCC-detection Algorithm:First Pass DFS


SCC-detection Algorithm:Second Pass DFS on Gr

Start DFS, stuck at G:G is a SCC

Start DFS again: H-I-J is a SCC

Start DFS again: B-A-C-F is a SCC

Start DFS again, stuck at D: D is a SCC

Start DFS again, stuck at E: E is a SCC


Problem 8: Finding Euler Circuit or Euler Path/Tour in a Undirected Graph

Exists or not? O(E) time


Finding Euler Circuit by Repeated use of DFS

DFS starting from 5: 5-4-10-5



5-4-10-5Left out graph: Still even degrees, but may not be connectedRe-start from 4 now: 4-1-3-7-4 -11-10-7-9—3-4



Done: 5-4-1-3-7-4-11-10-7-9-3-4-10-5Start from, say, 3 now: 3-2-8-9-6-3

Done: 5-4-1-3-2-8-9-6-3-7-4-11-10-7-9-3-4-10-5Start with, say, 9 now: 9-10-12-9You know what to do next!

graph algorithms basics graph: g = {v, e} v ={v 1, v 2, …, v n }, e = {e 1, e 2, … e m }, e ij...

Documents