introduction to graphs and breadth first search. graphs: what are they? representations of pairwise...
TRANSCRIPT
Introduction to GraphsIntroduction to Graphs
And Breadth First SearchAnd Breadth First Search
Graphs: what are they?Graphs: what are they?
• Representations of pairwise relationships
• Collections of objects under some specified relationship
• Representations of pairwise relationships
• Collections of objects under some specified relationship
Graphs: what are they mathematically?
Graphs: what are they mathematically?
• A graph G is a pair (V,E)• V is a set of vertices (nodes)• E is a set of pairs (a,b), a,b V
• V is the set of relatable objects• E is the set of relationships
• A graph G is a pair (V,E)• V is a set of vertices (nodes)• E is a set of pairs (a,b), a,b V
• V is the set of relatable objects• E is the set of relationships
A Visual ExampleA Visual Example
2
1
4
3
5
G = ( {1,2,3,4,5}, {(1,2), (1,4), (2,3), (2,4), (1,5)} )
Directed GraphsDirected Graphs
• In a directed graph• (a,b) E does not imply (b,a) E
• Undirected graphs are a subset• (a,b) E if and only if (b,a) E
• Visually, directed graphs are drawn with arrows
• In a directed graph• (a,b) E does not imply (b,a) E
• Undirected graphs are a subset• (a,b) E if and only if (b,a) E
• Visually, directed graphs are drawn with arrows
Directed Graph ExampleDirected Graph Example
2
1
4
3
5
G = ( {1,2,3,4,5}, { (1,5), (2,1), (2,3), (2,4), (3,2), (4,1) } )
Weighted GraphsWeighted Graphs
• Have weights associated with edges
• Can be directed or undirected• Can have pairs, in a directed
graph, where the weights from (a,b) have no relationship on the weights from (b,a)
• Have weights associated with edges
• Can be directed or undirected• Can have pairs, in a directed
graph, where the weights from (a,b) have no relationship on the weights from (b,a)
Weighted Graph Example
Weighted Graph Example
2
1
4
3
5
G = ( {1,2,3,4,5}, { (1,5,-5), (2,1,3.2), (2,3,42), (3,2, π), (2,4,777), (4,1,666) } )
3.2
π
-5
42
777
666
Graph RepresentationGraph Representation
• How to represent in memory?• Two common ways:• Adjacency Lists• Adjacency Matrix
• How to represent in memory?• Two common ways:• Adjacency Lists• Adjacency Matrix
Adjacency ListsAdjacency Lists
• Compact usage in sparse graphs where |E| << |V|2
• Stores graph as array of |V| lists• Each v has a list of adjacent v in G
• Compact usage in sparse graphs where |E| << |V|2
• Stores graph as array of |V| lists• Each v has a list of adjacent v in G
Undirected Graph Example
Undirected Graph Example
G = ( {1,2,3,4,5}, {(1,2), (1,4), (2,3), (2,4), (1,5)} )
12345
2 4 5
1 3 4
2
1 2
1
Directed Graph ExampleDirected Graph Example
12345
5
1 3 4
2
1
G = ( {1,2,3,4,5}, { (1,5), (2,1), (2,3), (2,4), (3,2), (4,1) } )
Adjacency Lists Wrap UpAdjacency Lists Wrap Up
• Sum of list lengths for undirected• 2|E|• For some apps, could optimize to |E|
• Sum of list lengths for directed• |E|
• Weighted graphs: left as exercise
• Sum of list lengths for undirected• 2|E|• For some apps, could optimize to |E|
• Sum of list lengths for directed• |E|
• Weighted graphs: left as exercise
Adjacency MatrixAdjacency Matrix
• Often less memory for dense graphs
• Faster check for edge existence• Mathematically:• M is a |V|*|V| matrix• Dimensions represent vertices• M(i,j)=1 if (i,j) E, 0 otherwise
• Often less memory for dense graphs
• Faster check for edge existence• Mathematically:• M is a |V|*|V| matrix• Dimensions represent vertices• M(i,j)=1 if (i,j) E, 0 otherwise
Undirected Graph Example
Undirected Graph Example
G = ( {1,2,3,4,5}, {(1,2), (1,4), (2,3), (2,4), (1,5)} )
1 2 3 4 5
1 0 1 0 1 1
2 1 0 1 1 0
3 0 1 0 0 0
4 1 1 0 0 0
5 1 0 0 0 0
Directed Graph ExampleDirected Graph ExampleG = ( {1,2,3,4,5}, { (1,5), (2,1), (2,3), (2,4), (3,2), (4,1) } )
1 2 3 4 5
1 0 0 0 0 1
2 1 0 1 1 0
3 0 1 0 0 0
4 1 0 0 0 0
5 0 0 0 0 0
Adjacency Matrix Wrap Up
Adjacency Matrix Wrap Up
• Size is always |V|2
• If |E| close to |V|, can be more efficient because edge is 1 bit instead of a 4 bytes for a pointer
• Weighted graphs: use weight instead of 0’s and 1’s
• Size is always |V|2
• If |E| close to |V|, can be more efficient because edge is 1 bit instead of a 4 bytes for a pointer
• Weighted graphs: use weight instead of 0’s and 1’s
Breadth-first SearchBreadth-first Search
• Problem: For a given graph G, and a specified s in the graph, find all vertices v that are reachable from s and determine the shortest path in G from s to v.
• Problem: For a given graph G, and a specified s in the graph, find all vertices v that are reachable from s and determine the shortest path in G from s to v.
How BFS worksHow BFS works
• Constructs a breadth first tree• Root is s• Path from s to v is shortest path
from s to v in G
• Constructs a breadth first tree• Root is s• Path from s to v is shortest path
from s to v in G
The BFS algorithmThe BFS algorithm
• Assigns a color to each node• white = vertex has not been
reached• gray = vertex is in the BFS
frontier)• black = vertex and ALL of its
neighbors have been processed.
• Assigns a color to each node• white = vertex has not been
reached• gray = vertex is in the BFS
frontier)• black = vertex and ALL of its
neighbors have been processed.
BFS algorithm (cont.)BFS algorithm (cont.)
• Computes d[v] for each v• Shortest distance from s to v in G
• Computes p[v] for each v• Predecessor of v in the breadth-
first tree
• Computes d[v] for each v• Shortest distance from s to v in G
• Computes p[v] for each v• Predecessor of v in the breadth-
first tree
BFS Pseudo Code(initialization)
BFS Pseudo Code(initialization)
1. for each vertex v in V2. color[v] = white3. d[v] = INFINITY4. p[v] = NULL5. color[s] = gray6. d[s] = 07. Queue.clear()8. Queue.put(s)
1. for each vertex v in V2. color[v] = white3. d[v] = INFINITY4. p[v] = NULL5. color[s] = gray6. d[s] = 07. Queue.clear()8. Queue.put(s)
BFS Pseudo Code(tree construction)BFS Pseudo Code
(tree construction)9. while (!Queue.empty())10. u = Queue.get()11. for each v adjacent to u12. if (color[v] == white)13. color[v] = gray14. d[v] = d[u] + 115. p[v] = u16. Queue.put(v)17. color[u] = black
9. while (!Queue.empty())10. u = Queue.get()11. for each v adjacent to u12. if (color[v] == white)13. color[v] = gray14. d[v] = d[u] + 115. p[v] = u16. Queue.put(v)17. color[u] = black
Correctness of BFSCorrectness of BFS• Definition 1. b(s,v) is the min
number of edges in any path from s to v. If there is no path from s to v then b(s,v) = INFINITY. b(s,v) is the shortest-path distance.
• Lemma 1. Let G=(V,E), v in V. For any edge (u,v) in E
b(s,v) <= b(s,u) + 1
• Definition 1. b(s,v) is the min number of edges in any path from s to v. If there is no path from s to v then b(s,v) = INFINITY. b(s,v) is the shortest-path distance.
• Lemma 1. Let G=(V,E), v in V. For any edge (u,v) in E
b(s,v) <= b(s,u) + 1
Proof of Lemma 1Proof of Lemma 1• If u is reachable from s, so is v.
The shortest path from s to v cannot be more than the shortest path from s to u plus the edge (u,v), thus the inequality holds. If u is not reachable then b(s,u) = INFINITY so the inequality holds
• If u is reachable from s, so is v. The shortest path from s to v cannot be more than the shortest path from s to u plus the edge (u,v), thus the inequality holds. If u is not reachable then b(s,u) = INFINITY so the inequality holds
Lemma 2Lemma 2
• Upon termination, the BFS algorithm computes d[v] for every vertex and d[v] >= b(s,v)
• Upon termination, the BFS algorithm computes d[v] for every vertex and d[v] >= b(s,v)
Proof of Lemma 2Proof of Lemma 2• By induction on the number i of
enqueue operations.• For i = 1 (s is enqueued),
• d[s]=[0]=b(s,s)• d[v]=INFINITY>=b(s,v) for all v != s
• For i = n, consider white v discovered from u. By induction, d[u]>=b(s,u). Since d[v]=d[u]+1 >= b(s,u)+1 >= b(s,v)
• By induction on the number i of enqueue operations.• For i = 1 (s is enqueued),
• d[s]=[0]=b(s,s)• d[v]=INFINITY>=b(s,v) for all v != s
• For i = n, consider white v discovered from u. By induction, d[u]>=b(s,u). Since d[v]=d[u]+1 >= b(s,u)+1 >= b(s,v)
Lemma 3Lemma 3
• At all times during execution of BFS • the queue contains vertices (v1,
v2, … vr) such that • d[v1] <= d[v2]…<=d[vr] • d[vr] <= d[v1] + 1
• At all times during execution of BFS • the queue contains vertices (v1,
v2, … vr) such that • d[v1] <= d[v2]…<=d[vr] • d[vr] <= d[v1] + 1
Proof of Lemma 3Proof of Lemma 3• By induction on number i of queue op’s.• For i=1, queue only has s, hypothesis holds• For i=n
• After dequeueing v1:• d[vr]<=d[v1]+1 and d[v1]<=d[v2], then
d[vr]<=d[v2]+1, so hypothesis holds
• After enqueueing vr+1:• D[vr+1] = d[v1]+1 >= d[vr]• D[vr+1] = d[v1]+1 <= d[v2]+1, since d[v1]<=d[v2]• Since v2 is the new head of queue, hypothesis holds
• By induction on number i of queue op’s.• For i=1, queue only has s, hypothesis holds• For i=n
• After dequeueing v1:• d[vr]<=d[v1]+1 and d[v1]<=d[v2], then
d[vr]<=d[v2]+1, so hypothesis holds
• After enqueueing vr+1:• D[vr+1] = d[v1]+1 >= d[vr]• D[vr+1] = d[v1]+1 <= d[v2]+1, since d[v1]<=d[v2]• Since v2 is the new head of queue, hypothesis holds
Corollary (4) to Lemma 3
Corollary (4) to Lemma 3
• If vertices u and v are enqueued during execution of BFS and u is enqueued before v, then d[u] <= d[v]
• If vertices u and v are enqueued during execution of BFS and u is enqueued before v, then d[u] <= d[v]
Theorem 5Theorem 5
• Given G=(V,E) and s• BFS discovers every v reachable
from s• Upon termination, d[v]=b(s,v)• Moreover, for v reachable from s• One of the shortests paths from s to v is
a path followed from s to p[v], followed by edge (p[v],v).
• Given G=(V,E) and s• BFS discovers every v reachable
from s• Upon termination, d[v]=b(s,v)• Moreover, for v reachable from s• One of the shortests paths from s to v is
a path followed from s to p[v], followed by edge (p[v],v).
Proof of Theorem 5Proof of Theorem 5• By contradiction. • Assume v assigned d[v] != b(s,v). By
lemma 2, d[v]>=b(s,v), so d[v] > b(s,v).
• v must be reachable, else b(s,v)>=d[v]
• Let u be predecessor on path to v• b(s,v) = b(s,u)+1 = d[u]+1• This would mean d[v] > d[u]+1
• By contradiction. • Assume v assigned d[v] != b(s,v). By
lemma 2, d[v]>=b(s,v), so d[v] > b(s,v).
• v must be reachable, else b(s,v)>=d[v]
• Let u be predecessor on path to v• b(s,v) = b(s,u)+1 = d[u]+1• This would mean d[v] > d[u]+1
Proof completionProof completion• d[v] > d[u]+1 cannot happen!• Look at when BFS dequeues u• v is either white, black, or gray• If v is black, already removed from queue,
and by corollary 4, d[v]<=d[u]• If v is gray, it was made gray when other
vertex w was dequeued, • so d[v]=d[w]+1 <= d[u]+1 (by corollary 4)
• If v is white, then the code sets d[v]• d[v] = d[u] + 1
• d[v] > d[u]+1 cannot happen!• Look at when BFS dequeues u• v is either white, black, or gray• If v is black, already removed from queue,
and by corollary 4, d[v]<=d[u]• If v is gray, it was made gray when other
vertex w was dequeued, • so d[v]=d[w]+1 <= d[u]+1 (by corollary 4)
• If v is white, then the code sets d[v]• d[v] = d[u] + 1
BFS Wrap UpBFS Wrap Up
• So, d[v]=b(s,v) for all v in V• All reachable vertices discovered,
else d = INFINITY• If p[v]=u, then d[v]=d[u]+1, so
one of the shortest paths from s to v takes path from s to u then (u,v)
• So, d[v]=b(s,v) for all v in V• All reachable vertices discovered,
else d = INFINITY• If p[v]=u, then d[v]=d[u]+1, so
one of the shortest paths from s to v takes path from s to u then (u,v)
Applications for GraphsApplications for Graphs
• Link structure of a website• Problems in travel, biology, etc.• Network representation• Solution space:• EXAMPLE: Sudoku
• Link structure of a website• Problems in travel, biology, etc.• Network representation• Solution space:• EXAMPLE: Sudoku