introduction to graphs and breadth first search. graphs: what are they? representations of pairwise...

Introduction to GraphsIntroduction to Graphs

And Breadth First SearchAnd Breadth First Search

Graphs: what are they?Graphs: what are they?

• Representations of pairwise relationships

• Collections of objects under some specified relationship

• Representations of pairwise relationships

• Collections of objects under some specified relationship

Graphs: what are they mathematically?

Graphs: what are they mathematically?

• A graph G is a pair (V,E)• V is a set of vertices (nodes)• E is a set of pairs (a,b), a,b V

• V is the set of relatable objects• E is the set of relationships

• A graph G is a pair (V,E)• V is a set of vertices (nodes)• E is a set of pairs (a,b), a,b V

• V is the set of relatable objects• E is the set of relationships

A Visual ExampleA Visual Example

2

1

4

3

5

G = ( {1,2,3,4,5}, {(1,2), (1,4), (2,3), (2,4), (1,5)} )

Directed GraphsDirected Graphs

• In a directed graph• (a,b) E does not imply (b,a) E

• Undirected graphs are a subset• (a,b) E if and only if (b,a) E

• Visually, directed graphs are drawn with arrows

• In a directed graph• (a,b) E does not imply (b,a) E

• Undirected graphs are a subset• (a,b) E if and only if (b,a) E

• Visually, directed graphs are drawn with arrows

Directed Graph ExampleDirected Graph Example

2

1

4

3

5

G = ( {1,2,3,4,5}, { (1,5), (2,1), (2,3), (2,4), (3,2), (4,1) } )

Weighted GraphsWeighted Graphs

• Have weights associated with edges

• Can be directed or undirected• Can have pairs, in a directed

graph, where the weights from (a,b) have no relationship on the weights from (b,a)

• Have weights associated with edges

• Can be directed or undirected• Can have pairs, in a directed

graph, where the weights from (a,b) have no relationship on the weights from (b,a)

Weighted Graph Example

Weighted Graph Example

2

1

4

3

5

G = ( {1,2,3,4,5}, { (1,5,-5), (2,1,3.2), (2,3,42), (3,2, π), (2,4,777), (4,1,666) } )

3.2

π

-5

42

777

666

Graph RepresentationGraph Representation

• How to represent in memory?• Two common ways:• Adjacency Lists• Adjacency Matrix

• How to represent in memory?• Two common ways:• Adjacency Lists• Adjacency Matrix

Adjacency ListsAdjacency Lists

• Compact usage in sparse graphs where |E| << |V|2

• Stores graph as array of |V| lists• Each v has a list of adjacent v in G

• Compact usage in sparse graphs where |E| << |V|2

• Stores graph as array of |V| lists• Each v has a list of adjacent v in G

Undirected Graph Example


G = ( {1,2,3,4,5}, {(1,2), (1,4), (2,3), (2,4), (1,5)} )

12345

2 4 5

1 3 4

2

1 2

1

Directed Graph ExampleDirected Graph Example

12345

5

1 3 4

2

1

G = ( {1,2,3,4,5}, { (1,5), (2,1), (2,3), (2,4), (3,2), (4,1) } )

Adjacency Lists Wrap UpAdjacency Lists Wrap Up

• Sum of list lengths for undirected• 2|E|• For some apps, could optimize to |E|

• Sum of list lengths for directed• |E|

• Weighted graphs: left as exercise

• Sum of list lengths for undirected• 2|E|• For some apps, could optimize to |E|

• Sum of list lengths for directed• |E|

• Weighted graphs: left as exercise

Adjacency MatrixAdjacency Matrix

• Often less memory for dense graphs

• Faster check for edge existence• Mathematically:• M is a |V|*|V| matrix• Dimensions represent vertices• M(i,j)=1 if (i,j) E, 0 otherwise

• Often less memory for dense graphs

• Faster check for edge existence• Mathematically:• M is a |V|*|V| matrix• Dimensions represent vertices• M(i,j)=1 if (i,j) E, 0 otherwise



G = ( {1,2,3,4,5}, {(1,2), (1,4), (2,3), (2,4), (1,5)} )

1 2 3 4 5

1 0 1 0 1 1

2 1 0 1 1 0

3 0 1 0 0 0

4 1 1 0 0 0

5 1 0 0 0 0

Directed Graph ExampleDirected Graph ExampleG = ( {1,2,3,4,5}, { (1,5), (2,1), (2,3), (2,4), (3,2), (4,1) } )

1 2 3 4 5

1 0 0 0 0 1

2 1 0 1 1 0

3 0 1 0 0 0

4 1 0 0 0 0

5 0 0 0 0 0

Adjacency Matrix Wrap Up

Adjacency Matrix Wrap Up

• Size is always |V|2

• If |E| close to |V|, can be more efficient because edge is 1 bit instead of a 4 bytes for a pointer

• Weighted graphs: use weight instead of 0’s and 1’s

• Size is always |V|2

• If |E| close to |V|, can be more efficient because edge is 1 bit instead of a 4 bytes for a pointer

• Weighted graphs: use weight instead of 0’s and 1’s

Breadth-first SearchBreadth-first Search

• Problem: For a given graph G, and a specified s in the graph, find all vertices v that are reachable from s and determine the shortest path in G from s to v.

• Problem: For a given graph G, and a specified s in the graph, find all vertices v that are reachable from s and determine the shortest path in G from s to v.

How BFS worksHow BFS works

• Constructs a breadth first tree• Root is s• Path from s to v is shortest path

from s to v in G

• Constructs a breadth first tree• Root is s• Path from s to v is shortest path

from s to v in G

The BFS algorithmThe BFS algorithm

• Assigns a color to each node• white = vertex has not been

reached• gray = vertex is in the BFS

frontier)• black = vertex and ALL of its

neighbors have been processed.

• Assigns a color to each node• white = vertex has not been

reached• gray = vertex is in the BFS

frontier)• black = vertex and ALL of its

neighbors have been processed.

BFS algorithm (cont.)BFS algorithm (cont.)

• Computes d[v] for each v• Shortest distance from s to v in G

• Computes p[v] for each v• Predecessor of v in the breadth-

first tree

• Computes d[v] for each v• Shortest distance from s to v in G

• Computes p[v] for each v• Predecessor of v in the breadth-

first tree

BFS Pseudo Code(initialization)

BFS Pseudo Code(initialization)

1. for each vertex v in V2. color[v] = white3. d[v] = INFINITY4. p[v] = NULL5. color[s] = gray6. d[s] = 07. Queue.clear()8. Queue.put(s)

1. for each vertex v in V2. color[v] = white3. d[v] = INFINITY4. p[v] = NULL5. color[s] = gray6. d[s] = 07. Queue.clear()8. Queue.put(s)

BFS Pseudo Code(tree construction)BFS Pseudo Code

(tree construction)9. while (!Queue.empty())10. u = Queue.get()11. for each v adjacent to u12. if (color[v] == white)13. color[v] = gray14. d[v] = d[u] + 115. p[v] = u16. Queue.put(v)17. color[u] = black

9. while (!Queue.empty())10. u = Queue.get()11. for each v adjacent to u12. if (color[v] == white)13. color[v] = gray14. d[v] = d[u] + 115. p[v] = u16. Queue.put(v)17. color[u] = black

Correctness of BFSCorrectness of BFS• Definition 1. b(s,v) is the min

number of edges in any path from s to v. If there is no path from s to v then b(s,v) = INFINITY. b(s,v) is the shortest-path distance.

• Lemma 1. Let G=(V,E), v in V. For any edge (u,v) in E

b(s,v) <= b(s,u) + 1

• Definition 1. b(s,v) is the min number of edges in any path from s to v. If there is no path from s to v then b(s,v) = INFINITY. b(s,v) is the shortest-path distance.

• Lemma 1. Let G=(V,E), v in V. For any edge (u,v) in E

b(s,v) <= b(s,u) + 1

Proof of Lemma 1Proof of Lemma 1• If u is reachable from s, so is v.

The shortest path from s to v cannot be more than the shortest path from s to u plus the edge (u,v), thus the inequality holds. If u is not reachable then b(s,u) = INFINITY so the inequality holds

• If u is reachable from s, so is v. The shortest path from s to v cannot be more than the shortest path from s to u plus the edge (u,v), thus the inequality holds. If u is not reachable then b(s,u) = INFINITY so the inequality holds

Lemma 2Lemma 2

• Upon termination, the BFS algorithm computes d[v] for every vertex and d[v] >= b(s,v)

• Upon termination, the BFS algorithm computes d[v] for every vertex and d[v] >= b(s,v)

Proof of Lemma 2Proof of Lemma 2• By induction on the number i of

enqueue operations.• For i = 1 (s is enqueued),

• d[s]=[0]=b(s,s)• d[v]=INFINITY>=b(s,v) for all v != s

• For i = n, consider white v discovered from u. By induction, d[u]>=b(s,u). Since d[v]=d[u]+1 >= b(s,u)+1 >= b(s,v)

• By induction on the number i of enqueue operations.• For i = 1 (s is enqueued),

• d[s]=[0]=b(s,s)• d[v]=INFINITY>=b(s,v) for all v != s

• For i = n, consider white v discovered from u. By induction, d[u]>=b(s,u). Since d[v]=d[u]+1 >= b(s,u)+1 >= b(s,v)

Lemma 3Lemma 3

• At all times during execution of BFS • the queue contains vertices (v1,

v2, … vr) such that • d[v1] <= d[v2]…<=d[vr] • d[vr] <= d[v1] + 1

• At all times during execution of BFS • the queue contains vertices (v1,

v2, … vr) such that • d[v1] <= d[v2]…<=d[vr] • d[vr] <= d[v1] + 1

Proof of Lemma 3Proof of Lemma 3• By induction on number i of queue op’s.• For i=1, queue only has s, hypothesis holds• For i=n

• After dequeueing v1:• d[vr]<=d[v1]+1 and d[v1]<=d[v2], then

d[vr]<=d[v2]+1, so hypothesis holds

• After enqueueing vr+1:• D[vr+1] = d[v1]+1 >= d[vr]• D[vr+1] = d[v1]+1 <= d[v2]+1, since d[v1]<=d[v2]• Since v2 is the new head of queue, hypothesis holds

• By induction on number i of queue op’s.• For i=1, queue only has s, hypothesis holds• For i=n

• After dequeueing v1:• d[vr]<=d[v1]+1 and d[v1]<=d[v2], then

d[vr]<=d[v2]+1, so hypothesis holds

• After enqueueing vr+1:• D[vr+1] = d[v1]+1 >= d[vr]• D[vr+1] = d[v1]+1 <= d[v2]+1, since d[v1]<=d[v2]• Since v2 is the new head of queue, hypothesis holds

Corollary (4) to Lemma 3

Corollary (4) to Lemma 3

• If vertices u and v are enqueued during execution of BFS and u is enqueued before v, then d[u] <= d[v]

• If vertices u and v are enqueued during execution of BFS and u is enqueued before v, then d[u] <= d[v]

Theorem 5Theorem 5

• Given G=(V,E) and s• BFS discovers every v reachable

from s• Upon termination, d[v]=b(s,v)• Moreover, for v reachable from s• One of the shortests paths from s to v is

a path followed from s to p[v], followed by edge (p[v],v).

• Given G=(V,E) and s• BFS discovers every v reachable

from s• Upon termination, d[v]=b(s,v)• Moreover, for v reachable from s• One of the shortests paths from s to v is

a path followed from s to p[v], followed by edge (p[v],v).

Proof of Theorem 5Proof of Theorem 5• By contradiction. • Assume v assigned d[v] != b(s,v). By

lemma 2, d[v]>=b(s,v), so d[v] > b(s,v).

• v must be reachable, else b(s,v)>=d[v]

• Let u be predecessor on path to v• b(s,v) = b(s,u)+1 = d[u]+1• This would mean d[v] > d[u]+1

• By contradiction. • Assume v assigned d[v] != b(s,v). By

lemma 2, d[v]>=b(s,v), so d[v] > b(s,v).

• v must be reachable, else b(s,v)>=d[v]

• Let u be predecessor on path to v• b(s,v) = b(s,u)+1 = d[u]+1• This would mean d[v] > d[u]+1

Proof completionProof completion• d[v] > d[u]+1 cannot happen!• Look at when BFS dequeues u• v is either white, black, or gray• If v is black, already removed from queue,

and by corollary 4, d[v]<=d[u]• If v is gray, it was made gray when other

vertex w was dequeued, • so d[v]=d[w]+1 <= d[u]+1 (by corollary 4)

• If v is white, then the code sets d[v]• d[v] = d[u] + 1

• d[v] > d[u]+1 cannot happen!• Look at when BFS dequeues u• v is either white, black, or gray• If v is black, already removed from queue,

and by corollary 4, d[v]<=d[u]• If v is gray, it was made gray when other

vertex w was dequeued, • so d[v]=d[w]+1 <= d[u]+1 (by corollary 4)

• If v is white, then the code sets d[v]• d[v] = d[u] + 1

BFS Wrap UpBFS Wrap Up

• So, d[v]=b(s,v) for all v in V• All reachable vertices discovered,

else d = INFINITY• If p[v]=u, then d[v]=d[u]+1, so

one of the shortest paths from s to v takes path from s to u then (u,v)

• So, d[v]=b(s,v) for all v in V• All reachable vertices discovered,

else d = INFINITY• If p[v]=u, then d[v]=d[u]+1, so

one of the shortest paths from s to v takes path from s to u then (u,v)

Applications for GraphsApplications for Graphs

• Link structure of a website• Problems in travel, biology, etc.• Network representation• Solution space:• EXAMPLE: Sudoku

• Link structure of a website• Problems in travel, biology, etc.• Network representation• Solution space:• EXAMPLE: Sudoku

introduction to graphs and breadth first search. graphs: what are they? representations of pairwise...

Documents