intro to computation & ai

Intro to Computation & AI

Dr. Jill Fain Lehman

School of Computer Science

Lecture 4: November 13, 1997

Graph Basics

• In general a graph consists of a set of nodes/vertices V, and set of edges E

• Note: a tree is a special type of graph.

• Either v V or e E may be a complex structure with additional information associated with it.

v1

v2

v3

v4 v5

v6

e1

e2

e3

e4

e6 e7

e8

e5

Examples

pghnyc la

sfno

bos

11.5

2

3.5

2.5 51.5

5

noun noun

verbadjarticle

‘s

noun

noun

noun

verb

Graph Formalism

• G = (V, E) where G is a graph, V a set of vertices and E a set of edges, such that e E iff e = (v1, v2), v1, v2 V.

• If G is undirected, then e = (v1, v2) implies e = (v2, v1), i.e. vertices are unordered.

• If G is directed (digraph) then (v1, v2) are ordered. v1 is the origin, v2 is the terminus or destination.

v1

v2v1

v2

Paths, Adjacency, Cycles

• Two vertices vi and vj are adjacent if there exists an edge e E such that e = (vi, vj).

• A path p is a sequence of vertices of V of the form p = v1 v2 ... vn (n >= 2) in which each vertex vi is adjacent to vi+1 (for 1<= i <= n-1).

• A cycle is a path p = v1 v2 ... vn such that v1 = vn

AB

CD

Connectivity

• If x V and y V, x = y, then x and y are connected if there exists a path p = v1…vn such that x = v1 and y = vn.

• For G undirected, a subset S of V is a connected component if for any two distinct vertices, x S, y S, x is connected to y.

• For G directed, a subset S of V is strongly connected if for each pair of distinct vertices (vi,vj) S, vi is connected to vj and vj is connected to vi. S is weakly connected if either vi is connected to vj or vj is connected to vi.

Connectivity Examples

Strongly connected Weakly connected

Adjacency Sets and Degrees

• Let an adjacency set Vx = {y | (x, y) E}. Then G = (V, A) where A = {Vx | x V}.

• For G undirected, the degree of a vertex x is the number of edges e in which x is one of the endpoints of e.

d=4

d=4

d=1

d=3

d=2 d=0

undirectedgraph with 2 components

Degrees for Directed Graphs• If x is a vertex in a digraph G, we can define two sets

Pred(x) and Succ(x), the predecessors and successors of x respectively.

• Pred(x) = {y | y V and (y, x) E}; the size of Pred(x) is called the in-degree of x.

• Succ(x) = {y | y V and (x,y) E}; the size of Succ(x) is called the out-degree of x.

in=2, out=0in=0; out=2

in=1, out=1

Graph Representations: The Adjacency Matrix

• Given G=(V,E), V=v1…vn. Let T[i,j] be a table with n rows and n columns such that row i corresponds to vi and column j to vj, (1 <= i,j <= n). Then T[i,j] = 1 iff there exists e E such that e = (vi,vj) and T[i,j] = 0 iff there exists no e E

such that e = (vi,vj) .1 2

3 4

0 1 0 00 0 1 11 0 0 11 0 0 0

1234

1 2 3 4

G Adjacency matrix for G

Graph Representations: Edge lists

1 2

3 4G

1:2:3:4:

2

3

1

1

4

4

Vector of linked adjacency lists for G

2

3

1

1

4

4

1

2

3

4

G

G

List of linkedadjacency lists for G(basic graphnode:= name, nextv, edgelist)

Graph InsertionGiven G: a list of graphnodes v: a graphnode edge: a pair of graphnodes, x and yAnd assume listinsert inserts only if not there.

InsertEdge(edge, graph) listinsert(edge.x, graph) listinsert(edge.y, graph) listinsert(edge.y.name, edge.x.edgelist) return graph

Complexity???

Complexity of Simple Graph Insertion

InsertEdge(edge, graph) listinsert(edge.x, graph) O(|V|) listinsert(edge.y, graph) O(|V|) listinsert(edge.y.name, edge.x.edgelist) O(|V|) return graph

Complexity: O(V) on each callHow many calls? At most V2

So, O(V3) to build a graph

Example

Main() {G := nullFor e in ((ny pgh) (ny bos) (bos pgh)) do G := InsertEdge(e, G)}

bosny

pgh

bos

G pgh

pgh

Graph Search

• Basic idea: to search a graph G, we want to visit all G’s vertices in a systematic order (we’ll use the adjacency list).

• Will need to designate some v V as the start vertex.

• Will need to mark each vertex we’ve visited as seen in order to detect cycles; so we add the field visited (boolean) to the basic graphnode definition.

Recursive DFS

ExhaustiveDFS(v) { v.visited := true for w in v.edgelist do if w.visited = false then ExhaustiveDFS(w)}

main() { ExhaustiveDFS(v0)}

What if G has multiple components, or G has onecomponent but is weakly connected?

Example

A

B

C

DEDFS(A) A.visited := true for unvisited w in (B C D) do EDFS(B)

Example

A

B

C

D

EDFS(B) B.visited := true for unvisited w in (A C D) do EDFS(C)

EDFS(A) A.visited := true for unvisited w in (B C D) do

Example

A

B

C

D

EDFS(C) C.visited := true for unvisited w in (A B D) do EDFS(D)

EDFS(B) B.visited := true for unvisited w in (A C D) do

EDFS(A) A.visited := true for unvisited w in (B C D) do

ExampleA

B

C

D

EDFS(D) D.visited := true No unvisited w in (B C A) so function returns

EDFS(A) A.visited := true for unvisited w in (B C D) do EDFS(B)

B.visited := true for unvisited w in (A C D) do EDFS(C)

C.visited := true for unvisited w in (A B D) do

ExampleA

B

C

DEDFS(A) A.visited := true for unvisited w in (B C D) do EDFS(B)

B.visited := true for unvisited w in (A C D) do EDFS(C)

C.visited := true No unvisited w so return

ExampleA

B

C

D

EDFS(A) A.visited := true for unvisited w in (B C D) do EDFS(B)

B.visited := true No unvisited w in (D) so return

ExampleA

B

C

DEDFS(A) A.visited := true No unvisited w in (C D) so return

How would you change EDFS to visit nodes “breadth first”?

ExhaustiveDFS(v) { v.visited := true for w in v.edgelist do if w.visited = false then ExhaustiveDFS(w)}

Shortest Path• For many problems the best representation

is a directed graph with weighted edges representing, e.g., distance, time, cost.

• Dijkstra’s shortest path algorithm finds the lowest cost path in O(n2).

Readassignment

Go to TA hours

Write pseudocode

Simulateby hand

Write/debug Java

1

3

6

1

24

16

2 Go to TA hours

PERT/CPM• Project Evaluation and Review (PERT) charts use a graph

to encode :– tasks as vertices– dependencies among paths as edges– duration of task as weight on edge

• A critical path on a PERT chart is a path from a start vertex to an end vertex such that if the completion time of any task along p slips by T then the project also slips by T.

• PERT/CPM uses a DAG and topological ordering.

The Travelling Salesman Problem (TSP)

• Given G, a directed graph with weighted edges, where vertices represent cities, and weights on edges connecting cities give the distance/cost of traveling between those cities.

• Problem: Find the minimum cost cycle that visits all the cities in the graph exactly once before returning to the starting point.

• The number of possible paths is exponential; can we do better than exhaustively trying all paths?

The Class P

• P is the class of all problems that can be solved in polynomial time on a deterministic computer.

• Polynomial means O(nk) for some integer k given a problem of size n.

• A deterministic computer makes exactly one choice at any choice point.

• All single processor machines and machines with fixed parallelism are deterministic.

The Class NP• NP is the class of all problems that can be solved in

polynomial time on a nondeterministic computer.

• A nondeterministic computer always makes the correct choice at a choice point (one choice but never backs up).

• Alternatively: a nondeterministic computer makes k copies of itself to run in parallel at a k-wise choice point, for all values of k.

• Alternatively: a nondeterministic computer can explore a tree of depth d in O(d) time.

Instant Ph.D.

Just answer the question:

Does P = NP?

(Nobody knows)

NP-Completeness

• An NP-complete problem is one that can be solved in O(nk) on a nondeterministic machine, and for which it can be shown that every problem in NP can be reduced to the NP-complete problem using a polynomial time transformation.

• Such proofs rely on the definition of Turing Machine.

• Concept of NP-completeness is important because:– Showing a polynomial deterministic solution for any NP-

complete problem means P = NP.– Proving something is NP-complete (or NP-hard) means you’re

not likely to find a polynomial algorithm.

Proving a Problem is in NP

• Another way to show your problem is NP-complete is to show that a known NP-complete problem can be reduced to it in polynomial time.

• E.g. The Hamiltonian Circuit problem is known to be NP-complete (find a cycle in a directed graph of n vertices that travels through each vertex exactly once and returns to the start).

• Let’s prove (very informally) that TSP is NP-complete.

Step 1: Show TSP is in NP

• Show TSP in NP by giving nondeterministic solution:– Nondeterministically guess all possible subsets

of |V| vertices and choose the one with minimum cost.

Step 2: Reduce Hamiltonian Circuit to TSP

• Given a graph G = (V, E) we turn it into GTSP by adding a weight of 1 to each edge.

• Run our nondeterministic TSP algorithm seeking a path of cost |V|.

• GTSP has a solution iff G has a Hamiltonian Circuit.

a b

d c

a b

d c

1

1

1

1

1 1

Step 3: Proof by Contradiction

• Now assume TSP is not NP-complete.

• Then we can solve any instance of HC in polynomial time (just convert to TSP, run and read off answer). So HC is in P.

• But we know that HC is NP-complete (contradiction).

• Thus our assumption must be wrong and TSP is NP-complete.

Who Cares?

• Just because you can only think of an exponential solution to a problem doesn’t mean that there isn’t a polynomial time solution (remember the mutilated checkerboard?).

• If a problem is in P it is also in NP by definition (similarly, if it’s O(n2) it’s also O(n3), etc.)

• Reduction of a known NP-complete problem guarantees that there is no polytime solution unless P = NP.

What do you do with an NP-complete problem?

• Don’t bother looking for a polynomial time solution; go directly to heuristic search….

intro to computation & ai

Documents