artificial intelligence - lecture 4 5

8/6/2019 Artificial Intelligence - Lecture 4 5

1/86


2/86

Last Time

We talked about building goal-based agents and

utility-based agents using search strategies.

But the strategies we discussed were uninformed

because the agent is given nothing more than the

problem definition.

Today well present informed search strategies.


3/86

Searching in Large Problems

The search space of a problem is generally

described in terms of the number ofpossible

states in the search :

Water Jug 12 states

Tic-Tac-Toe 39 states

Rubiks Cube 1019 states

100-variable SAT 1030 states

Chess 10120 states


4/86

Searching in Large Problems

Some problems search spaces are too large tosearch efficiently using uninformed methods.

Sometimes we have additional domainknowledge about the problem that we can use toinform the agent thats searching.

To do this, we use heuristics (informed guesses)

Heuristic means serving to aid discovery


5/86

Heuristic Searching

We define a heuristic function h(n):

Is computed from the state at node n

Uses domain-specific information in some way

Heuristics can estimate the goodness of aparticular node (or state) n:

How close is n to a goalnode?

What might be the minimal cost path from n to agoal node?


6/86

Heuristic Searching

We will formalize a heuristic h(n) as follows:

h(n) >= 0 for allnodes n

h(n) = 0 implies thatn is a goalnode

h(n) = implies thatn is a dead end from which

a goal cannot be reached


7/86

Best-First Search

Best-first search is a generic informed search

strategy that uses an evaluation functionf(n),

incorporating some kind ofdomain knowledge

f(n) is used to sort states in the open listusing

apriority queue (like UCS)


8/86

Greedy Search

Greedy search is the best-first search strategy,

with a simple evaluation function off(n) = h(n)

It relies only on the heuristic to select what is

currently believed to be closest to the goal

state


9/86

Greedy Search Example

f(n) = h(n)

# ofstatestested: 3, expanded: 2

State Open Close

-- S:8 --

S C:3,B:4,A:8 S

C G:0,B:4,A:8 SC

G Goal

Path:S-C-G

Cost: 13


10/86

Greedy Search Issues

Greedy Search is generallyfaster than theuninformed methods

It has more knowledge about the problem domain!

Resembles DFS in that it tends to follow a paththat is initially good, and thus:

Notcomplete (couldchase an infinite path or get

caughtin cycles if no open/closed lists) Not optimal(a better solutioncould exist through an

expensive node)


11/86

Greedy Search Issues


12/86

Algorithm ASearch

To try and solve the problems of greedy search,we can conduct an A search by defining ourevaluation functionf(n) = g(n) + h(n)

g(n) is the minimalcost path from the initialnodeto the current node

This adds a UCS-like component to the search: g(n) is the cost to reachn

h(n) is the estimatedcost from n to a goal


13/86

Algorithm ASearch Issues

ASearch is an informed strategy that

incorporates both the realcosts and the

heuristic functions to find better solutions

However, if our heuristic makes certain errors

(e.g. estimating along the path to a goal):

Stillnotcomplete

Stillnot optimal


14/86

Admissible Heuristics

Heuristic functions are good for helping usfind good solutions quickly

But its hard to design accurate heuristics!

They can be expensive to compute

They can make errors estimating the costs

These problems keep informed searches likethe A search from being complete and optimal


15/86

Admissible Heuristics

There is hope! We can add a constraint on our

heuristic function that, for all nodes n in the

searchspace, h(n) h*(n)

Where h*(n) is the true minimalcost from n to a goal

When h(n) h*(n), we say thath(n) is admissible

Admissible heuristics are inherently optimistic,

(i.e. theynever overestimate the cost to a goal)


16/86


17/86

A*Search Example

f(n) = g(n) + h(n)

# ofstatestested: 4, expanded: 3

State Open Close-- S:8 --

S A:9, B:9, C:11 S

A B:9, G:10, C:11, D:, E: SA

B G:9, C:11, D:, E: SAB

G Goal

Path:S-B-G

Cost: 9


18/86

Proof ofA* Optimality

Let:

G1 be the optimal goal

G2be some ot

her goal

f* be the cost of the optimal path(to G1)

n be some node in the optimal path, butnot to G2

Assume that G2 is found using A* search

wheref(n) = g(n) + h(n), andh(n) is admissible

i.e.A* finds a sub-optimal path, which is shouldnt


19/86


g(G2) > f*

by definition, G2 is sub-optimal

f(n) f*

by admissibility: sincef(n)never overestimates thecost to the goal it must be the cost of the optimalpath

f(G2) f(n)

G2 must be chosen overn, by our assumption

f(G2) f*

by transitivity of the operator


20/86


f(G2) f* (from previous slide)

g(G2) + h(G2) f*

by substituting the definition off(n)

g(G2) f*

since G2 is a goalnode,h(G2) = 0

This contradicts the assumption thatG2 is

suboptimal(g(G2

) > f*), thusA

* is optimal interms of path cost

A*never finds a sub-optimal goal


21/86

Devising Heuristics

Often done by relaxing the problem

SeeAI:A ModernApproach for more details

The goal of admissible heuristics is to get as closeas possible to the actual cost without going over

Trade-off: A really good h(n) might be expensive to compute

Could we find the solution faster with a simpler one?


22/86

Devising Heuristics

Ifh(n) = h*(n) for alln: Only nodes on optimal solution are searched

No unnecessary work

We know the actual cost from n to goal

Ifh(n) = 0 for alln: Heuristic is still admissible

A* is identical to UCS

The closer h is to h*, the fewernodes willneed to beexpanded; the more accurate the search will be


23/86

Devising Heuristics

Ifh1(n) h2(n) h*(n)

for each non-goal node n:

We say h2 dominates h1 h2 is closer to actualcost, but is still admissible

A* using h1(i.e. A1*) expands at least as many, if

not more nodes than A2*

A2* is said to be better informed than A1*


24/86

Devising Heuristics

Lets revisit ourMadisonto-El-Paso example,but instead ofoptimizing dollarcost, were tryingto reduce traveldistance

Whats a goodheuristic to usefor this problem?

Is it admissible?


25/86

Optimal Searching

Optimality isnt always required (i.e. you want

some solution,not the best solution)

h(n)neednot be admissible (necessarily)

Greedy search will often suffice

This is all problem-dependent, of course

Can result in fewer nodes being expanded,

and a solution can be found faster


26/86

Partial Searching

So far weve discussed algorithms that try to findapath from the initial state to a goal state

These are called partial search strategies becausethey build up partial solutions, which couldenumerate the entire search space to findsolutions

This is OK for small toy world problems

This is not OK for NP-Complete problems or thosewith exponential search spaces


27/86

Next Time

We will discuss complete search strategies, in

which each state/node represents a complete,

possible solution to the problem

We will also discuss optimization search,

where we try to find such complete solutions

that are optimal (the best)


28/86

Next Time

Optimization!!

S

till in Chapter 4


29/86

Searching: So Far

Weve discussed how to build goal-based andutility-based agents that search to solve problems

Weve also presented both uninformed (or blind)and informed (or heuristic) approaches for search

What weve covered so far are called partial

search strategies because they build up partialsolutions, which could enumerate the entire statespace before finding a solution


30/86


31/86

Optimization

Problems where we search through completesolutions to find the best solution are oftenreferred to as optimization problems

Most optimization tasks belong to a class ofcomputational problems called NP Non-deterministic Polynomial time solvable

Computationally very hard problems

For NP problems, state spaces are usually exponential,so partial search methods arent time or spaceefficient


32/86

Polynomial time In computational

complexity theory

polynomial time refers to the computation time of aproblem where the run time, m(n), is no greater thana polynomial function of the problem size, n.

Written mathematically using big O notation, thisstates that m(n)=O(nk)where kis some constant thatmay depend on the problem.

For example, the Quick Sort sorting algorithm on nintegers performs at most An2 operations for someconstantA. Thus it runs O(n2) in time and is apolynomial time algorithm.


33/86

Optimization Problems

The k-Queens Problem

Of course this isnt realchess


34/86


Traveling Salesman Problem (TSP)

Perhaps most famous optimization problem


35/86


As it turns out, many real-world problems that wemight want an agent to solve are similarly hardoptimization problems:

Bin-packing

Logistics planning

VLSI layout/circuit design

Theorem-proving

Navigation/routing Production scheduling, supply/demand

Learning the parameters for a neural network


36/86


For optimization (also sometimes called

constraint-satisfaction) problems, there is a

well-defined objective function that we are

trying to optimize


37/86


38/86

Satisfiability (SAT)

Given:

Some logical formula

Array of binary variables in the formula

Do:

Find a truth assignment for all variables such that

the formula is satisfied(true)


39/86

Satisfiability (SAT)

For example, given the following formula with 8clauses and 10 variables:

(x1

x2

x3

) (x2

x10

) (x2

) (x4

x10

) (x3

x5

) (x4 x2 x5) (x1 x6 x7) (x8 x10)

We need to find a 10-bit array that makes the formula

logically true: There are 210 = 1024 possible binary arrays

Only 32 of them (~3%) are solutions to this formula


40/86

Greedy Search for SAT

Astate is a 10-bit array x e.g. x = 0101010101

For this array,x1 = 0, x2 = 1, etc.

Our actions are to toggle any single bit in thearrayto generate a new one

Our heuristic(or objective function) will be tominimize the number of clauses in the formulathat are unsatisfied by the candidate string We are trying to satisfy them all


41/86



42/86



43/86



44/86



45/86


Greedy search does the right thing in that it doesfind a solution, and quickly

However, it only expanded 4 out of the 35 thatare generated in the search (i.e. placed in theopen list) You may work it all out yourself if you wish

It also found a direct route, and we dont need toremember the path, so storing all those extrastates pretty much wasted space!


46/86

Local Search

Local search is a type of greedy, complete

search that focuses on a specific (or local)part

of the search space, rather than trying to

branch out into all of it

We only consider the neighborhood of the

current state rather than the entire state

space so far (so as not to waste time/space)


47/86

Beam Search

One type of local search is beam search, whichusesf(n),as in other informed searches, but usesa beam with a width wto restrict the possiblesearch directions

Only keep the w-bestnodes in the open list, andthrow the rest away

More space efficient than best-first search, butcan throw away nodes on a solution path


48/86

Beam Search Example


49/86

Beam Search Example


50/86

Beam Search Example


51/86

Beam Search Example


52/86

Beam Search Example


53/86


54/86


55/86

Hill-Climbing (HC)

CURRENT = initialState // initialize the search

loop forever { // loop until optimum is found

NEXT = highest-valued successor of CURRENT

if score(CURRENT) better than score(NEXT) then

return CURRENTelse

CURRENT = NEXT

}

// we can modify this algorithm to test if whether or// not CURRENT is a goal state, if optimality isnt

// important, as with k-Queens or SAT


56/86

Hill-Climbing for SAT

How to represent the problem?

States, actions, and objective function

Weve seen this before

States: binary array that correspond to variable truth

assignments (e.g. 1010101010 for10 variables)

Actions: toggle a single bit on/off

Objectivefunction: to minimize the number of

unsatisfied clauses


57/86

Hill-Climbing for k-Queens



This is a little tricky States: k*kchess board with k queens

Actions: move a single queen to any of its legalpositions up, down, or diagonally

Objectivefunction: minimize the number ofconflicts between queens


58/86

Hill-Climbing for TSP



This is even trickier

States: an n-city tour(e.g. 1-4-6-2-5-3 is a 6-city tour)

Actions: swap any two cities in the tour (so the tour is

still valid given the problem definition) Objectivefunction: minimize the total cost or length

of the entire tour


59/86

Hill-Climbing Issues

The solution found by HC is totally determinedby the initial state

How should it be initialized?

Should it be fixed or random? Maybe we just want to get started somewhere in

the search space

Can frequently get stuck in local optima orplateaux, without finding the global optimum


60/86

Objective Surfaces

The objective surfaceis a plot of theobjective functionslandscape

The various levels ofoptimality can be seenon the objectivesurface

Getting stuck in localoptima can be a majorproblem!


61/86

Example Objective Surface


62/86

Escaping Local Optima

Searching with HC is like scaling Mount Everest ina thick fog with one arm and amnesia

Local optima are OK, but sometimes we want tofind the absolute best solution

Ch. 5 & 6 ofHow to Solve It: ModernHeuristics

have a better discussion of techniques forescaping local optima than AI:A ModernApproach


63/86

Escaping Local Optima

There are several ways we can try to avoid

local optima and find more globally optimal

solutions:

Random Restarting

Simulated Annealing

Tabu Search


64/86

Random Restarting

If at first you dont succeed, try, try again!

The idea here is to run the standard HC search

algorithm several times, each with different,randomized initial states

Of course, depending on the state space, this can

be a difficult task in and of itself not all statesthat can be generated are legal for someproblems


65/86


66/86

Random Restarting

Since HC is a localsearch strategy,trying multipleinitial states allowsus to locally

explore a widerrange of the searchspace

If we pick lucky

initial states, wecan find the globaloptimum!


67/86


68/86

Random Restarting

It turns out that, if each HC run has a probabilityp of success, the number of restarts needed isapproximately 1/p

For example, with 8-Queens, there is aprobability of successp ~ 0.14 ~ 1/7

S

o, on average, we would need only 7 randomlyinitialized trails of the basic HC search to find asolution


69/86

Random Restarting

Random restart approaches are built in tomany state-of-the-art constraint satisfactionalgorithms

Theyve been shown especially useful insystems geared toward solving hard SATproblems

GSAT

Davis-Putnam (DPLL with restarts)


70/86

Simulated Annealing (SA)

We dont always want to take the best localmove, sometimes we might want to:

Try taking uphill moves that arent the best

Actually go downhill to escape local optima

We can alter HC to allow for thesepossibilities:

Modify how successor states are selected

Change the criteria for accepting a successor


71/86


With standard Hill-Climbing:

We explore all of the current statesactions/successors

Accept the best one

Perhaps we can modify this to account for theother kinds of moves wed like to make:

Choose one action/successor at random

If it is better, accept it, otherwise accept with someprobabilityp


72/86


These changes allow us to take a variety of newmoves, but has problems: Chance of taking a bad move is the same at the

beginning of the search as at the end

The magnitude of a moves effect is ignored

We can replacep with a temperature T whichdecreases over time

Since T cools off over the course of search, wecall this approach simulated annealing


73/86


Concepts behind the SA analogy:


74/86


Let E= score(NEXT) score(CURRENT)

p = e E/T(Boltzman equation)

E -> , p -> 0

The worse a move is, the probability of taking it decreasesexponentially

Time -> , T -> 0

As time increases, the temperature decreases, in accordance with acooling schedule

T -> 0, p -> 0

As temperature decreases, the probability of taking a bad move alsodecreases


75/86


CURRENT = initialState // initialize the search

for TIME = 1 to do {

T = schedule(TIME) // elapsed time effects schedule

if T = 0 then // T has totally cooled

return CURRENT

NEXT = random successor of CURRENT

E = score(NEXT) score(CURRENT)

If E > 0 then

CURRENT = NEXT // take all good moves

else

CURRENT = NEXT with probability e E/T

}


76/86


77/86


According to thermodynamics, to grow a crystal: Start by heating a row of materials in a molten state

The crystal melt is cooled until it is frozen in

If the temperature is reduced too quickly, irregularities

occur and it does not reach its ground state (e.g. moreenergy is trapped in the structure)

By analogy, SA relies on a good cooling schedule, whichmaps the current time to a temperature T, to find the

optimal solution Usually exponential

Can be very difficult to devise


78/86


SA was first used to solve layout problems for

VLSI (very large-scale integration) computer

architectures in the 1980s

Optimally fitting hundreds of thousands of

transistors into a single compact microchip

It is also proven useful for the TSP, and is usedin many factory scheduling software systems


79/86

Tabu Search

Tabu search is a way to add memory to a localsearch strategy, and force it to explore new areasof the search space

Weve seen state-based memory before with theclosed list, but this memory: Tracks actions taken rather than states expanded

Is designed to be a limited (short-term) memory

Moves that have been seen or taken too recentlyor too often become tabu (or taboo)


80/86

Tabu Search

We maintain an array M which tracks time-stamps of the actions weve taken We store in location Mi the most recent time action i

was taken in the search

The key parameter of tabu search is the horizon:

how long should a certain remain tabu? If we set this too small, we may default to normal HC

and stay stuck in local optima If we set it too large, we may run out of legal moves!

Usually problem-dependent


81/86

Tabu Search

CURRENT = initialState // initialize search

BEST = CURRENT // retain best solution so far

for TIME = 1 to MAX_TIME do {

NEXT = best legal successor of CURRENT

ACTION = action that generated NEXTM[ACTION] = tabu info based on horizon & TIME

CURRENT = NEXT // take next move regardless

if score(CURRENT) better than score(BEST) then

BEST = C

URRENT}

return BEST


82/86

Tabu Search

Instead of an array, memory can also be

stored in a queue, or tabu list:

As a move is made, place it in the queue

When the queue becomes full, the oldest move is

removed and becomes legal again

The size of the queue is the horizon


83/86

Tabu Search

Since we take the best non-tabu move at each

step, we are allowed to take backward steps

or just OK moves, as with simulated annealing

Tabu search can also be faster than standard

HC, as it doesnt have to evaluate all

action/successors, just those that are legal


84/86

Tabu Search

Breaking rules

Since we retain the best solution so far, wesometimes might want to make tabu moves

anywayif they are better than anythingweve previously seen

This is called the aspiration criteria, if a tabusolution aspires to be better than allpreviously seen solutions


85/86


86/86

Summary

There are several effective ways of escaping

local optima for local searching, which exploit

different properties:

Random restarting tries several times from

different parts of the search space

Simulated annealing allows for a variety of moves

by searching stochastically Tabu searchis deterministic, but incorporates

memory to force exploration of the state space

artificial intelligence - lecture 4 5

Documents