trading optimality for speed…

17
Trading optimality for speed… The admissibility condition guarantees that an optimal path is found In path planning a near-optimal path can be satisfactory Try to minimise search instead of minimising cost: i.e. find a near-optimal path (quickly)

Upload: watson

Post on 06-Jan-2016

44 views

Category:

Documents


0 download

DESCRIPTION

Trading optimality for speed…. The admissibility condition guarantees that an optimal path is found In path planning a near-optimal path can be satisfactory Try to minimise search instead of minimising cost: i.e. find a near-optimal path (quickly). - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Trading optimality for speed…

Trading optimality for speed… The admissibility condition

guarantees that an optimal path is found

In path planning a near-optimal path can be satisfactory

Try to minimise search instead of minimising cost: i.e. find a near-optimal path (quickly)

Page 2: Trading optimality for speed…

CSC344: AI for Games

Lecture 6Online and local search

Patrick Olivier

[email protected]

Page 3: Trading optimality for speed…

Weighting…

)()()1()( nwhngwnfw w = 0.0 (breadth-first) w = 0.5 (A*) w = 1.0 (best-first, with f = h)

trading safety/optimality for speed weight towards h when confident in

the estimate of h

Page 4: Trading optimality for speed…

Local search algorithms In many optimisation problems, paths

are irrelevant; goal state the solution State space = set of "complete"

configurations Find configuration satisfying

constraints, e.g., n-queens: n queens on an n ×n board with no two queens on the same row, column, or diagonal

Use local search algorithms which keep a single "current" state and try to improve it

Page 5: Trading optimality for speed…

Hill-climbing search "climbing Everest in thick fog with

amnesia” we can set up an objective function to be

“best” when large (perform hill climbing)

…or we can use the previous formulation of heuristic and minimise the objective function (perform gradient descent)

Page 6: Trading optimality for speed…

Local maxima/minina Problem: depending on

initial state, can get stuck in local maxima/minina

1/(1+H(n)) = 1/17

1/(1+H(n)) = 1/2

Local minima

Page 7: Trading optimality for speed…

Local beam search Keep track of k states rather than just

one Start with k randomly generated states At each iteration, all the successors of

all k states are generated If any one is a goal state, stop; else

select the k best successors from the complete list and repeat.

Page 8: Trading optimality for speed…

Simulated annealing search Idea: escape local maxima by allowing

some "bad" moves but gradually decrease their frequency and range (VSLI layout, scheduling)

Page 9: Trading optimality for speed…

Simulated annealing example

Point feature labelling

Page 10: Trading optimality for speed…

Genetic algorithm search A successor state is generated by

combining two parent states Start with k randomly generated states

(population) A state is represented as a string over a

finite alphabet (often a string of 0s and 1s) Evaluation function (fitness function).

Higher values for better states. Produce the next generation of states by

selection, crossover, and mutation

Page 11: Trading optimality for speed…

Genetic algorithms in games Computationally expensive so primarily

offline form of learning Cloak, Dagger & DNA (Oidian Systems)

4 DNA strands defining opponent behaviour between battles, opponents play each other

Creatures (Millennium Interactive) Genetic algorithms to learning the weights in

a neural network that defines behaviour

Page 12: Trading optimality for speed…

“Real-time” search concepts In A* the whole path is computed off-line,

before the agent walks through the path This solution is only valid for static worlds If the world changes in the meantime, the

initial path is no longer valid: new obstacles appear position of goal changes (e.g. moving target)

Page 13: Trading optimality for speed…

“Real-time” definitions Off-line (non real-time): the solution is

computed in a given amount of time before being executed

Real-time: One move is computed at a time, and that move executed before computing the next

Anytime: the algorithm constantly improves its solution through time capable of providing “current best” at any time

Page 14: Trading optimality for speed…

Agent-based (online) search For example:

mobile robot NPC without perfect knowledge agent that must act now with limited information

Planning and execution are interleaved Could apply standard search techniques:

Best-first (but we know it is poor) Depth-first (has to physically back-track) A* (but nodes in the fringe are not accessible)

Page 15: Trading optimality for speed…

LRTA*: Learning Real-time A* Augment hill-climbing with memory Store “current best estimate” Follow path based on neighbours’

estimates Update estimates based on experience Experience Learning Flatten out local maxima…

Page 16: Trading optimality for speed…

LRTA*: example

8 9 2 2 4 11 1 1 1 1

8 9 3 2 4 11 1 1 1 1

8 9 3 4 4 11 1 1 1 1

8 9 5 4 4 11 1 1 1 1

8 9 5 5 4 11 1 1 1 1

Page 17: Trading optimality for speed…

Learning real-time A*