g52apt ai programming techniques - nottinghampszbsl/g52apt/slides/14-regression... ·...

37
G52APT AI Programming Techniques Lecture 14: Regression planning Brian Logan School of Computer Science [email protected]

Upload: others

Post on 24-Apr-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

G52APT AI Programming Techniques

Lecture 14: Regression planning

Brian Logan School of Computer Science

[email protected]

© Brian Logan 2012 G52APT Lecture 14: Regression planning 2

Outline of this lecture

•  recap: simple forward planning

• problems of forward planning

•  regression planning

• goal stack planning

•  clobbering

© Brian Logan 2012 G52APT Lecture 14: Regression planning 3

Recap: planning

•  like search, planning involves choosing a sequence of actions which, if executed, will achieve a given state of the world

• however there are important differences:

– relies on factored representations for states and goals (sets of properties, usually expressed in logic)

– operator applicability and effects are defined in terms of these properties, and are limited to parts of the state

– may use reasoning to infer the consequences of possible actions

Recap: divide and conquer

•  in search, there is no easy way to break a large problem down into sub-problems

•  all a search algorithm has to work with is a list of applicable operators (and possibly a heuristic evaluation of the successor states)

•  in contrast, planners exploit localised representations to:

–  break the problem down into sub-problems, e.g., by considering each of the properties (goals) defining the goal state in turn

– select appropriate operators to solve each sub-problem

•  as in local search, some planners also relax the requirement that solutions are constructed sequentially

© Brian Logan 2012 G52APT Lecture 14: Regression planning 4

© Brian Logan 2012 G52APT Lecture 14: Regression planning 5

Recap: STRIPS

•  states are represented as conjunctions of fluents (function-free ground literals)

• goals are represented by conjunctions of literals, possibly containing existentially quantified variables

•  actions are represented by operators which specify:

– the name of the action

– the precondition—a conjunction of positive literals specifying what must be true for the action to be applicable (PDDL is less restrictive)

– the postcondition—a conjunction of literals specifying how the situation changes when the operator is applied

© Brian Logan 2012 G52APT Lecture 14: Regression planning 6

STRIPS continued

•  for example, an operator which stacks one block on top of another in the blocks world could be specified as

[ Block(x), Block(y), x ≠ y, Clear(x), Clear(y), On(x, z) ] MOVE(x, y) [ On(x, y), Clear(z), ¬Clear(y), ¬On(x, z) ]

•  the precondition describes the state(s) in which the action can be performed, and the postcondition describes the state that results from performing the action

•  in the successor state, all the positive literals in the postcondition hold, as do all the literals in the precondition except for those that are negated literals in the postcondition

© Brian Logan 2012 G52APT Lecture 14: Regression planning 7

Example: blocks world operators

[ Block(x), Block(y), x ≠ y, Clear(x), Clear(y), On(x, z) ] MOVE(x, y) [ On(x, y), Clear(z), ¬Clear(y), ¬On(x, z) ]

[Block(x), Block(y), Clear(x), On(x, y) ] MOVE-TO-TABLE(x) [ On(x, table), Clear(y), ¬On(x, y) ]

© Brian Logan 2012 G52APT Lecture 14: Regression planning 8

Example: blocks world problem

•  initial state:

On(c,a) ^ On(a,table) ^ On(b,table) ^ Clear(b) ^ Clear(c) ^ Block(a) ^ Block(b) ^ Block(c)

• goal state: On(a,b) ^ On(b,c) ^ On(c,table)

Representing states in Prolog

•  in STRIPS planning, states are sets of fluents

•  e.g., in the blocks world we have:

–  blocks, a, b, c, … etc. which can be represented by atoms a, b, c, …

– properties of blocks, e.g, Clear(a), and relationships between blocks, e.g., On(a,b), which can be represented by Prolog predicates clear(a), on(a,b), etc.

•  states (e.g., initial state, goal state) can then be represented by lists of terms, e.g, [clear(a), on(a,b), ...] representing a conjunction of fluents

© Brian Logan 2012 G52APT Lecture 14: Regression planning 9

Representing operators

•  for example, the operator

[ Block(x), Block(y), x ≠ y, Clear(x), Clear(y), On(x, z) ] MOVE(x, y) [ On(x, y), Clear(z), ¬Clear(y), ¬On(x, z) ]

can be represented as

pop(move(X,Y), % name [cube(X),cube(Y), X \= Y, clear(X),clear(Y), on(X,Z)], % precondition [on(X,Y),clear(Z)], % add list [clear(Y),on(X,Z)]) % delete list

© Brian Logan 2012 G52APT Lecture 14: Regression planning 10

Simple forward planning in Prolog

:- use_module([library(lists),library(sets)]). % fp(+State,+Goals,-Plan): Plan is a sequence of

operators that applied in State achieves Goals fp(S,G,P) :- fp(S,G,[],R), reverse(R,P).

fp(S,G,P,P) :- holds(G,S). fp(S,G,Os,P) :-

pop(O,Pre,A,D),

holds(Pre,S),

\+ member(O,Os),

apply(S,A,D,S1), fp(S1,G,[O|Os],P).

© Brian Logan 2012 G52APT Lecture 14: Regression planning 11

Simple forward planning in Prolog

% holds(+Goals,+State): the goals (or preconditions) Goals hold in State

holds([],_). holds([Pre|Ps],S) :-

select(Pre,S,S1), holds(Ps,S1).

% apply(+State,+AddList,+DeleteList,-NewState): NewState is the result of applying the operator with add and delete lists AddList and DeleteList in State

apply(S,A,D,S1) :- subtract(S,D,S2), append(A,S2,S1), © Brian Logan 2012 G52APT Lecture 14: Regression planning 12

Simple forward planning in Prolog

% Move a block on top of another block

pop(move(X,Y),

[cube(X),cube(Y),X \= Y,

clear(X),clear(Y),on(X,Z)], [on(X,Y),clear(Z)],

[clear(Y),on(X,Z)])

% Move a block onto the table

pop(move_to_table(X), [cube(X),cube(Y),clear(X),on(X,Y)],

[on(X,table),clear(Y)],

[on(X,Y)])

© Brian Logan 2012 G52APT Lecture 14: Regression planning 13

© Brian Logan 2012 G52APT Lecture 14: Regression planning 14

Example: blocks world problem 2

•  initial state:

On(c,a) ^ On(a,table) ^ On(b,table) ^ Clear(b) ^ Clear(c) ^ Block(a) ^ Block(b) ^ Block(c)

• goal state: On(a,b)

B A

C

B C

A

© Brian Logan 2012 G52APT Lecture 14: Regression planning 15

Example: blocks world problem 2

•  initial state:

[on(c,a), on(a,table), on(b,table), clear(b), clear(c), block(a), block(b), block(c)]

• goal state: [on(a,b)]

•  for this example problem, the forward planner returns the plan

[move(b,c), move_to_table(b), move(c,b), move(c,a), move_to_table(c), move(a,b)]

• which contains four redundant steps

Problems with forward planning

•  if a plan is found it is guaranteed to achieve all the goals

• goals may be existentially quantified, e.g., [on(a,X)]

• however there are problems:

– plans may contain redundant steps

– planning may not terminate for some problems (due to depth first search)

© Brian Logan 2012 G52APT Lecture 14: Regression planning 16

Forward planner search

•  forward planner performs blind depth-first search of the state space

• order in which plans are returned is determined by

– the order in which operators are tried, and

– the order in which literals appear in the initial state description

© Brian Logan 2012 G52APT Lecture 14: Regression planning 17

© Brian Logan 2012 G52APT Lecture 14: Regression planning 18

Regression planning

•  a better way to solve STRIPS problems is to search backwards from the goal in world (situation) space

– goals are matched against operator postconditions

– operator preconditions become subgoals

– we stop when the operator preconditions (subgoals) are satisfied in the current state

•  resulting plan is a series of instantiated operators which, if applied in the initial state, result in the goal state

•  searching backwards often reduces the branching factor

Goal regression

• given a ground goal G and a ground operator (action) O, the regression from G over O is given by:

G′ = (G − Add-List(O)) + Precondition(O)

•  i.e., we remove from the goals any effects of the action and add the preconditions of the action as new goals

•  regression moves one or more goals from the set of unachieved goals to the set of achieved goals, and possibly adds some new goals to the set of unachieved goals

© Brian Logan 2012 G52APT Lecture 14: Regression planning 19

Relevant operators

• only regress over relevant operators

•  an operator is relevant if one of the literals added by the operator unifies with an unachieved goal

© Brian Logan 2012 G52APT Lecture 14: Regression planning 20

© Brian Logan 2012 G52APT Lecture 14: Regression planning 21

Blocks world problem again

•  initial state:

On(c,a) ^ On(a,table) ^ On(b,table) ^ Clear(b) ^ Clear(c) ^ Block(a) ^ Block(b) ^ Block(c)

• goal state: On(a,b) ^ On(b,c) ^ On(c,table)

© Brian Logan 2012 G52APT Lecture 14: Regression planning 22

Regression planning in the blocks world

•  the blocks world problem can be decomposed into three subgoals:

On(a, b), On(b, c) and On(c,table)

• we try to achieve each subgoal in turn

•  the first subgoal, On(a, b), is false in the initial situation, so we look for an operator which makes it true, i.e., which has On(a, b) as a postcondition

•  in this case, there is only one, MOVE(a, b), which has preconditions:

Block(a), Block(b), a ≠ b, Clear(a), Clear(b), On(a, z)

© Brian Logan 2012 G52APT Lecture 14: Regression planning 23

Regression planning in the blocks world

• Block(a), Block(b), a ≠ b, Clear(b), and On(a, z) are true in the current state, but Clear(a), is false

•  each unachieved precondition of MOVE(a, b) (in this case Clear(a)) becomes a new subgoal, and we look for an operator to make it true

•  in this case there are two operators we can choose: MOVE-TO-TABLE and MOVE

•  and so on …

Goal stack planning

•  the simplest form of regression planning is goal stack planning

• goal stack planning was the approach used by the original STRIPS planner

• performs depth first search backwards from each goal in turn, until it finds a sequence of operators that achieve the goal from the current state

• operators are then applied to give a new current state s′, and planning continues to find a sequence of operators that achieve the next goal, starting from s′

© Brian Logan 2012 G52APT Lecture 14: Regression planning 24

Goal stack planning

•  push the original conjunction of goals on the stack

•  repeat until the stack is empty:

– if the top of the stack is an achieved goal pop it from the stack

– if the top of the stack is a conjunctive goal, push its unachieved subgoals on the stack

– if the top of the stack is a single unachieved goal, replace it by an action that achieves the goal and push the action’s precondition on the stack

– if the top if the stack is an action, pop it from the stack, execute it and update the state using the action’s add and delete lists

© Brian Logan 2012 G52APT Lecture 14: Regression planning 25

Goal stack planning in Prolog

:- use_module([library(lists),library(sets)]).

% gsp(+State,+Goals,-Plan): Plan is a sequence of

% operators that applied in State achieves Goals

gsp(S,G,P) :- gsp(S,G,P,_).

© Brian Logan 2012 G52APT Lecture 14: Regression planning 26

Goal stack planning in Prolog

% gsp(+State,+Goals,-Plan,-NewState): Plan is a

% sequence of operators that applied in State

% achieves Goals and results in NewState

gsp(S,G,[],S) :- holds(G,S).

gsp(S,[G|Gs],P,S3) :-

pop(O,Pre,A,D),

member(G,A),

gsp(S,Pre,P1,S1),

apply(S1,A,D,S2),

gsp(S2,Gs,P2,S3),

append(P1,[O|P2],P).

© Brian Logan 2012 G52APT Lecture 14: Regression planning 27

Goal stack planning in Prolog

% holds(+Goals,+State): the goals (or preconditions) Goals hold in State

holds([],_).

holds([Pre|Ps],S) :-

select(Pre,S,S1),

holds(Ps,S1).

% apply(+State,+AddList,+DeleteList,-NewState): NewState is the result of applying the operator with add and delete lists AddList and DeleteList in State

apply(S,A,D,S1) :-

union(S,A,S2), subtract(S2,D,S1).

© Brian Logan 2012 G52APT Lecture 14: Regression planning 28

Goal stack planning in Prolog

% Move a block onto the table

pop(move_to_table(X),

[cube(X),cube(Y),clear(X),on(X,Y)],

[on(X,table),clear(Y)], [on(X,Y)])

% Move a block on top of another block

pop(move(X,Y),

[cube(X),cube(Y),X \= Y, clear(X),clear(Y),on(X,Z)],

[on(X,Y),clear(Z)],

[clear(Y),on(X,Z)])

© Brian Logan 2012 G52APT Lecture 14: Regression planning 29

© Brian Logan 2012 G52APT Lecture 14: Regression planning 30

Example: blocks world problem

•  initial state:

On(c,a) ^ On(a,table) ^ On(b,table) ^ Clear(b) ^ Clear(c) ^ Block(a) ^ Block(b) ^ Block(c)

• goal state: On(a,b)

•  for this example problem, the goal stack planner returns the plan

[move_to_table(c),move(a,b)]

• which is optimal

Problems with goal stack planning

• working backwards from unachieved goals in goal stack planning focuses search and often leads to shorter plans

•  however there are still problems:

– plans may still contain redundant steps

– plans may not achieve the goal

– planning may not terminate for some problems (due to depth first search)

• more difficult to handle existentially quantified goals (general issue with regression planning)

© Brian Logan 2012 G52APT Lecture 14: Regression planning 31

© Brian Logan 2012 G52APT Lecture 14: Regression planning 32

Example: Sussman anomaly

•  initial state:

On(c,a) ^ On(a,table) ^ On(b,table) ^ Clear(b) ^ Clear(c) ^ Block(a) ^ Block(b) ^ Block(c)

• goal state: On(a,b) ^ On(b,c) ^ On(c,table)

Example Sussman anomaly plan

•  for this example problem, the goal stack planner returns the plan

[move_to_table(c),move(a,b),

move_to_table(a),move(b,c)]

•  this achieves each goal individually, but not at the same time

•  this example illustrates a general problem with goal stack planning called ‘clobbering’

© Brian Logan 2012 G52APT Lecture 14: Regression planning 33

Clobbering

•  with conjunctive goals it can be hard to ensure that steps in the plan don’t interfere

•  e.g., when planning to achieve G1 ^ G2 the postcondition of an action to achieve G2 may make the (already achieved) goal G1 false

© Brian Logan 2012 G52APT Lecture 14: Regression planning 34

Example: clobbering

•  for example, the state after achieving the first subgoal, On(a,b), could be

with On(a, b) achieved

•  after achieving the second and third subgoals, On(b, c) and On(c,table) the state could be

with On(b, c) and On(c,table) achieved

© Brian Logan 2012 G52APT Lecture 14: Regression planning 35

Protecting achieved goals

•  to avoid this problem we must ensure that achieved (sub)goals are not ‘unachieved’ by subsequent steps in the plan

•  regression planning with protected goals is similar to goal stack planning, except that:

–  the planner maintains lists of unachieved and achieved goals, and

– a new action is only added to a partial plan if it achieves an unachieved goal and does not clobber any achieved goal

© Brian Logan 2012 G52APT Lecture 14: Regression planning 36

© Brian Logan 2012 G52APT Lecture 14: Regression planning 37

The next lecture

Further regression

Suggested reading:

• Bratko (2001) chapter 17