g52apt ai programming techniques - nottinghampszbsl/g52apt/slides/14-regression... ·...
TRANSCRIPT
G52APT AI Programming Techniques
Lecture 14: Regression planning
Brian Logan School of Computer Science
© Brian Logan 2012 G52APT Lecture 14: Regression planning 2
Outline of this lecture
• recap: simple forward planning
• problems of forward planning
• regression planning
• goal stack planning
• clobbering
© Brian Logan 2012 G52APT Lecture 14: Regression planning 3
Recap: planning
• like search, planning involves choosing a sequence of actions which, if executed, will achieve a given state of the world
• however there are important differences:
– relies on factored representations for states and goals (sets of properties, usually expressed in logic)
– operator applicability and effects are defined in terms of these properties, and are limited to parts of the state
– may use reasoning to infer the consequences of possible actions
Recap: divide and conquer
• in search, there is no easy way to break a large problem down into sub-problems
• all a search algorithm has to work with is a list of applicable operators (and possibly a heuristic evaluation of the successor states)
• in contrast, planners exploit localised representations to:
– break the problem down into sub-problems, e.g., by considering each of the properties (goals) defining the goal state in turn
– select appropriate operators to solve each sub-problem
• as in local search, some planners also relax the requirement that solutions are constructed sequentially
© Brian Logan 2012 G52APT Lecture 14: Regression planning 4
© Brian Logan 2012 G52APT Lecture 14: Regression planning 5
Recap: STRIPS
• states are represented as conjunctions of fluents (function-free ground literals)
• goals are represented by conjunctions of literals, possibly containing existentially quantified variables
• actions are represented by operators which specify:
– the name of the action
– the precondition—a conjunction of positive literals specifying what must be true for the action to be applicable (PDDL is less restrictive)
– the postcondition—a conjunction of literals specifying how the situation changes when the operator is applied
© Brian Logan 2012 G52APT Lecture 14: Regression planning 6
STRIPS continued
• for example, an operator which stacks one block on top of another in the blocks world could be specified as
[ Block(x), Block(y), x ≠ y, Clear(x), Clear(y), On(x, z) ] MOVE(x, y) [ On(x, y), Clear(z), ¬Clear(y), ¬On(x, z) ]
• the precondition describes the state(s) in which the action can be performed, and the postcondition describes the state that results from performing the action
• in the successor state, all the positive literals in the postcondition hold, as do all the literals in the precondition except for those that are negated literals in the postcondition
© Brian Logan 2012 G52APT Lecture 14: Regression planning 7
Example: blocks world operators
[ Block(x), Block(y), x ≠ y, Clear(x), Clear(y), On(x, z) ] MOVE(x, y) [ On(x, y), Clear(z), ¬Clear(y), ¬On(x, z) ]
[Block(x), Block(y), Clear(x), On(x, y) ] MOVE-TO-TABLE(x) [ On(x, table), Clear(y), ¬On(x, y) ]
© Brian Logan 2012 G52APT Lecture 14: Regression planning 8
Example: blocks world problem
• initial state:
On(c,a) ^ On(a,table) ^ On(b,table) ^ Clear(b) ^ Clear(c) ^ Block(a) ^ Block(b) ^ Block(c)
• goal state: On(a,b) ^ On(b,c) ^ On(c,table)
Representing states in Prolog
• in STRIPS planning, states are sets of fluents
• e.g., in the blocks world we have:
– blocks, a, b, c, … etc. which can be represented by atoms a, b, c, …
– properties of blocks, e.g, Clear(a), and relationships between blocks, e.g., On(a,b), which can be represented by Prolog predicates clear(a), on(a,b), etc.
• states (e.g., initial state, goal state) can then be represented by lists of terms, e.g, [clear(a), on(a,b), ...] representing a conjunction of fluents
© Brian Logan 2012 G52APT Lecture 14: Regression planning 9
Representing operators
• for example, the operator
[ Block(x), Block(y), x ≠ y, Clear(x), Clear(y), On(x, z) ] MOVE(x, y) [ On(x, y), Clear(z), ¬Clear(y), ¬On(x, z) ]
can be represented as
pop(move(X,Y), % name [cube(X),cube(Y), X \= Y, clear(X),clear(Y), on(X,Z)], % precondition [on(X,Y),clear(Z)], % add list [clear(Y),on(X,Z)]) % delete list
© Brian Logan 2012 G52APT Lecture 14: Regression planning 10
Simple forward planning in Prolog
:- use_module([library(lists),library(sets)]). % fp(+State,+Goals,-Plan): Plan is a sequence of
operators that applied in State achieves Goals fp(S,G,P) :- fp(S,G,[],R), reverse(R,P).
fp(S,G,P,P) :- holds(G,S). fp(S,G,Os,P) :-
pop(O,Pre,A,D),
holds(Pre,S),
\+ member(O,Os),
apply(S,A,D,S1), fp(S1,G,[O|Os],P).
© Brian Logan 2012 G52APT Lecture 14: Regression planning 11
Simple forward planning in Prolog
% holds(+Goals,+State): the goals (or preconditions) Goals hold in State
holds([],_). holds([Pre|Ps],S) :-
select(Pre,S,S1), holds(Ps,S1).
% apply(+State,+AddList,+DeleteList,-NewState): NewState is the result of applying the operator with add and delete lists AddList and DeleteList in State
apply(S,A,D,S1) :- subtract(S,D,S2), append(A,S2,S1), © Brian Logan 2012 G52APT Lecture 14: Regression planning 12
Simple forward planning in Prolog
% Move a block on top of another block
pop(move(X,Y),
[cube(X),cube(Y),X \= Y,
clear(X),clear(Y),on(X,Z)], [on(X,Y),clear(Z)],
[clear(Y),on(X,Z)])
% Move a block onto the table
pop(move_to_table(X), [cube(X),cube(Y),clear(X),on(X,Y)],
[on(X,table),clear(Y)],
[on(X,Y)])
© Brian Logan 2012 G52APT Lecture 14: Regression planning 13
© Brian Logan 2012 G52APT Lecture 14: Regression planning 14
Example: blocks world problem 2
• initial state:
On(c,a) ^ On(a,table) ^ On(b,table) ^ Clear(b) ^ Clear(c) ^ Block(a) ^ Block(b) ^ Block(c)
• goal state: On(a,b)
B A
C
B C
A
© Brian Logan 2012 G52APT Lecture 14: Regression planning 15
Example: blocks world problem 2
• initial state:
[on(c,a), on(a,table), on(b,table), clear(b), clear(c), block(a), block(b), block(c)]
• goal state: [on(a,b)]
• for this example problem, the forward planner returns the plan
[move(b,c), move_to_table(b), move(c,b), move(c,a), move_to_table(c), move(a,b)]
• which contains four redundant steps
Problems with forward planning
• if a plan is found it is guaranteed to achieve all the goals
• goals may be existentially quantified, e.g., [on(a,X)]
• however there are problems:
– plans may contain redundant steps
– planning may not terminate for some problems (due to depth first search)
© Brian Logan 2012 G52APT Lecture 14: Regression planning 16
Forward planner search
• forward planner performs blind depth-first search of the state space
• order in which plans are returned is determined by
– the order in which operators are tried, and
– the order in which literals appear in the initial state description
© Brian Logan 2012 G52APT Lecture 14: Regression planning 17
© Brian Logan 2012 G52APT Lecture 14: Regression planning 18
Regression planning
• a better way to solve STRIPS problems is to search backwards from the goal in world (situation) space
– goals are matched against operator postconditions
– operator preconditions become subgoals
– we stop when the operator preconditions (subgoals) are satisfied in the current state
• resulting plan is a series of instantiated operators which, if applied in the initial state, result in the goal state
• searching backwards often reduces the branching factor
Goal regression
• given a ground goal G and a ground operator (action) O, the regression from G over O is given by:
G′ = (G − Add-List(O)) + Precondition(O)
• i.e., we remove from the goals any effects of the action and add the preconditions of the action as new goals
• regression moves one or more goals from the set of unachieved goals to the set of achieved goals, and possibly adds some new goals to the set of unachieved goals
© Brian Logan 2012 G52APT Lecture 14: Regression planning 19
Relevant operators
• only regress over relevant operators
• an operator is relevant if one of the literals added by the operator unifies with an unachieved goal
© Brian Logan 2012 G52APT Lecture 14: Regression planning 20
© Brian Logan 2012 G52APT Lecture 14: Regression planning 21
Blocks world problem again
• initial state:
On(c,a) ^ On(a,table) ^ On(b,table) ^ Clear(b) ^ Clear(c) ^ Block(a) ^ Block(b) ^ Block(c)
• goal state: On(a,b) ^ On(b,c) ^ On(c,table)
© Brian Logan 2012 G52APT Lecture 14: Regression planning 22
Regression planning in the blocks world
• the blocks world problem can be decomposed into three subgoals:
On(a, b), On(b, c) and On(c,table)
• we try to achieve each subgoal in turn
• the first subgoal, On(a, b), is false in the initial situation, so we look for an operator which makes it true, i.e., which has On(a, b) as a postcondition
• in this case, there is only one, MOVE(a, b), which has preconditions:
Block(a), Block(b), a ≠ b, Clear(a), Clear(b), On(a, z)
© Brian Logan 2012 G52APT Lecture 14: Regression planning 23
Regression planning in the blocks world
• Block(a), Block(b), a ≠ b, Clear(b), and On(a, z) are true in the current state, but Clear(a), is false
• each unachieved precondition of MOVE(a, b) (in this case Clear(a)) becomes a new subgoal, and we look for an operator to make it true
• in this case there are two operators we can choose: MOVE-TO-TABLE and MOVE
• and so on …
Goal stack planning
• the simplest form of regression planning is goal stack planning
• goal stack planning was the approach used by the original STRIPS planner
• performs depth first search backwards from each goal in turn, until it finds a sequence of operators that achieve the goal from the current state
• operators are then applied to give a new current state s′, and planning continues to find a sequence of operators that achieve the next goal, starting from s′
© Brian Logan 2012 G52APT Lecture 14: Regression planning 24
Goal stack planning
• push the original conjunction of goals on the stack
• repeat until the stack is empty:
– if the top of the stack is an achieved goal pop it from the stack
– if the top of the stack is a conjunctive goal, push its unachieved subgoals on the stack
– if the top of the stack is a single unachieved goal, replace it by an action that achieves the goal and push the action’s precondition on the stack
– if the top if the stack is an action, pop it from the stack, execute it and update the state using the action’s add and delete lists
© Brian Logan 2012 G52APT Lecture 14: Regression planning 25
Goal stack planning in Prolog
:- use_module([library(lists),library(sets)]).
% gsp(+State,+Goals,-Plan): Plan is a sequence of
% operators that applied in State achieves Goals
gsp(S,G,P) :- gsp(S,G,P,_).
© Brian Logan 2012 G52APT Lecture 14: Regression planning 26
Goal stack planning in Prolog
% gsp(+State,+Goals,-Plan,-NewState): Plan is a
% sequence of operators that applied in State
% achieves Goals and results in NewState
gsp(S,G,[],S) :- holds(G,S).
gsp(S,[G|Gs],P,S3) :-
pop(O,Pre,A,D),
member(G,A),
gsp(S,Pre,P1,S1),
apply(S1,A,D,S2),
gsp(S2,Gs,P2,S3),
append(P1,[O|P2],P).
© Brian Logan 2012 G52APT Lecture 14: Regression planning 27
Goal stack planning in Prolog
% holds(+Goals,+State): the goals (or preconditions) Goals hold in State
holds([],_).
holds([Pre|Ps],S) :-
select(Pre,S,S1),
holds(Ps,S1).
% apply(+State,+AddList,+DeleteList,-NewState): NewState is the result of applying the operator with add and delete lists AddList and DeleteList in State
apply(S,A,D,S1) :-
union(S,A,S2), subtract(S2,D,S1).
© Brian Logan 2012 G52APT Lecture 14: Regression planning 28
Goal stack planning in Prolog
% Move a block onto the table
pop(move_to_table(X),
[cube(X),cube(Y),clear(X),on(X,Y)],
[on(X,table),clear(Y)], [on(X,Y)])
% Move a block on top of another block
pop(move(X,Y),
[cube(X),cube(Y),X \= Y, clear(X),clear(Y),on(X,Z)],
[on(X,Y),clear(Z)],
[clear(Y),on(X,Z)])
© Brian Logan 2012 G52APT Lecture 14: Regression planning 29
© Brian Logan 2012 G52APT Lecture 14: Regression planning 30
Example: blocks world problem
• initial state:
On(c,a) ^ On(a,table) ^ On(b,table) ^ Clear(b) ^ Clear(c) ^ Block(a) ^ Block(b) ^ Block(c)
• goal state: On(a,b)
• for this example problem, the goal stack planner returns the plan
[move_to_table(c),move(a,b)]
• which is optimal
Problems with goal stack planning
• working backwards from unachieved goals in goal stack planning focuses search and often leads to shorter plans
• however there are still problems:
– plans may still contain redundant steps
– plans may not achieve the goal
– planning may not terminate for some problems (due to depth first search)
• more difficult to handle existentially quantified goals (general issue with regression planning)
© Brian Logan 2012 G52APT Lecture 14: Regression planning 31
© Brian Logan 2012 G52APT Lecture 14: Regression planning 32
Example: Sussman anomaly
• initial state:
On(c,a) ^ On(a,table) ^ On(b,table) ^ Clear(b) ^ Clear(c) ^ Block(a) ^ Block(b) ^ Block(c)
• goal state: On(a,b) ^ On(b,c) ^ On(c,table)
Example Sussman anomaly plan
• for this example problem, the goal stack planner returns the plan
[move_to_table(c),move(a,b),
move_to_table(a),move(b,c)]
• this achieves each goal individually, but not at the same time
• this example illustrates a general problem with goal stack planning called ‘clobbering’
© Brian Logan 2012 G52APT Lecture 14: Regression planning 33
Clobbering
• with conjunctive goals it can be hard to ensure that steps in the plan don’t interfere
• e.g., when planning to achieve G1 ^ G2 the postcondition of an action to achieve G2 may make the (already achieved) goal G1 false
© Brian Logan 2012 G52APT Lecture 14: Regression planning 34
Example: clobbering
• for example, the state after achieving the first subgoal, On(a,b), could be
with On(a, b) achieved
• after achieving the second and third subgoals, On(b, c) and On(c,table) the state could be
with On(b, c) and On(c,table) achieved
© Brian Logan 2012 G52APT Lecture 14: Regression planning 35
Protecting achieved goals
• to avoid this problem we must ensure that achieved (sub)goals are not ‘unachieved’ by subsequent steps in the plan
• regression planning with protected goals is similar to goal stack planning, except that:
– the planner maintains lists of unachieved and achieved goals, and
– a new action is only added to a partial plan if it achieves an unachieved goal and does not clobber any achieved goal
© Brian Logan 2012 G52APT Lecture 14: Regression planning 36