effective approaches for partial satisfaction (over-subscription) planning romeo sanchez * menkes...

Post on 15-Dec-2015

221 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Effective Approaches for Partial Satisfaction (Over-subscription) Planning

Romeo Sanchez *Menkes van den Briel **Subbarao Kambhampati *

* Department of Computer Science and Engineering** Department of Industrial EngineeringArizona State UniversityTempe, Arizona

Outline

Background Example Approaches

Optiplan Altaltps Sapaps

Planning graph heuristics Results

For all your demands, you could’ve bought me a better flash memory stick at least!

In one day achieve the following 100 goals: RockData at WP 1, high-res pics at WP 2 & 3,

…., SoilData at WP 100

No way I can achieve that many goals in one day

It’s hard but here is the best I can do:

Goal1, Goal5, Goal99

Given: Actions with costs, and goals with utilities, find a plan that has a highest {utility – cost}

Previous Approaches:Highest utility goal firstEstimating the set of most beneficial goals

Background

rao
Mention that David Smith brain washed my students with his Summer School talk..

Background

Complete satisfaction (traditional) planning Goal state G is a list of conjunctions: G = g1 g2 … gn

A plan that achieves n – 1 goal fluents is as good as a plan that achieves 0 goal fluents

Partial satisfaction planning (PSP) Goal state G is a list of fluents: G = {g1, g2 , …, gn} Goal fluents might have utilities, actions might have costs,

therefore achieving a partial plan might be more beneficial than the “null” plan.

Achieving all goal fluents might be impossible… The goal state G may contain logically conflicting fluents

There might not be enough resources to achieve all fluents in G

(:goal (and (pointing satellite1 moon) (pointing satellite1 mars) ))

(:goal (and (have_rock rover1 waypoint1) (have_rock rover1 waypoint2) ))

PSP problems

PSP Net benefit: Given a planning problem P = (F, A, I, G), and for each action

a “cost” ca 0, and for each goal fluent f G a “utility” uf 0, and a positive number k. Is there a finite sequence of actions = (a1, a2, …, an) that starting from I leads to a state S that has net benefit f(SG) uf – a ca k.

PLAN EXISTENCE

PLAN LENGTH

PSP GOAL LENGTH

PSP GOAL

PLAN COST PSP UTILITY

PSP UTILITY COST

PSP NET BENEFIT

Example

Getting from Las Vegas (LV) to San Jose (SJ)

C: action cost

U(G): utility of goal G

G1,G2,G3,G4: goals

P = {travel(LV,DL), travel(DL,SJ), travel(SJ,SF)} achieves G1, G2, G3

Approaches

Optiplan Integer programming based STRIPS planner Solves the PSP problem by encoding it as an integer

program

Altaltps Heuristic regression planner Solves the PSP problem through a goal selection heuristic

Sapaps Heuristic forward state space planner Solves the PSP problem using an anytime A* algorithm

Optiplan

Optiplan planning system: Combines Graphplan (Blum & Furst, 1995) with State

Change Encoding (Vossen et al., 1999) As in the Blackbox planning system, Graphplan reduces

the encoding size generated by Optiplan Computes optimal plans for a given parallel length

Objective: fG Uf (x_addf,n + x_preaddf,n + x_maintainf,n) – lL aA Ca ya,l

Sum of goal utilities – Sum of action cost

Optiplan and partial satisfaction

Objective 0 / Minimize #actions

Constraints Fluent changes

Satisfy initial state Satisfy goal

Fluent implications Action implications

Total satisfaction planning: goal satisfaction is treated as a hard constraint

Objective Maximize net benefit

Goal utility – action cost

Constraints Fluent changes

Satisfy initial state

Fluent implications Actions implications

Partial satisfaction planning: goal satisfaction is treated as a soft constraint

Graphplan based cost propagation

AltAltps

AltAlt planning system Heuristic state-space search planner (Nguyen,

Kambhampati & Sanchez, 2002) Combines Graphplan (Blum & Furst, 1995) with heuristic

state-space search techniques (Bonet, Loerincs & Geffner, 1997; Bonet Geffner, 1999; McDermott 1999)

AltAltps planning system Total enumeration on 2n goal subsets is too costly Selects a promising subset of the top-level goals upfront Searches for a plan using a regression state space search

combined with cost-sensitive planning graph heuristics.

AltAltps cost propagation

Using a planning graph structure Propositions in the initial state come for free (they have

zero cost) Other propositions have costs computed as follows:

Propagation procedures Max-propagation

Sum-propagation

0

0

0

0

4

0

0

4

5 5

8

5 5

3

l=0 l=1 l=2

hl(p) = Cost of proposition p at level l

0 if p I

hl(p) = min{hl-1(p), cost(a) + Cl(a)} if l > 0

otherwise

Cl(a) = max{hl-1(q) : q prec(a)}

Cl(a) = q prec(a) hl-1(q)

4 4

AltAltps goal set selection

Main idea Start with the original goal set G and an empty goal set G’ Iteratively add goals to G’ as long as the estimated NET

BENEFIT increases The cost of adding another goal g to G’ depends on the

goals that are already in G’

G’ G’ g

Cost for achieving G’

Residual cost for gRelaxed plan for G’ (R’p)

Rp for G’ g biased to re-use actions in R’p

AltAltps cost-sensitive relaxed plan heuristic

General procedure States are ranked during search using the relaxed plan

heuristic and the propagated costs The idea is to compute the cost of a relaxed plan Rp in

terms of the costs of the actions composing it.

Heuristic value for S equal h(S) = aRpcost(a)

1. Given a state S, remove the (sub)goal g from S that has highest hl(g)

2. Select the action that supports g with lowest cost (cost(a) + Cl(a))

3. Regress S over a to get S’ = S prec(a) \ eff(a)

4. Stop when each proposition q S is present in the initial state

Sapaps

Nodes evaluation: g(S) = U(S) – C(S) h(S) = U(RP(S)) – C(RP(S))

Beneficial Node: g(S) > 0 or U(S) > C(S)

Termination Node: V S’: g(S) > f(S’)

A*: f(S) = g(S) + h(S)

A1: Navigate(X,Y) A2: SampleSoil(Y)

A3: TakePicture

A4: Navigate(Y,Z)

A5: SampleRock

g(S) = Util(HasSoilData) – Cost(A1,A2)

h(S) = Util(Apply(A3,S)) – Cost(A3)

Anytime A* Algorithm:Search through best beneficial nodes

SAPAPS: a forward A* approach for PSP

Heuristic: Variation of SAPA’s ApproachHeuristically extracting the least cost relaxed plan using cost-functionRemove “unbeneficial” goals and related actions

G1

G2

G3

A1

A2

A3

A4

→G1

G2

A1A3

C(A1) + C(A2) > U(G3)

SAPAPS: heuristic

Empirical results

Empirical results

Future work

top related