planning graph-based heuristics for cost-sensitive temporal planning

Planning Graph-based Heuristics for Cost-sensitive Temporal Planning

Minh B. Do & Subbarao KambhampatiCSE Department, Arizona State University

{binhminh,rao}@asu.edu

Motivation

Multi-dimensional nature of plan quality in metric temporal planning:– Temporal quality (e.g. makespan, slack)– Plan cost (e.g. cumulative action cost, resource consumption)

Necessitates multi-objective optimization:– Modeling objective functions– Tracking different quality metrics and heuristic estimation Challenge: There may be inter-dependent

relations between different quality metric

Example

Option 1: Tempe Phoenix (Bus) Los Angeles (Airplane)– Less time: 3 hours; More expensive: $200

Option 2: Tempe Los Angeles (Car)– More time: 12 hours; Less expensive: $50

Given a deadline constraint (6 hours) Only option 1 is viableGiven a money constraint ($100) Only option 2 is viable

Tempe

Phoenix

Los Angeles

General Problem

Planner Good quality solutionProblem specificationObjective function

How to design objective function?-User define-Learning users utility model

We do not investigate We investigate

Given the objective function that involve both time and cost quality Finding heuristics that sensitive to the cost function

Our approach

Using the Temporal Planning Graph (Smith & Weld) structure to track the time-sensitive cost function:– Estimation of the earliest time (makespan) to achieve all goals.– Estimation of the lowest cost to achieve goals– Estimation of the cost to achieve goals given the specific

makespan value. Using those information to calculate the heuristic

value for the objective function involving both time and cost

Outline

Action representation and Temporal Planning GraphTime sensitive cost functions:– Cost propagation using the temporal planning graph.– Termination criteria for the cost propagation process.

Deriving heuristic values from cost functions– Direct calculation– Heuristic by relaxed plan extraction

Empirical evaluationConclusion and future work

Action Representation

Similar to PDDL2.1 Level 3:– Actions have non-uniform durations and may consume

resources– Preconditions are true at start point or hold true for the action

duration.– Effects at start or end points.

Load(package,truck,place)

At(package,place)

At(package,place)

At(truck,place)

In(package,truck)

The (Relaxed) Temporal PG

Tempe

Phoenix

Los Angeles

Drive-car(Tempe,LA)

Heli(T,P)

Shuttle(T,P)

Airplane(P,LA)

t = 0 t = 0.5 t = 1 t = 1.5 t = 10

Time-sensitive Cost Function

Standard (Temporal) planning graph (TPG) shows the time-related estimates e.g. earliest time to achieve fact, or to execute actionTPG does not show the cost estimates to achieve facts or execute actions

Tempe

Phoenix

L.A

Shuttle(Tempe,Phx): Cost: $20; Time: 1.0 hourHelicopter(Tempe,Phx):Cost: $100; Time: 0.5 hourCar(Tempe,LA):Cost: $100; Time: 10 hourAirplane(Phx,LA):Cost: $200; Time: 1.0 hour

cost

time0 1.5 2 10

$300

$220

$100

Drive-car(Tempe,LA)

Heli(T,P)

Shuttle(T,P)

Airplane(P,LA)

t = 0 t = 0.5 t = 1 t = 1.5 t = 10

Estimating the Cost Function

Tempe

Phoenix

L.A

time0 1.5 2 10

$300

$220

$100

t = 1.5 t = 10

Shuttle(Tempe,Phx): Cost: $20; Time: 1.0 hourHelicopter(Tempe,Phx):Cost: $100; Time: 0.5 hourCar(Tempe,LA):Cost: $100; Time: 10 hourAirplane(Phx,LA):Cost: $200; Time: 1.0 hour

1

Drive-car(Tempe,LA)

Hel(T,P)

Shuttle(T,P)

t = 0

Airplane(P,LA)

t = 0.5

0.5

t = 1

Cost(At(LA)) Cost(At(Phx)) = Cost(Flight(Phx,LA))

Airplane(P,LA)

t = 2.0

$20

Cost Propagation

Issues:– At a given time point, each fact is supported by multiple actions– Each action has more than one precondition

Propagation rules:– Cost(f,t) = min {Cost(A,t) : f Effect(A)}– Cost(A,t) = Aggregate(Cost(f,t): f Pre(A))

• Sum-propagation: Cost(f,t)• Max-propagation: Max {Cost(f,t)}• Combination: 0.5 Cost(f,t) + 0.5 Max {Cost(f,t)}

Termination Criteria

Deadline Termination: Terminate at time point t if: goal G: Dealine(G) t goal G: (Dealine(G) < t) (Cost(G,t) =

Fix-point Termination: Terminate at time point t where we can not improve the cost of any proposition.K-lookahead approximation: At t where Cost(g,t) < , repeat the process of applying (set) of actions that can improve the cost functions k times.

cost

time0 1.5 2 10

$300

$220

$100

Drive-car(Tempe,LA)

H(T,P)

Shuttle(T,P)

Plane(P,LA)

t = 0 0.5 1 1.5 t = 10

Earliest time pointCheapest cost

Heuristic estimation using the cost functions

If the objective function is to minimize time: h = t0

If the objective function is to minimize cost: h = CostAggregate(G, t)If the objective function is the function of both time and cost

O = f(time,cost) then:h = min f(t,Cost(G,t)) s.t. t0 t t

Eg: f(time,cost) = 100.makespan + Cost then h = 100x2 + 220 at t0 t = 2 t

time

cost

0 t0=1.5 2 t = 10

$300

$220

$100

Cost(At(LA))

Earliest achieve time: t0 = 1.5Lowest cost time: t = 10

The cost functions have information to track both temporal and costmetric of the plan, and their inter-dependent relations !!!

Heuristic estimation by extracting the relaxed plan

Relaxed plan (Hoffman) satisfies all the goals ignoring the negative interaction:– Take into account positive interaction– Base set of actions for possible adjustment according to

neglected (relaxed) information (e.g. negative interaction, resource usage etc.)

Need to find a good relaxed plan (among multiple ones) according to the objective function

Heuristic estimation by extracting the relaxed plan

General Alg.: Traverse backward searching for actions supporting all the goals. When A is added to the relaxed plan RP, then:

Supported Fact = SF Effects(A)Goals = SF \ (G Precond(A))

Temporal Planning with Cost: If the objective function is f(time,cost), then A is selected such that:

f(t(RP+A),C(RP+A)) + f(t(Gnew),C(Gnew)) is minimal (Gnew = (G Precond(A)) \ Effects)

Finally, using mutex to set orders between A and actions in RP so that less number of causal constraints are violated

time

cost

0 t0=1.5 2 t = 10

$300

$220

$100

Tempe

Phoenix

L.A

f(t,c) = 100.makespan + Cost

Empirical evaluation

Objective:– Demonstrate that metric temporal planner armed with our approach

is able to produce plans that satisfy a variety of cost/makespan tradeoff.

Testing problems: Randomly generated logistics problems from TP4

(Hasslum&Geffner)

Load/unload(package,location): Cost = 1; Duration = 1;Drive-inter-city(location1,location2): Cost = 4.0; Duration = 12.0;Flight(airport1,airport2): Cost = 15.0; Duration = 3.0;Drive-intra-city(location1,location2,city): Cost = 2.0; Duration = 2.0;

Empirical Results

Cost variation

0

10

20

30

40

50

60

0.1 0.2 0.3 0.4 0.5 0.6 0 0.8 0.9 0.95 1

Alpha

To

tal

Co

st

Makespan Variation

15

1719

21

23

2527

29

31

0.1 0.2 0.3 0.4 0.5 0.6 0 0.8 0.9 0.95 1

Alpha

Ma

kesp

an

Results over 20 randomly generated temporal logistics problems involve moving4 packages between different locations in 3 cities:

O = f(time,cost) = .Makespan + (1- ).TotalCost

Empirical Results (cont.)

Higher look-ahead option generally produces better results in term of solving times and qualityRelaxed plan heuristic is generally more informative than the direct plan heuristic

Related Work

TGP, TP4 aim at makespan optimization (do not consider cost)MO-GRT does multi-criteria search, but does not exploit the inter-dependent relations between them.ASPEN (JPL) uses the iterative repairing technique to improve multi-dimensional plan quality

Conclusion

Introduced the time-sensitive cost functions to guide the heuristic search according to the objective functions involving both time (makespan) and monetary action cost:– Propagating cost function while building the temporal

planning graph– Extract the heuristic values using the cost function– Preliminary experiment result with Sapa showing the

utilities of the time-sensitive cost functions

Future Work

Experiments with domains and problems from the planning competitionImproving the cost function by better propagation rules, mutex information when building the temporal planning graph (TGP approach)Heuristics for tracking other types of planning qualities such as execution flexibilityMulti-objective search involving non-combinable criteria

planning graph-based heuristics for cost-sensitive temporal planning

Documents

cost estimates

slackplan cost

cost propagationissues

lowest cost

cost functiontime01

cost propagation process

cumulative action cost

cost functionour approachusing