planning graph-based heuristics for cost-sensitive temporal planning
DESCRIPTION
Planning Graph-based Heuristics for Cost-sensitive Temporal Planning. Minh B. Do & Subbarao Kambhampati CSE Department, Arizona State University {binhminh,rao}@asu.edu. Motivation. Multi-dimensional nature of plan quality in metric temporal planning: Temporal quality (e.g. makespan, slack) - PowerPoint PPT PresentationTRANSCRIPT
Planning Graph-based Heuristics for Cost-sensitive Temporal Planning
Minh B. Do & Subbarao KambhampatiCSE Department, Arizona State University
{binhminh,rao}@asu.edu
Motivation
Multi-dimensional nature of plan quality in metric temporal planning:– Temporal quality (e.g. makespan, slack)– Plan cost (e.g. cumulative action cost, resource consumption)
Necessitates multi-objective optimization:– Modeling objective functions– Tracking different quality metrics and heuristic estimation Challenge: There may be inter-dependent
relations between different quality metric
Example
Option 1: Tempe Phoenix (Bus) Los Angeles (Airplane)– Less time: 3 hours; More expensive: $200
Option 2: Tempe Los Angeles (Car)– More time: 12 hours; Less expensive: $50
Given a deadline constraint (6 hours) Only option 1 is viableGiven a money constraint ($100) Only option 2 is viable
Tempe
Phoenix
Los Angeles
General Problem
Planner Good quality solutionProblem specificationObjective function
How to design objective function?-User define-Learning users utility model
We do not investigate We investigate
Given the objective function that involve both time and cost quality Finding heuristics that sensitive to the cost function
Our approach
Using the Temporal Planning Graph (Smith & Weld) structure to track the time-sensitive cost function:– Estimation of the earliest time (makespan) to achieve all goals.– Estimation of the lowest cost to achieve goals– Estimation of the cost to achieve goals given the specific
makespan value. Using those information to calculate the heuristic
value for the objective function involving both time and cost
Outline
Action representation and Temporal Planning GraphTime sensitive cost functions:– Cost propagation using the temporal planning graph.– Termination criteria for the cost propagation process.
Deriving heuristic values from cost functions– Direct calculation– Heuristic by relaxed plan extraction
Empirical evaluationConclusion and future work
Action Representation
Similar to PDDL2.1 Level 3:– Actions have non-uniform durations and may consume
resources– Preconditions are true at start point or hold true for the action
duration.– Effects at start or end points.
Load(package,truck,place)
At(package,place)
At(package,place)
At(truck,place)
In(package,truck)
The (Relaxed) Temporal PG
Tempe
Phoenix
Los Angeles
Drive-car(Tempe,LA)
Heli(T,P)
Shuttle(T,P)
Airplane(P,LA)
t = 0 t = 0.5 t = 1 t = 1.5 t = 10
Time-sensitive Cost Function
Standard (Temporal) planning graph (TPG) shows the time-related estimates e.g. earliest time to achieve fact, or to execute actionTPG does not show the cost estimates to achieve facts or execute actions
Tempe
Phoenix
L.A
Shuttle(Tempe,Phx): Cost: $20; Time: 1.0 hourHelicopter(Tempe,Phx):Cost: $100; Time: 0.5 hourCar(Tempe,LA):Cost: $100; Time: 10 hourAirplane(Phx,LA):Cost: $200; Time: 1.0 hour
cost
time0 1.5 2 10
$300
$220
$100
Drive-car(Tempe,LA)
Heli(T,P)
Shuttle(T,P)
Airplane(P,LA)
t = 0 t = 0.5 t = 1 t = 1.5 t = 10
Estimating the Cost Function
Tempe
Phoenix
L.A
time0 1.5 2 10
$300
$220
$100
t = 1.5 t = 10
Shuttle(Tempe,Phx): Cost: $20; Time: 1.0 hourHelicopter(Tempe,Phx):Cost: $100; Time: 0.5 hourCar(Tempe,LA):Cost: $100; Time: 10 hourAirplane(Phx,LA):Cost: $200; Time: 1.0 hour
1
Drive-car(Tempe,LA)
Hel(T,P)
Shuttle(T,P)
t = 0
Airplane(P,LA)
t = 0.5
0.5
t = 1
Cost(At(LA)) Cost(At(Phx)) = Cost(Flight(Phx,LA))
Airplane(P,LA)
t = 2.0
$20
Cost Propagation
Issues:– At a given time point, each fact is supported by multiple actions– Each action has more than one precondition
Propagation rules:– Cost(f,t) = min {Cost(A,t) : f Effect(A)}– Cost(A,t) = Aggregate(Cost(f,t): f Pre(A))
• Sum-propagation: Cost(f,t)• Max-propagation: Max {Cost(f,t)}• Combination: 0.5 Cost(f,t) + 0.5 Max {Cost(f,t)}
Termination Criteria
Deadline Termination: Terminate at time point t if: goal G: Dealine(G) t goal G: (Dealine(G) < t) (Cost(G,t) =
Fix-point Termination: Terminate at time point t where we can not improve the cost of any proposition.K-lookahead approximation: At t where Cost(g,t) < , repeat the process of applying (set) of actions that can improve the cost functions k times.
cost
time0 1.5 2 10
$300
$220
$100
Drive-car(Tempe,LA)
H(T,P)
Shuttle(T,P)
Plane(P,LA)
t = 0 0.5 1 1.5 t = 10
Earliest time pointCheapest cost
Heuristic estimation using the cost functions
If the objective function is to minimize time: h = t0
If the objective function is to minimize cost: h = CostAggregate(G, t)If the objective function is the function of both time and cost
O = f(time,cost) then:h = min f(t,Cost(G,t)) s.t. t0 t t
Eg: f(time,cost) = 100.makespan + Cost then h = 100x2 + 220 at t0 t = 2 t
time
cost
0 t0=1.5 2 t = 10
$300
$220
$100
Cost(At(LA))
Earliest achieve time: t0 = 1.5Lowest cost time: t = 10
The cost functions have information to track both temporal and costmetric of the plan, and their inter-dependent relations !!!
Heuristic estimation by extracting the relaxed plan
Relaxed plan (Hoffman) satisfies all the goals ignoring the negative interaction:– Take into account positive interaction– Base set of actions for possible adjustment according to
neglected (relaxed) information (e.g. negative interaction, resource usage etc.)
Need to find a good relaxed plan (among multiple ones) according to the objective function
Heuristic estimation by extracting the relaxed plan
General Alg.: Traverse backward searching for actions supporting all the goals. When A is added to the relaxed plan RP, then:
Supported Fact = SF Effects(A)Goals = SF \ (G Precond(A))
Temporal Planning with Cost: If the objective function is f(time,cost), then A is selected such that:
f(t(RP+A),C(RP+A)) + f(t(Gnew),C(Gnew)) is minimal (Gnew = (G Precond(A)) \ Effects)
Finally, using mutex to set orders between A and actions in RP so that less number of causal constraints are violated
time
cost
0 t0=1.5 2 t = 10
$300
$220
$100
Tempe
Phoenix
L.A
f(t,c) = 100.makespan + Cost
Empirical evaluation
Objective:– Demonstrate that metric temporal planner armed with our approach
is able to produce plans that satisfy a variety of cost/makespan tradeoff.
Testing problems: Randomly generated logistics problems from TP4
(Hasslum&Geffner)
Load/unload(package,location): Cost = 1; Duration = 1;Drive-inter-city(location1,location2): Cost = 4.0; Duration = 12.0;Flight(airport1,airport2): Cost = 15.0; Duration = 3.0;Drive-intra-city(location1,location2,city): Cost = 2.0; Duration = 2.0;
Empirical Results
Cost variation
0
10
20
30
40
50
60
0.1 0.2 0.3 0.4 0.5 0.6 0 0.8 0.9 0.95 1
Alpha
To
tal
Co
st
Makespan Variation
15
1719
21
23
2527
29
31
0.1 0.2 0.3 0.4 0.5 0.6 0 0.8 0.9 0.95 1
Alpha
Ma
kesp
an
Results over 20 randomly generated temporal logistics problems involve moving4 packages between different locations in 3 cities:
O = f(time,cost) = .Makespan + (1- ).TotalCost
Empirical Results (cont.)
Higher look-ahead option generally produces better results in term of solving times and qualityRelaxed plan heuristic is generally more informative than the direct plan heuristic
Related Work
TGP, TP4 aim at makespan optimization (do not consider cost)MO-GRT does multi-criteria search, but does not exploit the inter-dependent relations between them.ASPEN (JPL) uses the iterative repairing technique to improve multi-dimensional plan quality
Conclusion
Introduced the time-sensitive cost functions to guide the heuristic search according to the objective functions involving both time (makespan) and monetary action cost:– Propagating cost function while building the temporal
planning graph– Extract the heuristic values using the cost function– Preliminary experiment result with Sapa showing the
utilities of the time-sensitive cost functions
Future Work
Experiments with domains and problems from the planning competitionImproving the cost function by better propagation rules, mutex information when building the temporal planning graph (TGP approach)Heuristics for tracking other types of planning qualities such as execution flexibilityMulti-objective search involving non-combinable criteria