© daniel s. weld 1 logistics ps3 project additional problem reading planning & csp for...

Post on 21-Jan-2016

213 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

© Daniel S. Weld 1

Logistics

• PS3  Project  Additional problem

• Reading  Planning & CSP for “today”  SAT plan paper review for Wed

© Daniel S. Weld 2

PSet 2Student(s) = s is a studentTakes(s, c, q) = s takes c during quarter qIsQtr(q) = q is a quarterPasses(s, c)  = s passes course cHigher(h, g) = grade h is higher than gIsGradeIn(g, c) = g is a grade in course c

Every student who takes French passes it

Needs time argument

Needs person & time args

s,q [student(s) isQtr(q) takes(x, French, q)] passes(s, French)

But what if Joe takes French in the fall, fails and then doesn’t take French in the winter?The antecedent is false… but PQ = P QSo the formula is true?!?

What’s the fix??

© Daniel S. Weld 3

PSet 2Higher(h, g) =

grade h is higher than gGradeOf(g, s, c, q) =

s has grade g in course c during q

The best score in Greek is always better than that french

g, q [s gradeOf(s, g, French, q)] [t, h gradeOf(t, h, Greek, q) Higher(h, g)]

© Daniel S. Weld 4

573 Topics

Agency

Problem Spaces

Search Knowledge Representation

Planning Uncertainty

MDPs SupervisedLearning

ReinforcementLearning

© Daniel S. Weld 5

Immediate Outline• Constraint satisfaction

  Defn – factoring state spaces  Backtracking policies  Variable-ordering heuristics & preprocessing

• The planning problem• Searching world states• Graphplan• SATplan • Reachability analysis & heuristics

• Planning under uncertainty

© Daniel S. Weld 6

Constraint Satisfaction

• Kind of search in which  States are factored into sets of variables  Search = assigning values to these variables  Structure of space is encoded with constraints

• Backtracking-style algorithms work  E.g. DFS for SAT (i.e. DPLL)

• But other techniques add speed  Propagation  Variable ordering  Preprocessing

© Daniel S. Weld 7

Chinese Constraint Network

Soup

Total Cost< $30

ChickenDish

Vegetable

RiceSeafood

Pork Dish

Appetizer

Must beHot&Sour

No Peanuts

No Peanuts

NotChow Mein

Not BothSpicy

© Daniel S. Weld 8

CSPs in the real world

• Scheduling Space Shuttle Repair• Airport gate assignments• Transportation Planning• Supply-chain management• Computer Configuration• Diagnosis• UI Optimization• Etc...

Adapting to

Device

Characteristics

© Daniel S. Weld 10

Binary Constraint Network• Set of n variables: x1 … xn

• Value domains for each variable: D1 … Dn

• Set of binary constraints (also “relations”)  Rij Di Dj

  Specifies which values pair (xi xj) are consistent

• V for each country• Each domain = 4

colors• Rij enforces

© Daniel S. Weld 11

Binary Constraint NetworkPartial assignment of values = tuple of pairs

{...(x, a)…} means variable x gets value a...

Tuple=consistent if all constraints satisfiedTuple=full solution if consistent + has all vars

Tuple {(xi, ai) … (xj, aj)} = consistent w/ a set of vars {xm … xn}

iff am … an such that {(xi, ai)…(xj, aj), (xm, am)…(xn, an)} } =

consistent

© Daniel S. Weld 12

N Queens• Variables = board columns• Domain values = rows• Rij = {(ai, aj) : (ai aj) (|i-j| |ai-aj|)

  e.g. R12 = {(1,3), (1,4), (2,4), (3,1), (4,1), (4,2)}

Q

Q

Q

• {(x1, 2), (x2, 4), (x3, 1)} consistent with (x4)• Shorthand: “{2, 4, 1} consistent with x4”

© Daniel S. Weld 13

CSP as a search problem?

• What are states?  (nodes in graph)

• What are the operators?   (arcs between nodes)

• Initial state?• Goal test?

Q

Q

Q

© Daniel S. Weld 14

Chronological Backtracking (BT) (e.g., depth first

search)

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

Q

1

2

34

5

6

Consistency check performed in the order in which vars were instantiatedIf c-check fails, try next value of current varIf no more values, backtrack to most recent var

© Daniel S. Weld 15

Backjumping (BJ)• Similar to BT, but

  more efficient when no consistent instantiation can be found for the current var

• Instead of backtracking to most recent var…  BJ reverts to deepest var which was c-checked

against the current var

BJ Discovers (2, 5, 3, 6) inconsistent with x6

No sense trying other values of x5

Q

Q

Q

Q

Q

© Daniel S. Weld 16

5

Conflict-Directed Backjumping (CBJ)

• More sophisticated backjumping behavior• Each variable has conflict set CS

  Set of vars that failed c-checks w/ current val  Update this set on every failed c-check

• When no more values to try for xi

  Backtrack to deepest var, xd, in CS(xi)  And update CS(xd):=CS(xd)CS(xi)-{xd}

CBJ Discovers(2, 5, 3) inconsistent with {x5, x6 }

Q

Q

Q

Q

Q

1 1

3

2

3

3 3

21

2

3

4

5

6x1 x2 x3 x4 x5 x6

CS(x5)

1,2,3

CS(x6)

1,2,3,5

© Daniel S. Weld 17

BT vs. BJ vs. CBJ

{

© Daniel S. Weld 18

Forward Checking (FC)

• Perform Consistency Check Forward• Whenever a var is assigned a value

  Prune inconsistent values from   As-yet unvisited variables  Backtrack if domain of any var ever collapses

Q

Q

Q

Q

Q

FC only visits consistent nodes but not all such nodes skips (2, 5, 3, 4) which CBJ visitsBut FC can’t detect that (2, 5, 3) inconsistent with {x5, x6 }

© Daniel S. Weld 19

Number of Nodes Explored

BT=BM

BJ=BMJ=BMJ2

CBJ=BM-CBJ

FC-CBJ

FC

More

Fewer=BM-CBJ2

© Daniel S. Weld 20

Number of Consistency Checks

BMJ2

BT

BJ

BMJ

BM-CBJ

CBJFC-CBJ

BM

BM-CBJ2

FC

More

Fewer

© Daniel S. Weld 21

Dynamic variable ordering

• In the N-queens examples we assumed  First x1 then x2 then ...

• But this order not required  Any order ok with respect to completeness  A good order leads to huge speedup

• A good heuristic (MRV):  Choose variable w/ minimum remaining

values • This is easy if one is doing FC

© Daniel S. Weld 22

DVO MRV => WOW!!

Algo 17 Queens 21 Queens 27 QueensFC-CBJmrv 1959 2572 5602FC-CBJ 67090 114612 737008FC 67329 115120 7448781CBJ 428645 949128BJ 436340 972065BT 485597 1156015

© Daniel S. Weld 23

Preprocessing Strategies

• Even FC-CBJ is O(bd) time worst case• Sometimes useful to preprocess

  before doing exponential search  spend polynomial time to achieve local

consistency • Arc consistency

  Consider all pairs of vars  Can values be eliminated from a domain ala FC  Propagate  O(d2) time where d= number of vars

© Daniel S. Weld 24

Constraint Satisfaction Recap

• CSP = Factoring a state space• Chronological Backtracking (BT)• Backjumping (BJ)• Conflict-Directed Backjumping (CBJ)• Forward checking (FC)• Dynamic variable ordering heuristics• Preprocessing Strategies

© Daniel S. Weld 25

Immediate Outline

• The planning problem• Searching world states• Constraint satisfaction• Graphplan• SATplan • Reachability analysis & heuristics

• Planning under uncertainty

© Daniel S. Weld 26

Ways to make “plans”

Generative PlanningReason from first principles (knowledge of actions)Requires formal model of actions

Case-Based PlanningRetrieve old plan which worked on similar problemRevise retrieved plan for this problem

Reinforcement LearningAct ”randomly” - noticing effects Learn reward, action models, policy

© Daniel S. Weld 27

Generative Planning

InputDescription of (initial state of) world (in some KR)Description of goal (in some KR)Description of available actions (in some KR)

OutputController

E.g. Sequence of actionsE.g. Plan with loops and conditionalsE.g. Policy = f: states -> actions

© Daniel S. Weld 28

Input Representation

• Description of initial state of world  E.g., Set of propositions:  ((block a) (block b) (block c) (on-table a)

(on-table b) (clear a) (clear b) (clear c) (arm-empty))

• Description of goal: i.e. set of worlds or ??  E.g., Logical conjunction  Any world satisfying conjunction is a goal  (and (on a b) (on b c)))

• Description of available actions

© Daniel S. Weld 29

Simplifying Assumptions

Environment

Percepts Actions

What action next?

Static vs.

Dynamic

Fully Observable vs.

Partially Observable

Deterministic vs.

Stochastic

Instantaneous vs.

Durative

Full vs. Partial satisfaction

Perfectvs.

Noisy

© Daniel S. Weld 30

Classical Planning

EnvironmentStatic

Fully Observable Deterministic Instantaneous

Full

Perfect

I = initial state G = goal state Oi(prec) (effects)

[ I ] Oi Oj Ok Om[ G ]

© Daniel S. Weld 31

Static Deterministic ObservableInstantaneousPropositional

“Classical Planning”

DynamicR

ep

lan

ni

ng

/S

itu

ate

d

Pla

ns

Durative

Tem

pora

l R

eason

in

g

Continuous

Nu

meri

c

Con

str

ain

t re

ason

ing

(LP

/ILP

)

Stochastic

Con

tin

gen

t/C

on

form

an

t P

lan

s,

Inte

rleaved

execu

tion

MD

P

Policie

sP

OM

DP

P

olicie

s

PartiallyObservable

Con

tin

gen

t/C

on

form

an

t P

lan

s,

Inte

rleaved

execu

tion

Sem

i-M

DP

P

olicie

s

© Daniel S. Weld 32

Today’s Hot Research Areas

• Durative Actions  Simultaneous actions, events, deadline goals

• Planning Under Uncertainty  Modeling sensors; searching belief states

[ I ] Oi

Oj

Ok

?

Ob

Oa

Oc

© Daniel S. Weld 33

Representing Actions

• Situation Calculus• STRIPS• PDDL• UWL• Dynamic Bayesian Networks

© Daniel S. Weld 34

How Represent Actions?• Simplifying assumptions

  Atomic time  Agent is omniscient (no sensing necessary).   Agent is sole cause of change  Actions have deterministic effects

• STRIPS representation  World = set of true propositions  Actions:

• Precondition: (conjunction of literals)• Effects (conjunction of literals)

a

aa

north11 north12

W0 W2W1

© Daniel S. Weld 35

STRIPS Actions• Action = function: worldState worldState• Precondition

  says where function defined• Effects

  say how to change set of propositions

aa

north11

W0 W1

north11precond: (and (agent-at 1 1)

(agent-facing north))

effect: (and (agent-at 1 2)

(not (agent-at 1 1)))

Note: str

ips doesn

’t

allow deri

ved effec

ts;

you must b

e complet

e!

© Daniel S. Weld 36

Action Schemata

(:operator pick-up :parameters ((block ?ob1)) :precondition (and (clear ?ob1)

(on-table ?ob1) (arm-empty))

:effect (and (not (clear ?ob1)) (not (on-table ?ob1))

(not (arm-empty)) (holding ?ob1)))

• Instead of defining: pickup-A and pickup-B and …

• Define a schema:Note: strips doesn’t

allow derived effects;

you must be complete!}

© Daniel S. Weld 37

Immediate Outline

• Constraint satisfaction• The planning problem• Searching world states

  Regression  Heuristics

• Graphplan• SATplan • Reachability analysis & heuristics

• Planning under uncertainty

© Daniel S. Weld 38

Planning as Search

• Nodes

• Arcs

• Initial State

• Goal State

World states

Actions

The state satisfying the complete description of the initial conds

Any state satisfying the goal propositions

© Daniel S. Weld 39

Forward-Chaining World-Space Search

AC

BCBA

InitialState Goal

State

© Daniel S. Weld 40

Backward-Chaining Search Thru Space of Partial World-States

DCBA

E

D

CBA

E

DCBA

E

* * *

• Problem: Many possible goal states are equally acceptable.

• From which one does one search?

AC

B

Initial State is completely defined

DE

© Daniel S. Weld 41

Regression• Regressing a goal, G, thru an action, A• Yields the weakest precondition G’

  Such that: if G’ is true before A is executed  G is guaranteed to be true afterwards

A Gp

recon

d

eff

ectG’

Represents a set of

world states

Represents a set of

world states

© Daniel S. Weld 42

Regression Example

pick-up :parameters ((block ?ob1)) :precondition (and (clear ?ob1)

(on-table ?ob1) (arm-empty))

:effect (and (not (clear ?ob1)) (not (on-table ?ob1))

(not (arm-empty)) (holding ?ob1)))

A G

pre

con

d

eff

ectG’

(and (holding C) (on A B))

(and (clear A) (on-table A) (arm-empty) (on A B))

Disjunction preconditions

© Daniel S. Weld 43

Conditional Effects

© Daniel S. Weld 44

Regressing Conditional Effects

A G

pre

con

d

eff

ectG’

(and (at keys home) (at paycheck bank))

(and (at briefcase bank) (in keys briefcase) (not (in paycheck briefcase)) (at paycheck bank))

bankhome

top related