Introduction to Automated Planning Revisiting Classical ...mgremesal/MIR/slides/06 - MIR -...

Post on 14-Oct-2020

1 views 0 download

transcript

1

Introduction to Automated Planning

Revisiting Classical Planning

2

PART I

Introduction to Automated Planning

3

4. A drawing or diagram made to scale showing the structure or arrangement of something.

5. In perspective rendering, one of several imaginary planes perpendicular to the line of vision between the viewer and the object being depicted.

6. A program or policy stipulating a service or benefit: a pension plan.

Synonyms: blueprint, design, project, scheme, strategy

plan n.1. A scheme, program, or method

worked out beforehand for the accomplishment of an objective: a plan of attack.

2. A proposed or tentative project or course of action: had no plans for the evening.

3. A systematic arrangement of elements or important parts; a configuration or outline: a seating plan; the plan of a story.

4

plan n.1. A scheme,

program, or method worked out beforehand for the accomplishment of an objective: a plan of attack.

5

Generating Plans of Action

Origin of Automated planning: computer programs to aid human planners

Project management (consumer software)Automatic schedule generation

» various OR and AI techniques

For some problems, we would like generateplans (or pieces of plans) automatically

Much more difficultAutomated-planning research is starting to pay off

Here are some examples …

6

Autonomous planning, scheduling, controlNASA: Jet Propulsion Lab and Ames Research Center

Remote Agent Experiment (RAX)

Deep Space 1Mars ExplorationRover (MER)

Space Exploration

7

Sheet-metal bending machines - Amada CorporationSoftware to plan the sequence of bends[Gupta and Bourne, J. Manufacturing Sci. and Engr., 1999]

Manufacturing

8

Bridge Baron - Great Game Products1997 world champion of computer bridge [Smith, Nau, and Throop, AI Magazine, 1998]2004: 2nd place

(North— ♠Q)… …

PlayCard(P3; S, R3)PlayCard(P2; S, R2) PlayCard(P4; S, R4)

FinesseFour(P4; S)

PlayCard(P1; S, R1)

StandardFinesseTwo(P2; S)

LeadLow(P1; S)

PlayCard(P4; S, R4’)

StandardFinesseThree(P3; S)

EasyFinesse(P2; S) BustedFinesse(P2; S)

FinesseTwo(P2; S)

StandardFinesse(P2; S)

Finesse(P1; S)Us:East declarer, West dummyOpponents:defenders, South & NorthContract:East – 3NTOn lead:West at trick 3 East:♠KJ74

West: ♠A2Out: ♠QT98653

(North— 3)

East— ♠J

West— ♠2

North— ♠3 South— ♠5 South— ♠Q

Games

9

Conceptual Model1. Environment

State transition systemΣ = (S,A,E,γ)

S = {states}A = {actions}E = {exogenous events}γ = state-transition function

System Σ

10

Σ = (S,A,E,γ)

S = {states}A = {actions}E = {exogenous events}State-transition functionγ: S x (A ∪ E) → 2S

S = {s0, …, s5}A = {move1, move2, put, take, load, unload}E = {}γ: see the arrows

State Transition System

take

put

move1

put

take

move1

move1move2

loadunload

move2

move2

location 1 location 2

s0

location 1 location 2

s1

s4

location 1 location 2

s5

location 1 location 2

location 1 location 2

s3

location 1 location 2

s2

The Dock Worker Robots (DWR) domain

11

Observation functionh: S → O

location 1 location 2

s3

Given observation o in O, produces action a in A

Conceptual Model2. Controller

Controller

12

Omit unless planning is online

Planning problemPlanning problemPlanning problem

Conceptual Model3. Planner’s Input

Planner

13

PlanningProblem

take

put

move1

put

take

move1

move1move2

loadunload

move2

move2

location 1 location 2

s0

location 1 location 2

s1

s4

location 1 location 2

s5

location 1 location 2

location 1 location 2

s3

location 1 location 2

s2

Description of ΣInitial state or set of states

Initial state = s0Objective

Goal state, set of goal states, set of tasks, “trajectory” of states, objective function, …Goal state = s5

The Dock Worker Robots (DWR) domain

14

Instructions tothe controller

Conceptual Model4. Planner’s Output

Planner

15

Plans

take

put

move1

put

take

move1

move1move2

loadunload

move2

move2

location 1 location 2

s0

location 1 location 2

s1

s4

location 1 location 2

s5

location 1 location 2

location 1 location 2

s3

location 1 location 2

s2

Classical plan: a sequence of actions

⟨take, move1, load, move2⟩

Policy: partial function from S into A

{(s0, take),(s1, move1),(s3, load),(s4, move2)}

take

move1

load

move2

The Dock Worker Robots (DWR) domain

16

A0: Finite system:finitely many states, actions, events

A1: Fully observable:the controller knows Σ’s current state

A2: Deterministic:each action has only one outcome

A3: Static (no exogenous events):no changes but the controller’s actions

A4: Attainment goals: a set of goal states Sg (no states to be avoided, etc.)

A5: Sequential plans:a plan is a linearly ordered sequenceof actions (a1, a2, … an)

A6: Implicit time:no time durations; linear sequence of instantaneous states

A7: Off-line planning: planner doesn’t know the execution status

Restrictive Assumptions

17

Classical planning requires all eight restrictive assumptionsOffline generation of action sequences for a deterministic, static, finite system, with complete knowledge, attainment goals, and implicit time

Reduces to the following problem:Given (Σ, s0, Sg)Find a sequence of actions (a1, a2, … an) that produces a sequence of state transitions (s1, s2, …, sn)such that sn is in Sg.

This is just path-searching in a graphNodes = statesEdges = actions

Is this trivial?

Classical Planning

18

Classical PlanningGeneralize the earlier example:

Five locations, three robot carts,100 containers, three piles

» Then there are 10277 statesNumber of particles in the universeis only about 1087

The example is more than 10190 times as large!

Automated-planning research has been heavily dominated by classical planning

Dozens (hundreds?) of different algorithmsWe will only describe the simplest (state-space planning)

In this course we will focus in planning with uncertainty (non-determinism)

location 1 location 2

s1

take

put

move1move2

19

Generalization of the earlier exampleA harbour with several locations

» e.g., docks, docked ships,storage areas, parking areas

Containers» going to/from ships

Robot carts» can move containers

Cranes» can load and unload containers

A running example: Dock Worker Robots

20

A running example: Dock Worker RobotsLocations: l1, l2, …

Containers: c1, c2, …can be stacked in piles, loaded onto robots, or held by cranes

Piles: p1, p2, …fixed areas where containers are stackedpallet at the bottom of each pile

Robot carts: r1, r2, …can move to adjacent locationscarry at most one container

Cranes: k1, k2, …each belongs to a single locationmove containers between piles and robotsif there is a pile at a location, there must also be a crane there

21

A running example: Dock Worker Robots

Fixed relations: same in all statesadjacent(l,l’) attached(p,l) belong(k,l)

Dynamic relations: differ from one state to anotheroccupied(l) at(r,l)loaded(r,c) unloaded(r)holding(k,c) empty(k)in(c,p) on(c,c’)top(c,p) top(pallet,p)

Actions:take(c,k,p) put(c,k,p)load(r,c,k) unload(r) move(r,l,l’)

22

PART II

Representations for Classical Planning

23

Representations: MotivationIn most problems, far too many states to try to represent all ofthem explicitly as s0, s1, s2, …Represent each state as a set of features

e.g.,» a vector of values for a set of variables» a set of ground atoms in some first-order language L

Define a set of operators that can be used to compute state-transitionsDon’t give all of the states explicitly

Just give the initial stateUse the operators to generate the other states as needed

24

OutlineRepresentation schemes

Classical representationSet-theoretic representationState-variable representationExamples: DWR and the Blocks WorldComparisons

25

Classical RepresentationStart with a function-free first-order language

Finitely many predicate symbols and constant symbols,but no function symbols

Example: the DWR domainLocations: l1, l2, …Containers: c1, c2, …Piles: p1, p2, …Robot carts: r1, r2, …Cranes: k1, k2, …

26

Classical RepresentationAtom: predicate symbol and arguments

Use these to represent both fixed and dynamic relationsadjacent(l,l’) attached(p,l) belong(k,l) occupied(l) at(r,l)loaded(r,c) unloaded(r)holding(k,c) empty(k)in(c,p) on(c,c’)top(c,p) top(pallet,p)

Ground expression: contains no variable symbols - e.g., in(c1,p3)Unground expression: at least one variable symbol - e.g., in(c1,x)Substitution: θ = {x1 ← v1, x2 ← v2, …, xn← vn}

Each xi is a variable symbol; each vi is a termInstance of e: result of applying a substitution θ to e

Replace variables of e simultaneously, not sequentially

27

StatesState: a set s of ground atoms

The atoms represent the things that are true in one of Σ’s statesOnly finitely many ground atoms, so only finitely many possible states

28

OperatorsOperator: a triple o=(name(o), precond(o), effects(o))

name(o) is a syntactic expression of the form n(x1,…,xk)» n: operator symbol - must be unique for each operator» x1,…,xk: variable symbols (parameters)

• must include every variable symbol in oprecond(o): preconditions

» literals that must be true in order to use the operatoreffects(o): effects

» literals the operator will make true

29

ActionsAction: ground instance (via substitution) of an operator

30

NotationLet S be a set of literals. Then

S+ = {atoms that appear positively in S}S– = {atoms that appear negatively in S}

More specifically, let a be an operator or action. Thenprecond+(a) = {atoms that appear positively in a’s preconditions}precond–(a) = {atoms that appear negatively in a’s preconditions}effects+(a) = {atoms that appear positively in a’s effects}effects–(a) = {atoms that appear negatively in a’s effects}

effects+(take(k,l,c,d,p) = {holding(k,c), top(d,p)}effects–(take(k,l,c,d,p) = {empty(k), in(c,p), top(c,p), on(c,d)}

31

ApplicabilityAn action a is applicable to a state s if s satisfies precond(a),

i.e., if precond+(a) ⊆ s and precond–(a) ∩ s = ∅

Here are an action and a state that it’s applicable to:

32

Result of Performing an Action

If a is applicable to s, the result of performing it isγ(s,a) = (s – effects–(a)) ∪ effects+(a)

Delete the negative effects, and add the positive ones

33

Planning domain: language plus operators

Corresponds to aset of state-transition systemsExample:operators for the DWR domain

34

Planning Problems Given a planning domain (language L, operators O)

Statement of a planning problem: a triple P=(O,s0,g)» O is the collection of operators» s0 is a state (the initial state)» g is a set of literals (the goal formula)

The actual planning problem: P = (Σ,s0,Sg)» s0 and Sg are as above» Σ = (S,A,γ) is a state-transition system» S = {all sets of ground atoms in L}» A = {all ground instances of operators in O}» γ = the state-transition function determined by the operators

I’ll often say “planning problem” when I mean the statement of the problem

35

Plans and SolutionsPlan: any sequence of actions σ = ⟨a1, a2, …, an⟩ such thateach ai is a ground instance of an operator in OThe plan is a solution for P=(O,s0,g) if it is executable and achieves g

i.e., if there are states s0, s1, …, sn such that» γ (s0,a1) = s1

» γ (s1,a2) = s2

» …» γ (sn–1,an) = sn

» sn satisfies g

36

ExampleLet P1 = (O, s1, g1), where

O is the set of operators given earlier

g1={loaded(r1,c3),at(r1,loc2)}

37

Example (continued)

Here are three solutions for P1:⟨take(crane1,loc1,c3,c1,p1), move(r1,loc2,loc1), move(r1,loc1,loc2),move(r1,loc2,loc1), load(crane1,loc1,c3,r1), move(r1,loc1,loc2)⟩

⟨take(crane1,loc1,c3,c1,p1), move(r1,loc2,loc1),load(crane1,loc1,c3,r1), move(r1,loc1,loc2)⟩

⟨move(r1,loc2,loc1), take(crane1,loc1,c3,c1,p1),load(crane1,loc1,c3,r1), move(r1,loc1,loc2)⟩

Each of them producesthe state shown here:

38

Example (continued)

The first is redundant: can remove actions and still have a solution⟨take(crane1,loc1,c3,c1,p1), move(r1,loc2,loc1), move(r1,loc1,loc2),move(r1,loc2,loc1), load(crane1,loc1,c3,r1), move(r1,loc1,loc2)⟩

⟨take(crane1,loc1,c3,c1,p1), move(r1,loc2,loc1),load(crane1,loc1,c3,r1), move(r1,loc1,loc2)⟩

⟨move(r1,loc2,loc1), take(crane1,loc1,c3,c1,p1),load(crane1,loc1,c3,r1), move(r1,loc1,loc2)⟩

The 2nd and 3rdare irredundantand shortest

39

Set-Theoretic RepresentationLike classical representation, but restricted to propositional logic

States: Instead of a collection of ground atoms …

{on(c1,pallet), on(c1,r1), on(c1,c2), …, at(r1,l1), at(r1,l2), …}

… use a collection of propositions (boolean variables):{on-c1-pallet, on-c1-r1, on-c1-c2, …, at-r1-l1, at-r1-l2, …}

40

Instead of operators like this one,

take all ofthe operatorinstances,e.g., this one,

and rewriteground atomsas propositions

take-crane1-loc1-c3-c1-p1precond: belong-crane1-loc1, attached-p1-loc1,

empty-crane1, top-c3-p1, on-c3-c1delete: empty-crane1, in-c3-p1, top-c3-p1, on-c3-p1add: holding-crane1-c3, top-c1-p1

41

ComparisonA set-theoretic representation is equivalent to a classical representation in which all of the atoms are ground

Exponential blowupIf a classical operator contains n atomsand each atom has arity k,then it corresponds to cnk actions where c = |{constant symbols}|

42

Use ground atoms for properties that do not change, e.g., adjacent(loc1,loc2)For properties that can change, assign values to state variables

Like fields in a record structureClassical and state-variable representations take similar amounts of space

Each can be translated into the other in low-order polynomial time

State-Variable Representation

{top(p1)=c3,cpos(c3)=c1,cpos(c1)=pallet,holding(crane1)=nil,rloc(r1)=loc2,loaded(r1)=nil, …}

43

Example: The Blocks WorldInfinitely wide table, finite number of children’s blocksIgnore where a block is located on the tableA block can sit on the table or on another blockWant to move blocks from one configuration to another

e.g.,

initial state goal

Can be expressed as a special case of DWRBut the usual formulation is simpler

I’ll give classical, set-theoretic, and state-variable formulationsFor the case where there are five blocks

c

abc

a b e

d

44

Classical Representation: Symbols

Constant symbols:The blocks: a, b, c, d, e

Predicates:ontable(x) - block x is on the tableon(x,y) - block x is on block yclear(x) - block x has nothing on itholding(x) - the robot hand is holding block xhandempty - the robot hand isn’t holding anything

ca b e

d

45

unstack(x,y)Precond: on(x,y), clear(x), handemptyEffects: ~on(x,y), ~clear(x), ~handempty,

holding(x), clear(y)

stack(x,y)Precond: holding(x), clear(y)Effects: ~holding(x), ~clear(y),

on(x,y), clear(x), handempty

pickup(x)Precond: ontable(x), clear(x), handemptyEffects: ~ontable(x), ~clear(x),

~handempty, holding(x)

putdown(x)Precond: holding(x)Effects: ~holding(x), ontable(x),

clear(x), handempty

Classical Operators ca b

ca b

ca b

ca

b

ca b

46

For five blocks, there are 36 propositionsHere are 5 of them:ontable-a - block a is on the tableon-c-a - block c is on block aclear-c - block c has nothing on itholding-d - the robot hand is holding block dhandempty - the robot hand isn’t holding anything

ca b

d

e

Set-Theoretic Representation: Symbols

47

Set-Theoretic Actions

Fifty differentactions

Here are four of them:

unstack-c-aPre: on-c,a, clear-c, handemptyDel: on-c,a, clear-c, handemptyAdd: holding-c, clear-a

stack-c-aPre: holding-c, clear-aDel: holding-c, ~clear-aAdd: on-c-a, clear-c, handempty

pickup-cPre: ontable-c, clear-c, handemptyDel: ontable-c, clear-c, handemptyAdd: holding-c

putdown-cPre: holding-cDel: holding-cAdd: ontable-c, clear-c, handempty

ca b

ca b

ca b

ca

b

ca b

48

Constant symbols:a, b, c, d, e of type block0, 1, table, nil of type other

State variables:pos(x) = y if block x is on block ypos(x) = table if block x is on the tablepos(x) = nil if block x is being heldclear(x) = 1 if block x has nothing on itclear(x) = 0 if block x is being held or has another block on itholding = x if the robot hand is holding block xholding = nil if the robot hand is holding nothing

ca b e

d

State-Variable Representation: Symbols

49

State-Variable Operatorsunstack(x : block, y : block)

Precond: pos(x)=y, clear(y)=0, clear(x)=1, holding=nilEffects: pos(x)=nil, clear(x)=0, holding=x, clear(y)=1

stack(x : block, y : block)Precond: holding=x, clear(x)=0, clear(y)=1Effects: holding=nil, clear(y)=0, pos(x)=y, clear(x)=1

pickup(x : block)Precond: pos(x)=table, clear(x)=1, holding=nilEffects: pos(x)=nil, clear(x)=0, holding=x

putdown(x : block)Precond: holding=xEffects: holding=nil, pos(x)=table, clear(x)=1

ca b

ca b

ca b

ca

b

ca b

50

Expressive PowerAny problem that can be represented in one representation can also be represented in the other twoCan convert in linear time and space, except when converting to set-theoretic (where we get an exponential blowup)

Classicalrepresentation

State-variablerepresentation

Set-theoreticrepresentation

trivial

P(x1,…,xn)becomes

fP(x1,…,xn)=1

write all ofthe groundinstances

f(x1,…,xn)=ybecomes

Pf(x1,…,xn,y)

51

ComparisonClassical representation

The most popular for classical planning, partly for historical reasons

Set-theoretic representationCan take much more space than classical representationUseful in algorithms that manipulate ground atoms directly

» e.g., planning graphs and satisfiabilityUseful for certain kinds of theoretical studies

State-variable representationEquivalent to classical representationLess natural for logicians, more natural for engineersUseful in non-classical planning problems as a way to handle numbers, functions, time

52

PART III

State-Space Planning

53

OutlineState-space planning

Forward searchBackward searchSTRIPS

54

Forward Search

take c3

move r1

take c2 …

55

PropertiesForward-search is sound

for any plan returned by any of its nondeterministic traces, this plan is guaranteed to be a solution

Forward-search also is completeif a solution exists then at least one of Forward-search’s nondeterministic traces will return a solution.

56

Deterministic ImplementationsSome deterministic implementationsof forward search:

breadth-first searchdepth-first searchbest-first search (e.g., A*)greedy search

Breadth-first and best-first search are sound and completeBut they usually aren’t practical because they require too much memoryMemory requirement is exponential in the length of the solution

In practice, more likely to use depth-first search or greedy searchWorst-case memory requirement is linear in the length of the solutionIn general, sound but not complete

» There are infinite branches» Thus, can make depth-first search complete by doing loop-checking

s0

s1

s2

s3

a1

a2

a3

s4

s5

sg

a4

a5 …

57

Branching Factor of Forward Search

Forward search can have a very large branching factorE.g., many applicable actions that don’t progress toward goal

Why this is bad:Deterministic implementations can waste time trying lots of irrelevant actions

Need a good heuristic function and/or pruning procedure

a3

a1

a2

…a1 a2 a50a3

initial state goal

58

Backward SearchFor forward search, we started at the initial state and computed state transitions

new state = γ(s,a)For backward search, we start at the goal and compute inverse state transitions

new set of subgoals = γ–1(g,a)To define γ-1(g,a), must first define relevance:

An action a is relevant for a goal g if» a makes at least one of g’s literals true

• g ∩ effects(a) ≠ ∅» a does not make any of g’s literals false

• g+ ∩ effects–(a) = ∅ and g– ∩ effects+(a) = ∅

59

Inverse State Transitions

If a is relevant for g, thenγ–1(g,a) = (g – effects(a)) ∪ precond(a)

Otherwise γ–1(g,a) is undefined

Example: suppose thatg = {on(b1,b2), on(b2,b3)}a = stack(b1,b2)

What is γ–1(g,a)?

60

g0

g1

g2

g3

a1

a2

a3

g4

g5s0

a4

a5

61

Efficiency of Backward Search

Backward search can also have a very large branching factorAs before, deterministic implementations can waste lots of time trying all of them

a1

…a1 a2 a50a3

initial state goal

62

STRIPSπ← the empty plando a modified backward search from g

instead of γ-1(s,a), each new set of subgoals is just precond(a)whenever you find an action that’s executable in the current state, then go forward on the current search path as far as possible, executing actions and appending them to πrepeat until all goals are satisfied

g

g1

g2

g3

a1

a2

a3

g4

g5g3

a4

a5

current search path

a6

π = ⟨a6, a4⟩s = γ(γ(s0,a6),a4)

g6

a3

satisfied in s0