SAMPLING-BASED METHODS FOR FACTORED TASK AND MOTION PLANNINGCaelan Garrett, Tomás Lozano-Pérez, and Leslie Kaelbling MIT EECS Research Qualifying Exam
Task and Motion Planning
■ Application ■ Fully autonomous robots in
human environments ■ One (of many) challenges ■ Planning in mixed discrete-
continuous (hybrid) spaces
■ Task planning (AI planning) ■ Discrete state/actions ■ Motion planning ■ Continuous robot movements
Related Work
■ Multi-Modal Motion Planning ■ Alami et al., Siméon et al., Hauser and Latombe, (Jennifer) Barry,
Vega-Brown and Roy ■ Inefficient in high-dimensional state-spaces
■ Navigation Among Movable Obstacles (NAMO) ■ Stilman and Kuffner, Van Den Berg et al., Krontiris and Bekris ■ Address a specialized subclass of manipulation planning
■ Task and Motion Planning ■ Finite domains - Dornhege et al., Erdem et al., Dantam et al. ■ Cambon et al., Kaelbling and Lozano-Pérez, Lagriffoul et al.,
Srivastava et al., Garrett et al., Toussaint ■ Generally inflexible to new domains
Contributions
■ Sampling-based planning for a broad class of hybrid problems ■ (Not just task and motion planning)
■ General-purpose algorithms ■ Treat domain-specific samplers as blackboxes ■ Usable software that respects this abstraction
■ Efficient algorithms ■ Leverage factoring while sampling and searching
Outline
1. Factored transition systems
2. Sampling-based planning
3. Algorithms
4. Future directions
Factored Transition Systems
Problem Class
■ Discrete-time ■ Plans are finite sequences of controls (but with
continuous parameterizations) ■ Deterministic ■ Actions always produce the intended effect ■ Observable ■ Access to the full world state ■ Hybrid ■ State & control composed of mixed discrete-
continuous variables
Example Pick-and-Place Problem
■ 1 robot: R ■ 2 movable objects: A, B ■ 1 region: RegionA ■ Goal constraints: robot at q*, object A in RegionA
R
A
B
q0
b0
a0
RegionA
q⇤
C⇤ = {RegionA, xq = q⇤}C0 = x
0 = (q0, a0, b0,None)
Factored Transition System
■ State variables: x ■ Robot configuration ■ Holding - A, B, None ■ Object poses
■ Control variables: u ■ Trajectories, grasping
■ Transition Relation: T(x, u, x’) ■ Union of transition clauses ■ Clauses: {Move, MoveH, Pick, Place}
Rob
Hold
Obj A
Obj B
Traj
Grip
x
x
0
x
u
xq
xA
xB x
0B
x
0A
x
0q
ut
xh x
0h
ug
Benefits of Factoring
■ Extremely high-dimensional - (e.g. 100 < DOFs) ■ Each movable object introduces new DOFs ■ Discretized state-space grows combinatorially ■ Existing motion planning methods ineffective
■ We develop algorithms that exploit factoring ■ Sample subsets of state/control variables at a time
that satisfy particular constraints ■ Discrete search using state-of-the-art AI planning
algorithms
Factored States
■ Sample efficiency: a set of values for each variable can induce a large set of states
■ 10 samples for Rq , 4 samples for AA, and 4 samples for B induce a large state-space (160 states)
xq xA xB xh ut
U =X =
None
A
B
Factored Move Transition
■ Robot moves while hand is empty
■ Factoring ■ Poses of objects A & B
unchanged ■ Collision-free (CFree)
constraints mention objects separately
Rob
Hold
Obj A
Obj B
TrajM otion
None None
CFree
False
CFree
CFree
Grip
x
x
0
x
u
xq
xA
xB
x
0q
ut
xA
xB
■ Robot instantly grasps an object
■ Factoring ■ Robot conf & pose of
object B unchanged ■ Kinematics (Kin)
constraint involves just object A & robot
Factored Pick Transition
A
Grasp
S table
K in
None
Rob
Hold
Obj A
Obj B
x
x
x
0
xA
xB xB
x
0A
xqx
0q
Traj
Gripu
ut
ug
Factored Controls
■ The same control (ut) can be applied in many different transitions
■ Allows reasoning about states in which the control can be applied
■ 26 trajectories admit many Move transitions (208)
A
B
R
(x, u, x0) 2 CMove
(x, u, x0) 62 CMove
AB R
ut ut
xA xA
xB
xBxqxq
x
0q x
0q
xq xA xB xh ut
U =X =
None
A
B
Constraint Network
■ Plan skeleton: sequence of transition clauses ■ Constraint network: bipartite graph from variables
to constraints
x
11
x
1m
x
01
x
0m Ca1
�a1
x
k�1m
x
k1
x
km
u11 u1
n uknuk
1
Cak�ak
Ca11 Cak
1x
k�11
a1 ak
Ca1CakC0 C⇤
C01
C0�0
C⇤�⇤
C⇤1
Only 7 out of 29 variables free
0 1 2 3
Rob
Hold
Obj A
Obj B
Traj
4 5
q0M otion
A
Grasp
a0
M otion
RegionGrasp
S tableS table
K in K in
b0 b0 b0 b0 b0 b0
a0
CAMoveHCA
PickCMove
CAPlace
ANone None
None None
u1t u3
t u5t
x
1q x
3qx
1q x
3q
x
2A x
4A x
4Ax
2A
q⇤
CFree
TrueFalse
CFree
CFree CFree
CFree CFree
CFreeH
CFreeH
M otion
CMove
Grip■ Sparse interactions between variables
Sampling-Based Planning
Sampling-Based Motion Planning
■ Primitive procedures ■ Sample configurations (state variable) ■ Connect configurations (control variable) ■ Test collision ■ Probabilistic Roadmap (PRM) ■ Sample values ■ Discrete search
Dimensionality Reducing Constraints
■ Low dimensional pose stability constraint (Stable) ■ Directly sample the constraint
Placement Sampler Pose p1, p2, …
Intersection of Constraints
■ Kinematic constraint (Kin) involves poses, grasps, and configurations
■ Conditional samplers - samplers with inputs
Inverse Kinematics
Pose p
Config q1, q2, …
Grasp g
Composing Conditional Samplers
Placement Sampler
Pose
Inverse Kinematics
Config
Grasp g Motion Planner Trajectory !1, !2, …
Config q2
■ Outputs of one conditional sampler are the inputs to another
■ Directed acyclic graph of conditional samplers
Lower-dimensional Solution Space
■ Sample at the intersection of several dimensionality-reducing constraints
■ When can the intersection be derived using each constraint individually?
■ Conditional constraint - partition of a constraint into input and output variables
■ Theorem: intersection is a submanifold if there exists a conditioning and ordering of conditional constraints: 1. Each variable is an output once 2. A variable is an output before it is an input 3. For each constraint, the projection onto its input
variables has full dimensionality*
Intersection of Projections
■ Inductive intersection between and ■ Set involves ■ Constraint involves
zi+1ziz1
C1 \ ... \ Ci
Ci+1
Zi+1
Z1
zi�1
Zi�1
(C1 \ ... \ Ci) \ Ci+1
zi, zi+1
Ci+1
z1, ..., zi
C1 \ ... \ Ci
C1 \ ... \ Ci
Ci+1
projzi(Ci+1)
Zi projzi(C1 \ ... \ Ci)
Input Output
Graphical Interpretation
1. Each variable “sampled” exactly once 2. Sampler inputs must be chosen before outputs 3. Set of valid input values is non-degenerate
u1t u3
t u5t
x
2A x
4A
x
1q x
3q
Grasp(x2A)
K in(x2A, x
4A, x
3q)K in(x2
A, a0, x1q)
M otion(q0, u1t , x
1q) M otion(x3
q, u5t , q⇤)M otion(x1
q, u3t , x
3q)
S table(x4A)
Each conditional sampler accounts for its own drop in dimensionality
2 Variables, 3 Constraints
1. Sample 2. Conditionally sample 3. Test
C1|⇥ C2 C3 C1|⇥ \ C2 \ C3
x
y
x
y
x
y
x
y
y ⇠ C1
x ⇠ C2(y)
(x, y) 2 C3 x
y
C2
C3C1
x
y
C2
C3
C1
1D 1D 2D 0D
Robust Feasibility
■ Sampling-based methods typically cannot identify infeasibility
■ Also ineffective for feasible problems with a degenerate set of solutions
■ Investigate completeness for robustly feasible problems ■ Set of solutions is an open set in solution-space
Motion Planning
■ Robust feasibility ≈ positive clearance, positive ε-good, expansive
■ Motion skeleton with 1 free waypoint
R R R
Robustly SatisfiableUnsatisfiable SatisfiableNot Robustly Satisfiable
Problem 1 Problem 2 Problem 3
R
Unsatisfiable
Problem 4
q⇤
q0 q0 q0q0
q⇤ q⇤ q⇤
q⇤ q⇤ q⇤ q⇤
Configuration space
Problem
■ Plan skeleton: , Free parameters: ■ Constraints:
Task and Motion Planning
AR
Region
B AR
Region
B AR
Region
B
Robustly SatisfiableUnsatisfiable SatisfiableNot Robustly Satisfiable
Problem 1 Problem 2 Problem 3
q0 q0 q0
a0
a0 a0 a0�a0 �a0 �a0
0 0 000 0
x
1q
x
2A
a0 a0
b0
b0 b0
x
1qx
1q
x
2A x
2A
{Motion(q0, u1t , x
1q),CFreeH (u1
t , a0),CFreeH (u1t , a0, b0),
Grasp(a0),Stable(x2A),Kin(a0, x
2A, x
1q),Region(x
2A)}
(MoveHA,PlaceA)x
2A, x
1q
Planning Algorithms
Domain-Independent Algorithms
■ Meta-parameter - set of conditional samplers ■ Samplers treated as blackboxes ■ Complete with respect to conditional samplers ■ Probabilistically complete given sufficient samplers
■ Algorithms still must: ■ Search through possible plan skeletons ■ Compose and order condition samplers ■ Perform rejection sampling over the solution-space
Introduce 2 New Algorithms
■ Each algorithm repeats 1. Sample values for discretization 2. Search discretized problem for plan
■ Search implemented using blackbox algorithms ■ Breadth-First Search (BFS) ■ Off-the-shelf AI planner (FastDownard) ■ Compile to AI planning language (SAS+) ■ Exploits factoring in its search heuristics
Incremental Algorithm
■ Incremental ≈ probabilistic roadmap (PRM) ■ Repeat:
1. Compose and sample conditional samplers 2. Search discretized problem
■ Drawback - produces many unnecessary samples ■ Example: [Placement(A)→pA1, Grasp(A)→gA1,
Placement(B)→pB1, Grasp(B)→gB1, IK(A, pA0, gA1)→q1, IK(B, pB0, gB1)→q2, IK(A, pA1, gA1)→q3, IK(B, pB1, gB1)→q4, Motion(q0, q1)→t1, Motion(q0, q2)→t2, Motion(q0, q3)→t3, Motion(q0, q4)→t4, Motion(q1, q0)→t5, Motion(q1, q2)→t6, Motion(q1, q3)→t7, …]
Focused Algorithm
■ Focused ≈ lazy PRM ■ Repeat:
1. Search with real & lazy samples 2. Sample values for lazy samples on found plan
■ Lazy samples ≈ lazy collision checking
Lazy Samples
■ Lazily sampling values reduces overhead from unnecessarily using conditional samplers
■ Example: [Placement(A)→pA1, Grasp(A)→gA1, IK(A, pA0, gA1)→q1, IK(A, pA1, gA1)→q2, Motion(q0, q1)→t1, MotionH(q1, q2, A, gA1)→t2, Motion(q2, q0)→t3] ■ Italics are real samples, bold are lazy samples ■ Lazy discretization can still be large ■ Share lazy samples across a sampler ■ Overly optimistic, but resolved through extra search
Scaling Experiments
Scaling Experiments
■ Focused outperforms incremental ■ Heuristic search outperforms BFS
Scaling Experiments
Diverse Experiments
Diverse Experiments
Diverse Experiments
Success percentage (%), Average runtime in sec. (t)
https://github.com/caelan/factored-transition-systems
STRIPStream
■ Predicate-based AI Planning Language ■ STRIPS/PDDL ■ pre/eff actions ■ More expressive ■ Same algorithms
https://github.com/caelan/ss https://github.com/caelan/pddlstream
Ongoing / Future Work
Benchmarking in many frameworks
Drake https://github.com/caelan/ss-drake
PyBullet https://github.com/caelan/ss-pybullet
Benchmarking in many frameworks
Drake https://github.com/caelan/ss-drake
PyBullet https://github.com/caelan/ss-pybullet
Cost-Sensitive Planning
■ Lower bounds on costs improve performance ■ Future work: theoretical analysis of asymptotic
optimality properties
Cost-Sensitive Planning
■ Lower bounds on costs improve performance ■ Future work: theoretical analysis of asymptotic
optimality properties
Planning using Learned Samplers
■ Learn diverse samplers using Bayesian Optimization, Generative Adversarial Networks, … ■ Active model learning and diverse action sampling for task
and motion planning. Zi Wang, Caelan Reed Garrett, Leslie Pack Kaelbling, Tomás Lozano-Pérez
Planning using Learned Samplers
■ Learn diverse samplers using Bayesian Optimization, Generative Adversarial Networks, … ■ Active model learning and diverse action sampling for task
and motion planning. Zi Wang, Caelan Reed Garrett, Leslie Pack Kaelbling, Tomás Lozano-Pérez
Constraint Network Optimization
■ When tractable, jointly solve for parameter values satisfying several constraints at once ■ Useful when constraints are strongly connected ■ Revert to sampling-based methods upon failure ■ Local optimization of solution ■ Preliminary work with Marc Toussaint
Fusing Sampler Instances
After fusing into a placement optimizer (Gurobi for now) and collision-free motion planners
optimize:(o2, o5, o1, o3, o4, o0)->(#6, #5, #1, #18, #2, #17)
(contained, o2, #6, o5) (pose, o1, #5) (kin, o1, #17, #5)(conf, #17) (kin, o2, #18, #6)(pose, o2, #6) (conf, #1) (contained, o1, #5, o5)(conf, #18) (conf, #2) (kin, o2, #2, o4) (kin, o1, #1, o3)
cfree-motion:(#17, o0, o2, #6, o1, #5)->(#35) cfree-motion:(#1, #17, o2, #6)->(#41) cfree-motion:(#18, #1, o1, o3, o2, #6)->(#47) cfree-motion:(#2, #18, o1, o3)->(#54) cfree-motion:(o0, #2, o1, o3, o2, o4)->(#12)
(traj, #35) (motion, #17, #35, o0) (traj, #41) (motion, #1, #41, #17) (traj, #47) (motion, #18, #47, #1) (traj, #54) (motion, #2, #54, #18) (traj, #12) (motion, o0, #12, #2)
sample-pose:(o2, o5)->(#6)
(pose, o2, #6) (contained, o2, #6, o5)
inverse-kinematics:(o2, #6)->(#18)
(conf, #18) (kin, o2, #18, #6)
sample-pose:(o1, o5)->(#5)
(pose, o1, #5) (contained, o1, #5, o5)
inverse-kinematics:(o1, #5)->(#17)
(conf, #17) (kin, o1, #17, #5)
inverse-kinematics:(o2, o4)->(#2)
(conf, #2) (kin, o2, #2, o4)
plan-motion:(#2, #18)->(#54)plan-motion:(o0, #2)->(#12)
(traj, #54) (motion, #2, #54, #18)(traj, #12) (motion, o0, #12, #2)
plan-motion:(#18, #1)->(#47)
(traj, #47) (motion, #18, #47, #1)
plan-motion:(#1, #17)->(#41) plan-motion:(#17, o0)->(#35)
(traj, #41) (motion, #1, #41, #17) (traj, #35) (motion, #17, #35, o0)
inverse-kinematics:(o1, o3)->(#1)
(conf, #1) (kin, o1, #1, o3)
Stochastic Planning
■ Approximate stochastic effects by determinization ■ Replan when unanticipated effects ■ Cost-sensitive planning to ensure induced policy
makes progress towards goal ■ Action cost equal to the expected cost under a
simple model ■ Induces an optimal policy for some probabilistically
simple MDPs ■ These MDPs are reasonable approximations for
some problems
Partially Observable Planning
■ Plan over distributions of states (belief-space) ■ Samplers operate on probability distributions
(e.g. Multinoulli, Multivariate Gaussian, …) ■ Exogenous observations produce new values ■ Optimistically assume helpful observations
Multi-Robot Planning
■ Centralized scheduling of a team of robots
■ Similar algorithms but use temporal planners as search subroutine
■ Temporal FastDownard ■ PDDL rovers domain
Multi-Robot Planning
■ Centralized scheduling of a team of robots
■ Similar algorithms but use temporal planners as search subroutine
■ Temporal FastDownard ■ PDDL rovers domain
Planning and Execution
■ Scheduling sampler evaluations ■ Prioritize samplers with low overhead and low
probability of success (by solving a “meta” MDP) ■ Estimate overhead and probability of success per
sampler
■ When replanning, many sampler evaluations may not be used ■ Defer sampler evaluations by scheduling samplers
and actions together ■ Plan to sample a real value in the future
Conclusion
■ General-purpose framework for exposing factoring in discrete-time hybrid systems
■ Techniques for solving a subclass of these systems using sampling
■ Domain-independent algorithms that operate on conditional samplers as blackboxes
■ Future directions include learning samplers, cost-sensitive planning, and planning & execution
Questions?
Hierarchy
■ Hierarchical action specifications ■ Assumption that refinement likely possible ■ Provide search guidance to a planner ■ Can postpone planning in some cases when
planning and executing
■ Focused algorithm effective when few things to achieve
PR2 Demonstration
■ Visual object detection for coarse pose estimates ■ Tensorflow RCNN ■ Point cloud registration for fine pose estimates ■ PCL ■ Occupancy grid for collision checking ■ Octomap
Robotic Fabrication
■ Preliminary work with Yijiang Huang and Caitlin Mueller in the architecture department