+ All Categories
Home > Documents > Efficient and Asymptotically Optimal Kinodynamic Motion ...kb572/pubs/iros_dirt.pdf · Efficient...

Efficient and Asymptotically Optimal Kinodynamic Motion ...kb572/pubs/iros_dirt.pdf · Efficient...

Date post: 05-Aug-2020
Category:
Upload: others
View: 7 times
Download: 0 times
Share this document with a friend
6
Efficient and Asymptotically Optimal Kinodynamic Motion Planning via Dominance-Informed Regions Zakary Littlefield and Kostas E. Bekris Abstract— Motion planners have been recently developed that provide path quality guarantees for robots with dynamics. This work aims to improve upon their efficiency, while main- taining their properties. Inspired by informed search principles, one objective is to use heuristics. Nevertheless, comprehensive and fast spatial exploration of the state space is still important in robotics. For this reason, this work introduces Dominance- Informed Regions (DIR), which express both whether parts of the space are unexplored and whether they lie along a high quality path. Furthermore, to speed up the generation of a successful successor state, which involves collision checking or physics-based simulation, a proposed strategy generates the most promising successor in an informed way, while maintaing properties. Overall, this paper introduces a new informed and asymptotically optimal kinodynamic motion planner, the Dominance-Informed Region Tree (DIRT). The method bal- ances exploration-exploitation tradeoffs without many explicit parameters. It is shown to outperform sampling-based and search-based methods for robots with significant dynamics. I. I NTRODUCTION Kinodynamic motion planning introduces many chal- lenges, such as high dimensionality, controllability issues, as well as geometric and kinematic considerations. There has been progress on addressing these problems and this work builds on top of these contributions. For instance, RRT can handle kinodynamic problems [1] but is proven to be suboptimal almost surely [2]. RRT * achieves asymptotic optimality [2] but can only be applied to systems for which an optimal local planner is available, i.e., a steering function that can connect two states exactly. It has been applied on several domains where these conditions hold [3]–[6]. Recent contributions achieved guarantees for systems without a steering function. These methods only need to forward propagate controls [7]. For instance, SST combines biased node selection with local pruning to improve solution quality over time [8]. Augmenting the state space with cost values and using branch-and-bound over RRT also results in asymptotic optimality [9]. A resolution complete approach partitions the state space, similar to classical search [10]. These methods do not use heuristics to guide search. This is an effective technique heavily used in discrete search [11], [12]. By guiding search, often high-quality solutions can be found fast. This integration of task space information has also been useful in sampling-based planners [13]–[17]. This work was sponsored by a NASA STRF award to Zakary Littlefield, a NASA ECF award to Kostas Bekris and a DoD-STTR award under Intelligent Automation Inc. The authors are with the Computer Science Dept. of Rutgers University, 110 Frelinghuysen Road, Pis- cataway, NJ, USA, {zakary.littlefield, kostas.bekris}@rutgers.edu. Fig. 1. Trajectories computed via the DIRT motion planning method for a fixed-wing airplane (left) and a planar manipulator pushing an object (right). Guidance via the proposed dominance-informed regions (DIRs) helps to find solutions quickly and improve path cost over time. Heuristically-guided planners aim to quickly return solu- tions and optimize the addition of each new search node. These objectives are complicated, however, if the edge generation process is computationally expensive [18], or as in continuous spaces, the number of possible edges is infinite. Balancing the number of edge evaluations per itera- tion creates a natural exploitation vs. exploration tradeoff, explored previously by the RRT-Blossom approach [19]. Existing heuristically guided planners do not address one of the following aspects: Kinodynamic problems where steering isn’t available. Or don’t achieve prob. completeness or asym. optimality. This work uses a heuristic guidance process but also main- tains probabilistic completeness and asymptotic optimality. It introduces the notion of Dominance-Informed Regions (DIR), a region for each tree node that balances Voronoi bias with path quality, thus representing an exploration- exploitation tradeoff implicitly. This gives rise to the asymp- totically optimal Dominance-Informed Region Tree (DIRT). The method makes effective use of an optimistic heuristic and a maneuver generation process through an informed, blossom-like edge expansion process. Modifications are pro- posed for expensive propagation procedures. DIRT can also make use of a node pruning process, similar to SST, if asymptotic optimality is relaxed, maintaining a fast, space- efficient planner. Overall, DIRT is able to quickly find solution trajectories and improve them in an anytime fashion. DIRT’s singular parameter corresponds to the number of candidate edges to be considered per iteration, which provides a natural transition from exploration to exploitation when using lower to higher values. Experimental evaluations are presented for several kinodynamic benchmarks, which show the benefits relative to previously proposed solutions.
Transcript
Page 1: Efficient and Asymptotically Optimal Kinodynamic Motion ...kb572/pubs/iros_dirt.pdf · Efficient and Asymptotically Optimal Kinodynamic Motion Planning via Dominance-Informed Regions

Efficient and Asymptotically Optimal Kinodynamic Motion Planningvia Dominance-Informed Regions

Zakary Littlefield and Kostas E. Bekris

Abstract— Motion planners have been recently developedthat provide path quality guarantees for robots with dynamics.This work aims to improve upon their efficiency, while main-taining their properties. Inspired by informed search principles,one objective is to use heuristics. Nevertheless, comprehensiveand fast spatial exploration of the state space is still importantin robotics. For this reason, this work introduces Dominance-Informed Regions (DIR), which express both whether parts ofthe space are unexplored and whether they lie along a highquality path. Furthermore, to speed up the generation of asuccessful successor state, which involves collision checking orphysics-based simulation, a proposed strategy generates themost promising successor in an informed way, while maintaingproperties. Overall, this paper introduces a new informedand asymptotically optimal kinodynamic motion planner, theDominance-Informed Region Tree (DIRT). The method bal-ances exploration-exploitation tradeoffs without many explicitparameters. It is shown to outperform sampling-based andsearch-based methods for robots with significant dynamics.

I. INTRODUCTION

Kinodynamic motion planning introduces many chal-lenges, such as high dimensionality, controllability issues,as well as geometric and kinematic considerations. Therehas been progress on addressing these problems and thiswork builds on top of these contributions. For instance,RRT can handle kinodynamic problems [1] but is proven tobe suboptimal almost surely [2]. RRT∗ achieves asymptoticoptimality [2] but can only be applied to systems for whichan optimal local planner is available, i.e., a steering functionthat can connect two states exactly. It has been applied onseveral domains where these conditions hold [3]–[6].

Recent contributions achieved guarantees for systemswithout a steering function. These methods only need toforward propagate controls [7]. For instance, SST combinesbiased node selection with local pruning to improve solutionquality over time [8]. Augmenting the state space with costvalues and using branch-and-bound over RRT also results inasymptotic optimality [9]. A resolution complete approachpartitions the state space, similar to classical search [10].These methods do not use heuristics to guide search. This isan effective technique heavily used in discrete search [11],[12]. By guiding search, often high-quality solutions can befound fast. This integration of task space information hasalso been useful in sampling-based planners [13]–[17].

This work was sponsored by a NASA STRF award to Zakary Littlefield,a NASA ECF award to Kostas Bekris and a DoD-STTR award underIntelligent Automation Inc.

The authors are with the Computer Science Dept.of Rutgers University, 110 Frelinghuysen Road, Pis-cataway, NJ, USA, {zakary.littlefield,kostas.bekris}@rutgers.edu.

Fig. 1. Trajectories computed via the DIRT motion planning method for afixed-wing airplane (left) and a planar manipulator pushing an object (right).Guidance via the proposed dominance-informed regions (DIRs) helps to findsolutions quickly and improve path cost over time.

Heuristically-guided planners aim to quickly return solu-tions and optimize the addition of each new search node.These objectives are complicated, however, if the edgegeneration process is computationally expensive [18], oras in continuous spaces, the number of possible edges isinfinite. Balancing the number of edge evaluations per itera-tion creates a natural exploitation vs. exploration tradeoff,explored previously by the RRT-Blossom approach [19].Existing heuristically guided planners do not address oneof the following aspects:

• Kinodynamic problems where steering isn’t available.• Or don’t achieve prob. completeness or asym. optimality.

This work uses a heuristic guidance process but also main-tains probabilistic completeness and asymptotic optimality.It introduces the notion of Dominance-Informed Regions(DIR), a region for each tree node that balances Voronoibias with path quality, thus representing an exploration-exploitation tradeoff implicitly. This gives rise to the asymp-totically optimal Dominance-Informed Region Tree (DIRT).The method makes effective use of an optimistic heuristicand a maneuver generation process through an informed,blossom-like edge expansion process. Modifications are pro-posed for expensive propagation procedures. DIRT can alsomake use of a node pruning process, similar to SST, ifasymptotic optimality is relaxed, maintaining a fast, space-efficient planner.

Overall, DIRT is able to quickly find solution trajectoriesand improve them in an anytime fashion. DIRT’s singularparameter corresponds to the number of candidate edgesto be considered per iteration, which provides a naturaltransition from exploration to exploitation when using lowerto higher values. Experimental evaluations are presented forseveral kinodynamic benchmarks, which show the benefitsrelative to previously proposed solutions.

Page 2: Efficient and Asymptotically Optimal Kinodynamic Motion ...kb572/pubs/iros_dirt.pdf · Efficient and Asymptotically Optimal Kinodynamic Motion Planning via Dominance-Informed Regions

II. PROBLEM STATEMENT

Consider a state space X, divided into a collision-free Xfand obstacle subset Xobs, with a start state, x◦ ∈ Xf , agoal set, XG ∈ Xf . There is also a control space U. Thedynamics of how controls influence new states are governedby a differential equation x = f(x, u), which results ina trajectory π in Π. In this work, the assumption is thattrajectories are generated via piecewise-constant controls.

Assumption 1: For a feasible motion planning problemthat admits a solution, there exists a solution trajectory πgenerated by a piecewise constant control function Υ.

Each trajectory has a cost g according to function cost :Π → R+. The solution trajectory must minimize this cost.Similar to prior work [8], the assumption is that the cost func-tion is Lipschitz continuous w.r.t. X, additive, monotonicallyincreasing, and non-degenerate. Given this cost objective,consider the following property:

Definition 1: (Asymptotic Optimality) Let g∗ denote theoptimal trajectory cost for a motion planning problem (Xf ,x◦, XG). Let Yn denote a random variable that represents theminimum cost value among all trajectories returned by analgorithm at iteration n. The algorithm is asymptotically op-timal if for all independent runs, P(

{lim supn→∞

Yn = g∗}

) = 1

A trajectory that returns the optimal cost g∗ is denoted asπ∗. The cost function used may not be directly defined overX. Instead, the notion of a task space is used.

Definition 2: A task space, T, is a transformation of thestate space X via a mapping function M : X→ T.This task space is application specific and usually capturesthe essence of the problem being solved, e.g. a manipulator’send effector configuration may be more important than thestate of the entire arm. The goal region is also usuallydefined in the task space (denoted by TG), along with otheruseful primitives like distances. A task space is defined via amapping from state space to task space (M) with a particularstate x mapping to a task state τ .

Definition 3: A heuristic guidance function h, defined inT, must satisfy the following:

h(τ) = 0, ∀τ ∈ TG (1)h(τ) < g∗(τ), ∀τ ∈ Tf . (2)

These conditions specify that the heuristic function isoptimistic, i.e., it does not overestimate the optimal trajectorycost to the goal. This task space guidance function enables amotion planner to check its overall progress toward reachingthe goal. It can also be used as a discriminator for what edgesare more likely to reach that goal.

The next assumption on task space mappings is used toargue properties of an algorithm in this context.

Assumption 2: A task space mapping M must maintaintopological equivalence with respect to the cost functioncost(·), heuristic function h(·), and distance functionsdX(·), dT(·) between trajectories defined in X and in T.

The assumption indicates that the task space versions ofthe cost function and heuristic must accurately map to similarvalues in the state space. This assumption allows for analysisto continue in state space.

III. METHOD

DIRT is outlined in Alg. 1, while Alg. 2 explains how thedominance-informed regions (DIRs) are used for selecting anode. This region keeps track of local nodes, their path costsfrom the tree root, and heuristic estimates. A DIR is usedduring the sampling-based planner’s selection process (Alg.3). The propagation procedure employs a blossom of manycontrols, while pruning can optionally be employed as well.

A. Dominance-Informed Region Tree (DIRT)

Algorithm 1: DIRT( x◦, T , U , h(·), N )

1 T ← {x◦}, πsol ← ∅;2 xnew ← x◦;3 for N iterations do4 if xnew 6= ∅ and h(xnew) < h(parent(xnew))

then xsel ← xnew ;5 else xsel ← DIR Selection(T ) ;6 if Ecand(xsel) = ∅ then7 Ecand(xsel)← Blossom( xsel, U) ;

8 while Ecand(xsel) 6= ∅ do9 u← arg min

f(target)

Ecand(xsel);

10 Ecand(xsel)← Ecand(xsel) \ {u};11 xnew ← Propagate(xsel, u);12 if xnew 6= NULL and not BnB( xnew, πsol)

then13 Extend Tree( T , (xsel → xnew) );14 Update Dominance Regions(T , xsel, xnew);15 if xnew ∈ T and (πsol = ∅ or

cost(π(xnew)) < cost(πsol)) then16 πsol = π(xnew);

17 break;

18 else19 xnew ← NULL;

20 return πsol;

Each iteration of DIRT starts with a selection of anexisting node to expand. If the previous iteration of DIRTresults in an expansion that has a smaller heuristic value thanits parent (Line 4), that newly created node is selected. Thisgreedy expansion strategy can find trajectories to the goal inrelatively open environments easily, given a good heuristicfunction. Since the Blossom expansion is a relatively largeamount of effort for a node expansion, compensating witha greedy selection can improve general performance of themotion planner as well. In all other cases, the node selectionis performed with Alg. 2. The start state will always maintaina DIR with a radius equal to the maximum DIR radiusassigned to any node in the search tree. Since the start isadded to the search tree without any other nodes in its localregion, this special consideration is necessary.

Then, a set of candidate edges are considered accordingto the Blossom algorithm (Alg. 4). These edges are rankedand processed in an order that aims to bring the system tothe goal. Among these edges, the edge that proposes an end

Page 3: Efficient and Asymptotically Optimal Kinodynamic Motion ...kb572/pubs/iros_dirt.pdf · Efficient and Asymptotically Optimal Kinodynamic Motion Planning via Dominance-Informed Regions

Fig. 2. Comparison of how nodes are selected. (left) An RRT-like, Voronoi-based selection would select the closest node to the random sample. (right)The DIR-based method selects equally among all nodes that contain therandom sample in their DIR. The DIR radius is dependent on g+h valuesat each node relative to those nearby.

state (or target state) that has a minimal f -value are furtherexamined. The target state can either be an estimated endstate, or is the result of propagating the system dynamics.

Once an edge is selected, it is collision-checked forsafety. This edge is also checked against the current bestsolution trajectory πsol. If adding this edge to the searchtree would create a trajectory that has larger f -value thancost(πsol), that edge does not get added to the search tree.This procedure is commonly known as branch and bound (orin Line 12 of Alg. 1 abbreviated to BnB). If the new edge isadded, the DIR is computed for that node. If this edge wasrejected for any reason, the next candidate edge for this nodewill be examined. This continues until a successful edge isadded, or the candidate edge set is empty.

B. Dominance-Informed Regions

Algorithm 2: DIR Selection(T )

1 xrand ← RandomSample(X);2 Xcand ← {x ∈ T |xrand ∈ DIR(x)};3 if |Xcand| = 0 then4 xclosest ← Closest(T, xrand);5 Xcand ← {x ∈ T |xclosest ∈ DIR(x)};6 return x ∈ Xcand at random;

A key idea of the proposed motion planning algorithmis the notion of a dominance-informed region (DIR). Theseregions represent a volume of the task space that is claimedby a search node. These regions are defined as spheres,B(τ, r) of radius r centered at task state τ ∈ T, whichis represented by the search node. With a slight abuse ofnotation, checking if a task state τi exists in DIR(τj) isshorthand for checking if the τi ∈ B(τj , rj). This operationof checking if a state exists within another DIR is the mainoperation in Alg. 2, and relies on radial nearest neighborqueries. Instead of relying only on Voronoi-bias like in RRT,which doesn’t consider path quality, using these informedregions allows for the path quality of a node to influencehow frequently a node will be selected. An illustration ofhow selection differs from that in RRT is shown in Fig. 2.

First, a uniform random sample is taken from the statespace similar to RRT. Then, all DIRs that contain thatrandom sample are composed into a candidate tree node set(where G is the search tree from the motion planner). If thiscandidate set is empty, this means that the random sample is

Fig. 3. Computing a new DIR and updating others. A new node is added,which reduces the size of DIR(xj ). Because the new node is not closeenough to xi, DIR(xi) is not effected. The dotted line denotes the oldDIR, and the solid lines are the current DIR.

in an unexplored region, and a secondary strategy is invoked.The closest tree node to the random sample is found, andthe previous membership test is performed from that node’sstate. Now, the candidate set is guaranteed to have at leastone element. This entire selection process has explorationproperties when the search tree is small, but when the treestarts filling up the reachable space, more preference is givento nodes with good path quality from the start node.

Algorithm 3: Update Dominance Regions(T ,xsel,xnew)

1 Xup ← {x ∈ T |dT(x, xnew) ≤ dT(xsel, xnew)};2 ∀x ∈ Xup s.t.

x.f > xnew.f, DIR(x)← B(x, dT(x, xnew));3 DIR(xnew)← max

x∈Xup

(dT(xnew, x) s.t. x.f > xnew.f);

The computation of a node’s DIR is performed via a localcomparison to other nodes (Alg. 3). The upper limit for theradius of a node’s DIR is the task-space distance dT betweena node xnew and its parent xsel. The size can be smaller ifthere are other nodes nearby that have better path quality.The path quality measure f is the same as in other heuristicsearch methods, f = g + h where g is the defined pathcost of the trajectory from the start node to this node andh is the heuristic estimate of the cost to the goal. Nodesthat have the best f -value locally will maintain larger DIRs,while suboptimal nodes will have their regions reduced sincea better node in the local region has been discovered. Anexample illustration of this computation is shown in Fig. 3.A variant of the overall planner removes nodes if their DIRsare reduced too much and is discussed later.

C. Blossom Edge Expansion

In addition to selection, the proposed motion planner alsogenerates edges differently than the basic RRT framework.The kinodynamic version of RRT will either perform onerandom control propagation or will propagate a set of randomcontrols and try to add the one edge that gets closest tothe randomly sampled state from the selection process. Thesecond choice has been shown to cause lack of probabilisticcompleteness for kinodynamic problems [20]. In general, a

Page 4: Efficient and Asymptotically Optimal Kinodynamic Motion ...kb572/pubs/iros_dirt.pdf · Efficient and Asymptotically Optimal Kinodynamic Motion Planning via Dominance-Informed Regions

larger number of control tries per iteration leads to moreexploitation, while a single control tends toward exploration.This observation is what led to the proposed motion plannerusing a Blossom-like propagation procedure [19], which isadapted so as to utilize task-space heuristic information.Algorithm 4: Blossom( x, U)

1 if x expanded previously then return Random U() ;2 else return Ecand = u ∈ Υ s.t. |Ecand| = bn ;

A Blossom approach aims to explore many different edgesfrom a given tree node. As the number of candidate edgesgrows, the closer that set approximates the true reachabilityset from that tree node. This can be computationally ineffi-cient for large numbers of edges, so care must be taken toreasonably limit the number of these edges.

For this algorithm, when a tree node is expanded for thefirst time, i.e., when the node is newly added to the tree, aset of candidate edges, Ecand is generated. This can be a setof random controls or a curated set of controls designed byan expert. The size of this set is limited to a predeterminedbn value, the blossom number. For every other expansionof this tree node, a single random control is generated.By switching from Blossom expansion to a single randomexpansion, asymptotic properties can be maintained, whilegetting the practical exploitation benefits.

D. Memory Constraint Considerations

Two related methods to DIRT take steps to limit thenumber of states the search tree stores [8], [10]. These prun-ing operations lead to significant performance improvementsduring runtime. This happens primarily through keeping datastructures small, which in turn keeps the cost of each iterationsmall. For DIRT, there are several radial nearest neighborqueries that need to be performed to maintain and updatethe DIR for each node. While branch-and-bound can limithow far the search tree extends, it does not limit the numberof nodes that can be added. A way to handle updating DIRsis to delete nodes that have a better node nearby.Algorithm 5: Update DIR Pruning(G, xsel, xnew)

1 Update Dominance Regions(G,xsel,xnew);2 Xd ← {x ∈ G|∃x2,B(x, rx + dT(x, x2)) ⊂ B(x2, rx2

)};3 G← G \Xd;

Alg. 5 provides a function that can replace Line 14 inAlg. 1. In addition to reducing the size of these regionswhen a better is node is added, nodes may be deleted iftheir DIR is completely encompassed in another node’s DIR.New nodes can also be pruned by this process, behavingsimilarly to the new node being in collision with an obstacle.In this case, the new node is not added to the searchtree and another candidate edge is examined. This pruningmodification, however, complicates the analysis. Since thesize of the DIR is limited by the distance to the parent node,it may be possible to show that a version of DIRT with Alg.5 is asymptotically optimal. The following analysis sectionwill study Alg. 1 as provided. Both the basic version andpruning version are included in the experimental evaluation.

IV. ANALYSIS

Proving that DIRT is asymptotically optimal is shownfollowing a Markov chain absorption argument (Theorem11.3 [21]). A Markov chain is constructed that “observes” anoptimal path at specified subdivisions and has an absorptionstate corresponding to the trajectory reaching the goal region.If the probability of transitioning from one Markov state toanother is non-zero, the probability of reaching the absorbingstate is 1 at infinity. This proves probabilistic completeness.A bound on the path cost OF an observing trajectory canalso be drawn using previous theoretical results [8]. As thelimit of number of states observing the optimal path goestoward infinity, the optimal trajectory will be generated withprobability 1.

Definition 4: (Covering Ball Sequence): A covering ballsequence B(π, δ) for trajectory π is a sequence of balls,B(x0, δ),B(x1, δ), ...,B(xn−1, δ) such that xi ∈ π.

Such a sequence can be constructed over π∗. Then, fromany covering balls in the sequence, if a tree node exists inthat ball, there is a positive probability of transitioning to thenext covering ball. To transition from one covering ball tothe next, there are two conditions:• A node in the covering ball is selected for propagation.• An edge is generated that terminates in the next ball.

If both operations have non-zero probability, the entireevent’s probability will also be non-zero. The probability thata node is selected is directly proportional to how large theirDIR is vs. the task space’s volume. The DIR for a nodeis always some distance to another node. So, its volume isalways non-zero and the probability of selecting this nodeis also non-zero, i.e., γselect = µ(DIR(τ))

µ(T)∗|Xcand| where µ is theLebesque measure. It is likely that the true probability ofnode selection is much higher, since multiple nodes maybe in the same ball, and the DIR region for nodes in thisball may have larger volume larger than the covering ball.This probability also does not include cases where a randomsample does not fall into an existing DIR. This leads to aVoronoi-bias similar to RRT.

For the probability of moving from one covering ball tothe next, Theorem 17 from previous work proves that thisis non-zero given assumptions about the smoothness of thesystem dynamics [8]. Since Alg. 4 reverts to single randompropagation per expansion, this probability is also non-zero.

Fig. 4. Test environments: (Left to Right) 2-link acrobot, fixed-wingairplane in a building, car-trailer in a maze, 4-link planar manipulatorpushing an object to the marked region.

V. EXPERIMENTS

Algorithms: The algorithms compared are RandomizedA∗ [17], AO-RRT [9], and SST [8]. Randomized A∗ andSST have parameters that introduce pruning. SST has anadditional parameter for a selection radius. The parameters

Page 5: Efficient and Asymptotically Optimal Kinodynamic Motion ...kb572/pubs/iros_dirt.pdf · Efficient and Asymptotically Optimal Kinodynamic Motion Planning via Dominance-Informed Regions

Fig. 5. A sum of number of solutions found over time for each of the evaluated algorithm/system scenarios. Each planner instance is ran 50 times toaccount for different random seeds.

Fig. 6. The average solution quality returned over time for each of the evaluated algorithm/system scenarios. Each planner instance is ran 50 times toaccount for different random seeds (which results in some variance in solution quality).

were chosen to maximize the algorithm’s effectiveness ineach evaluation. Two versions of DIRT are included: i) thebasic Alg. 1 and ii) a pruning version of Line 14 with anadaptive pruning radius. This is denoted as DIRT-Prune.When Blossom generates a set of candidate edges, theseedges are generated as equally spaced control values to spanthe control space of the system. Randomized A∗ only usesthese generated candidate edges for expansion. These vari-ants highlight the effectiveness of different modules of theproposed planner. Randomized A∗ uses a Blossom expansionstrategy and shows the effects of not including randomizedcontrols beyond greedy expansions. AO-RRT and SST havepath quality guarantees but do not exhibit an informed nature.

Methods that require an optimal local planner are notconsidered. The GLC approach [10] is also not included asits discretization is defined directly in the state space and notin a lower-dimensional task space. Due to the dimensionalityof the considered benchmarks, the number of cells necessaryto find a solution with GLC is prohibitive. Furthermore,the discretization of the state and control space depend onknowing the Lipschitz constants of the dynamics and costfunction, which is troublesome in a task space and notguaranteed to satisfy the assumptions of GLC.

Benchmarks: A set of scenarios involving dynamicalsystems are considered (Fig. 4):

a) a 2-link acrobot, which is passive-active [22]. The statespace is 4 dims. The control is the torque for the second joint.The system is tasked with reaching an upright balancingposition while avoiding obstacles, including self collisions.This system is low dimensional, but highly non-linear. Thetask space is just the 2 angles representing the point masson the end of the links. The heuristic is the distance betweenthe given state and the upright goal in task space.

b) A fixed-wing airplane moving through a building withtight stairwells to reach the top floor. The state space is 9dimensions [23]. The task space is the x,y,z location of thefixed-wing airplane, and the heuristic is computed via L2distances to waypoints in the stairwells.

c) A second order car-like vehicle with trailers and thetask is moving among a maze of obstacles. The state spacefor this system is 8 dimensional, corresponding to theSE(2) configuration of the car, the steering angle, forwardvelocity, and three trailer angles, θ1, θ2, θ3, relative to thechassis attachment. The controls for this system are forwardacceleration and steering angle velocity. The dynamics are:

x = v cos(θ) cos(ω), y = v sin(θ) cos(ω),

θ = v sin(ω), v = a.

θ1 = v sin(θ1 − θ), θ2 = v sin(θ1 − θ2),

θ3 = v sin(θ2 − θ3).

The task space is x, y location of the front vehicle and theheuristic is the L2 norm between the state and the goal.

d) A non-prehensile manipulation task, where an objectmust be pushed to a goal region. The object has inertia oncepushed. The state space is 8 dimensional, corresponding tothe 4 manipulator joint values and the object position andvelocity in the 2D plane. The heuristic is the sum of a lowerbounded time to reach the object with the end effector anda lower bounded time for the object to move at maximumvelocity toward the goal, ignoring obstacles.

Remarks: The first metric is how quickly each algorithmcan find a solution trajectory of any cost. The secondevaluation is the solution cost over time. Since the algorithmsare randomized, they are given 50 different random seeds andran independently. Solution quality and variance among alltrials is shown in Fig. 6. Example search trees are given inFig. 7.

Figs. 5 and 6 (left) show the results on the 2-link acrobot.Many algorithms can find solutions quickly. Randomized A∗,however, had difficulty finding solutions due to the pruningradius, which prohibits new nodes from being added in thehighly non-linear space. SST gets around this by allowingnew nodes to prune others only when path quality improves.SST does not find solutions for all instances within the timelimit, unlike AO-RRT and DIRT. Both versions of DIRT findthe best solutions and have lower variance.

Page 6: Efficient and Asymptotically Optimal Kinodynamic Motion ...kb572/pubs/iros_dirt.pdf · Efficient and Asymptotically Optimal Kinodynamic Motion Planning via Dominance-Informed Regions

Fig. 7. Example DIRT trees (for the planar manipulator the tree shown isthe object’s motion). Branch-and-bound and DIR-based pruning is applied.

Figs. 5 and 6 (middle left) correspond to the fixed-wingairplane. Randomized A∗ has failed to find solutions anddoesn’t scale well. The DIRT methods effectively find solu-tions quickly with high solution quality. In these scenarios,there is only one possible corridor to the goal, which explainswhy both SST and DIRT find high quality solutions. DIRTfinds them faster.

Figs. 5 and 6 (middle right) are results for the car-trailer.This is the most difficult benchmark due to the large numberof obstacles and large state space. Heuristic guidance isessential for finding solutions and DIRT is able to takeadvantage of it.

Figs. 5 and 6 (right) provide results for the planar ma-nipulator case. SST and Randomized A∗ were ineffectivein this space. SST did find some solution trajectories, butthe number of successes was small. The DIRT methodsfound the most solutions. Encoding a heuristic with moreknowledge about how the manipulator approaches the objectcould help. In terms of path quality, when the algorithms findsolutions, DIRT has slightly better solutions on average.

VI. DISCUSSION AND CONCLUSION

Using heuristics in sampling-based planners is a standardway to improve efficiency, but it is unclear what is the bestway to incorporate such information for continuous spaceproblems with dynamics. The paper proposes dominance-informed regions (DIRs) as a natural method for biasingtowards high quality nodes close to unexplored part of thestate space with a minimal number of parameters. DIRT alsouses an informed edge expansion primitive based on multiplepropagations and is able to provide asymptotic optimality. Itis also shown to have good run times in practice and quicklyfinds high quality solutions, including with a pruning-basedversion to keep the corresponding data structure sparse.

Future work includes providing an analysis of the pruningversion. This involves finding the true bound on the size ofthe dominance regions relative to the clearance balls thattile the optimal path. Experimentally, there is evidence thatthe pruning version may still be asymptotically optimal, butmay impose more assumptions on the type of systems forwhich it is applicable. It is also interesting to evaluate themethod in the context of systems that are more expensiveto simulate, such as those that require a physics engine.In this case, propagating multiple edges for each nodeexpansion can be costly. If the ranking of controls can bedetermined without robot simulation, similar performancecan be achieved as here. This prediction heuristic can bethe result of an approximated model with simpler dynamics,or can potentially be learned from data. The candidate edgesmay also be learned from previous experiences.

REFERENCES

[1] S. M. LaValle and J. J. Kuffner, “Randomized Kinodynamic Planning,”IJRR, vol. 20, no. 5, pp. 378–400, May 2001.

[2] S. Karaman and E. Frazzoli, “Sampling-based algorithms for optimalmotion planning,” The International Journal of Robotics Research,vol. 30, no. 7, pp. 846–894, 2011.

[3] G. Goretkin, A. Perez, R. Platt, and G. Konidaris, “Optimal Sampling-based Planning for Linear-Quadratic Kinodynamic Systems,” in IEEEIntl. Conf. on Robotics and Automation (ICRA), 2013.

[4] J.-H. Jeon, R. Cowlagi, S. Peters, S. Karaman, E. Frazzoli, P. Tsiotras,and K. Iagnemma, “Optimal Motion Planning with the Half-CarDynamical Model for Autonomous High-Speed Driving,” in AmericanControl Conference (ACC), 2013.

[5] D. Webb and J. van Den Berg, “Kinodynamic RRT*: AsymptoticallyOptimal Motion Planning for Robots with Linear Differential Con-tstraints,” in IEEE Intl. Conf. on Robotics and Automation (ICRA),2013.

[6] S. Karaman and E. Frazzoli, “Sampling-Based Optimal Motion Plan-ning for Non-holonomic Dynamical Systems,” in IEEE Intl. Conf. onRobotics and Automation (ICRA), 2013.

[7] G. Papadopoulos, H. Kurniawati, and N. Patrikalakis, “Analy-sis of Asymptotically Optimal Sampling-based Motion PlanningAlgorithms for Lipschitz Continuous Dynamical Systems,” 2014,http://arxiv.org/abs/1405.2872.

[8] Y. Li, Z. Littlefield, and K. E. Bekris, “Asymptotically optimalsampling-based kinodynamic planning,” The International Journal ofRobotics Research, vol. 35, no. 5, pp. 528–564, 2016.

[9] K. Hauser and Y. Zhou, “Asymptotically optimal planning by feasiblekinodynamic planning in a state–cost space,” IEEE Transactions onRobotics, vol. 32, no. 6, pp. 1431–1443, 2016.

[10] B. Paden and E. Frazzoli, “A Generalized Label Correcting Methodfor Optimal Kinodynamic Motion Planning ,” in WAFR, 2016.

[11] J. T. Thayer and W. Ruml, “Faster than weighted a*: An optimisticapproach to bounded suboptimal search.” in ICAPS, 2008, pp. 355–362.

[12] M. Phillips, M. Likhachev, and S. Koenig, “Pa* se: Parallel a* forslow expansions.” in ICAPS, 2014.

[13] O. Adiyatov, K. Sultanov, O. Zhumabek, and H. A. Varol, “Sparse treeheuristics for rrt* family motion planners,” in 2017 IEEE InternationalConference on Advanced Intelligent Mechatronics (AIM), July 2017,pp. 1447–1452.

[14] S. M. Persson and I. Sharf, “Sampling-based A* algorithm forrobot path-planning,” The International Journal of Robotics Research,vol. 33, no. 13, pp. 1683–1708, 2014.

[15] K. Bekris and L. Kavraki, “Informed and probabilistically completesearch for motion planning under differential constraints,” in First In-ternational Symposium on Search Techniques in Artificial Intelligenceand Robotics (STAIR), Chicago, IL, 2008.

[16] S. Choudhury, J. D. Gammell, T. D. Barfoot, S. S. Srinivasa, andS. Scherer, “Regionally accelerated batch informed trees (rabit*): Aframework to integrate local information into optimal path planning,”in Robotics and Automation (ICRA), 2016 IEEE International Confer-ence on. IEEE, 2016, pp. 4207–4214.

[17] R. Diankov and J. Kuffner, “Randomized statistical path planning,” inIntelligent Robots and Systems, 2007. IROS 2007. IEEE/RSJ Interna-tional Conference on. IEEE, 2007, pp. 1–6.

[18] V. Narayanan and M. Likhachev, “Heuristic search on graphs withexistence priors for expensive-to-evaluate edges,” 2017.

[19] M. Kalisiak and M. van de Panne, “Rrt-blossom: Rrt with a localflood-fill behavior,” in Robotics and Automation, 2006. ICRA 2006.Proceedings 2006 IEEE International Conference on. IEEE, 2006,pp. 1237–1242.

[20] T. Kunz and M. Stilman, “Kinodynamic rrts with fixed time stepand best-input extension are not probabilistically complete,” in WAFR,2014, pp. 233–244.

[21] C. Grinstead and J. Snell, Introduction to Probability. Providence,RI: American Mathmatical Society, 2012.

[22] M. W. Spong, “Underactuated mechanical systems,” in Control Prob-lems in Robotics and Automation, Lecture Notes in Control andInformation Sciences, B. Siciliano and K. P. Valavanis, Eds., 1997.

[23] A. Paranjape, K. Meier, X. Shi, S.-J. Chung, and S. Hutchinson,“Motion primitives and 3-D path planning for fast flight through aforest,” in IEEE/RSJ International Conference on Intelligent Robotsand Systems (IROS), Nov 2013.


Recommended