Towards Multi-Robot Active Collaborative State Estimation via …€¦ · ul:k+l1 ⌘ ur l:k+l1 R...

Towards Multi-Robot Active Collaborative State Estimationvia Belief Space Planning

Vadim Indelman*

Abstract— In this paper we address the problem of collabora-tive active state estimation within the framework of multi-robotsimultaneous localization and mapping (SLAM). We assumeeach robot has to autonomously navigate to a pre-specified setof goals in unknown environments and develop an approachthat enables the robots to collaborate in order to reducethe uncertainty in their state estimation. We formulate thisproblem as multi-robot belief space planning, where the beliefrepresents the probability distribution of robot states from theentire group, as well as the mapped environment thus far.Our approach is capable of guiding each robot to reduce itsuncertainty by re-observing areas previously observed (only)by other robots. Direct observations between robot states, suchas relative-pose measurements, are not required, providingenhanced flexibility for the group as the robots do not have tocoordinate rendezvous with each other. Instead, our frameworksupports indirect constraints between the robots, that areinduced by mutual observations of the same area possibly atdifferent time instances, and accounts for these future multi-robot constraints within the planning phase. The proposedapproach is evaluated in a simulation study.

I. INTRODUCTION

Accurate and reliable operation in complex, partiallyunknown or dynamically changing environments is essen-tial in numerous problem domains, including autonomousnavigation in GPS-deprived environments, search and res-cue scenarios, object manipulation, human-robot interaction,satellite proximity operations and robotic surgery. The corre-sponding problem can be formulated within the simultaneouslocalization and mapping (SLAM) paradigm, where infer-ence is performed over the robot states (e.g. pose, navigationstate) and the perceived environment, and is also tightlyrelated to vision-aided navigation.

While the SLAM problem has been extensively investi-gated for over a decade, state of the art approaches oftenassume robot motion to be externally determined and focuson the inference part. Although treating the two processesseparately simplifies the problem, and is a necessity in certainsemi-autonomous applications (e.g. hand-held camera, user-driven robots), optimal performance is not guaranteed, as theprinciple of separation [30], [2] does not hold in the generalcase, often resulting in sub-optimal estimation accuracy.

Addressing the fully coupled problem involves account-ing for different sources of uncertainty, such as stochasticdynamics and observations, within motion planning. To doso, we need to reason in terms of the belief, a probabilistic

*Department of Aerospace Engineering, Technion - Israel Institute ofTechnology, Haifa 32000, Israel.

Fig. 1: 3D view of the considered synthetic scenario. Eachrobot adjusts its trajectory to observe areas mapped byanother robot, thereby considerably reducing uncertainty (seealso Figure 2).

distribution over states of interest (e.g. robot poses), and howthe belief evolves for different actions.

The corresponding problem can be formulated as a par-tially observable Markov decision process (POMDP). Cal-culating a globally optimal solution to POMDP is known tobe computationally intractable [22], for all but the smallestproblems. Thus, the research community has been exten-sively investigating approximate approaches to provide betterscalability to support real world problems, however, oftenassuming operation in known environments, as well asassuming robot state is directly observable, see e.g. [18],[24], [23], [29], [7].

In recent years, research focus has also shifted towardsactive SLAM approaches [27], [8], [26], [20], [31], [5], [28],[16] that trade-off between exploration of new areas withimproving inference accuracy by actively performing loopclosure observations.

In this paper we address the problem of collaborativeactive state estimation within the framework of multi-robotSLAM. Collaboration between multiple possibly heteroge-nous robots, capable of information sharing, has been investi-gated both in passive and active multi-robot SLAM contexts.Existing active approaches, however, typically focus mainlyon coordination aspects to facilitate efficient exploration ofnew areas (see, e.g. [3], [6], [31]).

In contrast, here we approach the problem from a state

estimation perspective, investigating how can the robots col-laboratively improve inference quality. The passive problemformulation, where robots motion is externally determinedand the objective is to infer states of interest as accuratelyas possible, has been investigated in recent years. Theseresearch efforts (including [25], [1], [17], [13], [4], [14])explored different aspects of the problem, such as centralizedvs distributed architecture, direct and indirect multi-robotconstraints, and multi-robot data association.

On the other hand, active approaches for cooperative stateestimation have been much less investigated, in particularconsidering operation in unknown or uncertain environments.In this paper we address this problem within the frameworkof multi-robot SLAM and contribute an approach that allowsa group of robots to determine (locally-) optimal motion suchthat estimation accuracy is substantially improved while op-erating in unknown environments. Considering a centralizedsetting, we formulate the problem as multi-robot belief spaceplanning, leveraging our previous research on (single-robot)planning in the generalized belief space [10], [11], [12].

In particular, we incorporate within the planning phasereasoning regarding (future) multi-robot constraints, whichcorrespond to having the same scene observed by differentrobots, possibly at different time instances. These constraintsprovide incentive for the robots to properly adjust theirtrajectories such that appropriate areas are observed if theseobservations are expected to provide significant informationgain (Figure 1). Moreover, our formulation does not requirethe robots to actually meet in order to benefit from eachother’s information, thereby providing enhanced flexibility.

II. PROBLEM FORMULATION

Let xr

i

represent the pose of robot r at time t

i

and denoteby L

r

i

the perceived environment by robot r, e.g. representedby 3D points, by that time. We let Z

r

i

represent the localobservations of robot r at time t

i

, i.e. measurements acquiredby its onboard sensors, and define the joint state ⇥r overrobot past and current poses and observed 3D points as

⇥

r

k

.

= X

r

k

[ L

r

k

, X

r

k

.

= {xr

0, . . . , xr

k

} . (1)

The joint probability distribution function (pdf) over thisjoint state given local observations Z

r

0:k.

= {Zr

0 , . . . , Zr

k

}and controls u

r

0:k�1.

=

�

u

r

0, . . . , ur

k�1

is given by

p

�

⇥

r

k

|Zr

0:k, ur

0:k�1

�

/

p (x

r

0)

k

Y

i=1

⇥

p

�

x

r

i

|xr

i�1, ur

i�1

�

p (Z

r

i

|⇥ro

i

)

⇤

, (2)

where ⇥ro

i

✓ ⇥

r

i

are the involved random variables inthe measurement likelihood term p (Z

r

i

|⇥ro

i

), which canbe further expanded in terms of individual measurementsz

r

i,j

2 Z

r

i

representing observations of 3D points l

j

:

p (Z

r

i

|⇥ro

i

) =

Y

j

p

�

z

r

i,j

|xr

i

, l

j

�

. (3)

The motion and observation models in Eqs. (2) and (3) areassumed to be with with additive Gaussian noise,

x

r

i+1 = f (x

r

i

, u

r

i

) + w

r

i

, w

i

⇠ N (0,⌃

r

w

) (4)z

r

i,j

= h (x

r

i

, l

j

) + v

r

i

, v

i

⇠ N (0,⌃

r

v

) (5)

where ⌃r

w

and ⌃r

v

are the process and measurement noisecovariance matrices, respectively.

We consider now a group of R robots, and denote by ⇥k

the corresponding joint state

⇥

k

.

= X

k

[ L

k

, X

k

.

= {Xr

k

}Rr=1 , (6)

comprising the past and current poses X

k

of all robots, andwhere L

k

represents the perceived environment by the entiregroup. Assuming a common reference frame between therobots is established, L

k

includes all the 3D points in L

r

k

for each r, expressed in that reference frame.The joint pdf over ⇥

k

can now be written as

p (⇥

k

|Z0:k, u0:k�1) /R

Y

r=1

p

�

⇥

r

k

|Zr

0:k, ur

0:k�1

�

, (7)

where u0:k�1 represents the controls of all robots and isdefined as u0:k�1

.

=

�

u

r

0:k�1

R

r=1.

One can now calculate the maximum a posteriori estimate(MAP) of ⇥

k

in a centralized framework as

⇥

?

k

= argmax

⇥k

p (⇥

k

|Z0:k, u0:k�1) . (8)

The above formulation implicitly includes multi-robot con-straints: these constraints arise from some robot r observingat the current time t

k

landmarks that have been alreadyobserved by another robot r

0 at some time t

r

0 , with t

r

0 t

k

. These multi-robot constraints are the key to improvingestimation accuracy in a collaborative multi-robot setting.

In this paper we address the problem of how such multi-robot constraints can be planned ahead of time to improveestimation accuracy. To that end, we formulate the problemwithin the framework of belief space planning and develop anapproach capable of autonomously guiding the robots suchthat the mentioned multi-robot constraints can be generated,if these are expected to provide significant information gain.

While our approach naturally supports also direct multi-robot constraints (e.g. one robot makes relative-pose observa-tions of another robot), we focus on the more general case ofindirect multi-robot constraints. The latter provides enhancedflexibility as there is no requirement for the robots to actuallymeet each other to perform collaborative inference; instead,the latter takes place whenever a common scene is observedby different robots.

III. MULTI-ROBOT BELIEF SPACE PLANNING

We formulate this problem with a belief space planning,where the belief represents the joint pdf of the entire groupat an appropriate time. In particular, the belief at the currenttime t

k

is just the joint pdf (7):

b (⇥

k

)

.

= p (⇥

k

|Z0:k, u0:k�1) , (9)

while the belief at a future time t

k+l

is similarly defined asb (⇥

k+l

)

.

= p (⇥

k+l

|Z0:k+l

, u0:k+l�1), where Z

k+1:k+l

andu

l:k+l�1, the future observations and controls, are unknown.This belief can be written in terms of the belief b (⇥

k

)

from the current time and additional terms describing beliefevolution at a future time:

b (⇥

k+l

) / b (⇥

k

) ·l

Y

i=1

R

Y

r=1

p

�

x

r

k+i

|xr

k+i�1, ur

k+i�1

�

p

�

Z

r

k+i

|⇥ro

k+i

�

. (10)

The belief b (⇥k+l

) models the distribution of the joint stateat a future time t

k+l

, while accounting for future controlsu

l:k+l�1 ⌘�

u

r

l:k+l�1

R

r=1and observations Z

k+1:k+l

=

�

Z

r

k+1:k+l

R

r=1of all R robots in the group. Observe that the

latter depends on the controls (different areas are observedfor different controls). As we shall see next, it allowsto model future multi-robot constraints via observation ofmutual 3D points. Importantly, these observations that canbe acquired at different time instances.

We now define a general multi-robot objective function

J (u

k:k+L�1).

= E"

L

X

l=0

c

l

(b (⇥

k+l

) , u

k+l

) + c

L

(b (⇥

k+L

))

#

,

(11)that involves L look-ahead steps for all robots, and where c

l

is the immediate cost function for the lth look ahead step.The expectation operator accounts for all the possible futureobservations Z

k+1:k+l

. We are then interested in finding theoptimal controls u

?

k:k+L�1 for all R robots via

u

?

k:k+L�1 = argmin

uk:k+L�1

J (u

k:k+L�1) . (12)

We note that while we assumed the robots have a singleobjective function (11), it can still include different termsfor each robot (as will be seen next). One could resort alsoto a different objective function for each robot.

In particular, when c

l

includes the second moment of thebelief b (⇥

k+l

), e.g. covariance, calculating (12) facilitates aframework for actively and collaboratively reducing uncer-tainty in the joint state of the group. The problem, however,is further complicated when the objective function includesadditional terms, such as distance to goal and control effort,that require delicate balance of the importance of each term.

IV. DIRECT TRAJECTORY OPTIMIZATION

Finding a globally optimal solution u

?

k:k+L�1 via Eq. (12)involves solving a partially observable Markov decisionproblem (POMDP), which is known to be computationallyintractable [22] for all but the smallest problems. The re-search community has been extensively investigating approx-imate approaches, including point-based POMDP, sampling-based and direct trajectory optimization methods. Our ap-proach is of the latter class and is based on the previouswork [10], [11], [12], trivially extended to the multi-robotcentralized case considered herein.

At each time instant, we calculate locally-optimal controlsu

k:k+L�1 for all R robots, starting from a nominal trajectoryor controls that are assumed to be given. The latter can beset according to the controls from the previous time instant,or determined by sampling motion planning techniques, suchas RRT [19] or RRT* [15] (see, e.g., [29]).

In particular, as shown in our recent work [9], a similarprobabilistic formulation that exploits multi-robot indirectconstraints can be used within sampling based techniques,in which case the approach reported herein can be appliedto refine the identified best trajectories into optimal solutions.

The overall process for calculating the optimal controlsu

?

k:k+L�1 can be described as a dual-layer inference (see[10], [11], [12] for details). The outer layer constitutes aniterative optimization over the controls, which are updatedat each iteration according to

u

k:k+L�1 u

k:k+L�1 +�u

k:k+L�1. (13)

Different optimization methods can be used; in this work,similarly to [10], [11], we use a simple gradient method.Thus, the controls of each robot r are updated according to

u

r

k:k+L�1 u

r

k:k+L�1 � �rr

J, (14)

where � is an appropriate stepsize and rr

J is the gradientof the objective function with respect to the current solutionfor controls u

r

k:k+L�1 of robot r.The inner layer performs inference over the belief: Re-

calling the definition of the objective function J (u

k:k+L�1)

from Eq. (11), each iteration (13) involves inference over thejoint belief b (⇥

k+l

) ⇠ N

�

⇥

?

k+l

,⌃

k+l

�

, given the currentcontrol values, for each look ahead step, i.e. l = 1, . . . , L.

Calculating the gradient captures how the belief, and inparticular the covariance ⌃

k+l

, changes for different actionsof the R robots and eventually entails the (locally) optimalcontrols for a given objective function. This provides anatural mechanism to facilitate active collaborative stateestimation, i.e. adjusting robot trajectories to attain betterestimation accuracy.

In the next section we discuss in detail the mentioned innerlayer inference, and describe a methodology to account forfuture multi-robot constraints within this planning phase.

V. INCORPORATING FUTURE MULTI-ROBOTCONSTRAINTS INTO THE JOINT BELIEF b (⇥

k+l

)

We begin with the joint belief b (⇥k+l

) from Eq. (10) andwrite it explicitly as

b (⇥

k+l

) = p (⇥

k+l

|Z0:k+l

, u0:k+l�1) . (15)

Recall it is a function of the (unknown) future observationsZ

k+1:k+l

and of the controls u

k:k+l�1 of all R robots.While in inference the existence of observations is a

given fact (either the measurement is obtained or not), whenplanning future actions, we can only model probabilisticallywhether or not future measurements will be acquired. Intu-itively, if a 3D point is outside the camera field of view,or is too far (e.g. outside sensing range), it will not be

observed by the robot sensors. In a previous work [11] weintroduced binary latent variables to represent the probabilityof acquiring future measurements.

Here, we go one step further and note that a similar reason-ing allows to model future multi-robot constraints, i.e. ob-servation of 3D points by different robots, not necessarilyat the same time. This observation is the key element thatfacilitates our approach for opportunistic active collaborativestate estimation.

Specifically, for each robot r we use two types of binarylatent variables, denoted by �

r

k+l,j

and

r

k+l,m

. Variablesof the first type, �r

k+l,j

, represent a loop closure event tohappen at time t

k+l

(as in [11]), i.e. robot r re-observes 3Dpoints l

j

that were previously seen by robot r and possiblyby other robots. In contrast, the variables r

k+l,m

representobservations, to be made by robot r, of 3D points l

m

thatwere only observed by other robots. These latter observationsinduce additional (indirect) multi-robot constraints which,being part of the gradient calculation (14), attract the robotsto locations that admit such constraints to take place.

Collecting these variables into corresponding sets J r

k+l

.

=

n

�

r

k+l,j

o

and r

k+l

.

=

n

r

k+l,m

o

, and then letting

�

k+l

.

= [Rr=1�

r

k+l

,

k+l

.

= [Rr=1

r

k+l

, (16)

the joint pdf of the entire group at the lth look ahead step isgiven by

p (⇥

k+l

,�

k+l

,

k+l

|Z0:k+l

, u0:k+l�1) . (17)

The belief b (⇥

k+l

), see Eq. (15), can be calculated bymarginalizing out the latent variables as in

b(⇥

k+l

)=

X

�k+l

X

k+l

p(⇥

k+l

,�

k+l

,

k+l

|Z0:k+l

,u0:k+l�1) (18)

Let us now focus on the joint pdf (17) and write it recursivelyin terms of the belief from the previous step, b (⇥

k+l�1):

p (⇥

k+l

,�

k+l

,

k+l

|Z0:k+l

, u0:k+l�1) / b (⇥

k+1�1) ·R

Y

r=1

p

�

x

r

k+1|xr

k

, u

r

k

�

p

�

Z

r

k+l

,�

r

k+l

,

r

k+l

|⇥ro

k+l

�

, (19)

and where the last term can be expressed using the binarylatent variables as

p

�

Z

r

k+l

,�

r

k+l

,

r

k+l

|⇥ro

k+l

�

=

Y

lj2L

rk

p

�

z

r

k+l,j

|xr

k+l

, l

j

, �

r

k+l,j

�

p

�

�

r

k+l,j

|xr

k+l

, l

j

�

· (20)

Y

lm2Lk\Lrk

p

�

z

r

k+l,m

|xr

k+l

, l

m

,

r

k+l,m

�

p

�

r

k+l,m

|xr

k+l

, l

m

�

.

In the above equation, one can see the mentioned binaryvariables of both types: the first product involves the vari-ables �r

k+l,j

, accounting for the observations to be made byrobot r of 3D points in L

r

k

, i.e. representing areas previouslyobserved by robot r and possibly also by other robots. Thesecond product includes variables r

k+l,m

, modeling robot

r’s observations of 3D points in L

k

\Lr

k

, i.e. areas that werepreviously observed by other robots but not by robot r.

Intuitively, if the corresponding 3D points in L

k

\Lr

k

areestimated with high confidence (small uncertainty covari-ance), there is much to gain for robot r by observing (someof) these points. In such cases, we would like each robot rto be autonomously guided towards appropriate areas so thatits estimation quality can be performed (see Figures 1 and2a). In the next section we discuss in detail a mechanism toaccomplish this objective.

Remark: While Eq. (20) accounts for the areas observedby the current time t

k

, as represented by the 3D pointsL

k

, one could also consider, for each lth look ahead step,new 3D points that will be observed for the first timeduring the time interval [t

k+1, tk+l

]. As we show in [9],this facilitates a framework for active collaborative stateestimation while operating in unknown environments, andcan lead to significantly improved estimation accuracy.

VI. INFERENCE OVER THE BELIEF

Having discussed in detail the joint belief b (⇥

k+l

), wenow turn our attention towards the corresponding maximuma posteriori (MAP) inference, which involves calculating thefirst two moments of that belief. This is the inner layerinference, which is performed for each of the L look aheadsteps as part of the gradient calculation in Eq. (14).

In other words, we are looking for ⇥?

k+l

,⌃

k+l

that repre-sent the belief at the lth look ahead step:

b (⇥

k+l

) = N

�

⇥

?

k+l

,⌃

k+l

�

. (21)

We note that, alternatively, an information form could beused to attain better computational efficiency (as in [12]).

Calculating the MAP estimate of ⇥k+l

involves marginal-ization of all latent binary variables �

k+l

,

k+l

(see Eq. (18))

⇥

?k+l=argmax

⇥k+l

X

�k+l

X

k+l

p (⇥k+l,�k+l, k+l|Z0:k+l, u0:k+l�1)

which, similarly to the single robot case, is computationallyintractable. Instead, we resort to expectation maximization(EM) [21] approach and write

⇥

?k+l=argmin

⇥k+l

E [� log p (⇥k+l,�k+l, k+l|Z0:k+l, u0:k+l�1)]

where the expectation is taken with respect top

⇣

�

k+l

,

k+l

|ˆ⇥k+l

, Z0:k+l

, u0:k+l�1

⌘

, and where ˆ

⇥

k+l

isthe estimate of ⇥

k+l

at the current iteration.Following a similar process as in [11], [12], the above

calculation of ⇥?

k+l

can be written recursively as in Eq. (22)that appears at the top of next page. Linearizing about ˆ

⇥

k+l

and employing algebraic manipulation, one can express thisequation compactly as

⇥

?

k+l

= argmin

⇥k+l

kAk+l

�⇥

k+l

+ b

k+l

k2 , (23)

with the corresponding covariance (or, alternatively, infor-mation matrix) calculated as

⌃

k+l

=

�

A

T

k+l

A

k+l

��1. (24)

⇥

?

k+l

= argmin

⇥k+l

�

�

⇥

k+l�1 �⇥?

k+l�1

�

�

2

⌃k+l�1+

R

X

r=1

�

�

x

r

k+l

� f

�

x

r

k+l�1, ur

k+l�1

�

�

�

2

⌃rw+ (22a)

+

R

X

r=1

X

lj2L

rk

p

⇣

�

r

k+l,j

= 1|x̂r

k+l

,

ˆ

l

j

, z

r

k+l,j

⌘

�

�

z

r

k+l,j

� h

�

x

r

k+l

, l

j

�

�

�

2

⌃v+ (22b)

+

R

X

r=1

X

lm2Lk\Lrk

p

⇣

r

k+l,m

= 1|x̂r

k+l

,

ˆ

l

m

, z

r

k+l,m

⌘

�

�

z

r

k+l,m

� h

�

x

r

k+l

, l

m

�

�

�

2

⌃v(22c)

The terms p

⇣

�

r

k+l,j

= 1|x̂r

k+l

,

ˆ

l

j

, z

r

k+l,j

⌘

and

p

⇣

r

k+l,m

= 1|x̂r

k+l

,

ˆ

l

m

, z

r

k+l,m

⌘

in Eq. (22) representthe probabilities of observing 3D points l

j

and l

m

at afuture time t

k+l

. In standard EM inference, one can evaluateboth of these terms based on the estimates x̂

r

k+l

,

ˆ

l

j

and ˆ

l

m

at the current iteration and the given observations z

r

k+l,j

andz

r

k+l,m

(see, e.g. [14]). However, this is not possible in theplanning phase, since these future observations are unknown(we are trying to assess whether these will be acquired).

Nevertheless, these two terms play a key role in thedescribed dual-layer inference as they provide a mecha-nism to attract the robot to observe informative 3D points,i.e. observations that are expected to significantly impactthe uncertainty covariance (24). Crucially, it allows to do soeven for 3D points outside the robots current sensing range(or field of view). By appropriately modeling these terms,e.g. letting the probability of observing a 3D point decreasewith robot distance from the latter, it is possible to inducenon-zero contributions to the gradient r

r

J from Eq. (14).These contributions become dominant if information gain issignificant and as a result the robot will be guided towardsthe corresponding informative 3D points.

It is exactly this mechanism that allows to plan thementioned indirect multi-robot constraints, as representedby the summation in Eq. (22c). In particular, the gradientr

r

J calculated by each robot r accounts for the impact ofobserving 3D points L

k

\Lr

k

that were previously observedonly by other robots.

In practical terms, while the summation in Eq. (22c)involves all the 3D points observed by the entire group of Rrobots excluding robot r, one could prefer to only account forthose areas that are within a certain neighborhood of robotr. One approach to attain this is by appropriately nullifyingthe gradient of p

⇣

r

k+l,m

= 1|x̂r

k+l

,

ˆ

l

m

, z

r

k+l,m

⌘

such that itbecomes zero outside a user-defined radius, centered at x̂r

k+l

.

VII. EXPERIMENTS

We evaluate the proposed approach for opportunistic col-laborative active state estimation in a simulative environment,considering the problem of uncertainty-constrained aerialautonomous navigation in unknown and GPS-deprived en-vironments. In the next section we describe the consideredobjective function J from Eq. (11) and then in Section VII-Bpresent the results.

A. Scenario and Objective FunctionIn this work we consider a multi-robot uncertainty-

constrained autonomous navigation scenario, where eachrobot r has to navigate to a pre-defined goal xr

g

while op-erating in unknown environments and keeping its estimationuncertainty below a given (soft) threshold �. We use a singleobjective function for the entire group:

J (uk:k+L�1).

=

RX

r=1

⇥(1� ↵

r)

��x̂

rk+L � x

rg

��+ ↵

rtr (⌃

rk+L)

⇤,

(25)where ⌃r

k+L

is the covariance of robot r at the last lookahead step. For simplicity we do not include penalty overcontrol usage and assume maximum likelihood observations,which allows to omit the expectation operator in Eq. (25).The parameter ↵r represents an adaptive weight to trade-off uncertainty reduction and goal-attainment. Given anuncertainty threshold �, this parameter is calculated for eachrobot r based on its pose covariance ⌃r

k+L

as (see furtherdetails in [10], [11], [12]):

↵

r

.

= min

tr

�

⌃

r

k+L

�

�

, 1

!

. (26)

Since the objective function J involves the belief of allR robots (in terms of the first two moments x̂

r

k+L

and⌃

r

k+L

), calculating the actions u

r

k:k+L�1 for each robot r

via Eq. (14) takes into account the impact of the latter onrobot r and other involved robots. In particular, when robotr makes an observation of 3D points previously observedby other robots, the uncertainty covariance of these andpossibly additional robots will be impacted. For each robotr, the gradient r

r

J from Eq. (14) will therefore includecontributions from all these robots, properly quantifying theeffect of robot rth candidate controls ur

k:k+L�1 on the entiregroup, a process that implicitly also involves the controlsu

r

0

k:k+L�1 of other robots r

0 2 {1, . . . R}� {r}.We note that while the goals

�

x

r

g

R

r=1are pre-defined, an

interesting question that deserves further research, is how tochoose these goals online. Addressing this task allocationproblem could, for example, lead to improved estimationaccuracy by enhancing collaboration between nearby robots.

B. Results

We assume each robot r starts operating from a differentlocation and needs to reach its own goal xr

g

, while localizing

itself and mapping the environment perceived with its cameraand range sensors. We use a soft uncertainty threshold of� = 25 meters for all robots. The proposed approach isdemonstrated in two scenarios as described below.

1) First Scenario: In the first scenario, shown in Figure2a, we consider the case of two aerial robots passing byin opposite directions, each robot being guided towards itsown goal. We assume the robots have accurate knowledgeregarding their starting point (i.e. initial pose) and that therobots share a common reference frame. Furthermore, therobot trajectories are assumed to be sufficiently distant fromeach other such that the robots will not observe any commonareas, without properly adjusting their motion. As a result,without collaboration, each robot performs its own inferenceprocess (i.e. SLAM).

The result of our approach is shown in Figure 2: Figure2a shows the robot trajectories and displays also uncer-tainty covariances as ellipsoids, while Figure 2b depictsthe covariance evolution over time. Initially, each of thetwo robots proceeds directly to its goal as its covarianceis small compared to the uncertainty threshold �. However,despite using SLAM as the inference engine, uncertaintydoes develop over time, in particular due to short featuretracks, and eventually reaches �.

At this point, which happens in general at different timeinstances for the two robots, the parameter ↵r becomes 1

and the goal-attainment term in Eq. (25) vanishes for robotr, and as a result, robot r attempts to reduce its uncertainty.A single-robot framework would guide each robot r tore-observe informative 3D points previously mapped byrobot r [10], [11], [12]. In contrast, the proposed multi-robot centralized approach instead drives robot r to observeinformative areas previously seen by the other robot, r

0,thereby implicitly forming multi-robot constraints spanningdifferent time instances.

As a result, uncertainty covariance is significantly reduced,as shown in Figure 2b, and the robots reach the goalwith higher estimation accuracy. Figure 1 illustrates thecorresponding robot trajectories and the observed 3D pointsby each robot. Note the 3D points mutually observed by thetwo robots (at different time instances) as the result of ourmulti-robot planning approach.

Finally, in Figure 3 we present statistical study resultscomprising 50 runs for the two-robot case, where each runused different noise realizations (but the same goal and robotinitial positions). One can observe that overall, robot trajec-tories planned by our approach are consistent and smooth,and the estimation errors and uncertainty covariances aresignificantly reduced.

2) Second Scenario: In the second scenario we demon-strate an advantage of using the proposed multi-robot activecollaborative estimation as opposed to single-robot beliefplanning. The same objective function (25) and uncertaintythreshold � are considered, however the scenario is a bitdifferent, as shown in Figure 4: One of the robots startsoperating in an area without any 3D points and as a resultits uncertainty and estimation errors develop much faster,

(a)Pose index

0 20 40 60 80

Posi

tion n

orm

err

or

[m]

0

5

10

15

20

25

30

35

Est. Err. r1Sqrt. Cov. r1Est. Err. r2Sqrt. Cov. r2

(b)

Fig. 2: (a) Top view on robot trajectories. Ellipses representuncertainty covariances. (b) Corresponding position estima-tion errors and uncertainty covariances. Goals for each robotare denoted by a circle and a number.

East [m]-1000 0 1000

Nort

h [m

]0

500

1000

1500

2000

2500

3000

2

1

Pose index0 20 40 60 80

Posi

tion n

orm

err

or

[m]

10

20

30

40

50

60

70

Est. Err.Sqrt. Cov.

Fig. 3: Statistical study results. (a) Robot trajectories. (b)Corresponding position estimation errors (red) and squareroot covariances for the red robot.

reaching the uncertainty threshold � before reaching the goal.In the single robot case, this triggers uncertainty-reduction

motion planning that guides the robot to perform loopclosures. However, in the considered case the goal is simplybeyond the uncertainty budget [12]; reaching the goal withuncertainty below the threshold � is infeasible. This can beseen in Figure 5 that shows the robot is endlessly stuck in acycle of going towards the goal, reaching the uncertaintythreshold �, going back to reduce uncertainty via loopclosures, and doing the same process forever.

In contrast, assuming the availability of another robot thathappened to be operating in the same area and employingour approach for active opportunistic collaborative stateestimation allows to properly adjust the robot motion suchthat multi-robot constraints are created and, as a result, theuncertainty is significantly reduced (Figure 4). Interestingly,this can be considered as extending the mentioned single-robot uncertainty budget.

VIII. CONCLUSIONS

We presented an approach for active collaborative stateestimation assuming the robots operate in unknown or un-certain environments. Our belief space planning framework,where the belief accounts for the uncertainty in robot posesand in the observed environment, allows the robots tocalculate locally optimal trajectories such that their stateestimation is improved. We showed scenarios where this cor-responds to each robot being guided towards areas previously

(a)Pose index

0 10 20 30 40 50

Posi

tion n

orm

err

or

[m]

0

5

10

15

20

25

30

35

40

Est. Err. r1

Sqrt. Cov. r1

Est. Err. r2

Sqrt. Cov. r2

(b)

Fig. 4: A different scenario, where the green robot startsoperating in an area without distinctive landmarks and asa result develops significant errors. Our approach guidesthe robot to observe areas previously mapped by the redrobot, thereby drastically reducing estimation errors andcovariances. (a) top view; (b) position estimation errors andsquare root of uncertainty covariances.

(a)Pose index

0 20 40 60 80 100 120

Posi

tion n

orm

err

or

[m]

0

5

10

15

20

25

30

35

40

Est. Error

Sqrt. Cov.

(b)

Fig. 5: The same scenario as in Figure 4, however consider-ing only a single robot. The goal is outside the uncertaintybudget of a single robot. Without assistance from anotherrobot, as in Figure 4, the robot is unable to reach the goalwith the pre-defined uncertainty bound.

mapped by other robots, thereby facilitating an opportunisticcollaborative state estimation framework.

ACKNOWLEDGMENTS

This work was partially supported by the Technion Au-tonomous Systems Program (TASP).

REFERENCES

[1] A. Bahr, M.R. Walter, and J.J. Leonard. Consistent cooperativelocalization. In IEEE Intl. Conf. on Robotics and Automation (ICRA),pages 3415–3422, May 2009.

[2] Y. Bar-Shalom and E. Tse. Dual effect, certainty equivalence, andseparation in stochastic control. IEEE Transactions on AutomaticControl, 19(5):494–500, 1974.

[3] W. Burgard, M. Moors, C. Stachniss, and F. Schneider. Coordinatedmulti-robot exploration. IEEE Trans. Robotics, 2005.

[4] A. Cunningham, K. Wurm, W. Burgard, and F. Dellaert. Fullydistributed scalable smoothing and mapping with robust multi-robotdata association. In IEEE Intl. Conf. on Robotics and Automation(ICRA), St. Paul, MN, 2012.

[5] J. Du, L. Carlone, M. Kaouk Ng, B. Bona, and M. Indri. A comparativestudy on active SLAM and autonomous exploration with particlefilters. In Proc. of the IEEE/ASME Int. Conf. on Advanced IntelligentMechatronics, pages 916–923, 2011.

[6] D. Fox, J. Ko, K. Konolige, B. Limketkai, D. Schulz, and B. Stewart.Distributed multi-robot exploration and mapping. Proceedings of theIEEE - Special Issue on Multi-robot Systems, 94(9), Jul 2006.

[7] G. Hollinger and G. Sukhatme. Stochastic motion planning for roboticinformation gathering. In Robotics: Science and Systems (RSS), 2013.

[8] S. Huang, N. Kwok, G. Dissanayake, Q. Ha, and G. Fang. Multi-steplook-ahead trajectory planning in SLAM: Possibility and necessity. InIEEE Intl. Conf. on Robotics and Automation (ICRA), pages 1091–1096, 2005.

[9] V. Indelman. Towards cooperative multi-robot belief space planningin unknown environments. In Proc. of the Intl. Symp. of RoboticsResearch (ISRR), September 2015.

[10] V. Indelman, L. Carlone, and F. Dellaert. Towards planning ingeneralized belief space. In The 16th International Symposium onRobotics Research, Singapore, December 2013.

[11] V. Indelman, L. Carlone, and F. Dellaert. Planning under uncertaintyin the continuous domain: a generalized belief space approach. InIEEE Intl. Conf. on Robotics and Automation (ICRA), 2014.

[12] V. Indelman, L. Carlone, and F. Dellaert. Planning in the contin-uous domain: a generalized belief space approach for autonomousnavigation in unknown environments. Intl. J. of Robotics Research,34(7):849–882, 2015.

[13] V. Indelman, P. Gurfil, E. Rivlin, and H. Rotstein. Graph-based dis-tributed cooperative navigation for a general multi-robot measurementmodel. Intl. J. of Robotics Research, 31(9), August 2012.

[14] V. Indelman, E. Nelson, N. Michael, and F. Dellaert. Multi-robot posegraph localization and data association from unknown initial relativeposes via expectation maximization. In IEEE Intl. Conf. on Roboticsand Automation (ICRA), 2014.

[15] S. Karaman and E. Frazzoli. Sampling-based algorithms for optimalmotion planning. Intl. J. of Robotics Research, 30(7):846–894, 2011.

[16] A. Kim and R.M. Eustice. Perception-driven navigation: Active visualSLAM for robotic area coverage. In IEEE Intl. Conf. on Robotics andAutomation (ICRA), 2013.

[17] B. Kim, M. Kaess, L. Fletcher, J. Leonard, A. Bachrach, N. Roy,and S. Teller. Multiple relative pose graphs for robust cooperativemapping. In IEEE Intl. Conf. on Robotics and Automation (ICRA),pages 3185–3192, Anchorage, Alaska, May 2010.

[18] H. Kurniawati, D. Hsu, and W. S. Lee. Sarsop: Efficient point-basedpomdp planning by approximating optimally reachable belief spaces.In Robotics: Science and Systems (RSS), volume 2008, 2008.

[19] S. M. LaValle and J. J. Kuffner. Randomized kinodynamic planning.Intl. J. of Robotics Research, 20(5):378–400, 2001.

[20] R. Martinez-Cantin, N. De Freitas, A. Doucet, and J.A. Castellanos.Active policy learning for robot planning and exploration underuncertainty. In Robotics: Science and Systems (RSS), 2007.

[21] T.P. Minka. Expectation-Maximization as lower bound maximization.November 1998.

[22] C. Papadimitriou and J. Tsitsiklis. The complexity of markov decisionprocesses. Mathematics of operations research, 12(3):441–450, 1987.

[23] R. Platt, R. Tedrake, L.P. Kaelbling, and T. Lozano-Pérez. Belief spaceplanning assuming maximum likelihood observations. In Robotics:Science and Systems (RSS), pages 587–593, 2010.

[24] S. Prentice and N. Roy. The belief roadmap: Efficient planning inbelief space by factoring the covariance. Intl. J. of Robotics Research,2009.

[25] S.I. Roumeliotis and G.A. Bekey. Distributed multi-robot localization.IEEE Trans. Robot. Automat., August 2002.

[26] C. Stachniss, G. Grisetti, and W. Burgard. Information gain-basedexploration using rao-blackwellized particle filters. In Robotics:Science and Systems (RSS), pages 65–72, 2005.

[27] C. Stachniss, D. Haehnel, and W. Burgard. Exploration with activeloop-closing for FastSLAM. In IEEE/RSJ Intl. Conf. on IntelligentRobots and Systems (IROS), 2004.

[28] R. Valencia, J. Valls Miró, G. Dissanayake, and J. Andrade-Cetto.Active pose SLAM. In IEEE/RSJ Intl. Conf. on Intelligent Robotsand Systems (IROS), pages 1885–1891, 2011.

[29] J. Van Den Berg, S. Patil, and R. Alterovitz. Motion planning underuncertainty using iterative local optimization in belief space. Intl. J.of Robotics Research, 31(11):1263–1278, 2012.

[30] W. Wonham. On the separation theorem of stochastic control. SIAMJournal on Control, 6(2):312–326, 1968.

[31] K.M. Wurm, C. Stachniss, and W. Burgard. Coordinated multi-robotexploration using a segmentation of the environment. In IEEE/RSJIntl. Conf. on Intelligent Robots and Systems (IROS), Nice, France,September 2008.

Date post:	08-Jul-2020
Category:	Documents
Upload:	others
View:	0 times
Download:	0 times

Towards Multi-Robot Active Collaborative State Estimation via …€¦ · ul:k+l1 ⌘ ur l:k+l1 R...

Documents