+ All Categories
Home > Documents > Eindhoven University of Technology MASTER Optimal scalable ... · ized Kalman algorithm are based...

Eindhoven University of Technology MASTER Optimal scalable ... · ized Kalman algorithm are based...

Date post: 09-Jul-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
16
Eindhoven University of Technology MASTER Optimal scalable decentralized Kalman filter Oruc, S. Award date: 2008 Link to publication Disclaimer This document contains a student thesis (bachelor's or master's), as authored by a student at Eindhoven University of Technology. Student theses are made available in the TU/e repository upon obtaining the required degree. The grade received is not published on the document as presented in the repository. The required complexity or quality of research of student theses may vary by program, and the required minimum study period may vary in duration. General rights Copyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright owners and it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights. • Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain
Transcript
Page 1: Eindhoven University of Technology MASTER Optimal scalable ... · ized Kalman algorithm are based on the constraint of sending only measurements. In this situation an optimal algorithm

Eindhoven University of Technology

MASTER

Optimal scalable decentralized Kalman filter

Oruc, S.

Award date:2008

Link to publication

DisclaimerThis document contains a student thesis (bachelor's or master's), as authored by a student at Eindhoven University of Technology. Studenttheses are made available in the TU/e repository upon obtaining the required degree. The grade received is not published on the documentas presented in the repository. The required complexity or quality of research of student theses may vary by program, and the requiredminimum study period may vary in duration.

General rightsCopyright and moral rights for the publications made accessible in the public portal are retained by the authors and/or other copyright ownersand it is a condition of accessing publications that users recognise and abide by the legal requirements associated with these rights.

• Users may download and print one copy of any publication from the public portal for the purpose of private study or research. • You may not further distribute the material or use it for any profit-making activity or commercial gain

Page 2: Eindhoven University of Technology MASTER Optimal scalable ... · ized Kalman algorithm are based on the constraint of sending only measurements. In this situation an optimal algorithm

Eindhoven University of Technology Department of Electrical Engineering Control Systems

The Department of Electrical Engineering of the Eindhoven University of Technology accepts no responsibility for the contents of M.Sc. theses or practical training reports

Optimal Scalable Decentralized Kalman filter

by

S. Oruč

Master of Science thesis Project period: October 2008 Report Number: 08A/07 Commissioned by: Prof.dr.ir. P.P.J. van den Bosch Supervisors: Dr. M. Lazar Ir. J. Sijs Dr.ir. Tj. Tjalkens

Page 3: Eindhoven University of Technology MASTER Optimal scalable ... · ized Kalman algorithm are based on the constraint of sending only measurements. In this situation an optimal algorithm

GRADUATION PAPER 1

Optimal Scalable Decentralized Kalman FilterSertac Oruc

Abstract—The Kalman filter is a powerful state estimationalgorithm which incorporates noise models, process model andmeasurements to obtain an accurate estimate of the states ofthe process. Realization of conventional Kalman filter algorithmrequires a central processor that harvests measurements from allthe sensors on the field. Central systems have some drawbackslike reliability, central communication and high computationwhich result in a need for non-central algorithms.

The interest of this study is centered at optimality and scal-ability in Decentralized Kalman Filter (DKF). Current researchon DKF focuses on ”better” estimation of process states in termsof aspects like computation power, communication overhead,memory usage, robustness, reliability and scalability. This studytakes first optimality and then scalability in DKF as its focusand looks for an answer to problem of obtaining an optimal yetscalable DKF. An optimal DKF for a given network topology isfound and a mathematical problem formulation for an optimalscalable DKF is made.

Index Terms—Kalman filter, state estimation, consensus filter,covariance intersection, multi-sensor, decentralized/distributedKalman filter

I. INTRODUCTION

A. Background

WHEN a control engineer considers system analysisor controller design, first thing he does is forming

a deterministic model, using all scientific and engineeringknowledge he has. With this model and knowledge of controltheory the engineer can investigate the system behavior andbuild his controller. To do this, the engineer also needs to buildmeasurement devices to measure the interested variables of thesystem. Processing these measurement data for proper controlinputs forms a very important phase in control systems.

Deterministic models and control theory cannot provide asufficient tool to make analysis and design for every physicalsystem. First of all, no mathematical model can perfectlyrepresent a physical behavior. Every model reveals maincharacteristics and variables of the system and ignores the rest.Secondly dynamic systems are driven by ”indeterministic”noises and thirdly ”the measurement devices” or sensors do notprovide perfect data about the system but gives measurementswhich are corrupted by indeterministic noise.

Thus, as a proper and practical solution to this problem oneuses stochastic models and estimation theory to estimate thesystem behavior using noisy measurements and a stochasticprocess model.

MSc student S.Oruc is with the Department of Electrical Engineering, TUEindhoven, The Netherlands

e-mail: [email protected] September 2008

Fig. 1. System

In this project we deal with state estimation assuminga stochastic discrete time state-space model of the systemand model measurement noises and process noises as zeromean Gaussian functions and known error covariances. Wespecifically deal with the Kalman filter as the estimator.

The Kalman filter, which is also known as Linear QuadraticEstimator (LQE) in Control Theory, was created by RudolfEmil Kalman and published in 1960 in his famous paper ”ANew Approach to Linear Filtering and Prediction Problems”[1]. It was developed as a recursive solution to the discrete datalinear filtering problem. It uses process model, noise models,measurements and finds an accurate estimate together witherror covariance, namely uncertainty, of the estimate. It hasbeen a very active research area and has broad applicationsin automated devices, navigation systems, weather forecast,econometrics, tracking systems and many other diverse fields.

Since it was created, many variants of the Kalman filterhave been introduced, i.e. Information filter [2], The ExtendedKalman filter [3], Fast Kalman filter [4], Unscented Kalmanfilter [5], Schmidt’s extended filter and so on. Each of thesemethods attempts to solve another practical problem in ap-plications of Kalman filter. Information filter brought morepractical ease whereas Extended Kalman filter and UnscentedKalman filter provided applicability for non linear systems.In this paper we specifically consider Decentralized Kalmanfilter (DKF) and give a unique perspective to the problem offinding an optimal, scalable Kalman filter.

B. Problem Description

The system we consider in this study, which is shown infigure 1, is a network of nodes with their own sensors, henceown measurements. The states of a global process modeledby the equation x[k] = Ax[k−1]+w[k−1] in discrete time isdesired to be estimated at each node. This model is available toall the nodes. w is modeled as a zero mean Gaussian noise withan error covariance matrix of Q = E[wwT ]. In KF, modelingof process and measurement noises as zero mean Gaussianfunctions is an essential requirement for the formulation of

Page 4: Eindhoven University of Technology MASTER Optimal scalable ... · ized Kalman algorithm are based on the constraint of sending only measurements. In this situation an optimal algorithm

GRADUATION PAPER 2

KF. The noise is always modeled as zero mean Gaussian. Ifthe real noise is also zero mean Gaussian as modeled then KFestimate converges to zero.

All the nodes in the network measure some states of theglobal process. Measurements are modeled by measurementequations yi[k] = Cix[k]+ vi[k]. vi is modeled as a zero meanGaussian noise with a covariance matrix of Ri = E[vivT

i ].So each node is characterized by respective Ci matrix andmeasurement covariance matrix Ri.

Each node estimates the state vector of the same globalprocess. The sensor measurements are assumed not to becorrelated.

In a decentralized network structure like in figure 1 eachnode makes its own estimation and measurement. It is alsopossible to transmit the estimate and/or measurement to theneighbors. Throughout this paper the concept neighbor is usedfor communication neighbor which is the node directly con-nected to the operating node. The nodes are allowed to transmitdata to only their direct neighbors for better estimation.

”Decentralized Kalman Filter (DKF)” algorithm transmitsboth state estimates and measurements to the neighbors. Notethat the measurements are fresh data which are not correlatedwith the current estimates or other measurements made atneighbors. The estimates however are found by incorporatingthe information from history, therefore they are correlated withthe estimates from the other nodes. Also another algorithmwhich we call ”Local Kalman Filter (LKF)” is described inthis paper for comparison purposes, which differs basicallyfrom DKF such that LKF does not transmit state estimateswhereas DKF does.

In KF, trace of state error covariance matrix which is definedas P[k] = E[(x[k]− x[k])(x− x[k])T ] is minimized where x isthe estimate found by KF and x is the predicted real stateaccording to the model in equation x[k] = Ax[k−1]+w[k−1].If the model is accurately made this error is also equal tothe error between the real state and estimate. The trace ofP[k] gives the sum of squared error between predicted realstate and estimate which is minimized. Calculation of this P[k]matrix is crucial for the solution of optimal Kalman filter. Incentral KF P[k] is a function of A,Q,C,R and can be calculatedby central processor. However in DKF calculation of Pii[k] =E[(x[k]− xi[k])(x[k]− xi[k])T ] is not straight forward since xi[k]involves x j[k−1] j∈Ni terms because of state share betweenthe nodes. The incorporation of neighbor states brings the needfor calculation of cross-covariance terms which are defined asPi j = E[(x− xi)(x− x j)T ] into the picture. Calculation of thesePi j terms is the key challenge in finding an optimal scalableDKF algorithm.

In this study two results are achieved. Firstly an optimal de-centralized Kalman filter depending on the network topologyis obtained. This algorithm is built and discussed in sectionV. Secondly a problem formulation for a both optimal andscalable decentralized Kalman filter algorithm is proposed insection VI. Ultimately this research aims to find out an optimalscalable DKF algorithm with the following constraints:

• Every node of the network estimates the global state de-pending on the previous state estimates and measurementsof themselves and neighbors. (Decentralization)

• The algorithm must solely utilize peer-to-peer communi-cations, and should not involve global communications.(Decentralization)

• The algorithm must be scalable, which means adding orremoving a node must not change the algorithm on all ofthe nodes, although it affect a limited number of nodes.(Scalability)

• The algorithm must be optimal such that the error be-tween the estimated state and the true state must beminimized for an arbitrary network communication. (Op-timality)

C. Approach

Many algorithms that have been suggested for a decentral-ized Kalman algorithm are based on the constraint of sendingonly measurements. In this situation an optimal algorithm isalready known. In this paper we call this optimal decentralizedalgorithm ”Local Kalman Filter”, in which only measurementsare communicated, to distinguish it from our decentralizedKalman filter (DKF). In this research we challenge the prob-lem of finding an optimal and scalable DKF in which bothstate estimates and the measurements are transmitted. Forthis purpose first we consider optimality to find an optimalDKF and then we formulate the problem for both optimal andscalable algorithm.

Our approach is a simple yet creative one. In the firststep to have an optimal algorithm we specifically take cross-covariances into account by considering the network as awhole. We consider all of the cross covariances between everytwo nodes in the network. For this purpose we introduce a newrepresentation, called Global System Representation, for whichwe rebuild the Kalman filter equations. This representationmade it possible to calculate all the cross covariances in thenetwork and gave us a cost function for optimality in whichthe error between the individual estimates and predicted realstate is minimized. Solving this formulation, the optimal stateestimate at each individual node is obtained in terms of themeasurements and previous estimates from only the neighborsand the node itself and also the network topology. Since thenetwork topology must be made available to the nodes, thisalgorithm remains unscalable.

In the second step in section VI, we challenge the scal-ability constraint. We define scalability as ”Each node isallowed only to know the network connection of the firstorder neighborhood”. Using this statement a mathematicalformulation of an Optimal Scalable Decentralized KalmanFilter is derived. The idea in this approach is using only localnetwork topology instead of the whole network topology inthe individual algorithms on each node. The rest of the termsin the algorithm regarding the network connections other thanthe local network topology is modeled as uncertainty.

The solution of this scalable formulation depends on the

Page 5: Eindhoven University of Technology MASTER Optimal scalable ... · ized Kalman algorithm are based on the constraint of sending only measurements. In this situation an optimal algorithm

GRADUATION PAPER 3

modeling of this uncertainty mentioned above, about whichwe provide some ideas and propose this problem as a futurework.

II. NOMENCLATURE

x[k] The predicted real state according to the model atkth iteration.∗

yi[k] The measurement at ith node at kth iteration.xi The estimate of state x at ith nodexi− The predicted estimate (or prediction) of state x at

ith nodePi j Error covariance matrix between state estimates at ith

and jth nodeMi j Kalman gain for state from jth node on ith nodeKi j Kalman gain for measurement from jth node on ith

nodeN The set of all nodes in the networkNi Neighborhood of ith nodeN−

i Neighborhood of ith node without ith node (Ni−i)vi Measurement noise of the ith sensorw Process noiseRi Measurement noise covariance of the ith sensorQ Process noise covariance matrixCi Measurement matrix in the measurement model,

relating the state vector to the measurement (yi)of ith node

*For clarity time indices are dropped when there is no risk ofconfusion

III. KEY CONCEPTS

Throughout the paper some concepts and terms are used.For clarity these concepts and terms are explicitly stated toavoid confusion.

SensorDevices that measure a physical quantity and convert it into ananalog signal. In this paper by sensor measurement we meanlow-level interpreted digital data rather than raw analog data.

NodeUnit which has a sensor to make measurement and a processorto run an algorithm. Nodes also can communicate measure-ments and states. LKF and DKF algorithms are run on thesenodes. In the case central KF nodes send their measurements toa central processor and the central processor does the estimate.

NeighborThe node that is communication-wise adjacent to the operatingnode.

NeighborhoodThe set of nodes that is adjacent to the operating node and thenode itself.

2nd order neighborhoodThe set of nodes that are neighbors to the neighbor nodes ofthe operating nodes.

OptimalityIn this paper optimality refers to the minimization of the sumof squared errors between state estimate, xi and the predictedreal state, x according to the model. Throughout this paper theadjective ”optimal” is used in case this error is minimized forthe given constraints, such as decentralization, transmission ofstates, scalability.

ScalabilityPossibility to add/remove nodes such that the algorithms ononly a finite number of nodes change.

RobustnessThe performance of the system in case a node fails.

Globally Connected NetworkThe network structure in which every node has a directcommunication to all other nodes.

Data IncestThe situation in which a data is used more than once inestimation. In case states are transmitted, the same state canbe transmitted to a node from different nodes.

Covariance MatrixA matrix of covariances between elements of the same vector

Cross Covariance MatrixA matrix of covariances between elements of two differentvectors

Decentralized SystemA system which does not involve central activities like com-munication or processing but does these activities locally, inlocal processors.

Wireless Sensor Network (WSN)A network consisting of spatially distributed autonomousdevices(nodes) using sensors to cooperatively monitor physi-cal or environmental conditions such as temperature, sound,vibration, pressure and motion

ResidueThe discrepancy between estimate of the model on the nodeand the measurement taken in Kalman filtering

IV. KALMAN FILTER

Kalman filter [1] is an optimal recursive data process-ing algorithm. The term ”filter” refers to the fact that itis a data processing algorithm on a processor which usesnoisy measurements to find less noisy estimates. It givesthe ”optimal” estimate of the states of the process in leastsquares sense, combining model predictions and measurements[6]. KF processes all the measurements, uses the model ofthe system, the measurements and process noise to do theestimation without ignoring any of these information whichmakes Kalman Filter such a powerful estimator [6].

Conventional Kalman filtering is done in a centralized way,i.e. all measurements are gathered in a central processor andthe algorithm is run only in this central processor. For this

Page 6: Eindhoven University of Technology MASTER Optimal scalable ... · ized Kalman algorithm are based on the constraint of sending only measurements. In this situation an optimal algorithm

GRADUATION PAPER 4

reason we call this conventional Kalman filter central Kalmanfilter(CKF). This filter is proved to be optimal in least squaressense. On the other hand, LKF and DKF are done in adistributed fashion, i.e. measurements are processed locally inlocal processors(nodes) and each node runs its own algorithm.After the creation of KF till DKF there has been some researchsteps taken. In this chapter a related overview of evolution ofKalman filter is provided. First CKF is created and then theinformation filter(IF) has been found. IF made decentralizationand LKF, in which only measurements are sent, possible. LKFhas been followed by DKF which transmits state estimatesto the neighbors and processes past state estimations of theneighbors for better estimation.

A. Central Kalman Filter (CKF)

Central Kalman filter is the conventional Kalman filterfound by R. E. Kalman. The sensors provide measurementto a central processor and the algorithm is run in this centralprocessor.

The observed process is described in state space as:

x[k] = Ax[k−1]+w[k−1] (1)y[k] = Cx[k]+ v[k] (2)

The noise signals v and w are defined as measurement andprocess noises respectively which are modeled in CKF byzero mean Gaussians with covariance matrices R and Qrespectively. If we consider the system in figure 1 as anexample, all the measurements are sent to a central processorand the parameters are defined as follows;

y =

y1y2...

yn

C =

C1C2...

Cn

v =

v1v2...

vn

R =

R1 0 . . . 00 R2 . . . 0

0 0. . .

...0 0 . . . Rn

(3)

The Kalman filter consists of two stages at each time step. Atkth time step KF equations are written as follows; [7]:Time Update Equations:

x−[k] = Ax[k−1] (4)

P−[k] = AP[k−1]AT +Q (5)

Measurement Update Equations:

K[k] = P−[k]CT (CP−[k]CT +R)−1 (6)x[k] = x−[k]+K[k](y[k]−Cx−[k]) (7)P[k] = (I−K[k]C)P−[k] (8)

The time update stage propagates the past estimate in time tohave a prediction state x− at kth time step whereas measure-ment update stage updates the prediction with measurementsby weighing measurements and prediction by a gain calledKalman gain, K which is calculated for minimum least squareserror. The derivation of KF can be found in appendix A.

B. Information Filter (IF)

The information filter is a form of Kalman Filter that canbe derived by rewriting the central Kalman filter equations.It has an important role in decentralization of the algorithmwhich makes it important to understand. The information filteris derived with following two definitions:

Information Matrix: J[k] = P−1[k] (9)

Information State Vector: j[k] = P−1[k]x[k] (10)

IF estimates j and J instead of x and P. Therefore theirrespective stages are found as follows;

Time Update Equations:

J[k]− = AJ−1[k−1]AT +Q−1

(11)j[k]− = L[k] j[k−1] (12)L[k] := J[k]−AJ[k−1] (13)

Measurement Update Equations:

J[k] = J[k]−+CT R−1C (14)

j[k] = j[k]−+CT R−1y[k] (15)

IF is beneficial compared to KF in terms of computation whenthe number of measurements are larger than the number ofstates since then it is less expensive to invert J in equation 11and block diagonal matrix R in equations (14),(15), instead ofinverting (CP−[k]CT +R) in equation (6).

However more important added-value of IF is that it givesway to LKF and decentralization as explained in the nextsubsection.

C. Local Kalman Filter (LKF)

If we examine the IF equations we see that it is possibleto write CT R−1C and CT R−1y terms in finite sums of Ci,Ri,yiterms as;

CT R−1C = ∑i

CTi R−1

i Ci (16)

CT R−1y = ∑i

CTi R−1

i yi (17)

using (3). Then the IF equations can be written as;

Time Update Equations:

J[k]− = AJ−1[k−1]AT +Q−1

(18)j[k]− = L[k] j[k−1] (19)L[k] := J[k]−AJ[k−1] (20)

Measurement Update Equations:

J[k] = J[k]−+ ∑j∈Ni

CTj R−1

j C j (21)

j[k] = j[k]−+ ∑j∈Ni

CTj R−1

j y j (22)

Page 7: Eindhoven University of Technology MASTER Optimal scalable ... · ized Kalman algorithm are based on the constraint of sending only measurements. In this situation an optimal algorithm

GRADUATION PAPER 5

where Ni is the neighborhood of the ith node. Then in a givennetwork, like the network in figure 1, if only the measurementsare sent, the above equations derived from IF can be used suchthat a node harvests CT

j R−1j y j and CT

j R−1j C j terms from the

neighbors and processes these terms in its local algorithm. Wecall this decentralized algorithm Local Kalman Filter (LKF),in which only measurements are sent to the neighbors. Thisalgorithm is proven to give locally optimal estimate for ith

node [2]. We call this filter ”local” since it incorporates onlythe knowledge from its neighborhood. The measurement madeat a second order neighbor can never have an effect on theoperating node. However if the nodes send also the statesas well as the measurements then inherently a measurementmade at a second order neighbor reaches the operating nodein two time steps. Importance of sending the states as well asmeasurements is stressed in the next section.

D. Decentralized Kalman Filter (DKF)

After the creation of the Kalman filter for one measurement,it is realized that instead of using one sensor and one mea-surement of a plant it is more advantageous to use multiplesensors because of the issues like robustness, flexibility and thefact that a broader observation of the states can be measuredwith different types of sensors. The more sensors we use,the less the system performance is affected in case of failureof a sensor. The erroneous measurement of a single sensoraffects the overall performance less. All in all it is wiser touse multi-sensor systems and ”fuse” the information obtainedfrom different sensors to obtain an optimal estimation. Thisidea results in multi-sensor systems with a central processorin which a Kalman filter is run and fuses information fromdifferent sensors. [8]

Later on, the idea of using more sensors has been extendedto using also more processors since central processing involvessome drawbacks. First of all, central systems are not reliablesince the survival of the central processor is very critical.Secondly, for systems with large number of sensors, commu-nication with the central processor becomes difficult. Anotherreason, for spatially distributed systems, is that communicationwith central processor can be very difficult. As a result decen-tralized structures have been demanded for Kalman filtering[8], [9].

Decentralized architectures have become highly interestingsince they promise flexible systems in which nodes can beadded or removed and programmed independent of each other.The system becomes more robust since it does not depend onone master processor. No node or processor is very critical asall the nodes work in coherence without any hierarchy. Alsothe processor computation each processor must make can bereduced for advanced algorithms, which is not the focus inthis research.

In the previous subsection we already reached a decentral-ized algorithm which we call LKF. Different than the LKF,the DKF incorporates also the states of the neighbors intothe estimate. Incorporation of states brings the opportunity to

propagate a measurement throughout the network inherentlyin the state estimate. A measurement is made at node i at kth

time step and transmitted to node j. By transmitting the stateto node j, the measurement has an effect on the estimationat node j. Then at k + 1th time step, node j transmits itsestimation to node k which is a second order neighbor of nodei but after one time step, node k also inherently incorporatesthe same measurement in its estimation.

Transmission of states is crucial in decentralized structures.However note that for an arbitrary network incorporation ofneighbor states in DKF is not trivial. Data incest is likely tooccur unless the network connection is taken into account.In the next section we provide an optimal solution for DKFwhich incorporates states considering correlations between thestates and avoids also data incest.

V. OPTIMAL DKF

In this research our ultimate goal is to tackle the problemof having an optimal yet scalable DKF. The aimed algorithmhas to fulfill the constraints stated in the introduction section.Optimality and scalability are the key constraints.

In this study trace of auto-covariance matrix, which is thesum of squared errors between estimated states and the pre-dicted real state according to the model, is our cost function.Minimization of this cost function with respect to Kalmangains gives the Kalman gains for minimized cost function,hence the optimal algorithm.

Incorporation of the scalability constraint into the problemformulation is an issue that we challenge in this study. Intu-itively for the optimal DKF, the local algorithms must dependon the whole network connection, because of the state estimatecommunication which causes data incest. Formulating scala-bility and incorporating it into the mathematical formulationof the problem can solve this issue.

In this paper, we first ignore scalability, formulate and solveoptimal DKF problem given the network connection. Lateron, we define scalability mathematically such that it can beincorporated in the mathematical formulation to obtain anOptimal Scalable DKF.

A. Global System Representation

For optimality we introduce a new representation in whichwe consider the network as a whole. In this representationat each node we consider the effect of measurements andstate estimates of the neighborhood on estimation. The crosscorrelations between the estimates of each node are taken intoaccount as well as the auto correlations. Different than theprevious algorithms in which only auto covariance matriceslike P1,P2, ... are used, in this research P11,P22, ... are usedwhich are equal respectively. The cross covariance termsP12,P13, ... are defined as:

Pi j = E[(x− xi)(x− x j)T ] (23)

Page 8: Eindhoven University of Technology MASTER Optimal scalable ... · ized Kalman algorithm are based on the constraint of sending only measurements. In this situation an optimal algorithm

GRADUATION PAPER 6

We have the following process model at node i:

x[k] = Ax[k−1]+w[k−1] (24)yi[k] = Cix[k]+ vi[k] (25)

Here we take w[k] as process noise and vi[k] as measurementnoise which are expressed as Gaussian noises with zero meanand Q,Ri covariance matrices respectively. If we assume nocommunication between the nodes, the state estimate at nodei is calculated as follows:

predict:

x−i [k] = Axi[k−1] (26)

P−i [k] = APi[k−1]AT +Q (27)

update:

xi[k] = x−i [k]+Ki[k](yi−Cix−i [k]) (28)

Pi[k] = (I−Ki[k]Ci)P−i [k](I−Ki[k]Ci)T +Ki[k]RiKi[k]T (29)

Ki[k] = P−i [k]CTi (CiP−i [k]CT

i +Ri)−1 (30)

for optimal Ki[k]:

Pi[k] = (I−Ki[k]Ci)P−i [k] (31)

This brings us to the conventional Kalman filter equationswithout communication. Tr(Pi[k]) is taken as the cost function.In case we have communication the best estimate calculatedat a node does not depend on only the measurements andestimations made at that node but also the measurements andestimates of the neighbor nodes according to the communica-tion topology. Before going on with the problem formulationfor DKF in which both states and measurements are com-municated with the neighbors, Global System Representation(GSR), in which we take the network communication topologyexplicitly, is constructed with the following definitions:

X =

x1x2x3...

Y =

y1y2y3...

∆ =

A 0 0 . . .0 A 0 . . .0 0 A . . ....

......

. . .

Γ =

C1C2C3...

V =

v1v2v3...

Φ =

Q Q Q . . .Q Q Q . . .Q Q Q . . ....

......

. . .

Ω =

R1 0 0 . . .0 R2 0 . . .0 0 R3 . . ....

......

. . .

Π =

P11 P12 P13 . . .P21 P22 P23 . . .P31 P32 P33 . . .

......

.... . .

κ =

K11 K12 K13 . . .K21 K22 K23 . . .K31 K32 K33 . . .

......

.... . .

µ =

M11 M12 M13 . . .M21 M22 M23 . . .M31 M32 M33 . . .

......

.... . .

κi =[Ki1 Ki2 Ki3 · · · Ki j · · ·] where j ∈ Ni

µi =[Mi1 Mi2 · · · Mi j · · ·] where j ∈ Ni

µ−i =[Mi1 Mi2 · · · Mi j · · ·] where j ∈ N−

i

Γi =

C1C2...

C j...

where j ∈ Ni

So our process in equations (24),(25) can be expressed asfollows in Global System Representation:

x[k] = Ax[k−1]+w[k−1]Y [k] = ΓX [k]+V [k]

GSR results is a new problem formulation by consideringnetwork topology and all the estimations in all nodes that isexplained in the next section.

B. Problem Formulation

The problem formulation is based on the idea of correctingthe predicted state using the state estimates and measurementsfrom the neighbors and the measurement of operating node,just as in the conventional Kalman filter formulation. Forcorrection the residue terms between each measurement orstate estimate and the predicted state is weighed. For weighingeach measurement or state estimate, a Kalman gain is usedwhich is either Measurement Kalman gain Ki j or State Kalmangain Mi j.

Mathematically speaking the following formulation is foundfor ith node:

predict:

x−i [k] = Axi[k−1] (32)

P−i j [k] = APi j[k−1]AT +Q (33)

update:

xi[k] = x−i [k]+ ∑j∈Ni

Ki j[k](y j[k]−C j x−i [k])

+ ∑j∈N−i

Mi j[k](x−j [k]− x−i [k]) (34)

= ∑j∈Ni

Mi j[k]x−j [k]+ ∑j∈Ni

Ki j[k]y j[k] (35)

where Mii := I− ∑j∈N−i

Mi j− ∑j∈Ni

Ki jC j (36)

(37)

Here the predicted state x−i is corrected with measurementand state residue terms y j −C j x−i and x−j [k]− x−i [k]. Theseresidue terms are weighed with corresponding Kalman gainsKi j and Mi j. These Kalman gains are found for minimized cost

Page 9: Eindhoven University of Technology MASTER Optimal scalable ... · ized Kalman algorithm are based on the constraint of sending only measurements. In this situation an optimal algorithm

GRADUATION PAPER 7

function. The cost function is designated as sum of traces ofall auto-covariance matrices Pii as explained in the followingparagraphs.

For the calculation of Kalman gains we need error covari-ance Pii. For this purpose we use the estimation error:

since yi = Cix+ vi

ei = x− xi = ∑j∈Ni

Mi je−j − ∑j∈Ni

Ki jv j (38)

In matrix form:

ei = µi

e−1e−2e−3...

−κi

v1v2v3...

(39)

since E[e jvTk ] = 0 j,k ∈ Ni

Pii = E[eieTi ] = µi

P−11 P−12 · · ·P−21 P−22 · · ·

......

. . .

µT

i +κi

R1 0 · · ·0 R2 · · ·...

.... . .

κT

i

(40)

Thus it is shown that for the calculation of Pii, calculation ofall the cross covariance matrices Pi j is needed. Further moreauto- and cross-covariance matrices of the neighbors are alsoneeded. The real challenge in this research is finding thesecross covariances in a decentralized fashion.

Thus using Global System Representation Kalman filterformulation can be made globally as follows;

predict:

X−[k] = ∆X [k−1] (41)

Π−[k] = ∆Π[k−1]∆T +Φ (42)

update:

X [k] = µ [k]X−[k]+κ[k]Y [k] (43)

(Er[k] = X [k]− X [k] = µ [k]Err−[k]−κ [k]V [k]) (44)

Π[k] = E(Er[k]Er[k]T ) = µ [k]Π[k]−µ[k]T +κ [k]Ωκ[k]T (45)

Trace of Π[k] is the cost function in our optimizationproblem. Both state estimate and measurement sharing is donebetween neighboring nodes and the network connections areexplicitly taken into account in the cost function which will beshown in the next two sections. It will also be shown that thisformulation gives the same optimal estimate as conventionalKalman filter in case of global communication.

The communication topology is modeled such that Ki j andMi j values must be zero in case there is no communicationbetween ith and jth node. This means that the network com-munication must be known.

Note that in prediction step it is possible to decouple xiterms in X and P−i j terms in Π−[k] as in equations (32) and(33).

It is also worth to note that Mi j where i 6= j and Ki j valuesare independent variables where Mii values depend on thesevariables.

After this step it’s crucial to find κ and µ which minimizesthe trace of equation (45). It is also important to note that

Tr(Π[k]) = Tr(P11)+Tr(P22)+ . . . (46)

1) Solution for Global Communication: This section showsa solution in case of global communication. It is shown thatthis solution gives the same result as central Kalman filter.

This means the estimated states at each node are identicalsince all of the nodes are fed wit the same measurements fromthe field if the initial values are given the same. In this case thestate transition is needless so we remove Mi jterms where i 6= jsince xi− x j = 0. With this reasoning, analysis for measure-ment sharing can be made accordingly. Throughout this paper,the matrix algebra derivations can be made with the help of”Derivatives of traces” list given in Appendix C [10].

Tr(Π[k]) =N

∑i

Tr(Pii) where (47)

Tr(Pii) = Tr(MiiP−ii MTii )+Tr(

N

∑j

Ki jR jKTi j) from (45)

Mii = I−N

∑j=1

Ki jC j = I−κiΓ

then Tr(Pii) = Tr((I−κiΓ)Pii(I−κiΓ)T )+Tr(κiΩκTi )

(48)∂

∂κiTr(Π[k]) =

∂∂κi

Tr(Pii) since only Pii depends on κi

(49)

=−2PiiΓT +2κiΓPiiΓT +2κiΩ = 0 (50)

κi = PiiΓT (ΓPiiΓT +Ω)−1 (51)

But this is nothing but the same Kalman gain obtained usingconventional Kalman filter as in equation (6). Thus GSRapproach gives the same result as central Kalman filter in caseof global communication which is expected. In the next sectionsolution for ”Local Communication” in an arbitrary networkis given.

2) Solution for Local Communication: In equation (40) itis shown that Pii depends on only κi and µi. So also using theequation (46) we can make the decentralization:

∂∂κi

Tr(Π[k]) =∂

∂κiTr(Pii) (52)

∂∂ µ−i

Tr(Π[k]) =∂

∂ µ−iTr(Pii) (53)

where

Π = µiΠ−µTi +κiΩκT

i (54)

Pii = ∑j∈Ni

∑l∈Ni

MilPl jMTi j + ∑

j∈Ni

Ki jR jKTi j (55)

Page 10: Eindhoven University of Technology MASTER Optimal scalable ... · ized Kalman algorithm are based on the constraint of sending only measurements. In this situation an optimal algorithm

GRADUATION PAPER 8

Then for the gains Ki j and Mi j at ith node the followingequations are derived;

∂∂Ki j

Tr(Pii) =−2 ∑l∈Ni

MilP−li CTj +2Ki jR j = 0 where j ∈ Ni

(56)∂

∂Mi jTr(Pii) =−2 ∑

l∈Ni

Mil(P−li −P−l j ) = 0 where j ∈ N−i

(57)

rewriting equations (56), (57) and using equation (36):

P−ii CTj = ∑

l∈N−i

Mil(P−ii −P−li )CTj

+ ∑l∈Ni

KilClP−ii CTj +Ki jR j where j ∈ Ni (58)

(P−ii −P−i j ) = ∑l∈N−i

Mil(P−ii −P−i j )+ ∑l∈Ni

KilCl(P−ii −P−i j )

− ∑l∈N−i

Mil(P−li −P−l j ) where j ∈ N−i (59)

So this set of equations gives 2n− 1 matrix equations fromwhich the Kalman gains are calculated, where n is the numberof nodes in Ni. These equations can be written in matrix formas;

[κi µi

][Ai BiBT

i Ci

]=

[PiiΓT

i Pii]

(60)

(61)

where

Ai = ΓiP−ii ΓTi +Ωi (62)

Bi = ΓiPii (63)

Ci =

PiiPii...

P1iP2i...

omitting Pii−Pii term (64)

P ji =[Pji Pji · · ·Pji · · ·

]− [Pj1 Pj2 · · ·Pjk · · ·

](65)

where j ∈ Ni and k ∈ N−i

(66)

Hence the optimal DKF at ith node is obtained with thefollowing equations:

predict:x−i [k] = Axi[k−1] (67)

P−i j [k] = APi j[k−1]AT +Q (68)

update:

xi[k] = x−i [k]+ ∑j∈Ni

Ki j[k](y j[k]−C j x−i [k])

+ ∑j∈Ni

Mi j[k](x−j [k]− x−i [k]) (69)

[κi µi

]=

[PiiΓT

i Bi][

Ai BiBT

i Ci

]−1

(70)

Π[k] = µ[k]Π[k]−µ[k]T +κ[k]Ωκ [k]T (71)

Covariance matrices are required to be found for calculationsof Ai, Bi and Ci. The governing equation for covariance

Fig. 2. Comparison of Algorithms

matrices is equation (71). It involves all of the cross covariancematrices as well as all the Kalman gains in the network.Calculation of covariance matrices and the Kalman gains isindependent of the measurements. So these values can befound by each node using the equations (70) and (71). Soeach node can calculate Π off-line although this calculationmay cost too much and become impractical in case of a largenumber of nodes is used. Moreover, although this algorithm isoptimal, it is not scalable. Note that each node has to know theglobal network connection to calculate Kalman gains regardingthe connections.

The scalability issue is taken into consideration in thenext section. Given the scalability constraint the problemformulation for a scalable and optimal algorithm is stated.

Another important characteristic of this method is the factthat as the states converge, cross covariances approach to eachother. As a result Bi and Ci becomes singular and the inversionin equation (70) cannot be made. This is an inherent point thatstems from the formulation of Π. As the states converge, Πbecomes singular. Occurrence of singularity depends on theprecision of the processor (because of rounding off to zero incalculation of rank of Π). The singularity can be expected byexamining the equation (34) in problem formulation as well.Since the terms x j− xi goes to zero, Mi j can go to any value.

Figure 2 shows a table in which some key differences likenetwork structure, required and communicated informationbetween the algorithms CKF, LKF and DKF are listed. Inthis table N is the set of all nodes in the network and Ni isthe neighborhood of ith node.

The Heat Flow Bar experiment illustrates the comparisonbetween Optimal DKF, Local Kalman Filtering(LKF) andCentral Kalman Filter as well as the singularity issue on anexample.

Example: Heat Flow Bar

To illustrate the performance of Optimal DKF algorithmprovided in this section, we simulate an experiment in whicha bar is connected to two temperature reservoirs on bothends and heated from the middle. Our purpose is comparingDKF that we suggest in the previous section to CKF andLKF. Central Kalman filtering is done by a central processorwhich accesses all four measurements directly and processesthem to make an estimate. LKF and DKF are run in thelocal processors(nodes). For LKF nodes transmit only their

Page 11: Eindhoven University of Technology MASTER Optimal scalable ... · ized Kalman algorithm are based on the constraint of sending only measurements. In this situation an optimal algorithm

GRADUATION PAPER 9

Fig. 3. Heat Flow Bar

0 5 10290

300

310

320

3301 sec

0 5 10290

300

310

320

3305000 sec

0 5 10290

300

310

320

33010000 sec

0 5 10290

300

310

320

33015000 sec

Real TemperatureMeasurementscentral kfLKF(Node 1)DKF(Node 1)

Fig. 4. Temperature Profile of The Bar Over Time

measurements to their neighbors whereas in DKF both statesand measurements are transmitted and used as described inthe previous section.

The temperature profile of the bar is needed to be observed.The bar is modeled with 11 states each of which represents thetemperature of one segment of the bar as shown in figure 3.Reservoirs provide a boundary condition for the experimentand keep the temperature of the end points at 300K. The

0 5000 10000 150000

5

10

15

20RMS error per node in LKF

RM

S e

rror

(K

elvi

n)

time (sec)

central kfLKF(Node 1)LKF(Node 2)

Fig. 5. Local Kalman Filter

0 0.5 1 1.5 2 2.5

x 105

0

10

20

30

40

50RMS error per node in LKF

RM

S e

rror

(K

elvi

n)

time (sec)

central kfLKF(Node 1)LKF(Node 2)

Fig. 6. Local Kalman Filter Convergence

0 5000 10000 150000

5

10

15

20RMS error per node in DKF

RM

S e

rror

(K

elvi

n)

time (sec)

central kfDKF(Node 1)DKF(Node 2)

Fig. 7. Optimal Decentralized Kalman Filter

bar is heated from the 6th segment and temperature of only3rd ,5th,7th and 9th segments are measured. Each measurementis done by a node and the nodes can communicate with theirneighbors to collaborate data. Initially the bar is kept at 300Kwhich changes in time with the heat from the 6th segment. Thetemperature profile of the bar over time is shown in figure 4with solid line.

The algorithms use the same state space model of the systemas in figure 1, which is x[k] = Ax[k−1]+w[k−1] and y[k] =Cix[k]+ vi[k]. Here x denotes a vector consisting of 11 stateseach of which represents the temperature of one segment andw represents the heating/cooling noises on the states. Also Cidemotes measurement matrix and vi measurement noise. Forexample C1 = [00100000000]. Nodes do not know about theconstant heat given to the bar. So in model we use a very highQ so unreliable model and a low R, reliable measurement.

It is crucial to note that the state space model that the

Page 12: Eindhoven University of Technology MASTER Optimal scalable ... · ized Kalman algorithm are based on the constraint of sending only measurements. In this situation an optimal algorithm

GRADUATION PAPER 10

algorithms use does not know about the temperature reservoirsand the heat given to the bar. Because of that the modelassumes a Gaussian process noise since this is the best it cando as it never knows about the amount, power and positionof the heating or cooling being made. As mentioned before infirst section, in KF process noise has to be modeled as zeromean Gaussian. This is also in accordance with the fact thatthe nodes do not know about the properties of the heat. Thisbad modeling leads to a bias in the estimate as seen in figure6 since KF minimizes the sum of squared errors between theestimate and the mean value of ”predicted real state” by themodel. Since our KF models heat as zero mean Gaussian, thispredicted real state differs from the real state considerably.The difference between ”predicted real state” and real statecreates a bias in the error in figure 6.

Although this is the case we still use this set up with aconstant heat in the middle of the bar instead of setting up ascenario with a zero mean Gaussian heating/cooling processwhich could have been accurately described by the model.The reason for using this set up is that since the uncertaintyin the process is very high and model is very uncertain, themeasurements are very valuable for estimation and it is highlyneeded to reach far away nodes. In our system although wehave 11 segments(states), we measure 4 of them, the rest ispredicted by the model. If it would be possible to measureall of the segments then we could have been satisfied withonly the measurements and do not even run any Kalman filter.However note that there are many unobserved segments.

As the process model is very inaccurate, we choose a veryhigh Q, process noise covariance, whereas R, measurementnoise covariance, is quite small. This makes measurementsmuch more reliable than state predictions by the model in KF.

Since DKF incorporates states of the neighbors and statescarry inherently measurements from all of the network, DKFproves to be very useful in this situation. On the other hand inLKF only measurements are sent which makes it impossiblefor example for first node to estimate accurately 7th segmentbecause of the heat input at 6th segment.

Figure 5 shows performance of local Kalman filter. Rootmean squared error between the real states and estimatedstates at 1st node and 2nd node are drawn versus time. Sincethe nodes are symmetric, the error of the 3rd and 4th nodesfollow more or less the same trajectories as 2nd and 1st nodesrespectively. For this reason these two nodes are not drawn forclarity. Also central KF is drawn for comparison. The steadystate estimates are shown in figure 6. The 1st node have accessto measurement from only 2nd node and itself. Because ofthis reason its estimation is worse than 2nd node. On the otherhand 2nd node has access to the measurements from 1st and3rd nodes, hence it achieves a better estimate by incorporatingmeasurements from also the right side of the bar.

For the experiment which is shown in figure 7 a hybrid algo-rithm is used. In this experiment both states and measurementsare sent and Optimal Decentralized Kalman filter algorithmtogether with LKF is applied. When Π matrix in section V-A

is full rank, Optimal DKF is used. In case Π matrix is singularLKF is applied. When LKF algorithm runs covariances alsochange and at some time points Π matrix again becomes fullrank and DKF is used which causes the moments where wehave ”saw tooth” graph in the figure. When DKF works, evenone time step proves to be sufficient to decrease the errorsignificantly.

In figure 6 the error trajectories first tend to decrease butthen begins to increase and settles at a high value. Thisbehavior is related to the initial value attained in the algorithm.We chose initial temperature as 320K for each state. Asmeasurements are more dominant for the measured states,for the first iterations of the algorithm the measured statesquickly converges to the real value which makes the root meansquared error decrease quickly. After the quick convergenceof the measured states, the other states are predicted bythe model, using process noise model provided which is aninaccurate model. This causes the increase in the mean error.For example, the error between the steady state real state andthe converged estimate of first node in LKF is found as x−xlk f 1 =

[0.0,0.0,0.2,0.1,0.3,0.0,14.9,29.8,44.6,59.5,74.3

].

Since the first node accesses only the measurement from thesecond node it can estimate the left side very well, howeverfor the right side it extrapolates the result for left side since ituses only the model for the estimation of the right side. Thisexample shows the reason of the bias in error.

This experiment is a very clear example to show the im-portance of sending states in case the nodes measure differentstates of the plant and the process noise model is inaccurate.In our example the temperature profile differs significantly onleft and right sides of the heating point and the model doesnot have direct information about the heating/cooling poweror position. When only measurements are transmitted (LKF)first node cannot access to any information from the right sidesince it receives only the measurement of the second node.However for DKF case, first node accesses the measurementsfrom also the right side inherently in the state received fromthe second node. Sending states provide the possibility ofdiffusing information from one node to the rest of the network.This makes it possible to converge to a better steady state errorand DKF algorithm that is provided in the previous sectionprovides the optimal solution for the given network topologyand process model.

VI. OPTIMAL SCALABLE DKF

An important result of the previous section is the fact thatit is not possible to find a scalable solution to Optimal DKFformulation made in section V-B2. The scalability issue in theOptimal DKF stems from the calculation of Π in equation(71). The idea of transmission of the covariance matrices aswell as transmission of the estimates and measurements hasbeen considered to see if it can solve this issue. It is realizedthat real problem is calculating cross-covariance, Pi j terms.Suppose we want to find Pii term as in equation (40). Then itis needed to calculate also all the Pi j terms where ith node

Page 13: Eindhoven University of Technology MASTER Optimal scalable ... · ized Kalman algorithm are based on the constraint of sending only measurements. In this situation an optimal algorithm

GRADUATION PAPER 11

and jth node can be at the farthest edges of the network.It is shown in the calculation of Π that even farthest nodesaffect the algorithm of each other. Since distant nodes needinformation from each other to calculate their cross-covariancePi j and local communication constraint is given, Optimal DKFalgorithm derived in the previous section cannot be scalable.

The Covariance Intersection(CI) method [5] in which twocorrelated estimates are fused with an approximate covariancematrix is a practical, yet suboptimal solution for this problem.Like CI Method there are some studies which make similarapproximations for data fusion for correlated data like Consen-sus Filter [11], [12]. However, in this research our motivationis to make a formulation for an optimal algorithm given theconstraint of scalability.

A. Approach to the Scalability Problem

The first observation is that in the solution of this problemformulation in section V-B2, each node needs to know allthe connections in the network which contradicts with thescalability constraint.

The second observation is that the solution of the problemdepends on the formulation of the problem. If a formulationwhich takes scalability into account is made than a scalablesolution can be found. In section V-B the problem formulationincludes transmission of the state estimates as well as mea-surements which results in the correlation problem. In equation(40) cross correlations come into the picture.

A third observation is that ”scalability” must be defined tofit into the scalable problem formulation.

Our approach to this problem depends on these threeobservations. Firstly we define scalability as ”a node can seeonly its local network topology instead of whole networktopology”. This definition results in a problem formulationwhich involves only local network topology instead of globalnetwork topology of the network as in the optimal DKF.So any information regarding the non-neighbor nodes mustnot be involved in the algorithm of the operating node.The ”local network topology” can be the node itself, firstorder neighborhood, second order neighborhood or nth orderneighborhood. In this study we take ”local network topology”as first order neighborhood. So for example in the networkin figure 8 the algorithm in the first node can involve crosscovariance terms related to only second and third node. Therest of the cross covariances are modeled as ”uncertainty”terms. The neighbors can help the first node to approximatethis uncertainty terms as explained in the following sections.

B. Optimal Scalable DKF Problem Formulation

At this stage, we would like to introduce an exampletopology for which we will derive the optimal scalable DKFformulation without losing generality. Figure 8 shows ourexample topology and we are looking for an optimal scal-able DKF algorithm specifically on the first node. We send

Fig. 8. Example Topology for OSDKF

both states and measurements and incorporate those into theestimation as in equation (35). Then from equation (39)

e1 =[M11 M12 M13

]

e−1e−2e−3

− [

K11 K12 K13]

v1v2v3

(72)

e2 =[M21 M22 M24 M25

]

e−1e−2e−4e−5

[K21 K22 K24 K25

]

v1v2v4v5

(73)

e3 =[M31 M33 M37

]

e−1e−3e−7

− [

K31 K33 K37]

v1v3v7

(74)

However as the scalability condition implies, the algorithmin first node cannot involve e−4 ,e−5 ,e−7 and v4,v5,v7 terms. Asa result e2 and e3 as seen at the first node can be written as;

e2 = e12 +ϕ1

2 (75)

e3 = e13 +ϕ1

3 (76)

where

e12 =

[M21 M22

][e−1e−2

]− [

K21 K22][

v1v2

](77)

e13 =

[M31 M33

][e−1e−3

]− [

K31 K33][

v1v3

](78)

ϕ12 =

[M24 M25

][e−4e−5

]− [

K24 K25][

v4v5

](79)

ϕ13 = M37e−7 −K37v7 (80)

Then

P12 =[M11 M12 M13

]

P−11 P−12P−21 P−22P−31 P−32

[MT

21MT

22

]

+[K11 K12 K13

]

R1 00 R20 0

[KT

24KT

25

]+S1

12 (81)

Page 14: Eindhoven University of Technology MASTER Optimal scalable ... · ized Kalman algorithm are based on the constraint of sending only measurements. In this situation an optimal algorithm

GRADUATION PAPER 12

where

S112 = E[e1ϕT

2 ] =[M11 M12 M13

]

P−14 P−15P−24 P−25P−34 P−35

[MT

24MT

25

](82)

Similarly

P23 =[M21 M22 0

]

P−11 P−12 P−13P−21 P−22 P−23P−31 P−32 P−33

MT31

0MT

33

+[K21 K22 0

]

R1 0 00 R2 00 0 R3

KT310

KT33

+S1

23 (83)

where

S123 = E[e2ϕT

3 ]+E[ϕ2eT3 ]+E[ϕ2ϕT

3 ] (84)

Generalizing these cross covariance calculations we get

Π1 =

P11 P12 P13P21 P22 P23P31 P32 P33

(85)

µ1 =

M11 M12 M13M21 M22 0M31 0 M33

(86)

κ1 =

K11 K12 K13K21 K22 0K31 0 K33

(87)

Ω1 =

R1 0 00 R2 00 0 R3

(88)

Σ1 =

0 S112 S1

13S1

21 S122 S1

23S1

31 S132 S1

33

(89)

Π1 = µ1Π1µT1 +κ1Ω1κT

1 +Σ1 (90)

The solution of this formulation depends on the calculationsof unknown S1

i j matrices in Σ1. An estimate or an educatedapproximation of these values can lead to the solution ofthe formulation. Estimation or approximation of these ”un-certainty” terms is proposed as a future study to this research.The constraint for approximating or estimating S1

i j terms isthat only local communication is allowed for all nodes. Someideas about modeling these ”uncertainty” terms is given in thenext section.

C. Ideas for modeling uncertainty terms, Σ

Examining S112 we see that every term in S1

12 can be accessedby 2nd node except for P−34 and P−35 terms. For this reasonan approximation can be made by 2nd node for calculationof P12 at 1st node and then it can be sent to 1st node. Apossible approximation can be as follows by taking P34 ≈ P14and P35 ≈ P15;

S112∗=

[M11 +M13 M12

][P−14 P−15P−24 P−25

][MT

24MT

25

](91)

Similar approaches can be made for the other terms of Σ1 andthese uncertainties can be approximated by neighbor nodesand sent to the operating node. This approximation wouldsurely bring a bias in converged estimate. The stability andconvergence analysis of the proposed solutions have to bemade for the resulting algorithm.

VII. CONCLUSION

In this study we examined some optimality problems arisingby the decentralization of the Kalman filter. The first contribu-tion of this research is the introduction of a unique approach toDKF, which is called Global System Representation. Consid-ering the network of nodes as a whole, we stated an problemformulation for optimality depending on the given networktopology. Solving this problem formulation, we obtained anoptimal DKF dependent on the given network graph. Thissolution provided us with optimal local algorithms which alsotakes the network graph into account. This result is veryimportant since it proposes an algorithm which makes the bestestimate that can be obtained, taking the network topology intoaccount. So far in the literature the proposed algorithms forDKF have been compared with the central KF. However thecentral KF does not take the network topology into accountwhereas our Optimal DKF does and proves to be optimal ina sense that it minimizes the least squares error between thestate estimate and predicted real state by the model.

Although this solution for DKF is proved to be optimal,since it depends on the network graph, it is still unscalable.So the next challenge was incorporating scalability conditioninto the problem formulation so that a scalable algorithm canbe obtained.

A mathematical formulation of the scalable algorithm ismade in the last section. The solution of the formulationremained as dependant on the modeling of some uncertaintyterms which is proposed as a follow up to this research.

Lastly some comments, approximation tips are made for themodeling of these uncertainty terms. Still the formulation ofthese ”uncertainty” terms, stability analysis of it remained asto be researched in the future.

APPENDIX ADERIVATION OF KALMAN FILTER EQUATIONS

Predict:define e = x− x and e− = x− x− (92)

e[k] = Ae[k−1]−+w[k−1] from (1), (4), (92) (93)

P[k] = E[e[k]e[k]T ] (94)

P[k] = AP[k−1]AT +Q (95)Update:

x[k] = x−[k]+K[k](y[k]−Cx−[k]) (96)

Page 15: Eindhoven University of Technology MASTER Optimal scalable ... · ized Kalman algorithm are based on the constraint of sending only measurements. In this situation an optimal algorithm

GRADUATION PAPER 13

y[k]−Cx−[k] term is known as ”residual” or ”innovation”.Innovation represents the discrepency between the predictionCx−[k] and the measurement y[k]. The idea is improving theprediction towards the measurement by using a weight ofK[k], which is called Kalman gain. This Kalman gain canbe calculated optimally as a function of the measurementcovariance and predicted state covariance.

e[k] = (x[k]− x−[k])−K[k](Cx[k]−Cx−[k]+ v[k]) (97)from (2), (96) and (92)

e[k] = (I−K[k]C)e−[k]−K[k]v[k] (98)

P[k] = E[e[k]e[k]T ] (99)

P[k] = (I−K[k]C)P−[k](I−K[k]C)T +KRKT (100)

Here the trace of P[k], which is the sum of the variances ofeach state, should be minimized with respect to K[k] to findan optimal K[k].

since Tr(A) = Tr(AT )and P = PT

Tr(P[k]) = Tr(P−[k])−2Tr(P−[k]CT K[k]T ) (101)

+Tr(K[k](CP−[k]CT +R)K[k]T ) (102)d

dK[k]Tr(P[k]) =−2P−[k]CT +2K[k](CP−[k]CT +R) = 0

(103)

K[k] = P−[k]CT (CP−[k]CT +R)−1 (104)

using (104) in (100)

P[k] = (I−K[k]C)P−[k] (105)

APPENDIX BDERIVATION OF INFORMATION FILTER

Predict step is found by direct substitution of Informationfilter and Information State in equations (4) and (5). Updatestep is found eliminating K[k] by rewriting it in terms of P[k]and substituting into the equation (8).

K[k](CP−[k]CT +R) = P−[k]CT from equation (6) (106)

K[k]R = P−[k]CT −K[k]CP−[k]CT (107)

K[k]R = (I−K[k]C)P−[k]CT (108)

K[k] = P[k]CT R−1from equation (8) (109)

Using (8) and (109)

P−[k]−1 = P[k]−1(I−K[k]C) (110)

P[k]−1 = P−[k]−1 +CT R−1C (111)

J[k] = J−[k]+CT R−1C (112)

Also using (109) and (110) in (7) we find the expression forInformation state vector

x[k] = (I−K[k]C)x−[k]+K[k]y[k] (113)

P[k]−1x[k] = P[k]−1(I−K[k]C)x−[k]+P[k]−1K[k]y[k] (114)

P[k]−1x[k] = P−[k]−1x−[k]+CT R−1y[k] (115)

j[k] = j−[k]+CT R−1y[k] (116)

APPENDIX CDERIVATIVES OF TRACES

A. First Order

B. Second Order

REFERENCES

[1] R. Kalman, “A new approach to linear filtering and prediction problems,”Transaction of the ASME Journal of Basic Engineering, vol. 82, no. D,pp. 35–42, 1960.

[2] A. Mutambara and H. Durrant-Whyte, “A formally verified modular de-centralized robot control system,” . IEEE/RSJ International Conferenceon Intelligent Robots and Systems, 1993.

[3] B. Quine, J. Uhlmann, and H. Durrant-Whyte, “Implicit jacobians forlinearised state estimation in nonlinear systems.” Proceedings of the1995 American Control Conference, IEEE Press, 1995.

[4] A. A. Lange, “An algorithmic approach for improving and controllingthe quality of upper-air data,” WMO Instruments and Observing MethodsReport, No. 35, World Meteorological Organization, Geneva, Switzer-land, 1989, 1989.

[5] S. J. Julier and J. K. Uhlmann, “A non-divergent estimation algorithmin the presence of unknown correlations,” Proceedings of the AmericanControl Conference, New Mexico, 1997.

Page 16: Eindhoven University of Technology MASTER Optimal scalable ... · ized Kalman algorithm are based on the constraint of sending only measurements. In this situation an optimal algorithm

GRADUATION PAPER 14

[6] P. S. Maybeck, Stochastic models, estimation, and Control. AcademicPress, Harcourt Brace Jovanovich, 1979, ch. Chapter 1.

[7] G. Welch and G. Bishop, “An introduction to kalman filter,” 2006, uNC-Chapel Hill, TR-95-041.

[8] H. Hashmipour, S. Roy, and A. Laub, “Decentralized structures forparallel kalman filtering,” IEEE Transactions on Automatic Control,vol. 33, no. 1, pp. 88–93, 1988.

[9] H. Durant-Whyte, B. Rao, and H. Hu, “Towards a fully decentralizedarchitecture for multi-sensor data fusion,” in 1990 IEEE Int. Conf. onRobotics and Automation, Cincinnati, Ohio, USA, 1990, pp. 1331–1336.

[10] K. Petersen and P. M.S., The Matrix Cookbook, 2008.[11] M. Alighanbari and J. P. How, “An unbiased kalman consen-

sus algorithm,” Proceedings of the 2006 American Control Confer-ence,Minneapolis, Minnesota, USA, 2006.

[12] R. Olfati-Saber, “Distributed kalman filtering for sensor networks,”Proceedings of the 46th IEEE Conference on Decision and Control,2007.


Recommended