Unified Forms for Kalman and Finite Impulse
Response Filtering and Smoothing
Dan Simon∗ and Yuriy S. Shmaliy†
July 25, 2012
Abstract
The Kalman filter and smoother are optimal state estimators under certain
conditions. The Kalman filter is typically presented in a predictor/corrector
format, but the Kalman smoother has never been derived in that format. We
derive the Kalman smoother in a predictor/corrector format, thus providing a
unified form for the Kalman filter and smoother. We also discuss unbiased finite
impulse response (UFIR) filters and smoothers, which can provide a suboptimal
but robust alternative to Kalman estimators. We derive two unified forms for
UFIR filters and smoothers, and we derive lower and upper bounds for their
estimation error covariances. We present simulation results that demonstrate
the optimality of Kalman estimators and the robustness of UFIR estimators.
Keywords: Kalman filter, Kalman smoother, finite impulse response
filter, finite impulse response smoother, robust filter
∗Corresponding author. Cleveland State University, Cleveland, Ohio, United States. Emailaddress [email protected].
†Universidad de Guanajuato, Salamanca, Mexico. Email address [email protected].
1
1 Introduction
In what follows, we assume that we have a linear system described as
xi = Fixi−1 + wi
yi = Hixi + vi (1)
where the time index i ≥ 1, xi is the K-dimensional state vector, yi is the
M -dimensional measurement, {wi} is a process noise sequence, {vi} is a mea-
surement noise sequence, and system matrices Fi and Hi are known. Our
objective is to estimate xi based on the measurements and our knowledge of
the system dynamics.
We use the term estimator to refer to the class of algorithms that includes
filtering, prediction, and smoothing. A filter is an estimator that estimates xi
based on measurements up to and including time i. A predictor is an estimator
that estimates xi based on measurements prior to time i. A smoother is an
estimator that estimates xi based on measurements both prior to, and later
than, time i. There are many ways to estimate the state of (1).
Kalman Estimation
Under certain conditions that will be reviewed in more detail in Section 2, the
Kalman estimator provides the optimal estimate of the state. The first Kalman
estimator that was derived was the Kalman filter [15]. In the case of smoothing,
the Kalman estimator is generally written as either a fixed-lag smoother, a
fixed-interval smoother, or a fixed-point smoother. These algorithms can be
described as follows [3, Chapter 7], [33, Chapter 9]:
• In a fixed-lag smoother we estimate xi for i ≥ 1 on the basis of measure-
ments up to and including time i+ q for a fixed value of q > 0;
• In a fixed-interval smoother we estimate xi for i ∈ [1, N ] on the basis of
2
measurements up to and including time time N ;
• In a fixed-point smoother we estimate xi on the basis of measurements up
to and including time i + q for a fixed value of i and for a fixed value of
q > 0.
As we will see in Section 2, the form of the Kalman smoother is much different
than that of the Kalman filter. In this paper, we derive a Kalman smoother
that is in the same form as the predictor/corrector form of the Kalman filter.
The Kalman estimator is an infinite impulse response (IIR) filter; that is,
each measurement ym affects each estimate xi for all m and i. The IIR na-
ture of the Kalman filter makes it optimal, but also can make it sensitive to
modeling errors. This problem was noted early in the history of Kalman estima-
tion [11], [25], [36]. Over the past few decades, researchers have proposed many
methods of robustifying the Kalman estimator [37, Chapter 9]. For example,
although the Kalman estimator is the optimal linear estimator, it is not optimal
for non-Gaussian noise. Noise in nature is often approximately Gaussian but
with heavier tails, and the Kalman estimator can be modified to provide im-
proved performance in this case [18], [19], [40]. Sometimes measurements do not
contain any useful information but consist entirely of noise with a certain prob-
ability, and the Kalman estimator can also be modified to deal with this situa-
tion [4], [5], [24], [35]. Kalman estimation with uncertainties in the system ma-
trices of (1) has been considered by many authors [13], [16], [39], [43] [44], [46];
this is often called adaptive or robust Kalman estimation [12]. Methods for
identifying the noise covariances are presented in [2], [20], [23].
Finite Impulse Response Estimation
Part of our goal here is the same as that of the research described above, but we
take a different approach. Whereas the previous research efforts aimed to im-
prove the Kalman estimator in the presence of mismodeling, we propose to use
3
a finite impulse response (FIR) estimator as an alternative to the Kalman esti-
mator. The advantages of transversal FIR estimators over Kalman estimators
were recognized as far back as the 1960s, particularly in the areas of stability
and robustness [14]. In spite of their history, FIR filters are not commonly used
for state estimation. This is probably due to the analytical complexity and large
computation burden of FIR estimators. We find only a few substantial results
in recent years. FIR smoothers can be used to design nonlinear filters [42] and
hybrid filters [47] for polynomial models. Order-recursive FIR smoothers were
proposed for state space [45]. General receding horizon FIR smoother theory
has been developed [1], [9], [10], [17]. More recently, unbiased FIR (UFIR)
smoothing of polynomial state space models has been considered [27], and FIR
smoothing was developed from the general p-shift estimator [28], [30], [31]. It-
erative UFIR algorithms have also been developed [28], [30]. These algorithms
have the same predictor/corrector structure as the Kalman filter, ignore the
statistics of the noise and initial estimation errors, and become virtually op-
timal as the length of the FIR window increases. They are also amenable to
parallel processing, which can allow them to operate as fast as the Kalman
filter. The form of the UFIR filter is much different than that of the UFIR
smoother, and so in this paper we derive a unified form for the UFIR filter and
smoother.
Overview of the Paper
Section 2 gives a brief review of Kalman filtering and smoothing, and then
derives a unified form for the two algorithms. It also presents an example that
illustrates a potential benefit of the unified form. Section 3 gives a review of
UFIR filtering and smoothing, and then derives two separate but equivalent
unified forms for the two algorithms. It also derives upper and lower bounds
for the estimation error covariance. Section 4 presents some simulation results
comparing Kalman filtering and smoothing with UFIR filtering and smoothing.
4
2 Kalman Filtering and Smoothing
In Kalman estimation our objective is to use the measurements and our knowl-
edge of the system matrices to estimate the state [33]. If our estimate of xi is
based on measurements up to and including time t, we denote the estimate as
xi|t. If t = i then we have xi|i, which is called the a posteriori state estimate. If
t = i−1 then we have xi|i−1, which is called the a priori state estimate. If t > i,
then we have a non-causal smoothed estimate, which uses future measurements
to obtain the estimate at time i. Many approaches are available for estimating
the state. Suppose the following conditions hold:
1. {wi} and {vi} are zero-mean, Gaussian, white, and uncorrelated, with
known covariances Qi and Ri respectively;
2. We have an initial state estimate before any measurements are processed
that we denote as x0|0;
3. (x0− x0|0) ∼ N(0, P0|0), which means that the initial estimation error is a
zero-mean normally-distributed random variable with known covariance
P0|0.
Then the Kalman filter output is the mean of the state conditioned on the
measurements up to and including the current time:
xi|i = E(xi|y1, y2, · · · , yi) (2)
for i ≥ 1. Furthermore, the Kalman filter estimate is the one that minimizes the
trace of the covariance of the estimation error. The Kalman filter algorithm can
be described as shown in Figure 1, although there are also many other equivalent
formulations of the Kalman filter [33]. If the process noise, measurement noise,
and initial estimation error are not Gaussian, then the Kalman filter is the
minimum-variance linear estimator, although there may be nonlinear estimators
that have a smaller variance.
5
Figure 1: The Kalman filter. Ki is the Kalman gain, Pi|i is the a posteriori estimationerror covariance, and Pi|i−1 is the a priori estimation error covariance.
x0|0 = E(x0)P0|0 = E
[(x0 − x0|0)(x0 − x0|0)
T]
For i = 1, 2, · · ·xi|i−1 = Fixi−1|i−1
Pi|i−1 = FiPi−1|i−1FTi +Qi
Ki = Pi|i−1HTi (HiPi|i−1H
Ti +Ri)
−1
xi|i = xi|i−1 +Ki(yi −Hixi|i−1)Pi|i = (I −KiHi)Pi|i−1
Next i
In the case of smoothing, we use future measurements to obtain the optimal
state estimate. There are many ways to formulate the Kalman smoother. One
well-known algorithm is called the Rauch-Tung-Striebel (RTS) smoother, which
is a type of fixed-interval smoother [26], [33, Section 9.4.2]. Given measurements
yi for i ∈ [1, N ], the RTS smoother outputs estimates xi|N for all i ∈ [0, N ].
The RTS smoother algorithm is summarized in Figure 2. The RTS smoother
works by first executing the Kalman filter of Figure 1. The smoother then runs
backwards in time, starting at the final time, to obtain a smoothed estimate
and error covariance that are better than those of the Kalman filter.
Figure 2: The RTS smoother. Ksi is the Kalman smoother gain, and P s
i is thecovariance of the error of the smoothed estimate at time i.
Execute the Kalman filter for i = 1, 2, · · · , n, as shown in Figure 1Initialize P s
n = Pn|nFor q = 1, 2, · · · , n
i = n− qKs
i = Pi|iFTi+1(Pi+1|i)
−1
P si = Pi|i −Ks
i (Pi+1|i − P si+1)(K
si )
T
xi|n = xi|i +Ksi (xi+1|n − xi+1|i)
Next i
6
Unified Kalman Filtering and Smoothing
Figure 1 shows that the Kalman filter estimate can be written in the form
xi|i = γixi−1|i−1 +Kiyi
where γi = (I −KiHi)Fi (3)
for i ≥ 1. That is, the a posteriori estimate at the current time is a function
of the previous estimate and the new measurement. This is called a predic-
tor/corrector form. However, the smoothed estimate in Figure 2 does not have
this form. We would like to find a similar form for the smoothed estimate. In
particular, suppose that we want to use measurements {yi} for i ∈ [1, n] to
estimate the state at time n − q. We want to write the smoothed estimate in
the form
xn−q|n = γn,qxn−1|n−1 +
n∑m=n−q+1
βn,q,mym (4)
where the smoother lag q > 0. This form would give the smoothed estimate
at time n − q conditioned on measurements up through time n, as a function
of the estimate at time n− 1, and as a function of the measurements between
times n− q + 1 and n. Such a form can serve at least two purposes.
First, we find it mathematically attractive to obtain unified forms for differ-
ent algorithms. We see this in many areas of science and engineering [6], [8], [21].
According to this perspective, the parallel form of (3) and (4) is intuitively ap-
pealing. If in (4) we set n = i, q = 0, γn,q = γi, βn,q,m = 0 for m < n, and
βn,q,n = Ki, then (4) reduces to (3).
Second, the smoother form of (4) may have practical benefits because it
directly shows the additional sensitivity of the smoothed estimate to each
measurement, beyond the sensitivity that is already incorporated in xn−1|n−1.
βn,q,m is the sensitivity of xn−q|n to ym for m ∈ [n−q+1, n]. These sensitivities
could be used to process the measurements in order of decreasing sensitivity so
7
that the most important measurements are processed first, in case the timeli-
ness of the smoothed estimate is important for a given application. To be more
specific, (4) can be written in algorithmic form by first computing
µ(l) = value of m in the l-th largest value of βn,q,m (5)
for m ∈ [n− q + 1, n] and l ∈ [1, q]. When we say “l-th largest value of βn,q,m”
in (5), we implicitly assume the use of some matrix or vector norm. After
computing (5), we perform the following:
x(0)n−q|n = γn,qxn−1|n−1
For l = 1 to q
m = n− q + l
x(l)n−q|n = x
(l−1)n−q|n + βn,q,µ(l)yµ(l)
Next l
The above algorithm gives a smoothed estimate that improves with each iter-
ation so that at the end of the loop, x(q)n−q|n = xn−q|n is the optimal Kalman
smoother output. More importantly, the algorithm processes the measurements
in an optimal order; they are processed in the order of their influence on the
estimate. An example is given later in this section.
To write the RTS smoother algorithm of Figure 2 in the form of (4), we first
note from Figure 2 that xn−1|n can be written as
xn−1|n = xn−1|n−1 +Ksn−1(xn|n − xn|n−1). (6)
Proceeding inductively, it can be shown that
xn−q|n = xn−q|n−q +
q∑l=1
Ln,q,l(xn−q+l|n−q+l − xn−q+l|n−q+l−1)
where Ln,q,l =l−1∏r=0
Ksn−q+r (7)
8
for q ≥ 0. Now notice from the Kalman filter algorithm of Figure 1 that xn|n
can be written as
xn|n = (I −KnHn)Fnxn−1|n−1 +Knyn
= (I −KnHn)Fn(I −Kn−1Hn−1)Fn−1xn−2|n−2 +
(I −KnHn)FnKn−1yn−1 +Knyn. (8)
Proceeding inductively, it can be shown that
xn|n =
q∏j=0
Jn−jFn−j xn−q−1|n−q−1 +
q∑l=1
l−1∏j=0
Jn−jFn−j
Kn−lyn−l +Knyn
where Jn = I −KnHn (9)
for q ≥ 0. Now replace n with n− 1 and replace q with q − 2 in (9) to obtain
xn−1|n−1 = Mq−2xn−q|n−q +
q−2∑l=1
Ml−1Kn−1−lyn−1−l +Kn−1yn−1
where Mr =
∏r
j=0 Jn−1−jFn−1−j if r ≥ 0
I if r = −1
(JnFn)−1 if r = −2.
(10)
Now write (7) as
xn−q|n = xn−q|n−q +
p∑l=1
Ln,q,l
(xn−q+l|n−q+l − J−1
n−q+lxn−q+l|n−q+l+
J−1n−q+lKn−q+lyn−q+l
)= xn−q|n−q +
q∑l=1
Ln,q,l
((I − J−1
n−q+l)xn−q+l|n−q+l+
J−1n−q+lKn−q+lyn−q+l
). (11)
Now use (10) to find xn−q+l|n−q+l for l ∈ [0, q] in terms of xn−1|n−1 and yj for
j ∈ [n− q+ l+1, n− 1], and substitute for xn−q+l|n−q+l in the above equation.
9
After some lengthy algebra, we obtain
xn−q|n = γn,qxn−1|n−1 +
n∑m=n−q+1
βn,q,mym for q ≥ 0 (12)
where γn,q = M−1q−2 +
q∑l=1
Ln,q,l(I − J−1n−q+l)M
−1q−l−2
βn,q,m =
Ln,q,qKn if m = n[
Ln,q,q+m−nJ−1m −M−1
q−2Mn−2−m−∑q+m−n−1l=1 Ln,q,l(I − J−1
n−q+l)M−1q−l−2Mn−2−m
]Km if m < n.
This algorithm, which we call the unified Kalman filter/smoother, is mathemat-
ically identical to the Kalman filter of Figure 1 for q = 0, and is mathematically
identical to the RTS smoother of Figure 2 for q > 0. Some similarities and dif-
ferences between the RTS form of the smoother and the unified form of the
smoother are as follows.
1. Both forms require that the standard forward Kalman filter execute before
any smoothing operations take place. Both forms require the forward error
covariances Pi|i and Pi+1|i, for all i ∈ [0, n− 1].
2. The RTS smoother requires saving the forward state estimates xi+1|i for
all i ∈ [0, n− 1]. The unified Kalman smoother requires saving the mea-
surements yi for all i ∈ [1, n].
10
Example 1
Consider the time-invariant system of (1) with
F =
1 ∆t
0 1
H =
[1 0
]
Q =
1 0
0 1
R = 10
x0 =
[1 0
]TP0|0 =
1 0
0 1
(13)
with ∆t = 0.1. Suppose the system runs until time index n = 41 and that we
want to find the smoothed estimate with a lag of q = 20. Equation (12) gives
the following values for ||βn,q,m||2 for m ∈ [22, 41]:
β = {0.10, 0.21, 0.34, 0.50, 0.71, 0.98, 1.34, 1.83, 2.49, 3.37,
4.56, 6.15, 8.30, 11.20, 15.10, 20.34, 27.40, 36.91, 49.71, 0.05} . (14)
Equation (5) shows that the µ(l) indices for l ∈ [1, 20] are given as follows:
µ = {40, 39, 38, 37, 36, 35, 34, 33, 32, 31,
30, 29, 28, 27, 26, 25, 24, 23, 22, 41} . (15)
This means that y40 is the most important measurement for the xn−q estimate,
y39 is the second most important measurement, and so on. Now suppose that
we need to neglect one of the measurements in (12) due to computational con-
straints or for some other reason. It stands to reason from (15) that neglecting
y41 would result in the smallest degradation of xn−q from its optimal value, ne-
glecting y22 would result in the second smallest degradation, and so on. We can
11
numerically confirm this reasoning for this example by using (12) to calculate
xn−q|n,r = γn,qxn−1|n−1 +
r−1∑m=n−q+1
βn,q,mym +
n∑m=r+1
βn,q,mym (16)
for r ∈ [22, 41]. That is, xn−q|n,r is calculated in the same way as the optimal
smoothed estimate, except that it does not use yr. Figures 3 (linear scale) and 4
(log scale) show the RMS estimation error of 100 Monte Carlo simulations of
xn−q|n,r as a function of the missing measurement index r. The left-most point
on each plot corresponds to the optimal smoothed estimate, and the other el-
ements on the horizontal axes are in the opposite order of µ in (15). Each
smoothed estimation error (except for the first point) in the plots corresponds
to the neglect of a single measurement, with the importance of that measure-
ment increasing as we move from left to right on the plots. We see that, as
predicted, the estimation error gets worse as we leave out more important mea-
surements. We further see that if we neglect one of the few least important
measurements, then smoothing performance degrades only slightly relative to
the optimal performance, which is shown at the leftmost point on the plots.
0
200
400
600
800
1000
Missing Measurement Index (Sorted)
RM
S S
moo
thin
g E
rror
First StateSecond State
Figure 3: This figure illustrates the effect of neglecting increasingly important mea-surements in a smoothed estimate. This shows the same data as in Figure 4 excepton a linear scale.
12
100
101
102
103
Missing Measurement Index (Sorted)
RM
S S
moo
thin
g E
rror
First StateSecond State
Figure 4: This figure illustrates the effect of neglecting increasingly important mea-surements in a smoothed estimate. This shows the same data as in Figure 3 excepton a log scale.
3 Unified UFIR Filtering and Smoothing
This section presents two forms for a unified setting for UFIR filtering and
smoothing based on the p-shift UFIR estimator proposed in [28], [31]. We begin
with some preliminaries, and then derive the two unified forms in Section 3.1.
We derive estimation error bounds in Section 3.2.
As in the previous section, we suppose that we have a linear system given
by (1) where yn is the most recent measurement. The UFIR estimator is con-
ceptually different than the Kalman estimator. The Kalman estimator uses
all measurements ym for m ∈ [1, n] to obtain the filtered estimate xn|n, or the
smoothed estimate xn−q|n for q ≥ 1.
However, the FIR estimator uses the N most recent measurements to obtain
the filtered estimate xn, or the smoothed estimate xn−q for some value of lag
q ∈ [0, N − 1], where N is a user-specified smoothing interval. We often set
q = ⌊N/2⌋, where ⌊·⌋ is the floor function. However, other values of q may
provide lower estimation errors [27]. We use the notation xn for the UFIR
filter estimate, and xn−q for the UFIR smoothed estimate, both of which are
13
conditioned on the N measurements ym for m ∈ [n −N + 1, n]. Since xn and
xn−q are functions of N , we sometimes write them as xn(N) and xn−q(N).
Like Gauss’s ordinary least squares (OLS), which is stated by the Gauss-
Markov theorem to be the best linear unbiased estimator (BLUE) [7], [33], the
UFIR filter/smoother ignores noise statistics and initial estimation errors. The
UFIR estimator requires an optimal averaging interval of Nopt points in order
for the mean square error (MSE) of the estimate to be minimal. It was recently
shown in [32] that Nopt can be estimated with high accuracy by minimizing
the derivative of E{[yn −Hnxn(N)][yn −Hnxn(N)]T } with respect to N . An
example of this approach is given in Section 4.
Like the Kalman filter, the UFIR estimator can be given in a fast iterative
form for filtering and prediction [30]. For smoothing, however, we need to
modify the estimator to obtain an iterative form. To provide this modification,
let us write the smoothed estimate [31, Eq. (27)] at the first time point in the
smoothing interval as
xm = H−1n,mYn,m (17)
where m = n−N + 1 and H−1n,m = (HT
n,mHn,m)−1HTn,m ∈ RK×M(n−m+1) is the
generalized left inverse of Hn,m, and
Hn,m = Hn,mFn,m ∈ RM(n−m+1)×K
Yn,m =[yTn yTn−1 . . . yTm
]T ∈ RM(n−m+1)×1
Fn,m =[Fm+1T
n,0 Fm+1T
n,1 . . . F Tm+1 I︸ ︷︷ ︸
n−m+1
]T∈ RK(n−m+1)×K
Hn,m = diag(Hn Hn−1 . . . Hm︸ ︷︷ ︸
n−m+1
)∈ RM(n−m+1)×K(n−m+1)
Fr−gr,h =
g∏i=h
Fr−i ∈ RK×K . (18)
As can be seen, the form of (17) is reminiscent of the familiar OLS estimator
or BLUE.
14
Now, to provide a unified UFIR smoothing equation for arbitrary lag q,
the transition matrix Bn,m(q) must be specified such that xn−q = Bn,m(q)xm,
where xm is given in (17). By combining the forward-time and backward-time
solutions [38], this matrix can be found as
Bn,m(q) =
Fm+1n−q,0 , q 6 n−m− 1
I , q = n−m(Fn−q+1m,0
)−1, q > n−m+ 1.
(19)
The most general batch form of the unified UFIR filter (q = 0) and smoother
(q > 0) is thus the following:
xn−q = Bn,m(q)H−1n,mYn,m (20)
where q > 0. Assuming 0 6 q < N − 1, one may also use the particular form
of (20) given in [30, Eq. 20], although that form has the limitations discussed
in Section 3.1.1 below.
If we now observe that the filter estimate (q = 0) is
xn = Fm+1n,0 H−1
n,mYn,m (21)
then (20) can be written as
xn−q = Bn,m(q)(Fm+1n,0 )−1xn . (22)
This equation plays a key role in the derivation of the second form of the UFIR
estimator discussed in Section 3.1.2 below.
3.1 Iterative UFIR filtering and smoothing
Employing (20) and (22) and following [28], [30], this section derives two forms
of the iterative unified UFIR filter/smoother, which are similar in form to the
15
unified Kalman filter/smoother.
3.1.1 The First Form of the Unified UFIR Filter/Smoother
Following the derivations given in [28, Appendices I and II], the first form of
the iterative UFIR estimator corresponding to (20) becomes
xl−q = Fl−qxl−q−1 +Kl(yl −HlYlxl−q−1) (23)
for l ∈ [m+K,n], where
Yl =
F l−q+1l,0 =
q−1∏i=0
Fl−i if q > 0
I if q = 0
Yl = YlFl−q. (24)
Since Yl and Yl are functions of q, we sometimes write them as Yl(q) and Yl(q).
Note that (23) defines an iterative procedure, so xl−q is not a useful estimate for
l < n. The iteration (23) leads to the optimal UFIR q-lag smoothed estimate
(assuming that N = Nopt) when l = n. The bias correction gain Kl , Kl,m(q)
is given as
Kl = GlYTl H
Tl (25)
where the generalized noise power gain (GNPG) Gl , Gl,m(q) can be computed
iteratively using
Gl = Fl−q
(YTl H
Tl HlYl +G−1
l−1
)−1F Tl−q
=[YTl H
Tl HlYl + (Fl−qGl−1F
Tl−q)
−1]−1
. (26)
16
The initial values for this iteration, xm+K−1−q and Gm+K−1, are given as
xs−q = Bs,m(q)H−1s,mYs,m (27)
Gs = Bs,m(q)(HTs,mHs,m)−1BT
s,m(q) (28)
where s = m +K − 1. The index variable l in (23) ranges from m +K to n,
and the smoothed estimate xn−q is obtained when l = n in (23).
Since the UFIR estimator does not depend on initial conditions, we can
approximately set (27) to zero and (28) to the identity matrix. This simplifica-
tion gives relatively good estimates if N ≫ 1, although it may not be accurate
otherwise.
The first unified UFIR filter/smoother form (23) is summarized in Figure 5.
If q = 0 the algorithm becomes the UFIR filter, and if q > 0 the algorithm be-
comes the UFIR smoother. Note that for time-invariant systems the algorithm
is greatly simplified, although we do not show the simplified version here.
The estimate xn−q is the output of the iterative algorithm of Figure 5, and
is obtained by iteratively improving the estimate beginning with xn−(q+N−K).
Note that this algorithm outputs (q−K +1) estimates prior to the smoothing
interval. That is, it outputs xl−q for l ∈ [m +K,n], and (q −K + 1) of those
x indices are prior to the time index (n−N + 1). This situation is depicted in
Figure 6.
17
Figure 5: The first form of the iterative unified UFIR filter/smoother. The optimalvalue of N can be obtained as shown in Example 2 below.
Given: System dynamics of (1)K = number of statesn = most recent measurement time indexN = number of time points in smoother/filter intervalq = smoother time lag (note that q = 0 for filtering)
Define: m = n−N + 1 = time index of earliest measurement useds = m+K − 1
Eq. (27): xs−q = Bs,m(q)H−1s,mYs,m
Eq. (28): Gs = Bs,m(q)(HTs,mHs,m)
−1BTs,m(q)
For l = m+K to n
Gl =[HT
l Hl + (FlGl−1FTl )
−1]−1
Kl =q−1∏i=0
Fl−iGlHTl
xl−q = Fl−qxl−q−1 +Kl(yl −HlYlxl−q−1)Next l
n-N+K-q n-N+1 n-q n
Estimates obtained by first UFIR form
Figure 6: This figure illustrates the first form of the UFIR smoother. n is the timeindex of the most recent measurement, N is the length of the smoothing interval,K is the number of states, and q is the smoother lag. The first form of the UFIRsmoother obtains smoothed estimates between time indices (n−N+K−q) and (n−q).However, only the estimate at time (n− q) is optimal (assuming that N = Nopt).
18
3.1.2 The Second Form of the Unified UFIR Filter/Smoother
We see from (23) that when q = 0,
xl = Flxl−1 +Kl(yl −HlFlxl−1)
Kl = GlHTl
Gl =[HT
l Hl +(FlGl−1F
Tl
)−1]−1
(29)
for l ∈ [m+K,n], and the initial values are given by
xs = Fm+1s,0 H−1
s,mYs,m (30)
Gs = Fm+1s,0
(HT
s,mHs,m
)−1(Fm+1s,0
)T(31)
where, as in the previous section, s = m + K − 1. After the index l finishes
iterating from m+K to n, (29) provides the filtered estimate xn, and the q-lag
smoother estimate can then be computed by retrodicting xn to time n−q using
(22):
xn−q = Bn,m(q)(Fm+1n,0
)−1xn . (32)
We see that the second form of the UFIR smoother outputs estimates between
time indices (n−N +K) and n. Compare this with the first form of the UFIR
smoother, which is depicted in Figure 6.
The second unified UFIR filter/smoother form is summarized in Figure 7.
As with the first form, if q = 0 the algorithm is the UFIR filter, and if q > 0
the algorithm is the UFIR smoother. As with the first form, the algorithm
is greatly simplified for time-invariant systems, although we do not show the
simplified version here.
The algorithm of Figure 7 is conceptually similar to the RTS smoother of
Figure 2 in that first, filtering provides the estimates at each point in the time
window of interest, and then smoothing is obtained by retrodicting the filter
19
estimate back to time n− q, where q is the desired smoother lag.
Figure 7: The second form of the iterative unified UFIR filter/smoother. The optimalvalue of N can be obtained as shown in Example 2 below.
Given K = number of statesn = most recent measurement time indexN = number of time points in smoother/filtering intervalq = smoother time lag (note that q = 0 for filtering)
Define m = n−N + 1 = time index of earliest measurement useds = m+K − 1
Eq. (30): xs = Fm+1s,0 H−1
s,mYs,m
Eq. (31): Gs = Fm+1s,0
(HT
s,mHs,m
)−1 (Fm+1s,0
)TFor l = m+K to n
Gl =[HT
l Hl +(FlGl−1F
Tl
)−1]−1
xl = Flxl−1 +GlHTl (yl − Flxl−1)
Next l
xn−q = Bn,m(q)(Fm+1
n,0
)−1xn
3.2 Estimation Error Covariance Bounds
This section derives bounds for the UFIR filter and smoother estimation errors.
Define the instantaneous error and its covariance as
ϵl−q = xl−q − xl−q
Pl−q = E{ϵl−qϵTl−q} . (33)
Although the UFIR estimator has two equivalent forms (batch and iterative),
the covariance can be determined rigorously only via the batch form of (20).
If we try to calculate (33) iteratively using (23), then the calculation of Pl−q
will require continually expanding matrix operations at each iteration. There
is no such effect in the covariance calculation of the Kalman smoother, as we
see from Figure 2. Finding a rigorous closed-form analytical solution for the
UFIR smoother covariance via (20) results in a large mathematical burden and
20
is a topic of current research. On the other hand, it may not be necessary
because, unlike the Kalman smoother, the covariance is not required in the
UFIR smoother algorithm. Instead, reasonably accurate covariance bounds
may be sufficient for practical applications. We discuss the upper and lower
UFIR smoother covariance bounds below.
3.2.1 Upper Bound of UFIR Covariance
The upper bound (UB) of the UFIR smoother covariance PUBn−q can be found as
shown in Appendix A. The upper bound (A.5) involves accumulating process
noise covariances at each iteration. Therefore, the upper bound is relatively
tight for small N , and becomes more conservative when N ≫ 1. In a similar
way, the conservativeness of the upper bound increases with q. This means
that the upper bound of (A.5) is most useful (that is, most strict) for filtering
(q = 0), and for smoothing applications in which both N and q are small.
3.2.2 Lower Bound of UFIR Covariance
We can find the lower bound (LB) of the UFIR smoother covariance PLBn−q in
two different ways. First, we use (A.5), which was derived from the first UFIR
form of Section 3.1.1. We neglect all of the process noise covariances in (A.5)
to obtain
PLBl−q = (I −KlHlYl)Fl−qP
LBl−q−1F
Tl−q(· · · )T +KlR
Tl K
Tl . (34)
For time-invariant models, (34) becomes
PLBl−q = (I −KlHF q)FPLB
l−q−1FT (· · · )T +KlR
TKTl . (35)
Both (34) and (35) correspond to the first UFIR estimator form (23).
We can also use the second UFIR form of Section 3.1.2 to find the LB of
21
the UFIR smoother covariance. We first find the LB for filtering (q = 0) as
PLBl = (I −KlHl)FlP
LBl−1F
Tl (· · · )T +KlR
Tl K
Tl (36)
and then use (32) to compute the LB for smoothing for time-varying and time-
invariant models as
Time-varying models: PLBn−q = Bn,m(q)
(Fm+1n,0
)−1PLBn
(Fm+1n,0
)−TBTn,m(q)
Time-invariant models: PLBn−q = F−qPLB
n F−qT (37)
where PLBn is provided by (36) when l = n. Note that the LBs can also be
computed via the noise power gain to serve well in the three-sigma sense, as
shown in [29].
4 Simulation Results
This section presents some simulation results to demonstrate the theory of the
preceding sections.
Example 2
Recall from the UFIR algorithms of Figures 5 and 7 that the user needs to select
N , which is the number of measurements used in the UFIR smoother/filter. If
N is too small, then there is too little information to form a reliable estimate
of the state because there are too few measurements involved in the estimate.
However, if N is too large, then bias errors enter the estimate due to the long
time scale. This example shows how we can estimate Nopt. As described in [32],
Nopt ≈ argminN
∂V (N)
∂N
where V (N) = E{[yn −Hnxn(N)][yn −Hnxn(N)]T
}. (38)
22
We can numerically estimate the above derivative for various values of N , and
then use any optimization algorithm to minimize it. In this example we use
F =
1 ∆t
0 1
H =
[1 0
]
Q =
0 0
0 1
R = (5/6)2 (39)
with ∆t = 0.1. Figure 8 shows ∂V/∂N along with the RMS value of the
estimation error, which was obtained from 10,000 UFIR filter time steps. We
see that ∂V/∂N is minimum at N = 10, while the estimation error is minimum
at N = 9. Furthermore, the estimation error is relatively flat near its minimum,
which indicates that the UFIR filter is relatively robust, at least for this simple
example. A more accurate estimate of Nopt can be obtained by using more
time steps to approximate V (N).
0 5 10 15 200.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
N (Filtering Interval)
RMS Estimation ErrorDerivative of V(N)
Figure 8: This figure illustrates the strategy of finding the optimal measurementinterval N for the UFIR filter. The estimation error is minimized at about the samevalue of N as the derivative of V (N).
23
Example 3
Next we compare Kalman filter, Kalman smoother, UFIR filter, and UFIR
smoother performance. We use the same system parameters as in the previous
example. For the UFIR filter and smoother, we set N = 12 to simulate an
error in our approximation of Nopt and to explore its robustness. This gives
a UFIR smoother lag q = N/2 = 6. For the Kalman filter and smoother,
we vary the value of Q(2, 2) assumed by the estimator. Measurement noise
is often not too difficult to statistically characterize because it comes from
engineered measurement devices. However, process noise is often very difficult
to characterize, and so errors in the Kalman filter/smoother value of Q are
common. We run the estimator simulations for n = 10, 000 time steps. The
Kalman smoother in this example uses all nmeasurements to find the smoothed
estimate xi|n for i ∈ [1, 10000] (see Figure 2). The UFIR smoother uses yk for
k ∈ [i − 11, i] to find the smoothed estimate xi−6, where i ∈ [7, 10000] (see
Figures 5 and 7).
Tables 1 and 2 compare the performance of the Kalman filter, Kalman
smoother, UFIR filter, and UFIR smoother, and are based on 20 Monte Carlo
simulations.
Assumed Value of Q(2, 2)0.01 0.1 1 10 100
Kalman filter 1.56 0.71 0.53 0.60 0.71FIR filter 0.56 0.56 0.56 0.56 0.56
Kalman smoother 0.84 0.40 0.30 0.35 0.47FIR smoother 0.34 0.34 0.34 0.34 0.34
Table 1: RMS estimation errors for the first state in Example 3. R = (5/6)2, and thetrue value of Q(2, 2) is 1.
24
Assumed Value of Q(2, 2)0.01 0.1 1 10 100
Kalman filter 3.15 2.42 2.07 2.74 5.35FIR filter 2.26 2.26 2.26 2.26 2.26
Kalman smoother 1.58 1.20 1.01 1.40 3.18FIR smoother 1.10 1.10 1.10 1.10 1.10
Table 2: RMS estimation errors for the second state in Example 3. R = (5/6)2, andthe true value of Q(2, 2) is 1.
We make the following observations about Tables 1 and 2.
1. When the Kalman filter/smoother assumes the correct statistics of the
system, it is optimal, and so it outperforms the UFIR filter/smoother, as
expected.
2. The Kalman smoother always outperforms the Kalman filter, and UFIR
smoother always outperforms the UFIR filter, as expected.
3. As the Kalman filter/smoother assumed value of Q(2, 2) deviates from
the true value, its performance suffers, as expected. For this example,
performance suffers more when Q(2, 2) is smaller than the true value.
This agrees with our intuition. If we know that we have modeling er-
rors, we should put more emphasis on the measurements in the Kalman
filter/smoother rather than on the model information. However, when
Q(2, 2) decreases, the Kalman gain increases, as we can deduce from Fig-
ure 1.
4. UFIR filter/smoother performance is invariant with respect to errors in
the Kalman filter/smoother assumed value of Q(2, 2). Tables 1 and 2
show that the UFIR filter/smoother clearly outperforms the Kalman fil-
ter/smoother when the assumed noise statistics are incorrect.
25
5 Conclusion
We have derived a unified form for the Kalman filter and smoother that ex-
plicitly shows the sensitivity of the smoothed estimate to each measurement
between the time point of interest and the end of the smoothing interval. We
have derived two unified forms for the UFIR filter and smoother, along with
bounds for their estimation error covariances. We have seen that although the
Kalman filter and smoother are optimal when the system matrices are known,
the UFIR estimator can provide better robustness in the case of modeling er-
rors or uncertainties. The UFIR estimator does not require any knowledge of
noise statistics, which makes it an attractive alternative to the Kalman estima-
tor for some applications. The UFIR estimator requires more computational
effort than the Kalman estimator, although this could be circumvented through
parallel processing. MATLAB R⃝ source code is available on the internet so that
readers can replicate the examples in this paper [34].
Further research should focus on investigating the numerical properties of
the unified Kalman filter/smoother. Although we have shown that the unified
form is mathematically equivalent to the standard Kalman forms, we have
also observed numerical difficulties under certain conditions, so future research
should focus on characterizing and correcting those difficulties. Other future
research should focus on obtaining a computationally friendly equation for the
UFIR estimation error covariance. We have derived upper and lower bounds in
this paper, but the derivation of exact values remains for future work.
26
A Appendix
Estimation Error Covariance Upper Bound
Substitute xl−q from (23) in (33) to obtain
Pl−q = E{ϵl−qϵTl−q}
= E{[Fl−qϵl−q−1 + wl−q −Kl(yl −HlYlxl−q−1)] [· · · ]T
}. (A.1)
To find an iterative computation of (A.1), we need to express yl in terms of
xl−q−1. This can be done as follows:
yl = Hl(Flxl−1 + wl) + vl
= HlFl(Fl−1xl−2 + wl−1) +Hlwl + vl
...
= HlYlxl−q−1 +Hl
q∑j=1
Yl(q − j)wl−q−1+j +Hlwl + vl (A.2)
which can be written as
yl = Hl(Ylxl−q−1 +Ml) + vl
where Ml , Ml(q) =
q∑
j=1Yl(q − j)wl−q−1+j + wl if q > 0
wl if q = 0 .
(A.3)
Now we will substitute (A.3) in (A.1) to get an iteration of Pl from l = m+K
to l = n. This iteration gives an upper bound for Pn because it is based on the
iteration of (23), but xl−q in (23) is not optimal for any value of l except l = n
(assuming that N = Nopt). Substituting (A.3) in (A.1) gives
27
PUBl−q = E{[Fl−qϵl−q−1 + wl−q −Klyl +KlHlYlxl−q−1][· · · ]T }
= E{[Fl−qϵl−q−1 + wl−q −KlHlYlxl−q−1 −KlHlMl
−Klvl +KlHlYlxl−q−1][· · · ]T }
= E{[Fl−qϵl−q−1 −KlHlYlϵl−q−1 + wl−q −KlHlMl −Klvl][· · · ]T }
= E{[(Fl−q −KlHlYl)ϵl−q−1 + wl−q −KlHlMl −Klvl][· · · ]T }
= E{[(Fl−q −KlHlYl)ϵl−q−1 + wl−q −Kl(HlMl + vl)][· · · ]T } . (A.4)
Expanding this equation for q > 0 leads to
PUBl−q = (I −KlHlYl)Fl−qP
UBl−q−1F
Tl−q(· · · )T
+Ql−q +KlRTl K
Tl +KlHlE{MlMT
l }HTl K
Tl
−E{wl−qMTl }HT
l KTl−q −Kl−qHlE{Mlw
Tl−q}
= (I −KlHlYl)Fl−qPUBl−q−1F
Tl−q(· · · )T
+Ql−q +KlRTl K
Tl +KlHl(Ql +Ql)H
Tl K
Tl
−Ql−qYTl H
Tl K
Tl −KlHlYlQl−q ,
= (I −KlHlYl)(Fl−qPUBl−q−1F
Tl−q +Ql−q)(· · · )T
+KlRTl K
Tl +KlHl[Ql +Ql − YlQl−qYT
l ]HTl K
Tl (A.5)
where Ql , Ql(q) can be written as
Ql =
q∑j=1
Yl(q − j)Ql−q−1+jYTl (q − j) . (A.6)
28
References
[1] C. Ahn and P. Kim, “Fixed-lag maximum likelihood FIR smoother for
state-space models,” IEICE Electonics Express, vol. 5, no. 1, pp. 11–16,
January 2008.
[2] D. Alspach, “Gaussian sum approximations in nonlinear filtering and con-
trol,” Information Sciences, vol. 7, pp. 271–290, 1974.
[3] B. Anderson and J. Moore, Optimal Filtering, Mineola, New York: Dover,
2005.
[4] M. Athans, R. Whiting, and M. Gruber, “A suboptimal estimation algo-
rithm with probabilistic editing for false measurements with applications
to target tracking with wake phenomena,” IEEE Transactions on Auto-
matic Control, vol. AC-22, no. 3, pp. 372–384, June 1977.
[5] Y. Bar-Shalom, “Tracking methods in a multitarget environment,” IEEE
Transactions on Automatic Control, vol. AC-23, nol. 4, pp. 618–626, Au-
gust 1978.
[6] C. Fonseca and P. Fleming, “Multiobjective optimization and multiple con-
straint handling with evolutionary algorithms. I. A unified formulation,”
IEEE Transactions on Systems, Man and Cybernetics, Part A: Systems
and Humans, pp. 26–37, vol. 28, no. 1, Jan. 1998.
[7] C. F. Gauss, Theory of the Combination of Observations Least Subject to
Errors, Translated by G. W. Stewart, Philadelphia, Pennsylvania: SIAM
Publishing, 1995.
[8] J. Guerreiro and D. Trigueiros, “A Unified Approach to the Extraction
of Rules from Artificial Neural Networks and Support Vector Machines,”
Advanced Data Mining and Applications, (L. Cao, J. Zhong, and Y. Feng,
editors), pp. 34–42, Springer, 2010.
29
[9] S. Han and W. H. Kwon, “L2 – E FIR smoothers for deterministic discrete-
time state-space signal models,” IEEE Transactions on Automatic Control,
vol. 52, no. 5, pp. 927–932, May 2007.
[10] S. Han and W. H. Kwon, “A note on two-filter smoothing formulas,” IEEE
Transactions on Automatic Control, vol. 53, no. 3, pp. 849–854, March
2008.
[11] H. Heffes, “The effect of erroneous models on the Kalman filter response,”
IEEE Transactions on Automatic Control, vol. AC-11, pp. 541–543, July
1966.
[12] C. Hide, T. Moore, and M. Smith, “Adaptive Kalman filtering for low-cost
INS/GPS,” The Journal of Navigation, vol. 56, pp. 143–152, 2003.
[13] F. Hsiao and S. Pan, “Robust Kalman filter synthesis for uncertain mul-
tiple time-delay stochastic systems,” Journal of Dynamic Systems, Mea-
surement, and Control, vol. 118, pp. 803–808 December 1996.
[14] A. H. Jazwinski, Stochastic Processes and Filtering Theory, New York:
Academic, 1970.
[15] R. Kalman, “A new approach to linear filtering and prediction problems,”
ASME Journal of Basic Engineering, vol. 82, pp. 35–45, March 1960.
[16] S. Kosanam and D. Simon, “Kalman filtering with uncertain noise covari-
ances,” Intelligent Systems and Control, Honolulu, Hawaii, pp. 375–379,
August 2004.
[17] B. K. Kwon, S. Han, O. K. Kwon, and W. H. Kwon, “Minimum vari-
ance FIR smoothers for discrete-time state space models,” IEEE Signal
Processing Letters, vol. 14, no. 8, pp. 557–560, August 2007.
[18] C. Masreliez, “Approximate non-Gaussian filtering with linear state and
observation relations,” IEEE Transactions on Automatic Control, vol. AC-
20, pp. 107–110, February 1975.
30
[19] C. Masrielez and R. Martin, “Robust Bayesian estimation for the linear
model and robustifying the Kalman filter,” IEEE Transactions on Auto-
matic Control, vol. AC-22, no. 3, pp. 361–371, June 1977.
[20] R. Mehra, “Approaches to adaptive filtering,” IEEE Transactions on Au-
tomatic Control, vol. AC-17, no. 5, pp. 693–698, October 1972.
[21] R. Miller and L. Boxer, Algorithms Sequential and Parallel: A Unified
Approach, Pearson Education, 1999.
[22] A. Mohamed and K. Schwarz, “Adaptive Kalman Filtering for INS/GPS,”
Journal of Geodesy, pp. 193–203, vol. 73, no. 4, 1999.
[23] K. Myers and B. Tapley, “Adaptive sequential estimation with unknown
noise statistics,” IEEE Transactions on Automatic Control, vol. AC-21,
no. 4, pp. 520–523, August 1976.
[24] N. Nahi, “Optimal recursive estimation with uncertain observation,” IEEE
Transactions on Information Theory, vol. IT-15, no. 4, pp. 457–462, July
1969.
[25] T. Nishimura, “On the a priori information in sequential estimation prob-
lems,” IEEE Transactions on Automatic Control, vol. AC-11, pp. 197–204,
April 1966.
[26] H. Rauch, F. Tung, and C. Striebel, “Maximum likelihood estimates of
linear dynamic systems,” AIAA Journal, vol. 3, no. 8, pp. 1445–1450,
August 1965.
[27] Y. S. Shmaliy and L. Morales-Mendoza, “FIR smoothing of discrete-time
polynomial signals in state space,” IEEE Transactions on Signal Process-
ing, vol. 58, no. 5, pp. 2544–2555, May 2010.
[28] Y. S. Shmaliy, “Linear optimal FIR estimation of discrete time-invariant
state-space models,” IEEE Transactions on Signal Processing, vol. 58,
no. 6, pp. 3086–3096, June 2010.
31
[29] Y. S. Shmaliy and O. Ibarra-Manzano, “Noise power gain for discrete-time
FIR estimators,” IEEE Signal Processing Letters, vol. 18, no. 4, pp. 207–
210, April 2011.
[30] Y. S. Shmaliy, “An iterative Kalman-like algorithm ignoring noise and
initial conditions,” IEEE Transactions on Signal Processing, vol. 59, no. 6,
pp. 2465–2473, June 2011.
[31] Y. S. Shmaliy and O. Ibarra-Manzano, “Time-variant linear optimal finite
impulse response estimator for discrete-time state-space models,” Interna-
tional Journal of Adaptive Control and Signal Processing, vol. 26, no. 2,
pp. 95–104, February 2012.
[32] Y. S. Shmaliy, “Suboptimal FIR filtering of nonlinear models in additive
white Gaussian noise,” IEEE Transactions on Signal Processing, in print.
[33] D. Simon, Optimal State Estimation: Kalman, H∞, and Nonlinear Ap-
proaches, Hoboken, NJ: John Wiley & Sons, 2006.
[34] D. Simon, Unified Forms for Kalman and Finite Impulse Response Fil-
tering and Smoothing [Online], http://academic.csuohio.edu/simond/
UnifiedKalmanFIR, August 2012.
[35] R. Singer and R. Sea, “New results in optimizing surveillance system track-
ing and data correlation performance in dense multitarget environment,”
IEEE Transactions on Automatic Control, vol. 18, no. 6, pp. 571–582,
December 1973.
[36] T. Soong, “On a priori statistics in minimum-variance estimation prob-
lems,” ASME Journal of Basic Engineering, vol. 87D, pp. 109–112, March
1965.
[37] J. Spall (editor), Bayesian Analysis of Time Series and Dynamic Models,
New York: Marcel Dekker, 1988.
32
[38] H. Stark and J. W. Woods, Probability, Random Processes, and Estima-
tion Theory for Engineers, 2nd edition, Upper Saddle River, New Jersey:
Prentice Hall, 1994.
[39] Y. Theodor and U. Shaked, “Robust discrete-time minimum-variance fil-
tering,” IEEE Transactions on Signal Processing, vol. 44, no. 2, pp. 181–
189, February 1996.
[40] C. Tsai and L. Kurz, “An adaptive robustizing approach to Kalman filter-
ing,” Automatica, vol. 19, no. 3, pp. 278–288, May 1983.
[41] F. Wang and V. Balakrishnan, “Robust Kalman filters for linear time-
varying systems with stochastic parametric uncertainties,” IEEE Trans-
actions on Signal Processing, pp. 803–813, vol. 50, no. 4, 2002.
[42] X. Wang, “NFIR nonlinear filter,” IEEE Transactions on Signal Process-
ing, vol. 39, no. 7, pp. 1705–1708, July 1991.
[43] L. Xie, L. Lu, D. Zhang, and H. Zhang, “Improved robust H2 and H∞
filtering for uncertain discrete-time systems,” Automatica, vol. 40, no. 5,
pp. 873–880, May 2004.
[44] Y. Yang and W. Gao, “An Optimal Adaptive Kalman Filter,” Journal of
Geodesy, pp. 177–183, vol. 80, no. 4, 2006.
[45] J.-T. Yuan and J. A., Stuller, “Order-recurcive FIR smoothers,” IEEE
Transactions on Signal Processing, vol. 42, no. 5, pp. 1242–1246, May
1994.
[46] X. Zhang, A. Heemink, and J. Van Eijkeren, “Performance robustness
analysis of Kalman filter for linear discrete-time systems under plant and
noise uncertainty,” International Journal of Systems Science, vol. 26, no. 2,
pp. 257–275, February 1995.
[47] X. Zhou and X. Wang, “FIR-median hybrid filters with polynomial fit-
ting,” Digital Signal Processing, vol. 39, no. 2, pp. 112–124, March 2004.
33