Applied Mathematical Modelling 36 (2012) 1842–1853
Contents lists available at SciVerse ScienceDirect
Applied Mathematical Modelling
journal homepage: www.elsevier .com/locate /apm
Maximum likelihood least squares identification for systems withautoregressive moving average noise q
Wei Wang a, Feng Ding b,c,⇑, Jiyang Dai c
a Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University, Wuxi 214122, Chinab Control Science and Engineering Research Center, Jiangnan University, Wuxi 214122, Chinac School of Information Engineering, Nanchang Hangkong University, Nanchang 330063, China
a r t i c l e i n f o
Article history:Received 12 January 2011Received in revised form 14 July 2011Accepted 27 July 2011Available online 18 August 2011
Keywords:Least squaresMaximum likelihoodParameter estimationRecursive identificationCARARMA system
0307-904X/$ - see front matter � 2011 Elsevier Incdoi:10.1016/j.apm.2011.07.083
q This work was supported in part by the Nationa⇑ Corresponding author at: Control Science and E
E-mail addresses: [email protected] (W. Wang), f
a b s t r a c t
Maximum likelihood methods are important for system modeling and parameter estima-tion. This paper derives a recursive maximum likelihood least squares identification algo-rithm for systems with autoregressive moving average noises, based on the maximumlikelihood principle. In this derivation, we prove that the maximum of the likelihood func-tion is equivalent to minimizing the least squares cost function. The proposed algorithm isdifferent from the corresponding generalized extended least squares algorithm. The simu-lation test shows that the proposed algorithm has a higher estimation accuracy than therecursive generalized extended least squares algorithm.
� 2011 Elsevier Inc. All rights reserved.
1. Introduction
Mathematical modeling has been receiving much attention in parameter estimation and identification [1,2]. Maximumlikelihood estimation methods are a class of important approaches for dynamical system identification which has been dis-cussed for a long history [3] and have been applied to many areas such as asset pricing modeling in finance fields [4], imagetexture analysis [5], speech recognition [6] and spatial analysis [7]. In recent years, there are many likelihood estimationmethods developed for different models. For example, Södersöm et al. analyzed the accuracy of time-domain maximum like-lihood method and sample maximum likelihood method for errors-in-variables and output error identification [8]; Agüero etal. discussed the relationship between the time and frequency domain maximum likelihood estimation methods [9]. In thispaper, we discuss a maximum likelihood least squares method for equation error models with autoregressive moving aver-age (ARMA) noises.
For systems with stochastic noises, the least squares and gradient search methods are commonly used, e.g., the auxiliarymodel based least squares methods [10–15] and gradient based iterative method [14,16] for output error type models. Forequation error type models, Ding et al. discussed a least squares method for dual rate ARX models [17]; Ding et al. presentedan iterative and a recursive least squares methods for Hammerstein nonlinear ARMAX systems [18]; Han et al. developed ahierarchical least squares based iterative identification method for multivariable CARMA-like models [19]; Zhang et al. pro-posed a hierarchical gradient based iterative parameter estimation algorithm for multivariable output error moving average(OEMA) systems [20]. Wang et al. gave an extended stochastic gradient identification algorithm for Hammerstein–Wiener
. All rights reserved.
l Natural Science Foundation of China (60973043).ngineering Research Center, Jiangnan University, Wuxi 214122, PR [email protected] (F. Ding), [email protected] (J. Dai).
W. Wang et al. / Applied Mathematical Modelling 36 (2012) 1842–1853 1843
ARMAX systems [21]; Chen and Ding presented a modified stochastic gradient algorithms with fast convergence rates first[22]. Xiao et al. presented a residual-based extended stochastic gradient algorithm for ARMAX models [23]; Ding et al. pre-sented first a partially coupled stochastic gradient identification methods for non-uniformly sampled systems [24]; Wang etal. developed an input–output data filtering based least squares method for CARARMA models [25]. This paper proposes amaximum likelihood least squares method for the CARARMA models, based on the maximum likelihood principle.
The remainder of this paper is organized as follows. Section 2 derives the maximum likelihood objective function of CAR-ARMA models according to the maximum likelihood principle. Section 3 derives a maximum likelihood least squares algo-rithm for identifying the parameters of CARARMA models. Section 4 simply gives a recursive generalized extended leastsquares algorithm for comparison. Then, an illustrative example is given in Section 5 and concluding remarks are summa-rized in Section 6.
2. The basic principle
In the previous work in [26], Wang et al. studied the maximum likelihood identification method for dynamic adjustmentmodels, i.e. the controlled autoregressive autoregressive (CARAR) models or equation error autoregressive models,
AðzÞyðtÞ ¼ BðzÞuðtÞ þ 1CðzÞvðtÞ:
This paper consider the following controlled autoregressive autoregressive moving average systems (the CARARMA systemsfor short) [3,25],
AðzÞyðtÞ ¼ BðzÞuðtÞ þ DðzÞCðzÞ vðtÞ; ð1Þ
where u(t) and y(t) are the system input and output, respectively, v(t) is a white noise with the normal distribution{v(t)} � N(0,r2), A(z), B(z), C(z) and D(z) are polynomials in z�1, and defined by
AðzÞ ¼ 1þ a1z�1 þ a2z�2 þ � � � þ ana z�na ;
BðzÞ ¼ b1z�1 þ b2z�2 þ � � � þ bnbz�nb ;
CðzÞ ¼ 1þ c1z�1 þ c2z�2 þ � � � þ cnc z�nc ;
DðzÞ ¼ 1þ d1z�1 þ d2z�2 þ � � � þ dndz�nd :
Assume that the degree na, nb, nc and nd are known and y(t) = 0, u(t) = 0 and v(t) = 0 for t 6 0.Define the inner variable,
wðtÞ :¼ DðzÞCðzÞ vðtÞ; ð2Þ
which is an ARMA process.Define the parameter vector h and the information vector u(t) as
h :¼hs
hn
� �2 Rn; n :¼ na þ nb þ nc þ nd;
hs :¼ ½a1; a2; . . . ; ana ; b1; b2; . . . ; bnb�T 2 Rnaþnb ;
hn :¼ ½c1; c2; . . . ; cnc ;d1;d2; . . . ;dnd�T 2 Rncþnd ;
uðtÞ :¼usðtÞunðtÞ
� �2 Rn;
usðtÞ :¼ ½�yðt � 1Þ;�yðt � 2Þ; . . . ;�yðt � naÞ;uðt � 1Þ;uðt � 2Þ; . . . ;uðt � nbÞ�T 2 Rnaþnb ;
unðtÞ :¼ ½�wðt � 1Þ;�wðt � 2Þ; . . . ;�wðt � ncÞ;vðt � 1Þ; vðt � 2Þ; . . . ;vðt � ndÞ�T 2 Rncþnd :
Eq. (2) can be written as
wðtÞ ¼ ½1� CðzÞ�wðtÞ þ DðzÞvðtÞ ¼ uTnðtÞhn þ vðtÞ: ð3Þ
Here subscripts s and n denote the first letters of the words ‘system’ and ‘noise’, respectively. Using (2) and (3), Eq. (1) can berewritten as
1844 W. Wang et al. / Applied Mathematical Modelling 36 (2012) 1842–1853
yðtÞ ¼ ½1� AðzÞ�yðtÞ þ BðzÞuðtÞ þwðtÞ¼ uT
s ðtÞhs þwðtÞ ð4Þ¼ uT
s ðtÞhs þuTnðtÞhn þ vðtÞ
¼ uTðtÞhþ vðtÞ; ð5Þ
or
yðtÞ ¼ �Xna
i¼1
aiyðt � iÞ þXnb
i¼1
biuðt � iÞ �Xnc
i¼1
ciwðt � iÞ þXnd
i¼1
divðt � iÞ þ vðtÞ: ð6Þ
Since the observed values {u(i),y(i)} for i < t and the parameter vector h are uncorrelated with v(t) for a causal system, thejoint conditional probability density function of y(1),y(2), � � � , y(L) given h and u(1),u(2), � � � ,u(L � 1) can be expressed as
pðyð1Þ; yð2Þ; . . . ; yðLÞjuð1Þ;uð2Þ; . . . ;uðL� 1Þ; hÞ¼ pðyðLÞjyð1Þ; yð2Þ; . . . ; yðL� 1Þ;uð1Þ;uð2Þ; . . . ;uðL� 1Þ; hÞpðyðL� 1Þjyð1Þ; yð2Þ; . . . ;
yðL� 2Þ;uð1Þ;uð2Þ; . . . ;uðL� 1Þ; hÞ � � �pðyð1Þjyð0Þ;uð0Þ; hÞ
¼YL
t¼1
pðyðtÞjyð1Þ; yð2Þ; . . . ; yðt � 1Þ;uð1Þ;uð2Þ; . . . ;uðt � 1Þ; hÞ
¼YL
t¼1
p �Xna
i¼1
aiyðt � iÞ þXnb
i¼1
biuðt � iÞ �Xnc
i¼1
ciwðt � iÞ
þXnd
i¼1
divðt � iÞ þ vðtÞjyð1Þ; yð2Þ; . . . ; yðt � 1Þ;uð1Þ;uð2Þ; . . . ;uðt � 1Þ; h!
¼YL
t¼1
pðvðtÞÞ þ k ¼ 1
ðffiffiffiffiffiffiffiffiffiffiffiffi2pr2p
ÞLexp � 1
2r2
XL
t¼1
v2ðtÞ !
þ k;
where k is a constant.Let
yL :¼ fyð1Þ; yð2Þ; . . . ; yðLÞg;uL�1 :¼ fuð1Þ;uð2Þ; . . . ; uðL� 1Þg:
From (7), we define the logarithm of p(yLjuL�1) as the log-likelihood function,
lðyLjuL�1; hÞ :¼ ln pðyLjuL�1; hÞ ¼ ln k� L2
ln 2p� L2
ln r2 � 12r2
XL
t¼1
v2ðtÞ: ð8Þ
According to the maximum likelihood principle, the maximum likelihood estimate r2 of the noise variance makesl(yLjuL�1,h) = max, and thus letting the derivative equal zero gives
@lðyLjuL�1; hÞ@r2
����r2
¼ 0; ð9Þ
whose solution is given by
r2 ¼ 1L
XL
t¼1
v2ðtÞ: ð10Þ
Substituting (10) into (8) gives
lðyLjuL�1; hÞ ¼ ln k� L2
ln 2p� L2
ln1L
XL
t¼1
v2ðtÞ � L2¼ k1 �
L2
ln1L
XL
t¼1
v2ðtÞ; ð11Þ
where k1 :¼ ln k� L2 ln 2p� L
2.From (11), it is clear that the maximum value of lðyLjuL�1; hÞjh can be achieved by minimizing the following objective
function,
JðhÞ :¼ 12
XL
t¼1
v2ðtÞ�����h
; ð12Þ
W. Wang et al. / Applied Mathematical Modelling 36 (2012) 1842–1853 1845
where v(t) is given by
vðtÞ ¼ CðzÞDðzÞ ½AðzÞyðtÞ � BðzÞuðtÞ�: ð13Þ
Thus, the maximum likelihood estimate h of the CARARMA models can be obtained by minimizing the objective function JðhÞ.In fact, it is the least squares objective function.
3. The recursive maximum likelihood least squares algorithm
This section derives the recursive maximum likelihood least squares algorithm from the objective function JðhÞ in (12). LetI be an identity matrix of appropriate sizes and 1n be an n-dimensional column vector whose entries are all 1. From (12), theobjective function can be written as a recursive form
Jðh; tÞ ¼ Jðh; t � 1Þ þ 12
v2ðtÞ: ð14Þ
The following derivation is similar to that of the estimation algorithm for CARAR systems [26]. Using the first-order Taylorexpansion, v(t) at h ¼ hðt � 1Þ can be approximately expressed as
vðtÞ � vðtÞjhðt�1Þ þ@vðtÞ@h
� �T
hðt�1Þ½h� hðt � 1Þ�: ð15Þ
Let hðtÞ be the maximum likelihood estimate for h at time t, i.e.,
hðtÞ :¼ hsðtÞhnðtÞ
" #2 Rn;
hsðtÞ :¼ ½a1ðtÞ; a2ðtÞ; . . . ; ana ðtÞ; b1ðtÞ; b2ðtÞ; . . . ; bnbðtÞ�T 2 Rnaþnb ;
hnðtÞ :¼ ½c1ðtÞ; c2ðtÞ; . . . ; cnc ðtÞ; d1ðtÞ; d2ðtÞ; . . . ; dndðtÞ�T 2 Rncþnd :
Define the estimates of the polynomials A(z), B(z), C(z) and D(z) at time t as
Aðt; zÞ ¼ 1þ a1ðtÞz�1 þ a2ðtÞz�2 þ � � � þ ana ðtÞz�na ;
Bðt; zÞ ¼ b1ðtÞz�1 þ b2ðtÞz�2 þ � � � þ bnbðtÞz�nb ;
Cðt; zÞ ¼ 1þ c1ðtÞz�1 þ c2ðtÞz�2 þ � � � þ cnc ðtÞz�nc ;
Dðt; zÞ ¼ 1þ d1ðtÞz�1 þ d2ðtÞz�2 þ � � � þ dndðtÞz�nd :
Define the information vector,
uf ðtÞ :¼ � @vðtÞ@h
����hðt�1Þ
¼ � @vðtÞ@a1ðt � 1Þ ;
@vðtÞ@a2ðt � 1Þ ; . . . ;
@vðtÞ@ana ðt � 1Þ ;
@vðtÞ@b1ðt � 1Þ
;@vðtÞ
@b2ðt � 1Þ; . . . ;
"
@vðtÞ@bnbðt � 1Þ
;@vðtÞ
@c1ðt � 1Þ ;@vðtÞ
@c2ðt � 1Þ ; . . . ;@vðtÞ
@cnc ðt � 1Þ ;@vðtÞ
@d1ðt � 1Þ;
@vðtÞ@d2ðt � 1Þ
; . . . ;@vðtÞ
@dndðt � 1Þ
#T
2 Rn:
ð16Þ
Computing the partial derivatives of v(t) in (13) with respect to aj, bj, cj and dj at h ¼ hðt � 1Þ gives
@vðtÞ@aj
����hðt�1Þ
¼ Cðt � 1; zÞDðt � 1; zÞ
yðt � jÞ ¼: z�jyf ðtÞ;
@vðtÞ@bj
����hðt�1Þ
¼ � Cðt � 1; zÞDðt � 1; zÞ
uðt � jÞ ¼: �z�juf ðtÞ;
@vðtÞ@cj
����hðt�1Þ
¼ Aðt � 1; zÞDðt � 1; zÞ
yðt � jÞ � Bðt � 1; zÞDðt � 1; zÞ
uðt � jÞ ¼: z�jwf ðtÞ;
@vðtÞ@dj
����hðt�1Þ
¼ � 1
Dðt � 1; zÞvðt � jÞ ¼: �z�jv f ðtÞ;
1846 W. Wang et al. / Applied Mathematical Modelling 36 (2012) 1842–1853
where yf ðtÞ; uf ðtÞ; v fðtÞ and wf ðtÞ are defined by
yfðtÞ :¼ Cðt � 1; zÞDðt � 1; zÞ
yðtÞ;
uf ðtÞ :¼ Cðt � 1; zÞDðt � 1; zÞ
uðtÞ;
wf ðtÞ :¼ Aðt � 1; zÞDðt � 1; zÞ
yðtÞ � Bðt � 1; zÞDðt � 1; zÞ
uðtÞ;
v fðtÞ :¼ 1
Dðt � 1; zÞvðtÞ:
Their recursive relations are given by
yf ðtÞ ¼ yðtÞ þ c1ðt � 1Þyðt � 1Þ þ c2ðt � 1Þyðt � 2Þ þ � � � þ cnc ðt � 1Þyðt � ncÞr � d1ðt � 1Þyf ðt � 1Þ � d2ðt � 1Þyf ðt � 2Þ
� � � � � dndðt � 1Þyfðt � ndÞ;
ufðtÞ ¼ uðtÞ þ c1ðt � 1Þuðt � 1Þ þ c2ðt � 1Þuðt � 2Þ þ � � � þ cnc ðt � 1Þuðt � ncÞ � d1ðt � 1Þuf ðt � 1Þ � d2ðt � 1Þuf ðt � 2Þ
� � � � � dndðt � 1Þuf ðt � ndÞ;
wf ðtÞ ¼ yðtÞ þ a1ðt � 1Þyðt � 1Þ þ a2ðt � 1Þyðt � 2Þ þ � � � þ ana ðt � 1Þyðt � naÞ � b1ðt � 1Þuðt � 1Þ � b2ðt � 1Þuðt � 2Þ
� � � � � bnbðt � 1Þuðt � nbÞr � d1ðt � 1Þwf ðt � 1Þ � d2ðt � 1Þwf ðt � 2Þ � � � � � dnd
ðt � 1Þwf ðt � ndÞ;
v f ðtÞ ¼ vðtÞ � d1v f ðt � 1Þ � d2v f ðt � 2Þ � � � � � dndv fðt � ndÞ:
Thus from (16), we have
uf ðtÞ ¼ ½�yf ðt � 1Þ;�yf ðt � 2Þ; . . . ;�yfðt � naÞ; uf ðt � 1Þ; ufðt � 2Þ; . . . ; uf ðt � nbÞ;�wfðt � 1Þ;�wf ðt � 2Þ; . . . ;
� wf ðt � ncÞ; v f ðt � 1Þ; v f ðt � 2Þ; . . . ; v f ðt � ndÞ�T 2 Rn:
Applying the Taylor series expansion to J(h, t � 1) gives
Jðh; t � 1Þ � @Jðh; t � 1Þ@h
����hðt�1Þ
½h� hðt � 1Þ� þ 12½h� hðt � 1Þ�T@
2Jðh; t � 1Þ@h@hT
�����hðt�1Þ
½h� hðt � 1Þ� þ 12gðtÞ; ð17Þ
where the variable g(t) is the residual of the Taylor expansion of J(h, t � 1). Since the first-order derivative of J(h, t � 1) ath ¼ hðt � 1Þ approximately equals to zero, and
P�1ðt � 1Þ :¼ @2Jðh; t � 1Þ@h@hT
�����hðt�1Þ
is a positive-definite matrix, Eq. (14) can be written as
Jðh; tÞ � 12½h� hðt � 1Þ�TP�1ðt � 1Þ½h� hðt � 1Þ� þ 1
2gðtÞ þ 1
2v2ðtÞ: ð18Þ
From (18) and (14), we have
2Jðh; tÞ � ½h� hðt � 1Þ�TP�1ðt � 1Þ½h� hðt � 1Þ� þ v2ðtÞ þ gðtÞ
¼ ½h� hðt � 1Þ�TP�1ðt � 1Þ½h� hðt � 1Þ� þ gðtÞ þ vðtÞjhðt�1Þ þ@vðtÞ@h
����T
hðt�1Þ½h� hðt � 1Þ�
" #2
¼ ½h� hðt � 1Þ�T½P�1ðt � 1Þ þ uf ðtÞuTf ðtÞ�½h� hðt � 1Þ� � 2vðtÞuT
f ðtÞ½h� hðt � 1Þ� þ v2ðtÞ þ gðtÞ:
The second-order derivative of J(h, t) in (14) with respect to h is
P�1ðtÞ ¼ @2Jðh; tÞ@h@hT ¼
@2Jðh; t � 1Þ@h@hT þ vðtÞ @
2vðtÞ@h@hT þ
@vðtÞ@h
@vðtÞ@h
� �T
: ð19Þ
W. Wang et al. / Applied Mathematical Modelling 36 (2012) 1842–1853 1847
Because the second-order derivative of v(t) with respect to h at h ¼ hðt � 1Þ is zero, Eq. (19) can be written as
P�1ðtÞ ¼ P�1ðt � 1Þ þ uf ðtÞuTf ðtÞ: ð20Þ
Applying the matrix inversion lemma
ðAþ BCÞ�1 ¼ A�1 � A�1BðI þ CA�1BÞ�1CA�1
to (20) gives
PðtÞ ¼ Pðt � 1Þ � Pðt � 1ÞufðtÞuTf ðtÞPðt � 1Þ
1þ uTf ðtÞPðt � 1Þuf ðtÞ
:
Let
LðtÞ :¼ PðtÞuf ðtÞ;g�ðtÞ :¼ ½LðtÞvðtÞ�TP�1ðtÞ½LðtÞvðtÞ� þ v2ðtÞ þ gðtÞ:
Thus, we have
LðtÞ ¼ Pðt � 1Þuf ðtÞ1þ uT
f ðtÞPðt � 1ÞufðtÞ:
The objective function J(h, t) can be written as
2Jðh; tÞ � ½h� hðt � 1Þ � LðtÞvðtÞ�TP�1ðtÞ½h� hðt � 1Þ � LðtÞvðtÞ� þ ½LðtÞvðtÞ�TP�1ðtÞ½LðtÞvðtÞ� þ v2ðtÞ þ gðtÞ
¼ ½h� hðt � 1Þ � LðtÞvðtÞ�TP�1ðtÞ½h� hðt � 1Þ � LðtÞvðtÞ� þ g�ðtÞ:
Minimizing 2J(h, t) gives the estimate hðtÞ of h as follow:
hðtÞ ¼ hðt � 1Þ þ LðtÞvðtÞ:
Define
uðtÞ :¼usðtÞunðtÞ
� �2 Rn;
unðtÞ :¼ ½�wðt � 1Þ;�wðt � 2Þ; . . . ;�wðt � ncÞ; vðt � 1Þ; vðt � 2Þ; . . . ; vðt � ndÞ�T 2 Rncþnd :
Replacing hs with hsðt � 1Þ in (4), the estimate wðtÞ can be computed by
wðtÞ ¼ yðtÞ �uTs ðtÞh sðt � 1Þ:
Replacing u(t) and h in (5) with their estimates uðtÞ and hðt � 1Þ, the estimate vðtÞ of v(t) can be computed by
vðtÞ ¼ vðtÞjhðt�1Þ ¼ yðtÞ �uTðtÞhjhðt�1Þ ¼ yðtÞ � uTðtÞhðt � 1Þ: ð21Þ
Thus the maximum likelihood recursive least squares (ML-RLS) identification algorithm for the controlled autoregressiveautoregressive moving average system can be expressed as
hðtÞ ¼ hðt � 1Þ þ LðtÞvðtÞ; ð22Þ
vðtÞ ¼ yðtÞ � uTðtÞhðt � 1Þ; ð23Þ
LðtÞ ¼ Pðt � 1Þuf ðtÞ½1þ uTf ðtÞPðt � 1Þuf ðtÞ��1
; ð24Þ
PðtÞ ¼ ½I � LðtÞuTf ðtÞ�Pðt � 1Þ; Pð0Þ ¼ p0I; ð25Þ
uf ðtÞ ¼½�yfðt � 1Þ;�yfðt � 2Þ; . . . ;�yf ðt � naÞ; ufðt � 1Þ; ufðt � 2Þ; . . . ; ufðt � nbÞ;�wf ðt � 1Þ;�wf ðt � 2Þ; . . . ;
� wfðt � ncÞ; v f ðt � 1Þ; v f ðt � 2Þ; . . . ; v f ðt � ndÞ�T; ð26Þ
usðtÞ ¼ ½�yðt � 1Þ;�yðt � 2Þ; � � � ;�yðt � naÞ; uðt � 1Þ;uðt � 2Þ; . . . ;uðt � nbÞ�T; ð27ÞunðtÞ ¼ ½�wðt � 1Þ;�wðt � 2Þ; . . . ;�wðt � ncÞ; vðt � 1Þ; vðt � 2Þ; . . . ; vðt � ndÞ�T; ð28ÞwðtÞ ¼ yðtÞ �uT
s ðtÞhsðt � 1Þ; ð29Þ
uðtÞ ¼usðtÞunðtÞ
� �; ð30Þ
1848 W. Wang et al. / Applied Mathematical Modelling 36 (2012) 1842–1853
yf ðtÞ ¼ yðtÞ þ c1ðt � 1Þyðt � 1Þ þ c2ðt � 1Þyðt � 2Þ þ � � � þ cnc ðt � 1Þyðt � ncÞ � d1ðt � 1Þyfðt � 1Þ� d2ðt � 1Þyf ðt � 2Þ � � � � � dnd
ðt � 1Þyfðt � ndÞ; ð31Þ
ufðtÞ ¼ uðtÞ þ c1ðt � 1Þuðt � 1Þ þ c2ðt � 1Þuðt � 2Þ þ � � � þ cnc ðt � 1Þuðt � ncÞ � d1ðt � 1Þuf ðt � 1Þ� d2ðt � 1Þuf ðt � 2Þ � � � � � dnd
ðt � 1Þufðt � ndÞ; ð32Þ
wf ðtÞ ¼ yðtÞ þ a1ðt � 1Þyðt � 1Þ þ a2ðt � 1Þyðt � 2Þ þ � � � þ ana ðt � 1Þyðt � naÞ � b1ðt � 1Þuðt � 1Þ� b2ðt � 1Þuðt � 2Þ � � � � � bnb
ðt � 1Þuðt � nbÞ � d1ðt � 1Þwfðt � 1Þ � d2ðt � 1Þwf ðt � 2Þ � � � �� dnd
ðt � 1Þwf ðt � ndÞ; ð33Þ
v f ðtÞ ¼ vðtÞ � d1v f ðt � 1Þ � d2v f ðt � 2Þ � � � � � dndv fðt � ndÞ; ð34Þ
hðtÞ ¼ hsðtÞhnðtÞ
" #; ð35Þ
hsðtÞ ¼ ½a1ðtÞ; a2ðtÞ; . . . ; ana ðtÞ; b1ðtÞ; b2ðtÞ; . . . ; bnbðtÞ�T; ð36Þ
hnðtÞ ¼ ½c1ðtÞ; c2ðtÞ; . . . ; cnc ðtÞ; d1ðtÞ; d2ðtÞ; . . . ; dndðtÞ�T: ð37Þ
The steps involved in the ML-RLS algorithm are listed as follows:
1. Let t = 1, set the initial values hð0Þ ¼ 1n=p0; Pð0Þ ¼ p0I; vð0Þ ¼ 1=p0; wð0Þ ¼ 1=p0 and yfðiÞ ¼ 1=p0; uf ðiÞ ¼ 1=p0;
wf ðiÞ ¼ 1=p0; v f ðiÞ ¼ 1=p0 for i 6 0, p0 is a large number (i.e., p0 = 106).2. Collect the input–output data u(t) and y(t), form the information vectors us(t) by (27), unðtÞ by (28), uðtÞ by (30) and ufðtÞ
by (26).3. Compute vðtÞ by (23).4. Compute the gain vector L(t) by (24) and the covariance matrix P(t) by (25).
Fig. 1. The flowchart of computing the ML-RLS estimate hðtÞ.
W. Wang et al. / Applied Mathematical Modelling 36 (2012) 1842–1853 1849
5. Compute wðtÞ by (29), yf ðtÞ; uf ðtÞ; wf ðtÞ and v f ðtÞ by (31)–(34), respectively.6. Update the parameter estimation vector hðtÞ by (22).7. Increase t by 1 and go to Step 2.
The flowchart of computing the parameter estimates cðtÞ; aðtÞ and hðtÞ by the F-HLS algorithm in (22)–(37) is shown inFig. 1.
4. The recursive generalized extended least squares algorithm
To show the advantages of the ML-RLS algorithm, we give the recursive generalized extended least squares (RGELS) algo-rithm for identifying the parameter vector h [25]:
hðtÞ ¼ hðt � 1Þ þ LðtÞ½yðtÞ � uTðtÞhðt � 1Þ�; ð38Þ
LðtÞ ¼ Pðt � 1ÞuðtÞ1þ uTðtÞPðt � 1ÞuðtÞ ; ð39Þ
PðtÞ ¼ ½I � LðtÞuTðtÞ�Pðt � 1Þ; Pð0Þ ¼ p0In; ð40Þ
hðtÞ ¼ hsðtÞhnðtÞ
" #; uðtÞ ¼
usðtÞunðtÞ
� �; ð41Þ
usðtÞ ¼ ½�yðt � 1Þ;�yðt � 2Þ; . . . ;�yðt � naÞ; uðt � 1Þ;uðt � 2Þ; . . . ;uðt � nbÞ�T; ð42ÞunðtÞ ¼ ½�wðt � 1Þ;�wðt � 2Þ; . . . ;�wðt � ncÞ; vðt � 1Þ; vðt � 2Þ; . . . ; vðt � ndÞ�T; ð43ÞwðtÞ ¼ yðtÞ �uT
s ðtÞhsðtÞ; ð44ÞvðtÞ ¼ yðtÞ � uTðtÞhðtÞ: ð45Þ
The choice of the initial values of the RGELS algorithm is similar to that of the ML-RLS algorithm. The flowchart of computingthe parameter estimate hðtÞ in the RGELS algorithm is shown in Fig. 2.
Fig. 2. The flowchart of computing the RGELS estimate hðtÞ.
Table 1The parameter estimates and errors (r2 = 1.002).
Algorithms t a1 a2 b1 b2 c1 d1d (%)
RGELS 100 0.14341 0.68336 �0.84788 0.55601 0.53968 �0.61710 0.22896200 0.15186 0.84121 �0.84430 0.60662 0.56043 �0.60211 0.17086500 0.18153 0.85382 �0.84988 0.60759 0.59662 �0.45623 0.07638
1000 0.19889 0.88476 �0.85750 0.64841 0.61756 �0.45225 0.070192000 0.21621 0.89556 �0.88460 0.65090 0.61692 �0.42646 0.058533000 0.22092 0.89570 �0.85875 0.63017 0.62423 �0.39578 0.03114
ML-RLS 100 0.19542 0.84787 �0.74108 0.65672 0.48196 �0.29879 0.13036200 0.23207 0.88688 �0.83932 0.61429 0.51337 �0.39421 0.07280500 0.24413 0.91055 �0.89265 0.60274 0.63999 �0.25899 0.07197
1000 0.23559 0.89722 �0.88693 0.57404 0.61981 �0.33282 0.033842000 0.23814 0.89651 �0.89595 0.56468 0.60724 �0.35283 0.038533000 0.23315 0.89852 �0.87012 0.59422 0.59383 �0.35659 0.02159
True values 0.23000 0.90000 �0.85000 0.60000 0.62000 �0.36000
Table 2The ML-RLS parameter estimates and errors.
r2 t a1 a2 b1 b2 c1 d1d (%)
0.502 100 0.21067 0.86902 �0.79140 0.60466 0.46114 �0.29913 0.11711200 0.22933 0.88981 �0.84133 0.59473 0.50528 �0.39670 0.07735500 0.23931 0.90350 �0.87057 0.59611 0.62896 �0.27363 0.05731
1000 0.23384 0.89911 �0.86817 0.58380 0.61983 �0.33381 0.022942000 0.23334 0.89834 �0.87275 0.58145 0.61029 �0.35403 0.020225000 0.23133 0.89716 �0.86120 0.59892 0.60292 �0.35673 0.01336
10000 0.22855 0.89780 �0.85867 0.60362 0.61130 �0.36014 0.00833
1.002 100 0.19542 0.84787 �0.74108 0.65672 0.48196 �0.29879 0.13036200 0.23207 0.88688 �0.83932 0.61429 0.51337 �0.39421 0.07280500 0.24413 0.91055 �0.89265 0.60274 0.63999 �0.25899 0.07197
1000 0.23559 0.89722 �0.88693 0.57404 0.61981 �0.33282 0.033842000 0.23814 0.89651 �0.89595 0.56468 0.60724 �0.35283 0.038535000 0.23361 0.89404 �0.87249 0.59807 0.59907 �0.35859 0.02015
10000 0.22716 0.89510 �0.86747 0.60773 0.60992 �0.36305 0.01437
True values 0.23000 0.90000 �0.85000 0.60000 0.62000 �0.36000
0 500 1000 1500 2000 2500 30000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
RML−LS
t
δ
RGELS
Fig. 3. The ML-RLS and RGELS estimation errors versus t (r2 = 1.002).
1850 W. Wang et al. / Applied Mathematical Modelling 36 (2012) 1842–1853
0 200 400 600 800 1000 1200 1400 1600 1800 20000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
σ2=1.002
t
δ
σ2=0.502
Fig. 4. The ML-RLS parameter estimation errors versus t.
W. Wang et al. / Applied Mathematical Modelling 36 (2012) 1842–1853 1851
5. Example
Expanding the CARAR model in [26] to a CARARMA model,
AðzÞyðtÞ ¼ BðzÞuðtÞ þ DðzÞCðzÞ vðtÞ;
AðzÞ ¼ 1þ a1z�1 þ a2z�2 ¼ 1þ 0:23z�1 þ 0:90z�2;
BðzÞ ¼ b1z�1 þ b2z�2 ¼ �0:85z�1 þ 0:60z�2;
CðzÞ ¼ 1þ c1z�1 ¼ 1þ 0:62z�1;
DðzÞ ¼ 1þ d1z�1 ¼ 1� 0:36z�1;
h ¼ ½a1; a2; b1; b2; c1; d1�T ¼ ½0:23;0:90;�0:85; 0:60;0:62;�0:36�T:
8>>>>>>>>><>>>>>>>>>:
In simulation, the input u(t) is taken as an uncorrelated measured stochastic signal sequence with zero mean and unit var-iance, and v(t) as a white noise sequence with zero mean and variances r2 = 0.502 and r2 = 1.002, respectively. Applying theML-RLS algorithm and RGELS algorithm to estimate the parameters of this system, the parameter estimates and their errorsare shown in Table 1, the parameter estimates and their errors of the ML-RLS algorithm with different variances are shown inTable 2. The corresponding estimate errors d :¼ khðtÞ � hk=khk versus t are shown in Figs. 3 and 4.
From Tables 1 and 2 and Figs. 3 and 4, we can draw the following conclusions:
� Both the estimation errors of the ML-RLS and RGELS algorithms decrease as the data length increases – see Fig. 3. Thus theML-RLS algorithm can effectively identify the models with colored noises.� For the same data length, the estimation accuracy of the ML-RLS algorithm is better than that of the RGELS algorithm –
see Table 1 and Fig. 3.� For the same data length, the smaller the noise variance is, the faster the convergence rate of the ML-RLS algorithm is –
see Table 2 and Fig. 4.
6. Conclusions
This paper presents a maximum likelihood least squares identification algorithm for controlled autoregressive autore-gressive moving average models based on the maximum likelihood principle. The simulation results show that the proposedalgorithm is effective and the estimation accuracy of the ML-RLS algorithm is higher than that of the RGELS algorithm withthe same data length. Also, the proposed identification method can be extended to output error type models or to nonlinearsystems [27–29] or missing-data systems outputs [30–32], non-uniformly sampled-data systems [33–35], and multivariablesystems [36] and can be combined with the multi-innovation identification theory [37–47] to study identification problemsof systems with colored noises [48–53].
1852 W. Wang et al. / Applied Mathematical Modelling 36 (2012) 1842–1853
References
[1] Y.S. Xiao, F. Ding, Y. Zhou, M. Li, J.Y. Dai, On consistency of recursive least squares identification algorithms for controlled auto-regression models,Applied Mathematical Modelling 32 (11) (2008) 2207–2215.
[2] Y. Zhang, G.M. Cui, Bias compensation methods for stochastic systems with colored noise, Applied Mathematical Modelling 35 (4) (2011) 1709–1716.[3] L. Ljung, System Identification: Theory for the User, second ed., Prentice-Hall, Englewood Cliffs, New Jersey, 1999.[4] B. Kayahan, T. Stengos, Testing the capital asset pricing model with local maximum likelihood methods, Mathematical and Computer Modelling 46 (1–
2) (2007) 138–150.[5] T. Lundahl, W.J. Ohley, S.M. Kay, R. Siffert, Fractional Brownian motion: a maximum likelihood estimator and its application to image texture, IEEE
Transactions on Medical Imaging 5 (3) (2007) 152–161.[6] L.R. Bahl, F. Jelinek, R.L. Mercer, A maximum likelihood approach to continuous speech recognition, IEEE Transactions on Pattern Analysis and Machine
Intelligence 5 (2) (2009) 179–190.[7] M. Kyung, S.K. Ghosh, Maximum likelihood estimation for directional conditionally autoregressive models, Journal of Statistical Planning and Inference
140 (11) (2010) 3160–3179.[8] T. Södersöm, M. Hong, J. Schoukens, R. Pintelon, Accuracy analysis of time domain maximum likelihood method and sample maximum likelihood
method for errors-in-variables and output error identification, Automatica 46 (4) (2010) 721–727.[9] J.C. Agüero, J.I. Yuz, G.C. Goodwin, R.A. Delgado, On the equivalence of time and frequency domain maximum likelihood estimation, Automatica 46 (2)
(2010) 260–270.[10] F. Ding, J. Ding, Least squares parameter estimation with irregularly missing data, International Journal of Adaptive Control and Signal Processing 24
(7) (2010) 540–553.[11] L.L. Han, J. Sheng, F. Ding, Y. Shi, Auxiliary model identification method for multirate multi-input systems based on least squares, Mathematical and
Computer Modelling 50 (7–8) (2009) 1100–1106.[12] D.Q. Wang, Y.Y. Chu, F. Ding, Auxiliary model-based RELS and MI-ELS algorithms for Hammerstein OEMA systems, Computers & Mathematics with
Applications 59 (9) (2010) 3092–3098.[13] Y.J. Liu, D.Q. Wang, F. Ding, Least-squares based iterative algorithms for identifying Box–Jenkins models with finite measurement data, Digital Signal
Processing 20 (5) (2010) 1458–1467.[14] F. Ding, P.X. Liu, G. Liu, Gradient based and least-squares based iterative identification methods for OE and OEMA systems, Digital Signal Processing 20
(3) (2010) 664–677.[15] X.G. Liu, J. Lu, Least squares based iterative identification for a class of multirate systems, Automatica 46 (3) (2010) 549–554.[16] D.Q. Wang, G.W. Yang, R.F. Ding, Gradient-based iterative parameter estimation for Box–Jenkins systems, Computers & Mathematics with Applications
60 (5) (2010) 1200–1208.[17] J. Ding, F. Ding, The residual based extended least squares identification method for dual-rate systems, Computers & Mathematics with Applications 56
(6) (2008) 1479–1487.[18] F. Ding, T. Chen, Identification of Hammerstein nonlinear ARMAX systems, Automatica 41 (9) (2005) 1479–1489.[19] H.Q. Han, L. Xie, F. Ding, X.G. Liu, Hierarchical least squares based iterative identification for multivariable systems with moving average noises,
Mathematical and Computer Modelling 51 (9–10) (2010) 1213–1220.[20] Z.N. Zhang, F. Ding, X.G. Liu, Hierarchical gradient based iterative parameter estimation algorithm for multivariable output error moving average
systems, Computers & Mathematics with Applications 61 (3) (2011) 672–682.[21] D.Q. Wang, F. Ding, Extended stochastic gradient identification algorithms for Hammerstein–Wiener ARMAX systems, Computers & Mathematics with
Applications 56 (12) (2008) 3157–3164.[22] J. Chen, F. Ding, Modified stochastic gradient algorithms with fast convergence rates, Journal of Vibration and Control 17 (9) (2011) 1281–1286.[23] Y.S. Xiao, Y. Zhang, J. Ding, J.Y. Dai, The residual based interactive least squares algorithms and simulation studies, Computers & Mathematics with
Applications 58 (6) (2009) 1190–1197.[24] F. Ding, G. Liu, X.P. Liu, Partially coupled stochastic gradient identification methods for non-uniformly sampled systems, IEEE Transactions on
Automatic Control 55 (8) (2010) 1976–1981.[25] D.Q. Wang, F. Ding, Input–output data filtering based recursive least squares identification for CARARMA systems, Digital Signal Processing 20 (4)
(2010) 991–999.[26] W. Wang, J.H. Li, R.F. Ding, Maximum likelihood identification algorithm for controlled autoregressive autoregressive models, International Journal of
Computer Mathematics, doi:10.1080/00207160.2011.598514.[27] F. Ding, P.X. Liu, G. Liu, Identification methods for Hammerstein nonlinear systems, Digital Signal Processing 21 (2) (2011) 215–238.[28] D.Q. Wang, F. Ding, Least squares based and gradient based iterative identification for Wiener nonlinear systems, Signal Processing 91 (5) (2011) 1182–
1189.[29] F. Ding, Y. Shi, T. Chen, Auxiliary model based least-squares identification methods for Hammerstein output-error systems, Systems & Control Letters
56 (5) (2007) 373–380.[30] F. Ding, G. Liu, X.P. Liu, Parameter estimation with scarce measurements, Automatica 47 (8) (2011) 1646–1655.[31] J. Ding, Y. Shi, H.G. Wang, F. Ding, A modified stochastic gradient based parameter estimation algorithm for dual-rate sampled-data systems, Digital
Signal Processing 20 (4) (2010) 1238–1247.[32] F. Ding, P.X. Liu, H.Z. Yang, Parameter identification and intersample output estimation for dual-rate systems, IEEE Transactions on Systems, Man, and
Cybernetics, Part A: Systems and Humans 38 (4) (2008) 966–975.[33] Y.J. Liu, L. Xie, F. Ding, An auxiliary model based recursive least squares parameter estimation algorithm for non-uniformly sampled multirate systems,
Proceedings of the Institution of Mechanical Engineers, Part I: Journal of Systems and Control Engineering 223 (4) (2009) 445–454.[34] F. Ding, L. Qiu, T. Chen, Reconstruction of continuous-time systems from their non-uniformly sampled discrete-time systems, Automatica 45 (2) (2009)
324–332.[35] L. Xie, H.Z. Yang, F. Ding, Recursive least squares parameter estimation for non-uniformly sampled systems based on the data filtering, Mathematical
and Computer Modelling 54 (1–2) (2011) 315–324.[36] Y.J. Liu, J. Sheng, R.F. Ding, Convergence of stochastic gradient estimation algorithm for multivariable ARX-like systems, Computers & Mathematics
with Applications 59 (8) (2010) 2615–2627.[37] F. Ding, T. Chen, Performance analysis of multi-innovation gradient type identification methods, Automatica 43 (1) (2007) 1–14.[38] F. Ding, Several multi-innovation identification methods, Digital Signal Processing 20 (4) (2010) 1027–1039.[39] D.Q. Wang, F. Ding, Performance analysis of the auxiliary models based multi-innovation stochastic gradient estimation algorithm for output error
systems, Digital Signal Processing 20 (3) (2010) 750–762.[40] L.L. Han, F. Ding, Identification for multirate multi-input systems using the multi-innovation identification theory, Computers & Mathematics with
Applications 57 (9) (2009) 1438–1449.[41] L.L. Han, F. Ding, Multi-innovation stochastic gradient algorithms for multi-input multi-output systems, Digital Signal Processing 19 (4) (2009) 545–
554.[42] J.B. Zhang, F. Ding, Y. Shi, Self-tuning control based on multi-innovation stochastic gradient parameter estimation, Systems & Control Letters 58 (1)
(2009) 69–75.[43] F. Ding, X.P. Liu, G. Liu, Auxiliary model based multi-innovation extended stochastic gradient parameter estimation with colored measurement noises,
Signal Processing 89 (10) (2009) 1883–1890.
W. Wang et al. / Applied Mathematical Modelling 36 (2012) 1842–1853 1853
[44] Y.J. Liu, Y.S. Xiao, X.L. Zhao, Multi-innovation stochastic gradient algorithm for multiple-input single-output systems using the auxiliary model,Applied Mathematics and Computation 215 (4) (2009) 1477–1483.
[45] L. Xie, H.Z. Yang, F. Ding, Modeling and identification for non-uniformly periodically sampled-data systems, IET Control Theory & Applications 4 (5)(2010) 784–794.
[46] F. Ding, P.X. Liu, G. Liu, Multi-innovation least squares identification for system modeling, IEEE Transactions on Systems, Man, and Cybernetics, Part B:Cybernetics 40 (3) (2010) 767–778.
[47] Y.J. Liu, L. Yu, F. Ding, Multi-innovation extended stochastic gradient algorithm and its performance analysis, Circuits, Systems and Signal Processing 29(4) (2010) 649–667.
[48] J. Chen, Y. Zhang, R.F. Ding, Auxiliary model based multi-innovation algorithms for multivariable nonlinear systems, Mathematical and ComputerModelling 52 (9–10) (2010) 1428–1434.
[49] H.H. Yin, Z.F. Zhu, F. Ding, Model order determination using the Hankel matrix of impulse responses, Applied Mathematics Letters 24 (5) (2011) 797–802.
[50] L. Chen, J.H. Li, R.F. Ding, Identification of the second-order systems based on the step response, Mathematical and Computer Modelling 53 (5–6)(2011) 1074–1083.
[51] B. Bao, Y.Q. Xu, J. Sheng, R.F. Ding, Least squares based iterative parameter estimation algorithm for multivariable controlled ARMA system modellingwith finite measurement data, Mathematical and Computer Modelling 53 (9–10) (2011) 1664–1669.
[52] L.L. Xiang, L.B. Xie, Y.W. Liao, R.F. Ding, Hierarchical least squares algorithms for single-input multiple-output systems based on the auxiliary model,Mathematical and Computer Modelling 52 (5–6) (2010) 918–924.
[53] H.Q. Han, G.L. Song, Y.S. Xiao, Y.W. Liao, R.F. Ding, Performance analysis of the AM-SG parameter estimation for multivariable systems, AppliedMathematics and Computation 217 (12) (2011) 5566–5572.