Maximum likelihood least squares identification for systems with autoregressive moving average noise

Applied Mathematical Modelling 36 (2012) 1842–1853

Contents lists available at SciVerse ScienceDirect

Applied Mathematical Modelling

journal homepage: www.elsevier .com/locate /apm

Maximum likelihood least squares identification for systems withautoregressive moving average noise q

Wei Wang a, Feng Ding b,c,⇑, Jiyang Dai c

a Key Laboratory of Advanced Process Control for Light Industry (Ministry of Education), Jiangnan University, Wuxi 214122, Chinab Control Science and Engineering Research Center, Jiangnan University, Wuxi 214122, Chinac School of Information Engineering, Nanchang Hangkong University, Nanchang 330063, China

a r t i c l e i n f o

Article history:Received 12 January 2011Received in revised form 14 July 2011Accepted 27 July 2011Available online 18 August 2011

Keywords:Least squaresMaximum likelihoodParameter estimationRecursive identificationCARARMA system

0307-904X/$ - see front matter � 2011 Elsevier Incdoi:10.1016/j.apm.2011.07.083

q This work was supported in part by the Nationa⇑ Corresponding author at: Control Science and E

E-mail addresses: [email protected] (W. Wang), f

a b s t r a c t

Maximum likelihood methods are important for system modeling and parameter estima-tion. This paper derives a recursive maximum likelihood least squares identification algo-rithm for systems with autoregressive moving average noises, based on the maximumlikelihood principle. In this derivation, we prove that the maximum of the likelihood func-tion is equivalent to minimizing the least squares cost function. The proposed algorithm isdifferent from the corresponding generalized extended least squares algorithm. The simu-lation test shows that the proposed algorithm has a higher estimation accuracy than therecursive generalized extended least squares algorithm.

� 2011 Elsevier Inc. All rights reserved.

1. Introduction

Mathematical modeling has been receiving much attention in parameter estimation and identification [1,2]. Maximumlikelihood estimation methods are a class of important approaches for dynamical system identification which has been dis-cussed for a long history [3] and have been applied to many areas such as asset pricing modeling in finance fields [4], imagetexture analysis [5], speech recognition [6] and spatial analysis [7]. In recent years, there are many likelihood estimationmethods developed for different models. For example, Södersöm et al. analyzed the accuracy of time-domain maximum like-lihood method and sample maximum likelihood method for errors-in-variables and output error identification [8]; Agüero etal. discussed the relationship between the time and frequency domain maximum likelihood estimation methods [9]. In thispaper, we discuss a maximum likelihood least squares method for equation error models with autoregressive moving aver-age (ARMA) noises.

For systems with stochastic noises, the least squares and gradient search methods are commonly used, e.g., the auxiliarymodel based least squares methods [10–15] and gradient based iterative method [14,16] for output error type models. Forequation error type models, Ding et al. discussed a least squares method for dual rate ARX models [17]; Ding et al. presentedan iterative and a recursive least squares methods for Hammerstein nonlinear ARMAX systems [18]; Han et al. developed ahierarchical least squares based iterative identification method for multivariable CARMA-like models [19]; Zhang et al. pro-posed a hierarchical gradient based iterative parameter estimation algorithm for multivariable output error moving average(OEMA) systems [20]. Wang et al. gave an extended stochastic gradient identification algorithm for Hammerstein–Wiener

. All rights reserved.

l Natural Science Foundation of China (60973043).ngineering Research Center, Jiangnan University, Wuxi 214122, PR [email protected] (F. Ding), [email protected] (J. Dai).

http://dx.doi.org/10.1016/j.apm.2011.07.083

mailto:[email protected]



http://dx.doi.org/10.1016/j.apm.2011.07.083

http://www.sciencedirect.com/science/journal/0307904X

http://www.elsevier.com/locate/apm

W. Wang et al. / Applied Mathematical Modelling 36 (2012) 1842–1853 1843

ARMAX systems [21]; Chen and Ding presented a modified stochastic gradient algorithms with fast convergence rates first[22]. Xiao et al. presented a residual-based extended stochastic gradient algorithm for ARMAX models [23]; Ding et al. pre-sented first a partially coupled stochastic gradient identification methods for non-uniformly sampled systems [24]; Wang etal. developed an input–output data filtering based least squares method for CARARMA models [25]. This paper proposes amaximum likelihood least squares method for the CARARMA models, based on the maximum likelihood principle.

The remainder of this paper is organized as follows. Section 2 derives the maximum likelihood objective function of CAR-ARMA models according to the maximum likelihood principle. Section 3 derives a maximum likelihood least squares algo-rithm for identifying the parameters of CARARMA models. Section 4 simply gives a recursive generalized extended leastsquares algorithm for comparison. Then, an illustrative example is given in Section 5 and concluding remarks are summa-rized in Section 6.

2. The basic principle

In the previous work in [26], Wang et al. studied the maximum likelihood identification method for dynamic adjustmentmodels, i.e. the controlled autoregressive autoregressive (CARAR) models or equation error autoregressive models,

AðzÞyðtÞ ¼ BðzÞuðtÞ þ 1CðzÞvðtÞ:

This paper consider the following controlled autoregressive autoregressive moving average systems (the CARARMA systemsfor short) [3,25],

AðzÞyðtÞ ¼ BðzÞuðtÞ þ DðzÞCðzÞ vðtÞ; ð1Þ

where u(t) and y(t) are the system input and output, respectively, v(t) is a white noise with the normal distribution{v(t)} � N(0,r2), A(z), B(z), C(z) and D(z) are polynomials in z�1, and defined by

AðzÞ ¼ 1þ a1z�1 þ a2z�2 þ � � � þ ana z�na ;

BðzÞ ¼ b1z�1 þ b2z�2 þ � � � þ bnbz�nb ;

CðzÞ ¼ 1þ c1z�1 þ c2z�2 þ � � � þ cnc z�nc ;

DðzÞ ¼ 1þ d1z�1 þ d2z�2 þ � � � þ dndz�nd :

Assume that the degree na, nb, nc and nd are known and y(t) = 0, u(t) = 0 and v(t) = 0 for t 6 0.Define the inner variable,

wðtÞ :¼ DðzÞCðzÞ vðtÞ; ð2Þ

which is an ARMA process.Define the parameter vector h and the information vector u(t) as

h :¼hs

hn

� �2 Rn; n :¼ na þ nb þ nc þ nd;

hs :¼ ½a1; a2; . . . ; ana ; b1; b2; . . . ; bnb�T 2 Rnaþnb ;

hn :¼ ½c1; c2; . . . ; cnc ;d1;d2; . . . ;dnd�T 2 Rncþnd ;

uðtÞ :¼usðtÞunðtÞ

� �2 Rn;

usðtÞ :¼ ½�yðt � 1Þ;�yðt � 2Þ; . . . ;�yðt � naÞ;uðt � 1Þ;uðt � 2Þ; . . . ;uðt � nbÞ�T 2 Rnaþnb ;

unðtÞ :¼ ½�wðt � 1Þ;�wðt � 2Þ; . . . ;�wðt � ncÞ;vðt � 1Þ; vðt � 2Þ; . . . ;vðt � ndÞ�T 2 Rncþnd :

Eq. (2) can be written as

wðtÞ ¼ ½1� CðzÞ�wðtÞ þ DðzÞvðtÞ ¼ uTnðtÞhn þ vðtÞ: ð3Þ

Here subscripts s and n denote the first letters of the words ‘system’ and ‘noise’, respectively. Using (2) and (3), Eq. (1) can berewritten as

1844 W. Wang et al. / Applied Mathematical Modelling 36 (2012) 1842–1853

yðtÞ ¼ ½1� AðzÞ�yðtÞ þ BðzÞuðtÞ þwðtÞ¼ uT

s ðtÞhs þwðtÞ ð4Þ¼ uT

s ðtÞhs þuTnðtÞhn þ vðtÞ

¼ uTðtÞhþ vðtÞ; ð5Þ

or

yðtÞ ¼ �Xna

i¼1

aiyðt � iÞ þXnb

i¼1

biuðt � iÞ �Xnc

i¼1

ciwðt � iÞ þXnd

i¼1

divðt � iÞ þ vðtÞ: ð6Þ

Since the observed values {u(i),y(i)} for i < t and the parameter vector h are uncorrelated with v(t) for a causal system, thejoint conditional probability density function of y(1),y(2), � � � , y(L) given h and u(1),u(2), � � � ,u(L � 1) can be expressed as

pðyð1Þ; yð2Þ; . . . ; yðLÞjuð1Þ;uð2Þ; . . . ;uðL� 1Þ; hÞ¼ pðyðLÞjyð1Þ; yð2Þ; . . . ; yðL� 1Þ;uð1Þ;uð2Þ; . . . ;uðL� 1Þ; hÞpðyðL� 1Þjyð1Þ; yð2Þ; . . . ;

yðL� 2Þ;uð1Þ;uð2Þ; . . . ;uðL� 1Þ; hÞ � � �pðyð1Þjyð0Þ;uð0Þ; hÞ

¼YL

t¼1

pðyðtÞjyð1Þ; yð2Þ; . . . ; yðt � 1Þ;uð1Þ;uð2Þ; . . . ;uðt � 1Þ; hÞ

¼YL

t¼1

p �Xna

i¼1

aiyðt � iÞ þXnb

i¼1

biuðt � iÞ �Xnc

i¼1

ciwðt � iÞ

þXnd

i¼1

divðt � iÞ þ vðtÞjyð1Þ; yð2Þ; . . . ; yðt � 1Þ;uð1Þ;uð2Þ; . . . ;uðt � 1Þ; h!

¼YL

t¼1

pðvðtÞÞ þ k ¼ 1

ðffiffiffiffiffiffiffiffiffiffiffiffi2pr2p

ÞLexp � 1

2r2

XL

t¼1

v2ðtÞ !

þ k;

where k is a constant.Let

yL :¼ fyð1Þ; yð2Þ; . . . ; yðLÞg;uL�1 :¼ fuð1Þ;uð2Þ; . . . ; uðL� 1Þg:

From (7), we define the logarithm of p(yLjuL�1) as the log-likelihood function,

lðyLjuL�1; hÞ :¼ ln pðyLjuL�1; hÞ ¼ ln k� L2

ln 2p� L2

ln r2 � 12r2

XL

t¼1

v2ðtÞ: ð8Þ

According to the maximum likelihood principle, the maximum likelihood estimate r2 of the noise variance makesl(yLjuL�1,h) = max, and thus letting the derivative equal zero gives

@lðyLjuL�1; hÞ@r2

��r2

¼ 0; ð9Þ

whose solution is given by

r2 ¼ 1L

XL

t¼1

v2ðtÞ: ð10Þ

Substituting (10) into (8) gives

lðyLjuL�1; hÞ ¼ ln k� L2

ln 2p� L2

ln1L

XL

t¼1

v2ðtÞ � L2¼ k1 �

L2

ln1L

XL

t¼1

v2ðtÞ; ð11Þ

where k1 :¼ ln k� L2 ln 2p� L

2.From (11), it is clear that the maximum value of lðyLjuL�1; hÞjh can be achieved by minimizing the following objective

function,

JðhÞ :¼ 12

XL

t¼1

v2ðtÞ��h

; ð12Þ


where v(t) is given by

vðtÞ ¼ CðzÞDðzÞ ½AðzÞyðtÞ � BðzÞuðtÞ�: ð13Þ

Thus, the maximum likelihood estimate h of the CARARMA models can be obtained by minimizing the objective function JðhÞ.In fact, it is the least squares objective function.

3. The recursive maximum likelihood least squares algorithm

This section derives the recursive maximum likelihood least squares algorithm from the objective function JðhÞ in (12). LetI be an identity matrix of appropriate sizes and 1n be an n-dimensional column vector whose entries are all 1. From (12), theobjective function can be written as a recursive form

Jðh; tÞ ¼ Jðh; t � 1Þ þ 12

v2ðtÞ: ð14Þ

The following derivation is similar to that of the estimation algorithm for CARAR systems [26]. Using the first-order Taylorexpansion, v(t) at h ¼ hðt � 1Þ can be approximately expressed as

vðtÞ � vðtÞjhðt�1Þ þ@vðtÞ@h

� �T

hðt�1Þ½h� hðt � 1Þ�: ð15Þ

Let hðtÞ be the maximum likelihood estimate for h at time t, i.e.,

hðtÞ :¼ hsðtÞhnðtÞ

" #2 Rn;

hsðtÞ :¼ ½a1ðtÞ; a2ðtÞ; . . . ; ana ðtÞ; b1ðtÞ; b2ðtÞ; . . . ; bnbðtÞ�T 2 Rnaþnb ;

hnðtÞ :¼ ½c1ðtÞ; c2ðtÞ; . . . ; cnc ðtÞ; d1ðtÞ; d2ðtÞ; . . . ; dndðtÞ�T 2 Rncþnd :

Define the estimates of the polynomials A(z), B(z), C(z) and D(z) at time t as

Aðt; zÞ ¼ 1þ a1ðtÞz�1 þ a2ðtÞz�2 þ � � � þ ana ðtÞz�na ;

Bðt; zÞ ¼ b1ðtÞz�1 þ b2ðtÞz�2 þ � � � þ bnbðtÞz�nb ;

Cðt; zÞ ¼ 1þ c1ðtÞz�1 þ c2ðtÞz�2 þ � � � þ cnc ðtÞz�nc ;

Dðt; zÞ ¼ 1þ d1ðtÞz�1 þ d2ðtÞz�2 þ � � � þ dndðtÞz�nd :

Define the information vector,

uf ðtÞ :¼ � @vðtÞ@h

��hðt�1Þ

¼ � @vðtÞ@a1ðt � 1Þ ;

@vðtÞ@a2ðt � 1Þ ; . . . ;

@vðtÞ@ana ðt � 1Þ ;

@vðtÞ@b1ðt � 1Þ

;@vðtÞ

@b2ðt � 1Þ; . . . ;

"

@vðtÞ@bnbðt � 1Þ

;@vðtÞ

@c1ðt � 1Þ ;@vðtÞ

@c2ðt � 1Þ ; . . . ;@vðtÞ

@cnc ðt � 1Þ ;@vðtÞ

@d1ðt � 1Þ;

@vðtÞ@d2ðt � 1Þ

; . . . ;@vðtÞ

@dndðt � 1Þ

#T

2 Rn:

ð16Þ

Computing the partial derivatives of v(t) in (13) with respect to aj, bj, cj and dj at h ¼ hðt � 1Þ gives

@vðtÞ@aj

��hðt�1Þ

¼ Cðt � 1; zÞDðt � 1; zÞ

yðt � jÞ ¼: z�jyf ðtÞ;

@vðtÞ@bj

��hðt�1Þ

¼ � Cðt � 1; zÞDðt � 1; zÞ

uðt � jÞ ¼: �z�juf ðtÞ;

@vðtÞ@cj

��hðt�1Þ

¼ Aðt � 1; zÞDðt � 1; zÞ

yðt � jÞ � Bðt � 1; zÞDðt � 1; zÞ

uðt � jÞ ¼: z�jwf ðtÞ;

@vðtÞ@dj

��hðt�1Þ

¼ � 1

Dðt � 1; zÞvðt � jÞ ¼: �z�jv f ðtÞ;


where yf ðtÞ; uf ðtÞ; v fðtÞ and wf ðtÞ are defined by

yfðtÞ :¼ Cðt � 1; zÞDðt � 1; zÞ

yðtÞ;

uf ðtÞ :¼ Cðt � 1; zÞDðt � 1; zÞ

uðtÞ;

wf ðtÞ :¼ Aðt � 1; zÞDðt � 1; zÞ

yðtÞ � Bðt � 1; zÞDðt � 1; zÞ

uðtÞ;

v fðtÞ :¼ 1

Dðt � 1; zÞvðtÞ:

Their recursive relations are given by

yf ðtÞ ¼ yðtÞ þ c1ðt � 1Þyðt � 1Þ þ c2ðt � 1Þyðt � 2Þ þ � � � þ cnc ðt � 1Þyðt � ncÞr � d1ðt � 1Þyf ðt � 1Þ � d2ðt � 1Þyf ðt � 2Þ

� � � � � dndðt � 1Þyfðt � ndÞ;

ufðtÞ ¼ uðtÞ þ c1ðt � 1Þuðt � 1Þ þ c2ðt � 1Þuðt � 2Þ þ � � � þ cnc ðt � 1Þuðt � ncÞ � d1ðt � 1Þuf ðt � 1Þ � d2ðt � 1Þuf ðt � 2Þ

� � � � � dndðt � 1Þuf ðt � ndÞ;

wf ðtÞ ¼ yðtÞ þ a1ðt � 1Þyðt � 1Þ þ a2ðt � 1Þyðt � 2Þ þ � � � þ ana ðt � 1Þyðt � naÞ � b1ðt � 1Þuðt � 1Þ � b2ðt � 1Þuðt � 2Þ

� � � � � bnbðt � 1Þuðt � nbÞr � d1ðt � 1Þwf ðt � 1Þ � d2ðt � 1Þwf ðt � 2Þ � � � � � dnd

ðt � 1Þwf ðt � ndÞ;

v f ðtÞ ¼ vðtÞ � d1v f ðt � 1Þ � d2v f ðt � 2Þ � � � � � dndv fðt � ndÞ:

Thus from (16), we have

uf ðtÞ ¼ ½�yf ðt � 1Þ;�yf ðt � 2Þ; . . . ;�yfðt � naÞ; uf ðt � 1Þ; ufðt � 2Þ; . . . ; uf ðt � nbÞ;�wfðt � 1Þ;�wf ðt � 2Þ; . . . ;

� wf ðt � ncÞ; v f ðt � 1Þ; v f ðt � 2Þ; . . . ; v f ðt � ndÞ�T 2 Rn:

Applying the Taylor series expansion to J(h, t � 1) gives

Jðh; t � 1Þ � @Jðh; t � 1Þ@h

��hðt�1Þ

½h� hðt � 1Þ� þ 12½h� hðt � 1Þ�T@

2Jðh; t � 1Þ@h@hT

��hðt�1Þ

½h� hðt � 1Þ� þ 12gðtÞ; ð17Þ

where the variable g(t) is the residual of the Taylor expansion of J(h, t � 1). Since the first-order derivative of J(h, t � 1) ath ¼ hðt � 1Þ approximately equals to zero, and

P�1ðt � 1Þ :¼ @2Jðh; t � 1Þ@h@hT

��hðt�1Þ

is a positive-definite matrix, Eq. (14) can be written as

Jðh; tÞ � 12½h� hðt � 1Þ�TP�1ðt � 1Þ½h� hðt � 1Þ� þ 1

2gðtÞ þ 1

2v2ðtÞ: ð18Þ

From (18) and (14), we have

2Jðh; tÞ � ½h� hðt � 1Þ�TP�1ðt � 1Þ½h� hðt � 1Þ� þ v2ðtÞ þ gðtÞ

¼ ½h� hðt � 1Þ�TP�1ðt � 1Þ½h� hðt � 1Þ� þ gðtÞ þ vðtÞjhðt�1Þ þ@vðtÞ@h

��T

hðt�1Þ½h� hðt � 1Þ�

" #2

¼ ½h� hðt � 1Þ�T½P�1ðt � 1Þ þ uf ðtÞuTf ðtÞ�½h� hðt � 1Þ� � 2vðtÞuT

f ðtÞ½h� hðt � 1Þ� þ v2ðtÞ þ gðtÞ:

The second-order derivative of J(h, t) in (14) with respect to h is

P�1ðtÞ ¼ @2Jðh; tÞ@h@hT ¼

@2Jðh; t � 1Þ@h@hT þ vðtÞ @

2vðtÞ@h@hT þ

@vðtÞ@h

@vðtÞ@h

� �T

: ð19Þ


Because the second-order derivative of v(t) with respect to h at h ¼ hðt � 1Þ is zero, Eq. (19) can be written as

P�1ðtÞ ¼ P�1ðt � 1Þ þ uf ðtÞuTf ðtÞ: ð20Þ

Applying the matrix inversion lemma

ðAþ BCÞ�1 ¼ A�1 � A�1BðI þ CA�1BÞ�1CA�1

to (20) gives

PðtÞ ¼ Pðt � 1Þ � Pðt � 1ÞufðtÞuTf ðtÞPðt � 1Þ

1þ uTf ðtÞPðt � 1Þuf ðtÞ

:

Let

LðtÞ :¼ PðtÞuf ðtÞ;g�ðtÞ :¼ ½LðtÞvðtÞ�TP�1ðtÞ½LðtÞvðtÞ� þ v2ðtÞ þ gðtÞ:

Thus, we have

LðtÞ ¼ Pðt � 1Þuf ðtÞ1þ uT

f ðtÞPðt � 1ÞufðtÞ:

The objective function J(h, t) can be written as

2Jðh; tÞ � ½h� hðt � 1Þ � LðtÞvðtÞ�TP�1ðtÞ½h� hðt � 1Þ � LðtÞvðtÞ� þ ½LðtÞvðtÞ�TP�1ðtÞ½LðtÞvðtÞ� þ v2ðtÞ þ gðtÞ

¼ ½h� hðt � 1Þ � LðtÞvðtÞ�TP�1ðtÞ½h� hðt � 1Þ � LðtÞvðtÞ� þ g�ðtÞ:

Minimizing 2J(h, t) gives the estimate hðtÞ of h as follow:

hðtÞ ¼ hðt � 1Þ þ LðtÞvðtÞ:

Define

uðtÞ :¼usðtÞunðtÞ

� �2 Rn;

unðtÞ :¼ ½�wðt � 1Þ;�wðt � 2Þ; . . . ;�wðt � ncÞ; vðt � 1Þ; vðt � 2Þ; . . . ; vðt � ndÞ�T 2 Rncþnd :

Replacing hs with hsðt � 1Þ in (4), the estimate wðtÞ can be computed by

wðtÞ ¼ yðtÞ �uTs ðtÞh sðt � 1Þ:

Replacing u(t) and h in (5) with their estimates uðtÞ and hðt � 1Þ, the estimate vðtÞ of v(t) can be computed by

vðtÞ ¼ vðtÞjhðt�1Þ ¼ yðtÞ �uTðtÞhjhðt�1Þ ¼ yðtÞ � uTðtÞhðt � 1Þ: ð21Þ

Thus the maximum likelihood recursive least squares (ML-RLS) identification algorithm for the controlled autoregressiveautoregressive moving average system can be expressed as

hðtÞ ¼ hðt � 1Þ þ LðtÞvðtÞ; ð22Þ

vðtÞ ¼ yðtÞ � uTðtÞhðt � 1Þ; ð23Þ

LðtÞ ¼ Pðt � 1Þuf ðtÞ½1þ uTf ðtÞPðt � 1Þuf ðtÞ��1

; ð24Þ

PðtÞ ¼ ½I � LðtÞuTf ðtÞ�Pðt � 1Þ; Pð0Þ ¼ p0I; ð25Þ

uf ðtÞ ¼½�yfðt � 1Þ;�yfðt � 2Þ; . . . ;�yf ðt � naÞ; ufðt � 1Þ; ufðt � 2Þ; . . . ; ufðt � nbÞ;�wf ðt � 1Þ;�wf ðt � 2Þ; . . . ;

� wfðt � ncÞ; v f ðt � 1Þ; v f ðt � 2Þ; . . . ; v f ðt � ndÞ�T; ð26Þ

usðtÞ ¼ ½�yðt � 1Þ;�yðt � 2Þ; � � � ;�yðt � naÞ; uðt � 1Þ;uðt � 2Þ; . . . ;uðt � nbÞ�T; ð27ÞunðtÞ ¼ ½�wðt � 1Þ;�wðt � 2Þ; . . . ;�wðt � ncÞ; vðt � 1Þ; vðt � 2Þ; . . . ; vðt � ndÞ�T; ð28ÞwðtÞ ¼ yðtÞ �uT

s ðtÞhsðt � 1Þ; ð29Þ

uðtÞ ¼usðtÞunðtÞ

� �; ð30Þ


yf ðtÞ ¼ yðtÞ þ c1ðt � 1Þyðt � 1Þ þ c2ðt � 1Þyðt � 2Þ þ � � � þ cnc ðt � 1Þyðt � ncÞ � d1ðt � 1Þyfðt � 1Þ� d2ðt � 1Þyf ðt � 2Þ � � � � � dnd

ðt � 1Þyfðt � ndÞ; ð31Þ

ufðtÞ ¼ uðtÞ þ c1ðt � 1Þuðt � 1Þ þ c2ðt � 1Þuðt � 2Þ þ � � � þ cnc ðt � 1Þuðt � ncÞ � d1ðt � 1Þuf ðt � 1Þ� d2ðt � 1Þuf ðt � 2Þ � � � � � dnd

ðt � 1Þufðt � ndÞ; ð32Þ

wf ðtÞ ¼ yðtÞ þ a1ðt � 1Þyðt � 1Þ þ a2ðt � 1Þyðt � 2Þ þ � � � þ ana ðt � 1Þyðt � naÞ � b1ðt � 1Þuðt � 1Þ� b2ðt � 1Þuðt � 2Þ � � � � � bnb

ðt � 1Þuðt � nbÞ � d1ðt � 1Þwfðt � 1Þ � d2ðt � 1Þwf ðt � 2Þ � � � �� dnd

ðt � 1Þwf ðt � ndÞ; ð33Þ

v f ðtÞ ¼ vðtÞ � d1v f ðt � 1Þ � d2v f ðt � 2Þ � � � � � dndv fðt � ndÞ; ð34Þ

hðtÞ ¼ hsðtÞhnðtÞ

" #; ð35Þ

hsðtÞ ¼ ½a1ðtÞ; a2ðtÞ; . . . ; ana ðtÞ; b1ðtÞ; b2ðtÞ; . . . ; bnbðtÞ�T; ð36Þ

hnðtÞ ¼ ½c1ðtÞ; c2ðtÞ; . . . ; cnc ðtÞ; d1ðtÞ; d2ðtÞ; . . . ; dndðtÞ�T: ð37Þ

The steps involved in the ML-RLS algorithm are listed as follows:

1. Let t = 1, set the initial values hð0Þ ¼ 1n=p0; Pð0Þ ¼ p0I; vð0Þ ¼ 1=p0; wð0Þ ¼ 1=p0 and yfðiÞ ¼ 1=p0; uf ðiÞ ¼ 1=p0;

wf ðiÞ ¼ 1=p0; v f ðiÞ ¼ 1=p0 for i 6 0, p0 is a large number (i.e., p0 = 106).2. Collect the input–output data u(t) and y(t), form the information vectors us(t) by (27), unðtÞ by (28), uðtÞ by (30) and ufðtÞ

by (26).3. Compute vðtÞ by (23).4. Compute the gain vector L(t) by (24) and the covariance matrix P(t) by (25).

Fig. 1. The flowchart of computing the ML-RLS estimate hðtÞ.


5. Compute wðtÞ by (29), yf ðtÞ; uf ðtÞ; wf ðtÞ and v f ðtÞ by (31)–(34), respectively.6. Update the parameter estimation vector hðtÞ by (22).7. Increase t by 1 and go to Step 2.

The flowchart of computing the parameter estimates cðtÞ; aðtÞ and hðtÞ by the F-HLS algorithm in (22)–(37) is shown inFig. 1.

4. The recursive generalized extended least squares algorithm

To show the advantages of the ML-RLS algorithm, we give the recursive generalized extended least squares (RGELS) algo-rithm for identifying the parameter vector h [25]:

hðtÞ ¼ hðt � 1Þ þ LðtÞ½yðtÞ � uTðtÞhðt � 1Þ�; ð38Þ

LðtÞ ¼ Pðt � 1ÞuðtÞ1þ uTðtÞPðt � 1ÞuðtÞ ; ð39Þ

PðtÞ ¼ ½I � LðtÞuTðtÞ�Pðt � 1Þ; Pð0Þ ¼ p0In; ð40Þ

hðtÞ ¼ hsðtÞhnðtÞ

" #; uðtÞ ¼

usðtÞunðtÞ

� �; ð41Þ

usðtÞ ¼ ½�yðt � 1Þ;�yðt � 2Þ; . . . ;�yðt � naÞ; uðt � 1Þ;uðt � 2Þ; . . . ;uðt � nbÞ�T; ð42ÞunðtÞ ¼ ½�wðt � 1Þ;�wðt � 2Þ; . . . ;�wðt � ncÞ; vðt � 1Þ; vðt � 2Þ; . . . ; vðt � ndÞ�T; ð43ÞwðtÞ ¼ yðtÞ �uT

s ðtÞhsðtÞ; ð44ÞvðtÞ ¼ yðtÞ � uTðtÞhðtÞ: ð45Þ

The choice of the initial values of the RGELS algorithm is similar to that of the ML-RLS algorithm. The flowchart of computingthe parameter estimate hðtÞ in the RGELS algorithm is shown in Fig. 2.

Fig. 2. The flowchart of computing the RGELS estimate hðtÞ.

Table 1The parameter estimates and errors (r2 = 1.002).

Algorithms t a1 a2 b1 b2 c1 d1d (%)

RGELS 100 0.14341 0.68336 �0.84788 0.55601 0.53968 �0.61710 0.22896200 0.15186 0.84121 �0.84430 0.60662 0.56043 �0.60211 0.17086500 0.18153 0.85382 �0.84988 0.60759 0.59662 �0.45623 0.07638

1000 0.19889 0.88476 �0.85750 0.64841 0.61756 �0.45225 0.070192000 0.21621 0.89556 �0.88460 0.65090 0.61692 �0.42646 0.058533000 0.22092 0.89570 �0.85875 0.63017 0.62423 �0.39578 0.03114

ML-RLS 100 0.19542 0.84787 �0.74108 0.65672 0.48196 �0.29879 0.13036200 0.23207 0.88688 �0.83932 0.61429 0.51337 �0.39421 0.07280500 0.24413 0.91055 �0.89265 0.60274 0.63999 �0.25899 0.07197

1000 0.23559 0.89722 �0.88693 0.57404 0.61981 �0.33282 0.033842000 0.23814 0.89651 �0.89595 0.56468 0.60724 �0.35283 0.038533000 0.23315 0.89852 �0.87012 0.59422 0.59383 �0.35659 0.02159

True values 0.23000 0.90000 �0.85000 0.60000 0.62000 �0.36000

Table 2The ML-RLS parameter estimates and errors.

r2 t a1 a2 b1 b2 c1 d1d (%)

0.502 100 0.21067 0.86902 �0.79140 0.60466 0.46114 �0.29913 0.11711200 0.22933 0.88981 �0.84133 0.59473 0.50528 �0.39670 0.07735500 0.23931 0.90350 �0.87057 0.59611 0.62896 �0.27363 0.05731

1000 0.23384 0.89911 �0.86817 0.58380 0.61983 �0.33381 0.022942000 0.23334 0.89834 �0.87275 0.58145 0.61029 �0.35403 0.020225000 0.23133 0.89716 �0.86120 0.59892 0.60292 �0.35673 0.01336

10000 0.22855 0.89780 �0.85867 0.60362 0.61130 �0.36014 0.00833

1.002 100 0.19542 0.84787 �0.74108 0.65672 0.48196 �0.29879 0.13036200 0.23207 0.88688 �0.83932 0.61429 0.51337 �0.39421 0.07280500 0.24413 0.91055 �0.89265 0.60274 0.63999 �0.25899 0.07197

1000 0.23559 0.89722 �0.88693 0.57404 0.61981 �0.33282 0.033842000 0.23814 0.89651 �0.89595 0.56468 0.60724 �0.35283 0.038535000 0.23361 0.89404 �0.87249 0.59807 0.59907 �0.35859 0.02015

10000 0.22716 0.89510 �0.86747 0.60773 0.60992 �0.36305 0.01437

True values 0.23000 0.90000 �0.85000 0.60000 0.62000 �0.36000

0 500 1000 1500 2000 2500 30000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

RML−LS

t

δ

RGELS

Fig. 3. The ML-RLS and RGELS estimation errors versus t (r2 = 1.002).


0 200 400 600 800 1000 1200 1400 1600 1800 20000

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

σ2=1.002

t

δ

σ2=0.502

Fig. 4. The ML-RLS parameter estimation errors versus t.


5. Example

Expanding the CARAR model in [26] to a CARARMA model,

AðzÞyðtÞ ¼ BðzÞuðtÞ þ DðzÞCðzÞ vðtÞ;

AðzÞ ¼ 1þ a1z�1 þ a2z�2 ¼ 1þ 0:23z�1 þ 0:90z�2;

BðzÞ ¼ b1z�1 þ b2z�2 ¼ �0:85z�1 þ 0:60z�2;

CðzÞ ¼ 1þ c1z�1 ¼ 1þ 0:62z�1;

DðzÞ ¼ 1þ d1z�1 ¼ 1� 0:36z�1;

h ¼ ½a1; a2; b1; b2; c1; d1�T ¼ ½0:23;0:90;�0:85; 0:60;0:62;�0:36�T:

8>>>>>>>>><>>>>>>>>>:

In simulation, the input u(t) is taken as an uncorrelated measured stochastic signal sequence with zero mean and unit var-iance, and v(t) as a white noise sequence with zero mean and variances r2 = 0.502 and r2 = 1.002, respectively. Applying theML-RLS algorithm and RGELS algorithm to estimate the parameters of this system, the parameter estimates and their errorsare shown in Table 1, the parameter estimates and their errors of the ML-RLS algorithm with different variances are shown inTable 2. The corresponding estimate errors d :¼ khðtÞ � hk=khk versus t are shown in Figs. 3 and 4.

From Tables 1 and 2 and Figs. 3 and 4, we can draw the following conclusions:

� Both the estimation errors of the ML-RLS and RGELS algorithms decrease as the data length increases – see Fig. 3. Thus theML-RLS algorithm can effectively identify the models with colored noises.� For the same data length, the estimation accuracy of the ML-RLS algorithm is better than that of the RGELS algorithm –

see Table 1 and Fig. 3.� For the same data length, the smaller the noise variance is, the faster the convergence rate of the ML-RLS algorithm is –

see Table 2 and Fig. 4.

6. Conclusions

This paper presents a maximum likelihood least squares identification algorithm for controlled autoregressive autore-gressive moving average models based on the maximum likelihood principle. The simulation results show that the proposedalgorithm is effective and the estimation accuracy of the ML-RLS algorithm is higher than that of the RGELS algorithm withthe same data length. Also, the proposed identification method can be extended to output error type models or to nonlinearsystems [27–29] or missing-data systems outputs [30–32], non-uniformly sampled-data systems [33–35], and multivariablesystems [36] and can be combined with the multi-innovation identification theory [37–47] to study identification problemsof systems with colored noises [48–53].


References

[1] Y.S. Xiao, F. Ding, Y. Zhou, M. Li, J.Y. Dai, On consistency of recursive least squares identification algorithms for controlled auto-regression models,Applied Mathematical Modelling 32 (11) (2008) 2207–2215.

[2] Y. Zhang, G.M. Cui, Bias compensation methods for stochastic systems with colored noise, Applied Mathematical Modelling 35 (4) (2011) 1709–1716.[3] L. Ljung, System Identification: Theory for the User, second ed., Prentice-Hall, Englewood Cliffs, New Jersey, 1999.[4] B. Kayahan, T. Stengos, Testing the capital asset pricing model with local maximum likelihood methods, Mathematical and Computer Modelling 46 (1–

2) (2007) 138–150.[5] T. Lundahl, W.J. Ohley, S.M. Kay, R. Siffert, Fractional Brownian motion: a maximum likelihood estimator and its application to image texture, IEEE

Transactions on Medical Imaging 5 (3) (2007) 152–161.[6] L.R. Bahl, F. Jelinek, R.L. Mercer, A maximum likelihood approach to continuous speech recognition, IEEE Transactions on Pattern Analysis and Machine

Intelligence 5 (2) (2009) 179–190.[7] M. Kyung, S.K. Ghosh, Maximum likelihood estimation for directional conditionally autoregressive models, Journal of Statistical Planning and Inference

140 (11) (2010) 3160–3179.[8] T. Södersöm, M. Hong, J. Schoukens, R. Pintelon, Accuracy analysis of time domain maximum likelihood method and sample maximum likelihood

method for errors-in-variables and output error identification, Automatica 46 (4) (2010) 721–727.[9] J.C. Agüero, J.I. Yuz, G.C. Goodwin, R.A. Delgado, On the equivalence of time and frequency domain maximum likelihood estimation, Automatica 46 (2)

(2010) 260–270.[10] F. Ding, J. Ding, Least squares parameter estimation with irregularly missing data, International Journal of Adaptive Control and Signal Processing 24

(7) (2010) 540–553.[11] L.L. Han, J. Sheng, F. Ding, Y. Shi, Auxiliary model identification method for multirate multi-input systems based on least squares, Mathematical and

Computer Modelling 50 (7–8) (2009) 1100–1106.[12] D.Q. Wang, Y.Y. Chu, F. Ding, Auxiliary model-based RELS and MI-ELS algorithms for Hammerstein OEMA systems, Computers & Mathematics with

Applications 59 (9) (2010) 3092–3098.[13] Y.J. Liu, D.Q. Wang, F. Ding, Least-squares based iterative algorithms for identifying Box–Jenkins models with finite measurement data, Digital Signal

Processing 20 (5) (2010) 1458–1467.[14] F. Ding, P.X. Liu, G. Liu, Gradient based and least-squares based iterative identification methods for OE and OEMA systems, Digital Signal Processing 20

(3) (2010) 664–677.[15] X.G. Liu, J. Lu, Least squares based iterative identification for a class of multirate systems, Automatica 46 (3) (2010) 549–554.[16] D.Q. Wang, G.W. Yang, R.F. Ding, Gradient-based iterative parameter estimation for Box–Jenkins systems, Computers & Mathematics with Applications

60 (5) (2010) 1200–1208.[17] J. Ding, F. Ding, The residual based extended least squares identification method for dual-rate systems, Computers & Mathematics with Applications 56

(6) (2008) 1479–1487.[18] F. Ding, T. Chen, Identification of Hammerstein nonlinear ARMAX systems, Automatica 41 (9) (2005) 1479–1489.[19] H.Q. Han, L. Xie, F. Ding, X.G. Liu, Hierarchical least squares based iterative identification for multivariable systems with moving average noises,

Mathematical and Computer Modelling 51 (9–10) (2010) 1213–1220.[20] Z.N. Zhang, F. Ding, X.G. Liu, Hierarchical gradient based iterative parameter estimation algorithm for multivariable output error moving average

systems, Computers & Mathematics with Applications 61 (3) (2011) 672–682.[21] D.Q. Wang, F. Ding, Extended stochastic gradient identification algorithms for Hammerstein–Wiener ARMAX systems, Computers & Mathematics with

Applications 56 (12) (2008) 3157–3164.[22] J. Chen, F. Ding, Modified stochastic gradient algorithms with fast convergence rates, Journal of Vibration and Control 17 (9) (2011) 1281–1286.[23] Y.S. Xiao, Y. Zhang, J. Ding, J.Y. Dai, The residual based interactive least squares algorithms and simulation studies, Computers & Mathematics with

Applications 58 (6) (2009) 1190–1197.[24] F. Ding, G. Liu, X.P. Liu, Partially coupled stochastic gradient identification methods for non-uniformly sampled systems, IEEE Transactions on

Automatic Control 55 (8) (2010) 1976–1981.[25] D.Q. Wang, F. Ding, Input–output data filtering based recursive least squares identification for CARARMA systems, Digital Signal Processing 20 (4)

(2010) 991–999.[26] W. Wang, J.H. Li, R.F. Ding, Maximum likelihood identification algorithm for controlled autoregressive autoregressive models, International Journal of

Computer Mathematics, doi:10.1080/00207160.2011.598514.[27] F. Ding, P.X. Liu, G. Liu, Identification methods for Hammerstein nonlinear systems, Digital Signal Processing 21 (2) (2011) 215–238.[28] D.Q. Wang, F. Ding, Least squares based and gradient based iterative identification for Wiener nonlinear systems, Signal Processing 91 (5) (2011) 1182–

1189.[29] F. Ding, Y. Shi, T. Chen, Auxiliary model based least-squares identification methods for Hammerstein output-error systems, Systems & Control Letters

56 (5) (2007) 373–380.[30] F. Ding, G. Liu, X.P. Liu, Parameter estimation with scarce measurements, Automatica 47 (8) (2011) 1646–1655.[31] J. Ding, Y. Shi, H.G. Wang, F. Ding, A modified stochastic gradient based parameter estimation algorithm for dual-rate sampled-data systems, Digital

Signal Processing 20 (4) (2010) 1238–1247.[32] F. Ding, P.X. Liu, H.Z. Yang, Parameter identification and intersample output estimation for dual-rate systems, IEEE Transactions on Systems, Man, and

Cybernetics, Part A: Systems and Humans 38 (4) (2008) 966–975.[33] Y.J. Liu, L. Xie, F. Ding, An auxiliary model based recursive least squares parameter estimation algorithm for non-uniformly sampled multirate systems,

Proceedings of the Institution of Mechanical Engineers, Part I: Journal of Systems and Control Engineering 223 (4) (2009) 445–454.[34] F. Ding, L. Qiu, T. Chen, Reconstruction of continuous-time systems from their non-uniformly sampled discrete-time systems, Automatica 45 (2) (2009)

324–332.[35] L. Xie, H.Z. Yang, F. Ding, Recursive least squares parameter estimation for non-uniformly sampled systems based on the data filtering, Mathematical

and Computer Modelling 54 (1–2) (2011) 315–324.[36] Y.J. Liu, J. Sheng, R.F. Ding, Convergence of stochastic gradient estimation algorithm for multivariable ARX-like systems, Computers & Mathematics

with Applications 59 (8) (2010) 2615–2627.[37] F. Ding, T. Chen, Performance analysis of multi-innovation gradient type identification methods, Automatica 43 (1) (2007) 1–14.[38] F. Ding, Several multi-innovation identification methods, Digital Signal Processing 20 (4) (2010) 1027–1039.[39] D.Q. Wang, F. Ding, Performance analysis of the auxiliary models based multi-innovation stochastic gradient estimation algorithm for output error

systems, Digital Signal Processing 20 (3) (2010) 750–762.[40] L.L. Han, F. Ding, Identification for multirate multi-input systems using the multi-innovation identification theory, Computers & Mathematics with

Applications 57 (9) (2009) 1438–1449.[41] L.L. Han, F. Ding, Multi-innovation stochastic gradient algorithms for multi-input multi-output systems, Digital Signal Processing 19 (4) (2009) 545–

554.[42] J.B. Zhang, F. Ding, Y. Shi, Self-tuning control based on multi-innovation stochastic gradient parameter estimation, Systems & Control Letters 58 (1)

(2009) 69–75.[43] F. Ding, X.P. Liu, G. Liu, Auxiliary model based multi-innovation extended stochastic gradient parameter estimation with colored measurement noises,

Signal Processing 89 (10) (2009) 1883–1890.

http://dx.doi.org/10.1080/00207160.2011.598514


[44] Y.J. Liu, Y.S. Xiao, X.L. Zhao, Multi-innovation stochastic gradient algorithm for multiple-input single-output systems using the auxiliary model,Applied Mathematics and Computation 215 (4) (2009) 1477–1483.

[45] L. Xie, H.Z. Yang, F. Ding, Modeling and identification for non-uniformly periodically sampled-data systems, IET Control Theory & Applications 4 (5)(2010) 784–794.

[46] F. Ding, P.X. Liu, G. Liu, Multi-innovation least squares identification for system modeling, IEEE Transactions on Systems, Man, and Cybernetics, Part B:Cybernetics 40 (3) (2010) 767–778.

[47] Y.J. Liu, L. Yu, F. Ding, Multi-innovation extended stochastic gradient algorithm and its performance analysis, Circuits, Systems and Signal Processing 29(4) (2010) 649–667.

[48] J. Chen, Y. Zhang, R.F. Ding, Auxiliary model based multi-innovation algorithms for multivariable nonlinear systems, Mathematical and ComputerModelling 52 (9–10) (2010) 1428–1434.

[49] H.H. Yin, Z.F. Zhu, F. Ding, Model order determination using the Hankel matrix of impulse responses, Applied Mathematics Letters 24 (5) (2011) 797–802.

[50] L. Chen, J.H. Li, R.F. Ding, Identification of the second-order systems based on the step response, Mathematical and Computer Modelling 53 (5–6)(2011) 1074–1083.

[51] B. Bao, Y.Q. Xu, J. Sheng, R.F. Ding, Least squares based iterative parameter estimation algorithm for multivariable controlled ARMA system modellingwith finite measurement data, Mathematical and Computer Modelling 53 (9–10) (2011) 1664–1669.

[52] L.L. Xiang, L.B. Xie, Y.W. Liao, R.F. Ding, Hierarchical least squares algorithms for single-input multiple-output systems based on the auxiliary model,Mathematical and Computer Modelling 52 (5–6) (2010) 918–924.

[53] H.Q. Han, G.L. Song, Y.S. Xiao, Y.W. Liao, R.F. Ding, Performance analysis of the AM-SG parameter estimation for multivariable systems, AppliedMathematics and Computation 217 (12) (2011) 5566–5572.

Date post:	04-Sep-2016
Category:	Documents
Upload:	wei-wang
View:	216 times
Download:	2 times

Maximum likelihood least squares identification for systems with autoregressive moving average noise

Documents