Block partial derivative and its application to neural-net-based direct model-reference adaptive...

Block partial derivative and its application to neural-net-based direct-model-reference adaptive control

M.S. Ahmed

Indexiny terms: Adaptive control, Neural networks, Nonlinear control systems

Abstract: The concept of block partial derivative and the associated algebra are introduced. The algebra is then applied to direct-model-reference adaptive control (MRAC) of discrete-time nonlinear plants. Three MRAC algorithms are developed employing identification of the forward model, extended MIT rule and extended SPR rule. Local-convergence properties and results of simulation study are presented.

1 Introduction

Model-reference adaptive control (MRAC) is an important class of adaptive-control scheme. In the direct- MRAC scheme, the regulator is updated online so that the plant output follows the output of a reference model. The control is thus based on two loops: an inner and an outer loop. The inner loop provides the feedback control while the outer loop adjusts the parameters of the regulator.

In the MRAC of linear plant, the reference model and the controller structure are chosen in such a way that a parameter set of the regulator exists to ensure perfect model following. However, for nonlinear plants ,with unknown structures, it may not be possible to ensure perfect model following. Nevertheless, other prior infor- mation and extensive simulation studies may be carried out to create a suitable reference model and controller structures. In many cases, perfect model following may not be achievable and it may not be required either [ l ] .

It is well known that, when perfect model following is not achievable (owing to structural uncertainty or other factors), the MIT rule (i.e. the gradient method) attempts to achieve a least-mean-squares solution [l]. Given a reference model and a controller structure, such a gradient rule also can be formulated for nonlinear plants. However, a major obstacle in designing control schemes for nonlinear plants is the lack of a suitable model structure. The structures used in modelling nonlinear plants have been the Hammerstein, Weiner and NARMAX rep- resentations [2, 31.

In recent years, the artificial neural network (ANN) has come to be an important element in describing nonlinear functions. Neural networks are known as universal

IEE, 1994 Paper 13950 lC9). first received 5th January and in revised form 8th June 1994 The author is with the Department of Electrical & Computer Engineer- ing, University of Michigan, Dearborn, MI 48128, USA

I E E Proc.-Control Theory Appl., Vol. 141, No. 5, September 1994

function approximators [4]. It has been shown [ S I that a feedforward multilayered neural network can approximate a continuous function arbitrarily well. Adaptive control of nonlinear plants using the neural networks has been considered in References 6-8. However, all of them considered feedback-linearisable nonlinear plants. In this paper no a priori assumption is made about the plant structure. It is only assumed that the order and delay of the plant are known.

The proposed method is based on a direct minimisation of the squared error between the plant output and the model output. The minimisation is carried out using a gradient-descent rule. The online optimisation procedure is based on a new concept of partial derivatives called ‘block partial derivatives’ (BPD). Three approaches of MRAC are developed employing (a) identification of the forward model, (b) extended MIT rule, and (c) extended SPR rule. Local-convergence properties as well as results of simulation study are presented. An interesting observation is that, for a linear plant with a linear controller, the proposed method reduces to the well established MRAC schemes for a linear plant.

2

In this Section, we introduce the concept of block partial derivatives. The algebra of block partial derivatives is also presented which facilitates the partial-derivative computation when different blocks are connected in various configurations.

Block partial derivatives and associated algebra

2.1 Block partial derivatives (BPD) DeJinition: Consider a block d as shown in Fig. l a with the input x E 9”” and output y E WnY. The block partial derivatives (BPDs) of d are defined as

Remark I : In the above definition all inputs other than xi are held constant. However, the other signals within the block are allowed to vary owing to the change in x i .

Remark 2: If a parameter set w E 9”” in the block also is allowed to vary, its effect may be accommodated by representing the weights as auxiliary inputs as shown in

The author thanks King Fahd University of Pet- roleum and Minerals for supporting this research.

305

Fig. Ib. This representation facilitates computation of BPDs with respect to the parameters.

Remark 3 : When x and y are functions of time and the block is dynamic (i.e. has memory), the BPDs may be represented by rational dynamic operators.

r

b Fig. 1 Block with input and output D Block b Parameters of the block shown as auxiliary inputs

2.2 Algebra of block partial derivatives The MRAC algorithm proposed in this paper relies on the computation of various BPDs from blocks connected in different configurations. In this Section, we present the algebra which allows their computation.

2.2.1 Cascade blocks: The cascade of two blocks is shown in Fig. 2a. The cascade connection of the blocks

a

b Cascade and delay blocks Fig. 2

(I Cascade block b Delay block

d and B is represented as block V. Since the input y i s to block 9 are the output of block d, the BPDs of 0 can be obtained using the chain rule as [9]

2.2.2 Delay blocks: A class of blocks plays an important role in the study of discrete-time dynamic systems is the delay block. This is shown in Fig. 26. The input-output relationship is governed by

y( t ) = x(t - 1) = q - l x ( t ) (3) where 4 - l (or A) is a shift operator. The delay blocks represent the presence of memory. Therefore, in the presence of delay in the block, the explicit time dependence of the input-output must be brought in. The BPDs in this

306

case are obtained as

(4)

2.2.3 Feedback blocks: The most interesting and important block in control apphcations is a block with feedback. One such interconnection is shown in Fig. 3a.

i-

a

It 44 I/ C

Fig. 3 Block configurations r? Feedback black b Block with delays in the forward path c Block with delays in both the forward and feedback paths

Such blocks typically arise in the design of discrete-time control systems. To retain simplicity, a feedback block with only one output is shown. However, the concept can be extended to blocks with multiple outputs. For the block shown in Fig. 2c the BPDs for V are obtained using the chain rule as

or

where


2.2.4 Delays in the forward path: Another interesting block in the discrete-time dynamic system is a block with delays in the forward path. One such interconnection is shown in Fig. 36 where only one input is shown, although multiple inputs may well be considered. The BPDs in this case are obtained as*

act(t) aA&) a%(t - j ) aAy(t) - 1 ~~

dx(t) j = 1 axct - j ) ax(t) ax(t) +- --

or

x ( t )

with

0 6 9

c -: (controller) (plant) V ( t )

(7)

2.2.5 Delays in both forward and feedback paths: Now consider a block with delays in both the forward and the feedback path, as shown in Fig. 3c. Combining the results of Sections 2.2.3 and 2.2.4, the BPDs of V are obtained as

or

and

(9)

with A(&, q - I ) and B(t, 4-') given following eqns. 6 and 7, respectively, and w(t) is a parameter set of d.

3

Consider the MRAC scheme of Fig. 4u. The objective of the control scheme is to adjust the controller parameters, such that the plant output y ( t ) follows the reference output y,(t). Introduce the error and the criterion function as

(1 1)

Direct MRAC for nonlinear plants

e(t) = y,(t) - YO)

where wg represents the adjustable parameters of the controller. To minimise J , , we adjust the controller parameters along the negative-gradient direction. However, the gradient must be calculated for the block X. where X is an imaginary block containing the plant, controller and the feedback connections (see Fig. 4b). Otherwise, the effect of all interconnections will not be accounted for. This gives the following parameter-adjustment rule:

(13)

* Eqn. 7 implies that, for the block V,

= B ( q - l ) ? x ( r )

I E E Proc.-Control Theory App l . , Vol. 141, N o . 5, September 1994

where q,(t) is the 'learning rate' or 'adaptation rate' and rg(t) is the 'sensitivity derivative'. In implementing the parameter-adaptation rule, the key consideration is the

x ( t ) command input

/ a

I/ 1, , l 2 I1

L L b

Fig. 4 a Direct MRAC scheme h Controller and plant with the imaginary blocks D and H

Direct M R A C scheme and controller and plant

determination of qg and T,. In the following Section we derive a computational algorithm for Tg .

3.1 Gradient computation Consider a discrete-time nth-order nonlinear plant, with the chosen controller and the reference model given as

Plant:

At) =f{y(t - I), . . . , y(t - n), u(t - d), . . ., u(t - n ) ) (15)

u(t) = g{u(t - I), ..., u(t - k),

Controller:

H t ) , . . . , y(t - 0, x(t) , . . . , x(t - m)} (16) Model:

A & - ' ) 1 - u m l q - ' - . . ' ~ arnvq-"

and

B,(q-') b,, + b,,q-' + ' . . + b , , -d .q - '+d '

where f{ . } and g{ . } are nonlinear functions. x(t), y,(t), u(t), fir), d and d , respectively, are the command input, reference output, plant input, plant output, plant delay and model delay. Note that the reference model is chosen as linear since this seems to be a natural choice. However, an appropriate nonlinear reference model may well be chosen.

In Fig. 46 the plant~ontroller combination is redrawn with the imaginary blocks D and H . Since X is a feedback block, eqn. 10 can be applied to obtain

307

and forward paths, eqns. 9 and 10 can be used to obtain where

Since 9 is made up of two cascade blocks, one may also use eqn. 2 to obtain

aDY(t) acu(tj c__- ____ dw,(t - 1) - b p dw,(t - 1)

and

and

where

a" p au(t)

6 =-

To compute the BPDs of 9 and V, these blocks are represented in terms of the functions f ( . ) and g( . ) in Figs. 5a and 5b. Since they have delays in both the feedback

u(t-n) F (plant) . Combining eqns. 21-23, one obtains

where the arguments t and q-' are dropped for clarity. Similarly, combining eqns. 20,22 and 24, one obtains

a"&) - q-'B agu(t) - - - ~ _ _ _ &,(t - 1) A R aw,(t - 1)

Combining eqns. 14, 18,27 and 28, one now obtains

4 - d B dsu(t)

rg(t) = A R + q-dBS aw,(t - 1 ) a

Eqns. 25,26 and 29 show that re may be computed using the block partial derivatives o f f { . } and g{ '1. If the functions f ( } and g{ } are known, these block partial derivatives can be easily computed from x(t), u(t), y(t) and their delayed values. In this paper, we will use a feedforward neural network to approximate these functions. However, the approach can also be used for other approximations based on polynomial or other basis functions.

4 Neural-net-based M R A C

In this Section, we describe how the neural net can be used to formulate direct MRAC for the nonlinear plants. Feedforward nemal networks can be effectively utilised to represent the unknown functions f( . ) and g( -). As mentioned in Section 1, the capability of feedforward neural networks in approximating nonlinear functions has been well documented in recent literature. Although polynomial functions also enjoy such an approximation property, the feedforward neural network has the advantage of a simple structure, the only structural parameters being the number of hidden layers and the number of neurons in each layer. Further, all the BPDs for f and g required to compute r,(t) in eqn. 13 can be conveniently computed using the backpropagation algorithm.

4.1 MRAC with plant identification (Algorithm I ) The proposed MRAC scheme with identification of the forward model is shown in Fig. 6. Two feejdforward


b Fig. 5 D Plant b Controller

Block representation of plant and controller

/ I

Fig. 6 model

308

Neural-net-based MRAC with identification of the forward

neural networks .NNJ and N.Ng are trained, one to assume the nonlinear plant functionf(. ), while the other assumes the nonlinear controller function g( ). The weights of N N J are updated by minimising

D J, (w/ ) = f 1 1 4) I I (30)

(33)

where wJ represents the weights of M.NJ and qJ is the learning rate.

All the BPDs forfneeded in eqns. 25 and 33, can be obtained from the backpropagation network. The estim- ated BPDs off ( ' ) are equal to the ordered derivatives [IO] of the network .,VNJ. These derivatives are obtained by backpropagating a one [one dt)] from the output of ,,V.,VJ. The BPDs in eqn. 25 are then obtained from the backpropagated value in the appropriate input terminals, while the BPDs in eqn. 33 are obtained by multiplying the backpropagated values with the appropriate neuron outputs [ I I]. Note that the ultimate use of the network .,V.N, is in the computation of the BPDs for f ( . ) in eqn. 25, which are in turn needed to train the network MA", that constitutes the controller.

The network MN, that serves as the controller is trained using eqn. 13, where the partial derivative r&t) is computed using eqns. 29, 25 and 26. The BPDs of g ( . ) are equal to the ordered derivatives of A/.,Vg. These are obtained by backpropagating a one from the output ter- minal of -UNg in an analogous way as described in the preceding paragraph.

The choice of learning rates q,(t) and q,(t) is also very important. Their choice should ensure fast convergence of the controller parameters to the appropriate values and as well as retaining local stability [12]. Based on the local-stability requirement, guidelines for choosing q,(t) and q,(t) will be provided in Section 5.

4.2 MRAC without plant identification The MRAC scheme described in Section 4.1 has the dis- advantage of requiring the training of two neural networks. In addition to the training of the controller, training of the net JV.~ '"~ describing the plant model is also needed. Following the development of MRAC schemes in linear plant, identification of the plant model may be avoided. To obtain such schemes, additional assumptions and/or signal filtering may be needed. We propose two such schemes in the following.

4.2.1 Extended MlT rule (algorithm 2): This approach is based on approximating the closed-loop system by the reference model. Following a similar approach to the development of eqn. 29, it also can be established that

where

(34)

On the other hand, from eqn. 17, one also has

(35)

where A represents the block containing the model (see Fig. 4a). Upon convergence, the right-hand sides of eqns. 34 and 35 must be the same. Hence assuming that B and E , do not share any common zeros, on convergence [I] one has

A R + q - d B S = A , A o B + (36)

T = qd-dB, A , / b , (37)

and

where

B = b o E i

bo is the instantaneous gain, Bt is monk and A , is an observer polynomial.

Although the above relationship is only valid on convergence, the proposed scheme avoids identification of the forward model by substituting eqn. 36 in eqn. 29. Further, assuming that bo is positive and absorbing it into the learning rate, one obtains the adaptation rule

(38) W,O) = W,(t - 1) + vo(f)4(')e(t)

4 - d 4(t) = ~ ~

A , A , c?w,(t - 1) (39)

where qo is an appropriate learning rate. This scheme parallels the MRAC based on the MIT rule for the linear plant [I].

4.2.2 Extended SPR rule (algorithm 3 ) : A further manipulation of eqn. 39 gives

where

4 - 6 P U ( t ) $(t) = - ~

D aw,(t - 1)

Since $(t) is an approximation of T,(t)/b,, a 1st-order approximation of e(t) in G(t - 1) gives

0 = e(t) + bo 4I-(t)Gq(t - 1 ) + O(G2) (42) or

with

~ , ( t - 1) = w," - w,(t - I )

and

EI = max {G& - I)}

where W: contains the weights of the neural controller which causes e(?) to be zero and Lo( . ) denotes the error order. If D is chosen such that b,D/(A,A,) is strictly positive real, the SPR rule [l] gives the following adjustment scheme for minimising e(?)

(44) w,(t) = w,(t - 1) + vi(t)$(t)e(r)

where q* is a learning rate. Note that the previous two schemes are based on a gradient scheme which also relies on a 1 st-order Taylor-series approximation.

309 IEE Proc.-Control Theory Appl . , Vol. 141, No. 5, September 1994

In the special case of linear plant with a linear controller, A and B correspond to the plant polynomials. R, S and T correspond to the controller polynomials and wg corresponds to the parameters of R, S and T. Further, since g{ . becomes linear, dgu(t)/dw, contains u(t - I) , y(t), x(t ) and their delayed values. The proposed three algorithms then reduce to the well established MRAC schemes for linear plant [I] .

(d) For algorithm 1 (MRAC through identification of the forward model), the learning rates are chosen as [I21

5 Convergence analysis

In this Section, we present the local-convergence properties of the proposed algorithms. The analysis of local stability properties of nonlinear control systems involves study of local behaviour through linearisation. Since in adaptive systems the plant and controller characteristics change more slowly compared with the system states, separation of time scales gives simple yet effective ways of analysing local stability of the closed-loop system [l].

In this procedure first, stability of the closed-loop system is analysed for fixed controller parameters and the nominal values of the signals. Then the stability of the parameter-updating algorithm is separately studied. To ensure local stability of the adaptive-control strategy, it is required that both of the above result in locally stable systems.

It can be seen from eqn. 34 that the linearised closed- loop control system is given as

Further, from eqn. 17, for the reference model

(45)

where tilde represents the incremental values. On convergence, G, must be equal to G,. However, to satisfy this relationship by a physically realised controller, d and the degrees of the associated polynomials must satisfy some inequalities [l]. In terms of the stated problem in this paper, they translate to

d 2 d deg A , 2 n - m + d - 1 (47)

d e g R 2 n - 1

d e g S 2 n - 1

d e g T 2 n - 1 f d - d '

These relationships can be used as guidelines for the choice of d , k, 1 and m. The characteristic equation of the closed-loop linearised system becomes

A , A , B + = O (49)

To present the stability results we require the following assumptions:

(a) The neural net N.Ng is chosen to be sufficiently large such that there exists a weight vector W O with

(50) g( ' 1 = . y N g ( w : 9 ' 1 (b) d', deg R, deg S and deg T are chosen to satisfy

eqns. 47 and 48. (c) The polynomial B(q- ' ) at the operating point is

discrete Hurwitz. Further, the polynomials A,(q- ') and A0(4- I ) are chosen as discrete Hurwitz.

310

6, 7 0 Y, = ii r,(t) ii (54) Further, the neural net N N , is chosen to be sufficiently large such that there exists a weight vector wy with

f(.) = J(r"fcwy, . ) (55) (e) For algorithm 2 (extended MIT), the learning rate

is chosen as

0 < P L , < 2 (56) r/&) =

E+ 7 0 ?+ = lld(:)l12 (57)

PL,

(80 + Y o P o

( f ) For algorithm 3 (extended SPR rule), the learning rate IS chosen as

rl*(t) = 0 < p* < 2 (58)

7 0 Y$ = V(Wg(t) (59) e* + Y@

We first present the result for algorithm 2 (extended MIT).

Proposirion: Let assumptions (a)-(c) and (e) hold. Then the MRAC given by algorithm 2 is locally stable.

Proof: Assumption (a) implies that the function g ( . ) is completely describable by the net MA'", , which, together with assumption (b), ensures that on convergence the linearised closed-loop equation is given by eqns. 45 and 46 where

3t) = Y ( t ) - Y,, (60)

with

A,(l)Y, = Bm(l)xss and Y , = Ym..

The subscript ss indicates steady-state values. It now follows from eqn. 49 that, when assumption (c) holds, for fixed weight of'the neural net JV,N"~, the local stability of the control system is ensured.

Since the neural net N.Y, is not fixed, the weight- update equation must also be locally stable. Consider the training algorithm given by eqn. 13. In the vicinity of w,", a linearisation of e(w,", t ) in wg(t - 1 ) and assumption

.

(b) imply

o = d w ; , t ) = e(t) + r;(t)Gg(t - 1)

e(t) = - r:(t)Gg(t - 1 ) = - 6, 4(t)Gg(t - 1 )

(63)

(64) where Gg(t - 1) and d(t) are defined in eqns. 43 and 39, respectively. The last relationship follows from the fact that, at convergence, bo rg(t) equals @(t).

or

IEE Pro<.-Control Theory Appl, Yo1 14/ , N o 5, September 1994

Substituting eqn. 64 in eqn. 13, and subtracting both sides from w,”, one obtains

d&) = { I - V&bo d(t)dT(t)Pt,(t - 1) (65)

Remark 3: Computation of q+(t) based on eqn. 56 requires the knowledge of bo . An upper bound of bo may be used instead. Recall that algorithm 2 is based on the assumption that bo is positive.

a

24

Q 20 18 16 14

E

-0.3 -02 -0.1 0 01 0 2 03 b

plant outpl ty

Fig. 7 o Steady-state gain h Time ConsLdnt

Gain and time constant o f t h r linearised plant in example I

with q&t) given by eqn. 56, it follows that

With the range of p+ and e4 given in eqn. 57, it follows that

lldg(t)lIZ < IlG& - 1)Il2 (67) This shows that w,” is locally stable. If, in addition, dT(r)dg # 0 for nonzero d,, for almost all t , it follows that

(68) ensuring local asymptotic stability of w,”; ,Thus asymptotic convergence to the optimal weight wg IS not ensured unless fPT(r)Gg # 0 for nonzero fig, for almost all t. This requires that x( t ) be sufficiently exciting. Such observation is typical in adaptive control.

Remark I ; To ensure local stability it is required that the linearised plant model given by eqn. 45 at the operating point be minimum phase. If any of the zeros of the linearised plant lies outside the unit circle, to ensure stability it must be included into the zeros of G,(z).

Remurk 2: Using a similar approach, it can be shown that, for the local stability of algorithm 1, it is required that assumptions (a)-(d) hold. While for algorithm 3, local stability is ensured if assumptions (a)-(c) and ( e ) hold.

l l q L ) l 1 2 < llG& - 1)112

I f iE Proc.-Control Theory A p p l . , Vol. 141, No. 5 , September 1994

0 4 0 3 0 2

0 1 4 0 O -01

-0 2 -0 3 -0 4 -0 5

0

-7

-0 -O 5 4Lb 0 50 100 150 200 250 300

iterations Fig. 8 (i Open-loop step response with desired oulpul io 25 b MRAC using algorithm I

Open-loop Ytep response and M R A C of plant I

I. ___

Remark 4 : Eqn. 58 is of limited practical use as r,(t) is not known in algorithm 3. However, if boD/A,Ao is SPR, y i would be positive most of the time. This implies that, if q+(t) is chosen as a small positive number, eqn. 57 would be satisfied for some p. One practical approach to selecting the learning rate in this case would be to use an algorithm similar to eqn. 51, yielding

q*(t) N If p > 0 (69) & + I - 6 > 0 1 = l14wl12 (70)

where - p is to be chosen through simulation study.

Remark 5 : If the size of the neural networks are not large enough to represent the nonlinearity perfectly (i.e. if the assumptions of eqns. 50 and 55 are not true), a dead zone may be introduced in the neural-weight-updating algorithms, i.e. parameter updating may be suspended unless the absolute error is larger than a threshold.

6 Simulation studies

This Section provides results of simulation study on two examples. In all the simulations, the neural network has been chosen with one hidden layer. A bias term has been used in every neuron. The nonlinear activation unit in the neural network has been the hyperbolic tangent function. The initial weights of the network were set to small

311

Table 1 : Equations and parameters used in the simulation Algorithm 1 Algorithm 2 Algorithm 3

Equations used 13. 29. 32. 51, 53 38. 39, 56 41, 44.69

Parameters, pg = 0.5; p, = 0 5 max {bo} = 4 p = 1

Parameters. pv = 1 0; p, = 0.5 max {bo} = 0.5 p = 1

example 1 t = 0.001

example2 sp=O.OO1; ~ , = 0 . 0 0 1 p4=0 .5 : ~+=0.001 ~ = o . O O l

sg = 0.001 ; E, = 0.001 pI = 0.5; E+ = 0 001

random values. The controller order has been chosen from the guideline given in eqns. 47 and 48.

6.1 Example 1 A NARX model described by the following equation has been considered [13] :

-0.9y(t - 1) + u(t) (71) =

1 + yZ(t - 1)

a

0 3

- 0 4 1 , , , , , 1 -0 5

0 50 100 150 200 250 300 iteration

b Fig. 9 MRAC ofplant I 0 Using algorithm 2 b Using algorithm 3 (D = I )

Y , ~~~- ,

0 6 6

i' I 0 50 100 150 200 250 300

iteratons Fig. 10

312

Plant inpui during MRAC of plant I using algorithm 2

The variations of the steady-state gain and the time constant of the linearised plant against the operating point are shown in Fig. 7. The time constant is plotted'in terms

451 1 40 -

- 35L

5 0 ,\I

a

160 I J

2 60

*/

plant outplt y

b Fig. 11 plant in example 2 a Steady-state gain b Dominant time conslant

Steady-state gain and domipant i i m constant of linearised

of the sampling intervals. It can be observed that, although the steady-state gain in the given range does not vary to a large extent, the time constant changes significantly. The open-loop step response of the plant for a desired output value of k0.25 is shown in Fig. 8a which shows the tendency of limit-cycle oscillation. For the MRAC scheme the command input x(t ) is varied over the range k 0.25 and the reference model has been taken as

y,(t) - 0.8y,(t - 1) = 0 . 2 ~ ( t - 1) (72) Inequalities 47 and 48 have been satisfied by taking I = k = rn = 1. This corresponds to A,(q-') = 1. For this example, three neurons are used in the hidden layer of N N , and five neurons are used in the hidden layer of N N g . The equations and parameters used for different algorithms ark shown in Table 1. For the SPR method D(q- ') has been chosen as 1. Figs. 8b, 9a and 9b show the reference output and the model output for the three proposed algorithms. It can be observed that the model following is not perfect in most cases; however, they are satisfactory. Note that it is even not known whether, for the assumed controller structure, a controller exists which is capable of following the model perfectly. Fig. 10 shows the plant input when algorithm 2 has been applied. The input for the other algorithms have been similar.

The large oscillations during the initial iterations may be avoided by starting with omine-trained neural networks (for the nominal plant) instead of neural nets with random weights.

.


6.2 Example2 A 2nd-order model described by the following NARX model is used:

y( t ) = 0.9722y(t - 1) + 0.3578u(t - 1) - 0.1295u(t - 2)

- 0.3103dt - I)u(t - 1 ) - 0.04228y2(t - 2)

+ 0.1663dt - 2)u(t - 2)

- 0.03259y2(t - l)y(t - 2)

- 0.3513y2(t - l)u(t - 2)

+ 0.3084At - l)y(t - 2)u(t - 2)

+ 0.1087flt - 2)u(t - l)u(t - 2) (73)

The model is obtained through identification of a laboratory-scale liquid-level system [3]. The variations of the steady-state gain and the dominant time constant of the linearised plant against the operating point are shown in Figs. 1 la and 1 l b , respectively. The large variations of both the steady-state gain and the dominant time constant in the given range indicate a high degree of nonlinearity in the plant. For this example, the command input x(t) is varied between k0.5 and the reference model is taken as

y,(t) - 0.8ym(t - 1) - 0.025ym(t - 2)

= 0 . 2 ~ ( t - 1 ) - 0.025~(t - 2) (74)

Inequalities 47 and 48 have been satisfied by taking I = k = m = 2. This corresponds to A & - ' ) = 1. For this example, three neurons are used in the hidden layer of both N.NJ and . N N g . The equations and parameters used for different algorithms are shown in Table 1 . Figs.

-0 61 i a

12a and 126 shows the results for algorithms 1 and 2, respectively. Figs. 13n and 136 shows the results for algorithm 3, where D(q- ' ) have been chosen as 1-0.9q-' and

I 1

a

0 6 6

L

r -061 , , , , 1

0 50 100 150 200 250 300 iterations

b Fig. 13 (I D = I - 0.9q-I b D = I - 0 . 5 ~ - '

M R A C ofplant 2 using algorithm 3

Y. ~~~~

_____

1 - OSq- ' , respectively. It can be observed that the initial convergence can be significantly different for different algorithms and for different D(q- ' ) (for the SPR rule). Fig. 14 shows the plant input when algorithm 2 has been

0 . 6 L ' I

0 50 100 150 200 250 300 iterations

Fig. 14 Plant input during M R A C uJplant 2 using algorithm 2

1 I 0 50 100 150 200 250 300

iterations

b Fig. 12 M R A C ufplanr 2 U Using algorithm I b Using algorithm 2

Y. ~~~~

_____

IEE Floc.-Control Theory Appl . , Vol. 141, No. 5, September 1994

used. This Figure, together with Fig. 12b, shows that after the plant output converged to the reference-model output, the control input has been satisfactory. The inputs for the other algorithms have been similar.

From Figs. 8, 9, 12 and 13, it can be observed that algorithm 1 provides fast convergence and better track- ing of the reference-model output. This is expected, as the method incorporates the identified model in the

313

controller-adjustment mechanism. The online identification of the forward model, however, increases the computation and memory requirement. The other two algorithms reduce the computational requirement by bypassing the identification part. Algorithm 2 is gradient based and therefore it is likely to produce satisfactory results in the face of unmodelled dynamics in the plant by providing a least-mean-square adaptation of the controller. Although algorithm 3, which is based on the SPR rule, does not enjoy such a property, it offers the flex- ibility of choosing the D polynomial (see eqn. 41) which may advantageously be used to alter the convergence properties of the algorithm.

7 Conclusion

In this paper the concept of block partial derivative and the associated algebra are introduced. The algebra is applied to the model reference adaptive control (MRAC) of discrete-time nonlinear plant. A feedforward multilayered neural network has been used to represent the nonlinearities of the plant and the controller. Three variants of the MRAC scheme, namely MRAC through plant identification, MRAC with extended MIT rule and MRAC based on extended SPR rule, are derived. The local-convergence properties of the algorithms are analysed, and this provides guidelines for choosing the controller order and the learning rate for the neural-net training. Result of simulation studies on two examples establish the feasibility of the proposed algorithms and validation of the theoretical developments.

Although the proposed block partial-derivative algebra is applied to an MRAC scheme and neural-net- based control, it can also be used for other optimisation- based control of nonlinear plants, as well as control

based on other approximations of the nonlineafity. The block-by-block derivative evaluation permits the computation of the gradient vector with respect to tKe adjustable parameters.

8 References

1 ASTROM, K.J., and WITTENMARK, 8 . : ‘Adaptive control’

2 HSIA, T.C.: ‘System identification: least squares methods’ (Addison-Wesley Publishing CO, 1989)

(Lexington Books, Lexington, MA;l976). 3 SALES, K.R., and BILLINGS. S.A.: ‘Self-tuning control of nonlin-

ear ARMAX models’, I n t . J. Control, 1990, 51, pp. 753-769 4 STINCHCOMB, M., HORNIK, H., and WHITE, H.: ‘Multilayer

feedforward networks are universal approximators’, Neural Net- works, 1982,2, pp. 359-366

5 CYBENKO, G.: ’Approximation by superposition of a sigmoidal function’, Math. Control Signals syst., 1989, 2, pp. 303-314

6 NARENDRA, K.S., and PARTHASARATHY, K.: ‘Identification and control of dynamic systems using neural networks’, IEEE Trans., 1990, ”-1, (l), pp. 4-27

7 CHEN, F.C.: ‘Back-propagation neural networks for nonlinear self- tuning adaptive control’, IEEE Control Syst. Mag., 1990, IO, ( I ) , pp. 44-48

8 LIU, C.C., and CHEN, F.C.: ’Adaptive control of nonlinear continuous-time systems using neural networks - general relative degree and MIMO cases’, Inr . J. Control, 1993,58, (2), pp. 317-335

9 SOKOLNIKOFF, I.S., and REDHEFFER, R.M.: ‘Mathematics of physics and modern engineering’ (McGraw-Hill, 1970)

10 WEBROS, P.J.: ‘Backpropagation through time: what it does and how to do it’, Proc. IEEE, 1990,78, pp. 1550-1560

11 RUMELHART, D.E., HINTON, G.E., and WILLIAMS, R.J.: ‘Learning internal representation by hack propagation’, in ‘Parallel distributed processing: exploration in the microstructure of cogni- tion’. Vol. 1 (Bradford Books, Cambridge, MA, 1986)

12 AHMED, M.S., and ANJUM, M.F.: ‘Learning rate algorithm for neural net based direct adaptive control’, Arab. J. Sci. Eng., 1993, 18, pp. 4 9 4 5 1 3

13 SU, H., QIN, S., and McAVOY, T.J.: ‘Comparisons of four neural net learning methods for dynamic system identification’, IEEE Trans., 1992, NU-3, pp. 112-130

314 IEE Proc.-Control Theory Appl., Vol. 141, No. 5, September I994

Date post:	20-Sep-2016
Category:	Documents
Upload:	ms
View:	212 times
Download:	0 times

Block partial derivative and its application to neural-net-based direct model-reference adaptive...

Documents