Nonlinear discrete time optimal control based on …AUTHOR COPY 648 X. Jin and Y.C. Shin / Nonlinear...

AU

THO

R C

OP

Y

Journal of Intelligent & Fuzzy Systems 29 (2015) 647–658DOI:10.3233/IFS-141376IOS Press

647

Nonlinear discrete time optimal control basedon Fuzzy Models

Xin Jin and Yung C. Shin∗School of Mechanical Engineering, Purdue University, West Lafayette, IN, USA

Abstract. The approach of designing a discrete time optimal controller for a nonlinear system represented by a fuzzy model ispresented in this paper. A fuzzy model with product inference engine, singleton fuzzifier, center average defuzzifier, and Gaussianmembership functions is trained by the orthogonal least square (OLS) learning algorithm based on given input-output data pairs.An optimal control scheme is then formulated based on the fuzzy model. The numerical solution of the problem is achieved byuse of a feasible-direction algorithm. To show the effectiveness of the proposed method, the simulation results of three nonlinearoptimal control problems are presented. The results show that the performance of the proposed approach is quite similar to thatof optimal control of the system represented by an explicit mathematical model, thus demonstrating the efficacy of the proposedscheme for optimal control of unknown nonlinear systems.

Keywords: Optimal controller, nonlinear system, fuzzy model, feasible-direction algorithm

1. Introduction

Optimal control theory that has played an importantrole in the design of modern control systems has asits objective the maximization of return from, or theminimization of the cost of, the operation of physical,social, and economic processes [1]. Up to date, enor-mous efforts have been spent on the development ofcomputational techniques for solving optimal controlproblems [2, 3]. However, many of these optimal con-trol strategies are based on an explicit mathematicalmodel of the system. It is well known that modeling andidentification procedures for the dynamics of a givennonlinear system are most time consuming iterativeendeavors that require model design, parameter identifi-cation and model validation at each step of the iteration.Instead, fuzzy models can be easily established from

∗Corresponding author. Yung C. Shin, School of MechanicalEngineering, Purdue University, West Lafayette, 47907 IN, USA.Tel.: +1 765 494 9775; Fax: +1 765 494 0539; E-mail: [email protected]

the input-output data pairs. So a method of applyingthe classical nonlinear optimal control theory to fuzzymodels is presented here.

A fuzzy model consists of four components: fuzzyrule base, fuzzy inference engine, fuzzifier and defuzzi-fier [4]. Stone-Weierstrass theorem shows that thefuzzy model with product inference engine, single-ton fuzzifier, center average defuzzifier, and Gaussianmembership functions has universal approximationcapability, which means it can approximate any non-linear function to arbitrary accuracy [5]. Therefore, thiskind of fuzzy model is used to model nonlinear systemsin the present study.

There are many different ways to train a fuzzymodel such as back-propagation algorithm [6], gradi-ent descent, least square [7], clustering [8] and OLSalgorithm [9–11]. The most efficient and widely usedmethod is the OLS algorithm. Specifically, after an ini-tial fuzzy system is first constructed with as many fuzzybasis functions as input-output pairs, then the OLS algo-rithm is used to select significant fuzzy basis functionsto construct a final fuzzy model [9, 10].

1064-1246/15/$35.00 © 2015 – IOS Press and the authors. All rights reserved

mailto:[email protected] -@M edu

AU

THO

R C

OP

Y

648 X. Jin and Y.C. Shin / Nonlinear discrete time optimal control based on Fuzzy Models

Optimal control was introduced in the 1950s withuse of dynamic programming (leading to Hamilton-Jacobi-Bellman (HJB) partial differential equations)and the Pontryagin maximum principle (a generaliza-tion of the Euler-Lagrange equations deriving fromthe calculus of variations) [1, 12, 13]. However, theoptimal control of nonlinear systems is still one ofthe most challenging and difficult subjects in controltheory. In recent years, adaptive/approximate dynamicprogramming (ADP) algorithms [14–16] have gainedmuch attention from researchers. It is a reinforce-ment learning approach based on adaptive critics tosolve dynamic programming problems utilizing func-tion approximation for the value function. It can bebased on value iterations or policy iterations. In [17],a successive approximation method using generalizedHamilton-Jacobi-Bellman (GHJB) equation was pro-posed to solve the near-optimal control problem foraffine nonlinear discrete time systems, which requiresa small perturbation assumption and an initially sta-ble policy. The complete dynamics of affine nonlinearsystem were assumed to be known in the approach.

In [18], the Q-learning policy iteration method wasused to solve the optimal strategies for linear discretetime without requiring known system dynamics wherethe system dynamics are defined as constant matrices.However, this method works only for linear systemsand it is not clear how to select the number of iterationsrequired for convergence and stability.

Optimal control strategies for unknown affine nonlin-ear discrete time systems of the form x(k+1) = f(x(k)) +g(x(k))u(k) [19–22] or continuous time linear systems[23] using offline trained neural networks have beenpresented. These proposed schemes do not requireexplicit knowledge of the system dynamics as only thelearned neural network model is needed. It first usesa neural network to learn the complete plant dynam-ics and then offline ADP is attempted to use only thelearned neural network system model, resulting in anovel optimal control law. However, this scheme canonly be applied to the specific type of affine nonlin-ear discrete time systems or continuous time linearsystems.

Fuzzy models offer many advantages than neural net-work models. Using fuzzy basis function expansions,two sets of fuzzy basis functions can be easily com-bined, one generated form input-output pairs and theother obtained from linguistic fuzzy IF-THEN rulesthat may contain information which is not contained inthe input-output data pairs [8]. Therefore, fuzzy modelsare used in the present work instead of neural network

models, to approximate general nonlinear systems thatare not limited to nonlinear affine systems.

The feasible-direction algorithm [24, 25] is used toachieve the numerical solution of the Euler-Lagrangeequations of the formulated discrete time optimal con-trol problem. This algorithm uses the steepest descentto find the search direction and then apply a one-dimensional search routine to find the best step lengthiteratively. It has a very high computational efficiencyand is very easy to implement. Finally the proposedapproach is applied to three general nonlinear systemsto show its efficacy for control of unknown nonlinearsystems. The results are quite similar to that of optimalcontrol of the systems represented by explicit math-ematical models, thus validating its effectiveness. Inaddition, the optimal control solutions based on the twokinds of models can be found by almost the same iter-ative steps. Although the computation time based onfuzzy models for each step is longer than that based onexplicit mathematical models considering fuzzy modelshave much more terms than explicit mathematical mod-els for a specific dynamic system, it is still very shortdue to the fast computation speed of the computers.

Therefore, the proposed method is used to calcu-late the numerical solutions of the optimal controlproblems based on fuzzy models which approximatea general form of nonlinear discrete time systems(x(k + 1) = f(x(k),u(k))). The simulation results are verysimilar to that based on the explicit mathematical mod-els, which demonstrates that the proposed scheme canachieve very accurate nonlinear optimal control resultswithout implementing the time consuming modelingand system identification procedures.

2. Fuzzy model

A fuzzy model consists of four principal elements[26]: fuzzifier, fuzzy rule base, fuzzy inference engine,and defuzzifier. For the nonlinear discrete time multi-input, multi output (MIMO) system, it can be separatedinto a group of multi-input, single-output (MISO) sys-tems: U ∈ Rn+m → R, where U is compact. The fuzzymodel is established in state space form such that theinputs of the fuzzy system are the n states and m inputsof the system and the output of it is the each state valueof the system at the next time instance.

MIMO fuzzy systems with singleton fuzzifier,product inference, centroid defuzzifier, and Gaussianmembership function can be represented as follows, forp = 1, . . . , n [8, 9].

AU

THO

R C

OP

Y

X. Jin and Y.C. Shin / Nonlinear discrete time optimal control based on Fuzzy Models 649

xp(k + 1) = fp(x(k), u(k))

=

Mp∑l=1

wlp

(�n

i=1µAlip

(xi(k)))(

�mj=1µBl

jp(uj(k))

)Mp∑l=1

(�n

i=1µAlip

(xi(k)))(

�mj=1µBl

jp(uj(k))

)(1)

where fp: U ∈ Rn+m → R, xp(k + 1) is the pth stateof the system at time index k + 1, wl

p is the singleton,x(k) = [x1(k), x2(k), . . . , xn(k)]T is the state vector ofthe system at time index k and µAl

ip(xi(k)) is the Gaus-

sian membership function, defined by

µAlip

(xi(k)) = exp

⎛⎝−1

2

(xi(k) − xl

ip

σlxip

)2⎞⎠ (2)

where xlip and σl

xipare the center and width of xi(k)

respectively.Similarly u(k) = [u1(k), u2(k), . . . , um(k)]T is the

input vector of the system at time index k andµBl

ip(uj(k)) is the Gaussian membership function,

defined by

µBljp

(uj(k)) = exp

⎛⎝−1

2

(uj(k) − ul

jp

σlujp

)2⎞⎠ (3)

where uljp and σl

ujpare the center and width of

uj(k).

3. Orthogonal least square algorithm

The OLS algorithm is a very efficient and widely usedmethod for training a fuzzy model. The OLS algorithmis a one-pass regression procedure, and is thereforemuch faster than other algorithms. Also, the OLS algo-rithm generates a robust fuzzy model that is not sensitiveto noise in its inputs [8, 9]. In this paper, the widths (σ)of the fuzzy model are first fixed to cover the input stateregion. The resulting fuzzy model is then equivalent toa series expansion of fuzzy basis functions, which islinear in parameters [8, 9].

fp(x(k), u(k)) =Mp∑l=1

wlphl

p(x(k), u(k)) (4)

where

hlp(x(k), u(k))

=

(�n

i=1µAlip

(xi(k)))(

�mj=1µBl

jp(ui(k))

)Mp∑l=1

(�n

i=1µAlip

(xi(k)))(

�mj=1µBl

jp(ui(k))

) (5)

However, since the normalization factor in thedenominator is not known before the fuzzy basis func-tion is selected, a pseudo-fuzzy basis function is neededto define as follows [10]:

qlp(x(k), u(k))

=(�n

i=1µAlip

(xi(k)))(

�nj=1µBl

jp(uj(k))

)(6)

Then the fuzzy basis function can be expressed interms of pseudo-fuzzy basis functions as follows [10]:

hlp(x(k), u(k)) = ql

p(x(k), u(k))mp∑l=1

qlp(x(k), u(k))

(7)

For N input-output training pairs ([xt(k); ut(k)],xtp(k + 1)), the following matrix form can be derived

from t = 1 to N [11]:

d = Hw + e (8)

where d =[x1p(k + 1), . . . , xN

p (k + 1)]T

and H =[h1

p, . . . , hMpp

].

with

hlp =

[hl

p(x1(k); u1(k)), . . . , hlp(xN (k); uN (k))

]T,

w =[w1

p, . . . , wMpp

]T, and e = [e1, . . . , eN ]T .

The classical Gram-Schmidt orthogonal least-squares algorithm is used to determine the significantpseudo-fuzzy basis functions, which then can be nor-malized to fuzzy basis functions [9] and the weightingfactor can be calculated as [11]:

w = (HT H)−1HT d (9)

4. Optimal control for fuzzy model

The procedure of designing a discrete time opti-mal controller for nonlinear systems represented by afuzzy model is presented in this section. In this paper,

AU

THO

R C

OP

Y


a feasible-direction algorithm is used to achieve thenumerical solution of the Euler-Lagrange equations ofthe formulated discrete time optimal control problem[24].

The general problem considered in the solution algo-rithm is that of minimizing a cost function:

J = θ[x(N)] +N−1∑k=0

ϕ [x(k), u(k)] (10)

subject to the MIMO fuzzy model trained by OLS algo-rithm for p = 1, . . . , n:

xp(k + 1) = fp (x(k), u(k))

=

Mp∑l=1

wlp

n∏i=1

exp

(− 1

2

(xi(k)−xl

ip

σlxip

)2)

m∏j=1

exp

(− 1

2

(uj(k)−ul

jp

σlujp

)2)

Mp∑l=1

n∏i=1

exp

(− 1

2

(xi(k)−xl

ip

σlxip

)2)

m∏j=1

exp

(− 1

2

(uj(k)−ul

jp

σlujp

)2) (11)

x(0) = x0 (12)

The augmented cost function is represented by

Ja = θ [x(N)] +N−1∑k=0

ϕ [x(k), u(k)]

+λT (k + 1)(f [x(k), u(k)] − x(k + 1)) (13)

The gradient of Ja with respect to u is given by

g(k) =∂

N−1∑k=0

ϕ [x(k), u(k)]

∂u(k)

+∂f [x(k), u(k)]T

∂u(k)λ(k + 1) (14)

where

∂f [x(k),u(k)]∂u(k) =

⎛⎜⎜⎜⎜⎜⎜⎜⎝

∂f1 [x(k), u(k)]

∂u1(k)· · · ∂f1 [x(k), u(k)]

∂um(k)

.... . .

...

∂fn [x(k), u(k)]

∂u1(k)· · · ∂fn [x(k), u(k)]

∂um(k)

⎞⎟⎟⎟⎟⎟⎟⎟⎠

and k = 1, 2, . . . , N − 1.∂f [x(k),u(k)]

∂u(k) is very easy to derive based on an explicitmathematical model of the nonlinear system. However,if based on a fuzzy model of the system, ∂f [x(k),u(k)]

∂u(k)should be computed as:

∂fp [x(k), u(k)]

∂uq(k)=

Mp∑la=1

Mp∑lb=1

alap alb

p qlap (wlap − wlbp) (15)

where

alap =

n∏i=1

exp

⎛⎝−1

2

(xi(k) − x

laip

σlaxip

)2⎞⎠

×m∏

j=1

exp

⎛⎝−1

2

(uj(k) − u

lajp

σlaujp

)2⎞⎠ (16a)

alpp =

n∏i=1

exp

⎛⎝−1

2

(xi(k) − x

lbip

σlbxip

)2⎞⎠

×m∏

j=1

exp

⎛⎝−1

2

(uj(k) − u

lbjp

σlbujp

)2⎞⎠ (16b)

qlap = uq(k) − ula

qp

(σlauqp)2

(16c)

The gradient of Ja with respect to x can be com-puted as:

λ(k) =∂

N−1∑k=0

ϕ [x(k), u(k)]

∂x(k)

+∂f [x(k), u(k)]T

∂x(k)λ (k + 1) (17)

λ (N) = ∂θ [x(N)]

∂x(N)(18)

where

∂f [x(k),u(k)]∂x(k) =

⎛⎜⎜⎜⎜⎜⎜⎜⎝

∂f1 [x(k), u(k)]

∂x1(k)· · · ∂f1 [x(k), u(k)]

∂xn(k)

.... . .

...

∂fn [x(k), u(k)]

∂x1(k)· · · ∂fn [x(k), u(k)]

∂xn(k)

⎞⎟⎟⎟⎟⎟⎟⎟⎠

and k = 1, 2, . . . , N − 1.

AU

THO

R C

OP

Y


Similar to ∂f [x(k),u(k)]∂u(k) ,

∂f [x(k),u(k)]∂x(k) for the fuzzy

model should be represented as:

∂fp[x(k), u(k)]

∂xq(k)

=Mp∑la=1

Mp∑lb=1

alap alb

p qlap (wlap − wlbp) (19)

where

alap = �n

i=1 exp

⎛⎝−1

2

(xi(k) − x

laip

σlaxip

)2⎞⎠

×�mj=1 exp

⎛⎝−1

2

(uj(k) − u

lajp

σlaujp

)2⎞⎠ (20a)

albp = �n

i=1 exp

⎛⎝−1

2

(xi(k) − x

lbip

σlbxip

)2⎞⎠

×�mj=1 exp

⎛⎝−1

2

(uj(k) − u

lbjp

σlaujp

)2⎞⎠ (20b)

qlap = −xq(k) − xla

qp

(σlaxqp)2

(20c)

Then the structure of the solution algorithm to findthe optimal state trajectories and control inputs can bedescribed as follows [24]:

Step 1: Select a feasible initial control trajectoryu0(k), set the iteration index i = 0

Step 2: Using ui(k), solve (11) from the initial con-dition (12) to obtain xi(k)

Step 3: Using ui(k) and xi(k), solve (17) from termi-nal condition (18) to obtain Lagrange multipliers λ i(k)and calculate gradients gi(k) from (14)

Step 4: Specify a search direction: pi(k) = −gi(k)Step 5: Apply a one-dimensional search routine

along pi(k) to obtain ui+1(k). The correspondingline-optimization problem minimizes the followingquantity:

mina>0

J[ui(k) + �pi(k)]

Step 6: If for a given scalar ε > 0, the inequalities

|| gi(k) || < ε

hold, stop. Otherwise, set i = i + 1 and go to step 2.

In step 4, several methods such as conjugate gradi-ent methods or quasi-Newton methods can also be usedfor the specification of the search direction pi(k) [27].All these methods use a search direction that satisfies[pi(k), gi(k)] < 0, which guarantees that the deriva-tive ∂J

∂αis always negative for a = 0 (except for ui(k),

which is a stationary point), and therefore the objectivefunction can be improved for some a > 0.

In step 5, there are many different ways to search forthe best step length for the line search algorithm such asWolfe conditions, Goldstein conditions or backtrackingapproach [28]. For the fuzzy model, the computationprocedures are rather complex. Thus, the followingforward-backward method is used to find the best steplength α in a range [c, d] :

(1) Given h = 0.1, evaluate J(αk), where αk = c

and k = 0.(2) Compare the objective function values. Set

αk+1 = αk + h and evaluate Jk+1 = J(αk+1). IfJk+1 < Jk, go to forward step (3); otherwise, goto stop step (4).

(3) Forward step: Set αk = αk+1, Jk = Jk+1, k =k + 1, go to (2).

(4) Stop step: Set c = αk, d = αk+1, output [c, d]and stop.

Then choose a smaller h and do the iteration again,until (d − c) ≤ ε. After that choose α = c+d

2 .

5. Simulation results

In this section, three simulation examples thatillustrate the effectiveness of the proposed methodare presented. They are the carriage and nonlinearspring system [29], the rigid asymmetric spacecraft[30] and the nonlinear continuous stirred tank reactor[31].

5.1. Example 1: carriage with nonlinear spring

The optimal control law was applied to a cart witha mass M moving on the plane, as shown in Fig. 1.This carriage is attached to the wall via a spring withelasticity k given by

k = k0e−x1 (21)

where x1 is the displacement of the carriage fromthe equilibrium position associated with the externalforce u. Finally, a damper with damping factor hd the

AU

THO

R C

OP

Y


Fig. 1. Carriage with nonlinear spring.

system is given by the following continuous-time state-space nonlinear model [29].

x1(t) = x2(t) (22a)

x2(t) = − k0

Me−x1(t)x1(t) − hd

Mx2(t) + u(t)

M(22b)

where x2 is the carriage velocity. The parameters ofthe system are M = 1 kg, k0 = 0.33 N/m, while thedamping factor is hd = 1.1. An Euler approximationof system with sampling time Tc = 0.2 s is givenby [29].

x1(k + 1) = x1(k) + Tcx2(k) (23a)

x2(k + 1) = x2(k) − Tc

k0

Me−x1(k)x1(k)

−Tc

hd

Mx2(k) + Tc

u(k)

M(23b)

Since the Equation (23a) can be easily known fromthe physical meanings of x1 and x2, only the state Equa-tion (23b) needs to be approximated by the followingfuzzy model:

x2(k + 1)

=

M2∑l=1

wl2

(2∏

i=1exp

(− 1

2

(xi(k)−xl

i2σl

xi2

)2))

exp

(− 1

2

(u(k)−ul

12σl

u2

)2)

M2∑l=1

(2∏

i=1exp

(− 1

2

(xi(k)−xl

i2σl

xi2

)2))

exp

(− 1

2

(u(k)−ul

12σl

u2

)2) (24)

where x2(k + 1) is represented by a fuzzy model com-posed of the states x1(k) and x2(k) and the input u(k).

exp

(− 1

2

(x1(k)−xl

12σl

x12

)2)

, exp

(− 1

2

(x2(k)−xl

22σl

x22

)2)

and exp

(− 1

2

(u(k)−ul

12σl

u2

)2)

are Gaussian membership

functions, which are connected by product inferenceand centroid defuzzifier. For p = 2, the widths of the

states x1 and x2 (σlx1p

and σlx2p

) were fixed to be 0.6

and the width of the input u(σlup) was fixed to be 0.6.

Then 300 input-output pairs were utilized to train thefuzzy model by the OLS algorithm to derive the centersof states and input (xl

1p, xl2p, ul

1p) and the weightingvector (wlp). For the input-output pairs, the range ofstate x1 was from −1.5 m to 1.5 m, and the range ofstate x2 was from −1 m/s to 1 m/s, and the range ofinput u was from 0.5 N to 2.5 N. After trained, 115rules (M2 = 115) were selected to be the fuzzy modelof Equation (24).

To validate the fuzzy model, the system responseswere simulated with the same inputs (u(k) = −1 N

for k = 0, 1, . . . 19) for the discrete time mathemati-cal model and fuzzy model. As shown in Fig. 2, thetrained fuzzy model approximates the mathematicalmodel well.

The initial conditions are x1(0) = 0 m and x2(0) =0 m/s. The command inputs are r1 = 1 m and r2 =0 m/s and the performance index is (N = 20):

J = (x1(N) − r1(N))2 + 0.1(x2(N)

−r2(N))2 +N−1∑k=0

((x1(k) − r1(k))2

)

+0.1(x2(k) − r2(k))2 + 0.2u(k)2) (25)

Using the proposed algorithm, the optimal controlinput for the system represented by a fuzzy model wasderived. The feasible-direction algorithm [24, 25] wasalso used to derive the optimal control inputs for thesystem represented by an explicit mathematical model.

Then the two optimal control inputs were implementedwith the mathematical model to obtain the state trajecto-ries, as shown in Fig. 3. From the simulation results, theoptimal control results for the nonlinear system repre-sented by an explicit mathematical model and a fuzzymodel are quite similar. The performance index val-ues for the mathematical model and fuzzy model are5.5758 and 5.5183 respectively, which indicates the

AU

THO

R C

OP

Y


Fig. 2. The system responses of the mathematical model and fuzzymodel with the same control inputs for the cart with nonlinear spring.

effectiveness of the proposed method. Therefore, theapproach can achieve very good nonlinear optimal con-trol results without the traditional modeling and systemidentification procedures. In addition, this algorithmuses a fuzzy model to approximate the nonlinear sys-tem in the form of x(k + 1) = f (x(k), u(k)) instead ofnonlinear affine systems or linear systems which havebeen shown by others.

5.2. Example 2: rigid asymmetric spacecraft

Tracking of a rigid asymmetric spacecraft is con-cerned with a primary attitude control task. Due toinherent nonlinearity of attitude dynamics, tracking inlarge and rapid maneuvers is a complex undertaking.Therefore, this tracking problem with three independentaxis controls is investigated here. The Euler’s equations

Fig. 3. Simulation results of optimal control for the carriage withnonlinear spring represented by a fuzzy model and an explicit math-ematical model.

for the angular velocities x1, x2, x3 of the spacecraftare given by [30]

x1(t) = − l3 − l2

l1x2(t)x3(t) + u1(t)

l1(26a)

AU

THO

R C

OP

Y


x2(t) = − l1 − l3

l2x1(t)x3(t) + u2(t)

l2(26b)

x3(t) = − l2 − l1

l3x1(t)x2(t) + u3(t)

l3(26c)

where u1, u2, u3 are control torques, and I1 =86.24 kg · m2, I2 = 85.07 kg · m2 and I3 =113.59 kg · m2 are spacecraft principal inertias.

An Euler approximation of the system with samplingtime T = 5s is given by

x1(k + 1) = x1(k)

+5 · (−0.3307x2(k)x3(k) + 0.0116u1(k)) (27a)

x2(k + 1) = x2(k)

+5 · (0.3215x1(k)x3(k) + 0.0118u2(k)) (27b)

x3(k + 1) = x3(k)

+5 · (0.0103x1(k)x2(k) + 0.0088u3(k)) (27c)

Similar to Example 1, to train the fuzzy model ofthe system, the widths of the states x1, x2 and x3were fixed to be 0.05, and the widths of the inputsu1, u2 and u3 were fixed to be 0.1. Then 2000input-output pairs were utilized to train the fuzzymodels by the OLS algorithm to derive the centers ofstates and inputs, and the weighting factors. For theinput-output pairs, the ranges of states x1, x2 and x3were from −0.1 rad/s to 0.1 rad/s. The ranges ofinputs u1, u2 and u3 were from −0.2 Nm to 0.2 Nm.

After trained, 500 rules were selected to be the fuzzymodel of all the three Equations (27a), (27b) and (27c)respectively.

To validate the fuzzy model, the responseswith constant inputs (u1(k) = −0.05 Nm, u2(k) =−0.05 Nm, u3(k) = −0.05Nm for k = 0, 1, . . . 14)were simulated for the mathematical model and thefuzzy model. As shown in Fig. 4, the system responsesof the fuzzy model are almost same as that of the math-ematical model.

The initial conditions are x1(0) = 0 rad/s, x2(0) =0 rad/s and x3(0) = 0 rad/s. The command inputsare set to r1 = 0.04 rad/s, r2 = 0.04 rad/s andr3 = 0.04 rad/s, and the performance index is definedby (N = 40):

J = (x1(N) − r1(N))2 + (x2(N) − r2(N))2

+ (x3(N) − r3(N))2

+N−1∑k=0

((x1(k) − r1(k))2 + (x2(k) − r2(k))2

+ (x3(k) − r3(k))2 + 0.01u21(k)

+ 0.01u22(k) + 0.01u2

3(k))

(28)

Similar to the example 1, the optimal control inputsfor the systems represented by the two models werederived and then the state trajectories were obtained, asshown in Fig. 5. From the simulation results, the optimalcontrol results for the nonlinear systems representedby the two models are very close and the performanceindex values for the fuzzy model and mathematicalmodel are both 0.0149. Therefore, this example clearlyillustrates that the proposed method can solve the non-linear optimal control problems without the explicitmathematical models effectively.

5.3. Example 3: continuously stirred tank reactor

Consider a continuously stirred tank reactor shown inFig. 6. The mass and heat balance for a single reactionA � B are [31]

dc

dt= 1

θ(cf − c) − r (29a)

dT

dt= 1

θ(Tf − T ) + Jr − αu(T − Tc) (29b)

where concentration of reaction c and temperature T

are two states of the system. u is the coolant flowratecontrol input. θ = 10 min is the total start-up timeof interest. cf = 1 mol/L is the feed concentration ofreaction. Tf = 300 K is the feed temperature. J = 100is a chemical constant. r is the reaction rate. α =1.95 × 10−4m2 is the dimensionless heat transfer areaand Tc = 290 K is the room temperature.

If dimensionless quantities are defined as y1 = ccf

,

y2 = TJcf

, yc = Tc

Jcf, yf = Tf

Jcf, r = r

cf, one gets

dy1

dt= 1

θ(1 − y1) − r (30a)

dy2

dt= 1

θ(yf − y2) + r − αu(y2 − yc) (30b)

AU

THO

R C

OP

Y


Fig. 4. The system responses of the mathematical model and fuzzymodel with the same control inputs for rigid asymmetric spacecraft.

where y1, y2, yc and yf are dimensionless con-centration, temperature, coolant temperature, feedtemperature respectively, and the dimensionless irre-versible reaction rate r is given by

r = k10y1e− N

y2 (31)

where k10 = 300 is a pre-exponential factor of forwardconstant and N = 25.2 is a gas constant.

An Euler approximation of the system (30) with sam-pling time T = 0.4 min is given by

y1(k + 1) = y1(k) + T

(1

θ(1 − y1(k)) − r(k)

)(32a)

y2(k + 1) = y2(k) + T

(1

θ(yf − y2(k))

+r(k) − αu(k)(y2(k) − yc)) (32b)

To train the fuzzy model of the system, the widths ofthe states y1 and y2 were fixed to be 0.25 and the widthof the inputuwas fixed to be 250. Then 500 input-outputpairs were utilized to train the fuzzy models by the OLSalgorithm to derive the centers of states and input, andthe weighting factors. For the input-output pairs, therange of state y1 was from 0 to 1, the range of statey2 was from 3 to 4 and the range of input u was from0 to 1000. After trained, 137 rules were selected to bethe fuzzy models of the two equations (32a) and (32b)respectively. The system responses with the constantinput (u(k) = 300 L/ min for k = 0, 1, . . . 49) was sim-ulated for the mathematical model and the fuzzy model.As shown in Fig. 7, the responses of the two models arecompletely same.

The initial conditions for the dimensionless concen-tration and temperature are y1(0) = 1 (c(0) = y1(0) =1 mol/L) and y2(0) = 3 (T (0) = 100y2(0) = 300K)respectively. The objective of optimal control problemis to find the coolant flow rate control u(k) ≥ 0 suchthat minimize the functional

J = α1 (y1(N) − r1(N))2 + α2 (y2(N) − r2(N))2

+N−1∑k=0

(α1 (y1(k) − r1(k))2 + α2 (y2(k) − r2(k))2

+α3 (u(k) − us(k))2)

(33)

where N = 50, r1 = 0.408126, r2 = 3.29763, us =370 k, α1 = 100000, α2 = 2000 and α3 = 0.001. Sothe desired values for concentration and temperature arecd = r1 = 0.408126 and Td = 100 r2 = 329.763 K

respectivelySimilar to the previous examples, the optimal control

inputs for the systems represented by a fuzzy modeland an explicit mathematical model were derived andthen the state trajectories were obtained, as shown inFig. 8. From the simulation results, optimal control

AU

THO

R C

OP

Y


Fig. 5. Simulation results of optimal control for the rigid asymmetric spacecraft represented by a fuzzy model and an explicit mathematical model.

results for the nonlinear systems represented by the twomodels are quite close. Besides, the performance indexvalues for the fuzzy model and mathematical modelare 2.6748 × 106 and 2.6733 × 106 respectively, whichare also very similar, showing the effectiveness of theproposed scheme.

6. Conclusion

In this paper, a discrete time optimal controller tothe nonlinear system represented by a fuzzy model is

developed. With the product inference engine, single-ton fuzzifier, center average defuzzifier and Gaussianmembership functions, a fuzzy model was trained bythe OLS learning algorithm which is very efficient andnot sensitive to noise in its inputs, and then the opti-mal control problem was formulated based on the fuzzymodel. The numerical solution of the problem wasobtained by use of a feasible-direction algorithm. Thesimulation results of three nonlinear optimal controlexamples showed that the performance of the proposedapproach based on a fuzzy model is quite similar to

AU

THO

R C

OP

Y


Fig. 6. Continuously stirred tank reactor [31].

Fig. 7. The system responses of the mathematical model and fuzzymodel with the same control inputs for continuously stirred tankreactor.

that of optimal control of the system represented byan explicit mathematical model, thus demonstratingits efficacy for optimal control of unknown nonlinearsystems. However, because the feasible-direction algo-rithm converges slowly around the minimization pointsand may go to local minimization points sometimes,one interesting future research topic is to deal with thesedrawbacks.

Fig. 8. Simulation results of optimal control for the continuouslystirred tank reactor represented by a fuzzy model and an explicitmathematical model.

References

[1] D.E. Kirk, Optimal Control Theory: An Introduction, Dover-Publications, New York, 2012.

[2] I.M. Ross and M. Karpenko, A review of pseudospectral opti-mal control: From theory to flight, Annual Reviews in Control36(2) (2012), 182–197.

AU

THO

R C

OP

Y


[3] H.J. Kappen, V. Gomez and M. Opper, Optimal control as agraphical model inference problem, Machine Learning 87(2)(2012), 159–182.

[4] C.C. Lee, Fuzzy logic in control systems: Fuzzy logic con-troller. II, IEEE Transactions on Systems, Man and Cybernetics20(2) (1990), 419–435.

[5] B. Kosko, Fuzzy systems as universal approximators, IEEETransactions on Computers 43(11) (1994), 1329–1333.

[6] L.X. Wang and J.M. Mendel, Back-propagation fuzzy systemas nonlinear dynamic system identifiers, IEEE InternationalConference on Fuzzy Systems (1992), 1409–1418.

[7] L.X. Wang, A Course in Fuzzy Systems, Prentice-Hall Press,U S A, 1999.

[8] S.L. Chiu, Fuzzy model identification based on cluster esti-mation, Journal of Intelligent and Fuzzy Systems 2(3) (1994),267–278.

[9] L.X. Wang and J.M. Mendel, Fuzzy basis functions, universalapproximation, and orthogonal least-squares learning, IEEETransactions on Neural Networks 3(5) (1992), 807–814.

[10] C.W. Lee and Y.C. Shin, Construction of fuzzy systems usingleast-squares method and genetic algorithm, Fuzzy Sets andSystems 137(3) (2003), 297–323.

[11] S. Chen, C.F.N. Cowan and P.M. Grant, Orthogonal leastsquares learning algorithm for radial basis function networks,IEEE Transactions on Neural Networks 2(2) (1991), 302–309.

[12] H.J. Sussmann and J.C. Willems, 300 years of optimal control:From the brachystochrone to the maximum principle, IEEEControl Systems 17(3) (1997), 32–44.

[13] L.S. Pontryagin, V.G. Boltyanskiy, R.V. Gramkrelidze and E.F.Mischenko, The Mathematical Theory of Optimal Processes,Interscience, New York, 1962.

[14] A. Al-Tamimi, F.L. Lewis and M. Abu-Khalaf, Discrete-timenonlinear HJB solution using approximate dynamic program-ming: Convergence proof, IEEE Transactions on Systems,Man, and Cybernetics, Part B: Cybernetics 38(4) (2008),943–949.

[15] F.L. Lewis and D. Vrabie, Reinforcement learning and adap-tive dynamic programming for feedback control, IEEE Circuitsand Systems Magazine 9(3) (2009), 32–50.

[16] F.Y. Wang, H. Zhang and D. Liu, Adaptive dynamic pro-gramming: An introduction, IEEE Computational IntelligenceMagazine 4(2) (2009), 39–47.

[17] Z. Chen and S. Jagannathan, Generalized Hamilton–Jacobi–Bellman formulation-based neural network control ofaffine nonlinear discrete-time systems, IEEE Transactions onNeural Networks 19(1) (2008), 90–106.

[18] A. Al-Tamimi, F.L. Lewis and M. Abu-Khalaf, Model-free Q-learning designs for linear discrete-time zero-sum games with

application to H-infinity control, Automatica 43(3) (2007),473–481.

[19] T. Dierks, B.T. Thumati and S. Jagannathan, Optimal control ofunknown affine nonlinear discrete-time systems using offline-trained neural networks with proof of convergence, NeuralNetworks 22(5) (2009), 851–860.

[20] T. Dierks and S. Jagannathan, Online optimal control ofnonlinear discrete-time systems using approximate dynamicprogramming, Journal of Control Theory and Applications9(3) (2011), 361–369.

[21] Q. Wei, H. Zhang and J. Dai, Model-free multiobjectiveapproximate dynamic programming for discrete-time non-linear systems with general performance index functions,Neurocomputing 72(7) (2009), 1839–1848.

[22] A. Heydari and S.N. Balakrishnan, Finite-horizon control-constrained nonlinear optimal control using single networkadaptive critics, IEEE Transactions on Neural Networks andLearning Systems 24(1) (2013), 145–157.

[23] Y. Jiang and Z.P. Jiang, Computational adaptive optimalcontrol for continuous-time linear systems with com-pletely unknown dynamics, Automatica 48(10) (2012),2699–2704.

[24] A. Kotsialos, M. Papageorgiou and A. Messmer, Optimalcoordinated and integrated motorway network traffic control,14th International Symposium on Transportation and TrafficTheory, 1999, pp. 621–644.

[25] R.C. Carlson, I. Papamichail, M. Papageorgiou and A. Mess-mer, Optimal motorway traffic flow control involving variablespeed limits and ramp metering, Transportation Science 44(2)(2010), 238–253.

[26] C.C. Lee, Fuzzy logic in control systems: Fuzzy logic con-troller. I, IEEE Transactions on Systems, Man and Cybernetics20(2) (1990), 404–418.

[27] D.P. Bertsekas, Nonlinear Programming, Athena Scientific,Belmont, MA, 1999.

[28] W. Sun and Y. Yuan, Optimization Theory and Methods: Non-linear Programming, Springer, New York, 2006.

[29] L. Magni, G. De Nicolao, R. Scattolini and F. Allgower, Robustmodel predictive control for nonlinear discrete time systems,International Journal of Robust and Nonlinear Control 13(3-4)(2003), 229–246.

[30] H. Jaddu, Direct solution of nonlinear optimal control prob-lems using quasilinearization and Chebyshev polynomials,Journal of the Franklin Institute 339(4) (2002), 479–498.

[31] G.A. Hicks and W.H. Ray, Approximation methods for optimalcontrol synthesis, The Canadian Journal of Chemical Engi-neering 49(4) (1971), 522–528.

Date post:	10-Jul-2020
Category:	Documents
Upload:	others
View:	7 times
Download:	0 times

Nonlinear discrete time optimal control based on …AUTHOR COPY 648 X. Jin and Y.C. Shin / Nonlinear...

Documents