Date post: | 14-Dec-2016 |
Category: |
Documents |
Upload: | ajay-verma |
View: | 215 times |
Download: | 2 times |
International Journal of Control, Automation, and Systems (2013) 11(3):496-502 DOI 10.1007/s12555-011-0243-y
ISSN:1598-6446 eISSN:2005-4092http://www.springer.com/12555
Wavelet Reduced Order Observer based Adaptive Tracking Control for a
Class of Uncertain Nonlinear Systems using Reinforcement Learning
Manish Sharma and Ajay Verma
Abstract: This Paper investigates the mean to design the reduced order observer and observer based
controllers for a class of uncertain nonlinear system using reinforcement learning. A new design ap-
proach of wavelet based adaptive reduced order observer is proposed. The proposed wavelet adaptive
reduced order observer performs the task of identification of unknown system dynamics in addition to
the reconstruction of states of the system. Reinforcement learning is used via two wavelet neural net-
works (WNN), critic WNN and action WNN, which are combined to form an adaptive WNN controller.
The “strategic” utility function is approximated by the critic WNN and is minimized by the action
WNN. Owing to their superior learning capabilities, wavelet networks are employed in this work for
the purpose of identification of unknown system dynamics. Using the feedback control, based on re-
constructed states, the behavior of closed loop system is investigated. By Lyapunov approach, the un-
iformly ultimate boundedness of the closed-loop tracking error is verified. A numerical example is
provided to verify the effectiveness of theoretical development.
Keywords: Adaptive control, Lyapunov functional, optimal control, reduced order observer, reinforce-
ment learning, Wavelet neural networks.
1. INTRODUCTION
In many practical systems, the system model always
contains some uncertain elements; these uncertainties
may be due to additive unknown internal or external
noise, environmental influence, nonlinearities such as
hysteresis or friction, poor plant knowledge, reduced-
order models, and uncertain or slowly varying
parameters. Hence, the state observer for the uncertain
system will be useful and apply to reconstruct the states
of a dynamic system. The means to design adaptive
observers through estimation of states and parameters in
linear and nonlinear systems has been actively studied in
recent years [1-3]. Adaptive observers, specially the
reduced order observers, of nonlinear systems have
attracted much attention due to their wide uses in theory
and practice which, different from full order observers,
needs to estimate only unmeasurable states of the studied
system [4-7].
Reinforcement learning (RL) is a class of algorithms
for solving multi-step, sequential decision problems by
finding a policy for choosing sequences of actions that
optimize the sum of some performance criterion over
time [8-10]. In RL problems, an agent interacts with an
unknown environment. At each time step, the agent
observes the state, takes an action, and receives a reward.
The goal of the agent is to learn a policy (i.e., a mapping
from states to actions) that maximizes the long-term
return. Actor-Critic algorithm is an implementation of
RL which has separate structures for perception (critic)
and action (actor) [11-14]. Given a specific state, the
actor decides what action to take and the critic evaluates
the outcome of the action in terms of future reward
(goal).
System identification plays a critical role in the
designing of controllers for uncertain nonlinear systems.
Controller is expected to provide efficient, safe and
desired performance. To design such a controller a
highly accurate model of the system is required which is
quite difficult due to modeling inaccuracies. In such
cases intelligent control tools are integrated with the
control strategies to obtain reliable and accurate control
performance. Since last decade wavelet networks have
attracted much attention of researchers. A wavelet
network is constructed as an alternative to neural
networks as a system identification tool. Wavelet
network integrate the space frequency localization
property of wavelets with learning capabilities of neural
networks to improve the function approximation ability.
Wavelet networks finds application in multi-scale
analysis and synthesis, time frequency signal analysis in
signal processing, identification of nonstationary signals
[15,16]. Due to its property of multi resolution analysis
and suitability for the development of online tunable
control laws, an adaptive wavelet based control strategies
are cited in the literature [16,17].
Incorporating the advantages of WNN, adaptive actor-
critic WNN-based control has emerged as a promising
approach for the nonlinear systems. In the actor-critic
WNN based control; a long-term as well as short-term
© ICROS, KIEE and Springer 2013
__________
Manuscript received May 31, 2011; revised October 27, 2012;accepted February 26, 2013. Recommended by Editor Young llLee. Manish Sharma is with the Medicaps Institute of Managementand Science, Rajiv Gandhi Technical University, Bhopal, India (e-mail: [email protected]). Ajay Verma is with Institute of Engineering and Technology,
DAVV, Indore, India (e-mail: [email protected]).
Wavelet Reduced Order Observer based Adaptive Tracking Control for a Class of Uncertain Nonlinear Systems using…
497
system-performance measure can be optimized. While
the role of the actor is to select actions, the role of the
critic is to evaluate the performance of the actor. This
evaluation is used to provide the actor with a signal that
allows it to improve its performance, typically by
updating its parameters along an estimate of the gradient
of some measure of performance, with respect to the
actor’s parameters. The critic WNN approximates a
certain “strategic” utility function that is similar to a
standard Bellman equation, which is taken as the long-
term performance measure of the system. The weights of
action WNN are tuned online by both the critic WNN
signal and the filtered tracking error. It minimizes the
strategic utility function and uncertain system dynamic
estimation errors so that the optimal control signal can be
generated. This optimal action NN control signal
combined with an additional outer-loop conventional
control signal is applied as the overall control input to
the nonlinear system. The outer loop conventional signal
allows the action and critic NNs to learn online while
making the system stable. This conventional signal that
uses the tracking error is viewed as the “supervisory”
signal [7].
These motivate us to consider the designing of WNN
reduced order observer based adaptive tracking
controller for a class of uncertain nonlinear systems
using reinforcement learning. WNN are used for
approximating the system uncertainty as well as to
optimize the performance of the control strategy.
The paper is organized as follows: Section 2 deals
with the system preliminaries, system description is
given in Section 3. Reduced order observer and
controller designing are discussed in Sections 4 and 5
respectively. Section 6 deals with the tuning algorithm
for actor-critic wavelets. The stability analysis of the
proposed control scheme and the observer is given in
Section 7. Effectiveness of the proposed strategy is
illustrated through an example in Section 8 while Section
9 concludes the paper.
2. SYSTEM PRELIMINARIES
2.1. Fundamentals of wavelet neural network
Wavelet network is a type of building block for func-
tion approximation. The building block is obtained by
translating and dilating the mother wavelet function.
Corresponding to certain countable family of am and b
n,
wavelet function can be expressed as [18]
/ 2: , .
d dn
m
m
x ba m Z n Z
aψ
−
⎧ ⎫⎛ ⎞−⎪ ⎪∈ ∈⎨ ⎬⎜ ⎟
⎪ ⎪⎝ ⎠⎩ ⎭ (1)
Considering
0 0 0, , , .
m m d
m na a b na b m Z n Z
−
= = ∈ ∈ (2)
The wavelet in (1) can be expressed as
( ){ }/ 2
0 0 0: , ,
md m d
mna a x nb m Z n Zψ ψ− −
= − ∈ ∈ (3)
where the scalar parameters a0 and b0 define the step size
of dilation and translation discretizations (typically a0 = 2
and b0 =1) and 1 2
[ , ,..., ]T n
nx x x x R= ∈ is the input vec-
tor.
Output of an n dimensional WNN with m wavelet
nodes is [18]
.
d
mn mn
m Z n Z
f α ψ
∈ ∈
= ∑ ∑ (4)
3. SYSTEM DESCRIPTION
Consider a single input single output (SISO) nonlinear
system of the form
1 2
2 3
( )
,
n
x x
x x
x f x u
y Cx
=
=
= +
=
�
�
�
�
(5)
where 1 2
[ , ,..., ] , ,T
nx x x x u y= are state variable, control
input and output respectively. ( ) : nf x ℜ →ℜ is a smooth
unknown nonlinear function.
Rewriting the system (5) as
( ( ) ),
,
x Ax B f x u
y Cx
= + +
=
�
(6)
where A is the system matrix of ,n n× B is the input
matrix of order 1n× and C is the output matrix of order
1 .n× Also n
x∈ℜ and .
py∈ℜ For all real time sys-
tems, .p n≤ Suppose that C has full rank, it is possible
to make a linear change of coordinates
,
m
u
Cx x
Q
ξξ
ξ
⎡ ⎤ ⎡ ⎤= = Λ =⎢ ⎥ ⎢ ⎥
⎣ ⎦⎣ ⎦ (7)
where Q is chosen so that Λ is an invertible matrix.
Also p
mξ ∈ℜ and .
n p
uξ −
∈ℜ
Applying the coordinate transformation (7), the plant (6)
takes the form
1
2
( , ),
( , )
( , ),
m um
m uu
m u
F
F
y
ξ ξξ
ξ ξξ
φ ξ ξ
⎡ ⎤ ⎡ ⎤=⎢ ⎥ ⎢ ⎥⎣ ⎦⎣ ⎦
=
�
� (8)
where 1( ) ( ).F fξ ξ−
= Λ
The motivation for the reduced order observer stems
from the fact that in the plant model (6), the state ξm is
directly available for measurement and hence it suffices
to build an observer that estimates only the unmeasured
state ξu. The order of such an observer will correspond to
the dimension of the unmeasured state, namely n p− ≤
n. This type of observer is called a reduced order observ-
er [6] and it has many important applications in design
problems.
The objective is to formulate a state feedback control
law to achieve the desired tracking performance. The
Manish Sharma and Ajay Verma
498
control law is formulated using the transformed system
(6). Let 1
[ , , , ]n
T
d d d dy y y y
−
= � … be the vector of desired
tracking trajectory. Following assumptions are taken for
the systems under consideration.
Assumption 1: Desired trajectory ( )dy t is assumed
to be smooth, continuous n
C and available for mea-
surement.
4. WAVELET REDUCED ORDER OBSERVER
DESIGN
Applying the linear transformation (11), the system
(10) takes the form
11 12 11
21 22 21
11 12 1 2
( ( ) ),
,
m m
u u
m
u
x xA A Bf x u
x xA A B
xy C C y y
x
⎡ ⎤ ⎡ ⎤⎡ ⎤ ⎡ ⎤= + +⎢ ⎥ ⎢ ⎥⎢ ⎥ ⎢ ⎥⎣ ⎦ ⎣ ⎦⎣ ⎦ ⎣ ⎦
⎡ ⎤⎡ ⎤= = +⎢ ⎥⎣ ⎦
⎣ ⎦
�
�
(9)
where p
mx ∈ℜ is the measured state, n p
ux
−
∈ℜ is the
unmeasured state and py∈ℜ is the output of the sys-
tem (5) which depends upon measurable states as well as
unmeasurable states and it is assumed that both the parts
of the output, y1 and y2 are explicitly available for mea-
surement.
Wavelet based reduced order observer that estimates
the states of the system (6) is given by
22 12 21 21ˆ ˆ ˆ( )u u u ux A x mA x x B u B f= + − + +
� , (10)
where ˆ
ux is the estimation of state vector ,
ux m =
1 2[ , , , ]
Tn pm m m−
… is the observer gain matrix, selected
such that the matrix 22 12
A mA− is stable. In this work
WNN is used for system identification. Substitution of
(9) in (10) results
22 11 11 12
21 21
ˆ ˆ ˆ( ( ( ) ) )
.
u u m m ux A x m x A x B f x u A x
B u B f
= + − − + −
+ +
�
�
Or equivalently
22 11 11
12
ˆ ˆ
ˆˆ .
u u m m
u
x A x mx mA x mB u
mA x f ε
= + − −
− + +
�
�
(11)
Note that the above equation contains ˆ
ux for the de-
sign of observer which is not available for the design. So
the given transformation has to be applied to generate an
intermediate state.
ˆ
u u mx x mx′ = −
Applying the above transformation, the observer takes
the form
22 11 11 12ˆ
ˆ ˆ ˆ
u u m ux A x mA x mB u mA x f ε′ = − − − + +� . (12)
Assumption 2: a) 1
ˆ( ( )) ( ( ))u
f x t f x t xγ− ≤�
b) For a symmetric positive definite matrix Q there ex-
ist a symmetric positive definite matrix P such that
22 12 22 12( ) ( )TA mA P P A mA Q− + − = − ,
21( )TPB C= ,
where ˆu u ux x x= −� is the unmeasurable state variable
estimation error while 1γ is a positive constant.
Now defining the error system as
22 12
12 2
ˆ( ) ( ( ) ( )),
.
u u
u
x A mA x B f x f x
y C x y
= − + −
= =
�
� �
� � �
(13)
With the help of the proposed tuning laws presented in
the next part of this subsection, the error term ( )f x� is
reduced to a small arbitrary value, which is further atte-
nuated by robust control term vr. Adaptation laws for the
wavelet network used to approximate ( )f x� will be:
1 2 1 1
2 2 1
3 2 1
ˆ ˆ ˆ ˆ( ),
ˆˆ ,
ˆˆ ˆ ,
T Tf f f f f
f f f
f f f
y A w B c
w w y A
c c y B
α α β ϕ
β α
β α
= − = − −
= − =
= − =
��
� �
��
� �
� �
�
(14)
where 1,β
2β and
3β are the learning rates with positive
constants.
5. BASIC CONTROLLER DESIGN USING
FILTERED TRACKING ERROR
Defining the state tracking error vector ˆ( )e t as
ˆ ˆ ˆ( ) ( ) ( ) ( ) ( )d m u
e t x t y t e t e t= − = + . (15)
The filter tracking error is defined as
ˆ ˆ
m m u ur K e K e= + , (16)
where 1 2 1
[ , , ]m n p
K k k k− −
= … and 1 2 1
[ , , ]u p
K k k k−
= …
are the appropriately chosen coefficient vectors such that
ˆ 0e→ exponentially as 0.ℜ→
Applying the feedback linearization method the con-
trol law is defined as
ˆ
ˆ ˆ
n
d m m u uu y K e K e r f= − − − − . (17)
Stability of the system (5) with the proposed observer
and controller strategy will be analyzed in the next
section.
6. ADAPTIVE WNN CONTROLLER DESIGN
A novel strategic utility function is defined as the
long-term performance measure for the system. It is ap-
proximated by the WNN critic signal. The action WNN
signal is constructed to minimize this strategic utility
function by using a quadratic optimization function. The
critic WNN and action WNN weight tuning laws are
derived. Stability analysis using the Lyapunov direct
method is carried out for the closed-loop system (6) with
novel weight tuning updates.
Wavelet Reduced Order Observer based Adaptive Tracking Control for a Class of Uncertain Nonlinear Systems using…
499
6.1. Strategic utility function
The utility function 1
( ) [ ( )]m m
i ip k p k
=
= ∈ℜ is defined
on the basis of the filtered tracking error r̂ and is given
by:
2
2
ˆ( ) 0 if
ˆ 1 if > ,
i i
i
p k r
r
η
η
= ≤
=
(18)
where ( ) ,ip k ∈ℜ 1,2, ,i m= … and η ∈ℜ is the pre-
defined threshold. p(k) can be considered as the current
performance index defining good tracking performance
for p(k) = 0 and poor tracking performance for p(k) = 1..
The strategic utility function ( ) m
Q k′ ∈ℜ can be de-
fined using the binary utility function as
1
1
( ) ( 1) ( 2)
( ) ,
N N
k
Q k p k p k
p N
α α
α
−
+
′ = + + + +
+ +
…
…
(19)
where α ∈ℜ and 0 1α< < and N is the horizon. Above
equation may be rewritten as
1( )( ) min { ( 1) ( )}.N
u kQ k Q k p kα α
+= − − (20)
6.2. Critic WNN
The long term system performance can be approx-
imated by the critic WNN by defining the prediction
error as
ˆ ˆ( ) ( ) ( ( 1) ( ))N
ce k Q k Q k p kα α= − − − , (21)
where ˆ ( )Q k =1 1 1ˆ ( ) ( ( ))T Tw k v x kφ =
1 1ˆ ( ) ( ),Tw k kφ ( ) ,m
ce k ∈ℜ
ˆ ( ) m
Q k ∈ℜ is the critic signal, 1
1( )
n m
w k×
∈ℜ and 1v ∈
1nm n×
ℜ represent the weight estimates, 1
1( ) n
kφ ∈ℜ is
the wavelet activation function and 1n is the number of
nodes in the wavelet layer. The objective function to be
minimized by the critic NN is defined as:
1( ) ( ) ( ).
2
T
c c cE k e k e k= (22)
The weight update rule for the critic NN is derived
from gradient-based adaptation, which is given by
1 1 1ˆ ˆ ˆ( 1) ( ) ( )w k w k w k+ = + Δ , (23)
where 1 1
1
( )ˆ ( )
ˆ ( )
cE k
w kw k
α
⎡ ⎤∂Δ = −⎢ ⎥
∂⎣ ⎦ or
1 1 1 1 1 1
1
1 1
ˆ ˆ ˆ( 1) ( ) ( ) ( ( ) ( )
ˆ( ) ( 1) ( 1)) ,
T
N T T
w k w k k w k k
p k w k k
α φ φ
α α φ+
+ = − ×
+ − − −
(24)
where 1
α ∈ℜ is the WNN adaptation gain. The critic
WNN weights are tuned by the reinforcement learning
signal and the discounted past output values of critic
WNN.
6.3. Action WNN
The action NN is implemented for the approximation
of the unknown nonlinear function ( ( ))f x k and to pro-
vide an optimal control signal to the overall input u(k) as
2 2 2 2 2ˆ ˆ ˆ( ) ( ) ( ( )) ( ) ( )T T Tf k w k v x k w k kφ φ= = , (25)
where 2
2ˆ ( )
n m
w k×
∈ℜ and 2
2
nm n
v×
∈ℜ represent the
matrix of weight estimate, 2
2( )
n
kφ ∈ℜ is the activation
function, n2 is the number of nodes in the hidden layer.
Suppose that the unknown target output-layer weight for
the action WNN is w2, and then we have
2 2 2 2
2 2 2
( ) ( ) ( ( )) ( ( ))
( ) ( ) ( ( )),
T T
T
f k w k v x k x k
w k k x k
φ ε
φ ε
=
=
(26)
where 2( ( )) m
x kε ∈ℜ is the WNN approximation error.
From (25) and (26), we get
2 2 2 2
ˆ( ) ( ) ( )
ˆ( ( ) ) ( ) ( ( )),T
f k f k f k
w k w k x kφ ε
= −
= − −
�
(27)
where ( ) mf k ∈ℜ� is the functional estimation error.
The action WNN weights are tuned by using the func-
tional estimation error ( )f k� and the error between the
desired strategic utility function ( ) m
dQ k ∈ℜ and the
critic signal ˆ ( ).Q k Define
ˆ( ) ( ) ( ( ) ( )).a de k f k Q k Q k= + −
� (28)
The objective is to make the utility function ( )d
Q k
zero at every step. Thus (28) becomes
ˆ( ) ( ) ( ).ae k f k Q k= +
� (29)
The objective function to be minimized by the action NN
is given by
1( ) ( ) ( ).
2
T
a a aE k e k e k= (30)
The weight update rule for the action NN is also a gra-
dient based adaptation, which is defined as
2 2 2ˆ ˆ ˆ( 1) ( ) ( ),w k w k w k+ = + Δ (31)
where 2 2
2
( )ˆ ( )
ˆ ( )
aE k
w kw k
α
⎡ ⎤∂Δ = −⎢ ⎥
∂⎣ ⎦ or
2 2 2 2ˆˆ ˆ( 1) ( ) ( )( ( ) ( )) ,T
w k w k k Q k f kα φ+ = − +� (32)
where 2
α ∈ℜ is the WNN adaptation gain.
The WNN weight updating rule in (32) cannot be im-
plemented in practice since the nonlinear function
( ( ))f x k is unknown. However, using (16), the func-
tional estimation error is given by
( ) ( ).f k r r kδ= − +
�
� (33)
Substituting (33) in to (32) ,we get
2 2 2 2ˆˆ ˆ( 1) ( ) ( )( ( ) ( ))Tw k w k k Q k r r kα φ δ+ = − + − +� .
To implement the weight update rule, the unknown but
bounded disturbance δ(k) is taken to be zero. Then, (32)
is rewritten as
2 2 2 2ˆˆ ˆ( 1) ( ) ( )( ( ) )Tw k w k k Q k r rα φ+ = − + −� . (34)
Manish Sharma and Ajay Verma
500
Coincidentally, after replacing the functional approxi-
mation error, the weight update for the action NN is tuned
by the critic WNN output, current filtered tracking error,
and a conventional outer-loop signal. The block diagram
of the proposed control strategy is shown in Fig. 1.
7. STABILITY ANALYSIS
Consider a Lyapunov functional of the form
21 1ˆ
2 2
T
u uV x Px r= +� � . (35)
Differentiating it along the trajectories of the system
21
2
1 ˆ ˆ ˆ( ( ) ( )) (2
ˆˆ ˆ ˆ( ( ) ( ) ).
T T
u u u m m
n
u u u d
V x Qx x PB f x f x r K e
K e K f x u m y Cx y
= − + − +
+ + + + − −
�
� � �
Substituting control law u in the above equation
2
min 21
2 22
min 1 2 21
3 4
1ˆ( ( ) ( )
2
ˆ ˆ ˆ( ))+ ( )
1ˆ( )
2
ˆ ˆ( ),
T
u u
r
T
u u u u
u r
Q x x PB f x f x
f x r KmCx k r
Q x M x M x x PB f x
M r M x k r
= − + −
+ −
≤ − + + +
+ + −
� �
�
�
�
� � � �
�
where 1 21 3
,M PB γ=2 max 1
,M P γ=3 4
ˆmax ,M r γ= +
4maxM KmC= and
1 2 3 4, , , 0γ γ γ γ ≥
22
min 1
22 22
2 3
1
2
ˆ ˆ .2
u u
u r u
V Q x M x
yM x M k r r x
≤ − +
+ − + − +
�
� �
�
� �
The system is stable as long as
2
2 22
min
2
1 2 3
1ˆ
2 2
ˆ( ) .
r u
u
yk r Q x
M M x M r x
+ +
≥ + + +
�
�
� �
(36)
By proper selection of kr, P and Q, the above condition
can be satisfied.
8. SIMULATION RESULTS
Simulation is performed to verify the effectiveness of
proposed WNN reduced order observer based control
strategy. Considering a system of the form
1 2
2 3
3 1 2 3 1 2
1 2 3
,
,
5 6 9 0.01 sin ,
2 .
x x
x x
x x x x x x u
y x x x
=
=
= − − − + +
= + +
�
�
�
(37)
Here x1 and x2 are assumed to be known states and x3
is assumed to be estimated using the proposed reduced
order observer. System belongs to the class of uncertain
nonlinear systems defined by (5) with n =3. It is assumed
that only output is available for measurement. The
proposed observer controller strategy is applied to this
system with an objective to solve the tracking problem of
system. The desired trajectory is taken as 0.5sindy t=
0.1cos 0.4.t+ + Initial conditions are taken as (0)x =
[0.6,0.2,0.5] .T Attenuation level for robust controller is
taken as 0.01. Controller gain vector is taken as [10,k =
5,1]. Wavelet networks with discrete Shannon’s wavelet
as the mother wavelet is used for approximating the
unknown system dynamics. Wavelet parameters for these
wavelet networks are tuned online using the proposed
adaptation laws. Initial conditions for all the wavelet
parameters are set to zero. Simulation results are shown
in the figures. As observed from the figures, system
response tracks the desired trajectory rapidly.
ˆ
ux
ˆf
,m
y x
mx
r�
,m ue e
+
rdy
1z−
α
( )p k
+
−
+
+
+
r
1Nα
+
u
,m mx x�
ˆ
ux
mx
−
Fig. 1. Block diagram of the closed loop system.
Wavelet Reduced Order Observer based Adaptive Tracking Control for a Class of Uncertain Nonlinear Systems using…
501
0 10 20 30 40-1
0
1
time(sec)
X1 a
nd y
d
0 10 20 30 40-1
0
1
time(sec)
x2 a
nd y
dd
0 10 20 30 40-1
0
1
time(sec)
X3(e
st)
and y
ddd
yd
x1
x2
ydd
x3(est)
yddd
Fig. 2. States and the desired trajectories.
0 10 20 30 40-5
0
5
10
time(sec)
Co
ntr
ol
eff
ort
0 10 20 30 40-1
-0.5
0
0.5
time(sec)
Tra
ck
ing
err
or
Fig. 3. Control effort and Tracking error
0 5 10 15 20 25 30 35 40-2
0
2
time(sec)
y
0 5 10 15 20 25 30 35 40-2
0
2
time(sec)
y(e
st)
0 5 10 15 20 25 30 35 40-0.5
0
0.5
time(sec)
outp
ut
err
or
Fig. 4. Actual output, estimated output and output error.
9. CONCLUSION
A WNN reduced order observer based adaptive
tracking control strategy is proposed for a class of
systems with unknown system dynamics. Adaptive
wavelet networks are used for approximating the
unknown system dynamics. Adaptation laws are
developed for online tuning of the wavelet parameters.
The stability of the overall system is guaranteed by using
the Lyapunov functional. The theoretical analysis is
validated by the simulation results. As observed from the
Fig. 2 that all the states are bounded and tracking their
desired trajectory. Fig. 3 indicates the control effort and
the tracking error between the system output and desired
output which converges rapidly. The plant output and the
estimated output are shown in Fig. 4. Also the error
between them is shown, which is of the order of 10–4
which reveals the efficient observer design. A
convergence pattern in the observer error is also reflected
from the Fig. 4.
REFERENCES
[1] H. Lens and J. Adamy, “Observer based controller
design for linear systems with input constraints,”
Proc. of the 17th World Congress, The Internation-
al Federation of Automatic Control, Seoul, Korea,
pp. 9916-9921, July 2008.
[2] F. Abdollahi, H. A Talebi, and R. V. Patel, “A sta-
ble neural network-based observer with application
to flexible-joint manipulators,” IEEE Trans. on
Neural Networks, vol. 17, no. 1, pp. 118-129, Janu-
ary 2006.
[3] M. Sharma, A. Kulkarni, and A. Verma, “Wavelet
adaptive observer based control for a class of un-
certain time delay nonlinear systems with input
constraints,” Proc. of IEEE International Confe-
rence on Advances in Recent Technologies in
Communication and Computing, ARTCOM, pp.
863-86, 2009.
[4] V. Sundarapandian, “Reduced order observer de-
sign for nonlinear systems,” Applied Mathematics
Letters, vol. 19, pp. 936-941, 2006.
[5] Z. F. Lai and D. X. Hao, “The design of reduced-
order observer for systems with monotone nonli-
nearities,” ACTA Automatica Sinica, vol. 33, no. 2,
pp. 1290-1293, 2007.
[6] Y. G. Liu and J. F. Zhang, “Reduced-order observ-
er-based control design for nonlinear stochastic sys-
tems,” Systems & Control Letters, vol. 52, pp. 123-
135, 2004.
[7] G. Bartolini, E. Punta, and T. Zolezzi, “Reduced-
order observer for sliding mode control of nonli-
near non-affine systems,” Proc. of the 47th IEEE
Conference on Decision and Control, Mexico,
2008.
[8] P. He and S. Jagannathan, “Reinforcement learning
neural-network-based controller for nonlinear dis-
crete-time systems with input constraints,” IEEE
Trans. on Systems, Man, and Cybernetics—Part B:
Cybernetics, vol. 37, no. 2, pp. 425-436, April 2007.
Manish Sharma and Ajay Verma
502
[9] W. S. Lin, L. H. Chang, and P. C. Yang, “Adaptive
critic anti-slip control of wheeled autonomous ro-
bot,” IET Control Theory Applications, vol. 1, no. 1,
January 2007.
[10] L. G. Crespo, “Optimal performance, robustness
and reliability base designs of systems with struc-
tured uncertainty,” Proc. of American Control Con-
ference, pp. 4219-4224, USA, Colorado, 2003.
[11] J. Peters and S. Schaal, “Policy gradient methods
for robotics,” Proc. of the IEEE International Con-
ference on Intelligent Robotics Systems, pp. 2219-
2225, 2006.
[12] J. J. Murray, C. Cox, G. G. Lendaris, and R. Saeks,
“Adaptive dynamic programming,” IEEE Trans. on
Syst., Man, Cybern., vol. 32, no. 2, pp. 140-153,
May 2002.
[13] D. V. Prokhorov and D. C. Wunsch, “Adaptive
critic designs,” IEEE Trans. on Neural Networks,
vol. 8, no. 5, pp. 997-1007, September 1997.
[14] H. V. Hasselt and M. Wiering, “Reinforcement
learning in continuous action spaces,” Proc. of
IEEE Symposium on Approximate Dynamic Pro-
gramming and Reinforcement Learning, pp. 272-
279, 2007.
[15] Q. Zhang and A. Benveniste, “Wavelet networks,”
IEEE Trans. on Neural Networks, vol. 3, no. 6, pp.
889-898, November 1992.
[16] J. Zhang, G. G. Walter, Y. Miao, and. W. Lee,
“Wavelet neural networks for function learning,”
IEEE Trans. on Signal Processing, vol. 43, no. 6,
pp. 1485-1497, June 1995.
[17] B. Delyon, A. Juditsky, and A. Benveniste, “Accu-
racy analysis for wavelet approximations,” IEEE
Trans. on Neural Networks, vol. 6, no. 2, pp. 332-
348, March 1995.
[18] W. Sun, Y. Wang, and J. Mao, “Wavelet network
for identifying the model of robot manipulator,”
Proc. of the 4th World Congress on Intelligent
Control and Automation, pp. 1634-1638, China,
June 2002.
Manish Sharma is pursuing a Ph.D.
degree in Electronics and Telecommuni-
cation Engineering from Devi Ahilya
University. His research interests include
nonlinear adaptive control, wavelet neur-
al network, observer based control and
system identification.
Ajay Verma received his Ph.D. degree in
Electronics and Telecommunication En-
gineering from Devi Ahilya University.
His research interests include nonlinear
dynamics and system theory, neural net-
works and nonlinear control system.