http://www.iaeme.com/IJMET/index.asp 445 [email protected]
International Journal of Mechanical Engineering and Technology (IJMET)
Volume 9, Issue 2, February 2018, pp. 445–460 Article ID: IJMET_09_02_046
Available online at http://www.iaeme.com/IJMET/issues.asp?JType=IJMET&VType=9&IType=2
ISSN Print: 0976-6340 and ISSN Online: 0976-6359
© IAEME Publication Scopus Indexed
ON-LINE LEARNING OF ROBOT INVERSE
DYNAMICS WITH CEREBELLAR MODEL
CONTROLLER IN FEEDFORWARD
CONFIGURATION
Lavdim Kurtaj, Vjosa Shatri* and Ilir Limani
Faculty of Electrical and Computer Engineering,
University of Prishtina “Hasan Prishtina”, 10000 Prishtina, Kosovo.
*Corresponding Author
ABSTRACT
Performance of robot control in trajectory tracking can be improved considerably
if robot inverse dynamics model is known. It may be used in feedforward or in
computed torque configuration. Cerebellar model controllers can be used to acquire
inverse robot dynamics model on-line. In this paper we explore different structural
aspects of cerebellar controller in feedforward configuration for improving robot
control performance. Cerebellar controller is used beside conventional proportional-
derivative controller, and it learns by using output of later as teaching signal. Effects
of cerebellar controller with dimensionality of input space lower than that of the
problem to be learned is explored. Fully coupled Albus overlays with uniform
population coding for input dimensions, at different number, shape and width of
receptive fields, in accuracy of acquired model is investigated. Root-mean-square of
position and speed error is used as measure of control performance. How
normalization of receptive fields affects cerebellar control performance is explored by
using receptive fields with self-normalization property and those without it. Simulink
model of cerebellar controller that preserves layered organization is used, along robot
plant model built in SimMechanics.
Key words: Cerebellar model controller, robot inverse dynamics, on-line learning,
feedforward robot controller, receptive fields.
Cite this Article: Lavdim Kurtaj, Vjosa Shatri and Ilir Limani, On-line learning of
robot inverse dynamics with Cerebellar Model Controller in feedforward
configuration, International Journal of Mechanical Engineering and Technology 9(2),
2018. pp. 445–460.
http://www.iaeme.com/IJMET/issues.asp?JType=IJMET&VType=9&IType=2
On-line learning of robot inverse dynamics with Cerebellar Model Controller in feedforward
configuration
http://www.iaeme.com/IJMET/index.asp 446 [email protected]
1. INTRODUCTION
From control point of view robots have a number of actuators that must be driven in a
coordinated manner. Actuators drive joints that are linked with links, forming some structural
arrangement that is able to perform tasks it is intended for. Typically each joint actuator is
treated as part of a separate independent control system [1] with its own controller of type
proportional-integral-derivative (PID). But most robotic structures are characterized with
inherent dynamics interactions or couplings between joints. These couplings will be
manifested as disturbance for independent joint controllers and it is relied on ability of the
joint controller to suppress them at satisfactory level. When actuators are equipped with high
reduction ratio gearboxes, as is the case for low speed operation, disturbances at output axle
will be highly attenuated, and will not influence much in control performance of independent
joint actuators [2]. Use of constant parameter controllers of PID type is justified in this case.
In more demanding applications more advanced controllers are to be used. A broad range of
them rely on knowledge of robot dynamics, and use this information in improving control
performance. This leads to perfect control [3] with accurate robot dynamics model. Even
thought that this perfect control cannot be obtained in practice, it serves as leading way of
reaching it. Control problem is now converted to a problem of finding robot dynamics model
[4], as accurate as possible.
Attractive way of finding plant model, avoiding physical modeling, is by learning it with
some type of artificial neural network (ANN) [5]. Part of brain that is thought to be highly
involved in coordination of multi-joint movements is cerebellum [6]. How neuronal structure
of cerebellum, and of other parts of the brain, [7] is related to function it is involved in, is a
question that attracted many attention by research community. For cerebellum, relating
neuronal connectivity with its function has been covered by two theories of cerebellar
function, at 1969 by Marr [8] and at 1971 by Albus [9]. Theories assume a physiological
mechanism of learning at specific parts of connectivity between neurons, namely synapses,
for former being in form of long-term potentiation (LTP) and for latter in form of long-term
depression (LTD). About a decade latter at 1982 Ito [10] found that physiological mechanism
that aids learning at specified site was in form of LTD. Theories of cerebellar functioning
were followed by computational model at 1975 by Albus [11]. Based on its main assumed
functionality in articulating multi-joint movements, it was named Cerebellar Model
Articulation Controller or CMAC for short.
CMAC neural network in one of its extremes of implementation is treated as lookup table
[12, 13], where binary equivalents of inputs serve as address to a memory location where
information will be stored during learning or retrieved when it is used for control. Each
location will represent exclusively part of the multi-dimensional input space in form of the
hypercube (multi-dimensional receptive field), one quanta wide (binary receptive field one
quanta wide) in each direction, with same value over corresponding hypercube. Only one
memory location will be updated during learning, and also content of only one memory
location will determine its output. The result of learning will be stepped approximation of
hyper-surface describing input-output relationship for multi-input-single-output (MISO)
process. For multi-output (MIMO) processes each addressed location would be a vector with
number of locations equal to number of outputs. This form of learning is able to learn any
MISO (and MIMO) with desired accuracy [14], but uniform division of multi-dimensional
input space to hypercubes can result with enormous and impractical number of memory
locations. To attain desired accuracy quanta width must be determined from steepest part of
the hyper-surface.
Lavdim Kurtaj, Vjosa Shatri and Ilir Limani
http://www.iaeme.com/IJMET/index.asp 447 [email protected]
Number of partitions can be decreased by using wider receptive fields, with positive
influence on generalization, or by creating non-uniform partitions (receptive fields of different
widths) per dimension, or over the same dimension [18, 16]. One issue that deserves
consideration, when assuming adaptability at input layer, is applicability of it to the
distributed usage (by cerebellar parallel fibers [6]) of the same information as present in the
cerebellar neuronal connectivity.
All approaches suffer from curse of dimensionality [18], i.e. exponential increase in
higher dimensional hypercubes (or hyper-parallelepipeds with non-uniform partitions of input
dimensions) and corresponding weights where learning takes place. Approaches that apply
equally to original model and all aforementioned modifications that will aid in decreasing this
impractical storage space are higher order receptive fields [19], Albus overlays [11] and
hashing [11]. First approach, besides decreasing number of hypercubes, will provide smother
approximation instead of stepped one for standard CMAC [20]. Using Ablus overlays will
lose some of the functionality, but it will contribute to lower number of hypercubes when
wider receptive fields are used, by using only a number equal to dimensionality of input space
from total number of higher dimensional receptive fields for given input. Hashing is based on
a simple fact that for higher dimensional input spaces only a small fraction of it will be used
under normal working conditions, making most of the hypercubes (higher dimensional
receptive fields) unused. Hashing is many-to-one mapping, and will map this high number of
hypercubes to much lower number of memory locations where learning information is stored.
Collisions of storage spaces may happen (if not intentionally resolved) and will be manifested
as noise, but with proper design seem to be acceptable practically. Higher order receptive
fields are biologically plausible. Also Albus overlays may be thought of as biologically
plausible if considering random connectivity between input information and processing units
(neurons) that generate higher order receptive fields, especially under space constrains.
Hashing will reduce storage space in computational model, but would not decrease number of
processing units biologically.
In this paper we explore influence of receptive fields shape and width, for cerebellar
model input signals coding, in quality of acquired robot inverse dynamics during on-line
learning in feedforward configuration. Only Albus full-overlaid CMAC is considered.
Multiplication operator is used for generating higher-order receptive fields from one-
dimensional receptive fields used for input signal coding. Rest of the paper in Section 2 gives
short overview of utilization of robot inverse dynamics model in feedforward and feedback
control structures. It is followed with presentation of neuronal circuit of the cerebellum and
typical information processing by cerebellar models. Results of simulation in controlling
robot with cerebellar model using receptive fields of different shapes and widths, while it is
on-line learning robot inverse dynamics, is presented in Section 3. Paper ends with
Conclusions Section.
2. METHODS
Two main control structures where robot inverse dynamics can be used to improve control
performance are presented. Cerebellum based model will be used for on-line learning of robot
inverse dynamics.
2.1. Model of Robot Inverse Dynamics in Control
Inverse dynamics mode of the robot can be used to improve control performance. With
accurate model performance of up to perfect control can be obtained [3], theoretically. Since
implementation of the controller will be in for digital controller, with limited number of
calculations per second, even with ideal robot inverse dynamics model perfect control is not
On-line learning of robot inverse dynamics with Cerebellar Model Controller in feedforward
configuration
http://www.iaeme.com/IJMET/index.asp 448 [email protected]
achievable. Other factor that can prevent perfect control is friction, being almost impossible to
create ideal model. Inaccuracies in the robot model itself or in the implementation make
presence of conventional controller indispensible. Figure 1 shows two standard configurations
where inverse dynamics model can be incorporated in joint controller. Figure 1(a) represents
robot (joint) control structure with conventional PD controller augmented with feedforward
controller utilizing inverse dynamics model. If robot model is accurate all control action will
be generated from feedforward controller. Unmodelled dynamics will be handled with
conventional controller, and if there is none its output will be zero. Use of inverse dynamics
controller in a computed torque control structure is shown in Figure 1(b). In this case joint
actual positions and speeds and desired joint accelerations will serve as input to the block that
will compute necessary actuating torques. For smooth referent trajectories with continuous
joint positions, speeds and accelerations performance of both structures will be the same [21].
Second configuration sometimes is implemented similar to structure in Figure 1(a), by
preserving standard connectivity of conventional controller, and by using actual values for
joint positions and accelerations.
Figure 1 Robot controllers based on inverse dynamics model. (a) Robot control structure with
conventional PD controller augmented with feedforward controller utilizing robot inverse dynamics
model. (b) Robot computed torque controller
Notice that physical meaning of output signal from conventional PD controller will be
different for two control structures in Figure 1, being torque (or voltage) for first one in Figure
1(a), and acceleration for second control structure in Figure 1(b).
2.2. Neuronal Circuit of the Cerebellum and Information Processing
Cerebellum is part of the brain that inhabits more than a half of total number of neurons,
while occupying only a fraction of total volume. It is attributed a main functionality in
coordinating multi-joint movement. Being well separated from but resembling the brain it is
dq
dq
dq
q
q
q
τ
c
m
(
a
)
dq
dq
dq
q
q
q
q
q
q
τ
c
m
(
b
)
)(),()( qGqqCqqD
Controller
PD
Controller
PD
)(),( qGqqC
)(qD
Inverse Dynamics
Model
Inverse Dynamics
Model
τ
F
F τPD
a
P
D
Robot
Robot
Lavdim Kurtaj, Vjosa Shatri and Ilir Limani
http://www.iaeme.com/IJMET/index.asp 449 [email protected]
also known as little brain. Despite huge number of neurons, its organization seems simple and
regular, shown in Figure 2. Cerebellar cortex is organized in three layers of neurons. Inner
layer is molecular layer where mainly reside granule cells (smallest and most numerous type
of neuron) and Golgi cells, GrC and GoC in Figure 2. Middle layer, namely Purkinje cell
layer, having only bodies of Purkinje cells arranged in single cell thick layer, PC in Figure 2.
Third outer layer is molecular layer. It inhabits stellate cells and basket cell, StC and BaC in
Figure 2, and remarkable organization of with almost flat dendrite trees of Purkinje cells and
parallel fibers (axons of granule cells, PF in Figure 2) passing through them at right angle.
Information enters cerebellum through mossy fibers (mf1, mf2 and mf3 in Figure 2) and
through climbing fibers (cf in Figure 2). In developed cerebellum only one climbing fiber will
target given Purkinje cell, and they are assumed to carry teaching signal in form of error. All
climbing fibers originate from inferior olivary nucleus (IO in Figure 2). The only output from
cerebellar cortex is through Purkinje cell axons which target deep cerebellar nuclei (DCN in
Figure 2), with axons of latter being the only output from the cerebellum.
Figure 2 Neuronal circuit of the Cerebellum. Green lines ending with a circle are excitatory
connections; red lines ending with rhombs are inhibitory connections; climbing fibers with blue lines
ending with circles are also excitatory. mf1, mf2 and mf3: mossy fibers; gl: glomeruli; GrC: granule
cells; GoC: Golgi cells; aa: ascending axon; PF: parallel fibers; StC: stellate cells; BaC: basket cells;
PC: Purkinje cells; DCN: deep cerebellar nuclei; cf: climbing fiber; IO: inferior olive
Information carried by mossy fibers, is population coded with (assumed) one-dimensional
receptive fields of specific form, with square (0/1 or binary), triangular and Gaussian being
more common, shown in Figure 2 as 1-dim RF. They are processed expansively with granule
cells and Golgi cells processing arrangement, generating several orders of magnitude more
parallel fibers than original mossy fibers. Theoretically all possible combinations between
mossy fibers of different input dimensions are formed. Parallel fibers are axons of granule
cells that rise vertically from granule cell layer (ascending axon part, aa in Figure 2) through
Purkinje cell layer toward molecular layer, where they split in T-shaped form creating parallel
fibers. They are assumed to carry higher-dimensional information (n-dim RF in Figure 2) that
is distributed through several hundred Purkinje cell dendrite trees, but not necessarily
contacting all of them. One Purkinje cell may make contacts (synapses) with hundreds of
thousands (specie dependent) of parallel fibers from orders of magnitude more of them
passing through. These contacts are assumed main learning site [8, 9] where plasticity is
present, and are represented with synaptic weights wPF-PC in Figure 2. Co-activation of PF and
On-line learning of robot inverse dynamics with Cerebellar Model Controller in feedforward
configuration
http://www.iaeme.com/IJMET/index.asp 450 [email protected]
cf will induce LTD [10]. Hundreds of Purkinje cell axons will target deep cerebellar neuron,
also targeted from mossy fibers and climbing fibers, and will generate one excitatory output
from cerebellum. DCN also inhibit IO neurons, source of climbing fibers, shown in Figure 2
as recurrent loop between DCN and IO. Stellate cells and basket cells will also use
information from parallel fibers and will create inhibitory connections with Purkinje cells, to
dendrite tree for stellate cells, whereas basket cells target dendrite tree and specific basket
arrangement around Purkinje cells body where axon originates. Most models based on CMAC
[11] do not use DCN neuron and use only one PC per output, assuming simple summation
function for neurons of DCN and same learning signal for all PC that target given DCN
neuron. Also cerebellar models not necessarily respect number of inputs that target specific
neuron and amount of divergence for outputs from cerebellar neurons. Purkinje cell and
corresponding synaptic weights are also modeled as simple perceptron with linear activation
function. Functionality of granule cell layer with GrCs and GoCs is modeled as logical AND
or multiplication operator that generated multi-dimensional receptive fields at PF from one-
dimensional receptive fields at mossy fibers. First operator can be used if input receptive
fields are logical (0/1) and latter in general case with population coded receptive fields of any
form.
Generation of population coded information in the mossy fibers is supposed to be
happening outside of the cerebellum. Most of cerebellar models will use transformations that
convert input signals of each dimension to a group of signals with compact support receptive
fields of specific form, like those seen in Figure 2 at input. They can be arranged in groups of
non-overlapping receptive fields (Albus layers) like in standard Albus CMAC model [11], or
in a single layer of overlapping receptive fields in later CMAC models. In this paper three
shapes (square, trianglar and Gaussian) of different width for receptive fields used for
population coding input signals will be explored. Figure 3 shows one layer of one-
dimensional overlapping receptive fields used for coding input signal xi.
Figure 3 One layer of overlapping triangular receptive fields. Layer i has ni receptive fields, Bi,1 to
Bi,ni, Width of receptive field is CBi,4. All layers will be similar to one shown, but may be of different
number and of different widths
Only one layer of triangular receptive fields with a number of receptive fields of given
width is shown for population coding of input signal xi. Bi,k is basis function for receptive
field and CBi,4 is its width. Different inputs may use different receptive field shape and/or
width. Higher-order receptive fields carried by parallel fibers will be generated by using
multiplication operator over one-dimensional receptive fields that are used for population
coding input signals carried by mossy fibers.
Model of the cerebellar processing units, including processing from input signals to
specific type of coding, is implemented as cerebellar Simulink library [22, 23] and the same is
used for performing simulations (MATLAB and Simulink are registered trademarks of The
MathWorks, Inc., www.mathworks.com/trademarks). Robot plant used for simulations is
same to the one presented in [23].
Lavdim Kurtaj, Vjosa Shatri and Ilir Limani
http://www.iaeme.com/IJMET/index.asp 451 [email protected]
3. RESULTS AND DISCUSSIONS
Receptive fields used for coding input signals is one of factors that will determine ability of
cerebellar model to acquire robot model and quality of the acquired model. Of importance is
number of them, their shape and width. Also when seeking models with lower number of
processing elements some input signals may be left out, resulting with cerebellar model that
will not be able to acquire specific dynamical effects of robot.
Robot from [23] with single rotary joint in pendulum configuration is used. Robot has one
link of length 1 m and of mass 1 kg in pendulum configuration is used. Link mass is linearly
distributed over link length. Gripper is considered as point mass of 0.2 kg situated at link end.
Friction coefficient is set to 0.35 Nm/rad/s. fixed gain constants for conventional
proportional-derivative (PD) controller are set to 30 for proportional and 4 derivative
constant. Update rate for PD controller was set to 1000 Hz and that of cerebellar controller
was set to 100 Hz. Reference trajectory for learning and testing is ideal sinusoidal trajectory
for joint position, joint speed and joint acceleration, with amplitude equal to 3π/2 rad. Quality
of acquired model is evaluated with maximum joint position and joint speed errors, and root
mean square (RMS) of errors. Discrete RMS is calculated from sampled signal samples over
one period of reference signal.
3.1. Two-Dimensional CMAC for Three-Dimensional Robot Problem
Idealized inverse dynamics model for robot plant, neglecting actuator dynamics (or
considered idealized as torque drive with corresponding controller), is three dimensional
problem [23] if dynamical friction is taken into consideration. For cerebellar model in
feedforward configuration, given in Figure 1(a), input signals are desired joint position,
desired joint speed and desired joint acceleration, but in this first case lower dimensional input
space will be used, by living out joint speed. If friction would not be present it would suffice
inverse dynamics of robot in question.
Input range for each input signal, -2π rad to 2π rad for joint position and that of joint
acceleration from -2π•(2π•0.2)2 rad/s
2 to 2π•(2π•0.2)
2 rad/s
2, is divided into 72 quanta of equal
width [11, 23]. Input signals are each population coded with 73 receptive fields of triangular
shape, 8-quanta wide. Reference sinusoidal signal period is set to 10 s. Output of the PD
controller served as teaching signal. Signals for calculation of RMS are sampled at cerebellar
controller rate of 100 Hz. Simulink model of control system with robot plant and PD
controller, augmented with cerebellar model inverse dynamics feedforward controller is
shown in Figure 4. Inside 1D_PosOri_mL_Ovar_vInt bock is robot model implemented in
SimMechanics. Load mass (mL) and orientation angles relative to base (alfab) are set to zero.
Control action of PD controller is at output of ZOH (Zero-Order Hold) block. It is also the
place where we set update rate for PD controller. Cerebellar controller block contains two-
dimensional CMAC, shown in Figure 5, with two blocks for coding input signals by
On-line learning of robot inverse dynamics with Cerebellar Model Controller in feedforward
configuration
http://www.iaeme.com/IJMET/index.asp 452 [email protected]
Figure 4 Control system with robot plant and PD controller, augmented with cerebellar model inverse
dynamics feedforward controller
Population code, blocks 1L_1D_TBF_x1 and 1L_1D_TBF_x1. Block 4D_BF (GrC-GoC)
will create up to four dimensional receptive fields from up to four input signals, currently only
two inputs are used and other two are set to constant 1 with Dim_x3x4, becoming effectively
unused. Purkinje cell and weights are implemented inside PC block, where learning gain and
learning rate are set. Update rate of cerebellar controller is set to ZOH_CMAC (zero-order
hold for cerebellar controller). Two switches are used to select initial vales for weights and to
set manually learning to on or to off state. Automatic control of learning state is done with
block named Time Profile for Error. It will block learning for two reference signal periods at
beginning of simulation in order to obtain PD control reference performance. It will also
block learning for two other reference signal periods by the end of simulation run, to create
conditions for testing quality of robot inverse dynamics model acquired by cerebellar
controller. Numbers nearby signal lines in Figure 4 and Figure 5 indicate dimensionality of
corresponding signal. Two lines at mossy fibers are of dimensionality 73 each, indicating that
each input signal has 73 receptive fields, while number 5329 nearby parallel fibers represents
number of two-dimensional receptive fields formed from all combinations of input receptive
fields (73•73 = 5329).
Trend of RMS position error during three phases of robot control, with PD controller
augmented with feedforward cerebellar controller, at different learning gains is shown in
Figure 6 Lowest trace marks automatic learning state control mentioned in previous
paragraph. Learning gains are given in the inset of the same figure. During first phase lasting
20 seconds (PD Only Control) weights are initialized to zero and learning is set to off, causing
only PD controller to handle all control action. All traces during this phase are coincident,
with rising part during first 10 seconds being initialization of RMS calculation blocks. It is
followed by constant value during next 10 seconds, marking performance of control with PD
controller only. Any change in the trace reflects cumulative behavior during one input signal
period at past from current position.
Lavdim Kurtaj, Vjosa Shatri and Ilir Limani
http://www.iaeme.com/IJMET/index.asp 453 [email protected]
Figure 5 Cerebellar controller with two inputs
Second phase (Learning Phase) starts by automatic switching of learning to on. While
cerebellar controller acquires inverse dynamics model RMS errors (and corresponding
position and speed errors) will mainly have decreasing trend. Slope is steeper with higher
learning gain and control performance will improve. All traces converge, or would converge
for longer learning phase, to RMS position error value around 0.058 rad. This value under
sinusoidal like shape of error would correspond to maximal position error of 0.041 rad.
Inability of making zero control error is caused by controller being structurally inappropriate
to learn whole dynamical effects present in robot plant. In this case cerebellar controller will
not be able to acquire friction dynamics being speed dependent, since speed information is not
present as input dimension. Persistent error will have other adverse effects on quality of the
acquired model when transiting from learning on to learning off phase. It will cause
permanent oscillation of weights, similar to adaptive controllers, with possibly higher control
errors after learning phase when learning is switched off. This is visible on third phase of
traces in Figure 6 (Testing Phase). It can be seen that for traces that reached learning limit,
Figure 6 RMS of position error for three phases of robot control, with PD controller augmented with
feedforward cerebellar controller, at different learning gains
Cerebellum (Cerebellar
Cortex)
On-line learning of robot inverse dynamics with Cerebellar Model Controller in feedforward
configuration
http://www.iaeme.com/IJMET/index.asp 454 [email protected]
There is increase in RMS error when learning is switched off that will be higher the higher
the learning rate signifying more pronounced adaptive controller behavior. Variations during
first 10 seconds of this phase, from 80 s to 90 s, are caused from RMS calculation time
window and it will pass after one period and settle to constant value, since CMAC control
will be static. Lower learning gains will still manifest this effect, but with lower amplitudes
from settled limit.
The behavior with lower learning gains takes longer time to acquire model, up to the
modeling capabilities of given controller, and with less pronounced error increase when
learning turns off. This is shown in Figure 7 for 100 periods of learning signal (1000 seconds)
with learning rate 0.1/(16*8), equivalent of the trace with same mark in Figure 6. It can be
seen that gross learning will happen during first 100 seconds (80 seconds of learning). After
that small adjustments are done, mainly decreasing speed error, with little effect on maximum
and RMS of position error. They will be manifested as smoothing action to position control
and to smother control action by CMAC controller, seen by comparing corresponding insets
at beginning and end phase of Figure 7(a). RMS of speed error contains jump some instant
after first 10 seconds. It is caused from discontinuity of desired speed at beginning, followed
by
Figure 7 Takeover of control by feedforward cerebellar inverse dynamics controller during learning
and RMS errors trends. (a) Position error, speed error, and torques from PD and CMAC controllers.
(b) RMS of position error, RMS of speed error, and RMS of PD control action. All subfigures have by
two insets. Left insets are zoomed plots of first 100 seconds of corresponding plot with same vertical
axis scaling. Right insets show zoomed last 40 seconds (960 s to 1000 s) of corresponding plot with
vertical axis best-fit. During time range 0 s to 20 s learning is off and only PD controller is generating
control action. Following next and up to 980 s learning is with CMAC controller acquiring inverse
dynamics model. Learning is off also during last 20 seconds (980 s to 1000 s) and shows success of
acquired model by CMAC. Right insets cover transition from learning on to learning off, by 20
seconds each
Lavdim Kurtaj, Vjosa Shatri and Ilir Limani
http://www.iaeme.com/IJMET/index.asp 455 [email protected]
Intense action of PD controller (initial sharp peek is clipped in second plot of Figure 7(a)).
It will fast enter to dynamics steady state, indicated with flat part of RMS errors until initial
20 seconds. This initial transitory phase is visible also at RMS of position error, but it is
relatively low. With proper trajectory planning they can be highly attenuated. Relative
increase in error when learning stops is relatively low and barely noticeable from main plots
in Figure 7(b), but right insets with last 40 seconds magnified for better visibility show
details. RMS of position error will increase for 1.147%, from about 0.05665 rad to about
0.0573 rad. Maximum position error from 0.0345 rad will increase to 0.0385 rad, with
theoretical value being 0.03454 rad.
3.2. CMAC with Complete Modeling Capability would Zero Control Error
When all significant signals (dimensions) are used as input for cerebellar controller, it is
expected from it to be able to learn complete model of the plant. Control error in ideal case
would be zero. In practical situations quality of acquired model will be determined from other
constructive parameters, helping to make control error lower but not zero. For example from
previous section complete modeling capability can be tested in two forms, by making friction
zero (that can be done in simulations), or by increasing input space dimensionality by adding
joint speed as third input. If friction is made zero resulting model will be two-dimensional
(2D), while for nonzero friction values model will be three-dimensional (3D). Only linear
dynamical friction will be considered, and its coefficient Bd will be 0 or 0.35. Each input
signal will be population coded by 17 two quanta wide triangular receptive fields (RF). In 2D
case there will be 17•17 = 289 2D RF, and there will be 17•17•17 = 4913 3D RF for 3D input
space. Learning gain will be 0.1/(1*1) for all simulations of this section. Comparison between
models will be based on achieved control performance with learning course similar to one
shown in Figure 6. Maximal position error, RMS of position error, maximal speed error, and
RMS of speed error will be compared, at during last period of learning phase (time range from
70 s to 80 s) and during last 10 seconds (time range from 90 s to 100 s) when learning is off.
Figure 8 shows results of four simulation runs. First two bars of each group correspond to 2D
CMAC and other two to 3D CMAC, with left bar of the subgroup being for friction
coefficient equal to 0 and nearby bar being for friction coefficient equal to 0.35. It can be seen
from maximal position error and from RMS of position error that that all models are able to
learn inverse dynamics model, with exception of 2D CMAC when friction coefficient is not
zero (2D CMAC used for 3D problem). 3D CMAC learning is not influenced much from
value of the friction coefficient, seen as almost equal
Figure 8 Position and speed maximal and RMS errors for two-dimensional (2D) and three-
dimensional (3D) CMAC, for problems without (Bd = 0) and with friction (Bd = 0.35). Two left bars of
each group correspond to 2D CMAC and other two bars correspond to 3D CMAC. Left bars of each
subgroup of two are for problems without friction
On-line learning of robot inverse dynamics with Cerebellar Model Controller in feedforward
configuration
http://www.iaeme.com/IJMET/index.asp 456 [email protected]
Height for all right bar pairs of all groups. It is assumed that ranges of input signals are
covered properly. Individual behaviors follow the same trend as shown in previous section
(Figure 6 and Figure 7), like difference in values at ending phase of the learning and steady
state of testing phase that can be seen also in Figure 8. It is highly pronounced for 2D CMAC
in presence of friction, corresponding to the black trance with learning rate 0.1/(16*1) in
Figure 6, where error while learning becomes lower but it will experience considerable
increase when learning is switched to off (brown and dark-blue bars in Figure 8). This is
caused from large learning gain in presence of persistent error. Behavior with learning on and
off in 3D CMAC is similar to learning with lower learning gain, relative to the CMAC
structure.
3.3. Number, Shape and Width of Receptive Fields
These parameters will be explored with 2D CMAC in problems without friction, to avoid
shadowing of structural behavior from frictional effects. All evaluation cases will follow same
learning and testing phases as shown in Figure 6.
Trend of RMS of position error for a number of simulations with different number of
receptive fields (RF) of triangular shape is shown in Figure 9. Two marks are given for every
simulation run, red dot for error at the end of the learning phase and blue circle for testing
phase error. Duration of learning is same for all simulations, being 6 input signal periods
(cycles). Normally errors can be made smaller with longer training phases, if there is no
inherent limitation. For digital control systems update period for conventional and cerebellar
controller will be some of limiting factors.
Figure 9 Number of triangular receptive fields (RF) and RMS of position error for feedforward
CMAC in two phases of learning. RMS error for each number of RF is given for two phases of
learning, when learning is on (red point) and at dynamics steady state when learning is off (blue
circle). Increasing number of RF will decrease RMS. In practical situations there may be inherent
limitations that prevent this
Results of on-line learning for three shapes of RF under different number and width of RF
are shown in Figure 10. Three shapes of RF were tested, with square (SBF), triangular (TBF)
and Gaussian (RBF) basis function. Results of learning for different learning gains at selected
receptive fields shape are given in three columns by three subplots each. Rows correspond to
a given number of RF and of a given width. One pair of bars corresponds to one simulation
run with coefficient marked under it determining learning gain. Blue bars represent RMS of
Lavdim Kurtaj, Vjosa Shatri and Ilir Limani
http://www.iaeme.com/IJMET/index.asp 457 [email protected]
position error at the end of the learning phase (as in Figure 6), while brown bars dynamics
steady state RMS of position error during testing phase. General trend for all nine subplots is
similar. When learning gains are larger (several left pairs of bars in subplots) errors when
learning is on are lower than that when learning is off. These learning conditions have more
pronounced adaptive tracking behavior that aids in lowering tracking error, but model
acquired by cerebellar neural network will be less accurate, manifested by larger control error
when learning is turned off. Several pairs at the right of each subplot have opposite behavior,
with RMS error becoming lower when learning is off that the on when the learning was on.
Since when learning stops cerebellar controller performance cannot change, this is only an
indication of a decreasing cumulative trend during last period of RMS error calculation. Final
performance is obtained after one RMS calculation period passes, and RMS error will be
constant there after (for periodic reference signal), shown during last 10 seconds of Figure 6.
Some pair around middle of subplots for certain learning gain will have about the same errors
for two bars of the group. For gains lower than this (right pairs) learning may go
proportionally slower but with better model acquired by cerebellar controller. Larger learning
gains than this limit may provide faster performance improvement with less accurate model
learned by cerebellar controller, also accompanied with the risk off making control unstable.
Figure 10 CMAC learning with receptive fields (RF) of different shape. Columns show results of
learning for different learning gains at selected receptive fields shape. Rows are for given number of
RF and of given width. One pair of bars corresponds to one simulation run with coefficient marked
under it determining learning gain. Blue bars represent RMS of position error at the end of the learning
phase (as in Figure 6), while brown bars dynamics steady state RMS of position error during testing
phase. SBF: RF with square basis function (BF); TBF: RF with triangular BF; RBF: RF with Gaussian
BF
On-line learning of robot inverse dynamics with Cerebellar Model Controller in feedforward
configuration
http://www.iaeme.com/IJMET/index.asp 458 [email protected]
For all cases after gross learning (learning phase from Figure 6) RMS errors are reach
about the same level independent of learning gain, the level determined by structural
parameters. Notice the scaling for first row being almost three times higher than that of two
other rows, 0.09 and 0.035. Better performance with higher number of RF is noticed at each
column by looking from top to bottom. By increasing order of RF it is expected to improve
modeling performance of cerebellar controller. This can be seen by looking at rows from left
to right, where only shape (order) of RF will change (increase), leading to lower RMS errors
and better control performance. While behavior for SBF and TBF is as expected, behavior of
RBF is somehow different. First, error with lower number of RF (9) is slightly lower than that
with higher number of RF (17), first and second subplots of third column. This may be to
more favorable match of centers and widths of RF for problem that will be learned. Opposite
of this may be seen in Figure 9 with triangular RF, range from 12-15 RF. Second, increasing
number of RBF RF from middle to bottom subplots (17 RF to 25 RF) shows no improvement
on RMS error. The cause for this behavior may be in normalization of two-dimensional RF.
First two basis functions have self-normalization property, but this is not the case for
Gaussian RF used in the third case. Problem of normalization can be overcome by adding
normalization stage, similar to fuzzy neural networks, or by using basis functions that have
normalization as inbuilt property, like B-splines (having SBF and TBS as two first members).
4. CONCLUSION
Cerebellar controller in feedforward configuration was used as augmentation to conventional
proportional-derivative for path tracking problem in robotics. Structure of cerebellar
controller was of cerebellar model articulation controller (CMAC) with fully coupled Albus
overlays. Controller acquired robot inverse dynamics during on-line learning. Different
structural aspects and influence in accuracy of acquired model were explored. Simulink model
for control system, including CMAC and robot plant, was used for simulations. Cerebellar
Simulink model that preserves layered structure was used. Cerebellar controller will fast learn
inverse dynamics inside its modeling capability scope, and will take over control from
conventional proportional-derivative controller. It was shown that using lower dimensional
input space than that of the problem may limit modeling capabilities of cerebellar controller in
acquiring inverse dynamics model, and may result at inability of decreasing control error.
Increasing number of receptive fields for coding input signals will decrease error, but there
may be cases that do not follow this trend in regular way, with negative or positive effects.
Usually this issue can be bypassed with input layer adaption, where centers and widths of
receptive fields for coding input signals will be determined adaptively. We evaluated only
uniform distribution of receptive fields, with same widths per dimension, assuming them as
information source for distributed processing with overall result of uniform variation. Other
aspects explored were shape the width of receptive fields. Increasing order of receptive fields
increased accuracy of acquired model with same number of receptive fields for same length of
training phase. For learning gains lower than some value control performance in RMS sense
will not degrade when passing from learning phase to phase of controlling with acquired
model. Learning gains above that value may provide faster decrease of control error,
accompanied with less accurate model acquired by cerebellar controller, but may risk making
control system unstable. Basis functions without normalization property manifested adverse
effects by not being able to increase accuracy of acquired inverse dynamics model, caused by
activity fluctuations of resulting higher dimensional receptive fields that could not be
overcome by learning of selected time span. General trend of RMS position error in relation
to learning gain was same for all tested shapes, widths and numbers of receptive fields.
Lavdim Kurtaj, Vjosa Shatri and Ilir Limani
http://www.iaeme.com/IJMET/index.asp 459 [email protected]
REFERENCES
[1] Wan Kyun Chung, Li-Chen Fu, Torsten Kröger. Motion Control. In: Bruno Siciliano,
Oussama Khatib, Eds., Springer Handbook of Robotics, Second Edition. Springer, 2016,
pp. 163-194.
[2] K. S. Fu, R. C. Gonzalez and C. S. G. Lee. Robotics: Control, Sensing, vision, and
Intelligence. McGraw-Hill, Inc., 1987.
[3] Zhihua Qu and Darren M. Dawson. Robust tracking control of robot manipulators, IEEE
Press, New York, 1996.
[4] Peter Corke. Robotics, Vision and Control: Fundamental Algorithms in MATLAB,
Springer-Verlag, 2011.
[5] Nazmul Siddique and Hojjat Adeli. Computational Intelligence: Synergies of Fuzzy
Logic, Neural Networks and Evolutionary Computing, John Wiley & Sons, Ltd, 2013.
[6] Masao Ito. The Cerebellum: Brain for an Implicit Self, Pearson Education, Inc., 2012.
[7] J. C. Eccles, M. Ito and J. Szentágothai. The Cerebellum as a Neuronal Machine. Springer
Science+Business Media, New York, 1967.
[8] D. Marr. A Theory of Cerebellar Cortex. The Journal of Physiology, vol. 202, no. 2, Jun
1969, pp. 437-470.
[9] J. S. Albus. Theory of cerebellar function. Mathematical Biosciences, vol. 10, no. 1/2,
February 1971, pp. 25-61.
[10] M. Ito, M. Kano. Long-lasting depression of parallel fiber-Purkinje cell transmission
induced by conjunctive stimulation of parallel fibers and climbing fibers in the cerebellar
cortex. Neuroscience Letters, vol. 33, no. 3, 13 December 1982, pp. 253-258.
[11] J. S. Albus. New approach to manipulator control: the cerebellar model articulation
controller (CMAC). Transactions of the ASME Journal of Dynamic Systems,
Measurement, and Control, vol. 97, no. 3, September 1975, pp. 220-227.
[12] Chan-Mo Kim, Kwang-Ho Choi and Yong B. Cho. Hardware Design of CMAC Neural
Network for Control Applications. Proceedings of the International Joint Conference on
Neural Networks, 2003, 20-24 July 2003, Portland, OR, USA, pp. 953-958.
[13] Lavdim Kurtaj, Vjosa Shatri and Ilir Limani. New model of information processing at
granule cell layer makes cerebellum as biological equivalent for ANFIS and CANFIS:
Sharing of processing resources and generalization. IEEE International Conference on
Fuzzy Systems, 2017, pp. 1-8.
[14] Raul Rojas. Neural Networks: A Systematic Introduction. Berlin, New-York: Springer-
Verlag, 1996.
[15] Francisco J. González-Serrano, Aníbal R. Figueiras-Vidal, and Antonio Artés-Rodríguez.
Generalizing CMAC Architecture and Training. IEEE Transactions on Neural Networks,
Vol. 9, No. 6, November 1998, pp. 1509-1514.
[16] S. D. Teddy, E. M.-K. Lai and C. Quek. Hierarchically Clustered Adaptive Quantization
CMAC and Its Learning Convergence. IEEE Transactions on Neural Networks, Volume
18, Issue 6, November 2007, pp. 1658-1682.
[17] Hyongsuk Kim and Chun-Shin Lin. Use of Adaptive Resolution for Better CMAC
Learning. International Joint Conference on Neural Networks, 1992. IJCNN, 7-11 June
1992, Baltimore, MD, USA, USA, pp. I-517-I-522.
[18] R. Bellman. Adaptive Control Processes. Princeton University Press, Princeton, 1961.
[19] Chiang Ching-Tsan and Lin Chun-Shin. CMAC with General Basis Functions. Neural
Networks, Elsevier Science Ltd., October 1996, Volume 9, Issue 7, pp. 1199 - 1211.
On-line learning of robot inverse dynamics with Cerebellar Model Controller in feedforward
configuration
http://www.iaeme.com/IJMET/index.asp 460 [email protected]
[20] Lavdim Kurtaj, Ilir Limani, Vjosa Shatri and Avni Skeja. Dependence of CMAC Neural
Network Properties at initial, during, and after Learning Phase from Input Mapping
Function. Proceedings of the 12th WSEAS International Conference on Systems Theory
and Scientific Computation (ISTASC’12), Istanbul, Turkey, August 21-23, 2012; ISBN
978-1-61804-115-9, pp. 187-192.
[21] Bruno Siciliano, Lorenzo Sciavicco, Luigi Villani and Giuseppe Oriolo. Robotics:
Modelling, Planning and Control, Springer, 2011.
[22] Vjosa Shatri, Lavdim Kurtaj and Ilir Limani. Hardware-in-the-loop architecture with
MATLAB/Simulink and QuaRC for rapid prototyping of CMAC neural network
controller for ball-and-beam plant. Proceedings of 2017 40th International Convention on
Information and Communication Technology, Electronics and Microelectronics, MIPRO
2017, 2017, pp. 1201-1206.
[23] Lavdim Kurtaj, Vjosa Shatri and Ilir Limani. Comparative performance of two types of
cerebellar model controllers for controlling robot joint: size, learning and generalization.
Proceedings of 2017 6th Mediterranean Conference on Embedded Computing, MECO
2017 - Including ECYPS 2017, 2017, pp. 1-5.
[24] P. Kamal Kumar, Taj, L. Praveen, Anoop Joshi and G Musalaiah. Fabrication of
Pneumatic Pick and Place Robot. International Journal of Civil Engineering and
Technology, 8(7), 2017, pp. 594–600.