Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic
The McCulloch Neuron (1943)
g = step function The euclidian space ℜn is divided in two regions A and B
]1;0[)(1
∈→−=⎟⎠
⎞⎜⎝
⎛−= ∑
=
abgbpwga tn
iii pw
w1
p1
bp
2
pn
w2
wn
+
A
B
p1
p2
bpwpw =+ 2211
for n=2
51
Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic
The McCulloch Neuron – as patterns classifier
o
o
oo
o o
o o
o oo oo o
x xx xx x
xx
x xx
x
x xx x
Linearly separable collections Linearly dependent (non-separable) collections
The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.
Some Boolean functions of two variables represented in a binary plan.
52
Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic
Linear and Non-Linear Classifiers
There exist possible logical functions connecting n inputs to one binary output.
n # of binray patterns
# of logical functions
# linearly separable
% linearly separable
1 2 4 4 100 2 4 16 14 87,5 3 8 256 104 40,6 4 16 65536 1.772 2,9 5 32 4,3 x 109 94.572 2,2 x 10-3 6 64 1,8 x 1019 5.028.134 3,1 x 10-13
nm 222 =
The logical functions of one variable: A, A , 0, 1 The logical functions of two variables: A, B, B,A , 0, 1
,,,, BABABABA ∧∨∧∨ ,,,, BABABABA ∧∨∧∨ BABA ⊕⊕ ,
53
Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic
Two Step Binary Perceptron The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.
The neuron 6 implements a logical AND function by choosing
∑=
=5
366
iiwb .
For example:
111;31
54366564636 ====⇒==== aaaifonlyandifabwww
54
Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic
Three Step Binary Perceptron
55
p1
1
1
p2
p1
w1 3
w3 9
w4 9
AA
B
Bw5 9
a1 1
a1 1=A B
^
p2
3
6
4
7
9
10
105
8
Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic
Neurons and Artificial Neural Networks § Micro-structure
characteristics of each neuron in the network § Meso-Structure
organization of the network § Macro-Structure
association of networks, eventually with some analytical processing approach for complex problems
The image cannot be displayed. Your computer may not have enough memory to open the image, or the image may have been corrupted. Restart your computer, and then open the file again. If the red x still appears, you may have to delete the image and then insert it again.
56
w1
p1
bp2
pn
w2
wn
+ Bias input
Bias: with p=0, output ≠0 still possible !
Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic
Typical activation functions Linear ssf =)( Hopfield
BSB purelin
s
f s( )
Signal
⎩⎨⎧
<−
≥+=
0101
)(ssesse
sf Perceptron hardlims
s-1
1f s( )
Step
⎩⎨⎧
<
≥+=
0001
)(ssesse
sf Perceptron BAM
hardlim
s
1f s( )
Hopfield/ BAM
⎪⎩
⎪⎨
⎧
=
<−
>+
=
00101
)(sifunchanged
ssesse
sf Hopfield BAM
s-1
1f s( )
57
Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic
Typical activation functions
BSB or Logical Threshold ⎪
⎩
⎪⎨
⎧
+≥+
+<<−
−≤−
=
KsseKKsKsesKsseK
sf )( BSB satlin
satlins
s-K
Kf s( )
Logístics
sesf
−+=
11)(
Perceptron Hopfield BAM, BSB
logsig
s
1f s( )
Hiperbolic Tangent s
s
eessf 2
2
11)tanh()(
−
−
+
−==
Perceptron Hopfield BAM, BSB
tansig
s
1
-1
f s( )
58
Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic
Meso-Structure – Network Organization...
# neurons per layer # network layers # connection type (forward, backward, lateral).
1- Multilayer Feedforward
Multilayer Perceptron (MLP)
59
Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic
Meso-Structure – Network Organization...
2- Single Layer laterally connected (BSB (self-feedback), Hopfield)
3 – Bilayers Feedforward/Feedbackward
60
Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic
Meso-Structure – Network Organization
4 – Multilayer Cooperative/Comparative Network
5 – Hybrid Network
Sub-
Rede 1
Sub-
Rede 2
61
Network 1
Network 2
Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic
Neural Macro-Structure
Rede 1
Rede 2a Rede 2b Rede 2c
Rede 3
- # networks - connection type - size of networks - degree of connectivity
62
NetW.
NetW.
NetW. NetW. NetW.2
Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic
∑+← 2
k
ijjijij x
xww
µδ
Supervised Learning
ydxww −≡+← δµδ§ Delta Rule → Perceptron
§ Widrow-Hoff delta rule (LMS) → ADALINE, MADALINE
§ Generalized Delta Rule
Widrow-Hoff Delta Rule (LMS)
x
d-y__δ µ – learning rate
63
Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic
Delta rule → Perceptron x
d-y__δ
64
Perceptron – Rosenblatt, 1957 Dynamics:
⎩⎨⎧
<
≥+==
+=∑
0001
)(j
jjj
ijijijj
ssesse
sfy
bpws
w1 j
bj1
p1 j
sj
yjp
2 j
pn j
w2 j
wn j
+
jjj yd −=δ
wij ← wij + µ δj xij Delta Rule
µ - learning rate δj = 0 → the weight is not changed.
Psychology Reasoning: - positive reinforcement - negative reinforcement
Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic
ADALINE and MADALINE
65
Widrow & Hoff, 1960 – (Mult.) Adaptive Linear Element Training:
∑ +=i
jijijj bpwy
jjObs δε ≡: wij ← wij + µ δj xij Delta Rule
⎟⎟
⎠
⎞
⎜⎜
⎝
⎛+−=−= ∑ jijijjjjj bpwdsdε
∑+← 2
k
ijjijij x
xww
µεWidrow-Hoff delta rule LMS – Least Mean Squared algorithm
0.1< µ <1 – stability and convergency speed. MatLab: NEWLIN, NEWLIND, ADAPT, LEARNWH
Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic
LMS Algorithm
66
Objective: learn a function from the samples (xk, dk) { xk}, {dk } and {ek } → stationary stochastic processes e = d – y → actual stochastic error
E[e2 ] = E[d2 ] – 2Pwt + wRwt
ℜ→ℜnf :
∑=
==n
i
tiiwxy1
xw
→ Linear neuron
E[e2] = E[(d-y)2] = E[(d-xwt)2] = E[d2 ] – 2E[dx]wt + wE [xtx] wt
Expected value
Assuming w deterministic. With E [xtx] ≡ R → autocorrelation input matrix E [dx] ≡ P → cross correlated vector
0 = 2w*R – 2P
(Partial derivatives equal 0 for optimal w*)
w* = PR-1 Optimal analytic solution of the optimization (solvelin.m)
Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic
Iterative LMS Algorithm
67
Objective: adaptively learn a function from the samples (xk, dk) ℜ→ℜnf :
Knowing P and R, ϶ R-1 , then for some w: ∇w E[e2 ] = 2wR – 2P Post-multiplyting by ½ R-1
½ ∇w E[e2 ] R-1 = w –P R-1 = w – w*
w* = w – ½ ∇w E[e2 ] R-1
wk+1 = wk – ck ∇w E[e2 ] R-1
(ck = ½ → Newton’s method)
LMS Hypothesis: E[e2
k+1| e20 , e2
1, ... e2k] = e2
k
i *
How to, cautiously find new (better ) values for wi , the free parameters ?
Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic
Iterative LMS Algorithm...
68
Iterative (adaptive) solution (The optimal solution is never reached!) MADALINE i-input, j-neuron
∑+← 2
k
ijjijij x
xww
µδ
i *
⎥⎦
⎤⎢⎣
⎡
∂∂
∂∂
n
kkwe
we 2
1
2, !
⎥⎦
⎤⎢⎣
⎡
∂
−∂
∂
−∂=
n
kkkk
wyd
wyd 2
1
2 )(,)(!
⎥⎦
⎤⎢⎣
⎡
∂
∂−−
∂
∂−−=
n
kkk
kkk w
yydwyyd )(2,)(21
!
⎥⎦
⎤⎢⎣
⎡
∂
∂
∂
∂−=
n
kkk w
ywye !,21
[ ] )(2– ,2 1 tkkkkk
nkkk yexxe wxx ==−= !
assuming R = I → estimated steppest decent algorithm: wk+1
= wk – ck ∇w e2k
Gradient of e2
k with respect to w ∇w e2
k =
LMS algorithm reduces to wk+1 = wk + 2ck ek xk
Norma- lization
Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic
The Multilayer Perceptron - The Generalized Delta Rule
Rumelhart, Hinton e Williams, PDP/MIT, 1986
)0(11 xp = )1(
1x1
)2(1 yx =
2)2(2 yx =
)1(2x
)1(3x
)0(22 xp =
)0(33 xp =
69
Neuron Dynamics: Processing Element (PE) j in layer k input i
with f (activation function) continuous differentiable
)( )()(
)1()()(0
)(
kj
kj
i
ki
kij
kj
kj
sfx
xwws
=
+= ∑ − Turning Point Question: How to find the error associated with an internal neuron??
Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic
The generalized delta rule )0(
11 xp = )1(1x
1)2(
1 yx =
2)2(2 yx =
)1(2x
)1(3x
)0(22 xp =
)0(33 xp =
70
∑=
−=m
jjj yd
1
22 )(ε
),...,,( )()(1
)()( kmj
kj
koj
kj www=w
),...,,1( )1()1(1
)1( −−− = knj
kj
kj xxx
⎥⎥⎦
⎤
⎢⎢⎣
⎡
∂
∂
∂
∂
∂
∂=
∂
∂=∇ )(
2
)(1
2
)(0
2
)(
2)( ,, k
mjkj
kj
kj
kj www
εεεε!
w
)(
)(
)(
2
)(
2)(
kj
kj
kj
kj
kj
ss ww ∂
∂
∂
∂=
∂
∂=∇
εε
)1()()( −= kj
kj
kjs xw )1(
)(
)(−=
∂
∂ kjk
j
kjs xw
)1()(
2
)(
2)( −
∂
∂=
∂
∂=∇ k
jkj
kj
kj s
xw
εε
)(
2)(
21
kj
kj s∂
∂−=
εδ
Training
- quadratic error - weigths of PE j
- input vector of PE j
With →
so
Defining the quadratic derivative error as
Instantaneous gradient:
)1()()( 2 −−=∇ kj
kj
kj xδ Gradient of the error with respect
to the weights as function of the former layer signals!!
Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic
The generalized delta rule... )0(
11 xp = )1(1x
1)2(
1 yx =
2)2(2 yx =
)1(2x
)1(3x
)0(22 xp =
)0(33 xp =
71
For the output layer, the quadratic derivative error is:
The output error associated with PEj, in the last layer:
)(1
2)(
)(1
2
)())((
21
)(
21
kj
N
i
kii
kj
N
iii
kj s
sfd
s
ydkk
∂
−∂
−=∂
−∂
−=∑∑==δ
The partial derivatives are 0 for i ≠ j
)()())((
))(())((
21 )()(
)(
)()(
)(
2)()( k
jkjjk
j
kjjk
jjkj
kjjk
j sfxdssfd
sfdssfd
ʹ′−=∂
−∂−−=
∂
−∂−=δ
jjkjj
kj ydxd −=−= )()(ε
Giving: )(. )()()( k
jkj
kj sf ʹ′= εδ
Remember, “activation function, f, continuous differentiable”
Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic
The generalized delta rule... )0(
11 xp = )1(1x
1)2(
1 yx =
2)2(2 yx =
)1(2x
)1(3x
)0(22 xp =
)0(33 xp =
72
For a hidden layer k, the quadratic derivative error can be calculated using the linear outputs of layer k+1:
∑+
=
+
+ ⎟⎟⎠
⎞⎜⎜⎝
⎛
∂
∂
∂
∂−=
∂
∂−=
1
1)(
)1(
)1(
2
)(
2)(
21
21 kN
iki
ki
ki
kj
kj s
sssεε
δ
∑∑++
=
++
=
+
+ ⎟⎟⎠
⎞⎜⎜⎝
⎛
∂
∂=⎟⎟⎠
⎞⎜⎜⎝
⎛
∂
∂⎟⎟⎠
⎞⎜⎜⎝
⎛
∂
∂−=
11
1)(
)1()1(
1)(
)1(
)1(
2
21 kk N
iki
kik
i
N
iki
ki
ki s
sss
sδ
ε
(Chain Rule)
∑=
−+=kN
i
ki
kij
kj
kj xwws
1
)1()()(0
)(Taking into account that
( )∑ ∑+
= =
+++
⎟⎟
⎠
⎞
⎜⎜
⎝
⎛
⎟⎟
⎠
⎞
⎜⎜
⎝
⎛+
∂
∂=
1
1 1
)()1()1(0)(
)1()(k kN
i
N
l
kl
kli
kik
i
ki
kj sfww
sδδ
( )∑ ∑+
= =
++
⎟⎟
⎠
⎞
⎜⎜
⎝
⎛
∂
∂=
1
1
)()(
1
)1()1()(k kN
i
klk
i
N
l
kli
ki
kj sf
swδδ
( ) ( ) ( ))()()(
)()( thatandif0
gconsiderin
kj
kjk
j
klk
j
sfsfs
jlsfs
ʹ′=∂
∂≠=
∂
∂
( ) ( ))(1
)1()1()( .
)(
1kj
N
i
kji
ki
kj sf
kj
wk
ʹ′
≡
⎟⎟⎠
⎞⎜⎜⎝
⎛= ∑
+
=
++
!! "!! #$
ε
δδWe have:
)(. )()()( kj
kj
kj sf ʹ′= εδ
Finally, the quadratic derivative errror for a hidden layer:
Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic
The “Error Backpropagation” algorithm
1. randomw kij ←)( , initialize the network weigths
2. for (x,d), training pair, obtain y. Feedforward propagation: ∑=
−=m
jjj yd
1
22 )(ε
3. k layerlast← 4. for each element j in the layer k do:
Compute )(kjε using jj
kjj
kj ydxd −=−= )()(ε if k is the last layer,
∑+
=
++=1
1
)1()1()(kN
i
kji
ki
kj wδε if it is a hidden layer;
Compute )(. )()()( k
jk
jk
j sf ʹ′= εδ
5. 1−← kk if k > 0 go to step 4, else continue. 6. )()()()( 2)()1( k
ik
ikj
kj nn xww µδ+=+
7. For the next training pair go to step 2.
73
Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic
The Backpropagation Algorithm in practice
74
1 – In the standard form BP is very slow. 2 – BP Pathologies: paralysis in regions of small gradient. 3 – Initial conditions can lead to local minima. 4 – Stop conditions – number of epochs, ∆wij < ϵ 5 – BP variants
- trainbpm (with momentum) - trainbpx (adaptive learning rate) - .... - trainlm (Levenberg-Marquard – J, Jacobian)
E - Energia da rede
Padrões armazenados
Padrão espúrioValor Inicial
Padrão recuperado
Estados
e2
“Bad Start” “Good Start”
Local Minima
wi,j
Optimum
e2(wi,j) - Illustrative quadratic error as function of the weights eJJJJ TTk
j1)( )( −+=Δ µW
Obs: the error surface is, normally, unknown. Steepest descent → go in the opposite direction of the local gradient (“downhill”).
Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic
Computational Tools
• SNNS
• MatLab - Neural Network Toolbox
• NeuralWorks
• Java
• C++
• Hardware Implementations of RNAs
75
Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic
SNNS - Stuttgarter Neural Network Simulator
76
Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic
MatLab - complete environment
-System Simulation -Training -Control
uhat1
tansigpurelinnetsum 1
+
netsum
+
Zero -OrderHold
Unit Delay 5z
1
Unit Delay 1z
1
Switch 3
Switch 2
Switch 1
Switch Saturation 1
MatrixGain 4
K*u
MatrixGain 3
K*u
MatrixGain 2
K*u
MatrixGain 1
K*u
Fcn3
f(u)
Fcn2
f(u)
Fcn1
f(u)
Fcn
f(u)
Discrete State -Space 3
y(n)=Cx(n)+Du(n)x(n+1)=Ax(n)+Bu(n)
Discrete State -Space 2
y(n)=Cx(n)+Du(n)x(n+1)=Ax(n)+Bu(n)
Discrete State -Space 1
y(n)=Cx(n)+Du(n)x(n+1)=Ax(n)+Bu(n)
Constant 7B2_c
Constant 6B1_cConstant 5
-C-
Constant 4-C-
Constant 3-C-
Constant 2-C-
y3
u2
r1
qi
model liq 4 order
qi h4
Scope
Model Reference Controller
Plant Output
Reference
ControlSignal
NeuralNetworkController h4
h4tansig
radbas
purelin
logsig
77
Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic 78
Demonstration - Perceptron % Perceptron % Training an ANN to learn to classify a non-linear problem % Input Pattern P=[ 0 0 0 0 1 1 1 1 0 0 1 1 0 0 1 1 0 1 0 1 0 1 0 1] % Target %T=[1 0 1 1 1 0 1 0] % Linear separable T=[1 0 0 1 1 0 1 0] % non separable % Try with Rosenblat's Perceptron net=newp(P,T,'hardlim') % train the network net=train(net,P,T) Y=sim(net,P)
T = 1 0 0 1 1 0 1 0 Y = 1 0 1 0 0 0 1 0
T = 1 0 1 1 1 0 1 0 Y = 1 0 1 1 1 0 1 0
Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic 79
Demonstration - OCR
Training Vector
20 % Noise
p1
p1
p63
y1
y2
y16
x1
x2
xN
ANN
Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic 80
Demonstration – OCR...
Training with 10 x (0,10,20,30,40,50) % noise
p1
p1
p63
y1
y2
y16
x1
x2
xN
* - error without noisy training patterns * - error using noisy training patterns
% of missclassifications – Neural OCR Classifier
Noisy patterns used in training (unitl % of bits flipped)
With Some Noisy Training Pattern → Learns how to treat “any” noise
Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic 81
Demonstration – LMS, ADALINE, FIR
!,:),(
)()(
)()2()1()()(
22
110
210
unstablebecanbutcompactmoreisModelIIRObsZerosonlystablealwaysModelFIR
zwzwzwwzUzY
nkuwkuwkuwkuwky
nn
n
−−− +++=
−+−+−+=
!
!
0 50 100 150-4
-2
0
2
4System changes at 80 sec
sec
123sec)15080(
12.01sec)9.790( 2221 ++
=−++
=−ss
gss
g
(TDL – Time Delay Line)
sec1.0, =sTTimeSampling
Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic 82
Demo – LMS, ADALINE, FIR... % ADALINE - Adaptive dynamic system identification % First sampled system - until 80 sec g1=tf(1,[1 .2 1]), gd1=c2d(g1,.1) % Sytem changes dramatically - after 80 sec g2=tf(3,[1 2 1]),gd2=c2d(g2,.1) % Pseudo Random Binary Signal - good for identification u=idinput(120*10,'PRBS',[0 0.01],[-1 1]); % time vector ... [y1,t1,x1]=lsim(gd1,u1,t1); [y2,t2,x2]=lsim(gd2,u2,t2,x1); % Creates new adaline nework with delayed inputs (FIR) % Learning Rate = 0.09 net=newlin(t,y,[1 2 3 4 5 6 7 8 9 10],0.09) [net,Y,E]=adapt(net,t,y) % design an average transfer function netd=newlind(t,y)
Laboratório de Automação e Robótica - A. Bauchspiess – Soft Computing - Neural Networks and Fuzzy Logic 83
Demo – LMS, ADALINE, FIR...
0 500 1000 1500-4
-2
0
2
4RMSE Set 1=6.5742
0 500 1000 1500-4
-2
0
2
4Error
0 200 400 600 800 1000 1200-5
0
5
10RMSE Set 2=22.7817
0 200 400 600 800 1000 1200-5
0
5Error
n=10, lr=0.1
u=idinput(1500,'PRBS',[0 0.01])
u=idinput(1200,'PRBS',[0 0.05])
n=10, lr=0.1
Verification Signal
ADALINE Learns System AND also Changes in the Dynamics!!
But, in other frequency range not so good... (needs to Adjust TDL, lr, Ts)