1
Chapter 6: Artificial Neural NetworksPart 2 of 3 (Sections 6.4 – 6.6)
Asst. Prof. Dr. Sukanya PongsuparbDr. Srisupa Palakvangsa Na AyudhyaDr. Benjarath Pupacdi
SCCS451 Artificial IntelligenceWeek 12
2
Agenda
Multi-layer Neural NetworkHopfield Network
3
Multilayer Neural NetworksA multilayer perceptron is a feedforward neural network with ≥ 1 hidden layers.
Threshold
Inputs
x1
x2
OutputY
HardLimiter
w2
w1
LinearCombiner
Inputlayer
Firsthiddenlayer
Secondhiddenlayer
Outputlayer
O u
t p
u t
S i
g n
a l s
I n p
u t
S i
g n
a l s
Single-layer VS Multi-layer Neural Networks
4
Roles of Layers
Input Layer Accepts input signals from outside worldDistributes the signals to neurons in hidden layerUsually does not do any computation
Output Layer (computational neurons)Accepts output signals from the previous hidden layerOutputs to the worldKnows the desired outputs
Hidden Layer (computational neurons)Determines its own desired outputs
5
Hidden (Middle) Layers
Neurons in hidden layers unobservable through input and output of the networks.Desired output unknown (hidden) from the outside and determined by the layer itself1 hidden layer for continuous functions2 hidden layers for discontinuous functionsPractical applications mostly use 3 layersMore layers are possible but each additional layer exponentially increases computing load
6
How do multilayer neural networks learn?
More than a hundred different learning algorithms are available for multilayer ANNsThe most popular method is back-propagation.
7
Back-propagation AlgorithmIn a back-propagation neural network, the learning algorithm has 2 phases.1. Forward propagation of inputs2. Backward propagation of errors
The algorithm loops over the 2 phases until the errors obtained are lower than a certain threshold.Learning is done in a similar manner as in a perceptron
A set of training inputs is presented to the network.The network computes the outputs.The weights are adjusted to reduce errors.
The activation function used is a sigmoid function.
Xsigmoid
eY
11
8
Common Activation FunctionsS t e p f u n c t io n S ig n f u n c t io n
+ 1
-10
+ 1
-10X
Y
X
Y+ 1
-10 X
Y
S ig m o id f u n c t io n
+ 1
-10 X
Y
L in e a r f u n c t io n
0 if ,00 if ,1
XX
Y step
0 if ,10 if ,1
XX
Y signX
sigmoid
eY
1
1 XY linear
Hard limit functions often used for decision-making neurons for classification and pattern recognition
Popular in back-propagation networks
Often used for linear approximation
Output is a real number in the [0, 1] range.
9
3-layer Back-propagation Neural Network
Inputlayer
xi
x1
x2
xn
1
2
i
n
Outputlayer
1
2
k
l
yk
y1
y2
yl
Input signals
Error signals
wjk
Hiddenlayer
wij
1
2
j
m
10
How a neuron determines its outputVery similar to the perceptron
Xsigmoid
eY
11
n
iiiwxX
1
1. Compute the net weighted input
2. Pass the result to the activation function
Inputlayer
xi
x1
x2
xn
1
2
i
n
Outputlayer
1
2
k
l
yk
y1
y2
yl
Input signals
Error signals
wjk
Hiddenlayer
wij
1
2
j
m
j
2
5
1
8
0.10.2
0.50.3
X = (0.1(2) + 0.2(5) + 0.5(1) + 0.3(8)) – 0.2 = 3.9Y = 1 / (1 + e-3.9) = 0.98
Let θ = 0.2
0.98
0.98
0.98
0.98
0.98
0.98
Input Signals
11
How the errors propagate backwardThe errors are computes in a similar manner to the errors in the perceptron.Error = The output we want – The output we get
)()()( , pypype kkdk Error at an output neuron k at iteration p
Inputlayer
xi
x1
x2
xn
1
2
i
n
Outputlayer
1
2
k
l
yk
y1
y2
yl
Input signals
Error signals
wjk
Hiddenlayer
wij
1
2
j
m
k
2
5
1
8
0.10.2
0.50.3
ek(p) = 1 – 0.98 = 0.02
Suppose the expected output is 1.
0.98
Iteration p
Error Signals
12
Back-Propagation Training AlgorithmStep 1: Initialization
Randomly define weights and threshold θ such that the numbers are within a small range
where Fi is the total number of inputs of neuron i. The weight initialization is done on a neuron-by-neuron basis.
ii FF4.2 ,4.2
0 5 10 15 20 25 30 350
1
2
Random weight range [2.4, 0]
Fi
2.4/
Fi
13
Back-Propagation Training AlgorithmStep 2: Activation
Propagate the input signals forward from the input layer to the output layer.
]1,0[,1
1
YY sigmoid
Xsigmoid
e
n
iiiwxX
1
Inputlayer
xi
x1
x2
xn
1
2
i
n
Outputlayer
1
2
k
l
yk
y1
y2
yl
Input signals
Error signals
wjk
Hiddenlayer
wij
1
2
j
m
j
2
5
1
8
0.10.2
0.50.3
X = (0.1(2) + 0.2(5) + 0.5(1) + 0.3(8)) – 0.2 = 3.9Y = 1 / (1 + e-3.9) = 0.98
Let θ = 0.20.98
0.98
0.98
0.98
0.98
Input Signals
14
Back-Propagation Training AlgorithmStep 3: Weight Training
There are 2 types of weight training1. For the output layer neurons2. For the hidden layer neurons
*** It is important to understand that first the input signals propagate forward, and then the errors propagate backward to help train the weights. ***
In each iteration (p + 1), the weights are updated based on the weights from the previous iteration p.The signals keep flowing forward and backward until the errors are below some preset threshold value.
15
3.1 Weight Training (Output layer neurons)
These formulas are used to perform weight corrections.
)()()1( ,,, pwpwpw kjkjkj
)()()(, ppypw kjkj
)()(1)()( pepypyp kkkk
Inputlayer
xi
x1
x2
xn
1
2
i
n
Outputlayer
1
2
k
l
yk
y1
y2
yl
Input signals
Error signals
wjk
Hiddenlayer
wij
1
2
j
m
k
1
2
m
w1,kw2,k
wm,k
yk(p)
Iteration p
jwj,k
ek(p) = yd,k(p) - yk(p)
δ = error gradient
16
)()()1( ,,, pwpwpw kjkjkj
)()()(, ppypw kjkj
)()(1)()( pepypyp kkkk
We want to compute this We know this
predefined We know how to compute this
We know how to compute these
k
1
2
m
w1,kw2,k
wm,k
yk(p)
Iteration p
jwj,k
ek(p) = yd,k(p) - yk(p)
We do the above for each of the weights of the outer layer neurons.
17
3.2 Weight Training (Hidden layer neurons)
These formulas are used to perform weight corrections.
)()()1( ,,, pwpwpw jijiji
)()()(, ppxpw jiji
l
kjkkjjj pwppypyp
1
)()()(1)()(
Inputlayer
xi
x1
x2
xn
1
2
i
n
Outputlayer
1
2
k
l
yk
y1
y2
yl
Input signals
Error signals
wjk
Hiddenlayer
wij
1
2
j
m
j
1
2
n
w1,jw2,j
wn,j
Iteration p
iwi,j
1
2
l
k
18
l
kjkkjjj pwppypyp
1
)()()(1)()(
We want to compute this We know this
predefined input
We do the above for each of the weights of the hidden layer neurons.
)()()1( ,,, pwpwpw jijiji
)()()(, ppxpw jiji
Propagates from the outer layer
We know this
j
1
2
n
w1,jw2,j
wn,j
Iteration p
iwi,j
1
2
l
k
We know how to compute these
19
Inputlayer
xi
x1
x2
xn
1
2
i
n
Outputlayer
1
2
k
l
yk
y1
y2
yl
Input signals
Error signals
wjk
Hiddenlayer
wij
1
2
j
m
P = 1
20
Inputlayer
xi
x1
x2
xn
1
2
i
n
Outputlayer
1
2
k
l
yk
y1
y2
yl
Input signals
Error signals
wjk
Hiddenlayer
wij
1
2
j
m
Weights trainedWeights trainedP = 1
21
After the weights are trained in p = 1, we go back to Step 2 (Activation) and compute the outputs for the new weights.
If the errors obtained via the use of the updated weights are still above the error threshold, we start weight training for p = 2.
Otherwise, we stop.
22
Inputlayer
xi
x1
x2
xn
1
2
i
n
Outputlayer
1
2
k
l
yk
y1
y2
yl
Input signals
Error signals
wjk
Hiddenlayer
wij
1
2
j
m
P = 2
23
Inputlayer
xi
x1
x2
xn
1
2
i
n
Outputlayer
1
2
k
l
yk
y1
y2
yl
Input signals
Error signals
wjk
Hiddenlayer
wij
1
2
j
m
Weights trainedWeights trainedP = 2
24
Example: 3-layer ANN for XORx2 or input2
x1 or input1
(1, 1)
(1, 0)
(0, 1)
(0, 0)
XOR is not a linearly separable function.
A single-layer ANN or the perceptron cannot deal with problems that are not linearly separable. We cope with these problem using multi-layer neural networks.
25
Example: 3-layer ANN for XOR
y5 5
x1 3 1
x2
Input layer
Output layer
Hidden layer
4 2
3 w13
w24
w23
w14
w35
w45
4
5
1
1
1
(Non-computing)
n
iiiwxX
1
Let α = 0.1
26
Example: 3-layer ANN for XOR
26
Training set: x1 = x2 = 1 and yd,5 = 0
y55
x1 31
x2
Inputlayer
Outputlayer
Hidden layer
42
3
w13
w24
w23
w24
w35
w45
4
5
1
1
1
-0.1
-1.2
1.01.1
0.8
0.30.4
0.9
0.51
1
0
1
1
0.5
0.9
0.4
1.0
Calculate y3 = sigmoid(0.5+0.4-0.8) = 0.5250
Calculate y4 = sigmoid(0.9+1.0+0.1) = 0.8808
0.5250
0.8808
-0.63
0.9689
y5 = sigmoid(-0.63+0.9689-0.3) = 0.5097
0.5097
e = 0 – 0.5097 = – 0.5097
Let α = 0.1
27
Example: 3-layer ANN for XOR (2)
27
Back-propagation of error (p = 1, output layer)
y55
x1 31
x2
Inputlayer
Outputlayer
Hidden layer
42
3
w13
w24
w23
w24
w35
w45
4
5
1
1
1
-0.1
-1.2
1.01.1
0.8
0.30.4
0.9
0.51
1
0
1
1
0.5
0.9
0.4
1.0
0.5250
0.8808
-0.63
0.9689y5 = 0.5097
e = – 0.5097
δ = y5x (1-y5) x e = 0.5097 x (1-0.5097)
x (-0.5097) = -0.1274
Δwj,k (p) = α x yj(p) x δk(p)Δw3,5 (1) = 0.1 x 0.5250 x (-0.1274) = -0.0067
Let α = 0.1
wj,k (p+1) = wj,k (p) + Δwj,k (p)w3,5 (2) = -1.2 – 0.0067 = -1.2067
-1.2067
28
Example: 3-layer ANN for XOR (3)
28
Back-propagation of error (p = 1, output layer)
y55
x1 31
x2
Inputlayer
Outputlayer
Hidden layer
42
3
w13
w24
w23
w24
w35
w45
4
5
1
1
1
-0.1
-1.2
1.01.1
0.8
0.30.4
0.9
0.51
1
0
1
1
0.5
0.9
0.4
1.0
0.5250
0.8808
-0.63
0.9689y5 = 0.5097
e = – 0.5097
δ =-0.1274
Let α = 0.1
w4,5 (2) = 1.1 – 0.0112 = 1.0888
-1.2067
Δwj,k (p) = α x yj(p) x δk(p)Δw4,5 (1) = 0.1 x 0.8808 x (-0.1274) = -0.0112
wj,k (p+1) = wj,k (p) + Δwj,k (p)
1.0888
29
Example: 3-layer ANN for XOR (4)
29
Back-propagation of error (p = 1, output layer)
y55
x1 31
x2
Inputlayer
Outputlayer
Hidden layer
42
3
w13
w24
w23
w24
w35
w45
4
5
1
1
1
-0.1
-1.2
1.01.1
0.8
0.30.4
0.9
0.51
1
0
1
1
0.5
0.9
0.4
1.0
0.5250
0.8808
-0.63
0.9689y5 = 0.5097
e = – 0.5097
δ =-0.1274
Let α = 0.1
θ5 (2) = 0.3 + 0.0127= 0.3127
-1.2067
Δθk (p) = α x y(p) x δk(p)Δθ5 (1) = 0.1 x -1 x (-0.1274) = 0.0127θ5 (p+1) = θ5 (p) + Δ θ5 (p)
1.0888
0.3127
30
Example: 3-layer ANN for XOR (5)
30
Back-propagation of error (p = 1, input layer)
y55
x1 31
x2
Inputlayer
Outputlayer
Hidden layer
42
3
w13
w24
w23
w24
w35
w45
4
5
1
1
1
-0.1
-1.2
1.01.1
0.8
0.30.4
0.9
0.51
1
0
1
1
0.5
0.9
0.4
1.0
0.5250
0.8808
-0.63
0.9689y5 = 0.5097
e = – 0.5097
δ =-0.1274
Let α = 0.1
-1.2067
1.0888
0.3127
Δwi,j (p) = α x xi(p) x δj(p)Δw1,3 (1) = 0.1 x 1 x 0.0381 = 0.00381wi,j (p+1) = wi,j (p) + Δwi,j (p)w1,3 (2) = 0.5 + 0.00381 = 0.5038
δj(p) = yi(p) x (1-yi (p)) x ∑ [αk(p) wj,k (p)], all k’s δ3(p) = 0.525 x (1- 0.525) x (-0.1274 x -1.2) = 0.0381
0.5038
31
Example: 3-layer ANN for XOR (6)
31
Back-propagation of error (p = 1, input layer)
y55
x1 31
x2
Inputlayer
Outputlayer
Hidden layer
42
3
w13
w24
w23
w24
w35
w45
4
5
1
1
1
-0.1
-1.2
1.01.1
0.8
0.30.4
0.9
0.51
1
0
1
1
0.5
0.9
0.4
1.0
0.5250
0.8808
-0.63
0.9689y5 = 0.5097
e = – 0.5097
δ =-0.1274
Let α = 0.1
-1.2067
1.0888
0.3127
Δwi,j (p) = α x xi(p) x δj(p)Δw1,4 (1) = 0.1 x 1 x -0.0147 = -0.0015wi,j (p+1) = wi,j (p) + Δwi,j (p)w1,4 (2) = 0.9 -0.0015 = 0.8985
δj(p) = yi(p) x (1-yi (p)) x ∑ [αk(p) wj,k (p)], all k’s δ4(p) = 0.8808 x (1- 0.8808) x (-0.1274 x 1.1) = -0.0147
0.5038
0.8985
32
Example: 3-layer ANN for XOR (7)
32
Back-propagation of error (p = 1, input layer)
y55
x1 31
x2
Inputlayer
Outputlayer
Hidden layer
42
3
w13
w24
w23
w24
w35
w45
4
5
1
1
1
-0.1
-1.2
1.01.1
0.8
0.30.4
0.9
0.51
1
0
1
1
0.5
0.9
0.4
1.0
0.5250
0.8808
-0.63
0.9689y5 = 0.5097
e = – 0.5097
δ =-0.1274
Let α = 0.1
-1.2067
1.0888
0.3127
Δwi,j (p) = α x xi(p) x δj(p)Δw2,3 (1) = 0.1 x 1 x 0.0381 = 0.0038wi,j (p+1) = wi,j (p) + Δwi,j (p)w2,3 (2) = 0.4 + 0.0038 = 0.4038
δ3(p) = 0.0381δ4(p) = -0.0147
0.5038
0.4038 0.8985
33
Example: 3-layer ANN for XOR (8)
33
Back-propagation of error (p = 1, input layer)
y55
x1 31
x2
Inputlayer
Outputlayer
Hidden layer
42
3
w13
w24
w23
w24
w35
w45
4
5
1
1
1
-0.1
-1.2
1.01.1
0.8
0.30.4
0.9
0.51
1
0
1
1
0.5
0.9
0.4
1.0
0.5250
0.8808
-0.63
0.9689y5 = 0.5097
e = – 0.5097
δ =-0.1274
Let α = 0.1
-1.2067
1.0888
0.3127
Δwi,j (p) = α x xi(p) x δj(p)Δw2,4 (1) = 0.1 x 1 x -0.0147 = -0.0015wi,j (p+1) = wi,j (p) + Δwi,j (p)w2,4 (2) = 1 – 0.0015 = 0.9985
δ3(p) = 0.0381δ4(p) = -0.0147
0.5038
0.4038 0.8985
0.9985
34
Example: 3-layer ANN for XOR (9)
34
Back-propagation of error (p = 1, input layer)
y55
x1 31
x2
Inputlayer
Outputlayer
Hidden layer
42
3
w13
w24
w23
w24
w35
w45
4
5
1
1
1
-0.1
-1.2
1.01.1
0.8
0.30.4
0.9
0.51
1
0
1
1
0.5
0.9
0.4
1.0
0.5250
0.8808
-0.63
0.9689y5 = 0.5097
e = – 0.5097
δ =-0.1274
Let α = 0.1
-1.2067
1.0888
0.3127
δ3(p) = 0.0381δ4(p) = -0.0147
0.5038
0.4038 0.8985
0.9985
θ3 (2) = 0.8 - 0.0038 = 0.7962
Δθk (p) = α x y(p) x δk(p)Δθ3 (1) = 0.1 x -1 x 0.0381 = -0.0038θ3 (p+1) = θ3 (p) + Δ θ3 (p)
0.7962
35
Example: 3-layer ANN for XOR (10)
35
Back-propagation of error (p = 1, input layer)
y55
x1 31
x2
Inputlayer
Outputlayer
Hidden layer
42
3
w13
w24
w23
w24
w35
w45
4
5
1
1
1
-0.1
-1.2
1.01.1
0.8
0.30.4
0.9
0.51
1
0
1
1
0.5
0.9
0.4
1.0
0.5250
0.8808
-0.63
0.9689y5 = 0.5097
e = – 0.5097
δ =-0.1274
Let α = 0.1
-1.2067
1.0888
0.3127
δ3(p) = 0.0381δ4(p) = -0.0147
0.5038
0.4038 0.8985
0.9985
θ4 (2) = -0.1 + 0.0015 = -0.0985
Δθk (p) = α x y(p) x δk(p)Δθ4 (1) = 0.1 x -1 x (-0.0147) = 0.0015θ4 (p+1) = θ4 (p) + Δ θ4 (p)
0.7962
-0.0985
36
Example: 3-layer ANN for XOR (9)
36
y55
x1 31
x2
Inputlayer
Outputlayer
Hidden layer
42
3
w13
w24
w23
w24
w35
w45
4
5
1
1
1
-0.1
-1.2
1.01.1
0.8
0.30.4
0.9
0.5
α = 0.1
Now the 1st iteration (p = 1) is finished. Weight training process is repeated until the sum of squared errors is less than 0.001 (threshold).
-1.2067
1.0888
0.3127
0.5038
0.4038 0.8985
0.9985
0.7962
-0.0985
37
Learning Curve for XOR
0 50 100 150 200
101
Epoch
Sum
-Squ
ared
Err
orSum-Squared Network Error for 224 Epochs
100
10-1
10-2
10-3
10-4
The curve shows ANN learning speed.
224 epochs or 896 iterations were required.
38
Final Results
38
y55
x1 31
x2
Inputlayer
Outputlayer
Hidden layer
42
3
w13
w24
w23
w24
w35
w45
4
5
1
1
1
-0.1
-1.2
1.01.1
0.8
0.30.4
0.9
0.5 -10.4
9.8
4.6
4.7
4.8 6.4
6.4
7.3
2.8
Training again with different initial values may result differently. It works so long as the sum of squared errors is below the preset error threshold.
39
Final Results
Inputs
x1 x2
1010
1100
011
Desiredoutput
yd
0
0.0155
Actualoutput
y5Y
Error
e
Sum ofsquarederrors
e 0.9849 0.9849 0.0175
0.0155 0.0151 0.0151 0.0175
0.0010
Different result possible for different initial.But the result always satisfies the criterion.
40
McCulloch-Pitts Model: XOR Op.
y55
x1 31
x2 42
+1.0
1
1
1+1.0
+1.0
+1.0
+1.5
+1.0
+0.5
+0.5 2.0
Activation function: sign function
41
Decision Boundary
(a) Decision boundary constructed by hidden neuron 3;(b) Decision boundary constructed by hidden neuron 4; (c) Decision boundaries constructed by the complete three-layer network
x1
x2
1
(a)
1
x2
1
1
(b)
00
x1 + x2 – 1.5 = 0 x1 + x2 – 0.5 = 0
x1 x1
x2
1
1
(c)
0
42
Problems of Back-Propagation
Not similar to the process of a biological neuronHeavy computing load
43
Accelerated Learning in Multi-layer NN (1)
Represent sigmoid function by hyperbolic tangent:
where a and b are constants.Suitable values: a = 1.716 and b = 0.667
aeaY bX
htan
12
44
Accelerated Learning in Multi-layer NN (2)
Include a momentum term in the delta rule
where is a positive number (0 1) called the
momentum constant. Typically, the momentum constant is set to 0.95.
This equation is called the generalized delta rule.
)()()1()( ppypwpw kjjkjk
4545
Learning with Momentum
0 20 40 60 80 100 120 10-4
10-2
100
102
Epoch
Sum
-Squ
ared
Err
or
Training for 126 Epochs
0 100 140 -1
-0.5
0
0.5
1
1.5
Epoch
Lear
ning
Rat
e
10-3
101
10-1
20 40 60 80 120
Reduced from 224 to 126 epochs
46
Accelerated Learning in Multi-layer NN (3)
Adaptive learning rate: Ideasmall smooth learning curvelarge fast learning, possibly instable
Heuristic rule:increase learning rate when the change of the sum of squared errors has the same algebraic sign for several consequent epochs.decrease learning rate when the sign alternates for several consequent epochs
4747
Effect of Adaptive Learning Rate
0 10 20 30 40 50 60 70 80 90 100Epoch
Training for 103 Epochs
0 20 40 60 80 100 1200
0.2
0.4
0.6
0.8
1
Epoch
Lear
ning
Rat
e
10-4
10-2
100
102Su
m-S
quar
ed E
rror
10-3
101
10-1
4848
Momentum + Adaptive Learning Rate
0 10 20 30 40 50 60 70 80Epoch
Training for 85 Epochs
0 10 20 30 40 50 60 70 80 900
0.5
1
2.5
Epoch
Lear
ning
Rat
e
10-4
10-2
100
102Su
m-S
quar
ed E
rror
10-3
101
10-1
1.5
2
49
The Hopfield Network
Neural networks were designed on an analogy with the brain, which has associative memory.We can recognize a familiar face in an unfamiliar environment. Our brain can recognize certain patterns even though some information about the patterns differ from what we have remembered.Multilayer ANNs are not intrinsically intelligent.Recurrent Neural Networks (RNNs) are used to emulate human’s associative memory.Hopfield network is a RNN.
50
The Hopfield Network: Goal
To recognize a pattern even if some parts are not the same as what it was trained to remember.The Hopfield network is a single-layer network.It is recurrent. The network outputs are calculated and then fed back to adjust the inputs. The process continues until the outputs become constant.Let’s see how it works.
51
Single-layer n-neuron Hopfield Network
xi
x1
x2
xn
yi
y1
y2
yn
1
2
i
nI n p
u t
S i
g n
a l s
O u
t p
u t
S i
g n
a l s
52
Activation Function
If the neuron’s weighted input is greater than zero, the output is +1.If the neuron’s weighted input is less than zero, the output is -1.If the neuron’s weighted input is zero, the output remains in its previous state.
XYXX
Ysign
if,if,1
0if,1
53
Hopfield Network Current State
The current state of the network is determined by the current outputs, i.e. the state vector.
xi
x1
x2
xn
yi
y1
y2
yn
1
2
i
n
ny
yy
21
Y
54
What can it recognize?
n = the number of inputs = nEach input can be +1 or -1There are 2n possible sets of input/output, i.e. patterns.M = total number of patterns that the network was trained with, i.e. the total number of patterns that we want the network to be able to recognize
55
Example: n = 3, 23 = 8 possible states
y1
y2
y3(1, 1,1)( 1, 1,1)
( 1, 1, 1) (1, 1, 1)
(1, 1,1)( 1,1,1)
(1, 1, 1)( 1,1, 1)
0
56
Weights Weights between neurons are usually represented in matrix form
For example, let’s train the 3D network to recognize the following 2 patterns (M = 2, n = 3)
Once the weights are calculated, they remained fixed.
IYYW MM
m
Tmm
1
111
1Y
111
2Y
57
Weights (2)
M = 2
Thus we can determine the weight matrix as follows
111
1Y
111
2Y 1111 TY 1112 TY
IYYW MM
m
Tmm
1
100010001
I
100010001
2111111
111111
W
022202220
58
How is the Hopfield network tested?Given an input vector X, we calculate the output in a similar manner that we have seen before.
Ym = sign(W Xm – θ ), m = 1, 2, …, M
111
000
111
022202220
1 signY
111
000
111
022202220
2 signY
Θ is the threshold matrixIn this case all thresholds are set to zero.
59
Stable States
As we see, Y1 = X1 and Y2 = X2. Thus both states are said to be stable (also called fundamental states).
111
000
111
022202220
1 signY
111
000
111
022202220
2 signY
60
Unstable StatesWith 3 neurons in the network, there are 8 possible states. The remaining 6 states are unstable.
Possible State Iteration
Inputs Outputs
Fundamental Memoryx1 x2 x3 y1 y2 y3
1 1 1 0 1 1 1 1 1 1 1 1 1
-1 1 1 0 -1 1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 -1 1 0 1 -1 1 1 1 1
1 1 1 1 1 1 1 1 1 1
1 1 -1 0 1 1 -1 1 1 1
1 1 1 1 1 1 1 1 1 1
-1 -1 -1 0 -1 -1 -1 -1 -1 -1 -1 -1 -1
-1 -1 1 0 -1 -1 1 -1 -1 -1 -1 -1 -1
1 -1 -1 -1 -1 -1 -1
-1 1 -1 0 -1 1 -1 -1 -1 -1 -1 -1 -1
1 -1 -1 -1 -1 -1 -1
1 -1 -1 0 -1 1 -1 -1 -1 -1 -1 -1 -1
1 -1 -1 -1 -1 -1 -1
61
Error Correction Network
Each of the unstable states represents a single error, compared to the fundamental memory.The Hopfield network can act as an error correction network.
62
The Hopfield Network
The Hopfield network can store a set of fundamental memories.The Hopfield network can recall those fundamental memories when presented with inputs that maybe exactly those memories or slightly different.However, it may not always recall correctly.Let’s see an example.
63
Ex: When Hopfield Network cannot recall
X1 = (+1, +1, +1, +1, +1)X2 = (+1, -1, +1, -1, +1)X3 = (-1, +1, -1, +1, -1)
Let the probe vector beX = (+1, +1, -1, +1, +1)It is very similar to X1, but the network recalls it as X3.This is a problem with the Hopfield Network
64
Storage capacity of the Hopfield Network
Storage capacity is the largest number of fundamental memories that can be stored and retrieved correctly.The maximum number of fundamental memories Mmax that can be stored in the n-neuron recurrent network is limited by
nMmax 0.15