Date post: | 18-Jan-2016 |
Category: |
Documents |
Upload: | laureen-welch |
View: | 246 times |
Download: | 6 times |
Chapter 9Artificial Neural network
Introduction to Back Propagation Neural Network BPNN
By KH Wong
Neural Networks Ch9. , ver. 5f2 1
Introduction
• Neural Network research is are very popular • A high performance Classifier (multi-class)• Successful in handwritten optical character
OCR recognition, speech recognition, image noise removal etc.
• Easy to implementation– Slow in learning– Fast in classification
Neural Networks Ch9. , ver. 5f2 2
http://www.ninds.nih.gov/disorders/brain_basics/ninds_neuron.htmhttp://yann.lecun.com/exdb/mnist/
Motivation
• Biological findings inspire the development of Neural Net– Input weights Logic function output
• Biological relation– Input– Dendrites – Output– Human computes using a net
Neural Networks Ch9. , ver. 5f2 3
X=inputs
W=weights
Neuron(Logic function)
Output
Applications
Neural Networks Ch9. , ver. 5f2 4
• Microsoft: XiaoIce. AI• http://image-net.org/challenges/LS
VRC/2015/– 200 categories: accordion,
airplane ,ant ,antelope ….dishwasher ,dog ,domestic cat ,dragonfly ,drum ,dumbbell , etc.
• Tensor flow
ILSVRC 2015
Number of object classes 200
TrainingNum images 456567
Num objects 478807
ValidationNum images 20121
Num objects 55502
TestingNum images 40152
Num objects ---
Different types of artificial neural networks
• Autoencoder• DNN Deep neural network & Deep learning• MLP Multilayer perceptron• RNN (Recurrent neural network)• RBM Restricted Boltzmann machine• SOM (Self-organizing map)• Convolutional neural network• From https://en.wikipedia.org/wiki/Artificial_neural_network• The method discussed in this power point can be applied to many of the above
nets.
Neural Networks Ch9. , ver. 5f2 5
Theory of Back Propagation Neural Net (BPNN)
• Use many samples to train the weights (W) & Biases (b), so it can be used to classify an unknown input into different classes
• Will explain– How to use it after training: forward pass
(classify /or the recognition of the input )– How to train it: how to train the weights and
biases (using forward and backward passes)
Neural Networks Ch9. , ver. 5f2 6
Back propagation is an essential step in many artificial network designs
• For training an artificial neural network• For each training example xi, a supervised (teacher)
output ti is given.
• For the ith training sample x: xi
1) Feed forward propagation: feed xi to the neural net, obtain output yi. Error ei |ti-yi|2
2) Back propagation: feed ei to net from the output side and adjust weight w (by finding ∆w) to minimize e.
• Repeat 1) and 2) for all samples until E is 0 or very small.
Neural Networks Ch9. , ver. 5f2 7
Example :Optical character recognition OCR
• Training: Train the system first by presenting a lot of samples with known classes to the network
• Recognition: When an image is input to the system, it will tell what character it is
Neural Networks Ch9. , ver. 5f2 8
Neural Net Output3=‘1’, other outputs=‘0’
Neural Net
Training up the network:weights (W) and bias (b)
Overview of this document
• Back Propagation Neural Networks (BPNN)– Part 1: Feed forward processing (classification or
Recognition)– Part 2: Back propagation (Training the network), also
include forward processing, backward processing and update weights
• Appendix:• A MATLAB example is explained• %source :
http://www.mathworks.com/matlabcentral/fileexchange/19997-neural-network-for-pattern-recognition-tutorial
Neural Networks Ch9. , ver. 5f2 9
Part 1 (classification in action /or the Recognition process)Forward pass of Back Propagation
Neural Net (BPNN)Assume weights (W) and bias (b) are found by training already (to be discussed in part2)
Neural Networks Ch9. , ver. 5f2 10
Recognition: assume weight (W) bias (b) are found earlier
• Neural Networks Ch9. , ver. 5f2 11
OutputOutput0=0Output1=0Output2=0Output3=1
:Outputn=0
Each pixel is X(u,v)
Correct recognition
•
Neural Networks Ch9. , ver. 5f2
12
1X l 2X l 3X l
1W l 2W lNlW
NlX
Output layer Input layer
Hidden layers
A neural network
Exercise 1• How many input and outputs neurons?• Ans: 4 input and 2 output neurons• How many hidden layers does this network have?• Ans: 3• How many weights in total?• Ans: First hidden layer has 4x4, second layer has 3x4,
third hidden layer has 3x3, fourth hidden layer to output layer has 2x3 weights. total=16+12+9+6=43
Neural Networks Ch9. , ver. 5f2
13
1X l 2X l 3X l
4W NlInputsneurons
1W l
What is this layer of neurons X called?Ans: 4X l
Multi-layer structure of a BP neural network
•
Neural Networks Ch9. , ver. 5f214
Input layer
,Youtput ,W,X inputs
has layer inneuron eachfor that such
biases ofset b weights,ofset W inputs, ofset Xoutputs,Y
lll ywx
l
l
l
:layer
hidden
() function a transfer
,b one
and,...,,
weightshas neuron Each
neurons multiple haslayer A
321
f
bias
www
layer
Output
Otherhidden layers
Inside each neuron there is a bias (b)• In between any neighboring 2
neuron layers, a set of weights are found
Neural Networks Ch9. , ver. 5f2 15
)3( ix
y
)1( iwu uf
)(Iw
)2( iw)2( ix
)( Iix
Inside each neuron x=input, y=output
• Neural Networks Ch9. , ver. 5f2 16
Ii
iixi
u
Ii
i
e
fy
euf
f
uwxb
bw(i)x(i)ufy
1b)()(
1
1
1)u( therefore
,simplicityfor 1 assume,1
1)(
i.e. function, (sigmod) logistica is ()Typically
signal internal weight,input, bias,
, with)u(
)1( ix
y
)1( iwu uf
)(Iw
)2( iw)2( ix
)( Iix
BPNN Forward pass• Forward pass is to find the output when an input is given. For
example:• Assume we have used N=60,000 images (MNIST database) to
train a network to recognize c=10 numerals.• When an unknown image is given to the input, the output
neuron corresponds to the correct answer will give the highest output level.
Neural Networks Ch9. , ver. 5f2 17
10 output neurons for 0,1,2,..,9
Inputimage
000100
Our simple demo program• Training pattern
– 3 classes (in 3 rows)– Each class has 3 training
samples (items in each row)
• After training , an input (assume it is test image #2) is presented to the network, the network should tell you it is class 2.
Neural Networks Ch9. , ver. 5f2 18
class1
class2
class3
Result:image (class 2)
Unknowninput
Numerical Example : Architecture of our example
Neural Networks Ch9. , ver. 5f2 19
Input Layer9x1 pixels
output Layer 3x1
neuron) eachfor bias (1 1x neurons 5b
neuron eachfor inputs 9x neurons 5W
layer
hidden
l
l
x
lx
neuron eachfor (),b,W fbiasesweights ll •
The input x • P2=[50 30 25 215 225 231 31 22 34; ...
%class1: 1st training sample. Gray level 0->255
Neural Networks Ch9. , ver. 5f2 20
P1=50P2=30P3=25P4=215P5=225P6=235P7=31P8=22P9=34
9 neuronsIn input layer
3 neuronsIn output layer
5 neuronsIn hidden layer
Exercise 2: Feed forwardInput =P1,..P9, output =Y1,Y2,Y3
teacher(target) =T1,T2,T3•
Neural Networks Ch9. , ver. 5f2 21
•
A1: Hidden layer1 =5 neurons, indexed by jWl=1=9x5bl=1=5x1
P(i=1)
P(i=2)
P(i=3)
::
P(i=9)
(i=1,j=1)
(i=2,i=1)
A1(j=5)
A1(j=1) (j=1,k=1)
l=2(j=2,k=2)
(j=2,k=1)A1(j=2)
Layer l=1 Layer l=2
Y1=0.5101T1=1
Y2=0.4322T2=0
Y3=0.3241T3=0
Output layer
Input layer
Class1 :T1,T2,T3=1,0,0
Exercise 2: What is the target code for T1,T2,T3 if it is for class3?Ans: 0,0,1
Exercise 3: find Y1•
Neural Networks Ch9. , ver. 5f2 22
l=1i=2
l=1i=3
l=1i=1
l=2i=1b=0.5
l=2i=2b=0.3
l=3i=1b=0.7
l=3i=2b=0.6
Wl=1,j=3,i=2
0.15
0.730.27
0.10.35
0.4
0.6
0.35
0.8
0.25
Input layer
Hidden layer ouput layer
Y1=?
y2
X=1
X=3.1
X=0.5
A1
A2
Ii
iixi
e
fy1
b)()(
1
1)u(
• %demo_bpnn_note1 khw ver15• u1=1*0.1+3.1*0.35+0.5*0.4+0.5• A1=1/(1+exp(-1*u1))• • u2=1*0.27+3.1*0.73+0.5*0.15+0.3• A2=1/(1+exp(-1*u2))• • u_Y1=A1*0.6+A2*0.35+0.7• Y1=1/(1+exp(-1*u_Y1))
• %%%%%% result %%%%%%• %>>demo_bpnn_note1• u1 = 1.8850• A1 = 0.8682• U2 = 2.9080• A2 = 0.9482• Y1 = 0.8528• >> %>>
Neural Networks Ch9. , ver. 5f2 23
Answer 3
Part 2: Back propagation processing
(Training the network)
Back Propagation Neural Net (BPNN) (Training)
Ref:http://en.wikipedia.org/wiki/Backpropagation
Neural Networks Ch9. , ver. 5f2 24
Back propagation stage•
Neural Networks Ch9. , ver. 5f2 25
1ll
1lxlx
Part1:FeedForward (studied before)
Part2: Back propagation
llayer
)(1 bxfx l
We will explain why and prove the necessary equations in the following slides
For training we need to find , why?
E
The criteria to train a network • Based on the overall error function, there are ‘N’ samples and
‘c’ classes to be learned (Assume N=60,000 in MNIST dataset)
Neural Networks Ch9. , ver. 5f2
26
network forward feed theofouput at the
sample training theof classoutput The
(teacher) sample training theof class truegiven The
;2
2
1
:)1( outputs allfor sample training theofError
utss_all_outpall_sampleerror_for_
2
1error Overall
2
2
1
2
1 1
th
thnk
thnk
c
k
nk
nk
n
th
N
N
n
c
k
nk
nk
N
k
ny
nt
norm
ytE
,..ckn
E
ytE
Example: The k-th class training sampleThe teacher says it is class tk
n=1
Before we back propagate data , we have to find the feed forward error signals e(n) first for training sample x(n). Recall: Feed forward processing, Input =P1,..P9, output =Y1,Y2,Y3, teacher =T1,T2,T3
• Input=
Neural Networks Ch9. , ver. 5f2 27
•
A1: Hidden layer1 =5 neurons, indexed by jWl=1=9x5bl=1=5x1
P(i=1)
P(i=2)
P(i=3)
::
P(i=9)
(i=1,j=1)
(i=2,i=1)
A1(j=5)
A1(j=1) (j=1,k=1)
(j=2,k=2)
(j=2,k=1)A1(j=2)
Layer l=1 Layer l=2
Y1=0.5101T1=1
Y2=0.4322T2=0
Y3=0.3241T3=0
Output layer
Input layer
I.e. e(n)=(1/2)|Y1-T1|2
=0.5*(0.5101-1)^2=0.12
e
Exercise 3 : The training idea• Assume it is for the nth training
sample, and belong to class C.• In the previous exercise we
calculated that in this network Y1=0.8059
• During training for this input the teacher says t=1
a) What is the error value e?b) How do we use this e?• Answer a: e=(1/2)|Y1-t|2=0.5*(1-0.8059)^2=0.0188• Answer b: We feed this e back to the network to find w to
minimize the overall E (E =sum_all_n [t-e]). It is because we know that w_new=w_old+ w will give a new w that decreases E. hence by applying this formula recursively, we can achieve a set of W to minimum E.
Neural Networks Ch9. , ver. 5f2 28
t=1
Assume it is for the nth training sample
How to back propagate?
•
Neural Networks Ch9. , ver. 5f2 29
29
Neuron j
?E
find toneed wedoBut why
(1)--------- rule chainby , EE
,E
find want toWe
)(y
, definitionBy
isOutput j, neurona For
output actualy,or teachertarget
outputat error squared 2
1
1j
1
2
ij
ij
j
j
j
jij
ij
j
Ii
iijij
j
Ii
iijij
j
w
w
u
u
y
yw
sow
bwxfuf
bwxu
y
t
ytE
i=1,2,..,II inputs to neuron jOutput of neuron j is yj
jIiw ,
jiw ,1
Because: E/ wi,j tells you how to change w to minimize eE The method is called Learning by gradient decent
•
Neural Networks Ch9. , ver. 5f2 30
b
Ebb
w
Ew
Ewwww
Tw
Ew
eE
www
w
oldnew
argument, same For the
need why wesThat'
slide),next thein explained be ldecent wilgradient of theory (The
0.1)factor learning ( ve smalla useslowly it do o
decent)gradient by (learning make
cycle learningevery for E)ofelement an is ( decrease want to weIf
using
,calculated is new a (epoch), cycle learning each In
oldoldnew
oldnew
We need to find , why?
• Ans:
Neural Networks Ch9. , ver. 5f231
EE
EEEE
EEwE
EEEE
()(
E
)(E
EE
EE
EEE
oldnew
oldoldnew
oldnew
oldnew
oldnewoldnew
decrease will- set :Conclusion
ve always is since ),()(
)(-)()(
becomes *) into **put
rate learning set the to termve smalla is is where
(**)- set we
*----- )()(
, Here
..)()(
definitionby seriesTaylor
Using Taylor series http://www.fepress.org/files/math_primer_fe_taylor.pdfhttp://en.wikipedia.org/wiki/Taylor's_theorem
E
Back propagation ideaInput =P1,..P9, output =Y(k=1),Y(k=2),Y3(k=3)teachers =T(k=1),T(k=3),T(k=3)
•
Neural Networks Ch9. , ver. 5f2 32
•
A1: Hidden layer1 =5 neurons, indexed by jWl=1=9x5bl=1=5x1
P(i=1)
P(i=2)
P(i=3)
::
P(i=9)
(i=1,j=1)
(i=2,j=1)
A1(j=5)
A1(i=1) (j=1,k=1)
l=2(j=2,k=2)
(j=2,k=1)A1(j=2)
32
Layer l=1 Layer l=2
Y(k=1)=0.5101T(k=1)=1
Y(k=2)=0.4322T(k=2)=0
Y(k=3)=0.3241T(k=3)=0
Output layer
Input layer
e=(1/2)|Y1-T1|2
=0.5*(0.5101-1)^2=0.12
Back propagate to find a better w to reduce E
The training algorithm • Loop many epochs until E is very small or W is stable• { For n=1,N_all_training_samples• { feed forward x(n) to network to get y(n)• e(n)=0.5*[y(n)-t(n)]^2 //t(n)=teacher of sample x(n)• back propagate e(n) to the network, • //showed earlier if w=-*E/w , and wnew=wold+ w
• //output y(n) will be closer to t(n) hence e(n) will decrease• find w=-*E/w //E will decrease. 0.1=learning rate• update wnew=wold+ w =wold-*E/w //for weight
• Similarity update bnew=bold+ b =wold-*E/b //for bias• }• E=sum_all_n (e(n))• }
Neural Networks Ch9. , ver. 5f2 33
Theory of how to find E/w
•
Neural Networks Ch9. , ver. 5f2 34
term3 term2,, term1
rule) chain(by , EE
(1) from so E, affects how see want toWe
through neuronoutput toconnected is input An
,,
,
,
kj
j
j
j
jkj
kj
kj
w
u
u
y
yw
w
wkj
Xj=1yk
Wj,k
Output neuron k
uk
k
Jj
jkjjkk
k
Ij
jkjjk
bwxfufy
bwxu
1,
1,
)( Xj=J
Case 1: if neuronj is at the output layer. We want to see how E will change if we change the weight wj,k
•
Neural Networks Ch9. , ver. 5f2 35
ysensitivit
ufufty
u
uf
yu
y
y
xufuftyw
w
tyy
ty
y
bxw
bxw
w
u
ufufufu
uf
u
y
w
u
u
y
yw
k
kkkkk
k
k
kk
k
kk
jkkkkkj
kj
kkk
kk
k
jjkj
jjkj
kj
k
kkkk
k
k
k
kj
k
k
k
kkj
)2())(1)((
term2*term1)(EE
:note
)(1)(E
term3* term2* term1E
since
outputat measured,5.0E
:term1
constant since, :term3
appendix See,)(1)()(')(
:term2
term3 term2,, term1
, EE
,
,
2
,
,
,
,,
xjyk
Wj,k
uk
Outputyk
Teacher(Target )Class=tk
Neuron k as an output neuron
We want to see kjw ,
E
ek=0.5(tk-yk)2
, ekEk
Case2 : if neuron j is at the hidden layer. We want to see if how E will change if we change the weight wi,j. Note: Output yi affects all neurons connected to it in next layer
• Neural Networks Ch9. , ver. 5f2 36
layernext thein all affects
eachfor , because ,:part1b
slide)last of eq.(2) (see EE
:part1a
part1bpart1aEE
:term1
term3 term2term1
EE
,,
11
,,
kj
jkjkkjj
k
kk
k
kk
Kk
k
Kk
k j
k
kj
ji
j
j
j
jji
uy
kywuwy
u
u
y
yu
y
u
uy
w
u
u
y
yw
neuron j
1ku 1ky
program
in
W2
,
Kkjw
jy
kby indexed
neuronsOutput
1kix ju
program in
W1, jiw1, kjw
2ku 2ky
2k
Kku Kky
Kk
2, kjw
Kkjw ,
EChangeshere
Case2 : continue
• Neural Networks Ch9. , ver. 5f2 37
iii
Kk
kkjk
ji
Kk
kkjk
ji
Kk
kkjk
i
Kk
k
xufufww
ww
wy
)(1)(E
hence
slide previous thein that similar to are term3term2,
term3term2term3term2term1E
Epart1bpart1a term1So,
1,
,
1,
,
1,
1
For this hidden neuron j, this is df1 in the program
Input xi to the hidden neuron i, P(:,) in program
After all (E/w) are found after you solved case1 and case2
•
Neural Networks Ch9. , ver. 5f2 38
w
Eww
w
Ew
www
E
w
oldnew
oldnew
0.1) rate learning (use method,decent graident
theusing minimized is so
all update tostep thisuse can We
Revisit the training algorithm • Iter=1: all_epochs (or break when E is very small)• { For n=1:N_all_training_samples• { feed forward x(n) to network to get y(n)• e(n)=0.5*[y(n)-t(n)]^2 ;//t(n)=teacher of sample x(n)• back propagate e(n) to the network, • //showed earlier if w=-*E/w , and wnew=wold+ w
• //output y(n) will be closer to t(n) hence e(n) will decrease• find w=-*E/w //E will decrease. 0.1=learning rate• update wnew=wold+ w =wold-*E/w ;//for weight
• Similarity update bnew=bold+ b =wold-*E/b ;//for bias• }• E=sum_all_n (e(n))• }
Neural Networks Ch9. , ver. 5f2 39
Summary
• Learn what is Back Propagation Neural Networks (BPNN)
• Learn the forward pass• Learn how to back propagate data during
training of the BPNN network
Neural Networks Ch9. , ver. 5f2 40
References• Wiki
– http://en.wikipedia.org/wiki/Backpropagation– http://en.wikipedia.org/wiki/Convolutional_neural_network
• Matlab programs– Neural Network for pattern recognition- Tutorial
http://www.mathworks.com/matlabcentral/fileexchange/19997-neural-network-for-pattern-recognition-tutorial
– CNN Matlab example http://www.mathworks.com/matlabcentral/fileexchange/38310-deep-learning-toolbox
• Open source library– Tensor flow: http://www.geekwire.com/2015/google-open-sources-
tensorflow-machine-learning-system-offering-its-neural-network-to-outside-developers/
Neural Networks Ch9. , ver. 5f2 41
Appendices
Neural Networks Ch9. , ver. 5f2 42
Appendix 1:Sigmod function f(u) and its derivative f’(u)
•
Neural Networks Ch9. , ver. 5f243
)(1)()()(
Thus,
)(1)()1(
11
)1(
1
)1(
)1()1(
)1(
1
)1()1(
1
)1(
1)(
)1(
1)(
rule) chain using(,)1(
)1(1
1)(
)(
)()(
1set simplicityfor ,1
1
1
1)(
'
22'
'
'
ufufufdu
udf
ufufee
e
eee
ee
e
e
ee
ee
uf
du
ed
ede
d
du
udfuf
Hence
ufdu
udfee
uf
uu
u
uuu
uu
u
u
uu
uu
u
u
u
uu
http://link.springer.com/chapter/10.1007%2F3-540-59497-3_175#page-1
http://mathworld.wolfram.com/SigmoidFunction.html
•
Neural Networks Ch9. , ver. 5f2 44
nnLlLl
nnn
l
nnn
nnn
l
nnnn
nl
n
n
nnnn
llll
tyf
L l
ivuftyb
E
b
ui
b
uufyt
b
uftyt
b
E
b
ytyt
b
Ei)(ii) & (ii
t
ufy
iiiuftytE
n-th
iiub
u
ub
ib
ubxu
u'
layeroutput at the
)(
,1),(in since
,)(
, From
(teacher)or target truththe
,outputcurrent theis )( Becuase
)()(2
1
2
1
sample theSince
)(ysensitivit theEEE
hence
),(1 so, since
'
'
22
1
Alternative
Derivation (for the output layer , in each neuron)
1ll
Output(last layer)t=target (teacher)y=output.Back propagate error to the previous layer
derivation
•
Neural Networks Ch9. , ver. 5f245
eq(ii)δbb
Ebb
xE
viv(iv eq
T
viE
E
xivxE
ufty(iv)xuftyE
bxufty
ufyt
yyt
E
bwxuytE
lb
loldl
n
blold
lnew
lll
l
ll
l
ll
nnlnnl
lnn
lnn
l
nnn
l
nnn
see , argument, same For the
),,.use hence slide),next see method,decent gradient theis (This
factor) learning ( ve smalla useslowly it do o
)( make
cycle learningeverfy for decrease want to weIf
-(v)------- ,calculated is new a phase, learning eachFor
) weight and input each(for )(
)(' in since,)('
)(')(
and,2
1 , (iii) from Also
1oldoldoldnew
oldnew
2
BNPP example in matlab
Based on Neural Network for pattern recognition- Tutorial
http://www.mathworks.com/matlabcentral/fileexchange/19997-neural-network-for-
pattern-recognition-tutorial
Neural Networks Ch9. , ver. 5f2 46
Example: a simple BPNN
• Number of classes (no. of output neurons)=3• Input 9 pixels: each input is a 3x3 image• Training samples =3 for each class• Number of hidden layers =1• Number of neurons in the hidden layer =5
Neural Networks Ch9. , ver. 5f2 47
Display of testing patterns
•
Neural Networks Ch9. , ver. 5f2 48
Architecture
Neural Networks Ch9. , ver. 5f2 •
Input:P=9x1Indexed by j
A1: Hidden layer1 =5 neurons, indexed by jWl=1=9x5bl=1=5x1
l=1(i=1,j=1)
l=1(i=2,j=1)
l=1(i=9,j=1)
P(i=1)
P(i=2)
P(i=3)
::
P(i=9)
)1(jb...)1,2()1,1(112
11
1
1
1A
PjiPji ll
e
A1(j=1)P(i=1)
P(i=2)
P(i=9)
Neuron j=1Bias=b1(j=1)
2(j=1,k=1)
2(j=2,k=1)
2(j=5,k=1)
))1(b...)1()1,2()1()1,1((222
21
2
1
1A
kkAkjkAkj ll
e
A2(k=2)A1
A2
A5
Neuron k=1Bias=b2(k=1)
l=1(i=i,j=1)
l=1(i=2,j=1)
l=1(i=9,j=5)
l=1i(j=3,j=4)
A1(j=5)
A1(j=1)
A2:layer2, 3 Output neuronsindexed by kWl=2=5x3bl=2=3x1
l=2(j=5,k=3)
l=2(j=1,k=1)
l=2(j=2,k=2)
l=2(j=2,k=1)A1(j=2)
49
Layer l=1 Layer l=2S2 generated
S1 generated
• %source : http://www.mathworks.com/matlabcentral/fileexchange/19997-neural-network-for-pattern-recognition-tutorial• clear memory %comments added by kh wong• clear all• clc• nump=3; % number of classes• n=3; % number of images per class• % training images reshaped into columns in P • % image size (3x3) reshaped to (1x9)• • % training images • P=[196 35 234 232 59 244 243 57 226; ...• 188 15 236 244 44 228 251 48 230; ... % class 1• 246 48 222 225 40 226 208 35 234; ...• • 255 223 224 255 0 255 249 255 235; ...• 234 255 205 251 0 251 238 253 240; ... % class 2• 232 255 231 247 38 246 190 236 250; ...• • 25 53 224 255 15 25 249 55 235; ...• 24 25 205 251 10 25 238 53 240; ... % class 3• 22 35 231 247 38 24 190 36 250]';• • % testing images • N=[208 16 235 255 44 229 236 34 247; ...• 245 21 213 254 55 252 215 51 249; ... % class 1• 248 22 225 252 30 240 242 27 244; ...• • 255 241 208 255 28 255 194 234 188; ...• 237 243 237 237 19 251 227 225 237; ... % class 2• 224 251 215 245 31 222 233 255 254; ...• • 25 21 208 255 28 25 194 34 188; ...• 27 23 237 237 19 21 227 25 237; ... % class 3• 24 49 215 245 31 22 233 55 254]';• • % Normalization• P=P/256;• N=N/256;•
Neural Networks Ch9. , ver. 5f2 50
• % display the training images • figure(1),• for i=1:n*nump• im=reshape(P(:,i), [3 3]);• %remove theline below to reflect the truth data input• % im=imresize(im,20); % resize the image to make it clear• subplot(nump,n,i),imshow(im);…• title(strcat('Train image/Class #', int2str(ceil(i/n))))• end• % display the testing images • figure,• for i=1:n*nump• im=reshape(N(:,i), [3 3]);• % remove theline below to reflect the truth data input• % im=imresize(im,20); % resize the image to make it clear • subplot(nump,n,i),imshow(im);title(strcat('test image #', int2str(i)))• end•
Neural Networks Ch9. , ver. 5f2 51
• • • % targets• T=[ 1 1 1 0 0 0 0 0 0• 0 0 0 1 1 1 0 0 0• 0 0 0 0 0 0 1 1 1 ];• • S1=5; % numbe of hidden layers• S2=3; % number of output layers (= number of classes)• • [R,Q]=size(P); • epochs = 10000; % number of iterations• goal_err = 10e-5; % goal error• a=0.3; % define the range of random variables• b=-0.3;• W1=a + (b-a) *rand(S1,R); % Weights between Input and Hidden Neurons• W2=a + (b-a) *rand(S2,S1); % Weights between Hidden and Output Neurons• b1=a + (b-a) *rand(S1,1); % Weights between Input and Hidden Neurons• b2=a + (b-a) *rand(S2,1); % Weights between Hidden and Output Neurons• n1=W1*P;• A1=logsig(n1); %feedforward the first time• n2=W2*A1;• A2=logsig(n2);%feedforward the first time• e=A2-T; %actually e=T-A2 in main loop• error =0.5* mean(mean(e.*e)); % better say e=T-A2 , but no harm to error here• nntwarn off
Neural Networks Ch9. , ver. 5f2 52
• for itr =1:epochs• if error <= goal_err • break• else• for i=1:Q %i is index to a column in P(9x9), for each column of P
( P:,i)• % is a training sample image, 9 training samples, 3 for each class• %A1=5x9, A1 =outputs of hidden layer and input to output layer• % A2=3x9, A2=Outputs of output layer• %T=true class, each column in T is for 1 training sample • % hidden_layer =1, output_layer =2, • df1=dlogsig(n1,A1(:,i)); %df1 is 5x1 for 5 neurons in hidden layer• df2=dlogsig(n2,A2(:,i)); %df2 is 3x1 for output neurons• % s2 is sigma2=sensitvity2 from the output layer , equation(2) • s2 = -1*diag(df2) * e(:,i); %e=T-A2; df2=f’=f(1-f) of layer2
Neural Networks Ch9. , ver. 5f2 53
• %s1=5x1• s1 = diag(df1)* W2'* s2; % eq(3),feedback, from s2 to S1• %dW= -n*s2*df(u)*x in ppt, =0.1, S2 is found, x is A1• • %W2 is 3x5 , each output neuron receives, update W2• % 5 inputs from 5 hidden neurons in the hidden layer• %sigma2=s2 = -1*diag(df2) * e(:,i); %e=T-A2; df2=f’=f(1-f) of layer2• %delta_W2 = -learning_rate*sigma2*input_to_output_layer • %delta_W2 = -0.1*sigma2*A1• W2 = W2-0.1*s2*A1(:,i)'; %learning rate=0.1, equ(2) output case• %3x5 =3x5- (3x1*1x5), • %A1=5 hidden neuron outputs (5 hidden neurons)• %A1(:,i)’=1x5=outputs of hidden layer, • • b2 = b2-0.1*s2; %threshold • % 3x1=3x1- 3x1• %P1(:,i)=1x9 =input t o hidden,• % s1=5x1 because each hidden note has 1 sensitivity (sigma)• W1 = W1-0.1*s1*P(:,i)';% update W1 in layer 1, see equ(3) hidden case• %5x9 = 5x9-(5x1* 1x9), since P is 9x9 and for an i, P(:,i)' =1x9
Neural Networks Ch9. , ver. 5f2 54
• b1 = b1-0.1*s1;%threshold • %5x1=5x1-5x1• • A1(:,i)=logsig(W1*P(:,i)+b1);%forward• %5x1 = 5x1• A2(:,i)=logsig(W2*A1(:,i)+b2);%forward• %3x1=3x1• end• e = T - A2; % for this e, put -ve sign for finding s2• error =0.5*mean(mean(e.*e));• disp(sprintf('Iteration :%5d mse :%12.6f
%',itr,error));• mse(itr)=error;• end• end• Neural Networks Ch9. , ver. 5f2 55
• threshold=0.9; % threshold of the system (higher threshold = more accuracy)• • % training images result• • %TrnOutput=real(A2)• TrnOutput=real(A2>threshold) • • % applying test images to NN , TESTING BEGINS HERE• n1=W1*N;• A1=logsig(n1);• n2=W2*A1;• A2test=logsig(n2);• • % testing images result• • %TstOutput=real(A2test)• TstOutput=real(A2test>threshold)• • • % recognition rate• wrong=size(find(TstOutput-T),1);• recognition_rate=100*(size(N,2)-wrong)/size(N,2)• % end of code
Neural Networks Ch9. , ver. 5f2 56
Result of the programmse error vs. itr (epoch iteration)
•
Neural Networks Ch9. , ver. 5f2 57
Appendix: Architecture of our demo program: exercise3(write formulas for A1(i=4) , and A2(k=3)How many inputs, hidden neurons, outputs, weights in each layer?
Neural Networks Ch9. , ver. 5f2 •
Input:P=9x1Indexed by j
A1: Hidden layer1 =5 neurons, indexed by jWl=1=9x5bl=1=5x1
l=1(i=1,j=1)
l=1(i=2,j=1)
l=1(i=9,j=1)
P(i=1)
P(i=2)
P(i=3)
::
P(i=9)
)1(jb...)1,2()1,1(112
11
1
1
1)1(A
PjiPji ll
ej
A1(i=1)P(i=1)
P(i=2)
P(i=9)
Neuron i=1Bias=b1(i=1)
l=2(i=1,k=1)
l=2(i=2,k=1)
l=2(i=5,k=1)
)]1(b...)1()1,2()1()1,1([222
21
2
1
1)1(A
kkAkjkAkj ll
ek
A2(k=2)
A1
A2
A5
Neuron k=1Bias=b2(k=1)
l=1(i=1,j=1)
l=1(i=2,j=1)
l=1(i=9,j=5)
l=1(i=3,j=4)
A1(j=5)
A1(j=1)
A2:layer2, 3 Output neuronsindexed by kWl=2=5x3bl=2=3x1
l=2(j=5,k=3)
l=2(j=1,k=1)
l=2(i=2,k=2)
l=2(j=2,k=1)A1(j=2)
58
Layer l=1 Layer l=2S2 generated
S1 generated
Answer (exercise3: write values for A1(i=4) and A2(k=3)
• P=[ 0.7656 0.7344 0.9609 0.9961 0.9141 0.9063 0.0977 0.0938 0.0859]%each is p(j=1,2,3..)
• Wl=1=[ 0.2112 0.1540 -0.0687 -0.0289 0.0720 -0.1666 0.2938 -0.0169 -0.1127]%each is w(l=1,j=1,2,3,..)
• bl=1= 0.1441 %for neuron i• %Find A1(i=4)• A1_i_is_4=1/(1+exp[-(l=1*P+bl=1))]• =0.49• How many inputs, hidden neurons, outputs, weights and biases in
each layer?• Answer: Inputs=9, hidden neurons=5, outputs=3, weights in hidden
layer (layer1) =9x5, neurons in output layer (layer2)= 5x3, 5 biases in hidden layer (layer1), 3 biases in output layer (layer2)
• The 4th hidden neuron is A1(i=4)
Neural Networks Ch9. , ver. 5f2 59
)4(jb...)4,2()4,1(112
11
1
1
1)4(A
PjjPjj ll
ej