Date post: | 30-Dec-2015 |
Category: |
Documents |
Upload: | marion-lang |
View: | 219 times |
Download: | 5 times |
1
Pattern Recognition:Statistical and Neural
Lonnie C. Ludeman
Lecture 23
Nov 2, 2005
Nanjing University of Science & Technology
2
Happy Halloween
3
Lecture 23 Topics
1.Review the Backpropagation Algorithm
2.Give Preliminary Assignments for design of Artificial Neural Networks.
3.Discuss the flow diagram for Backpropagation algorithm in more detail.
4. Answer Frequently Asked Questions (FAQ)
5. Specify some Guidelines for selection of parameters and structure
4
Backpropagation Training Algorithm for Feedforward Neural networks
5
Input pattern sample xk
6
Calculate Outputs First Layer
7
Calculate Outputs Second Layer
8
Calculate Outputs Last Layer
9
Check Performance
ETOTAL(p) ½ (d[x(p-i)] – f( wT(p-i)x(p-i) )2
i = 0
Ns - 1
ETOTAL(p+1) = ETOTAL(p) + Ep+1 (p+1) – Ep-Ns (p-Ns )
Single Sample Error
Over all Samples Error
Can be computed recursively
10
Change Weights Last Layer using Rule #1
11
Change Weights previous Layer using Rule #2
12
Change Weights previous Layer using Modified Rule #2
13
Input pattern sample xk+1
Continue Iterations Until
14
Repeat process until performance is satisfied or maximum number of iterations are reached.
If performance not satisfied at maximum number of iterations the algorithm
stops and NO design is obtained.
If performance is satisfied then the current weights and structure provide the
required design.
15
Freeze Weights to get Acceptable Neural Net Design
16
Definitions
Convergence: Occurs when the tolerance parameter has been satisfied thus giving an acceptable design.
Failure to Converge: Occurs when the tolerance level is not satisfied in NMAX iterations thus giving no design.Lock UP: Occurs when the algorithm produces saturated outputs that will only allow very small corrections to the weights.
17
(a) Select a feedforward neural network structure- Number of layers, Number of nodes per layer, types of nonlinearities
(b) Select learning and momentum parameters acceptable error, maximum number of iterations
(c) Construct a Training set with correct classification and normalize samples
(d) Select Training type – Supervised, By Sample or By Epoch
(e) Select Target values
Design Preliminary Assignments
18
Discussion of Backpropagation Training Algorithm for Feedforward Neural Networks
Training samples with classification
Computer Flow Diagram
Further Discussion of each step
19
Training Set: { (x(k) , dx(k) ), k=1,2,…,Ns }
Training Patterns
Desired output for each training pattern
20
Backpropagation Algorithm for Training Feedforward Artificial Neural Networks
Flow Chart
21
(Step 0) Initialization
Randomly assign the weights to all Layers:
(Method 1) Weights selected from [-1, 1]
(Method 2) Ng-Widrow Procedure
Set the iteration number p=1 and go to Step(1)
22
(Step 1) Select one of the training samples
Method 1- Fixed order: Define order that the training samples are to be drawn and keep order through successive samples and all passes through data.
Method 2- Multiple Fixed orders: Use different specific ordering through different passes through data
Method 3- Fully Random Order . Randomly select order through each pass through the data
Using one of the methods above select a training sample x(p) which includes the pattern vector and the class assignment.
Then go to Step (2)
23
(Step 2) Calculate outputs of each layer
Using selected sample x(p) and weights matrices for each layer W(k)(p), k=1 to NL
calculate outputs y(1)(p) of all nodes in layer 1,
and then
calculate all outputs in successive layers y(k)(p), k = 2, 3, …, NL.
Store all nodal outputs in memory.
Then Go To step(3)
24
ETOTAL(p) ½ (d[x(p-i)] – f( wT(p-i)x(p-i) )2
i = 0
Ns - 1
Single Sample Error
Over all Samples Error
(Step 3) Calculation of performance
Usually use the following Approximation to reduce computational complexity
25
ETOTAL(p+1) = ETOTAL(p) + Ep+1 (p+1) – Ep-Ns (p-Ns )
This approximation can be computed recursively taking care to avoid roundoff errors by
When error gets small this approximation becomes pretty good!
Then GO To Step(4)
26
(Step 4) Branch on Error Tolerance
If performance not satisfied,
ETOTAL(p) > ε (acceptable total error)
Then GO TO Step(5)
If performance is satisfied then the current weights and structure provide the
required design and the algorithm is stopped.
27
If p = NMAX
then algorithm stops
and NO acceptable design is obtained
(Step 5) Branch on Maximum number of Itterations
If p < NMAX
( This does not mean that an acceptable design is not possible, only that it was not found).
Then GO TO Step(6) (algorithm continues)
28
(Step 6) Calculation of weight changes using weight update equations
Calculate the weight changes and new weights for the last (L)th layer using the outputs from step (3) and the current input sample x(p) using Rule #1 for each nonlinearity.
Continue calculating new weights in reverse order for preceding L-1 Layers to the first layer using Rule #2 and modified Rule #2
If training by epoch, the weight changes will be accumulated for each sample though one entire pass through the data before changing weights
Go to Step (7)
29
(Step 7) Update the weight matrices
Update all weight matrices for all layersto get new weight matrices W(k)(p+1)
Then go to step (1)
30
Questions Concerning Neural Net Design
1.How do we select the Training parameters and structure of NN? See later discussion.
2. Why do we Normalize the Training Samples? To avoid saturation of nonlinarities.
3. What effect does sample order have on training? Can possibly speed up convergence.
31
4. Should we train by epoch or by sample? Training by sample may speed up convergence ( but not significantly).
5. How do we select the Training Pattern set? Training set patterns should be representative of all patterns in each of the classes.
32
Guidelines for the “Art” of Neural Net Design
33
Guidelines for Neural Net Design
1. Choice of training parameter η
usually 0 < η < 10
η too big could cause lockup
higher η can give faster training
Nominal value η = 0.8
34
Guidelines for Neural Net Design
2. Choice of momentum parameter m
usually 0 < m < 1
m to big could cause lockup
higher m can give faster training
Nominal value m = 0.3
35
Guidelines for Neural Net Design
3. Choice of activation functions for neuron nonlinearities:
Tanh, Logistic, Linear, Step Function, Signum function, continuous, discrete
Bipolar continuous usually gives faster design with saturation at 1 and -1
Design relatively insensitive to type of nonlinearity.
Choose Tanh nonlinearities
36
Guidelines for Neural Net Design
4. Choice of Maximum number of Iterations NMAX
usually 1000 < NMAX < 108
No harm in selecting as large as possible to give time for the algorithm to converge. However too large of a value could mask unsuitable design parameters wasting resources and time!
Nominal value NMAX = 10,000
37
Guidelines for Neural Net Design
5. Choice of error tolerance ETOT
usually 10-8 < ETOT < 0.5
ETOT too big could cause poor design and give poor generalization
ETOT to small can result in excessive number of iterations and “grandmothering”
Nominal value ETOT = 0.05
38
Guidelines for Neural Net Design
6. Choice of number of Layers L
usually 0 < L < 5
L too big can cause slow convergence or no convergence and result in an unnecessarily complex design if it converges.
L to small may not give enough flexibility for the required design and thus not converge.
Nominal value L = 3
39
Approximation theory theorems imply that two layers are all that is needed but the theorems do not give the nonlinear functions only that they exist.
Our results on Hyperplane-AND-OR structure gives us reason to believe that maybe three layers are sufficient.
6. Choice of number of Layers L. (cont)
Guidelines for Neural Net Design
40
Guidelines for Neural Net Design
7. Choice of number nodes in each Layer
usually NL = 1 or NC (last layer)
Use NL= NC for large number of classes(NC>6) No need to use any more than the number of classes.
Use L=1 for small number of classes. This gives enough dynamic range to discriminate between output values(sensitivity).
Nominal value L = 1 or NC
41
8. Selection of Target values
Two class case- single output neuon
Bipolar Activation Nonlinearity
Unipolar Activation Nonlinearity
Guidelines for Neural Net Design
42
Selection of Target values (Continued)
Two class case- two output neurons
Bipolar Activation Nonlinearity
Unipolar Activation Nonlinearity
0.1
0.1
43
Selection of Target values (Continued)
K class case- one output neuron
ti selected as
center of Ri
44
Selection of Target values (Continued)
K class case- K output neuronsBipolar Activation Nonlinearity
Decision Rule
45
Summary Lecture 23
1.Review the Backpropagation Algorithm
2.Give Preliminary Assignments for design of Artificial Neural Networks.
3.Discuss the flow diagram for Backpropagation algorithm in more detail.
4. Answer Frequently Asked Questions (FAQ)
5. Specify some Guidelines for selection of parameters and structure
46
A few Jack-O-Lanterns
47
End of Lecture 23