Date post: | 14-Jun-2015 |
Category: |
Technology |
Upload: | partha-pratim-deb |
View: | 752 times |
Download: | 2 times |
Partha pratim debMtech(cse)-1st year
Netaji subhash engineering college
1. Biological inspiration vs. artificial neural network
2. Why Use Neural Networks?
3. Neural network applications
4. Learning strategy & Learning techniques
5. Generalization types
6. Artificial neurons
7. MLP neural networks and tasks
8. learning mechanism used by multilayer perceptron
9. Activation functions
10. Multi-Layer Perceptron example for approximation
The
McC
ullo
gh-P
itts
m
odel
neurotransmission
Learning strategy
1.Supervised learning
2.Unsupervised learning
A
BA
B A
B
A
B
A
B
A
B
A
B A
B
A
B
It is based on a labeled training set.
The class of each piece of data in training set is known.
Class labels are pre-determined and provided in the training phase.
A
BA
BA
B
Class
Class
Class
Class
Class
Class
Task performedClassificationPattern
Recognition NN model :
PreceptronFeed-forward
NN
“class of data is defined here”
Task performedClustering
NN Model :Self
Organizing Maps“class of data is not
defined here”
1.Linear
2.Nonlinear
Nonlinear generalization of the McCullogh-Pitts neuron:
),( wxfy 2
2
2
||||
1
1
a
wx
axw
ey
ey T
sigmoidal neuron
Gaussian neuron
MLP = multi-layer perceptron
Perceptron:
MLP neural network:
xwy Tout x yout
x yout23
2
1
23
22
21
2
2
13
12
11
1
1
),(
2,1,1
1
),,(
3,2,1,1
1
212
11
ywywy
yyy
ke
y
yyyy
ke
y
T
kkkout
T
aywk
T
axwk
kkT
kkT
• control
• classification
• prediction
• approximation
These can be reformulated in general as
FUNCTION APPROXIMATION
tasks.
Approximation: given a set of values of a function g(x) build a neural network that approximates the g(x) values for any input x.
Activation function used for curve the input data to know the variation
Sigmoidal (logistic) function-common in MLP
Note: when net = 0, f = 0.5
)(1
1
))(exp(1
1))(( tak
ii ietak
tag
where k is a positive constant. The sigmoidal function gives a value in
range of 0 to 1. Alternatively can use
tanh(ka) which is same shape but in range –1 to 1.
Input-output function of a neuron (rate coding
assumption)
Multi-Layer Perceptron example for approximation
Algorithm (sequential)
1. Apply an input vector and calculate all activations, a and u
2. Evaluate k for all output units via:
(Note similarity to perceptron learning algorithm)
3. Backpropagate ks to get error terms for hidden layers using:
4. Evaluate changes using:
))(('))()(()( tagtytdt iiii
k
kikii wttugt )())((')(
)()()()1(
)()()()1(
tzttwtw
txttvtv
jiijij
jiijij
Here I have used simple identity activation function with an example to understand how neural network
works
Once weight changes are computed for all units, weights are updated
at the same time (bias included as weights here). An example:
y1
y2
x1
x2
v11= -1
v21= 0
v12= 0
v22= 1
v10= 1v20= 1
w11= 1
w21= -1
w12= 0
w22= 1
Use identity activation function (ie g(a) = a)
Have input [0 1] with target [1 0].
All biases set to 1. Will not draw them for clarity.
Learning rate = 0.1
y1
y2
x1
x2
v11= -1
v21= 0
v12= 0
v22= 1
w11= 1
w21= -1
w12= 0
w22= 1
Have input [0 1] with target [1 0].
x1= 0
x2= 1
Forward pass. Calculate 1st layer activations:
y1
y2
v11= -1
v21= 0
v12= 0
v22= 1
w11= 1
w21= -1
w12= 0
w22= 1u2 = 2
u1 = 1
u1 = -1x0 + 0x1 +1 = 1
u2 = 0x0 + 1x1 +1 = 2
x1
x2
Calculate first layer outputs by passing activations thru activation functions
y1
y2
x1
x2
v11= -1
v21= 0
v12= 0
v22= 1
w11= 1
w21= -1
w12= 0
w22= 1z2 = 2
z1 = 1
z1 = g(u1) = 1
z2 = g(u2) = 2
Calculate 2nd layer outputs (weighted sum thru activation functions):
y1= 2
y2= 2
x1
x2
v11= -1
v21= 0
v12= 0
v22= 1
w11= 1
w21= -1
w12= 0
w22= 1
y1 = a1 = 1x1 + 0x2 +1 = 2
y2 = a2 = -1x1 + 1x2 +1 = 2
Backward pass:
1= -1
2= -2
x1
x2
v11= -1
v21= 0
v12= 0
v22= 1
w11= 1
w21= -1
w12= 0
w22= 1
Target =[1, 0] so d1 = 1 and d2 = 0
So:
1 = (d1 - y1 )= 1 – 2 = -1
2 = (d2 - y2 )= 0 – 2 = -2
Calculate weight changes for 1st layer (cf perceptron learning):
1 z1 =-1x1
x2
v11= -1
v21= 0
v12= 0
v22= 1
w11= 1
w21= -1
w12= 0
w22= 1
z2 = 2
z1 = 1
1 z2 =-2
2 z1 =-2
2 z2 =-4
Weight changes will be:
x1
x2
v11= -1
v21= 0
v12= 0
v22= 1
w11= 0.9
w21= -1.2
w12= -0.2
w22= 0.6
But first must calculate ’s:
1= -1
2= -2
x1
x2
v11= -1
v21= 0
v12= 0
v22= 1
1 w11= -1
2 w21= 21 w12= 0
2 w22= -2
’s propagate back:
1= -1
2= -2
x1
x2
v11= -1
v21= 0
v12= 0
v22= 1
1= 1
2 = -2
1 = - 1 + 2 = 1
2 = 0 – 2 = -2
And are multiplied by inputs:
1= -1
2= -2
v11= -1
v21= 0
v12= 0
v22= 1
1 x1 = 0
2 x2 = -2
x2= 1
x1= 0
2 x1 = 0
1 x2 = 1
Finally change weights:
v11= -1
v21= 0
v12= 0.1
v22= 0.8x2= 1
x1= 0 w11= 0.9
w21= -1.2
w12= -0.2
w22= 0.6
Note that the weights multiplied by the zero input are unchanged as they do not contribute to the error
We have also changed biases (not shown)
Now go forward again (would normally use a new input vector):
v11= -1
v21= 0
v12= 0.1
v22= 0.8x2= 1
x1= 0 w11= 0.9
w21= -1.2
w12= -0.2
w22= 0.6
z2 = 1.6
z1 = 1.2
Now go forward again (would normally use a new input vector):
v11= -1
v21= 0
v12= 0.1
v22= 0.8x2= 1
x1= 0 w11= 0.9
w21= -1.2
w12= -0.2
w22= 0.6y2 = 0.32
y1 = 1.66
Outputs now closer to target value [1, 0]
Neural network applicationsPattern Classification
Applications examples
• Remote Sensing and image classification
• Handwritten character/digits Recognition
Control, Time series, Estimation
• Machine Control/Robot manipulation
• Financial/Scientific/Engineering Time series
forecasting.Optimization
• Traveling sales person
• Multiprocessor scheduling and task
assignmentReal World Application Examples
• Hospital patient stay length prediction
• Natural gas price prediction
• Artificial neural networks are inspired by the learning processes that take place in biological systems.
• Learning can be perceived as an optimisation process.
• Biological neural learning happens by the modification of the synaptic strength. Artificial neural networks learn in the same way.
• The synapse strength modification rules for artificial neural networks can be derived by applying mathematical optimisation methods.
• Learning tasks of artificial neural networks = function approximation tasks.
• The optimisation is done with respect to the approximation error measure.
• In general it is enough to have a single hidden layer neural network (MLP, RBF or other) to learn the approximation of a nonlinear function. In such cases general optimisation can be applied to find the change rules for the synaptic weights.
1.artificial neural network,simon haykin
2.artificial neural network , yegnanarayana
3.artificial neural network , zurada
4. Hornick, Stinchcombe and White’s conclusion (1989) Hornik K., Stinchcombe M. and White
H., “Multilayer feedforward networks are universal approximators”, Neural Networks, vol. 2,
no. 5,pp. 359–366, 1989
5. Kumar, P. and Walia, E., (2006), “Cash Forecasting: An Application of Artificial Neural
Networks in Finance”, International Journal of Computer Science and Applications 3 (1): 61-
77.