Neural network and mlp

Post on 14-Jun-2015

753 views 2 download

Tags:

description

it is about the basic neural network and how does mlp works with a simple activation function example with diagrams

transcript

Partha pratim debMtech(cse)-1st year

Netaji subhash engineering college

1. Biological inspiration vs. artificial neural network

2. Why Use Neural Networks?

3. Neural network applications

4. Learning strategy & Learning techniques

5. Generalization types

6. Artificial neurons

7. MLP neural networks and tasks

8. learning mechanism used by multilayer perceptron

9. Activation functions

10. Multi-Layer Perceptron example for approximation

The

McC

ullo

gh-P

itts

m

odel

neurotransmission

Learning strategy

1.Supervised learning

2.Unsupervised learning

A

BA

B A

B

A

B

A

B

A

B

A

B A

B

A

B

It is based on a labeled training set.

The class of each piece of data in training set is known.

Class labels are pre-determined and provided in the training phase.

A

BA

BA

B

Class

Class

Class

Class

Class

Class

Task performedClassificationPattern

Recognition NN model :

PreceptronFeed-forward

NN

“class of data is defined here”

Task performedClustering

NN Model :Self

Organizing Maps“class of data is not

defined here”

1.Linear

2.Nonlinear

Nonlinear generalization of the McCullogh-Pitts neuron:

),( wxfy 2

2

2

||||

1

1

a

wx

axw

ey

ey T

sigmoidal neuron

Gaussian neuron

MLP = multi-layer perceptron

Perceptron:

MLP neural network:

xwy Tout x yout

x yout23

2

1

23

22

21

2

2

13

12

11

1

1

),(

2,1,1

1

),,(

3,2,1,1

1

212

11

ywywy

yyy

ke

y

yyyy

ke

y

T

kkkout

T

aywk

T

axwk

kkT

kkT

• control

• classification

• prediction

• approximation

These can be reformulated in general as

FUNCTION APPROXIMATION

tasks.

Approximation: given a set of values of a function g(x) build a neural network that approximates the g(x) values for any input x.

Activation function used for curve the input data to know the variation

Sigmoidal (logistic) function-common in MLP

Note: when net = 0, f = 0.5

)(1

1

))(exp(1

1))(( tak

ii ietak

tag

where k is a positive constant. The sigmoidal function gives a value in

range of 0 to 1. Alternatively can use

tanh(ka) which is same shape but in range –1 to 1.

Input-output function of a neuron (rate coding

assumption)

Multi-Layer Perceptron example for approximation

Algorithm (sequential)

1. Apply an input vector and calculate all activations, a and u

2. Evaluate k for all output units via:

(Note similarity to perceptron learning algorithm)

3. Backpropagate ks to get error terms for hidden layers using:

4. Evaluate changes using:

))(('))()(()( tagtytdt iiii

k

kikii wttugt )())((')(

)()()()1(

)()()()1(

tzttwtw

txttvtv

jiijij

jiijij

Here I have used simple identity activation function with an example to understand how neural network

works

Once weight changes are computed for all units, weights are updated

at the same time (bias included as weights here). An example:

y1

y2

x1

x2

v11= -1

v21= 0

v12= 0

v22= 1

v10= 1v20= 1

w11= 1

w21= -1

w12= 0

w22= 1

Use identity activation function (ie g(a) = a)

Have input [0 1] with target [1 0].

All biases set to 1. Will not draw them for clarity.

Learning rate = 0.1

y1

y2

x1

x2

v11= -1

v21= 0

v12= 0

v22= 1

w11= 1

w21= -1

w12= 0

w22= 1

Have input [0 1] with target [1 0].

x1= 0

x2= 1

Forward pass. Calculate 1st layer activations:

y1

y2

v11= -1

v21= 0

v12= 0

v22= 1

w11= 1

w21= -1

w12= 0

w22= 1u2 = 2

u1 = 1

u1 = -1x0 + 0x1 +1 = 1

u2 = 0x0 + 1x1 +1 = 2

x1

x2

Calculate first layer outputs by passing activations thru activation functions

y1

y2

x1

x2

v11= -1

v21= 0

v12= 0

v22= 1

w11= 1

w21= -1

w12= 0

w22= 1z2 = 2

z1 = 1

z1 = g(u1) = 1

z2 = g(u2) = 2

Calculate 2nd layer outputs (weighted sum thru activation functions):

y1= 2

y2= 2

x1

x2

v11= -1

v21= 0

v12= 0

v22= 1

w11= 1

w21= -1

w12= 0

w22= 1

y1 = a1 = 1x1 + 0x2 +1 = 2

y2 = a2 = -1x1 + 1x2 +1 = 2

Backward pass:

1= -1

2= -2

x1

x2

v11= -1

v21= 0

v12= 0

v22= 1

w11= 1

w21= -1

w12= 0

w22= 1

Target =[1, 0] so d1 = 1 and d2 = 0

So:

1 = (d1 - y1 )= 1 – 2 = -1

2 = (d2 - y2 )= 0 – 2 = -2

Calculate weight changes for 1st layer (cf perceptron learning):

1 z1 =-1x1

x2

v11= -1

v21= 0

v12= 0

v22= 1

w11= 1

w21= -1

w12= 0

w22= 1

z2 = 2

z1 = 1

1 z2 =-2

2 z1 =-2

2 z2 =-4

Weight changes will be:

x1

x2

v11= -1

v21= 0

v12= 0

v22= 1

w11= 0.9

w21= -1.2

w12= -0.2

w22= 0.6

But first must calculate ’s:

1= -1

2= -2

x1

x2

v11= -1

v21= 0

v12= 0

v22= 1

1 w11= -1

2 w21= 21 w12= 0

2 w22= -2

’s propagate back:

1= -1

2= -2

x1

x2

v11= -1

v21= 0

v12= 0

v22= 1

1= 1

2 = -2

1 = - 1 + 2 = 1

2 = 0 – 2 = -2

And are multiplied by inputs:

1= -1

2= -2

v11= -1

v21= 0

v12= 0

v22= 1

1 x1 = 0

2 x2 = -2

x2= 1

x1= 0

2 x1 = 0

1 x2 = 1

Finally change weights:

v11= -1

v21= 0

v12= 0.1

v22= 0.8x2= 1

x1= 0 w11= 0.9

w21= -1.2

w12= -0.2

w22= 0.6

Note that the weights multiplied by the zero input are unchanged as they do not contribute to the error

We have also changed biases (not shown)

Now go forward again (would normally use a new input vector):

v11= -1

v21= 0

v12= 0.1

v22= 0.8x2= 1

x1= 0 w11= 0.9

w21= -1.2

w12= -0.2

w22= 0.6

z2 = 1.6

z1 = 1.2

Now go forward again (would normally use a new input vector):

v11= -1

v21= 0

v12= 0.1

v22= 0.8x2= 1

x1= 0 w11= 0.9

w21= -1.2

w12= -0.2

w22= 0.6y2 = 0.32

y1 = 1.66

Outputs now closer to target value [1, 0]

Neural network applicationsPattern Classification

Applications examples

• Remote Sensing and image classification

• Handwritten character/digits Recognition

Control, Time series, Estimation

• Machine Control/Robot manipulation

• Financial/Scientific/Engineering Time series

forecasting.Optimization

• Traveling sales person

• Multiprocessor scheduling and task

assignmentReal World Application Examples

• Hospital patient stay length prediction

• Natural gas price prediction

• Artificial neural networks are inspired by the learning processes that take place in biological systems.

• Learning can be perceived as an optimisation process.

• Biological neural learning happens by the modification of the synaptic strength. Artificial neural networks learn in the same way.

• The synapse strength modification rules for artificial neural networks can be derived by applying mathematical optimisation methods.

• Learning tasks of artificial neural networks = function approximation tasks.

• The optimisation is done with respect to the approximation error measure.

• In general it is enough to have a single hidden layer neural network (MLP, RBF or other) to learn the approximation of a nonlinear function. In such cases general optimisation can be applied to find the change rules for the synaptic weights.

1.artificial neural network,simon haykin

2.artificial neural network , yegnanarayana

3.artificial neural network , zurada

4. Hornick, Stinchcombe and White’s conclusion (1989) Hornik K., Stinchcombe M. and White

H., “Multilayer feedforward networks are universal approximators”, Neural Networks, vol. 2,

no. 5,pp. 359–366, 1989

5. Kumar, P. and Walia, E., (2006), “Cash Forecasting: An Application of Artificial Neural

Networks in Finance”, International Journal of Computer Science and Applications 3 (1): 61-

77.