Date post: | 08-Sep-2018 |
Category: |
Documents |
Upload: | nguyenkien |
View: | 215 times |
Download: | 0 times |
Neuro-fuzzy Systems
Khurshid Ahmad,
1
1
Khurshid Ahmad, Professor of Computer Science,
Department of Computer ScienceTrinity College,
Dublin-2, IRELAND21th November 2012.
https://www.cs.tcd.ie/Khurshid.Ahmad/Teaching.html
Neuro-fuzzy models
A fuzzy inference system can be shown to be functionally equivalent to a class of adaptive networks.
2
The burden of specifying the parameters of the fuzzy inference can be transferred to an algorithm that attempts to learn the value of the parameters
Jang, Jyh-Shing Roger., Sun, Chuen-Tsai & Mizutani, Eiji. (1997). Neuro-Fuzzy & Soft Computing: A Computational Approach to Learning and Machine Intelligence. Upper Saddle River (NJ): Prentice Hall, Inc. (Chapters 8 and 12)
Neuro-fuzzy models
For complex control systems, there is a wealth of observational a priori knowledge related to the behaviour of inputs and output(s).
The input space may be partitioned –between
3
The input space may be partitioned –between say normal behaviour and abnormal behaviour inducing stimuli.The corresponding output can be noted
Jang, Jyh-Shing Roger., Sun, Chuen-Tsai & Mizutani, Eiji. (1997). Neuro-Fuzzy & Soft Computing: A Computational Approach to Learning and Machine Intelligence. Upper Saddle River (NJ): Prentice Hall, Inc. (Chapters 8 and 12)
Neuro-fuzzy models
Learn from the input-output data:
• Data mining;• Machine Learning;
• Neural Networks; }
4
• Neural Networks; }• Genetic Algorithms• Hybrids ���� Neuro Fuzzy systems
Jang, Jyh-Shing Roger., Sun, Chuen-Tsai & Mizutani, Eiji. (1997). Neuro-Fuzzy & Soft Computing: A Computational Approach to Learning and Machine Intelligence. Upper Saddle River (NJ): Prentice Hall, Inc. (Chapters 8 and 12)
Soft Computing
Neuro-fuzzy models
Learn from the input-output data:
• If a soft computing system is
5
• If a soft computing system is able to compute the input-output relationships, then it will LEARN to compute the relationshipsJang, Jyh-Shing Roger., Sun, Chuen-Tsai & Mizutani, Eiji. (1997). Neuro-Fuzzy & Soft Computing: A Computational Approach to Learning and Machine Intelligence. Upper Saddle River (NJ): Prentice Hall, Inc. (Chapters 8 and 12)
Neuro-fuzzy modelsLearn from the input-output data:
The key notion in learning is that of THRESHOLD
Threshold is an Old English word meaning the piece of
6
timber or stone which lies below the bottom of a door, and
has to be crossed in entering a house; the sill of a doorway;
hence, the entrance to a house or building.
More technically, in contexts of wages and taxation, in
which wage or tax increases become due or obligatory when
some predetermined conditions are fulfilled (esp. above a
specified point on a graduated scale). [..]
Neuro-fuzzy modelsLearn from the input-output data:
The key notion in learning is that of THRESHOLD
Threshold in many specialist domains refers to a lower
7
limit.
(i) Psychology: esp. in phrase threshold of consciousness.
(ii) In Physiology and more widely:
(a) the limit below which a stimulus is not
perceptible;
(b) the magnitude or intensity that must be exceeded
for a certain reaction or phenomenon to occur.
Neuro-fuzzy modelsLearn from the input-output data:
The key notion in learning is that of THRESHOLD
Threshold in many specialist domains refers to a lower
8
limit.
(iii) In Electronics: (a) threshold device, element, etc.: a circuit element having one output
and a number of inputs, each of which accepts a binary signal and
multiplies it by some factor; the output is 0 or 1 depending on
whether or not the sum of the resulting quantities is less than a
certain threshold value;
(b) threshold function, a Boolean function that can be realized by such
an element; threshold logic, switching (based on such elements).
Neuro-fuzzy modelsLearn from the input-output data:
The key notion in learning is that of THRESHOLD
Threshold in many specialist domains refers to a lower
9
limit.
(iv) In Fuzzy Logic and Fuzzy Knowledge Bases, rules are
fired if the aggregation of the antecedents’ membership
functions is non-zero. The threshold value here is any
number greater than zero.
Neuro-fuzzy modelsLearn from the input-output data:
The key notion in learning is that of THRESHOLD
Threshold functions
ywxwv
≥
+= 21
10avFunctionSigmoid
vif
vifv
vif
FunctionLinearPiecewise
vif
vifFunctionThreshold
−+=
−≤
−>>+
+≥
=
<
≥=
exp1
1(v))3(
2
10
2
1
2
12
11
(v))2(
00
01(v))1(
φ
φ
φ
Neuro-fuzzy models:A case study
Consider a first-order Sugeno fuzzy model with two inputs (x & y) and one output (z).
11
There are two fuzzy rules:
R1: IF x is A1 and y is B1 THEN f1=p1x+q1y+r1
R2: IF x is A2 and y is B2 THEN f2=p2x+q2y+r2
Neuro-fuzzy models:A case study
Consider a first-order Sugeno fuzzy model with two inputs (x & y) and one output (z).
A1 w w
Layer 1 Layer 2
Layer 5
12
A1
A2
B1
B2
TT
TT
N
N
∑
x
y
w1
w2
w1
w2
w1f1
w2f2
f
Layer 3 Layer 4
Layer 5
Neuro-fuzzy models:A case study
1
1.2
Consider a first-order Sugeno fuzzy model with two inputs (x & y) and one output (z).
B1
1.2
13
0
0.2
0.4
0.6
0.8
1
0 2 4 6 8 10 12
Series1
0
0.2
0.4
0.6
0.8
1
1.2
0 2 4 6 8 10 12
y
Series1
w1
f1=p1x+q1y+r1
f=(w1 f1+w2 f2)/(w1+w2)
Neuro-fuzzy models:A case study
The operation of a fuzzy system depends on the execution of FOUR major tasks:
Fuzzification, Inference,
14
Inference, Composition,
(Defuzzification).
The different layers in an adaptive network perform one or more of the tasks
Neuro-fuzzy models:A case study
Consider a first-order Sugeno fuzzy model with two inputs (x & y) and one output (z).
A1 w w
Layer 1 Layer 2
Layer 5
15
A1
A2
B1
B2
TT
TT
N
N
∑
x
y
w1
w2
w1
w2
w1f1
w2f2
f
Layer 3 Layer 4
Layer 5
Neuro-fuzzy models:A case study
One can argue that the first layer, that receives input from the external world, actually performs fuzzification.
16
Recall that fuzzification involves the choice of variables, fuzzy inputand output variables and defuzzified output variable(s), definitionof membership functions for the input variables and the descriptionof fuzzy rules. The membership functions defined on the inputvariables are applied to their actual values to determine the degreeof truth for each rule premise. The degree of truth for a rule'spremise is sometimes referred to as its a (alpha) value. If a rule'spremise has a non-zero degree of truth, that is if the rule applies atall, then the rule is said to fire.
Neuro-fuzzy models:A case study
Consider a first-order Sugeno fuzzy model with two inputs (x & y) and one output (z).
LAYER 1: Every node i in this layer is an adaptive node with a node function
17
4,3)(
2,1)(
2,1
,1
==
==
−iforyO
iforxO
i
i
Bi
Ai
µ
µib
i
i
A
a
cxx
2
1
1)(
−+
=µ
PREMISE PARAMETER SET:={ai,bi,ci)
Neuro-fuzzy models:A case study
The operation of the 2nd and 3rd layers in an adaptive network may be construed as equivalent to that of performing inference.
18
performing inference.In lectures on knowledge representation we had defined inferenceas follows: The truth-value for the premise of each rule is computedand the conclusion applied to each part of the rule. This results inone fuzzy subset assigned to each output variable for each rule. MINand PRODUCT are two inference methods. In MIN inferencing theoutput membership function is clipped off at a height correspondingto the computed degree of truth of a rule's premise. Thiscorresponds to the traditional interpretation of the fuzzy logic'sAND operation. In PRODUCT inferencing the output membershipfunction is scaled by the premise's computed degree of truth.
Neuro-fuzzy models:A case study
Consider a first-order Sugeno fuzzy model with two inputs (x & y) and one output (z).
LAYER 2: Every node in this layer is a fixed node (∏); the node outputs the product of all incoming signals
19
4,3;2,1)(*)(2,2 ====
−jiforyxwO
jBAii iµµ
Each node in this layer represents the firing strength of a rule; fuzzy AND operator can be used
Neuro-fuzzy models:A case study
Consider a first-order Sugeno fuzzy model with two inputs (x & y) and one output (z).
LAYER 3: Every node in this layer is a fixed node (N); the ith node calculates the ratio of the ith rule’s firing strength to the sum of all rules’ firing strengths
20
strength to the sum of all rules’ firing strengths
2,121
_
,3 =+
== iforww
wwO i
ii
Outputs of layer 3 are called NORMALIZED FIRING STRENGTHS
Neuro-fuzzy models:A case study
The operation of the (3rd &) 4th
layer(s) involves composition. You may remember our definition of composition: All the fuzzy subsets assigned
21
You may remember our definition of composition: All the fuzzy subsets assigned
to each output variable are combined together to form a single fuzzy subset for
each output variable. MAX and SUM are two composition rules. In MAX
composition, the combined fuzzy subset is constructed by taking the pointwise
maximum over all the fuzzy subsets assigned to the output variable by the
inference rule. The SUM composition, the combined output fuzzy subset is
constructed by taking the pointwise sum over all the fuzzy subsets assigned to
output variable by their inference rule. (Note that this can result in truth values
greater than 1).
Neuro-fuzzy models:A case study
Consider a first-order Sugeno fuzzy model with two inputs (x & y) and one output (z).
LAYER 4: Every node in this layer is an adaptive node with a node function:
22
)
__
,4 (* iiiiiii ryqxpwfwO ++==
The normalized firing strengths wi is a normalised firing strength from layer 3;The parameter set, {pi,qi,ri} is the so-called CONSEQUENT PARAMETERS SET
Neuro-fuzzy models:A case study
And, finally the output layer of an adaptive network performs the equivalent of defuzzification.
23
Defuzzification was defined as process where the value from the composition stage needs to be converted to a single number or a crisp value. Two popular defuzzification techniques are the CENTROID and MAXIMUM techniques. The use of CENTROID technique relies on using the centre of gravity of the membership function to calculate the crisp value of the output variable. The MAXIMUM techniques, and there are a number of them, broadly speaking, use one of the variable values at which the fuzzy subset has its maximum truth value to compute the crisp value.
Neuro-fuzzy models:A case study
Consider a first-order Sugeno fuzzy model with two inputs (x & y) and one output (z).
LAYER 5: The single node in this layer is a fixed node
labelled ∑, which computes the overall output as the
24
labelled ∑, which computes the overall output as the
summation of all incoming signals
∑∑
∑ ==
i
i
i
ii
i
iiw
fw
fwO_
1,5
Neuro-fuzzy models:A case study
The network below is an adaptive network that is functionally equivalent to a Takagi-Sugeno model.
A1 w w
Layer 1 Layer 2
Layer 5
25
A1
A2
B1
B2
TT
TT
N
N
∑
x
y
w1
w2
w1
w2
w1f1
w2f2
f
Layer 3 Layer 4
Layer 5
Neuro-fuzzy models
The goal of a number of statistical investigations is to predict the variation of a dependent variable on one or more independent variables using a mathematical equation. The dependence can be linear or non-linear.
y
27
Consider the linear dependence of a variable y on independent variable
x, sometimes with the proviso that the independent variables can be
observed without any (observational) error. The dependent variable y
may have different values for the SAME x.
One can argue that y is essentially a random variable and its distribution depends on x; typically, the quest is to find the relationship between the independent variable and the MEAN of the dependent variable y – the regression curve of y on x.
Neuro-fuzzy models
xy βα +=ˆ
Assume that the dependence of y on x is linear � for any given x
the MEAN of the distribution of y is given as
28
xy βα +=ˆStatisticians remind us that an observed y will differ from the mean ŷ by the value of a random variable, say, ε
εβα ++= xy
Neuro-fuzzy models
xy βα +=ˆ
Assume that the dependence of y on x is linear � for any given x
the MEAN of the distribution of y is given as
29
xy βα +=ˆStatisticians remind us that an observed y will differ from the mean ŷ by the value of a random variable, say, ε
εβα ++= xy
Neuro-fuzzy models
xy βα +=ˆ
Assume that the dependence of y on x is linear � for any given x
the MEAN of the distribution of y is given as
30
xy βα +=ˆStatisticians remind us that an observed y will differ from the mean ŷ by the value of a random variable, say, ε � the value may be related to the possible error of measurement and related to other variables that may have an influence on y
εβα ++= xy
Neuro-fuzzy modelsWhat we have to do now is to use an OBSERVED data set containing the tuples {xi,yi} for a number of observations, i=1,N, for estimating the values of α and β.
Given that we have assumed that the relation between x and y is linear, then we have to find a straight line that may provide a fit.
31
linear, then we have to find a straight line that may provide a fit.There could be many straight lines that can be fitted to the data set and we have to chose the best one. We begin by predicting the value of y using estimates of α and β, which we will refer to
as a and b
xbay +=ˆ
Neuro-fuzzy models
xbay +=ˆThe error in predicting the value of y given a corresponding
32
The error in predicting the value of y given a corresponding value of x, will be denoted as the error vector ei:
yye ii ˆ−=
Neuro-fuzzy models
yye ii ˆ−=Typically, instead of computing ei, a difficult task, we tend to reduce the sum of errors associated with the N observations to
33
reduce the sum of errors associated with the N observations to zero. This is rather unsuitable, as one find totally unsuitable lines, one tends to minimize the value of the sum of the squares of ei
22 )]([ i
i
i
i
i
i
i bxayee +−=⇒ ∑∑∑
Neuro-fuzzy models
22 )]([ i
i
i
i
i
i
i bxayee +−=⇒ ∑∑∑Essentially, we equate the (partial) derivatives of the above equation with respect to a and b to zero and we get normal
34
equation with respect to a and b to zero and we get normal equations
∑∑ ∑
∑∑
∑
∑
+=
+=
=−+−
=−+−
i
i
i i
iii
i
i
i
i
ii
i
i
i
i
i
xbxayx
xbnay
toLeading
xbxay
bxay
2
*
0))](([2
0)1)](([2
Neuro Fuzzy Models• Statisticians generally have good mathematical backgrounds with which to analyse decision-making algorithms theoretically. […] However, they often pay little or no attention to the applicability of their own theoretical results’ (Raudys 2001:xi).
35
• Neural network researchers ‘advocate that one should not
make assumptions concerning the multivariate densities assumed for pattern classes’ . Rather, they argue that ‘one should assume only the structure of decision making rules’ and hence there is the emphasis in the minimization of classification errors for instance.
Raudys, Šarûunas. (2001). Statistical and Neural Classifiers: An integrated approach to design. London: Springer-Verlag
Neuro Fuzzy Models
• In neural networks there are algorithms that have a theoretical justification and some have no theoretical elucidation’.
•Given that there are strengths and
36
•Given that there are strengths and weaknesses of both statistical and other soft computing algorithms (e.g. neural nets, fuzzy logic), one should integrate the two classifier design strategies (ibid)
Raudys, Šarûunas. (2001). Statistical and Neural Classifiers: An integrated approach to design. London: Springer-Verlag
Neuro Fuzzy Models• The ‘remarkable qualities’ of neural networks: the dynamics of a single layer perceptron progresses from the simplest algorithms to the most complex algorithms:
• Initial Training � each pattern class characterized by sample mean vector �neuron behaves like EDC � ;
37
neuron behaves like EDC � ;• Further Training � neuron begins to evaluate correlations and variances of features � neuron behaves like standard linear Fischer classifier• More training � neuron minimizes number of incorrectly identified training patterns � neuron behaves like a support vector classifier.
• Statisticians and engineers usually design decision-making algorithms from experimental data by progressing from simple algorithms to more complex ones.
Raudys, Šarûunas. (2001). Statistical and Neural Classifiers: An integrated approach to design. London: Springer-Verlag
Neuro-fuzzy models
Adaptive NetworksA network typically comprises a set of nodes connected by directed links.
Each node performs a static node function on its incoming signals to generate a single node output.
38
incoming signals to generate a single node output.
Each link specifies the direction of signal flow from one node to another.
An adaptive network is a network structure whose overall input-output behaviour is determined by a collection of modifiable parameters.
Neuro-fuzzy modelsDirected graphs
•The nodes of a directed graph are called processing elements. •The links of the graph are called connections. •Each connection functions as an instantaneous
39
•Each connection functions as an instantaneous unidirectional signal-conduction path.•Each processing element can receive any number of incoming connections, sometimes called input connections.•Each processing element can have any number of outgoing connections but the signals in all of these must be the same.
Neuro-fuzzy modelsDirected graphs
•The nodes of a directed graph are called processing elements. •The links of the graph are called connections. •Each connection functions as an instantaneous
Output Connection
Input Connections Processing Unit
Fan Out
40
unidirectionalsignal-conduction path.•Each processing element can receive any number of incoming connections, sometimes called input connections.•Each processing element can have any number of outgoing connections but the signals in all of these must be the same.
Neuro-fuzzy modelsDirected graphs
In effect, each processing element has a single output connection that can branch or fan outinto copies to form multiple output connections (sometimes called collaterals), each of which
41
carries the same identical signal (the processing element's output signal).
Processing elements can have local memory.
Each processing element possesses a transfer function which can use (and alter) local memory, can use input signals, and which produces the processing element's output signal.
Neuro-fuzzy modelsDirected graphs
The only inputs allowed to the transfer function are the values stored in the processing element's local memory and the current values of the input signals in the connections received by the
42
the connections received by the processing element.
The only outputs allowed from the transfer function are values to be stored in the processing element's local memory and the processing element's output signal.
Neuro-fuzzy modelsDirected graphs
Transfer functions can operate continuously or episodically.
If they operate episodically, there must be an input called "activate" that causes the processing element's transfer
43
"activate" that causes the processing element's transfer function to operate on the current input signals and local memory values and to produce an updated output signal (and possibly to modify local memory values).
Continuous processing elements are always operating. The "activate" input arrives via a connection from a schedulingprocessing element that is part of the network.
Real NeuroscienceBrains compute. This means that they process
information, creating abstract representations of physical
entities and performing operations on this information in
order to execute tasks. One of the main goals of
computational neuroscience is to describe these
45London, Michael and Michael Häusser (2005). Dendritic Computation. Annual Review of Neuroscience. Vol. 28, pp 503–32
computational neuroscience is to describe these
transformations as a sequence of simple elementary steps
organized in an algorithmic way. The mechanistic
substrate for these computations has long been debated.
Traditionally, relatively simple computational properties
have been attributed to the individual neuron, with the
complex computations that are the hallmark of brains
being performed by the network of these simple elements.
DEFINITIONS: Artificial Neural Networks
Artificial neural networks emulate threshold
behaviour, simulate co-operative phenomenon by a
network of 'simple' switches and are used in a
variety of applications, like banking, currency
trading, robotics, and experimental and animal
46
trading, robotics, and experimental and animal
psychology studies.
These information systems, neural networks or
neuro-computing systems as they are popularly
known, can be simulated by solving first-order
difference or differential equations.
What computers can do? Artificial Neural Networks
In a restricted sense artificial neurons are simple emulations of biological neurons: the artificial neuron can, in principle, receive its input from all other artificial neurons in the ANN; simple operations are performed on the input data; and, the recipient neuron can, in
47
on the input data; and, the recipient neuron can, in principle, pass its output onto all other neurons.
Intelligent behaviour can be simulated through computation in massively parallel networks of simple processors that store all their long-term knowledge in the connection strengths.
DEFINITIONS:Neurons & Appendages
A neuron is a cell with appendages; every cell has a nucleus
and the one set of appendages brings in inputs – the dendrites
– and another set helps to output signals generated by the cell
The Real McCoy
48
AXONCELL BODY
NUCLEUS
DENDRITES
The Real McCoy
DEFINITIONS:Neurons & Appendages
The human brain is mainly composed of neurons: specialized cells
Dendrite
Soma
AxonTerminals
49
specialized cells that exist to transfer information rapidly from one part of an animal's body to another.
SOURCE:http://en.wikipedia.org/wiki/Neurons
Nucleus
DEFINITIONS:Neurons & Appendages
This communication is achieved by the transmission (and reception) of electrical impulses (and chemicals)
Dendrite
Soma
AxonTerminals
50
impulses (and chemicals) from neurons and other cells of the animal. Like other cells, neurons have a cell body that contains a nucleus enshrouded in a membrane which has double-layered ultrastructure with numerous pores.
SOURCE:http://en.wikipedia.org/wiki/Neurons
Nucleus
DEFINITIONS:Neurons & Appendages
Neurons have a variety of appendages, referred to as 'cytoplasmic processes known as neurites which end in close apposition to other cells. In higher animals, neurites are of two varieties: Axons are processes of
Dendrite
Soma
AxonTerminals
51
varieties: Axons are processes of generally of uniform diameter and conduct impulses away from the cell body; dendrites are short-branched processes and are used to conduct impulses towards the cell body.
The ends of the neurites, i.e. axons and dendrites are called synaptic terminals, and the cell-to-cell contacts they make are known as synapses.
SOURCE:http://en.wikipedia.org/wiki/Neurons
Nucleus
DEFINITIONS: The fan-ins and fan-outs
–+ 10 fan-in
4
1010 neurons with 104 connections and an average of 10 spikes per second
= 1015 adds/sec. This is a lower bound on the equivalent computational
power of the brain.
52
–
summation
1 - 100 meters per sec.
Asynchronous
firing rate,
c. 200 per sec.
10 fan-out4
Notes on Artificial Neural Networks
Input signals to a neural network from outside the network arrive via connections that originate in
53
via connections that originate in the outside world.
Outputs from the network to the outside world are connections that leave the network.
Notes on Artificial Neural Networks
•Artificial Neural Networks (ANN) are computational systems, either hardware or software, which mimic animate neural systems comprising
54
animate neural systems comprising biological (real) neurons. •An ANN is architecturally similar to a biological system in that the ANN also uses a number of simple, interconnected artificial neurons.
Observed Biological Processes (Data)
Notes on Artificial Neural Networks
55http://en.wikipedia.org/wiki/Neural_network#Neural_networks_and_neuroscience
Neural Networks &Neurosciences
Biologically PlausibleMechanisms for Neural Processing & Learning
(Biological Neural Network Models)
Theory(Statistical Learning Theory &
Information Theory)
Real NeuroscienceBrains compute. This means that they process
information, creating abstract representations of physical
entities and performing operations on this information in
order to execute tasks. One of the main goals of
computational neuroscience is to describe these
57London, Michael and Michael Häusser (2005). Dendritic Computation. Annual Review of Neuroscience. Vol. 28, pp 503–32
computational neuroscience is to describe these
transformations as a sequence of simple elementary steps
organized in an algorithmic way. The mechanistic
substrate for these computations has long been debated.
Traditionally, relatively simple computational properties
have been attributed to the individual neuron, with the
complex computations that are the hallmark of brains
being performed by the network of these simple elements.
Notes on Artificial Neural Networks
Neural Networks 'learn' by adapting in accordance with a training regimen: The network is subjected to particular information environments on a particular schedule to achieve the desired end-result.
58
There are three major types of training regimens or learning paradigms:
SUPERVISED;UN-SUPERVISED;
REINFORCEMENT or GRADED.
Notes on Artificial Neural NetworksNeurons & Appendages
A neuron is a cell with appendages; every cell has a nucleus and the one set of appendages brings in inputs – the dendrites – and another set helps to output signals generated by the cell
59
AXONCELL BODY
NUCLEUS
DENDRITES
Notes on Artificial Neural Networks:The fan-ins and fan-outs
–+ 10 fan-in
4
1010 neurons with 104 connections and an average of 10 spikes per second = 1015 adds/sec. This is a lower bound on the equivalent computational power of the brain.
60
–
summation
1 - 100 meters per sec.
Asynchronous
firing rate,
c. 200 per sec.
10 fan-out4
Notes on Artificial Neural Networks:Biological and Artificial NN’s
Entity Biological Neural Networks
Artificial Neural Networks
Processing Units Neurons Network Nodes
61
Input Dendrites Network Arcs
Output Axons Network Arcs
Inter-linkage Synaptic Contact (Chemical and Electrical)
Node to Node via Arcs
Connectivity Plastic Connections Weighted Connections Matrix
Notes on Artificial Neural Networks:Biological and Artificial NN’s
62http://brainmaps.org/index.php?p=brain-connectivity-maps-imagemap
Notes on Artificial Neural Networks:Biological and Artificial NN’s
63http://brainmaps.org/index.php?p=brain-connectivity-maps-imagemap
413 major areas in animal brain
Areas connected to each other
Some more connected than others
Notes on Artificial Neural Networks:An operational view of Artificial NN’s
ΣΣΣΣ
wk1
wk2
Neuron xkx1
x2
Input Signals
Summing
Junction
Activation
Function
64A schematic for an 'electronic' neuron
ykΣΣΣΣwk3
wk4
x3
x4bk
Input Signals
Output S
ignal
Notes on Artificial Neural Networks:An operational view of Artificial NN’s
A neural network comprisesA set of processing units
A state of activation
An output function for each unit
A pattern of connectivity among units
65
A pattern of connectivity among units
A propagation rule for propagating patterns of activities
through the network
An activation rule for combining the inputs impinging on
a unit with the current state of that unit to produce a
new level of activation for the unit
A learning rule whereby patterns of connectivity are
modified by experience
An environment within which the system must operate
Notes on Artificial Neural Networks: Rosenblatt’s Perceptron
Logic Gate: A digital circuit that implements an elementary logical operation. It has one or more inputs but ONLY one output. The conditions applied to the input(s) determine the voltage levels at the output. The output, typically, has two values � ‘0’ or ‘1’.
66
values � ‘0’ or ‘1’.
Digital Circuit: A circuit that responds to discrete values of input (voltage) and produces discrete values of output (voltage).Binary Logic Circuits: Extensively used in computers to carry out instructions and arithmetical processes. Any logical procedure maybe effected by a suitable combinations of the gates. Binary circuits are typically formed from discrete components like the integrated circuits.
Notes on Artificial Neural Networks: Rosenblatt’s Perceptron
Logic Circuits: Designed to perform a particular logical function based on AND, OR (either), and NOR (neither). Those circuits that operate between two discrete (input) voltage levels, high & low, are described as binary logic circuits.
67
& low, are described as binary logic circuits.
Logic element: Small part of a logic circuit, typically, a logic gate, that may be represented by the mathematical operators in symbolic logic.
Gate Input(s) Output
AND Two
(or more)
High if and only if both (or all) inputs are high.
NOT One High if input low and vice
Notes on Artificial Neural Networks: Rosenblatt’s Perceptron
68
NOT One High if input low and vice versa
OR Two
(or more)
High if any one (or more) inputs are high
Input 1 Input 2 Output
0 0 0
0 1 0
The operation of an AND gate
Notes on Artificial Neural Networks: Rosenblatt’s Perceptron
69
0 1 0
1 0 0
1 1 1
AND (x,y)= minimum_value(x,y);AND (1,0)=minimum_value(1,0)=0;AND (1,1)=minimum_value(1,1)=1
A single layer perceptron can perform a number of logical operations which are performed by a number of computational devices.
x 1
w=+1
A hard-wired perceptron
Notes on Artificial Neural Networks: Rosenblatt’s Perceptron
70
x 2
w=+1 1
w=+1 2
Σ=w1x1+w2x2+θ
θ = -1.5
y=1 if Σ≥ 0; y=0 ifΣ< 0
perceptron below performs the AND operation.This is hard-wired because the weights are predetermined and not learnt
A single layer perceptron can perform a number of logical operations which are performed by a number of computational devices.
A learning perceptron
An algorithm: Train the network for a number of epochs(1) Set initial weights w1 and w2 and the bias θ to set of random numbers; (2) Compute the weighted sum:
x1*w1+x2*w2+ θ
Notes on Artificial Neural Networks: Rosenblatt’s Perceptron
71
perceptron below performs the AND operation.
x1*w1+x2*w2+ θ(3) Calculate the output using a delta function
y(i)= delta(x1*w1+x2*w2+ θ ); delta(x)=1, if x is greater than zero,
delta(x)=0,if x is less than or equal to zero
(4) compute the difference between the actual output and
desired output:
e(i)= ydesired- y(i)(5) If the errors during a training epoch are all zero then stop otherwise update
wj(i+1)=wj(i)+ α*xj*e(i) , j=1,2
A single layer perceptron can perform a number of logical operations which are performed by a number of computational devices:
αααα=0.1Θ=-0.1
Epoch X1 X2 Ydesire Initial Weights Actual Error Final Weights
Notes on Artificial Neural Networks: Rosenblatt’s Perceptron
72
Epoch X1 X2 Ydesire
d
Initial
W1
Weights
W2
Actual Output
Error Final
W1
Weights
W2
1 0 0 0 0.3 -0.1 0 0 0.3 -0.1
0 1 0 0.3 -0.1 0 0 0.3 -0.1
1 0 0 0.3 -0.1 1 -1 0.2 -0.1
1 1 1 0.2 -0.1 0 1 0.3 0.0
A single layer perceptron can perform a number of logical operations which are performed by a number of computational devices.
Epoch X1 X2 Ydesire
d
Initial
W1
Weights
W2
Actual Output
Error Final
W1
Weights
W2
2 0 0 0 0.3 0.0 0 0 0.3 0.0
Notes on Artificial Neural Networks: Rosenblatt’s Perceptron
73
2 0 0 0 0.3 0.0 0 0 0.3 0.0
0 1 0 0.3 0.0 0 0 0.3 0.0
1 0 0 0.3 0.0 1 -1 0.2 0.0
1 1 1 0.2 0.0 1 0 0.2 0.0
A single layer perceptron can perform a number of logical operations which are performed by a number of computational devices.
Epoch X1 X2 Ydesire
d
Initial
W1
Weights
W2
Actual Output
Error Final
W1
Weights
W2
3 0 0 0 0.2 0.0 0 0 0.2 0.0
Notes on Artificial Neural Networks: Rosenblatt’s Perceptron
74
3 0 0 0 0.2 0.0 0 0 0.2 0.0
0 1 0 0.2 0.0 0 0 0.2 0.0
1 0 0 0.2 0.0 1 -1 0.1 0.0
1 1 1 0.1 0.0 0 1 0.2 0.1
A single layer perceptron can perform a number of logical operations which are performed by a number of computational devices.
Epoch X1 X2 Ydesire
d
Initial
W1
Weights
W2
Actual Output
Error Final
W1
Weights
W2
4 0 0 0 0.2 0.1 0 0 0.2 0.1
Notes on Artificial Neural Networks: Rosenblatt’s Perceptron
75
4 0 0 0 0.2 0.1 0 0 0.2 0.1
0 1 0 0.2 0.1 0 0 0.2 0.1
1 0 0 0.2 0.1 1 -1 0.1 0.1
1 1 1 0.1 0.1 1 0 0.1 0.1
A single layer perceptron can perform a number of logical operations which are performed by a number of computational devices.
Epoch X1 X2 Ydesire
d
Initial
W1
Weights
W2
Actual Output
Error Final
W1
Weights
W2
5 0 0 0 0.1 0.1 0 0 0.1 0.1
Notes on Artificial Neural Networks: Rosenblatt’s Perceptron
76
5 0 0 0 0.1 0.1 0 0 0.1 0.1
0 1 0 0.1 0.1 0 0 0.1 0.1
1 0 0 0.1 0.1 0 0 0.1 0.1
1 1 1 0.1 0.1 1 0 0.1 0.1
Neuro Fuzzy Models• The ‘remarkable qualities’ of neural networks: the dynamics of a single layer perceptron progresses from the simplest algorithms to the most complex algorithms:
• Initial Training � each pattern class characterized by sample mean vector �neuron behaves like EDC � ;
77
neuron behaves like EDC � ;• Further Training � neuron begins to evaluate correlations and variances of features � neuron behaves like standard linear Fischer classifier• More training � neuron minimizes number of incorrectly identified training patterns � neuron behaves like a support vector classifier.
• Statisticians and engineers usually design decision-making algorithms from experimental data by progressing from simple algorithms to more complex ones.
Raudys, Šarûunas. (2001). Statistical and Neural Classifiers: An integrated approach to design. London: Springer-Verlag