Definition of Soft Computing

8/10/2019 Definition of Soft Computing

http://slidepdf.com/reader/full/definition-of-soft-computing 1/16

1

Definition of Soft Computing

“Basically, soft computing is not a homogeneous body of concepts and techniques. Rather, it is a partnership of distinct methods that in one way or another conform to its guiding principle. At this juncture, the dominant aim of soft computing is to exploit the tolerance for imprecision anduncertainty to achieve tractability, robustness and low solutions cost. The principal constituents of softcomputing are fuzzy logic, neurocomputing, and probabilistic reasoning, with the latter subsuminggenetic algorithms, belief networks, chaotic systems, and parts of learning theory. In the partnershipof fuzzy logic, neurocomputing, and probabilistic reasoning, fuzzy logic is mainly concerned withimprecision and approximate reasoning; neurocomputing with learning and curve-fitting; and

probabilistic reasoning with uncertainty and belief propagation”.

Soft computing could therefore be seen as a series of techniques and methods so that real practicalsituations could be dealt with in the same way as humans deal with them, i.e. on the basis ofintelligence, common sense, consideration of analogies, approaches, etc. In this sense, soft computingis a family of problem-resolution methods headed by approximate reasoning and functional andoptimisation approximation methods, including search methods. Soft computing is therefore thetheoretical basis for the area of intelligent systems and it is evident that the difference between thearea of artificial intelligence and that of intelligent systems is that the first is based on hard computingand the second on soft computing. Soft Computing is still growing and developing.

From this other viewpoint on a second level, soft computing can be then expanded into other

components which contribute to a definition by extension, such as the one first given. From the beginning (Bonissone 2002), the components considered to be the most important in this second levelare probabilistic reasoning, fuzzy logic and fuzzy sets, neural networks, and genetic algorithms, which because of their interdisciplinary, applications and results immediately stood out over othermethodologies such as the previously mentioned chaos theory, evidence theory, etc. The popularity ofgenetic algorithms, together with their proven efficiency in a wide variety of areas and applications,

their attempt to imitate natural creatures (e.g. plants, animals, humans) which are clearly soft (i.e.flexible, adaptable, creative, intelligent, etc.), and especially the extensions and different versions,

transform this fourth second-level ingredient into the well-known evolutionary algorithms whichconsequently comprise the fourth fundamental component of soft computing, as shown in thefollowing diagram, see Figure 2.

Importance of Soft Computing

The aim of Soft Computing is to exploit tolerance for imprecision, uncertainty, approximate

reasoning, and partial truth in order to achieve close resemblance with human-like decision making.Soft Computing is a new multidisciplinary field, to construct a new generation of Artificial

Intelligence, known as Computational Intelligence.



2

What does Soft Computing mean?

The main goal of Soft Computing is to develop intelligent machines and to solve nonlinear andmathematically unmodelled system problems (Zadeh 1994) and (Zadeh 2001). The applications ofSoft Computing have proved two main advantages. First , it made solving nonlinear problems, inwhich mathematical models are not available, possible. Second , it introduced the human knowledgesuch ascognition, recognition, understanding, learning, and others into the fields of computing. This resultedin the possibility of constructing intelligent systems such as autonomous self-tuning systems, and

automated designed systems.

As stated in (Verdegay 2003), since the fuzzy boom of the 1990s, methodologies based on fuzzy sets(i.e. soft computing) have become a permanent part of all areas of research, development andinnovation, and their application has been extended to all areas of our daily life: health, banking,

home, and are also the object of study on different educational levels. Similarly, there is no doubt thatthanks to the technological potential that we currently have, computers can handle problems of

tremendous complexity (both in comprehension and dimension) in a wide variety of new fields.As we mentioned above, since the 1990s, evolutionary algorithms have proved to be extremelyvaluable for finding good solutions to specific problems in these fields, and thanks to their scientificattractiveness, the diversity of their applications and the considerable efficiency of their solutions inintelligent systems, they have been incorporated into the second level of soft computing

components.Evolutionary algorithms, however, are merely another class of heuristics, or

metaheuristics, in the same way as Tabu Search, Simulated Annealing, Hill Climbing, Variable Neighbourhood Search, Estimation Distribution Algorithms, Scatter Search, Reactive Search and verymany others are. Generally speaking, all these heuristic algorithms (metaheuristics) usually providesolutions which are not ideal, but which largely satisfy the decision-maker or the user. When these acton the basis that satisfaction is better than optimization, they perfectly illustrate Zadeh‟s famoussentence (Zadeh 1994):

“…in contrast to traditional hard computing, soft computing exploits the tolerance for imprecision,uncertainty, and partial truth to achieve tractability, robustness, low solution-cost, and better rapportwith reality”.

Fuzzy Sets and Fuzzy LogicFuzzy sets were introduced by Zadeh in 1965 to represent/manipulate data and information possessingnonstatistical uncertainties.

It was specifically designed to mathematically represent uncertainty and vagueness and to provideformalized tools for dealing with the imprecision intrinsic to many problems.

Fuzzy logic provides an inference morphology that enables approximate human reasoning capabilitiesto be applied to knowledge-based systems. The theory of fuzzy logic provides a mathematical strengthto capture the uncertainties associated with human cognitive processes, such as thinking andreasoning.

The conventional approaches to knowledge representation lack the means for representating themeaning of fuzzy concepts. As a consequence, the approaches based on first order logic and classical probability theory do not provide an appropriate conceptual framework for dealing with therepresentation of commonsense knowledge, since such knowledge is by its nature both lexicallyimprecise and noncategorical. The developement of fuzzy logic was motivated in large measure by

the need for a conceptual frame-work which can address the issue of uncertainty and lexicalimprecision.

Some of the essential characteristics of fuzzy logic relate to the following (Zadeh, 1992):

• In fuzzy logic, exact reasoning is viewed as a limiting case of approximate reasoning.

• In fuzzy logic, everything is a matter of degree.• In fuzzy logic, knowledge is interpreted a collection of elastic or, equivalently, fuzzy constraint on a

collection of variables.• Inference is viewed as a process of propagation of elastic constraints.



3

• Any logical system can be fuzzified.

There are two main characteristics of fuzzy systems that give them better performance

for specific applications.• Fuzzy systems are suitable for uncertain or approximate reasoning, especially for the systemwith a mathematical model that is difficult to derive.

• Fuzzy logic allows decision making with estimated values under incomplete or uncertaininformation.

Operations on fuzzy setsWe extend the classical set theoretic operations from ordinary set theory to fuzzy sets. We note thatall those operations which are extensions of crisp concepts reduce to their usual meaning when thefuzzy subsets have membership degrees that are drawn from{ 0 , 1 }. For this reason, when extending perations to fuzzy sets we use the same symbol as in set theory.

Let A and B are fuzzy subsets of a nonempty (crisp) set X .

Definition 16. (intersection) The intersection of A

and B is defined as ( A ∩ B)(t ) = min{A(t ) ,B(t ) } = A(t ) ∧ B(t ) , for all t ∈ X.

Intersection of two triangular fuzzy numbers.

4 Neural Computing (Biological and Artificial Neural Networks) Neural Computing, e.g. Artificial Neural Networks, is one of the most interesting and rapidlygrowing areas of research, attracting researchers from a wide variety of scientific disciplines.Starting from the basics, Neural Computing covers all the major approaches, putting each in perspective in terms of their capabilities, advantages, and disadvantages.An Artificial Neural Network (ANN) is an information processing paradigm that is inspired by the way of biological nervous systems, such as the brain, process information. The key

element of this paradigm is the structure of the information processing system. It is composedof a large number of highly interconnected processing elements (neurones) working in unisonto solve specific problems.

ANNs, like people, learn by example. An ANN is configured for a specific application, suchas pattern recognition or data classification, through a learning process. Learning in biologicalsystems involves adjustments to the synaptic connections that exist between the neurones.This is true of ANNs as well.

4.1 The brain as an information processing systemThe human brain contains about 10 billion nerve cells, or neurons. On average, each neuron isconnected to other neurons through about 10 000 synapses. (The actual figures vary greatly,

depending on the local neuroanatomy.) The brain‟s network of neurons forms a massively



4

parallel information processing system. This contrasts with conventional computers in whicha single processor executes a single series of instructions.

As a discipline of Artificial Intelligence, Neural Networks attempt to bring computers a littlecloser to the brain‟s capabilities by imitating certain aspects of information processing in the brain, in a highly simplified way.

The brain is not homogeneous. At the largest anatomical scale, we distinguish cortex,midbrain, brainstem, and cerebellum. Each of these can be hierarchically subdivided into

many regions, and areas within each region, either according to the anatomical structure of

the neural networks within it, or according to the function performed by them. The overall pattern of projections (bundles of neural connections) between areas is extremely complex,and only partially known. The best mapped (and largest) system in the human brain is thevisual system, where the first 10 or 11 processing stages have been identified.

We distinguish feed forward projections that go from earlier processing stages (near the

sensory input) to later ones (near the motor output), from feedback connections that go in theopposite direction. In addition to these long-range connections, neurons also link up withmany thousands of their neighbours. In this way they form very dense, complex localnetworks.

The basic computational unit in the nervous system is the nerve cell, or neuron. A biological

neuron has, see Figure 61:• Dendrites (inputs) a neuron • Cell body • Axon (output)

Figure 61: A biological neuron

A neuron receives input from other neurons (typically many thousands). Inputs sum

(approximately). Once input exceeds a critical level, the neuron discharges a spike – anelectrical pulse that travels from the body, down the axon, to the next neuron(s) (or other

receptors). This spiking event is also called depolarization, and is followed by a refractory period, during which the neuron is unable to fire.

The axon endings (Output Zone) almost touch the dendrites or cell body of the next neuron.Transmission of an electrical signal from one neuron to the next is effected byneurotransmittors, chemicals which are released from the first neuron and which bind toreceptors in the second. This link is called a synapse. The extent to which the signal from oneneuron is passed on to the next depends on many factors, e.g. the amount of neurotransmittorsavailable, the number and arrangement of receptors, amount of neurotransmittors reabsorbed,etc.



5

Brains learn. From what we know of neuronal structures, one way brains learn is by alteringthe strengths of connections between neurons, and by adding or deleting connections betweenneurons. Furthermore, they learn “on-line”, based on experience, and typically without the benefit of a benevolent teacher. The efficacy of a synapse can change as a result ofexperience, providing both memory and learning through long-term potentiation. One waythis happens is through release of more neurotransmitters. Many other changes may also beinvolved, see Figure 62.

Figure 62: A biological neuron

Learning in artificial neural networks

A neural network has to be configured such that the application of a set of inputs produces (either„direct‟ or via a relaxation process) the desired set of outputs. Various methods to set the strengths of the connections exist. One way is to set the weights explicitly, using a priori knowledge. Another way

is to „train‟ the neural network by feeding it teaching patterns and letting it change its weights

according to some learning rule.

We can categorize the learning situations in two distinct sorts. These are:

• Supervised learning or associative learn ing in which the network is trained by providing itwith input and matching output patterns. These input-output pairs can be provided by anexternal teacher, or by the system which contains the network (self-supervised).

• Unsupervised learn ing or self-organization in which an (output) unit is trained to respondto clusters of pattern within the input. In this paradigm the system is supposed to discoverstatistically salient features of the input population. Unlike the supervised learning paradigm, there is no a priori set of categories into which the patterns are to be classified;

rather the system must develop its own representation of the input stimuli.



6

Hebb rule

Both learning paradigms discussed above result in an adjustment of the weights of the connections between units, according to some modification rule. Virtually all learning rules for models of this typecan be considered as a variant of the Hebbian learning rule suggested by Hebb in the classic book

Organization of Behaviour (Hebb 1949). The Hebb rule determines the change in the weightconnection from ui to uj by Δwij = α *yi *yj, where α is the learning rate and yi, yj represent theactivations of ui and uj respectively. Thus, if both ui and uj are activated the weight of the connectionfrom ui to uj should be increased.

Examples can be given of input/output associations which can be learned by a two-layer Hebb rule

pattern associator. In fact, it can be proved that if the set of input patterns used in training are mutuallyorthogonal, the association can be learned by a two-layer pattern associator using Hebbian learning.However, if the set of input patterns are not mutually orthogonal, interference may occur and thenetwork may not be able to learn associations. This limitation of Hebbian learning can be overcome

by using the delta rule.

Delta rule

The delta rule (Russell 2005), also called the Least Mean Square (LMS) method, is one of the mostcommonly used learning rules. For a given input vector, the output vector is compared to the correctanswer. If the difference is zero, no learning takes place; otherwise, the weights are adjusted to reducethis difference. The change in weight from ui to uj is given by: Δwij = α * yi * ej, where α is the

learning rate, yi represents the activation of ui and ej is the difference between the expected output

and the actual output of uj. If the set of input patterns form a linearly independent set then arbitraryassociations can be learned using the delta rule.

This learning rule not only moves the weight vector nearer to the ideal weight vector, it does so in themost efficient way. The delta rule implements a gradient descent by moving the weight vector fromthe point on the surface of the paraboloid down toward the lowest point, the vertex.

In the case of linear activation functions where the network has no hidden units, the delta rule willalways find the best set of weight vectors. On the other hand, that is not the case for hidden units. Theerror surface is not a paraboloid and so does not have a unique minimum point. There is no such powerful rule as the delta rule for networks with hidden units. There have been a number of theoriesin response to this problem. These include the generalized delta rule and the unsupervised competitive

learning model.

Generalizing the ideas of the delta rule, consider a hierarchical network with an input layer, an outputlayer and a number of hidden layers. We consider only the case where there is one hidden layer. Thenetwork is presented with input signals which produce output signals that act as input to the middlelayer. Output signals from the middle layer in turn act as input to the output layer to produce the finaloutput vector. This vector is compared to the desired output vector. Since both the output and thedesired output vectors are known, we can calculate differences between both outputs and get an errorof neural network. The error is backpropagated from the output layer through the middle layer to theunit which are responsible for generating that output. The delta rule can be used to adjust all theweights. More details are presented in (Fausett 1994).

5. Examples of business applications

There are many applications of ANNs in today’s business. Financial institutions are improving

their decision making by enhancing the interpretation of behavioral scoring systems and developing

superior ANN models of credit card risk and bankruptcy [14,22]. Securities and trading

houses are developing and improving their forecasting techniques and trading strategies with

ANNs. Insurance companies are managing risk better by using ANNs to develop a model of top

underwriters and using this as a training and evaluation tool for other underwriters. Manufacturers

are improving their product quality through predictive process control systems using ANNs

[18]. Oil and gas corporations are learning more from their data by using ANNs to interpret seismic

signals and sub-surface images to improve their exploration effort. Four actual ANN applications

are now described.



7

5.1.irline security control

With the increasing threat of terrorism, airline passengers’ bags in international airports such as

New York, Miami, and London go through an unusually rigorous inspection before being loaded

into the cargo bay [4]. In addition to using metal detector and x-ray station to detect metal weapons,

these airports use ANNs to screen for plastic explosives. They use a detection system which

bombards the luggage with neutrons and monitors the gamma rays that are emitted in response.

The network then analyzes the signal to decide whether the response predicts an explosive. The

purpose of this operation is to detect explosives with a 95 percent probability, while minimizing

the number of false alarms.

Detecting explosive using gamma rays is not simple since different chemical elements release

different frequencies. Explosive materials are rich in nitrogen, but so are some benign substances,

including protein-rich materials, such as wool and leather. Though an abundance of gamma rays at

nitrogen’s frequency raises some suspicion, it is difficult to make a distinction. To minimize the

classification error, supervised training was conducted. The ANN was fed with a batch of instrument

reading as well as the information on whether explosives were indeed present. The

trained network were able to achieve its intended purpose. The entire security system can handle

600 to 700 bags per hour and the network raises false alarms on only 2 percent of the harmless

bags at the 95 percent detection point. This reduction in false alarms translates into many less

bags that must be opened and examined each day. In turn, it reduces the cost of airport operations,

increases the efficiency of the check-in process, and improves the satisfaction of customers.

5.2. Inr:estment management and risk controlNeural Systems Inc. [171 makes use of a supervised network to mimic the recommendations of

money managers on the optimal allocation of assets among Treasury instruments. The application

demonstrated how well an ANN can be trained to recognize the shape and evolution of

the interest-yield curve and to make recommendations as to long or short positions in the US

Treasury market. The network was trained on measured and calculated economic indicators, such as

the evolution of interest rates, price changes, and the shape and speed of the change of the yields

curves. The network could then determine the optimal allocation among segments in various

Treasury instruments being measured against a benchmark or comparator performance index. It

could determine also the dynamic relationship between different variables in portfolio managementand risk control. Consequently, it allowed more active control of portfolio’s level of certainty.

Based on the experience gained with this application, another ANN with a higher level of

complexity was subsequently developed.

5.3. Prediction of thrift failuresProfessor Linda M. Salchenberger and her colleagues at the Loyola University of Chicago have

developed an ANN to predict the financial health of savings and loan associations. They identified

many possible inputs to the network. Through stepwise regression analyses, 5 significant variables

were identified (out of 291. These variables were the ratios of: net worth/ total assets, repossessed

assets/ total assets, net income/ gross income, net income/ total assets, cash plus securities/

total assets. They ratios were selected to measure, respectively, capital adequacy, assetquality, management efficiency, earnings, and liquidity. After identifying the input variables, they

conducted some experiments and selected a single middle layer, feed-forward, backpropagation

network consisting of 5 input nodes, 3 middle layer

nodes, and one output node (see Figure 6). The output node was interpreted as the probability

that an institution was classified as failed or surviving. To train the network, supervised learning was

conducted with training sets consisting of the five financial ratios and the corresponding failed or



8

surviving result from 100 failures and 100 surviving S and L institutions between January, 1986 toDecember, 1987. The result showed the threelayer ANN gained more predictive power over

logit model. The latter is equivalent to a two-layer (no middle-layer) network.

5.4. Prediction of stock price index

With limited knowledge about the stock market and with only data available from a public

library, Ward Systems Group, Inc. [261 created an example showing how one might set up an ANN

application to predict stock market behavior. The first step was to decide what to predict or classify

(i.e., the target outputs). Obviously there are many possible outputs that could be predicted, such as

turning points, market direction, etc. For this application, the next month’s average Standard

and Poor’s stock price index was selected. The next step was to consider which input

facts or parameters are necessary or useful for predicting the target outputs. In this case, thestock price index for the current month was chosen because it should be an important factor in

predicting next month’s index. In addition, nine other publicly available economic indicators were

selected: unadjusted retail sales, average three month Treasury bill rate, total U.S. Government

securities, industrial production index, New York gold price, outstanding commercial paper and

acceptances, Swiss Franc value, U.S. Government receipts, and U.S. Government expenditures (see

Figure 7).

Next, the case characteristics for the problem were entered into the system. These included the

defining characteristics (the names of the input parameters) and the classifying characteristics

(the names of the output results). Finally, examples of previous results were entered in order totrain the network. These case histories contain information for all the months in the years of

1974 to 1979. The goal is to see if the system could predict the monthly stock price indexes in

1980.

After several hours of training, the network was able to predict the next month’s stock price

index for all of 1980. The result has shown that



9

such neural system can produce the first 8 month predictions with less than 3.2% average absoluteerror and the entire 12 month predictions with only 4% average error. Therefore, through a

carefully designed ANN, it is possible to predict the volatile stock market.

6. Limitations of artificial neural networks Artificial neural network is undoubtedly a

powerful tool for decision making. But there are several weaknesses in its use.

(1) ANN is not a general-purpose problem solver. It is good at complex numerical computation

for the purposes of solving system of linear or non-linear equations, organizing data into

equivalent classes, and adapting the solution model to environmental changes. However, it is

not good at such mundane tasks as calculating payroll, balancing checks, and generating invoices.

Neither is it good at logical inference – a job suited for expert systems. Therefore, usersmust know when a problem could be solved with an ANN.

(2) There is no structured methodology available for choosing, developing, training, and verifying

an ANN [23]. The solution quality of an ANN is known to be affected by the number of

layers, the number of neurons at each layer, the transfer function of each neuron, and the size of

the training set. One would think that the more data in the training set, the better the accuracy of

the output. But, this is not so. While too small a training set will prohibit the network from

developing generalized patterns of the inputs, too large a one will break down the generalized

patterns and make the network sensitive to input noise. In any case, the selection of these

parameters is more of an art than a science. Users of ANNs must conduct experiments (or sensitivity

analyses) to identify the best possible configuration of the network. This calls for easy-to-use andeasy-tomodify ANN development tools that are gradually appearing on the market.



10

(3) There is no single standardized paradigm for ANN development. Because of its interdisciplinary

nature, there have been duplicating efforts spent on ANN research. For example, the

backpropagation learning algorithm was independently developed by three groups of researchers

in different times: Werbos [29], Parker 1191, and Rumelhart, Hinton, and Williams [21].

To resolve this problem, the ANN community should establish a repository of available paradigms

to facilitate knowledge transfer between researchers.

Moreover, to make an ANN work, it must be tailored specifically to the problem it is intended

to solve. To do so, users of ANN must select a particular paradigm as the starting prototype.

However, there are many possible paradigms. Without a proper training, users may easily get

lost in this. Fortunately, most of the ANN development tools commercially available today provide

scores of sample paradigms that work on various classes of problems. A user may follow

the advice and tailor it to his or her own needs.

Genetic algorithmsA genetic algorithm is a type of a searching algorithm. It searches a solution space for an optimalsolution to a problem. The key characteristic of the genetic algorithm is how the searching is done.The algorithm creates a “population” of possible solutions to the problem and lets them “evolve” overmultiple generations to find better and better solutions. The generic form of the genetic algorithm is

shown in Figure 51. The items in bold in the algorithm are defined here.

The population consists of the collection of candidate solutions that we are considering during thecourse of the algorithm. Over the generations of the algorithm, new members are “born” into the

population, while others “die” out of the population. A single solution in the population is referred toas an individual. The fitness of an individual is a measure of how “good” is the solution represented

by the individual. The better solution has a higher fitness value – obviously, this is dependent on the problem to be solved. The selection process is analogous to the survival of the fittest in the naturalworld. Individuals are selected for “breeding” (or cross-over) based upon their fitness values. The

crossover occurs by mingling two solutions together to produce two new individuals. During eachgeneration, there is a small chance for each individual to mutate.

To use a genetic algorithm, there are several questions that need to be answered:• How is an individual represented? • How is an individual‟s fitness calculated? • How are individuals selected for breeding? • How are individuals crossed-over?• How are individuals mutated?

• What is the size of the population? • What are the “termination conditions”?



11

Most of these questions have problem specific answers. The last two, however, can be discussed in amore general way. The size of the population is highly variable. The population should be as large as possible. The limiting factor is, of course, the running time of the algorithm. The larger populationmeans more time consuming calculation.

The algorithm in Figure 51 has a very vague end point – the meaning of “until the terminationconditions are met” is not immediately obvious. The reason for this is that there is no one way to endthe algorithm. The simplest approach is to run the search for a set number of generations – the longer.

Another approach is to end the algorithm after a certain number of generations pass with no

improvement of the fitness of the best individual in the population. There are other possibilities aswell. Since most of the other questions are dependent upon the search problem, we will look at twoexample problems that can be solved using genetic algorithms: finding a mathematical function‟smaximum and the travelling salesman problem.

3.2.1 Function maximization Example (Thede 2004) One application for a genetic algorithm is to find values for a collection ofvariables that will maximize a particular function of those variables. While this type of problem could be solved otherwise, it is useful as an example of the operation of genetic algorithms. For thisexample, let‟s assume that we are trying to determine such variables that produce the maximum valuefor this function:

f( w, x, y, z) = w3 + x2 – y2 – z 2 + 2yz – 3wx + wz – xy + 2

This could probably be solved using multivariable calculus, but it is a good simple example of the useof genetic algorithms. To use the genetic algorithm, we need to answer the questions listed in the previous section.

How is an individual represented?

What information is needed to have a “solution” of the maximization problem? It is clear that we needonly values: w, x, y, and z . Assuming that we have values for these four variables, we have a candidatesolution for our problem.

The question is how to represent these four values. A simple way to do this is to use an array of four

values (integers or floating point numbers). However, for genetic algorithms it is usually better tohave a larger individual – this way, variations can be done in a more subtle way. The research shows(Holland 1975) that representation of individuals using bit strings offers the best performance. We cansimply choose a size in bits for each variable, and then concatenate the four values together into asingle bit string. For example, we will choose to represent each variable as a four-bit integer, makingour entire individual a 16-bit string. Thus, an individual such as1101 0110 0111 1100represents a solution where w = 13, x = 6, y = 7, and z = 12.

How is an individual’s fitness calculated?

Next, we consider how to determine the fitness of each individual. There is generally a differentiation between the fitness and evaluation functions. The evaluation function is a function that returns anabsolute measure of the individual. The fitness function is a function that measures the value of theindividual relative to the rest of the population.

In our example, an obvious evaluation function would be to simply calculate the value of f for thegiven variables. For example, assume we have a population of 4 individuals:1010 1110 1000 00110110 1001 1111 0110

0111 0110 1110 10110001 0110 1000 0000

The first individual represents w = 10, x = 14, y = 8, and z = 3, for an f value of 671. The values forthe entire population can be seen in the following table:



12

The fitness function can be chosen from many options. For example, the individuals could be listed inorder from the lowest to the highest evaluation function values, and an ordinal ranking applied. ORThe fitness function could be the individual‟s evaluation value divided by the average evaluationvalue.Looking at both of these approaches would give us something like this:

The key is that the fitness of an individual should represent the value of the individual relative to therest of the population, so that the best individual has the highest fitness.

How are individuals selected for breeding?

The key to the selection process is that it should be probabilistically weighted so that higher fitnessindividuals have a higher probability of being selected. Other than these specifications, the method ofselection is open to interpretation.

One possibility is to use the ordinal method for the fitness function, then calculate a probability of

selection that is equal to the individual‟s fitness value divided by the total fitness of all theindividuals. In the example above, that would give the first individual a 40% chance of being selected,the second a 20% chance, the third a 30% chance, and the fourth a 10% chance. It gives betterindividuals more chances to be selected.

A similar approach could be used with the average fitness calculation. This would give the firstindividual a 72% chance, the second a 5% chance, the third a 22% chance, and the fourth a 1%chance. This method makes the probability more dependent on the relative evaluation functions ofeach individual.

How are individuals crossed-over?

Once we have selected a pair of individuals, they are “bred” – or in genetic algorithm language, theyare crossed-over . Typically two children are created from each set of parents. One method for performing the cross-over is described here, but there are other approaches. Two locations arerandomly chosen within the individual. These define corresponding substrings in each individual. Thesubstrings are swapped between two parent individuals, creating two new children. For example, let‟s

look at our four individuals again:1010 1110 1000 00110110 1001 1111 0110

0111 0110 1110 10110001 0110 1000 0000

Let‟s assume that the first and third individuals are chosen for cross-over. Keep in mind that theselection process is random. The fourth and fourteenth bits are randomly selected to define the

substring to be swapped, so the cross-over looks like this:



13

Thus, two new individuals are created. We should create new individuals until we replace the entire population – in our example, we need one more cross-over operators. Assume that the first and fourthindividuals are selected this time. Note that an individual may be selected multiple times for breeding,while other individuals might never be selected. Further assume that the eleventh and sixteenth bitsare randomly selected for the cross-over point. We could apply the second cross-over like this:

The second generation of the population is the following:1011 0110 1110 10110110 1110 1000 00111010 1110 1000 00000001 0110 1000 0011

How are individuals mutated?

Finally, we need to allow individuals to mutate. When using bit strings, the easiest way to implementthe mutation is to allow every single bit in every individual a chance to mutate. This chance should bevery small, since we don‟t want to have individuals changing dramatically due to mutation. Settingthe percentage so, that roughly one bit per individual has a chance to change on average.

The mutation will consist of having the bit “flip”: 1 changes to 0 and 0 changes to 1. In our example,assume that the bold and italicized bits have been chosen for mutation:1011011011101011 → 1011011011101011011011 1010000011 → 011010 1010000011101011101000 0000 → 101011101001 0000

00 0101101000001 1 → 01 0101101000000 1

Operators of genetic programmingCrossover Operator

Two primary operations exist for modifying structures in genetic programming. The most important

one is the crossover operation. In the crossover operation, two solutions are combined to form twonew solutions or offspring. The parents are chosen from the population by a function of the fitness of

the solutions. Three methods exist for selecting the solutions for the crossover operation.

Another method for selecting the solution to be copied is tournament selection. Typically the genetic



14

program chooses two solutions random. The solution with the higher fitness will win. This methodsimulates biological mating patterns in which, two members of the same sex compete to mate with athird one of a different sex. Finally, the third method is done by rank. In rank selection, selection is based on the rank, (not the numerical value) of the fitness values of the solutions of the population(Koza 1992).The creation of offsprings from the crossover operation is accomplished by deleting the crossoverfragment of the first parent and then inserting the crossover fragment of the second parent. The secondoffspring is produced in a symmetric manner. For example consider the two S -expressions in Figure54, written in a modified scheme programming language and represented in a tree.

An important improvement that genetic programming displays over genetic algorithms is its ability to

create two new solutions from the same solution. In Figure 55 the same parent is used twice to createtwo new children. This figure illustrates one of the main advantages of genetic programming overgenetic algorithms. In genetic programming identical parents can yield different offspring, while ingenetic algorithms identical parents would yield identical offspring. The bold selections indicate thesubtrees to be swapped.



15

Mutation Operator

Mutation is another important feature of genetic programming. Two types of mutations are possible.

In the first kind a function can only replace a function or a terminal can only replace a terminal. In thesecond kind an entire subtree can replace another subtree. Figure 56 explains the concept of mutation.Genetic programming uses two different types of mutations. The top parse tree is the original agent.The bottom left parse tree illustrates a mutation of a single terminal (2) for another single terminal (a).It also illustrates a mutation of a single function (-) for another single function (+). The parse tree on

the bottom right illustrates a replacement of a subtree by another subtree.



16

3.3.2 Applications of genetic programmingGenetic programming can be used for example in the following task solving:Gun Firing Program. A more complicated example consists of training a genetic program to fire agun to hit a moving target. The fitness function is the distance that the bullet is off from the target.The program has to learn to take into account a number of variables, such as the wind velocity, the

type of gun used, the distance to the target, the height of the target, the velocity and acceleration of thetarget.

This problem represents the type of problem for which genetic programs are best. It is a simple fitnessfunction with a large number of variables.

Water Sprinkler System. Consider a program to control the flow of water through a system of watersprinklers. The fitness function is the correct amount of water evenly distributed over the surface.Unfortunately, there is no one variable encompassing this measurement. Thus, the problem must bemodified to find a numerical fitness. One possible solution is placing water-collecting measuringdevices at certain intervals on the surface. The fitness could then be the standard deviation in waterlevel from all the measuring devices. Another possible fitness measure could be the difference between the lowest measured water level and the ideal amount of water; however, this number wouldnot account in any way the water marks at other measuring devices, which may not be at the idealmark.Maze Solving Program. If one were to create a program to find the solution to a maze, first, the program would have to be trained with several known mazes. The ideal solution from the start to thefinish of the maze would be described by a path of dots. The fitness in this case would be the number

of dots the program is able to find. In order to prevent the program from wandering around the mazetoo long, a time limit is implemented along with the fitness function.

Date post:	02-Jun-2018
Category:	Documents
Upload:	rajesh-kumar
View:	221 times
Download:	0 times

Definition of Soft Computing

Documents