+ All Categories
Home > Documents > Automatic design of Neural Networks with L-Systems and genetic algorithms - A biologically inspired...

Automatic design of Neural Networks with L-Systems and genetic algorithms - A biologically inspired...

Date post: 25-Nov-2023
Category:
Upload: ufsc
View: 0 times
Download: 0 times
Share this document with a friend
8
Automatic Design of Neural Networks with L-Systems and Genetic Algorithms - A Biologically Inspired Methodology ıdio Mauro Lima de Campos, Mauro Roisenberg and Roberto C´ elio Lim˜ ao de Oliveira Abstract— In this paper we introduce a biologically plausi- ble methodology capable to automatically generate Artificial Neural Networks (ANNs) with optimum number of neurons and adequate connection topology. In order to do this, three biological metaphors were used: Genetic Algorithms (GA), Lindenmayer Systems (L-Systems) and ANNs. The methodology tries to mimic the natural process of nervous system growing and evolution, using L-Systems as a recipe for development of the neurons and its connections and the GA to evolve and optimize the nervous system architecture suited for an specific task. The technique was tested on three well known simple problems, where recurrent networks topologies must be evolved. A more complex problem, involving time series learning was also proposed for application. The experiments results shows that our proposal is very promising and can generate appropri- ate neural networks architectures with an optimal number of neurons and connections, good generalization capacity, smaller error and large noise tolerance. I. I NTRODUCTION Natural Computing is a research area that, inspired by nature, investigates models and computational techniques and, at same time, uses computational systems to synthesize natural phenomena trying to understand the world around us. It is a highly interdisciplinary field that connects natural sciences and computer science. One of the most promising computational paradigms studied by natural computing are the Artificial Neural Networks (ANNs) technology. ANNs are inspired by the structural organization and the functioning of the nervous systems. They´re being widely used in a va- riety of applications, as pattern recognition, control systems, financial forecast, medicine and agriculture. However, until now, some of the first questions to be faced by a user to use the technology of ANN to solve a given problem are: What kind of ANN should I use? How many neurons will be needed to represent the problem solution with the desired accuracy?. Many researchers in natural computation are devoted to study solutions to the problem of automatic design of neural networks [2], [4], [7], [11], [15], [14] . However the real- ization of an optimized architecture, in terms of numbers of neurons and appropriate topologies isn’t a simple task. The time and effort required for the project of ANNs are totally dependent of the nature and complexity of the task: eg. a feedforward neural network with static neurons is ıdio Mauro Lima de Campos is in the Department Information System, Federal University of Par´ a in Castanhal, Brazil, e:mail : [email protected] . Mauro Roisenberg is in the Department of Informatics and Statis- tics, Federal University of Santa Catarina, Florian´ opolis, Brazil (email: {mauro}@inf.ufsc.br), Roberto C´ elio Lim˜ ao de Oliveira is in the De- partment of Electrical Engineering, Federal University of Par´ a , Brazil, e:mail:[email protected]. incapable to represent dynamic systems [8], which could only be appropriately modeled by a recurrent neural network topology. This dependence leads to a significant amount of time and effort being expended to find the most optimal architecture for ANN to simulate certain class of problem. The main idea is that Nature solved this problem in what concern the animals nervous system. Each animal species has it own repertoire of behavior generated and controlled by a nervous system that is codified in the genome. The genome, in turn, is optimized and tailored by evolutionary processes. This mechanism seeks to create and select the individuals more adapted to solve a specific problem: to survive and reproduce in a given environment, i.e. Nature begins to look for simple solutions and once these solutions are no more able to survive in the environment, then each time more complex solutions are evolved. Based on this idea, we join three computing paradigms inspired by natural phenomena. Artificial Neural Networks, Genetic Algorithms (GA) and Lindenmayer Systems (L-Systems) which attempt to mimic the process of development of the neurons. Whereas this process is governed by the genetic information contained in every cell of the organism, and that this information is not a blueprint of that final form but can be seen as a recipe that is followed, by each cell individually. Currently there are several methods for automatic design of neural networks, most of them use GA to optimize neural network parameters, as the number of neurons and the weight values, the early methods used the direct encoding method [24], [27]. These methods have problems of scalability, a low level of biological plausibility and, in addition, are only capable to generate feedforward topologies, which are only functional models for solution of static problems. Some other researchers have proposed strategies for indirect encoding, where rules are codified, which carry information on how the network has to be built: [11] uses grammars to generate graphs, [1], [24], [25], [23] , [26] sought inspiration in the nature and used GA and L-Systems to generate modular ANN, these methodologies are more biologically plausible. In most of these studies the methodology is presented as a more scalable method to find the best number of neurons and connections, but in all of them, only feedforward artificial neural network topologies are obtained. Nowadays some researches are trying to get more com- plex topologies using biological inspiration. To develop a new methodology able to generate neural network structures with both different number of neurons and topologies, we investigate and describe in next section the direct encoding, grammatical encoding and the Genetic Auto-design of Neural
Transcript

Automatic Design of Neural Networks with L-Systems and GeneticAlgorithms - A Biologically Inspired Methodology

Lıdio Mauro Lima de Campos, Mauro Roisenberg and Roberto Celio Limao de Oliveira

Abstract— In this paper we introduce a biologically plausi-ble methodology capable to automatically generate ArtificialNeural Networks (ANNs) with optimum number of neuronsand adequate connection topology. In order to do this, threebiological metaphors were used: Genetic Algorithms (GA),Lindenmayer Systems (L-Systems) and ANNs. The methodologytries to mimic the natural process of nervous system growingand evolution, using L-Systems as a recipe for developmentof the neurons and its connections and the GA to evolve andoptimize the nervous system architecture suited for an specifictask. The technique was tested on three well known simpleproblems, where recurrent networks topologies must be evolved.A more complex problem, involving time series learning wasalso proposed for application. The experiments results showsthat our proposal is very promising and can generate appropri-ate neural networks architectures with an optimal number ofneurons and connections, good generalization capacity, smallererror and large noise tolerance.

I. INTRODUCTION

Natural Computing is a research area that, inspired bynature, investigates models and computational techniquesand, at same time, uses computational systems to synthesizenatural phenomena trying to understand the world aroundus. It is a highly interdisciplinary field that connects naturalsciences and computer science. One of the most promisingcomputational paradigms studied by natural computing arethe Artificial Neural Networks (ANNs) technology. ANNsare inspired by the structural organization and the functioningof the nervous systems. They´re being widely used in a va-riety of applications, as pattern recognition, control systems,financial forecast, medicine and agriculture. However, untilnow, some of the first questions to be faced by a user touse the technology of ANN to solve a given problem are:What kind of ANN should I use? How many neurons willbe needed to represent the problem solution with the desiredaccuracy?.

Many researchers in natural computation are devoted tostudy solutions to the problem of automatic design of neuralnetworks [2], [4], [7], [11], [15], [14] . However the real-ization of an optimized architecture, in terms of numbersof neurons and appropriate topologies isn’t a simple task.The time and effort required for the project of ANNs aretotally dependent of the nature and complexity of the task:eg. a feedforward neural network with static neurons is

Lıdio Mauro Lima de Campos is in the Department Information System,Federal University of Para in Castanhal, Brazil, e:mail : [email protected]. Mauro Roisenberg is in the Department of Informatics and Statis-tics, Federal University of Santa Catarina, Florianopolis, Brazil (email:{mauro}@inf.ufsc.br), Roberto Celio Limao de Oliveira is in the De-partment of Electrical Engineering, Federal University of Para , Brazil,e:mail:[email protected].

incapable to represent dynamic systems [8], which couldonly be appropriately modeled by a recurrent neural networktopology. This dependence leads to a significant amount oftime and effort being expended to find the most optimalarchitecture for ANN to simulate certain class of problem.

The main idea is that Nature solved this problem in whatconcern the animals nervous system. Each animal species hasit own repertoire of behavior generated and controlled by anervous system that is codified in the genome. The genome,in turn, is optimized and tailored by evolutionary processes.This mechanism seeks to create and select the individualsmore adapted to solve a specific problem: to survive andreproduce in a given environment, i.e. Nature begins to lookfor simple solutions and once these solutions are no moreable to survive in the environment, then each time morecomplex solutions are evolved. Based on this idea, we jointhree computing paradigms inspired by natural phenomena.Artificial Neural Networks, Genetic Algorithms (GA) andLindenmayer Systems (L-Systems) which attempt to mimicthe process of development of the neurons. Whereas thisprocess is governed by the genetic information contained inevery cell of the organism, and that this information is not ablueprint of that final form but can be seen as a recipe thatis followed, by each cell individually.

Currently there are several methods for automatic designof neural networks, most of them use GA to optimize neuralnetwork parameters, as the number of neurons and the weightvalues, the early methods used the direct encoding method[24], [27]. These methods have problems of scalability, alow level of biological plausibility and, in addition, are onlycapable to generate feedforward topologies, which are onlyfunctional models for solution of static problems. Some otherresearchers have proposed strategies for indirect encoding,where rules are codified, which carry information on howthe network has to be built: [11] uses grammars to generategraphs, [1], [24], [25], [23] , [26] sought inspiration in thenature and used GA and L-Systems to generate modularANN, these methodologies are more biologically plausible.In most of these studies the methodology is presented as amore scalable method to find the best number of neurons andconnections, but in all of them, only feedforward artificialneural network topologies are obtained.

Nowadays some researches are trying to get more com-plex topologies using biological inspiration. To develop anew methodology able to generate neural network structureswith both different number of neurons and topologies, weinvestigate and describe in next section the direct encoding,grammatical encoding and the Genetic Auto-design of Neural

Networks (GADONN) methods of automatic design of neu-ral network structures respectively. Then in section III ourmethodology that uses L-rules and GA is presented. Thus,it can evolve to solve both static and dynamic problems.In section IV the experiments are described and discussedand finally in section V conclusions and directions for futureresearch are presented.

II. METHODS OF AUTOMATIC DESIGN OF NEURALNETWORKS

This section discusses some design methods of ANNarchitectures. Two major phases involved in the evolutionof architectures are the genotype representation scheme andthe evolutionary algorithm (EA) used to evolve ANN archi-tectures. One of the key issues in encoding is to decide howmuch information about architecture should be encoded inthe chromosome. In the next subsections we discuss someof those aspects, also presenting the degree of biologicalplausibility, scalability and neural computability of eachmethod.

A. Direct Encoding

At one extreme of the methods to automatic design ofneural networks there are the so called Direct Methods.In a Direct Encoding method all the details, i.e, everyconnection and nodes of architecture must be specifiedin the chromosome. These aspects are illustrated by thework of [27] and [18], that restricted their initial projectto feedforwards networks with a fixed number of units forwhich the GA has evolved the connection topology. Asshown in Fig 1, the connection topology was representedby an NxN matrix (5x5) where each entry encodes the typeof connection from the ”from unit” to the ”to unit”. Theentries in the connectivity matrix were either ”0” (meaningno connection”), or ”L” (meaning connection). In the directmethod all the information about the structure of neuralnetwork is represented directly in the binary string used asgenotype. In the method proposed by [27] the genotype isobtained by the concatenation of lines or columns of theconnectivity matrix in a binary string. Using that outline abinary string of size N2 , will represent a neural network withN neurons. The direct encoding method has a low degreeof biological plausibility. Because the development fromgenotype to phenotype is controlled by production rules,that remind the development of the embryo, controlled byDNA. [4] considers that one of the major disadvantages ofdirect methods is their poor scalability. In other words, thelength of the genotype representing the structure of the neuralnetwork increase exporentially as the number of neurons inthe network increases. The neural networks obtained by [18]and [28] can only simulate static problems.

B. Grammatical Encoding or Indirect Encoding

At the other extreme there are indirect methods, whichuse a developmental rule [12],[11], [1],[2] , [17], [3] , [22] toencode information in the genotype. In these methods insteadof encoding the connectivity pattern and its corresponding

Fig. 1. Direct Encoding

parameters in the genotype, a set of developmental rules wereencoded into it. These rules in fact control the growth ofa cell whose structure at maturity represents the connectiv-ity pattern and its corresponding parameters. This kind ofrepresentation scheme is called indirect encoding, that hasbeen used by many researchers [12],[11], [1],[2] , [17], [3] ,[22] with the objective of reduce the genotypic representationof architectures. The details about each connection in anANN are either predefined according to prior knowledgeor specified by a set of deterministic developmental rules.The indirect encoding scheme can produce more compactgenotypic representation of ANN architectures. [11] appliedthis general idea to the development of neural networks usinga type of grammar called ”graph-generation grammar”, asimple example is presented in Fig 2(a),(b). The matrix 8x8,Fig 2(b), can be interpreted as a connection matrix for aneural network: a 1 in row i and column j means that unit iis present in the network and a 1 in row i and column j, (inot equal to j) , means that there is not a connection fromunit i to unit j . The result is the network shown in Fig 2(c).The grammar used was G=((S,A,B,C,D,a,b,c,e,p),(0,1),P,S),the production rules are shown in Fig 2(a). The chromosomeis divided up into separate rules, each one consists of fivelocus. The first locus is the left-hand side of the rule; thesecond through fifth locus are the four symbols in the matrixon the right-hand side of the rule. The possible alleles ateach locus are the symbols A-Z and a-p. The first locusof the chromosome is fixed to be the start symbol, S; atleast on rule taking S into a (2x2) matrix is necessary toget started in building a network from a grammar, Fig 3.All the other symbols are chosen at random. A network isbuilt applying the grammar rules encoded in the chromosomefor a predetermined number of iteractions. The rules thattake a-p to the 16 (2x2) matrixes of zeros and ones, werefixed and are not represented in the chromosome. Someresearchers, [2] , [4] discussed that indirect encoding methodis more biologically plausible than the direct encoding. Be-cause that method considers that development from genotypeto phenotype controlled by production rules, the methodvaguely remind the development of the embryo, controlledby DNA. Since the ultimate size of the connectivity matrixin Kitano’s method is (2x2)n. The method still suffers fromthe scalability [4]. In Kitano’s experiments, connections toor from nonexistent units and recurrent connections wereignored.

[20] also used both methods of direct encoding anddirect encoding, first, the chromosome contains informationabout parameters of the topology, architecture and learningparameters of the ANN. Second, the chromosome containsthe necessary information so that a constructive method givesrise to an ANN topology (or architecture),

Fig. 2. Illustration of the Kitano’s ”graph generation grammar”, toproduce a network to solve the XOR problem (a) Grammatical rules, (b) Aconnection matrix is produced from the Grammar. (c) The resulting network.

Fig. 3. Illustration of a chromosome encoding grammar

C. Methods that use Lindenmayer Systems[16] - Automaticdesign of neural network structures (GADONN)

In the method of grammatical encoding presented previ-ously the production rules are applied sequentially, but [2]considers that biological motivation for the use of L-Systemsis due to the fact that the production rules are appliedin parallel and simultaneously replace all the letters in agiven word, that is destinated to capture cellular divisions inmulticellular organisms, where a lot of divisions can occur atthe same time. [3] used a new method (GADONN) to designand optimize the structure of neural networks. GADONNis an indirect parallel method which uses a context-freeL-System (Lindenmayer System) to encode developmentalrules in genotype. In order to avoid scalability problem,instead of making the axiom grow in two dimensions (i.e.by replacing each element of the matrix with a 2x2 matrix),it is grown in one dimension. Furthermore, since the searchdomain is the space of feedforward networks. Restricting

the search to general feedforward networks (networks inwhich the forward links are not limited to be between twoadjacent layers) will lead to a lower triangular matrix, asseen in Fig 4. Hence, a neural network with ’N’ neuronswill result in a binary string of N(N+1)/2 as opposed to N2,[27], which will enhance the speed of convergence on thealgorithm. In fact, reducing the dimensions of the searchspace eliminates evaluating the behavior of alternatives andmitigates the need to test them. This reduction in the numberof feasible alternatives is one of the major advantages overthe other methods. Reducing the dimensions of the searchspace also results in convergence of the method in fewergenerations. A context free L-System is used to encode theproduction rules.The structure of the chromosome (genotype) representing theL-System is shown in Fig 5 and results in equation 1, withdefault length of 65.

bl = gbits+ inds− schar+ char− inds−RHSchar (1)

where: bl is the binary string length, gbits is the number ofbits that corresponds to growth cycle (default is 8), inds isthe index size in bits (default is 3), char is the number ofcharacters in the alphabet set (default is 8), RHSchar is thenumber of characters in the right hand side of the productionrules (default is 2), schar is the number of characters in theseed (default is 3). The objective function ’f’ used by [4] isa linear combination of training error, validation error andnumber of connections in the network, equation 2.

f = −k1et− k2ev − k3nc (2)

where: k1 is the weight factor of training error, k2 is theweight factor of validation error, k3 is the weight factor ofnumber of connections in the network, et is the training error(sum of squared error in training), ev is the validation error(sum of squared error in validation), nc is the number of con-nections in the network. The best values of the weight factorsin the objective function were obtained through an interactiveprocedure. These values are as follows, k1=0.5, k2=0.5 ek3=0.1. The backpropagation algorithm was used to train theANNs and to evaluate the training by the objective function.It was necessary to generalize the algorithm, making itapplicable to all types of valid ANNs. The methodologyproposed by [4] generates only direct topologies that arefunctional models, and cannot make sequential processing.This method didn´t use context sensitive production rulesthat enable the generation of more complex topologies, asrecurrent neural networks.

[10] proposed a new method using genetic algorithmsand L-systems to allow grow up of efficient neural networkstructures. The proposed L rules operates directly on 2-dimensional cell matrix. L Rules are produced automaticallyand they have ”age” that controls the firing times, i.e, timesthat rules are applied. The conventional neural model wasmodified to present the knowledge by birth(axon weights)and the learning by experience (dendrite weights). Thissystem enables to find special structures that are very fast

Fig. 4. Different stages of chromosome in GADDON (after growth),[4].

Fig. 5. Structure of the chromosome GADONN, [4].

for both to train and to operate comparing to conventional,layered methods.

The methodology proposed in section III, has the samegeneral principles of [10], where the overall plan for buildingup the nervous system of a living organism is stored inthe genes and that genes are affected by natural selectionvia crossovers and mutations. Similarly, artificial genes, areaffected by the operations of genetic algorithms. However,some differences should be mentioned, the L-system pro-posed in section III is context sensitive, [10] used Context-free L-systems. The synaptic connections and synaptic con-tacts of the method proposed by [10] are formed such thatresulting networks are acyclic, our methodology considers re-current connections, because recurrence increases complexityof the problems that can be dealed by a neural network. [10]proposed a method that can generate feedforward networksfully or partially connected, our method only generatespartially connected networks.

[30] Proposed a method that uses Evolutionary Algorithmsto design the architectures of ANNs, with both direct andindirect encoding strategies. The paper presents a com-prehensive empirical evaluation of eight combinations ofEAs and NNs on 16 public-domain and artificial data sets.The method presents excellent results only for feedforwardneural networks. Methods for building recurrent architec-tures have been hindered by the fact that available trainingalgorithms are considerably more complex than those forfeedforward networks. [21] Present a new method to build re-current ANNs networks based on evolutionary computation,to evolve the architecture and the weights simultaneously.[13],[17] proposed neuroevolution methods that evolve fixedtopology networks and ways of combining traditional neuralnetwork learning algorithms with evolutionary methods.

III. PROPOSED METHODOLOGY FOR NEURALNETWORKS DESIGN

This section goes on to describe and explore a biologicallyplausible methodology that implements a growth modelbased on a context-sensitive L-Systems[16],[5] capable togenerate economical neural architectures with optimum num-ber of neurons and adequate connection topology, that are de-fined as a directed graph representing the connectivity of thenetwork. Each node represents a unit and each directed edgerepresents a weighted connection from one unit to another. Inorder to this, it was considered some initial suppositions: Toimitate the mechanism of grown of structures, among themthe neural networks, we used a rewriting system. The basicidea is to define complex objects by successively replacingparts of a simple initial object using a set of rewriting rulesor productions. A great advantage of this strategy is that itbegins the search process in the solution space with simplerand smaller structures and, only after then, tries every timemore bigger and complex ones, instead of pure GA thatusually begins the search with an initial population equallydistributed over the solutions space. Nature uses this processto generate and evolve the optimal nervous system to eachcreature accomplish their task survive and procreate in theirenvironment. Our approach is biologically motivated by thefact, that in case of the human brain, there are much moreneurons than nucleotides in the genome as stated by [9]

“Nature uses a biological developmental process to trans-form a genetic code into a nervous system. During the devel-opmental process, cells divide using the genetic information.This allows one to encode incredibly complex systems witha compact code. For example a human brain contains about1011 neurons, each one with an average of 105 connections.If the graph of connections was encoded using a list ofdestinations for each neuron, it would require 1011 * 105

* ln2 1011 = 1.7 * 1017 bits. The numbers of genes arein the order of 2*109. The two numbers differs more than10 orders of magnitude. How can the developmental processachieve such compression?´´

A. Neural Network Codification with L-System

To model the metaphor of chromosome codification andgrowing process of body structures, e.g. Nervous System, itwas considered that the development of plants and animalsis governed by the genetic information contained in eachcell of the organism. Each cell contains the same geneticinformation (the genotype), which determines the way inwhich each cell behaves and, because of that, the final formand functioning of the organism (the phenotype). [6] Statedthat this genetic information is not a blueprint of that finalform but can be seen as a recipe that is followed, not bythe organism as a whole, but by each cell individually, thediscovery of languagens that can explain this developmentprocess is a gap for the science [29]. The shape and behaviorof a cell depends on those genes that are expressed in itsinterior. Which genes actually are expressed depends on thecontext of the cell. Already at the very beginning of an

organism’s life, subtle intercellular interaction takes place,changing the set of genes that are expressed in each cell.This process of cell differentiation is responsible for theformation of all different organs. The L-System proposedin this research was based on the model created by [2],the same used context-sensitive grammar (L-System), butwith a codification scheme that allows recurrent connections.The grammar can be described as G = {Σ,Π, α}, whereΣ = {A;B;C;D; , } and α = A,B,C. The production rulesare described in Table I, where ε is empty chain: Supposing

TABLE IPRODUCTION RULES OF THE GRAMMAR.

A → AB | ε A is replaced by AB or by εB > C → DB | ε If C is the successor of B, B is replaced by DB or εB < C → CD | ε If B is the predecessor of B, C is replaced by CD or εD < D → D2 If D is the predecessor of D, D is replaced by D2.

that starting with the axiom α = A,B,C applying the firstrule of Table I repeatedly a number of times equal to thenumber of inputs of the ANN, for example considering thenumber of inputs 2, the string that result is ABB, B, C.Considering the third rule applied to ABB, B, C a numberof times equal to the number of outputs of the ANN, forexample one output. The resulting string will be ABB, B,and CD. Using the second rule of Table I applied to ABB,B, CD two times, the resulting string will be ABB, DDB,CD and finally considering the second option of the followingrules B > C → DB |ε, A → AB |ε and B < C → CD |εthe language generated will be BB,DD,D. That representsthe ANN of the Fig 6(a) - the comma represents anotherlayer. The definition of context used here it is not the sameas in normal L-Systems. Usually the context of a symbolthat is being rewriting is directly on the left and right of thatsymbol. Here it is considered in relation to another level. TheRecurrent ANN of the Fig 6(b) can be obtained in a similarway, using the fourth production rule B > C → DB |ε.Applying this rule to BB,DD,D results in BB, DD, D2. Thelast D2 represents the feedback connection. The ANN of theFig 6(c) is obtained applying the rule B > C → DB | ε threetimes instead of two how accomplished in the third stagepreviously presented. The stages of the process developmentof ANNs are shown in full detail in [7].

B. Rules Extraction with Genetic Algorithms

In this subsection we present the process of rules extractionwith GA considering the following question: Why bother intouse binary strings for chromosomes instead of L-system sym-bols directly?. [11] proposed a methodology where the pro-duction rules are directly codified in the Chromosomes. Thatseems to be more beneficial than binary encoding, because ofthe size of the chromosomes used. The computational effortusing binary codification is larger than using both the methodof ”direct encoding” or ”indirect encoding”, due to theprocess of search of the production rules. Nevertheless we aretrying to investigate a biologically plausible methodology anddecided to use a binary codification. In our work we try to

Fig. 6. Artificial Neural Networks Generated

model the ”constructive” approach found in the evolutionaryprocess. Here, L-rules are extracted automatically by the GAin order to control the number of firing times, i.e. times wecan apply each rule. The algorithm used for extraction ofrules of the chromosome is shown below.1. Read the chromosome, six bits at a time.2. Each group of six bits is converted for a comma, or for asymbol of the alphabet defined by the grammar, in agreementwith the Table II, what will result in a string. The orderof reading of the Table II is the following: to determinethe string of bits that codes the character followed by thefollowing steps: determine which of the four rows on the leftcorresponds to the first two bits of the string, then choose acolumn according to the middle two bits, and finally choosea row on the right using the last two bits. For example, thecharacter corresponding to 001000 is A.3. Find all the strings that code a production rule, followingthe representation adopted is shown in the Table I.4. Throw away all production rules that do not conformto the restrictions given in the grammar, leaving only validproduction rules.5. Repeat 1-4, by starting to read the bit string notjust at the first bit but also at bit 2,3,4,5,6 and at bit512,511,510,509,508,507. All production rules are extractedin this way form the L-system for one network. Sincethe algorithm starts at all bit positions and reads in bothdirections, the chromosome of one member of the populationis read twelve times, which may eventually increase the levelof implicit parallelism of genetic algorithm.

00 01 10 1100 B 2 A B 0000 D , , C 0100 D B , , 1000 B A , B 1101 D D B 2 0001 D A D , 0101 C B D C 1001 B , C D 1110 A B 2 C 0010 B B B B 0110 , D D D 1010 A C B A 1111 B B C B 0011 B A D C 0111 B 2 D C 1011 , B , B 11

TABLE IITHE CONVERSION TABLE USED

C. Fitness Evaluation of the Neural Network Structure

Here we present how the Fitness Evaluation was realized.As we want to find optimized neural structures capable tosolve a given problem, the fitness function adopted considersa minimum number of: neurons in the intermediary layer,production rules, recurrent connections and residual error,which can be represented by Equation 3 and Equation 4.The fitness is returned to the GA, which produces a newsolution from the population to be evaluated. The FitnessFunction has two derivations, the first selects the satisfactoryarchitectures, which have Average error smaller or equalto the acceptable value (AV). The second derivation hasdiscard the architectures that are not appropriate to simulatea specific problem, the constant 0.1 used to give a lowevaluation for those architectures. The satisfactory resultfor this function, is a value that represents a minimumarchitecture, with good generalization capacity and smallererror for large noise tolerance.

Fitness = [A1

ERM+

A2

NCIH+A3

RP+

A4

REC + 1] (3)

IF ERM≤AVotherwise

Fitness = [A1

ERM+

A2

NCIH+A3

RP+

A4

REC + 1]∗0.1, (4)

where: ERM= Average error of the patterns in the output theneural network. NCIH=Number of neurons in the interme-diary layer. RP=Number of production rules. REC=Numberof recurrent connections, AV= Acceptable value of error.A1,A2, A3 and A4 are appropriate constants.

IV. EXPERIMENTS

In order to investigate the possibilities of the method,especially in problems where recurrent neural networksshould be evolved. The technique was tested on three well-known simple problems, which are described below.Theparity problem consists of presenting in the input of theneural network a string of bits and to obtain in the outputthe parity value of the string presented. The correspondinggraph of the Mealy Machine, which implements the problemof parity, is shown in the Fig 7. The parameters of thefitness function used in the simulations were A1=0.000001,A2=100, A3=1 and A4=100, AV=0.01 the value of AVguarantees a good generalization capacity in the sense of theextrapolation, the values of A1,A2, A3 and A4 were chosenby heuristic. The numbers of epochs were 30000 and thenumber of generation was 100, the size of the chromosomeused was of 512 bits. The minimum network obtained afterthe simulation is shown in Fig 8. Once obtained the networkarchitecture, it was checked its generalization capacity. Thebit string 0,1,0,1,1,0,1 was presented in sequence to the ANNinput. The desired and the obtained outputs are shown belowin Table III.

The second problem studied was the languages proposedby [19] and shown in Table IV. The Average Fitness and

Fig. 7. Mealy Machine for the Parity Problem

Fig. 8. Generated ANNs capable to recognize the Parity Problem

the Bests Fitness for the Language 1 is shown in Fig 9and for the Language 2 is shown in Fig 10. The minimumnetwork obtained after the simulation for the Language 1is shown in Fig 11(a) and for the Language 2 in the Fig11(b). It was checked the generalization capacity for theTomita’s Language 1, the desired and the obtained outputsare shown below in Table V. To verify de viability of the

Outputs Obtained Outputs0 0.0227071 0.9877231 0.9877230 0.0160731 0.9847521 0.9877430 0.016074

TABLE IIIOBTAINED OUTPUTS FOR THE FIRST EXPERIMENT

Language Description Examples1 1* ε,1,11,111,1111,111112 10* ε,10,1010,101010

Language Description Examples1 1* ε,0,10,01,00,011,1102 10* ε,1,0,11,00,101,100

TABLE IVLANGUAGES INVESTIGATED IN [19]

methodology for large problems it was realized an experi-ment for a temporal series that represents the CPI consumerprice index in the period of 1998-jan to 2010-may (CentralBank of Brazil, as seen in Table VII). The CPI quantifiesthe costs of products in different moments . In other words ameasures change through time in the price level of consumergoods and services purchased by households and is beinguseful to calculation of inflation rate. The parameters of thefitness function used in the simulations were A1=0.00001,A2=1000, A3=10 and A4=100, AV=0.0001 the value of AVguarantees a good generalization capacity in the sense of the

Fig. 9. Best and Average Fitness versus generation for one run of GA,Language 1.

Fig. 10. Best and Average Fitness versus generation for one run of GA,Language 2.

extrapolation, the values of A1,A2, A3 and A4 were chosenby heuristic. The numbers of epochs were 40000 and thenumber of generation was 100, the size of the chromosomeused was of 512 bits. The minimum network obtained afterthe simulation is identical of the Figure 11(a). The Fig 12shows the obtained and desired values for the CPI. It waschecked the extrapolation capacity for the CPI, the desiredand the obtained outputs are shown below in Table VI.

The method of [4] is less computationally intensive thanexisting interactive design procedures (direct encoding) andgrammatical encoding hence it was applied to the automaticdesign of neural networks for complex processes such asto model the dynamics of a PH neutralization process andCSTR reactor. The methodology presented here is morecomputational intensive than the method proposed by [4],[15] and [10] because the complexity of the production rulesand the search algorithms used, depending on the complexityof the problem, like time series learning, the computationaleffort increases a lot and more complexes topologies are

Fig. 11. Generated ANNs capable to recognize Tomita’s Language.

Obtained x(k+1) Desired x(k+1) Error0.993624 1 0.006376

1 1 01 1 01 1 0

0.00578 0 0.005780.026471 0 0.0264710.978195 1 0.021805

TABLE VOBTAINED OUTPUTS FOR TOMITA’S LANGUAGE 1

demanded. Recurrent neural networks are still not fullyexploited because of its tedious training and complex systemstructure.

V. CONCLUSIONS

A methodology of automatic project of Neural Networkspresented in the Section III constitutes an important toolfor an obtaining of optimized architectures. It implementsa growth model capable to generating increasingly largeand complex neural architectures. The Genetic Algorithmworks with a population of Neural Networks instead ofonly one network, allowing a satisfactory architecture, whichsimulates certain task with certain degree of precision, tobe appropriately selected. That is not always possible withconventional methods of project, where an architecture notoptimized charge a computational load that can limit thepractical application of Neural Networks or become unfea-sible its implementation, for example, in ”hardware”.

The experiments showed that the ANN design method-ology used in this research works well for simple andcomplexes problems where recurrent neural networks shouldbe evolved to obtain efficient and economic architectures. Toset the parameters of the Genetic Algorithm is a very sensiblefeature of the methodology and we are working to improvethis point. The production rules presented in Table I can

Fig. 12. Obtained and Desired Values for CPI.

CPI Obtained CPI Desired Error0.006153 97-JAN 0.0185 1.711215e-50.006219 97-DEC 0.0056 1.912073e-7

TABLE VIOBTAINED AND DESIRED VALUES FOR CPI

JAN FEV MAR APR MAI JUN1998 0,0126 0,0014 0,0033 0,0023 0,0014 0,00411999 0,0064 0,0141 0,0095 0,0052 0,0008 0,00652000 0,0101 0,0005 0,0051 0,0025 0,004 -0,00012001 0,0064 0,004 0,0056 0,0086 0,0041 0,00522002 0,0079 0,0014 0,0042 0,0071 ,0028 ,00552003 ,0232 0,0137 0,0106 0,0112 0,0069 -0,00162004 0,0108 0,0028 0,0046 0,0031 0,0071 0,00782005 0,0085 0,0043 0,007 0,0088 0,0079 -0,00052006 0,0065 0,0001 0,0022 0,0034 -0,0019 -0,0042007 0,0069 0,0034 0,0048 0,0031 0,0025 0,00422008 0,0097 0,0056 0,0045 0,0072 0,0087 0,00772009 0,0083 0,0021 0,0061 0,0047 0,0039 0,00122010 0,0129 0,0068 0,0086 0,0076 0,0021 -

JUL AUG SEP OCT NOV DEC1998 -0,0025 -0,0052 -0,0017 0,002 -0,0019 0,00091999 0,012 0,0048 0,0019 0,0092 0,0112 0,0062000 0,0191 0,0086 0,0004 0,0002 0,004 0,00622001 0,0136 0,0054 0,0012 0,0071 0,0085 0,0072002 0,0103 0,0076 0,0066 0,0114 0,0314 0,01942003 0,0034 0,0013 0,0076 0,0021 0,0033 0,00432004 0,0059 0,0079 0,0001 0,001 0,0037 0,00632005 0,0013 -0,0044 0,0009 0,0042 0,0057 0,00462006 0,0006 0,0016 0,0019 0,0014 0,0024 0,00632007 0,0028 0,0042 0,0023 0,0013 0,0027 0,0072008 0,0053 0,0014 -0,0009 0,0047 0,0056 0,00522009 0,0034 0,002 0,0018 0,0001 0,0026 0,0024

TABLE VIINORMALIZED VALUES FOR CPI IN THE PERIOD OF 1998-JAN TO

2010-MAY - CENTRAL BANK OF BRAZIL

generate more complex architectures than the others methodspresented in section II, because recurrent connections wereconsidered , but they demand a larger computational effortand more complex search algorithms. Some researchers don’tworry about biological plausibility, in some situations it isnot necessary, for instance when they want to obtain anartifact with good generalization capacity. However, when weneed a better understanding of the mechanisms of biologicaldevelopment that transforms the genetic code in the nervoussystem the biological inspiration is very important, lookingotherwise it may be totally unnecessary.

REFERENCES

[1] Boers, E., Kuiper, H., Biological metaphors and the design of artificialneural networks. Master’s thesis, Leiden University, Niels Bohrweg 1,2333 CA, Leiden, Netherlands, 1992.

[2] Boers, E., Kuiper, H., Happel, B., Designing modular artificial neuralnetworks,Tech. rep., Departament of Computer Science and TheoreticalPsychology, Leiden University, 1993.

[3] Boozarjomehry, R. B., Application of artificial intelligence in feedbacklinearization. Ph.D. thesis, Department of Chemical and PetroleumEngineering.The University of Calgary, UK, 1997.

[4] Boozarjomehry, R. B., Svrcek, W., Automatic design of neural networksstructures. Computers and Chemical Engineering vol. (25),pp. 1075-1088, 2001

[5] Chomsky, N., Ethree models for the description of language. In: IRETransactions on Information Theory. pp. 113-124, 1956.

[6] Dawkins, R., The Blind Watchmaker: Why the Evidence of EvolutionReveals a Universe without Design W.W. Norton & Co.,1996.

[7] DeCampos L.M.L, R. M., A biologically inspired methodology forneural networks design In: IEEE Conference on Cybernetics andIntelligent Systems. IEEE,pp. 619-624, 2004.

[8] Devaney, R., An introduction to chaotic dynamical systems. IEEETransactions on Neural Networks, Applicandae Mathematicae 19(2),pp. 204-205, May. 1990.

[9] Gruau, F., Automatic definition of modular neural networks, Adapt.Behav. 3 (2), pp.151-183, 1995.

[10] Isto, Aho, Harri Kemppainen, Kai Koskimies, Erkki Makinen, TapioNiemi. Searching Neural Network Structures with L Systems andGenetic Algorithms. International Journal of Computer Mathematics,1029-0265, Volume 73, pp.55-75, 1999.

[11] Kitano, H., Designing neural networks by genetic algorithms usinggraph generation system, In: Complex Systems Journal 4., pp. 461-476,1990.

[12] Kitano, H., Neurogenetic learning: an integrated method of desiginingand training neural neural networks using genetic algorithms, PhysicaD.(75), pp.225-238, 1994.

[13] Kenneth O. Stanley and Risto Miikkulainen, Evolving NeuralNetworks Through Augmenting Topologies. In: Journal Evolution-ary Evolutionary Computation vol.10, number.2, pp. 99-127, 2002,http://nn.cs.utexas.edu/?stanley:ec02.

[14] Kohl, N and Risto Miikkulainen, Evolving Neural Networks forStrategic Decision-Making Problems. In: Neural Networks 22, pp.326-337,2009.

[15] Lee,D.W., Kong,S.G., Evolvable neural network based on developmen-tal models for mobile robot navigation., In: Proceedings of InternationalJoint Conference on Neural Networks, Montreal, Canada. IEEE,pp.337-342, 2005.

[16] Lindenmayer, A., Mathematical models for cellular interactions indevelopment i. filament with one-sided inputs. Theoretical Biology, vol.18, pp. 288-300, 1968.

[17] Miikkulainen, R., Evolving neural networks. In: GECCO ’10: Pro-ceedings of the 12th annual conference on Genetic and evolutionarycomputation. ACM, New York, NY, USA, pp. 2441-2460, 2010.

[18] Miller, G. F., Tood, P. M., Hedge, S. U., Designing neural networksusing genetic algorithms. In: Proceedings of International on GeneticAlgorithms and their Applications.,pp. 379-384, 1989.

[19] M.Tomita., Dynamic construction of Finite automata from examplesusing hill-climbing. In: Proceedings of the Fourth Annual Conferenceof the Cognitive Science Society. Ann Arbor, MI, pp. 105-108, 1982.

[20] Peralta, Juan; Gutierrez, German; Sanchis, Araceli. ADANN: Auto-matic Design of Artificial Neural Networks. In: Proceedings of the2008 GECCO conference companion on Genetic and evolutionarycomputation, p.1863-1870,2008.

[21] Pujol, J. C. F., Evolution of artificial neural networks using a two-dimensional representation. Ph.D. thesis, School of Computer Sci-ence,University of Birmingham, UK., 1999.

[22] Salustowicz, R., A genetic algorithm for the topological optimizationof neural networks. Ph.D. thesis, Tecnishe Universitat Berlin. 1995.

[23] Schiffmann, W., Evolutioners design von neuralen netzen. In: Infor-matikin den Biowissenchafen, ser. Informatik Aktuell, R. Hofestdt, F.Krckeberg,and T.Lengauger. Springer, pp. 121-132, 1993.

[24] Vaario, J., An emergent modeling method for artificial neural networks.Ph.D. thesis, The University of Tokyo, Tokyo, Japan., 1993.

[25] Vaario, J., From evolutionary computation to computational evolution.Informatica 18, pp. 417-434, 1994.

[26] Vaario, J., Onitsuka, A., Shimohara, K., Formation of neural structures.In: Proceedings of the Fourth European Conference on Arti Life,ECAL97. The MIT Press, pp.214-223, 1997.

[27] Vico, F., Sandoval, F., Use of genetic algorithms in neural networksdefinition. In: Proceedings of International Workshop on ArtificialNeural Networks IWANN91, Granada, Spain. Spring-Verlag, pp. 196-203, September 1991.

[28] Whitley, D., The genitor algorithm and selectiom presure : Why rank-based allocation of reproduction is best. In: Proceedings of Internationalon Genetic Algorithms and their Applications, pp. 116-121, 1989.

[29] N. Feng, G. Ning, and X. Zheng, A framework for simulating axonguidance, In: Proceedings of Neurocomputing, 2005, pp.70-84.

[30] Cant-Paz, E. and C. Kamath, An empirical comparison of combina-tions of evolutionary algorithms and neural networks for classificationproblems, IEEE Transactions on Systems, Man, and Cybernetics-PartB: Cybernetics, pages 915-927, 2005.


Recommended