Ticconi - 2012 - Investigation of Mutation Rates Adaptivity in Changing Environments an Evolutionary...

Investigation of Mutation Rates Adaptivity in Changing

Environments: an Evolutionary Robotics Approach

Fabio Ticconi

August 30, 2012

Abstract

Taking inspiration from results of recent research in microbiology and evolutionary theory, in

this work has been explored the possibility that making the mutation rate in a genetic algorithm

adaptive, instead of fixed, could improve the performance in case of mutating environmental

conditions. To reach this goal, a simplified model of stress-driven mutation rates is proposed,

integrated into the general principles and techniques of the Evolutionary Robotics. A notion of

environmental stress has been defined for three different environments where populations of

simulated Khepera robots have been evolved and analysed in their ability to regain fitness after

their environment was modified, and to keep their fitness high while the environment changed

periodically. The approach resulted promising and outperformed the fixed mutation rate in

most experiments, being able in certain environments to recover more quickly from a drastic

change in the external conditions. Extensions needs to be developed in future works to study

the potential and limits of mutation rate adaptivity.

Contents

1 Background 6

1.1 Epigenetics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.2 Evolutionary Robotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.2.1 Artificial Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.2.2 Genetic Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

1.3 Stress-driven Epigenetic Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2 Methods 14

2.1 Simulated Envinroment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

2.2 Robot Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

2.2.1 Neural network architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 18

2.3 StrEGA implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2.3.1 Dynamic brood size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

3 Experimental setup 22

3.1 Simple sources switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

3.2 Simple sources switch with water . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

3.3 Complex sources switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4 Results 28

4.1 Simple sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

4.1.1 Static environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

4.1.2 Switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

4.1.3 Periodic change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4.2 Simple sources with water . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.2.1 Static environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.2.2 Switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

4.2.3 Periodic change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

1

4.3 Complex sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4.3.1 Camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

4.3.2 Infrared . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

5 Conclusion 37

A Additional Plots 43

A.1 Simple sources switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

A.1.1 Static environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

A.1.2 Switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

A.1.3 Periodic change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

A.2 Simple sources switch with water . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

A.2.1 Static environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

A.2.2 Switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

A.2.3 Periodic change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

2

Acknowledgements

This work is dedicated to my family and to Anto, that give me love when I most need it.

My most sincere thanks go to all the people with who I shared an amazing, challenging and

stimulating year. The closest friends, they already know. Of course they now, we have been

sweating and laughing side by side.

A big thanks to all the tutors and professors who have made the start of this adventure easy

and enjoyable, and the progress challenging and interesting.

A special thanks to Matthew, my supervisor, that has always been able to put order in my

confuse ideas and taught me a couple of important lessons, while always being friendly and

ready to give advice.

3

Introduction

The adaptive mutation is a little cloud that obscures the beauty and clearness of the

molecular biological perspective on Life.

Vasily Ogryzko

Mainstream evolutionary theory states that genetic mutations are blind, that is they are not

directed onto specific genes. It is argued that they occur at random times and places into the

genotype and that they normally generate lower-fitness phenotypes, thus meaning that most

mutations are actually harmful although being an important source of variation in a population

1. Cricks Central dogma of molecular biology is often cited to defend this position:

The central dogma of molecular biology deals with the detailed residue-by-residue

transfer of sequential information. It states that such information cannot be trans-

ferred back from protein to either protein or nucleic acid.

Following the argument of Crick (1970), the proteins synthesized using DNA and RNA

cannot influence the nucleic acids back. This in turn strengthens the idea that mutations have

to be blind, because proteins that receive stimuli from outside the cytoplasm cannot react to

them by modifying the DNA, of course assuming the dogma is true.

If we take for granted the randomness of mutations, theoretically their rate should be very

small. They are needed to explore the genetic space and have diversity into a population,

but this should happen without making every offspring unable to survive because of too many

deleterious mutations.

As by classical Darwinian evolution, the phenotype of each individual is subjected to natural

selection, and the individual with deleterious mutations will likely be eliminated (because of

their reduced fitness), while the ones with neutral or just slightly harmful mutations will maybe

continue to reproduce; the few ones, finally, who have been subjected to a positive mutation,

will be fitter and will reproduce more, thus spreading the newly found gene into the population.

1See (Jablonka and Lamb, 2005) for an history of the debate about the randomness of mutations.

4

As Baer (2008) put it, deleterious mutations are the price we living organisms pay for the

ability to evolve.

During the last few decades, though, new findings in molecular biology have led researchers

to argue whether this view was correct or not. Can mutations be adaptive, that is can an

organism modify its own mutation rate or direct a mutation onto specific genes to

make its offspring more likely to survive?

This work does not aim to answer this questions, but to define and test a model able

to embody, in a simplified way, the essential mechanisms of mutation rate adaptivity. The

hypothesis of this work is that, whether they exist in nature or not, adaptive mutations can

be beneficial in a population undergoing environmental stress, accelerating the evolutionary

process.

A prediction of this work is that a properly defined stress-driven mutation rate adaptivity

should allow a population to escape a risk of extinction, and quickly regain its vitality. To do

that, such a model needs to be defined and than tested in a properly setup environment.

In Chapter 1 is first presented an overview of the current debate over the adaptivity of

mutations, then in Chapter 2 are described the methods here used to define and test the above

mentioned model:

In Chapter 3 are described the various experimental setup where the model has been tested,

followed by the obtained results in Chapter 4. In Chapter 5 there are the conclusions of this

work, the results are discussed an possible future improvement outlined.

5

Chapter 1

Background

The question whether mutations can be adaptive or not has risen a long-lasting debate, since

it touches what is considered a central point of evolutionary theory and because, if the answer

is proven affirmative, we will be in fact acknowledging a sort of Lamarckian process 1 alongside

the classic Darwinian one.

Notwithstanding the importance of the question and the amount of studies on the subject,

this problem has yet to be considered closed.

On the one hand, proponents of the adaptivity hypothesis have showed how the mutation

rate is not constant even between closely related individuals and groups, while it is instead

correlated with the fitness: the lowest the fitness, the highest the mutation rate (Baer et al., 2007;

Baer, 2008). Low-fitness individuals, can be argued, under certain environmental conditions

start to have a higher mutation rate (eventually specific onto the genes that make them less

good in that particular environment), thus increasing their likelihood of generating an offspring

better able to cope with the environment than themselves. That would make their mutations

adaptive. The mutation is said to be directed in case it occurs specifically in genes that can

increase the organisms fitness (Rosenberg, 2001).

On the other hand, however, this could just be due to the cost of maintaining the DNA

intact, that in bad environment conditions might be too high for an individual. This could lead

to an increased mutation rate and eventually to the generation of offspring with a higher fitness

than the parent, or to other reasons still explainable under a classic Darwinian framework Baer

(2008).

An example of how subtle is the difference between an adaptive and a non-adaptive muta-

genic process is seen with particular lac mutants of Escherichia Coli, unable to feed and grow

1This is especially a problem in case directed mutations will be proven to happen, because the environment

in that case will directly modify particular genes which will then be inherited.

6

properly in a lactose-rich environment. In this situation, there seems to be an adaptive response

that drives the lac cells to increase their mutation rate (a phenomenon known as hypermuta-

tion), so that eventually an offspring will end up with a lac+ gene which, by increasing the cell

growth in that particular environment, will quickly spread into the population.

Hendrickson et al. (2002) argued that the adaptivity of mutations in this case is only apparent

and proposed a model where the lac gene is amplified in the genotype by means of random

mutations, resulting in a slight increase of the growth rate of the cell.

The selection will then favour a further amplification of this gene, which also means an

increase in the probability of having one of these repeated lac genes mutated into a lac+ gene

at the normal, constant mutation rate. When this happens, lac+ cells will grow at a higher rate

and survive more than the others: selective pressure will make the lac+ gene spread into the

population, eventually covering it all. When analysed without taking gene amplification into

account, the researchers argued, this whole process can be misread as an example of adaptive

mutation rates or even a directed mutation.

Stumpf et al. (2007) instead tested many predictions of the amplification model, finding that

most of them do not hold in this case: for example, cells that do actually amplify the lac gene

can end up with a lac+, but this is unstable because of the many copies of the lac. The stable

lac+ cells instead are showed to have been generated without amplification in the parent.

In the last decade many other experiments and analyses have been performed on the lac

mutant of E. Coli, bringing more evidence to the hypothesis of mutations as an adaptive stress

response to bad, challenging environmental conditions. It seems however to be excluded a

directed mutation: a transient hypermutation under stress conditions is enough for a lac pop-

ulation to adapt and become a lac+, whilst it is still debated how the population is able to

survive a (although temporary) huge increase in mutations (Gibson et al., 2010; Fonville, 2011;

Rosenberg, 2001; Galhardo et al., 2007; Gonzalez et al., 2008).

Condition-dependent mutation rates have also been found in some multicellular organisms,

although more controversially (Baer, 2008; Agrawal and Wang, 2008; Cotton, 2009; Sharp and

Agrawal, 2012), and it has also been suggested that human cancers arise in part as an evolu-

tionarily programmed side effect of age- and damage-inducible genetic instability affecting both

somatic and germ line lineages (Zhao and Epstein, 2008): an increase in mutation rates in the

human male sperm while ageing and when experiencing environmental stress seems to lead to

increased species adaptation, even though at a high cost for the single individuals.

The debate, as said, is still open, but one more aspect needs to be considered.

Firstly, most of the cited works assumed the mutation rate to be evolvable (because the

7

molecules that control DNA repairing and copying are in fact encoded by the DNA itself).

Secondly, a recent work (Clune et al., 2008) pointed out how natural selection, in a strict

Darwinian framework, fails to find the optimal mutation rates in difficult environments if they

are encoded into the genome.

Assuming the last two points are true 2, mutation rates arent adaptive even on the phyloge-

netic scale, let alone the ontogenetic one. A recent area called epigenetics could help to escape

this apparent dead-end for the adaptivity hypothesis, adding a new dimension to the scenario.

1.1 Epigenetics

DNA is just a tape carrying information, and a tape is no good without a player.

Epigenetics is about the tape player.

Bryan Turner 3

Epigenetics is the name given to a relatively recent field studying the molecules that are

around the DNA and that regulate gene expression. They are also found in the cytoplasm and

are highly sensitive to environmental stress (Yaish et al., 2011; Pecinka et al., 2010), modifying

the genetic expression (also switching on and off particular genes) in different conditions.

What has prompted a rather heated debate is the possibility for the epigenome (the name

given to the whole set of molecules with epigenetic functions) to be heritable 4.

It is easy to see why: the epigenome of an individual responds to environmental stress

by modifying gene expression or switching off a gene at all, so the children that inherit the

epigenome could have the same effects on gene expression without having experienced the stress

themselves. This is called inheritance of acquired characteristics, and has been fought since the

end of the 19th century on the account of its Lamarckism (Weismann, 1893).

Since there is some evidence (Dinant et al., 2008) that DNA repair is one of the duties of

the epigenome, which in turn is highly responsive to environmental stress, it is possible that

this stress may be a cause of transient, adaptive mutation rate change.

In this work, as stated in the introduction, a simplified model of stress-driven muta-

tion rates is developed without trying to maintain a biological resemblance. The

2The inability to find an optimal mutation rate, however, may not be a problem as long as there is hyper-

mutability, even if evolved by means of natural selection without any direct influence of the stress, as (Kang

et al., 2006) shows.3http://epigenome.eu/en/1,1,04See (Jablonka and Lamb, 2005) and a counter argument by (Haig, 2006); for evidence in plants see (Greer

et al., 2011).

8

(possible) adaptivity of mutations in real organisms, together with the (still unproven) idea

of an epigenome as interface between the environmental stress and the genetic changes, is an

inspiration to investigate whether, comparing artificial organisms with such characteristics and

others without, we can see the first being better than the second.

To develop such a model of phylogenetic adaptivity, incorporating epigenetics, it has been

chosen to use the principles and methods of Evolutionary Robotics (ER). The next section will

present a brief history of the field and the rationale of this approach for studying evolution.

1.2 Evolutionary Robotics

ER had developed during the 90s as a new approach to the study of cognition: a so called

synthetic approach to cognition (Pfeifer et al., 2007). Its first use was as a model to study

minimally cognitive agents, which were built or simulated and put in an environment to solve

tasks. The simplified structure of both the robots and the environments together with im-

provements in computer hardware allowed the researchers to perform real-time as well as oine

analysis in reasonable time, compared to real organisms.

The whole idea is that of imitating the evolutionary process to shape the controller, as well

as the body, of a robot in a goal-oriented way. ER has been applied in engineering to design

actual robots or other complex tools, like satellites, but it has been most commonly used to

test scientific hypotheses in various fields: psychology, ecology, evolutionary biology, sociology,

economics or more recently abiogenesis (Parisi and Cecconi, 2006; Tuci et al., 2002; Nolfi and

Floreano, 2000; Pfeifer et al., 2008).

The assumption, of course, is that the natural environment does in fact constitute an arena

for real organisms, whose bodies and behaviours are tested and selected by means of natural

selection: this allows only the fittest individual to survive and to become increasingly better in

that particular environment, and shape in this way the body and behaviour of natural species. If

the experimenter wants to analyse such a process in a controlled way, or to produce a robot fit to

a particular environment, he or she just needs to define precisely the environment characteristics

and a test function able to quantify how well a particular robot performs when put in that

environment for a certain time, leaving the rest to the artificial evolution.

The keywords in ER are, thus, evolution and controller: the first to shape the second so that

the needed behaviour can be seen. ER, in fact, arose after two other techniques had become

mature, artificial neural networks (ANN) and genetic algorithms (GA), that are outlined in the

next two subsections.

9

1.2.1 Artificial Neural Networks

The researcher involved in ER, as said, wanted to produce agents able to move in simulated

and real environments, responding quickly to external stimuli. The behaviour-based controllers

introduced by Brooks, the first models for biologically inspired robotics, were a too high ab-

straction for the purposes of ER. The idea, instead, of using artificial neural networks (that can

be easily encoded into genome-like strings) as a controller able to convert environmental stimuli

into behaviour had, at that point, been a natural one (Nolfi and Floreano, 2000).

Firstly, ANNs are a simplified model of the biological neural networks which in turn are

the main focus of non-representationalist views of cognition, like connectionism, that were at

the time growing in recognition. In addition, ANNs were experiencing again an explosion in

research, after the big drop caused by the (Minsky and Papert, 1969): the initial problems

of the computational limits of perceptron had been solved with multilayer perceptrons. GAs

and the backpropagation algorithm, alone and combined, were showed able to correctly update

the weights of ANNs to solve complex tasks in the machine learning field, but in many of

the ER experiments there wasnt the need to use the backpropagation algorithm 5. The main

difference is that with GAs the focus is on species adaptation, instead on the adaptation of the

single individuals. There is also evidence of a superiority of GAs in finding weights of ANNs

compared to backpropagation (Gupta and Sexton, 1999), so the last one is rarely found in ER

studies.

Many different ANN models, in addition to the simple multilayer perceptron, have been

studied and tested for ER experiments (Nolfi and Floreano, 2000): a particularly common one

is the Continuous-time recurrent neural network (CTRNN) (Beer, 2003), which is the one used

in this work.

1.2.2 Genetic Algorithms

Holland (1975) proposed a framework to systematically evolve genomes (initially only with

binary genes) using operators inspired by genetics, like mutation and recombination. Its initial

goal was not that of inventing a system to solve complex tasks, but to study evolution under

controlled conditions and to show how the evolutionary process was not limited to natural

organisms.

5In certain cases it was actually used, together with the genetic algorithm, to simulate both evolution of the

species and lifetime learning of each individual. Parameters of the backpropagation algorithm can then be evolved

too, removing the need for the experimenter to determine them. To the purpose of lifetime neural plasticity, also

Hebbian learning has been extensively used (Floreano et al., 2005)

10

The main idea is, given a certain problem, to define an unambiguous encoding of its possible

solutions (the genotypes), then create a set of strings that can be translated from the encoded

version into a testable representation. This is the starting population over which the artificial

selection can be applied.

A binary string, for example, could uniquely encode the shape of a particular object by

specifying, for each gene, whether a particular characteristic has to be present or not. The

object, built using the specifications in the genome, can be tested in a pre-specified environment.

A value of goodness, called fitness, is then assigned to that particular object in that particular

environment. A selection criterion can then be applied: for example keep only the best five

solutions and clone their genetic codes. Then apply mutations, and combine them two-by-two

using crossover, to get a population of the same size of the beginning. The process is then

repeated.

This process guarantees an increase in performance of the whole population. Given infinite

time, it would explore all the possible values for the genome, thus an optimal solution will

eventually be found. It is more common, however, to stop the evolution when a certain fitness

threshold has been reached, meaning that the solution found is good enough for our necessity,

or after a given number of iterations (generations).

Since the work of Holland (1975), many different parameters, genetic operators, fitness (test)

functions and selection criteria have been developed and tested, and GAs have been successfully

applied to a wide range of applications (Mitchell, 1998).

In this work, however, GAs are only considered as a stylized reproduction of the natural

evolutionary process for artificial organisms. To be able to study the effects of stress-induced

mutations onto artificial organisms, the standard GA has been modified by including a simplified

version of an epigenome that encodes the mutation rate. The next section presents this novel

approach.

1.3 Stress-driven Epigenetic Algorithm

Adaptivity in mutation rates in living organisms, as seen, has been widely studied by biologists.

The same does not apply to the field of ER.

There have been, in fact, studies about improvements in function optimization when using

an adaptive mutation scheme (Thierens, 2002; Chan et al., 2008). Still, most ER works use a

fixed mutation rate, usually empirically found to be close to an optimal value (with respect

to the experiment-specific conditions).

Problems with hard-coded, fixed mutation rates arise when facing dynamic environments.

11

Under changing conditions, there might be a different optimal mutation rate. An approach to

resolve this issue could work by modifying the mutation rate (and eventually other parameters)

in a way that could, in principle, keep up with the (if existent) current optimal mutation rate,

or at least achieve better results than a fixed-mutation rate approach.

To achieve this goal, in the previous work where the Stress-driven EpiGenetic Algo-

rithm (StrEGA) had been introduced (Ticconi, 2011), the use of the inverse of ranking was

explored (with function optimization in mind). Each individual has been associated not only

with a genome, as in classical GAs, but also with an epigenome constituted of only one epi-

gene. This encoded a number in the range [0, 1], representing the probability for each of its own

genes, after duplication, to be mutated. After evaluation, a stress indicator is used to update

the epigenome, either increasing its value or decreasing it.

The main difference, compared to the cited studies on adaptive mutation rates in GAs, is

that each individual has its own epigenome, while normally a single mutation rate is made

global for the whole population. There are however two main points that make this previously

explored approach not perfectly applicable to this work.

Firstly, the epigenome was inheritable. Apart from the fact that epigenetic inheritance is

still a debated issue, in this work it was not strictly necessary.

Secondly, for an optimization task there is not normally a lifetime. The genome of the

individual is also its phenotype (the xs of point being passed to the function), and it directly

produces a fitness value (the y calculated by the function). The notion of stress was thus an

abstract idea: the inverse of ranking. In a truncation selection scheme, where only a fraction

of the best is allowed to reproduce, we can considered the best individual the least stressed

one, and viceversa the worst individual the most stressed one. Therefore, the inverse of ranking

served well to decide the amount of the epigenome modification for each individual at each

generation.

The stress encoding used in the original work (Ticconi, 2011) had led to better results than

the normal GA, over a variety of fixed mutation rates, but in other conditions it may not be

the best option. The main, practical problem with the idea of a stress-driven GA, therefore, is

that it increases the number of parameters an experimenter has to think about.

In this context, deciding what is stress is as difficult as deciding the fitness function. A com-

plete coupling of fitness and stress might not always be feasible (when the fitness is unlimited,

for example) or appropriate. Using the inverse of ranking may, however, be considered as a

simple alternative to environment-specific stress functions.

Despite this weakness, an objective advantage of an epigenetic-based approach remains: it

12

does not increase the genetic space as in encoding the mutation rate in the genome. It also

serves well the purpose of studying in isolation the adaptivity of mutation rate, excluding it

from the evolutionary process 6.

In the next chapter, after an overview of the chosen simulator for this work, will be given a

more detailed description of how the StrEGA has been adapted to the current task.

6As seen before, in biological organisms this is the main problem for researchers willing to prove there is

an adaptive mechanism, since the opponents always remarks how a such mechanism could well be a product of

classic Darwinian evolution.

13

Chapter 2

Methods

In this work, it has been chosen to use and extend an existing software, Evorobot* 1, instead

of building everything from scratch. There are various reasons for this.

Evorobot* is free, multi-platform and open source, and it has been used, modified andextended over the last 15 years by some of the researchers who started the ER field

It is written in C++, thus speeding up the execution of the experiments, and it can berun both with and without GUI 2, being suitable to be sent into a computer cluster

Most of all, the Khepera simulator has been finely tuned, at the beginning of its develop-ment, after weeks of experiments using real robots, by sampling their sensors readings in

all the conditions supported by the simulator itself and saving them in a file distributed

with the package: it is therefore supposed to be more accurate than a purely mathematical

simulator

Because of what said in the last point, it is normally easy to transfer the simulated controller

onto a real environment without a great loss in fitness values. The trade-off is of course a

limitation on the kind of environment that can be defined, which has been here considered a

minor problem since Evorobot* already supports:

light objects

3D cylindrical objects with configurable sizes, heights and colours

walls with configurable heights and colours1Developed at LARAL, a laboratory of the italian National Research Council, mainly by Stefano Nolfi and

Onofrio Gigliotta, and is a complete rewriting of the original Evorobot developed by Nolfi and Dario Floreano.

Website: http://laral.istc.cnr.it/evorobotstar2The GUI has been developed using the also free, open source and performant QT Framework, version 4.

14

ground areas with configurable radius and colours

Khepera robots with built-in and configurable sensors

which have been proven enough to setup the experiments needed for this work.

The program is specifically intended for trial-based, multiple-seed evolutionary runs. Each

experiment has to be assigned a directory containing a configuration file, evorobot.cf, an optional

world file, evorobot.env, and an optional network file, evorobot.net.

When Evorobot* is run from inside an experiment directory, it reads the files and creates

appropriate internal structures, initializing the robot environment module, the evolutionary

module and the controller module. Most importantly, the number of repetitions, of generations,

of individuals, of trials and of life cycles has to be specified for each experiment.

After that, the evolution is started by first looking at the number of repetitions of the

experiment that have to be performed. For each repetition, an incremental seed is passed to the

random numbers generator (the first seed can be specified in the configuration file, the others

will be seed plus the index of the current repetition). This allows exact replications of the

experiments and helps to give them statistical robustness.

In each repetition, the whole environment is randomly re-initialized, and an evolutionary

process started. The evolution lasts for the number of generations specified: in each of them,

a population is tested individual by individual, their fitness values recorded, then a fraction of

the best ones is allowed to reproduce while the others are killed. The children (offspring) of

the survivors will form a new population of the same size of the previous one, and the current

generation will end.

When an individual has to be tested, three experiment-specific functions (among the others)

are used 3:

initialize world()

initialize robot position()

ffitness()3Refer to the file robot-env.cpp to see details of these functions. It can be seen in the source code that these

three functions are in fact pointers to function which are linked to the appropriate, experiment specific functions

using the name of the fitness function passed in the configuration file: instead of converting them into an object-

oriented style, it has been chosen to maintain the C-style approach traditionally used by the LARAL researchers.

A new version of Evorobot*, fully object-oriented, had not yet been available at the time of this work.

15

The experiment-specific versions of these functions are the three main pieces of code a

researcher has to write to setup a simple experiment, unless he or she needs to modify the GA

or use a non-supported network controller.

The first two functions are called whenever a new trial for an individual begins. If not

overwritten, the default functions are used: the first reads the evorobot.env and initializes

randomly the environment, the second sets the robot position to random coordinates, the third,

finally, is usually called at the end of the trial, but a parameter can be set to call it after every

time step of the trial.

In the next subsections some of the aspects of the simulator will be explored in greater

detail, with focus on the modifications made for this work.

Firstly, we will look into the configuration files and into how is it organized the environment

where the robots will be tested. Then, there will be an overview of the robot structure, its

sensors and the architecture of the network that controls it.

Lastly, we will present the modifications made to the classic GA to support stress-induced

mutation rate variations.

2.1 Simulated Envinroment

The world is a squared arena of 1000x1000 pixels, while the robot occupies a circle with approx-

imate radius of 27 pixels. The simulator makes the world toroidal by default, but it has been

chosen to use walls (coloured in black) in the current experiments to limit the walkable surface

and stimulate the evolution of an obstacle avoidance behaviour (if a robot hits a wall, it dies

and its fitness will be very low).

Depending on the experiment, there can be one, two or three ground zones (flat objects of

different colours and radius 100 pixels), and ten cylindrical objects, five with 27 pixels radius

and five with 12.5 pixels radius. The configuration file for the environment, evorobot.env, allows

to specify easily which objects have to be loaded in memory for that experiment by listing them

on different lines, for example:

swall x0 y0 X1000 Y 0 h1.0 c0.0 c0.0 c0.0

specifies an object of type wall which in fact is a line of width 1 pixel starting at coordinates

(0, 0) and ending at coordinates (1000, 0) 4, with an height of 1 pixel and RGB set to (0, 0, 0),

4As usual in computer simulations, the (0, 0) is the top left point of the arena.

16

which is black 5. For the other objects it is similar, but there is not the second pair of coordinates,

just one pair for the barycentre.

All the objects are loaded in memory at the beginning and can eventually be further ma-

nipulated. This is what initialize world() is usually for. In this work, the ground areas are

left to their fixed position, as are walls. In the experiments where cylinders are used, these are

randomly reset at each initialize world() call (at the beginning of each trial), as is the robot

position in any experiment (making sure its starting position does not hit an object, which will

cause a premature death).

How the loaded world is going to be initialized, however, depends on the parameters into

the main configuration file, evorobot.cf. In some of the old experiment-specific functions into

Evorobot*, for example, whether to randomize the cylindrical objects or not is decided by the

parameter random round.

In the next section there is an overview of some of the parameters used in the experiment.

2.2 Robot Architecture

The Khepera mobile robot was developed in the 90s by a team of the EPFL (Mondada et al.,

1999), just after the start of the evolutionary robotics field. It is a differential wheeled, round-

shaped robot of about 5.5cm diameter, able to run at a maximum speed of 1m/s.

The simulator supports light, ground and infrared (proximity) sensors, as well as some

optional functionality of the physical robot like a gripper, to grab objects, and a linear camera

(with view angle from 0 to 360).

Proximity and light sensors are placed all around the robot, the ground sensor (which detects

the colour of one single pixel) is placed behind, facing the ground, and the camera on the top.

In this work, only infrared and ground sensors, and in some experiments the camera, are

used. When the camera is used, the visual field is set to 36 degrees and divided into three vertical

subfields, whose averaged value will be copied into a respective visual inputs. If the camera is

used, the eight infrared sensors values are averaged two by two and copied in the respective

inputs, otherwise all of them have a dedicate input for the controller, which is described in the

next subsection.

5In the configuration the range of values for colours is [0, 1] for each of the RGB components, which is then

scaled up into the classic [0, 255] to be displayed with QT.

17

2.2.1 Neural network architecture

The simulator supports various kinds of neural network, but as said the CTRNN has been

chosen. The architecture, as can be seen in Figure 2.1, is composed of three layers:

1. an input layer, where the sensors readings are copied, in addition to the proprioceptive

sensors

2. an hidden layer composed of 6 recurrent leaky neurons fully connected with the input

layer and with the

3. output layer, that has only two neurons: their values will be directly used by the simulator

to calculate the speed of the wheels

Figure 2.1: The neural network which controls the robot. At the bottom there is the input layer, where are copied the

readings from the sensors (external or proprioceptors). In the middle there are six recurrent leaky neurons that are fully

connected with the two motor neurons, which in turn controls the speed of the wheels.

The proprioceptive sensors in the input layer are updated in the fitness function, and serve

as a simplified internal feedback of the robot. This allows to make decisions according to not

only the external sensory inputs, but also the internal ones: if the robot is starving, for example,

the energy proprioceptors will be very low. A more detailed description of the proprioceptors

used will be given in the next chapter.

As said before, in this work it has been used a CTRNN, where each node is completely

described by the following equation:

iyi = yi +Nj=1

wji(gj(yj + j)) + Ii

18

This equation is integrated using Euler method so that at each time steps the activations of

the neurons are updated using this rule:

yi = yi +h

i(yi +

Nj=1

wji(gj(yj + j)) + Ii)

i, j and wji are all evolved by the genetic algorithm. The output values of the two neurons

in the output layer are used to update the speed of the motors (update motors), which in turn

are used to move the robot (move robot). In one time step, the maximum movement in one

direction is of 20 pixels.

The controller cannot modify itself during the lifetime of the robot (there isnt plasticity).

The modifications of weights can only happen by means of mutation when reproducing the best

individuals, as is presented in the next section.

2.3 StrEGA implementation

To support the variability of mutation rates in the StrEGA model, as said in Section 1.3, a

definition of stress has to be produced according to the particular environments and agents

being used.

As will be seen in the next chapter, the individuals in this work are rewarded for staying

not only alive, but also well: thus, starvation is considered a situation of high stress. In any

case, however, after the evaluation the stress will be a value in the range [0, 1], where 0 is no

stress at all and 1 is maximum stress.

The calculated stress is then used into the mutate epi() function, a modification of the

mutate() function in Evorobot*. It takes a gene as an argument, and returns it eventually

mutated.

Since Evorobot* uses, as a genotype, a string of integers in the range [0, 255], the gene passed

to the mutating function is an integer value 6.

Either the normal and the epigenetic mutating functions create a binary representation of

the gene to be mutated, make a random check using the mutation rate for each of its 8 bits 7

and if the check is positive, flip the bit.

6Each gene is scaled up in the range [5, 5] during the genotype to phenotype mapping, where the genome istranslated into the actual weights of the neural network. In this work, the same structure has been maintained:

a few test have been performed with greater resolutions, ie gene range [0, 1024], without a difference in fitness

values.7The mutation rate is here, therefore, the probability for each of the 8 bits of a gene to be mutated.

19

The stress is used in mutate epi() to modify the mutation rate such that, being x the base

mutation rate specified in the configuration file, the new mutation rate can be in the range

[0.001, x2], where high stress increases it and low stress decreases it. The check is then appliedon the new mutation rate for the 8 bits.

To check whether the only useful component of the EGA was just the possibility to increase

the base mutation rate, some of the experiments had the parameter add stress activated. If the

base mutation rate is x, after applying the stress the new value is in the range [x, x 2]. Forthe rest, it remains the same as before.

2.3.1 Dynamic brood size

A further exploration has been that of using the stress to influence not only the mutation rate,

but also the amount of offspring an individual can produce.

As said, only a certain fraction of the best individuals (determined by the nreproducing

parameter) survives each generation, and the number of children of each father is normally

fixed to a value specified by the offspring parameter. The final population is then calculated as

the multiplication between the number of fathers and the offspring per father.

Instead, when the parameter dyn brood was activated, this had been the new process of

reproduction:

1. after selecting the nreproducing best individuals (fathers from now on), order them by

stress value

2. divide the fathers into (offspring-1) bins so that the first bin contains the least stressed

individuals, and the last bin the most stressed ones

3. make the fathers in the first bin produce 2 children, the ones in the second bin 4 children

and so on. If i is the index of the bin, from 1 to (offspring-1), the number of children the

fathers that belongs to i can produce is equal to i 2

We wanted to keep the total population of the same size than with the normal reproduction

system, that is equal to nreproducing times offspring, at the same time without changing the

selection criterion (thus, the individual are first selected by fitness as usual, then given different

reproductive capabilities using the stress).

In the following is provided a proof that the proposed system yields the wanted result,

assuming that the number of fathers is divisible by (offspring-1).

Be n the number of reproducing organisms, m the number of children per reproduc-

ing individual and P the size of the whole population calculated as n m. Then

20

the number of bins, as defined before, is b = m 1, and the number of reproducingindividuals into each bin is f =

n

b. What we need to find is a succession of b values,

2

Chapter 3

Experimental setup

The experiments have been performed in an incremental way, starting from a simple environment

and slightly increasing its complexity.

The aim of the experiments is to test if mutation rate adaptivity upon the environmental

conditions could lead a population to better survive in changing environments, and whether in

general led to an increased overall fitness (excluding eventual fluctuations) or not.

In each environment there are two sources of food: a good one, and a poisonous one.

Two sinusoidal functions determine how much good or deleterious a particular source is at each

generation. They can be seen in Figure 3.1.

Clearly, the worst moment for the population is after the cross: they can still survive by

eating what was, a few generations before, the only food source, even though the amount of

food they get at each time step is far less than before; at the same time, there isnt yet a strong

enough pressure to make them change source. When the source they are eating from becomes

poisonous, all the population experiences a big drop in performance, and after that they start

to be selected for choosing the other source.

The stress, just after the switch, should start to rise, thus increasing the mutation rate and

the genetic space exploration. Since the individuals will eventually start again to lose their

stress and increase their fitness, a convergence should arrive again until the next switch.

The environment becomes increasingly difficult because the switches become more frequent,

ie at a certain point there will not be enough time to actually find an individual able to search

for the other source, adaptive mutation rates or not.

In the next subsections are presented the experiments in details. Unless differently specified,

each experiment has been completely repeated 10 times with a different seed, and each

individual has been re-evaluated 10 times with a reloaded environment. Each evaluation

(trial) lasts 5000 time steps.

22

(a) Easy

(b) Medium

(c) Difficult

Figure 3.1: The three sinusoidals used to calculate the amount of food (red) and poison (green). When food becomes

negative it has become poison, and viceversa. The food is calculated as food(x) = sin(0.00015x2 + 1.5)a + b, while the

poison as poison(x) = sin(0.00015x2 1.5)a + b. Figure 3.1a uses a = 0.5, b = 0.5, Figure 3.1b uses a = 0.7, b = 0.3,Figure 3.1c uses a = 0.9, b = 0.1.

3.1 Simple sources switch

This is the simplest environment. Apart from the walls of the arena, there are two ground areas

of different colours, one near the east wall and the other near the west wall, as in figure 3.2. The

23

west wall has a section coloured with white, instead of black, which enables a discrimination

task: at generation 0, the white on the wall means food, while at the opposite side there is

poison.

Figure 3.2: The simple environment as displayed into Evorobot*. The white wall can be seen on the west wall. At

generation 0, the light blue area is food and the dark blue is poison.

The robots are equipped with 4 infrared sensors, a camera, a ground sensor and an energy

proprioceptor. The energy is their food. When they go over the a ground area, the value

corresponding to the current generation is drawn from the respective sinusoidal: it can either

increase or decrease their energy, which however cannot be greater than 1 or less than 0.

The robot loses a fixed amount of energy at each time step, and an additional one propor-

tional to its speed.

The fitness corresponds to sum of the energy levels for each time step, divided by the

number of time steps. This value is summed up to the fitness values of the other trials of

the same individual, and then averaged up: the final value is the actual fitness used to decide

whether the robot can reproduce or has to be discarded.

The stress corresponds to the number of time steps where the energy level was under 0.5.

The optimal strategy is clearly to use the white wall as a discriminant, and according to the

generation going straight toward east or west.

24

3.2 Simple sources switch with water

In this case, to make the task more difficult (thus with more ways to be solved), a water source

(again a ground area, with a different colour) has been added in the middle of the environment,

as in figure 3.3.

Figure 3.3: The simple environment with water as displayed into Evorobot*. The white wall can be seen on the west wall.

At generation 0, the light blue area is food and the dark blue is poison. The area in the middle is always water.

Compared to the previous setting, there are two more proprioceptors for water and protein

levels.

Energy is increased only if both the water and the protein level are above 0.5. The protein

are gained as in the previous setting the energy was gained (thus using the sinusoidal functions),

while the water is completely refilled when the robot passes over it: it doesnt change with the

generations.

This obliges the robot to continuously move between the current food area and the middle

of the environment, making it lose a bit of energy in the process.

The stress is increased only in the time steps where the energy is below 0.5, using this for-

mula:

stressi+1 = stressi + 0.5(1 water level) + 0.5(1 protein level)

As can be seen, there is a slight decoupling between fitness (which is calculate in the same

way as before) and stress: an individual might have had a low fitness because was neither able

25

to go over the water source nor the food source, thus its stress will be very high; or it might

have had a low fitness but had been good at going over the water, only failing to find the food

source, thus its stress will be not so high.

In case a random controller, just after the switch, is able to be selected (just because also

the others have a very low fitness in that critical situation), its mutation rate would be higher

than a good robot that is perfectly able to oscillate between water and one side area, but does

it on the wrong one. This should allow the children of the random walker to be far away from

itself, eventually being better in exploring the environment.

The optimal behaviour is still to reach the correct food area, and then to go back and forth

between it and the water source.

3.3 Complex sources switch

In this setting, the task is similar to the previous one: eat proteins and drink water to generate

energy, whose summed up amount, averaged over the trials, constitutes the fitness. Also the

stress is calculated in the same way, and the water area is still exactly in the middle of the

environment.

Figure 3.4: The complex environment as displayed into Evorobot*. The big cylinders are randomly spread into the

environment, while the small cylinders are grouped (their barycentre, however, can be in one of the four different angles of

the environment, by random chance).

The robots are equipped, as before, with 4 infrared sensors, the camera, the ground sensor

and the proprioceptors, but many of the experiments have also been replicated with a simplified

26

structure, without the camera but with 8 infrared sensors. This has been done to see how the

system performed with a different discrimination task.

The main difference respect to the previous setting is that the food and poison sources are

not ground areas, but cylindrical objects of different sizes: at generation 0, the small cylinders

are proteins and the big cylinders are poison. They are of different colours to be discriminated

more easily by the camera, and the white zone on the wall is of course absent in this experiment.

The robot has to go near to them, discriminate whether they are big or small and eat them

by touching them.

In each time step there is a small probability that eaten cylinders, either big or small, will

grow back. In addition, big cylinders are always randomly spread in the environment, while

small cylinders are always grouped together, whilst their barycentre is changed randomly for

each trial.

The optimal strategy, then is different in the two cases. When the small cylinders are

proteins, the best approach would be to look around, avoiding poison, for the area where they

are grouped and stay there. When the big cylinders are proteins the agents have to continuously

explore the environment since they are widespread, but also need to avoid the cluster of small

cylinders when they encounter it.

27

Chapter 4

Results

In the previous chapter, the experiments have been divided in three main groups. Each group

of experiments has then been explored with different parameters. In the following there is a

section for each group, where will be presented (unless differently specified):

1. performance of the agents in a static environment, where each source gives either food

(saturates completely) or poison (depletes internal food state completely)

2. what happens if the pre-evolved agents are taken from the last generation of the previous

experiment and re-evolved but with completely switch sources

3. a complete evolution where the food and poison sources switch smoothly, following the

previously presented sinusoidal functions

each in a different subsection. Additional plots can be seen in the Appendix A.

4.1 Simple sources

See Figure 3.2 for details of the environment.

4.1.1 Static environment

In this case, the parameter kill on energy has always been used, considering the easiness for an

agent to reach an optimal behaviour: in most cases, twenty to thirty generations were needed

to have a best fitness near to 1.

A population has been evolved for 100 generations in this fixed environment: the west source

gives food (energy level set to 1 for each time step the robot is on the area), the east source gives

poison (energy level set to 0 as soon as the robot enters it, and because of the kill on energy

parameter, the robot dies instantly).

28

Results can be seen in Figure A.1. With all three mutation rates the best values arrived

quickly to the maximum. They have been evolved without epigenetics, since they served just as

a base for the switch: the population of the last generation of each of these groups has than been

put into the environment with switched source, repeated three times (with the same starting

population) to test epigenetics and dynamic brood size.

4.1.2 Switch

As can be seen in Figure 4.1, when the evolved population is put into the switched environment

there is a drastic performance drop, but in about 20 generations it arrives again over 0.8 of

fitness. The base mutation rate is 0.01 here, for all, meaning that the epigenetic one could go

up to 0.02.

All the three are able to recover fairly quickly, with the best value being, at the last gener-

ation, almost the same for all of them. We can see, instead, that using epigenetics (especially

without dynamic brood size) leads to better performance for the average population.

(a) Best fitness (b) Mean fitness

Figure 4.1: Plot of the evolutionary performance after that the last generation of the static environment (evolved with

mutation rate 0.01) has been put in a completely switched environment (what was food is poison and viceversa). The red

line is the non-epigenetic approach, the green line is with epigenetics and the blue line is epigenetic with

dynamic brood size. In Figure 4.1a there is the plot of the best fitness for each generation, while in Figure 4.1b there

is the plot of the mean fitness for each generation. In both cases, the values are averaged over 10 experiment replications,

for each of the three groups.

Since this could have been explained by the increased mutation rate, other two replications

have been made: one with base mutation rate 0.02, and the other with mutation rate 0.04 (see

A.1.2 for details).

With 0.02 the situation is very similar: the plot of the best individuals does not present

relevant differences between epigenetics and normal, even though the average seems to yield a

worse result for the normal approach. When increasing again to 0.04, instead, for the normal

29

approach there is the first subtle decrease in the best plot, even though can be considered

negligible, and a huge decrease in the average plot, meaning that although the best individual

of the last generation were just a bit worse than with epigenetics, the whole population was

much worse.

This means that the epigenetic approach can actually use an increased mutation rate without

a decrease in overall performance (because the stress goes down while the fitness goes up, so

eventually the mutation rate gets down-regulated), while the normal approach starts losing

performance.

4.1.3 Periodic change

In the third environment, the two ground areas can become both food or poison over time,

according to two sinusoidals. The base mutation rate has been set to 0.01.

In Figure 4.2 we can see the performance while using the medium sinusoidal (see Figure

3.1b), without the parameter kill on energy. The compared systems are five now: normal,

epigenetics, dynamic brood size, and two more: epigenetics with the add stress parameter

activated, and with also the dynamic brood size.


Figure 4.2: Five systems are compared here: normal (red), epigenetics (green), dynamic brood size (blue), epigenetics with

add stress (violet), epigenetics with add stress and dynamic brood size (sky blue). The base mutation rate is 0.01, the

medium sinusoidal (Figure 3.1b) is used and kill on energy is not activated. In Figure 4.2a there is the plot of the

best fitness for each generation, while in Figure 4.2b there is the plot of the mean fitness for each generation. In both

cases, the values are averaged over 10 experiment replications, for each of the five groups.

The best performance, as before, are very similar, while the straight epigenetics without

other modification outperform again the others on the average fitness of the population. A

variation of parameters 1, also yields the same results (with a slight loss on the best plot for

1Adding the kill on energy parameter (Figure A.4), or using the hard sinusoidal (Figure A.5), with or

without kill on energy (Figure A.6)

30

epigenetics, which still outperforms the others on the average plot).

4.2 Simple sources with water

See Figure 3.3 for details of the environment.

4.2.1 Static environment

By obliging the agents to visit both the food area and the water to get a good fitness, the range

of evolvable behaviours should be greater than before. As can be seen in Figure A.7, especially

increasing the mutation rate, it is harder to reach a high fitness even with 500 generations. The

errors are far greater than before, meaning that over 10 runs, some were more fortunate than

others. The average fitness is heavily influenced by the mutation rate increase.

In the next subsection the last generation for each of the three mutation rates will be put

in a completely switched environment, as before.

4.2.2 Switch

Mutation rates 0.01 and 0.02 show a similar pattern than before. The epigenetics and normal

have comparable best fitness values, and epigenetics outperform both dynamic brood size and

normal on the mean fitness plot. An increased reactivity to the new environment is seen, instead,

in the experiment with 0.04 mutation rate (Figure 4.3).

In this case, the fixed mutation rate approach and the dynamic brood size are not able to

go over 0.8 of fitness even at the last generation, and they show a smooth growth, contrarily

to the epigenetic approach where the increase is huge and quick. The mean fitness plot, as in

the other experiments, leaves the fixed mutation approach as the least performant between the

three, and the epigenetics as first.

4.2.3 Periodic change

This experiment, like Section 4.1.3, evolves from the beginning agents in a changing environ-

ment, using the same sinusoidals and also a base mutation rate of 0.01.

In Figure 4.4 we can see the performance while using the medium sinusoidal (Figure

3.1b) and the parameter kill on energy. The systems compared are as in Section 4.1.3:

normal, epigenetics, dynamic brood size, epigenetics with the add stress parameter activated,

and with the dynamic brood size.

31


Figure 4.3: Plot of the evolutionary performance after that the last generation of the static environment (evolved with

mutation rate 0.04) has been put in a completely switched environment (what was food is poison and viceversa). The red

line is the non-epigenetic approach, the green line is with epigenetics and the blue line is epigenetic with

dynamic brood size. In Figure 4.3a there is the plot of the best fitness for each generation, while in Figure 4.3b there

is the plot of the mean fitness for each generation. In both cases, the values are averaged over 10 experiment replications,

for each of the three groups.




medium sinusoidal (Figure 3.1b) is used and kill on energy is activated. In Figure 4.4a there is the plot of the best

fitness for each generation, while in Figure 4.4b there is the plot of the mean fitness for each generation. In both cases,

the values are averaged over 10 experiment replications, for each of the five groups.

There is not a clear difference in performance in the best plot but after every change the

epigenetics is again able to stay slightly above the others (apart before the first switch, where it

is outperformed by add stress), while in the best two are again epigenetics and dynamic brood

size. This also happens when using the hard sinusoidal (see Figure A.12 for the experiment

plots). In the other two experiments (medium sinusoidal with kill on energy (Figure A.10)

and hard sinusoidal without kill on energy (Figure A.11)) the fixed mutation rate has slightly

better performance in the best plot than the various epigenetic ones, but is still outperformed

32

in the mean plot.

4.3 Complex sources

This group of experiments has been, for the agents, the most difficult one. Figure 3.4 shows

how the environment is structured. The next two sections present, respectively, the results when

using both camera and infrared sensors, and when using only the infrared sensors.

Because of the difficulty of the environment, reflected by the low performance, only the two

lowest mutation rates (0.01 and 0.02) and the two easiest sinusoidals (easy and medium)

have been used.

4.3.1 Camera

Easy sinusoidal

The sinusoidal here used is easy because, at worse, the cylinders give little to now food, but

they are never really poisonous. Nevertheless, the task remains difficult and the performance

tend to be low. In the comparison, however, the epigenetics is never outperformed by the fixed

mutation rate approach. When the base mutation rate is 0.01, apart from add stress they

all follow a similar pattern. This also happens when the mutation rate is 0.02, but only if

kill on energy is not activated. When it is (see Figure 4.5), epigenetics outperforms all the

others.




easy sinusoidal (Figure 3.1a) is used and kill on energy is activated. In Figure 4.6a there is the plot of the best fitness

for each generation, while in Figure 4.5b there is the plot of the mean fitness for each generation. In both cases, the

values are averaged over 10 experiment replications, for each of the five groups.

33

Medium sinusoidal

The overall performance is very low, where only in certain cases the fitness value 0.8 is reached.

The five (as before) tested systems all tend to have low performance. In one combination of

parameters (see Figure 4.6), however, the dynamic brood size outperform all the others in

both the best and the mean plot.







34

4.3.2 Infrared

Easy sinusoidal

In Figure 4.7 can be seen one of the experiments with infrared sensors and easy sinusoidal. All

the systems have low performance and are very similar, but the epigenetics is slightly better,

both for the best plot and for the mean plot, where the improvement is more accentuated

and is followed also by the dynamic brood size system.




easy sinusoidal (Figure 3.1a) is used and kill on energy is not activated. In Figure 4.7a there is the plot of the best

fitness for each generation, while in Figure 4.7b there is the plot of the mean fitness for each generation. In both cases,

the values are averaged over 10 experiment replications, for each of the five groups.

35

Medium sinusoidal

Using the medium sinusoidal decrease the overall performance, leaving the relative performance

the same. In Figure 4.8 the same configuration than with the easy sinusoidal is showed. The

performance are still low, but slightly better than the others. For the other configurations, the

difference is negligible.







36

Chapter 5

Conclusion

Starting from a debated issue in evolutionary biology, the adaptivity of mutations, we developed

an Evolutionary Robotics model able to incorporate the idea of stress-driven modification of the

mutation rates, through the use of a structure parallel to the (artificial) genome: the epigenome.

Three different environments, with increasing difficulty, have been used to test the initial

hypothesis. The results are promising but do not show a generalized and consistent increment

in performance, even though in most of the experiments the epigenetics scored better, thus

meaning that it allowed the population either to recover more quickly from a situation of high

stress, or to reach more quickly an high fitness value, or both.

The use of the add stress and add stress with dynamic brood were functional to check

whether an increment in performance by the epigenetics could be due to just the fact that the

mutation rates increases. This has been showed wrong, because neither a higher fixed mutation

rate, nor the two add stress configurations, were able to outperform the plain epigenetic system

in most experiments. This could mean that in certain moments the mutation rate needs to

be smaller, to allow a smoother evolutionary process (ie, if the individual are already good,

there is not a need for huge exploration of the genetic space). When the environment changes,

the mutation rate needs to be higher, so to increase the probability that at least some of the

individuals will survive.

Different environments and different definition of stress would need to be developed to

confirm the results and test to what extent an adaptive mutation scheme can be beneficial. In

particular, is has not been tested whether increasing the range for the mutation rate in the

epigenetic system can lead to too many deleterious mutations and reduce the performance.

Another path that had looked promising in (Ticconi, 2011) is that of epigenetic inheritance.

Allowing the mutation rate, modified by the stress experienced during the lifetime of an indi-

vidual, to be passed to the individuals offspring would in fact enable a sort of phylogenetic

37

memory, at least between near generations, of how harsh is the environment in that period.

This experimentation has, in addition, arisen some remarks about the feasibility of a classical

trial-based ER model to further analyse this subject. An open-ended, multi-agent artificial life

simulation might be a better tool to study the dynamics of mutation rates of a real-time-acting

population.

38

Bibliography

Aneil F Agrawal and Alethea D Wang. Increased transmission of mutations by low-condition

females: evidence for condition-dependent DNA repair. PLoS biology, 6(2):e30, February

2008. ISSN 1545-7885.

Charles F Baer. Does mutation rate depend on itself. PLoS biology, 6(2):e52, February 2008.

ISSN 1545-7885.

Charles F Baer, Michael M Miyamoto, and Dee R Denver. Mutation rate variation in multi-

cellular eukaryotes: causes and consequences. Nature reviews. Genetics, 8(8):61931, August

2007. ISSN 1471-0056.

Randall D. Beer. The Dynamics of Active Categorical Perception in an Evolved Model Agent.

Adaptive Behavior, 11(4):209243, December 2003. ISSN 10597123.

KY Chan, TC Fogarty, and ME Aydin. Genetic algorithms with dynamic mutation rates

and their industrial applications. International Journal of Computational Intelligence and

Applications, 7(2), 2008.

Jeff Clune, Dusan Misevic, Charles Ofria, Richard E Lenski, Santiago F Elena, and Rafael

Sanjuan. Natural selection fails to optimize mutation rates for long-term adaptation on

rugged fitness landscapes. PLoS computational biology, 4(9):e1000187, January 2008. ISSN

1553-7358.

S Cotton. Condition dependent mutation rates and sexual selection. Journal of evolutionary

biology, 2009.

Francis Crick. Central Dogma in Molecular Biology. Nature, 227(8):561563, 1970.

Christoffel Dinant, Adriaan B Houtsmuller, and Wim Vermeulen. Chromatin structure and

DNA damage repair. Epigenetics and chromatin, 1(1):9, January 2008. ISSN 1756-8935.

39

Dario Floreano, Mototaka Suzuki, and Claudio Mattiussi. Active vision and receptive field

development in evolutionary robots. Evolutionary computation, 13(4):52744, January 2005.

ISSN 1063-6560.

NC Fonville. Stress-induced modulators of repeat instability and genome evolution. Journal of

Molecular Microbiology and Biotechnology, 2011.

Rodrigo S Galhardo, P J Hastings, and Susan M Rosenberg. Mutation as a stress response and

the regulation of evolvability. Critical reviews in biochemistry and molecular biology, 42(5):

399435, 2007. ISSN 1040-9238.

Janet L Gibson, Mary-Jane Lombardo, Philip C Thornton, Kenneth H Hu, Rodrigo S Galhardo,

Bernadette Beadle, Anand Habib, Daniel B Magner, Laura S Frost, Christophe Herman, P J

Hastings, and Susan M Rosenberg. The sigma(E) stress response is required for stress-induced

mutation and amplification in Escherichia coli. Molecular microbiology, 77(2):41530, July

2010. ISSN 1365-2958.

Caleb Gonzalez, Lilach Hadany, and RG Ponder. Mutability and importance of a hypermutable

cell subpopulation that produces stress-induced mutants in Escherichia coli. PLoS genetics,

4(10):e1000208, January 2008. ISSN 1553-7404.

EL Greer, TJ Maures, Duygu Ucar, and AG Hauswirth. Transgenerational epigenetic inheri-

tance of longevity in Caenorhabditis elegans. Nature, 479(7373):365371, October 2011. ISSN

0028-0836.

Jatinder N.D Gupta and Randall S Sexton. Comparing backpropagation with a genetic algo-

rithm for neural network training. Omega, 27(6):679684, December 1999. ISSN 03050483.

David Haig. Weismann Rules! OK? Epigenetics and the Lamarckian temptation. Biology &

Philosophy, 22(3):415428, December 2006. ISSN 0169-3867.

Heather Hendrickson, E Susan Slechta, Ulfar Bergthorsson, Dan I Andersson, and John R

Roth. Amplification-mutagenesis: evidence that directed adaptive mutation and general

hypermutability result from growth with a selected gene amplification. Proceedings of the

National Academy of Sciences of the United States of America, 99(4):21649, February 2002.

ISSN 0027-8424.

John H. Holland. Adaptation in Natural and Artificial Systems. The University of Michigan

Press, 1975.

40

Eva Jablonka and Marion J Lamb. Evolution in four dimensions: Genetic, epigenetic, behav-

ioral, and symbolic variation in the history of life. MIT Press, 2005.

Josephine M Kang, Nicole M Iovine, and Martin J Blaser. A paradigm for direct stress-induced

mutation in prokaryotes. FASEB journal: official publication of the Federation of American

Societies for Experimental Biology, 20(14):247685, December 2006. ISSN 1530-6860.

Marvin Minsky and Seymour Papert. Perceptrons: An introduction to computational geometry.

MIT Press, 1st edition, 1969.

Melanie Mitchell. An Introduction to Genetic Algorithms. The MIT Press, 1998.

F Mondada, E Franzi, and A Guignard. The development of khepera. Experiments with the

Mini-Robot Khepera, Proceedings of the First International Khepera Workshop, pages 714,

1999.

Stefano Nolfi and Dario Floreano. Evolutionary Robotics. MIT Press, 2000.

Domenico Parisi and Federico Cecconi. La societa` dei beni. Dalla famiglia allo stato alle imprese

private. Bollati Boringhieri, 2006.

Ales Pecinka, Huy Q Dinh, Tuncay Baubec, Marisa Rosa, Nicole Lettner, and Ortrun Mittelsten

Scheid. Epigenetic regulation of repetitive elements is attenuated by prolonged heat stress in

Arabidopsis. The Plant cell, 22(9):311829, September 2010. ISSN 1532-298X.

R Pfeifer, M Lungarella, and O Sporns. The synthetic approach to embodied cognition: a

primer. Handbook of Cognitive Science, 2008.

Rolf Pfeifer, Max Lungarella, and Fumiya Iida. Self-organization, embodiment, and biologically

inspired robotics. Science (New York, N.Y.), 318(5853):108893, November 2007. ISSN

1095-9203.

S M Rosenberg. Evolving responsively: adaptive mutation. Nature reviews. Genetics, 2(7):

50415, July 2001. ISSN 1471-0056.

Nathaniel P Sharp and Aneil F Agrawal. Evidence for elevated mutation rates in low-quality

genotypes. Proceedings of the National Academy of Sciences of the United States of America,

109(16):61426, April 2012. ISSN 1091-6490.

Jeffrey D Stumpf, Anthony R Poteete, and Patricia L Foster. Amplification of lac cannot

account for adaptive mutation to Lac+ in Escherichia coli. Journal of bacteriology, 189(6):

22919, March 2007. ISSN 0021-9193.

41

D Thierens. Adaptive mutation rate control schemes in genetic algorithms. Evolutionary Com-

putation, 2002. CEC 02. Proceedings of the 2002 Congress on, 1:980 985, 2002.

Fabio Ticconi. StrEGA: Stress-driven EpiGenetic Algorithm. Technical report, University of

Sussex, 2011.

Elio Tuci, M Quinn, and I Harvey. An evolutionary ecological approach to the study of learning

behavior using a robot-based model. Adaptive Behavior, 10(3-4):201221, 2002.

August Weismann. The Germ-Plasm A Theory of Heredity. Charles Scribners Sons, 1893.

Mahmoud W Yaish, Joseph Colasanti, and Steven J Rothstein. The role of epigenetic processes

in controlling flowering time in plants exposed to stress. Journal of experimental botany, 62

(11):372735, July 2011. ISSN 1460-2431.

Yongzhong Zhao and Richard J Epstein. Programmed genetic instability: a tumor-permissive

mechanism for maintaining the evolvability of higher species through methylation-dependent

mutation of DNA repair genes in the male germ line. Molecular biology and evolution, 25(8):

173749, August 2008. ISSN 1537-1719.

42

Appendix A

Additional Plots

A.1 Simple sources switch

43

A.1.1 Static environment

(a) Mutation rate 0.01

(b) Mutation rate 0.02

(c) Mutation rate 0.04

Figure A.1: Each of the plots represents the values of the best (red line) and mean fitness, with the error bars calculated

using standard error (values taken from 10 repetitions of each experiment). The environment is static for the whole 100

generations.

44

A.1.2 Switch

Mutation rate 0.01

See Figure 4.1.

Mutation rate 0.02


Figure A.2: Plot of the evolutionary performance after that the last generation of the static environment (evolved with

mutation rate 0.02) has been put in a completely switched environment (what was food is poison and viceversa). The

red line is the non-epigenetic approach, the green line is with epigenetics and the blue line is epigenetic

with dynamic brood size. In Figure A.2a there is the plot of the best fitness for each generation, while in Figure

A.2b there is the plot of the mean fitness for each generation. In both cases, the values are averaged over 10 experiment

replications, for each of the three groups.

45

Mutation rate 0.04








46

A.1.3 Periodic change

Medium sinusoidal

See Figure 4.2.

Medium sinusoidal, Kill on Energy


Figure A.4: Five systems are compared here: normal (red), epigenetics (green), dynamic brood size (blue), epigenetics

with add stress (violet), epigenetics with add stress and dynamic brood size (sky blue). The base mutation rate is 0.01,

the medium sinusoidal (Figure 3.1b) is used and kill on energy is activated. In Figure A.4a there is the plot of the

best fitness for each generation, while in Figure A.4b there is the plot of the mean fitness for each generation. In both

cases, the values are averaged over 10 experiment replications, for each of the three groups.

47

Hard sinusoidal




the hard sinusoidal (Figure 3.1c) is used and kill on energy is not activated. In Figure A.5a there is the plot of the



48

Hard sinusoidal, Kill on Energy




the hard sinusoidal (Figure 3.1c) is used and kill on energy is activated. In Figure A.6a there is the plot of the best

fitness for each generation, while in Figure A.6b there is the plot of the mean fitness for each generation. In both cases,

the values are averaged over 10 experiment replications, for each of the three groups.

49

A.2 Simple sources switch with water

A.2.1 Static environment

(a) Mutation rate 0.01

(b) Mutation rate 0.02

(c) Mutation rate 0.04

Figure A.7: Each of the plots represents the values of the best (red line) and mean fitness, with the error bars calculated

using standard error (values taken from 10 repetitions of each experiment). The environment is static for the whole 100

generations.

50

A.2.2 Switch

Mutation rate 0.01








51

Mutation rate 0.02








52

Mutation rate 0.04

See Figure 4.3.

53

A.2.3 Periodic change

Medium sinusoidal




the medium sinusoidal (Figure 3.1b) is used and kill on energy is not activated. In Figure A.10a there is the plot of

the best fitness for each generation, while in Figure A.10b there is the plot of the mean fitness for each generation. In

both cases, the values are averaged over 10 experiment replications, for each of the three groups.

54

Medium sinusoidal, Kill on Energy

See Figure 4.4.

Hard sinusoidal




the hard sinusoidal (Figure 3.1c) is used and kill on energy is not activated. In Figure A.11a there is the plot of the



55

Hard sinusoidal, Kill on Energy




the hard sinusoidal (Figure 3.1c) is used and kill on energy is activated. In Figure A.12a there is the plot of the best

fitness for each generation, while in Figure A.12b there is the plot of the mean fitness for each generation. In both cases,

the values are averaged over 10 experiment replications, for each of the three groups.

56

Date post:	07-Sep-2015
Category:	Documents
Upload:	koteko87
View:	215 times
Download:	0 times

Ticconi - 2012 - Investigation of Mutation Rates Adaptivity in Changing Environments an Evolutionary...

Documents