Investigation of Mutation Rates Adaptivity in Changing
Environments: an Evolutionary Robotics Approach
Fabio Ticconi
August 30, 2012
Abstract
Taking inspiration from results of recent research in microbiology and evolutionary theory, in
this work has been explored the possibility that making the mutation rate in a genetic algorithm
adaptive, instead of fixed, could improve the performance in case of mutating environmental
conditions. To reach this goal, a simplified model of stress-driven mutation rates is proposed,
integrated into the general principles and techniques of the Evolutionary Robotics. A notion of
environmental stress has been defined for three different environments where populations of
simulated Khepera robots have been evolved and analysed in their ability to regain fitness after
their environment was modified, and to keep their fitness high while the environment changed
periodically. The approach resulted promising and outperformed the fixed mutation rate in
most experiments, being able in certain environments to recover more quickly from a drastic
change in the external conditions. Extensions needs to be developed in future works to study
the potential and limits of mutation rate adaptivity.
Contents
1 Background 6
1.1 Epigenetics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.2 Evolutionary Robotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.2.1 Artificial Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2.2 Genetic Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3 Stress-driven Epigenetic Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2 Methods 14
2.1 Simulated Envinroment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2 Robot Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2.1 Neural network architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.3 StrEGA implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3.1 Dynamic brood size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
3 Experimental setup 22
3.1 Simple sources switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2 Simple sources switch with water . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
3.3 Complex sources switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
4 Results 28
4.1 Simple sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.1.1 Static environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
4.1.2 Switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
4.1.3 Periodic change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
4.2 Simple sources with water . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.2.1 Static environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.2.2 Switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4.2.3 Periodic change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
1
4.3 Complex sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.3.1 Camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
4.3.2 Infrared . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5 Conclusion 37
A Additional Plots 43
A.1 Simple sources switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
A.1.1 Static environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
A.1.2 Switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
A.1.3 Periodic change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
A.2 Simple sources switch with water . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
A.2.1 Static environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
A.2.2 Switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
A.2.3 Periodic change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
2
Acknowledgements
This work is dedicated to my family and to Anto, that give me love when I most need it.
My most sincere thanks go to all the people with who I shared an amazing, challenging and
stimulating year. The closest friends, they already know. Of course they now, we have been
sweating and laughing side by side.
A big thanks to all the tutors and professors who have made the start of this adventure easy
and enjoyable, and the progress challenging and interesting.
A special thanks to Matthew, my supervisor, that has always been able to put order in my
confuse ideas and taught me a couple of important lessons, while always being friendly and
ready to give advice.
3
Introduction
The adaptive mutation is a little cloud that obscures the beauty and clearness of the
molecular biological perspective on Life.
Vasily Ogryzko
Mainstream evolutionary theory states that genetic mutations are blind, that is they are not
directed onto specific genes. It is argued that they occur at random times and places into the
genotype and that they normally generate lower-fitness phenotypes, thus meaning that most
mutations are actually harmful although being an important source of variation in a population
1. Cricks Central dogma of molecular biology is often cited to defend this position:
The central dogma of molecular biology deals with the detailed residue-by-residue
transfer of sequential information. It states that such information cannot be trans-
ferred back from protein to either protein or nucleic acid.
Following the argument of Crick (1970), the proteins synthesized using DNA and RNA
cannot influence the nucleic acids back. This in turn strengthens the idea that mutations have
to be blind, because proteins that receive stimuli from outside the cytoplasm cannot react to
them by modifying the DNA, of course assuming the dogma is true.
If we take for granted the randomness of mutations, theoretically their rate should be very
small. They are needed to explore the genetic space and have diversity into a population,
but this should happen without making every offspring unable to survive because of too many
deleterious mutations.
As by classical Darwinian evolution, the phenotype of each individual is subjected to natural
selection, and the individual with deleterious mutations will likely be eliminated (because of
their reduced fitness), while the ones with neutral or just slightly harmful mutations will maybe
continue to reproduce; the few ones, finally, who have been subjected to a positive mutation,
will be fitter and will reproduce more, thus spreading the newly found gene into the population.
1See (Jablonka and Lamb, 2005) for an history of the debate about the randomness of mutations.
4
As Baer (2008) put it, deleterious mutations are the price we living organisms pay for the
ability to evolve.
During the last few decades, though, new findings in molecular biology have led researchers
to argue whether this view was correct or not. Can mutations be adaptive, that is can an
organism modify its own mutation rate or direct a mutation onto specific genes to
make its offspring more likely to survive?
This work does not aim to answer this questions, but to define and test a model able
to embody, in a simplified way, the essential mechanisms of mutation rate adaptivity. The
hypothesis of this work is that, whether they exist in nature or not, adaptive mutations can
be beneficial in a population undergoing environmental stress, accelerating the evolutionary
process.
A prediction of this work is that a properly defined stress-driven mutation rate adaptivity
should allow a population to escape a risk of extinction, and quickly regain its vitality. To do
that, such a model needs to be defined and than tested in a properly setup environment.
In Chapter 1 is first presented an overview of the current debate over the adaptivity of
mutations, then in Chapter 2 are described the methods here used to define and test the above
mentioned model:
In Chapter 3 are described the various experimental setup where the model has been tested,
followed by the obtained results in Chapter 4. In Chapter 5 there are the conclusions of this
work, the results are discussed an possible future improvement outlined.
5
Chapter 1
Background
The question whether mutations can be adaptive or not has risen a long-lasting debate, since
it touches what is considered a central point of evolutionary theory and because, if the answer
is proven affirmative, we will be in fact acknowledging a sort of Lamarckian process 1 alongside
the classic Darwinian one.
Notwithstanding the importance of the question and the amount of studies on the subject,
this problem has yet to be considered closed.
On the one hand, proponents of the adaptivity hypothesis have showed how the mutation
rate is not constant even between closely related individuals and groups, while it is instead
correlated with the fitness: the lowest the fitness, the highest the mutation rate (Baer et al., 2007;
Baer, 2008). Low-fitness individuals, can be argued, under certain environmental conditions
start to have a higher mutation rate (eventually specific onto the genes that make them less
good in that particular environment), thus increasing their likelihood of generating an offspring
better able to cope with the environment than themselves. That would make their mutations
adaptive. The mutation is said to be directed in case it occurs specifically in genes that can
increase the organisms fitness (Rosenberg, 2001).
On the other hand, however, this could just be due to the cost of maintaining the DNA
intact, that in bad environment conditions might be too high for an individual. This could lead
to an increased mutation rate and eventually to the generation of offspring with a higher fitness
than the parent, or to other reasons still explainable under a classic Darwinian framework Baer
(2008).
An example of how subtle is the difference between an adaptive and a non-adaptive muta-
genic process is seen with particular lac mutants of Escherichia Coli, unable to feed and grow
1This is especially a problem in case directed mutations will be proven to happen, because the environment
in that case will directly modify particular genes which will then be inherited.
6
properly in a lactose-rich environment. In this situation, there seems to be an adaptive response
that drives the lac cells to increase their mutation rate (a phenomenon known as hypermuta-
tion), so that eventually an offspring will end up with a lac+ gene which, by increasing the cell
growth in that particular environment, will quickly spread into the population.
Hendrickson et al. (2002) argued that the adaptivity of mutations in this case is only apparent
and proposed a model where the lac gene is amplified in the genotype by means of random
mutations, resulting in a slight increase of the growth rate of the cell.
The selection will then favour a further amplification of this gene, which also means an
increase in the probability of having one of these repeated lac genes mutated into a lac+ gene
at the normal, constant mutation rate. When this happens, lac+ cells will grow at a higher rate
and survive more than the others: selective pressure will make the lac+ gene spread into the
population, eventually covering it all. When analysed without taking gene amplification into
account, the researchers argued, this whole process can be misread as an example of adaptive
mutation rates or even a directed mutation.
Stumpf et al. (2007) instead tested many predictions of the amplification model, finding that
most of them do not hold in this case: for example, cells that do actually amplify the lac gene
can end up with a lac+, but this is unstable because of the many copies of the lac. The stable
lac+ cells instead are showed to have been generated without amplification in the parent.
In the last decade many other experiments and analyses have been performed on the lac
mutant of E. Coli, bringing more evidence to the hypothesis of mutations as an adaptive stress
response to bad, challenging environmental conditions. It seems however to be excluded a
directed mutation: a transient hypermutation under stress conditions is enough for a lac pop-
ulation to adapt and become a lac+, whilst it is still debated how the population is able to
survive a (although temporary) huge increase in mutations (Gibson et al., 2010; Fonville, 2011;
Rosenberg, 2001; Galhardo et al., 2007; Gonzalez et al., 2008).
Condition-dependent mutation rates have also been found in some multicellular organisms,
although more controversially (Baer, 2008; Agrawal and Wang, 2008; Cotton, 2009; Sharp and
Agrawal, 2012), and it has also been suggested that human cancers arise in part as an evolu-
tionarily programmed side effect of age- and damage-inducible genetic instability affecting both
somatic and germ line lineages (Zhao and Epstein, 2008): an increase in mutation rates in the
human male sperm while ageing and when experiencing environmental stress seems to lead to
increased species adaptation, even though at a high cost for the single individuals.
The debate, as said, is still open, but one more aspect needs to be considered.
Firstly, most of the cited works assumed the mutation rate to be evolvable (because the
7
molecules that control DNA repairing and copying are in fact encoded by the DNA itself).
Secondly, a recent work (Clune et al., 2008) pointed out how natural selection, in a strict
Darwinian framework, fails to find the optimal mutation rates in difficult environments if they
are encoded into the genome.
Assuming the last two points are true 2, mutation rates arent adaptive even on the phyloge-
netic scale, let alone the ontogenetic one. A recent area called epigenetics could help to escape
this apparent dead-end for the adaptivity hypothesis, adding a new dimension to the scenario.
1.1 Epigenetics
DNA is just a tape carrying information, and a tape is no good without a player.
Epigenetics is about the tape player.
Bryan Turner 3
Epigenetics is the name given to a relatively recent field studying the molecules that are
around the DNA and that regulate gene expression. They are also found in the cytoplasm and
are highly sensitive to environmental stress (Yaish et al., 2011; Pecinka et al., 2010), modifying
the genetic expression (also switching on and off particular genes) in different conditions.
What has prompted a rather heated debate is the possibility for the epigenome (the name
given to the whole set of molecules with epigenetic functions) to be heritable 4.
It is easy to see why: the epigenome of an individual responds to environmental stress
by modifying gene expression or switching off a gene at all, so the children that inherit the
epigenome could have the same effects on gene expression without having experienced the stress
themselves. This is called inheritance of acquired characteristics, and has been fought since the
end of the 19th century on the account of its Lamarckism (Weismann, 1893).
Since there is some evidence (Dinant et al., 2008) that DNA repair is one of the duties of
the epigenome, which in turn is highly responsive to environmental stress, it is possible that
this stress may be a cause of transient, adaptive mutation rate change.
In this work, as stated in the introduction, a simplified model of stress-driven muta-
tion rates is developed without trying to maintain a biological resemblance. The
2The inability to find an optimal mutation rate, however, may not be a problem as long as there is hyper-
mutability, even if evolved by means of natural selection without any direct influence of the stress, as (Kang
et al., 2006) shows.3http://epigenome.eu/en/1,1,04See (Jablonka and Lamb, 2005) and a counter argument by (Haig, 2006); for evidence in plants see (Greer
et al., 2011).
8
(possible) adaptivity of mutations in real organisms, together with the (still unproven) idea
of an epigenome as interface between the environmental stress and the genetic changes, is an
inspiration to investigate whether, comparing artificial organisms with such characteristics and
others without, we can see the first being better than the second.
To develop such a model of phylogenetic adaptivity, incorporating epigenetics, it has been
chosen to use the principles and methods of Evolutionary Robotics (ER). The next section will
present a brief history of the field and the rationale of this approach for studying evolution.
1.2 Evolutionary Robotics
ER had developed during the 90s as a new approach to the study of cognition: a so called
synthetic approach to cognition (Pfeifer et al., 2007). Its first use was as a model to study
minimally cognitive agents, which were built or simulated and put in an environment to solve
tasks. The simplified structure of both the robots and the environments together with im-
provements in computer hardware allowed the researchers to perform real-time as well as oine
analysis in reasonable time, compared to real organisms.
The whole idea is that of imitating the evolutionary process to shape the controller, as well
as the body, of a robot in a goal-oriented way. ER has been applied in engineering to design
actual robots or other complex tools, like satellites, but it has been most commonly used to
test scientific hypotheses in various fields: psychology, ecology, evolutionary biology, sociology,
economics or more recently abiogenesis (Parisi and Cecconi, 2006; Tuci et al., 2002; Nolfi and
Floreano, 2000; Pfeifer et al., 2008).
The assumption, of course, is that the natural environment does in fact constitute an arena
for real organisms, whose bodies and behaviours are tested and selected by means of natural
selection: this allows only the fittest individual to survive and to become increasingly better in
that particular environment, and shape in this way the body and behaviour of natural species. If
the experimenter wants to analyse such a process in a controlled way, or to produce a robot fit to
a particular environment, he or she just needs to define precisely the environment characteristics
and a test function able to quantify how well a particular robot performs when put in that
environment for a certain time, leaving the rest to the artificial evolution.
The keywords in ER are, thus, evolution and controller: the first to shape the second so that
the needed behaviour can be seen. ER, in fact, arose after two other techniques had become
mature, artificial neural networks (ANN) and genetic algorithms (GA), that are outlined in the
next two subsections.
9
1.2.1 Artificial Neural Networks
The researcher involved in ER, as said, wanted to produce agents able to move in simulated
and real environments, responding quickly to external stimuli. The behaviour-based controllers
introduced by Brooks, the first models for biologically inspired robotics, were a too high ab-
straction for the purposes of ER. The idea, instead, of using artificial neural networks (that can
be easily encoded into genome-like strings) as a controller able to convert environmental stimuli
into behaviour had, at that point, been a natural one (Nolfi and Floreano, 2000).
Firstly, ANNs are a simplified model of the biological neural networks which in turn are
the main focus of non-representationalist views of cognition, like connectionism, that were at
the time growing in recognition. In addition, ANNs were experiencing again an explosion in
research, after the big drop caused by the (Minsky and Papert, 1969): the initial problems
of the computational limits of perceptron had been solved with multilayer perceptrons. GAs
and the backpropagation algorithm, alone and combined, were showed able to correctly update
the weights of ANNs to solve complex tasks in the machine learning field, but in many of
the ER experiments there wasnt the need to use the backpropagation algorithm 5. The main
difference is that with GAs the focus is on species adaptation, instead on the adaptation of the
single individuals. There is also evidence of a superiority of GAs in finding weights of ANNs
compared to backpropagation (Gupta and Sexton, 1999), so the last one is rarely found in ER
studies.
Many different ANN models, in addition to the simple multilayer perceptron, have been
studied and tested for ER experiments (Nolfi and Floreano, 2000): a particularly common one
is the Continuous-time recurrent neural network (CTRNN) (Beer, 2003), which is the one used
in this work.
1.2.2 Genetic Algorithms
Holland (1975) proposed a framework to systematically evolve genomes (initially only with
binary genes) using operators inspired by genetics, like mutation and recombination. Its initial
goal was not that of inventing a system to solve complex tasks, but to study evolution under
controlled conditions and to show how the evolutionary process was not limited to natural
organisms.
5In certain cases it was actually used, together with the genetic algorithm, to simulate both evolution of the
species and lifetime learning of each individual. Parameters of the backpropagation algorithm can then be evolved
too, removing the need for the experimenter to determine them. To the purpose of lifetime neural plasticity, also
Hebbian learning has been extensively used (Floreano et al., 2005)
10
The main idea is, given a certain problem, to define an unambiguous encoding of its possible
solutions (the genotypes), then create a set of strings that can be translated from the encoded
version into a testable representation. This is the starting population over which the artificial
selection can be applied.
A binary string, for example, could uniquely encode the shape of a particular object by
specifying, for each gene, whether a particular characteristic has to be present or not. The
object, built using the specifications in the genome, can be tested in a pre-specified environment.
A value of goodness, called fitness, is then assigned to that particular object in that particular
environment. A selection criterion can then be applied: for example keep only the best five
solutions and clone their genetic codes. Then apply mutations, and combine them two-by-two
using crossover, to get a population of the same size of the beginning. The process is then
repeated.
This process guarantees an increase in performance of the whole population. Given infinite
time, it would explore all the possible values for the genome, thus an optimal solution will
eventually be found. It is more common, however, to stop the evolution when a certain fitness
threshold has been reached, meaning that the solution found is good enough for our necessity,
or after a given number of iterations (generations).
Since the work of Holland (1975), many different parameters, genetic operators, fitness (test)
functions and selection criteria have been developed and tested, and GAs have been successfully
applied to a wide range of applications (Mitchell, 1998).
In this work, however, GAs are only considered as a stylized reproduction of the natural
evolutionary process for artificial organisms. To be able to study the effects of stress-induced
mutations onto artificial organisms, the standard GA has been modified by including a simplified
version of an epigenome that encodes the mutation rate. The next section presents this novel
approach.
1.3 Stress-driven Epigenetic Algorithm
Adaptivity in mutation rates in living organisms, as seen, has been widely studied by biologists.
The same does not apply to the field of ER.
There have been, in fact, studies about improvements in function optimization when using
an adaptive mutation scheme (Thierens, 2002; Chan et al., 2008). Still, most ER works use a
fixed mutation rate, usually empirically found to be close to an optimal value (with respect
to the experiment-specific conditions).
Problems with hard-coded, fixed mutation rates arise when facing dynamic environments.
11
Under changing conditions, there might be a different optimal mutation rate. An approach to
resolve this issue could work by modifying the mutation rate (and eventually other parameters)
in a way that could, in principle, keep up with the (if existent) current optimal mutation rate,
or at least achieve better results than a fixed-mutation rate approach.
To achieve this goal, in the previous work where the Stress-driven EpiGenetic Algo-
rithm (StrEGA) had been introduced (Ticconi, 2011), the use of the inverse of ranking was
explored (with function optimization in mind). Each individual has been associated not only
with a genome, as in classical GAs, but also with an epigenome constituted of only one epi-
gene. This encoded a number in the range [0, 1], representing the probability for each of its own
genes, after duplication, to be mutated. After evaluation, a stress indicator is used to update
the epigenome, either increasing its value or decreasing it.
The main difference, compared to the cited studies on adaptive mutation rates in GAs, is
that each individual has its own epigenome, while normally a single mutation rate is made
global for the whole population. There are however two main points that make this previously
explored approach not perfectly applicable to this work.
Firstly, the epigenome was inheritable. Apart from the fact that epigenetic inheritance is
still a debated issue, in this work it was not strictly necessary.
Secondly, for an optimization task there is not normally a lifetime. The genome of the
individual is also its phenotype (the xs of point being passed to the function), and it directly
produces a fitness value (the y calculated by the function). The notion of stress was thus an
abstract idea: the inverse of ranking. In a truncation selection scheme, where only a fraction
of the best is allowed to reproduce, we can considered the best individual the least stressed
one, and viceversa the worst individual the most stressed one. Therefore, the inverse of ranking
served well to decide the amount of the epigenome modification for each individual at each
generation.
The stress encoding used in the original work (Ticconi, 2011) had led to better results than
the normal GA, over a variety of fixed mutation rates, but in other conditions it may not be
the best option. The main, practical problem with the idea of a stress-driven GA, therefore, is
that it increases the number of parameters an experimenter has to think about.
In this context, deciding what is stress is as difficult as deciding the fitness function. A com-
plete coupling of fitness and stress might not always be feasible (when the fitness is unlimited,
for example) or appropriate. Using the inverse of ranking may, however, be considered as a
simple alternative to environment-specific stress functions.
Despite this weakness, an objective advantage of an epigenetic-based approach remains: it
12
does not increase the genetic space as in encoding the mutation rate in the genome. It also
serves well the purpose of studying in isolation the adaptivity of mutation rate, excluding it
from the evolutionary process 6.
In the next chapter, after an overview of the chosen simulator for this work, will be given a
more detailed description of how the StrEGA has been adapted to the current task.
6As seen before, in biological organisms this is the main problem for researchers willing to prove there is
an adaptive mechanism, since the opponents always remarks how a such mechanism could well be a product of
classic Darwinian evolution.
13
Chapter 2
Methods
In this work, it has been chosen to use and extend an existing software, Evorobot* 1, instead
of building everything from scratch. There are various reasons for this.
Evorobot* is free, multi-platform and open source, and it has been used, modified andextended over the last 15 years by some of the researchers who started the ER field
It is written in C++, thus speeding up the execution of the experiments, and it can berun both with and without GUI 2, being suitable to be sent into a computer cluster
Most of all, the Khepera simulator has been finely tuned, at the beginning of its develop-ment, after weeks of experiments using real robots, by sampling their sensors readings in
all the conditions supported by the simulator itself and saving them in a file distributed
with the package: it is therefore supposed to be more accurate than a purely mathematical
simulator
Because of what said in the last point, it is normally easy to transfer the simulated controller
onto a real environment without a great loss in fitness values. The trade-off is of course a
limitation on the kind of environment that can be defined, which has been here considered a
minor problem since Evorobot* already supports:
light objects
3D cylindrical objects with configurable sizes, heights and colours
walls with configurable heights and colours1Developed at LARAL, a laboratory of the italian National Research Council, mainly by Stefano Nolfi and
Onofrio Gigliotta, and is a complete rewriting of the original Evorobot developed by Nolfi and Dario Floreano.
Website: http://laral.istc.cnr.it/evorobotstar2The GUI has been developed using the also free, open source and performant QT Framework, version 4.
14
ground areas with configurable radius and colours
Khepera robots with built-in and configurable sensors
which have been proven enough to setup the experiments needed for this work.
The program is specifically intended for trial-based, multiple-seed evolutionary runs. Each
experiment has to be assigned a directory containing a configuration file, evorobot.cf, an optional
world file, evorobot.env, and an optional network file, evorobot.net.
When Evorobot* is run from inside an experiment directory, it reads the files and creates
appropriate internal structures, initializing the robot environment module, the evolutionary
module and the controller module. Most importantly, the number of repetitions, of generations,
of individuals, of trials and of life cycles has to be specified for each experiment.
After that, the evolution is started by first looking at the number of repetitions of the
experiment that have to be performed. For each repetition, an incremental seed is passed to the
random numbers generator (the first seed can be specified in the configuration file, the others
will be seed plus the index of the current repetition). This allows exact replications of the
experiments and helps to give them statistical robustness.
In each repetition, the whole environment is randomly re-initialized, and an evolutionary
process started. The evolution lasts for the number of generations specified: in each of them,
a population is tested individual by individual, their fitness values recorded, then a fraction of
the best ones is allowed to reproduce while the others are killed. The children (offspring) of
the survivors will form a new population of the same size of the previous one, and the current
generation will end.
When an individual has to be tested, three experiment-specific functions (among the others)
are used 3:
initialize world()
initialize robot position()
ffitness()3Refer to the file robot-env.cpp to see details of these functions. It can be seen in the source code that these
three functions are in fact pointers to function which are linked to the appropriate, experiment specific functions
using the name of the fitness function passed in the configuration file: instead of converting them into an object-
oriented style, it has been chosen to maintain the C-style approach traditionally used by the LARAL researchers.
A new version of Evorobot*, fully object-oriented, had not yet been available at the time of this work.
15
The experiment-specific versions of these functions are the three main pieces of code a
researcher has to write to setup a simple experiment, unless he or she needs to modify the GA
or use a non-supported network controller.
The first two functions are called whenever a new trial for an individual begins. If not
overwritten, the default functions are used: the first reads the evorobot.env and initializes
randomly the environment, the second sets the robot position to random coordinates, the third,
finally, is usually called at the end of the trial, but a parameter can be set to call it after every
time step of the trial.
In the next subsections some of the aspects of the simulator will be explored in greater
detail, with focus on the modifications made for this work.
Firstly, we will look into the configuration files and into how is it organized the environment
where the robots will be tested. Then, there will be an overview of the robot structure, its
sensors and the architecture of the network that controls it.
Lastly, we will present the modifications made to the classic GA to support stress-induced
mutation rate variations.
2.1 Simulated Envinroment
The world is a squared arena of 1000x1000 pixels, while the robot occupies a circle with approx-
imate radius of 27 pixels. The simulator makes the world toroidal by default, but it has been
chosen to use walls (coloured in black) in the current experiments to limit the walkable surface
and stimulate the evolution of an obstacle avoidance behaviour (if a robot hits a wall, it dies
and its fitness will be very low).
Depending on the experiment, there can be one, two or three ground zones (flat objects of
different colours and radius 100 pixels), and ten cylindrical objects, five with 27 pixels radius
and five with 12.5 pixels radius. The configuration file for the environment, evorobot.env, allows
to specify easily which objects have to be loaded in memory for that experiment by listing them
on different lines, for example:
swall x0 y0 X1000 Y 0 h1.0 c0.0 c0.0 c0.0
specifies an object of type wall which in fact is a line of width 1 pixel starting at coordinates
(0, 0) and ending at coordinates (1000, 0) 4, with an height of 1 pixel and RGB set to (0, 0, 0),
4As usual in computer simulations, the (0, 0) is the top left point of the arena.
16
which is black 5. For the other objects it is similar, but there is not the second pair of coordinates,
just one pair for the barycentre.
All the objects are loaded in memory at the beginning and can eventually be further ma-
nipulated. This is what initialize world() is usually for. In this work, the ground areas are
left to their fixed position, as are walls. In the experiments where cylinders are used, these are
randomly reset at each initialize world() call (at the beginning of each trial), as is the robot
position in any experiment (making sure its starting position does not hit an object, which will
cause a premature death).
How the loaded world is going to be initialized, however, depends on the parameters into
the main configuration file, evorobot.cf. In some of the old experiment-specific functions into
Evorobot*, for example, whether to randomize the cylindrical objects or not is decided by the
parameter random round.
In the next section there is an overview of some of the parameters used in the experiment.
2.2 Robot Architecture
The Khepera mobile robot was developed in the 90s by a team of the EPFL (Mondada et al.,
1999), just after the start of the evolutionary robotics field. It is a differential wheeled, round-
shaped robot of about 5.5cm diameter, able to run at a maximum speed of 1m/s.
The simulator supports light, ground and infrared (proximity) sensors, as well as some
optional functionality of the physical robot like a gripper, to grab objects, and a linear camera
(with view angle from 0 to 360).
Proximity and light sensors are placed all around the robot, the ground sensor (which detects
the colour of one single pixel) is placed behind, facing the ground, and the camera on the top.
In this work, only infrared and ground sensors, and in some experiments the camera, are
used. When the camera is used, the visual field is set to 36 degrees and divided into three vertical
subfields, whose averaged value will be copied into a respective visual inputs. If the camera is
used, the eight infrared sensors values are averaged two by two and copied in the respective
inputs, otherwise all of them have a dedicate input for the controller, which is described in the
next subsection.
5In the configuration the range of values for colours is [0, 1] for each of the RGB components, which is then
scaled up into the classic [0, 255] to be displayed with QT.
17
2.2.1 Neural network architecture
The simulator supports various kinds of neural network, but as said the CTRNN has been
chosen. The architecture, as can be seen in Figure 2.1, is composed of three layers:
1. an input layer, where the sensors readings are copied, in addition to the proprioceptive
sensors
2. an hidden layer composed of 6 recurrent leaky neurons fully connected with the input
layer and with the
3. output layer, that has only two neurons: their values will be directly used by the simulator
to calculate the speed of the wheels
Figure 2.1: The neural network which controls the robot. At the bottom there is the input layer, where are copied the
readings from the sensors (external or proprioceptors). In the middle there are six recurrent leaky neurons that are fully
connected with the two motor neurons, which in turn controls the speed of the wheels.
The proprioceptive sensors in the input layer are updated in the fitness function, and serve
as a simplified internal feedback of the robot. This allows to make decisions according to not
only the external sensory inputs, but also the internal ones: if the robot is starving, for example,
the energy proprioceptors will be very low. A more detailed description of the proprioceptors
used will be given in the next chapter.
As said before, in this work it has been used a CTRNN, where each node is completely
described by the following equation:
iyi = yi +Nj=1
wji(gj(yj + j)) + Ii
18
This equation is integrated using Euler method so that at each time steps the activations of
the neurons are updated using this rule:
yi = yi +h
i(yi +
Nj=1
wji(gj(yj + j)) + Ii)
i, j and wji are all evolved by the genetic algorithm. The output values of the two neurons
in the output layer are used to update the speed of the motors (update motors), which in turn
are used to move the robot (move robot). In one time step, the maximum movement in one
direction is of 20 pixels.
The controller cannot modify itself during the lifetime of the robot (there isnt plasticity).
The modifications of weights can only happen by means of mutation when reproducing the best
individuals, as is presented in the next section.
2.3 StrEGA implementation
To support the variability of mutation rates in the StrEGA model, as said in Section 1.3, a
definition of stress has to be produced according to the particular environments and agents
being used.
As will be seen in the next chapter, the individuals in this work are rewarded for staying
not only alive, but also well: thus, starvation is considered a situation of high stress. In any
case, however, after the evaluation the stress will be a value in the range [0, 1], where 0 is no
stress at all and 1 is maximum stress.
The calculated stress is then used into the mutate epi() function, a modification of the
mutate() function in Evorobot*. It takes a gene as an argument, and returns it eventually
mutated.
Since Evorobot* uses, as a genotype, a string of integers in the range [0, 255], the gene passed
to the mutating function is an integer value 6.
Either the normal and the epigenetic mutating functions create a binary representation of
the gene to be mutated, make a random check using the mutation rate for each of its 8 bits 7
and if the check is positive, flip the bit.
6Each gene is scaled up in the range [5, 5] during the genotype to phenotype mapping, where the genome istranslated into the actual weights of the neural network. In this work, the same structure has been maintained:
a few test have been performed with greater resolutions, ie gene range [0, 1024], without a difference in fitness
values.7The mutation rate is here, therefore, the probability for each of the 8 bits of a gene to be mutated.
19
The stress is used in mutate epi() to modify the mutation rate such that, being x the base
mutation rate specified in the configuration file, the new mutation rate can be in the range
[0.001, x2], where high stress increases it and low stress decreases it. The check is then appliedon the new mutation rate for the 8 bits.
To check whether the only useful component of the EGA was just the possibility to increase
the base mutation rate, some of the experiments had the parameter add stress activated. If the
base mutation rate is x, after applying the stress the new value is in the range [x, x 2]. Forthe rest, it remains the same as before.
2.3.1 Dynamic brood size
A further exploration has been that of using the stress to influence not only the mutation rate,
but also the amount of offspring an individual can produce.
As said, only a certain fraction of the best individuals (determined by the nreproducing
parameter) survives each generation, and the number of children of each father is normally
fixed to a value specified by the offspring parameter. The final population is then calculated as
the multiplication between the number of fathers and the offspring per father.
Instead, when the parameter dyn brood was activated, this had been the new process of
reproduction:
1. after selecting the nreproducing best individuals (fathers from now on), order them by
stress value
2. divide the fathers into (offspring-1) bins so that the first bin contains the least stressed
individuals, and the last bin the most stressed ones
3. make the fathers in the first bin produce 2 children, the ones in the second bin 4 children
and so on. If i is the index of the bin, from 1 to (offspring-1), the number of children the
fathers that belongs to i can produce is equal to i 2
We wanted to keep the total population of the same size than with the normal reproduction
system, that is equal to nreproducing times offspring, at the same time without changing the
selection criterion (thus, the individual are first selected by fitness as usual, then given different
reproductive capabilities using the stress).
In the following is provided a proof that the proposed system yields the wanted result,
assuming that the number of fathers is divisible by (offspring-1).
Be n the number of reproducing organisms, m the number of children per reproduc-
ing individual and P the size of the whole population calculated as n m. Then
20
the number of bins, as defined before, is b = m 1, and the number of reproducingindividuals into each bin is f =
n
b. What we need to find is a succession of b values,
2
Chapter 3
Experimental setup
The experiments have been performed in an incremental way, starting from a simple environment
and slightly increasing its complexity.
The aim of the experiments is to test if mutation rate adaptivity upon the environmental
conditions could lead a population to better survive in changing environments, and whether in
general led to an increased overall fitness (excluding eventual fluctuations) or not.
In each environment there are two sources of food: a good one, and a poisonous one.
Two sinusoidal functions determine how much good or deleterious a particular source is at each
generation. They can be seen in Figure 3.1.
Clearly, the worst moment for the population is after the cross: they can still survive by
eating what was, a few generations before, the only food source, even though the amount of
food they get at each time step is far less than before; at the same time, there isnt yet a strong
enough pressure to make them change source. When the source they are eating from becomes
poisonous, all the population experiences a big drop in performance, and after that they start
to be selected for choosing the other source.
The stress, just after the switch, should start to rise, thus increasing the mutation rate and
the genetic space exploration. Since the individuals will eventually start again to lose their
stress and increase their fitness, a convergence should arrive again until the next switch.
The environment becomes increasingly difficult because the switches become more frequent,
ie at a certain point there will not be enough time to actually find an individual able to search
for the other source, adaptive mutation rates or not.
In the next subsections are presented the experiments in details. Unless differently specified,
each experiment has been completely repeated 10 times with a different seed, and each
individual has been re-evaluated 10 times with a reloaded environment. Each evaluation
(trial) lasts 5000 time steps.
22
(a) Easy
(b) Medium
(c) Difficult
Figure 3.1: The three sinusoidals used to calculate the amount of food (red) and poison (green). When food becomes
negative it has become poison, and viceversa. The food is calculated as food(x) = sin(0.00015x2 + 1.5)a + b, while the
poison as poison(x) = sin(0.00015x2 1.5)a + b. Figure 3.1a uses a = 0.5, b = 0.5, Figure 3.1b uses a = 0.7, b = 0.3,Figure 3.1c uses a = 0.9, b = 0.1.
3.1 Simple sources switch
This is the simplest environment. Apart from the walls of the arena, there are two ground areas
of different colours, one near the east wall and the other near the west wall, as in figure 3.2. The
23
west wall has a section coloured with white, instead of black, which enables a discrimination
task: at generation 0, the white on the wall means food, while at the opposite side there is
poison.
Figure 3.2: The simple environment as displayed into Evorobot*. The white wall can be seen on the west wall. At
generation 0, the light blue area is food and the dark blue is poison.
The robots are equipped with 4 infrared sensors, a camera, a ground sensor and an energy
proprioceptor. The energy is their food. When they go over the a ground area, the value
corresponding to the current generation is drawn from the respective sinusoidal: it can either
increase or decrease their energy, which however cannot be greater than 1 or less than 0.
The robot loses a fixed amount of energy at each time step, and an additional one propor-
tional to its speed.
The fitness corresponds to sum of the energy levels for each time step, divided by the
number of time steps. This value is summed up to the fitness values of the other trials of
the same individual, and then averaged up: the final value is the actual fitness used to decide
whether the robot can reproduce or has to be discarded.
The stress corresponds to the number of time steps where the energy level was under 0.5.
The optimal strategy is clearly to use the white wall as a discriminant, and according to the
generation going straight toward east or west.
24
3.2 Simple sources switch with water
In this case, to make the task more difficult (thus with more ways to be solved), a water source
(again a ground area, with a different colour) has been added in the middle of the environment,
as in figure 3.3.
Figure 3.3: The simple environment with water as displayed into Evorobot*. The white wall can be seen on the west wall.
At generation 0, the light blue area is food and the dark blue is poison. The area in the middle is always water.
Compared to the previous setting, there are two more proprioceptors for water and protein
levels.
Energy is increased only if both the water and the protein level are above 0.5. The protein
are gained as in the previous setting the energy was gained (thus using the sinusoidal functions),
while the water is completely refilled when the robot passes over it: it doesnt change with the
generations.
This obliges the robot to continuously move between the current food area and the middle
of the environment, making it lose a bit of energy in the process.
The stress is increased only in the time steps where the energy is below 0.5, using this for-
mula:
stressi+1 = stressi + 0.5(1 water level) + 0.5(1 protein level)
As can be seen, there is a slight decoupling between fitness (which is calculate in the same
way as before) and stress: an individual might have had a low fitness because was neither able
25
to go over the water source nor the food source, thus its stress will be very high; or it might
have had a low fitness but had been good at going over the water, only failing to find the food
source, thus its stress will be not so high.
In case a random controller, just after the switch, is able to be selected (just because also
the others have a very low fitness in that critical situation), its mutation rate would be higher
than a good robot that is perfectly able to oscillate between water and one side area, but does
it on the wrong one. This should allow the children of the random walker to be far away from
itself, eventually being better in exploring the environment.
The optimal behaviour is still to reach the correct food area, and then to go back and forth
between it and the water source.
3.3 Complex sources switch
In this setting, the task is similar to the previous one: eat proteins and drink water to generate
energy, whose summed up amount, averaged over the trials, constitutes the fitness. Also the
stress is calculated in the same way, and the water area is still exactly in the middle of the
environment.
Figure 3.4: The complex environment as displayed into Evorobot*. The big cylinders are randomly spread into the
environment, while the small cylinders are grouped (their barycentre, however, can be in one of the four different angles of
the environment, by random chance).
The robots are equipped, as before, with 4 infrared sensors, the camera, the ground sensor
and the proprioceptors, but many of the experiments have also been replicated with a simplified
26
structure, without the camera but with 8 infrared sensors. This has been done to see how the
system performed with a different discrimination task.
The main difference respect to the previous setting is that the food and poison sources are
not ground areas, but cylindrical objects of different sizes: at generation 0, the small cylinders
are proteins and the big cylinders are poison. They are of different colours to be discriminated
more easily by the camera, and the white zone on the wall is of course absent in this experiment.
The robot has to go near to them, discriminate whether they are big or small and eat them
by touching them.
In each time step there is a small probability that eaten cylinders, either big or small, will
grow back. In addition, big cylinders are always randomly spread in the environment, while
small cylinders are always grouped together, whilst their barycentre is changed randomly for
each trial.
The optimal strategy, then is different in the two cases. When the small cylinders are
proteins, the best approach would be to look around, avoiding poison, for the area where they
are grouped and stay there. When the big cylinders are proteins the agents have to continuously
explore the environment since they are widespread, but also need to avoid the cluster of small
cylinders when they encounter it.
27
Chapter 4
Results
In the previous chapter, the experiments have been divided in three main groups. Each group
of experiments has then been explored with different parameters. In the following there is a
section for each group, where will be presented (unless differently specified):
1. performance of the agents in a static environment, where each source gives either food
(saturates completely) or poison (depletes internal food state completely)
2. what happens if the pre-evolved agents are taken from the last generation of the previous
experiment and re-evolved but with completely switch sources
3. a complete evolution where the food and poison sources switch smoothly, following the
previously presented sinusoidal functions
each in a different subsection. Additional plots can be seen in the Appendix A.
4.1 Simple sources
See Figure 3.2 for details of the environment.
4.1.1 Static environment
In this case, the parameter kill on energy has always been used, considering the easiness for an
agent to reach an optimal behaviour: in most cases, twenty to thirty generations were needed
to have a best fitness near to 1.
A population has been evolved for 100 generations in this fixed environment: the west source
gives food (energy level set to 1 for each time step the robot is on the area), the east source gives
poison (energy level set to 0 as soon as the robot enters it, and because of the kill on energy
parameter, the robot dies instantly).
28
Results can be seen in Figure A.1. With all three mutation rates the best values arrived
quickly to the maximum. They have been evolved without epigenetics, since they served just as
a base for the switch: the population of the last generation of each of these groups has than been
put into the environment with switched source, repeated three times (with the same starting
population) to test epigenetics and dynamic brood size.
4.1.2 Switch
As can be seen in Figure 4.1, when the evolved population is put into the switched environment
there is a drastic performance drop, but in about 20 generations it arrives again over 0.8 of
fitness. The base mutation rate is 0.01 here, for all, meaning that the epigenetic one could go
up to 0.02.
All the three are able to recover fairly quickly, with the best value being, at the last gener-
ation, almost the same for all of them. We can see, instead, that using epigenetics (especially
without dynamic brood size) leads to better performance for the average population.
(a) Best fitness (b) Mean fitness
Figure 4.1: Plot of the evolutionary performance after that the last generation of the static environment (evolved with
mutation rate 0.01) has been put in a completely switched environment (what was food is poison and viceversa). The red
line is the non-epigenetic approach, the green line is with epigenetics and the blue line is epigenetic with
dynamic brood size. In Figure 4.1a there is the plot of the best fitness for each generation, while in Figure 4.1b there
is the plot of the mean fitness for each generation. In both cases, the values are averaged over 10 experiment replications,
for each of the three groups.
Since this could have been explained by the increased mutation rate, other two replications
have been made: one with base mutation rate 0.02, and the other with mutation rate 0.04 (see
A.1.2 for details).
With 0.02 the situation is very similar: the plot of the best individuals does not present
relevant differences between epigenetics and normal, even though the average seems to yield a
worse result for the normal approach. When increasing again to 0.04, instead, for the normal
29
approach there is the first subtle decrease in the best plot, even though can be considered
negligible, and a huge decrease in the average plot, meaning that although the best individual
of the last generation were just a bit worse than with epigenetics, the whole population was
much worse.
This means that the epigenetic approach can actually use an increased mutation rate without
a decrease in overall performance (because the stress goes down while the fitness goes up, so
eventually the mutation rate gets down-regulated), while the normal approach starts losing
performance.
4.1.3 Periodic change
In the third environment, the two ground areas can become both food or poison over time,
according to two sinusoidals. The base mutation rate has been set to 0.01.
In Figure 4.2 we can see the performance while using the medium sinusoidal (see Figure
3.1b), without the parameter kill on energy. The compared systems are five now: normal,
epigenetics, dynamic brood size, and two more: epigenetics with the add stress parameter
activated, and with also the dynamic brood size.
(a) Best fitness (b) Mean fitness
Figure 4.2: Five systems are compared here: normal (red), epigenetics (green), dynamic brood size (blue), epigenetics with
add stress (violet), epigenetics with add stress and dynamic brood size (sky blue). The base mutation rate is 0.01, the
medium sinusoidal (Figure 3.1b) is used and kill on energy is not activated. In Figure 4.2a there is the plot of the
best fitness for each generation, while in Figure 4.2b there is the plot of the mean fitness for each generation. In both
cases, the values are averaged over 10 experiment replications, for each of the five groups.
The best performance, as before, are very similar, while the straight epigenetics without
other modification outperform again the others on the average fitness of the population. A
variation of parameters 1, also yields the same results (with a slight loss on the best plot for
1Adding the kill on energy parameter (Figure A.4), or using the hard sinusoidal (Figure A.5), with or
without kill on energy (Figure A.6)
30
epigenetics, which still outperforms the others on the average plot).
4.2 Simple sources with water
See Figure 3.3 for details of the environment.
4.2.1 Static environment
By obliging the agents to visit both the food area and the water to get a good fitness, the range
of evolvable behaviours should be greater than before. As can be seen in Figure A.7, especially
increasing the mutation rate, it is harder to reach a high fitness even with 500 generations. The
errors are far greater than before, meaning that over 10 runs, some were more fortunate than
others. The average fitness is heavily influenced by the mutation rate increase.
In the next subsection the last generation for each of the three mutation rates will be put
in a completely switched environment, as before.
4.2.2 Switch
Mutation rates 0.01 and 0.02 show a similar pattern than before. The epigenetics and normal
have comparable best fitness values, and epigenetics outperform both dynamic brood size and
normal on the mean fitness plot. An increased reactivity to the new environment is seen, instead,
in the experiment with 0.04 mutation rate (Figure 4.3).
In this case, the fixed mutation rate approach and the dynamic brood size are not able to
go over 0.8 of fitness even at the last generation, and they show a smooth growth, contrarily
to the epigenetic approach where the increase is huge and quick. The mean fitness plot, as in
the other experiments, leaves the fixed mutation approach as the least performant between the
three, and the epigenetics as first.
4.2.3 Periodic change
This experiment, like Section 4.1.3, evolves from the beginning agents in a changing environ-
ment, using the same sinusoidals and also a base mutation rate of 0.01.
In Figure 4.4 we can see the performance while using the medium sinusoidal (Figure
3.1b) and the parameter kill on energy. The systems compared are as in Section 4.1.3:
normal, epigenetics, dynamic brood size, epigenetics with the add stress parameter activated,
and with the dynamic brood size.
31
(a) Best fitness (b) Mean fitness
Figure 4.3: Plot of the evolutionary performance after that the last generation of the static environment (evolved with
mutation rate 0.04) has been put in a completely switched environment (what was food is poison and viceversa). The red
line is the non-epigenetic approach, the green line is with epigenetics and the blue line is epigenetic with
dynamic brood size. In Figure 4.3a there is the plot of the best fitness for each generation, while in Figure 4.3b there
is the plot of the mean fitness for each generation. In both cases, the values are averaged over 10 experiment replications,
for each of the three groups.
(a) Best fitness (b) Mean fitness
Figure 4.4: Five systems are compared here: normal (red), epigenetics (green), dynamic brood size (blue), epigenetics with
add stress (violet), epigenetics with add stress and dynamic brood size (sky blue). The base mutation rate is 0.01, the
medium sinusoidal (Figure 3.1b) is used and kill on energy is activated. In Figure 4.4a there is the plot of the best
fitness for each generation, while in Figure 4.4b there is the plot of the mean fitness for each generation. In both cases,
the values are averaged over 10 experiment replications, for each of the five groups.
There is not a clear difference in performance in the best plot but after every change the
epigenetics is again able to stay slightly above the others (apart before the first switch, where it
is outperformed by add stress), while in the best two are again epigenetics and dynamic brood
size. This also happens when using the hard sinusoidal (see Figure A.12 for the experiment
plots). In the other two experiments (medium sinusoidal with kill on energy (Figure A.10)
and hard sinusoidal without kill on energy (Figure A.11)) the fixed mutation rate has slightly
better performance in the best plot than the various epigenetic ones, but is still outperformed
32
in the mean plot.
4.3 Complex sources
This group of experiments has been, for the agents, the most difficult one. Figure 3.4 shows
how the environment is structured. The next two sections present, respectively, the results when
using both camera and infrared sensors, and when using only the infrared sensors.
Because of the difficulty of the environment, reflected by the low performance, only the two
lowest mutation rates (0.01 and 0.02) and the two easiest sinusoidals (easy and medium)
have been used.
4.3.1 Camera
Easy sinusoidal
The sinusoidal here used is easy because, at worse, the cylinders give little to now food, but
they are never really poisonous. Nevertheless, the task remains difficult and the performance
tend to be low. In the comparison, however, the epigenetics is never outperformed by the fixed
mutation rate approach. When the base mutation rate is 0.01, apart from add stress they
all follow a similar pattern. This also happens when the mutation rate is 0.02, but only if
kill on energy is not activated. When it is (see Figure 4.5), epigenetics outperforms all the
others.
(a) Best fitness (b) Mean fitness
Figure 4.5: Five systems are compared here: normal (red), epigenetics (green), dynamic brood size (blue), epigenetics with
add stress (violet), epigenetics with add stress and dynamic brood size (sky blue). The base mutation rate is 0.02, the
easy sinusoidal (Figure 3.1a) is used and kill on energy is activated. In Figure 4.6a there is the plot of the best fitness
for each generation, while in Figure 4.5b there is the plot of the mean fitness for each generation. In both cases, the
values are averaged over 10 experiment replications, for each of the five groups.
33
Medium sinusoidal
The overall performance is very low, where only in certain cases the fitness value 0.8 is reached.
The five (as before) tested systems all tend to have low performance. In one combination of
parameters (see Figure 4.6), however, the dynamic brood size outperform all the others in
both the best and the mean plot.
(a) Best fitness (b) Mean fitness
Figure 4.6: Five systems are compared here: normal (red), epigenetics (green), dynamic brood size (blue), epigenetics with
add stress (violet), epigenetics with add stress and dynamic brood size (sky blue). The base mutation rate is 0.01, the
medium sinusoidal (Figure 3.1b) is used and kill on energy is not activated. In Figure 4.6a there is the plot of the
best fitness for each generation, while in Figure 4.6b there is the plot of the mean fitness for each generation. In both
cases, the values are averaged over 10 experiment replications, for each of the five groups.
34
4.3.2 Infrared
Easy sinusoidal
In Figure 4.7 can be seen one of the experiments with infrared sensors and easy sinusoidal. All
the systems have low performance and are very similar, but the epigenetics is slightly better,
both for the best plot and for the mean plot, where the improvement is more accentuated
and is followed also by the dynamic brood size system.
(a) Best fitness (b) Mean fitness
Figure 4.7: Five systems are compared here: normal (red), epigenetics (green), dynamic brood size (blue), epigenetics with
add stress (violet), epigenetics with add stress and dynamic brood size (sky blue). The base mutation rate is 0.02, the
easy sinusoidal (Figure 3.1a) is used and kill on energy is not activated. In Figure 4.7a there is the plot of the best
fitness for each generation, while in Figure 4.7b there is the plot of the mean fitness for each generation. In both cases,
the values are averaged over 10 experiment replications, for each of the five groups.
35
Medium sinusoidal
Using the medium sinusoidal decrease the overall performance, leaving the relative performance
the same. In Figure 4.8 the same configuration than with the easy sinusoidal is showed. The
performance are still low, but slightly better than the others. For the other configurations, the
difference is negligible.
(a) Best fitness (b) Mean fitness
Figure 4.8: Five systems are compared here: normal (red), epigenetics (green), dynamic brood size (blue), epigenetics with
add stress (violet), epigenetics with add stress and dynamic brood size (sky blue). The base mutation rate is 0.02, the
medium sinusoidal (Figure 3.1b) is used and kill on energy is not activated. In Figure 4.8a there is the plot of the
best fitness for each generation, while in Figure 4.8b there is the plot of the mean fitness for each generation. In both
cases, the values are averaged over 10 experiment replications, for each of the five groups.
36
Chapter 5
Conclusion
Starting from a debated issue in evolutionary biology, the adaptivity of mutations, we developed
an Evolutionary Robotics model able to incorporate the idea of stress-driven modification of the
mutation rates, through the use of a structure parallel to the (artificial) genome: the epigenome.
Three different environments, with increasing difficulty, have been used to test the initial
hypothesis. The results are promising but do not show a generalized and consistent increment
in performance, even though in most of the experiments the epigenetics scored better, thus
meaning that it allowed the population either to recover more quickly from a situation of high
stress, or to reach more quickly an high fitness value, or both.
The use of the add stress and add stress with dynamic brood were functional to check
whether an increment in performance by the epigenetics could be due to just the fact that the
mutation rates increases. This has been showed wrong, because neither a higher fixed mutation
rate, nor the two add stress configurations, were able to outperform the plain epigenetic system
in most experiments. This could mean that in certain moments the mutation rate needs to
be smaller, to allow a smoother evolutionary process (ie, if the individual are already good,
there is not a need for huge exploration of the genetic space). When the environment changes,
the mutation rate needs to be higher, so to increase the probability that at least some of the
individuals will survive.
Different environments and different definition of stress would need to be developed to
confirm the results and test to what extent an adaptive mutation scheme can be beneficial. In
particular, is has not been tested whether increasing the range for the mutation rate in the
epigenetic system can lead to too many deleterious mutations and reduce the performance.
Another path that had looked promising in (Ticconi, 2011) is that of epigenetic inheritance.
Allowing the mutation rate, modified by the stress experienced during the lifetime of an indi-
vidual, to be passed to the individuals offspring would in fact enable a sort of phylogenetic
37
memory, at least between near generations, of how harsh is the environment in that period.
This experimentation has, in addition, arisen some remarks about the feasibility of a classical
trial-based ER model to further analyse this subject. An open-ended, multi-agent artificial life
simulation might be a better tool to study the dynamics of mutation rates of a real-time-acting
population.
38
Bibliography
Aneil F Agrawal and Alethea D Wang. Increased transmission of mutations by low-condition
females: evidence for condition-dependent DNA repair. PLoS biology, 6(2):e30, February
2008. ISSN 1545-7885.
Charles F Baer. Does mutation rate depend on itself. PLoS biology, 6(2):e52, February 2008.
ISSN 1545-7885.
Charles F Baer, Michael M Miyamoto, and Dee R Denver. Mutation rate variation in multi-
cellular eukaryotes: causes and consequences. Nature reviews. Genetics, 8(8):61931, August
2007. ISSN 1471-0056.
Randall D. Beer. The Dynamics of Active Categorical Perception in an Evolved Model Agent.
Adaptive Behavior, 11(4):209243, December 2003. ISSN 10597123.
KY Chan, TC Fogarty, and ME Aydin. Genetic algorithms with dynamic mutation rates
and their industrial applications. International Journal of Computational Intelligence and
Applications, 7(2), 2008.
Jeff Clune, Dusan Misevic, Charles Ofria, Richard E Lenski, Santiago F Elena, and Rafael
Sanjuan. Natural selection fails to optimize mutation rates for long-term adaptation on
rugged fitness landscapes. PLoS computational biology, 4(9):e1000187, January 2008. ISSN
1553-7358.
S Cotton. Condition dependent mutation rates and sexual selection. Journal of evolutionary
biology, 2009.
Francis Crick. Central Dogma in Molecular Biology. Nature, 227(8):561563, 1970.
Christoffel Dinant, Adriaan B Houtsmuller, and Wim Vermeulen. Chromatin structure and
DNA damage repair. Epigenetics and chromatin, 1(1):9, January 2008. ISSN 1756-8935.
39
Dario Floreano, Mototaka Suzuki, and Claudio Mattiussi. Active vision and receptive field
development in evolutionary robots. Evolutionary computation, 13(4):52744, January 2005.
ISSN 1063-6560.
NC Fonville. Stress-induced modulators of repeat instability and genome evolution. Journal of
Molecular Microbiology and Biotechnology, 2011.
Rodrigo S Galhardo, P J Hastings, and Susan M Rosenberg. Mutation as a stress response and
the regulation of evolvability. Critical reviews in biochemistry and molecular biology, 42(5):
399435, 2007. ISSN 1040-9238.
Janet L Gibson, Mary-Jane Lombardo, Philip C Thornton, Kenneth H Hu, Rodrigo S Galhardo,
Bernadette Beadle, Anand Habib, Daniel B Magner, Laura S Frost, Christophe Herman, P J
Hastings, and Susan M Rosenberg. The sigma(E) stress response is required for stress-induced
mutation and amplification in Escherichia coli. Molecular microbiology, 77(2):41530, July
2010. ISSN 1365-2958.
Caleb Gonzalez, Lilach Hadany, and RG Ponder. Mutability and importance of a hypermutable
cell subpopulation that produces stress-induced mutants in Escherichia coli. PLoS genetics,
4(10):e1000208, January 2008. ISSN 1553-7404.
EL Greer, TJ Maures, Duygu Ucar, and AG Hauswirth. Transgenerational epigenetic inheri-
tance of longevity in Caenorhabditis elegans. Nature, 479(7373):365371, October 2011. ISSN
0028-0836.
Jatinder N.D Gupta and Randall S Sexton. Comparing backpropagation with a genetic algo-
rithm for neural network training. Omega, 27(6):679684, December 1999. ISSN 03050483.
David Haig. Weismann Rules! OK? Epigenetics and the Lamarckian temptation. Biology &
Philosophy, 22(3):415428, December 2006. ISSN 0169-3867.
Heather Hendrickson, E Susan Slechta, Ulfar Bergthorsson, Dan I Andersson, and John R
Roth. Amplification-mutagenesis: evidence that directed adaptive mutation and general
hypermutability result from growth with a selected gene amplification. Proceedings of the
National Academy of Sciences of the United States of America, 99(4):21649, February 2002.
ISSN 0027-8424.
John H. Holland. Adaptation in Natural and Artificial Systems. The University of Michigan
Press, 1975.
40
Eva Jablonka and Marion J Lamb. Evolution in four dimensions: Genetic, epigenetic, behav-
ioral, and symbolic variation in the history of life. MIT Press, 2005.
Josephine M Kang, Nicole M Iovine, and Martin J Blaser. A paradigm for direct stress-induced
mutation in prokaryotes. FASEB journal: official publication of the Federation of American
Societies for Experimental Biology, 20(14):247685, December 2006. ISSN 1530-6860.
Marvin Minsky and Seymour Papert. Perceptrons: An introduction to computational geometry.
MIT Press, 1st edition, 1969.
Melanie Mitchell. An Introduction to Genetic Algorithms. The MIT Press, 1998.
F Mondada, E Franzi, and A Guignard. The development of khepera. Experiments with the
Mini-Robot Khepera, Proceedings of the First International Khepera Workshop, pages 714,
1999.
Stefano Nolfi and Dario Floreano. Evolutionary Robotics. MIT Press, 2000.
Domenico Parisi and Federico Cecconi. La societa` dei beni. Dalla famiglia allo stato alle imprese
private. Bollati Boringhieri, 2006.
Ales Pecinka, Huy Q Dinh, Tuncay Baubec, Marisa Rosa, Nicole Lettner, and Ortrun Mittelsten
Scheid. Epigenetic regulation of repetitive elements is attenuated by prolonged heat stress in
Arabidopsis. The Plant cell, 22(9):311829, September 2010. ISSN 1532-298X.
R Pfeifer, M Lungarella, and O Sporns. The synthetic approach to embodied cognition: a
primer. Handbook of Cognitive Science, 2008.
Rolf Pfeifer, Max Lungarella, and Fumiya Iida. Self-organization, embodiment, and biologically
inspired robotics. Science (New York, N.Y.), 318(5853):108893, November 2007. ISSN
1095-9203.
S M Rosenberg. Evolving responsively: adaptive mutation. Nature reviews. Genetics, 2(7):
50415, July 2001. ISSN 1471-0056.
Nathaniel P Sharp and Aneil F Agrawal. Evidence for elevated mutation rates in low-quality
genotypes. Proceedings of the National Academy of Sciences of the United States of America,
109(16):61426, April 2012. ISSN 1091-6490.
Jeffrey D Stumpf, Anthony R Poteete, and Patricia L Foster. Amplification of lac cannot
account for adaptive mutation to Lac+ in Escherichia coli. Journal of bacteriology, 189(6):
22919, March 2007. ISSN 0021-9193.
41
D Thierens. Adaptive mutation rate control schemes in genetic algorithms. Evolutionary Com-
putation, 2002. CEC 02. Proceedings of the 2002 Congress on, 1:980 985, 2002.
Fabio Ticconi. StrEGA: Stress-driven EpiGenetic Algorithm. Technical report, University of
Sussex, 2011.
Elio Tuci, M Quinn, and I Harvey. An evolutionary ecological approach to the study of learning
behavior using a robot-based model. Adaptive Behavior, 10(3-4):201221, 2002.
August Weismann. The Germ-Plasm A Theory of Heredity. Charles Scribners Sons, 1893.
Mahmoud W Yaish, Joseph Colasanti, and Steven J Rothstein. The role of epigenetic processes
in controlling flowering time in plants exposed to stress. Journal of experimental botany, 62
(11):372735, July 2011. ISSN 1460-2431.
Yongzhong Zhao and Richard J Epstein. Programmed genetic instability: a tumor-permissive
mechanism for maintaining the evolvability of higher species through methylation-dependent
mutation of DNA repair genes in the male germ line. Molecular biology and evolution, 25(8):
173749, August 2008. ISSN 1537-1719.
42
Appendix A
Additional Plots
A.1 Simple sources switch
43
A.1.1 Static environment
(a) Mutation rate 0.01
(b) Mutation rate 0.02
(c) Mutation rate 0.04
Figure A.1: Each of the plots represents the values of the best (red line) and mean fitness, with the error bars calculated
using standard error (values taken from 10 repetitions of each experiment). The environment is static for the whole 100
generations.
44
A.1.2 Switch
Mutation rate 0.01
See Figure 4.1.
Mutation rate 0.02
(a) Best fitness (b) Mean fitness
Figure A.2: Plot of the evolutionary performance after that the last generation of the static environment (evolved with
mutation rate 0.02) has been put in a completely switched environment (what was food is poison and viceversa). The
red line is the non-epigenetic approach, the green line is with epigenetics and the blue line is epigenetic
with dynamic brood size. In Figure A.2a there is the plot of the best fitness for each generation, while in Figure
A.2b there is the plot of the mean fitness for each generation. In both cases, the values are averaged over 10 experiment
replications, for each of the three groups.
45
Mutation rate 0.04
(a) Best fitness (b) Mean fitness
Figure A.3: Plot of the evolutionary performance after that the last generation of the static environment (evolved with
mutation rate 0.04) has been put in a completely switched environment (what was food is poison and viceversa). The
red line is the non-epigenetic approach, the green line is with epigenetics and the blue line is epigenetic
with dynamic brood size. In Figure A.3a there is the plot of the best fitness for each generation, while in Figure
A.3b there is the plot of the mean fitness for each generation. In both cases, the values are averaged over 10 experiment
replications, for each of the three groups.
46
A.1.3 Periodic change
Medium sinusoidal
See Figure 4.2.
Medium sinusoidal, Kill on Energy
(a) Best fitness (b) Mean fitness
Figure A.4: Five systems are compared here: normal (red), epigenetics (green), dynamic brood size (blue), epigenetics
with add stress (violet), epigenetics with add stress and dynamic brood size (sky blue). The base mutation rate is 0.01,
the medium sinusoidal (Figure 3.1b) is used and kill on energy is activated. In Figure A.4a there is the plot of the
best fitness for each generation, while in Figure A.4b there is the plot of the mean fitness for each generation. In both
cases, the values are averaged over 10 experiment replications, for each of the three groups.
47
Hard sinusoidal
(a) Best fitness (b) Mean fitness
Figure A.5: Five systems are compared here: normal (red), epigenetics (green), dynamic brood size (blue), epigenetics
with add stress (violet), epigenetics with add stress and dynamic brood size (sky blue). The base mutation rate is 0.01,
the hard sinusoidal (Figure 3.1c) is used and kill on energy is not activated. In Figure A.5a there is the plot of the
best fitness for each generation, while in Figure A.5b there is the plot of the mean fitness for each generation. In both
cases, the values are averaged over 10 experiment replications, for each of the three groups.
48
Hard sinusoidal, Kill on Energy
(a) Best fitness (b) Mean fitness
Figure A.6: Five systems are compared here: normal (red), epigenetics (green), dynamic brood size (blue), epigenetics
with add stress (violet), epigenetics with add stress and dynamic brood size (sky blue). The base mutation rate is 0.01,
the hard sinusoidal (Figure 3.1c) is used and kill on energy is activated. In Figure A.6a there is the plot of the best
fitness for each generation, while in Figure A.6b there is the plot of the mean fitness for each generation. In both cases,
the values are averaged over 10 experiment replications, for each of the three groups.
49
A.2 Simple sources switch with water
A.2.1 Static environment
(a) Mutation rate 0.01
(b) Mutation rate 0.02
(c) Mutation rate 0.04
Figure A.7: Each of the plots represents the values of the best (red line) and mean fitness, with the error bars calculated
using standard error (values taken from 10 repetitions of each experiment). The environment is static for the whole 100
generations.
50
A.2.2 Switch
Mutation rate 0.01
(a) Best fitness (b) Mean fitness
Figure A.8: Plot of the evolutionary performance after that the last generation of the static environment (evolved with
mutation rate 0.01) has been put in a completely switched environment (what was food is poison and viceversa). The
red line is the non-epigenetic approach, the green line is with epigenetics and the blue line is epigenetic
with dynamic brood size. In Figure A.8a there is the plot of the best fitness for each generation, while in Figure
A.8b there is the plot of the mean fitness for each generation. In both cases, the values are averaged over 10 experiment
replications, for each of the three groups.
51
Mutation rate 0.02
(a) Best fitness (b) Mean fitness
Figure A.9: Plot of the evolutionary performance after that the last generation of the static environment (evolved with
mutation rate 0.02) has been put in a completely switched environment (what was food is poison and viceversa). The
red line is the non-epigenetic approach, the green line is with epigenetics and the blue line is epigenetic
with dynamic brood size. In Figure A.9a there is the plot of the best fitness for each generation, while in Figure
A.9b there is the plot of the mean fitness for each generation. In both cases, the values are averaged over 10 experiment
replications, for each of the three groups.
52
Mutation rate 0.04
See Figure 4.3.
53
A.2.3 Periodic change
Medium sinusoidal
(a) Best fitness (b) Mean fitness
Figure A.10: Five systems are compared here: normal (red), epigenetics (green), dynamic brood size (blue), epigenetics
with add stress (violet), epigenetics with add stress and dynamic brood size (sky blue). The base mutation rate is 0.01,
the medium sinusoidal (Figure 3.1b) is used and kill on energy is not activated. In Figure A.10a there is the plot of
the best fitness for each generation, while in Figure A.10b there is the plot of the mean fitness for each generation. In
both cases, the values are averaged over 10 experiment replications, for each of the three groups.
54
Medium sinusoidal, Kill on Energy
See Figure 4.4.
Hard sinusoidal
(a) Best fitness (b) Mean fitness
Figure A.11: Five systems are compared here: normal (red), epigenetics (green), dynamic brood size (blue), epigenetics
with add stress (violet), epigenetics with add stress and dynamic brood size (sky blue). The base mutation rate is 0.01,
the hard sinusoidal (Figure 3.1c) is used and kill on energy is not activated. In Figure A.11a there is the plot of the
best fitness for each generation, while in Figure A.11b there is the plot of the mean fitness for each generation. In both
cases, the values are averaged over 10 experiment replications, for each of the three groups.
55
Hard sinusoidal, Kill on Energy
(a) Best fitness (b) Mean fitness
Figure A.12: Five systems are compared here: normal (red), epigenetics (green), dynamic brood size (blue), epigenetics
with add stress (violet), epigenetics with add stress and dynamic brood size (sky blue). The base mutation rate is 0.01,
the hard sinusoidal (Figure 3.1c) is used and kill on energy is activated. In Figure A.12a there is the plot of the best
fitness for each generation, while in Figure A.12b there is the plot of the mean fitness for each generation. In both cases,
the values are averaged over 10 experiment replications, for each of the three groups.
56