+ All Categories
Home > Documents > Ticconi - 2012 - Investigation of Mutation Rates Adaptivity in Changing Environments an Evolutionary...

Ticconi - 2012 - Investigation of Mutation Rates Adaptivity in Changing Environments an Evolutionary...

Date post: 07-Sep-2015
Category:
Upload: koteko87
View: 215 times
Download: 0 times
Share this document with a friend
Description:
Evolutionary Robotics master thesis
58
Investigation of Mutation Rates Adaptivity in Changing Environments: an Evolutionary Robotics Approach Fabio Ticconi August 30, 2012
Transcript
  • Investigation of Mutation Rates Adaptivity in Changing

    Environments: an Evolutionary Robotics Approach

    Fabio Ticconi

    August 30, 2012

  • Abstract

    Taking inspiration from results of recent research in microbiology and evolutionary theory, in

    this work has been explored the possibility that making the mutation rate in a genetic algorithm

    adaptive, instead of fixed, could improve the performance in case of mutating environmental

    conditions. To reach this goal, a simplified model of stress-driven mutation rates is proposed,

    integrated into the general principles and techniques of the Evolutionary Robotics. A notion of

    environmental stress has been defined for three different environments where populations of

    simulated Khepera robots have been evolved and analysed in their ability to regain fitness after

    their environment was modified, and to keep their fitness high while the environment changed

    periodically. The approach resulted promising and outperformed the fixed mutation rate in

    most experiments, being able in certain environments to recover more quickly from a drastic

    change in the external conditions. Extensions needs to be developed in future works to study

    the potential and limits of mutation rate adaptivity.

  • Contents

    1 Background 6

    1.1 Epigenetics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

    1.2 Evolutionary Robotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

    1.2.1 Artificial Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

    1.2.2 Genetic Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

    1.3 Stress-driven Epigenetic Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . 11

    2 Methods 14

    2.1 Simulated Envinroment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

    2.2 Robot Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

    2.2.1 Neural network architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 18

    2.3 StrEGA implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

    2.3.1 Dynamic brood size . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

    3 Experimental setup 22

    3.1 Simple sources switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

    3.2 Simple sources switch with water . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

    3.3 Complex sources switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

    4 Results 28

    4.1 Simple sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

    4.1.1 Static environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

    4.1.2 Switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

    4.1.3 Periodic change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

    4.2 Simple sources with water . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

    4.2.1 Static environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

    4.2.2 Switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

    4.2.3 Periodic change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

    1

  • 4.3 Complex sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

    4.3.1 Camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

    4.3.2 Infrared . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

    5 Conclusion 37

    A Additional Plots 43

    A.1 Simple sources switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

    A.1.1 Static environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44

    A.1.2 Switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

    A.1.3 Periodic change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

    A.2 Simple sources switch with water . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

    A.2.1 Static environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

    A.2.2 Switch . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

    A.2.3 Periodic change . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

    2

  • Acknowledgements

    This work is dedicated to my family and to Anto, that give me love when I most need it.

    My most sincere thanks go to all the people with who I shared an amazing, challenging and

    stimulating year. The closest friends, they already know. Of course they now, we have been

    sweating and laughing side by side.

    A big thanks to all the tutors and professors who have made the start of this adventure easy

    and enjoyable, and the progress challenging and interesting.

    A special thanks to Matthew, my supervisor, that has always been able to put order in my

    confuse ideas and taught me a couple of important lessons, while always being friendly and

    ready to give advice.

    3

  • Introduction

    The adaptive mutation is a little cloud that obscures the beauty and clearness of the

    molecular biological perspective on Life.

    Vasily Ogryzko

    Mainstream evolutionary theory states that genetic mutations are blind, that is they are not

    directed onto specific genes. It is argued that they occur at random times and places into the

    genotype and that they normally generate lower-fitness phenotypes, thus meaning that most

    mutations are actually harmful although being an important source of variation in a population

    1. Cricks Central dogma of molecular biology is often cited to defend this position:

    The central dogma of molecular biology deals with the detailed residue-by-residue

    transfer of sequential information. It states that such information cannot be trans-

    ferred back from protein to either protein or nucleic acid.

    Following the argument of Crick (1970), the proteins synthesized using DNA and RNA

    cannot influence the nucleic acids back. This in turn strengthens the idea that mutations have

    to be blind, because proteins that receive stimuli from outside the cytoplasm cannot react to

    them by modifying the DNA, of course assuming the dogma is true.

    If we take for granted the randomness of mutations, theoretically their rate should be very

    small. They are needed to explore the genetic space and have diversity into a population,

    but this should happen without making every offspring unable to survive because of too many

    deleterious mutations.

    As by classical Darwinian evolution, the phenotype of each individual is subjected to natural

    selection, and the individual with deleterious mutations will likely be eliminated (because of

    their reduced fitness), while the ones with neutral or just slightly harmful mutations will maybe

    continue to reproduce; the few ones, finally, who have been subjected to a positive mutation,

    will be fitter and will reproduce more, thus spreading the newly found gene into the population.

    1See (Jablonka and Lamb, 2005) for an history of the debate about the randomness of mutations.

    4

  • As Baer (2008) put it, deleterious mutations are the price we living organisms pay for the

    ability to evolve.

    During the last few decades, though, new findings in molecular biology have led researchers

    to argue whether this view was correct or not. Can mutations be adaptive, that is can an

    organism modify its own mutation rate or direct a mutation onto specific genes to

    make its offspring more likely to survive?

    This work does not aim to answer this questions, but to define and test a model able

    to embody, in a simplified way, the essential mechanisms of mutation rate adaptivity. The

    hypothesis of this work is that, whether they exist in nature or not, adaptive mutations can

    be beneficial in a population undergoing environmental stress, accelerating the evolutionary

    process.

    A prediction of this work is that a properly defined stress-driven mutation rate adaptivity

    should allow a population to escape a risk of extinction, and quickly regain its vitality. To do

    that, such a model needs to be defined and than tested in a properly setup environment.

    In Chapter 1 is first presented an overview of the current debate over the adaptivity of

    mutations, then in Chapter 2 are described the methods here used to define and test the above

    mentioned model:

    In Chapter 3 are described the various experimental setup where the model has been tested,

    followed by the obtained results in Chapter 4. In Chapter 5 there are the conclusions of this

    work, the results are discussed an possible future improvement outlined.

    5

  • Chapter 1

    Background

    The question whether mutations can be adaptive or not has risen a long-lasting debate, since

    it touches what is considered a central point of evolutionary theory and because, if the answer

    is proven affirmative, we will be in fact acknowledging a sort of Lamarckian process 1 alongside

    the classic Darwinian one.

    Notwithstanding the importance of the question and the amount of studies on the subject,

    this problem has yet to be considered closed.

    On the one hand, proponents of the adaptivity hypothesis have showed how the mutation

    rate is not constant even between closely related individuals and groups, while it is instead

    correlated with the fitness: the lowest the fitness, the highest the mutation rate (Baer et al., 2007;

    Baer, 2008). Low-fitness individuals, can be argued, under certain environmental conditions

    start to have a higher mutation rate (eventually specific onto the genes that make them less

    good in that particular environment), thus increasing their likelihood of generating an offspring

    better able to cope with the environment than themselves. That would make their mutations

    adaptive. The mutation is said to be directed in case it occurs specifically in genes that can

    increase the organisms fitness (Rosenberg, 2001).

    On the other hand, however, this could just be due to the cost of maintaining the DNA

    intact, that in bad environment conditions might be too high for an individual. This could lead

    to an increased mutation rate and eventually to the generation of offspring with a higher fitness

    than the parent, or to other reasons still explainable under a classic Darwinian framework Baer

    (2008).

    An example of how subtle is the difference between an adaptive and a non-adaptive muta-

    genic process is seen with particular lac mutants of Escherichia Coli, unable to feed and grow

    1This is especially a problem in case directed mutations will be proven to happen, because the environment

    in that case will directly modify particular genes which will then be inherited.

    6

  • properly in a lactose-rich environment. In this situation, there seems to be an adaptive response

    that drives the lac cells to increase their mutation rate (a phenomenon known as hypermuta-

    tion), so that eventually an offspring will end up with a lac+ gene which, by increasing the cell

    growth in that particular environment, will quickly spread into the population.

    Hendrickson et al. (2002) argued that the adaptivity of mutations in this case is only apparent

    and proposed a model where the lac gene is amplified in the genotype by means of random

    mutations, resulting in a slight increase of the growth rate of the cell.

    The selection will then favour a further amplification of this gene, which also means an

    increase in the probability of having one of these repeated lac genes mutated into a lac+ gene

    at the normal, constant mutation rate. When this happens, lac+ cells will grow at a higher rate

    and survive more than the others: selective pressure will make the lac+ gene spread into the

    population, eventually covering it all. When analysed without taking gene amplification into

    account, the researchers argued, this whole process can be misread as an example of adaptive

    mutation rates or even a directed mutation.

    Stumpf et al. (2007) instead tested many predictions of the amplification model, finding that

    most of them do not hold in this case: for example, cells that do actually amplify the lac gene

    can end up with a lac+, but this is unstable because of the many copies of the lac. The stable

    lac+ cells instead are showed to have been generated without amplification in the parent.

    In the last decade many other experiments and analyses have been performed on the lac

    mutant of E. Coli, bringing more evidence to the hypothesis of mutations as an adaptive stress

    response to bad, challenging environmental conditions. It seems however to be excluded a

    directed mutation: a transient hypermutation under stress conditions is enough for a lac pop-

    ulation to adapt and become a lac+, whilst it is still debated how the population is able to

    survive a (although temporary) huge increase in mutations (Gibson et al., 2010; Fonville, 2011;

    Rosenberg, 2001; Galhardo et al., 2007; Gonzalez et al., 2008).

    Condition-dependent mutation rates have also been found in some multicellular organisms,

    although more controversially (Baer, 2008; Agrawal and Wang, 2008; Cotton, 2009; Sharp and

    Agrawal, 2012), and it has also been suggested that human cancers arise in part as an evolu-

    tionarily programmed side effect of age- and damage-inducible genetic instability affecting both

    somatic and germ line lineages (Zhao and Epstein, 2008): an increase in mutation rates in the

    human male sperm while ageing and when experiencing environmental stress seems to lead to

    increased species adaptation, even though at a high cost for the single individuals.

    The debate, as said, is still open, but one more aspect needs to be considered.

    Firstly, most of the cited works assumed the mutation rate to be evolvable (because the

    7

  • molecules that control DNA repairing and copying are in fact encoded by the DNA itself).

    Secondly, a recent work (Clune et al., 2008) pointed out how natural selection, in a strict

    Darwinian framework, fails to find the optimal mutation rates in difficult environments if they

    are encoded into the genome.

    Assuming the last two points are true 2, mutation rates arent adaptive even on the phyloge-

    netic scale, let alone the ontogenetic one. A recent area called epigenetics could help to escape

    this apparent dead-end for the adaptivity hypothesis, adding a new dimension to the scenario.

    1.1 Epigenetics

    DNA is just a tape carrying information, and a tape is no good without a player.

    Epigenetics is about the tape player.

    Bryan Turner 3

    Epigenetics is the name given to a relatively recent field studying the molecules that are

    around the DNA and that regulate gene expression. They are also found in the cytoplasm and

    are highly sensitive to environmental stress (Yaish et al., 2011; Pecinka et al., 2010), modifying

    the genetic expression (also switching on and off particular genes) in different conditions.

    What has prompted a rather heated debate is the possibility for the epigenome (the name

    given to the whole set of molecules with epigenetic functions) to be heritable 4.

    It is easy to see why: the epigenome of an individual responds to environmental stress

    by modifying gene expression or switching off a gene at all, so the children that inherit the

    epigenome could have the same effects on gene expression without having experienced the stress

    themselves. This is called inheritance of acquired characteristics, and has been fought since the

    end of the 19th century on the account of its Lamarckism (Weismann, 1893).

    Since there is some evidence (Dinant et al., 2008) that DNA repair is one of the duties of

    the epigenome, which in turn is highly responsive to environmental stress, it is possible that

    this stress may be a cause of transient, adaptive mutation rate change.

    In this work, as stated in the introduction, a simplified model of stress-driven muta-

    tion rates is developed without trying to maintain a biological resemblance. The

    2The inability to find an optimal mutation rate, however, may not be a problem as long as there is hyper-

    mutability, even if evolved by means of natural selection without any direct influence of the stress, as (Kang

    et al., 2006) shows.3http://epigenome.eu/en/1,1,04See (Jablonka and Lamb, 2005) and a counter argument by (Haig, 2006); for evidence in plants see (Greer

    et al., 2011).

    8

  • (possible) adaptivity of mutations in real organisms, together with the (still unproven) idea

    of an epigenome as interface between the environmental stress and the genetic changes, is an

    inspiration to investigate whether, comparing artificial organisms with such characteristics and

    others without, we can see the first being better than the second.

    To develop such a model of phylogenetic adaptivity, incorporating epigenetics, it has been

    chosen to use the principles and methods of Evolutionary Robotics (ER). The next section will

    present a brief history of the field and the rationale of this approach for studying evolution.

    1.2 Evolutionary Robotics

    ER had developed during the 90s as a new approach to the study of cognition: a so called

    synthetic approach to cognition (Pfeifer et al., 2007). Its first use was as a model to study

    minimally cognitive agents, which were built or simulated and put in an environment to solve

    tasks. The simplified structure of both the robots and the environments together with im-

    provements in computer hardware allowed the researchers to perform real-time as well as oine

    analysis in reasonable time, compared to real organisms.

    The whole idea is that of imitating the evolutionary process to shape the controller, as well

    as the body, of a robot in a goal-oriented way. ER has been applied in engineering to design

    actual robots or other complex tools, like satellites, but it has been most commonly used to

    test scientific hypotheses in various fields: psychology, ecology, evolutionary biology, sociology,

    economics or more recently abiogenesis (Parisi and Cecconi, 2006; Tuci et al., 2002; Nolfi and

    Floreano, 2000; Pfeifer et al., 2008).

    The assumption, of course, is that the natural environment does in fact constitute an arena

    for real organisms, whose bodies and behaviours are tested and selected by means of natural

    selection: this allows only the fittest individual to survive and to become increasingly better in

    that particular environment, and shape in this way the body and behaviour of natural species. If

    the experimenter wants to analyse such a process in a controlled way, or to produce a robot fit to

    a particular environment, he or she just needs to define precisely the environment characteristics

    and a test function able to quantify how well a particular robot performs when put in that

    environment for a certain time, leaving the rest to the artificial evolution.

    The keywords in ER are, thus, evolution and controller: the first to shape the second so that

    the needed behaviour can be seen. ER, in fact, arose after two other techniques had become

    mature, artificial neural networks (ANN) and genetic algorithms (GA), that are outlined in the

    next two subsections.

    9

  • 1.2.1 Artificial Neural Networks

    The researcher involved in ER, as said, wanted to produce agents able to move in simulated

    and real environments, responding quickly to external stimuli. The behaviour-based controllers

    introduced by Brooks, the first models for biologically inspired robotics, were a too high ab-

    straction for the purposes of ER. The idea, instead, of using artificial neural networks (that can

    be easily encoded into genome-like strings) as a controller able to convert environmental stimuli

    into behaviour had, at that point, been a natural one (Nolfi and Floreano, 2000).

    Firstly, ANNs are a simplified model of the biological neural networks which in turn are

    the main focus of non-representationalist views of cognition, like connectionism, that were at

    the time growing in recognition. In addition, ANNs were experiencing again an explosion in

    research, after the big drop caused by the (Minsky and Papert, 1969): the initial problems

    of the computational limits of perceptron had been solved with multilayer perceptrons. GAs

    and the backpropagation algorithm, alone and combined, were showed able to correctly update

    the weights of ANNs to solve complex tasks in the machine learning field, but in many of

    the ER experiments there wasnt the need to use the backpropagation algorithm 5. The main

    difference is that with GAs the focus is on species adaptation, instead on the adaptation of the

    single individuals. There is also evidence of a superiority of GAs in finding weights of ANNs

    compared to backpropagation (Gupta and Sexton, 1999), so the last one is rarely found in ER

    studies.

    Many different ANN models, in addition to the simple multilayer perceptron, have been

    studied and tested for ER experiments (Nolfi and Floreano, 2000): a particularly common one

    is the Continuous-time recurrent neural network (CTRNN) (Beer, 2003), which is the one used

    in this work.

    1.2.2 Genetic Algorithms

    Holland (1975) proposed a framework to systematically evolve genomes (initially only with

    binary genes) using operators inspired by genetics, like mutation and recombination. Its initial

    goal was not that of inventing a system to solve complex tasks, but to study evolution under

    controlled conditions and to show how the evolutionary process was not limited to natural

    organisms.

    5In certain cases it was actually used, together with the genetic algorithm, to simulate both evolution of the

    species and lifetime learning of each individual. Parameters of the backpropagation algorithm can then be evolved

    too, removing the need for the experimenter to determine them. To the purpose of lifetime neural plasticity, also

    Hebbian learning has been extensively used (Floreano et al., 2005)

    10

  • The main idea is, given a certain problem, to define an unambiguous encoding of its possible

    solutions (the genotypes), then create a set of strings that can be translated from the encoded

    version into a testable representation. This is the starting population over which the artificial

    selection can be applied.

    A binary string, for example, could uniquely encode the shape of a particular object by

    specifying, for each gene, whether a particular characteristic has to be present or not. The

    object, built using the specifications in the genome, can be tested in a pre-specified environment.

    A value of goodness, called fitness, is then assigned to that particular object in that particular

    environment. A selection criterion can then be applied: for example keep only the best five

    solutions and clone their genetic codes. Then apply mutations, and combine them two-by-two

    using crossover, to get a population of the same size of the beginning. The process is then

    repeated.

    This process guarantees an increase in performance of the whole population. Given infinite

    time, it would explore all the possible values for the genome, thus an optimal solution will

    eventually be found. It is more common, however, to stop the evolution when a certain fitness

    threshold has been reached, meaning that the solution found is good enough for our necessity,

    or after a given number of iterations (generations).

    Since the work of Holland (1975), many different parameters, genetic operators, fitness (test)

    functions and selection criteria have been developed and tested, and GAs have been successfully

    applied to a wide range of applications (Mitchell, 1998).

    In this work, however, GAs are only considered as a stylized reproduction of the natural

    evolutionary process for artificial organisms. To be able to study the effects of stress-induced

    mutations onto artificial organisms, the standard GA has been modified by including a simplified

    version of an epigenome that encodes the mutation rate. The next section presents this novel

    approach.

    1.3 Stress-driven Epigenetic Algorithm

    Adaptivity in mutation rates in living organisms, as seen, has been widely studied by biologists.

    The same does not apply to the field of ER.

    There have been, in fact, studies about improvements in function optimization when using

    an adaptive mutation scheme (Thierens, 2002; Chan et al., 2008). Still, most ER works use a

    fixed mutation rate, usually empirically found to be close to an optimal value (with respect

    to the experiment-specific conditions).

    Problems with hard-coded, fixed mutation rates arise when facing dynamic environments.

    11

  • Under changing conditions, there might be a different optimal mutation rate. An approach to

    resolve this issue could work by modifying the mutation rate (and eventually other parameters)

    in a way that could, in principle, keep up with the (if existent) current optimal mutation rate,

    or at least achieve better results than a fixed-mutation rate approach.

    To achieve this goal, in the previous work where the Stress-driven EpiGenetic Algo-

    rithm (StrEGA) had been introduced (Ticconi, 2011), the use of the inverse of ranking was

    explored (with function optimization in mind). Each individual has been associated not only

    with a genome, as in classical GAs, but also with an epigenome constituted of only one epi-

    gene. This encoded a number in the range [0, 1], representing the probability for each of its own

    genes, after duplication, to be mutated. After evaluation, a stress indicator is used to update

    the epigenome, either increasing its value or decreasing it.

    The main difference, compared to the cited studies on adaptive mutation rates in GAs, is

    that each individual has its own epigenome, while normally a single mutation rate is made

    global for the whole population. There are however two main points that make this previously

    explored approach not perfectly applicable to this work.

    Firstly, the epigenome was inheritable. Apart from the fact that epigenetic inheritance is

    still a debated issue, in this work it was not strictly necessary.

    Secondly, for an optimization task there is not normally a lifetime. The genome of the

    individual is also its phenotype (the xs of point being passed to the function), and it directly

    produces a fitness value (the y calculated by the function). The notion of stress was thus an

    abstract idea: the inverse of ranking. In a truncation selection scheme, where only a fraction

    of the best is allowed to reproduce, we can considered the best individual the least stressed

    one, and viceversa the worst individual the most stressed one. Therefore, the inverse of ranking

    served well to decide the amount of the epigenome modification for each individual at each

    generation.

    The stress encoding used in the original work (Ticconi, 2011) had led to better results than

    the normal GA, over a variety of fixed mutation rates, but in other conditions it may not be

    the best option. The main, practical problem with the idea of a stress-driven GA, therefore, is

    that it increases the number of parameters an experimenter has to think about.

    In this context, deciding what is stress is as difficult as deciding the fitness function. A com-

    plete coupling of fitness and stress might not always be feasible (when the fitness is unlimited,

    for example) or appropriate. Using the inverse of ranking may, however, be considered as a

    simple alternative to environment-specific stress functions.

    Despite this weakness, an objective advantage of an epigenetic-based approach remains: it

    12

  • does not increase the genetic space as in encoding the mutation rate in the genome. It also

    serves well the purpose of studying in isolation the adaptivity of mutation rate, excluding it

    from the evolutionary process 6.

    In the next chapter, after an overview of the chosen simulator for this work, will be given a

    more detailed description of how the StrEGA has been adapted to the current task.

    6As seen before, in biological organisms this is the main problem for researchers willing to prove there is

    an adaptive mechanism, since the opponents always remarks how a such mechanism could well be a product of

    classic Darwinian evolution.

    13

  • Chapter 2

    Methods

    In this work, it has been chosen to use and extend an existing software, Evorobot* 1, instead

    of building everything from scratch. There are various reasons for this.

    Evorobot* is free, multi-platform and open source, and it has been used, modified andextended over the last 15 years by some of the researchers who started the ER field

    It is written in C++, thus speeding up the execution of the experiments, and it can berun both with and without GUI 2, being suitable to be sent into a computer cluster

    Most of all, the Khepera simulator has been finely tuned, at the beginning of its develop-ment, after weeks of experiments using real robots, by sampling their sensors readings in

    all the conditions supported by the simulator itself and saving them in a file distributed

    with the package: it is therefore supposed to be more accurate than a purely mathematical

    simulator

    Because of what said in the last point, it is normally easy to transfer the simulated controller

    onto a real environment without a great loss in fitness values. The trade-off is of course a

    limitation on the kind of environment that can be defined, which has been here considered a

    minor problem since Evorobot* already supports:

    light objects

    3D cylindrical objects with configurable sizes, heights and colours

    walls with configurable heights and colours1Developed at LARAL, a laboratory of the italian National Research Council, mainly by Stefano Nolfi and

    Onofrio Gigliotta, and is a complete rewriting of the original Evorobot developed by Nolfi and Dario Floreano.

    Website: http://laral.istc.cnr.it/evorobotstar2The GUI has been developed using the also free, open source and performant QT Framework, version 4.

    14

  • ground areas with configurable radius and colours

    Khepera robots with built-in and configurable sensors

    which have been proven enough to setup the experiments needed for this work.

    The program is specifically intended for trial-based, multiple-seed evolutionary runs. Each

    experiment has to be assigned a directory containing a configuration file, evorobot.cf, an optional

    world file, evorobot.env, and an optional network file, evorobot.net.

    When Evorobot* is run from inside an experiment directory, it reads the files and creates

    appropriate internal structures, initializing the robot environment module, the evolutionary

    module and the controller module. Most importantly, the number of repetitions, of generations,

    of individuals, of trials and of life cycles has to be specified for each experiment.

    After that, the evolution is started by first looking at the number of repetitions of the

    experiment that have to be performed. For each repetition, an incremental seed is passed to the

    random numbers generator (the first seed can be specified in the configuration file, the others

    will be seed plus the index of the current repetition). This allows exact replications of the

    experiments and helps to give them statistical robustness.

    In each repetition, the whole environment is randomly re-initialized, and an evolutionary

    process started. The evolution lasts for the number of generations specified: in each of them,

    a population is tested individual by individual, their fitness values recorded, then a fraction of

    the best ones is allowed to reproduce while the others are killed. The children (offspring) of

    the survivors will form a new population of the same size of the previous one, and the current

    generation will end.

    When an individual has to be tested, three experiment-specific functions (among the others)

    are used 3:

    initialize world()

    initialize robot position()

    ffitness()3Refer to the file robot-env.cpp to see details of these functions. It can be seen in the source code that these

    three functions are in fact pointers to function which are linked to the appropriate, experiment specific functions

    using the name of the fitness function passed in the configuration file: instead of converting them into an object-

    oriented style, it has been chosen to maintain the C-style approach traditionally used by the LARAL researchers.

    A new version of Evorobot*, fully object-oriented, had not yet been available at the time of this work.

    15

  • The experiment-specific versions of these functions are the three main pieces of code a

    researcher has to write to setup a simple experiment, unless he or she needs to modify the GA

    or use a non-supported network controller.

    The first two functions are called whenever a new trial for an individual begins. If not

    overwritten, the default functions are used: the first reads the evorobot.env and initializes

    randomly the environment, the second sets the robot position to random coordinates, the third,

    finally, is usually called at the end of the trial, but a parameter can be set to call it after every

    time step of the trial.

    In the next subsections some of the aspects of the simulator will be explored in greater

    detail, with focus on the modifications made for this work.

    Firstly, we will look into the configuration files and into how is it organized the environment

    where the robots will be tested. Then, there will be an overview of the robot structure, its

    sensors and the architecture of the network that controls it.

    Lastly, we will present the modifications made to the classic GA to support stress-induced

    mutation rate variations.

    2.1 Simulated Envinroment

    The world is a squared arena of 1000x1000 pixels, while the robot occupies a circle with approx-

    imate radius of 27 pixels. The simulator makes the world toroidal by default, but it has been

    chosen to use walls (coloured in black) in the current experiments to limit the walkable surface

    and stimulate the evolution of an obstacle avoidance behaviour (if a robot hits a wall, it dies

    and its fitness will be very low).

    Depending on the experiment, there can be one, two or three ground zones (flat objects of

    different colours and radius 100 pixels), and ten cylindrical objects, five with 27 pixels radius

    and five with 12.5 pixels radius. The configuration file for the environment, evorobot.env, allows

    to specify easily which objects have to be loaded in memory for that experiment by listing them

    on different lines, for example:

    swall x0 y0 X1000 Y 0 h1.0 c0.0 c0.0 c0.0

    specifies an object of type wall which in fact is a line of width 1 pixel starting at coordinates

    (0, 0) and ending at coordinates (1000, 0) 4, with an height of 1 pixel and RGB set to (0, 0, 0),

    4As usual in computer simulations, the (0, 0) is the top left point of the arena.

    16

  • which is black 5. For the other objects it is similar, but there is not the second pair of coordinates,

    just one pair for the barycentre.

    All the objects are loaded in memory at the beginning and can eventually be further ma-

    nipulated. This is what initialize world() is usually for. In this work, the ground areas are

    left to their fixed position, as are walls. In the experiments where cylinders are used, these are

    randomly reset at each initialize world() call (at the beginning of each trial), as is the robot

    position in any experiment (making sure its starting position does not hit an object, which will

    cause a premature death).

    How the loaded world is going to be initialized, however, depends on the parameters into

    the main configuration file, evorobot.cf. In some of the old experiment-specific functions into

    Evorobot*, for example, whether to randomize the cylindrical objects or not is decided by the

    parameter random round.

    In the next section there is an overview of some of the parameters used in the experiment.

    2.2 Robot Architecture

    The Khepera mobile robot was developed in the 90s by a team of the EPFL (Mondada et al.,

    1999), just after the start of the evolutionary robotics field. It is a differential wheeled, round-

    shaped robot of about 5.5cm diameter, able to run at a maximum speed of 1m/s.

    The simulator supports light, ground and infrared (proximity) sensors, as well as some

    optional functionality of the physical robot like a gripper, to grab objects, and a linear camera

    (with view angle from 0 to 360).

    Proximity and light sensors are placed all around the robot, the ground sensor (which detects

    the colour of one single pixel) is placed behind, facing the ground, and the camera on the top.

    In this work, only infrared and ground sensors, and in some experiments the camera, are

    used. When the camera is used, the visual field is set to 36 degrees and divided into three vertical

    subfields, whose averaged value will be copied into a respective visual inputs. If the camera is

    used, the eight infrared sensors values are averaged two by two and copied in the respective

    inputs, otherwise all of them have a dedicate input for the controller, which is described in the

    next subsection.

    5In the configuration the range of values for colours is [0, 1] for each of the RGB components, which is then

    scaled up into the classic [0, 255] to be displayed with QT.

    17

  • 2.2.1 Neural network architecture

    The simulator supports various kinds of neural network, but as said the CTRNN has been

    chosen. The architecture, as can be seen in Figure 2.1, is composed of three layers:

    1. an input layer, where the sensors readings are copied, in addition to the proprioceptive

    sensors

    2. an hidden layer composed of 6 recurrent leaky neurons fully connected with the input

    layer and with the

    3. output layer, that has only two neurons: their values will be directly used by the simulator

    to calculate the speed of the wheels

    Figure 2.1: The neural network which controls the robot. At the bottom there is the input layer, where are copied the

    readings from the sensors (external or proprioceptors). In the middle there are six recurrent leaky neurons that are fully

    connected with the two motor neurons, which in turn controls the speed of the wheels.

    The proprioceptive sensors in the input layer are updated in the fitness function, and serve

    as a simplified internal feedback of the robot. This allows to make decisions according to not

    only the external sensory inputs, but also the internal ones: if the robot is starving, for example,

    the energy proprioceptors will be very low. A more detailed description of the proprioceptors

    used will be given in the next chapter.

    As said before, in this work it has been used a CTRNN, where each node is completely

    described by the following equation:

    iyi = yi +Nj=1

    wji(gj(yj + j)) + Ii

    18

  • This equation is integrated using Euler method so that at each time steps the activations of

    the neurons are updated using this rule:

    yi = yi +h

    i(yi +

    Nj=1

    wji(gj(yj + j)) + Ii)

    i, j and wji are all evolved by the genetic algorithm. The output values of the two neurons

    in the output layer are used to update the speed of the motors (update motors), which in turn

    are used to move the robot (move robot). In one time step, the maximum movement in one

    direction is of 20 pixels.

    The controller cannot modify itself during the lifetime of the robot (there isnt plasticity).

    The modifications of weights can only happen by means of mutation when reproducing the best

    individuals, as is presented in the next section.

    2.3 StrEGA implementation

    To support the variability of mutation rates in the StrEGA model, as said in Section 1.3, a

    definition of stress has to be produced according to the particular environments and agents

    being used.

    As will be seen in the next chapter, the individuals in this work are rewarded for staying

    not only alive, but also well: thus, starvation is considered a situation of high stress. In any

    case, however, after the evaluation the stress will be a value in the range [0, 1], where 0 is no

    stress at all and 1 is maximum stress.

    The calculated stress is then used into the mutate epi() function, a modification of the

    mutate() function in Evorobot*. It takes a gene as an argument, and returns it eventually

    mutated.

    Since Evorobot* uses, as a genotype, a string of integers in the range [0, 255], the gene passed

    to the mutating function is an integer value 6.

    Either the normal and the epigenetic mutating functions create a binary representation of

    the gene to be mutated, make a random check using the mutation rate for each of its 8 bits 7

    and if the check is positive, flip the bit.

    6Each gene is scaled up in the range [5, 5] during the genotype to phenotype mapping, where the genome istranslated into the actual weights of the neural network. In this work, the same structure has been maintained:

    a few test have been performed with greater resolutions, ie gene range [0, 1024], without a difference in fitness

    values.7The mutation rate is here, therefore, the probability for each of the 8 bits of a gene to be mutated.

    19

  • The stress is used in mutate epi() to modify the mutation rate such that, being x the base

    mutation rate specified in the configuration file, the new mutation rate can be in the range

    [0.001, x2], where high stress increases it and low stress decreases it. The check is then appliedon the new mutation rate for the 8 bits.

    To check whether the only useful component of the EGA was just the possibility to increase

    the base mutation rate, some of the experiments had the parameter add stress activated. If the

    base mutation rate is x, after applying the stress the new value is in the range [x, x 2]. Forthe rest, it remains the same as before.

    2.3.1 Dynamic brood size

    A further exploration has been that of using the stress to influence not only the mutation rate,

    but also the amount of offspring an individual can produce.

    As said, only a certain fraction of the best individuals (determined by the nreproducing

    parameter) survives each generation, and the number of children of each father is normally

    fixed to a value specified by the offspring parameter. The final population is then calculated as

    the multiplication between the number of fathers and the offspring per father.

    Instead, when the parameter dyn brood was activated, this had been the new process of

    reproduction:

    1. after selecting the nreproducing best individuals (fathers from now on), order them by

    stress value

    2. divide the fathers into (offspring-1) bins so that the first bin contains the least stressed

    individuals, and the last bin the most stressed ones

    3. make the fathers in the first bin produce 2 children, the ones in the second bin 4 children

    and so on. If i is the index of the bin, from 1 to (offspring-1), the number of children the

    fathers that belongs to i can produce is equal to i 2

    We wanted to keep the total population of the same size than with the normal reproduction

    system, that is equal to nreproducing times offspring, at the same time without changing the

    selection criterion (thus, the individual are first selected by fitness as usual, then given different

    reproductive capabilities using the stress).

    In the following is provided a proof that the proposed system yields the wanted result,

    assuming that the number of fathers is divisible by (offspring-1).

    Be n the number of reproducing organisms, m the number of children per reproduc-

    ing individual and P the size of the whole population calculated as n m. Then

    20

  • the number of bins, as defined before, is b = m 1, and the number of reproducingindividuals into each bin is f =

    n

    b. What we need to find is a succession of b values,

    2

  • Chapter 3

    Experimental setup

    The experiments have been performed in an incremental way, starting from a simple environment

    and slightly increasing its complexity.

    The aim of the experiments is to test if mutation rate adaptivity upon the environmental

    conditions could lead a population to better survive in changing environments, and whether in

    general led to an increased overall fitness (excluding eventual fluctuations) or not.

    In each environment there are two sources of food: a good one, and a poisonous one.

    Two sinusoidal functions determine how much good or deleterious a particular source is at each

    generation. They can be seen in Figure 3.1.

    Clearly, the worst moment for the population is after the cross: they can still survive by

    eating what was, a few generations before, the only food source, even though the amount of

    food they get at each time step is far less than before; at the same time, there isnt yet a strong

    enough pressure to make them change source. When the source they are eating from becomes

    poisonous, all the population experiences a big drop in performance, and after that they start

    to be selected for choosing the other source.

    The stress, just after the switch, should start to rise, thus increasing the mutation rate and

    the genetic space exploration. Since the individuals will eventually start again to lose their

    stress and increase their fitness, a convergence should arrive again until the next switch.

    The environment becomes increasingly difficult because the switches become more frequent,

    ie at a certain point there will not be enough time to actually find an individual able to search

    for the other source, adaptive mutation rates or not.

    In the next subsections are presented the experiments in details. Unless differently specified,

    each experiment has been completely repeated 10 times with a different seed, and each

    individual has been re-evaluated 10 times with a reloaded environment. Each evaluation

    (trial) lasts 5000 time steps.

    22

  • (a) Easy

    (b) Medium

    (c) Difficult

    Figure 3.1: The three sinusoidals used to calculate the amount of food (red) and poison (green). When food becomes

    negative it has become poison, and viceversa. The food is calculated as food(x) = sin(0.00015x2 + 1.5)a + b, while the

    poison as poison(x) = sin(0.00015x2 1.5)a + b. Figure 3.1a uses a = 0.5, b = 0.5, Figure 3.1b uses a = 0.7, b = 0.3,Figure 3.1c uses a = 0.9, b = 0.1.

    3.1 Simple sources switch

    This is the simplest environment. Apart from the walls of the arena, there are two ground areas

    of different colours, one near the east wall and the other near the west wall, as in figure 3.2. The

    23

  • west wall has a section coloured with white, instead of black, which enables a discrimination

    task: at generation 0, the white on the wall means food, while at the opposite side there is

    poison.

    Figure 3.2: The simple environment as displayed into Evorobot*. The white wall can be seen on the west wall. At

    generation 0, the light blue area is food and the dark blue is poison.

    The robots are equipped with 4 infrared sensors, a camera, a ground sensor and an energy

    proprioceptor. The energy is their food. When they go over the a ground area, the value

    corresponding to the current generation is drawn from the respective sinusoidal: it can either

    increase or decrease their energy, which however cannot be greater than 1 or less than 0.

    The robot loses a fixed amount of energy at each time step, and an additional one propor-

    tional to its speed.

    The fitness corresponds to sum of the energy levels for each time step, divided by the

    number of time steps. This value is summed up to the fitness values of the other trials of

    the same individual, and then averaged up: the final value is the actual fitness used to decide

    whether the robot can reproduce or has to be discarded.

    The stress corresponds to the number of time steps where the energy level was under 0.5.

    The optimal strategy is clearly to use the white wall as a discriminant, and according to the

    generation going straight toward east or west.

    24

  • 3.2 Simple sources switch with water

    In this case, to make the task more difficult (thus with more ways to be solved), a water source

    (again a ground area, with a different colour) has been added in the middle of the environment,

    as in figure 3.3.

    Figure 3.3: The simple environment with water as displayed into Evorobot*. The white wall can be seen on the west wall.

    At generation 0, the light blue area is food and the dark blue is poison. The area in the middle is always water.

    Compared to the previous setting, there are two more proprioceptors for water and protein

    levels.

    Energy is increased only if both the water and the protein level are above 0.5. The protein

    are gained as in the previous setting the energy was gained (thus using the sinusoidal functions),

    while the water is completely refilled when the robot passes over it: it doesnt change with the

    generations.

    This obliges the robot to continuously move between the current food area and the middle

    of the environment, making it lose a bit of energy in the process.

    The stress is increased only in the time steps where the energy is below 0.5, using this for-

    mula:

    stressi+1 = stressi + 0.5(1 water level) + 0.5(1 protein level)

    As can be seen, there is a slight decoupling between fitness (which is calculate in the same

    way as before) and stress: an individual might have had a low fitness because was neither able

    25

  • to go over the water source nor the food source, thus its stress will be very high; or it might

    have had a low fitness but had been good at going over the water, only failing to find the food

    source, thus its stress will be not so high.

    In case a random controller, just after the switch, is able to be selected (just because also

    the others have a very low fitness in that critical situation), its mutation rate would be higher

    than a good robot that is perfectly able to oscillate between water and one side area, but does

    it on the wrong one. This should allow the children of the random walker to be far away from

    itself, eventually being better in exploring the environment.

    The optimal behaviour is still to reach the correct food area, and then to go back and forth

    between it and the water source.

    3.3 Complex sources switch

    In this setting, the task is similar to the previous one: eat proteins and drink water to generate

    energy, whose summed up amount, averaged over the trials, constitutes the fitness. Also the

    stress is calculated in the same way, and the water area is still exactly in the middle of the

    environment.

    Figure 3.4: The complex environment as displayed into Evorobot*. The big cylinders are randomly spread into the

    environment, while the small cylinders are grouped (their barycentre, however, can be in one of the four different angles of

    the environment, by random chance).

    The robots are equipped, as before, with 4 infrared sensors, the camera, the ground sensor

    and the proprioceptors, but many of the experiments have also been replicated with a simplified

    26

  • structure, without the camera but with 8 infrared sensors. This has been done to see how the

    system performed with a different discrimination task.

    The main difference respect to the previous setting is that the food and poison sources are

    not ground areas, but cylindrical objects of different sizes: at generation 0, the small cylinders

    are proteins and the big cylinders are poison. They are of different colours to be discriminated

    more easily by the camera, and the white zone on the wall is of course absent in this experiment.

    The robot has to go near to them, discriminate whether they are big or small and eat them

    by touching them.

    In each time step there is a small probability that eaten cylinders, either big or small, will

    grow back. In addition, big cylinders are always randomly spread in the environment, while

    small cylinders are always grouped together, whilst their barycentre is changed randomly for

    each trial.

    The optimal strategy, then is different in the two cases. When the small cylinders are

    proteins, the best approach would be to look around, avoiding poison, for the area where they

    are grouped and stay there. When the big cylinders are proteins the agents have to continuously

    explore the environment since they are widespread, but also need to avoid the cluster of small

    cylinders when they encounter it.

    27

  • Chapter 4

    Results

    In the previous chapter, the experiments have been divided in three main groups. Each group

    of experiments has then been explored with different parameters. In the following there is a

    section for each group, where will be presented (unless differently specified):

    1. performance of the agents in a static environment, where each source gives either food

    (saturates completely) or poison (depletes internal food state completely)

    2. what happens if the pre-evolved agents are taken from the last generation of the previous

    experiment and re-evolved but with completely switch sources

    3. a complete evolution where the food and poison sources switch smoothly, following the

    previously presented sinusoidal functions

    each in a different subsection. Additional plots can be seen in the Appendix A.

    4.1 Simple sources

    See Figure 3.2 for details of the environment.

    4.1.1 Static environment

    In this case, the parameter kill on energy has always been used, considering the easiness for an

    agent to reach an optimal behaviour: in most cases, twenty to thirty generations were needed

    to have a best fitness near to 1.

    A population has been evolved for 100 generations in this fixed environment: the west source

    gives food (energy level set to 1 for each time step the robot is on the area), the east source gives

    poison (energy level set to 0 as soon as the robot enters it, and because of the kill on energy

    parameter, the robot dies instantly).

    28

  • Results can be seen in Figure A.1. With all three mutation rates the best values arrived

    quickly to the maximum. They have been evolved without epigenetics, since they served just as

    a base for the switch: the population of the last generation of each of these groups has than been

    put into the environment with switched source, repeated three times (with the same starting

    population) to test epigenetics and dynamic brood size.

    4.1.2 Switch

    As can be seen in Figure 4.1, when the evolved population is put into the switched environment

    there is a drastic performance drop, but in about 20 generations it arrives again over 0.8 of

    fitness. The base mutation rate is 0.01 here, for all, meaning that the epigenetic one could go

    up to 0.02.

    All the three are able to recover fairly quickly, with the best value being, at the last gener-

    ation, almost the same for all of them. We can see, instead, that using epigenetics (especially

    without dynamic brood size) leads to better performance for the average population.

    (a) Best fitness (b) Mean fitness

    Figure 4.1: Plot of the evolutionary performance after that the last generation of the static environment (evolved with

    mutation rate 0.01) has been put in a completely switched environment (what was food is poison and viceversa). The red

    line is the non-epigenetic approach, the green line is with epigenetics and the blue line is epigenetic with

    dynamic brood size. In Figure 4.1a there is the plot of the best fitness for each generation, while in Figure 4.1b there

    is the plot of the mean fitness for each generation. In both cases, the values are averaged over 10 experiment replications,

    for each of the three groups.

    Since this could have been explained by the increased mutation rate, other two replications

    have been made: one with base mutation rate 0.02, and the other with mutation rate 0.04 (see

    A.1.2 for details).

    With 0.02 the situation is very similar: the plot of the best individuals does not present

    relevant differences between epigenetics and normal, even though the average seems to yield a

    worse result for the normal approach. When increasing again to 0.04, instead, for the normal

    29

  • approach there is the first subtle decrease in the best plot, even though can be considered

    negligible, and a huge decrease in the average plot, meaning that although the best individual

    of the last generation were just a bit worse than with epigenetics, the whole population was

    much worse.

    This means that the epigenetic approach can actually use an increased mutation rate without

    a decrease in overall performance (because the stress goes down while the fitness goes up, so

    eventually the mutation rate gets down-regulated), while the normal approach starts losing

    performance.

    4.1.3 Periodic change

    In the third environment, the two ground areas can become both food or poison over time,

    according to two sinusoidals. The base mutation rate has been set to 0.01.

    In Figure 4.2 we can see the performance while using the medium sinusoidal (see Figure

    3.1b), without the parameter kill on energy. The compared systems are five now: normal,

    epigenetics, dynamic brood size, and two more: epigenetics with the add stress parameter

    activated, and with also the dynamic brood size.

    (a) Best fitness (b) Mean fitness

    Figure 4.2: Five systems are compared here: normal (red), epigenetics (green), dynamic brood size (blue), epigenetics with

    add stress (violet), epigenetics with add stress and dynamic brood size (sky blue). The base mutation rate is 0.01, the

    medium sinusoidal (Figure 3.1b) is used and kill on energy is not activated. In Figure 4.2a there is the plot of the

    best fitness for each generation, while in Figure 4.2b there is the plot of the mean fitness for each generation. In both

    cases, the values are averaged over 10 experiment replications, for each of the five groups.

    The best performance, as before, are very similar, while the straight epigenetics without

    other modification outperform again the others on the average fitness of the population. A

    variation of parameters 1, also yields the same results (with a slight loss on the best plot for

    1Adding the kill on energy parameter (Figure A.4), or using the hard sinusoidal (Figure A.5), with or

    without kill on energy (Figure A.6)

    30

  • epigenetics, which still outperforms the others on the average plot).

    4.2 Simple sources with water

    See Figure 3.3 for details of the environment.

    4.2.1 Static environment

    By obliging the agents to visit both the food area and the water to get a good fitness, the range

    of evolvable behaviours should be greater than before. As can be seen in Figure A.7, especially

    increasing the mutation rate, it is harder to reach a high fitness even with 500 generations. The

    errors are far greater than before, meaning that over 10 runs, some were more fortunate than

    others. The average fitness is heavily influenced by the mutation rate increase.

    In the next subsection the last generation for each of the three mutation rates will be put

    in a completely switched environment, as before.

    4.2.2 Switch

    Mutation rates 0.01 and 0.02 show a similar pattern than before. The epigenetics and normal

    have comparable best fitness values, and epigenetics outperform both dynamic brood size and

    normal on the mean fitness plot. An increased reactivity to the new environment is seen, instead,

    in the experiment with 0.04 mutation rate (Figure 4.3).

    In this case, the fixed mutation rate approach and the dynamic brood size are not able to

    go over 0.8 of fitness even at the last generation, and they show a smooth growth, contrarily

    to the epigenetic approach where the increase is huge and quick. The mean fitness plot, as in

    the other experiments, leaves the fixed mutation approach as the least performant between the

    three, and the epigenetics as first.

    4.2.3 Periodic change

    This experiment, like Section 4.1.3, evolves from the beginning agents in a changing environ-

    ment, using the same sinusoidals and also a base mutation rate of 0.01.

    In Figure 4.4 we can see the performance while using the medium sinusoidal (Figure

    3.1b) and the parameter kill on energy. The systems compared are as in Section 4.1.3:

    normal, epigenetics, dynamic brood size, epigenetics with the add stress parameter activated,

    and with the dynamic brood size.

    31

  • (a) Best fitness (b) Mean fitness

    Figure 4.3: Plot of the evolutionary performance after that the last generation of the static environment (evolved with

    mutation rate 0.04) has been put in a completely switched environment (what was food is poison and viceversa). The red

    line is the non-epigenetic approach, the green line is with epigenetics and the blue line is epigenetic with

    dynamic brood size. In Figure 4.3a there is the plot of the best fitness for each generation, while in Figure 4.3b there

    is the plot of the mean fitness for each generation. In both cases, the values are averaged over 10 experiment replications,

    for each of the three groups.

    (a) Best fitness (b) Mean fitness

    Figure 4.4: Five systems are compared here: normal (red), epigenetics (green), dynamic brood size (blue), epigenetics with

    add stress (violet), epigenetics with add stress and dynamic brood size (sky blue). The base mutation rate is 0.01, the

    medium sinusoidal (Figure 3.1b) is used and kill on energy is activated. In Figure 4.4a there is the plot of the best

    fitness for each generation, while in Figure 4.4b there is the plot of the mean fitness for each generation. In both cases,

    the values are averaged over 10 experiment replications, for each of the five groups.

    There is not a clear difference in performance in the best plot but after every change the

    epigenetics is again able to stay slightly above the others (apart before the first switch, where it

    is outperformed by add stress), while in the best two are again epigenetics and dynamic brood

    size. This also happens when using the hard sinusoidal (see Figure A.12 for the experiment

    plots). In the other two experiments (medium sinusoidal with kill on energy (Figure A.10)

    and hard sinusoidal without kill on energy (Figure A.11)) the fixed mutation rate has slightly

    better performance in the best plot than the various epigenetic ones, but is still outperformed

    32

  • in the mean plot.

    4.3 Complex sources

    This group of experiments has been, for the agents, the most difficult one. Figure 3.4 shows

    how the environment is structured. The next two sections present, respectively, the results when

    using both camera and infrared sensors, and when using only the infrared sensors.

    Because of the difficulty of the environment, reflected by the low performance, only the two

    lowest mutation rates (0.01 and 0.02) and the two easiest sinusoidals (easy and medium)

    have been used.

    4.3.1 Camera

    Easy sinusoidal

    The sinusoidal here used is easy because, at worse, the cylinders give little to now food, but

    they are never really poisonous. Nevertheless, the task remains difficult and the performance

    tend to be low. In the comparison, however, the epigenetics is never outperformed by the fixed

    mutation rate approach. When the base mutation rate is 0.01, apart from add stress they

    all follow a similar pattern. This also happens when the mutation rate is 0.02, but only if

    kill on energy is not activated. When it is (see Figure 4.5), epigenetics outperforms all the

    others.

    (a) Best fitness (b) Mean fitness

    Figure 4.5: Five systems are compared here: normal (red), epigenetics (green), dynamic brood size (blue), epigenetics with

    add stress (violet), epigenetics with add stress and dynamic brood size (sky blue). The base mutation rate is 0.02, the

    easy sinusoidal (Figure 3.1a) is used and kill on energy is activated. In Figure 4.6a there is the plot of the best fitness

    for each generation, while in Figure 4.5b there is the plot of the mean fitness for each generation. In both cases, the

    values are averaged over 10 experiment replications, for each of the five groups.

    33

  • Medium sinusoidal

    The overall performance is very low, where only in certain cases the fitness value 0.8 is reached.

    The five (as before) tested systems all tend to have low performance. In one combination of

    parameters (see Figure 4.6), however, the dynamic brood size outperform all the others in

    both the best and the mean plot.

    (a) Best fitness (b) Mean fitness

    Figure 4.6: Five systems are compared here: normal (red), epigenetics (green), dynamic brood size (blue), epigenetics with

    add stress (violet), epigenetics with add stress and dynamic brood size (sky blue). The base mutation rate is 0.01, the

    medium sinusoidal (Figure 3.1b) is used and kill on energy is not activated. In Figure 4.6a there is the plot of the

    best fitness for each generation, while in Figure 4.6b there is the plot of the mean fitness for each generation. In both

    cases, the values are averaged over 10 experiment replications, for each of the five groups.

    34

  • 4.3.2 Infrared

    Easy sinusoidal

    In Figure 4.7 can be seen one of the experiments with infrared sensors and easy sinusoidal. All

    the systems have low performance and are very similar, but the epigenetics is slightly better,

    both for the best plot and for the mean plot, where the improvement is more accentuated

    and is followed also by the dynamic brood size system.

    (a) Best fitness (b) Mean fitness

    Figure 4.7: Five systems are compared here: normal (red), epigenetics (green), dynamic brood size (blue), epigenetics with

    add stress (violet), epigenetics with add stress and dynamic brood size (sky blue). The base mutation rate is 0.02, the

    easy sinusoidal (Figure 3.1a) is used and kill on energy is not activated. In Figure 4.7a there is the plot of the best

    fitness for each generation, while in Figure 4.7b there is the plot of the mean fitness for each generation. In both cases,

    the values are averaged over 10 experiment replications, for each of the five groups.

    35

  • Medium sinusoidal

    Using the medium sinusoidal decrease the overall performance, leaving the relative performance

    the same. In Figure 4.8 the same configuration than with the easy sinusoidal is showed. The

    performance are still low, but slightly better than the others. For the other configurations, the

    difference is negligible.

    (a) Best fitness (b) Mean fitness

    Figure 4.8: Five systems are compared here: normal (red), epigenetics (green), dynamic brood size (blue), epigenetics with

    add stress (violet), epigenetics with add stress and dynamic brood size (sky blue). The base mutation rate is 0.02, the

    medium sinusoidal (Figure 3.1b) is used and kill on energy is not activated. In Figure 4.8a there is the plot of the

    best fitness for each generation, while in Figure 4.8b there is the plot of the mean fitness for each generation. In both

    cases, the values are averaged over 10 experiment replications, for each of the five groups.

    36

  • Chapter 5

    Conclusion

    Starting from a debated issue in evolutionary biology, the adaptivity of mutations, we developed

    an Evolutionary Robotics model able to incorporate the idea of stress-driven modification of the

    mutation rates, through the use of a structure parallel to the (artificial) genome: the epigenome.

    Three different environments, with increasing difficulty, have been used to test the initial

    hypothesis. The results are promising but do not show a generalized and consistent increment

    in performance, even though in most of the experiments the epigenetics scored better, thus

    meaning that it allowed the population either to recover more quickly from a situation of high

    stress, or to reach more quickly an high fitness value, or both.

    The use of the add stress and add stress with dynamic brood were functional to check

    whether an increment in performance by the epigenetics could be due to just the fact that the

    mutation rates increases. This has been showed wrong, because neither a higher fixed mutation

    rate, nor the two add stress configurations, were able to outperform the plain epigenetic system

    in most experiments. This could mean that in certain moments the mutation rate needs to

    be smaller, to allow a smoother evolutionary process (ie, if the individual are already good,

    there is not a need for huge exploration of the genetic space). When the environment changes,

    the mutation rate needs to be higher, so to increase the probability that at least some of the

    individuals will survive.

    Different environments and different definition of stress would need to be developed to

    confirm the results and test to what extent an adaptive mutation scheme can be beneficial. In

    particular, is has not been tested whether increasing the range for the mutation rate in the

    epigenetic system can lead to too many deleterious mutations and reduce the performance.

    Another path that had looked promising in (Ticconi, 2011) is that of epigenetic inheritance.

    Allowing the mutation rate, modified by the stress experienced during the lifetime of an indi-

    vidual, to be passed to the individuals offspring would in fact enable a sort of phylogenetic

    37

  • memory, at least between near generations, of how harsh is the environment in that period.

    This experimentation has, in addition, arisen some remarks about the feasibility of a classical

    trial-based ER model to further analyse this subject. An open-ended, multi-agent artificial life

    simulation might be a better tool to study the dynamics of mutation rates of a real-time-acting

    population.

    38

  • Bibliography

    Aneil F Agrawal and Alethea D Wang. Increased transmission of mutations by low-condition

    females: evidence for condition-dependent DNA repair. PLoS biology, 6(2):e30, February

    2008. ISSN 1545-7885.

    Charles F Baer. Does mutation rate depend on itself. PLoS biology, 6(2):e52, February 2008.

    ISSN 1545-7885.

    Charles F Baer, Michael M Miyamoto, and Dee R Denver. Mutation rate variation in multi-

    cellular eukaryotes: causes and consequences. Nature reviews. Genetics, 8(8):61931, August

    2007. ISSN 1471-0056.

    Randall D. Beer. The Dynamics of Active Categorical Perception in an Evolved Model Agent.

    Adaptive Behavior, 11(4):209243, December 2003. ISSN 10597123.

    KY Chan, TC Fogarty, and ME Aydin. Genetic algorithms with dynamic mutation rates

    and their industrial applications. International Journal of Computational Intelligence and

    Applications, 7(2), 2008.

    Jeff Clune, Dusan Misevic, Charles Ofria, Richard E Lenski, Santiago F Elena, and Rafael

    Sanjuan. Natural selection fails to optimize mutation rates for long-term adaptation on

    rugged fitness landscapes. PLoS computational biology, 4(9):e1000187, January 2008. ISSN

    1553-7358.

    S Cotton. Condition dependent mutation rates and sexual selection. Journal of evolutionary

    biology, 2009.

    Francis Crick. Central Dogma in Molecular Biology. Nature, 227(8):561563, 1970.

    Christoffel Dinant, Adriaan B Houtsmuller, and Wim Vermeulen. Chromatin structure and

    DNA damage repair. Epigenetics and chromatin, 1(1):9, January 2008. ISSN 1756-8935.

    39

  • Dario Floreano, Mototaka Suzuki, and Claudio Mattiussi. Active vision and receptive field

    development in evolutionary robots. Evolutionary computation, 13(4):52744, January 2005.

    ISSN 1063-6560.

    NC Fonville. Stress-induced modulators of repeat instability and genome evolution. Journal of

    Molecular Microbiology and Biotechnology, 2011.

    Rodrigo S Galhardo, P J Hastings, and Susan M Rosenberg. Mutation as a stress response and

    the regulation of evolvability. Critical reviews in biochemistry and molecular biology, 42(5):

    399435, 2007. ISSN 1040-9238.

    Janet L Gibson, Mary-Jane Lombardo, Philip C Thornton, Kenneth H Hu, Rodrigo S Galhardo,

    Bernadette Beadle, Anand Habib, Daniel B Magner, Laura S Frost, Christophe Herman, P J

    Hastings, and Susan M Rosenberg. The sigma(E) stress response is required for stress-induced

    mutation and amplification in Escherichia coli. Molecular microbiology, 77(2):41530, July

    2010. ISSN 1365-2958.

    Caleb Gonzalez, Lilach Hadany, and RG Ponder. Mutability and importance of a hypermutable

    cell subpopulation that produces stress-induced mutants in Escherichia coli. PLoS genetics,

    4(10):e1000208, January 2008. ISSN 1553-7404.

    EL Greer, TJ Maures, Duygu Ucar, and AG Hauswirth. Transgenerational epigenetic inheri-

    tance of longevity in Caenorhabditis elegans. Nature, 479(7373):365371, October 2011. ISSN

    0028-0836.

    Jatinder N.D Gupta and Randall S Sexton. Comparing backpropagation with a genetic algo-

    rithm for neural network training. Omega, 27(6):679684, December 1999. ISSN 03050483.

    David Haig. Weismann Rules! OK? Epigenetics and the Lamarckian temptation. Biology &

    Philosophy, 22(3):415428, December 2006. ISSN 0169-3867.

    Heather Hendrickson, E Susan Slechta, Ulfar Bergthorsson, Dan I Andersson, and John R

    Roth. Amplification-mutagenesis: evidence that directed adaptive mutation and general

    hypermutability result from growth with a selected gene amplification. Proceedings of the

    National Academy of Sciences of the United States of America, 99(4):21649, February 2002.

    ISSN 0027-8424.

    John H. Holland. Adaptation in Natural and Artificial Systems. The University of Michigan

    Press, 1975.

    40

  • Eva Jablonka and Marion J Lamb. Evolution in four dimensions: Genetic, epigenetic, behav-

    ioral, and symbolic variation in the history of life. MIT Press, 2005.

    Josephine M Kang, Nicole M Iovine, and Martin J Blaser. A paradigm for direct stress-induced

    mutation in prokaryotes. FASEB journal: official publication of the Federation of American

    Societies for Experimental Biology, 20(14):247685, December 2006. ISSN 1530-6860.

    Marvin Minsky and Seymour Papert. Perceptrons: An introduction to computational geometry.

    MIT Press, 1st edition, 1969.

    Melanie Mitchell. An Introduction to Genetic Algorithms. The MIT Press, 1998.

    F Mondada, E Franzi, and A Guignard. The development of khepera. Experiments with the

    Mini-Robot Khepera, Proceedings of the First International Khepera Workshop, pages 714,

    1999.

    Stefano Nolfi and Dario Floreano. Evolutionary Robotics. MIT Press, 2000.

    Domenico Parisi and Federico Cecconi. La societa` dei beni. Dalla famiglia allo stato alle imprese

    private. Bollati Boringhieri, 2006.

    Ales Pecinka, Huy Q Dinh, Tuncay Baubec, Marisa Rosa, Nicole Lettner, and Ortrun Mittelsten

    Scheid. Epigenetic regulation of repetitive elements is attenuated by prolonged heat stress in

    Arabidopsis. The Plant cell, 22(9):311829, September 2010. ISSN 1532-298X.

    R Pfeifer, M Lungarella, and O Sporns. The synthetic approach to embodied cognition: a

    primer. Handbook of Cognitive Science, 2008.

    Rolf Pfeifer, Max Lungarella, and Fumiya Iida. Self-organization, embodiment, and biologically

    inspired robotics. Science (New York, N.Y.), 318(5853):108893, November 2007. ISSN

    1095-9203.

    S M Rosenberg. Evolving responsively: adaptive mutation. Nature reviews. Genetics, 2(7):

    50415, July 2001. ISSN 1471-0056.

    Nathaniel P Sharp and Aneil F Agrawal. Evidence for elevated mutation rates in low-quality

    genotypes. Proceedings of the National Academy of Sciences of the United States of America,

    109(16):61426, April 2012. ISSN 1091-6490.

    Jeffrey D Stumpf, Anthony R Poteete, and Patricia L Foster. Amplification of lac cannot

    account for adaptive mutation to Lac+ in Escherichia coli. Journal of bacteriology, 189(6):

    22919, March 2007. ISSN 0021-9193.

    41

  • D Thierens. Adaptive mutation rate control schemes in genetic algorithms. Evolutionary Com-

    putation, 2002. CEC 02. Proceedings of the 2002 Congress on, 1:980 985, 2002.

    Fabio Ticconi. StrEGA: Stress-driven EpiGenetic Algorithm. Technical report, University of

    Sussex, 2011.

    Elio Tuci, M Quinn, and I Harvey. An evolutionary ecological approach to the study of learning

    behavior using a robot-based model. Adaptive Behavior, 10(3-4):201221, 2002.

    August Weismann. The Germ-Plasm A Theory of Heredity. Charles Scribners Sons, 1893.

    Mahmoud W Yaish, Joseph Colasanti, and Steven J Rothstein. The role of epigenetic processes

    in controlling flowering time in plants exposed to stress. Journal of experimental botany, 62

    (11):372735, July 2011. ISSN 1460-2431.

    Yongzhong Zhao and Richard J Epstein. Programmed genetic instability: a tumor-permissive

    mechanism for maintaining the evolvability of higher species through methylation-dependent

    mutation of DNA repair genes in the male germ line. Molecular biology and evolution, 25(8):

    173749, August 2008. ISSN 1537-1719.

    42

  • Appendix A

    Additional Plots

    A.1 Simple sources switch

    43

  • A.1.1 Static environment

    (a) Mutation rate 0.01

    (b) Mutation rate 0.02

    (c) Mutation rate 0.04

    Figure A.1: Each of the plots represents the values of the best (red line) and mean fitness, with the error bars calculated

    using standard error (values taken from 10 repetitions of each experiment). The environment is static for the whole 100

    generations.

    44

  • A.1.2 Switch

    Mutation rate 0.01

    See Figure 4.1.

    Mutation rate 0.02

    (a) Best fitness (b) Mean fitness

    Figure A.2: Plot of the evolutionary performance after that the last generation of the static environment (evolved with

    mutation rate 0.02) has been put in a completely switched environment (what was food is poison and viceversa). The

    red line is the non-epigenetic approach, the green line is with epigenetics and the blue line is epigenetic

    with dynamic brood size. In Figure A.2a there is the plot of the best fitness for each generation, while in Figure

    A.2b there is the plot of the mean fitness for each generation. In both cases, the values are averaged over 10 experiment

    replications, for each of the three groups.

    45

  • Mutation rate 0.04

    (a) Best fitness (b) Mean fitness

    Figure A.3: Plot of the evolutionary performance after that the last generation of the static environment (evolved with

    mutation rate 0.04) has been put in a completely switched environment (what was food is poison and viceversa). The

    red line is the non-epigenetic approach, the green line is with epigenetics and the blue line is epigenetic

    with dynamic brood size. In Figure A.3a there is the plot of the best fitness for each generation, while in Figure

    A.3b there is the plot of the mean fitness for each generation. In both cases, the values are averaged over 10 experiment

    replications, for each of the three groups.

    46

  • A.1.3 Periodic change

    Medium sinusoidal

    See Figure 4.2.

    Medium sinusoidal, Kill on Energy

    (a) Best fitness (b) Mean fitness

    Figure A.4: Five systems are compared here: normal (red), epigenetics (green), dynamic brood size (blue), epigenetics

    with add stress (violet), epigenetics with add stress and dynamic brood size (sky blue). The base mutation rate is 0.01,

    the medium sinusoidal (Figure 3.1b) is used and kill on energy is activated. In Figure A.4a there is the plot of the

    best fitness for each generation, while in Figure A.4b there is the plot of the mean fitness for each generation. In both

    cases, the values are averaged over 10 experiment replications, for each of the three groups.

    47

  • Hard sinusoidal

    (a) Best fitness (b) Mean fitness

    Figure A.5: Five systems are compared here: normal (red), epigenetics (green), dynamic brood size (blue), epigenetics

    with add stress (violet), epigenetics with add stress and dynamic brood size (sky blue). The base mutation rate is 0.01,

    the hard sinusoidal (Figure 3.1c) is used and kill on energy is not activated. In Figure A.5a there is the plot of the

    best fitness for each generation, while in Figure A.5b there is the plot of the mean fitness for each generation. In both

    cases, the values are averaged over 10 experiment replications, for each of the three groups.

    48

  • Hard sinusoidal, Kill on Energy

    (a) Best fitness (b) Mean fitness

    Figure A.6: Five systems are compared here: normal (red), epigenetics (green), dynamic brood size (blue), epigenetics

    with add stress (violet), epigenetics with add stress and dynamic brood size (sky blue). The base mutation rate is 0.01,

    the hard sinusoidal (Figure 3.1c) is used and kill on energy is activated. In Figure A.6a there is the plot of the best

    fitness for each generation, while in Figure A.6b there is the plot of the mean fitness for each generation. In both cases,

    the values are averaged over 10 experiment replications, for each of the three groups.

    49

  • A.2 Simple sources switch with water

    A.2.1 Static environment

    (a) Mutation rate 0.01

    (b) Mutation rate 0.02

    (c) Mutation rate 0.04

    Figure A.7: Each of the plots represents the values of the best (red line) and mean fitness, with the error bars calculated

    using standard error (values taken from 10 repetitions of each experiment). The environment is static for the whole 100

    generations.

    50

  • A.2.2 Switch

    Mutation rate 0.01

    (a) Best fitness (b) Mean fitness

    Figure A.8: Plot of the evolutionary performance after that the last generation of the static environment (evolved with

    mutation rate 0.01) has been put in a completely switched environment (what was food is poison and viceversa). The

    red line is the non-epigenetic approach, the green line is with epigenetics and the blue line is epigenetic

    with dynamic brood size. In Figure A.8a there is the plot of the best fitness for each generation, while in Figure

    A.8b there is the plot of the mean fitness for each generation. In both cases, the values are averaged over 10 experiment

    replications, for each of the three groups.

    51

  • Mutation rate 0.02

    (a) Best fitness (b) Mean fitness

    Figure A.9: Plot of the evolutionary performance after that the last generation of the static environment (evolved with

    mutation rate 0.02) has been put in a completely switched environment (what was food is poison and viceversa). The

    red line is the non-epigenetic approach, the green line is with epigenetics and the blue line is epigenetic

    with dynamic brood size. In Figure A.9a there is the plot of the best fitness for each generation, while in Figure

    A.9b there is the plot of the mean fitness for each generation. In both cases, the values are averaged over 10 experiment

    replications, for each of the three groups.

    52

  • Mutation rate 0.04

    See Figure 4.3.

    53

  • A.2.3 Periodic change

    Medium sinusoidal

    (a) Best fitness (b) Mean fitness

    Figure A.10: Five systems are compared here: normal (red), epigenetics (green), dynamic brood size (blue), epigenetics

    with add stress (violet), epigenetics with add stress and dynamic brood size (sky blue). The base mutation rate is 0.01,

    the medium sinusoidal (Figure 3.1b) is used and kill on energy is not activated. In Figure A.10a there is the plot of

    the best fitness for each generation, while in Figure A.10b there is the plot of the mean fitness for each generation. In

    both cases, the values are averaged over 10 experiment replications, for each of the three groups.

    54

  • Medium sinusoidal, Kill on Energy

    See Figure 4.4.

    Hard sinusoidal

    (a) Best fitness (b) Mean fitness

    Figure A.11: Five systems are compared here: normal (red), epigenetics (green), dynamic brood size (blue), epigenetics

    with add stress (violet), epigenetics with add stress and dynamic brood size (sky blue). The base mutation rate is 0.01,

    the hard sinusoidal (Figure 3.1c) is used and kill on energy is not activated. In Figure A.11a there is the plot of the

    best fitness for each generation, while in Figure A.11b there is the plot of the mean fitness for each generation. In both

    cases, the values are averaged over 10 experiment replications, for each of the three groups.

    55

  • Hard sinusoidal, Kill on Energy

    (a) Best fitness (b) Mean fitness

    Figure A.12: Five systems are compared here: normal (red), epigenetics (green), dynamic brood size (blue), epigenetics

    with add stress (violet), epigenetics with add stress and dynamic brood size (sky blue). The base mutation rate is 0.01,

    the hard sinusoidal (Figure 3.1c) is used and kill on energy is activated. In Figure A.12a there is the plot of the best

    fitness for each generation, while in Figure A.12b there is the plot of the mean fitness for each generation. In both cases,

    the values are averaged over 10 experiment replications, for each of the three groups.

    56


Recommended