Memòria del
Treball de Recerca
An overview of Behavioural-based Robotics
with simulated implementations
on an Underwater Vehicle.
presentat per:
Marc Carreras i Pérez
tutor:
Dr. Joan Batlle i Grabulosa
programa de doctorat:
Informàtica Industrial / Tecnologies Avançades de Producció
Institut d’Informàtica i Aplicacions
Universitat de Girona
Girona, 14 de Juliol de 2000
1
INDEX
1 Introduction.............................................................................................................. 2
2 Behavioural-based Robotics versus Traditional AI............................................... 3
3 Fundamentals of Behaviour-based Robotics ......................................................... 83.1 Principles....................................................................................................... 93.2 Expression of behaviours ........................................................................... 103.3 Behavioural choice and design .................................................................. 113.4 Behavioural encoding ................................................................................. 133.5 Assembling behaviours .............................................................................. 143.6 Adaptive Behaviour-based Robotics.......................................................... 15
4 Behaviour-based Approaches................................................................................ 184.1 Subsumption architecture.......................................................................... 19
4.1.1 Description................................................................................. 194.1.2 Implementation ......................................................................... 22
4.2 Action Selection Dynamics......................................................................... 254.2.1 Description................................................................................. 254.2.2 Implementation ......................................................................... 28
4.3 Motor Schemas approach........................................................................... 324.3.1 Description................................................................................. 324.3.2 Implementation ......................................................................... 34
4.4 Process Description Language................................................................... 364.4.1 Description................................................................................. 364.4.2 Implementation ......................................................................... 38
4.5 Comparison ................................................................................................. 41
5 Conclusions and Future Work .............................................................................. 445.1 Conclusions ................................................................................................. 445.2 Future work................................................................................................. 45
References...................................................................................................................... 49
Appendix A. Simulation of missions with an AUV. ................................................... 54A.1 The control architecture ............................................................................. 55A.2 The low-level controllers ............................................................................ 58A.3 Model of an Autonomous Underwater Vehicle......................................... 59A.4 Underwater environment........................................................................... 62
Appendix B. Published results..................................................................................... 65B.1 MCMC 2000B.2 CCIA 2000B.3 Q&A-R 2000
2
1 Introduction
This investigative report presents the work carried out between November 1999
and July 2000 in the Robotics and Computer Vision group at the University of
Girona.
This report concentrates on centered in control architectures for autonomous
robots. A control architecture is the part of a control system in an autonomous
robot which is in charge of proposing correct actions in order to achieve its mission.
Concretely, this report focuses on Behaviour-based Robotics, a methodology for
building control architectures. The intention of this report is not to survey the state
of the art in this field, but rather it tries to give an overall description of the subject
and to show implementations of different behaviour-based approaches. In
particular, four architectures have been implemented to achieve a simple mission
in a computer-simulated environment. These behaviour-based architectures are
Subsumption, Action Selection Dynamics, Motor Schema and Process Description
Language. The purpose of this report is to illustrate the field of Behaviour-based
Robotics through theoretic aspects and representative implemented examples.
The presented work is part of an investigative project based on underwater
robotics. The research line started in 1992 and has been supported by different
research projects of the Spanish Government. At the moment a control architecture
is being implemented to transform our Remotely Operated Vehicle, GARBI (Amat
et al, 1996), into an Autonomous Underwater Vehicle (AUV). The control
architecture, called OOCAA (Object Oriented Control Architecture for Autonomy)
(Ridao et al, 2000), was designed as a hybrid architecture (refer to section 2)
containing aspects of behaviour-based robotics in its lower layer. The work
presented in this report is integrated with the research project in different aspects.
As an immediate contribution, the underwater-simulated environment, specially
developed for this investigative work, is used as a tool to design and test new parts
of the OOCAA before implementing them in the robot. As a medium term, the
consecution of the PhD, starting with this investigative phase, should improve the
performance on the control of GARBI.
3
With the purpose of contributing to the investigation on underwater robotics,
implementations were carried out using an AUV model and a three-dimensional
simulated environment. However, the overview of Behaviour-based Robotics was
dealt with from a general point of view concerning autonomous vehicles. The four
chosen architectures are also suitable for use in any other kind of robotic vehicles
as well. Another important point to note is that this investigation is done with a
single robot, for this reason all Behaviour-based concepts on multiagents systems
have been omitted.
This investigative report is structured as follows. Section 2 reviews the history of
control architectures for autonomous robots, starting with traditional methods of
Artificial Intelligence and continues trough to control robots. The most important
facts are revised until the inception of the field of Behaviour-based Robotics. Once
this field has been introduced, section 3 summarises the most important principles
and characteristics of Behaviour-based systems. Section 4 deals with the four
compared architectures. All the architectures are firstly described as they were
originally designed, then an implemented example is shown. Simulation results
and discussion of positive and negative aspects are given. The final part of this
section compares the four architectures. Section 5 concludes the report and
proposes future work to be done in order to start the PhD. An overview of the
thesis proposal is described. Finally, appendixes complete the report with the
description of the simulated missions with an AUV (Appendix A) and the related
publications of this investigative work (Appendix B).
2 Behavioural-based Robotics versus Traditional AI
The first attempt at building autonomous robots began around the mid-twentieth
century with the emergence of Artificial Intelligence. The approach begun at that
time is now referred to as “Traditional AI”, “Classical AI”, “Deliberative approach”
or “Hierarchical approach”. Traditional AI relies on a centralised world model for
verifying sensory information and generating actions in the world. The design of
the control architecture is based on a top-down philosophy. The robot control
architecture is broken down into an orderly sequence of functional components
(Brooks, 1991) and the user formulates explicit tasks and goals for the system, see
Figure 1:
4
Per
cept
ion
Mod
elin
g
Pla
nn
ing
Tas
k ex
ecut
ion
Mot
or c
ontr
ol
sensors actuators
Figure 1
1. Perception. In the first component, a sensor interpreter resolves noise and
conflicts in the sensory data. Perception algorithms are used to find
characteristics and objects within the environment.
2. Modeling. With the data obtained from perception, the world-modeling
component builds a symbolic representation of the world. This representation
contains geometric details of all objects in the robot’s world with their positions
and orientation.
3. Planing. The planner then operates on these symbolic descriptions of the world
and produces a sequence of actions to achieve the goals given by the users.
4. Task execution. This function controls the execution of the planned tasks
giving the set-points to each actuator.
5. Motor control. This control system is used to control the actuators in
accordance with the set-points.
The principal characteristics which all AI approaches have in common are:
• Hierarchical structure. The main goals are divided into different tasks, sub-
tasks, etc, in a hierarchical manner. Higher levels in the hierarchy provide sub-
goals for lower subordinate levels. The tasks are accomplished using a top-down
methodology.
• Sequential processing. These processes are executed in serial form starting
with the sense activities, moving through the modeling, and planning, ending
with the actuation.
• Symbolic planner. The planner reasons, basing itself on a symbolic world
model. The world must be generated linking the physical perceptions to the
corresponding symbols.
5
• Functionally compartmentalised . There is a clear subdivision of the
different tasks which must be carried out. Each component in the control
architecture will be in charge of only one of these functions.
In the 1970’s one of the earliest autonomous robots, called Shakey (Fikes &
Nilsson, 1971) (Nilsson, 1984), was built using a deliberative control architecture.
Shakey inhabited a set of specially prepared rooms. It navigated from room to
room, trying to satisfy a given goal. While experimenting with this robot new
difficulties were found. The planing algorithms failed with non-trivial solutions,
the integration of the world representations was extremely difficult and finally the
planning didn’t work as well as was hoped. The algorithms were improved and
better results were obtained. However, the environment was adapted totally to the
robot’s perceptive requirements. Many other robotic systems have been built with
the traditional AI approach, some of the best known are (Albus, 1991), (Huang,
1996), (Lefebvre & Saridis, 1992), (Chatila & Laumond, 1985) and (Laird &
Rosenbloom P.S., 1990). Nevertheless, traditional approaches still have problems
when dealing with complex, non-structured and changing environments. Only in
structured and highly predictable environments have they proved to be suitable.
The principal problems found in Traditional AI can be listed as:
• Computation. Traditional AI requires large amounts of data storage and
intense computation. For an autonomous mobile robot this can be a serious
problem.
• Real time processing. The real world has its own dynamics and for this
reason systems must react fast enough to perform their tasks. Most often,
traditional AI isn’t fast enough because the information is processed centrally.
Modeling and planning are long sequential processes, and the longer they take,
the more changed the world will be when the robot decides to act. The agent
needs simple multiple and parallel processes instead of only a few sequential
long processes.
• Robustness and Generalization. Traditional AI usually lacks in
generalization capacity. If a novel situation arises, the system breaks down or
stops all together. Also it doesn’t take into consideration problems of noise in
the sensory data and actuators when giving its symbolic representation.
6
• The accurate world model. In order to plan correctly, the world model must
be very accurate. This requires high-precision sensors and careful calibration,
both of which are very difficult and expensive.
• The Frame problem. This problem arises when trying to maintain a model of
a continuously changing world. If the autonomous robot inhabits a real world,
the objects will move and the light will change. In any event, the planner needs
a model with which to plan.
• The Symbol-grounding problem. The world model uses symbols, such as
“door”, “corridor”, etc, which the planner can use. The Symbol-grounding
problem refers to how the symbols are related to real world perceptions. The
planner is closed in a symbolic world model while the robot acts in the open real
world.
In the middle of the 1980s, due to dissatisfaction with the performance of robots in
dealing with the real world, a number of scientists began rethinking the general
problem of organizing intelligence. Among the most important opponents to the AI
approach were Rodney Brooks (Brooks, 1986), Rosenschein and Kaelbling
(Rosenschein, 1986) and Agre and Chapman (Agre & Chapman, 1987). They
criticized the symbolic world which Traditional AI used and wanted a more
reactive approach with a strong relation to the perceived world and the actions
there in. They implemented these ideas using a network of simple computational
elements indirectly connecting sensors to actuators in a distributed manner. There
were no central models of the world explicitly represented. The model of the world
used was the real one perceived by the sensors at each moment. Leading the new
paradigm, Brooks proposed the “Subsumption Architecture” which was the first
approach to the new field of “Behaviour-based Robotics”.
Instead of the top-down approach of Traditional AI, Behaviour-based systems use a
bottom-up philosophy like that in Reactive Robotics. Reactive systems provide
rapid real-time responses using a collection of pre-programmed rules. Reactive
systems are characterized by a strong response, however, as they don’t have any
kind of internal states, they are incapable of using internal representations to
deliberate or learn new behaviours. On the other hand, Behaviour-based systems
can store states in a distributed representation, allowing a certain degree of high-
level deliberation.
7
The Behaviour-based approach uses a set of simple parallel behaviours which react
to the perceived environment proposing the response the robot must take in order
to accomplish the behaviour (see Figure 2). There are neither problems of world
modeling nor real time processing. Nevertheless, another difficulty has to be
solved; how to select the proper behaviours for robustness and efficiency in
accomplishing goals. Two new questions also appeared, which Traditional AI
doesn’t take into consideration; how to adapt the architecture in order to improve
its goal-achievement, nor how to adapt it when new situations appear. This simple
but powerful methodology was in great contrast to Traditional AI and, from its
beginning, provided for simplicity, parallelism, perception-action mapping and real
implementations.
sensors actuators
avoid obstacles
go to point
explore
manipulate the world
Figure 2
Behaviour-based Robotics has been widely used and investigated since then. This
new field has attracted researchers from many and divers disciplines such as
biologists, neuroscientists, philosophers, linguistics, psychologists and, of course,
people working with computer science and artificial intelligence. All of whom have
found practical uses for this in approach their various fields of endeavor. The field
of Behaviour-based Robotics has also been referred to as the “New AI” or, under a
more appropriate denomination for the fields mentioned above, “Embodied
Cognitive Science”.
sensors actuatorsreactive layer
deliberative layer
control execution
Figure 3
8
Finally, some researchers have adopted a hybrid approach between Traditional AI
and Behaviour-based Robotics. Hybrid systems attempt a compromise between
bottom-up and top-down methodologies. Usually the control architecture is
structured in three layers: the deliberative layer, the control execution layer and
the functional reactive layer (see Figure 3). The deliberative layer transforms the
mission into a set of tasks which perform a plan. The reactive behaviour-based
system takes care of the real time issues related to the interactions with the
environment. The control execution layer interacts between the upper and lower
layers, supervising the accomplishment of the tasks. Hybrid architectures take
advantage of the hierarchical planning aspects of Traditional AI and the reactive
and real time aspects of behavioural approaches. Hybrid architectures have been
widely used. Some of the best known are AuRA (Arkin, 1986), the Planner-Reactor
Architecture (Lyons, 1992) and Atlantis (Gat, 1991) used in the Sojourner Mars
explorer.
3 Fundamentals of Behaviour-based Robotics
Behaviour-based Robotics is a methodology for designing autonomous agents and
robots. The behaviour-based methodology is a bottom-up approach inspired by
biology, a collection of behavioural acts in parallel achieving goals. Behaviours are
implemented as a control law using inputs and outputs. They can also store states
constituting a distributed representation system (Mataric, 1999). The basic
structure consists of all behaviours taking inputs from the robot’s sensors and
sending outputs to the robot’s actuators. (See Figure 4). A coordinator is needed
order to send only one command at a time to the motors.
Actuators
Behaviour 1
Behaviour 2
Behaviour n
STIMULUS
COORDINATOR
Sensors
Figure 4
9
The internal structure of a behaviour can also be composed of different modules
interconnected by sensors, various other modules and, finally, with the coordinator
(Brooks, 1986). However, behaviours must be completely independent of each
other. The global structure is a network of interacting behaviours comprising low-
level control and high-level deliberation abilities. The latter is performed by the
distributed representation which can contain states and consequently change the
behaviour according to their information.
The parallel structure of simple behaviours allows a real time response with low
computational cost. Autonomous robots using this methodology can be built easily
with low cost. Behaviour-based robotics has demonstrated its reliable performance
in standard robotic activities such as navigation, obstacle avoidance, terrain
mapping, object manipulation, cooperation, learning maps and walking. (Arkin,
1998) and (Pfeifer & Scheier, 1999) are two principal references which overview the
field.
3.1 Principles
There are few basic principles which have been used by all researchers in
Behavior-based Robotics. These principles provide the keys to success in
methodology.
• Parallelism. Behaviours are executed concurrently. Each one can run on its
own processor. Parallelism appears at all levels, from behavioural design to
software and hardware implementation. This characteristic contributes to the
speed of computation and consequently to the dynamics between the robot and
the environment.
• Modularity. The system is organised into different modules (behaviours). The
important fact is that each module must run independently. This important
principle contributes to the robustness of the system. If, for example, one
behaviour fails due to the break down of a sensor, the others will continue
running and the robot will always be controlled. Another important
consequence of modularity is the possibility of building the system
incrementally. In the design phase, the priority behaviours will first be
implemented and tested. Once they run correctly, more behaviours will be
added to the system.
10
• Situatedness/Embeddedness. The concept of “Situatedness” means that a
robot is situated and surrounded by the real world. For this reason it mustn’t
operate using an abstract representation of reality, it must use the real
perceived world. “Embeddednes” refers to the fact that the robot exists as a
physical entity in the real world. This implies that the robot is subjected to
physical forces, damages and, in general, to any influence from the
environment. This means that the robot shouldn’t try to model these influences
or plan with them. Instead it should use this system-environment interaction to
act and react with the same dynamics as the world.
• Emergence. This is the most important principle of Behaviour-based Robotics.
It is based on the principles explained above and attempts to explain why the
set of parallel and independent behaviours can arrive at a composite behaviour
for the robot to accomplish the expected goals. Emergence is the property which
results from the interaction between the robotic behavioural system and the
environment. Due to emergence the robot performs behaviours that weren’t pre-
programmed. The interaction of behaviour with the environment generates new
characteristics on the robot’s behaviour which weren’t pre-designed. Numerous
researchers have talked about emergence. Two examples are “Intelligence
emerges from interaction of the components of the system” (Brooks, 1991) and
“Emergence is the appearance of novel properties in whole systems” (Moravec,
1988).
3.2 Expression of behaviours
There are several ways with which to express a robotic behaviour:
1. Stimulus-Response Diagrams
Behaviours are represented using Stimulus(input)-Response(output) (SR) blocks as
shown in Figure 5. Behaviours are in parallel structure and the outputs are
channelled into a coordinate mechanism which produces an appropriate response
(Figure 4). Stimulus-Response diagrams are the most intuitive method to express
behaviours.
BehaviourStimulus Response
Figure 5
11
2. Functional Notation
Behaviours can also be expressed as mathematical functions. Then coordination
function evaluates the individual behaviours and generates the response which will
be transmitted to the actuators:
b : behaviour b(s) = r s : stimulus
r : response
coordinate [ b1(s1), b2(s2), ..., bn(sn)] = response
3. Finite State Acceptor Diagrams
Finite State Acceptors (FSA) (Arbib et al, 1981) are used to describe aggregations
and sequences of behaviours during the accomplishment of some high-level goals.
They make the active behaviours and the transitions between them explicit. A
finite state acceptor M is specified by:
Q : set of allowable behavioural statesδ : transition behavioural configurationq 0 : starting behavioural configurationF : set of accepting states
The finite state acceptor can be diagramed as by a set of circles representing
behaviours with arrows indicating, the stimulus. Stimuli change the active
behaviours which are represented by a double circle. Figure 6 shows a simple
example of a FSA.
A B C
i1 i3 i5
i2 i4
δ q input δ(q,input)
A i1 AA i2 BB i3 BB i4 CC i5 C
M = {{A,B,C},δδ,A,{B,C}}
Figure 6
3.3 Behavioural choice and design
There are many approaches to choose and design the behaviours which must
appear in the architecture. Three of the most well known are next described.
M = ( Q , δ , q 0 , F )
12
1. Ethologically guided/constrained design
This method is based on animal behaviour. Behaviours are designed after
consulting biological literature searching for animal behaviours which can be used
for a robotic system. Then, the animal model which accomplishes the desired
behaviour is translated to a more suitable behaviour system for the robot. The
schema-based robotic navigational system was invented using this methodology
based on the navigational behaviour of the toad (Arbib & House, 1987). Figure 7
shows the design methodology for ethologically guided systems.
ConsultEthologicalLiterature
ExtractModel
ImportModel toRobot
Run RoboticExperiments
EvaluateResults
Done
EnhanceModel
Guide NewBiological
Experiments
Figure 7
2. Situated activity-based design
A robot’s actions are predicted upon the situations in which it may find itself. The
design requires a solid understanding of the relationship between the robotic agent
and its environment, then all possible situations must be related to a behaviour.
The perception problem is reduced to recognising the situation in which the robot
finds itself at any given moment. Different projects can be found in (Agre &
Chapman, 1987) and (Schoppers, 1987). Figure 8 shows the situated activity design
methodology.
AssesAgent-
EnvironmentDynamics
Partitioninto
Situations
CreateSituationalResponses
ImportBehaviours
to Robot
Run RoboticExperiments
Done
Enhance, Expand,Correct
BehaviouralResponses
EvaluateResults
Figure 8
13
3. Experimentally driven design
The basic premise is to build a set of abilities, make a trial run in the real world,
debug imperfect behaviours and add to the behavioural system. This is a trial and
error approach. By iterative repetition of this process, the behaviour-based
architecture is built. This method has been widely used by such people as Brooks
(Brooks, 1989) and Payton (Payton et al, 1992), among others. Figure 9 shows the
experimentally driven design methodology.
BuildMinimalSystem
ExerciseRobot
Done
Add NewBehaviouralCompetence
EvaluateResults
Figure 9
3.4 Behavioural encoding
To encode the behavioural response we must create a functional mapping from the
stimulus plane for the motor plane. The motor plane usually has two parameters,
the strength (magnitude of the response) and the orientation (direction of the
action). A behaviour can be expressed as (S, R, β) where:
• S: Stimulus Domain
S is the domain of all perceivable stimuli. Each stimulus “s” is represented by
s(p,λ), where “p” is the class of perception and “λ” is its strength. Each class “p”
has a τ threshold value above which a response is generated.
• R: Range of Responses
For autonomous vehicles with six degrees of freedom, the response r ∈ R of a
behaviour is a six-dimensional vector: r = [x, y, z, θ, φ, ψ] composed of the three
translation degrees of freedom and the three rotational degrees of freedom.
Each parameter is composed of strength and orientation values. When there are
different responses ri, the final response is ri ’= gi · ri , where gi is a gain which
specifies the strength of the behaviour relative to the others.
• ββ: Behavioural Mapping
The mapping function “β” relates the stimulus domain with the response range
for each individual active behaviour: β(s) → r. This mapping function generates
response only when λ > τ. Behavioural mappings β can be :
14
• Null: the stimuli produce no motor response
• Discrete: numerable sets of responses, Examples of this kind of mapping can
be found in (Bonasso, 1992) or Subsumption architecture (Brooks, 1986).
• Continuous: infinite space of potential reactions. Examples can be found in
potential fields (Khatib, 1985) or spin fields (Slack, 1990).
3.5 Assembling behaviours
When we combine and coordinate multiple behaviours the emergent behaviour
appears. This is the product of the complexity between a robotic system and the
real world. Coordination function is:
ρ : six-dimensional final vector responseC : coordination functionG : relative strengths of the behaviours* : element by element productB : behavioursS : stimulus
The two primary coordination mechanisms are:
• Competitive methods. The output is the selection of a single behaviour. The
coordinator chooses only one behaviour to control the robot. Depending on
different criterions the coordinator determines which behaviour is best for the
control of the robot. Preferable methods are suppression networks such as
Subsumption architecture (Brooks, 1986), action-selection (Maes, 1990) and
voting-based coordination (Rosenblatt & Payton, 1989).
• Cooperative methods. The output is the superposition of the force gradients
given by all the behaviours. The coordinator applies a method which takes all
the bahavioural responses and generates an output which will control the robot.
Behaviours which generate a stronger output will impose a greater influence on
the final behaviour of the robot. Principal methods are vector summation
(potential fields, (Khatib, 1985)) and behavioural blending (Saffiotti et al, 1995).
Basic behaviour-based structures use only a coordinator which operates using all
the behaviours to generate the robot’s response. However, there are more complex
systems with different groups of behaviours coordinated by different coordinators.
Each group generates an output and with these intermediate outputs, the final
robot’s response is generated through a final coordinator. These recursive
ρ = C ( G * B (S))
15
structures are used in high level deliberation. By means of these structures a
distributed representation can be made and the robot can behave differently
depending on internal states, achieving multiphase missions.
3.6 Adaptive Behaviour-based Robotics
One of the fields associated with Behaviour-based robotics is Adaptation.
Intelligence cannot be understood without adaptation. If a robot requires autonomy
and robustness it must adapt itself into the environment. The primary reasons for
autonomous adaptivity are:
• The robot’s programmer doesn’t know all the parameters of the behaviour-
based system.
• The robot must be able to perform in different environments.
• The robot must be able to perform in changing environments.
The properties which adaptive systems in robotics must contemplate are
(Kaelbling, 1999):
• Tolerance to sensor noise.
• Adjustability. A robot must learn continuously while performing its task in the
environment.
• Suitability. Learning algorithms must be adequate for all kinds of
environments.
• Strength. The adaptive system must posses the ability to control the robot in a
desired place in order to obtain the desired data.
However, adaptation is a wide term. According to (McFarland, 1991) there are
various levels of adaptation in a behaviour-based system.
• Sensory Adaptation. Sensors become more attuned to the environment and
changing conditions of light, temperature, etc.
• Behavioural Adaptation. The individual behaviours are adjusted relative to
the others.
• Evolutionary Adaptation. This adaptation is done over a long period of time
inducing change in individuals of one species, in this case robots. Descendants
change their internal structure based on the success or failure of their ancestors
in the environment.
• Learning as Adaptation. The robot learns different behaviours or different
coordination methods which improve its performance.
16
Using the mathematical notation explained in sections 3.2 and 3.4 we can see
where adaptation can be applied to a simple behaviour:
ri = gi · bi (si)
1. The mapping function bi which relates the stimulus si with the responses ri.
2. The magnitude of the response which is controlled by the gain gi.
3. The necessity of a new behaviour i.
In the coordination function, from section 3.5, there are also some terms which can
be adapted:
ρ = C ( G * B (S))
4. The set of behaviours bi which constitute an assemblage B, in case the
behaviour-based system has more than one coordination phase.
5. The relative strengths of each behaviour response G.
6. The coordination function C.
As can be seen, many parameters can be adapted in a behaviour-based robotic
system. At the moment there are still few examples of real robotic systems which
learn to behave (Kaelbling, 1999). There isn’t either an established methodology to
develop adaptive behaviour-based systems. The two approaches most commonly
used are Reinforcement learning and Evolutionary techniques (Ziemke, 1998;
Arkin, 1998; Dorigo & Colombetti, 1998; Pfeifer & Scheier, 1999). Both have
interesting characteristics but also disadvantages like the convergence time or the
difficulties in finding a reinforcement or fitness function respectively. In many
cases they are implemented over control architectures based on neural networks.
Using the adaptive methodologies the weights of the network are modified until an
optimal response is obtained. The two approaches have demonstrated the
feasibility of the theories in real robots in all levels of adaptation. The basic ideas of
the two methodologies are next described.
• Reinforcement learning.
Reinforcement learning is a class of learning algorithm where a scalar evaluation
(rewards or punishments) of the performance of the control architecture is
available from the interaction with the environment. The goal of RL algorithm is
to maximise the expected evaluation by adjusting the parameters of the control
17
architecture. The adjustment of parameters determines the control policy that is
being applied. The evaluation is generated by the critic using an utility function.
Generally this utility function is unknown and must be also learned. The
evaluation can be returned immediately or after a while. In the latter the
reinforcements have to be distributed in the current and past control signals. The
two predominating algorithms in RL are Adaptive Heuristic Critic and Q-learning.
In AHC the process of learning the decision policy is separated from learning the
utility function of the critic. However, in Q-learning a single “Q” function is learned
to evaluate both actions. This Q function denotes the total reinforcement obtained
by choosing the action as a first action and then following the policy for future time
steps. Currently, Q-learning is dominating adaptive behaviour-based architectures
using reinforcement learning. For a complete introduction to RL refer to
(Kaelbling, 1996).
• Evolutionary Robotics.
Evolutionary learning techniques are inspired by the mechanisms of natural
selection. The principal method used is Genetic Algorithms (GAs). Evolutionary
algorithms typically start from a randomly initialized population of
individuals/genotypes encoded as strings of bits or real numbers. Each individual
encodes the control system of a robot. Each individual is evaluated in the
environment. From the evaluation a score is assigned that measures the ability to
perform a desired task. Individuals that have obtained higher fitness values are
allowed to reproduce by generating copies of their genotypes with the addition of
changes introduced by some genetic operators (mutations, crossover, duplication,
etc.). By repeating this process for several generations the fitness values of the
population increase. Evolutionary robotics has shown good results in real robots.
Usually they are used over neural networks modifying the weights of the nodes.
Evolutionary algorithms have demonstrated more reliable solutions than
reinforcement learning when reinforcement frequency is lower. However
evolutionary approaches have also problems of real-time due to the large time
necessary to converge into a reasonable solution. For a complete introduction to
Evolutionary Robotics refer to (Nolfi, 1998) (Harvey et al, 1997).
18
4 Behaviour-based Approaches
Many proposals have appeared since the field of Behaviour-based Robotics began in
1986 with Subsumption architecture, yet there still isn’t a clear classification of the
different techniques in the literature. A global classification, accepted by the
majority of scientists (Ziemke, 1998) (Maes, 1994), lists systems according to their
adaptability. The first proposals were non-adaptive approaches or engineering
approaches. In these proposals, a sophisticated action selection mechanism is
implemented and tuned manually. Later on, adaptive approaches appeared. These
architectures have a simplistic action selection mechanism with sophisticated
learning techniques such as reinforcement learning or evolutionary approaches
(section 3.6).
This report has concentrated on the engineering approaches because they form the
base of Behaviour-based Robotics and because most adaptive approaches are also
based on them. From among the many proposals, four architectures have been
chosen which we think represent the overall methodologies and which have
successfully been implemented in real robots. These architectures were designed
independently and are based on different ideas within the field of Behaviour-based
Robotics. The architectures and their basic characteristics can be seen in Table 1.
Controlarchitecture
Behavioural choice and design
Behaviouralencoding
Assembling behaviours Programmingmethod
Subsumptionarchitecture
Experimentally Discrete Competitive, arbitration viainhibition and suppression
AFSM, BehaviouralLanguage orbehavioural libraries
Action SelectionDynamics
Experimentally Discrete Competitive, arbitration vialevels of activation
Mathematicalalgorithms
Schema-basedapproach
Ethologically Continuous usingPotential fields
Cooperative via vectorsummation and normalisation
Parameterisedbehavioural libraries
ProcessDescriptionLanguage
Experimentally Continuous Cooperative via valuesintegration and normalisation
Process DescriptionLanguage
Table 1
19
4.1 Subsumption architecture
4.1.1 Description
The Subsumption architecture was designed by Rodney Brooks in the 1980s at the
Massachusetts Institute of Technology. His work opened the field of Behaviour-
based Robotics. To overcome the problems encountered in Traditional AI when
designing real robotic systems, Brooks proposed a completely different
methodology. He refused the centralized symbolic world model and proposed a
decentralized set of simple modules which reacted more rapidly to environmental
changes. To accomplish this, he presented the Subsumption architecture in 1986
with the paper “A Robust Layered Control System for a Mobile Robot” (Brooks,
1986). Later on, he modified a few aspects of the architecture as a result of
suggestions from J.H. Connell. The final modifications on the Subsumption
architecture can be found in (Brooks, 1989) and (Connell, 1990). Subsumption
Architecture has been widely applied in all kinds of robots since then. Some further
modifications have also been proposed. However, in this report, the original
Subsumption approach will be described.
Subsumption Architecture is a method of reducing a robot’s control architecture
into a set of task-achievement behaviours or competences represented as separate
layers. Individual layers work on individual goals concurrently and
asynchronically. All the layers have direct access to the sensory information.
Layers are organised hierarchically allowing higher layers to inhibit or suppress
signals from lower layers. Suppression eliminates the control signal from the lower
layer and substitutes it with the one proceeding from the higher layer. When the
output of the higher layer is not active, the suppression node doesn’t affect the
lower layer signal. On the other hand, only inhibition eliminates the signal from
the lower layer without substitution. Through these mechanisms, higher-level
layers can subsume lower-levels. The hierarchy of layers with the suppression and
inhibition nodes constitute the competitive coordination method, see Figure 10.
20
Behaviour 1
Behaviour 2
Behaviour 4
I
S
STIMULUS
Behaviour 3
S
Sensors
Actuators
COORDINATOR
Figure 10
All layers are constantly attentive to the sensory information. When the output of a
layer becomes active, it suppresses or inhibits the outputs of the layers below,
taking control of the vehicle. Internally the layer has states and timers allowing
deliberation and continuing activity for a period of time when the activation
conditions finish.
This architecture can be built incrementally, adding layers at different phases. For
example, the basic layers, such as “go to behaviour” or “avoid obstacles behaviour”,
can be implemented and tested in the first phase. Once they work properly, new
layers can be added without the necessity of redesigning previous ones.
The layers of the subsumption architecture were originally designed as a set of
different modules called Augmented Finite States Machines (AFSM). Each AFSM
is composed of a Finite State Machine (FSM) connected to a set of registers and a
set of timers or alarm clocks, see Figure 11. Registers store information from inside
FSM as well as from the outside sensors and other AFSM. Timers enable state
changes after a certain period of time while finite state machines change their
internal state depending on the current state and inputs. An input message or the
expiration of a timer can change the state of the machine. Inputs from the AFSM
can be suppressed by other machines and outputs can be inhibited. AFSM behave
like a single FSM but with the added characteristics of registers and timers.
21
Finite State Machine
º IS
Figure 11
Using a set of augmented finite state machines a layer can be implemented to act
like a behaviour. A layer is constructed of a network of AFSM joined by wires with
suppression and inhibition nodes. Figure 12 shows the AFSM network designed by
Brooks for a mobile robot with the layers “avoid objects”, “wander” and “explore”,
(Brooks, 1986). As the figure shows, designing the network so as to accomplish a
desired behaviour is not exactly clear. For this reason, Brooks developed a
programming language, the Behavioural Language (Brooks, 1990), which
generates the AFSM network using a single rule set for each behaviour. The high-
level language is compiled to the intermediate AFSM representation and then
further compiled to run on a range of processors.
Figure 12
22
One of the principles of the Subsumption architecture is independence from the
layers. The implementation methodology, as stated above, consists of building the
layers incrementally once the previous layers have been tested. Nevertheless, in
Figure 12, we can see some wires which go from one layer to another breaking the
independence. This fact was shown by Connell (Connell, 1990) who proposed a total
independence of the layers until the coordination phase. This assures the
possibility of implementing the layers incrementally without redesigning the
previous ones. This is also useful in order to map each layer into a different
processor in the robot. Connell and other researchers have also proposed other
formalism to implement the layers instead of AFSM. Usually, computer programs
are used for simplicity in programming rules without the use of FSM. AFSM must
be considered as the formalism Brooks chose, for its simplicity and rapid
processing, to implement the Subsumption architecture, not as part of it.
4.1.2 Implementation
Subsumption Architecture has been implemented with three behaviours and tested
in a simulated underwater environment in an AUV. The mission consisted of
reaching a collection of way-points avoiding obstacles and traps. The first
behaviour “Go to” is in charge of driving the vehicle toward the way-points. The
second behaviour “Obstacle avoidance” has the goal of maintaining the vehicle
away from obstacles. The third behaviour “Avoid trapping” is used to depart from
zones in which the vehicle could be trapped. For a complete description of the
behaviours and the simulated environment refer to Appendix A.
The three behaviours have been implemented in three different functions which
use sensory information as inputs and, as outputs, if the behaviour becomes active,
the 3-dimensional velocity the vehicle should follow. Also, two suppression nodes
were used to coordinate behaviours. AFSM was not been used due to the simplicity
of the executable functions in the simulated environment. However, as mentioned
before, the implementation method isn’t the most important fact in the
subsumption architecture. The three behaviours were implemented as shown in
Figure 13.
23
Obstacle avoid.
Avoid Trapping
Go To
S
S
Figure 13
The hierarchy of behaviours was constructed as follows: at the top, the “Obstacle
avoidance” behaviour followed by the “Avoid trapping” and “Go to” behaviours. This
hierarchy primarily assures that the vehicle stays away from obstacles. As the
mission never requires proximity to obstacles, the “Obstacle avoidance” in the top
level assures the safety of the vehicle. At the second level the “Avoid trapping”
behaviour takes control of the “go to” behaviour when and if the vehicle becomes
trapped. As in AFSM, a timer was used to maintain activity in the behaviours 5
seconds longer when the activation conditions were finished.
Some graphical results of the Subsumption approach can be seen in the next three
figures. Figure 14 shows an aerial view of the simulation and Figure 15 shows a
lateral view. Finally Figure 16 shows a three-dimensional representation of the
simulation.
Figure 14
24
Figure 15
Figure 16
Given the results, it can be said that the principal advantages of the Subsumption
approach are robustness, modularity and easy tuning of the behaviours.
Behaviours can be tuned individually and, once they work properly, mixed. The
design of the hierarchy is very easy once the priorities of the behaviours are known
(a difficult task when working with a large architecture). This architecture is very
modular, every behaviour can be implemented with a different processor with the
responses coordinated as a final step. A sequential algorithm is not necessary as all
the behaviours are completely independent. The principal disadvantage is the non-
optimal trajectories, due to the competitive coordination method, with a lot of
bends when the active behaviour changes. Table 2 summarises the principal
characteristics of Subsumption Architecture.
25
SUBSUMPTION ARCHITECTURE
Developer Rodney Brooks, Massachusetts Institute of Technology
References (Brooks, 1986; Brooks, 1989; Connell, 1990)
Behavioural choice and design Experimentally
Behavioural encoding Discrete
Assembling behaviours Competitive, arbitration via inhibition and suppression
Programming method AFSM, Behavioural Language or behavioural libraries
Positive aspects Modularity, Robustness and Tuning time
Negative aspects Development time and performance (non-optimal paths)
Table 2
4.2 Action Selection Dynamics
4.2.1 Description
Action Selection Dynamics (ASD) is an architectural approach which uses a
dynamic mechanism for behaviour (or action) selection. Pattie Maes from the AI-
Lab at MIT developed it toward the end of the 1980s. Principal references are
(Maes, 1989), (Maes, 1990) and (Maes, 1991). Behaviours have associated
activation levels which are used to arbitrate competitively the activity which will
take control of the robot. Other approaches for action selection have been proposed
(Tyrell, 1993). However, ASD is the most well known and most commonly specified.
Action Selection Dynamics uses a network of nodes to implement the control
architecture. Each node represents a behaviour (an external behaviour of the
robot). The nodes are called competence modules. This network of modules is used
to determine which competence module will be active and therefore control the
robot. The coordination method is competitive, only one module can be active at
any moment. To activate the competence modules some binary states are used.
Each competence module has three lists of states which define its interaction
within the network. The first list is the precondition list and contains all the states
which should be true so that the module becomes executable. The second list is the
add list and contains all the states which are expected to be true after the
activation of the module. Finally the third list is the delete list and contains the
states which are expected to become false after the execution of the module.
26
The states are external perceptions of the environment gathered by the sensors.
Usually some kind of processing will be necessary to transform the analogue
outputs of the sensors to a binary state. For example, for the state “No_Obstacle”,
all the values provided by the sonar have to be processed to determine if there are
nearby obstacles. The states can also be internal assumptions or motivations of the
robot. The state “Way-point_Reached” could be one example. The states are also
used to determine the goals and protected goals of the robot. The goals would be the
states which are desired to be true. The protected goals are the goals already
achieved and therefore retained. The mission of the robot is defined by the
assignment of the goals to some states.
Once all the states and competence modules are defined, the decision network can
be built. Different links appear between the nodes based on the precondition, add
and delete lists:
• Successor link: For each state which appears in the add list of module A and
in the precondition list of module B, a successor link joins A with B.
• Predecessor link: A predecessor link joints B with A if there is a successor
link between A and B.
• Conflicter link: For each state which appears in the delete list of module B
and in the precondition list of module A, a conflicter link joins A with B.
In Figure 17 successor links can be seen as continuous arrows and conflicter links
as lines with a black dot. Note that predecessor links are inverted conflicter links.
NODE 4
NODE 6
NODE 5
NODE 3
NODE 2
NODE 1
STATE 5
STATE 4
STATE 3
STATE 2
STATE 1
STATE 5
STATE 1
STATE 2
STATES
GOALS
PROTECTEDGOAL
Figure 17
27
Activation of competence modules occurs depending on the quantity of energy they
are given. The energy is spread in two phases. In the first phase three different
mechanisms are used:
1. Activation by the states: if at least one state in the precondition list is true,
activation energy is transferred to the competence module.
2. Activation by the goals: if at least one state in the add list belongs to a goal
state, activation energy is transferred to the competence module.
3. Activation by the protected goals: if at least one state in the delete list
belongs to a protected goal state, activation energy is removed from the
competence module.
The spread of energy in phase one is shown in Figure 17 with dotted lines. On the
other hand, phase two spreads energy from competence modules. Three
mechanisms are also used:
1. Activation of Successors: Executable modules spread a fraction of their own
energy to successors which aren’t executable if the state of the link is false.
The goal is to increase activation of modules which become executable after
the execution of the predecessor module.
2. Activation of Predecessor: Non-executable modules spread a fraction of their
own energy to the predecessor if the state of the link is false. The goal is to
spread energy to the modules so that, through their execution, the successor
module becomes executable.
3. Inhibition of Conflicters: Competence modules decrease the energy of
conflicter modules if the state of the link is true. The goal is to decrease the
energy of the conflicters so that by becoming active the preconditions of the
module are rendered false.
In each cycle the competence modules increase or decrease their energy until a
global maximum and minimum level are reached. The activated module has to
fulfil three conditions:
1. It has to be executable (all preconditions have to be true).
2. Its level of energy has to surpass a threshold.
3. Its level of energy has to be higher than that of the modules accomplishing
conditions 1 and 2.
28
When a module becomes active, its level of energy is reinitialised to 0. If none of
the modules fulfil condition 2, the threshold is decreased. Several parameters are
used for the thresholds and the amount of energy to be spread. Also, normalisation
rules assure that all modules have the same opportunities to become active. Note
that the energy of the modules is accumulated incrementally, then the sample time
becomes very important because it determines the velocity of the accumulation. For
a mathematical notation of this algorithm refer to (Maes, 1989).
The intuitive idea of Action Selection Dynamics is that by using the network and
the spreading of energy, after some time, the active module is the best action to
take for the current situation and current goals. Although Action Selection
Dynamics is complex and difficult to design, it has been successfully tested in real
robotic systems.
4.2.2 Implementation
Three behaviours have been implemented and tested using Action Selection
Dynamics in an underwater simulated environment with an AUV. As in
Subsumption, the mission consisted of reaching achieve a collection of way-points
avoiding obstacles and avoiding entrapment. The first behaviour “Go to” is in
charge of driving the vehicle toward the way-points. The second behaviour
“Obstacle avoidance” has the goal of keeping the vehicle away from obstacles. And
the third behaviour “Avoid trapping” is used to depart from zones in which the
vehicle could be trapped. For a complete description of the behaviours and the
simulated environment refer to Appendix A.
In the implementation of the ASD network, the three behaviours represent three
competence modules. Each behaviour has been implemented in a function. The
coordination module which contains the ASD network has also been implemented
in another function following the mathematical notations found in (Maes, 1989).
Each module uses different binary states in its lists. In Table 3 all the states used
are shown. Note that from the six states, three are the negation of the other three.
This is to simplify the ASD algorithm. The goal of the robot is the state
“No_Way_Point”. When this state is true the robot has arrived at and passed
through all the way-points, therefore the mission is complete. Once the states are
29
described the preconditions list, add list and delete list have to be defined, see
Table 4.
State Description
WAY-POINT There is a way-point to go.
NO_WAY-POINT There isn’t any way-point.
OBSTACLE There is a nearby obstacle.
NO_OBSTACLE There isn’t any nearby obstacle.
TRAPPED The vehicle cannot depart from the same zone.
NO_TRAPPED The vehicle is moving through different zones.
Table 3
Go To STATES :
CONDITION LIST WAY-POINT NO_OBSTACLE NO_TRAPPED
ADD LIST NO_WAY-POINT OBSTACLE TRAPPED
DELETE LIST WAY-POINT
Obstacle Avoidance STATES :
CONDITION LIST OBSTACLE
ADD LIST NO_OBSTACLE
DELETE LIST OBSTACLE
Avoid Trapping STATES :
CONDITION LIST WAY-POINT NO_OBSTACLE TRAPPED
ADD LIST NO_TRAPPED OBSTACLE
DELETE LIST TRAPPED
Table 4
With the lists of each competence module the network can be implemented, see
Figure 18. Note that the design phase for the ASD architecture consists of the
specification of the states and the lists. Then, the entire network can be generated
automatically and only some parameters must be tuned.
AvoidTrapping
Obstacleavoid.
Go To
STI
MULS
MA
X {E
1, E2, E
3}>T
hres.
Figure 18
30
The ASD network spreads energy from the states, the goal and between the
competence modules. The competence module which is executable and has more
energy than the others and the threshold will activate and control the robot until a
new iteration is received. The parameters of the decision network that have been
used can be seen in Table 5. The spreading of energy between the modules is
determined by the relationship between different parameters.
Parameter Description Value
ππ Maximum level of energy per module 40 units
φφ Amount of energy spread by a state 10 units
γγ Amount of energy spread by a goal 20 units
δδ Amount of energy spread by a protected goal 0 units
θθ Threshold to becoming active 15 units
Ts Sample time 1 second
Table 5
Some graphical results of the ASD approach can be seen in the next three figures.
Figure 19 shows an aerial view of the simulation and Figure 20 shows a lateral
view. Finally Figure 21 shows a three-dimensional representation of the
simulation. As can be seen, the path obtained with Action Selection Dynamics is
quite optimal. The principal advantages of this method are robustness of the
architecture and automatic coordination once the network has been generated.
However, the design and implementation phases are very complex and difficult.
Figure 19
31
Figure 20
Figure 21
Table 6 summarises the principal characteristics of Action Selection Dynamics.
ACTION SELECTION DYNAMICS
Developer Pattie Maes, Massachusetts Institute of Technology
References (Maes, 1989; Maes, 1990; Maes, 1991)
Behavioural choice and design Experimentally
Behavioural encoding Discrete
Assembling behaviours Competitive, arbitration via levels of activation
Programming method Mathematical algorithms
Positive aspects Modularity and Robustness
Negative aspects Development time and complexity
Table 6
32
4.3 Motor Schemas approach
4.3.1 Description
Schema-based theories appeared in the eighteenth century as a philosophical
model for the explanation of behaviour. Schemas were defined as the mechanism of
understanding sensory perception in the process of storing knowledge. Later on, in
the beginning of the twentieth century, the schema theory was adapted in
psychology and neuroscience as a mechanism for expressing models of memory and
learning. Finally in 1981, Michael Arbib adapted the schema theory for a robotic
system (Arbib, 1981). He built a simple schema-based model inspired by the
behaviour of the frog to control robots. Since then, schema-based methodologies
have been widely used in robotics. The principal proposal is Motor Schemas
developed by Ronald Arkin at Georgia Institute of Technology, Atlanta. Arkin
proposed Motor Schemas (Arkin, 1987) as a new methodology of Behaviour-based
Robotics.
From a robotic point of view “a motor schema is the basic unit of behaviour from
which complex actions can be constructed; it consists of the knowledge of how to act
or perceive as well as the computational process by which it is enacted” (Arkin,
1993). Each schema operates as a concurrent, asynchronous process initiating a
behavioural intention. Motor schemas react proportionally to sensory information
perceived from the environment. All schemas are always active producing outputs
to accomplish their behaviour. The output of a motor schema is an action vector
which defines the way the robot should move. The vector is produced using the
potential fields method (Khatib, 1985). However, instead of producing an entire
field, only the robot’s instantaneous reaction to the environment is produced
allowing a simple and rapid computation.
The coordination method is cooperative and consists of vector summation of all
motor schema output vectors and normalisation. A single vector is obtained
determining the instantaneous desired velocity for the robot. Each behaviour
contributes to the emergent global behaviour of the system. The relative
contribution of each schema is determined by a gain factor. Safety or dominant
behaviours must have higher gain values. Normalisation assures that the final
33
vector is within the limits of the particular robot’s velocities. Figure 22 shows the
structure of motor schema architecture.
∑R = ∑ (Gi · Ri)Behaviour 1
Behaviour 2
Behaviour 4STIMULUS
Behaviour 3Sensors Actuators
COORDINATOR
Norm.
Figure 22
Implementation of each behaviour can be done with parameterised behavioural
libraries in which behaviours like “move ahead”, “move-to-goal”, “avoid-static-
obstacle”, “escape” or “avoid-past” can be found. Schemas have internal parameters
depending on the behaviour and an external parameter, the gain value. Each
schema can be executed into a different processor. Nevertheless, the outputs must
have the same format in order to be summed by the coordinator. For two-
dimensional vehicle control refer to (Arkin, 1989) and for three-dimensional control
to (Arkin, 1992). For a set of positions, each behaviour generates a potential field
which indicates the directions to be followed by the robot in order to accomplish the
behaviour. Merging all the behaviours using the coordinator provides a global
potential field giving an intuitive view of the motor schema architecture
performance, Figure 23.
Figure 23
34
4.3.2 Implementation
Motor Schema architecture has been implemented with three behaviours and
tested in an underwater simulated environment with an AUV. As in the previous
tests, the mission consisted of reaching a collection of way-points avoiding obstacles
and avoiding to be entrapment. The first behaviour “Go to” is in charge of driving
the vehicle toward the way-points. The second behaviour “Obstacle avoidance” has
the goal of keeping the vehicle away from obstacles. And the third behaviour
“Avoid trapping” is used to depart from zones in which the vehicle could be
trapped. For a complete description of the behaviours and the simulated
environment refer to Appendix A.
Each behaviour has been implemented in a different function. A simple
coordination module is used to sum the signals and normalise the output. The
structure of the control architecture can be seen in Figure 24. After tuning the
system, “Obstacle avoidance” behaviour has the highest gain value, followed by
“Avoid trapping” and “Go to” behaviours. As in Subsumption architecture, higher
priority has been given to the safety behaviour “Obstacle avoidance”, followed by
“Avoid trapping” to take control over “Go to” when it is necessary.
Obstacle avoid.
Avoid Trapping
Go To
∑R = ∑ (Gi · Ri)
Figure 24
Some graphical results of the Motor Schema approach can be seen in the next three
figures. Figure 25 shows an aerial view of the simulation and Figure 26 shows a
lateral view. Finally Figure 27 shows a three-dimensional representation of the
simulation.
35
Figure 25
Figure 26
Figure 27
36
After reviewing the results, it can be said that the principal advantages are
simplicity and ease of implementation as well as optimised paths. The architecture
can be implemented in different processors because the algorithm isn’t sequential.
However, difficulties appeared in tuning the gain values. The values are very
sensible and have to be tuned together. When new behaviours are added, re-tuning
is necessary because the sum of the responses of some behaviours can cancel the
effect of others, such as “Obstacle avoidance”. For this reason, robustness and
modularity are very low. Table 7 summarises the principal characteristics of Motor
Schema approach.
MOTOR SCHEMA APPROACH
Developer Ronald Arkin, Georgia Institute of Technology
References (Arkin, 1987) (Arkin, 1989) (Arkin, 1992)
Behavioural choice and design Ethologically
Behavioural encoding Continuous using potential fields
Assembling behaviours Cooperative via vector summation and normalisation
Programming method Parameterised behavioural libraries
Positive aspects Complexity, Development time and simplicity
Negative aspects Tuning time, robustness and modularity
Table 7
4.4 Process Description Language
4.4.1 Description
Process Description Language (PDL) was introduced in 1992 by Luc Steels from
the VUB Artificial Intelligence Laboratory, Belgium. PDL (Steels, 1992) (Steels,
1993) is intended as a tool to implement process networks in real robots. PDL is a
language which allows the description and interaction of different process
constituting a cooperative dynamic architecture.
PDL architecture is organised with different active behaviour systems, Figure 28.
Each one is intended as an external behaviour of the robot like “explore”, “go
towards target” or “obstacle avoidance”. Each behaviour also contains many active
processes operating in parallel. Processes represent simple movements which the
37
behaviour will use to reach its goal. Processes take information from sensors and
generate a control action if needed. The control action is related to the set-points
which must reach several actuators of the robot. Concretely a process output is an
increment value which will be added or subtracted to some set-points of the
actuators. This means, for example, that the process “turn right if the left bumpers
are touched” will add a value to the left motor set-point speed and subtract it from
the right during each sample time when the necessary conditions are true (in this
case, touching the left bumpers). The contribution of all the processes will be added
to the current set-points and then a normalisation will assure a correct output.
This simple methodology constitutes the cooperative coordination method.
Ri =Ri-1+∑ (Gi · Ri)
∑
Process 1
Process 2
Process 4STIMULUS
Process 3Sensors Actuators
COORDINATOR
Norm.
Process 5
Behaviour 1
Behaviour 2
Figure 28
Process description language proposes the language used to implement such
processes. The functions are very simple allowing high speed processing. For
example, the process “turn right if the left bumpers are touched” would be
implemented as:
void turn_right(void){
if(bumper_mapping[3]>0){add_value(left_speed, 1);add_value(right_speed, -1);}
}
The relative contribution of each process is determined by the value added to or
subtracted from the set-points. Processes with large values will exert a greater
influence on the robot. The ultimate direction taken by the robot will depend on
which process influences the overall behaviour in the strongest way. It must be
38
noted that these values are added each time step. For this reason it is very
important that the ranges of these values be related to the sample time in which
the architecture is working. It is possible that with small values and a big sample
time, the architecture might not be able to control the robot. The dynamics of the
architecture must be faster than the dynamics of the robot. This is due to the fact
that PDL works by manipulating derivatives of the set-points implying a fast
control loop to assure the system is stable. It’s important to remember that
although PDL is structured in simple and fast processes, the dynamics will always
have to be faster than that of the robot or vehicle to be controlled.
The overall execution algorithm is defined by the following recursive procedure:
1. All quantities are frozen (sensory information and set-points).
2. All processes are executed and their relative contribution are added or
subtracted).
3. The set-points are changed based on the overall contribution of the
processes.
4. The set-points are sent to the actuator controllers.
5. The latest sensory quantities are read in.
4.4.2 Implementation
Process Description Language architecture has been implemented with three
behaviours and tested in the underwater simulated environment with an AUV. As
in previous tests, the mission consisted of reaching a collection of way-points
avoiding obstacles and entrapment. The first behaviour “Go to” is in charge of
driving the vehicle toward the way-points. The second behaviour “Obstacle
avoidance” has the goal of keeping the vehicle away from obstacles. And the third
behaviour “Avoid trapping” is used to depart from zones in which the vehicle could
be trapped. For a complete description of the behaviours and the simulated
environment refer to Appendix A.
Each behaviour has been implemented in a different function. The low level
processes of each behaviour were assembled and the behaviour only generated a
response. The response is a vector which will change the current velocity vector of
the vehicle a little in the direction desired by the behaviour. The coordinator is a
simple module which sums the current velocity vector with those generated by the
39
behaviours. The final vector is normalised. Although the processes have been
assembled in a single response the methodology is the same. It can be said that
this is a high level PDL approach. Figure 29 shows the structure of this
architecture.
Obstacle avoid.
Avoid Trapping
Go To
∑
Ri =Ri-1+∑ (Gi · Ri)
Figure 29
The size of the vectors is used to give priority to some behaviours over others. In
this case, the maximum speed vector of the robot and those of the behaviours can
be seen in Table 8.
Vector Maximum Magnitude[m/s]
Robot speed 0.5
“Obstacle avoidance” 0.15
“Avoid trapping” 0.04
“Go to” 0.03
Table 8
This means that when the “Obstacle avoidance” behaviour becomes active, it will
affect the overall behaviour in the strongest way. The “Avoid trapping” will only
dominate the “Go to” behaviour to depart from possible entrapment situations. All
these values are closely related to the sample time of the control architecture. PDL
is a methodology which works with derivatives of the speed in this case. This
means that the dynamics of PDL should be faster than the dynamics of the robot.
As with the other evaluated architectures, simulations were carried out with a
sample time of 1 second. However, the control architecture wasn’t very fast, and
the final sample time was changed to 0.1 seconds.
Some graphical results of Process Description Language approach can be seen in
the next three figures. Figure 30 shows an aerial view of the simulation and Figure
31 shows a lateral view. Finally, Figure 32 shows a three-dimensional
representation of the simulation.
40
Figure 30
Figure 31
Figure 32
41
After reviewing the simulated results it can be said that PDL provides an easy tool
to implement a control architecture. Advantages are simplicity and optimized
paths when the architecture is tuned. However, as the coordinator method is
cooperative, the tuning was very difficult. When new behaviours are added re-
tuning is necessary. Also there is the problem of the sample time which must be
faster than the other approaches. And because the final velocity vector is obtained
incrementally, the architecture must be sequential making modularity difficult.
Table 9 summarises the principal characteristics of the Process Description
Language approach.
PROCESS DESCRIPTION LANGUAGE
Developer Luc Steels, VUB Artificial Intelligence Laboratory
References (Steels, 1992; Steels, 1993)
Behavioural choice and design Experimentally
Behavioural encoding Continuous
Assembling behaviours Cooperative via values integration and normalisation
Programming method Process Description Language
Positive aspects Complexity, Development time and simplicity
Negative aspects Small sample time, Tuning time, Robustness and Modularity
Table 9
4.5 Comparison
Once the four behaviour-based architectures have been implemented and tested
some conclusions can be drawn. It should be noted that the architectures have been
implemented for a simple mission. However, it should be sufficient to find
attractiveness and deficiencies of each one. As commented above, each architecture
has exhibited advantages and disadvantages and for this reason each one will be
well suited for a particular application. First of all let’s summarise the different
properties that have been commented on in relation to the four architectures in
Table 10.
Property \ Architecture: 1st 2nd 3rd 4th
Performance SCHE. P.D.L A.S.D. SUBS.
Modularity SUBS A.S.D. P.D.L. SCHE.
Robustness SUBS A.S.D. P.D.L. SCHE.
Development time SCHE. P.D.L. SUBS. A.S.D.
Tuning time SUBS. A.S.D. P.D.L. SCHE.
Simplicity SCHE. P.D.L. SUBS. A.S.D.
Table 10
42
Looking at the table, it can be seen that properties are grouped in accordance with
the coordination method. Competitive methods (Subsumption and A.S.D.) have
robustness, modularity and easy tuning. This is due to the fact that they only have
one active behaviour at any moment. Therefore, robustness is preferable because in
dangerous situations only a safety behaviour will act and the danger is avoided.
Modularity should also be considered an important property of competitive
methodologies because more behaviours can be added without influencing the old
ones. Only the coordinator will have to be adapted to the new input. For this reason
the tuning time is very short. Behaviours are tuned independently and once they
work properly they never have to be re-tuned.
However, competitive methods have disadvantages also, mainly in the coordinator
design. In order to choose only one active behaviour, a complex method must be
used (A.S.L.) or a clear understanding of the hierarchy of behaviours is necessary
(Subsumption). Once this is done, the final tune-up is very easy. For this reason
the development time is usually long and the coordinator can become very complex.
Another negative property is slow performance due to the non-instant merging of
behaviours. A lot of bends appear in the path when more than one behaviour is
acting consecutively.
As competitive approaches, the two methodologies studied, Subsumption and
Action Selection Dynamics, posses all these properties. However, they have quite
different philosophies. Subsumption is more a low-level approach. The hierarchy of
behaviours has to be known and then the network has to be designed. Subsumption
offers a series of tools, the suppression and inhibition nodes, to build the network.
For this reason, the implementation will be simple, but perhaps the design will be
quite difficult. On the other hand, Action Selection Dynamics is a high level
approach to building an architecture. All the competence modules are completely
described and the design consists of filling in all the module lists. Once all the
behaviours are perfectly described, the network will automatically choose the best
behaviour. In ASD the design will be easier but the implementation is more
difficult.
In contrast to competitive approaches, methods with a cooperative coordinator have
other properties like simplicity, performance and development time. Due to the fact
43
that all behaviours are active, the response will always be a merging of these
behaviours. This means that the path described by the robot will be smoother than
that described by a competitive method. For this reason performance will be a
common property.
Another property is simplicity. The coordinator will be very simple, because the
final output is usually a kind of sum of all the behaviours multiplied by the gain
factors. In relation to simplicity, the development time will be small. However, this
simplicity causes great difficulty in tuning the priority gains as the values are very
sensible and critical. In extreme situations, non-safe behaviours can cancel the safe
ones. Unfortunately the modularity will be very bad as a result, because each new
behaviour will cause the re-tuning of all the priority gains.
The differences between the two cooperative studied approaches, Motor Schema
and Process Description Language, are in the level of abstraction and in the
coordinator. In Motor Schema, behaviours are implemented individually with the
behavioural library. It’s more a high level design. However, in PDL behaviours are
implemented as low-level processes which change the set-points of the actuators a
little. Implementation will be simple but the design will be more complex.
Nevertheless, the principal difference between the two approaches is found in the
coordinator. In Motor schema the output is obtained every time step from the
outputs of the behaviours. On the other hand, in PDL the output is an integration
of all the outputs generated before. This implies that a small sample time is neede
to assure the stability of the system, which can be problematic if there is a lot of
computation to do.
Concluding this comparison it can be said that depending on the exigencies of the
robot to be controlled one method can prove to be more appropriate than another.
Once the architecture has been implemented, a cooperative method can be more
suitable if better performance is necessary. Or a competitive method if robustness
is the basic premise. The control architecture could also depend on hardware
availability, sensorial system and compatibility with adaptive algorithms.
44
5 Conclusions and Future Work
5.1 Conclusions
This investigative work reviews the basic aspects of Behaviour-based Robotics
architectures. The report begins with an overview of the basic fundamentals of the
subject. In addition, four behaviour-based approaches have been described and
implemented. The application field of the work is underwater robotics.
Implementations are based on a simulated underwater vehicle that must
accomplish a mission in an underwater simulated environment as well. Finally, a
comparative study of the implemented architectures is provided. The goal of the
work has been to understand the basic characteristics and limitations of
Behaviour-based Robotics through representative examples with the final purpose
of improving control techniques in autonomous robots such as GARBI underwater
vehicle.
Behaviour-based robotics has demonstrated its feasibility to control autonomous
robots due mainly to characteristics such as parallelism, modularity, absence of
centralised knowledge representations and sensory-motor interconnection. Such
features have make Behaviour-based robotics suitable for its easily and
incremental implementation, fast execution and good performance. However,
disadvantages appear when adjusting behaviour and coordination parameters. The
problem is to determine which action must be taken at every time step. This
problem arises from the correct coordination of behaviours. As mentioned in section
4.5, competitive methods have good robustness but bad performance when there is
a continuous change of the active behaviour. As far as cooperative methods is
concerned, they have a very good performance when parameters are properly tuned
but its low robustness can lead to control failures.
Another shown constraint of the tested architectures is adaptation. As explained in
section 3.6, adaptation can be applied in different levels of the control architecture.
From our experience with the simulated mission, the need of an adaptive system
for the behavioural level is obvious. Also, in a real test of a control system,
adaptation could be also necessary in sensory and learning stages. Adaptation in
behaviour-based systems is currently a very active research subject. The problem is
to determine an automatic feasible manner to tune the parameters of the
45
architecture or to generate new behaviours. A control architecture for an
autonomous vehicle that should inhabit a natural environment cannot be designed
without adaptive abilities.
5.2 Future work
Future work will consist on designing a Behavioural-based architecture that will
take into account the different aspects remarked in the conclusions. The idea is to
use advantages found in the Behaviour-based approaches and to propose solutions
for disadvantages. At the moment the proposal isn’t still focused. However the
directions to be taken and some desirable characteristics of the architecture will be
exposed. Future work intends to start a thesis that should finish with a
contribution on behaviour-based robotics.
Basics
The proposed behavioural-based architecture will be designed for an underwater
vehicle. The approach presented is focused as a control architecture for navigation,
hence, responses represent a movement of the vehicle. The architecture is
composed by several simple behaviours as the approaches presented in the report.
Each one is completely independent generating a normalised response. A three-
dimensional vector “vi” with a module “mi” between 0 and 1 composes each
response. Associated with this vector an activation level “ai” indicates how
important is for the behaviour to take the control of the robot. This value is also
between 0 and 1, see Figure 33. This codification allows a clear independence
between the control action and the activation of the behaviour.
biS ri
vi=(xi, yi, zi); m i=[0 1]
XL
YL
ZL
mi
yixi
zi
v i
ai=[0 1]
Figure 33
46
An hybrid coordination system
Coordination of the responses is done through a hybrid approach between
competition and cooperation. The aim is to design a coordination system to keep
the robustness of competitive approaches as well as the good performance of the
cooperative ones. Like suppression nodes of Subsumption architecture, the
proposed coordination system is composed by a set of nodes with two inputs which
generate a merged control response. The nodes compose a hierarchical and
cooperative coordination system. The idea is to use the good performance of
cooperation when the predominant behaviour is not completely active.
The nodes have a dominant behaviour that suppresses the responses of the non-
dominant behaviour when the first one is completely activated (a=1). However,
when the dominant behaviour is partially activated (0<a<1), the final response will
be a combination of both inputs. The idea is that non-dominant behaviours can
modify slightly the responses of dominant behaviours when they aren’t completely
activated. For example, if the dominant behaviour is “obstacle avoidance” and the
non-dominant “go to point”, when “obstacle avoidance” is only a bit activated (the
obstacles are still far), a mixed response will be obtained. When non-decisive
situations are happening, cooperation between behaviours is allowed.
Nevertheless, robustness is present when dealing with critical situations. The
proposed node to coordinate behaviours can be seen in Figure 34.
ri
rj
rij
vij
vi +vj ·(1 - ai)2 if (ai>0)
vj if (ai=0)
if (mij>1) vij= vij /mij
aij
ai + aj ·(1 - ai)2 if (ai>0)
aj if (ai=0)
if (aij>1) aij=1
Dominant
Non-dominant
nij
Figure 34
The node “nij” has the property to generate a normalised response like the one
generated by behaviours. The effect of the non-dominant behaviour depends on the
squared activation of the dominant to assure that in a critical situation between
47
both, the dominant will always take the control. Using these nodes all behaviours
can be coordinated. Depending on the situation, the control response could be
produced by all the behaviours or by only one.
The coordination method can be classified as a hybrid approach because the
response is the one generated by the dominant behaviour affected by non-dominant
behaviours according to the level of activation of the first. Although in the
literature doesn’t appear any kind of hybrid coordination system, we think that the
method offers good properties and can be successfully implemented in an
autonomous robot.
Adaptivity
To coordinate all the behaviours a lot of dispositions can be taken. Assuming that a
response is used only once (responses are only connected into one coordination
node), “n-1” nodes will be needed for “n” behaviours, see Figure 35. This
coordination system allows encoding all possible combinations. Each combination
will represent a hierarchical structure of the coordination system. Then, using an
adaptive algorithm the best disposition to accomplish the current mission can be
obtained. Using for example techniques such as reinforcement learning or
evolutionary robotics (section 3.6) the network of nodes could be adapted online to
optimise missions. Also the possibility to use adaptivity in the sensory and learning
levels will be studied.
Behaviour 1
Behaviour 2
Behaviour 4STIMULUS
Behaviour 3Sensors Actuators
COORDINATOR
n21
D
ND
n34
D
ND
n2’1’
D
ND
Figure 35
48
Future work planning
To carry out the proposed future work different phases should be accomplished.
These phases will be tested and modified according to the results obtained in the
implementation with the robot.
Phase 1 To implement and test the feasibility of the hybrid coordination method
and to redesign it if necessary.
Phase 2 To survey the fields of “Reinforcement learning” and “Evolutionary
Robotics” in order to find a useful approach to online behavioural
adaptation for the proposed architecture.
Phase 3 To implement such architecture in an autonomous underwater robot
demonstrating the feasibility of the proposed control architecture with a
representative mission.
49
References
Agre, P. E. and Chapman, D. (1987). Pengi: an implementation of a Theory of
Activity. In: Proceedings of the Sixth Annual Meeting of the American
Association for Artificial Intelligence, pp. 268-272, Seattle, Washington.
Albus, J. (1991). Outline for a theory of Intelligence. IEEE Transactions on
Systems, Man and Cybernetics, vol. 21, is. 3, pp. 473-509.
Amat, J., Batlle, J., Casals, A., and Forest, J. (1996). GARBI: a low cost ROV,
constrains and solutions. In: 6ème Seminaire IARP en robotique sous-
marine , pp. 1-22, Toulon-La Seyne, France.
Arbib, M. A. (1981). Perceptual structures and distributed motor control. Handbook
of physiology-The nervous system II: Motor Control, Chap. 33, pp. 1449-80.
Oxford, UK: Oxford University Press.
Arbib, M. A. and House, D. (1987). Depth and detours: An Essay on Visually guided
behaviour. Vision, Brain and Cooperative Computation, pp. 129-63. MIT
Press.
Arbib, M. A., Kfoury, A J, and Moll, R. N. (1981). A basis for Theoretical Computer
Science. Springer-Verlag.
Arkin, R. C. (1986). Path Planning for a Vision-Based Autonomous Robot. In:
Proceedings of the SPIE Conference on Mobile Robots, pp. 240-49,
Cambridge, MA.
Arkin, R. C. (1987). Motor schema based navigation for a mobile robot: an approach
to programming by behaviour. In: Proceedings of the IEEE conference on
robotics and automation, pp. 264-71, Raleigh, NC.
Arkin, R. C. (1989). Motor schema-based mobile robot navigation. International
Journal of Robotica Research, vol. 8, is. 4, pp. 92-112.
Arkin, R. C. (1992). Behavior-Based Robot Navigation for Extended Domains.
Adaptative Behavior, vol. 1, is. 9, pp. 201-225.
50
Arkin, R. C. (1993). Modeling neural function at the schema level: Implications and
results for robotic control. Biological neural networks in invertebrate
neuroethology and robotics, Chap. 17, pp. 383-410. Boston: Academic Press.
Arkin, R. C. (1998). Behavior-based Robotics. MIT Press.
Balch, T. and Arkin, R. C. (1993). Avoiding the Past: A simple but effective strategy
for Reactive Navigation. In: Proceedings of the IEEE International
Conference on Robotics and Automation, vol. 1, pp. 678-85, Atlanta, GA.
Bonasso, P. (1992). Reactive Control of Underwater Experiments. Applied
Intelligence, vol. 2, is. 3, pp. 201-04.
Brooks, R. (1986). A Robust Layered Control System for a Mobile Robot. IEEE
Journal of Robotics and Automation, vol. RA-2, is. 1, pp. 14-23.
Brooks, R. (1989). A robot that walks: Emergent Behavior from a Carefully Evolved
Network. Neural Computation, vol. 1, is. 2, pp. 253-262.
Brooks, R. (1990). The behavior language . A.I. Memo No. 1227, MIT AI Laboratory.
Brooks, R. (1991). Intelligence Without Reason. A.I. Memo No. 1293, MIT AI
Laboratory.
Brooks, R. (1991). New approaches to Robotics. Science, vol. 253, pp. 1227-1232.
Chatila, R. A. and Laumond, J. C. (1985). Position referencing and consistent
World Modeling for Mobile Robots. In: IEEE International Conference on
Robotics ans Automation.
Connell, J. H. (1990). Minimalist Mobile robotics: A colony-style architecture for an
Artificial Creature. Academic Press.
Dorigo, M. and Colombetti, M. (1998). Robot Shaping: An Experiment in Behavior
Engineering. MIT Press.
Fikes, R. E. and Nilsson, N. J. (1971). STRIPS: A new approach to the application
of theorem proving and problem solving. Artificial Intelligence, vol. 2, pp.
189-208.
51
Fossen, T. I. (1995). Guidance and Control of Ocean vehicles. John Wiley & Sons.
Gat, E. (1991). Reliable Goal-Directed Reactive Controla of Autonomous Mobile
Robots. Ph.D. Dissertation, Virginia Polytechnic Institute and State
University, Blacksburg.
Goheen, K. (1995). Techniques for URV Modelling. Underwater Robotics Vehicles,
Chap. 4. TSI Press.
Harvey, I., Husbands, P., Cliff, D., Thomson, A., and Jakobi, A. (1997).
Evolutionary Robotics: the sussex approach. Robotics and Autonomous
systems, vol. 20, is. 2-4, pp. 205-224.
Huang, H-M. (1996). An architecture and a methodology for intelligent control.
IEEE Expert: Intelligent Systems and their applications, vol. 11, is. 2, pp.
46-55.
Kaelbling, L. P. (1996). Reinforcement Learning A Survey. Journal of Artificial
Intelligence Research, vol. 4, pp. 237-285.
Kaelbling, L. P. (1999). Robotics and Learning. The MIT encyclopedia of the
cognitive sciences, pp. 723-724. MIT Press.
Khatib, O. (1985). Real-Time Obstacle Avoidance for Manipulators and Mobile
Robots. In: Proceedings of the IEEE International Conference on Robotics
and Automation, pp. 500-05, St. Louis, MO.
Laird, J. E. and Rosenbloom P.S. (1990). Integrating, Execution, Planning, and
Learning in soar for External Environments. In: Proceedings, AAAI-90, pp.
1022-1029.
Lefebvre, D. and Saridis, G. (1992). A computer architecture for intelligent
Machines. In: Proceedings of the IEEE International Conference on Robotics
and Automation, pp. 245-50, Nice, France.
Lyons, D. (1992). Planning, Reactive . Encyclopedia of Artificial Intelligence, pp.
1171-82. John Wiley and Sons, New York.
Maes, P. (1989). How to do the right thing. Connection Science, vol. 1, pp. 291-323.
52
Maes, P. (1990). Situated Agents Can Have Goals. Robotics and Automation
Systems, vol. 6, pp. 49-70.
Maes, P. (1991). A bottom-up mechanism for behaviour selection in an artificial
creature. From animals to animats: Proceedings of the first international
conference on simulation of adaptive behaviour. Ed. JA Meyer & SW Wilson,
MIT Press / Bradford Books.
Maes, P. (1994). Modeling Adaptive Autonomous Agents. Artificial Life Journal,
vol. 1, is. 1 & 2, pp. 135-162.
Mataric, M. (1999). Behavior-based Robotics. The MIT encyclopedia of the cognitive
sciences, pp. 74-77. MIT Press.
McFarland, D. (1991). What it means for robot behavior to be adaptive. From
animals to animats: Proceedings of the first International Conference on
Simulation of Adaptive Behavior, pp. 22-28. MIT Press.
Moravec, H. (1988). Mind Children: The future of Robot and Human Intelligence.
Harvard University Press.
Nilsson, N. J. (1984). Shakey the Robot. Stanford Research Institute AI Center,
Technical rep. 323.
Nolfi, S. (1998). Evolutionary Robotics: Exploiting the full power of self-
organization. Connection Science, vol. 10, is. 3-4, pp. 167-183.
Payton, D., Keirsey, D., Kimble, D., Krozel, J., and Rosenblatt, J. (1992). Do
Whatever Works: A robust approach to fault-tolerant autonomous control.
Applied Intelligence, vol. 2, is. 3, pp. 225-50.
Pfeifer, R. and Scheier, C. (1999). Understanding Intelligence. MIT Press.
Ridao, P., Yuh, J., Batlle, J., and Sugihara, K. (2000). On AUV Control
Architecture. In: Proceedings of the IEEE International Conference on
Intelligent Robots and Systems, Takamatsu, Japan.
Rosenblatt, J. and Payton, D. (1989). A fine-grained alternative to the subsumption
architecture for Mobile Robot Control. In: Proceedings of the International
53
Joint Conference on Neural Networks, pp. 317-23.
Rosenschein, S. J. Kaelbling L. P. (1986). The synthesis of digital machines with
provable epistemic properties. In: Proceedings of the Conference on
Theoretical Aspects of Reasoning About Knowledge, pp. 83-98, Los Altos,
California.
Saffiotti, A., Konolige, K., and Ruspini, E. (1995). A multi-valued logic approach to
integrating planning and control. Artificial Intelligence, vol. 76, is. 1-2, pp.
481-526.
Schoppers, M. (1987). Universal Plans for Reactive Robots in Unpredictable
Environments. In: Proceedings of the Tenth International Joint Conference
on Artificial Intelligence (IJCAI-87), pp. 852-59.
Slack, M. (1990). Situationally Driven Local Navigation for Mobile Robots. JPL
Publication No. 90-17, NASA Jet Propulsion Laboratory, Pasadena, CA.
Steels, L. (1992). The PDL reference manual. Memo 92-5, VUB AI Lab, Brussels,
Belgium.
Steels, L. (1993). Building agents with autonomous behaviour systems. The
artificial route to artificial intelligence. Building situated embodied agents.
Lawrence Erlbaum Associates, New Haven.
Tyrell, T. (1993). Computational mechanisms for action selection. Doctoral
dissertation, University of Edinburgh, Scotland.
Yuh, J. (1990). Modeling and Control of Underwater Robotic Vehicles. IEEE
Transactions on Systems, Man and Cybernetics, vol. 20, is. 6, pp. 1475-1483.
Ziemke, T. (1998). Adaptive Behavior in Autonomous Agents. Presence, vol. 7, is. 6,
pp. 564-587.
54
Appendix A. Simulation of missions with an AUV.
In order to test the Behaviour-based architectures explained in section 4, a
simulator has been implemented using Matlab/Simulink. The simulator is
composed by a control architecture, low-level controllers, a model of an
Autonomous Underwater Vehicle and a graphical interface with an underwater
environment. Figure 36 shows the interconnections of the different processes. Each
one of them is described in the next subsections.
AUVMODEL
OBSTACLE AVOIDANCE
AVOID TRAPPING
GO TO
COORDINATIONLOW-LEVEL
CONTROLLERS
BEHAVIOUR-BASED CONTROL ARCHITECTURE
UNDERWATERENVIRONMENT
VEHICLE SPEEDTHRUSTER SPEEDS
POSITION AND ORIENTATION
SONAR DISTANCES
Figure 36
All these components make possible the simulation of a simple mission that we
think is sufficient to evaluate the different aspects of the control architecture. Note
that typical problems of real robots (position estimation, noise in signals, faults in
sonar distances) aren’t simulated. The fact of depreciate all these aspects breaks
several principles of Behavioural-based Robotics. However these simulations
intend to prove only the performance of the control architectures not the principles.
We assume that if the architectures were implemented in a real robot, properties of
Behaviour-based Robotics would assure robustness in front of these problems. For
this reason the mission we chose is very simple. We think that nearly the same
simulated results can obtained on a real experiment.
The mission consists in achieve three way-points, one after the other, avoiding
obstacles and avoiding to become trapped. The starting point of the mission and the
three goal points can be seen in the 3D representation shown in Figure 37. The
figure also shows the dimensions of the environment in meters. In some places the
sea floor emerges and the vehicle is obligated to turn over the obstacles.
55
Figure 37
A.1 The control architecture
The control architecture block contains one of the architectures of section 4.
However, to compare the architectures a fixed set of behaviours has been
determined. For this mission three behaviours have been chose: “Go to”, “Obstacle
avoidance” and “Avoid trapping”. All the architectures were implemented with
these behaviours changing implementation aspects but maintaining the structure
of them.
These behaviours use different inputs but generate an established output. The
output is a three-dimensional vector representing the speed that should move the
vehicle, see Figure 38. The vector is defined with two angles “α” and “β” and a
magnitude “M”. The angle “α” represents the angle that should turn the vehicle
respect of its yaw angle. The angle “β”combined with the magnitude of the speed
represent the vertical movement of the vehicle. Note that this angle is referred to
the horizontal direction, this is due to the stability of the vehicle modelled (pitch
and roll angles are always nearly to zero), see section A.3. The magnitude “M” is
the module of the speed. The units of this value are [m/s], and the maximum value
56
for our vehicle is 0.5 m/s. The coordinator module will merge the three speed
vectors and generate another speed vector like the previous ones but saturating the
magnitude to 0.5 m/s.
XL YL
ZG ZL
α
β
XG
YG
M
Figure 38
• “Go to” behaviour
This behaviour has the intention of driving the vehicle towards the goal point. It
proposes a speed vector with a constant magnitude. The direction of the vector is
the one that joins the position of the vehicle with the goal point, see Figure 39. The
input of the vector is position and orientation of the vehicle. The goal points are
stored by the behaviour and changed when the robot is at a close distance.
XL YL
ZG ZLGOALPOINT
α
β
XG
YG
GG
Figure 39
• “Obstacle avoidance” behaviour
This behaviour is used to protect the robot to crash with obstacles. The inputs of
the behaviour are the sonar distances. The vehicle has seven sonars: three at the
front, one on each side, one at the back and another below (section A.4). Each sonar
direction has a configurable range. If one of the distances is less than this range,
57
the behaviour will generate an opposite response. The behaviour computes all
distances and creates a 3D speed vector that indicates the direction that must take
the robot to avoid the obstacles detected by sonars, see Figure 40.
β
XL
YL
ZLOBSTACLE
α
ZG
XG
YG
M
Figure 40
• “Avoid trapping” behaviour
The avoid trapping (Balch & Arkin, 1993) behaviour is used to depart from
situations where the robot becomes trapped. The input of the behaviour is the
position of the robot, which is used to save a local map of the recent path of the
vehicle. The map is centered in the robot position and has finite number of cells.
Each cell contains the number of times that the robot has visited the cell. If the
sum of all the values is higher than a threshold, the behaviour becomes active and
a speed vector is generated. The direction of the vector will be the one opposed to
the centre of gravity of the local map using the values of the cells. The magnitude
will be proportional the sum of the cell values. The cells are incremented by a
configurable sample time and saturated on a maximum value. After a long time,
the cell values are decreased allowing going back to a visited zone.
This behaviour becomes active in two different situations. The first one is when the
vehicle is trapped in front obstacles. In this case the behaviour will take control of
the vehicle and drive him away to another zone. The second situation is when the
vehicle is navigating very slowly for the interaction of the others behaviours. In
this case the cells will increase fast the value and the behaviour will become active
driving away the vehicle from the path, see Figure 41.
58
XL
YL
ZL
PATH
α
β
ZG
XG
YG
M
Figure 41
A.2 The low-level controllers
The low-level control module is in charge of accomplishing the set-points given by
the control architecture. The inputs of the module are the speed vector and the
position and orientation of the robot. The speed vector is decomposed in three set-
points: the yaw, the horizontal speed and the vertical speed. The yaw is obtained
with the current yaw and the “α” value. The horizontal speed is obtained from the
horizontal component of the speed vector. And the vertical speed is obtained with
the vertical component.
Each set-point is accomplished by an individual controller. A PID controller was
chose and tuned for each controller. Figure 42 shows the structure of the low-level
controllers. The set-points are compared with the real ones obtained from position
and orientation of the model. The yaw controller uses the current yaw of the
vehicle. The horizontal speed controller uses the composition of the “x” and “y”
derivatives. And the vertical speed controller uses the “z” derivative.
Figure 42
59
As the vehicle is stable (section A.3) the decomposition of the speed vector allows a
total independence of the horizontal and vertical controllers. However the
horizontal controllers (yaw and horizontal speed controllers) act on the same
thrusters (the two horizontal thrusters, see section A.3). For this reason they were
tuned together. On the other hand the two vertical thrusters (section A.3) are only
controlled by the vertical speed controller.
It must be noted that the low-level controllers aren’t very complex. The intention
was to implement simple controllers, and to tune them to achieve real results
instead of non-real but impressive results. The responses of the controllers with the
model were compared with real ones until they were very similar. It is important to
notice that the intention of the report is to test control architectures. For this
purpose is important to work with a simulator that reacts as a real robot.
A.3 Model of an Autonomous Underwater Vehicle
To simulate the behaviour of an AUV a hydrodynamic model is necessary. In our
case the model was adapted to our underwater vehicle GARBI. First of all the
underwater robot will be described. Then, the model for an underwater vehicle will
be analysed.
• The underwater vehicle
GARBI (Amat et al, 1996) was first conceived as a Remotely Operated Vehicle
(ROV) for exploration in waters up to 200 meters in depth. At the moment a control
architecture is being implemented to transform this vehicle into an Autonomous
Underwater Vehicle. GARBI, see Figure 43, was designed with the aim of building
an underwater vehicle using low cost materials, such as fibre-glass and epoxy
resins. To solve the problem of resistance to underwater pressure, the vehicle is
servo-pressurised to the external pressure by using a compressed air bottle, like
those used in scuba diving. Air consumption is required only in the vertical
displacements during which the decompression valves release the required amount
of air to maintain the internal pressure at the same level as the external one. The
vehicle has the possibility to incorporate two arms which allow it to perform some
tasks of object manipulation through tele-operation.
60
Figure 43
The vehicle has thrusters, see Figure 44, two for horizontal movements (axis x) and
two for vertical movements (axis z). Additionally it is possible to add another
thruster in the transverse direction (axis y). Due to the distribution of the weights,
the vehicle is completely stable. Pitch and roll angles are always insignificant. For
this reason the vertical and horizontal movements are totally independents. The
vehicle also has several sensors: 2 compass, 2 pressure sensors, 2 water speed
sensors and 5 sonars. Dimensions are: length 1.3 m., height 0.9 m and width 0.7 m.
Maximum speed is 3 knots and the weight is 150 Kg.
X
Z
Prow
Stern
Starboard
Larboard
T1
T2
T3 T4
Y
Figure 44
61
• The hydrodynamic model
An AUV can be modelled by using two approaches (Goheen, 1995). The first one
uses predictive techniques which build up the model from the basic physical laws
governing the dynamic behaviour of the system. The second uses testing
techniques which are based on parameter estimation from real tests. The model
used for GARBI is based on predictive techniques. As described in the literature
(Yuh, 1990) and (Fossen, 1995), the hydrodynamic equation of motion of an
underwater vehicle with 6 DOF can be conveniently described as follows:
( ) ( )( ) ( ) ( )( )G G G G G G G G G GRB A RB ABu M M V C V C V V D V V G O= + ⋅ + + ⋅ + +& (eq.1)
( ) WG
BGG FFOG += (eq.2)
where,
B: thruster configuration matrix
u=(ω12, ω22,ω32,ω42,ω52)T : control inputs vector
ωi : angular speed of the propeller i
GMRB : inertia matrix
GMA: added-mass matrix
( )T
EGG
EGGG aV α,=& : acceleration vector
( )T
EGG
EGGG vV ω,= : velocity vector
( )TO ψθφ= : Roll, Pith & Yaw angles
GCRB : rigid-body coriolis
GCA : added coriolis matrix
GD : damping matrixGG : gravity &buoyancy vector
The super-index denotes the coordinate system where vector components are
expressed. {G} is a robot fixed coordinate system and {E} is an earth fixed
coordinate system. For simulation purposes it’s interesting to compute the
evolution of the robot position and orientation as a function of the forces acting on
the vehicle. This can be easily computed, arranging the terms of (eq.1) as follows:
( )( ) ( ) ( )( )OGVVDVVCVCBuK GGGGGA
GRB
G −−⋅+−= )( (eq.3)
( ) KMMV AG
RBGG ⋅+= −1& (eq.4)
The velocity vector can be computed through integration:
62
∫= dtVV GG & (eq.5)
and finally, the rate of change of the position and the orientation can be computed
as follows:
⋅
=
−
EG
EG
G
x
xGE
G
Ev
OT
R
O
r
ω133
33
)(0
0&& (eq.6)
where:
GE R is the rotation matrix
( )TG
E zyxr &&&& = and
−=−
θφ
θφ
φφ
θφθφ
cos
cos
cos0
cos0
tgcostg1
)( 1
sinsin
sin
OT
The parameters of the equations were obtained from the GARBI robot
characteristics less added-mass matrix and damping matrix which were taken from
the model of the underwater vehicle ODIN from University of Hawaii.
A.4 Underwater environment
To test the control architectures, an environment has been generated where the
robot can navigate and interact. The environment is generated with a grey-scale
“bmp” file and can be changed easily. The grey-scale value of each pixel is
translated in a depth value. The three-dimensional environment is generated
(Figure 37) and the robot can be situated inside detecting the sea-floor or the
obstacles. This detection is done simulating seven sonar sensors. The disposition of
the sonars can be seen in Figure 45 with different detected distances. Six of them
are situated in the horizontal plane and the last one in the negative “z” axis. Note
that is not necessary to use a sonar in the positive “z” axis. The method of
generating the environment only allows determining the sea floor not intermediate
objects. As can be seen, each sonar is simulated as a cone.
63
X
Y
X
Z
TOP VIEW LATERAL VIEW
Figure 45
As a tool to control visually the results of the simulations there is a graphical
interface that shows the evolution of the robot during the development of a
mission. The graphical interface, see Figure 46, shows a screen with a top view and
a lateral view of the evolution of the vehicle inside the environment. The
environment is represented with contour lines in the top view. The sonars and the
goal points are also represented. The interface provides numerical information of
the position of the vehicle, sonar sensors, motors, goal point and way-point.
Figure 46
64
Finally a three-dimensional representation can be also generated once the
simulation has finished. Figure 47 shows the path of a simulation using Motor
Schema architecture.
Figure 47
65
Appendix B. Published results
As a result of the investigative work a conference paper was published in the Q&A-
R 2000 international conference. Also, two conference papers have been accepted to
be presented in the MCMC2000 international conference and the CCIA 2000
national conference.
B.1 MCMC 2000
Title: An overview on behaviour-based methods for AUV control.
Authors: M. Carreras, J. Batlle, P. Ridao and G.N. Roberts.
Conference: 5th IFAC Conference on Manoeuvring and Control of Marine Crafts.
Place: Aalborg, Denmark.
Date: August 23-25, 2000
B.2 CCIA 2000
Title: An Underwater Autonomous Agent. From simulation to experimentation.
Authors: Pere Ridao, Joan Batlle, and Marc Carreras.
Conference: 3r Congrés Català d'Intel.ligència Artificial.
Place: Vilanova i la Geltrú, Catalunya.
Date: October 5-7, 2000
B.3 Q&A-R 2000
Title: Reactive control of an AUV using Motor Schemas.
Authors: M. Carreras, J. Batlle and P. Ridao.
Conference: International conference on Quality control, Automation and Robotics.
Place: Cluj Napoca, Rumania.
Date: May 19-20, 2000