An overview of Behavioural-based Robotics with simulated ......Memòria del Treball de Recerca An...

Memòria del

Treball de Recerca

An overview of Behavioural-based Robotics

with simulated implementations

on an Underwater Vehicle.

presentat per:

Marc Carreras i Pérez

tutor:

Dr. Joan Batlle i Grabulosa

programa de doctorat:

Informàtica Industrial / Tecnologies Avançades de Producció

Institut d’Informàtica i Aplicacions

Universitat de Girona

Girona, 14 de Juliol de 2000

1

INDEX

1 Introduction.............................................................................................................. 2

2 Behavioural-based Robotics versus Traditional AI............................................... 3

3 Fundamentals of Behaviour-based Robotics ......................................................... 83.1 Principles....................................................................................................... 93.2 Expression of behaviours ........................................................................... 103.3 Behavioural choice and design .................................................................. 113.4 Behavioural encoding ................................................................................. 133.5 Assembling behaviours .............................................................................. 143.6 Adaptive Behaviour-based Robotics.......................................................... 15

4 Behaviour-based Approaches................................................................................ 184.1 Subsumption architecture.......................................................................... 19

4.1.1 Description................................................................................. 194.1.2 Implementation ......................................................................... 22

4.2 Action Selection Dynamics......................................................................... 254.2.1 Description................................................................................. 254.2.2 Implementation ......................................................................... 28

4.3 Motor Schemas approach........................................................................... 324.3.1 Description................................................................................. 324.3.2 Implementation ......................................................................... 34

4.4 Process Description Language................................................................... 364.4.1 Description................................................................................. 364.4.2 Implementation ......................................................................... 38

4.5 Comparison ................................................................................................. 41

5 Conclusions and Future Work .............................................................................. 445.1 Conclusions ................................................................................................. 445.2 Future work................................................................................................. 45

References...................................................................................................................... 49

Appendix A. Simulation of missions with an AUV. ................................................... 54A.1 The control architecture ............................................................................. 55A.2 The low-level controllers ............................................................................ 58A.3 Model of an Autonomous Underwater Vehicle......................................... 59A.4 Underwater environment........................................................................... 62

Appendix B. Published results..................................................................................... 65B.1 MCMC 2000B.2 CCIA 2000B.3 Q&A-R 2000

2

1 Introduction

This investigative report presents the work carried out between November 1999

and July 2000 in the Robotics and Computer Vision group at the University of

Girona.

This report concentrates on centered in control architectures for autonomous

robots. A control architecture is the part of a control system in an autonomous

robot which is in charge of proposing correct actions in order to achieve its mission.

Concretely, this report focuses on Behaviour-based Robotics, a methodology for

building control architectures. The intention of this report is not to survey the state

of the art in this field, but rather it tries to give an overall description of the subject

and to show implementations of different behaviour-based approaches. In

particular, four architectures have been implemented to achieve a simple mission

in a computer-simulated environment. These behaviour-based architectures are

Subsumption, Action Selection Dynamics, Motor Schema and Process Description

Language. The purpose of this report is to illustrate the field of Behaviour-based

Robotics through theoretic aspects and representative implemented examples.

The presented work is part of an investigative project based on underwater

robotics. The research line started in 1992 and has been supported by different

research projects of the Spanish Government. At the moment a control architecture

is being implemented to transform our Remotely Operated Vehicle, GARBI (Amat

et al, 1996), into an Autonomous Underwater Vehicle (AUV). The control

architecture, called OOCAA (Object Oriented Control Architecture for Autonomy)

(Ridao et al, 2000), was designed as a hybrid architecture (refer to section 2)

containing aspects of behaviour-based robotics in its lower layer. The work

presented in this report is integrated with the research project in different aspects.

As an immediate contribution, the underwater-simulated environment, specially

developed for this investigative work, is used as a tool to design and test new parts

of the OOCAA before implementing them in the robot. As a medium term, the

consecution of the PhD, starting with this investigative phase, should improve the

performance on the control of GARBI.

3

With the purpose of contributing to the investigation on underwater robotics,

implementations were carried out using an AUV model and a three-dimensional

simulated environment. However, the overview of Behaviour-based Robotics was

dealt with from a general point of view concerning autonomous vehicles. The four

chosen architectures are also suitable for use in any other kind of robotic vehicles

as well. Another important point to note is that this investigation is done with a

single robot, for this reason all Behaviour-based concepts on multiagents systems

have been omitted.

This investigative report is structured as follows. Section 2 reviews the history of

control architectures for autonomous robots, starting with traditional methods of

Artificial Intelligence and continues trough to control robots. The most important

facts are revised until the inception of the field of Behaviour-based Robotics. Once

this field has been introduced, section 3 summarises the most important principles

and characteristics of Behaviour-based systems. Section 4 deals with the four

compared architectures. All the architectures are firstly described as they were

originally designed, then an implemented example is shown. Simulation results

and discussion of positive and negative aspects are given. The final part of this

section compares the four architectures. Section 5 concludes the report and

proposes future work to be done in order to start the PhD. An overview of the

thesis proposal is described. Finally, appendixes complete the report with the

description of the simulated missions with an AUV (Appendix A) and the related

publications of this investigative work (Appendix B).

2 Behavioural-based Robotics versus Traditional AI

The first attempt at building autonomous robots began around the mid-twentieth

century with the emergence of Artificial Intelligence. The approach begun at that

time is now referred to as “Traditional AI”, “Classical AI”, “Deliberative approach”

or “Hierarchical approach”. Traditional AI relies on a centralised world model for

verifying sensory information and generating actions in the world. The design of

the control architecture is based on a top-down philosophy. The robot control

architecture is broken down into an orderly sequence of functional components

(Brooks, 1991) and the user formulates explicit tasks and goals for the system, see

Figure 1:

4

Per

cept

ion

Mod

elin

g

Pla

nn

ing

Tas

k ex

ecut

ion

Mot

or c

ontr

ol

sensors actuators

Figure 1

1. Perception. In the first component, a sensor interpreter resolves noise and

conflicts in the sensory data. Perception algorithms are used to find

characteristics and objects within the environment.

2. Modeling. With the data obtained from perception, the world-modeling

component builds a symbolic representation of the world. This representation

contains geometric details of all objects in the robot’s world with their positions

and orientation.

3. Planing. The planner then operates on these symbolic descriptions of the world

and produces a sequence of actions to achieve the goals given by the users.

4. Task execution. This function controls the execution of the planned tasks

giving the set-points to each actuator.

5. Motor control. This control system is used to control the actuators in

accordance with the set-points.

The principal characteristics which all AI approaches have in common are:

• Hierarchical structure. The main goals are divided into different tasks, sub-

tasks, etc, in a hierarchical manner. Higher levels in the hierarchy provide sub-

goals for lower subordinate levels. The tasks are accomplished using a top-down

methodology.

• Sequential processing. These processes are executed in serial form starting

with the sense activities, moving through the modeling, and planning, ending

with the actuation.

• Symbolic planner. The planner reasons, basing itself on a symbolic world

model. The world must be generated linking the physical perceptions to the

corresponding symbols.

5

• Functionally compartmentalised . There is a clear subdivision of the

different tasks which must be carried out. Each component in the control

architecture will be in charge of only one of these functions.

In the 1970’s one of the earliest autonomous robots, called Shakey (Fikes &

Nilsson, 1971) (Nilsson, 1984), was built using a deliberative control architecture.

Shakey inhabited a set of specially prepared rooms. It navigated from room to

room, trying to satisfy a given goal. While experimenting with this robot new

difficulties were found. The planing algorithms failed with non-trivial solutions,

the integration of the world representations was extremely difficult and finally the

planning didn’t work as well as was hoped. The algorithms were improved and

better results were obtained. However, the environment was adapted totally to the

robot’s perceptive requirements. Many other robotic systems have been built with

the traditional AI approach, some of the best known are (Albus, 1991), (Huang,

1996), (Lefebvre & Saridis, 1992), (Chatila & Laumond, 1985) and (Laird &

Rosenbloom P.S., 1990). Nevertheless, traditional approaches still have problems

when dealing with complex, non-structured and changing environments. Only in

structured and highly predictable environments have they proved to be suitable.

The principal problems found in Traditional AI can be listed as:

• Computation. Traditional AI requires large amounts of data storage and

intense computation. For an autonomous mobile robot this can be a serious

problem.

• Real time processing. The real world has its own dynamics and for this

reason systems must react fast enough to perform their tasks. Most often,

traditional AI isn’t fast enough because the information is processed centrally.

Modeling and planning are long sequential processes, and the longer they take,

the more changed the world will be when the robot decides to act. The agent

needs simple multiple and parallel processes instead of only a few sequential

long processes.

• Robustness and Generalization. Traditional AI usually lacks in

generalization capacity. If a novel situation arises, the system breaks down or

stops all together. Also it doesn’t take into consideration problems of noise in

the sensory data and actuators when giving its symbolic representation.

6

• The accurate world model. In order to plan correctly, the world model must

be very accurate. This requires high-precision sensors and careful calibration,

both of which are very difficult and expensive.

• The Frame problem. This problem arises when trying to maintain a model of

a continuously changing world. If the autonomous robot inhabits a real world,

the objects will move and the light will change. In any event, the planner needs

a model with which to plan.

• The Symbol-grounding problem. The world model uses symbols, such as

“door”, “corridor”, etc, which the planner can use. The Symbol-grounding

problem refers to how the symbols are related to real world perceptions. The

planner is closed in a symbolic world model while the robot acts in the open real

world.

In the middle of the 1980s, due to dissatisfaction with the performance of robots in

dealing with the real world, a number of scientists began rethinking the general

problem of organizing intelligence. Among the most important opponents to the AI

approach were Rodney Brooks (Brooks, 1986), Rosenschein and Kaelbling

(Rosenschein, 1986) and Agre and Chapman (Agre & Chapman, 1987). They

criticized the symbolic world which Traditional AI used and wanted a more

reactive approach with a strong relation to the perceived world and the actions

there in. They implemented these ideas using a network of simple computational

elements indirectly connecting sensors to actuators in a distributed manner. There

were no central models of the world explicitly represented. The model of the world

used was the real one perceived by the sensors at each moment. Leading the new

paradigm, Brooks proposed the “Subsumption Architecture” which was the first

approach to the new field of “Behaviour-based Robotics”.

Instead of the top-down approach of Traditional AI, Behaviour-based systems use a

bottom-up philosophy like that in Reactive Robotics. Reactive systems provide

rapid real-time responses using a collection of pre-programmed rules. Reactive

systems are characterized by a strong response, however, as they don’t have any

kind of internal states, they are incapable of using internal representations to

deliberate or learn new behaviours. On the other hand, Behaviour-based systems

can store states in a distributed representation, allowing a certain degree of high-

level deliberation.

7

The Behaviour-based approach uses a set of simple parallel behaviours which react

to the perceived environment proposing the response the robot must take in order

to accomplish the behaviour (see Figure 2). There are neither problems of world

modeling nor real time processing. Nevertheless, another difficulty has to be

solved; how to select the proper behaviours for robustness and efficiency in

accomplishing goals. Two new questions also appeared, which Traditional AI

doesn’t take into consideration; how to adapt the architecture in order to improve

its goal-achievement, nor how to adapt it when new situations appear. This simple

but powerful methodology was in great contrast to Traditional AI and, from its

beginning, provided for simplicity, parallelism, perception-action mapping and real

implementations.

sensors actuators

avoid obstacles

go to point

explore

manipulate the world

Figure 2

Behaviour-based Robotics has been widely used and investigated since then. This

new field has attracted researchers from many and divers disciplines such as

biologists, neuroscientists, philosophers, linguistics, psychologists and, of course,

people working with computer science and artificial intelligence. All of whom have

found practical uses for this in approach their various fields of endeavor. The field

of Behaviour-based Robotics has also been referred to as the “New AI” or, under a

more appropriate denomination for the fields mentioned above, “Embodied

Cognitive Science”.

sensors actuatorsreactive layer

deliberative layer

control execution

Figure 3

8

Finally, some researchers have adopted a hybrid approach between Traditional AI

and Behaviour-based Robotics. Hybrid systems attempt a compromise between

bottom-up and top-down methodologies. Usually the control architecture is

structured in three layers: the deliberative layer, the control execution layer and

the functional reactive layer (see Figure 3). The deliberative layer transforms the

mission into a set of tasks which perform a plan. The reactive behaviour-based

system takes care of the real time issues related to the interactions with the

environment. The control execution layer interacts between the upper and lower

layers, supervising the accomplishment of the tasks. Hybrid architectures take

advantage of the hierarchical planning aspects of Traditional AI and the reactive

and real time aspects of behavioural approaches. Hybrid architectures have been

widely used. Some of the best known are AuRA (Arkin, 1986), the Planner-Reactor

Architecture (Lyons, 1992) and Atlantis (Gat, 1991) used in the Sojourner Mars

explorer.

3 Fundamentals of Behaviour-based Robotics

Behaviour-based Robotics is a methodology for designing autonomous agents and

robots. The behaviour-based methodology is a bottom-up approach inspired by

biology, a collection of behavioural acts in parallel achieving goals. Behaviours are

implemented as a control law using inputs and outputs. They can also store states

constituting a distributed representation system (Mataric, 1999). The basic

structure consists of all behaviours taking inputs from the robot’s sensors and

sending outputs to the robot’s actuators. (See Figure 4). A coordinator is needed

order to send only one command at a time to the motors.

Actuators

Behaviour 1

Behaviour 2

Behaviour n

STIMULUS

COORDINATOR

Sensors

Figure 4

9

The internal structure of a behaviour can also be composed of different modules

interconnected by sensors, various other modules and, finally, with the coordinator

(Brooks, 1986). However, behaviours must be completely independent of each

other. The global structure is a network of interacting behaviours comprising low-

level control and high-level deliberation abilities. The latter is performed by the

distributed representation which can contain states and consequently change the

behaviour according to their information.

The parallel structure of simple behaviours allows a real time response with low

computational cost. Autonomous robots using this methodology can be built easily

with low cost. Behaviour-based robotics has demonstrated its reliable performance

in standard robotic activities such as navigation, obstacle avoidance, terrain

mapping, object manipulation, cooperation, learning maps and walking. (Arkin,

1998) and (Pfeifer & Scheier, 1999) are two principal references which overview the

field.

3.1 Principles

There are few basic principles which have been used by all researchers in

Behavior-based Robotics. These principles provide the keys to success in

methodology.

• Parallelism. Behaviours are executed concurrently. Each one can run on its

own processor. Parallelism appears at all levels, from behavioural design to

software and hardware implementation. This characteristic contributes to the

speed of computation and consequently to the dynamics between the robot and

the environment.

• Modularity. The system is organised into different modules (behaviours). The

important fact is that each module must run independently. This important

principle contributes to the robustness of the system. If, for example, one

behaviour fails due to the break down of a sensor, the others will continue

running and the robot will always be controlled. Another important

consequence of modularity is the possibility of building the system

incrementally. In the design phase, the priority behaviours will first be

implemented and tested. Once they run correctly, more behaviours will be

added to the system.

10

• Situatedness/Embeddedness. The concept of “Situatedness” means that a

robot is situated and surrounded by the real world. For this reason it mustn’t

operate using an abstract representation of reality, it must use the real

perceived world. “Embeddednes” refers to the fact that the robot exists as a

physical entity in the real world. This implies that the robot is subjected to

physical forces, damages and, in general, to any influence from the

environment. This means that the robot shouldn’t try to model these influences

or plan with them. Instead it should use this system-environment interaction to

act and react with the same dynamics as the world.

• Emergence. This is the most important principle of Behaviour-based Robotics.

It is based on the principles explained above and attempts to explain why the

set of parallel and independent behaviours can arrive at a composite behaviour

for the robot to accomplish the expected goals. Emergence is the property which

results from the interaction between the robotic behavioural system and the

environment. Due to emergence the robot performs behaviours that weren’t pre-

programmed. The interaction of behaviour with the environment generates new

characteristics on the robot’s behaviour which weren’t pre-designed. Numerous

researchers have talked about emergence. Two examples are “Intelligence

emerges from interaction of the components of the system” (Brooks, 1991) and

“Emergence is the appearance of novel properties in whole systems” (Moravec,

1988).

3.2 Expression of behaviours

There are several ways with which to express a robotic behaviour:

1. Stimulus-Response Diagrams

Behaviours are represented using Stimulus(input)-Response(output) (SR) blocks as

shown in Figure 5. Behaviours are in parallel structure and the outputs are

channelled into a coordinate mechanism which produces an appropriate response

(Figure 4). Stimulus-Response diagrams are the most intuitive method to express

behaviours.

BehaviourStimulus Response

Figure 5

11

2. Functional Notation

Behaviours can also be expressed as mathematical functions. Then coordination

function evaluates the individual behaviours and generates the response which will

be transmitted to the actuators:

b : behaviour b(s) = r s : stimulus

r : response

coordinate [ b1(s1), b2(s2), ..., bn(sn)] = response

3. Finite State Acceptor Diagrams

Finite State Acceptors (FSA) (Arbib et al, 1981) are used to describe aggregations

and sequences of behaviours during the accomplishment of some high-level goals.

They make the active behaviours and the transitions between them explicit. A

finite state acceptor M is specified by:

Q : set of allowable behavioural statesδ : transition behavioural configurationq 0 : starting behavioural configurationF : set of accepting states

The finite state acceptor can be diagramed as by a set of circles representing

behaviours with arrows indicating, the stimulus. Stimuli change the active

behaviours which are represented by a double circle. Figure 6 shows a simple

example of a FSA.

A B C

i1 i3 i5

i2 i4

δ q input δ(q,input)

A i1 AA i2 BB i3 BB i4 CC i5 C

M = {{A,B,C},δδ,A,{B,C}}

Figure 6

3.3 Behavioural choice and design

There are many approaches to choose and design the behaviours which must

appear in the architecture. Three of the most well known are next described.

M = ( Q , δ , q 0 , F )

12

1. Ethologically guided/constrained design

This method is based on animal behaviour. Behaviours are designed after

consulting biological literature searching for animal behaviours which can be used

for a robotic system. Then, the animal model which accomplishes the desired

behaviour is translated to a more suitable behaviour system for the robot. The

schema-based robotic navigational system was invented using this methodology

based on the navigational behaviour of the toad (Arbib & House, 1987). Figure 7

shows the design methodology for ethologically guided systems.

ConsultEthologicalLiterature

ExtractModel

ImportModel toRobot

Run RoboticExperiments

EvaluateResults

Done

EnhanceModel

Guide NewBiological

Experiments

Figure 7

2. Situated activity-based design

A robot’s actions are predicted upon the situations in which it may find itself. The

design requires a solid understanding of the relationship between the robotic agent

and its environment, then all possible situations must be related to a behaviour.

The perception problem is reduced to recognising the situation in which the robot

finds itself at any given moment. Different projects can be found in (Agre &

Chapman, 1987) and (Schoppers, 1987). Figure 8 shows the situated activity design

methodology.

AssesAgent-

EnvironmentDynamics

Partitioninto

Situations

CreateSituationalResponses

ImportBehaviours

to Robot

Run RoboticExperiments

Done

Enhance, Expand,Correct

BehaviouralResponses

EvaluateResults

Figure 8

13

3. Experimentally driven design

The basic premise is to build a set of abilities, make a trial run in the real world,

debug imperfect behaviours and add to the behavioural system. This is a trial and

error approach. By iterative repetition of this process, the behaviour-based

architecture is built. This method has been widely used by such people as Brooks

(Brooks, 1989) and Payton (Payton et al, 1992), among others. Figure 9 shows the

experimentally driven design methodology.

BuildMinimalSystem

ExerciseRobot

Done

Add NewBehaviouralCompetence

EvaluateResults

Figure 9

3.4 Behavioural encoding

To encode the behavioural response we must create a functional mapping from the

stimulus plane for the motor plane. The motor plane usually has two parameters,

the strength (magnitude of the response) and the orientation (direction of the

action). A behaviour can be expressed as (S, R, β) where:

• S: Stimulus Domain

S is the domain of all perceivable stimuli. Each stimulus “s” is represented by

s(p,λ), where “p” is the class of perception and “λ” is its strength. Each class “p”

has a τ threshold value above which a response is generated.

• R: Range of Responses

For autonomous vehicles with six degrees of freedom, the response r ∈ R of a

behaviour is a six-dimensional vector: r = [x, y, z, θ, φ, ψ] composed of the three

translation degrees of freedom and the three rotational degrees of freedom.

Each parameter is composed of strength and orientation values. When there are

different responses ri, the final response is ri ’= gi · ri , where gi is a gain which

specifies the strength of the behaviour relative to the others.

• ββ: Behavioural Mapping

The mapping function “β” relates the stimulus domain with the response range

for each individual active behaviour: β(s) → r. This mapping function generates

response only when λ > τ. Behavioural mappings β can be :

14

• Null: the stimuli produce no motor response

• Discrete: numerable sets of responses, Examples of this kind of mapping can

be found in (Bonasso, 1992) or Subsumption architecture (Brooks, 1986).

• Continuous: infinite space of potential reactions. Examples can be found in

potential fields (Khatib, 1985) or spin fields (Slack, 1990).

3.5 Assembling behaviours

When we combine and coordinate multiple behaviours the emergent behaviour

appears. This is the product of the complexity between a robotic system and the

real world. Coordination function is:

ρ : six-dimensional final vector responseC : coordination functionG : relative strengths of the behaviours* : element by element productB : behavioursS : stimulus

The two primary coordination mechanisms are:

• Competitive methods. The output is the selection of a single behaviour. The

coordinator chooses only one behaviour to control the robot. Depending on

different criterions the coordinator determines which behaviour is best for the

control of the robot. Preferable methods are suppression networks such as

Subsumption architecture (Brooks, 1986), action-selection (Maes, 1990) and

voting-based coordination (Rosenblatt & Payton, 1989).

• Cooperative methods. The output is the superposition of the force gradients

given by all the behaviours. The coordinator applies a method which takes all

the bahavioural responses and generates an output which will control the robot.

Behaviours which generate a stronger output will impose a greater influence on

the final behaviour of the robot. Principal methods are vector summation

(potential fields, (Khatib, 1985)) and behavioural blending (Saffiotti et al, 1995).

Basic behaviour-based structures use only a coordinator which operates using all

the behaviours to generate the robot’s response. However, there are more complex

systems with different groups of behaviours coordinated by different coordinators.

Each group generates an output and with these intermediate outputs, the final

robot’s response is generated through a final coordinator. These recursive

ρ = C ( G * B (S))

15

structures are used in high level deliberation. By means of these structures a

distributed representation can be made and the robot can behave differently

depending on internal states, achieving multiphase missions.

3.6 Adaptive Behaviour-based Robotics

One of the fields associated with Behaviour-based robotics is Adaptation.

Intelligence cannot be understood without adaptation. If a robot requires autonomy

and robustness it must adapt itself into the environment. The primary reasons for

autonomous adaptivity are:

• The robot’s programmer doesn’t know all the parameters of the behaviour-

based system.

• The robot must be able to perform in different environments.

• The robot must be able to perform in changing environments.

The properties which adaptive systems in robotics must contemplate are

(Kaelbling, 1999):

• Tolerance to sensor noise.

• Adjustability. A robot must learn continuously while performing its task in the

environment.

• Suitability. Learning algorithms must be adequate for all kinds of

environments.

• Strength. The adaptive system must posses the ability to control the robot in a

desired place in order to obtain the desired data.

However, adaptation is a wide term. According to (McFarland, 1991) there are

various levels of adaptation in a behaviour-based system.

• Sensory Adaptation. Sensors become more attuned to the environment and

changing conditions of light, temperature, etc.

• Behavioural Adaptation. The individual behaviours are adjusted relative to

the others.

• Evolutionary Adaptation. This adaptation is done over a long period of time

inducing change in individuals of one species, in this case robots. Descendants

change their internal structure based on the success or failure of their ancestors

in the environment.

• Learning as Adaptation. The robot learns different behaviours or different

coordination methods which improve its performance.

16

Using the mathematical notation explained in sections 3.2 and 3.4 we can see

where adaptation can be applied to a simple behaviour:

ri = gi · bi (si)

1. The mapping function bi which relates the stimulus si with the responses ri.

2. The magnitude of the response which is controlled by the gain gi.

3. The necessity of a new behaviour i.

In the coordination function, from section 3.5, there are also some terms which can

be adapted:

ρ = C ( G * B (S))

4. The set of behaviours bi which constitute an assemblage B, in case the

behaviour-based system has more than one coordination phase.

5. The relative strengths of each behaviour response G.

6. The coordination function C.

As can be seen, many parameters can be adapted in a behaviour-based robotic

system. At the moment there are still few examples of real robotic systems which

learn to behave (Kaelbling, 1999). There isn’t either an established methodology to

develop adaptive behaviour-based systems. The two approaches most commonly

used are Reinforcement learning and Evolutionary techniques (Ziemke, 1998;

Arkin, 1998; Dorigo & Colombetti, 1998; Pfeifer & Scheier, 1999). Both have

interesting characteristics but also disadvantages like the convergence time or the

difficulties in finding a reinforcement or fitness function respectively. In many

cases they are implemented over control architectures based on neural networks.

Using the adaptive methodologies the weights of the network are modified until an

optimal response is obtained. The two approaches have demonstrated the

feasibility of the theories in real robots in all levels of adaptation. The basic ideas of

the two methodologies are next described.

• Reinforcement learning.

Reinforcement learning is a class of learning algorithm where a scalar evaluation

(rewards or punishments) of the performance of the control architecture is

available from the interaction with the environment. The goal of RL algorithm is

to maximise the expected evaluation by adjusting the parameters of the control

17

architecture. The adjustment of parameters determines the control policy that is

being applied. The evaluation is generated by the critic using an utility function.

Generally this utility function is unknown and must be also learned. The

evaluation can be returned immediately or after a while. In the latter the

reinforcements have to be distributed in the current and past control signals. The

two predominating algorithms in RL are Adaptive Heuristic Critic and Q-learning.

In AHC the process of learning the decision policy is separated from learning the

utility function of the critic. However, in Q-learning a single “Q” function is learned

to evaluate both actions. This Q function denotes the total reinforcement obtained

by choosing the action as a first action and then following the policy for future time

steps. Currently, Q-learning is dominating adaptive behaviour-based architectures

using reinforcement learning. For a complete introduction to RL refer to

(Kaelbling, 1996).

• Evolutionary Robotics.

Evolutionary learning techniques are inspired by the mechanisms of natural

selection. The principal method used is Genetic Algorithms (GAs). Evolutionary

algorithms typically start from a randomly initialized population of

individuals/genotypes encoded as strings of bits or real numbers. Each individual

encodes the control system of a robot. Each individual is evaluated in the

environment. From the evaluation a score is assigned that measures the ability to

perform a desired task. Individuals that have obtained higher fitness values are

allowed to reproduce by generating copies of their genotypes with the addition of

changes introduced by some genetic operators (mutations, crossover, duplication,

etc.). By repeating this process for several generations the fitness values of the

population increase. Evolutionary robotics has shown good results in real robots.

Usually they are used over neural networks modifying the weights of the nodes.

Evolutionary algorithms have demonstrated more reliable solutions than

reinforcement learning when reinforcement frequency is lower. However

evolutionary approaches have also problems of real-time due to the large time

necessary to converge into a reasonable solution. For a complete introduction to

Evolutionary Robotics refer to (Nolfi, 1998) (Harvey et al, 1997).

18

4 Behaviour-based Approaches

Many proposals have appeared since the field of Behaviour-based Robotics began in

1986 with Subsumption architecture, yet there still isn’t a clear classification of the

different techniques in the literature. A global classification, accepted by the

majority of scientists (Ziemke, 1998) (Maes, 1994), lists systems according to their

adaptability. The first proposals were non-adaptive approaches or engineering

approaches. In these proposals, a sophisticated action selection mechanism is

implemented and tuned manually. Later on, adaptive approaches appeared. These

architectures have a simplistic action selection mechanism with sophisticated

learning techniques such as reinforcement learning or evolutionary approaches

(section 3.6).

This report has concentrated on the engineering approaches because they form the

base of Behaviour-based Robotics and because most adaptive approaches are also

based on them. From among the many proposals, four architectures have been

chosen which we think represent the overall methodologies and which have

successfully been implemented in real robots. These architectures were designed

independently and are based on different ideas within the field of Behaviour-based

Robotics. The architectures and their basic characteristics can be seen in Table 1.

Controlarchitecture

Behavioural choice and design

Behaviouralencoding

Assembling behaviours Programmingmethod

Subsumptionarchitecture

Experimentally Discrete Competitive, arbitration viainhibition and suppression

AFSM, BehaviouralLanguage orbehavioural libraries

Action SelectionDynamics

Experimentally Discrete Competitive, arbitration vialevels of activation

Mathematicalalgorithms

Schema-basedapproach

Ethologically Continuous usingPotential fields

Cooperative via vectorsummation and normalisation

Parameterisedbehavioural libraries

ProcessDescriptionLanguage

Experimentally Continuous Cooperative via valuesintegration and normalisation

Process DescriptionLanguage

Table 1

19

4.1 Subsumption architecture

4.1.1 Description

The Subsumption architecture was designed by Rodney Brooks in the 1980s at the

Massachusetts Institute of Technology. His work opened the field of Behaviour-

based Robotics. To overcome the problems encountered in Traditional AI when

designing real robotic systems, Brooks proposed a completely different

methodology. He refused the centralized symbolic world model and proposed a

decentralized set of simple modules which reacted more rapidly to environmental

changes. To accomplish this, he presented the Subsumption architecture in 1986

with the paper “A Robust Layered Control System for a Mobile Robot” (Brooks,

1986). Later on, he modified a few aspects of the architecture as a result of

suggestions from J.H. Connell. The final modifications on the Subsumption

architecture can be found in (Brooks, 1989) and (Connell, 1990). Subsumption

Architecture has been widely applied in all kinds of robots since then. Some further

modifications have also been proposed. However, in this report, the original

Subsumption approach will be described.

Subsumption Architecture is a method of reducing a robot’s control architecture

into a set of task-achievement behaviours or competences represented as separate

layers. Individual layers work on individual goals concurrently and

asynchronically. All the layers have direct access to the sensory information.

Layers are organised hierarchically allowing higher layers to inhibit or suppress

signals from lower layers. Suppression eliminates the control signal from the lower

layer and substitutes it with the one proceeding from the higher layer. When the

output of the higher layer is not active, the suppression node doesn’t affect the

lower layer signal. On the other hand, only inhibition eliminates the signal from

the lower layer without substitution. Through these mechanisms, higher-level

layers can subsume lower-levels. The hierarchy of layers with the suppression and

inhibition nodes constitute the competitive coordination method, see Figure 10.

20

Behaviour 1

Behaviour 2

Behaviour 4

I

S

STIMULUS

Behaviour 3

S

Sensors

Actuators

COORDINATOR

Figure 10

All layers are constantly attentive to the sensory information. When the output of a

layer becomes active, it suppresses or inhibits the outputs of the layers below,

taking control of the vehicle. Internally the layer has states and timers allowing

deliberation and continuing activity for a period of time when the activation

conditions finish.

This architecture can be built incrementally, adding layers at different phases. For

example, the basic layers, such as “go to behaviour” or “avoid obstacles behaviour”,

can be implemented and tested in the first phase. Once they work properly, new

layers can be added without the necessity of redesigning previous ones.

The layers of the subsumption architecture were originally designed as a set of

different modules called Augmented Finite States Machines (AFSM). Each AFSM

is composed of a Finite State Machine (FSM) connected to a set of registers and a

set of timers or alarm clocks, see Figure 11. Registers store information from inside

FSM as well as from the outside sensors and other AFSM. Timers enable state

changes after a certain period of time while finite state machines change their

internal state depending on the current state and inputs. An input message or the

expiration of a timer can change the state of the machine. Inputs from the AFSM

can be suppressed by other machines and outputs can be inhibited. AFSM behave

like a single FSM but with the added characteristics of registers and timers.

21

Finite State Machine

º IS

Figure 11

Using a set of augmented finite state machines a layer can be implemented to act

like a behaviour. A layer is constructed of a network of AFSM joined by wires with

suppression and inhibition nodes. Figure 12 shows the AFSM network designed by

Brooks for a mobile robot with the layers “avoid objects”, “wander” and “explore”,

(Brooks, 1986). As the figure shows, designing the network so as to accomplish a

desired behaviour is not exactly clear. For this reason, Brooks developed a

programming language, the Behavioural Language (Brooks, 1990), which

generates the AFSM network using a single rule set for each behaviour. The high-

level language is compiled to the intermediate AFSM representation and then

further compiled to run on a range of processors.

Figure 12

22

One of the principles of the Subsumption architecture is independence from the

layers. The implementation methodology, as stated above, consists of building the

layers incrementally once the previous layers have been tested. Nevertheless, in

Figure 12, we can see some wires which go from one layer to another breaking the

independence. This fact was shown by Connell (Connell, 1990) who proposed a total

independence of the layers until the coordination phase. This assures the

possibility of implementing the layers incrementally without redesigning the

previous ones. This is also useful in order to map each layer into a different

processor in the robot. Connell and other researchers have also proposed other

formalism to implement the layers instead of AFSM. Usually, computer programs

are used for simplicity in programming rules without the use of FSM. AFSM must

be considered as the formalism Brooks chose, for its simplicity and rapid

processing, to implement the Subsumption architecture, not as part of it.

4.1.2 Implementation

Subsumption Architecture has been implemented with three behaviours and tested

in a simulated underwater environment in an AUV. The mission consisted of

reaching a collection of way-points avoiding obstacles and traps. The first

behaviour “Go to” is in charge of driving the vehicle toward the way-points. The

second behaviour “Obstacle avoidance” has the goal of maintaining the vehicle

away from obstacles. The third behaviour “Avoid trapping” is used to depart from

zones in which the vehicle could be trapped. For a complete description of the

behaviours and the simulated environment refer to Appendix A.

The three behaviours have been implemented in three different functions which

use sensory information as inputs and, as outputs, if the behaviour becomes active,

the 3-dimensional velocity the vehicle should follow. Also, two suppression nodes

were used to coordinate behaviours. AFSM was not been used due to the simplicity

of the executable functions in the simulated environment. However, as mentioned

before, the implementation method isn’t the most important fact in the

subsumption architecture. The three behaviours were implemented as shown in

Figure 13.

23

Obstacle avoid.

Avoid Trapping

Go To

S

S

Figure 13

The hierarchy of behaviours was constructed as follows: at the top, the “Obstacle

avoidance” behaviour followed by the “Avoid trapping” and “Go to” behaviours. This

hierarchy primarily assures that the vehicle stays away from obstacles. As the

mission never requires proximity to obstacles, the “Obstacle avoidance” in the top

level assures the safety of the vehicle. At the second level the “Avoid trapping”

behaviour takes control of the “go to” behaviour when and if the vehicle becomes

trapped. As in AFSM, a timer was used to maintain activity in the behaviours 5

seconds longer when the activation conditions were finished.

Some graphical results of the Subsumption approach can be seen in the next three

figures. Figure 14 shows an aerial view of the simulation and Figure 15 shows a

lateral view. Finally Figure 16 shows a three-dimensional representation of the

simulation.

Figure 14

24

Figure 15

Figure 16

Given the results, it can be said that the principal advantages of the Subsumption

approach are robustness, modularity and easy tuning of the behaviours.

Behaviours can be tuned individually and, once they work properly, mixed. The

design of the hierarchy is very easy once the priorities of the behaviours are known

(a difficult task when working with a large architecture). This architecture is very

modular, every behaviour can be implemented with a different processor with the

responses coordinated as a final step. A sequential algorithm is not necessary as all

the behaviours are completely independent. The principal disadvantage is the non-

optimal trajectories, due to the competitive coordination method, with a lot of

bends when the active behaviour changes. Table 2 summarises the principal

characteristics of Subsumption Architecture.

25

SUBSUMPTION ARCHITECTURE

Developer Rodney Brooks, Massachusetts Institute of Technology

References (Brooks, 1986; Brooks, 1989; Connell, 1990)

Behavioural choice and design Experimentally

Behavioural encoding Discrete

Assembling behaviours Competitive, arbitration via inhibition and suppression

Programming method AFSM, Behavioural Language or behavioural libraries

Positive aspects Modularity, Robustness and Tuning time

Negative aspects Development time and performance (non-optimal paths)

Table 2

4.2 Action Selection Dynamics

4.2.1 Description

Action Selection Dynamics (ASD) is an architectural approach which uses a

dynamic mechanism for behaviour (or action) selection. Pattie Maes from the AI-

Lab at MIT developed it toward the end of the 1980s. Principal references are

(Maes, 1989), (Maes, 1990) and (Maes, 1991). Behaviours have associated

activation levels which are used to arbitrate competitively the activity which will

take control of the robot. Other approaches for action selection have been proposed

(Tyrell, 1993). However, ASD is the most well known and most commonly specified.

Action Selection Dynamics uses a network of nodes to implement the control

architecture. Each node represents a behaviour (an external behaviour of the

robot). The nodes are called competence modules. This network of modules is used

to determine which competence module will be active and therefore control the

robot. The coordination method is competitive, only one module can be active at

any moment. To activate the competence modules some binary states are used.

Each competence module has three lists of states which define its interaction

within the network. The first list is the precondition list and contains all the states

which should be true so that the module becomes executable. The second list is the

add list and contains all the states which are expected to be true after the

activation of the module. Finally the third list is the delete list and contains the

states which are expected to become false after the execution of the module.

26

The states are external perceptions of the environment gathered by the sensors.

Usually some kind of processing will be necessary to transform the analogue

outputs of the sensors to a binary state. For example, for the state “No_Obstacle”,

all the values provided by the sonar have to be processed to determine if there are

nearby obstacles. The states can also be internal assumptions or motivations of the

robot. The state “Way-point_Reached” could be one example. The states are also

used to determine the goals and protected goals of the robot. The goals would be the

states which are desired to be true. The protected goals are the goals already

achieved and therefore retained. The mission of the robot is defined by the

assignment of the goals to some states.

Once all the states and competence modules are defined, the decision network can

be built. Different links appear between the nodes based on the precondition, add

and delete lists:

• Successor link: For each state which appears in the add list of module A and

in the precondition list of module B, a successor link joins A with B.

• Predecessor link: A predecessor link joints B with A if there is a successor

link between A and B.

• Conflicter link: For each state which appears in the delete list of module B

and in the precondition list of module A, a conflicter link joins A with B.

In Figure 17 successor links can be seen as continuous arrows and conflicter links

as lines with a black dot. Note that predecessor links are inverted conflicter links.

NODE 4

NODE 6

NODE 5

NODE 3

NODE 2

NODE 1

STATE 5

STATE 4

STATE 3

STATE 2

STATE 1

STATE 5

STATE 1

STATE 2

STATES

GOALS

PROTECTEDGOAL

Figure 17

27

Activation of competence modules occurs depending on the quantity of energy they

are given. The energy is spread in two phases. In the first phase three different

mechanisms are used:

1. Activation by the states: if at least one state in the precondition list is true,

activation energy is transferred to the competence module.

2. Activation by the goals: if at least one state in the add list belongs to a goal

state, activation energy is transferred to the competence module.

3. Activation by the protected goals: if at least one state in the delete list

belongs to a protected goal state, activation energy is removed from the

competence module.

The spread of energy in phase one is shown in Figure 17 with dotted lines. On the

other hand, phase two spreads energy from competence modules. Three

mechanisms are also used:

1. Activation of Successors: Executable modules spread a fraction of their own

energy to successors which aren’t executable if the state of the link is false.

The goal is to increase activation of modules which become executable after

the execution of the predecessor module.

2. Activation of Predecessor: Non-executable modules spread a fraction of their

own energy to the predecessor if the state of the link is false. The goal is to

spread energy to the modules so that, through their execution, the successor

module becomes executable.

3. Inhibition of Conflicters: Competence modules decrease the energy of

conflicter modules if the state of the link is true. The goal is to decrease the

energy of the conflicters so that by becoming active the preconditions of the

module are rendered false.

In each cycle the competence modules increase or decrease their energy until a

global maximum and minimum level are reached. The activated module has to

fulfil three conditions:

1. It has to be executable (all preconditions have to be true).

2. Its level of energy has to surpass a threshold.

3. Its level of energy has to be higher than that of the modules accomplishing

conditions 1 and 2.

28

When a module becomes active, its level of energy is reinitialised to 0. If none of

the modules fulfil condition 2, the threshold is decreased. Several parameters are

used for the thresholds and the amount of energy to be spread. Also, normalisation

rules assure that all modules have the same opportunities to become active. Note

that the energy of the modules is accumulated incrementally, then the sample time

becomes very important because it determines the velocity of the accumulation. For

a mathematical notation of this algorithm refer to (Maes, 1989).

The intuitive idea of Action Selection Dynamics is that by using the network and

the spreading of energy, after some time, the active module is the best action to

take for the current situation and current goals. Although Action Selection

Dynamics is complex and difficult to design, it has been successfully tested in real

robotic systems.


Three behaviours have been implemented and tested using Action Selection

Dynamics in an underwater simulated environment with an AUV. As in

Subsumption, the mission consisted of reaching achieve a collection of way-points

avoiding obstacles and avoiding entrapment. The first behaviour “Go to” is in

charge of driving the vehicle toward the way-points. The second behaviour

“Obstacle avoidance” has the goal of keeping the vehicle away from obstacles. And

the third behaviour “Avoid trapping” is used to depart from zones in which the

vehicle could be trapped. For a complete description of the behaviours and the

simulated environment refer to Appendix A.

In the implementation of the ASD network, the three behaviours represent three

competence modules. Each behaviour has been implemented in a function. The

coordination module which contains the ASD network has also been implemented

in another function following the mathematical notations found in (Maes, 1989).

Each module uses different binary states in its lists. In Table 3 all the states used

are shown. Note that from the six states, three are the negation of the other three.

This is to simplify the ASD algorithm. The goal of the robot is the state

“No_Way_Point”. When this state is true the robot has arrived at and passed

through all the way-points, therefore the mission is complete. Once the states are

29

described the preconditions list, add list and delete list have to be defined, see

Table 4.

State Description

WAY-POINT There is a way-point to go.

NO_WAY-POINT There isn’t any way-point.

OBSTACLE There is a nearby obstacle.

NO_OBSTACLE There isn’t any nearby obstacle.

TRAPPED The vehicle cannot depart from the same zone.

NO_TRAPPED The vehicle is moving through different zones.

Table 3

Go To STATES :

CONDITION LIST WAY-POINT NO_OBSTACLE NO_TRAPPED

ADD LIST NO_WAY-POINT OBSTACLE TRAPPED

DELETE LIST WAY-POINT

Obstacle Avoidance STATES :

CONDITION LIST OBSTACLE

ADD LIST NO_OBSTACLE

DELETE LIST OBSTACLE

Avoid Trapping STATES :

CONDITION LIST WAY-POINT NO_OBSTACLE TRAPPED

ADD LIST NO_TRAPPED OBSTACLE

DELETE LIST TRAPPED

Table 4

With the lists of each competence module the network can be implemented, see

Figure 18. Note that the design phase for the ASD architecture consists of the

specification of the states and the lists. Then, the entire network can be generated

automatically and only some parameters must be tuned.

AvoidTrapping

Obstacleavoid.

Go To

STI

MULS

MA

X {E

1, E2, E

3}>T

hres.

Figure 18

30

The ASD network spreads energy from the states, the goal and between the

competence modules. The competence module which is executable and has more

energy than the others and the threshold will activate and control the robot until a

new iteration is received. The parameters of the decision network that have been

used can be seen in Table 5. The spreading of energy between the modules is

determined by the relationship between different parameters.

Parameter Description Value

ππ Maximum level of energy per module 40 units

φφ Amount of energy spread by a state 10 units

γγ Amount of energy spread by a goal 20 units

δδ Amount of energy spread by a protected goal 0 units

θθ Threshold to becoming active 15 units

Ts Sample time 1 second

Table 5

Some graphical results of the ASD approach can be seen in the next three figures.

Figure 19 shows an aerial view of the simulation and Figure 20 shows a lateral

view. Finally Figure 21 shows a three-dimensional representation of the

simulation. As can be seen, the path obtained with Action Selection Dynamics is

quite optimal. The principal advantages of this method are robustness of the

architecture and automatic coordination once the network has been generated.

However, the design and implementation phases are very complex and difficult.

Figure 19

31

Figure 20

Figure 21

Table 6 summarises the principal characteristics of Action Selection Dynamics.

ACTION SELECTION DYNAMICS

Developer Pattie Maes, Massachusetts Institute of Technology

References (Maes, 1989; Maes, 1990; Maes, 1991)


Behavioural encoding Discrete

Assembling behaviours Competitive, arbitration via levels of activation

Programming method Mathematical algorithms

Positive aspects Modularity and Robustness

Negative aspects Development time and complexity

Table 6

32

4.3 Motor Schemas approach

4.3.1 Description

Schema-based theories appeared in the eighteenth century as a philosophical

model for the explanation of behaviour. Schemas were defined as the mechanism of

understanding sensory perception in the process of storing knowledge. Later on, in

the beginning of the twentieth century, the schema theory was adapted in

psychology and neuroscience as a mechanism for expressing models of memory and

learning. Finally in 1981, Michael Arbib adapted the schema theory for a robotic

system (Arbib, 1981). He built a simple schema-based model inspired by the

behaviour of the frog to control robots. Since then, schema-based methodologies

have been widely used in robotics. The principal proposal is Motor Schemas

developed by Ronald Arkin at Georgia Institute of Technology, Atlanta. Arkin

proposed Motor Schemas (Arkin, 1987) as a new methodology of Behaviour-based

Robotics.

From a robotic point of view “a motor schema is the basic unit of behaviour from

which complex actions can be constructed; it consists of the knowledge of how to act

or perceive as well as the computational process by which it is enacted” (Arkin,

1993). Each schema operates as a concurrent, asynchronous process initiating a

behavioural intention. Motor schemas react proportionally to sensory information

perceived from the environment. All schemas are always active producing outputs

to accomplish their behaviour. The output of a motor schema is an action vector

which defines the way the robot should move. The vector is produced using the

potential fields method (Khatib, 1985). However, instead of producing an entire

field, only the robot’s instantaneous reaction to the environment is produced

allowing a simple and rapid computation.

The coordination method is cooperative and consists of vector summation of all

motor schema output vectors and normalisation. A single vector is obtained

determining the instantaneous desired velocity for the robot. Each behaviour

contributes to the emergent global behaviour of the system. The relative

contribution of each schema is determined by a gain factor. Safety or dominant

behaviours must have higher gain values. Normalisation assures that the final

33

vector is within the limits of the particular robot’s velocities. Figure 22 shows the

structure of motor schema architecture.

∑R = ∑ (Gi · Ri)Behaviour 1

Behaviour 2

Behaviour 4STIMULUS

Behaviour 3Sensors Actuators

COORDINATOR

Norm.

Figure 22

Implementation of each behaviour can be done with parameterised behavioural

libraries in which behaviours like “move ahead”, “move-to-goal”, “avoid-static-

obstacle”, “escape” or “avoid-past” can be found. Schemas have internal parameters

depending on the behaviour and an external parameter, the gain value. Each

schema can be executed into a different processor. Nevertheless, the outputs must

have the same format in order to be summed by the coordinator. For two-

dimensional vehicle control refer to (Arkin, 1989) and for three-dimensional control

to (Arkin, 1992). For a set of positions, each behaviour generates a potential field

which indicates the directions to be followed by the robot in order to accomplish the

behaviour. Merging all the behaviours using the coordinator provides a global

potential field giving an intuitive view of the motor schema architecture

performance, Figure 23.

Figure 23

34


Motor Schema architecture has been implemented with three behaviours and

tested in an underwater simulated environment with an AUV. As in the previous

tests, the mission consisted of reaching a collection of way-points avoiding obstacles

and avoiding to be entrapment. The first behaviour “Go to” is in charge of driving

the vehicle toward the way-points. The second behaviour “Obstacle avoidance” has

the goal of keeping the vehicle away from obstacles. And the third behaviour

“Avoid trapping” is used to depart from zones in which the vehicle could be

trapped. For a complete description of the behaviours and the simulated

environment refer to Appendix A.

Each behaviour has been implemented in a different function. A simple

coordination module is used to sum the signals and normalise the output. The

structure of the control architecture can be seen in Figure 24. After tuning the

system, “Obstacle avoidance” behaviour has the highest gain value, followed by

“Avoid trapping” and “Go to” behaviours. As in Subsumption architecture, higher

priority has been given to the safety behaviour “Obstacle avoidance”, followed by

“Avoid trapping” to take control over “Go to” when it is necessary.

Obstacle avoid.

Avoid Trapping

Go To

∑R = ∑ (Gi · Ri)

Figure 24

Some graphical results of the Motor Schema approach can be seen in the next three

figures. Figure 25 shows an aerial view of the simulation and Figure 26 shows a

lateral view. Finally Figure 27 shows a three-dimensional representation of the

simulation.

35

Figure 25

Figure 26

Figure 27

36

After reviewing the results, it can be said that the principal advantages are

simplicity and ease of implementation as well as optimised paths. The architecture

can be implemented in different processors because the algorithm isn’t sequential.

However, difficulties appeared in tuning the gain values. The values are very

sensible and have to be tuned together. When new behaviours are added, re-tuning

is necessary because the sum of the responses of some behaviours can cancel the

effect of others, such as “Obstacle avoidance”. For this reason, robustness and

modularity are very low. Table 7 summarises the principal characteristics of Motor

Schema approach.

MOTOR SCHEMA APPROACH

Developer Ronald Arkin, Georgia Institute of Technology

References (Arkin, 1987) (Arkin, 1989) (Arkin, 1992)

Behavioural choice and design Ethologically

Behavioural encoding Continuous using potential fields

Assembling behaviours Cooperative via vector summation and normalisation

Programming method Parameterised behavioural libraries

Positive aspects Complexity, Development time and simplicity

Negative aspects Tuning time, robustness and modularity

Table 7

4.4 Process Description Language

4.4.1 Description

Process Description Language (PDL) was introduced in 1992 by Luc Steels from

the VUB Artificial Intelligence Laboratory, Belgium. PDL (Steels, 1992) (Steels,

1993) is intended as a tool to implement process networks in real robots. PDL is a

language which allows the description and interaction of different process

constituting a cooperative dynamic architecture.

PDL architecture is organised with different active behaviour systems, Figure 28.

Each one is intended as an external behaviour of the robot like “explore”, “go

towards target” or “obstacle avoidance”. Each behaviour also contains many active

processes operating in parallel. Processes represent simple movements which the

37

behaviour will use to reach its goal. Processes take information from sensors and

generate a control action if needed. The control action is related to the set-points

which must reach several actuators of the robot. Concretely a process output is an

increment value which will be added or subtracted to some set-points of the

actuators. This means, for example, that the process “turn right if the left bumpers

are touched” will add a value to the left motor set-point speed and subtract it from

the right during each sample time when the necessary conditions are true (in this

case, touching the left bumpers). The contribution of all the processes will be added

to the current set-points and then a normalisation will assure a correct output.

This simple methodology constitutes the cooperative coordination method.

Ri =Ri-1+∑ (Gi · Ri)

∑

Process 1

Process 2

Process 4STIMULUS

Process 3Sensors Actuators

COORDINATOR

Norm.

Process 5

Behaviour 1

Behaviour 2

Figure 28

Process description language proposes the language used to implement such

processes. The functions are very simple allowing high speed processing. For

example, the process “turn right if the left bumpers are touched” would be

implemented as:

void turn_right(void){

if(bumper_mapping[3]>0){add_value(left_speed, 1);add_value(right_speed, -1);}

}

The relative contribution of each process is determined by the value added to or

subtracted from the set-points. Processes with large values will exert a greater

influence on the robot. The ultimate direction taken by the robot will depend on

which process influences the overall behaviour in the strongest way. It must be

38

noted that these values are added each time step. For this reason it is very

important that the ranges of these values be related to the sample time in which

the architecture is working. It is possible that with small values and a big sample

time, the architecture might not be able to control the robot. The dynamics of the

architecture must be faster than the dynamics of the robot. This is due to the fact

that PDL works by manipulating derivatives of the set-points implying a fast

control loop to assure the system is stable. It’s important to remember that

although PDL is structured in simple and fast processes, the dynamics will always

have to be faster than that of the robot or vehicle to be controlled.

The overall execution algorithm is defined by the following recursive procedure:

1. All quantities are frozen (sensory information and set-points).

2. All processes are executed and their relative contribution are added or

subtracted).

3. The set-points are changed based on the overall contribution of the

processes.

4. The set-points are sent to the actuator controllers.

5. The latest sensory quantities are read in.


Process Description Language architecture has been implemented with three

behaviours and tested in the underwater simulated environment with an AUV. As

in previous tests, the mission consisted of reaching a collection of way-points

avoiding obstacles and entrapment. The first behaviour “Go to” is in charge of

driving the vehicle toward the way-points. The second behaviour “Obstacle

avoidance” has the goal of keeping the vehicle away from obstacles. And the third

behaviour “Avoid trapping” is used to depart from zones in which the vehicle could

be trapped. For a complete description of the behaviours and the simulated

environment refer to Appendix A.

Each behaviour has been implemented in a different function. The low level

processes of each behaviour were assembled and the behaviour only generated a

response. The response is a vector which will change the current velocity vector of

the vehicle a little in the direction desired by the behaviour. The coordinator is a

simple module which sums the current velocity vector with those generated by the

39

behaviours. The final vector is normalised. Although the processes have been

assembled in a single response the methodology is the same. It can be said that

this is a high level PDL approach. Figure 29 shows the structure of this

architecture.

Obstacle avoid.

Avoid Trapping

Go To

∑

Ri =Ri-1+∑ (Gi · Ri)

Figure 29

The size of the vectors is used to give priority to some behaviours over others. In

this case, the maximum speed vector of the robot and those of the behaviours can

be seen in Table 8.

Vector Maximum Magnitude[m/s]

Robot speed 0.5

“Obstacle avoidance” 0.15

“Avoid trapping” 0.04

“Go to” 0.03

Table 8

This means that when the “Obstacle avoidance” behaviour becomes active, it will

affect the overall behaviour in the strongest way. The “Avoid trapping” will only

dominate the “Go to” behaviour to depart from possible entrapment situations. All

these values are closely related to the sample time of the control architecture. PDL

is a methodology which works with derivatives of the speed in this case. This

means that the dynamics of PDL should be faster than the dynamics of the robot.

As with the other evaluated architectures, simulations were carried out with a

sample time of 1 second. However, the control architecture wasn’t very fast, and

the final sample time was changed to 0.1 seconds.

Some graphical results of Process Description Language approach can be seen in

the next three figures. Figure 30 shows an aerial view of the simulation and Figure

31 shows a lateral view. Finally, Figure 32 shows a three-dimensional

representation of the simulation.

40

Figure 30

Figure 31

Figure 32

41

After reviewing the simulated results it can be said that PDL provides an easy tool

to implement a control architecture. Advantages are simplicity and optimized

paths when the architecture is tuned. However, as the coordinator method is

cooperative, the tuning was very difficult. When new behaviours are added re-

tuning is necessary. Also there is the problem of the sample time which must be

faster than the other approaches. And because the final velocity vector is obtained

incrementally, the architecture must be sequential making modularity difficult.

Table 9 summarises the principal characteristics of the Process Description

Language approach.

PROCESS DESCRIPTION LANGUAGE

Developer Luc Steels, VUB Artificial Intelligence Laboratory

References (Steels, 1992; Steels, 1993)


Behavioural encoding Continuous

Assembling behaviours Cooperative via values integration and normalisation

Programming method Process Description Language

Positive aspects Complexity, Development time and simplicity

Negative aspects Small sample time, Tuning time, Robustness and Modularity

Table 9

4.5 Comparison

Once the four behaviour-based architectures have been implemented and tested

some conclusions can be drawn. It should be noted that the architectures have been

implemented for a simple mission. However, it should be sufficient to find

attractiveness and deficiencies of each one. As commented above, each architecture

has exhibited advantages and disadvantages and for this reason each one will be

well suited for a particular application. First of all let’s summarise the different

properties that have been commented on in relation to the four architectures in

Table 10.

Property \ Architecture: 1st 2nd 3rd 4th

Performance SCHE. P.D.L A.S.D. SUBS.

Modularity SUBS A.S.D. P.D.L. SCHE.

Robustness SUBS A.S.D. P.D.L. SCHE.

Development time SCHE. P.D.L. SUBS. A.S.D.

Tuning time SUBS. A.S.D. P.D.L. SCHE.

Simplicity SCHE. P.D.L. SUBS. A.S.D.

Table 10

42

Looking at the table, it can be seen that properties are grouped in accordance with

the coordination method. Competitive methods (Subsumption and A.S.D.) have

robustness, modularity and easy tuning. This is due to the fact that they only have

one active behaviour at any moment. Therefore, robustness is preferable because in

dangerous situations only a safety behaviour will act and the danger is avoided.

Modularity should also be considered an important property of competitive

methodologies because more behaviours can be added without influencing the old

ones. Only the coordinator will have to be adapted to the new input. For this reason

the tuning time is very short. Behaviours are tuned independently and once they

work properly they never have to be re-tuned.

However, competitive methods have disadvantages also, mainly in the coordinator

design. In order to choose only one active behaviour, a complex method must be

used (A.S.L.) or a clear understanding of the hierarchy of behaviours is necessary

(Subsumption). Once this is done, the final tune-up is very easy. For this reason

the development time is usually long and the coordinator can become very complex.

Another negative property is slow performance due to the non-instant merging of

behaviours. A lot of bends appear in the path when more than one behaviour is

acting consecutively.

As competitive approaches, the two methodologies studied, Subsumption and

Action Selection Dynamics, posses all these properties. However, they have quite

different philosophies. Subsumption is more a low-level approach. The hierarchy of

behaviours has to be known and then the network has to be designed. Subsumption

offers a series of tools, the suppression and inhibition nodes, to build the network.

For this reason, the implementation will be simple, but perhaps the design will be

quite difficult. On the other hand, Action Selection Dynamics is a high level

approach to building an architecture. All the competence modules are completely

described and the design consists of filling in all the module lists. Once all the

behaviours are perfectly described, the network will automatically choose the best

behaviour. In ASD the design will be easier but the implementation is more

difficult.

In contrast to competitive approaches, methods with a cooperative coordinator have

other properties like simplicity, performance and development time. Due to the fact

43

that all behaviours are active, the response will always be a merging of these

behaviours. This means that the path described by the robot will be smoother than

that described by a competitive method. For this reason performance will be a

common property.

Another property is simplicity. The coordinator will be very simple, because the

final output is usually a kind of sum of all the behaviours multiplied by the gain

factors. In relation to simplicity, the development time will be small. However, this

simplicity causes great difficulty in tuning the priority gains as the values are very

sensible and critical. In extreme situations, non-safe behaviours can cancel the safe

ones. Unfortunately the modularity will be very bad as a result, because each new

behaviour will cause the re-tuning of all the priority gains.

The differences between the two cooperative studied approaches, Motor Schema

and Process Description Language, are in the level of abstraction and in the

coordinator. In Motor Schema, behaviours are implemented individually with the

behavioural library. It’s more a high level design. However, in PDL behaviours are

implemented as low-level processes which change the set-points of the actuators a

little. Implementation will be simple but the design will be more complex.

Nevertheless, the principal difference between the two approaches is found in the

coordinator. In Motor schema the output is obtained every time step from the

outputs of the behaviours. On the other hand, in PDL the output is an integration

of all the outputs generated before. This implies that a small sample time is neede

to assure the stability of the system, which can be problematic if there is a lot of

computation to do.

Concluding this comparison it can be said that depending on the exigencies of the

robot to be controlled one method can prove to be more appropriate than another.

Once the architecture has been implemented, a cooperative method can be more

suitable if better performance is necessary. Or a competitive method if robustness

is the basic premise. The control architecture could also depend on hardware

availability, sensorial system and compatibility with adaptive algorithms.

44

5 Conclusions and Future Work

5.1 Conclusions

This investigative work reviews the basic aspects of Behaviour-based Robotics

architectures. The report begins with an overview of the basic fundamentals of the

subject. In addition, four behaviour-based approaches have been described and

implemented. The application field of the work is underwater robotics.

Implementations are based on a simulated underwater vehicle that must

accomplish a mission in an underwater simulated environment as well. Finally, a

comparative study of the implemented architectures is provided. The goal of the

work has been to understand the basic characteristics and limitations of

Behaviour-based Robotics through representative examples with the final purpose

of improving control techniques in autonomous robots such as GARBI underwater

vehicle.

Behaviour-based robotics has demonstrated its feasibility to control autonomous

robots due mainly to characteristics such as parallelism, modularity, absence of

centralised knowledge representations and sensory-motor interconnection. Such

features have make Behaviour-based robotics suitable for its easily and

incremental implementation, fast execution and good performance. However,

disadvantages appear when adjusting behaviour and coordination parameters. The

problem is to determine which action must be taken at every time step. This

problem arises from the correct coordination of behaviours. As mentioned in section

4.5, competitive methods have good robustness but bad performance when there is

a continuous change of the active behaviour. As far as cooperative methods is

concerned, they have a very good performance when parameters are properly tuned

but its low robustness can lead to control failures.

Another shown constraint of the tested architectures is adaptation. As explained in

section 3.6, adaptation can be applied in different levels of the control architecture.

From our experience with the simulated mission, the need of an adaptive system

for the behavioural level is obvious. Also, in a real test of a control system,

adaptation could be also necessary in sensory and learning stages. Adaptation in

behaviour-based systems is currently a very active research subject. The problem is

to determine an automatic feasible manner to tune the parameters of the

45

architecture or to generate new behaviours. A control architecture for an

autonomous vehicle that should inhabit a natural environment cannot be designed

without adaptive abilities.

5.2 Future work

Future work will consist on designing a Behavioural-based architecture that will

take into account the different aspects remarked in the conclusions. The idea is to

use advantages found in the Behaviour-based approaches and to propose solutions

for disadvantages. At the moment the proposal isn’t still focused. However the

directions to be taken and some desirable characteristics of the architecture will be

exposed. Future work intends to start a thesis that should finish with a

contribution on behaviour-based robotics.

Basics

The proposed behavioural-based architecture will be designed for an underwater

vehicle. The approach presented is focused as a control architecture for navigation,

hence, responses represent a movement of the vehicle. The architecture is

composed by several simple behaviours as the approaches presented in the report.

Each one is completely independent generating a normalised response. A three-

dimensional vector “vi” with a module “mi” between 0 and 1 composes each

response. Associated with this vector an activation level “ai” indicates how

important is for the behaviour to take the control of the robot. This value is also

between 0 and 1, see Figure 33. This codification allows a clear independence

between the control action and the activation of the behaviour.

biS ri

vi=(xi, yi, zi); m i=[0 1]

XL

YL

ZL

mi

yixi

zi

v i

ai=[0 1]

Figure 33

46

An hybrid coordination system

Coordination of the responses is done through a hybrid approach between

competition and cooperation. The aim is to design a coordination system to keep

the robustness of competitive approaches as well as the good performance of the

cooperative ones. Like suppression nodes of Subsumption architecture, the

proposed coordination system is composed by a set of nodes with two inputs which

generate a merged control response. The nodes compose a hierarchical and

cooperative coordination system. The idea is to use the good performance of

cooperation when the predominant behaviour is not completely active.

The nodes have a dominant behaviour that suppresses the responses of the non-

dominant behaviour when the first one is completely activated (a=1). However,

when the dominant behaviour is partially activated (0<a<1), the final response will

be a combination of both inputs. The idea is that non-dominant behaviours can

modify slightly the responses of dominant behaviours when they aren’t completely

activated. For example, if the dominant behaviour is “obstacle avoidance” and the

non-dominant “go to point”, when “obstacle avoidance” is only a bit activated (the

obstacles are still far), a mixed response will be obtained. When non-decisive

situations are happening, cooperation between behaviours is allowed.

Nevertheless, robustness is present when dealing with critical situations. The

proposed node to coordinate behaviours can be seen in Figure 34.

ri

rj

rij

vij

vi +vj ·(1 - ai)2 if (ai>0)

vj if (ai=0)

if (mij>1) vij= vij /mij

aij

ai + aj ·(1 - ai)2 if (ai>0)

aj if (ai=0)

if (aij>1) aij=1

Dominant

Non-dominant

nij

Figure 34

The node “nij” has the property to generate a normalised response like the one

generated by behaviours. The effect of the non-dominant behaviour depends on the

squared activation of the dominant to assure that in a critical situation between

47

both, the dominant will always take the control. Using these nodes all behaviours

can be coordinated. Depending on the situation, the control response could be

produced by all the behaviours or by only one.

The coordination method can be classified as a hybrid approach because the

response is the one generated by the dominant behaviour affected by non-dominant

behaviours according to the level of activation of the first. Although in the

literature doesn’t appear any kind of hybrid coordination system, we think that the

method offers good properties and can be successfully implemented in an

autonomous robot.

Adaptivity

To coordinate all the behaviours a lot of dispositions can be taken. Assuming that a

response is used only once (responses are only connected into one coordination

node), “n-1” nodes will be needed for “n” behaviours, see Figure 35. This

coordination system allows encoding all possible combinations. Each combination

will represent a hierarchical structure of the coordination system. Then, using an

adaptive algorithm the best disposition to accomplish the current mission can be

obtained. Using for example techniques such as reinforcement learning or

evolutionary robotics (section 3.6) the network of nodes could be adapted online to

optimise missions. Also the possibility to use adaptivity in the sensory and learning

levels will be studied.

Behaviour 1

Behaviour 2

Behaviour 4STIMULUS

Behaviour 3Sensors Actuators

COORDINATOR

n21

D

ND

n34

D

ND

n2’1’

D

ND

Figure 35

48

Future work planning

To carry out the proposed future work different phases should be accomplished.

These phases will be tested and modified according to the results obtained in the

implementation with the robot.

Phase 1 To implement and test the feasibility of the hybrid coordination method

and to redesign it if necessary.

Phase 2 To survey the fields of “Reinforcement learning” and “Evolutionary

Robotics” in order to find a useful approach to online behavioural

adaptation for the proposed architecture.

Phase 3 To implement such architecture in an autonomous underwater robot

demonstrating the feasibility of the proposed control architecture with a

representative mission.

49

References

Agre, P. E. and Chapman, D. (1987). Pengi: an implementation of a Theory of

Activity. In: Proceedings of the Sixth Annual Meeting of the American

Association for Artificial Intelligence, pp. 268-272, Seattle, Washington.

Albus, J. (1991). Outline for a theory of Intelligence. IEEE Transactions on

Systems, Man and Cybernetics, vol. 21, is. 3, pp. 473-509.

Amat, J., Batlle, J., Casals, A., and Forest, J. (1996). GARBI: a low cost ROV,

constrains and solutions. In: 6ème Seminaire IARP en robotique sous-

marine , pp. 1-22, Toulon-La Seyne, France.

Arbib, M. A. (1981). Perceptual structures and distributed motor control. Handbook

of physiology-The nervous system II: Motor Control, Chap. 33, pp. 1449-80.

Oxford, UK: Oxford University Press.

Arbib, M. A. and House, D. (1987). Depth and detours: An Essay on Visually guided

behaviour. Vision, Brain and Cooperative Computation, pp. 129-63. MIT

Press.

Arbib, M. A., Kfoury, A J, and Moll, R. N. (1981). A basis for Theoretical Computer

Science. Springer-Verlag.

Arkin, R. C. (1986). Path Planning for a Vision-Based Autonomous Robot. In:

Proceedings of the SPIE Conference on Mobile Robots, pp. 240-49,

Cambridge, MA.

Arkin, R. C. (1987). Motor schema based navigation for a mobile robot: an approach

to programming by behaviour. In: Proceedings of the IEEE conference on

robotics and automation, pp. 264-71, Raleigh, NC.

Arkin, R. C. (1989). Motor schema-based mobile robot navigation. International

Journal of Robotica Research, vol. 8, is. 4, pp. 92-112.

Arkin, R. C. (1992). Behavior-Based Robot Navigation for Extended Domains.

Adaptative Behavior, vol. 1, is. 9, pp. 201-225.

50

Arkin, R. C. (1993). Modeling neural function at the schema level: Implications and

results for robotic control. Biological neural networks in invertebrate

neuroethology and robotics, Chap. 17, pp. 383-410. Boston: Academic Press.

Arkin, R. C. (1998). Behavior-based Robotics. MIT Press.

Balch, T. and Arkin, R. C. (1993). Avoiding the Past: A simple but effective strategy

for Reactive Navigation. In: Proceedings of the IEEE International

Conference on Robotics and Automation, vol. 1, pp. 678-85, Atlanta, GA.

Bonasso, P. (1992). Reactive Control of Underwater Experiments. Applied

Intelligence, vol. 2, is. 3, pp. 201-04.

Brooks, R. (1986). A Robust Layered Control System for a Mobile Robot. IEEE

Journal of Robotics and Automation, vol. RA-2, is. 1, pp. 14-23.

Brooks, R. (1989). A robot that walks: Emergent Behavior from a Carefully Evolved

Network. Neural Computation, vol. 1, is. 2, pp. 253-262.

Brooks, R. (1990). The behavior language . A.I. Memo No. 1227, MIT AI Laboratory.

Brooks, R. (1991). Intelligence Without Reason. A.I. Memo No. 1293, MIT AI

Laboratory.

Brooks, R. (1991). New approaches to Robotics. Science, vol. 253, pp. 1227-1232.

Chatila, R. A. and Laumond, J. C. (1985). Position referencing and consistent

World Modeling for Mobile Robots. In: IEEE International Conference on

Robotics ans Automation.

Connell, J. H. (1990). Minimalist Mobile robotics: A colony-style architecture for an

Artificial Creature. Academic Press.

Dorigo, M. and Colombetti, M. (1998). Robot Shaping: An Experiment in Behavior

Engineering. MIT Press.

Fikes, R. E. and Nilsson, N. J. (1971). STRIPS: A new approach to the application

of theorem proving and problem solving. Artificial Intelligence, vol. 2, pp.

189-208.

51

Fossen, T. I. (1995). Guidance and Control of Ocean vehicles. John Wiley & Sons.

Gat, E. (1991). Reliable Goal-Directed Reactive Controla of Autonomous Mobile

Robots. Ph.D. Dissertation, Virginia Polytechnic Institute and State

University, Blacksburg.

Goheen, K. (1995). Techniques for URV Modelling. Underwater Robotics Vehicles,

Chap. 4. TSI Press.

Harvey, I., Husbands, P., Cliff, D., Thomson, A., and Jakobi, A. (1997).

Evolutionary Robotics: the sussex approach. Robotics and Autonomous

systems, vol. 20, is. 2-4, pp. 205-224.

Huang, H-M. (1996). An architecture and a methodology for intelligent control.

IEEE Expert: Intelligent Systems and their applications, vol. 11, is. 2, pp.

46-55.

Kaelbling, L. P. (1996). Reinforcement Learning A Survey. Journal of Artificial

Intelligence Research, vol. 4, pp. 237-285.

Kaelbling, L. P. (1999). Robotics and Learning. The MIT encyclopedia of the

cognitive sciences, pp. 723-724. MIT Press.

Khatib, O. (1985). Real-Time Obstacle Avoidance for Manipulators and Mobile

Robots. In: Proceedings of the IEEE International Conference on Robotics

and Automation, pp. 500-05, St. Louis, MO.

Laird, J. E. and Rosenbloom P.S. (1990). Integrating, Execution, Planning, and

Learning in soar for External Environments. In: Proceedings, AAAI-90, pp.

1022-1029.

Lefebvre, D. and Saridis, G. (1992). A computer architecture for intelligent

Machines. In: Proceedings of the IEEE International Conference on Robotics

and Automation, pp. 245-50, Nice, France.

Lyons, D. (1992). Planning, Reactive . Encyclopedia of Artificial Intelligence, pp.

1171-82. John Wiley and Sons, New York.

Maes, P. (1989). How to do the right thing. Connection Science, vol. 1, pp. 291-323.

52

Maes, P. (1990). Situated Agents Can Have Goals. Robotics and Automation

Systems, vol. 6, pp. 49-70.

Maes, P. (1991). A bottom-up mechanism for behaviour selection in an artificial

creature. From animals to animats: Proceedings of the first international

conference on simulation of adaptive behaviour. Ed. JA Meyer & SW Wilson,

MIT Press / Bradford Books.

Maes, P. (1994). Modeling Adaptive Autonomous Agents. Artificial Life Journal,

vol. 1, is. 1 & 2, pp. 135-162.

Mataric, M. (1999). Behavior-based Robotics. The MIT encyclopedia of the cognitive

sciences, pp. 74-77. MIT Press.

McFarland, D. (1991). What it means for robot behavior to be adaptive. From

animals to animats: Proceedings of the first International Conference on

Simulation of Adaptive Behavior, pp. 22-28. MIT Press.

Moravec, H. (1988). Mind Children: The future of Robot and Human Intelligence.

Harvard University Press.

Nilsson, N. J. (1984). Shakey the Robot. Stanford Research Institute AI Center,

Technical rep. 323.

Nolfi, S. (1998). Evolutionary Robotics: Exploiting the full power of self-

organization. Connection Science, vol. 10, is. 3-4, pp. 167-183.

Payton, D., Keirsey, D., Kimble, D., Krozel, J., and Rosenblatt, J. (1992). Do

Whatever Works: A robust approach to fault-tolerant autonomous control.

Applied Intelligence, vol. 2, is. 3, pp. 225-50.

Pfeifer, R. and Scheier, C. (1999). Understanding Intelligence. MIT Press.

Ridao, P., Yuh, J., Batlle, J., and Sugihara, K. (2000). On AUV Control

Architecture. In: Proceedings of the IEEE International Conference on

Intelligent Robots and Systems, Takamatsu, Japan.

Rosenblatt, J. and Payton, D. (1989). A fine-grained alternative to the subsumption

architecture for Mobile Robot Control. In: Proceedings of the International

53

Joint Conference on Neural Networks, pp. 317-23.

Rosenschein, S. J. Kaelbling L. P. (1986). The synthesis of digital machines with

provable epistemic properties. In: Proceedings of the Conference on

Theoretical Aspects of Reasoning About Knowledge, pp. 83-98, Los Altos,

California.

Saffiotti, A., Konolige, K., and Ruspini, E. (1995). A multi-valued logic approach to

integrating planning and control. Artificial Intelligence, vol. 76, is. 1-2, pp.

481-526.

Schoppers, M. (1987). Universal Plans for Reactive Robots in Unpredictable

Environments. In: Proceedings of the Tenth International Joint Conference

on Artificial Intelligence (IJCAI-87), pp. 852-59.

Slack, M. (1990). Situationally Driven Local Navigation for Mobile Robots. JPL

Publication No. 90-17, NASA Jet Propulsion Laboratory, Pasadena, CA.

Steels, L. (1992). The PDL reference manual. Memo 92-5, VUB AI Lab, Brussels,

Belgium.

Steels, L. (1993). Building agents with autonomous behaviour systems. The

artificial route to artificial intelligence. Building situated embodied agents.

Lawrence Erlbaum Associates, New Haven.

Tyrell, T. (1993). Computational mechanisms for action selection. Doctoral

dissertation, University of Edinburgh, Scotland.

Yuh, J. (1990). Modeling and Control of Underwater Robotic Vehicles. IEEE

Transactions on Systems, Man and Cybernetics, vol. 20, is. 6, pp. 1475-1483.

Ziemke, T. (1998). Adaptive Behavior in Autonomous Agents. Presence, vol. 7, is. 6,

pp. 564-587.

54

Appendix A. Simulation of missions with an AUV.

In order to test the Behaviour-based architectures explained in section 4, a

simulator has been implemented using Matlab/Simulink. The simulator is

composed by a control architecture, low-level controllers, a model of an

Autonomous Underwater Vehicle and a graphical interface with an underwater

environment. Figure 36 shows the interconnections of the different processes. Each

one of them is described in the next subsections.

AUVMODEL

OBSTACLE AVOIDANCE

AVOID TRAPPING

GO TO

COORDINATIONLOW-LEVEL

CONTROLLERS

BEHAVIOUR-BASED CONTROL ARCHITECTURE

UNDERWATERENVIRONMENT

VEHICLE SPEEDTHRUSTER SPEEDS

POSITION AND ORIENTATION

SONAR DISTANCES

Figure 36

All these components make possible the simulation of a simple mission that we

think is sufficient to evaluate the different aspects of the control architecture. Note

that typical problems of real robots (position estimation, noise in signals, faults in

sonar distances) aren’t simulated. The fact of depreciate all these aspects breaks

several principles of Behavioural-based Robotics. However these simulations

intend to prove only the performance of the control architectures not the principles.

We assume that if the architectures were implemented in a real robot, properties of

Behaviour-based Robotics would assure robustness in front of these problems. For

this reason the mission we chose is very simple. We think that nearly the same

simulated results can obtained on a real experiment.

The mission consists in achieve three way-points, one after the other, avoiding

obstacles and avoiding to become trapped. The starting point of the mission and the

three goal points can be seen in the 3D representation shown in Figure 37. The

figure also shows the dimensions of the environment in meters. In some places the

sea floor emerges and the vehicle is obligated to turn over the obstacles.

55

Figure 37

A.1 The control architecture

The control architecture block contains one of the architectures of section 4.

However, to compare the architectures a fixed set of behaviours has been

determined. For this mission three behaviours have been chose: “Go to”, “Obstacle

avoidance” and “Avoid trapping”. All the architectures were implemented with

these behaviours changing implementation aspects but maintaining the structure

of them.

These behaviours use different inputs but generate an established output. The

output is a three-dimensional vector representing the speed that should move the

vehicle, see Figure 38. The vector is defined with two angles “α” and “β” and a

magnitude “M”. The angle “α” represents the angle that should turn the vehicle

respect of its yaw angle. The angle “β”combined with the magnitude of the speed

represent the vertical movement of the vehicle. Note that this angle is referred to

the horizontal direction, this is due to the stability of the vehicle modelled (pitch

and roll angles are always nearly to zero), see section A.3. The magnitude “M” is

the module of the speed. The units of this value are [m/s], and the maximum value

56

for our vehicle is 0.5 m/s. The coordinator module will merge the three speed

vectors and generate another speed vector like the previous ones but saturating the

magnitude to 0.5 m/s.

XL YL

ZG ZL

α

β

XG

YG

M

Figure 38

• “Go to” behaviour

This behaviour has the intention of driving the vehicle towards the goal point. It

proposes a speed vector with a constant magnitude. The direction of the vector is

the one that joins the position of the vehicle with the goal point, see Figure 39. The

input of the vector is position and orientation of the vehicle. The goal points are

stored by the behaviour and changed when the robot is at a close distance.

XL YL

ZG ZLGOALPOINT

α

β

XG

YG

GG

Figure 39

• “Obstacle avoidance” behaviour

This behaviour is used to protect the robot to crash with obstacles. The inputs of

the behaviour are the sonar distances. The vehicle has seven sonars: three at the

front, one on each side, one at the back and another below (section A.4). Each sonar

direction has a configurable range. If one of the distances is less than this range,

57

the behaviour will generate an opposite response. The behaviour computes all

distances and creates a 3D speed vector that indicates the direction that must take

the robot to avoid the obstacles detected by sonars, see Figure 40.

β

XL

YL

ZLOBSTACLE

α

ZG

XG

YG

M

Figure 40

• “Avoid trapping” behaviour

The avoid trapping (Balch & Arkin, 1993) behaviour is used to depart from

situations where the robot becomes trapped. The input of the behaviour is the

position of the robot, which is used to save a local map of the recent path of the

vehicle. The map is centered in the robot position and has finite number of cells.

Each cell contains the number of times that the robot has visited the cell. If the

sum of all the values is higher than a threshold, the behaviour becomes active and

a speed vector is generated. The direction of the vector will be the one opposed to

the centre of gravity of the local map using the values of the cells. The magnitude

will be proportional the sum of the cell values. The cells are incremented by a

configurable sample time and saturated on a maximum value. After a long time,

the cell values are decreased allowing going back to a visited zone.

This behaviour becomes active in two different situations. The first one is when the

vehicle is trapped in front obstacles. In this case the behaviour will take control of

the vehicle and drive him away to another zone. The second situation is when the

vehicle is navigating very slowly for the interaction of the others behaviours. In

this case the cells will increase fast the value and the behaviour will become active

driving away the vehicle from the path, see Figure 41.

58

XL

YL

ZL

PATH

α

β

ZG

XG

YG

M

Figure 41

A.2 The low-level controllers

The low-level control module is in charge of accomplishing the set-points given by

the control architecture. The inputs of the module are the speed vector and the

position and orientation of the robot. The speed vector is decomposed in three set-

points: the yaw, the horizontal speed and the vertical speed. The yaw is obtained

with the current yaw and the “α” value. The horizontal speed is obtained from the

horizontal component of the speed vector. And the vertical speed is obtained with

the vertical component.

Each set-point is accomplished by an individual controller. A PID controller was

chose and tuned for each controller. Figure 42 shows the structure of the low-level

controllers. The set-points are compared with the real ones obtained from position

and orientation of the model. The yaw controller uses the current yaw of the

vehicle. The horizontal speed controller uses the composition of the “x” and “y”

derivatives. And the vertical speed controller uses the “z” derivative.

Figure 42

59

As the vehicle is stable (section A.3) the decomposition of the speed vector allows a

total independence of the horizontal and vertical controllers. However the

horizontal controllers (yaw and horizontal speed controllers) act on the same

thrusters (the two horizontal thrusters, see section A.3). For this reason they were

tuned together. On the other hand the two vertical thrusters (section A.3) are only

controlled by the vertical speed controller.

It must be noted that the low-level controllers aren’t very complex. The intention

was to implement simple controllers, and to tune them to achieve real results

instead of non-real but impressive results. The responses of the controllers with the

model were compared with real ones until they were very similar. It is important to

notice that the intention of the report is to test control architectures. For this

purpose is important to work with a simulator that reacts as a real robot.

A.3 Model of an Autonomous Underwater Vehicle

To simulate the behaviour of an AUV a hydrodynamic model is necessary. In our

case the model was adapted to our underwater vehicle GARBI. First of all the

underwater robot will be described. Then, the model for an underwater vehicle will

be analysed.

• The underwater vehicle

GARBI (Amat et al, 1996) was first conceived as a Remotely Operated Vehicle

(ROV) for exploration in waters up to 200 meters in depth. At the moment a control

architecture is being implemented to transform this vehicle into an Autonomous

Underwater Vehicle. GARBI, see Figure 43, was designed with the aim of building

an underwater vehicle using low cost materials, such as fibre-glass and epoxy

resins. To solve the problem of resistance to underwater pressure, the vehicle is

servo-pressurised to the external pressure by using a compressed air bottle, like

those used in scuba diving. Air consumption is required only in the vertical

displacements during which the decompression valves release the required amount

of air to maintain the internal pressure at the same level as the external one. The

vehicle has the possibility to incorporate two arms which allow it to perform some

tasks of object manipulation through tele-operation.

60

Figure 43

The vehicle has thrusters, see Figure 44, two for horizontal movements (axis x) and

two for vertical movements (axis z). Additionally it is possible to add another

thruster in the transverse direction (axis y). Due to the distribution of the weights,

the vehicle is completely stable. Pitch and roll angles are always insignificant. For

this reason the vertical and horizontal movements are totally independents. The

vehicle also has several sensors: 2 compass, 2 pressure sensors, 2 water speed

sensors and 5 sonars. Dimensions are: length 1.3 m., height 0.9 m and width 0.7 m.

Maximum speed is 3 knots and the weight is 150 Kg.

X

Z

Prow

Stern

Starboard

Larboard

T1

T2

T3 T4

Y

Figure 44

61

• The hydrodynamic model

An AUV can be modelled by using two approaches (Goheen, 1995). The first one

uses predictive techniques which build up the model from the basic physical laws

governing the dynamic behaviour of the system. The second uses testing

techniques which are based on parameter estimation from real tests. The model

used for GARBI is based on predictive techniques. As described in the literature

(Yuh, 1990) and (Fossen, 1995), the hydrodynamic equation of motion of an

underwater vehicle with 6 DOF can be conveniently described as follows:

( ) ( )( ) ( ) ( )( )G G G G G G G G G GRB A RB ABu M M V C V C V V D V V G O= + ⋅ + + ⋅ + +& (eq.1)

( ) WG

BGG FFOG += (eq.2)

where,

B: thruster configuration matrix

u=(ω12, ω22,ω32,ω42,ω52)T : control inputs vector

ωi : angular speed of the propeller i

GMRB : inertia matrix

GMA: added-mass matrix

( )T

EGG

EGGG aV α,=& : acceleration vector

( )T

EGG

EGGG vV ω,= : velocity vector

( )TO ψθφ= : Roll, Pith & Yaw angles

GCRB : rigid-body coriolis

GCA : added coriolis matrix

GD : damping matrixGG : gravity &buoyancy vector

The super-index denotes the coordinate system where vector components are

expressed. {G} is a robot fixed coordinate system and {E} is an earth fixed

coordinate system. For simulation purposes it’s interesting to compute the

evolution of the robot position and orientation as a function of the forces acting on

the vehicle. This can be easily computed, arranging the terms of (eq.1) as follows:

( )( ) ( ) ( )( )OGVVDVVCVCBuK GGGGGA

GRB

G −−⋅+−= )( (eq.3)

( ) KMMV AG

RBGG ⋅+= −1& (eq.4)

The velocity vector can be computed through integration:

62

∫= dtVV GG & (eq.5)

and finally, the rate of change of the position and the orientation can be computed

as follows:

⋅

=

−

EG

EG

G

x

xGE

G

Ev

OT

R

O

r

ω133

33

)(0

0&& (eq.6)

where:

GE R is the rotation matrix

( )TG

E zyxr &&&& = and

−=−

θφ

θφ

φφ

θφθφ

cos

cos

cos0

cos0

tgcostg1

)( 1

sinsin

sin

OT

The parameters of the equations were obtained from the GARBI robot

characteristics less added-mass matrix and damping matrix which were taken from

the model of the underwater vehicle ODIN from University of Hawaii.

A.4 Underwater environment

To test the control architectures, an environment has been generated where the

robot can navigate and interact. The environment is generated with a grey-scale

“bmp” file and can be changed easily. The grey-scale value of each pixel is

translated in a depth value. The three-dimensional environment is generated

(Figure 37) and the robot can be situated inside detecting the sea-floor or the

obstacles. This detection is done simulating seven sonar sensors. The disposition of

the sonars can be seen in Figure 45 with different detected distances. Six of them

are situated in the horizontal plane and the last one in the negative “z” axis. Note

that is not necessary to use a sonar in the positive “z” axis. The method of

generating the environment only allows determining the sea floor not intermediate

objects. As can be seen, each sonar is simulated as a cone.

63

X

Y

X

Z

TOP VIEW LATERAL VIEW

Figure 45

As a tool to control visually the results of the simulations there is a graphical

interface that shows the evolution of the robot during the development of a

mission. The graphical interface, see Figure 46, shows a screen with a top view and

a lateral view of the evolution of the vehicle inside the environment. The

environment is represented with contour lines in the top view. The sonars and the

goal points are also represented. The interface provides numerical information of

the position of the vehicle, sonar sensors, motors, goal point and way-point.

Figure 46

64

Finally a three-dimensional representation can be also generated once the

simulation has finished. Figure 47 shows the path of a simulation using Motor

Schema architecture.

Figure 47

65

Appendix B. Published results

As a result of the investigative work a conference paper was published in the Q&A-

R 2000 international conference. Also, two conference papers have been accepted to

be presented in the MCMC2000 international conference and the CCIA 2000

national conference.

B.1 MCMC 2000

Title: An overview on behaviour-based methods for AUV control.

Authors: M. Carreras, J. Batlle, P. Ridao and G.N. Roberts.

Conference: 5th IFAC Conference on Manoeuvring and Control of Marine Crafts.

Place: Aalborg, Denmark.

Date: August 23-25, 2000

B.2 CCIA 2000

Title: An Underwater Autonomous Agent. From simulation to experimentation.

Authors: Pere Ridao, Joan Batlle, and Marc Carreras.

Conference: 3r Congrés Català d'Intel.ligència Artificial.

Place: Vilanova i la Geltrú, Catalunya.

Date: October 5-7, 2000

B.3 Q&A-R 2000

Title: Reactive control of an AUV using Motor Schemas.

Authors: M. Carreras, J. Batlle and P. Ridao.

Conference: International conference on Quality control, Automation and Robotics.

Place: Cluj Napoca, Rumania.

Date: May 19-20, 2000

Date post:	06-Apr-2020
Category:	Documents
Upload:	others
View:	3 times
Download:	0 times

An overview of Behavioural-based Robotics with simulated ......Memòria del Treball de Recerca An...

Documents