+ All Categories
Home > Documents > Evolving Agents In a Hostile...

Evolving Agents In a Hostile...

Date post: 23-Jan-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
24
Evolving Agents In a Hostile Environment Alex J. Berry December 4, 2003 Abstract This paper describes research into the application of evolutionary al- gorithms that evolve agents in a hostile environment. Our hostile envi- ronment is the location of a terrorist attack in progress. The types of agents in the environment are first responders, terrorists, and victims. Terrorist agents are in the environment to create mayhem. Victim agents that move in a “panicky” nature are in the environment. First responder agents act like and perform the tasks of real first responders. The agents are evolved using an evolutionary computation technique called genetic programming. Representations using expression and decision trees were used for the individuals. Experiments were performed with many changes in parameters. Fluctuating environments, crossover only evolution, and precreated individuals were some of the experimental runs. After many experiments it was found that these agents will need very large complex decision and expression trees to meet our requirements. It was also shown that this technique of using GP to create agents can be successfully used to create intelligent agents that can operate in the defined environment. Keywords Evolutionary Algorithms, Genetic Programming, Agent Technology 1
Transcript
Page 1: Evolving Agents In a Hostile Environmentweb.mst.edu/~tauritzd/courses/ec/fs2003/project/Berry.pdfEvolving Agents In a Hostile Environment Alex J. Berry December 4, 2003 Abstract This

Evolving Agents In a Hostile Environment

Alex J. Berry

December 4, 2003

Abstract

This paper describes research into the application of evolutionary al-

gorithms that evolve agents in a hostile environment. Our hostile envi-

ronment is the location of a terrorist attack in progress. The types of

agents in the environment are first responders, terrorists, and victims.

Terrorist agents are in the environment to create mayhem. Victim agents

that move in a “panicky” nature are in the environment. First responder

agents act like and perform the tasks of real first responders. The agents

are evolved using an evolutionary computation technique called genetic

programming. Representations using expression and decision trees were

used for the individuals. Experiments were performed with many changes

in parameters. Fluctuating environments, crossover only evolution, and

precreated individuals were some of the experimental runs. After many

experiments it was found that these agents will need very large complex

decision and expression trees to meet our requirements. It was also shown

that this technique of using GP to create agents can be successfully used

to create intelligent agents that can operate in the defined environment.

Keywords

Evolutionary Algorithms, Genetic Programming, Agent Technology

1

Page 2: Evolving Agents In a Hostile Environmentweb.mst.edu/~tauritzd/courses/ec/fs2003/project/Berry.pdfEvolving Agents In a Hostile Environment Alex J. Berry December 4, 2003 Abstract This

Contents

1 Introduction 2

1.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.2 Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31.3 Genetic Programming . . . . . . . . . . . . . . . . . . . . . . . . 31.4 Research Questions . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Methodology 5

2.1 Map Generator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52.2 Agents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.2.1 Generic Agents . . . . . . . . . . . . . . . . . . . . . . . . 72.2.2 Victim Agents . . . . . . . . . . . . . . . . . . . . . . . . 72.2.3 First Responder Agents . . . . . . . . . . . . . . . . . . . 72.2.4 Terrorist Agents . . . . . . . . . . . . . . . . . . . . . . . 82.2.5 Judging an agent’s performance . . . . . . . . . . . . . . . 8

2.3 Viewer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

3 Evolutionary Process 9

3.1 Individuals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93.1.1 Boolean Trees . . . . . . . . . . . . . . . . . . . . . . . . . 93.1.2 Action Trees . . . . . . . . . . . . . . . . . . . . . . . . . 11

3.2 Initialization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113.3 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133.4 Selection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133.5 Reproduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133.6 Competition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

4 Experimental Setup 14

5 Results 16

5.1 Future Experiments . . . . . . . . . . . . . . . . . . . . . . . . . 22

6 Conclusion 23

6.1 Research Answers . . . . . . . . . . . . . . . . . . . . . . . . . . . 236.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

1 Introduction

Terrorist attacks are a threat in the world today. These attacks can occuranywhere. The VEnOM1 Labs group is currently working on a system calledFiRSTE that will be used to train first responders in attacks against publicbuildings. One of the things lacking in FiRSTE is autonomous people walkingaround the virtual environment interacting with the trainees. This project is

1http://web.umr.edu/ vrcs

2

Page 3: Evolving Agents In a Hostile Environmentweb.mst.edu/~tauritzd/courses/ec/fs2003/project/Berry.pdfEvolving Agents In a Hostile Environment Alex J. Berry December 4, 2003 Abstract This

designed to produce agents that can be used in the VR environment. This willprovide a better learning experience in the virtual world. This paper covers thebeginning of the experiment with a model in two-dimensional world. Followingin the introduction are descriptions of the background information, approachtaken, and research questions that are being explored.

1.1 Background

This simulation of agents in a hostile environment is based on a paper written byThomas Haynes in 1995 [HW95]. Haynes simulated simple agents in a hostileenvironment using genetic programming techniques to evolve behaviors. Theenvironment presented was based on placing energy and mines randomly acrossa two-dimensional grid. Agents were evolved to find energy and mines on thegrid. The emphasis of his approach was evolving an agent that could operateon a randomly generated environment, in his case 10 × 10 or 20 × 20 grids.He allowed for communication between multiple agents allowing the creationof better agents. He showed that agents could operate in randomly generatedenvironments to detect the energy and mines. This project extends some of thesegoals. Agents are used in randomly generated environments. Terrorist agentsare added that can add traps to the environment, as well as other agent typesand abilities. No form of energy usage is tracked in this iteration of this project,although limiting air for first responders is a consideration for the future.

1.2 Approach

In modeling a hostile environment three types of agents are used. Victim agentsare created to simulate how a victim would act. A first responder agent per-forms typical duties such as locating hazards and helping victims. The thirdtype of agent is a terrorist. These agents are able to alter the hostility of theenvironment by adding traps to the map and by having the ability to kill otheragents. A simulation is performed with these agents to judge how well theyperform in the environment.

Using the information from the simulation, the agents are evolved to improvetheir abilities in the environment. The first responders becoming better atrescuing and detecting and terrorists being more “effective” in their placementof traps and abilities. An Evolutionary Computation technique called geneticprogramming is employed to develop the programs that will drive the agents tosuccess.

1.3 Genetic Programming

Genetic Programming is an EC technique where the individuals in the popu-lation are executable computer programs. GP has been thoroughly researchedby Koza, Langdon and many other individuals [Koz90, LQ95]. The individualsin the population are parse trees that consist of terminals and non-terminals.There are two types of trees that are used for this project, tradition LISP style

3

Page 4: Evolving Agents In a Hostile Environmentweb.mst.edu/~tauritzd/courses/ec/fs2003/project/Berry.pdfEvolving Agents In a Hostile Environment Alex J. Berry December 4, 2003 Abstract This

+

* %

1 4 3 2

+

^ -

5 8 7 6

+

^

5 6

%

4 3

Figure 1: A crossover operation

+

* %

1 4 3 2

+

/ %

3 4 3 9

Figure 2: A mutation operation

decision trees as well as expression trees. The decision tree is a combination ofif-then-else statements that are combined with actions. Typically, the booleanpart of the if statement is monotonic. In this case expression trees have beenused for the boolean statements. Decision and expression trees are discussedfurther in Section 3.1. The other major changes to the typical evolutionary cyclein GP are in the reproduction phase with the mutation and crossover operators.Figure 1 shows a typical crossover operation. Two parents are selected, oneis copied and a subtree in the second is used to replace a subtree in the first.Figure 2 shows a typical mutation operation where a subtree is replaced in theindividual at random.

1.4 Research Questions

The research questions this paper addresses are as follows:

� What is the complexity level of a good agent?

� What sort of search path will a First Responder Agent Follow?

� Which control parameters produce the best First Responder?

� Which control parameters produce the best Terrorist?

� Do better agents develop using the same map for all generations or a newmap for each generation?

4

Page 5: Evolving Agents In a Hostile Environmentweb.mst.edu/~tauritzd/courses/ec/fs2003/project/Berry.pdfEvolving Agents In a Hostile Environment Alex J. Berry December 4, 2003 Abstract This

Table 1: Bit Usage for Map ArrayBit Number Usage

b0 North Wallb1 East Wallb2 South Wallb3 West Wallb4 Neutral Agentb5 Good Agentb6 Hostile Agentb7 Memoryb8 Trap Present

2 Methodology

The project consists of three elements: the map generator, the agents, and thegenetic programs. Each of these was approached with different techniques tocombine the whole project as one system. The parts were coded in C++ usingobject oriented techniques. A viewer has also been developed to show the agentsat work.

2.1 Map Generator

One of the core items of this project is the map generator. This generatoris used to generate maps for performing the experiments and is also used formemory storage for the agents. Figure 3, shows an example of a randomlygenerated map. There are several concerns that were considered when creatingthe map class. Low memory usage, high speed, and adaptability for expansionwere the requirements of the map generator. The generic size feature allowsfor maps from 1 × 1 up to n × n or the memory limitations of the computer.The map layout is currently done with an array of bit arrays. Each element ofthe array is used to store information about the cell. The cell can have walls,agents, explosives, and other attributes. The information about these attributesis stored bitwise in a sixteen bit array. Table 1 shows the usage of the bits.

Eight bits are being used for information storage which is half usage for eachcell. Later if more bits are needed the array can be easily changed to a 32 bitarray, giving twice the storage space. Currently a 1000 × 1000 map would useapproximately 20MB of memory. For this iteration of the project, randomlygenerated maps are being used. These maps are generated based on a wallweight, pwall. If pwall = .20 then the map will have about 20% walls. Buildingmaps that actually look like real building floor plans was also a consideration.It was decided that at this time it was more important to get agents that couldwork in any environment. Later, an approach to building maps that look likereal floor plans will be used.

5

Page 6: Evolving Agents In a Hostile Environmentweb.mst.edu/~tauritzd/courses/ec/fs2003/project/Berry.pdfEvolving Agents In a Hostile Environment Alex J. Berry December 4, 2003 Abstract This

Figure 3: A generated map with pwall = .30

6

Page 7: Evolving Agents In a Hostile Environmentweb.mst.edu/~tauritzd/courses/ec/fs2003/project/Berry.pdfEvolving Agents In a Hostile Environment Alex J. Berry December 4, 2003 Abstract This

2.2 Agents

Three types of agents operate on the map with different goals and abilities.Victim, first responder, and terrorist agents operate and are based off of ageneric agent base class.

2.2.1 Generic Agents

The generic agent base class provides for basic functionality for any type ofagent. It takes care of keeping a copy of the map for memory purposes andprovides a generic interface for the EA to use. The goal being for the EA tooperate on the agents without knowing the agent types. Each of the agent typeswill add to this basic functionality to give their individual abilities.

2.2.2 Victim Agents

The victim agents were the first expansion of the generic agent base. Thevictim agents move undirected in the environment in “panicky” state. Theyhave one goal, trying to escape the building. If a victim agent runs acrossa first responder, there is a chance that it escapes the building. The victimagent has no ability to kill or save a first responder or terrorist. Upon locatinga first responder, they can share information with that agent. However, thismemory sharing will not be clean, there will be a chance that they will giveincorrect information to the first responder. This increases the similarity toreal life where the victims will be in a state of panic and not communicatingwell. A parameter ploss is used to control this loss of information. If ploss = 1then all of the transfered memory has the chance of being bad and if ploss = 0then no information will be lost in the transfer. For the victim-first respondercommunication the victim and the first responder must be at the same locationon the map.

2.2.3 First Responder Agents

The goals of the first responder in the real world is to identify victims andhazards at the location; and if possible, to remove potential hazards. In thedeveloped environments all of these goals are simulated. The first respondersidentifies living victims and helps to remove them from the building. Whenfinding a deceased victim the first responder remembers the location. The firstresponders attempt to identify terrorists in the building. When a potentialhazard is found the location is marked in the memory of that agent. Theresponder can attempt to remove the trap at its own risk. There is a chancethat the trap will kill the first responder before the responder can remove it.First responders can also catch and detain terrorists.

First responders interact with other first responders in the environment.To do this, radio communication is simulated. The information that a firstresponder receives on a given iteration is shared with all other first responders inthe environment. This is a simple memory copy from one responder to another.

7

Page 8: Evolving Agents In a Hostile Environmentweb.mst.edu/~tauritzd/courses/ec/fs2003/project/Berry.pdfEvolving Agents In a Hostile Environment Alex J. Berry December 4, 2003 Abstract This

When conflicting information is found, the agent keeps its own memory. Forthis iteration, the first responders do not interact in any other way.

2.2.4 Terrorist Agents

The terrorist agent plays another important role in the environment. The gen-eral goal of a terrorist agent is to cause mayhem in the environment. Withthe abilities to attack first responders and victims, the terrorist can slow theprogress of clearing the building. The terrorist agents are able to set traps thathave timed detonation. When the traps detonate, they destroy all walls in theeight surrounding cells as well as killing all agents in those cells. Terrorist areonly be able to communicate with other terrorists when they come in contactwith each other at which time they can share accurate information with eachother.

2.2.5 Judging an agent’s performance

The agents are judged based on their performance in their tasks. Currently thevictim agents are not judged in performance since this agent is not evolved inthis iteration of the project. The other agents are judged based on the followingcriteria.

� First Responder

How many victims did the responder help?

How many terrorists did the responder catch?

How many traps did the responder remove?

Did the first responder survive?

How much of the map did the first responder explore?

� Terrorist

How many victims were killed by its traps?

How many first responders were killed by its traps?

How many first responders and victims were killed by contact?

How long did the terrorist survive?

2.3 Viewer

A viewer was developed in wxWindows2 to show runs with agents in the envi-ronment. wxWindows is a cross platform graphical user interface developmentlibrary. It allows the viewer to be compiled for running in Windows

as wellas Linux. The viewer has two modes of operation. The interface can run froma file that was generated from a run in the evolutionary process or it can runa randomly generated map. Figures 3 and 4 are from the viewer application.

2http://www.wxWindows.org

8

Page 9: Evolving Agents In a Hostile Environmentweb.mst.edu/~tauritzd/courses/ec/fs2003/project/Berry.pdfEvolving Agents In a Hostile Environment Alex J. Berry December 4, 2003 Abstract This

Table 2: Color scheme of the viewerColor Shape What is it

White Lines WallsGreen Box VictimBlue Box First ResponderRed Box TerroristYellow Circle Trap

Colored squares are used to identify information in the map. Table 2 tells whateach of the colors means. Combinations of colors mean that multiple thingsoccupy the location. For example, a blue box and a red box make a magentabox combine to mean a first responder and terrorist occupy the location.

3 Evolutionary Process

The evolutionary process takes a population and uses evolution to alter andchange the individuals in it over time. Two populations are evolved for thisproject: a population of first responders and a population of terrorists. Overtime this should produce first responders that get better as terrorist get betterand vise versa. A co-evolution is occurring in this project. Agents that haveopposite goals are being evolved to compete against each other. At this time,there is no special feature of this project to account for this.

3.1 Individuals

Genetic programming is being used in this research. The two populations ofindividuals are being evolved for this research that have the same individualrepresentation. This was achieved by giving a terminal in the individual differentmeanings for the agents. For instance, a terminal exists for a terrorist agentthat means place a trap, this same terminal means remove a trap for a firstresponder. Each individual consists of a collection of boolean and action trees.

3.1.1 Boolean Trees

The boolean tree is an expression tree. It contains boolean operators and ter-minals associated with the map. Instances of this tree are used in the actiontrees as the boolean expression for the if-then-else statements. The operatorsin boolean trees are XOR, AND, OR, and NOT. The terminals and their mean-ings are shown in the Table 3. “S” represents some location on that map thatincludes the current location of the agent and the eight surrounding squares.

Each of the operators, except for NOT, are binary and point to two terminalsor two expression trees. The NOT operator points to one expression tree or oneterminal. Figure 5 shows a sample boolean tree that was created at random.

9

Page 10: Evolving Agents In a Hostile Environmentweb.mst.edu/~tauritzd/courses/ec/fs2003/project/Berry.pdfEvolving Agents In a Hostile Environment Alex J. Berry December 4, 2003 Abstract This

Figure 4: The viewer with agents operating

Table 3: Terminals for the boolean treesTerminal Meaning

VICTIM(S) Is there a victim at location S?TRAP(S) Is there a trap at location S?

VALIDM(S) Is moving to S a valid move?FIRSTR(S) Is there a first responder at location S?TERROR(S) Is there a terrorist at location S?

10

Page 11: Evolving Agents In a Hostile Environmentweb.mst.edu/~tauritzd/courses/ec/fs2003/project/Berry.pdfEvolving Agents In a Hostile Environment Alex J. Berry December 4, 2003 Abstract This

AND

NOT OR

VICTIM (S)

TRAP (S)

TERROR (S)

Figure 5: A Boolean Tree

Table 4: Terminals and their meanings for the action treesTerminal First Responder Terrorist

MOVE(S) Move to location S. Move to location S.SAVE(S) Save victim(s) at location S. NothingKILL(S) Trap Terrorist(s) at location S. Kill agent(s) at locations S.

PRTRAP(S) Remove trap at locations S. Place trap at locations S.

The tree in Figure 5 evaluates to true when there is not a victim at S and aterrorist is at location S or a trap is at location S.

3.1.2 Action Trees

The action trees consist of boolean trees, an operator and terminals. Since theseare decision trees, the primary operator is the if-then-else statement, or IFTE.The IFTE contains a boolean tree, and points to two other action trees thatcan be actions or further IFTE statements. The actions that can be taken arelisted in Table 4. These actions have different meanings based on whether a firstresponder or terrorist is using them. The first responders and terrorist interprettree results differently, making the tree generic. Terrorists do nothing if theychoose save, and first responders can only kill terrorists. An action tree has beenillustrated in Figure 6. Anywhere a (BOOLTREE) is seen, there would actuallybe a full boolean tree at that location. Algorithm 1 shows the pseudocode thatis equivalent to the tree shown in Figure 6.

3.2 Initialization

The trees for the agents can be initialized in two different ways. The first israndom tree generation. When the evolution starts trees are generated with twodepths, a depth for the boolean subtrees and a total depth for the root action

11

Page 12: Evolving Agents In a Hostile Environmentweb.mst.edu/~tauritzd/courses/ec/fs2003/project/Berry.pdfEvolving Agents In a Hostile Environment Alex J. Berry December 4, 2003 Abstract This

Algorithm 1 Action Tree Example

if BOOLTREE1 then

if BOOLTREE2 then

MOVE to Selse

KILLend if

else

if BOOLTREE3 then

SAVEelse

MOVE to Send if

end if

IFTE

IFTE IFTE BoolTree

1

BoolTree 3

BoolTree 2 MOVE(S) MOVE(S) SAVE KILL

Figure 6: An Action Tree

12

Page 13: Evolving Agents In a Hostile Environmentweb.mst.edu/~tauritzd/courses/ec/fs2003/project/Berry.pdfEvolving Agents In a Hostile Environment Alex J. Berry December 4, 2003 Abstract This

tree. The depths are controlled with input parameters. The second techniquefor initialization comes from using good past individuals as a starting point forevolution. It can be shown that it takes time to generate good individuals overtime with GP techniques so once some are found they are a good starting pointfor future runs of the application.

3.3 Evaluation

Every generation the agents are evaluated. This evaluation occurs in the formof running a simulation multiple times with the agents. The simulation is runmultiple time because a good agent could be killed very quickly depending oninitial placement. This gives an average fitness that is more representative ofthe individual. The agents all enter the environment differently. The victimand terrorist agents are randomly placed in the grid. There is a terrorist forevery six or seven victims, this is controlled with a parameter. The responderagents enter through doors on the side of the map and proceed to explore theenvironment. The terrorists continue to set traps and the victims continue tomove randomly. After the simulation has finished, the agents are evaluated ontheir performance and that fitness is used the EA.

Fitness is determined based on the items listed in Section 2.2.5. Fitnessfor a first responder is based on the number of victims rescued, survival time,terrorists caught, traps diffused, and the amount of the map explored. Thesefactors are multiplied by constants and form the fitness function as shown inEquation 1. The terrorist agent has a similar arrangement, just with differentgoals. The goals include number of traps placed, agents killed by traps, agentskilled by contact, survival time, and the number of friendlies killed. This fitnesscalculation is shown in Equation 2. The terms in Equations 1 and 2 are explainedin Table 5. These fitnesses are then used for agent competition.

FFirR = V H · cV H + TC · cTC + TR · cTR + FST · cFST + ME · cME (1)

FTerr = TK · cTK + CK · cCK + TP · cTP + TST · cTST − FK · cFK (2)

3.4 Selection

Selection was a simple rank based selection. The trees are sorted based on theircalculated fitness. Since the populations are small, around ten, the best two orthree individuals are used to generate the children.

3.5 Reproduction

Reproduction consists of two operators: crossover and mutation. Only one isperformed at a time due to the destructive nature of these operators when usingGP. There is a parameter that controls the chance of each one occurring. Thecrossover operator copies parent A and chooses a subtree to replace. This canbe a action or boolean tree. A matching depth action or boolean tree is found

13

Page 14: Evolving Agents In a Hostile Environmentweb.mst.edu/~tauritzd/courses/ec/fs2003/project/Berry.pdfEvolving Agents In a Hostile Environment Alex J. Berry December 4, 2003 Abstract This

Table 5: Label Definitions for Equations 1 and 2Label Meaning

VH Number of victims HelpedTC Number of Terrorist CaughtTR Number of Traps removedFST First Responder Survival TimeME Number of cells in the map exploredTK Number of Kills via TrapsCK Number of Kills via ContactTP Number of Traps PlacedFK Number of other Terrorists KilledTST Terrorist Survival Time

in parent B to replace the one in parent A. A parameter is used to control thechance that a given tree is chosen for crossover.

Mutation is very similar. The child is created by copying a parent andboolean or action subtree is chosen for mutation. This tree is then replaced bya randomly generated tree of similar depth. Again, a parameter controls theselection of this tree.

3.6 Competition

Competition was simple for this project. The worst two or three populationmembers are replaced by the children. Other methods could be used but this isappropriate since the population size is always been around ten.

4 Experimental Setup

The experimental setup consists of two primary parts. The first was the valuesfor the constants in the fitness function. The values for the constants in thefitness functions that were used are shown in Table 6. These constants werechosen based on the initial focus of the project. For the first responders, mapexploration is the most important factor, thus a multiplier of 10. For the ter-rorists the most important factor was kills using traps, thus it has the highestmultiplier in that fitness function. The survival time has been scaled back to 1if the agent survives the entire time. The reason for this was it is more impor-tant to see the agents performing the other tasks rather than surviving in thisiteration.

Next, the parameter file structure was created. Table 7 shows the parametersand their ranges. These are put in the dat file and used by the agent simulatorto evolve agents. The agent simulator is a console application that runs theevolution and simulations of the agents. To run this application, the executableis run with the dat file as the first argument. This program outputs the fitnessesevery loginterval generations.

14

Page 15: Evolving Agents In a Hostile Environmentweb.mst.edu/~tauritzd/courses/ec/fs2003/project/Berry.pdfEvolving Agents In a Hostile Environment Alex J. Berry December 4, 2003 Abstract This

Table 6: Fitness Constants used in Equations 1 and 2 for the experiments.Label Value

cV H 5cTC 1cTR 1cFST 1/SimulationTimecME 10cTK 5cCK 2cTP 1cFK 1/2cTST 1/SimulationTime

Table 7: Parameters in the dat fileParameter Name Meaning Range

mapsize rows cols Size of the map rows > 0 cols > 0mapregen N How often the map regenerates N > 0Seed N Starting Seed Any IntegerAgentSizeFR N Number of First Responders N > 0AgentSizeS N Number of Victims N > 0AgentSizeT N Number of Terrorists N > 0NumBombs N Maximum Number of Traps N ≥ 0MaxNumGens N Maximum Number of Generations N ≥ 0StopFit D Fitness to stop at D ≥ 0DepthBT N Initial Boolean Tree Depth N > 0DepthAT N Initial Action Tree Depth N > 0NumRuns N Number of times to run the simulation N ≥ 0loginterval N How often to log information N ≥ 0cross mut chance D Crossover vs Mutation Chance 0 ≤ D ≤ 1pc cross D Crossover Tree Selection Chance 0 ≤ D ≤ 1pc mut D Mutation Tree Selection Chance 0 ≤ D ≤ 1pwall D Percent Walls in Generated Map 0 ≤ D ≤ 1debuglevel N Debug Output Level N ≥ 0SimulationTime N Number of steps in each simulation N ≥ 0ploss D Memory loss with victim contact 0 ≤ D ≤ 1

15

Page 16: Evolving Agents In a Hostile Environmentweb.mst.edu/~tauritzd/courses/ec/fs2003/project/Berry.pdfEvolving Agents In a Hostile Environment Alex J. Berry December 4, 2003 Abstract This

Table 8: Parameters values used in the runsParameter Name Default Run 1 Run 2 Run 3 Run 4

mapsize 0 0 50 43 50 43 50 43 50 43mapregen 0 0 5 5 0Seed N N/A N/A N/A 10 N/AAgentSizeFR 10 10 10 10 10AgentSizeS 30 30 30 30 30AgentSizeT 5 5 5 5 5NumBombs 5 5 5 5 5MaxNumGens ∞ ∞ ∞ ∞ ∞

StopFit ∞ ∞ ∞ ∞ ∞

DepthBT 3 3 3 3 3DepthAT 10 10 10 10 10NumRuns 5 5 5 5 5loginterval 10 10 10 10 10cross mut chance .5 .5 .8 1 1pc cross .12 .12 .12 .12 .12pc mut .21 .21 .21 .21 .21pwall .15 .15 .15 .15 .15debuglevel 10 10 10 10 10SimulationTime 1000 1000 5000 10000 10000ploss N/A N/A N/A N/A N/A

5 Results

Many different experiments needed to be run to find optimum parameter valuesto use for the generation of intelligent agents. The first experiments run in thisresearch were done to get a feel for the parameter values. To start with param-eter values were chosen based on knowledge of the problem, knowledge of GPsand some random values. Table 8 shows the default values for the parametersas well as the parameter values used in later runs. The first experiments ranwith no exceptional results and were all similar to run 1 in configuration. Thefitness levels from these experiments ranged around 300 on average. Occasionalexceptions with fitnesses up to 1000 were seen, but did not last long. Figure 7shows a graph of the fitnesses versus number of generations for this experiment.The reason that the fitnesses can increase and decrease so drastically is that theagents movements are affected by other agents in the environment.

Due to the fact that good results were not being produced, it was decided toturn the terrorists off and to restrict first responders to movement only to seeif good first responders could develop in the environment. This is referred toin the table as run 2. Two parameters were raised for run 2: cross mut chance

was raised to .8 and mapregen was set to 5. This run resulted in a loweraverage fitness, about 250, but more individuals reaching above a fitness of1400. Figure 8 is graph showing the output from this run. Several runs were

16

Page 17: Evolving Agents In a Hostile Environmentweb.mst.edu/~tauritzd/courses/ec/fs2003/project/Berry.pdfEvolving Agents In a Hostile Environment Alex J. Berry December 4, 2003 Abstract This

0

200

400

600

800

1000

1200

1400

1600

1800

2000

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000

Fitn

ess

Leve

l

Number of Generations

Best Individual Worst Individual Average Individual

Figure 7: Run 1 Fitness versus number of generations output

17

Page 18: Evolving Agents In a Hostile Environmentweb.mst.edu/~tauritzd/courses/ec/fs2003/project/Berry.pdfEvolving Agents In a Hostile Environment Alex J. Berry December 4, 2003 Abstract This

0

200

400

600

800

1000

1200

1400

1600

1800

0 2000 4000 6000 8000 10000 12000 14000 16000 18000 20000

Fitn

ess

Leve

l

Number of Generations

Best Individual Worst Individual Average Individual

Figure 8: Run 2 Fitness versus number of generations output

done similar to this all producing similar results.For the next experiments, the cross mut chance was further explored. It was

run set to 0 and 1. When it was set to 0 and pure mutation was being used, nogood results were produced. The results from run 1 were better with the chanceset to 0.5. However, when the parameter was set to 1, a pure crossover evolution,good results were starting to appear when suddenly the program was killed.Figure 10 shows the output from this run. As one can see, there is little changein the data for the first 4750 generations, but suddenly between generations4750 and 4810 intelligence starts to emerge. Upon exploration I found that theprogram was killed due to excessive memory usage. Figure 11 shows the numberof nodes for the best individual for runs 3 and 4. The best individual from run3 has a depth of 60 and 3,354,136 nodes. This experiment was run again, as run4, unseeded to see if the same results would occur. Figure 10 shows the outputof this run. This time the explosion of size occurs at 100 generations and againthis giant tree is the highest fitness individual in the population. Table 9 showsthe tree node count, tree depth, and fitness for the best individuals at the endsof all of the runs. This appears to point to the fact that crossover is the moreimportant operator and this is a very complex problem.

Another approach that was attempted was the creation of a individual byhand that would produce good results. Figure 12 shows this individual. Thisindividual only contains movement terminals, no other factors are taken into

18

Page 19: Evolving Agents In a Hostile Environmentweb.mst.edu/~tauritzd/courses/ec/fs2003/project/Berry.pdfEvolving Agents In a Hostile Environment Alex J. Berry December 4, 2003 Abstract This

0

200

400

600

800

1000

1200

1400

1600

1800

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000

Fitn

ess

Leve

l

Number of Generations

Best Individual Worst Individual Average Individual

Figure 9: Run 3 Fitness versus number of generations output

Table 9: Node counts, tree depths, and fitness for the end of the runs.Run Tree Depth Number of Nodes Fitness

1 5 21 7572 14 315 5353 60 3,354,136 16154 53 2,864,677 1105

19

Page 20: Evolving Agents In a Hostile Environmentweb.mst.edu/~tauritzd/courses/ec/fs2003/project/Berry.pdfEvolving Agents In a Hostile Environment Alex J. Berry December 4, 2003 Abstract This

0

200

400

600

800

1000

1200

1400

1600

0 20 40 60 80 100 120 140

Fitn

ess

Leve

l

Number of Generations

Best Individual Worst Individual Average Individual

Figure 10: Run 4 Fitness versus number of generations output

20

Page 21: Evolving Agents In a Hostile Environmentweb.mst.edu/~tauritzd/courses/ec/fs2003/project/Berry.pdfEvolving Agents In a Hostile Environment Alex J. Berry December 4, 2003 Abstract This

0

500000

1e+06

1.5e+06

2e+06

2.5e+06

3e+06

3.5e+06

0 500 1000 1500 2000 2500 3000 3500 4000 4500 5000

Num

ber

of N

odes

Number of Generations

Run 3 Run 4

Figure 11: Number of nodes vs generations for Runs 3 & 4

21

Page 22: Evolving Agents In a Hostile Environmentweb.mst.edu/~tauritzd/courses/ec/fs2003/project/Berry.pdfEvolving Agents In a Hostile Environment Alex J. Berry December 4, 2003 Abstract This

IFTE

MOVE(N) VALIDM(N) IFTE

MOVE(E) VALIDM(E) IFTE

MOVE(S) VALIDM(S) IFTE

MOVE(W) VALIDM(W) SAVE

Figure 12: Hand generated good individual

account. This tree produced good results the first evaluation with a fitness of1635, but lost steam after that. Later it was realized that this individual caneasily get into movement cycles. This shows that more usage of the agent’smemory will be needed to produce good results.

5.1 Future Experiments

There are more experiments that need to be performed. These experimentscan currently not be done due to the complex nature of the problem. Treesimplification algorithms and better mutation and crossover operators will needto be added before these experiments can be performed. The following listitemizes these experiments that would enhance the results.

� Fluctuating the ratios of victim:terrorist:first responder agents to find anappropriate level.

� Agents developed in a constant/changing map environment from differentevolutionary cycles to compare parameters.

� The best agents from static and random evolution compete to find outwhich approach is better.

� Experiments that vary all of the control parameters to see what levels ofagents are produced.

22

Page 23: Evolving Agents In a Hostile Environmentweb.mst.edu/~tauritzd/courses/ec/fs2003/project/Berry.pdfEvolving Agents In a Hostile Environment Alex J. Berry December 4, 2003 Abstract This

6 Conclusion

First responder training is becoming more important in our world with today’sterrorist threats. One of the locations of terrorist attacks is in buildings. Cur-rently, VEnOM Labs is working on a virtual reality software suite to train firstresponders in this environment. One of the features that would be helpful isautonomous agents in the VR environment. This research has looked at thefirst steps of creating first responder and terrorist agents that will be added tothe VR world in the future. This research has shown that this problem hasa very complex genetic programming solution. The best individuals that havebeen produced have depths of around 60 and node counts in the millions. It hasalso been shown that the crossover operator is better for reproduction than themutation operator. The use of GP to evolve agents in a hostile environment hasproved successful in this research. With further exploration and new operatorsagents that can be used in the VR world will be created. In Section 1.4, fivequestions were set out to be explored, the research answers are looked at next.

6.1 Research Answers

� What is the complexity level of a good agent?

A good agent will be very complex. In the outset of this research it was thoughtthat the individuals needed to produce good results would be fairly simple.Through experimentation, it was found that the best individuals were becomingvery complex. Very large trees were being executed in the evolutionary steps.A simplification algorithm will be needed in the future to attempt to reducethe complexity the trees. The complexity of the research was reduced as theexperiments progress, however these features will be added back in the future.

� What sort of search path will a First Responder Agent Follow?

None of the runs shown or the trial runs that were unrecorded show agentstraveling in an intelligent search pattern. Part of this is due to the fact thatthere is not terminal for “Has the agent been at locations S?”. This terminal willneed to be added to get the agents to explore more of the map. The manuallycreated individual also showed this.

� Which control parameters produce the best First Responder?

Again little was learned in this area. It is know that crossover is a betterreproduction operator. It also appears that the fitness function constants thatwere used will produce good agents that can explore the map.

� Which control parameters produce the best Terrorist?

Nothing was learned about terrorist parameters as their evolution was disabledearly on in the experimentation process, since the focus was on first responderbehavior. Later the evolution of these agents will be re-enabled.

23

Page 24: Evolving Agents In a Hostile Environmentweb.mst.edu/~tauritzd/courses/ec/fs2003/project/Berry.pdfEvolving Agents In a Hostile Environment Alex J. Berry December 4, 2003 Abstract This

� Do better agents develop using the same map for all generations or a newmap for each generation?

The agents that were produced in the results of this project were unaffected bychanging the map versus not changing the map. Runs 3 and 4 in the resultssection show that good agents can be produced either way. If indeed good agentsare emerging.

6.2 Future Work

There are several things that could be added to this project to expand it inthe future. Expansion into a 3d environment specifically a VR environment tosee how the agents do. There are many evolutionary approaches that can belooked at such as a Learning Classifier System or a LCS-GP hybrid. A mini-max approach could be explored where a GP evolves the heuristic evaluation.Also, it was not taken into account that the populations will become very stongagainst each other, but not against other populations. An island approach wherepopulation members are shared over time should help to fix this problem.

References

[HW95] Thomas D. Haynes and Roger L. Wainwright. A simulation of adaptiveagents in hostile environment. In K. M. George, Janice H. Carroll,Ed Deaton, Dave Oppenheim, and Jim Hightower, editors, Proceedings

of the 1995 ACM Symposium on Applied Computing, pages 318–323,Nashville, USA, 1995. ACM Press.

[Koz90] John R. Koza. Genetically breeding populations of computer programsto solve problems in artificial intelligence. In Proceedings of the Second

International Conference on Tools for AI, Herndon, Virginia, USA,pages 819–827. IEEE Computer Society Press, Los Alamitos, CA, USA,6-9 1990.

[LQ95] William B. Langdon and Adil Qureshi. Genetic programming – com-puters using “natural selection” to generate programs. Technical Re-port RN/95/76, Gower Street, London WC1E 6BT, UK, 1995.

24


Recommended