Date post: | 01-Jan-2016 |
Category: |
Documents |
Upload: | colton-wise |
View: | 46 times |
Download: | 0 times |
Distributed Evolution for Swarm Robotics
Suranga HettiarachchiComputer Science Department
University of Wyoming
Committee Members:Dr. William Spears – Computer Science (Committee Chair / Research Advisor)Dr. Diana Spears – Computer ScienceDr. Thomas Bailey – Computer ScienceDr. Richard Anderson-Sprecher – StatisticsDr. David Thayer – Physics and Astronomy
Outline
• Goals and Contributions• Robot Swarms• Physicomimetics Framework• Offline Evolutionary Learning• Novel Distributed Online Learning• Obstacle Avoidance with Physical Robots• Conclusion and Future Work
Goals
• To improve the state-of-the-art of obstacle avoidance in swarm robotics.
• To create a novel real-time learning algorithm for swarm robotics, to improve performance in changing environments.
Contributions
• Improved performance in obstacle avoidance:• Scales to far higher numbers of robots and obstacles than the
norm• Invented an online population-based learning
algorithm:• Demonstrate feasibility of algorithm with obstacle avoidance, in
environments that change dynamically and are three times denser than the norm, with obstructed perception
• Hardware Implementation• Implemented obstacle avoidance algorithm on real robots
Obstacle Avoidance
Hardware Implementation
Online Learning Algorithm
Outline• Goals and Contributions• Robot Swarms• Physicomimetics Framework• Offline Evolutionary Learning• Novel Distributed Online Learning• Obstacle Avoidance with Physical Robots• Conclusion and Future Work
Robot Swarms
• Robot swarms can act as distributed computers, solving problems that a single robot cannot
• For many tasks, having a swarm maintain cohesiveness while avoiding obstacles and performing the task is of vital importance
• Example Task: Chemical Plume Source Tracing
Chemical Plume Source Tracing
Link to this movie may not work properly
Outline• Goals and Contributions• Robot Swarms• Physicomimetics Framework• Offline Evolutionary Learning• Novel Distributed Online Learning• Obstacle Avoidance with Physical Robots• Conclusion and Future Work
Physicomimetics for Robot Control
•Biomimetics: Gain inspiration from biological systems and ethology.
•Physicomimetics: Gain inspiration from
physical systems. Good for formations.
Physicomimetics Framework
Robots have limited sensor range,and friction for stabilization
F F
F F F
Virtual forces F on a robot A by other robots ai and the environment cause a
d displacement in its behavior.
d
a 1
a 2
a 3
a4
A
Environment
Robots are controlled via “virtual” forces from nearby robots, goals, and obstacles. F = ma control law.
Seven robots form a hexagon
Two Classes of Force Laws
p
ji
r
mGmF
7
6
13
12224
r
c
r
dF
The left “Newtonian” force law, is good for creating swarms in rigid formations. The right “Lennard-Jones” force law (LJ) more easily models fluid behavior, which is potentially better for maintaining cohesion while avoiding obstacles.
The “classic” law Novel use of LJ force law for robot control
What do these force laws look like?
Change in Force MagnitudeWith Varying Distance for Robot – Robot Interactions
Fmax = 1.0
Fmax = 4.0
Desired Robot Separation Distance = 50
Outline
• Goals and Contributions• Robot Swarms• Physicomimetics Framework• Offline Evolutionary Learning• Novel Distributed Online Learning• Obstacle Avoidance with Physical Robots• Conclusion and Future Work
Swarm Learning (Offline)
• Typically, the interactions between the swarm robots are learned via simulation in “offline” mode.
Swarm Simulation
Initial RulesFinal Rules
that achieve thedesired behavior
Offline Learning, such as an Evolutionary Algorithm (EA)
FitnessRules
Swarm Simulation Environment
Offline Learning Approach
• An Evolutionary Algorithm (EA) is used to evolve the rules for the robots in the swarm.
• A global observer assigns fitness to the rules based on the collective behavior of the swarm in the simulation.
• Each member of the swarm uses the same rules. The swarm is a homogeneous distributed system.
• For physicomimetics, the rules consists of force law parameters.
Force Law Parameters• Parameters of the “Newtonian” force law
G- “gravitational” constant of robot-robot interactionsP- power of the force law for robot-robot interactionsFmax- maximum force of robot-robot interactions
Similar 3-tuples for obstacle/goal-robot interactions.
• Parameters of the LJ force lawε- strength of the robot-robot interactionsc- non-negative attractive robot-robot parameterd- non-negative repulsive robot-robot parameterFmax- maximum force of robot-robot interactions
Similar 4-tuples for obstacle/goal-robot interactions.
Gr-r Pr-r Fmaxr-r Gr-o Pr-o Fmaxr-o Gr-g Pr-g Fmaxr-g
εr-r cr-r dr-r Fmaxr-r εr-o cr-o dr-o Fmaxr-o εr-g cr-g dr-g Fmaxr-g
Measuring Fitness • Connectivity (Cohesion) : maximum number of
robots connected via a communication path. • Reachability (Survivability) : percentage of
robots that reach the goal.• Time to Goal : time taken by at least 80% of the
robots to reach the goal.
goalconnectivity4R
reachability
High fitness corresponds to high connectivity,high reachability, and low time to goal.
Summary of Results
• We compared the performance of the best “Newtonian” force law found by the EA to the best LJ force law.
• The “Newtonian” force law produces more rigid structures making it difficult to navigate through obstacles. This causes poor performance, despite high connectivity.
• Lennard-Jones is superior, because the swarm acts as a viscous fluid. Connectivity is maintained while allowing the robots to reach the goal in a timely manner.
• The Lennard-Jones force law demonstrates scalability in the number of robots and obstacles.
Connectivity of Robots
Force Law
Robots
Obstacles
20 40 60 80 100
Newt20 1160 1260 1290 1530 1920
100 - - - - -
LJ20 470 480 490 510 520
100 640 650 670 680 690
Time for 80% of the Robots to Reach the Goal
A Problem
• The simulation assumes a certain environment. What happens if the environment changes when the swarm is fielded?• We can’t go back to the simulation world.• Can the swarm adapt “on-line” in the field?
Environment trained on.
Environment changes.
Performance degrades.
Frequently Proposed Solution
• Each robot has sufficient CPU power and memory to maintain a complete map of the environment.
• When environment changes, each robot runs an EA internally, on a simulation of the new environment.
• Robots wait until new rules are evolved.
• It is better to learn in the field, in real time.
4 days of simulation time
Outline• Goals and Contributions• Robot Swarms• Physicomimetics Framework• Offline Evolutionary Learning• Novel Distributed Online Learning• Obstacle Avoidance with Physical Robots• Conclusion and Future Work
Example• The maximum velocity is increased by 1.5x.• Obstacles are tripled in size.• High obstacle density creates cul-de-sacs and
robots are left behind. Collisions also occur.• Obstructed perception is also introduced.• The learned offline rules are no longer sufficient.
Environment trained on.
Environment changes.
Performance degrades.
Novel Online Learning Approach• Borrow from evolution.
• Each robot in the swarm is an individual in a population that interacts with its neighbors.
• Each robot contains a slightly mutated copy of the best rule set found with offline learning.
• When the environment changes, some mutations perform better than others.
• Better performing robots share their knowledge with poorer performing neighbors.
• We call this “Distributed Agent Evolution with Dynamic Adaptation to Local Unexpected Scenarios” (DAEDALUS).
DAEDALUS for Obstacle Avoidance
• Each robot is initialized with randomly perturbed (via mutation) versions of the force laws learned with the offline simulation.
• Robots are penalized if they collide with obstacles and/or are left behind.
• Robots that are most successful and are moving will retain the highest worth, and share their force laws with neighboring robots that were not as successful.
Experimental Setup
• There are five goals to reach in a long corridor.• Between each goal is a different obstacle
course.• Robots that are left behind (due to obstacle cul-
de-sacs) do not proceed to the next goal.• The number of robots that survive to reach the
last goal is low. We want the robots to learn to do better, while in the field.
DAEDALUS Results• DAEDALUS succeeded in dramatically
reducing the number of collisions and improving survivability, despite the difficulties caused by obstructed perception.
• Our results depended on the mutation rate. Can DAEDALUS learn that also?
20 minutes of simulation time
Further DAEDALUS Results• DAEDALUS also succeeded in learning the
appropriate mutation rate for the robots. Hence, the system is striking a balance between exploration and exploitation.
Number of Robots Surviving with Different Mutation Rates
1% 3% 5% 7% 9%
60-start 12 12 12 12 12
53-goal1 8 10 11 12 12
45-goal2 9 6 10 9 11
40-goal3 7 6 10 8 9
34-goal4 5 6 9 8 6
32-goal5 5 5 9 7 6
Effect of Mutation Rate on Survival
60 Robots moving towards 5 goals through 90 obstacles in between each goal
Collision Reduction
Summary of DAEDALUS
• Creating rapidly adapting robots in changing environments is challenging.
• Offline learning can yield initial “seed” rules, which must then be perturbed.
• The key is to maintain “diversity” in the rules that control the members of the swarm.
• Collective behaviors still arise from the local interactions of diverse population of robots.
Outline• Goals and Contributions• Robot Swarms• Physicomimetics Framework• Traditional Offline Learning• Novel Distributed Online Learning• Obstacle Avoidance with Physical
Robots• Conclusion and Future Work
Obstacle Avoidance with Robots
• Use three Maxelbot robots• Use 2D trilateration localization
algorithm (Not a part of this thesis)• Design and develop obstacle
avoidance module (OAM)• Implement physicomimetics on a real
outdoor robot
Hardware Architecture of Maxelbot
MiniDRAGON for motor control,
executes Physicomimetics
MiniDRAGON for trilateration,
provides robot coordinates
OAMAtoD conversion
RF and acoustic sensors
IR sensors
I2C
I2C
I2C
Physicomimetics for Obstacle Avoidance
• Constant “virtual” attractive goal force in front of the leader
• “Virtual” repulsive forces from four sensors mounted on the front of the leader, if obstacles detected
• The resultant force creates a change in velocity due to F = ma
• Power supply to motors are changed based on the forces acting on the leader.
Obstacle Avoidance Methodology• Measure the performance of physicomimetics
with repulsion from obstacles • All experiments are conducted outdoor in the
“Prexy’s Pasture”• Three Maxelbots: One leader and two followers• Graphs show the correlation between raw
sensor readings and motor power• Leader uses the physicomimetics algorithm
with the obstacle avoidance module• Focus is on the obstacle avoidance by the
leader, not the formation control
Maxelbot Turning Left - Obstacle on the Right
-100
0
100
200
300
400
500
600
700
800
1 1001 2001 3001 4001 5001 6001 7001 8001 9001
Time
Se
ns
or
Re
ad
ing
an
d M
oto
r P
ow
er
Right-most Sensor Reading
Power to Left Motor
If there is an obstacle on the right, power to left motor is reduced
Maxelbot Turning Right - Obstacle on the Left
-100
0
100
200
300
400
500
600
700
800
1 1001 2001 3001 4001 5001 6001 7001 8001 9001
Time
Sen
sor
Rea
din
g a
nd
Mo
tor
Po
wer
Left-most Sensor Reading
Power to Right Motor
If there is an obstacle on the left, power to right motor is reduced
If there is an obstacle in front, power to both motors is reduced
Maxelbot Stopping Behavior - Both Middle Sensors Detect an Obstacle
-100
0
100
200
300
400
500
600
700
800
1 1001 2001 3001 4001 5001 6001 7001 8001 9001
Time
Se
ns
or
Re
ad
ing
s a
nd
Mo
tor
Po
we
r
Ave. of the Two Middle Sensors
Ave. of the Motor Power
Further Analysis of Sensor Reading and Motor Power
• Scatter plots give more information
• Provide a broader picture of data
• Shows the correlation of motor power with distance to an obstacle in inches (the robots ignore obstacles greater than 30” away)
Movie of 3 Maxelbots, Leader has OAM
Maxelbot Turning Right - Obstacle on the Left
-20
-10
0
10
20
30
40
50
60
70
80
0 10 20 30 40 50 60 70 80 90 100
Distance to obstacle on the left in inches
Po
we
r to
Rig
ht
Mo
tor
Left sensorsees obstacle
Left middle sensoralso sees obstacle
Outline• Goals and Contributions• Robot Swarms• Physicomimetics Framework• Offline Evolutionary Learning• Novel Distributed Online Learning• Obstacle Avoidance with Physical Robots• Conclusion and Future Work
Contributions• Improved performance in obstacle avoidance:
• Applied a new force law for robot control, to improve performance• Provided novel objective performance metrics for obstacle avoiding
swarms• Improved scalability of the swarm in obstacle avoidance• Improved performance of obstacle avoidance with obstructed
perception• Invented a real-time learning algorithm (DAEDALUS):
• Demonstrate that a swarm can improve performance by mutating and exchanging force laws
• Demonstrate feasibility of DAEDALUS with obstacle avoidance, in environments three times denser than the norm
• Explore the trade-offs of mutation on homogeneous and heterogeneous swarm learning
• Hardware Implementation• Present a novel robot control algorithm that merges
physicomimetics with obstacle avoidance.
Future Work• Use DAEDALUS to provide practical solutions to real world problems• Provide obstacle avoidance capability to all the robots in the formation• Develop robots with greater data exchange capability• Adapt the physicomimetics framework to incorporate performance feedback for specific tasks and situational awareness • Extend the physicomimetics framework for sensing and performing tasks in a marine environment (with Harbor Branch)• Introduce robot/human roles and interactions to distributed evolution architecture
Work Published• Spears W., Spears D., Heil R., Kerr W. and Hettiarachchi S. An overview of
physicomimetics. Lecture Notes in Computer Science - State of the Art Series Volume 3342, 2004. Springer.
• Hettiarachchi S. and Spears W., Moving swarm formations through obstacle fields. Proceedings of the 2005 International Conference on Artificial Intelligence, Volume 1, 97-103, CSREA Press.
• Hettiarachchi S., Spears W., Green D., and Kerr W., Distributed agent evolution with dynamic adaptation to local unexpected scenarios . Proceedings of the 2005 Second GSFC/IEEE Workshop on Radical Agent Concepts. Springer.
• Spears, W., D. Zarzhitsky, S. Hettiarachchi, W. Kerr. Strategies for multi-asset surveillance. IEEE International Conference on Networking, Sensing and Control, 2005, 929-934. IEEE Press.
• Hettiarachchi, S. and W. Spears (2006). DAEDALUS for agents with obstructed perception. In SMCals/06 IEEE Mountain Workshop on Adaptive and Learning Systems, pp. 195-200. IEEE Press, Best Paper Award.
• Hettiarachchi, S. (2006). Distributed online evolution for swarm robotics. In Doctoral Mentoring Program AAMAS06, T. Ishida and A. B. Hassine (Eds.), Autonomous Agents and Multi Agent Systems, pp. 17-18..
• Hettiarachchi, S., P. Maxim, and W. Spears (2007). An architecture for adaptive swarms. In Robotics Research Trends, X. P Guo (Ed.). Nova Publishers (Book Chapter).
Thank You
Questions?
Backup Slides
Next set of slides may be confusing because they are intended to be placed between the slides from 1-49.
DAEDALUS for Reducing Collisions
• Slightly mutate robot-obstacle force law interactions.
• Those robots that do not collide give their force laws to poorer performing robots.
DAEDALUS for Improving Survival
• Previous experiment did not attempt to alleviate the situation where robots are left behind.
• This is caused by large number of cul-de-sacs produced by large obstacle density.
• Slightly mutate robot-robot interaction, if there is a nearby moving neighbor.
• Rapidly mutate robot-goal interaction, if there are no neighbors.
Improved Survival
Two Online experiments are independent from each other.
Task: Obstacle Avoidance with Obstructed Perception
goalRobots must organize themselves into aformation and then movetoward a goal, while avoiding obstacles.
•A robot may not see another robot, due to the presence of obstacles.•If r > minD, then robot A and robot B have their perception obstructed.
DAEDALUS Results
Results averaged over 100 independent runs
We do not train children on hard problems immediately, instead, we train them on easier problems first. This is counter to accepted wisdom in the EA community.
DAEDALUS online learning is improving performance.
Homogeneous DAEDALUS
• All robots had the same mutation rate, which was 5%.
• The results may depend quite heavily on choosing the correct mutation rate.
• The best mutation rate may also depend on the environment, and should potentially change as the environment changes.
• We decided to explore this effect by conducting several experiments with different mutation rates.
Heterogeneous DAEDALUS• We attempted to address the problem of
choosing the correct mutation rate.• We divided the robots into five groups of
equal size.• Each group of 12 robots was assigned a
mutation rate of 1%, 3%, 5%, 7%, and 9%, respectively.
• This mimics the behavior of children that have different “comfort zones” in their rate of exploration.
Heterogeneous Results
Results averaged over 100 independent runs
The result at the final goal is essentially identicalto the average of the five performance curves in the previous graph. Can DAEDALUS learn the proper “comfort zone”, instead?
Analogy – Children Learning
• Borrowed from the analogy of a “swarm” of children learning some task.
• They share useful information as to the rules they might use, but they also share meta-information as to the level of exploration that is actually safe!
• Very bold children might encourage their more timid comrades to explore more than they would initially.
• If a very bold child has an accident, the rest of the children will become more timid.
Extended Heterogeneous DAEDALUS - Results
Results averaged over 100 independent runs
DAEDALUS nowallows the robots to receive a neighbor’smutation rate, in addition to the neighbor’s rules.The results are closeto those achieved by the homogenous DAEDALUS with the best mutation rate!
Why Physicomimetics?
• Capable of maintaining formations of robots
• Designed as a leader-follower algorithm
• Allows robots to move quickly, due to minimal communication
• Can use theory to set parameters
Physcomimetics for Formation Control
• The leader provides an attractive goal force for the followers
• The follower uses F = ma to compute the change in velocity that is required to follow the leader
• Power supply to motors are changed based on the changes in velocity
Formation Control Methodology• Measure the quality of Physicomimetics without
repulsions from obstacles • All experiments are conducted outdoor in the
“Prexy’s Pasture”• Three Maxelbots: One leader and two followers• Results averaged over 10 runs• Leader remotely controlled (NO Physicomimetics)• Leader DO NOT have obstacle avoidance
capability• Focus is on the formation control, not the
obstacle avoidance
Triangular Formation
Triangular Formation Results
Linear Formation
Linear Formation Results
Maxelbot Turning Right - Obstacle on the Left
-20
-10
0
10
20
30
40
50
60
70
80
0 10 20 30 40 50 60 70 80 90 100
Distance to Obstacle (inches)
Po
we
r to
Rig
ht
Mo
tor
Lag in stopping due to physicomimetic inertia.Helps counteract noisy sensors.
Lag in starting due to physicomimetic inertia.Helps counteract noisy sensors.
Left sensorsees obstacle
Left middle sensorsees obstacle
Maxelbot Turning Left - Obstacle on the Right
-20
-10
0
10
20
30
40
50
60
70
80
0 10 20 30 40 50 60 70 80 90 100
Distance to Obstacle (inches)
Po
we
r to
Le
ft M
oto
r
Lag in starting due to AP inertia.Helps counteract noisy sensors.
Lag in stopping due to AP inertia.Helps counteract noisy sensors.
Right sensorsees obstacle
Right middle sensorsees obstacle
Maxelbot Stopping Behavior - Both Middle Sensors Detect an Obstacle
0
10
20
30
40
50
60
70
80
0 10 20 30 40 50 60 70 80 90 100
Distance to Obstacle (inches)
Av
era
ge
of
Le
ft a
nd
Rig
ht
Mo
tor
Po
we
r
Power will be reduced if theoutermost sensors see anobstacle when the innersensors do not.