+ All Categories
Home > Documents > Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot...

Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot...

Date post: 20-Jul-2020
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
60
1 Distributed Intelligent Systems – W11 Machine-Learning Methods Applied to Distributed Robotic Systems
Transcript
Page 1: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

1

Distributed Intelligent Systems – W11Machine-Learning Methods

Applied to Distributed Robotic Systems

Page 2: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Outline

• Expensive optimization problems– Noise resistance– Evaluation time

• Challenges in multi-robot scenarios– Credit assignment problems– Co-adaptation strategies

• Co-adaptation examples– Co-learning obstacle avoidance– Co-evolving coordinated motion

2

Page 3: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Expensive Optimization and Noise Resistance

3

Page 4: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Expensive Optimization Problems

1. Time for evaluation of candidate solutions (e.g., tens of seconds) >> time for application of metaheuristic operators (e.g., milliseconds)

2. Noisy performance evaluations disrupt the adaptation process and require multiple evaluations for actual performance

Two fundamental reasons making robot control design and optimization expensive in terms of time:

4

Page 5: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Expensive Optimization Problems1. Time for evaluation of candidate >> time for

application of metaheuristic operators

• Example: obstacle avoidance• Robots need to encounter

obstacles to learn to avoid them

• Evaluation span 20-60 s depending on size of the arena

• Current processors can execute several million instructions in that time (e.g. ARM Cortex-A9 ~5000 MIPS)

[Di Mario and Martinoli, Robotica, 2014] 5

Page 6: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Expensive Optimization Problems2. Noisy performance evaluations disrupt the adaptation process and require multiple evaluations for actual performance

fitness

# e

valu

atio

ns

• Multiple evaluations at the same point in the search space yield different results

• Example: fitness distribution for obstacle avoidance

• Noise from: sensors, actuators, initial conditions, other robots.

• Noise causes decreased convergence speed and residual error

[Di Mario and Martinoli, Robotica, 2014] 6

Page 7: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Reducing Evaluation TimeGeneral recipe: exploit more abstracted, calibrated representations (models and simulation tools)

See also multi-level modeling lectures

7

Page 8: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Dealing with Noisy Evaluations• Better information about candidate solution can be obtained by

combining multiple noisy evaluations• We could evaluate systematically each candidate solution for a

fixed number of times → not efficient from a computational perspective

• We want to dedicate more computational time to evaluate promising solutions and eliminate as quickly as possible the “lucky” ones

• Idea: re-evaluate and aggregate → each candidate solution might have been evaluated a different number of times → compare the aggregated value

• In GA good and robust candidate solutions survive over generations; in PSO they survive in the individual memory

• Use dedicated functions for aggregating multiple evaluations: e.g., minimum and average or generalized aggregation functions (e.g., quasi-linear weighted means), perhaps combined with a statistical test for comparing resulting aggregated performances

8

Page 9: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

GA PSO

9

Page 10: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Testing Noise-Resistant on Benchmarks

• Benchmark 1 : Sphere and Generalized Rosenbrockfunctions– 30 real parameters [Pugh et al., SIS 2005]– 24 real parameters [Di Mario et al., CEC 2014]– Minimize objective function– Expensive only because of noise

• Benchmark 2: obstacle avoidance on a robot– 24 real parameters– Maximize objective function– Expensive because of noise and evaluation time

Biased results!

10

Page 11: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Benchmark 1: Gaussian Additive Noise on Generalized Rosenbrock

Fair test: samenumber of evaluations candidate solutions for all algorithms (i.e. N generations/ iterations of standard versions compared with N/2 of the noise-resistant ones)

[Pugh et al., SIS 2005]

Biased results: low number of runs (20) and population size (20) < search dimension (30)!

11

Page 12: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Benchmark 1: Functions

• Sphere

• Rosenbrock

• Normalized and bounded to [0, 1]

• Gaussian noise model

• Bernoulli noise model-2

-10

12

-10

12

340

500

1000

1500

2000

2500

3000

-2-1

01

2

-2-1

0

120

2

4

6

8

Sphere

Rosenbrock

[Di Mario et al., CEC 2014]

12

Page 13: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Rosenbrock with Gaussian Noise: Increasing σ

σ = 0 σ = 0.01

σ = 0.05 σ = 0.1

[Di Mario et al., CEC 2014]

13

Page 14: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Increasing Population Size Does Not Help

[Di Mario et al., CEC 2014]

14

Page 15: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Bernoulli Noise: Positive and Negative Amplitudes

15

Page 16: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Benchmark 2:Obstacle Avoidance on a Mobile Robot

• Similar to [Floreano and Mondada 1996]– Discrete-time, single-layer, artificial

recurrent neural network controller– Shaping of neural weights and biases

(24 real parameters)– fitness function: rewards speed,

straight movement, avoiding obstacles

• Different from [Floreano and Mondada 1996]– Environment: bounded open-space

of 2x2 m instead of a maze

V = average wheel speed, Δv = difference between wheel speeds, i = value of most active proximity sensor

[Pugh J., EPFL PhD Thesis No. 4256, 2008]16

Page 17: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Baseline Experiment: Extended-Time Adaptation

• Compare the basic algorithms with their corresponding noise-resistant version

• Population size 100, 100 iterations, evaluation span 300 seconds (150 s for noise-resistant algorithms)→ 34.7 days

• Fair test: same total evaluation time for all the algorithms

• Realistic simulation (Webots)• Best evolved solutions averaged

over 30 runs• Best candidate solution in the

final pool selected based on 5 runs of 30 s each; performance tested over 40 runs of 30s each

• Similar performance for all algorithms[Pugh J., EPFL PhD Thesis No. 4256, 2008]

17

Page 18: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

18

Where Can Noise-Resistant Algorithms Make the Difference?

• Limited adaptation time • Hybrid adaptation (simulation/hardware in the loop)• Large amount of noise

Notes:• all examples from shaping obstacle avoidance behavior• best learned/evolved solution averaged over multiple runs• fair tests: same total amount of evaluation time for all the

different algorithms (standard and noise-resistant)

Page 19: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Limited-Time Adaptation Trade-Offs

• 1 robot, 24 parameters• Total adaptation time =

8.3 hours (1/100 of previous learning time)

• Trade-offs: population size, number of iterations, evaluation span

• Realistic simulation (Webots)Varying population size vs.

number of iterations

Good with small populations

[Pugh J., EPFL PhD Thesis No. 4256, 2008]

No advantage

19

Page 20: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Hybrid Adaptation with Real Robots• Move from realistic simulation (Webots) to real

robots after 90% learning (even faster evolution)• Compromise between time and accuracy• Noise-resistance helps manage transition

[Pugh J., EPFL PhD Thesis No. 4256, 2008] 20

Page 21: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Hybrid Adaptation vs. Only Real Robots

• Noise-resistant PSO• Hybrid: 30 iterations in

simulation, then 30 iterations on real robots

• Achieves similar fitness as running 60 iterations on real robots

• Requires half the real robot evaluation time

[Di Mario and Martinoli, Robotica, 2014] 21

Page 22: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Increasing Noise Level – Set-Up

• Scenario 1: One robot learning obstacle avoidance

• Scenario 2: One robot learning obstacle avoidance, one robot running pre-evolved obstacle avoidance

• Scenario 3: Two robots co-learning obstacle avoidance

Idea: more robots more noise (as perceived from an individual robot) because there is no explicit communication between the robots, but in scenario 3 there is information sharing through the population manager.

1x1 m arena, PSO, 50th iteration, scenario 3

[Pugh et al, SIS 2005] 22

Page 23: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Increasing Noise Level – Sample Results

[Pugh et al, SIS 2005] 23

Page 24: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Why Noise-Resistant Algorithms Make the Difference?

Standard PSO vs. A-Posteriori evaluations

PSO gbestavg of 1000 eval

[Di Mario et al., CEC 2014]

24

Page 25: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Why Noise-Resistant Algorithms Make the Difference?

Noise-Resistant PSO vs. A-Posteriori evaluations

PSO gbestavg of 1000 eval

[Di Mario et al., CEC 2014]

25

Page 26: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

From Single to Multi-Unit Systems:Co-Adaptation in a

Shared World

26

Page 27: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Adaptation in Multi-Robot Scenarios

• Collective: fitness become noisy due to partial perception, independent parallel actions

27

Page 28: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Credit Assignment ProblemWith limited communication, no communication at all, or partial perception:

• A robot cannot distinguish between the environmentalmodifications caused by its own actions from thosegenerated by others.

• Punishments and rewards are likely to be inconsistent. 28

Page 29: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Co-Adaptation in a Collaborative Framework

29

Page 30: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Co-Shaping Collaborative Behavior

Three orthogonal axes to consider (extremities and balanced solutions are possible):

1. Performance evaluation: individual vs. group fitness or reinforcement

2. Solution sharing: private vs. public policies

3. Team diversity: homogeneous (identical controller and hardware) vs. heterogeneous learning

30

Page 31: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Policy Performance Sharing Diversity

i-pr-he individual private heterogeneous

i-pr-ho individual private homogeneous

i-pu-he individual public heterogeneous

i-pu-ho individual public homogeneous

g-pr-he group private heterogeneous

g-pr-ho group private homogeneous

g-pu-he group public heterogeneous

g-pu-ho group public homogeneous

Do not make sense (inconsistent)

Interesting (consistent)

Possible but not scalable

Search Algorithms for MR Systems

31

Page 32: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Example of collaborative co-learning with binary encoding of 100 candidate solutions and 2 robots

Population-Based Search Algorithmsfor Multi-Robot Systems

32

Page 33: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Stick-Pulling Case Study: Homogeneous Learning

33

• See W 10 lecture• Optimization of a single GTP for the whole team

Page 34: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Viable for exploring heterogeneous solutions

Not scalable

Heterogeneity allowed but eventually roughly homogeneous solution via shuffle around of candidate solutions

Homogeneity enforced

34

• See W 10 lecture• Learning to specialize the team members (multiple GTPs)

Stick-Pulling Case Study: Heterogeneous Learning

Page 35: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Co-Adaptation for Obstacle Avoidance

35

Page 36: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Population-Based Search Algorithms for Multi-Robot Systems

36

Page 37: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Distributed Robotic Adaptation• Standard approach: evaluate candidate solutions on

robots but centralize population manager• New approach: distributed also the population manager

on the robots and share population management through communication channels

• Currently: synchronization at the end of an iteration/generation

• Why PSO:– appears interesting since this metaheuristic works well with

small pools of candidate solutions: candidate pool size ≈ robot team size

– limited particle neighborhood sizes → scalable, on-board operation 37

Page 38: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Varying the Robotic Group Size

38

Page 39: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Varying the Robotic Group Size

39

Page 40: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Varying the Robotic Group Size

40

Page 41: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Varying the Robotic Group Size

41

Page 42: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

• Same control architecture as [Floreano & Mondada, 1996] (ANN, 24 weights to tune, Khepera III has 9 proximity sensors)

• Same fitness function as [Floreano & Mondada, 1996]• Similar Webots world as [Pugh et al., 2005] but 3x3 m• Robot group size: 1, 2, 5, 10• PSO parameters

– Swarm size: 10– pw = nw = 2.0– w = 0.6

Varying the RoboticGroup Size

[Pugh and Martinoli, Swarm Intelligence J., 2009] 42

Page 43: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Varying the Robotic Group Size –Learning vs. Testing Environment

• Gradually increase number of robots on team

• Up to 10x faster learning with little performance loss

• Arena 3x3 m

[Pugh and Martinoli, Swarm Intelligence J., 2009]

Learned in a group of 10 robots (10x faster), final evaluation as single robotLearned as single

robot, final evaluation as single robot 43

Page 44: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Distributed Adaptation withReal Robots (Pugh, 2008)

Before adaptation (5x speed-up) 44

Page 45: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Distributed Adaptation withReal Robots (Pugh, 2008)

After adaptation (5x speed-up) 45

Page 46: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Increasing Number of Robots: Impact of Noise Resistance

• Webots experiments • 1x1 m arena (high density!)• Fair test: same amount of total

evaluation time for each bar• Performance decreases with

number of robots (more difficult to avoid in overcrowded arenas)

• Noise-resistance make the difference in high density (i.e. noisier) scenarios

[Di Mario and Martinoli, Robotica, 2014]46

Page 47: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Impact of Limited Time Adaptation• Webots experiments • 1x1 m arena (high density!)• full-time adaptation: 417 h • limited time adaptation: 8h• 52 times smaller evaluation time, 17% max drop in performance

• same obstacle avoidance strategy

Recipe:1. Evaluation span include at

least 1 interaction2. Swarm size = dimension of

parameter space3. Use noise-resistant algorithms4. Dedicate max time budget to

iterations

[Di Mario and Martinoli, Robotica, 2014]

47

Page 48: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Co-Adaptation for Coordinated Motion

48

Page 49: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

We call “swarm-bot” an artifact composed of a number of simpler robots, called “s-bots”, capable of self-assembling and self-organizing to adapt to its environmentS-bots can connect to and disconnect from eachother to self-assemble and form structures whenneeded, and disband at will

The SWARM-BOTS project (2001-2005)http://www.swarm-bots.org

49

Page 50: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

The coordinated motion task

• Four s-bots are connected in a swarm-bot formation

• Their chassis are randomly oriented

• The s-bots should be able to – collectively choose a direction

of motion – move as far as possible

50

Page 51: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Coordinated motion:The traction sensor

• Connected s-bots apply pulling/pushingforces to each other when moving

• Each s-bot can measure a traction forceacting on its turret/chassis connection

• The traction force indicates the mismatchbetween– the average direction of motion of the group– the desired direction of motion of the single

s-bot• Simple perceptrons are evolved as

controllers

4 traction sensors

2 motors

51

Page 52: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Coordinated motion:The evolutionary algorithm

• Binary encoded genotype– 8 bits per real valued parameter of the neural controllers

• Generational evolutionary algorithm– 100 individual evolved for 100 generations– 20 best individual are allowed to reproduce in each

generation– Mutation (3% per bit) is applied to the offspring

• The perceptron is cloned and downloaded to each s-bot

• Fitness is evaluated looking at the swarm-botperformance– Each individual is evaluated with equal starting

conditions52

Page 53: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Population-Based Search Algorithms for Multi-Robot Systems

53

Page 54: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Coordinated motion:Fitness evaluation

• The fitness F of a genotype is given by the distance covered by the group:

where X(t) is the coordinate vector of the center of mass at time t, and D is the maximum distance that can be covered in 150 simulation cycles

• Fitness is evaluated 5 times (fixed number per candidate solution!), starting from different random initializations

• The resulting average is assigned to the genotype54

Page 55: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Coordinated motion: Results

Replication Performance1 0.878882 0.839593 0.883384 0.715675 0.795736 0.752097 0.834258 0.858489 0.87222

10 0.76111

Average fitness

Post-evaluation

55

Page 56: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Coordinated motion: Real s-bots

flexibilitydefault (used for evolution)

56

Page 57: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Coordinated motion: Scalability

flexibility and scalabilityscalability

57

Page 58: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Conclusion

58

Page 59: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Take Home Messages• Machine-learning techniques (population-based and hill-climbing

algorithms) can be used for design and optimization of software and hardware features in multi-robot settings

• The cost of an optimization problem is heavily influenced by the amount of noise in the evaluation function, the time needed for evaluating a candidate solution, and the dimension of the parameter space

• Collaborative co-adaptation strategies can be differentiated along three axes: public/private solutions; homogeneous/heterogeneous system, individual/group performance

• Multi-robot platforms can be exploited for testing in parallel multiple candidate solutions

• One way to bypass the credit assignment problem in multi-robot contexts is to enforce homogeneity and reward group performance

• PSO appears to be well suited for fully distributed on-board operation and fairly robust to small pools of candidate solutions 59

Page 60: Distributed Intelligent Systems – W11 Machine-Learning ... · • Challenges in multi-robot scenarios – Credit assignment problems ... Two fundamental reasons making robot control

Additional Literature – Week 11Books• T. Balch and L. E. Parker (Eds.), Robot teams: From diversity to polymorphism.

Natick, MA: A K Peters.

Papers• Zhang Y., Antonsson E. K., and Martinoli A., “Evolutionary Engineering

Design Synthesis of On-Board Traffic Monitoring Sensors”. Research in Engineering Design, 19(2-3): 113-125, 2008.

• Pugh J. and Martinoli A., “Multi-Robot Learning with Particle Swarm Optimization”. Proc. of the Fifth ACM Int. Joint Conf. on Autonomous Agents and Multi-Agent Systems, May 2006, Hakodate, Japan, pp. 441–448.

• Dorigo M., Trianni V., Sahin E., Groß R., Labella T., Nolfi S., Baldassare G., Deneubourg J.-L., Mondada F., Floreano D., and Gambardella L.. “Evolving Self-organising Behaviours for a Swarm-bot”. Autonomous Robots, 17:223–245, 2004

• Murciano A. and Millán J. del R., "Specialization in Multi-Agent Systems Through Learning". Biological Cybernetics, 76: 375-382, 1997.

• Mataric, M. J. “Learning in behavior-based multi-robot systems: Policies, models, and other agents”. Special Issue on Multi-disciplinary studies of multi-agent learning, Ron Sun, editor, Cognitive Systems Research, 2(1):81-93, 2001.

60


Recommended