UNIVERSITY OF CALIFORNIA, SAN DIEGO
Extremum Seeking for Mobile Robots
A Dissertation submitted in partial satisfaction of the
requirements for the degree Doctor of Philosophy
in
Engineering Sciences (Mechanical Engineering)
by
Nima Ghods
Committee in charge:
Professor Miroslav Krstic, ChairProfessor Robert BitmeadProfessor William HeltonProfessor Raymond de CallafonProfessor Michael Todd
2011
Copyright
Nima Ghods, 2011
All rights reserved.
The Dissertation of Nima Ghods is approved, and
it is acceptable in quality and form for publication
on microfilm and electronically:
Chair
University of California, San Diego
2011
iii
For my mother
who suddenly faced adversity in this great land
and sacrificed in order to keep opportunity alive for her children.
iv
TABLE OF CONTENTS
Signature Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . x
Vita . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
Abstract of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 Thesis Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Slow Sensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42.2 Model of a Metal Oxide Sensor . . . . . . . . . . . . . . . . . . . . . 62.3 Extremum Seeking Design for Slow Sensors . . . . . . . . . . . . . . . 82.4 Slow Sensor and a Static Map . . . . . . . . . . . . . . . . . . . . . . 102.5 Drifting Sensor and a Static Map . . . . . . . . . . . . . . . . . . . . 152.6 Navigation of a 2D Point Mass With a Slow Sensor . . . . . . . . . . 18
3 Source Seeking for Nonholonomic Unicycle with Speed Regulation . . . . . 263.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 273.2 Vehicle Model and Control Design . . . . . . . . . . . . . . . . . . . . 283.3 The Average System . . . . . . . . . . . . . . . . . . . . . . . . . . . 303.4 Stability for Small Positive or Negative Vc . . . . . . . . . . . . . . . 353.5 Stability for Medium and Large Positive Vc . . . . . . . . . . . . . . . 403.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4 Multi-Agent Deployment Over a Source . . . . . . . . . . . . . . . . . . . 474.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484.2 Control Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 504.3 Free Anchors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 524.4 Fixed Anchors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574.5 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 604.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
v
5 Multi-agent Deployment with Stochastic Extremum Seeking . . . . . . . . 665.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 675.2 Vehicle Model and Local Agent Cost . . . . . . . . . . . . . . . . . . 685.3 Control Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 695.4 Stability Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
5.4.1 Case 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 725.4.2 Case 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.5 Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 785.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6 Light Source Seeking Experiments . . . . . . . . . . . . . . . . . . . . . . . 816.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 816.2 Vehicle Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 826.3 Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
6.3.1 Localization and Tracking of a Light Source . . . . . . . . . . 866.3.2 Level Set Tracking of a Light Source . . . . . . . . . . . . . . 866.3.3 Collision Avoidance . . . . . . . . . . . . . . . . . . . . . . . . 88
6.4 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . 89
7 Plume Source Seeking Experiments . . . . . . . . . . . . . . . . . . . . . . 937.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 937.2 Testbed Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 947.3 Robot Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 977.4 Experiment Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1027.5 Conclusion and Future Work . . . . . . . . . . . . . . . . . . . . . . . 103
A Stability Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
B Averaging in Infinite Dimensions . . . . . . . . . . . . . . . . . . . . . . . 110
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
vi
LIST OF FIGURES
Figure 2.1: (a) An example of metal oxide sensor TGS2602 responding tofour different concentrations of ethanol. (b) Comparison of the firstorder sensor model and the real sensor reaction to ethanol. . . . . . . 7
Figure 2.2: Extremum seeking block diagrams. The modified extremumseeking algorithm (b) applies both to the case with a slow sensor(ε > 0) and to the case with a sensor modeled as a pure integrator,which we also refer to as a ‘drifting sensor’ (ε = 0). In both cases(ε > 0 and ε = 0), the washout filter is optional (both h > 0 andh = 0 are permissible). . . . . . . . . . . . . . . . . . . . . . . . . . . 9
Figure 2.3: Gas concentration distribution along the pipe with gas leak atposition 0. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
Figure 2.4: Simulation results for modified extremum seeking with slowsensor dynamics. (a) Output of the nonlinear map. (b) The sensorposition relative to θ∗. (c) The signal after the high pass filter. (d)The slow sensor reading. . . . . . . . . . . . . . . . . . . . . . . . . 14
Figure 2.5: Simulation results for extremum seeking with Gsensor(s) = b/swith washout filter. (a) Output of the nonlinear map. (b) The sensorposition relative to θ∗. (c) The signal after the high pass filter. . . . . 17
Figure 2.6: Simulation results for extremum seeking with Gsensor(s) = b/sand without washout filter. (a) Output of the nonlinear map. (b)The sensor position relative to θ∗. . . . . . . . . . . . . . . . . . . . . 19
Figure 2.7: Modified ES for 2D point mass vehicle with slow sensor. Thescheme applies both to the case with a slow sensor (ε > 0) and to thecase with a sensor modeled as a pure integrator, which we also referto as a ‘drifting sensor’ (ε = 0), and with both h > 0 and h = 0 beingpermissible. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
Figure 2.8: Simulation results for extremum seeking on a 2D point masswith a slow sensor. (a) Vehicle trajectory with the intensity of thenonlinear map in the background. (b) Output of the nonlinear map.(c) The slow sensor output. (e) The output of the washout filter. (d)and (f) The control input of x-axis and y-axis before the addition ofthe perturbation, respectively. . . . . . . . . . . . . . . . . . . . . . . 24
Figure 3.1: The notation used in the model of vehicle sensor and centerdynamics. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
Figure 3.2: Block diagram of source seeking via tuning of angular velocityand forward velocity using one reading . . . . . . . . . . . . . . . . . 30
Figure 3.3: Diagram of the error variables relating the vehicle and thesource. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
vii
Figure 3.4: Simulation results for steering-based unicycle source seekingwith forward speed regulation: (a), (b), (c) showing the evolutionof the variables rc, θ, and Vc + bξ, respectively, and (d) showing thetrajectory of the vehicle. . . . . . . . . . . . . . . . . . . . . . . . . . 38
Figure 3.5: The difference in trajectories for small positive and negativeVc. The two cases yield convergence to the average equilibria (3.31)and (3.32), respectively. For Vc < 0 the vehicle points towards thesource at the end of the transient, whereas for Vc > 0 the vehiclepoints away from the source at the end of the transient. . . . . . . . 41
Figure 3.6: Simulation result of vehicle trajectory using steering-basedsource seeking and forward speed regulation on a Rosenbrock function(the white shading represents the maximum). . . . . . . . . . . . . . 41
Figure 3.7: Two trajectories of the same vehicle, with the only differencebeing the initial condition in θ. The vehicle converges to two differentaverage equilibria, (3.33) and (3.34). (a) shows the evolution of therelative angle between the vehicle heading and the source, with µ0 ≈π/3. (b) shows the trajectory of the vehicles. . . . . . . . . . . . . . . 44
Figure 3.8: Three trajectories of the same vehicle, with the only differencebeing the value of Vc. The vehicle converges to three different trajec-tories that encircle the source. (a) shows the evolution of the relativeangle between the vehicle heading and the source, with µ0 ≈ 0 whenVc is close to V upper
c and µ0 ≈ π/2 when Vc ≫ V upperc . (b) shows the
trajectory of the vehicles. . . . . . . . . . . . . . . . . . . . . . . . . . 45
Figure 4.1: Vehicle density function for λ = 5 and λ(α) = 5(2− α). . . . . 57Figure 4.2: Block diagram of a single follower agent. . . . . . . . . . . . . 61Figure 4.3: Double y-axis plots of the vehicle trajectories showing time
scale on the left y-axis, the signal field strength on the right y-axis,and the location of the vehicles on the x-axis. (a) Agent deploymentwith fixed anchors. (b) Agent deployment with free anchors. . . . . . 63
Figure 4.4: Theoretical plot of (a) Formation distribution function and(b) Formation density function for the fixed and free anchor cases . . 64
Figure 4.5: (a) Agent deployment with free anchors starting far fromthe equilibrium with linearly increasing parameters (b) Group of 11agents using free anchor case to achieve seeking of a moving source . 65
Figure 5.1: Shows a group of vehicles using the stochastic extremum seek-ing algorithm with Case 1 perturbations and interaction gains givenby (5.68). The anchor agents are denoted by red triangles and thefollower agents are denoted by blue dots. The agents start inside thedashed black line and converge to a circular formation around thesource. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
viii
Figure 5.2: Shows a group of vehicles using the stochastic extremum seek-ing algorithm with Case 2 perturbations. The agents start inside thedashed black line and converge to a line formation centered aroundthe source with the anchor agents at the end of the line formation. . . 80
Figure 6.1: Graphical interpretation of the unicycle model with a decou-pled sensor. The red dot indicates the sensors location . . . . . . . . 83
Figure 6.2: ANT (a) top view (b) bottom view . . . . . . . . . . . . . . . 84Figure 6.3: CAD rendering of the PCB . . . . . . . . . . . . . . . . . . . 85Figure 6.4: Photographs of the ANT performing source seeking with over-
layed trajectory appearing in order from left to right top to bottom. . 87Figure 6.5: Photographs of the ANT performs level set tracing at 15 sec
intervals appearing in order from left to right top to bottom . . . . . 88Figure 6.6: Picture of the testbed after the ANT had traced the level set
several times . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89Figure 6.7: Photographs of two ANTs performing source seeking in a field
produced by to light sources at 10 sec intervals appearing in orderfrom left to right top to bottom. . . . . . . . . . . . . . . . . . . . . . 90
Figure 6.8: Photographs of the ANT performing obstacle avoidance whiletracking a light source at 5 sec intervals appearing in order from leftto right top to bottom. . . . . . . . . . . . . . . . . . . . . . . . . . . 91
Figure 6.9: Photographs of the ANTs avoiding each other while trackinga light source at 5 sec intervals appearing in order from left to righttop to bottom. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
Figure 7.1: Wind tunnel (a) the intake (b) the outlet . . . . . . . . . . . . 95Figure 7.2: Smoke chamber (a) picture of the smoke chamber (b) diagram
of smoke chamber . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97Figure 7.3: Matlab GUI used to run experiments. The GUI has commu-
nication states on the left the test controls on the top right, and thereal time plots on the bottom right. . . . . . . . . . . . . . . . . . . . 98
Figure 7.4: Plume-bot (a) picture of the plume-bot (b) CAD of plume-bot 99Figure 7.5: Custom designed circuit board . . . . . . . . . . . . . . . . . . 100Figure 7.6: Smoke sensor (a) picture of the smoke sensor (b) circuit dia-
gram for particulate sensors. . . . . . . . . . . . . . . . . . . . . . . . 100Figure 7.7: Circuit diagram for wind sensors. . . . . . . . . . . . . . . . . 101Figure 7.8: Block diagram of the overall experiment . . . . . . . . . . . . 102Figure 7.9: Picture of the plume-bot during a plume source seeking test . 103Figure 7.10: A 35 sec trajectory of the plume-bot performing smoke plume
localization in a wind tunnel with a rightward wind of 1m/s. . . . . 104
ix
ACKNOWLEDGEMENTS
First and foremost, I would like to express my gratitude to my advisor Professor
Miroslav Krstic for all the opportunities that he has provided for me. His excellent
advice and guidance have tremendously helped my academic as well as professional
development. It was truly an honor to work with him.
I would like to thank my mother Tara, my sister Mashia, my niece Armita, and
my brother-in-law Shahram for there love and support.
I would like to thank the members of my committee for their helpful questions
and comments, and lending their time and expertise to this project.
I would like to thank my fellow graduate students, Antranik Siranosian, Jen-
nie Cochran, Dan Arnold, James Gray, Paul Frihauf, David Zhang, Andrew Kwok,
Gabe Graham, Chad Foerster, Christopher Colburn, Ahsan Samiee, James Krieger,
Alicia Powers, Nikos Berkiaris-Liberis, Halil Basturk, Alex Scheinker, Alex Simp-
kins, Ameet Deshpande, Charles Kinney, Matthew Graham, Bahman Gharesifard,
Mike Ouimet, Delphine Bresch-Pietri, Artem Chakirov, Michael Bohm and Gideon
Prior for creating an enjoyable and collaborative research environment. A special
thanks goes to Jennie, Antranik, and Paul for all their advice and help.
I would like to thank the iBotics team, Gregory Mills, Jenny Wize, Paul Wise-
caver, Thomas Denewiler, Andrew Meares, and Chris Barngrover. A special thanks
to Andrew and Thomas for all their enthusiasm and love for robots.
I would like to thank the people on MURI and LANL plume project, Ramon
Huerta , Lev Tsimring, Alexander Vergara Tinoco, Kerem Muezzinoglu, Terry Pe-
ters, Nikolai Rulkov, Mikhail Rabinovich, Matt Bement, and Charles Farrar.
I would like to thank the MAE machine shop staff, Chris Cassidy, Thomas
Chalfant, and David Lischer for lending me lots of help and letting me use the shop
at ungodly hours.
I would like to thank all the undergrad team members that helped me with my
research experiments and side projects.
Finally, I would like to thank my friends John Crawford, Rody Tebcherani, and
Cezario Tebcherani, who I can always count on. A special thanks to John for his
listening ear and for always helping me put things in perspective.
x
This dissertation includes reprints of the following papers:
N. Ghods, and M. Krstic, “Source seeking with very slow or drifting sensors,” pro-
visionally accepted for Journal of Dynamic Systems, Measurement, and Control.
(Chapter 2)
N. Ghods, and M. Krstic, “Speed regulation in steering-based source seeking,” Au-
tomatica, vol. 46, pp. 452–459, 2010. (Chapter 3)
N. Ghods and M. Krstic, “Multi-agent deployment over a source,” provisionally
accepted for IEEE Transactions on Control Systems Technology. (Chapter 4)
N. Ghods, P. Frihauf, and M. Krstic, “Multi-Agent Deployment in the Plane Using
Stochastic Extremum Seeking,” IEEE Conference on Decision and Control, 2010.
(Chapter 5)
The dissertation author was the primary investigator and author of these publi-
cations.
xi
VITA
2006 B.S. in Mechanical Engineering, University of Califor-nia, San Diego
2006-2008 Teaching Assistant, Department of Mechanical Engi-neering, University of California, San Diego
2011 Ph.D. in Engineering Sciences (Mechanical Engineer-ing), University of California, San Diego
PUBLICATIONS
C. Zhang, D. Arnold, N. Ghods, A.A. Siranosian and M. Krstic, “Source seeking withnon-holonomic unicycle without position measurement and with tuning of forwardvelocity,” Systems & Control Letters, vol. 56, issue 3, pp. 245–252, 2007.
J. Cochran, N. Ghods, A. Siranosian, and M. Krstic, “3D source seeking for under-actuated vehicles without position measurement,” IEEE Transactions on Robotics,vol. 25, pp. 245–252, 2009.
N. Ghods, and M. Krstic, “Speed regulation in steering-based source seeking,” Au-tomatica, vol. 46, pp. 452–459, 2010.
N. Ghods, and M. Krstic, “Source seeking with very slow or drifting sensors,” pro-visionally accepted for Journal of Dynamic Systems, Measurement, and Control.
N. Ghods, and M. Krstic, “Multi-agent deployment over a source,”provisionallyaccepted for IEEE Transactions on Control Systems Technology.
xii
ABSTRACT OF THE DISSERTATION
Extremum Seeking for Mobile Robots
by
Nima Ghods
Doctor of Philosophy in Engineering Sciences (Mechanical Engineering)
University of California, San Diego, 2011
Professor Miroslav Krstic, Chair
The work in this thesis describes theoretical and experimental results of ex-
tremum seeking applied to vehicle(s) with the objective of localizing the source of
an unknown, nonlinear, signal field. For environments where position information
is unavailable, the extremum seeking method is applied to autonomous vehicles as
a means of navigating to find the source of some signal which the vehicles can mea-
sure locally. The signal is at maximum intensity at the source and decreases with
distance away from the source. Although we only assume that the signal field has
a maximum in experiments, to prove theoretical stability we use quadratic form a
local approximation of the signal field.
We explore the idea of dealing with a very slow or drifting sensor and provide
stability results for several distinct variations of an extremum seeking scheme for 1D
optimization and 2D source localization with point-mass vehicle dynamics. Detailed
convergence analysis and simulations for steering-based source seeking with forward
velocity regulation applied to nonholonomic vehicles are provided. We develop a
deterministic algorithm in a continuum to deploy a group of autonomous vehicles
(agents) capable of measuring relative position to neighbors, in a line formation,
which has a higher density of agents near the source of a measurable signal and a
lower density away from the source in 1D. We also consider stochastic swarming
algorithms in 2D that force the net of agents to spread, maintain a formation, and
seek a source without position information, whereby each agent is given a local
xiii
measurement of signal field and the relative distance from neighbors.
Experimental results of extremum seeking applied to mobile vehicles to perform
localization, tracking, and level-set tracing of a light source are shown. We perform
experiments with multiple vehicles using extremum seeking not only to localize the
light source but also to avoid objects and each other. Finally, we discuss details
of setting up a testbed to produce a characterized smoke plume and the results of
plume source seeking experiments.
xiv
1
Introduction
The main goal of this work is to develop algorithms based on extremum seeking
for autonomous vehicle(s). Using theoretical and experimental results we show that
these algorithms allow the vehicle(s) to localize an unknown source. Throughout
this work we assume the signal is at maximum intensity at the source and decreases
with distance away from the source. The main idea for the control law is to guide
the vehicle(s) up the gradient of the signal to find the source. For the theoretical
results we assume a quadratic form for the signal field. The quadratic assumption
can be relaxed using the same methods in [2, 53].
For coordinated motion control and autonomous agents, deprivation of position
information is an area of rapidly growing interest. Extremum seeking is a use-
ful concept in environments where GPS is unavailable and inertial navigation is
too expensive, such as urban environments, underwater, under ice and in caves.
Extremum seeking is a real-time, non-model based adaptive control technique for
tuning parameters to optimize an unknown nonlinear map. Extremum seeking re-
lies on persistence of excitation, usually a sinusoid, to perturb the parameters being
tuned. This quantifies the effects of the parameters on the output of the nonlinear
map, then uses that information to generate estimates of the optimal parameter
values. Extremum seeking [2] has been advanced or employed in applications by
several other authors [10, 4, 38, 53, 54, 1, 44, 45, 61, 46, 52, 62, 8, 57, 56].
In the present work we attempt to overcome some of the newly-faced challenges
of source seeking with autonomous vehicle(s) using extremum seeking. In working
1
2
on chemical localization, one is faced with the problem of very slow sensor dynamics,
which causes the overall system to perform poorly or become unstable. In Chapter
2 the problem of slow senor dynamics is addressed. In [11] the vehicle is constrained
to have a constant forward velocity, which creates a trade off between convergence
speed and size of the ring to which vehicle converges. The forward velocity constraint
is unrealistic for ground and underwater vehicles since most of the time they have
the ability not only to slow down but go backwards. In Chapter 3 we explore the
benefits of being able to regulate the forward velocity. The problem of multiple
vehicles with local information about the source and their neighbors performing
source seeking is analyzed in Chapters 4 and 5.
The thrust of the investigator’s effort as a Ph.D. candidate has been theoretical
development of control algorithms. These algorithms have been employed in sev-
eral applications including an autonomous underwater robot, a light-source seeking
robot, and a plume-source seeking robot. The final chapters set forth the experimen-
tal work that employs the algorithms. As is often the case when theory juxtaposes
with application, the applications give insight as to future theoretical work that
could improve robotic performance in extremum seeking. Some of the theoretical
work done in [11, 12] is experimentally verified in Chapter 6. The difficult task
of seeking the source of a complex smoke plume experimentally is considered in
Chapter 7.
1.1 Thesis Overview
The contents of this thesis are as follows.
Chapter 2 presents a modified extremum seeking scheme to account for and
exploit slow sensor dynamics. We also consider the worst case, which is sensor
dynamics governed by a pure integrator.
Chapter 3 presents an extremum seeking based design, with the intent of bring-
ing the vehicle to a stop, or as close to a stop as possible. The vehicle speed
is controlled using simple derivative-like feedback of the sensor measurement (the
derivative is approximated with a washout filter) to which a speed bias parameter
3
Vc is added. The angular velocity is tuned using standard extremum seeking.
Chapter 4 presents a control algorithm for vehicles that are capable of sensing
a local signal field and the relative position between them and their neighbors based
on a combination of two components. One component of the control law is inspired
by the heat partial differential equations (PDE) and it results in the agents deploying
between two anchor agents. The other component of the control law is based on
extremum seeking and it achieves higher vehicle density around the source. Using
averaging theory for PDEs we prove that the vehicle density will be highest around
the source.
Chapter 5 presents the deployment of a group of N autonomous fully actuated
vehicles (agents) in a non-cooperative manner in a planar signal field using stochastic
extremum seeking, with the objective of spreading, maintaining a formation, and
seeking a source. The vehicles are not able to sense their own positions but are
capable of sensing the distance between their neighbors and themselves.
Chapter 6 presents the robot design and experimental results for localizing,
tracking, level-set tracing of a light source. The experimental results in this chapter
validate some of the numerical and theoretical results presented in [11, 12].
Chapter 7 presents the construction of testbed and experimental results for
smoke plume source localization experiments. The experiments done in this chapter
are the first steps in validating the theoretical work in Chapter 2.
2
Slow Sensor
In this Chapter we introduce a new idea of how to extend extremum seeking to
deal with a slow or drifting sensor. Slow sensors arise in many applications, including
sensing chemical concentrations in tracking of contaminant plumes. Slow sensors
are often the cause of poor performance and a potential cause of instability. In this
paper we design a modified extremum seeking scheme to account for and exploit slow
sensor dynamics. We also consider the worst case, which is sensor dynamics governed
by a pure integrator. We provide stability results for several distinct variations of
an extremum seeking scheme for one-dimensional optimization. Then we develop a
design for source seeking in a plane using a fully actuated vehicle, prove its closed-
loop convergence, and present simulation results. We use metal-oxide microhotplate
gas sensors as a real world example of slow sensor dynamics, model the sensor based
on experimental data, and employ the identified sensor model in our source seeking
simulations.
2.1 Introduction
Recent advances in extremum seeking have shown it to be a powerful tool in real
time non-model based control and optimization [10, 4, 38, 51, 54, 1]. Success has
been achieved in compensating slow actuator dynamics [60, 59, 11], but no results
have been reported on extremum seeking for plants with slow sensor dynamics, or
in the extreme case of sensors governed by a pure integrator (drifting sensors). In
4
5
this thesis we introduce a new idea of how to extend extremum seeking to deal with
a slow or drifting sensor.
For simplicity, we first consider a single-parameter extremum seeking problem
with a static map, and sensor dynamics. Then we consider a 2D problem with
simple vehicle dynamics, and with slow sensor dynamics. The classical extremum
seeking scheme [2] is modified by observing that the integrator, a key adaptation
element, is already present in the sensor dynamics, if they are governed by a pure
integrator. We perform an appropriate (time-varying) swap of the integrator block
and the demodulation block (Section 2.3), and as a result obtain a scheme where
the map output converges to the extremum quickly, while the sensor output may
converge slowly, or it may even drift to infinity (in the case of a sensor modeled by
a pure integrator). Stability and simulation results are presented first for a system
with a slow sensor (Section 2.4). This is followed by results for a sensor governed by
a pure integrator (Section 2.5). (These results do not imply one another.) Finally,
results for the case of a 2D point mass vehicle with a slow sensor are presented
(Section 2.6).
Traditional methods for gas plume seeking using slow metal oxide sensors [28,
29, 30] (reviewed in Section 2.2) either wait for a large enough change in the sensor
reading or for the sensor reading to settle before they act. Most of these search
methods [5, 34, 35] are based on mimicking insect behavior (mainly moths) to local-
ize source of odor without much consideration of the sensor dynamics. The modified
ES scheme reacts to the sensor reading continuously, which allows the overall system
to converge to an optimum much faster than the sensor settling time.
Our compensation of slow sensor dynamics does not amount to employing a
differentiator after the sensor to cancel the integrator in the sensor and act on the
trend of the signal, rather than on the value of the signal. This approach would result
in amplification of noise. Instead, our approach leverages the integrator action in the
sensor, to have it assume the role of the tuning element in the extremum seeking
loop. We highlight this by considering both a version of the modified extremum
seeking scheme with the standard washout filter in the loop and a version without
the washout filter, proving stability in each case.
6
To show the capabilities of the modified extremum seeking scheme with the
metal oxide sensors we consider the realistic two dimensional problem of trying to
localize a gas leak in a room with a single moving sensor. In the 2D source seeking
problem we are faced with the problem that two integrators exist in the loop, one
from the sensor and one associated with the vehicle model. A modification of the
extremum seeking scheme is needed to reduce the loop phase drop from 180 to a
lesser value. This modification comes in the form of a washout filter to approximate
differentiator, or, if preferred, in the form of a phase-lead compensator.
2.2 Model of a Metal Oxide Sensor
Due to their small size, metal oxide based microhotplate sensors can be used
to develop portable, sensitive, and low-cost gas monitoring system to detect, for
example, leakage of hazardous gases. Modeling metal oxide microhotplate sensor
dynamics accurately can prove to be very difficult, as seen in [22, 20, 21]. In this
section we make a reasonable assumption to simplify the complicated models. The
basic premise of the sensor model in [22, 20, 21] is that the sensor reading is driven
by an exponential of the concentration of several gases, and the gas concentrations
are governed by several coupled ODEs, which correspond to chemical reactions. We
are concerned with locating the maximum of a single gas with little fluctuation in
temperature.
Tests were performed to better understand the leading dynamics of the sensor.
A gas with a certain concentration was released at 30 [sec] into the experiment, then
the gas was flushed out at 600 [sec]. Figure 2.1 (a) shows the reaction of a TGS2602
metal oxide microhotplate sensor [19] to ethanol at four different concentrations.
Note in Figure 2.1 (a) that the sensor reading takes around 120 [sec] to settle,
independently of the gas concentration.
From these tests we see that the dominant dynamics of the sensor are governed
by a first order system
Gsensor(s) =b
s+ ε, (2.1)
7
0 200 400 600 800 1000 12000
50
100
150
200
250
Time (sec)
Sen
sor
Res
ista
nce
(kΩ
)
Sensor Reaction to Ethanol
250 ppm200 pmm150 ppm100 ppm
(a)0 50 100 150 200 250 300 350
0
50
100
150
200
Time (sec)
Sen
sor
Res
ista
nce
(kΩ
)
Sensor Reading
Sensor Reading For 250 ppmFirst Order Sensor Model Reading
(b)
Figure 2.1: (a) An example of metal oxide sensor TGS2602 responding to fourdifferent concentrations of ethanol. (b) Comparison of the first order sensor modeland the real sensor reaction to ethanol.
8
where b and ε are positive constants that depend on the sensor and the type of
gases. After performing several tests we observed that, although ε is positive, its
magnitude is quite small (on the order of 10−2). By inspection we set b = 0.037 and
ε = 0.046 to get the model for the gas sensor reacting to ethanol. Figure 2.1 (b)
compares the identified sensor model against the real TGS2602 gas sensor reading.
The sensor model parameters change for different gases and different sensors but
always stay positive. Note that methods in [2] can be applied if the sensor also
contains any fast dynamics.
2.3 Extremum Seeking Design for Slow Sensors
In this section, we modify the classical extremum seeking scheme to work with
very slow sensors. In the extreme case the sensors are governed by a pure integrator,
namely drifting sensors. We start with a key observation that an integrator is already
a part of the classical extremum seeking loop in Figure 2.2(a). We need to modify the
scheme so that the sensor itself is performing the task of this integrator. To do this,
we need to swap the integrator and the multiplication by sin(ωt) in Figure 2.2(a),
i.e., to move the integrator upstream in the signal path. This is not a simple swap of
linear blocks because a multiplication by a time varying signal is involved. However,
using integration by parts, we get that∫ t
0
η(τ) sin(ωτ)dτ = sin(ωt)
∫ t
0
η(τ)dτ − ω
∫ t
0
cos(ωτ)
∫ τ
0
η(σ)dσdτ . (2.2)
We use this observation to convert the scheme in Figure 2.2(a) to the scheme in
Figure 2.2(b), where the guiding idea is that the sensor is a pure integrator, namely,
ε = 0. As we shall see, this modification also works when ε > 0.
In the following sections we will show, using averaging theory, that the modified
extremum seeking scheme can be used to maximize a signal (for example gas con-
centration), using just the output of the sensor and without any knowledge of the
map parameters or the sensor parameters.
9
Nonlinear Map
)( f
hs
s
!
)sin( t"s
1k#
)sin( ta "
J
$
(a) Classical extremum seeking algorithm
;%'6&'"38)/3<)
)(θfε+s
bθ
hs
s
+)sin( tω
)cos( tωω−s
1k
θ
)sin( ta ω
J ="'+%8)
η
µ
(b) Modified ES for slow sensor
Figure 2.2: Extremum seeking block diagrams. The modified extremum seekingalgorithm (b) applies both to the case with a slow sensor (ε > 0) and to the casewith a sensor modeled as a pure integrator, which we also refer to as a ‘driftingsensor’ (ε = 0). In both cases (ε > 0 and ε = 0), the washout filter is optional (bothh > 0 and h = 0 are permissible).
10
2.4 Slow Sensor and a Static Map
We consider applications in which the goal is to maximize the output of an
unknown nonlinear map f(θ) by varying the input θ. The signal f(θ(t)) is measured
through a slow sensor, namely, the signal µ(t), governed by the ODE
µ = −εµ+ bf(θ) . (2.3)
Let the maximizing value of θ be denoted as θ∗. We assume that the nonlinear map
is quadratic,
J = f(θ) = f ∗ − qθ(θ − θ∗)2, (2.4)
where besides θ∗ and f ∗ being unknown, qθ is an unknown positive constant.
In this section we study the case of a slow sensor (ε > 0 but small). We consider
both the ES scheme with a washout filter (h > 0) and without a washout filter
(h = 0). In the next section we address the same two cases but for a sensor modeled
as a pure integrator (ε = 0).
Let θ be the estimate of θ∗, and θ = θ− θ∗ be the error. From Figure 2.2 (b) we
obtain
θ = k
(η sin(ωt) +
1
s[−ηω cos(ωt)]
). (2.5)
Note, we mix the time and frequency domain notation by using the brackets [·] todenote that the transfer function acts as an operator on a time-domain function.
To prove stability we are going to analyze θ, η, and µ. Assuming the nonlinear
map (2.4) and the block diagram in 2.2 (b) we obtain
µ =b
s+ ε
[f ∗ − qθ(θ − θ∗)2
](2.6)
η =s
s+ h[µ] (2.7)
θ = k
(η sin(ωt) +
1
s[−ηω cos(ωt)]
)− θ∗. (2.8)
By rearranging (2.7), multiplying (2.6) and (2.8) by s, replacing θ with θ and
11
setting τ = ωt we obtain
dµ
dτ=1
ω
[bf ∗ − bqθ(θ + a sin(τ))2 − εµ
](2.9)
dη
dτ=1
ω
[bf ∗ − bqθ(θ + a sin(τ))2 − εµ− hη
](2.10)
dθ
dτ=− 1
ωk(hη + εµ− bf ∗ + bqθ(θ + a sin(τ))2) sin(τ) . (2.11)
Using the following two identities
1
2π
∫ 2π
0
(θ + a sin(τ))2dτ = θ2 +a2
2(2.12)
1
2π
∫ 2π
0
(θ + a sin(τ))2 sin(τ)dτ = θa, (2.13)
to average (2.9)–(2.11) we obtain
dµavg
dτ=1
ω
[bf ∗ − bqθ
(θ2 +
a2
2
)− εµavg
](2.14)
dηavgdτ
=1
ω
[bf ∗ − bqθ
(θ2 +
a2
2
)− εµavg − hηavg
](2.15)
dθavgdτ
=− kbaqθω
θavg . (2.16)
The equilibrium of the averaged system (2.14)–(2.16) is
µeavg =
b
ε
(f ∗ +
qθa2
2
)(2.17)
ηeavg = 0 (2.18)
θeavg = 0. (2.19)
The Jacobian of (2.14)–(2.16) at (µeavg, η
eavg, θ
eavg) is
Javg =1
ω
−ε 0 0
−ε −h 0
0 0 −kbaqθ
. (2.20)
Given that the nonlinear map has a maximum (qθ > 0) and that the sensor is sta-
ble (ε > 0) and non-inverting (b > 0), it follows that, if we choose a, ω, k, h > 0, the
Jacobian (2.20) is Hurwitz and the equilibrium of the averaged system (2.14)-(2.16)
is locally exponentially stable. From averaging theorem [36] we get the following
result.
12
Theorem 2.1 There exists ω∗ such that for all finite ω > ω∗ the system in Figure
2.2 (b) with nonlinear map (2.4) has a unique exponentially stable periodic solution
(µ2π/ω(t), η2π/ω(t), θ2π/ω(t)) of period 2π/ω which satisfies∥∥∥∥∥∥∥∥
µ2π/ω(t)− bε
(f ∗ + qθa
2
2
)η2π/ω(t)
θ2π/ω(t)
∥∥∥∥∥∥∥∥ ≤ O(1/ω), ∀ t ≥ 0. (2.21)
Since θ− θ∗ = θ+a sin(ωt) = (θ− θ2π/ω)+ θ2π/ω +a sin(ωt), the theorem implies
that the first term is zero, the second term is O(1/ω), and the third term is O(a).
Thus lim supt→∞ |θ(t)− θ∗| = O(1/ω). Hence, we get
lim supt→∞
|f(θ(t))− f ∗| = O(a2 + 1/ω2) , (2.22)
which characterizes the asymptotic performance of the extremum seeking loop in
Figure 2.2 (b).
Figure 2.4 shows simulations for a moving sensor along the length of a pipe, where
the objective is to localize a gas leak on the pipe with the use of sensor-compensated
extremum seeking, with the gas distribution, which is shown in Figure 2.3, modeled
in the form
f(θ) =δ∗
1 + pθ(θ − θ∗)2, (2.23)
where δ∗ = 250, pθ = 0.5, and θ∗ = 0. The extremum seeking parameters were
chosen as ω = 30, a = 0.2, k = 10, and h = 1. We assume the sensor model (2.1)
with the parameters ε = 0.046 and b = 0.037. Figure 2.4(b) shows the position of
the sensor in reference to the gas leak with a starting position of 3. The nonlinear
map output (J) and the sensor position (θ) quickly converge to a periodic motion
around f ∗ and θ∗, respectively. The signal after the washout filter (η), shown in
Figure 2.4(c), goes to zero.
Note in Figure 2.4(d) that the sensor reading converges very slowly. The time
interval for which J and θ are shown in Figure 2.4 is only one tenth of the time
interval on which η and µ are shown. This is done in order to display the details
13
−4 −2 0 2 40
50
100
150
200
250
Distance (m)
Gas
Con
cent
ratio
n (p
pm)
Distribution of Ethanol Gas
Figure 2.3: Gas concentration distribution along the pipe with gas leak at position0.
of the rapidly convergent sensor position θ, while the sensor reading µ is about ten
times slower. More specifically, even though it takes the sensor reading 120[sec] to
settle the extremum seeking algorithm is able to tune θ to achieve maximum output
from the nonlinear map in less than 6[sec]. The convergence would be orders of
magnitude slower if the algorithm had to wait for the sensor reading to settle every
time it wanted to tweak θ.
In some applications the use of washout filters may be undesirable because they
act as approximate differentiators and therefore may result in the amplification of
noise. Dropping the washout filter still results in a stable system. The washout filter
is used for performance reasons, not for stability reasons or to ‘cancel’ the extremely
slow (integrator-like) dynamics of the sensor. The proof for this case (omitted) is
very similar to the proof for the case where the sensor is a pure integrator but the
ES scheme does employ a washout filter (Theorem 2.3), with the Jacobian of the
averaged system given as
Javg =1
ω
[−ε 0
0 −kbaqθ
]. (2.24)
Theorem 2.2 Consider the system in Figure 2.2 (b) with the nonlinear map of form
14
0 2 4 60
50
100
150
200
250
Time (sec)
J (p
pm)
Ethanol Concentration
0 2 4 6
−1
−0.5
0
0.5
1
1.5
2
2.5
3
Time (sec)θ
(m)
Position Estimate of the Gas Leak
(a) (b)
0 20 40 60 800
1
2
3
4
5
6
7
8
Time (sec)
η
High Pass Filter of the Sensor Reading
0 20 40 60 800
50
100
150
200
Time (sec)
µ (k
Ω)
Sensor Reading
(c) (d)
Figure 2.4: Simulation results for modified extremum seeking with slow sensor dy-namics. (a) Output of the nonlinear map. (b) The sensor position relative to θ∗.(c) The signal after the high pass filter. (d) The slow sensor reading.
15
(2.4) and without the washout filter. There exists ω∗ such that for all finite ω > ω∗
the system has a unique exponentially stable periodic solution (y2π/ω(t), θ2π/ω(t)) of
period 2π/ω which satisfies∥∥∥∥∥∥ y2π/ω(t)− b
ε
(f ∗ + qθa
2
2
)θ2π/ω(t)
∥∥∥∥∥∥ ≤ O(1/ω), ∀ t ≥ 0 . (2.25)
Simulation (not included) for the system in Theorem 2.2 shows a convergence
rate that is inferior to that of the algorithm with the washout filter (Theorem 2.1).
This convergence rate difference is not captured by the averaging analysis because
the approximation accuracy of averaging is low when some of the eigenvalues of the
average system are small due to small ε.
2.5 Drifting Sensor and a Static Map
Our scheme works even when ε = 0, which is the case when the sensor is a
pure integrator. This is a rather extreme situation of a sensor that responds, but
permanently drifts in its value (towards infinity). All that we can achieve in this
case is to maximize the sensor’s input, since its output never settles.
The stability analysis for this case mimics some parts of the proof for Theorem
2.1. Assuming the nonlinear map in (2.4) and setting ε = 0, we write (2.6) as
µ =b
s
[f ∗ − qθ
(θ + a sin(ωt)
)2]. (2.26)
Since the sensor output µ is not going to settle when its input θ settles, we do not
include the sensor output as a state for which we are proving convergence. We only
study the states θ and η, whose equations are
dη
dτ=1
ω
[bf ∗ − bqθ(θ + a sin(τ))2 − hη
](2.27)
dθ
dτ=− 1
ωk(hη − bf ∗ + bqθ(θ + a sin(τ))2) sin(τ) . (2.28)
16
Using the identities (2.12) and (2.13) we obtain the following averaged equations
dηavgdτ
=1
ω[bf ∗ − bqθ(θ
2 +a2
2)− hηavg] (2.29)
dθavgdτ
=1
ω[−kbaqθθavg]. (2.30)
The averaged system (2.29), (2.30) has the equilibrium
[ηeavg, θeavg] =
[b
h
(f ∗ +
qθa2
2
), 0
], (2.31)
with the the Jacobian of
Javg =1
ω
[−h 0
0 −kbaqθ
]. (2.32)
Theorem 2.3 There exists ω∗ such that for all finite ω > ω∗ the system in Figure
2.2 (b) with the nonlinear map of form (2.4) and ε = 0 in the sensor dynamics has a
unique exponentially stable periodic solution (η2π/ω(t), θ2π/ω(t)) of period 2π/ω which
satisfies ∥∥∥∥∥∥ η2π/ω − b
h
(f ∗ + qθa
2
2
)θ2π/ω
∥∥∥∥∥∥ ≤ O(1/ω), ∀ t ≥ 0 . (2.33)
Figure 2.5 shows a simulation with a sensor Gsensor(s) = b/s, θ∗ = 0, f ∗ = 1,
qθ = 0.5, and b = 1. The ES parameters are chosen as ω = 30, a = 0.2, k = 10,
and h = 1. Figure 2.5(a) shows the ability of the sensor-compensated ES scheme
to maximize the output of a nonlinear map even with a marginally stable sensor.
Figure 2.5(b) shows θ starting from 3 and converging to a periodic motion around
θ∗. Figure 2.5(c) shows how the signal after the washout filter (η) converges to a
periodic motion around ηeavg = 1.02. The response for µ(t) is not shown since it
drifts in a linear manner towards infinity, as expected.
The scheme studied in Theorem 2.3 contains a cascade of the sensor’s integrator
dynamics and of a washout filter. It may appear that the key to the result is that a
differentiator cancels an integrator. This is not the case at all, as we illustrate with
the next simulations, for the system with Gsensor(s) = b/s and without the washout
filter (i.e., with h = 0). This simple result is given without a proof, which follows
from the fact that the (scalar) Jacobian is −kbaqθ/ω (in the τ time scale).
17
0 1 2 3 4 5 6−4
−3
−2
−1
0
1
Time (sec)
J
Nonlinear Map Output
0 1 2 3 4 5 6
0
0.5
1
1.5
2
2.5
3
Time (sec)θ
Estimate of θ*
(a) (b)
0 1 2 3 4 5 6−0.4
−0.2
0
0.2
0.4
0.6
0.8
1
Time (sec)
η
High Pass Filter of the Sensor Reading
(c)
Figure 2.5: Simulation results for extremum seeking with Gsensor(s) = b/s withwashout filter. (a) Output of the nonlinear map. (b) The sensor position relative toθ∗. (c) The signal after the high pass filter.
18
Theorem 2.4 Consider the system in Figure 2.2 without the washout filter, with
ε set to zero in the sensor dynamics, and the nonlinear map of form (2.4). There
exists ω∗ such that for all ω > ω∗ the system has a unique exponentially stable
periodic solution θ2π/ω(t) of period 2π/ω which satisfies∥∥∥θ2π/ω(t)∥∥∥ ≤ O(1/ω), ∀ t ≥ 0 . (2.34)
Simulation results for the system in Theorem 2.4 are shown in Figure 2.6 for
f ∗ = 1, qθ = 0.5, b = 1, ω = 30, a = 0.2, k = 10, and h = 1. As expected, θ and
f converge to a periodic motion around θ∗ and f ∗, respectively. The drifting sensor
without the washout filter has significant oscillations after settling compared to the
previous case with the washout filter. The significance of the result in Theorem 2.4,
shown in Figure 2.6, is that the modified extremum seeking scheme is not merely
acting based on the signal trend/derivative rather than on the signal value, which
would have been the case if the inclusion of a washout filter had turned out to be
crucial. Rather than ‘canceling’ the sensor’s integrator, our scheme leverages it, by
using its presence for the function of tuning θ(t) in the ES loop.
2.6 Navigation of a 2D Point Mass With a Slow
Sensor
In this section we study the case of a slow sensor (ε > 0 but small) on a vehicle
modeled as a 2D point mass
x(t) = ux(t) (2.35)
y(t) = uy(t) , (2.36)
where ux(t), uy(t) are two independent velocity inputs to the vehicle. For simplicity
of our presentation we assume that the nonlinear map is quadratic with the form
f(x, y) =f ∗ − qx(x− x∗)2 − qy(y − y∗)2, (2.37)
19
0 2 4 6−4
−3
−2
−1
0
1
Time (sec)
J
Nonlinear Map Output
0 2 4 6−1
0
1
2
3
4
Time (sec)
θ
Estimate of θ*
(a) (b)
Figure 2.6: Simulation results for extremum seeking with Gsensor(s) = b/s andwithout washout filter. (a) Output of the nonlinear map. (b) The sensor positionrelative to θ∗.
where (x∗, y∗) is the maximizer, f ∗ is the maximum and qx, qy are some unknown
positive constants.
We develop a two-input scheme, which accounts for the integrator dynamics of
the vehicles two actuation channels, in the following manner. We start from the
scheme for a static map in Figure 2.2. To get an integrator to appear at the input
of the nonlinear map, we first place the term 1 = ssbetween the ES gain (k) and
the addition of the perturbation a sin(ωt). Then, taking the term 1sfrom the term
ssand moving it downstream in the signal flow direction and past the perturbation
input, which results in a differentiation of the perturbation, we get an integrator
to appear at the input of the nonlinear map. Then, realizing that a differentiator
s, which remains from the term ss, cannot be implemented, we replace it with an
approximate differentiator, i.e., a washout filter ss+dx
. Finally, we take advantage of
the availability of the integrator in the lowest branch of the extremum seeking loop
and, with a suitable block diagram manipulation, arrive at the scheme given in the
x-channel of the scheme in Figure 2.7.
To go from a 1D scheme to a two-input 2D-navigation scheme we simply add
another extremum seeking channel with the perturbation and the demodulation
20
;%'6&'"38)/3<)
),( yxfb
s+ε
µ
y
x
hs
s
+)sin( tω
)cos( tωω−s
1
xd−
xk
xξ
)cos( ta ωω
xɺ
)cos( tω
yd−
yky
ξ
)sin( ta ωω−
yɺ
)sin( tωω
J ="'+%8)
s
1
s
1
s
1
η
Figure 2.7: Modified ES for 2D point mass vehicle with slow sensor. The schemeapplies both to the case with a slow sensor (ε > 0) and to the case with a sensormodeled as a pure integrator, which we also refer to as a ‘drifting sensor’ (ε = 0),and with both h > 0 and h = 0 being permissible.
applied with a 90 phase shift, as was done in [60]. The vehicle control is given by
ux(t) = aω cos(ωt) + kxξx(t) (2.38)
uy(t) = −aω sin(ωt) + kyξy(t) . (2.39)
We introduce the new coordinates
x = x− x∗ − a sin(ωt) (2.40)
y = y − y∗ − a cos(ωt) . (2.41)
With the new coordinates the map (2.37) becomes
f(x, y) = f ∗ − qx(x+ a sin(ωt))2 − qy(y + a cos(ωt))2 . (2.42)
21
From the block diagram in Figure 2.2(c) we write the equations for µ, η, ξx, and ξy
µ =b
s+ ε
[f ∗ − qx(x− x∗)2 − qy(y − y∗)2
](2.43)
η =s
s+ h[µ] (2.44)
ξx = η sin(ωt)− 1
s[ηω cos(ωt) + dxξx] (2.45)
ξy = η cos(ωt) +1
s[ηω sin(ωt)− dyξy]. (2.46)
By replace (x, y) with (x, y), letting τ = ωt, and rearranging (2.43)–(2.46) we
obtain the ODEs
dµ
dτ=1
ω
[−εµ+ bf ∗ − bqx(x+ a sin(τ))2
−bqy(y + a cos(τ))2]
(2.47)
dη
dτ=1
ω
[−εµ− hη + bf ∗ − bqx(x+ a sin(τ))2
−bqy(y + a cos(τ))2]
(2.48)
dξxdτ
=− 1
ω
[(hη + εµ− bf ∗ + bqx(x+ a sin(τ))2
+bqy(y + a cos(τ))2) sin(τ) + dxξx]
(2.49)
dx
dτ=kxξx (2.50)
dξydτ
=− 1
ω
[(hη + εµ− bf ∗ + bqx(x+ a sin(τ))2
+bqy(y + a cos(τ))2)cos(τ) + dyξy
](2.51)
dy
dτ=kyξy (2.52)
Using the identities (2.12) and (2.13) to average (2.47)–(2.52), we get
dµavg
dτ=1
ω
[−εµavg + bf ∗ − bqx
(x2avg +
a2
2
)− bqy
(y2avg +
a2
2
)](2.53)
dηavgdτ
=1
ω
[−hη − εµavg + bf ∗ − bqx
(x2avg +
a2
2
)− bqy
(y2avg +
a2
2
)](2.54)
dξx avg
dτ=− 1
ω(baqxxavg + dxξx avg) (2.55)
dxavg
dτ=kxωξx avg (2.56)
22
dξy avgdτ
=− 1
ω(baqyyavg + dyξy avg) (2.57)
dyavgdτ
=kyωξy avg . (2.58)
The equilibrium of the averaged system
µeavg =
b
ε
(f ∗ +
(qx + qy)a2
2
)(2.59)
ηeavg = 0 (2.60)
ξex avg = 0 (2.61)
xeavg = 0 (2.62)
ξey avg = 0 (2.63)
yeavg = 0 . (2.64)
with the Jacobian of (2.53)–(2.58) at (µeavg, η
eavg, ξ
ex avg, x
eavg, ξ
ey avg, y
eavg) given by
Javg =1
ω
−ε 0 0 0 0 0
−ε −h 0 0 0 0
0 0 −dx −baqx 0 0
0 0 kx 0 0 0
0 0 0 0 −dy −baqx
0 0 0 0 ky 0
. (2.65)
Given that the nonlinear map has a maximum (qx, qy > 0) and that the sen-
sor is stable (ε > 0) and non-inverting (b > 0), it follows that, if we choose
a, ω, kx, ky, dx, dy, h > 0, the Jacobian (2.65) is Hurwitz and the equilibrium (2.59)–
(2.64) of the averaged system (2.53)–(2.58) is locally exponentially stable. From
averaging theorem [36] we get the following result.
Theorem 2.5 There exists ω∗ such that for all finite ω > ω∗ the system in Figure
2.7 with nonlinear map (2.37) has a unique exponentially stable periodic solution
(µ2π/ω(t), η2π/ω(t), ξ2π/ωx (t), x2π/ω(t),
23
ξ2π/ωy (t), y2π/ω(t)) of period 2π/ω which satisfies∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥
µ2π/ω(t)− bε
(f ∗ + (qx+qy)a2
2
)η2π/ω(t)
ξ2π/ωx (t)
x2π/ω(t)
ξ2π/ωy (t)
y2π/ω(t)
∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥∥≤ O(1/ω), ∀ t ≥ 0. (2.66)
Since x − x∗ = x + a sin(ωt) = (x − x2π/ω) + x2π/ω + a sin(ωt), the theorem
implies that the first term is zero, the second term is O(1/ω), and the third term
is O(a). Thus lim supt→∞ |x(t) − x∗| = O(1/ω) + O(a). Similarly in y we obtain
lim supt→∞ |y(t)− y∗| = O(1/ω) +O(a). Hence, we get
lim supt→∞
|f(x(t), y(t))− f ∗| = O(a2 + 1/ω2) , (2.67)
which characterizes the asymptotic performance of the extremum seeking loop in
Figure 2.7.
Figure 2.8 shows simulations of a point mass vehicle starting at position (1,1) us-
ing a sensor with slow dynamics and actuator-sensor-compensated extremum seeking
on a nonlinear map modeled in the form
f(x, y) =δ∗
1 + px(x− x∗)2 + py(y − y∗)2, (2.68)
where δ∗ = 250, px = 1, py = 0.5 and (x∗, y∗) = (0, 0). The extremum seeking
parameters are ω = 20, a = 0.5, kx = 1, ky = 1, dx = 0.2, dy = 0.2 and h = 1.
We assume the sensor model (2.1) with the parameters ε = 0.046 and b = 0.037.
It is interesting to note that the time it takes the vehicle to settle to the location
of the maximum concentration is one forth the time that it take the sensor reading
to settle. The increase in convergence time of the position of the sensor from the
previous 1D case to the 2D case is mainly due to the addition of the actuator
dynamics for the vehicle.
24
0 10 20 30 40 50
100
150
200
250
Time (sec)
J
Nonlinear Map Output
(a) (b)
0 20 40 60 80 100 1200
50
100
150
200
Time (sec)
µ
Output of Sensor Reading
0 10 20 30 40 50
−0.2
−0.1
0
0.1
0.2
0.3
0.4
0.5
Time
˙ x
Control Input of X−axis
(c) (d)
0 20 40 60 80 100 1200
1
2
3
4
5
6
7
Time (sec)
η
High Pass Filter of the Sensor Reading
0 10 20 30 40 50−0.2
−0.1
0
0.1
0.2
0.3
Time
˙ y
Control Input of Y−axis
(e) (f)
Figure 2.8: Simulation results for extremum seeking on a 2D point mass with aslow sensor. (a) Vehicle trajectory with the intensity of the nonlinear map in thebackground. (b) Output of the nonlinear map. (c) The slow sensor output. (e)The output of the washout filter. (d) and (f) The control input of x-axis and y-axisbefore the addition of the perturbation, respectively.
25
Similar with the modified one dimensional case, the two dimensional modified
extremum seeking case with point mass actuator dynamics can be extended to the
two dimensional case with no washout filter or with a purely drifting sensor.
Modified extremum seeking on the two dimensional case with point mass actuator
dynamics, similar to the one dimensional slow sensor case, can also be proven to be
stable with no washout filter or with a purely drifting sensor.
This chapter is in full a reprint of the material as it appears in: N. Ghods, and M.
Krstic, “Source seeking with very slow or drifting sensors,” provisionally accepted
for Journal of Dynamic Systems, Measurement, and Control.
The dissertation author was the primary investigator and author of this paper.
3
Source Seeking for Nonholonomic
Unicycle with Speed Regulation
The simplest strategy for extremum seeking-based source localization, for sources
with unknown spatial distributions and nonholonomic unicycle vehicles without po-
sition measurement, employs a constant positive forward speed. Steering of the
vehicle in the plane is performed using only the variation of the angular velocity.
While keeping the forward speed constant is a reasonable strategy motivated by im-
plementation with aerial vehicles, it leads to complexities in the asymptotic behavior
of the vehicle, since the vehicle cannot settle—at best it can converge to a small-size
attractor around the source. In this paper we regulate the forward velocity, with
the intent of bringing the vehicle to a stop, or as close to a stop as possible. The
vehicle speed is controlled using simple derivative-like feedback of the sensor mea-
surement (the derivative is approximated with a washout filter) to which a speed
bias parameter Vc is added. The angular velocity is tuned using standard extremum
seeking. We prove two results. For Vc in a certain range around zero, we show that
the vehicle converges to a ring around the source and on average the limit of the
vehicle’s heading is either directly away or towards the source. For other values of
Vc > 0, the vehicle converges to a ring around the source and it revolves around the
source. Interestingly, the average heading of this revolution around the source is
more outward than inward. The theoretical results are illustrated with simulations.
26
27
3.1 Introduction
In [11, 59], we considered the problem of seeking the source of a scalar signal
using a nonholonomic vehicle with no position information. We designed two distinct
strategies—keeping the angular velocity constant and tuning the forward speed by
extremum seeking [59]; and keeping the forward speed constant and tuning the
angular velocity by extremum seeking [11]. The strategy in [59] generates vehicle
motions that resemble triangles, rhombi, or stars (with arc-shaped sides), which drift
towards the source, resulting in periodic motions around the source. The strategy
in [11] generates motions that sinusoidally converge towards the source and settle
into an almost periodic (in a mathematical sense of the term) motion in a ring around
the source. While the proof of the result [11] is more challenging, the vehicle motion
is much more efficient than with the strategy in [59], since the simple tuning of
the heading results in trajectories where the distance of the vehicle from the source
decreases monotonically.
Neither of the strategies in [11, 60] are ideal, since [60] sacrifices the transients,
whereas [11] complexifies the asymptotic performance. In this paper we aim for the
best of both worlds, but not by simply combining the strategies in [11] and [60].
We propose something more elegant, a strategy that partly simplifies the approach
in [11], while adding a simple derivative-like feedback to a nominal forward speed
Vc. This feedback allows the vehicle to slow down as it gets closer to the source and
converge closer to the source without giving up convergence speed.
We prove two results, for quadratic signal fields that decay with the distance from
the source. For Vc in a certain range around zero, we show that the vehicle converges
to a ring around the source and on average the limit of the vehicle’s heading is either
directly away or towards the source. For other values of Vc > 0, the vehicle converges
to a ring around the source and it revolves around the source. Interestingly, the
average heading of this revolution around the source is more outward than inward—
this is possible because the vehicle’s speed is not constant, it is lower during the
outward steering intervals and higher during the inward steering intervals. The
theoretical results are illustrated with simulations. A simulation is also done to
consider the case when a Rosenbrock function as the signal field.
28
x
y
cr
sr
R
v
!
Figure 3.1: The notation used in the model of vehicle sensor and center dynamics.
In Section 3.2 a description of the vehicle model and extremum seeking scheme
are given. We derive the averaged system in Section 3.3. We prove local exponential
convergence results to ring/annulus-shaped sets around the source in Sections 3.4
and 3.5. Section 3.4 deals with the case of small |Vc|, whereas Section 3.5 deals with
medium and large positive values of Vc. Simulation results in Sections 3.4 and 3.5
illustrate the distinct behaviors exhibited using different values of Vc. In Section 3.6
we summarize the set of possible motions and attractors near the source that are
achieved for different values of a key design parameter.
3.2 Vehicle Model and Control Design
We consider a mobile agent modeled as a unicycle with a sensor mounted a
distance R away from the center. The diagram in Figure 3.1 depicts the position,
heading, angular and forward velocities for the vehicle center and sensor. The
equations of motion for the vehicle’s center are
rc = vejθ (3.1)
θ = Ω (3.2)
where rc is complex variable that represents the center of the vehicle in 2D, θ is
the orientation and v and Ω are the forward and angular velocity inputs, respec-
29
tively. The sensor is located at rs = rc + Rejθ. Note that this convenient complex
representation of the position would be less useful if extending this work to a 3D
setting.
The task of the vehicle is to seek a source that emits a signal (for example, the
concentration of a chemical, biological agent, electromagnetic, acoustic, or even ther-
mal signal) which decays as a function of distance away from the source. We assume
this signal field is distributed according to an unknown nonlinear map f (r(x, y))
which has an isolated local maximum f ∗ = f(r∗) where r∗ is the location of the lo-
cal maximum. We design a controller that achieves local convergence to r∗ without
knowledge of the shape of f , using only the measurement f(rs). We could design
a control law to force the vehicle’s trajectory to evolve according to the gradient of
the dynamical system rc = −∇f , if we knew both the shape of the map f and the
position of the vehicle rc, and further if the vehicle were fully actuated. In that case
the trajectory of rc would asymptotically converge to the set of stationary points of
f where ∇f(r∗) = 0. In the absence of the knowledge of function f(x, y) and of the
vehicle’s position, we have to employ techniques of non model-based optimization.
In addition, in the absence of direct actuation of the vehicle’s position, namely,
for a nonholonomic vehicle that cannot be directly steered sideways and all of its
motion has to be produced using forward and angular velocity inputs, the task of
source-seeking becomes even more challenging.
We employ extremum-seeking to tune the angular velocity (Ω) directly and the
forward velocity (v) indirectly. This scheme is depicted by the block diagram in
Figure 3.2. The control laws are given by
Ω = aω cos(ωt) + cξ sin(ωt) (3.3)
v = Vc + bξ , (3.4)
where ξ is the output of the washout filter, namely, of the approximate differentiator
of f(rs, t). The performance can be influenced by the parameters a, c, b, R, h, ω
and Vc. We tune angular velocity Ω with the basic extremum-seeking tuning law,
which has a perturbation term, aω cos(ωt), to excite the system. The ξ sin(ωt) term
estimates the angular gradient of the map.
30
Nonlinear Map
)(rf
f
hs
s
)sin( t!
c
"
)cos( ta !!
#
Unicycle
Dynamics
b
cr
#
cV
Figure 3.2: Block diagram of source seeking via tuning of angular velocity andforward velocity using one reading
The forward velocity v = Vc + bξ is chosen using the following intuition. When
the vehicle is approaching the source, heading straight towards it, the sensor reading
is increasing and hence ξ > 0. It is reasonable to increase the speed of the vehicle
when it is going towards the source. Conversely, when the vehicle is past the source
and the signal reading is decreasing, i.e., ξ < 0, the vehicle should be slowed down,
which (3.4) achieves.
We stress that the steering feedback (3.3) does not employ the nonlinear damp-
ing introduced in [11]. The damping needed to exponentially stabilize the average
equilibria is provided by the forward speed feedback (3.4).
3.3 The Average System
We focus on maps which depend on the distance from the source only. Since
our goal is only the establishment of local convergence, we assume that the map is
quadratic, and given by
f(rs) = f ∗ − qr|rs − r∗|2 (3.5)
31
where r∗ is the unknown maximizer, f ∗ = f(r∗) is the unknown maximum and qr
is an unknown positive constant.
We define an output error variable
e =h
s+ h[f ]− f ∗ , (3.6)
where hs+h
[f ] is a low-pass filter applied to the sensor reading f , which allows us
to express ξ, the output of the washout filter, as ξ = ss+h
[f ] = f(rs) − hs+h
[f ] =
f(rs − f ∗ − e), noting also that e = hξ.
Consider the system
rc = (Vc + bξ)ejθ (3.7)
θ = aω cos(ωt) + cξ sin(ωt) (3.8)
e = hξ (3.9)
ξ = −(qr|rs − r∗|2 + e) (3.10)
rs = rc +Rejθ (3.11)
shown in Figure 3.2. To analyze this system we start by defining the shifted variables
rc = rc − r∗ (3.12)
θ = θ − a sin(ωt) (3.13)
e = e− qrR2 . (3.14)
We also introduce the time scale change
τ = ωt, (3.15)
and introduce a map from the position rc to a scalar quantity θ∗, given by
−rc = |rc|ejθ∗
(3.16)
θ∗ = −j
2ln
(− rc¯rc
)= arg(r∗ − rc) , (3.17)
where θ∗ represents the heading angle towards the source located at r∗ when the
vehicle is at rc, and ¯rc is the complex conjugate of rc. Using these definitions, the
32
x
y
*r
*
*
~
r~
cr
Figure 3.3: Diagram of the error variables relating the vehicle and the source.
expression for ξ is
ξ =−(qr|rc +Rejθ − r∗|2 + e− qrR
2)
=−(qr
(|rc|2 − 2R|rc| cos(θ − θ∗ + a sin(τ))
)+ e). (3.18)
The dynamics of the shifted system are
drcdτ
=1
ω
(Vc + bξ)ej(θ+a sin(τ))
(3.19)
dθ
dτ=
1
ωcξ sin(τ) (3.20)
de
dτ=
1
ωhξ. (3.21)
We next define error variables rc and θ (depicted in Figure 3.3), which represent
the distance to the source, and the difference between the vehicle’s heading and the
optimal heading, respectively,
rc = |rc| (3.22)
θ = θ − θ∗. (3.23)
The resulting dynamics for the error variables are
33
drcdτ
=d√rc ¯rc
dτ=
1
2|rc|
(drcdτ
¯rc + rcd¯rcdτ
)=− Vc + bξ
ωcos(θ + a sin(τ)
)(3.24)
dθ
dτ=dθ
dτ− dθ∗
dτ=
dθ
dτ+
j
2|rc|2
(drcdτ
¯rc − rcd¯rcdτ
)=1
ω
[cξ sin(τ) +
Vc + bξ
rcsin(θ + a sin(τ)
)](3.25)
de
dτ=1
ωhξ (3.26)
ξ =−(qrr
2c + e− 2qrRrc cos
(θ + a sin(τ)
)). (3.27)
The system of equations is periodic with a period 2π, and the averaged error
system is
dravec
dτ=1
ω
[bJ0(a)(qrr
ave2
c + eave) cos(θave)
−bqrRravec (1 + J0(2a) cos(2θave))
−VcJ0(a) cos(θave)]
(3.28)
dθave
dτ=1
ω
[−qr(2cRJ1(a) + bJ0(a))r
avec sin(θave)
+bqrRJ0(2a) sin(2θave)
+VcJ0(a)− bJ0(a)e
ave
rcsin(θave)
](3.29)
deave
dτ=−h
ω
[(qrr
ave2
c + eave)
−2qrRJ0(a)ravec cos(θave)
], (3.30)
where J1(a) and J1(a) are Bessel functions of the first kind. The averaged error
system (3.28)–(3.30) has four equilibria defined byrave
eq1
c =VcJ0(a)
bqrRρ1θave
eq1= π
eaveeq1
= e12,
(3.31)
34
rave
eq2
c = −VcJ0(a)
bqrRρ1θave
eq2= 0
eaveeq2
= e12,
(3.32)
rave
eq3
c = ρ0
θaveeq3
= π + µ0
eaveeq3
= e34
(3.33)
rave
eq4
c = ρ0
θaveeq4
= π − µ0
eaveeq4
= e34 ,
(3.34)
where
ρ0 =
√γ1√
2cJ1(a)(3.35)
µ0 = arctan
√γ2
b√qrR(1− J0(2a))
(3.36)
e12 = −2VcJ20 (a)
bρ1− (VcJ
20 (a))
2
qrb2R2ρ21(3.37)
e34 = − γ12c2RJ2
1 (a)(3.38)
+bqrRhJ0(a)
√2γ1(1− J0(2a))
cJ1(a)√
γ2 + b2qrR(1− J0(2a))2. (3.39)
and
γ1 =cJ1(a)J0(a)Vc + b2qrRρ2
γ2 =2cJ1(a)J0(a)Vc − b2qrRρ3
ρ1 =1 + J0(2a)− 2J20 (a) ≥ 0
ρ2 =J20 (a)− J0(2a)− J0(2a)J
20 (a) + J2
0 (2a)
ρ3 =− 2J20 (a) + 2J0(2a)J
20 (a)− J2
0 (2a) + 1 ≥ 0.
Note that, due to properties of Bessel functions, 1−J0(2a) is positive for all positive
a. In addition, ρ1(a) and ρ3(a) = (1− J0(2a))ρ1(a) are positive for all positive and
sufficiently small values of a. In fact, both ρ1(a) and ρ3(a) > 0 appear to be positive
35
for all positive values of a (rather than only for small a > 0), but this may be
difficult to prove.
Due to the transformation (3.22), the four equilibria (3.31)–(3.34) can only be
related back to the original system if ravec is real and positive. It should be noted
that raveeq1
c and raveeq2
c cannot simultaneously be positive (note that Vc can be either
positive or negative), and also that raveeq3
c and raveeq4
c are real only when γ1 > 0. In
the next two sections we will show stability of the four average equilibria (not all of
them simultaneously) for different values of the speed bias parameter Vc, and infer
the appropriate convergence properties for the non-average system (3.24)–(3.27).
Each of the four average equilibria (3.31)–(3.34) represents a ring around the
source. However, more interesting information is obtained when considering the
average values of θ. With equilibrium 1 the vehicle points away from the source,
with equilibrium 2 it points directly towards the source, and with equilibria 3 and
4 the vehicle points, on the average, outwards relative to the ring, revolving around
the source in the counterclockwise direction for equilibrium 3 and in the clockwise
direction for equilibrium 4.
3.4 Stability for Small Positive or Negative Vc
In this section we analyze the stability properties of system shown in Figure 3.2
when the parameter Vc is small but not zero.
Theorem 3.1 Consider the system in Figure 3.2 with nonlinear map (3.5) that has
a maximum (qr > 0). Let the parameters c, b, R, h be chosen as positive. Let the
parameter a be chosen so that J0(a), J0(2a), J1(a), 1+ J0(2a)− 2J20 (a) > 0. Let the
parameter Vc be nonzero and such that either
Vc ∈ (0, V lowerc ), (3.40)
where V lowerc , −bqrR(1 + J0(2a)) + h
2J20 (a)
Rρ1,
or
Vc ∈ (V upperc , 0), where V upper
c , b2qrRρ32cJ1(a)J0(a)
. (3.41)
36
There exists constants ω∗ > 0 and δ > 0 such that, for all ω > ω∗, if the ini-
tial conditions rc(0), θ(0), e(0) are such that the following quantities are sufficiently
small, ∣∣∣∣|rc(0)− r∗| − |Vc|J0(a)bqrRρ1
∣∣∣∣ < δR (3.42)
|θ(0)− arg(rc(0)− r∗)− nπ| < δa , n ∈ N (3.43)∣∣e(0)− qrR2 − e12
∣∣ < δqR2 , (3.44)
then the trajectory of the vehicle center rc(t) locally exponentially converges to, and
remains in, the ring
|Vc|J0(a)bqrRρ1
−O(1/ω) ≤ ∥rc − r∗∥ ≤ |Vc|J0(a)bqrRρ1
+O(1/ω) . (3.45)
Proof: The Jacobian of the average system (3.28)–(3.30) at the equilibria
(3.31) and (3.32) is (at both equilibria) given by
Aeq1 =1
ω
−2VcJ2
0 (a)
Rρ1− bqrR(1 + J0(2a)) 0 −bJ0(a)
0 η 0
−2hJ0(a)(qrR + Vc
bRρ1
)0 −h
(3.46)
where
η = 2cJ1(a)J0(a)
bρ1Vc −
bqrRρ3ρ1
. (3.47)
By applying a similarity transformation with the matrix
T =
1 0 0
0 0 1
0 1 0
, (3.48)
we convert the Jacobian (3.46) into the block diagonal matrix
diag
1
ω
−2VcJ20 (a)
Rρ1− bqrR(1 + J0(2a)) −bJ0(a)
−2hJ0(a)(qrR + Vc
bRρ1
)−h
,η
ω
. (3.49)
The characteristic equation for this Jacobian is the combination of the characteristic
equations of the two blocks, which is
(ωs)2 + ζ(ωs) + hbqrRρ1 = 0 (3.50)
ωs− η = 0 , (3.51)
37
where
ζ =2J2
0 (a)Vc
Rρ1+ bqrR(1 + J0(2a)) + h . (3.52)
According to the Routh-Hurwitz criterion, to guarantee that the roots of the polyno-
mial have negative real parts, each coefficient must be greater than zero. Hence, we
need η < 0 in (3.47) and ζ > 0 in (3.52). Both of these conditions are satisfied under
either condition (3.40) or (3.41) of Theorem 3.1. By applying Theorem 10.4 from
[36] to this result, we conclude that the error system (3.24)–(3.27) has two distinct,
exponentially stable periodic solutions within O(1/ω) of the equilibria (3.31) and
(3.32), which proves that the the vehicle center rc converges to the annulus (3.45)
around the source r∗ defined in (3.45).
Simulation: Figure 3.4 shows the simulation with the map parameters r∗ =
(0, 0), qr = 1 and vehicle initial conditions of r0 = (1, 1) and θ0 = −π/2. The
ES parameters are chosen as ω = 20, a = 1.8, R = 0.1, c = 80, b = 4, h = 2, and
Vc = 0.005, which satisfies (3.41). Figures 3.4 (a), (b), and (c) show that the error
variables converge very near the theoretical equilibrium values. Figure 3.4 (d) shows
the trajectory of the vehicle in the signal field. It appears from Figure 3.4 (d) as if
the vehicle comes to a full stop. This is actually not the case, as we note from the
zoom frame in Figure 3.4 (c), and as we further explain in Remark 3.4.
Figure 3.5 shows the main difference between the small positive and negative
Vc with the map parameters, initial conditions, and ES parameters chosen to be
the same as the simulation in Figure 3.4 for both vehicles except for the parameter
Vc, which was set to +0.02 for one and −0.02 for the other. While with Vc > 0
the vehicle heading converges to a value pointing directly away from the source,
as predicted by the average equilibrium for the heading in (3.31), with Vc < 0
the vehicle heading converges to a value pointing directly towards the source, as
predicted by the average equilibrium for the heading in (3.32).
The abilities of this extremum seeking scheme on a non-quadratic function can be
seen in Figure 3.6, where the vehicle can converge to the maximum with the unknown
map being a Rosenbrock function. The Rosenbrock function is characterized by an
extremely deep valley along the parabola x2 = y that leads to the global minimum
and is often use abilities of an optimization scheme [48] . The Rosenbrock function
38
0 5 10 15 20 25 300
0.5
1
1.5
Time
rc
Absolute Distance from the Source
Simulation ResultTheoretical Equilibrium
Time
θ
Relative Angle between Vehicle and Source
0 10 20 30
0
π
2
π
Simulation ResultTheoretical Equilibrium
(a) (b)
0 5 10 15 20 25 30−2
−1
0
1
2
3
Time
Vc+
bξ
Forward speed
15 15.5 16
−5
0
5
x 10−3
−0.5 0 0.5 1 1.5−0.2
0
0.2
0.4
0.6
0.8
1
X
Y
Vehicle Trajectory
Start LocationVehicle TrajectoryVehicle BodySource LocationTheoretical Equilibrium
(c) (d)
Figure 3.4: Simulation results for steering-based unicycle source seeking with for-ward speed regulation: (a), (b), (c) showing the evolution of the variables rc, θ, andVc + bξ, respectively, and (d) showing the trajectory of the vehicle.
39
used in Figure 3.6 has a maximum at (1, 1) with the following form
f(rs) = −1
2(1− xs)
2 − (ys − x2s)
2, (3.53)
where xs =Re(rs) and ys =Im(rs). The vehicle is given the starting positions of
r0 = (−0.5,−0.5) and θ0 = π. The ES parameters are chosen as ω = 20, a =
1.8, R = 0.1, c = 80, b = 5, h = 1, and V c = −0.005.
Remark 3.1: The vehicle does not come to a full stop, as evident from Figure
3.4 (c), even though it slows down nearly to a stop due to a very small Vc = 0.005.
However, unlike in [11], the vehicle, after entering the annulus, does not revolve
around the source. It points, on the average, towards or away from the source, de-
pending on the sign of Vc. The vehicle’s angular velocity and forward speed oscillate
but the vehicle does not drift clockwise or counter-clockwise in the annulus. While
this fact is evident from the simulations, unfortunately it cannot be proved. This
is because only the relative heading with respect to the source has an exponentially
stable equilibrium. The absolute heading, after averaging the θ-system (3.25), has a
continuum of equilibria, but none of them are exponentially stable, which precludes
the possibility of proving, using the averaging method, that no drift occurs.
Similar to [11], the vehicle converges to an annulus around the source with a
radius proportional to Vc. From (3.49) we see that when h is large the decay rate
in the radial state rc of the vehicle is a function of two terms, one with Vc and the
other with b, unlike [11], where the convergence rate depends only on Vc, and where a
trade-off between the annulus size and convergence speed exists (faster convergence
implies a larger annulus, because the vehicle has constant speed). In the present
design we can choose Vc ≪ b and achieve fast convergence to a small annulus around
the source. With the choice of small Vc the vehicle comes almost to a stop, as shown
in Figure 3.4.
The linearization step fails when Vc = 0, due to the singularity at rc = 0 in
(3.25). For this reason, nothing can be said about the system behavior even though
Vc = 0 verifies the Routh-Hurwitz criterion. The singularity at rc = 0 also manifests
itself in the average equilibria (3.31) and (3.32), where rc = 0 at both equilibria,
but the heading has a non-unique value (θ = π or θ = 0).
40
3.5 Stability for Medium and Large Positive Vc
For medium or large values of Vc the vehicle converges to the average equilibria
3 and 4, namely to an annulus within which the vehicle revolves around the source,
similar to the vehicle trajectories produced by the algorithm in [11]. However, as
we shall see, an interesting difference relative to [11] arises thanks to the fact that
forward speed is not constant, which allows the vehicle to revolve around the source
with non-tangential average heading.
Theorem 3.2 Consider the system in Figure 3.2 with nonlinear map (3.5) that has
a maximum (qr > 0). Let the parameters c, b, R, h be chosen as positive. Let the
parameter a be chosen so that J0(a), J0(2a), J1(a), 1 + J0(2a)− 2J20 (a) > 0. Let
Vc > V upperc , (3.54)
where V upperc is defined in (3.41). There exists constants ω∗ > 0 and δ > 0 such
that, for all ω > ω∗, if the initial conditions rc(0), θ(0), e(0) are such that∣∣∣∣|rc(0)− r∗| −√γ1√
2cJ1(a)
∣∣∣∣ < δR (3.55)
|θ(0)− arg(rc(0)− r∗)− (2n+ 1)π ± µ0| < δa , n ∈ N (3.56)∣∣e(0)− qrR2 − e34
∣∣ < δqR2 , (3.57)
where θaveeq3or4
and eaveeq3or4
are from the equilibria (3.33) and (3.34), then the tra-
jectory of the vehicle center rc(t) locally exponentially converges to, and remains in,
the annulus
√γ1√
2cJ1(a)−O(1/ω) ≤ |rc − r∗| ≤
√γ1√
2cJ1(a)+O(1/ω). (3.58)
Proof: We first note that condition (3.54) ensures that γ2 > 0. We also note
that the statement of the theorem relies on γ1 being positive, since it appears under
the square root. To see that γ1 is indeed positive, we express it as
γ1 =γ22
+ b2qrR(ρ32
+ ρ2
), (3.59)
41
−0.5 0 0.5 1 1.5−0.2
0
0.2
0.4
0.6
0.8
1
X
Y
Vehicle Trajectory
Start LocationSmall Positive V
c
Small Negative Vc
Source LocationTheoretical Equilibrium
Figure 3.5: The difference in trajectories for small positive and negative Vc. Thetwo cases yield convergence to the average equilibria (3.31) and (3.32), respectively.For Vc < 0 the vehicle points towards the source at the end of the transient, whereasfor Vc > 0 the vehicle points away from the source at the end of the transient.
−1.5 −1 −0.5 0 0.5 1 1.5 2−1
−0.5
0
0.5
1
1.5
2
2.5
X
Y
Vehicle Trajectory
Start LocationVehicle TrajectoryVehicle BodySource Location
Figure 3.6: Simulation result of vehicle trajectory using steering-based source seek-ing and forward speed regulation on a Rosenbrock function (the white shading rep-resents the maximum).
42
where
ρ32
+ ρ2 =1
2(1− J0(2a))
2 ≥ 0 , ∀a . (3.60)
Since γ2 ≥ 0, it follows that γ1 > 0 and thus it follows that the average equilibria
(3.33) and (3.34) are well defined.
As done in the proof of Theorem 3.1, we can calculate the Jacobians for equilibria
(3.33) and (3.34), which happens to be the same matrix at both equilibria. Due to
the complicated form of the Jacobian matrix, we do not show the matrix and instead
just show its characteristic polynomial:
0 =[(ωs)3 + (Rbqr(1 + J0(2a))
+b2qrJ0(a)
cJ1(a)(1− J0(2a)) + h
)(ωs)2
+
((2qrR +
bqrJ0(a)
cJ1(a)
)γ2 +Rbqrhρ1
)(ωs)
+2Rqrhγ2] . (3.61)
According to the Routh-Hurwitz criterion, to guarantee that the roots of the poly-
nomial have negative real parts, each coefficient must be greater than zero and the
product of the s2 and s1 coefficients must be greater than the s0 coefficient. The
product of the s2 and s1 coefficients minus the s0 coefficient is
bq2r
((2qrR +
bqrJ0(a)
cJ1(a)
)γ2 +Rbqrhρ1
)(R(1 + J0(2a)) +
bJ0(a)
cJ1(a)(1− J0(2a))
)+ qrh
(bJ0(a)
cJ1(a)γ2 +Rbρ1
). (3.62)
With the condition (3.54), the Routh-Hurwitz criterion is satisfied and therefore
the Jacobian for the equilibria (3.33) and (3.34) is Hurwitz. By applying Theorem
10.4 from [36] to this result, we conclude that the error system (3.24)–(3.27) has
two distinct, exponentially stable periodic solutions within O(1/ω) of the equilibria
(3.33) and (3.34), which proves that the vehicle center rc converges to the annulus
(3.58) around the source r∗.
43
Simulation: On the approach towards the source, the vehicle trajectory with
Vc > V upperc is very similar to the trajectory for Vc ∈ (V lower
c , V upperc ). However,
as the vehicle for Vc > V upperc gets close to the source, it begins to encircle the
source clockwise or counterclockwise, depending on the initial conditions. Figure
3.7 shows the simulation for Vc > V upperc , with two different initial conditions, one
that converges to the average equilibrium (3.33) and the other that converges to
the average equilibrium (3.34). The simulations in Figure 3.7 were done with map
parameters and ES parameters chosen as to be the same as the simulation in Figure
3.4 except for Vc = 1, which satisfies (3.54).
Figure 3.8 shows a simulation of three vehicles with three different values for Vc.
The simulations in Figure 3.8 were done with map parameters and ES parameters
chosen to be the same as the simulation in Figure 3.4 except for Vc. The three
values of Vc were chosen as 1.001 × V upperc , 10 × V upper
c , and 100 × V upperc to show
that the vehicle’s average heading ranging from directly away from the source for
Vc slightly larger than V upperc to almost tangential to the ring for Vc ≫ V upper
c . Note
this behavior is explained by (3.36) and how it relates to V upperc .
3.6 Conclusion
We have proposed a modification of the nonholonomic source seeking algorithm
in [11], with a regulation of the vehicle forward speed which allows the vehicle to
slow down as it gets close to the source. We have proved the convergence to a
neighborhood of the source in three cases, identifying three classes of attractors:
• Vc ∈ (V lowerc , 0): the vehicle points, on the average, directly towards the
source, and does not drift around the ring. This is a continuum of attrac-
tors, parametrized by the position on the ring.
• Vc ∈ (0, V upperc ): the vehicle points, on the average, directly away from the
source, and does not drift around the ring. This is a continuum of attractors,
parametrized by the position on the ring.
• Vc > V upperc : the vehicle revolves around the source in the clockwise or counter-
44
Time
θ
Relative Angle between Vehicle and Source
0 2 4 6 8 100
π
2
π
3π
2
θ(0)=π/4θ(0)=−π/4Theoretical Equilibrium
(a)
−0.3 −0.2 −0.1 0 0.1 0.2 0.3−0.4
−0.3
−0.2
−0.1
0
0.1
0.2
0.3
X
Y
Vehicle Trajectory
Start Locationθ(0)=π/4θ(0)=−π/4Source LocationTheoretical Equilibrium
(b)
Figure 3.7: Two trajectories of the same vehicle, with the only difference beingthe initial condition in θ. The vehicle converges to two different average equilibria,(3.33) and (3.34). (a) shows the evolution of the relative angle between the vehicleheading and the source, with µ0 ≈ π/3. (b) shows the trajectory of the vehicles.
45
Time
θ
Relative Angle between Vehicle and Source
0 10 20 300
π
2
π
3π
2
Vc=1.001*V
cupper
Vc=10*V
cupper
Vc=100*V
cupper
Theoretical Equilibria
(a)
−0.4 −0.2 0 0.2 0.4 0.6−0.4
−0.2
0
0.2
0.4
0.6
X
Y
Vehicle Trajectory
Start Location
Vc=1.001*V
cupper
Vc=10*V
cupper
Vc=100*V
cupper
Source LocationTheoretical Equilibria
(b)
Figure 3.8: Three trajectories of the same vehicle, with the only difference being thevalue of Vc. The vehicle converges to three different trajectories that encircle thesource. (a) shows the evolution of the relative angle between the vehicle heading andthe source, with µ0 ≈ 0 when Vc is close to V upper
c and µ0 ≈ π/2 when Vc ≫ V upperc .
(b) shows the trajectory of the vehicles.
46
clockwise direction, depending on the initial condition. The vehicle’s average
heading ranges from slightly outward relative to the ring (for Vc ≫ V upperc ) to
almost directly away from the source (for Vc only slightly larger than V upperc ).
While our new strategy is not applicable to fixed-wing aircraft, it is applicable to
mobile robots, marine vehicles, and rotorcraft. Of the three ranges for the speed
bias parameter Vc, namely, Vc ∈ (V lowerc , 0), Vc ∈ (0, V upper
c ), and Vc > V upperc , from
the point of view of asymptotic performance, the negative range Vc ∈ (V lowerc , 0)
seems preferable, because the vehicle virtually stops near the source and because it
points directly towards the source on average.
This chapter is in full a reprint of the material as it has been submitted to: N.
Ghods, and M. Krstic, “Speed regulation in steering-based source seeking,” Auto-
matica, vol. 46, pp. 452459, 2010.
The dissertation author was the primary investigator and author of this paper.
4
Multi-Agent Deployment Over a
Source
We consider the problem of deploying a group of autonomous vehicles (agents)
in a formation which has higher density near the source of a measurable signal
and lower density away from the source. The spatial distribution of the signal and
the location of the source are unknown but the signal is known to decay with the
distance from the source. The vehicles do not have the capability of sensing their
own positions but they are capable of sensing the relative position between them
and their neighbors. We design a control algorithm based on a combination of two
components. One component of the control law is inspired by the heat PDE and it
results in the agents deploying between two anchor agents. The other component of
the control law is based on extremum seeking and it achieves higher vehicle density
around the source. Using averaging theory for PDEs we prove that the vehicle
density will be highest around the source. We also quantify the density function
of the agents’ deployment position. By discretizing the model with respect to the
continuous agent index, we obtain decentralized control laws for discrete agents and
illustrate the theoretical results with simulations.
47
48
4.1 Introduction
Extremum seeking has proved to be a powerful tool in real-time non-model based
control and optimization for single unmanned autonomous vehicles [59, 60, 11, 13,
39]. In recent years, extremum seeking has also been used for groups of unmanned
autonomous vehicles in a network with each vehicle having limited local information
[51, 49].
We consider the task of seeking the maximum of a signal field while simulta-
neously achieving a formation distribution which has higher density around the
areas with higher signal strength. We combine the method of extremum-seeking
with diffusion feedback to have a group of vehicles complete the task of formation
deployment and source-seeking.
With the new method we explore two different types of control for the agents on
the boundary, which we refer to as anchors: (1) the case of free anchors and (2) the
case of fixed anchors. The free anchor case allows the agents on the edge of the for-
mation to freely move, whereas the fixed anchors case has stationary anchor agents
that start at a desired location. Different deployment distributions are achieved in
the two cases.
The diffusion-based feedback enables the overall multi-agent formation to act
as a net of source seekers, rather than as a group of independent, uncoordinated
seekers, who intrude upon each others’ space. With the free anchors the user casts
the net in a manner to prompt attraction towards the source and spread around
the source. In the fixed anchor case the ends of the net are fixed and the agents in
between distribute such that they have the highest density near the source.
In the present paper we consider only the one-dimensional problem. The two-
dimensional coordinated source seeking problem allows a much broader array of
problem formulations, depending on various possible formation topologies. For this
reason, we focus on the 1D situation to introduce the design ideas and analysis
techniques.
The motivation for using the diffusion/heat PDE is that the diffusion action
induces each agent to take a position half way between his two neighbors. By
combining diffusion with extremum seeking one obtains a swarm of agents where
49
each agent is driven by two competing strategies, extremum seeking which aims to
place all the agents at the extremum, and diffusion which aims to spread the agents
evenly, provided the anchors are apart. The overall result of these two effects is
that the agents are deployed more densely near the extremum than away from the
extremum. We quantify this density in the paper.
The problem of understanding when the individual actions of interacting agents
give rise to a coordinated behavior has received considerable attention in many fields.
In the control community, the interest in coordination phenomena has been recently
promoted by the need of controlling groups of unmanned autonomous vehicles. A
basic, fairly simplified setup considers a group of nmobile agents, each one described
by a dynamic system capturing the evolution of its heading angle [31] or its position
and velocity [55]. When agents interact with a limited number of neighbors, one
faces the problem of designing a decentralized control scheme (where each agent
uses only the neighbors’ information) in order to orchestrate the collective behavior.
Decentralization implies that the control action can be computed in a distributed
fashion.
A method often used to design and analyze a decentralized controller for a group
of agents is to treat the agents as a continuum. Relations between distributed
consensus algorithms and the heat equation are made in [18]. In [37], agents use
model reference adaptive control laws to track desired trajectories, using either the
heat equation or the wave equation as reference models. Boundary control of PDEs
was used to deploy vehicles into planar curves in [23]. A continuum model for
a swarm of vehicles is formulated using a vehicle density function in [32]. In [9]
deployment on a line segment is achieved by using feedback laws consistent with the
spatially discretized heat equation.
Multi-agent and GPS-enabled source seeking problems have been solved in [43,
47]. A hybrid strategy for solving the source seeking problem was developed in [41].
In [33, 63, 14] the proposed problem in this paper is considered as a GPS-enabled
game problem were each agent is trying to maximize its own cost function, but in
these algorithms the agents also require the cost information of their neighbors.
Section 4.2 presents a description of the vehicle model and the control scheme
50
for both free and fixed anchor cases. We prove local exponential convergence results
of an equilibrium with the density function that has maximum density set around
the source in Sections 4.3 and 4.4. Section 4.3 deals with the case of free anchors,
whereas Section 4.4 deals with fixed anchors. Simulation results in Section 4.5
illustrate the distinct behavior exhibited using free and fixed control for the anchor
agents with and without independent parameters for each agent.
4.2 Control Design
We consider vehicles modeled as a velocity-actuated point mass
xt = v (4.1)
where x is a vector of position of the point masses, and v are the vehicles velocity
inputs. It is common to consider the heat equation
xt(α, t) = xαα(α, t) (4.2)
as a model that governs the position x(α, t) at time t of an agent indexed by α in
a large (continuum) group of agents, where each agent is able to sense its nearest
neighbor and apply diffusion feedback actuated through the velocity input, namely
v(α, t) = xαα(α, t), (4.3)
with the boundary conditions at xt(0, t) and xt(1, t). The subscripts are used to
denote a partial derivative in the respective variable. For simplicity without loss of
generality we choose the spatial domain α ∈ [0, 1].
Extremum-seeking on a single vehicle modeled as a velocity-actuated point mass
has been studied in [60]. The control law used in [60] is
v(t) = aω cos(ωt) + cξ sin(ωt) (4.4)
ξ =s
s+ h[J ], (4.5)
where J is the measurement of the signal field and a, ω, c, and h are parameters
chosen by the designer. The washout filter (4.5) is not required for stability [53],
but used to achieve better performance.
51
In this paper, given only the measurements of the values of the function J = f(x),
we employ a mix of extremum-seeking and nearest-neighbor based diffusion feedback
given by
v(α, t) =κ(α)xαα(α, t) + a(α)ω cos(ωt) + c(α)ξ(α, t) sin(ωt) (4.6)
ξ(α, t) =s
s+ h(α)[J(α, t)], (4.7)
where the performance can be influenced by the positive parameters a(α), c(α),
κ(α), h(α), and ω. The parameters can vary with respect to α, which allows each
vehicle to have different parameters.
For the agents on the boundary (anchor agents) we consider two different types of
control laws. We explore first the case of having the anchors free to move according
to the shape and location of the signal field, and then consider the case where the
user deploys the anchors to desired locations.
The free anchor boundary conditions have the form
v(0, t) =− κ(0)ν + a(0)ω cos(ωt) + c(0)ξ(0, t) sin(ωt) (4.8)
v(1, t) =κ(1)ν + a(1)ω cos(ωt) + c(1)ξ(1, t) sin(ωt), (4.9)
where ν is a constant velocity which makes the anchors expand out until the exter-
mum seeking (ES) term is big enough to counteract ν and stop the expansion of the
anchors.
The fixed anchor boundary conditions have the form
x(0, t) = x (4.10)
x(1, t) = x, (4.11)
where x and x are the desired fixed locations of the boundary agents. The fixed
boundary conditions are used to force the agents in between the anchors (follower
agents) to distribute between the desired locations. The fixed anchors can be virtual
points whose positions are fed to the nearest followers, or the fixed anchors can
represent a physical boundary like a wall that the followers can sense.
With the free anchors there are no restrictions on where the formation will end
up. The deployment range depends primarily on the initial anchor velocities ν. On
52
the other hand, the fixed anchor case allows the user to pick an area of interest and
have the agents explore all of this area.
We assume that the nonlinear map defining the distribution of the signal field is
quadratic and takes the form
J = f(x) = f ∗ − q(x− x∗)2, (4.12)
where x is the position of the vehicle, x∗ is the maximizer, f ∗ = f(x∗) is the
maximum, and q is an unknown positive constant. The assumption of the quadratic
form for the signal field is used to simplify the stability proof.
4.3 Free Anchors
In this section we analyze the convergence properties of the feedback law (4.6)–
(4.9). We define an output error variable e(α, t) = h(α)s+h(α)
[J(α, t)]− f ∗ where h(α)s+h(α)
is a low-pass filter applied to the sensor reading J , which allows us to express ξ(α, t),
the signal from the washout filter, as ξ(α, t) = ss+h(α)
[J(α, t)] = J(α, t)−f ∗−e(α, t),
noting also that e(α, t) = h(α)ξ(α, t).
To study the vehicle formation in a continuum case we use the formation density
function
p(x) =d
dxϕ−1(x) =
1
ϕ′(ϕ−1(x))(4.13)
where ϕ−1(x) is the inverse function of vehicle position ϕ(α) and ϕ′ denotes the
derivative with respect to the function’s only argument.
Theorem 4.1 Consider the closed-loop system
xt(α, t) =κ(α)xαα(α, t) + a(α) cos(ωt) + c(α)ξ(α, t) sin(ωt) (4.14)
et(α, t) =h(α)ξ(α, t) (4.15)
ξ(α, t) =− q(x(α, t)− x∗)2 − e(α, t) (4.16)
with the free boundary conditions (B.C.)
xt(0, t) =− κ(0)ν + a(0) cos(ωt) + c(0)ξ(0, t) sin(ωt) (4.17)
xt(1, t) =κ(1)ν + a(1) cos(ωt) + c(1)ξ(1, t) sin(ωt), (4.18)
53
where κ(α), h(α), a(α), c(α) > 0 and a(α), c(α) are chosen such that ddα
(a(α)c(α)) <a(α)c(α)
2, ∀α ∈ [0, 1], q > 0, and ν ∈ R. There exists ω∗ > 0 such that, for all ω > ω∗,
there exists a periodic solution (x2π/ω(α, t), e2π/ω(α, t)) of period 2π/ω in t and with
the property that
|x2π/ω(α, t)− x∗ − ρ(α)|2 ≤ O
(1
ω+max
αa(α)
)(4.19)
∀α ∈ [0, 1], t ≥ 0, where
ρ(α) = afree0 eγ(α) − afree1 e−γ(α), (4.20)
afree0 =ν
eγ(1) − e−γ(1)
(1
λ2(1)+
e−γ(1)
λ2(0)
), (4.21)
afree1 =ν
eγ(1) − e−γ(1)
(1
λ2(1)+
eγ(1)
λ2(0)
), (4.22)
γ(α) =
∫ α
0
λ(σ) dσ, and (4.23)
λ(σ) =
√qc(σ)a(σ)
κ(σ)(4.24)
such that whenever the quantities
|x(0, 0)− x∗ − ρ(0)|2,∫ 1
0|x(α, 0)− x∗ − ρ(α)|2 dα, (4.25)∫ 1
0|xα(α, 0)− ρα(α)|2 dα, and
∫ 1
0
∣∣∣e(α, 0) + qa2
2+ qρ2(α)
∣∣∣2 dα (4.26)
are sufficiently small, the solution (x(α, t), e(α, t)) exponentially converges to
(x2π/ω(α, t), e2π/ω(α, t)) in H1[0, 1]× L2[0, 1] norm.
Proof: We start the proof by defining the error variable
x = x− x∗ − a(α) sin(ωt), (4.27)
where x∗ is the location of the source, and the new time variable
τ = ωt. (4.28)
The resulting dynamics become
xτ (α, τ) =1
ω
(κ(α)
(xαα(α, τ) + a′′(α) sin(τ)
)− c(α)ξ(α, τ) sin(τ)
), (4.29)
eτ (α, τ) =h(α)
ωξ(α, τ), (4.30)
ξ(α, τ) = −q(x(α, τ) + a(α) sin(τ))2 − e(α, τ) (4.31)
54
with B.C.
xτ (0, τ) =1
ω(−κ(0)ν + c(0)ξ(0, t) sin(τ)) (4.32)
xτ (1, τ) =1
ω(κ(1)ν + c(1)ξ(1, t) sin(τ)). (4.33)
The average error system is
xaveτ (α, τ) =
1
ω(κ(α)xave
αα (α, τ)− qc(α)a(α)xave(α, τ)) (4.34)
eaveτ (α, τ) =− h(α)
ω
(q(xave(α, τ))2 +
qa2(α)
2+ eave(α, τ)
)(4.35)
with B.C.
xaveτ (0, τ) =
1
ω(−κ(0)ν − qc(0)a(0)xave(0, τ)) (4.36)
xaveτ (1, τ) =
1
ω(κ(1)ν − qc(1)a(1)xave(1, τ)). (4.37)
The equilibrium profile of the average error system (4.34)–(4.37) is
[xavee(α), eave
e
(α)]=
[ρ(α),−qa2(α)
2− qρ2(α)
], (4.38)
where ρ(α) is given in (4.20).
We shift the system state by its equilibrium profile with the following transfor-
mation
w(α, τ) = xave(α, τ)− xavee(α) (4.39)
z(α, τ) = eave(α, τ)− eavee
(α), (4.40)
which results in the following dynamics
wτ (α, τ) =1
ω(κ(α)wαα(α, τ)− qc(α)a(α)w(α, τ)) (4.41)
zτ (α, τ) = −h(α)
ω
(q(w(α, τ) + ρ(α))2 + z(α, τ)− qρ2(α)
)= −h(α)
ω
(qw2(α, τ) + 2qρ(α)w(α, τ) + z(α, τ)
)(4.42)
55
with B.C.
wτ (0, τ) =1
ω(−κ(0)ν − qc(0)a(0)(w(0, τ) + ρ(0)))
=− qc(0)a(0)
ωw(0, τ) (4.43)
wτ (1, τ) =1
ω(κ(1)ν − qc(1)a(1)(w(1, τ) + ρ(1)))
=− qc(1)a(1)
ωw(1, τ) (4.44)
Linearizing the averaged error system produces
wτ (α, τ) =1
ω(κ(α)wαα(α, τ)− qc(α)a(α)w(α, τ)) (4.45)
zτ (α, τ) = −h(α)
ω(2qρ(α)w(α, τ) + z(α, τ)) (4.46)
with B.C.
wτ (0, τ) = −qc(0)a(0)
ωw(0, τ) (4.47)
wτ (1, τ) = −qc(1)a(1)
ωw(1, τ). (4.48)
Using Lemma A.1 in Appendix A, where k1 = κ(α), k2 = qa(α)c(α), k3 =
2qh(α)ρ(α), and k4 = h(α), we get that the averaged error system has an expo-
nentially stable equilibrium. Applying Theorem 3.6 and Example 6.4 in [27] (details
in Appendix B), we can state that there exists ω∗ > 0 such that, for all ω > ω∗,
there exists a periodic solution (x2π/ω(α, t), e2π/ω(α, t)) of period 2π/ω in t and with
the property that
|x2π/ω(0, t)− x∗ − ρ(0)|2
+
∫ 1
0
|x2π/ω(α, t)− x∗ − ρ(α)|2 dα
+
∫ 1
0
|x2π/ωα (α, t)− ρ′(α)|2 dα ≤ O
(1/ω +max
αa(α)
), (4.49)
so that the solution (x(α, t), e(α, t)) locally exponentially converges to (x2π/ω(α, t),
e2π/ω(α, t)) in H1[0, 1]× L2[0, 1] norm. Agmon’s inequality combined with Young’s
inequality yields
supα
|ζ(α, t)|2 ≤ ζ2(0, t) +∫ 1
0|ζ(α, t)|2 dα +
∫ 1
0|ζα(α, t)|2 dα . (4.50)
By applying (4.50) to (4.49) we get the bound (4.19).
56
Now we take a look at how the parameters affect the density function.
Proposition 4.1 The averaged equilibrium (4.20)–(4.24) has the following forma-
tion density function
p(x) =1 + x−x∗√
(x−x∗)2+4a0a1
λ(ϕ−1(x))(x− x∗ +
√(x− x∗)2 + 4a0a1
) , (4.51)
where a0 = afree0 , a1 = afree1 are given in (4.21) and (4.22).
Proof: We start by taking the vehicle position function, which has the form
x = ϕ(α) = ρ(α) + x∗ = a0eγ(α) − a1e
−γ(α) + x∗, (4.52)
and solving (4.52) for γ to obtain
γ(α) = ln
(x− x∗ +
√(x− x∗)2 + 4a0a1a0
). (4.53)
We use (4.23) to rewrite γ in terms of λ and differentiate both sides with respect to
x to obtain (d
dxϕ−1(x)
)λ(ϕ−1(x)) =
1 + x−x∗√(x−x∗)2+4a0a1(
x− x∗ +√
(x− x∗)2 + 4a0a1
) (4.54)
and then simply solve for the density function p(x) = ddxϕ−1(x).
Figure 4.1 shows two density plots with the parameters chosen in a way to make
λ = 5 for the solid black line and λ(α) = 5(2 − α) for the dashed blue line with
ν = 2 and x∗ = 0 for both. Figure 4.1 shows that the vehicles with higher value of
λ(α) squeeze towards the maximum x∗ and the vehicles with lower values of λ(α)
spread out more.
We consider the simple case of constant λ, to show the effect of λ and ν on the
density function at x∗. The formula for density function at x∗ with constant λ is
given by
p(x∗) =
√κλ sinh(λ)
ν√
2 + 2 cosh(λ), (4.55)
where it can be noted that as λ increases so does the density function at x∗, while
the opposite is true for ν.
57
−2 −1 0 1 20
0.5
1
1.5
2
Position (x− x∗)
Den
sity
(nve
hic
les/
∆x)
λ = 5λ = 5(1+α)
Figure 4.1: Vehicle density function for λ = 5 and λ(α) = 5(2− α).
4.4 Fixed Anchors
In this section we highlight the differences in the analysis of the fixed anchor
case from the free anchor case. The main differences between the two cases is that
the fixed anchor case forces the formation deployment profile to be between x and
x, which in turn causes the density function to be in the same range. Unlike in the
free anchor case, in the fixed anchor case the anchors are stationary.
Theorem 4.2 Consider the system
xt(α, t) =κxαα(α, t) + aω cos(ωt) + cξ(α, t) sin(ωt) (4.56)
et(α, t) =hξ(α, t) (4.57)
ξ(α, t) =− q(x(α, t)− x∗)2 − e(α, t) (4.58)
with the fixed boundary conditions
x(0, t) = x (4.59)
x(1, t) = x (4.60)
58
where x and x ∈ R. There exists ω∗ > 0 such that, for all ω > ω∗, there exists a
periodic solution (x2π/ω, e2π/ω(α, t)) of period 2π/ω in t and with the property that
|x2π/ω(α, t)− x∗ − ρ(α)|2 ≤ O
(1
ω+max
αa(α)
)(4.61)
∀α ∈ [0, 1], t ≥ 0, where
ρ(α) = afixed0 eγ(α) − afixed1 e−γ(α), (4.62)
afixed0 =x− x∗(1− e−γ(1))− xe−γ(1)
(eγ(1) − e−γ(1)), (4.63)
afixed1 =x− x∗(1− eγ(1))− xeγ(1)
(eγ(1) − e−γ(1)), (4.64)
and γ given by (4.23), such that whenever the quantities∫ 1
0|x(α, 0)− x∗ − ρ(α)|2 dα, (4.65)
and∫ 1
0
∣∣∣e(α, 0) + qa2
2+ qρ2(α)
∣∣∣2 dα (4.66)
are sufficiently small, the solution (x(α, t), e(α, t)) exponentially converges to(x2π/ω(α, t), e2π/ω(α, t)
)in L2[0, 1]× L2[0, 1] norm.
Proof: Similar to the proof for Theorem 4.1, we start by applying (4.27) and
(4.28) to system (4.56)–(4.58) with the B.C. (4.59)–(4.60), and then by averaging
we obtain
xaveτ (α, τ) =
1
ω(κ(α)xave
αα (α, τ)− qc(α)a(α)xave(α, τ)) (4.67)
eaveτ (α, τ) =− h(α)
ω
(q(xave(α, τ))2 +
qa(α)2
2+ eave(α, τ)
)(4.68)
with B.C.
xave(0, τ) = x− x∗ and xave(1, τ) = x− x∗. (4.69)
The average error system (4.67)–(4.69) has an equilibrium defined by[xavee(α), eave
e
(α)]=
[ρ(α),−qa2
2− qρ2(α)
], (4.70)
where ρ(α) is given in (4.62). We omit the details of the averaging, but would like
to point out that the main difference in averaging the fixed case from the free case
59
is in the boundary condition, which yields different coefficients for the equilibrium
(4.62).
Shifting the averaged system by the equilibrium and linearizing we get
wτ (α, τ) =1
ω(κ(α)wαα(α, τ)− qc(α)a(α)w(α, τ)) (4.71)
zτ (α, τ) = −h(α)
ω(2qρ(α)w(α, τ) + z(α, τ)) (4.72)
with B.C. w(0, τ) = w(1, τ) = 0.
Using Lemma A.3 in Appendix A, where k1 = κ(α), k2 = qa(α)c(α), k3 =
2qh(α)ρ(α), and k4 = h(α), we get that the averaged error system has an expo-
nentially stable equilibrium. Using Theorem 3.6 and Example 6.4 in [27] (details
in Appendix B), we can state that there exists ω∗ > 0 such that, for all ω > ω∗,
there exists a periodic solution (x2π/ω, e2π/ω(α, t)) of period 2π/ω in t and with the
property that∫ 1
0
|x2π/ω(α, t)− x∗ − ρ(α)|2 dα ≤ O(1/ω +max
αa(α)
)(4.73)
so that the solution (x(α, t), e(α, t)) locally exponentially converges to (x2π/ω(α, t),
e2π/ω(α, t)) in L1[0, 1]×L2[0, 1] norm. By applying (4.50) to (4.73) we get the bound
(4.61).
The same result holds as in Proposition 4.1 for the averaged equilibrium of the
fixed anchor case (4.62)–(4.64) with the formation density function given as (4.51)
where a0 = afixed0 and a1 = afixed1 are given in (4.63) and (4.64), respectively. As
derived earlier, the formation density function at position x∗ with a constant λ,
given by
p(x∗) =sinh(λ)
λβ, (4.74)
where
β =(x∗2 − x∗(x+ x)
)(2− 2 cosh(λ))− 2xx cosh(λ) + x2 + x2, (4.75)
increases with bigger λ and decreases as the difference between x and x grows.
60
4.5 Simulation Results
To implement the algorithm in Section 4.2 we must first understand how to
choose and tune the parameters a, c, κ, ω, h, and ν. Higher values of a and c cause
the attraction of the vehicle towards the source to increase and the opposite is true
for κ. The parameters ω and a are chosen such that the quantity 1/ω +maxα a(α)
is sufficiently small. The cutoff frequency h for the washout filter has to be high
enough to significantly get rid of the DC term but smaller than the perturbation
frequency ω. In the free anchor case, the higher the ratio νκac, the farther the anchor
vehicle will settle from the source, thereby causing the formation to spread out.
To apply the algorithm in Section 4.2, we discretize the continuous model (4.6)
to implement the algorithm. The two anchor agents do not require any modification
of their control laws (4.8), (4.9), and (4.10) since they do not include any partial dif-
ferentiation with respect to the agent index in their control law. The state variables
x(α, t) and ξ(α, t) become x(iδ, t) and ξ(iδ, t) where i = 0, ..., n + 1, δ = 1/(n + 1),
and n is the number of follower agents. We denote the two anchor agents’ states as
[x0, ξ0] and [xn+1, ξn+1], and the interior seeking agents’ states as [xi, ξi].
We discretize the seeking agents’ control laws (4.6) by using three-point central
differencing to approximate the spatial derivatives, obtaining
vi(t) = κixi+1−2xi+xi−1
δ2+ ai cos(ωt) + ciξi(t) sin(ωt), (4.76)
which can be rearranged as
vi(t) = κi∆xi+1,i+∆xi−1,i
δ2+ ai cos(ωt) + ciξi(, t) sin(ωt), (4.77)
where ∆xj,i = xj − xi. The washout filter becomes
ξi(t) =s
s+ h[Ji(t)], (4.78)
where Ji is the sensor reading of agent i. Figure 4.2 shows the block diagram for
one follower agent.
The signal field parameters for plots in Figure 4.3 are f ∗ = 1, q = 10 and
x∗ = 0.6. We apply (4.77), where a = 0.008, c = 15, κ = 0.05, ω = 45, and h = 10,
for all follower agents and x0 = 0, xn = 1 for the anchor agents to simulate the fixed
61
Nonlinear Map
)( ixfiJ
ihs
s
)sin( t
ic i
)cos( tai
ix
s
1 ix
1 ix
1 ix
2
! i
2
Figure 4.2: Block diagram of a single follower agent.
agent case on 11 agents. Figure 4.3(a) shows the evolution of a group of autonomous
vehicles, with fixed boundary agents, all released equidistantly between x0 and xn.
The agents deploy more densely around the signal source (peak) than away from
the source, which is consistent with the form of the density function (4.51) where
a0 = afixed0 and a1 = afixed1 are given in (4.63) and (4.64), respectively.
We simulate the free boundary condition case using
v0(t) = −κν + aω cos(ωt) + cξ0(t) sin(ωt), (4.79)
vn(t) = κν + aω cos(ωt) + cξn(t) sin(ωt), (4.80)
and (4.77), where a, c, κ, ω, and h have the same value as the first simulation and
ν = 0.5. Figure 4.3(b) shows the evolution of a group of 11 autonomous vehicles,
with free boundary control, released starting with the anchor agents at position
0 and 0.1 and the follower agents spread equally between them. The deployment
density is consistent with the theoretically predicted solid curve in Figure 4.1.
The theoretical distribution and density functions for the free and fixed anchor
cases is shown in Figure 4.4. Figure 4.4(a) shows the normalized vehicle ID number
(α) on the y-axis and the vehicle location on the x-axis. Figure 4.4 shows that, in
the free anchor case, the agents cover less of the area (between 0.2 to 1) than in the
fixed anchor case, which are forced to cover the area between 0 and 1. Figure 4.4(b)
shows that the free anchor case has higher density around the source than the fixed
anchor case.
62
The simulation in Figure 4.5 is produced with the same parameters as the sim-
ulation shown in Figure 4.3(b), except that in Figure 4.5(a) the extermum seeking
parameters are ai = 0.008(1+ i/n), ci = 15(1+ i/n) and in Figure 4.5(b) the source
is moving according to x∗(t) = cx∗ + ax∗ sin(ωx∗t), where cx∗ = 0.6, ax∗ = 0.2, and
ωx∗ = π/5. Figure 4.5(a) shows how increasing the parameters a and c with respect
to the agents index i pulls the agents with a higher i closer to the source. Figure
4.5(b) shows how the algorithm handles a moving source.
4.6 Conclusions
We have introduced algorithms that expand the capability of previous single-
agent source seeking algorithms. The new multi-agent source seeking algorithms
cover the area around the source in such a manner that the highest density of
agents is achieved at the source and the density decreases away from the source.
This form of deployment is achieved by combining standard extremum seeking with
consensus-type ideas, namely, by using algorithms that are simultaneously driven by
the local signal strength and by diffusion feedback, which employs the distance to
the nearest agents. While diffusion aims to place an agent exactly halfway between
its neighbors, extremum seeking aims to pull the agent closer to the source. In
the presence of anchor agents, which deploy some distance apart, the result is that
agents deploy more densely near the source than away from the source.
Of interest for future research is to extend the present algorithms to the stochastic
case, namely, to replace the sinusoidally forced extremum seeking algorithms by
extremum seeking algorithms forced by white noise [39]. In addition, it is of interest
to extend the current results for one-dimensional formations in one-dimensional
space to higher-dimensional formations in higher-dimensional space. Finally, it is of
interest to extend the present results to non-holonomic vehicles.
This chapter is in full a reprint of the material as it has been submitted to: N.
Ghods, and M. Krstic, “Multi-agent deployment over a source,” under review.
The dissertation author was the primary investigator and author of this paper.
63
−0.2 0 0.2 0.4 0.6 0.8 10
1
2
3
4
Tim
e
Vehicle Position−0.2 0 0.2 0.4 0.6 0.8 1
−6
−4
−2
0
2
Sig
nal F
ield
(a) Fixed anchors
0 0.5 10
1
2
3
4
Tim
e
Vehicle Position0 0.5 1
−6
−4
−2
0
2
Sig
nal F
ield
(b) Free anchors
Figure 4.3: Double y-axis plots of the vehicle trajectories showing time scale onthe left y-axis, the signal field strength on the right y-axis, and the location ofthe vehicles on the x-axis. (a) Agent deployment with fixed anchors. (b) Agentdeployment with free anchors.
64
0 0.2 0.4 0.6 0.8 10
0.2
0.4
0.6
0.8
1
Position (x)
Dis
trib
utio
n (α
)
Formation Distribution
Free B.C.Fixed B.C.
(a)
−0.6 −0.4 −0.2 0 0.2 0.40
0.5
1
1.5
2
2.5
3
Position (x− x∗)
Den
sity
(nve
hic
les/
∆x)
Formation Density
Free B.C.Fixed B.C.
(b)
Figure 4.4: Theoretical plot of (a) Formation distribution function and (b) Forma-tion density function for the fixed and free anchor cases
65
−0.2 0 0.2 0.4 0.6 0.8 10
1
2
3
4
Tim
e
Vehicle Position−0.2 0 0.2 0.4 0.6 0.8 1
−6
−4
−2
0
2
Sig
nal F
ield
(a) Linearly increasing parameters a and c
−0.5 0 0.5 1 1.50
1
2
3
4
5
6
7
8
Tim
e
Vehicle Position
Source Trajectory
(b) Moving source
Figure 4.5: (a) Agent deployment with free anchors starting far from the equilibriumwith linearly increasing parameters (b) Group of 11 agents using free anchor caseto achieve seeking of a moving source
5
Multi-agent Deployment with
Stochastic Extremum Seeking
We consider the problem of deployment of a group of N autonomous fully ac-
tuated vehicles (agents) in a non-cooperative manner in a planar signal field using
the recently introduced method of stochastic extremum seeking. The spatial dis-
tribution of the signal is unknown to the vehicles but known to be convex. The
vehicles are not able to sense their own positions but are capable of sensing the
distance between their neighbors and themselves. Each vehicle employs a stochastic
extremum seeking control law whose goal is to minimize the value of the measured
signal, namely to be as close as possible to the bottom of the signal field, as well as to
simultaneously minimizing a function of the distances between neighboring agents.
Such a seemingly conflicting and mutually competitive nature of the agents’ control
laws produces a Nash equilibrium that depends on the agents’ control parameters
and the unknown signal distribution. We prove local exponential convergence, both
almost surely and in probability, to a small neighborhood near the Nash equilibrium.
The theoretical results are illustrated with simulations.
66
67
5.1 Introduction
Recently, extremum seeking has been considered for distributed control of vehi-
cles in a network with each vehicle having limited local information in [51, 50, 24].
The applications include groups of vehicles operating underwater, under ice, in caves
or in urban environments where GPS is unavailable, or where inertial navigation sys-
tems are too costly. Other applications include scenarios where communication or
interaction among all agents is not feasible.
We investigate a stochastic version of non-cooperative source seeking by navi-
gating the autonomous vehicles with the help of a random perturbation. We use
stochastic extremum seeking and apply an extra force to some of the vehicles, which
we refer to as anchor agents, to increase the deployment area. The remaining
agents, which we refer to as follower agents, achieve deployment over a source by
using stochastic extremum seeking to maximize or minimize their local costs. The
vehicles have no knowledge of their own position, nor the position of the source,
and are only required to sense the distance between their neighbors and themselves.
In an application, the signal could be the concentration of a chemical or biological
agent, or it could be an electromagnetic, acoustic, or thermal source. The strength
of the signal is assumed to decay away from the source through diffusion or other
physical processes, but the spatial distribution of the signal is not available to the
vehicles.
The work [50] considers a non-cooperative problem where each agent is trying to
maximize or minimize their local cost function, which results in the convergence of
the group of agents to a Nash equilibrium. In [50], similar to the one agent case [60],
each agent employs two out-of-phase sinusoidal perturbations in order to generate
gradient estimates in the x and y directions for the extremum seeking algorithm.
We consider two cases of excitation for the group of N vehicles. Case 1 uses
an independent Brownian motion on a unit circle for every vehicle, and Case 2
uses only one Brownian motion on a unit circle for all vehicles, but with limited
interaction between neighbors. We provide a stability analysis for both cases based
on stochastic averaging theorems recently developed in [40]. The choice of using
random processes for perturbation was motivated by [6] and [7], where it is observed
68
that the bacterium Escherichia coil (E. coli) is able to move up chemical gradients
towards higher densities of nutrients by using what appears to be random searching
from time to time. In the works [42, 39], also motivated by E. coli, the problem of
stochastic source seeking was considered for vehicles with unicycle dynamics.
In Section 5.2, we give a description of the vehicle model and the cost function
used by each agent. Section 5.3 presents the control scheme applied according to
Case 1, which allows interaction among all agents, and Case 2, which allows limited
interaction between the agents. We prove convergence results of a group of vehicles
to a Nash equilibrium in probability and almost surely for the control law in Case 1
and in Case 2 in Sections 5.4. Simulation results in Section 5.5 illustrate the distinct
behavior exhibited using both cases for control of the agents.
5.2 Vehicle Model and Local Agent Cost
We consider vehicles modeled as a velocity-actuated point mass
dxi
dt= vxi,
dyidt
= vyi, (5.1)
where (xi, yi) is the position of the vehicle in the plane, and vxi, vyi are the vehicle
velocity inputs. The subscript i is used to denote the ith vehicle.
We assume that the nonlinear map defining the distribution of the signal field is
quadratic and takes the form
fi(xi, yi) =f ∗ + qx(xi − x∗)2 + qy(yi − y∗)2 (5.2)
where (x∗, y∗) is the minimizer, f ∗ = f(x∗, y∗) is the minimum, and (qx, qy) are
unknown positive constants. To account for the interactions between the vehicles
we assume that each vehicle can sense the distance,
dij(x, y) =√
(xi − xj)2 + (yi − yj)2, (5.3)
between itself and other vehicles. The cost function
Ji = fi +∑j∈N
qijd2ij (5.4)
includes inter-vehicle interactions, where qij ≥ 0 is the weighting that vehicle i puts
on its distance to vehicle j.
69
5.3 Control Design
To deploy the agents about the source position, we propose a control scheme
that utilizes Brownian motion on the unit circle as the excitation signal to perform
stochastic extremum seeking. Brownian motion on the unit circle has the following
form:
Y = ejB = [Y1, Y2]T = [cos(B), sin(B)]T , (5.5)
where j is the imaginary unit, and B is a 1-dimensional Brownian motion.
First, for clarity, we introduce the control scheme for a single vehicle i, which
does not impose any constraints on the excitation signal. Then, we discuss the
deployment on N vehicles, which utilizes excitation signals from two cases. The
two types of excitation are as follows, one where each vehicle uses an independent
Brownian motion and the other where every vehicle uses the same Brownian motion
process but the initial conditions of the processes differ by kπ/2 , k ∈ Z, betweenneighboring vehicles.
We propose the following stochastic control algorithm for vehicle i:
vxi =− aη1i + cxξiη1i + νxi, (5.6)
vyi =− aη2i + cyξiη2i + νyi, (5.7)
ξi =s
s+ h[Ji], (5.8)
η1i =cos(Wi(t/ϵ)), (5.9)
η2i =sin(Wi(t/ϵ)), (5.10)
were ξi is the output of the washout filter for the cost Ji, η1i, and η2i are used
as perturbations in the stochastic extremum seeking scheme, a, cx, cy, ϵ, h > 0 are
extremum seeking design parameters, and νxi, νyi ∈ R. In (5.8) s represents the
frequency domain in the transfer function acting on the cost Ji. We consider vehicles
with νxi, νyi = 0 to be the anchor agents and those with νxi = νyi = 0 to be the
follower agents. The signal (Wi(t), t ≥ 0) is a standard Brownian motion defined in
a complete probability space (Ω,F , P ) with sample space Ω, the σ−field F , and
the probability measure P .
70
Using Ito’s formula, η1i and η2i can be written as the solution to the differential
equations,
dη1i =− 1
2ϵcos(Wi(t/ϵ))dt− sin(Wi(t/ϵ))dWi(t/ϵ), (5.11)
dη2i =− 1
2ϵsin(Wi(t/ϵ))dt+ cos(Wi(t/ϵ))dWi(t/ϵ), (5.12)
which are equivalent to the stochastic differential equations
dη1i =− 1
2ϵη1idt− η2idWi, (5.13)
dη2i =− 1
2ϵη2idt+ η1idWi, (5.14)
with initial condition Wi(0) = 0 and [η1i(0), η2i(0)]T = [cos(ϕi), sin(ϕi)]
T , ϕi ∈ R,i.e., η21i(0) + η22i(0) = 1. This equivalence is shown with more detail in Section II of
[40]. Hence, the control signals (5.6) and (5.7) become
vxi =− a
2ϵη1i − aη2iWi + cxξiη1i + νxi, (5.15)
vyi =− a
2ϵη2i + aη1iWi + cxξiη2i + νyi, (5.16)
where Wi is a white noise signal for all i ∈ 1, 2, ..., N. The vehicle dynamics in
closed-loop with control laws (5.6)–(5.10) are rewritten as
dxi =− a
2ϵη1idt− aη2idWi + cxξiη1idt+ νxidt, (5.17)
dyi =− a
2ϵη2idt+ aη1idWi + cyξiη2idt+ νyidt, (5.18)
where η1i and η2i are given in (5.13) and (5.14), respectively.
Remark 5.1: It is not necessary to choose the Brownian motion on the unit
circle as the probing signal in the stochastic design for one vehicle. It is only re-
quired that the excitation signals in the x and y directions are uncorrelated and
bounded. Note that the Brownian motion on the unit circle was primarily chosen
for the ease that it provides in the stochastic averaging and the ability to use one
Brownian motion per vehicle or, as it will be shown in the next section, use only
one Brownian motion for the entire group of vehicles.
71
For the stable deployment of N vehicles, with dynamics (5.13)–(5.14), (5.17)–
(5.18), we impose additional constraints on the Brownian motion Wi and the initial
condition of the Brownian motion on a unit circle (cos(ϕi), sin(ϕi)) according to the
two types of excitation that we consider.
Case 1: For this case, we require the Brownian motion used by the ith agent,
Wi(t), to be uncorrelated with Brownian motion used by the jth agent, Wj(t), for
i = j and allow ϕi ∈ R. Under these constraints, each vehicle is allowed to interact
with any of the other vehicles.
Case 2: This case allows every vehicle to use the same Brownian motion, Wi(t) =
W (t),∀i ∈ 1, 2, ..., N, but requires the initial condition of the Brownian motion
on a unit circle to satisfy
ϕi − ϕj =kπ2, i ∈ Ωodd, j ∈ Ωeven,
ϕi − ϕj = kπ, i, j ∈ Ωodd or i, j ∈ Ωeven,(5.19)
∀i, j ∈ 1, 2, ..., N, k ∈ Z, where Ωodd and Ωeven are nonempty sets chosen such
that
Ωeven ∪ Ωodd = 1, 2, ..., N and Ωeven ∩ Ωodd = ∅. (5.20)
With these constraints, a vehicle in the set Ωodd cannot gather distance information
about vehicles in the the same set because their perturbation signals are correlated.
Therefore a vehicle in the set Ωodd indirectly interacts with another vehicle in the
same set by influencing vehicles in the set Ωeven. The same is true for vehicles in
the set Ωeven.
Remark 5.2: Besides dealing with N vehicles converging to a Nash equilibrium,
the main difference between [60], where cos(ωt) and sin(ωt) were used as probing
signals, and this work is that we use components of Brownian motion on the unit
circle as a probing signal.
5.4 Stability Analysis
In this section, we present and prove local stability, in a specific probabilistic
sense, for a group of vehicles with the two excitation cases presented in Section 5.3.
72
We define an output error variable ei =h
s+h[Ji(t)] − f ∗ where h
s+his a low-pass
filter applied to the cost J , which allows us to express ξi(t), the signal from the
washout filter, as ξi(t) =s
s+h[Ji(t)] = Ji(t)− f ∗ − ei(t), noting also that ei = hξi.
5.4.1 Case 1
Here we show stability for a group of fully actuated vehicles with control laws
(5.6)–(5.10) using the Case 1 perturbations.
Theorem 5.1 Consider the closed-loop system,
dxi = − a
2ϵη1idt− aη2idWi + cxξiη1idt+ νxidt, (5.21)
dyi = − a
2ϵη2idt+ aη1idWi + cyξiη2idt+ νyidt, (5.22)
dei = hξidt, (5.23)
dη1i = − 1
2ϵη1idt− η2idWi, (5.24)
dη2i = − 1
2ϵη2idt+ η1idWi, (5.25)
ξi = qx(xi − x∗)2 + qy(yi − y∗)2
+N∑j=1
qijd2ij − ei, (5.26)
∀ i ∈ 1, 2, ..., N, where the parameters νx, νy ∈ RN , a, cx, cy, h, qx, qy > 0 and
qij ≥ 0, ∀ i, j ∈ 1, 2, ..., N, and the signal Wi is a Brownian motion with Wi(0) =
0,Wi(t) = Wj(t). If the initial conditions are [η1(0), η2(0)]T = [cos(ϕi), sin(ϕi)]
T
with ϕi ∈ R and x(0), y(0), e(0) are such that the quantities, |xi(0) − x∗ − xeqi |,
|yi(0)− y∗− yeqi |, |ei(0)− eeq|, are sufficiently small, where (x∗, y∗) is the minimizer
of (5.2),
xeq =1
cxaQ−1
x νx, (5.27)
yeq =1
cyaQ−1
y νy, (5.28)
eeqi =qxxeq2
i + qyyeq2
i +a2
2(qx + qy)
+∑j∈N
qij[(xeq
i − xeqj )2 + (yeqi − yeqj )2
], (5.29)
73
and the matrices Qx and Qy, given by
Qx ij =
−qx −
∑Nk=1 qik i = j
qij i = j, (5.30)
Qy ij =
−qy −
∑Nk=1 qik i = j
qij i = j, (5.31)
are invertible, then there exist constants Cx, Cy, γx, γy > 0 and a function T (ϵ) :
(0, ϵ0) → N such that for any δ > 0
limϵ→0
inft ≥ 0 : |xi(t)− x∗ − xeq
i | > Cxe−γxt + δ + a
= ∞, a.s., (5.32)
limϵ→0
inft ≥ 0 : |yi(t)− y∗ − yeqi | > Cye−γyt + δ + a
= ∞, a.s., (5.33)
and
limϵ→0
P|xi(t)− x∗ − xeqi |
≤ Cxe−γxt + δ + a, ∀t ∈ [0, T (ϵ)] = 1, (5.34)
limϵ→0
P|yi(t)− y∗ − yeqi |
≤ Cye−γyt + δ + a, ∀t ∈ [0, T (ϵ)] = 1, (5.35)
where ∀i ∈ 1, 2, ..., N with the limϵ→0 T (ϵ) = ∞. The constants Cx and Cy
are dependent on both the initial condition (x(0), y(0), e(0)) and the parameters
a, cx, cy, νx, νy, h, qx, qy. The constants γx, γy are dependent on the parameters a, cx,
cy, νx, νy, h, qx, qy.
Proof: We start by defining the error variables
x =x− x∗ − aη1, (5.36)
y =y − y∗ − aη2, (5.37)
and define [χ1(t), χ2(t)] = [η1(ϵt), η2(ϵt)]. Since
dx =dx− a
2χ1(t/ϵ) dt− χ2(t/ϵ) dW, (5.38)
dy =dy − a
2χ2(t/ϵ) dt+ χ1(t/ϵ) dW, (5.39)
74
we obtain the following dynamics for the error variables:
dx
dt=cxξχ1(t/ϵ) + νx, (5.40)
dy
dt=cyξχ2(t/ϵ) + νy, (5.41)
de
dt=hξ, (5.42)
ξi =qx(xi + aχ1i(t/ϵ))2 + qy(yi + aχ2i(t/ϵ))
2
+∑j∈N
qij[(xi + aχ1i(t/ϵ)− xj − aχ1j(t/ϵ))
2
+(yi + aχ2i(t/ϵ)− yj + aχ2j(t/ϵ))2]− ei, (5.43)
dχ1(t) =− 1
2χ1(t)− χ2(t)dW, (5.44)
dχ2(t) =− 1
2χ2(t) + χ1(t)dW. (5.45)
We use general stochastic averaging given in Theorem 2 of [40] to analyze the
error system. We first calculate the average system of (5.40)–(5.42). The signals
χ1 and χ2 are both components of the Brownian motion on a unit circle, which is
known to be exponentially ergodic with invariant distribution µ(S) = l(S)2π
for any
set S ⊂ (x, y) ∈ R2|x2+y2 = 1 where l(S) denotes the length (Lebesgue measure)
of S [3]. The integral over the entire space of functions of Brownian motion on a
unit circle can be reduced to the integral from 0 to 2π. Since∫Rcos2k+1(s)µ(ds) =
∫ 2π
0
cos2k+1(s)1
2πds = 0, (5.46)∫
Rcos2(s)µ(ds) =
∫ 2π
0
cos2(s)1
2πds =
1
2, (5.47)
∫R
∫Rcos(s) cos(r)µ(ds)µ(dr) =∫ 2π
0
∫ 2π
0
cos(s) cos(r)1
4π2dsdr = 0, (5.48)
(note that the same applies to the sine function) and∫Rcos(s) sin(s)µ(ds) =
∫ 2π
0
cos(s) sin(s)1
2πds = 0, (5.49)
75
we obtain the average error system
dxave
dt=cxaQxx
ave + νx, (5.50)
dyave
dt=cyaQyy
ave + νy, (5.51)
deavei
dt=h
(−eavei + qxx
ave2 + qyyave2 +
a2
2(qx + qy)
)+ h
∑j∈N
qij[(xave − xave
j )2 + (yave − yavej )2]. (5.52)
Using the fact that Qx and Qy have the special form, shown in (5.30) and (5.31),
with Gershgorin Circle Theorem (Theorem 7.2.1 in [25]) we get that as long as
qx, qy > 0, the matrices Qx, Qy have eigenvalues that are all negative (i.e. they are
Hurwitz and invertible).
The average error system has equilibria (5.27), (5.28), and (5.29) with the Jaco-
bian,
A =
cxa2π
Qx 0 0
0 cya
2πQy 0
0 0 −hI
. (5.53)
The matrices Qx and Qy are Hurwitz, which implies that A is Hurwitz and that the
equilibria (5.27), (5.28), and (5.29) are exponentially stable.
Using Theorem 2 in [40] there exist constants c > 0, r > 0, γ > 0 and functions
T (ϵ) : (0, ϵ0) → N, such that for any δ > 0, and any initial conditions |Λϵ(0)| < r,
limϵ→0
inft ≥ 0 : |Λϵ(t)| > c|Λϵ(0)|e−γt + δ
= ∞, a.s., (5.54)
and
limϵ→0
P|Λϵ(t)| > c|Λϵ(0)|e−γt + δ, t ∈ [0, T (ϵ)
= 1, (5.55)
with limϵ→0 T (ϵ) = ∞, where
Λϵ(t) =
x− xeq
y − yeq
e− eeq
. (5.56)
76
The results (5.54) and (5.55) state that the norm of the error vector Λϵ(t) expo-
nentially converges, both almost surely and in probability, to a point below an
arbitrarily small residual value δ over an arbitrarily long time interval, which tends
to infinity as ϵ goes to zero. In particular, each xi-component and yi-component
for all i ∈ 1, 2, . . . , N of the error vector converges to below δ, which gives us
(5.32)–(5.35).
5.4.2 Case 2
Here we show stability for a group of fully actuated vehicles with control laws
(5.6)–(5.10) using the Case 2 perturbations.
Theorem 5.2 Consider the closed-loop system (5.21)–(5.26) where the parameters
νx, νy ∈ RN , a, cx, cy, h, qx, qy > 0, and qij ≥ 0 ∀ i, j ∈ 1, 2, ..., N. If the initial
conditions are Wi(0) = 0, [η1i(0), η2i(0)]T = [cos(ϕi), sin(ϕi)]
T with ϕi chosen such
that (5.19)–(5.20) holds and x(0), y(0), e(0) are such that the following quantities
|xi(0)− x∗ − xeqi |, |yi(0)− y∗ − yeqi |, |ei(0)− eeq|, (5.57)
are sufficiently small, where[xeq
yeq
]=A−1
xy
[νx
νy
], (5.58)
eeqi =qxxeq2
i + qyyeq2
i +a2
2(qx + qy)
+∑j∈N
qij[(xeq
i − xeqj )2 + (yeqi − yeqj )2
], (5.59)
and the matrix Axy is invertible and is given by
Axy =
[cxa(QΩ − Iqx) −cxaQΩ
−cyaQΩ cya(QΩ − Iqy)
], (5.60)
77
QΩ ij =
−∑
k∈Ωoddqik i ∈ Ωeven, i = j
−∑
k∈Ωevenqik i ∈ Ωodd, i = j
qij i ∈ Ωeven, j ∈ Ωodd
qij i ∈ Ωodd, j ∈ Ωeven
0 otherwise
, (5.61)
then there exist constants Cx, Cy, γx, γy > 0 and a function T (ϵ) : (0, ϵ0) → N such
that for any δ > 0
limϵ→0
inft ≥ 0 : |xi(t)− x∗ − xeq
i | > Cxe−γxt + δ + a
= ∞, a.s (5.62)
limϵ→0
inft ≥ 0 : |yi(t)− y∗ − yeqi | > Cye−γyt + δ + a
= ∞, a.s (5.63)
and
limϵ→0
P|xi(t)− x∗ − xeqi |
≤ Cxe−γxt + δ + a, ∀t ∈ [0, T (ϵ)] = 1 (5.64)
limϵ→0
P|yi(t)− y∗ − yeqi |
≤ Cye−γyt + δ + a, ∀t ∈ [0, T (ϵ)] = 1 (5.65)
∀i ∈ [1, N ] with the limϵ→0 T (ϵ) = ∞.
Proof: Similar to the proof for Theorem 5.1 we start by applying (5.36), (5.37)
and defining [χ1(t), χ2(t)] = [η1(ϵt), η2(ϵt)]. By employing stochastic averaging to
compute the average system and then, linearizing the average system about the
equilibrium [xeq, yeq, eeq]T , we obtain the Jacobian,
A =
[Axy 0
0 −hI
], (5.66)
which is block diagonal and Hurwitz since Axy and −hI are both Hurwitz. By
applying Theorem 2 in [40], similar to Theorem 5.1, we can obtain the results
(5.62)–(5.65).
78
5.5 Simulation
In this section, we show numerical results for a group of vehicles with the control
scheme presented in Section 5.3. For the following simulations, without loss of
generality, we let the unknown location of the signal field be at the origin (x∗, y∗) =
(0, 0), and let the unknown signal field parameters be (qx, qy) = (1, 1).
In Figure 5.1 we consider 13 vehicles with Case 1 perturbations. We choose the
design parameters as a = 0.01, cx = cy = 150, h = 10, and define agents 1 through 6
as the anchor agents with the forcing terms,
(νxi, νyi) = 0.05
(cos
(iπ
3
), sin
(iπ
3
)), (5.67)
where i = 1, . . . , 6. In addition to the design parameters, we picked the interaction
gains qij such that
qij =
qi,i+1 = qi+1,i = 0.5, i ∈ 1, ..., 12, i = 6
qi,13 = 0.5, i ∈ 7, ..., 12qi,i−6 = qi−6,i = 1, i ∈ 7, ..., 12qi,j = 0, otherwise
. (5.68)
Figure 5.1 shows the ability of the control algorithm to produce a circular distribu-
tion around the source with a higher density of vehicles near the source. In this plot,
the trajectories of the vehicles are not shown to avoid obscuring the final vehicle
formation.
In Figure 5.2 we consider 5 vehicles with Case 2 constraints. We pick agent 1 and
agent 5 as the anchor agents with (νx1, νy1) = (−0.1, 0.1), (νx5, νy5) = (0.1,−0.1),
and choose the other design parameters to be the same as in the previous simulation.
We assume that each vehicle interacts with only the closest indexed agents with a
weighting of 0.5, i.e., qi,i+1 = qi+1,i = 0.5, i = 1, ..., 4. Figure 5.2 shows a line
formation centered at the source, with a higher density of agents near the source,
and generated by agents using a single Brownian motion signal.
Illustrated in these figures is the effect of the forcing terms (νxi, νyi) assigned
to the anchor agents. By carefully selecting these forcing terms, other geometric
deployments can be made, which will be distorted by the signal field. For instance,
79
Figure 5.1: Shows a group of vehicles using the stochastic extremum seeking algo-rithm with Case 1 perturbations and interaction gains given by (5.68). The anchoragents are denoted by red triangles and the follower agents are denoted by blue dots.The agents start inside the dashed black line and converge to a circular formationaround the source.
if νxi and νyi (5.67) were defined as
νxi =0.05
(a cos
(iπ
3
)− b sin
(iπ
3
))(5.69)
νyi =0.05
(a cos
(iπ
3
)+ b sin
(iπ
3
)), (5.70)
where a is the semimajor axis and b is the semiminor axis, an elliptical deployment
will result.
5.6 Conclusion
In this chapter, we presented a stochastic extremum seeking algorithm for a group
of agents, with two different constraints on the agents, to achieve stable deployment
over a source. We presented a stability proof that shows convergence of the vehicles,
to a Nash equilibrium, both in the almost sure sense and in probability when using
80
Figure 5.2: Shows a group of vehicles using the stochastic extremum seeking algo-rithm with Case 2 perturbations. The agents start inside the dashed black line andconverge to a line formation centered around the source with the anchor agents atthe end of the line formation.
two kinds of excitation signals. We show simulation results for the control algorithm
applied to agents on a static source.
This chapter is in full a reprint of the material as it has been submitted to:
N. Ghods, P. Frihauf, and M. Krstic, “Multi-agent deployment in the plane using
stochastic extremum seeking, IEEE Conference on Decision and Control, 2010.
The dissertation author was the primary investigator and author of this paper.
6
Light Source Seeking Experiments
In this chapter we consider the problem of seeking a light source with an au-
tonomous ground vehicle. The vehicle does not have the capability of sensing its
position or the position of the source but is capable of sensing the light signal orig-
inating from the light bulb. The light field created by the light bulb decays away
from the position of the light bulb but the vehicle does not have the knowledge of the
functional form of this field. We employ a control strategy that keeps the forward
velocity constant and tunes the angular velocity via extremum seeking. First, we
present the design for a light-seeking robot. We produce experimental results of a
vehicle performing localizing, tracking, and tracing level-sets of a light source. We
also present multiple vehicles seeking a light source while avoiding objects and each
other.
6.1 Introduction
Research in applications that use autonomous vehicles are wide, varied, and
constantly growing. In particular, the field of research dealing with vehicles deprived
of position information is rapidly gaining interest. These vehicles must navigate and
perform a desired task without the use of GPS or inertial navigation. The vehicles
that we use in this work do not have lateral motion capabilities.
In this chapter we present experiments to support some of the theoretical and
numerical results covered in [11, 12]. We will employ a control schemes based on
81
82
extremum seeking to control the heading of ground vehicles while keeping their for-
ward speed constant. In [11] theoretical results for basic extremum seeking applied
to the steering of autonomous vehicles are provided. In [12], the application of
extremum seeking on vehicles with different objectives and different configurations
from those which the theory covers are presented.
Extremum seeking employs a periodic probing motion of the vehicle to search
the signal space, which then provides the necessary information to orient the vehicle
in the correct direction. There exist applications for which this probing motion is
undesirable, in which case extremum seeking can still be applied via a slight modi-
fication of decoupling the sensor from the body on the vehicle. In the experiments
presented here we modified the extremum seeking method to separate the desired
tuning of the vehicle orientation from the undesirable periodic probing. The concept
behind decoupled extremum seeking is that the sensor can move along the vehicle
body, providing the necessary probing motion, while the vehicle itself moves in a
smooth fashion. Implementing decoupled extremum seeking does not hinder the
vehicle’s capability of source seeking.
In Section 6.2 we provide the design of the Autonomous Nonholonomic Tracker
(ANT), which is used in the light-seeking experiments. Section 6.3 presents exper-
imental results for localizing a stationary light source, tracking a moving source,
tracing the level sets, and source seeking and collision avoidance with one or two
robots. We conclude this chapter with our future intentions in Section 6.4.
6.2 Vehicle Design
The basic vehicle configuration used for extremum seeking assumes the vehicle
itself can readily perform the movement caused by the periodic perturbation used to
search the space. In our case it is inefficient to have the entire robot move in these
period probing motions so we consider the use of decoupling the sensor from the
body of robot. The ANT was designed around the unicycle model with a decoupled
sensor depicted in Figure 6.1. The key aspects in the design of the ANT were keeping
the sensor at a distance R from the center, keeping the axis of rotation of the vehicle
83
x
y
θ
θs
v
Figure 6.1: Graphical interpretation of the unicycle model with a decoupled sensor.The red dot indicates the sensors location
at its center rc, and having separate actuation to decouple the sensor sweeping θs
from the vehicle turning (θ). Numerical validation of the decoupled unicycle model
used for sources seeking is discussed in [12].
The ANT was assembled with two decks made of acrylic. As shown in Figure
6.2 the wireless communications, battery, steering servo, and the decoupling sensor
servo are housed in between the acrylic decks. The bottom of the lower deck houses
the steering gears and the driving servo. The top deck contains the light sensor arm
and circuit board.
The ANT uses two types of sensors on-board: a light sensor and an IR proximity
sensor. The TAOS TSL14S-LF is a light sensor placed at the tip of the sweeping
sensor arm and to provide light intensity readings. The light sensor output passes
through a low-pass RC filter, with a cutoff frequency of 10 Hz, built on a custom
printed circuit board (PCB), to remove high frequency noise. The ANT has two
Sharp GP2D120XJ00F IR proximity sensors located front left and right of the robot,
which help the robot detect and avoid obstacles.
The ANT uses two Hitec HS-85MG micro servos for locomotion, one for contin-
uously moving forward and the other for steering. A Cirrus CS301 micro servo is
used to provide the sweeping motion of the sensor arm. All servos are controlled by
the PWM (pulse width modulation) that comes from the microprocessor. To power
84
Light-sensor arm
Battery
IR sensor
Wireless communication
Servo
PCB board
Wheel
Axle
Bearing
Steering gear
Driving servo
(a) (b)
Figure 6.2: ANT (a) top view (b) bottom view
all the electronics on the ANT we use a Tenergy 2S-500-10 lithium polymer battery
pack. A 5V voltage regulator is used to maintain a consistent supply voltage to the
electrical components.
A custom designed PCB, shown in Figure 6.3, was created for the ANT. At the
core of the in PCB is the dsPIC30F4012 microprocessor from Microchip. There are
six analog input channels and six PWM output channels on the 28-pin microproces-
sor, which allow for the addition of three more sensors in the future depending on
the application. The microprocessor on the PCB also connects through the MPR
connector to a wireless xBee communication module used for data collation.
The control algorithm for the ANTs are given as
v = v0 (6.1)
θ = c sin(ωt)s
s+ h[J ] + d(IRleft − IRright) (6.2)
θs = aω cos(ωt) (6.3)
where v commands the surge servo, θ commands the steering servo, and θs commands
the decoupled sensor arm. The light sensor reading J and the IR sensor readings
IRleft and IRright are the inputs to the control algorithm. The parameter v0 is a
constant that determines the forward speed of the robot. The extremum seeking
parameters are a, ω, c, and h where a is the probing amplitude, ω is the sinusoidal
85
Sensor Inputs
Battery Input
ON/OFF Switch
MPR ConnectionProgrammingConnection
Motor Outputs
PIC Microcontroller
Figure 6.3: CAD rendering of the PCB
sweeping frequency, c is the adaptive gain, and h is the cutoff frequency of the
washout filter. The addition of the d term, which acts as an obstacle avoidance
gain, in (6.2) was made to give the robot the ability of avoiding obstacles and other
robots. The ANT was programmed with a digital version of the extremum seeking
algorithm (6.1)–(6.3) in the MPLAB integrated development environment provided
by Microchip.
6.3 Experiment
In this section we show the extremum seeking method employed on the ANT.
The ANT is given the task of seeking the source or a level set produced by a light
source while avoiding obstacles. The ANT has no information about its position or
the position of the source. Similar to most mobile vehicles, the ANT has kinematic
constraints, which do not allow the robot to move sideways. Considering these
constraints, one of the advantages of the extremum seeking method is being able to
simultaneously solve a nonholonomic steering problem while also solving an adaptive
optimization problem. The experiments done in this section use one or two desk
lamps as the source and a table gridded with 0.15m (6.0in) squares to give a better
idea of relative distance as the vehicle moves around on the table.
86
6.3.1 Localization and Tracking of a Light Source
Here we show experimental results of the extremum seeking method to not only
localize a light source but also to track the light source once the source moves.
We designed an experiment to test how the algorithm would handle the worst case
scenario of moving sources, i.e., instantly moving from one location to another. The
experiment was done with two light bulbs. The experiment begins with one light
bulb turned on, then once the robot has converged to the light bulb it is turned
off and a second light bulb is turned on. From the perspective of the vehicle this
experiment emulates a source that can instantly move from one location to another.
The first two photos in Figure 6.4 show the ANT starting from a location away from
the light bulb and then quickly converging to the light bulb. Since the extremum
seeking algorithm never stops searching the vehicle continues to sniff around the
light source. As shown in the last two photos of Figure 6.4, once the light source is
switched the vehicle starts converging to the new light source.
6.3.2 Level Set Tracking of a Light Source
Tracing out the curves which define a specific value of the signal is a good way to
gain more information about the signal field. These curves are referred to as a level
set or isoline. A simple modification to the extremum seeking algorithm produces a
simple solution for implementation on the ANT to perform level set tracing. In the
tracking experiment the robot was trying to maximize the light intensity J that it
was measuring. In this experiment we modify (6.2) by replacing J with the negative
absolute value of the difference of the sensor reading J and the desired level set
value Jd. The steering control law, modified for level set tracing, becomes
θ = c sin(ωt)s
s+ h[−|J − Jd|] + d(IRleft − IRright). (6.4)
For this experiment we hang the two lamps above the table to produce a peanut-
like shaped signal field. Figure 6.5 shows a sequence of pictures of the vehicle
employing extremum seeking to trace a level set of the light source. The pictures are
taken at fifteen second intervals. A marker attached to the bottom of the vehicle is
used to draw the vehicle’s path as it performs the level set tracing. Figure 6.6 shows
87
Figure 6.4: Photographs of the ANT performing source seeking with overlayed tra-jectory appearing in order from left to right top to bottom.
88
Figure 6.5: Photographs of the ANT performs level set tracing at 15 sec intervalsappearing in order from left to right top to bottom
the test table after five minutes, where the ANT has traced out the level set two and
a half times. The vehicle traced a peanut-like shape of approximately 45in × 27in
(115cm × 70cm) with a maximum deviation of approximately 2in (5cm) between
laps. From these pictures we can conclude that a vehicle employing extremum
seeking can successfully perform level set tracing on a static unknown source given
a desired signal intensity Jd.
6.3.3 Collision Avoidance
In almost all applications of mobile vehicles collision avoidance is an important
part of the task. Here we present three experiments that show the collision avoidance
capabilities of the ANTs. The experimental setup is very similar to the setup in the
light tracing experiment, where the light sources were hung above the test table.
Figure 6.7 shows a sequence of pictures of a red and black ANT employing extremum
seeking to track two light sources. The pictures are taken at ten second intervals.
The first picture of Figure 6.7 shows the two desk lamps being used as sources as
well as the starting position of the robots. The red and the black ANTs start next
to each other but once they are turned on they repel each other and head to two
different light sources. The last picture in Figure 6.7 shows each robot settling to a
89
Figure 6.6: Picture of the testbed after the ANT had traced the level set severaltimes
different light source.
A second experiment was done with one light source and some obstacles in the
way of the robot. As shown in Figure 6.8 the robot avoids the two objects on its way
to the light source. A final experiment was done to see how well the two robots can
avoid each other while tracking one light source. Figure 6.9 show how they avoided
each other once they both arrived at the source. After some time the two robots
settled in to a small circular trajectory with the robots being at opposite ends.
6.4 Conclusion and Future Work
In this chapter we showed that extremum seeking applied to autonomous vehicles
allows for the completion of a variety of tasks, such as source tracking, level set
tracing, and multi-vehicle sources seeking while avoiding collision. In the future,
we plan to experiment with multi-vehicle algorithms with methods similar to the
ones mentioned in Chapter 4 and 5 but for nonholonomic vehicles. We also plan to
investigate the application of extremum seeking in performing cooperative tracking
of multiple targets.
90
Figure 6.7: Photographs of two ANTs performing source seeking in a field producedby to light sources at 10 sec intervals appearing in order from left to right top tobottom.
91
Figure 6.8: Photographs of the ANT performing obstacle avoidance while trackinga light source at 5 sec intervals appearing in order from left to right top to bottom.
92
Figure 6.9: Photographs of the ANTs avoiding each other while tracking a lightsource at 5 sec intervals appearing in order from left to right top to bottom.
7
Plume Source Seeking
Experiments
Tracking a plume of chemical back to its source is made difficult by the com-
plexity of a plume structure caused by turbulence and shifts in the prevailing wind
direction. Insects overcome this problem using forms of anemotaxis, which involve
traveling upwind when an attractive chemical is perceived. We combine the method
of extremum seeking with the biologically inspired idea of traveling up wind to
achieve plume source localization. We create an apparatus that is able to produce
a wide range of plumes. We present experimental results of an autonomous vehicle
equipped with a smoke sensor and a wind direction sensor seeking the source of a
smoke plume.
7.1 Introduction
Tracking plumes to their source is a difficult task, as it is highly affected by the
turbulence of the media and by the sensitivity of the sensors to both the media
and other contaminants in the media. In general, most attempts at plume tracking
have used the “PC on board” philosophy. The assumption is that a great deal of
processing is required to extract enough data to track a plume, as the data used
by biological systems ([16, 17, 26]) may be quite detailed and subtle. Data ranging
93
94
from edge detection to gradient calculations might be used to track plumes.
In this chapter, we describe a robot implementing a simple algorithm. This algo-
rithm is based on a combination of extremum seeking and wind direction feedback,
and contains no explicit state or memory and no internal processing of sensory data.
The robot simply reacts to external environmental conditions. However, the robot
is capable of tracking an odor plume reliably upstream, and has a high success rate
from anywhere within the plume, and with any initial configuration. In Section 7.2,
we cover the construction of a testbed that allows the operator to control the smoke
concentration at the source and the wind speed. Section 7.3 shows the design and
assembly of a mobile robot with the capability of tracking a smoke plume source,
which we refer to as plume-bot. The experimental results are shown in Section 7.4.
We conclude this chapter with potential future work in Section 7.5.
7.2 Testbed Setup
This testbed consists of three main parts: a wind tunnel, a chamber with a
known smoke concentration, and a base station computer. The wind tunnel has two
fans that control wind speed, which allows us to perform tests at a wide range of
wind flow environments. The smoke chamber allows us to produce a smoke source at
the intake of the wind tunnel. The base station computer is used to control the fans,
the smoke release, and record the status of the plume-bot during an experiment.
The wind tunnel has overall interior dimensions of 1.2 m wide by 2.4 m long and
0.33 m high. The entire tunnel was constructed using plywood, except for the top,
which needed to be clear acrylic in order for the vision system to track the position
of the plume-bot. To avoid muzzle turbulence, which would misrepresent natural
conditions in the tunnel, an intake was designed and constructed using standard
0.15 m long drinking straws stacked together to form a honeycomb structure. To
maximize intake flow, the honeycomb has the same cross-sectional dimensions as the
tunnel itself. The outlet section houses two 0.10 m diameter DC brushless fans that
were attached to a 0.20 m wide by 0.25 m high tapered outlet. The fans pull the
air through the system and force it through a 0.20 m diameter air duct that leads
95
to the lab fume hood. An electronic ignition device and smoke chamber is located
at the intake where the smoke can be released into the box. Ignition and fan speed
controls are provided through a micro-controller board with a serial RS232 interface
to the base station computer. Figure 7.1 shows the intake and the outlet of the
wind tunnel box. The clear acrylic is attached to a metal frame which hinges onto
the wind tunnel box. The hinged acrylic allows us to easily access the inside of the
wind tunnel box for placing and moving the plume-bot.
(a) (b)
Figure 7.1: Wind tunnel (a) the intake (b) the outlet
Creating an apparatus with reliable and characterized smoke plume is the most
difficult task of this testbed due to the complex nature of the plume. Characterizing
our smoke plume allows us to understand how our system will work with similar
environments outside our testbed and allows us to reliably compare the different
experiments with each other. To characterize the smoke plume we first start with
characterizing flow through the box. A good descriptor of the wind flow is the
Reynolds number, which is given by the following
Re =ρudnµ
(7.1)
dn =4A
p(7.2)
where u, ρ, dn, µ are wind velocity, air density, hyraulic diameter of the tunnel, and
dynamic viscosity of air, respectively. The equation for hyraulic diameter is given
96
in terms of the area A and the perimeter p. For our case the formula simplifies to
Re =2ρuab
µ(a+ b)(7.3)
(7.4)
where a and b are the width and height of the box. The Reynolds number can
be used to determine if flow is laminar, transient or turbulent. The flow is laminar
when Re ≤ 2300, transient when 2300 < Re < 4000, and turbulent when Re ≥ 4000.
Given that air has a density of 1.205 kg/m3 and a dynamic viscosity of 1.983× 10−5
kg s/m and that the box’s cross section is 1.2 m wide and 0.33 m high, we can write
the Reynolds number just in terms of the wind velocity as follows
Re = 16000u. (7.5)
By controlling the wind velocity we can produce all three types of flows. For example,
if we wanted laminar flow we would control the wind speed to be less than 0.14 m/s
and for turbulence we would set the wind speed to higher than 0.25 m/s.
The smoke chamber allows us to control the concentration and pressure of the
infused smoke released at the intake. The smoke chamber consists of a cylinder tube
with sealed ends, a pressure controlled inlet, a hot plate to create smoke particulates,
and an outlet hose that releases the smoke into the wind tunnel box. Figure 7.2
shows a picture of the smoke chamber. During each test a set amount of powder is
placed onto the hot plate igniter and the inlet pressure is set to be slightly above
the pressure inside the wind tunnel to allow the smoke to leak into the wind tunnel.
The base station computer interfaces with the control box, the plume-bot wireless
serial link, and the overhead video camera (mounted six feet above the apparatus).
A Matlab GUI running on the base station computer collects data and controls
the experiment. Matlab provides image processing tools that we use to locate a
bright light on the plume-bot and track its position as the plume-bot moves across
the camera’s field of vision. The Matlab GUI was used to collect data from the
camera, plume-bot, and the wind tunnel. Figure 7.3 shows a snapshot of the GUI
where the controls are on the top right, the real time video and vehicle trajectory
are on the bottom right, and the connection states to the plume-bot and the wind
tunnel are on the left.
97
Igniter
Smoke Output line
Smoke powder Bowl
Dry air inlet
(a) (b)
Figure 7.2: Smoke chamber (a) picture of the smoke chamber (b) diagram of smokechamber
7.3 Robot Design
In this section we discuss the design of the plume-bot. The plume-bot consists
of an acrylic frame with two in-line wheels and two side supports. The two in-line
wheels are both steered by a gear assembly and a radio controlled (RC) servo. The
rear wheel, which moves the vehicle forward, is turned by another servo modified for
continuous rotation. The side supports are each terminated with a single bearing
and serve to prevent the plume-bot from tipping. The plume bot is shown in Figure
7.4.
At the core of the electronics system on the plume-bot is the plume-bot con-
troller board (shown in Figure 7.5). This custom-made printed circuit board (PCB)
is based upon an Atmel microcontroller and was designed as a general purpose tool
for controlling the vehicle hardware, interfacing with analog sensors, and communi-
cating via serial links with other devices or computers.
Low cost, wireless telemetry at 9600 bps was obtained with a pair of 433 MHz
RF transceivers from Parallax Inc. The link is unidirectional, with the base station
98
Figure 7.3: Matlab GUI used to run experiments. The GUI has communicationstates on the left the test controls on the top right, and the real time plots on thebottom right.
99
(a) (b)
Figure 7.4: Plume-bot (a) picture of the plume-bot (b) CAD of plume-bot
receiving real-time data from the plume-bot. In order to facilitate modulation with
the carrier wave for transmission, the data packets are prefixed and suffixed with
symmetrical bit patterns. In addition, each data packet contains a packet ID and
an error checksum. The standard data packet from the plume-bot consists of the
current battery voltage and the current smoke sensor reading. Other packets may
contain control parameters for debugging. The packet IDs are sequential and the
base station software, upon missing a packet ID, will attempt to re-synchronize the
connection.
The plume-bot is equipped with a single compact optical smoke sensor that
allows the plume-bot to avoid colliding with the walls of the wind tunnel. The
smoke sensor, shown in Figure 7.6 (a), comes in a 46 × 30 × 18 mm package. The
smoke sensor outputs a voltage proportional to smoke density in the sensor’s opening
located in its center. The output voltage goes from 0 to 4 volts, which corresponds
to a dust density of 0 to 0.5 mg/m3, respectively. A circuit diagram of the optical
smoke sensor is shown in Figure 7.6 (b). The smoke sensor is mounted on a forward
facing arm that can be moved side to side with an RC servo. A 15 mm diameter fan
is mounted in an acrylic box behind the sensor to force the air through the sensor’s
100
Figure 7.5: Custom designed circuit board
opening and to prevent false readings from stagnant smoke in the particle sensor’s
detection chamber.
(a) (b)
Figure 7.6: Smoke sensor (a) picture of the smoke sensor (b) circuit diagram forparticulate sensors.
The plume-bot is equipped with a novel wind direction sensor consisting of a
pair of self-heated thermistor anemometers. The cooling effect of wind blowing over
the thermistor causes the temperature of the thermistor to drop. A differential
amplifier, shown in Figure 7.7, is used to amplify the voltage difference between
the two thermistors. By placing the thermistor on the right and the left side of the
plume-bot, the voltage output of the amplifier can be used to determine whether the
plume-bot is facing with the wind or against it, i.e., giving angle of attack. The wind
101
sensor is calibrated to give 0 volts when plume-bot is facing 90 degrees to the left
of the wind flow, 2.5 volts when the plume-bot is facing upwind, and 5 volts when
the plume-bot is facing 90 degrees to the right of the wind flow. The wind sensor
does not produce any meaningful output when the plume-bot is facing down wind,
therefore the plume-bot’s initial heading in the experiments is always set between
90 degrees to the left or right of the wind flow.
Figure 7.7: Circuit diagram for wind sensors.
The algorithm used on the plume-bot is a combination of extremum seeking and
wind direction feedback. The extremum seeking algorithm tries to drive the plume-
bot to the location of highest smoke concentration, while the wind feedback tries
to make the plume-bot go upstream. The full control law consists of setting the
forward velocity to a constant and angular velocity (θ) to the following
θ = aω cos(ωt) +s
s+ h[µ] sin(ωt) + p sin(ϕ) (7.6)
where ϕ is the robot angle relitive to the incoming wind, µ is the smoke sensor
reading, a, ω, and h are extremum seeking parameters, and p is the weighting on the
wind feedback term. A block diagram of the entire system is shown in Figure 7.8.
The addition of wind feedback to the extremum seeking algorithm was biologically
inspired. Moths, for example, do not only search for the plume but also surge
upwind [58].
102
» (1)
a! cos(!t)
ksin(!t)
)sin(⋅p
Vehicle Dynamics
Smoke sensor
Wind
Pos. Conc.
Wind sensor
PLUME Unknown function
of the position
ss+ h
Controller
µ
φ
θ
V
θ
Figure 7.8: Block diagram of the overall experiment
7.4 Experiment Results
In this section we discuss the experimental procedure then show the results of
plume experiments. In this experiment the plume-bot searches for a smoke source
using two kinds of information: smoke concentration detected by the smoke sensors
and wind direction detected by the wind sensors. The basic strategy given in (7.6)
is to perform local search for a plume and to track it in the upwind direction.
Figure 7.9 shows a picture of the robot performing source seeking on the smoke
plume. After tuning of the parameters in the algorithm we started testing. Thirty
tests were run for a wind speed of 1 m/s and the robot placed 1.8 m (6.0 ft) down-
stream and 0.61m (2.0 ft) to the right of the source with a heading of 55 degrees to
the right of the oncoming wind. The starting location was chosen as far as possi-
ble downstream and close to the edge of the smoke plume. Out of the thirty tests
twenty one were successful, where success is defined as the smoke sensor on the
robot coming within 0.15m (6.0 in) of the smoke source. Figure 7.10 shows a plot
of a successful run, where the plume-bot travels from the edge of the smoke plume
103
Figure 7.9: Picture of the plume-bot during a plume source seeking test
to the source of the smoke plume within 35 sec. Tests with different wind speeds
proved to have similar rates of success.
Twenty tests were performed without the wind sensor feedback. In these twenty
tests the plume-bot only reached the the plume source eight times. We speculate
that the reason for the lack of success of the tests without wind sensor feedback was
the pockets of smoke that the plume-bot would encounter. Once the plume-bot met
a pocket of smoke, it would try to follow the high concentration in the smoke pocket
downstream.
7.5 Conclusion and Future Work
We proved the extremum seeking algorithm with wind feedback to be 2/3 suc-
cessful in finding the source of a smoke plume. In the future we would want to use
chemical sensors with slow sensor dynamics and implement the extremum seeking
algorithm for slow sensors, discussed in Chapter 2, to perform source seeking of a
chemical. We would also want to perform plume source localization in a more real-
istic, less controlled environment. The use of multiple plume-bots would be useful
to increase the success rate.
104
0 0.5 1 1.5 20
0.2
0.4
0.6
0.8
1
1.2
1.4
X [m]
Y [m
]
Vehicle TrajectoryPlume SourceApproximate plume edgeStarting location
Figure 7.10: A 35 sec trajectory of the plume-bot performing smoke plume localiza-tion in a wind tunnel with a rightward wind of 1m/s.
Appendix A
Stability Analysis
Lemma A.1 Consider the following system
wτ (α, τ) = k1(α)wαα(α, τ)− k2(α)w(α, τ) (A.1)
zτ (α, τ) = −k3(α)z(α, τ)− k4(α)w(α, τ) (A.2)
with boundary conditions
wτ (0, τ) = −k2(0)w(0, τ) and wτ (1, τ) = −k2(1)w(1, τ), (A.3)
where k1(α), k2(α), k3(α), k4(α) are strictly positive bounded functions, and k2(α)
satisfies k2′(α) < 1
2k2(α), ∀α ∈ [0, 1]. The system (A.1)–(A.3) is exponentially stable
at the equilibrium w = 0, z = 0, i.e., there exists M > 0 and µ > 0 such that for all
τ > 0,
Ω(τ) ≤ M e−µτΩ(0), (A.4)
where
Ω(τ) =
∫ 1
0
w(α, τ)2 dα +
∫ 1
0
wα(α, τ)2 dα + w(0, τ)2 +
∫ 1
0
z(α, τ)2 dα. (A.5)
Proof: Let V (τ) be the Lyapunov functional,
V (τ) =m
2
∫ 1
0
wα(α, τ)2 dα+
m
2w(0, τ)2 +
1
2
∫ 1
0
z(α, τ)2 dα, (A.6)
105
106
where m is a positive scalar to be determined. Computing the derivative of V (τ)
gives
V =m
∫ 1
0
wτα(α, τ)wα(α, τ) dα +mwτ (0, τ)w(0, τ) +
∫ 1
0
zτz dα. (A.7)
Integrating the first term by parts, we obtain
V =mwτwα|10 −m
∫ 1
0
wτ (α, τ)wαα(α, τ) dα + wτ (0, τ)w(0, τ)
+
∫ 1
0
zτ (α, τ)z(α, τ) dα. (A.8)
Substituting (A.1)–(A.3) yields
V =−mk2(α)w(α, τ)wα(α, τ)|10 −m
∫ 1
0
k1(α)wαα(α, τ)2 dα
+m
∫ 1
0
k2(α)w(α, τ)wαα(α, τ) dα−mk2(0)w(0, τ)2 −
∫ 1
0
k3(α)z(α, τ)2 dα
−∫ 1
0
k4(α)ρ(α)w(α, τ)z(α, τ) dα. (A.9)
The second term is negative and can be removed. Integrating by parts on the third
term of (A.9), gives
V ≤−m
∫ 1
0
k2(α)wα(α, τ)2 dα−mk2(0)(w(0, τ)
2)
−m
∫ 1
0
k2′(α)w(α, τ)wα(α, τ) dα−
∫ 1
0
k3(α)z(α, τ)2 dα
−∫ 1
0
k4(α)w(α, τ)z(α, τ) dα. (A.10)
We now bound V by applying the Cauchy-Schwarz and Young’s Inequality to the
third and last term with the parameters θ1, θ2 > 0
V ≤−m
∫ 1
0
k2(α)wα(α, τ)2 dα−mk2(0)w(0, τ)
2 −∫ 1
0
k3(α)z(α, τ)2 dα
+1
2θ1
∫ 1
0
z(α, τ)2 dα +m
2θ2
∫ 1
0
k2′(α) dα
∫ 1
0
wα(α, τ)2 dα
+m
2
∫ 1
0
(θ1mk24(α) + θ2k
′2(α)
)dα
∫ 1
0
w(α, τ)2 dα. (A.11)
107
Applying Poincare inequality on the last term, which states∫ 1
0
w(α, τ)2 dα ≤ 2w(0, t)2 + 4
∫ 1
0
wα(α, τ)2 dα, (A.12)
letting
k2 = minα∈[0,1]
(k2(α)− 2k2′(α)) , (A.13)
k3 = minα∈[0,1]
k3(α), (A.14)
k4 = maxα∈[0,1]
k4(α), (A.15)
and choosing θ1 = 1/k3 and θ2 = 1/2, we get
V ≤−m
(k2 − 2
k2
4
mk3
)∫ 1
0
wα(α, τ)2 dα−m
(k2 −
k2
4
mk3
)w(0, τ)2
− k3
2
∫ 1
0
z(α, τ)2 dα. (A.16)
Selecting the analysis parameters m = 4 k24
k2k3, we find
V ≤− mµ
2
∫ 1
0
wα(α, τ)2 dα− mµ
2w(0, τ)2 − µ
2
∫ 1
0
z(α, τ)2 dα
≤− µV, (A.17)
where µ = min (k2, k3). From the comparison Lemma [36] and Lemma A.2, we have
Ω(τ) ≤ 1p1V (τ) ≤ 1
p1e−µτV (0) ≤ p2
p1e−µτΩ(0), (A.18)
where p1 = 12min
(m8, 1), and p2 = 1
2max(m, 1). The result (A.4) is obtained from
(A.18) with M = p2p1.
Lemma A.2 There exists p1 and p2 > 0 such that
p1Ω(τ) ≤ V (τ) ≤ p2Ω(τ), (A.19)
where Ω(τ) and V (τ) are shown (A.5) and (A.6), respectively.
Proof: With p2 =12max(m, 1), the RHS of the equation (A.19) is immediate.
Rewriting V (τ) by using Poincare inequality,
V (τ) ≥m
4
∫ 1
0
wα(α, τ)2 dα +
m
16
∫ 1
0
w(α, τ)2 dα +3m
8w(0, τ)2 +
1
2
∫ 1
0
z(α, τ)2 dα,
(A.20)
we obtain the LHS of (A.19), with p1 =12min
(m8, 1).
108
Lemma A.3 Consider the following system
wτ (α, τ) = k1(α)wαα(α, τ)− k2(α)w(α, τ) (A.21)
zτ (α, τ) = −k3(α)z(α, τ)− k4(α)w(α, τ) (A.22)
with boundary conditions w(0, τ) = 0, w(1, τ) = 0, where k1(α), k2(α), k3(α), and
k4(α) are strictly positive bounded functions ∀α ∈ [0, 1]. The system (A.21)–(A.22)
is exponentially stable at the equilibrium w = 0, z = 0, i.e., there exists µ > 0 such
that for all τ > 0,
V (τ) ≤ e−µτV (0), (A.23)
where V (τ) = 12
∫ 1
0mw(α,τ)2
k1(α)dα + 1
2
∫ 1
0z(α, τ)2 dα and m > 0 is given in the proof.
Proof: Computing the derivative of V gives us
V = −∫ 1
0
m
k1(α)wτ (α, τ)w(α, τ) dα−
∫ 1
0
zτ (α, τ)z(α, τ) dα (A.24)
(A.25)
substituting (A.21) and (A.22) we obtain
V =m
∫ 1
0
wαα(α, τ)w(α, τ) dα−m
∫ 1
0
k2(α)
k1(α)w(α, τ)2 dα (A.26)
−∫ 1
0
k3(α)z(α, τ)2 dα−
∫ 1
0
k4(α)w(α, τ)z(α, τ) dα (A.27)
Integrating by parts on the first term and using the Cauchy-Schwarz and Young’s
inequality with the parameter θ > 0 on the last term we get
V =mw(α, τ)wα(α, τ)|10 −m
∫ 1
0
wα(α, τ)2 dα−m
∫ 1
0
k2(α)
k1(α)w(α, τ)2 dα
−∫ 1
0
k3(α)z(α, τ)2 dα +
∫ 1
0
θk24
2w(α, τ)2 dα +
∫ 1
0
1
2θ(α)z(α, τ)2 dα . (A.28)
Given the boundary conditions, the first term is zero. The second term is negative
and can be removed. Combining the common terms we get
V ≤−∫ 1
0
m
k1(α)
(k2(α)−
θk1(α)k24(α)
2m
)w(α, τ)2 dα
−∫ 1
0
(k3(α)−
1
2θ
)z(α, τ)2 dα. (A.29)
109
Letting
k1 = maxα∈[0,1]
k1(α), (A.30)
k2 = minα∈[0,1]
k2(α), (A.31)
k3 = minα∈[0,1]
k3(α), (A.32)
k4 = maxα∈[0,1]
k4(α), (A.33)
and choosing θ1 = 1/k3 and m = k1k4k2k3
we get
V ≤− m
2k2
∫ 1
0
w(α, τ)2
k1(α)dα− 1
2k3
∫ 1
0
z(α, τ)2 dα
≤− µV , (A.34)
where µ = min(k2, k3
). By solving (A.34) for V (τ) we get (A.23).
Appendix B
Averaging in Infinite Dimensions
We rewrite the system as
u = Au+ F (t/ϵ, u) (B.1)
with ϵ = 1/ω. For the PDE system (4.14)–(4.18) in Chapter 4 Section 3 with
dynamic boundary conditions. We introduce a system of the form (B.1) with u =
(x, e, xl, xr)T , by defining its linear operator as
A =
A0 0 0 0
0 L 0 0
0 0 0 0
0 0 0 0
(B.2)
D(A) =u ∈ D(A0)× L2(0, 1)× R2 |Blx = xl and Brx = xr
. (B.3)
The a linear operator A0 is defined as
A0f(α) = κ(α)d2f(α)
dα2, (B.4)
with the domain
D(A0) =
f(α) ∈ L2(0, 1) : f(α) and
df(α)
dαare abs. cont.,
d2f(α)
dα2∈ L2(0, 1)
,
(B.5)
and the linear operator L is defined as
Lf(α) = −h(α)f(α) (B.6)
110
111
with the domain D(L) = L2(0, 1). The linear operators Bl and Br are defined as
Blf(α) = f(0) (B.7)
Brf(α) = f(1) (B.8)
D(Bl) = f(α) ∈ L2(0, 1) : f(α) is abs. cont. (B.9)
D(Br) = f(α) ∈ L2(0, 1) : f(α) is abs. cont.. (B.10)
The nonlinear operator F = (F1, F2, F3, F4)T is defined with ϵ = 1/ω and
F1(ωt, x, e)(α) = κ(α)a′′(α) sin(ωt)− c(α)ξ(ωt, x, e)(α) sin(ωt) (B.11)
F2(ωt, x, e)(α) = −q(x(α) + a(α) sin(ωt))2 (B.12)
F3(ωt, x, e) = −ν + κ(0)a′′(0) sin(ωt)− c(0)ξ(ωt, x, e)(0) sin(ωt) (B.13)
F4(ωt, x, e) = ν + κ(1)a′′(1) sin(ωt)− c(1)ξ(ωt, x, e)(1) sin(ωt) (B.14)
ξ(ωt, x, e)(α) = −q(x(α) + a(α) sin(ωt))2 − e(α). (B.15)
Similarly for the PDE system (4.14), (4.15), (4.16), (4.59) in Chapter 4 Section
4 with homogeneous Dirichlet boundary condition, we define the operator A by
A =
(A0 0
0 L
)(B.16)
D(A) =
(x
e
)∈ D(A0)× L2(0, 1) |Blx = 0 and Brx = 0
, (B.17)
with the nonlinearity F = (F1, F2)T .
To use Theorem 3.6 in [27], the system (B.1) must satisfy the following assump-
tions:
• F is almost periodic and satisfies the smoothness conditions from Section 2 of
[27] (continuously differentiable). Both of the conditions are trivially satisfied
for (B.11)–(B.15).
• The linear operator A, which is such that ∥TA(t)∥ ≤ Mekt for some positive
M and k, must satisfy hypothesis (H) given in [27] as a condition that if
h : [s,∞) → X is norm-continuous, then
(i)∫ t
sTA(t− τ)h(τ) dτ ∈ D(A), for s ≤ t;
(ii)∥∥A ∫ t
sTA(t− τ)h(τ) dτ
∥∥ ≤ Mekt sups≥τ≥t ∥h(τ)∥, for s ≤ t.
112
It is a routine extension of known results [15] that, for both (B.2) and (B.16), Agenerates an analytic semigroup and that properties (i) and (ii) in hypothesis (H)
hold . Hence the conditions of [27] are satisfied and Theorems 1 and 2 in Chapter
4 follow.
Bibliography
[1] V. Adetola and M. Guay, “Parameter convergence in adaptive extremum-seeking control,” Automatica, vol. 43, no. 1, pp. 105–110, 2007.
[2] K. B. Ariyur and M. Krstic, Real Time Optimization by Extremum SeekingControl. Wiley-Interscience, 2003.
[3] J. R. Baxter and G. A. Brosamler, “Energy and the law of iterated logarithm,”Mathematica Scandinavica, vol. 38, pp. 115–136, 1976.
[4] R. Becker, R. King, R. Petz, and W. Nitsche, “Adaptive closed-loop separationcontrol on a high-lift configuration using extremum seeking,” AIAA, vol. 45,no. 6, p. 1382, 2007.
[5] J. Belanger and E. Arbas, “Behavioral strategies underlying pheromone-modulated flight in moths: lessons from simulation studies,” Journal of Com-parative Physiology A: Sensory, Neural, and Behavioral Physiology, vol. 183,no. 3, pp. 345–360, 1998.
[6] H. Berg, E Coli in Motion. Springer New York, 2003.
[7] H. Berg and D. A. Brown, “Chemotaxis in e. coli analyzed by three-dimensionaltracking,” Nature, vol. 239, pp. 500–504, 1972.
[8] E. Biyik and M. Arcak, “Gradient climbing in formation via extremum-seekingand passivity-based coordination rules,” Asian J. Control: Special Issue on”Collective Behavior and Control of Multi-Agent Systems”, vol. 10, no. 2, pp.201–211, March 2008.
[9] R. Carli and F. Bullo, “Quantized coordination algorithms for rendezvous anddeployment,” SIAM J. Control Optim., vol. 48, no. 3, pp. 1251–1274, 2009.
[10] C. Centioli, F. Iannone, G. Mazza, M. Panella, L. Pangione, S. Podda, A. Tuc-cillo, V. Vitale, and L. Zaccarian, “Maximization of the lower hybrid powercoupling in the frascati tokamak upgrade via extremum seeking,” Control En-gineering Practice, vol. 16, no. 12, pp. 1468 – 1478, 2008.
113
114
[11] J. Cochran and M. Krstic, “Nonholonomic source seeking with tuning of angularvelocity,” IEEE Transactions on Automatic Control, vol. 54, pp. 717–731, 2009.
[12] J. Cochran, A. Siranosian, N. Ghods, and K. M, “Source seeking with non-holonomic unicycle without position measurements and with tuning of angularvelocity part ii: Applications,” IEEE Conference on Decision and Control,2007.
[13] J. Cochran, A. Siranosian, N. Ghods, and M. Krstic, “3d source seeking forunderactuated vehicles without position measurement,” IEEE Transactions onRobotics, pp. 117–129, 2009.
[14] J. Cortes, S. Martınez, T. Karatas, and F. Bullo, “Coverage control for mobilesensing networks,” IEEE Transactions on Robotics and Automation, vol. 20,no. 2, pp. 243–255, 2004.
[15] R. F. Curtain and H. J. Zwart, An introduction to infinite-dimensional linearsystems theory. Springer-Varlag, New York, 1995.
[16] K. Dittmer, F. Grasso, and J. Atema, “Effects of varying plume turbulenceon temporal concentration signals available to orienting lobsters,” BiologicalBulletin, pp. 232–233, 1995.
[17] ——, “Obstacles to flow produce distinctive patterns of odor dispersal on a scalethat could be detected by marine animals,” Biological Bulletin, pp. 313–314,1996.
[18] G. Ferrari-Trecate, A. Buffa, and M. Gati, “Analysis of coordination in multi-agent systems through partial difference equations,” IEEE Transactions onAutomatic Control, vol. 51, no. 6, pp. 1058–1063, 2006.
[19] Technical information for TGS2620 data sheet, revised 03/05 ed., Figaro Engi-neering Inc.
[20] A. Fort, M. Mugnaini, S. Rocchi, V. V. M.B. Serrano-Santos, and R. Spinicci,“Surface state model for conductance responses during thermal-modulation ofSnO2-based thick film sensors. part I. model derivation,” IEEE Trans. Instr.Meas., 2006.
[21] ——, “Surface state model for conductance responses during thermal-modulation of SnO2-based thick film sensors. part II. experimental verification,”IEEE Trans. Instr. Meas., 2006.
[22] A. Fort, M. Mugnaini, S. Rocchi, M. Serrano-Santos, V. Vignoli, andR. Spinicci, “Simplified models for sno2 sensors during chemical and thermaltransients in mixtures of inert, oxidizing and reducing gases,” Sensors and Ac-tuators B: Chemical, vol. 124, no. 1, pp. 245–259, 2007.
115
[23] P. Frihauf and M. Krstic, “Leader-enabled deployment into planar curves,”IEEE Transactions on Automatic Control, Submitted.
[24] N. Ghods and M. Krstic, “Multi-agent deployment over a source,” Submitted,submitted to IEEE Transactions on Control Systems Technology .
[25] G. H. Golub and C. F. V. Loan, Matrix Computations, 3rd ed. Baltimore,MD: The Johns Hopkins University Press, 1996.
[26] F. Grasso, T. Consi, D. Mountain, and J. Atema, “Behavior of purely chemo-tactic robot lobster reveals different odor dispersal patterns in the jet regionand the patch field of a turbulent plume,” Biological Bulletin, pp. 312–313,1996.
[27] J. Hale and S. V. Lunel, “Averaging in infinite dimensions,” Integral EquationsAppl., vol. 2, no. 4, pp. 463–494, 1990.
[28] H. Ishida, T. Nakamoto, T. Moriizumi, T. Kikas, and J. Janata, “Plumetrackingrobots: A new application of chemical sensors,” Biological Bulletin, vol. 200,pp. 222–226, 2001.
[29] H. Ishida, G. Nakayama, T. Nakamoto, and T. Moriizum, “Odor-source local-ization in the clean room by an autonomous mobile sensing system,” Sensorsand Actuators B: Chemical, vol. 33, no. 1-3, pp. 115 – 121, 1996, eurosensorsIX.
[30] ——, “Controlling a gas/odor plume-tracking robot based on transient re-sponses of gas sensors,” Sens., Proceedings of IEEE, vol. 2, pp. 1665–1670,2002.
[31] A. Jadbabaie, J. Lin, and A. S. Morse, “Coordination of groups of mobileautonomous agents using nearest neighbor rules,” IEEE Transactions on Au-tomatic Control, vol. 48, no. 6, pp. 988–1001, 2003.
[32] E. W. Justh and P. S. Krishnaprasad, “Equilibria and steering laws for planarformations,” Systems & Control Letters, vol. 52, no. 1, pp. 25 – 38, 2004.
[33] J. C. K. Laventall, “Coverage control by multi-robot networks with limited-range anisotropic sensory,” International Journal of Control, vol. 82, pp. 1113–1121, 2009.
[34] R. Kanzaki, “Coordination of wing motion and walking suggests common con-trol of zigzag motor program in a male silkworm moth,” Sensory, Neural, andBehavioral Physiology, vol. 182, no. 3, pp. 267–276, 1998.
116
[35] R. Kanzaki, N. Sugi, and T. Shibuya, “Self-generated zigzag turning of bombyxmori males during pheromonemediated upwind walking,” Zoological Science,vol. 9, no. 3, pp. 515–527, 1992.
[36] H. Khalil, Nonlinear Systems. Prentice-Hall, 2002.
[37] J. Kim, K.-D. Kim, V. Natarajan, S. D. Kelly, and J. Bentsman, “Pde-basedmodel reference adaptive control of uncertain heterogeneous multiagent net-works,” Nonlinear Analysis: Hybrid Systems, vol. 2, no. 4, pp. 1152–1167,2008.
[38] Y. Li, A. Rotea, G. T.-C. Chiu, L. Mongeau, and I.-S. Paek, “Extremum seekingcontrol of a tunable thermoacoustic cooler,” IEEE Trans. Contr. Syst. Technol.,vol. 13, pp. 527–536, 2005.
[39] S. J. Liu and M. Krstic, “Stochastic source seeking for nonholonomic unicycle,”Automatica to appear.
[40] ——, “Stochastic averaging in continuous time and its applications to ex-tremum seeking,” IEEE Transactions on Automatic Control, to appear.
[41] C. G. Mayhew, R. G. Sanfelice, and A. Teel, “Robust source-seeking hybridcontrollers for nonholonomic vehicles,” American Control Conference, pp. 2722–2727, June 2008.
[42] A. R. Mesquita, J. P. Hespanha, and K. Astrom, “Optimotaxis: A stochasticmulti-agent optimization procedure with point measurements,” in HSCC, 2008,pp. 358–371.
[43] P. Ogren, E. Fiorelli, and N. Leonard, “Cooperative control of mobile sen-sor networks: adaptive gradient climbing in a distributed environment,” IEEETrans. Automat. Contr, vol. 29, pp. 1292–1302, 2004.
[44] Y. Ou, C. Xu, E. Schuster, T. Luce, J. R. Ferron, and M. Walker, “Extremum-seeking finite-time optimal control of plasma current profile at the diii-d toka-mak,” 2007 American Ctrl. Conf., 2007.
[45] K. Peterson and A. Stefanopoulou, “Extremum seeking control for soft landingof an electromechanical valve actuator,” Automatica, vol. 29, pp. 1063–1069,2004.
[46] D. Popovic, M. Jankovic, S. Magner, and A. Teel, “Extremum seeking methodsfor optimization of variable cam timing engine operation,” IEEE Transactionson Control Systems Technology, vol. 14, no. 3, pp. 398–407, 2006.
[47] B. Porat and A. Neohorai, “Localizing vapor-emitting sources by moving sen-sors,” IEEE Trans. Signal Processing, vol. 44, pp. 1018–1021, 1996.
117
[48] M. Potter and K. De Jong, “A cooperative coevolutionary approach to functionoptimization,” in Parallel Problem Solving from Nature PPSN III. SpringerBerlin / Heidelberg, 1994, vol. 866, pp. 249–257.
[49] M. S. Stankovic, K. H. Johansson, and D. M. Stipanovic, “Distributed seekingof nash equilibria with applications to mobile sensor networks,” submitted toIEEE Tran. on Automatic Control.
[50] M. S. Stankovic, K. Johansson, and D. M. Stipanovic, “Distributed seeking ofnash equilibria in mobile sensor networks,” Submitted, submitted to 2010 Proc.IEEE Conf. on Decision and Control .
[51] M. S. Stankovic and D. Stipanovic, “Stochastic extremum seeking with appli-cations to mobile sensor networks,” 2009 American Control Conference, 2009.
[52] K. Stegath, N. Sharma, C. Gregory, and W. E. Dixon, “An extremum seekingmethod for non-isometric neuromuscular electrical stimulation,” IEEE Inter-national Conference on Systems, Man and Cybernetics, pp. 2528–2532, 2007.
[53] Y. Tan, D. Nesic, and I. M. Mareels, “On non-local stability properties ofextremum seeking controllers,” Automatica, vol. 42, pp. 889–903, 2006.
[54] M. Tanelli, A. Astolfi, and S. Savaresi, “Non-local extremum seeking controlfor active braking control systems,” Conf. on Control Applications, 2006.
[55] H. Tanner, A. Jadbabaie, and G. Pappas, “Flocking in fixed and switchingnetworks,” IEEE Transactions on Automatic Control, vol. 52, pp. 863–868,2007.
[56] H.-H. Wang and M. Krstic, “Extremum seeking for limit cycle minimization,”IEEE Transactions on Automatic Control, vol. 45, pp. 2432–2436, 2000.
[57] H.-H. Wang, S. Yeung, and M. Krstic, “Experimental application of extremumseeking on an axial-flow compressor,” IEEE Transactions on Control SystemsTechnology, vol. 8, pp. 300–309, 1999.
[58] T. D. Wyatt, “Moth flights of fancy,” Nature, vol. 369, pp. 98–99, 1994.
[59] C. Zhang, D. Arnold, N. Ghods, A. Siranosian, and M. Krstic, “Source seekingwith nonholonomic unicycle without position measurement and with tuning offorward velocity,” Systems & Ctrl. Letters, vol. 56, pp. 245–252, 2007.
[60] C. Zhang, A. Siranosian, and M. Krstic, “Extremum seeking for moderatelyunstable systems and for autonomous vehicle target tracking without positionmeasurements,” Automatica, vol. 43, pp. 1832–1839, 2007.
118
[61] X. Zhang, D. Dawson, W. Dixon, and B. Xian, “Extremum seeking nonlinearcontrollers for a human exercise machine,” Proc. 2004 IEEE Conf. Decisionand Ctrl., 2004.
[62] X. Zhang, D. M. Dawson, W. E. Dixon, and B. Xian, “Extremum seekingnonlinear controllers for a human exercise machine,” IEEE Transactions onMechatronics, vol. 14, no. 2, pp. 233–240, 2006.
[63] M. Zhu and S. Martınez, “Distributed coverage games for mobile visual sensornetworks,” SIAM Journal on Control and Optimization, submitted, January2010.