Date post: | 28-Dec-2015 |
Category: |
Documents |
Upload: | laura-webb |
View: | 222 times |
Download: | 1 times |
Naturally Adaptive ICTMark Shackleton
Pervasive ICT Research Centre
BT Research & Venturing
Talk outline• BT Research & Venturing
• Why BT is interested in Nature-inspired Computing and Communications
• Adaptive and Autonomic Systems
• Some autonomic Self-* examples:– “Fly Phones”: self-configuring channel allocation for mobile networks
using principles from developmental biology
– “Self Service”: an autonomic protocol for dynamic service provision
– “Embryo”: a fully decentralised, autonomic service management framework, inspired by morphogenesis
• From nature-inspired heuristics to engineering and design principles
BT Research & venturing
• BT’s Research facility
• Provides innovation and R&D for BT’s lines of business
• About 300 people based at:– Adastral Park, Martlesham Heath, Ipswich, Suffolk
• My team: nature-inspired & adaptive solutions
Today’s Problems with IT / ICT• ICT systems are becoming so complicated that they are
increasingly becoming largely unmanageable– sheer scale of current and envisaged ICT deployments
– heterogeneity of the underlying infrastructure that nobody (“no single person”) can understand
– unanticipated and unwanted interactions between components
• These add up to frequent failures or sub-optimal system-level behaviour, and costly, error-prone system administration
• “Nature-inspired Computing” and more specifically “Autonomic Computing” - by analogy to the human autonomic nervous system, regulating “basic” functions e.g. blood pressure, heart rate, breathing
“Autonomic Computing”
TCO, Resilience, Telephony, SysAdmin->BusGoals
Autonomic Computing
• Self-configuring– adaptation to IT system changes, such as new nodes becoming
available or going offline
• Self-optimising– tuning resources and load balancing
• Self-protecting– guard against damage from attacks or failures
• Self-healing– recovery from, or work around, failed components
BT’s interest in Nature-inspired and Autonomic Solutions
• The Digital Networked Economy will require the support of a highly adaptive underlying ICT infrastructure.
• This will be dynamic, heterogeneous and support multiple domains of ownership and control.
• It will need to adapt to transient changes in demand as well as longer-term usage trends.
• It will embed “autonomic behaviour” that reduces deployment and running costs, whilst enhancing resilience.
Adaptive / Autonomic ICT- decentralised, autonomic, lightweight
From: complex & costly manual managementTo: self-managing
“autonomic” ICT solutions And: resilient provisionof services in a complex
Pervasive ICT world Self-healing ICT “immune systems”
network
Self-managing Peer-to-Peer and Decentralised architectures
Complex systems engineering& nature-inspired solutions
“Autonomic Computing”
…a nature-inspired analogy!
IBM’s “autonomic mgr” control loop
The Autonomic Analogy“..we should not only consider what theautonomic nervous system does but also howit does it… Successful self-management lies in the way [biological systems] achieve thisfunctionality.”
From: O. Babaoglu, M. Jelasity, A. Montresor. Grassroots Approach to Self-Management in Large-Scale Distributed Systems. In Proceedings of the EU-NSF Strategic Research Workshop on Unconventional Programming Paradigms, Mont Saint-Michel, France, 15-17 September 2004.
Architectures & principles + Specific algorithms
Adaptive Systems versus Elements
Local entity RulesGlobal
Behaviour
Interactions
Tune/control
Novel design principles...
Problem statement: Increasingly complicated (diverse, dynamic and seemingly unpredictable) ICT systems have created a management crisis, with serious reliability issues and a severe loss of confidence from the average user.
Traditional solution: Reinforce central control, artificially reduce diversity through enforcement of restrictive usage policies, “fight” emergent system properties by blocking unsupervised (e.g. P2P) interactions.
Consequences: Escalating cost of ownership, waste of computing resources, lost opportunities for innovative use of technology.
Alternative solution: Reject “complication”, embrace “complexity”. Adopt novel design principles that take advantage of emergent phenomena, learn to rely on statistical rather than deterministic predictability, focus on developing methods to “promote” desirable system properties rather than on “micro-management”.
Some (NI) Design Heuristics (1)
• Local rules - wherever possible use local rules and decision making to achieve overall behaviour
• Interactions - by combining local decision making with carefully crafted interactions between neighbourhood nodes/entities the desired global behaviour can often be achieved
• Positive and negative feedback - biological systems make extensive use of feedback to control processes and achieve robust design of structure and behaviour
Some (NI) Design Heuristics (2)
• Decentralised solutions - often a given problem is in essence a decentralised problem - in this case a decentralised solution may be well matched– In addition, nodes in a decentralised system often "bring their own
resources" which can help provide a scalable solution
• Engineered-in behaviour versus explicit external control - where possible it is preferable to embody some management within the system itself– Policy-based management is still appropriate and possible via tuning
parameters and via the system's in-built adaptability
Some examples using these approaches to create adaptive solutions...
Ex#1: “FlyPhones”From Fruit Flies to Mobile Phones
Frequency allocation - the problem
• Interference between neighboring base stations should be prevented.
• The number of available frequencies is limited.
• Bandwidth has to be very carefully distributed between adjacent cells.
Cell differentiation for fruit flies
• Most cells have the potential to make bristles.
• They all start to express the corresponding gene (greyscale = density of the associated transcription factor)
• But they gradually specialise until only a few actually develop a bristle.
• Clearly a self-organised process.
Underlying mechanism
• This process obeys a fully decentralised control mechanism.
• It involves a local positive feedback loop…
• Coupled with cross-inhibition of neighboring cells(Delta-Notch signalling)
Why is an original solution needed?
• This is in fact a very complex problem:– The base stations are not regularly distributed
(i.e. they don’t have the same size and/or number of neighbours)
– The continuously fluctuating traffic must be taken into account.
• A centralised decision process is not particularly well adapted…
• But a system allocating frequencies on the basis of local competition between cells is.
The “Flyphones” algorithm
• The equivalent of the natural feedback loop is implemented for each available frequency.
• Through the “negotiation” process, each base station starts to develop a preference for some frequencies (moving away from the unstable equilibrium)…
• And simultaneously inhibits its neighbours from using them.
Flyphones
Nature-inspired innovation•Self-organising•Self-healing•Micro Cell Decisions•Macro Result
‘Autonomic’ Network•Dynamic•Scale-independent•Distributed•Self-organising•Self-healing
58 base stations4 from 29 channels10253 Solutions
680 base stations6 from 42 channels104031 Solutions
Applying FlyPhones
Comparing Different Types of Frequency Constraints
0
100
200
300
400
500
0 50 100 150 200 250
Time/s
Co
ns
tra
int
Co-Site + Far-Site
Co-Site + Far-Site +Spurious
Co-Site + Far-Site +Spurious + Intermod
Field Network Radio scenario-- 600 lines of communication, 250 channels, mobile transceivers- currently solved centrally, statically (obvious difficulties!)- FlyPhones can solve it too- contract research to use FlyPhones in dynamic management
Ex#2: SelfService - an autonomic protocol for dynamic service provision
Telco network->ICT services; Bgd; NI
“Alone in the world” (~PC model)
z
x
x
Need for service x
Installed module x
y
xA
z x
y
z
y
x
B
zx
y
z
y
xC
z
x
y
Pros and cons
• Highly robust to node or network failure.
• End user has total control.
• Need a lot of onboard power (good for hardware manufacturers!).
• Need many copies of every application (good for software manufacturers!).
• Need virtually no ICT infrastructure (not good for service providers!).
• Amazing waste of resources (“99% idle time” syndrome).
“Thin client”
A B
C x
x
y
yz z
x
x
Need for service x
Installed module x
Client-server relationshipy
xD
z
x y
z
Pros and cons
• Extremely brittle (single point of failure!)
• Administrator has total control.
• Mixed picture for the hardware industry (need powerful servers, but only low-end PC’s)
• Mixed picture for the software industry (depends on license management).
• Mixed picture for ICT providers (network services are paramount, but risk of bottlenecks and QoS degradation).
Quotes:(from IBM’s Autonomic Computing Manifesto)
• “An autonomic computing system knows its environment and the context surrounding its activity, and acts accordingly (…) It will tap available resources, even negotiate the use by other systems of its underutilized elements, changing both itself and its environment in the process.”
• “An autonomic computing system cannot exist in a hermetic environment (…) Standard ways of system identification, communication and negotiation – perhaps even new classes of system-neutral intermediates or ‘agents’ specifically assigned the role of cyber-diplomats to regulate conflicting resource demands – need to be invented and agreed on.”
• n.b. Plus industry trends: SOA, P2P, Grid, Web services
“SelfService”
A B
C x
xx
y
y
yz z
z
x
x
Need for service x
Installed module x
Service-specific client-server relationship
x
x y
z
[P2P service provision]
Pros and cons
• Intermediate robustness (no single point of failure, but problems will tend to be “non-local”).
• End user is back in charge (decides what to install).
• Interesting model for sharing resources (i.e. P2P utility computing).
• A clear step towards pervasive ICT and a great opportunity for service providers, if it works...
What is the challenge?
• The difficulty is of course to ensure adequate service coverage, in terms of accessibility, reliability, latency etc.
• This has to be achieved without central control or planning, otherwise:– It won’t scale.
– We’ll lose many of the benefits (in terms of agility and adaptability).
“SelfService”
• Objectives:– To support reliable, fault-tolerant access to a sub-set of services,
which are required at local “access points”.
– To reduce the need for installation/running of the corresponding software modules on local devices used as access points.
– Without having to rely on dedicated servers.
• Underlying hypothesis:There are unpredictable but consistent patterns of activity, which can be used to select stable partnerships (e.g. “device X, hosting service S1, is able to provide it to device Y for 80% of business hours”).
Experimental algorithmSTART
(generate request)
Already know
provider
Broadcast request
Reply received
Store provider’s
address
Download componentTargeted
request
Reply received
Increment delay
Delay reached unacceptable
limit
“Forget” provider EXIT
(Re-)examine request
yes
yes
yes
yes Process or send job
Locally maintained information
KnownProviderID score
KnownProviderID score
KnownProviderID score
Subsciption
Service
ID
QoS
value
requestssuccess
KnownProviderID score
KnownProviderID score
KnownProviderID score
Subsciption
Service
ID
QoS
value
requestssuccess
ServiceKnownProviderID score
KnownProviderID score
KnownProviderID score
Subsciption
ID
QoS
value
requestssuccess
Decision rules
Service
KnownProviderID score
KnownProviderID score
KnownProviderID score
Subsciption
ID
QoS
value
requestssuccess
Service
• Maintain a “subscription” to each required service component
• Can keep a record of QoS attributes, such as speed of response
“Alone in the world” (benchmark)
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
100%
0 200 400 600 800 1000
Time-step
Fra
ctio
n o
f re
qu
ired
mo
du
les
inst
alle
d l
oca
lly
(ave
rag
e)
P = 0.1
P = 0.05
P = 0.2
Load balancing
1
10
100
1000
0 - 1 1 - 10 10 - 100 100 - 1000
Average delay
Num
ber
of p
eers
(log
arith
mic
sca
le)
1
10
100
1000
0 - 1 1 - 10 10 - 100 100 - 1000
Average delay
Num
ber
of p
eers
(log
arith
mic
sca
le)
Broadcasts & downloads (~bandwidth)
0%
25%
50%
75%
100%
0 200 400 600 800 1000
Time-step
Fra
ctio
n (
pe
rce
nta
ge
)
Broadcast requests
Locally installed components
Pervasive “SelfService”
• Mobile nodes (e.g. PDAs)
• Regular, but unpredictable, daily activity cycles.
• Colour code = QoS.
• Size = number of modules installed.
• Grey links = in range.
• White links = identified opportunities for co-operation.
Biological Morphogenesisas a source of inspiration?• In morphogenesis, individual stem cells simultaneously
differentiate (specialise) and move in space (equivalent to rewiring in a network)
• In this way cells of the right type occupy the right location in the developing organism
• Neighbours influence each other’s choice via a dynamic web of positive and negative feedback
• This aspect of the developmental process shares many characteristics with co-operative peer-to-peer (P2P) service provision across networks– deciding which service to host is equivalent to differentiation
– selecting providers via “rewiring” is similar to cell migration
Ex3: Embryo
• Embryo is a fully decentralised, autonomic service management framework, inspired by morphogenesis
• It is capable of inducing the local installation of components AND modifying the topology of a peer-to-peer interaction overlay network
• In doing so, it adapts the overall system so as to meet the needs of the majority of peers
• Simulations show that Embryo supports deployment of new applications, adding or retiring of service components and re-scaling (reallocating resources) “on the fly” without explicit management. => Autonomic
Embryo’s key mechanisms
• Rewiring (~Cell migration)– nodes send “adverts” offering their available service components, as
well as their own needs
– these signals are propagated using local “gossiping”
– links are made where there is a reciprocal match of offer & need
– nodes maintain an awareness of their local context via adverts they have seen “pass them by”
• Changing type (~Cell differentiation)– a node is permitted to be of only one type (i.e. host only one service
component) since we wish to explore a node’s ability to dynamically reconfigure within its current context
– neighbours exhibit implicit mutual inhibition since there is a reduced pressure to offer a service that is already offered by a neighbour
Embryo: simulation of an autonomic service management framework
• Each node acts as an “access point” for one application type (indicated by shape)
• Each node hosts and makes available one service component type (indicated by colour & number)
• The three strips show which (of 3) application types require which service components
Embryo: some simulation results
Time until stable (N = 9)
0
2000
4000
6000
8000
10000
12000
14000
16000
0 32 64 96 128 160 192 224 256 288 320 352 384 416
Population size
Tim
e u
nti
l st
able
• The time to convergence of the system to stable state grows only logarithmically with population size (i.e. number of nodes or cells)
• This suggests good scalability properties
“Rewiring” ~ Cell migration/adhesion
Successful handshakes (average per node, N = 9)
0
2
4
6
8
10
12
14
16
18
0 32 64 96 128 160 192 224 256 288
Population size
Han
dsh
akes
• The number of “corrective actions” (rewiring events) a node must make before the system reaches steady state grows slowly with population size
• Note that when there are more service types scalability is better still (downward trend - not shown)
Changing type ~ DifferentiationMetamorphoses (average per node, N = 9)
0
1
2
3
4
5
6
0 32 64 96 128 160 192 224 256 288 320 352 384 416
Population size
Met
amor
phos
es
• Bigger population sizes (i.e. more nodes/cells) require less differentiation events per node
• This suggests good scalability properties
Some Design Heuristics (1)
• Local rules - wherever possible use local rules and decision making to achieve overall behaviour
• Interactions - by combining local decision making with carefully crafted interactions between neighbourhood nodes/entities the desired global behaviour can often be achieved
• Positive and negative feedback - biological systems make extensive use of feedback to control processes and achieve robust design of structure and behaviour– Flyphones uses explicit inhibitory signalling
– Embryo exhibits implicit inhibition i.e. if a neighbour offers a service then I don’t need to offer it
Some Design Heuristics (2)
• Decentralised solutions - often a given problem is in essence a decentralised problem (c.f. Flyphones) - in this case a decentralised solution may be well matched. In addition, nodes in a decentralised system often "bring their own resources" which can help provide a scalable solution
• Engineered-in behaviour versus explicit external control - where possible it is preferable to embody some management within the system itself; policy-based management is still appropriate and possible via tuning parameters and via the system's in-built adaptability
From Heuristics to Design Patterns?
• Babaoglu has proposed using the following “Design Patterns” (abstractions) taken from Biology…
• Plain Diffusion and Reaction-Diffusion
• Chemotaxis - movement over gradients
• Replication - an epidemics metaphor
• Stigmergy - distributed control via environment
• …these are cast in a common framework describing topologies, and overlaid “communication strategies”.
Ref: Design Patterns from Biology for Distributed Computing. Ozalp Babaoglu, Geoffrey Canright, Andreas Deutsch, Gianni Di Caro, Frederick Ducatelle, Luca Gambardella, Niloy Ganguly, M´ark Jelasity, Roberto Montemanni, and Alberto Montresor.
Design principles
• The desire of autonomic computing is for ICT systems that are capable of “looking after themselves” without external intervention
• Self-managing distributed systems are the only alternative to complication-induced paralysis
• Engineering individual components capable of self-organising reliably into a whole that is “greater than the sum of its parts” is the only way to design such systems– multi-scale adaptability that takes advantage of emergent properties is key
• We can use nature-inspired “architectures & principles” or more literal instances of “algorithms found in nature”– both are helpful and valid approaches
Additional benefits of the approach
• “High level” policies can provide control, without having to explicitly plan or manage the details
• Control can be delegated where appropriate to “engineered-in” self* properties
• Systems can adapt to changes, such as changes in demand
• Robustness to failure of components
• Improved scalability
• The approach often brings many of the above benefits as well as comparable performance to existing solutions
Self-organisation
• By definition, a system composed of units making autonomous decisions based on locally available information can only be “driven” to a desirable state via self-organisation.
• This requires engineering the reasoning and decision-making engine running on individual units so as to promote the emergence of the “right” collective behaviour.
• In turn, this means adapting the predictive techniques of natural complexity science (both analytical and numerical) to meet the needs of artificial complex systems designers.
• This may seem relatively trivial in principle, but experimental validation requires prototype implementation.
“Complex Systems” approaches
• The tools of Complex Systems research are likely to prove increasingly important: – this can be in terms of “engineering-in” the right properties…
(E.g. this approach helped with the design of the SWAN system.)
– or in terms of helping understand macroscopic behaviour.
• …which in a way is the same thing: understanding the macroscopic properties resulting from a variety of possible rules/parameter combinations is key to selecting a set that will promote emergence of the desired system behaviour.
Summary… and challenges!• The move to huge-scale, dynamic, heterogeneous
networks and systems requires new approaches to complement the existing ICT engineering toolkit– the world has never been more ready for NIS thinking...
• Bio-inspired solutions coupled with tools from complexity science provide a rich source of such novel approaches– we already have many examples of “point solutions”
• The challenge is to move beyond bio-inspired design heuristics to bio-inspired design patterns and principles– Babaglou’s framework is an interesting early example of this
– but how do we plan to get NIS solutions into “mainstream” computer science and Software Engineers’ thinking & day-to-day toolkits?
Thank you
Acknowledgements:• Fabrice Saffre• Richard Tateson