Naturally Adaptive ICT Mark Shackleton Pervasive ICT Research Centre BT Research & Venturing...

Naturally Adaptive ICTMark Shackleton

Pervasive ICT Research Centre

BT Research & Venturing

[email protected]

Talk outline• BT Research & Venturing

• Why BT is interested in Nature-inspired Computing and Communications

• Adaptive and Autonomic Systems

• Some autonomic Self-* examples:– “Fly Phones”: self-configuring channel allocation for mobile networks

using principles from developmental biology

– “Self Service”: an autonomic protocol for dynamic service provision

– “Embryo”: a fully decentralised, autonomic service management framework, inspired by morphogenesis

• From nature-inspired heuristics to engineering and design principles

BT Research & venturing

• BT’s Research facility

• Provides innovation and R&D for BT’s lines of business

• About 300 people based at:– Adastral Park, Martlesham Heath, Ipswich, Suffolk

• My team: nature-inspired & adaptive solutions

Today’s Problems with IT / ICT• ICT systems are becoming so complicated that they are

increasingly becoming largely unmanageable– sheer scale of current and envisaged ICT deployments

– heterogeneity of the underlying infrastructure that nobody (“no single person”) can understand

– unanticipated and unwanted interactions between components

• These add up to frequent failures or sub-optimal system-level behaviour, and costly, error-prone system administration

• “Nature-inspired Computing” and more specifically “Autonomic Computing” - by analogy to the human autonomic nervous system, regulating “basic” functions e.g. blood pressure, heart rate, breathing

“Autonomic Computing”

TCO, Resilience, Telephony, SysAdmin->BusGoals

Autonomic Computing

• Self-configuring– adaptation to IT system changes, such as new nodes becoming

available or going offline

• Self-optimising– tuning resources and load balancing

• Self-protecting– guard against damage from attacks or failures

• Self-healing– recovery from, or work around, failed components

BT’s interest in Nature-inspired and Autonomic Solutions

• The Digital Networked Economy will require the support of a highly adaptive underlying ICT infrastructure.

• This will be dynamic, heterogeneous and support multiple domains of ownership and control.

• It will need to adapt to transient changes in demand as well as longer-term usage trends.

• It will embed “autonomic behaviour” that reduces deployment and running costs, whilst enhancing resilience.

Adaptive / Autonomic ICT- decentralised, autonomic, lightweight

From: complex & costly manual managementTo: self-managing

“autonomic” ICT solutions And: resilient provisionof services in a complex

Pervasive ICT world Self-healing ICT “immune systems”

network

Self-managing Peer-to-Peer and Decentralised architectures

Complex systems engineering& nature-inspired solutions

“Autonomic Computing”

…a nature-inspired analogy!

IBM’s “autonomic mgr” control loop

The Autonomic Analogy“..we should not only consider what theautonomic nervous system does but also howit does it… Successful self-management lies in the way [biological systems] achieve thisfunctionality.”

From: O. Babaoglu, M. Jelasity, A. Montresor. Grassroots Approach to Self-Management in Large-Scale Distributed Systems. In Proceedings of the EU-NSF Strategic Research Workshop on Unconventional Programming Paradigms, Mont Saint-Michel, France, 15-17 September 2004.

Architectures & principles + Specific algorithms

Adaptive Systems versus Elements

Local entity RulesGlobal

Behaviour

Interactions

Tune/control

Novel design principles...

Problem statement: Increasingly complicated (diverse, dynamic and seemingly unpredictable) ICT systems have created a management crisis, with serious reliability issues and a severe loss of confidence from the average user.

Traditional solution: Reinforce central control, artificially reduce diversity through enforcement of restrictive usage policies, “fight” emergent system properties by blocking unsupervised (e.g. P2P) interactions.

Consequences: Escalating cost of ownership, waste of computing resources, lost opportunities for innovative use of technology.

Alternative solution: Reject “complication”, embrace “complexity”. Adopt novel design principles that take advantage of emergent phenomena, learn to rely on statistical rather than deterministic predictability, focus on developing methods to “promote” desirable system properties rather than on “micro-management”.

Some (NI) Design Heuristics (1)

• Local rules - wherever possible use local rules and decision making to achieve overall behaviour

• Interactions - by combining local decision making with carefully crafted interactions between neighbourhood nodes/entities the desired global behaviour can often be achieved

• Positive and negative feedback - biological systems make extensive use of feedback to control processes and achieve robust design of structure and behaviour

Some (NI) Design Heuristics (2)

• Decentralised solutions - often a given problem is in essence a decentralised problem - in this case a decentralised solution may be well matched– In addition, nodes in a decentralised system often "bring their own

resources" which can help provide a scalable solution

• Engineered-in behaviour versus explicit external control - where possible it is preferable to embody some management within the system itself– Policy-based management is still appropriate and possible via tuning

parameters and via the system's in-built adaptability

Some examples using these approaches to create adaptive solutions...

Ex#1: “FlyPhones”From Fruit Flies to Mobile Phones

Frequency allocation - the problem

• Interference between neighboring base stations should be prevented.

• The number of available frequencies is limited.

• Bandwidth has to be very carefully distributed between adjacent cells.

Cell differentiation for fruit flies

• Most cells have the potential to make bristles.

• They all start to express the corresponding gene (greyscale = density of the associated transcription factor)

• But they gradually specialise until only a few actually develop a bristle.

• Clearly a self-organised process.

Underlying mechanism

• This process obeys a fully decentralised control mechanism.

• It involves a local positive feedback loop…

• Coupled with cross-inhibition of neighboring cells(Delta-Notch signalling)

Why is an original solution needed?

• This is in fact a very complex problem:– The base stations are not regularly distributed

(i.e. they don’t have the same size and/or number of neighbours)

– The continuously fluctuating traffic must be taken into account.

• A centralised decision process is not particularly well adapted…

• But a system allocating frequencies on the basis of local competition between cells is.

The “Flyphones” algorithm

• The equivalent of the natural feedback loop is implemented for each available frequency.

• Through the “negotiation” process, each base station starts to develop a preference for some frequencies (moving away from the unstable equilibrium)…

• And simultaneously inhibits its neighbours from using them.

Flyphones

Nature-inspired innovation•Self-organising•Self-healing•Micro Cell Decisions•Macro Result

‘Autonomic’ Network•Dynamic•Scale-independent•Distributed•Self-organising•Self-healing

58 base stations4 from 29 channels10253 Solutions

680 base stations6 from 42 channels104031 Solutions

Applying FlyPhones

Comparing Different Types of Frequency Constraints

0

100

200

300

400

500

0 50 100 150 200 250

Time/s

Co

ns

tra

int

Co-Site + Far-Site

Co-Site + Far-Site +Spurious

Co-Site + Far-Site +Spurious + Intermod

Field Network Radio scenario-- 600 lines of communication, 250 channels, mobile transceivers- currently solved centrally, statically (obvious difficulties!)- FlyPhones can solve it too- contract research to use FlyPhones in dynamic management

Ex#2: SelfService - an autonomic protocol for dynamic service provision

Telco network->ICT services; Bgd; NI

“Alone in the world” (~PC model)

z

x

x

Need for service x

Installed module x

y

xA

z x

y

z

y

x

B

zx

y

z

y

xC

z

x

y

Pros and cons

• Highly robust to node or network failure.

• End user has total control.

• Need a lot of onboard power (good for hardware manufacturers!).

• Need many copies of every application (good for software manufacturers!).

• Need virtually no ICT infrastructure (not good for service providers!).

• Amazing waste of resources (“99% idle time” syndrome).

“Thin client”

A B

C x

x

y

yz z

x

x

Need for service x

Installed module x

Client-server relationshipy

xD

z

x y

z

Pros and cons

• Extremely brittle (single point of failure!)

• Administrator has total control.

• Mixed picture for the hardware industry (need powerful servers, but only low-end PC’s)

• Mixed picture for the software industry (depends on license management).

• Mixed picture for ICT providers (network services are paramount, but risk of bottlenecks and QoS degradation).

Quotes:(from IBM’s Autonomic Computing Manifesto)

• “An autonomic computing system knows its environment and the context surrounding its activity, and acts accordingly (…) It will tap available resources, even negotiate the use by other systems of its underutilized elements, changing both itself and its environment in the process.”

• “An autonomic computing system cannot exist in a hermetic environment (…) Standard ways of system identification, communication and negotiation – perhaps even new classes of system-neutral intermediates or ‘agents’ specifically assigned the role of cyber-diplomats to regulate conflicting resource demands – need to be invented and agreed on.”

• n.b. Plus industry trends: SOA, P2P, Grid, Web services

“SelfService”

A B

C x

xx

y

y

yz z

z

x

x

Need for service x

Installed module x

Service-specific client-server relationship

x

x y

z

[P2P service provision]

Pros and cons

• Intermediate robustness (no single point of failure, but problems will tend to be “non-local”).

• End user is back in charge (decides what to install).

• Interesting model for sharing resources (i.e. P2P utility computing).

• A clear step towards pervasive ICT and a great opportunity for service providers, if it works...

What is the challenge?

• The difficulty is of course to ensure adequate service coverage, in terms of accessibility, reliability, latency etc.

• This has to be achieved without central control or planning, otherwise:– It won’t scale.

– We’ll lose many of the benefits (in terms of agility and adaptability).

“SelfService”

• Objectives:– To support reliable, fault-tolerant access to a sub-set of services,

which are required at local “access points”.

– To reduce the need for installation/running of the corresponding software modules on local devices used as access points.

– Without having to rely on dedicated servers.

• Underlying hypothesis:There are unpredictable but consistent patterns of activity, which can be used to select stable partnerships (e.g. “device X, hosting service S1, is able to provide it to device Y for 80% of business hours”).

Experimental algorithmSTART

(generate request)

Already know

provider

Broadcast request

Reply received

Store provider’s

address

Download componentTargeted

request

Reply received

Increment delay

Delay reached unacceptable

limit

“Forget” provider EXIT

(Re-)examine request

yes

yes

yes

yes Process or send job

Locally maintained information

KnownProviderID score



Subsciption

Service

ID

QoS

value

requestssuccess




Subsciption

Service

ID

QoS

value

requestssuccess

ServiceKnownProviderID score



Subsciption

ID

QoS

value

requestssuccess

Decision rules

Service




Subsciption

ID

QoS

value

requestssuccess

Service

• Maintain a “subscription” to each required service component

• Can keep a record of QoS attributes, such as speed of response

“Alone in the world” (benchmark)

0%

10%

20%

30%

40%

50%

60%

70%

80%

90%

100%

0 200 400 600 800 1000

Time-step

Fra

ctio

n o

f re

qu

ired

mo

du

les

inst

alle

d l

oca

lly

(ave

rag

e)

P = 0.1

P = 0.05

P = 0.2

Load balancing

1

10

100

1000

0 - 1 1 - 10 10 - 100 100 - 1000

Average delay

Num

ber

of p

eers

(log

arith

mic

sca

le)

1

10

100

1000

0 - 1 1 - 10 10 - 100 100 - 1000

Average delay

Num

ber

of p

eers

(log

arith

mic

sca

le)

Broadcasts & downloads (~bandwidth)

0%

25%

50%

75%

100%

0 200 400 600 800 1000

Time-step

Fra

ctio

n (

pe

rce

nta

ge

)

Broadcast requests

Locally installed components

Pervasive “SelfService”

• Mobile nodes (e.g. PDAs)

• Regular, but unpredictable, daily activity cycles.

• Colour code = QoS.

• Size = number of modules installed.

• Grey links = in range.

• White links = identified opportunities for co-operation.

Biological Morphogenesisas a source of inspiration?• In morphogenesis, individual stem cells simultaneously

differentiate (specialise) and move in space (equivalent to rewiring in a network)

• In this way cells of the right type occupy the right location in the developing organism

• Neighbours influence each other’s choice via a dynamic web of positive and negative feedback

• This aspect of the developmental process shares many characteristics with co-operative peer-to-peer (P2P) service provision across networks– deciding which service to host is equivalent to differentiation

– selecting providers via “rewiring” is similar to cell migration

Ex3: Embryo

• Embryo is a fully decentralised, autonomic service management framework, inspired by morphogenesis

• It is capable of inducing the local installation of components AND modifying the topology of a peer-to-peer interaction overlay network

• In doing so, it adapts the overall system so as to meet the needs of the majority of peers

• Simulations show that Embryo supports deployment of new applications, adding or retiring of service components and re-scaling (reallocating resources) “on the fly” without explicit management. => Autonomic

Embryo’s key mechanisms

• Rewiring (~Cell migration)– nodes send “adverts” offering their available service components, as

well as their own needs

– these signals are propagated using local “gossiping”

– links are made where there is a reciprocal match of offer & need

– nodes maintain an awareness of their local context via adverts they have seen “pass them by”

• Changing type (~Cell differentiation)– a node is permitted to be of only one type (i.e. host only one service

component) since we wish to explore a node’s ability to dynamically reconfigure within its current context

– neighbours exhibit implicit mutual inhibition since there is a reduced pressure to offer a service that is already offered by a neighbour

Embryo: simulation of an autonomic service management framework

• Each node acts as an “access point” for one application type (indicated by shape)

• Each node hosts and makes available one service component type (indicated by colour & number)

• The three strips show which (of 3) application types require which service components

Embryo: some simulation results

Time until stable (N = 9)

0

2000

4000

6000

8000

10000

12000

14000

16000

0 32 64 96 128 160 192 224 256 288 320 352 384 416

Population size

Tim

e u

nti

l st

able

• The time to convergence of the system to stable state grows only logarithmically with population size (i.e. number of nodes or cells)

• This suggests good scalability properties

“Rewiring” ~ Cell migration/adhesion

Successful handshakes (average per node, N = 9)

0

2

4

6

8

10

12

14

16

18

0 32 64 96 128 160 192 224 256 288

Population size

Han

dsh

akes

• The number of “corrective actions” (rewiring events) a node must make before the system reaches steady state grows slowly with population size

• Note that when there are more service types scalability is better still (downward trend - not shown)

Changing type ~ DifferentiationMetamorphoses (average per node, N = 9)

0

1

2

3

4

5

6

0 32 64 96 128 160 192 224 256 288 320 352 384 416

Population size

Met

amor

phos

es

• Bigger population sizes (i.e. more nodes/cells) require less differentiation events per node

• This suggests good scalability properties

Some Design Heuristics (1)

• Local rules - wherever possible use local rules and decision making to achieve overall behaviour

• Interactions - by combining local decision making with carefully crafted interactions between neighbourhood nodes/entities the desired global behaviour can often be achieved

• Positive and negative feedback - biological systems make extensive use of feedback to control processes and achieve robust design of structure and behaviour– Flyphones uses explicit inhibitory signalling

– Embryo exhibits implicit inhibition i.e. if a neighbour offers a service then I don’t need to offer it

Some Design Heuristics (2)

• Decentralised solutions - often a given problem is in essence a decentralised problem (c.f. Flyphones) - in this case a decentralised solution may be well matched. In addition, nodes in a decentralised system often "bring their own resources" which can help provide a scalable solution

• Engineered-in behaviour versus explicit external control - where possible it is preferable to embody some management within the system itself; policy-based management is still appropriate and possible via tuning parameters and via the system's in-built adaptability

From Heuristics to Design Patterns?

• Babaoglu has proposed using the following “Design Patterns” (abstractions) taken from Biology…

• Plain Diffusion and Reaction-Diffusion

• Chemotaxis - movement over gradients

• Replication - an epidemics metaphor

• Stigmergy - distributed control via environment

• …these are cast in a common framework describing topologies, and overlaid “communication strategies”.

Ref: Design Patterns from Biology for Distributed Computing. Ozalp Babaoglu, Geoffrey Canright, Andreas Deutsch, Gianni Di Caro, Frederick Ducatelle, Luca Gambardella, Niloy Ganguly, M´ark Jelasity, Roberto Montemanni, and Alberto Montresor.

Design principles

• The desire of autonomic computing is for ICT systems that are capable of “looking after themselves” without external intervention

• Self-managing distributed systems are the only alternative to complication-induced paralysis

• Engineering individual components capable of self-organising reliably into a whole that is “greater than the sum of its parts” is the only way to design such systems– multi-scale adaptability that takes advantage of emergent properties is key

• We can use nature-inspired “architectures & principles” or more literal instances of “algorithms found in nature”– both are helpful and valid approaches

Additional benefits of the approach

• “High level” policies can provide control, without having to explicitly plan or manage the details

• Control can be delegated where appropriate to “engineered-in” self* properties

• Systems can adapt to changes, such as changes in demand

• Robustness to failure of components

• Improved scalability

• The approach often brings many of the above benefits as well as comparable performance to existing solutions

Self-organisation

• By definition, a system composed of units making autonomous decisions based on locally available information can only be “driven” to a desirable state via self-organisation.

• This requires engineering the reasoning and decision-making engine running on individual units so as to promote the emergence of the “right” collective behaviour.

• In turn, this means adapting the predictive techniques of natural complexity science (both analytical and numerical) to meet the needs of artificial complex systems designers.

• This may seem relatively trivial in principle, but experimental validation requires prototype implementation.

“Complex Systems” approaches

• The tools of Complex Systems research are likely to prove increasingly important: – this can be in terms of “engineering-in” the right properties…

(E.g. this approach helped with the design of the SWAN system.)

– or in terms of helping understand macroscopic behaviour.

• …which in a way is the same thing: understanding the macroscopic properties resulting from a variety of possible rules/parameter combinations is key to selecting a set that will promote emergence of the desired system behaviour.

Summary… and challenges!• The move to huge-scale, dynamic, heterogeneous

networks and systems requires new approaches to complement the existing ICT engineering toolkit– the world has never been more ready for NIS thinking...

• Bio-inspired solutions coupled with tools from complexity science provide a rich source of such novel approaches– we already have many examples of “point solutions”

• The challenge is to move beyond bio-inspired design heuristics to bio-inspired design patterns and principles– Babaglou’s framework is an interesting early example of this

– but how do we plan to get NIS solutions into “mainstream” computer science and Software Engineers’ thinking & day-to-day toolkits?

Thank you

Acknowledgements:• Fabrice Saffre• Richard Tateson

Date post:	28-Dec-2015
Category:	Documents
Upload:	laura-webb
View:	222 times
Download:	1 times

Naturally Adaptive ICT Mark Shackleton Pervasive ICT Research Centre BT Research & Venturing...

Documents