1
Autonomic Computing and Networking
Pieter Simoens, Steven Latré
Filip De Turck, Bart Dhoedt Future Internet Department
17/05/2011 Gent
2
Outline
• Research Context• Thin/Smart client computing• Autonomic Communications• Introduction to Demo’s
Why autonomic systems ?
Autonomic systems :Managing complex things is difficult
3
Autonomic Systems
ObservationComplexity of ICT-systems is growing
Issues- Management gets complex (high opex)- System configuration error-prone and sub-optimal- Difficult to recover from unforeseen situations
4
Autonomic SystemsInspiration : The Human Body
- Distributed responsibilities- Collaborating control systems- Each system: optimised for specific task- Under control of central system
- Learns and adapts online- Governed by high-level goal: Stay Alive
5
Autonomic Systems
Autonomic systems decrease management complexity by performing low-level configurations themselves
The system adapts its behavior to changes inThe environmentEnd-user needsService requirements
It is governed by high-level policiesRepresenting business goalsDefined and managed by human operators
6
Autonomic Computing
MAPE control loop (IBM 2001)- Knows itself and its context- Configures, reconfigures, heals and
protects itself- Optimizes continuously- Can interact with outside world- Anticipates to balance resources and needs, without
involving users
"Civilization advances by extending the number of important operations which we can perform without thinking about them.” - Alfred North Whitehead
7
ACN @ Future Internet Dpt.
1. Autonomic Technologies- Automatic policy translation- Autonomic adaptation
- Scalability and multi-agent management- Learning
- Design and implementation of an autonomic service platform2. Autonomic Communication3. Autonomic Distributed Computing4. Integrated infrastructures5. Smart Client Computing6. Autonomics for IoT
- Sensor networks- ICT for Green
8
9
Outline
• Research Context• Thin/Smart client computing• Autonomic Communications• Introduction to Demo’s
Introduction• Thin client ?
• ideally limited to I/O functions (display, network)
• CPU and storage hosted in the network• Rationale :
• Enhanced software life cycle management• Data security, privacy and integrity• Increased terminal lifetime• Data is available
optimized for wired LAN environments, non I/O intensive applications
10
Objectives
X-layer optimization for better performance
wireless link optimizations image transmission
optimizations optimized management
(profiling, migration, reservations, ...)
access network
core networkpublic hotspot
energy-efficientQoE
mobile multimedia
intelligent
11
12
MobiThin
•FP7-STREP (call 1, Challenge 1.1 “Future Internet”)•Time frame
• start : Jan 1st, 2008• end : June 30th, 2010
MobiThin system
13
Build a mobile thin client service in wireless environment for heterogeneous applications
13
System Overview
1414
Project Highlights - Integrated System
Backbone network
Management Server
Access networkAccess network
Client 5
Adaptive Thin Client Protocol
Client 6
Client 4
Client 2
Client 1
Client 3
Thin Client Server 1 Thin Client Server 2 Thin Client Server 10
Application Image Service
Data Storage Service
User Sessions
Self ManagementProtocol
Management Server SLM
Thin Client Server SLM (physical host)User Session SLM (VM that runs apps)
Channel server side SLM
Channel clientside SLM
Mobile Device SLM
- Fully functional E2E system has been built, based on requirements analyzed at the start of the project
- Cross-layer optimizations = the core business of the project 1) wireless X-layer mechanisms (thin client protocol - PHY-MAC) 2) thin client protocol optimizations
- scheduled updates- event buffering
3) self-management of the service- VM migration supporting QoS, peak load avoidance, …- server consolidation for green computing
4) SLM framework spanning the complete system developed15
Possible actions per level
Relocate session to other server, start/stop extra server Redistribution of resources to certain session,
compensating over-spenders by under-spenders Choice of channel (= image transmission protocol)
Tuning of channel parameters: color depth, UDP/TCP, user event buffering, scheduled updates, streaming
(Semi-) Physical changes: display brightness, wireless interface sleep time
Management Server SLM
Thin Client Server SLM (physical host)
User Session SLM (VM that runs apps)
Channel serverside SLM
Channel clientside SLM
Mobile Device SLM
16
Server Consolidation
• When there is low work load on the system, energy can be saved by shutting down redundant thin client servers.
• When the work load raises, extra thin client servers should be powered on.
Server Consolidation AlgorithmDecide how many servers are needed in the (near) future based on the system load in a previous time frame
t
System load
17
P CPU load #online servers %rejected users%difference with simulated #online servers
6.25 80.7 13.8 0.6 -0.8
12.5 74.4 14.8 5.2 -1.8
25 67.2 18.3 3.8 5.9
50 58.9 21.3 4.8 1.7
75 47.4 23.7 4.4 -5.3
100 50 25 0 0
P CPU #online servers
Max. Energy Savings: 45%
18
Server Consolidation
MobiThin Gains
19
• Successful project, rated “Excellent” by EU• Strong partnership, good prospects for future collaborations• Foundation laid for innovative research ideas• Good output in publications
• > 20 accepted publications• Best paper award
• Standardisation through ETSI (ISG-MTC)• 2 work items completed
From Thin to Smart Thin client : Run the whole application on a server
Problems Constant and high bandwidth needed Always extra latency introduced Doesn't work well with some multimedia applications
(e.g. augmented reality)
20
Smart client
Only offload parts of the software
21
Only offload parts of the software Adapt the deployment to the changing context and the
changing optimization goal
22
Smart client
23
Outline
• Research Context• Thin/Smart client computing• Autonomic Communications• Introduction to Demo’s
24
The goal of autonomic communications
Optimize the Quality of Experience, maximize the revenue … and do it fast!
Router> enable Router# configure terminal Router(config)# interface ethernet 1/1 Router(config-if)# ethernet Router(config-line)# exit Router(config)# end Router#
From high-level goalsTo low-level device configurations
Autonomic Computing
• Presented by IBM in 2001
• Homogeneous components
• 1 computing environment
• MAPE control loop Monitor Analyze Plan Execute
Autonomic Communications
• Extension to IBM’s model
• Heterogeneous devices
• Networked system
• More complex control loops Model-based translation Semantically enriched Reasoning & learning Policy-based management
25
Computing vs. Communications
Complexity
Manage complexity of an Operations Support System
Real-time dynamic management
Per service or per subscriber management
Will we ever be able to tackle such complexity?
Parallel with robotics
Millions of interactions
Trying to “mimic” human behavior
Still in early stages
27
Introducing intelligence into the network
Access NetworkInternet Service Provider
Home NetworkCustomer
`
Core InternetNetwork Operators
Access NetworkInternet Service Provider
DatacenterApplication Service Provider
Management Domain Boundaries
HOW?
Scalable Privacy
Trustworthy Intelligent
Human-governed Secure
28
A federation of autonomic elements (AE)
Access NetworkInternet Service Provider
Home NetworkCustomer
`
Core InternetNetwork Operators
Access NetworkInternet Service Provider
DatacenterApplication Service Provider
Management Domain Boundaries
AE
AE AE
AE
AE AE
AE
AE AE
AE
AE AE
AE
AE AE
distributed reasoning
servicediscovery
contractnegotiation
contextexchange
Research focus
Design and implementation of architectural components for federated management of future networks and services
Autonomic Element
Semantic Enterprise Service Bus
Federation Orchestrator
Contract Negotiator
State Comparator Planning Agent Policy
Framework
Interaction Endpoint
Context Manager
Optimization Algorithm
Monitoring Probe
Resource Interface
Information & Data Model
...Resource Endpoint
Operator Interface
Aut
onom
ic In
tera
ctio
n In
terfa
ce
loosely coupled management components
semantic communication and collaboration
policy driven
end-to-end federation of management domains
Research directions
Access NetworkInternet Service Provider
Home NetworkCustomer
`
Core InternetNetwork Operators
Access NetworkInternet Service Provider
DatacenterApplication Service Provider
Management Domain Boundaries
AE AE AE
AE AE AE
AE AE AE
AE AE AE
AE AE
AE AE
AE
AE AE AE
AE AE AE
AE AE AE
AE
AE AE AE
AE
Autonomic Element
Semantic Enterprise Service Bus
Federation Orchestrator
Contract Negotiator
State Comparator Planning Agent Policy
Framework
Interaction Endpoint
Context Manager
Optimization Algorithm
Monitoring Probe
Resource Interface
Information & Data Model
...Resource Endpoint
Operator Interface
Aut
onom
ic In
tera
ctio
n In
terfa
ce
semantic inter-domain contract negotiation
autonomic cloud management
control loop design
automated policy translation
Automatic policy translation
32
FP7 ECODE
Introducing autonomic behaviour in today’s routers
• FP7 Strep (Call 1.6 “New paradigms and experimental facilities”)• Timeframe
• Start: September 2008• End: December 2011
FP7 ECODE
Experimental COgnitive Distributed EngineCognitive engine on top of an existing router
Integration of learning capability into self-adaptive closed-loop control process
Communication systems autonomously interrelated and controlled, dynamically adapting to changing environments Role of learning
• How to diagnose their own state, own activity/behavior, and environment over time (thus detect, identify, & analyze problems)
• How (cost-effective) and when (timely) to adapt decisions and to tune react/execute (and thus capable to increase their functionality and performance)
• When to operate autonomously and to cooperate
Augment control paradigm of pre-defined decision making process, and pre-determined execution, with learning component
Routing
Forwarding
Learning
Routing
Forwarding
Routing + Learning
Forward + Learning
Router RouterWeak coupling
Strong Coupling
Today Step 1: overlay
Step 2: integrated
ECODE machine learning in practice
Different TCP stacks cause different levels of fairness
Cubic
Reno
Cubic
Highspeed
Vegas
Highspeed
Cubic
Vegas
Reno
Vegas
ECODE machine learning in practice
• Different TCP stacks different responsiveness• Variations due to
• Different TCP dialects • Defective stacks: ignores congestion warnings
• Profile Based Accountability holding subscribers (i.e. stacks) accountable for their behaviour
aggressiveness
resp
onsi
vene
ss
Goodzone
reward stacksin the good zone
punish stacksin the bad zone
37
Outline
• Research Context• Thin/Smart client computing• Autonomic Communications• Introduction to Demo’s
38
Demo 1 – hybrid remote display
• large areas of solid color • few colors • updates cover small part of screen• low update frequency
• no homogeneous areas • fine-grained complex color
patterns• updates cover whole screen• high update frequency
office applicationtext editor, spreadsheet, e-mail client
multimedia applicationvideo streaming, 3D game
Encode through remote display protocol (VNC)
Encode through video codec(H.264)
Motivation: graphical content diversity
39
Dynamically switching between protocolsDecision on output encoding format based on amount of
motion between subsequent frames
• inefficient transport of multimedia data via a thin client protocol• high bandwidth• irresponsive user interface
• video codecs are designed for transport of video• minimal bandwidth requirements for a given amount of motion• higher client CPU load due to decoding
40
Demo set-up
Demo 2 – SLRG inferencing
• Identification of Shared Link Resource Groups
Shared Link Resource Group
Demo 2 – SLRG inferencing
• Goal: improve recovery time of link failures by learning.
• OSPF area
• One node is enabled with SLRG inference• Learns
Demonstration – iLab.t setup
• Using three nodes
ctl vhost-0 vid
n1 n2
n3 n4
n5 n6
n7 n8
n9 n10
OSPF area
video outputDemo controlVideo streaming
Demonstration – video screen
• Showing three video streams
Demonstration – video screen
• What to look for?• Video interruptions;• standard OSPF (left side) and
SRG inference enabled OPSF (right side).
• For learned SRGs• compare left and
right parts of astream;
• compare streams;• compare local and
remote link failures.
Demonstration – status screen