Thomas DreibholzInstitute for Experimental Mathematics
University of Duisburg-Essen, Germany
University of Duisburg-Essen, Institute for Experimental Mathematics
Reliable Server Pooling–
A Novel IETF Architecture for Availability-Sensitive Services
Thomas Dreibholz Reliable Server Pooling – A Novel IETF Architecture for Availability-Sensitive Services P. 2
Table of Contents
What is Reliable Server Pooling? Prototype Demonstration Terminology and Protocols Motivation and Application Scenarios
Failure Detection Dynamic Pools “Unclean” Shutdowns Session Monitoring
Failover Mechanism Applying Client-Based State Sharing
Conclusion and Outlook
Thomas Dreibholz's Reliable Server Pooling Pagehttp://tdrwww.iem.uni-due.de/dreibholz/rserpool/
Thomas Dreibholz's Reliable Server Pooling Pagehttp://tdrwww.iem.uni-due.de/dreibholz/rserpool/
Thomas Dreibholz Reliable Server Pooling – A Novel IETF Architecture for Availability-Sensitive Services P. 3
What is „Reliable Server Pooling“?Prototype Demonstration
Thomas Dreibholz Reliable Server Pooling – A Novel IETF Architecture for Availability-Sensitive Services P. 4
Reliable Server Pooling (RSerPool)
Terminology: Pool Element (PE): Server Pool: Set of PEs PE ID: ID of a PE in a pool Pool Handle: Unique pool ID Handlespace: Set of pools Pool Registrar (PR) Pool User (PU): Client
Support for Existing Applications Proxy Pool User (PPU) Proxy Pool Element (PPE)
Protocols: ASAP (Aggregate Server Access Protocol) ENRP (Endpoint Handlespace Redundancy Protocol)
Thomas Dreibholz Reliable Server Pooling – A Novel IETF Architecture for Availability-Sensitive Services P. 5
Session Failover usingClient-Based State Sharing
Necessary to handle failover:A new PE must be able to recover the
session state of the old PE
Simple solution for many applications:Usage of „state cookies“ [LCN2002]
Now part of the ASAP protocol!
Thomas Dreibholz Reliable Server Pooling – A Novel IETF Architecture for Availability-Sensitive Services P. 6
What is a Pool Policy? A rule for the selection of the PEs Defined in our IETF Working Group draft (draft-ietf-rserpool-policies-07.txt)
Application of Policies Registrar: Creates PE list upon request by PU Pool User: Selection of a PE from the list Both according to the pool policies (pool-specific!)
Non-Adaptive Policies Stateless: Random (RAND) Stateful: Round Robin (RR) (Default policy, must be supported)
Adaptive Policy Least Used (LU)
Load definition is application-specific! Round robin among multiple least-loaded PEs
Server Selection Rules(Pool Policies)
Thomas Dreibholz Reliable Server Pooling – A Novel IETF Architecture for Availability-Sensitive Services P. 7
The Application Model
Server– PE Capacity– Shared among sessions
(multi-tasking principle)
Client– Requests are generated
• Request Size (effort)• Request Interval (frequency)
– Waiting queue for requests– Sequential processing
System Utilization– PU:PE Ratio
– Provisioning for certain Target Utilization, e.g. 80%
yAvgCapacitrvalRquestInte
RquestSize
opuToPERatiizationsystemUtil *
Thomas Dreibholz Reliable Server Pooling – A Novel IETF Architecture for Availability-Sensitive Services P. 8
Performance Metrics
Provider's Perspective“Does my server capacity gain revenue?”
Average Utilization of server resources [%]
User's Perspective“How much time is
needed to process
my requests?”
Avg. Handling Speed
[% of average
server capacity]
Depends on: Queuing Startup Server Failover
Thomas Dreibholz Reliable Server Pooling – A Novel IETF Architecture for Availability-Sensitive Services P. 9
Dynamic Pools – A Proof of Conept
Ideal case: a “clean” shutdown PEs abort their session before
shutting down
Not critical ...
... except for extremely low MTBF
Round Robin: no stable rounds -> random
behaviour
Handling SpeedHandling Speed
Thomas Dreibholz Reliable Server Pooling – A Novel IETF Architecture for Availability-Sensitive Services P. 10
“Unclean” Shutdowns
Re-processing effort increases (due to lost work)
Session monitoring is crucial: fast failure detection -> quick failover
Handling SpeedHandling SpeedUtilizationUtilization
Thomas Dreibholz Reliable Server Pooling – A Novel IETF Architecture for Availability-Sensitive Services P. 11
Session Monitoring
Session monitoring is crucial
Various possible mechanisms Keep-Alives Part of application protocol
e.g. transaction timeouts
Endpoint Keep-Alive Monitoring Here: small impact When is it useful?
Short and frequent requests Minimizes startup time (see paper for details)
Handling SpeedHandling Speed
Thomas Dreibholz Reliable Server Pooling – A Novel IETF Architecture for Availability-Sensitive Services P. 12
Using Client-Based State Sharing
More cookies -> less re-processing, better handling speed
But what about overhead?
Handling SpeedHandling SpeedUtilizationUtilization
Thomas Dreibholz Reliable Server Pooling – A Novel IETF Architecture for Availability-Sensitive Services P. 13
Configuring a Useful Cookie Interval
Cookie size: a few bytes up to ~64K (limit)
Idea: For known MTBF (in request times):
set cookie interval to achieve a certain goodput (e.g. 98%)
Choice of goodput depending on application's requirements
=> Accepting a certain amount of re-processing work
Results: For realistic MTBF:
high goodput already at moderate cookie rate
overhead significantly rises for too-high goodput -> inefficient!
Cookies per RequestCookies per Request
Thomas Dreibholz Reliable Server Pooling – A Novel IETF Architecture for Availability-Sensitive Services P. 14
Conclusion and Outlook
Conclusion RSerPool is the IETF's upcoming standard for service availability 3 basic server selection policies Failure detection mechanisms:
Session monitoring Endpoint keep-alives
Failover mechanism: Client-based state sharing
Future Work From simulation to reality:
Tests with our prototype implementation in the PlanetLab First results already available [KiVS2007]
Security analysis and robustness against DoS attacks
Thomas Dreibholz Reliable Server Pooling – A Novel IETF Architecture for Availability-Sensitive Services P. 15
Thank You for Your Attention!Any Questions?
Visit Our Project Homepage:http://tdrwww.iem.uni-due.de/dreibholz/rserpool/
Thomas Dreibholz, [email protected]
To be continued ...To be continued ...
Thomas Dreibholz Reliable Server Pooling – A Novel IETF Architecture for Availability-Sensitive Services P. 16
The RSerPool Protocol Stack
Aggregate Server Access Protocol (ASAP) PR PE: Registration, Deregistration and Monitoring by Home-PR (PR-H) PR PU: Server Selection, Failure Reports
Endpoint Handlespace Redundancy Protocol (ENRP) PR PR: Handlespace Synchronisation
ASAP is IETF's first
Session Layer standard!
ASAP is IETF's first
Session Layer standard!
Thomas Dreibholz Reliable Server Pooling – A Novel IETF Architecture for Availability-Sensitive Services P. 17
Motivation
Motivation of RSerPool: Unified, application-independent solution for service availability Not available before => Foundation of the IETF RSerPool Working Group
Application Scenarios for RSerPool: Main motivation: Telephone Signalling (SS7) over IP Under discussion by the IETF:
Load Balancing Voice over IP (VoIP) with SIP IP Flow Information Export (IPFIX)
... and many more!
Requirements for RSerPool: “Lightweight” (low resource requirements, e.g. embedded devices!) Real-Time (quick failover) Scalability (e.g. to large (corporate) networks) Extensibility (e.g. by new server selection rules) Simple (automatic configuration: “just turn on, and it works!”)