Application-Oriented Networking
through
Virtualization and Service Composition
by
Hadi Bannazadeh
A thesis submitted in conformity with the requirementsfor the degree of Doctor of Philosophy
Electrical and Computer Engineering DepartmentUniversity of Toronto
Copyright c© 2010 by Hadi Bannazadeh
Abstract
Application-Oriented Networking
through
Virtualization and Service Composition
Hadi Bannazadeh
Doctor of Philosophy
Electrical and Computer Engineering Department
University of Toronto
2010
Future networks will face major challenges in accommodating emerging and future
networked applications. These challenges include significant architecture and manage-
ment issues pertaining to future networks. In this thesis, we study several of these chal-
lenges including issues such as configurability, application-awareness, rapid application-
creation and deployment and scalable QoS management. To address these challenges, we
propose a novel Application-Oriented Network (AON) architecture as a converged com-
puting and communication network in which application providers are able to flexibly
configure in-network resources on-demand. The resources in AON are virtualized and
offered to the application providers through service-oriented approaches.
To enable large-scale experimentation with future network architectures and applica-
tions, in the second part of this thesis, we present the Virtualized Application Networking
Infrastructure (VANI) as a prototype of an Application-Oriented Network. VANI utilizes
a service-oriented control and management plane that provides flexible and dynamic al-
location, release, program and configuration of resources used for creating applications
or performing network research experiments from layer three and up. Moreover, VANI
resources allow development of network architectures that require a converged network
of computing and communications resources such as in-network processing, storage and
ii
software and hardware-based reprogrammable resources. We also present a Distributed
Ethernet Traffic Shaping (DETS) system used in bandwidth virtualization in VANI and
designed to guarantee the send and receive Ethernet traffic rates in VANI, in a computing
cluster or a datacenter.
The third part of this thesis addresses the problem of scalable QoS and admission
control in service-oriented environments where a limited number of instances of service
components are shared among different application classes. We first use Markov Deci-
sion Processes to find optimal solutions to this problem. Next we present a scalable and
distributed heuristic algorithm able to guarantee probability of successful completion of
a composite application. The proposed algorithm does not assume a specific distribution
type for services execution times and applications request inter-arrival times, and hence
is suitable for systems with stationary or non-stationary request arrivals. We use simula-
tions and experimental measurements to show the effectiveness of the proposed solutions
and algorithms in various parts of this thesis.
iii
to the memory of my father
iv
Acknowledgements
The completion of this thesis would not have been possible without the support of
many people. First and foremost, I owe my deepest gratitude to my supervisor, Professor
Alberto Leon-Garcia, for his guidance and generous support throughout my research. I
would like to thank him for the insightful discussions and ideas that shaped my research
and led me to the completion of this thesis. Professor Leon-Garcia is not only a great
supervisor but also an admirable person that I will see as a role model in the future.
I would like to thank the honorable members of my committee: Professors Ben Liang,
Paul Chow, Baochun Li, Gordon Agnew and Ahsish Khisti for their evaluation of my
thesis and their invaluable comments and feedbacks.
I would also like to thank the university staff members, especially Ms. Linda Espeut,
Mr. Vladimirio Cirillo and Ms. Darlene Gorzo for their generous help and administrative
support.
During the years at UofT, I received support, feedback and encouragement from my
dear friends and teammates at the Network Architecture Lab, especially from Alireza
Bigdeli, Armin Ghayoori, Keith Redmond, Ali Tizghadam, Ramy Farha, Ivan Hernandez,
Agop Koulakezian and Houman Rastegarfar. I would like to express my gratitude and
thanks to all of them.
I also had the privilege to work with many students at UofT as part of their education
process. I wish to thank them all for their dedication, hard work and willingness to
experience. They are Arbab Khan, Gordon Tam, Saleh Dani, Justin Seto, Andrew Mehes,
Michael Ens, Ian Gartley, Tom Yue, Darryl Chung, Mingliang Ma, Maxim Galash, Wenyu
Li and Anthony Das Santos.
Throughout my Ph.D. years I received family-like friendship from many friends. I
would like to thank all of them for the memorable moments: Amin Farbod, Reza Safian,
Amirali Basri, Maryam Bahrami, Mostafa Haghiri, Kamran Farzan, Mehdi Lotfinezhad,
and David Brown.
v
I would also like to thank my mother, brothers and sisters for their unconditional love
and support without which the completion of this thesis would not have been possible.
Meeting my wife was one of the most wonderful events during my Ph.D. years. I would
like to thank my beloved wife, Sara, for her selfless love and support. I am grateful for
her sacrifices and patience.
vi
Contents
1 Introduction 1
1.1 Vision of A Future Network . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.1 Motivating Application Scenarios . . . . . . . . . . . . . . . . . . 4
1.2 Research Goals and Challenges . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 Proposed Solutions Overview . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4 Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
I Application-Oriented Networking 16
2 Background and Requirement Analysis 17
2.1 New Computing Models . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2 New Applications through Composition . . . . . . . . . . . . . . . . . . . 18
2.3 Emergence of Cloud Computing . . . . . . . . . . . . . . . . . . . . . . . 21
2.4 Evolution of Traditional Service Providers . . . . . . . . . . . . . . . . . 22
2.5 Introduction of Smart Phones . . . . . . . . . . . . . . . . . . . . . . . . 24
2.6 Advancements in Content Delivery Networks . . . . . . . . . . . . . . . . 25
2.7 Future Networks Architecture . . . . . . . . . . . . . . . . . . . . . . . . 27
3 Application-Oriented Networking 29
3.1 AON Application Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2 AON Control and Management Planes . . . . . . . . . . . . . . . . . . . 39
vii
3.3 Application-Oriented Routers . . . . . . . . . . . . . . . . . . . . . . . . 42
3.4 Application-Oriented Routers Use Cases . . . . . . . . . . . . . . . . . . 44
3.4.1 Telecom Service Providers . . . . . . . . . . . . . . . . . . . . . . 44
3.4.2 Enterprise Networks . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.4.3 Overlay Networks and Content Distribution Networks . . . . . . . 48
3.5 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
II Virtualized Application Networking Infrastructure 52
4 Virtualized Application Networking Infrastructure 53
4.1 VANI Design Requirements . . . . . . . . . . . . . . . . . . . . . . . . . 56
4.1.1 VANI Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.1.2 Current Physical Resources in VANI (VANIv1 Resources) . . . . . 60
4.1.3 Example: Requesting a Resource in VANI . . . . . . . . . . . . . 63
4.2 VANI Control and Management Plane (VANI-CMP) . . . . . . . . . . . 64
4.2.1 User Management . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.2.2 Authentication Authorization Accounting . . . . . . . . . . . . . 65
4.2.3 Resource Allocation . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.2.4 Generic Resources and Registration . . . . . . . . . . . . . . . . . 66
4.3 SOA-Based Implementation of VANI-CMP . . . . . . . . . . . . . . . . . 67
4.4 Security in VANI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.5 Guaranteeing Bandwidth in VANI . . . . . . . . . . . . . . . . . . . . . . 70
4.5.1 Interconnecting VANI Nodes in IP Layer . . . . . . . . . . . . . . 71
4.5.2 Interconnecting VANI Nodes in Ethernet Layer . . . . . . . . . . 72
4.5.3 Experimentation with L3 Protocols . . . . . . . . . . . . . . . . . 73
4.6 SW-Based Resources in VANI . . . . . . . . . . . . . . . . . . . . . . . . 73
4.7 Federation with GENI . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
viii
4.8 A VANI Node . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.9 Performance Evaluations . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.9.1 Reprogrammable Hardware Resource . . . . . . . . . . . . . . . . 78
4.9.2 Processing Service and Network Virtualization . . . . . . . . . . . 80
4.10 Experiments & Applications . . . . . . . . . . . . . . . . . . . . . . . . . 83
5 A Distributed Ethernet Traffic Shaping System 84
5.1 Distributed Ethernet Traffic Shaping (DETS) system . . . . . . . . . . . 89
5.1.1 DETS Protocol . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
5.1.2 DETS for Linux OS . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.2 DETS System Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
5.2.1 Rate Allocator Module . . . . . . . . . . . . . . . . . . . . . . . . 91
5.2.2 Performance Improvements . . . . . . . . . . . . . . . . . . . . . 97
5.3 Performance Evaluations . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
5.4 Modifications to Ethernet Control Plane . . . . . . . . . . . . . . . . . . 103
III QoS & Admission Control in Service-Oriented Systems105
6 Allocating Services to Applications using Markov Decision Processes 106
6.1 Concurrent Service Executions . . . . . . . . . . . . . . . . . . . . . . . . 108
6.1.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . 108
6.1.2 Markov Decision Process Formulation . . . . . . . . . . . . . . . . 111
6.1.3 Optimal Policy with Different Services . . . . . . . . . . . . . . . 113
6.1.4 The Optimal Policy and Performance Comparison . . . . . . . . . 115
6.2 Sequential Service Executions . . . . . . . . . . . . . . . . . . . . . . . . 118
6.2.1 Problem formulation . . . . . . . . . . . . . . . . . . . . . . . . . 118
6.2.2 Markov Decision Process formulation . . . . . . . . . . . . . . . . 123
6.2.3 Optimal policy and performance comparison . . . . . . . . . . . . 124
ix
7 A Distributed Probabilistic Commitment-Control Algorithm 130
7.1 QoS Control in a Service-Oriented System . . . . . . . . . . . . . . . . . 133
7.2 Probabilistic Modeling of Service Commitment . . . . . . . . . . . . . . . 137
7.3 Computing Over-Commitment Probability . . . . . . . . . . . . . . . . . 142
7.4 Distributed Algorithm for Service Commitment . . . . . . . . . . . . . . 144
7.4.1 DASC Complexity Analysis . . . . . . . . . . . . . . . . . . . . . 146
7.5 DASC Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . 148
7.6 Queue-enabled Distributed Algorithm for Service Commitment . . . . . . 156
7.6.1 Problem Formulation and Description . . . . . . . . . . . . . . . . 157
7.6.2 Q-DASC Performance Evaluation . . . . . . . . . . . . . . . . . . 158
7.7 Related work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
8 Application Admission Control System 165
8.1 Problem Statement . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
8.2 Steady-State Based Application Admission Control System . . . . . . . . 168
8.3 Online Optimization-based Application Admission Control System . . . . 170
8.3.1 Feasibility Check . . . . . . . . . . . . . . . . . . . . . . . . . . . 172
8.3.2 Scenario Generation . . . . . . . . . . . . . . . . . . . . . . . . . 173
8.3.3 Optimal Admission Decisions For Generated Scenarios . . . . . . 174
8.3.4 Final Decision Making . . . . . . . . . . . . . . . . . . . . . . . . 176
8.4 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
9 Conclusions 181
9.1 Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 181
9.1.1 Application-Oriented Networking . . . . . . . . . . . . . . . . . . 182
9.1.2 Virtualized Application Networking Infrastructure . . . . . . . . . 183
9.1.3 Scalable and Distributed QoS and Admission Control . . . . . . . 184
9.1.4 Related Educational Contributions . . . . . . . . . . . . . . . . . 186
x
9.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
Appendices 191
A Queue-Enabled Service Commitment 191
A.1 Time to Enter Service in a G/G/C/N System . . . . . . . . . . . . . . . 191
A.2 TES for G/D/C/N System . . . . . . . . . . . . . . . . . . . . . . . . . 199
A.3 TES for G/M/C/N System . . . . . . . . . . . . . . . . . . . . . . . . . 199
B Computing Over-Commitment Probability using Chernoff’s Bound 201
C Derivation of Gk(t) Probability 206
D Simulation Environment Description 208
Bibliography 210
Glossary 226
xi
List of Tables
4.1 Average maximum FPGA programming time . . . . . . . . . . . . . . . . 80
4.2 UDP and TCP traffic measurements in a VANI node in MBytes per second
(MBps) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
xii
List of Figures
1.1 Vision of a future network . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Example of a future application: Smart Grids . . . . . . . . . . . . . . . 5
2.1 Basic Service-Oriented Architecture model(source:http://www.w3.org) . . 20
3.1 Three planes in an Application-Oriented Network . . . . . . . . . . . . . 30
3.2 Application Plane Resources . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.3 Multiple Applications in AON . . . . . . . . . . . . . . . . . . . . . . . . 32
3.4 Application Plane Architecture . . . . . . . . . . . . . . . . . . . . . . . 34
3.5 Application-Oriented Network Reference Model . . . . . . . . . . . . . . 40
3.6 Overall view of an Application-Oriented Network with multiple AORs and
applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.7 Telecommunication services in an AON . . . . . . . . . . . . . . . . . . . 45
3.8 Enterprise Service Bus and AON . . . . . . . . . . . . . . . . . . . . . . 47
3.9 Peer-to-Peer network in AON . . . . . . . . . . . . . . . . . . . . . . . . 49
4.1 VANI design requirements . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.2 VANI architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.3 Researcher interaction with VANI planes . . . . . . . . . . . . . . . . . . 59
4.4 Virtualizing physical resources in VANI . . . . . . . . . . . . . . . . . . . 61
4.5 A sample interaction between a researcher and VANI to secure a resource 63
4.6 A sample schema for generic XML content in a getRequest response message 67
xiii
4.7 Connecting VANI nodes in IP layer . . . . . . . . . . . . . . . . . . . . . 71
4.8 Connecting VANI nodes in Ethernet layer . . . . . . . . . . . . . . . . . 72
4.9 Large scale experimentation with new L3 protocols . . . . . . . . . . . . 74
4.10 Connecting VANI to GENI . . . . . . . . . . . . . . . . . . . . . . . . . . 76
4.11 Reprogrammable Hardware (BEE2 Board) . . . . . . . . . . . . . . . . . 78
4.12 Traffic measurement experiment topology . . . . . . . . . . . . . . . . . . 81
5.1 A system with five nodes and two virtual nodes on each . . . . . . . . . . 85
5.2 TCP rate back off due to interfering UDP traffic . . . . . . . . . . . . . . 86
5.3 DETS measurement and rate control points . . . . . . . . . . . . . . . . 89
5.4 DETS System Internal Modules . . . . . . . . . . . . . . . . . . . . . . . 92
5.5 DETS performance evaluations for system shown in Figure 5.1 . . . . . . 98
5.6 Performance evaluation of rate allocation algorithms a) RAA-SlowProbe
b) RAA-FastProbe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
5.7 Performance evaluation of rate allocation algorithms a) RAA-FairShare b)
RAA-ForwardExplicit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
5.8 DETS in Ethernet control plane . . . . . . . . . . . . . . . . . . . . . . . 103
6.1 A system with m different service types and N instance of each type . . 109
6.2 A system with three types of service and two classes of applications . . . 111
6.3 A system with three type of services, two classes of applications and two
types of instances for service type 3 . . . . . . . . . . . . . . . . . . . . . 113
6.4 Optimal policy when the system is in state (n1, n2), and α = 1, β = 0.1 . 115
6.5 Optimal policy when the system is in state (n1, n2), and α = 1, β = 0.5 . 116
6.6 Performance Comparison between Complete Sharing, Complete Partition-
ing and MDP-based partitioning mechanisms . . . . . . . . . . . . . . . . 117
6.7 A system with m different service types and N instance of each type . . 119
6.8 A system with three types of service and two classes of applications . . . 122
xiv
6.9 Optimal policy when the system is in state (n11, n12, n22), and γ = 0.1: a)
n22 = 1, b) n22 = 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
6.10 Optimal policy when the system is in state (n11, n12, n22), and γ = 0.3: a)
n22 = 1, b) n22 = 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126
6.11 Performance Comparison between No Commitment Policy, Full Commit-
ment Policy and MDP-based partitioning mechanisms (α = −0.1, β =
0.5, γ = 0.1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
6.12 A sample beta distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 128
6.13 Performance Comparison between No Commitment Policy, Full Commit-
ment Policy and MDP-based partitioning with a beta distribution for ser-
vice execution time and (α = −0.1, β = 0.5, γ = 0.1) . . . . . . . . . . . . 128
7.1 A sample service-oriented environment . . . . . . . . . . . . . . . . . . . 133
7.2 Composition Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
7.3 A service-oriented system with three agents, each controlling one service
type . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144
7.4 Distributed Algorithm for Service Commitment in SDL (Specification and
Description Language) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
7.5 Beta pdf for service execution time with parameters α = 2.333 and β = 4.666148
7.6 Application failure ratio for a system with two application classes and two
service types . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
7.7 Comparing DASC throughput with bottleneck-based admission control
algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
7.8 A service oriented environment consisted of twelve service types and three
applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
7.9 Applications failure ratios in the system . . . . . . . . . . . . . . . . . . 152
7.10 Failure ratios in services 1 to 6 vs. applications request rates . . . . . . . 153
xv
7.11 Comparison between four admission control mechanisms with stationary
request arrivals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
7.12 Comparison between four admission control mechanisms with on-off bursty
request arrivals with burst time (T) . . . . . . . . . . . . . . . . . . . . . 154
7.13 Applications queuing probability with ample number of queuing spaces
using Q-DASC algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . 159
7.14 Applications failure probability based on queue size in Q-DASC algorithm 160
8.1 A sample service-oriented environment . . . . . . . . . . . . . . . . . . . 167
8.2 Application Admission Control System using Online Optimization . . . . 171
8.3 System reward for four different techniques . . . . . . . . . . . . . . . . . 178
8.4 Application 1 and application 2 failure rates based on the applications
request rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179
A.1 Distributions for residual service times in a service with uniform execution
time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193
A.2 Distributions for residual service times in a service with Normal execution
time . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 194
A.3 Distributions for residual service times in a service with a Beta execution
time α = 2.333, β = 4.666 . . . . . . . . . . . . . . . . . . . . . . . . . . 195
A.4 TES distribution and calculated bound for beta distribution with α =
2.333, β = 4.666 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 198
B.1 A sample d(s) for a service with 900 instances, and random pi s for 1000
application instances . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 204
xvi
Chapter 1
Introduction
The present Internet has become an essential infrastructure in modern society in spite of
its glaring and serious shortcomings in regards to security, reliability, and performance
[1]. A constant in the thirty-year history of the Internet has been continual growth in
scale, both in the number of Internet users and in the diversity of Internet applications.
The growth in the Internet has been fueled by the steady improvement in cost and
performance of computing and communications technology.
The Internet is currently entering a new phase of more dramatic and diverse growth
which is driven by: 1. New computing models where a new application can be created
with the same ease as designing a new web page through the linking of service build-
ing blocks and 2. New Internet users in the form of communicating devices such as
smart wireless phones and tablets, high definition monitors, smart sensors, alarms and
controllers.
The next generation Internet will be challenged to support a much more diverse
and a much greater number of applications as well as a new generation of communicating
devices. The goal of this thesis is to study how future networks can support and facilitate
creation, deployment and management of these emerging applications. In particular, we
study how service composition techniques in application creation and virtualization of
1
Chapter 1. Introduction 2
resources can empower future networks in this support.
In the following, we will first present our vision of future networks, followed by brief
descriptions of some example application scenarios that require a new network architec-
ture. We then outline our research goals and challenges, and briefly preview our solutions.
Finally, we describe the thesis structure and research contributions.
1.1 Vision of A Future Network
Figure 1.1 shows our vision of the emerging future networks. In this vision, the network
will be mainly comprised of an optical backbone network, core/metro/access networks
and finally the terminals, datacenters and end users. The backbone optical network
will be responsible for transferring massive volumes of data between the core network
components. The access network will provide very high bandwidth connectivity using
various wireless and optical technologies to the network users.
Terminals and users in future networks fall into various classes. One major class
will be mobile computing nodes that will have a combination of advanced features such
as high processing power, a long battery life and sophisticated user interfaces including
touch screens, speech and image recognition, etc. Moreover, users will have high mobility
that will require hand-offs between various types of access technologies.
Another class of the network end terminals will be smart sensors and/or actuators,
such as smart grid sensors, that require a very high level of responsiveness and reliability
from the network and will be deployed in mass. The sheer volume of these communicating
devices will challenge the scale and cost-points of future networks.
Another major type of network “users” will be massive datacenters that exploit in-
expensive computing and storage commodity resources and constitute factories used for
creating applications that require massive processing, storage and bandwidth at low cost.
These datacenters will be connected to the network using very high bandwidth optical
Chapter 1. Introduction 3
EnterprisesEnterprises
Core/Metro/Access Networks
Optical Network
Users
Data
cen
ters
(Clo
ud C
om
p.)
Figure 1.1: Vision of a future network
networking technologies.
In combination, these new network users at the edge of future networks together
with deployment of very high bandwidth optical core network will open the door for
introduction of a vast universe of applications and intelligence that will serve different
purposes.
In this application universe, different classes of applications (e.g., sensors, human
communications, machine-to-machine, content distribution, etc) will have different and
sometimes contradictory expectations from the network. These diverging requirements
might force creation of separate networks, unless future networks become capable of
serving these applications on a shared infrastructure.
In this thesis our main research objective is to study future network architectures
that can provide customizable support for applications. In the next subsection we briefly
examine several motivating application scenarios to illustrate the type of challenges that
Chapter 1. Introduction 4
future networks will face and to motivate our research on future network architectures.
1.1.1 Motivating Application Scenarios
Future networks will support a diverse range of applications over a variety of access
technologies. In this section, we discuss four sample application scenarios in two general
categories to show the type of challenges the future networks will face and application’s
requirement that need to be addressed.
Smart Infrastructures
Smart infrastructures such as smart utility grids and smart transportation systems are an
important class of future applications. In this class of application, sensors and actuators
will be deployed in massive scale throughout the network. The sensors allow real-time
environmental information to be gathered in a variety of settings and the actuators allow
control actions to be exercised in response to environmental conditions according to
various policies and objectives.
On the other hand, inexpensive computing and storage in massive datacenters allow
introduction of applications that can receive the sensors data, process it, and generate
and forward commands to the actuators. The combination of smart sensors, actuators
and affordable large scale processing and storage will enable introduction of smart in-
frastructures that will revolutionize the way we live. One of the major requirements of
these future smart infrastructures is having a responsive network to reliably transfer the
sensor data and actuator commands between the sensors, actuators and datacenters.
Smart grids are an example of smart infrastructures that deploy sensors and actuators
in homes and housing and industrial complexes as shown in Figure 1.2. The smart
infrastructure in smart grids enables not only improved energy efficiency, but also new
business models for energy pricing and trading as well as energy consumption that is
sensitive to carbon emissions. The sensors in smart grids will generate large volumes
Chapter 1. Introduction 5
Core/Access Network
Optical Network
SmartGridsElements
Data
cen
ters
(Clo
ud C
om
p.)
SmartHouses
Datacenters
(Cloud Comp.)
Figure 1.2: Example of a future application: Smart Grids
of data. The generated data needs to be securely transferred through the network to
datacenters that are able to process data in massive scale and produce commands that
are to be forwarded to the actuators resided in the homes and housing and manufacturing
complexes. The amount of data that smart grids can potentially generate as well as
reliability and responsiveness level that are needed will surpass existing applications that
are supported by the current Internet.
Another type of smart infrastructure involves smart transportation systems comprised
of traffic sensors and traffic signals as well as networked cars and passengers equipped
with wireless devices, GPS and cameras. These smart transportation systems generate
a very large amount of data that need to be securely passed through the network to
the datacenters for processing. The generated commands in these datacenters will not
only guide the man-driven traffic but possibly will direct smart and machine-controlled
moving elements. This automated control will be essential to address the future energy
challenges and maximizing the use of green and renewable sources of energy.
Chapter 1. Introduction 6
From these two sample applications, we can see that future networks need to be highly
reliable, responsive and be able to handle a large number of wireless and mobile users in
massive scales with high security and accountability.
Content Distribution Networks
Content distribution and human-to-human communications and interactions are driving
the emergence of many new applications that will be empowered by the availability of
high bandwidth in the backbone optical networks, inexpensive computing resources and
smart mobile terminals. These applications need to handle a large number of mobile
and heterogeneous users. The heterogeneity and mobility are driven by the universal
acceptance of smart wireless phones, netbooks and new devices such as iPad that use
and build on high bandwidth wireless access technologies.
One example application will be high quality streaming of an event to a large number
of users. The end-users in this application will be using heterogeneous devices and differ-
ent access technologies and are mostly mobile. In this application the distribution and
streaming needs to be adaptive to each user’s available bandwidth and device playback
capabilities or user preferences. In a mobile environment, mobile devices may experience
temporary disconnections, however, novel caching techniques are required to maximize
Quality of Experience. In this class of applications, users need to interact with each other
and produce metadata that can be consumed by other users. The users experience will
also be improved with a large volume of metadata (speech-to-text, etc) that needs to
be automatically generated using powerful computing resources. Also, the distribution
model in this class of application needs to be customized to the application business
models requirements, the type of content as well as the target end-users.
In this class of applications, many novel functionalities are required including: efficient
content multicasting, smart caching, content conversion, image rendering and recognition,
specialized encryption and decryption functionalities, speech recognition and speech to text
Chapter 1. Introduction 7
as well as AAA (Authentication/Authorization/Accounting) operations. These function-
alities are also required to be highly reliable, robust and affordable and sometimes they
may only be needed for a short period of time.
Another example of content distribution applications is 3D presence systems [2] that
will enable a group of individuals to interact in 3D across a wide area network. This
application, unlike the previous scenario, might involve a small group of users, how-
ever it requires very high speed and high bandwidth connections and large amounts
of processing. Introduction of 3D technologies, inexpensive computing resources in the
datacenters used for image processing operations, together with very high bandwidth
backbone optical network will enable introduction of these types of content distribution
and streaming applications that will need different types of functionalities than what the
current networks can provide.
Having seen these scenarios, we can expect that future networks will face unprece-
dented challenges from a diverse range of applications. These challenges will push net-
works to match the advancements in the different technologies in its periphery (users,
commoditized computing, and access technologies) and within (optical networks). In this
thesis, our goal is to study future networks architecture and the type of capabilities that
these networks need to cover.
1.2 Research Goals and Challenges
In the previous section, we presented our vision of future networks and we briefly pre-
viewed a few sample future application classes. We also discussed some challenging
requirements of these applications. To enable introduction of such applications, it is
fundamental for the network to address these challenges and more. In this section we
name a few of those challenges and outline the main research questions studied in this
thesis. These challenges are grouped in four main categories including configurability
Chapter 1. Introduction 8
and application-orientation, facilitating application-creation, scalable service manage-
ment and QoS control, and mobility and security.
• Configurability and Application-Orientation: Traditional networks are usu-
ally designed, installed and configured once for all applications that operate over
them. Future network need to be flexible and configurable to adapt to applica-
tion requirements. This configurability should be in different levels from the lower
links configuration up to application-specific routing functions. For instance, ap-
plications should be able to customize the network architecture according to their
distribution model as well as their chosen caching, forwarding, broadcasting and
multicasting approaches for their specific content. Future networks require config-
urable and application-oriented components which enable each application to fulfill
its own set of requirements. We call such a network an Application-Oriented Net-
work. One of the main research goals in this thesis is to develop a framework in
which application-orientation and configurability can be realized in future networks.
• Facilitating Application Creation: Future distributed applications need to be
created, deployed and retired rapidly to adapt to future agile and competitive
business models. Networks can foster this agility by offering common services
used in the full life cycle of an application. Many of these common services will
use computing and communication resources. Future networks should provide a
network of computing and communication resources on which these services and
ultimately applications can operate. To provide such support, virtualization and
composition techniques will be heavily used in future networks.
Virtualization has been introduced as a technique to hide the underlying hardware
resources from the applications. Virtualization will be used in different levels in fu-
ture networks to facilitate application creation and to fulfill application-orientation
and configurability requirements. Virtualization will be both a solution and a
Chapter 1. Introduction 9
challenge in future network since the success and scope of future networks will de-
pend on the advancements in virtualization technologies in various domains such
as bandwidth virtualization as well as computing virtualization.
To facilitate application creation, we also need to be able to compose new applica-
tions using these common service components and virtualized resources. Another
major challenge in future networks is to define open and flexible interfaces to these
common services to enable their incorporation in different applications in the face
of heterogeneity.
• Scalable Service Management and QoS Control: Guaranteeing QoS has
been and will be a major challenge for any network. The scope and diversity of the
applications that will operate over future networks depend directly to the level of
QoS, responsiveness and predictability that these networks will provide.
Future networks will offer a much diverse range of features to the applications.
Therefore their management scope will cover managing the new features as well.
To limit the costs associated with management in future networks, scalable and
automated service and resource management solutions will be required. A major
challenge in introduction of future application-oriented networks is to design and
develop such scalable service management systems and their associated algorithms.
Another major research goal in this thesis is to study QoS control mechanisms that
are able to guarantee a desirable network behavior in large scales.
• Mobility and Security: Two of the main challenges in future networks are mo-
bility and security management. Generally, networks with IP-centric transport
stratum have difficulty in handling mobility and security issues. With emergence
of a new generation of applications and users, these two challenging issues will con-
tinue to play a major rule in success of any network architecture. Although we do
not directly address these challenges in this study, one of our main research goal
Chapter 1. Introduction 10
is to provide a flexible platform for enabling incorporation of novel mobility and
security management systems in future applications.
Addressing the complete list of challenges in a dissertation is impossible. Nev-
ertheless, we focus on a subset of these challenges in areas including configurability,
application-orientation, network-facilitated application creation, virtualization and scal-
able QoS management. We also direct the interested reader to other studies in our
research group on autonomous service management [3] and core network management [4]
that have addressed other challenges in future networks.
1.3 Proposed Solutions Overview
In this section, we briefly overview our proposed solutions for some of the studied chal-
lenges. In this study, we propose a new network architecture called Application-Oriented
Network architecture to address the configurability and application-orientation challenges
in future networks. Application-Oriented Network (AON) is a converged computing and
communication network that facilitates creation of diverse range of applications through
virtualization of resources and service-oriented application creation paradigm.
In AON, an application is a high-level distributed function that is composed of several
lower-level service components and is designed to either deliver a service to the end-users
or to be used in another more high-level application. One of the main objectives in AON
is to enable application marketplaces in which application providers can find available
components that are designed and developed separately and use them in their applica-
tions. To do so, AON follows the Service-Oriented Architecture (SOA) [5] application
creation paradigm.
In SOA, high-level applications and business processes are created by composing
service components that can be accessed through well-defined and standard interfaces.
Service-Oriented Architecture enables loose coupling and higher interoperability among
Chapter 1. Introduction 11
service components that are used in creating an application while each can be developed
and deployed independently. Our proposal for AON utilizes this paradigm to facilitate
application creation in future networks.
Virtualization is another major technique that we heavily use to facilitate application
creation and to provide configurability and application-orientation. “Virtualization” cor-
responds to different technologies in areas such as computer hardware, software, memory,
storage, data and etc. [6]. Nevertheless, we refer to a virtualized resource as a resource
that provides the essential capabilities of the real physical resource and is abstracted
from the physical resource. Through virtualization, AON allows application providers to
rapidly deploy and retire an application. Application providers are also able to flexibly
and dynamically configure the virtualized resources to satisfy their applications require-
ments.
Our proposed architecture for AON consists of three main planes: control, manage-
ment and application planes. The resources that are required for creating an application
are virtualized and abstracted as service components in the application plane. In AON
application plane, a virtual network of computing and communication resources is al-
located to each application in which that application could operate. The application
providers secure access to these resources through the open and well-defined interfaces
of the control plane. The AON management plane, on the other hand, is responsible for
managing the resources in the application plane.
Although applications in the application plane can follow any network architecture
that they choose, we propose a generic application plane architecture to address future
applications requirements. This generic architecture includes two enriched layers; a ser-
vice layer and a transport layer that covers content-delivery as well as data-delivery
functions.
To validate this architecture, we designed and developed a prototype of this network
called Virtualized Application Networking Infrastructure (VANI) [7, 8] that allows ap-
Chapter 1. Introduction 12
plication providers and networking researchers to create new distributed applications. In
the realm of network experimentation testbeds, VANI is a major contribution as it en-
ables network researchers and application providers to experiment with new distributed
applications and network architectures. Moreover, VANI allows experimentation with
new layer three protocols instead of Internet Protocol (IP). VANI also includes a repro-
grammable hardware resource that allows application providers to perform customized
hardware-based processing in the network.
Another major contribution of this study in the field of network virtualization is the
introduction of a Distributed Ethernet Traffic Shaping system (DETS) [9] that is able to
guarantee the send and receive bandwidth on virtual networks created for applications
in VANI. Although DETS is proposed for VANI, it is also capable of operating in any
virtual machine-based computing cluster or large datacenter such as cloud computing
[10] datacenters to improve network performance and minimize interference between dif-
ferent users traffic. One of the main advantages of DETS is that unlike other Ethernet
congestion control mechanism it does not require any changes in Ethernet equipments
and can operate on the hosts systems. We propose four algorithms for the DETS core
module which is the rate allocator. We compare these four algorithms performance and
describe their characteristic through experimentations and measurements.
One of the main challenges in a service-oriented system such as VANI is to guarantee
an agreed level of QoS for the composite applications. To address this issue, we inves-
tigate large scale QoS and admission control mechanisms and propose new algorithms
to guarantee the QoS for applications created through service composition in service-
oriented systems. Specifically, we focus on application admission and QoS control to
guarantee successful application completion in service-oriented systems where a set of
service components with a limited number of instances are shared between different ap-
plications. The goal is to allocate these limited resources in a way that the system revenue
is maximized and an agreed level of application QoS is met. In this problem, there are
Chapter 1. Introduction 13
several defining parameters that can affect the possible solutions including service execu-
tion times distributions, applications request arrival processes as well as the scalability
concerns.
We first formulate this problem using Markov Decision Processes (MDP) [11, 12]
for small scale systems that have exponential service execution times distributions and
are subjected to stationary Poisson request arrival processes. Next we introduce the
problem of QoS control in these environments [13] and we propose distributed heuristics
to guarantee the probability of successful completion of an admitted application instance
[13, 14]. The proposed algorithm is called Distributed Algorithm for Service Commitment
(DASC) [15]. DASC is able to operate with both stationary and non-stationary request
arrival processes and covers both queue-less and queue-enabled services. Moreover, it
does not limit the service execution times distributions to be exponential. DASC uses a
probabilistic model to predict future resource usage in the system and makes admission
decisions based on the current and the projected state of the system.
We present alternative steady-state based admission control approaches that can guar-
antee the probability of successful completion in the steady-state. Through simulations
and performance comparisons we show that DASC is able to operate with both station-
ary and non-stationary request arrivals, while the steady-state based approaches can only
operate with stationary request arrivals.
We also propose an application admission control system to both guarantee an agreed
level of QoS using DASC and maximize the system revenue by admitting more valuable
application classes to the system [16]. For the application admission control system, we
investigate the steady-state based solutions as well as online combinatorial optimization
approaches to maximize system revenue.
Chapter 1. Introduction 14
1.4 Thesis Structure
The thesis is composed of three parts that correspond to the major contributions of this
study:
• Part I: Application-Oriented Networking: In this part, we present the AON
architecture. We start, in Chapter 2, by analyzing the background and require-
ments for AON and studying the major trends in computer and communication
technologies [17]. The AON discussion is followed with the description of the
AON main planes and layers as well as their responsibilities and functionalities
in Chapter 3. We also present the main component in an AON network which is
an Application-Oriented Router (AOR) that utilizes hardware and software com-
ponents to support content processing and delivery. We present several sample use
cases in which AORs can add value to different applications scenarios.
• Part II: Virtualized Application Networking Infrastructure: In Chapter 4,
we describe VANI architecture and its main resources and we show how a researcher
or an application provider can interact with VANI. We also discuss different func-
tionalities of VANI control and management plane. We describe how new resources
can be created and registered in VANI. Performance measurements on VANI re-
programmable resources and VANI internal fabric are presented as well.
In Chapter 5, we present the DETS system that is able to guarantee the send and
receive bandwidth on virtual networks created for applications in VANI. In this
Chapter, we also present DETS main modules and its corresponding algorithms as
well as measurements and performance evaluations.
• Part III: QoS & Admission Control in Service-Oriented Systems: In this
final part, we investigate the problem of service allocation in service-oriented sys-
tems. We first formulate this problem using Markov Decision Processes in chapter
Chapter 1. Introduction 15
6. Next, we introduce the problem of QoS control in these environments and we de-
scribe the DASC system in Chapter 7 and present the probabilistic and predictive
model used in DASC for both queue-less and queue-enabled systems. In Chapter
8, we propose an application admission control system in service-oriented systems.
Performance evaluations of each of the proposed algorithms are properly placed in
the related chapters. Finally in Chapter 9, we offer the concluding remarks and
discuss our contributions as well as our future work.
Part I
Application-Oriented Networking
16
Chapter 2
Background and Requirement
Analysis
In this chapter, we focus on the challenges to network architecture presented by new ap-
plications. We examine how the well-known trends in the commoditization of hardware,
software, and communications technology have enabled new distributed computing mod-
els. New applications based on these models have been very disruptive because of the
clear advantages that they have over traditional ones. We examine the features of these
new applications that have made them successful and identify the potential additional
benefits that may result from their associated computing models. We discuss how these
new models are leading to a new service or application provider infrastructure in which
computing and communications technologies converge in new ways.
2.1 New Computing Models
Relentless technology advance, captured by the rubric of ”Moore’s Law”, has been a
steady driver for change in networking equipment, devices, services, and applications.
Improvements in computation power and cost have facilitated the execution of more
complex software, which in turn has stimulated more demand for improved hardware.
17
Chapter 2. Background and Requirement Analysis 18
This virtuous cycle has taken a dramatic turn in the last few years as the basic enabling
computing, communications, and software technologies have become commoditized. New
distributed computing models have appeared that are fundamentally disrupting tradi-
tional models for offering services and applications by leveraging commodity resources
and introducing new business models.
Peer-to-peer applications are a prime example of these disruptive trends [18]. Com-
modity computing, communications, and software have also enabled new applications
that attain entirely new levels of scale, with Google search the preeminent example.
Peer-to-peer applications and Google search represent extremes of distributed comput-
ing in terms of ownership and control of resources, but they also share the advantages
inherent in distributed computing. Both examples can achieve huge levels of scale. Their
design provides for the delivery of the application through systems that are loosely cou-
pled so that faults can be addressed through simple mechanisms that exploit inexpensive
redundancy. Both designs incorporate self-organizing mechanisms to address the man-
agement of a huge aggregation of resources. Self-organizing mechanisms are also used
to ensure connectivity and basic levels of performance. It is clear that these new in-
frastructures that are built atop of shared and/or commodity resources can achieve very
large scales, while have the potential to provide higher reliability, performance, and much
lower operating costs.
2.2 New Applications through Composition
We have seen that the Internet has become the platform to support new applications
and that the associated infrastructure is becoming more decentralized. Moreover, the
approaches to creating new applications are also changing in a direction where innovation
becomes more decentralized. In this environment, new applications are created through
composition of service components, both in the form of ”mashups”, as well as in a more
Chapter 2. Background and Requirement Analysis 19
rigorous form by using a Service Oriented Architecture [19, 20, 5].
The success of Web protocols and standards in enabling the deployment of a massive
system through the uncoordinated efforts of a global community provides support for
efforts to create applications through the linking of interoperable, loosely coupled software
components that are accessed through Internet protocols. New application providers
such as Google and Yahoo now offer access to components that provide services such as
search, map, chat, and photo sharing, to other application developers through Application
Programming Interfaces (APIs). The term ”mashup” is used to denote web applications
where several sources are used to create a new service [21]. For example, Google maps
have provided the basis for a huge number of mapping mashups. The importance of the
mashup phenomenon is that it marks the emergence of a new mode of application creation
where applications are created through a distributed and collaborative process and where
the application at any given point in time is the cumulative result of a community effort.
The term Web 2.0 refers to this network-centric platform [22].
The emergence of Service Oriented Architecture (SOA) standards represents another
related major trend for the delivery of applications [19, 20]. The key architectural con-
cept in SOA is the service orientation that enables the rapid and easy composition and
management of large-scale distributed services in the face of component autonomy and
heterogeneity. Service composition concepts are not new but have increasingly come into
the spotlight in recent years with the emergence of the new technologies such as XML
and Web Services (WS).
SOA model follows a three step phase: register, find, and invoke as shown in figure
2.1. Web Services set of specifications [23] is an instantiation of this model. Web Ser-
vices specifications provide uniform interfaces to loosely-coupled software components.
These specifications provide a messaging framework for the transfer of information, an
XML-based [24] grammar for defining web services [23], and the means for locating web
services. As in mashups, a key concept of interest to us in Web Services is the ability
Chapter 2. Background and Requirement Analysis 20
Figure 2.1: Basic Service-Oriented Architecture model(source:http://www.w3.org)
to create new applications through the composition of service components that can be
accessed through standard Web Service interfaces. However, SOA goes further through
the development of the Business Process Execution Language (BPEL) [25], which al-
lows business processes to be implemented as workflows involving multiple web services.
From a network architecture perspective, SOA shifts the focus to an overlay network of
computing resources where messages are exchanged according to content and in doing so
opens the way to application-oriented networking.
The emergence of SOA as a new paradigm for service provisioning highlights the
importance of service orientation in the future application-oriented networks. SOA-based
loosely-coupled systems are giving enterprises greater agility, when it comes to adjusting
the structure of their businesses to meet changing business requirements. This model
of flexible and decentralized application creation has enabled introduction of many new
services and applications providers, and future network needs to consider this application
creation paradigm to facilitate and enable agility in application creation.
Chapter 2. Background and Requirement Analysis 21
2.3 Emergence of Cloud Computing
Recently cloud computing model has emerged as a platform for deployment of appli-
cations and services. This model relies on very large scale datacenters attached to the
Internet cloud [10, 26]. These datacenters heavily utilize virtualization techniques on
top of commodity hardware and software components. The resources in the cloud are
consequently inexpensive and affordable for many application providers that find it more
economically viable to use these resources rather than investing on in-house deployment
of these resources.
The cloud computing resources are primarily based on virtualization of two pillar
resources; computing and storage. Application providers can use these virtualized re-
sources to store and process data. Moreover, they can dynamically secure or release
resource based on their current and/or anticipation of the load on their applications us-
ing programmable and open WS interfaces as in Amazon Elastic Cloud Computing (EC2)
services [27] or open-source systems based on that interface as in Eucalyptus project [28].
The cloud computing datacenters are connected to the network with very high speed
optical connections. Together with massive processing power and storage they facilitate
introduction of applications that require processing large volumes of data. Another ad-
vantage of the cloud computing model is that it has shortened the deployment phase of
an application lifecycle significantly and it has made creation of new applications con-
siderably faster than before. It also enables applications to dynamically adapt to change
in load by increasing or decreasing the amount of resources they need. Nevertheless,
the improvements should be made in the network virtualization since it has been shown
that [29, 9, 30] even internal traffic inside cloud datacenters does not enjoy a guaranteed
performance, and there is considerable interference between different users traffic.
The introduction and success of cloud computing model is another indicator that
future applications are moving toward platforms that enable rapid application creation
through composition of basic service components over shared platforms. It is also a prime
Chapter 2. Background and Requirement Analysis 22
example of how commodity resources and virtualization techniques can facilitate creation
of low cost and short-life applications.
2.4 Evolution of Traditional Service Providers
The infrastructure of traditional service providers (i.e. telephone companies) is also
changing towards one that is based on the Internet Protocol, under pressure from tech-
nology advances and new application providers. While suffering from some clear dis-
advantages relative to new application providers, traditional service providers remain
superior in terms of mobility as well as reliability, security, and well-established business
models.
The infrastructure of the traditional service provider is undergoing a fundamental
transition to a multi-service packet-switching architecture based on Internet Protocol
(IP). This transition includes the introduction of a new control plane based on the Session
Initiation Protocol (SIP) [31] that enables the replacement of existing services such as
voice, and the introduction of services such as instant messaging and presence. This IP
based network architecture is articulated by the Next Generation Networks Focus Group
at the International Telecommunication Union [32].
According to the ITU-T definition, a Next Generation Network (NGN) [32, 33] is a
packet-based network able to provide services including Telecommunication Services and
able to make use of multiple broadband, QoS-enabled transport technologies and in which
service-related functions are independent from underlying transport-related technologies.
The NGN’s architecture allows decoupling of the network’s transport and service layers.
This means that whenever a provider wants to enable a new service, they can do so
by defining it directly at the service layer without considering the transport layer - i.e.
services are independent of transport details.
IP Multimedia Subsystem (IMS) [34] is an effort by telecom-oriented standard bodies
Chapter 2. Background and Requirement Analysis 23
to realize the NGN concepts and extend the new control plane to any access network,
and it presents a natural evolution from the traditional closed signaling system to the
NGN service control system. IMS was developed for controlling the access to services by
customers of Third Generation wireless access networks. In the IMS approach, servers, in
the user’s home service provider’s network, control access to all services. Consequently,
the service provider can determine what services are delivered, at what quality level, and
at what cost.
IPSphere and Service Signaling Stratum (SSS) [35] are another telecom industry effort
to enable end-to-end services across multiple service providers. An interesting aspect of
the SSS is that web services are used in its implementations. Operators can publish the
services they are willing to provide, and other operators can use web services to negotiate
and secure the resources to enable end-to-end services.
The development of SSS presents interesting possibilities for the development of an
environment where traditional and emerging providers can work together in the delivery
of services and applications. Further development of systems such as SSS can provide
the means for these players to interact dynamically in distributed fashion, and in doing
so, create an open market for applications.
While initial implementations of IMS are based on traditional client/server architec-
tures, it is clear that the emerging distributed models associated with new disruptive
applications are applicable. We can therefore anticipate that future application and ser-
vice provider infrastructures will be based on similar, if not identical, infrastructures
that converge computing and communications to accommodate the emerging paradigms
for application creation and delivery. In addition, in the future, we will see the im-
pact of the main principle of NGN and IMS architecture, which is the independence of
the service-related functionalities from the transport-related functionalities, in the future
application-oriented networks. The main advantage of this separation model is the emer-
gence of numerous new service and application providers that utilize telecommunication
Chapter 2. Background and Requirement Analysis 24
infrastructure for delivering their services to the users.
2.5 Introduction of Smart Phones
The introduction of smart phones and associated applications has been another major
trend in the past few years that makes the transition to mobile computers as the default
user devices. This success is mainly due to the advances in introduction of low power
consuming powerful processors, multi-touch interactive displays, and high bandwidth 3G
and 3G+ wireless networks [36], and upcoming Forth generation (Long Term Evolution)
of wireless access networks; 4G [37]. The introduction of these devices has triggered an
explosion in the number and diversity of Internet-based applications. This success has
also stimulated the introduction of other novel devices, e.g. the iPad, which in turn will
generate another wave of applications.
Applications on smart phones enjoy enhancements in other services such as location-
based services, and Instant Messaging (IM) services. They also are able to utilize cloud
based computing for performing processing intensive tasks such as speech recognition.
Another interesting trend in smart phones applications is the popularity and universal
acceptance of application marketplaces such as Apple store [38], and Android open ap-
plication marketplace [39] in which users and application providers interact and required
transactions are managed.
In combination with other trends and technologies like cloud computing and SOA-
based technologies, success of these devices is an exemplification of how the Internet, and
the infrastructure that has emerged around it have truly become the platform for the
delivery of an unlimited number of applications. It is also an indicator that the wireless,
high mobility and high bandwidth terminals will constitute the majority of users in future
networks, in contrast to the traditional view of network users as being mainly fixed and
wired.
Chapter 2. Background and Requirement Analysis 25
2.6 Advancements in Content Delivery Networks
During the last decade content delivery networks have seen major advancements and
technological breakthroughs. A Content Distribution (or Delivery) Network (CDN) is an
overlay network upon which content (e.g., video) is distributed and delivered to the end
users. In CDN, content is usually copied on multiple servers across a wide area, and users
connect to one of these server to receive a copy of the content in contrast to contacting
the central server. Akamai [40] is a major content delivery network which was a pioneer
in this field. In Akamai, the content producers push their content to the Akamai edge
servers and users receive the content from these servers. The major shortcoming in this
model is that different content delivery networks are not usually inter-operable and the
users interactivity levels with the provided content is limited.
Another prime example of content delivery networks is the peer-to-peer file-sharing
networks such as BitTorrent [41] that consume a large portion of global Internet traffic
with a minimum centralized management, and proved how innovative, distributed and
self-managing systems can operate effectively over commodity and shared resources of
ordinary Internet users.
Publish/subscribe systems are another important class of content delivery networks
[42]. Publish/subscribe systems use an asynchronous messaging paradigm to link publish-
ers and subscribers of event information. One of the main protocols for pub/sub systems
is the Extensible Messaging and Presence Protocol (XMPP) [43] that is an XML-based
protocol first developed for Instant Messaging services and now is becoming one of the
main candidates for asynchronous message delivery.
A big advantage of the publish/subscribe paradigm is that publishers are loosely cou-
pled to subscribers. Publishers need not know of the existence of specific subscribers and
they can remain ignorant of the system topology. Publish/subscribe provides the op-
portunity for better scalability than traditional client-server paradigms, through parallel
operation, message caching, tree-based routing. This scalability, however, requires using
Chapter 2. Background and Requirement Analysis 26
hardware-based message processing and rule matching.
Multimedia streaming applications are another emerging class of content delivery ap-
plications. Adaptive streaming [44] is the latest trend in this class of application. In
adaptive streaming, the streamed content format, and consequently required bandwidth,
is adapted to the end-user device capabilities and the available bandwidth. HTTP pro-
tocol is widely used in adaptive streaming. This class of application is going to see
another major challenge by emergence of 3D streaming applications [2]. Moreover, with
high bandwidth availability and growing demand for short delay and high quality video,
streaming uncoded and raw high definition multimedia content will become more attrac-
tive, especially since the content will be converted to many formats and be played on
heterogeneous devices.
According to a recent Cisco Visual Networking Index report [45] video traffic currently
accounts for more than third of Internet traffic, and another third of the Internet traffic
is associated with peer-to-peer file sharing networks. It is expected that by the end of
2014 all video traffic (P2P, TV, on-demand, and Internet) will take more than 91 percent
of global consumer traffic. It is anticipated that 57 percent of consumer Internet traffic
in 2014 will be Internet video, mainly due to the expected advancements in HDTV and
3D video.
Considering the statistics and all other major content delivery services such as YouTube
[46], and Hulu [47], and IPTV [48] services, we conclude that content delivery will con-
tinue to be one of the most bandwidth, processing and storage consuming applications on
future networks. Therefore, any future network architecture has to offer solid solutions
for efficient content delivery including smart caching, forwarding, broadcasting, and mul-
ticasting live as well as on-demand content (and associated metadata) to a large number
of heterogeneous (and mostly mobile) devices.
Chapter 2. Background and Requirement Analysis 27
2.7 Future Networks Architecture
Although the Internet has been the essential infrastructure in modern society and it
has enjoyed enormous success in delivering myriad of services, it suffers from major
shortcomings in several aspects. These shortcomings are in different areas such as security
flaws, mobility support, QoS guarantees, traffic interference and isolation, addressing and
forwarding (multicasting/broadcasting) problems. With the introduction of new systems
and applications in Internet, these problems will become more significant and will affect
the network performance more than before.
Although many patch solutions have been proposed for these problems (e.g, firewall,
proxy, NAT, TCP-friendly protocols, etc), however it is widely accepted that the current
Internet has reached its limits and suffers from ossification [1, 49], and research into
proposing new architectures is needed to address the challenges that future networks will
face.
There have also been several proposals for introducing new network architectures and
protocols [50, 51, 52]. Among the proposed architectures, we can mention [53] in which
the authors have proposed a new network architecture based on pub/sub model. Also in
[54] Palo Alto Research Center (PARC) researchers propose a content-centric networking
in contrast to a location-centric view of the network. In this proposed content-centric
network, packet address points to a content rather than a location.
Our work in this thesis falls into this body of research and we try to study this prob-
lem from applications point of view. Our goal is to create environment in which networks
can participate more actively in full applications lifecycle including applications creation,
deployment and retirement. In the next chapter, we describe our view on future network
architectures and we present a new architecture called Application-Oriented Network
(AON). To address the challenges imposed by future applications, AON is designed as a
converged computing and communication network. To arrive at AON architecture, we
considered major trends and architectures discussed in this chapter such as Next Gener-
Chapter 2. Background and Requirement Analysis 28
ation Networks architecture, Service-Oriented Architecture, Content Delivery Networks
and mobile networks.
One of the major obstacles in introducing new network architectures was and still is
experimentation with proposed network architectures in a large scale environment and
possibly with massive numbers of end users. To address this problem, there have been
several initiatives to build large scale testbeds for networking research. Examples of
these initiatives are GENI [55, 56], PlanetLab [57, 58], ProtoGENI (Emulab) [59, 60],
and ORCA [61] in the United States, FEDERICA in Europe [62], G-Lab in Germany,
and i2CAT in Spain. In the second part of this thesis we present Virtualized Application
and Networking Infrastructure (VANI) that is designed and developed based on AON
principles and enables experimentation with new network architectures and protocols
and distributed applications for future networks.
Chapter 3
Application-Oriented Networking
In the previous chapter, we analyzed several trends in computer and communication
networking and in particular we discussed how commodity hardware and software led to
new paradigms in application creation and to the introduction of numerous applications
on the Internet platform. In this chapter, we consider the role that future networks can
play to further advance the creation of new applications and services. We introduce
an Application-Oriented Network (AON) as a converged computing and communications
network that provides flexible and dynamic support to application providers for delivering
diversified compositional services. AON support is provided through enriched transport
and service strata and a service-oriented approach to utilization of virtualized shared
resources.
AON is a converged network; in AON we eliminate the separation of computing
and communication technologies and combine them in a new approach. In particular, a
collection of networking and computing resources can be secured through AON to create
a distributed application. Unlike many other networks that deliver their services to the
end-users, the AON users are application providers. Application providers, on the other
hand, deal with the end-users of their applications. Applications in AON are created
through composition of service components and virtualized resources and can be in a
29
Chapter 3. Application-Oriented Networking 30
Man
agem
ent
Pla
ne
(man
ages A
ON
& reso
urces in
Ap
plicatio
n P
lane
Man
agem
ent
Pla
ne
(man
ages A
ON
& reso
urces in
Ap
plicatio
n P
lane
Co
ntro
l
Pla
ne
(allocates reso
urces in
Ap
plicatio
n P
lane)
Co
ntro
l
Pla
ne
(allocates reso
urces in
Ap
plicatio
n P
lane)
Ap
plica
tion
Pla
ne
(virtu
alized reso
urces, an
d serv
ice com
po
nen
ts)
Ap
plica
tion
Pla
ne
(virtu
alized reso
urces, an
d serv
ice com
po
nen
ts)
Figure 3.1: Three planes in an Application-Oriented Network
diversified range of applications such as telecommunication services, enterprise services
and content delivery networks. In AON, multiple applications are able to coexist and
have on-demand access to network resources as well as to flexibly configure and manage
these resources while each has different requirements. They might also have short life
cycles and can be easily deployed, grown/shrunk in scale and finally retired.
In the rest of this chapter, we present a reference model for an Application-Oriented
Network. The reference model is designed to describe how the AON goals can be fulfilled
and also to describe the framework of collaboration and interaction between the main
players in an AON namely service providers and application providers.
The AON reference model has three main planes: management plane, control plane
and AON user plane, also called AON application plane (Figure 3.1). As we explained
earlier, AON users are application providers. Application providers can deploy applica-
tions in the application plane using the resources instantiated in this plane. The control
plane, on the other hand, is used by the application providers to secure access to the
resources and service components in the application plane. The management plane is
responsible for managing these resources as well as the Application-Oriented Network. In
the next section, we describe AON application plane characteristics, and its architecture.
Chapter 3. Application-Oriented Networking 31
Application Plane ResourcesApplication Plane Resources
SCSC SCSC
Processing
Networking
Resources
Storage
Facilitating Service
Components
Figure 3.2: Application Plane Resources
3.1 AON Application Plane
The AON application plane is composed of virtualized resources and service components
required for creating an application. These resources can be communication resources
(e.g. virtual links), and computing resources such as virtual processing, reprogrammable
hardware resources, and storage resources as well as any hardware-based or software-
based service components needed for creating applications. These service components
can include, for example, database services, orchestration services, and content conversion
services (Figure 3.2). Other examples of resources are general service components and
software-as-service components such as application-specific authentication, authorization,
accounting, as well as security-related services such as encryption/decryption services.
In the AON application plane, all resources are virtualized and represented by one or
more service components. Virtualization, as we explained earlier, is a technique that is
Chapter 3. Application-Oriented Networking 32
used to instantiate a virtual resource that provides the essential capabilities of the real
physical resource. The virtual resources are abstracted from the physical resource and
can be shared among many users without interference. For instance, a virtual computing
resource is a processing resource that might share a physical processing resource with
other virtual resources.
AO
N C
on
trol
AO
N C
on
trol
AO
N M
anag
emen
tA
ON
Man
agem
ent
AON Application PlaneAON Application Plane
SC
SC
SC
SCSC
SC
App1
SC
SC
SC
SC
App2SC
SC
SC
SC
SC
App3
Virtual resources assigned to each application by AON Control plane, and managed by AON Management Plane
Figure 3.3: Multiple Applications in AON
The virtualized resources and service components expose their functionalities through
well-defined open interfaces, such as Web Services, that are platform-independent and can
be invoked from heterogeneous environments. Application providers are able to program,
configure and compose these resources using SOA technologies according to their own
requirements to create a more complex application or a service that can be used by other
applications. For instance, an orchestrator service can be built on top of processing and
storage services, or as another example, a content conversion service can be built using
a virtualized reprogrammable hardware resource.
The virtualized resources and service components are assigned to each application per
application provider request. These resources are secured for each application provider
Chapter 3. Application-Oriented Networking 33
through the AON control plane. The AON management plane performs the management-
related tasks such as monitoring and fault management on these virtualized resources.
In other words, the AON control and management planes cooperatively create a resource
pool for applications in which they can operate. Consequently, multiple applications are
able to coexist in an AON while each owns one of the created resource pools. (Figure
3.3).
Applications in the application plane can follow any layered network architecture that
satisfies their requirements. However, to address the applications requirements discussed
in the previous chapter, we propose a generic architecture for the application plane. In
proposing this generic architecture, we study the functionalities as well as resources that
need to be embedded in the application plane.
To arrive at the application plane architecture, we considered major trends in the
computing and communication fields (described in the previous chapter), especially Next
Generation Networks architecture [32], Service-Oriented Architecture [5] as well as Con-
tent Distribution Networks. The AON application plane architecture has the main char-
acteristics of the NGN and SOA architectures: the separation of services from transport
and a service-oriented design for the service layer.
Traditional transport layers are mostly designed to perform pure digital data delivery
between two geographically separated points. As new applications emerge, however, the
need for performing content-delivery functions in a network in addition to data-delivery
becomes more significant as content-delivery becomes the default and dominating com-
munication transfer mode in future networks. For this reason, the transport layer in
AON application plane incorporates content-delivery related functionalities to accommo-
date content distribution applications.
In comparison to the Next Generation Networks (NGN) reference model, the AON
reference model can be seen as a new interpretation of NGN principles. In AON, the key
principle of NGN that is the separation of the service layer and the transport layer is
Chapter 3. Application-Oriented Networking 34
adapted in the way that it changes the “abstraction level” of the delivery concept in the
transport layer from raw digital data-delivery to the more advanced content-delivery.
The AON reference model acknowledges the key principle of SOA which is the service-
orientation in the service layer. Therefore, this architecture for applications appears as an
evolution from the NGN reference model and SOA. This evolution provides the platform
for achieving benefits of both in a converged network.
Figure 3.4 shows the AON application plane architecture that includes two main
strata and internal planes of an application. As can be seen, within the AON application
plane each application can have its own user-plane, control plane and management plane
operating on its allocated resource pool.
Us e r Plan e
Con trol Plan e
Mgm t Plan e
Service
Transport
Figure 3.4: Application Plane Architecture
Service Stratum
The service stratum in the AON application plane embraces the functions facilitating
development and deployment of services and applications as well as service modules and
applications. Based on service-orientation concepts, functionalities inside the service
stratum will be those that enable the rapid development and deployment of services
including search, location, identity, instant messaging and application-specific authen-
tication, authorization and accounting. Other example components in this layer are
modules responsible for orchestrating services and creating new complex services and
Chapter 3. Application-Oriented Networking 35
applications.
As we stated before, the cornerstone in this layer is the service-orientation concepts
and provision of facilities for creation of new applications. In-network service layer
components can also include third-party produced services that can be used in service-
oriented application creation. Among these services we can name alternative accounting
services, orchestration engines, inter-networking services, instant messaging services, lo-
calization services and etc. AON control and management planes provide functionalities
necessary for interactions between the application-providers and generic service providers.
In the control plane description we discuss this functionality in more detail.
Transport Stratum
The main differentiating characteristic of the transport layer in future application-oriented
networks compared to conventional networks is the inclusion of content-delivery tasks in
the transport layer in addition to the pure data-delivery tasks. As we described in the
previous chapters, the majority of global Internet traffic is currently consumed by content-
delivery and file-sharing applications, and these types of applications are becoming the
dominant type of application while the traditional one-to-one human communication ap-
plication is becoming a special-case scenario. Therefore, we propose inclusion of content
delivery to accommodate the applications in this class and to fulfill the requirements dis-
cussed in the previous chapters such as efficient and smart content distribution, caching,
conversion, encryption, decryption, etc. Moreover, this change enables improved and
efficient handling of mobility and security challenges in future networks.
The inclusion of content delivery tasks in transport layer functionalities implies inclu-
sion of major content-delivery related resources into this layer in addition to traditional
networking resources. The most important of these resources are processing and storage
that are the most basic needs of future applications in the transport level. In the rest of
this subsection, we elaborate more on requirements and advantages of inclusion of these
Chapter 3. Application-Oriented Networking 36
resources into the transport stratum.
Processing
In-network processing resources in AON can be used in many content-delivery related
functionalities such as content conversion, compression and decompression, encryption
and decryption, content validation, content-based routing and content transformation.
In-network transport nodes equipped with processing resources can also host a range
of essential services for application creation provided by third parties. Among these
services we can name pub-sub engines, message-passing services, security engines, con-
version services and compression and decompression engines all deployed on the process-
ing resources. The AON control plane provides functionalities necessary for interactions
between the application-providers and these third-party service providers.
Although most of the current content processing systems are software-based, to fulfill
the scalability requirement of future applications hardware-based content processors may
be needed to empower in-network content processing functions. These hardware-based
content processors have to be configurable, customizable, and reprogrammable to meet
application-specific requirements. An example network architecture that will benefit from
this capability is the pub-sub architecture, especially for hardware-based rule-matching
and content-based routing. 3D video distribution networks or even software-defined mo-
bile networks can also benefit from such reprogrammable hardware resources.
The inclusion of powerful processors in the transport layer allows for significant im-
provements in privacy-related and security-related operations as well. For instance pro-
cessing intensive tasks such as rigorous security checks on the packets, messages and con-
tent can be done in the network to meet the applications security requirements. Later
in this chapter, we discuss security concerns in AON in more detail.
Processing can also be used in mobile networks to improve quality and efficiency in
mobility related functions. For instance, it can be used in performing handover during
Chapter 3. Application-Oriented Networking 37
video streaming to a mobile user. In this scenario, in order to provide a smooth handover
experience, multiple format conversions can be performed on a video stream and the
generated streams can be forwarded to heterogeneous devices or end-points associated
with the user. Combining this with adaptive streaming approaches in content distribution
will lead to a far better video and multimedia streaming experiences in future networks.
Other application scenarios in which in-network processors can be useful are nu-
merous. As another example, network-coding-based content distribution networks can
significantly benefit from in-network processors to perform image processing and content
coding/decoding tasks using the general processors, Graphics Processing Units (GPU),
network processors as well as reprogrammable hardware resources.
Storage
Storage is another basic requirement of most applications especially in content distribu-
tion. Applications choose various methods for dealing with the problem of storing content
according to constraints such as the content type and target users. Some applications
exploit distributed commodity storage resources spread throughout the network such as
BitTorrent [41] which utilizes a peer-to-peer based configuration while others use a more
centralized approach.
Storage is also required for reliable and efficient cache and delivery of content espe-
cially to temporarily unavailable nodes in mobile networks [63] as well as in pub/sub
systems. Storage is also needed for efficient content multicasting and broadcasting with
advanced functionalities such as playback.
Content storage together with processing is also useful in live and adaptive streaming
applications where a large number of end-users need to receive a metadata-enriched
multimedia stream over heterogeneous access technologies and mobile devices. Different
conditions in the access network, such as handover or signal loss may lead to temporary
disconnection of the mobile node. In a mobile network, in-network and near-to-the-
Chapter 3. Application-Oriented Networking 38
user storage capabilities become very useful especially for performing smart and efficient
content caching. The need for in-network storage in delivering content to mobile nodes
have also been discussed in [63] in which the authors have proposed a clean-slate cache-
and-forward architecture for video delivery in mobile networks.
Networking
Networking resources are the resources that provide pure data-delivery between different
resources inside an AON or between the AON resources and applications end-users.
These resources have been traditionally the main part of transport layers. To satisfy
future applications data-delivery requirements, in an AON the applications are able to
specify their networking requirements. Many applications require guaranteed network
connections while others prefer the traditional best-effort connectivity. For instance, the
Internet transport has been designed based on a simple best-effort packet-forwarding
approach. Although there have been many efforts to introduce advanced QoS guarantees
in IP-based transport networks, many applications are continuously finding this model
of data-delivery simplistic and insufficient. AON enables the application providers to
request for different levels of Quality of Service by specifying rate, delay, etc.
Configurability is one of the main requirements of an Application-Oriented Network.
Therefore, in an AON, unlike traditional transport layers, the application providers are
allowed to even configure the data-delivery network topology to adapt to applications
requirements. This functionality has been also demonstrated in CANARIE network us-
ing User Controlled Light Path (UCLP) Web Services [64, 65]. An Application-Oriented
Network also has to provide different levels of communication services in varying granu-
larity to applications that require such services. These communications services include
(but not limited to) optical light path connections, circuit-switch, packet-switching or
MPLS-based connections, multicast and broadcast network links, and connections to
multi-homed end-points. The networking resources in AON are virtualized and made
Chapter 3. Application-Oriented Networking 39
available to application providers through well-defined and open interfaces of the AON
control plane.
In comparison to the current CDNs, the combination of content-delivery (processing,
storage) and the data-delivery (networking) in AON transport layer enables a more ad-
vanced content-delivery that can be customized for an application and a specific content.
The inclusion of processing, storage, and content-delivery in general, in future networks,
is the true manifestation of the convergence between computing and communications re-
sources in an Application-Oriented Network, as we discussed in the previous chapters.
3.2 AON Control and Management Planes
The AON reference model is composed of an application plane (which itself can have
multiple stacks of applications) and a control plane which is responsible for dynamic
allocation of application plane resources to each application (shown in Figure 3.5). The
AON control plane is also responsible for fast failure management while the manage-
ment plane performs long-term management tasks on the application plane virtualized
resources such as provisioning, re-provisioning, prediction, and pricing, and fault moni-
toring and management.
The three-plane model is traditionally used in telecommunication networks to draw
a boundary between different functionalities required in high-quality operation of a net-
work. The Internet Protocol and TCP/IP model of communications, however, lacks a
clear identification of the control and management plane. While this fact has been a
major advantage point for IP and has enabled IP to grow in scale and be managed, it is
generally believed that the lack of a well-designed control and management plane in the
Internet will be a major reason for replacing this architecture in future networks. There-
fore, AON success directly depends to its control and management plane architecture
and the flexibility, scalability and the type of functionalities that they can provide.
Chapter 3. Application-Oriented Networking 40
AON Application PlaneAON Application Plane
Us e r Plan e
Con trol Plan e
Mgm t Plan e
Service
Transport
AO
N C
on
trol
Plan
eA
ON
Co
ntro
l
Plan
e
AO
N M
anag
emen
t
Plan
eA
ON
Man
agem
ent
Plan
e
Multiple Applications (stacks) in Application plane
Us e r Plan e
Con trol Plan e
Mgm t Plan e
Service
Transport
Figure 3.5: Application-Oriented Network Reference Model
The most important feature in Application-Oriented Network is that it has to be
configurable and application-oriented. The AON control plane provides mechanisms for
on-demand allocation and configuration of the network resources per application provider
request. For example, in most conventional networks, transport-level network topologies
are mainly determined by the topology of physical links connecting the routers. In an
application-oriented transport layer, however, network topology should be determined
based on the application requirements. It is also possible for an application to dy-
namically change the topology according to various factors, such as changes in load or
operation costs (e.g., power).
Another main functionality of the AON control and management planes is to enable
interaction between application providers and service providers. Service providers can
provide services which can be used by other application providers. Therefore, AON
provides the main functionalities and a framework for these types of interactions and
handles the accounting, management and monitoring issues related to it.
Service providers can register their services in the management plane and the control
Chapter 3. Application-Oriented Networking 41
plane allocates them to the application providers on-demand. The control and manage-
ment plane together handle the authentication, authorization and accounting aspects of
this interaction between the application providers and service providers. In other words,
the service producers do not need to know the identity of the service consumers and vice
versa.
In AON, services follow generic, platform-independent and well-defined interfaces so
that they can be allocated, controlled and managed by the AON control and management
plane. The generic interface is also needed to enable application providers to incorpo-
rate the resources and service components in their applications as simply as possible.
In AON, each class of resources follows a generic interface template. For instance, pro-
grammable resources (e.g., processing resources) can follow one generic interface, while
storage resources can follow another generic interface. Network resources are another
class of resources that need their own generic interface.
Another important function in AON management plane is the monitoring and mea-
surement operations. The monitoring and measurement need to be done in order to
control QoS and moreover, it is required for applications to adapt to the changes in the
environments that they operate in.
All the communications between the control plane and management plane need to be
secured and authenticated so that the security risks in an AON are minimized. In terms
of interactions between the application providers and the AON control and management
planes, all such communications are done through secure channels and different levels of
authentication and authorization are performed to enable secure access to AON resources.
Also different access levels for application providers are defined to efficiently authenticate
and authorize the usage of the resources in the application plane. The AON control and
management plane then allocates the resources to the application providers based on
their profile limit, and provides the isolation between the resources allocated to different
applications in the application plane so that they can not interfere with each other’s
Chapter 3. Application-Oriented Networking 42
operation. This also stops applications from performing a security attack, such as sniffing,
on another application.
In the application plane, however, the application providers are free to follow any
security approach they choose and AON does not limit them to use a particular method
and technique. The communication between the application providers and the resources
in the application plane extend beyond AON security checks and AON does not impose
any specific protocol, format or encryption technique on these interactions as well.
3.3 Application-Oriented Routers
In this section, we focus on Application-Oriented Routers (AORs) which are network-level
nodes in Application-Oriented Networks. We examine the types of functionalities which
should be embedded in these emerging network elements. We also discuss several use
cases of AORs, based on the proposed network architecture including AOR in enterprise
networks as well as in telecommunication networks and content delivery networks.
Due to inclusion of content-delivery functions in the transport level, application-
oriented routers are not only able to perform the conventional routers’ task of pure
data delivery, but also perform content-delivery related tasks. The AOR operations also
include the previously described tasks of content processing and content storage.
In order to meet the high throughput and low latency requirements, applications
need processing technology that goes beyond the conventional software processing. For
instance, in case of XML processing, relative to transactional database processing, it
has been found that the desired response times and transaction rates cannot be achieved
without major improvements in XML parsing [66, 67]. To reach the required throughput,
AORs will exploit hardware techniques for processing intensive operations especially in
the form of hardware-based XML processing, validation, transformation, encryption,
decryption, compression, decompression, and content-based routing [68]. Thus AORs
Chapter 3. Application-Oriented Networking 43
emerge as a networking component that is high in performance, has high reliability,
and includes traditional layer three routing capabilities as well as the described content-
delivery tasks.
Another important requirement for application-oriented networks is the ability to con-
figure the network elements based on the applications requirements. In most conventional
networks transport-level network topologies are mainly determined by the topology of
physical links connecting the data routers. In an application-oriented transport layer,
however, network topology should be determined based on the application requirements.
Some applications are best suited for a flat peer-to-peer topology and others might require
a hierarchical architecture.
In AON applications share the same infrastructure for application development and
deployment. However, there is no “one size fits all” configuration in AON , and network
resources and elements are configured based on the applications’ requirements. As a
result, Application-Oriented Routers are not pre-configured devices which provide some
basic functionality to all applications and force the application providers to follow the
predefined configuration. On the contrary, AORs open the doors of the network level
entities to application providers and give application providers the option to configure
their allocated resources as they prefer. This will be achieved through virtualization
of resources and providing well-defined open interfaces to configure the resources on-
demand.
Figure 3.6 shows our overall view of AORs in an Application-Oriented Network. As it
can be seen multiplicity of applications share a converged communication and computing
infrastructure as well as the hardware and software components embedded in application-
oriented routers. Each application in this view has a resource pool in a set of AORs and
the end-users and terminals can be part of one application or more.
Application-oriented routers will also have to meet many of the traditional require-
ments of existing service provider infrastructure. The traditional capabilities to engineer
Chapter 3. Application-Oriented Networking 44
Enterprise Nodes
Enterprise Nodes
AOR
AOR
AOR
AOR
Application
Oriented
Network
App.3
App.2
App.1
Figure 3.6: Overall view of an Application-Oriented Network with multiple AORs andapplications
reliability and performance into the overall system will be needed, as will the incorpora-
tion of novel self-management mechanisms that reduce operating expenses.
3.4 Application-Oriented Routers Use Cases
The success of Application-Oriented Routers depends on the value they add to the current
applications and also the facilities they provide for future ones. Therefore, in this section,
we present some use cases for application-oriented routers in the context of the proposed
architecture for application-oriented networks.
3.4.1 Telecom Service Providers
IP Multimedia Subsystem (IMS) [34] is one of the major candidate architectures for
the next generation telecommunication networks. In this subsection we focus on the
potential contributions of AOR to IMS networks. IMS is an effort by the telecom-
oriented standard bodies to realize the NGN concepts and extend the new control plane
Chapter 3. Application-Oriented Networking 45
Application
Oriented
Network
AOR AORUser A
User B
Figure 3.7: Telecommunication services in an AON
to any access network and it presents a natural evolution from the traditional closed
signaling system to the NGN service control system. IMS was developed for controlling
the access to services by customers of Third Generation wireless access networks. In the
IMS approach, servers, in the user’s home service provider’s network, control access to
all services. Consequently, the service provider can determine what services are delivered
at what quality level and at what cost.
In the context of future application-oriented networks, IMS service providers can uti-
lize the content-processing and storage functionalities embedded in the AORs to increase
the quality level of their current services and to introduce new services using newly
available functionalities.
Among these newly available functionalities we can mention content transformation
and transcoding, which enable connectivity between heterogeneous devices, and also
content multicasting and other sophisticated content processing tasks such as encryp-
tion/decryption, pattern matching, and compression/decompression. The need for these
types of network-support for content delivery has recently gained more attention as in [69]
the authors have proposed the idea of network support for content-delivery for ambient
networks that is aligned with our view on application-oriented networks as well.
For example, consider a case which is shown in Figure 3.7. User A has an active
Chapter 3. Application-Oriented Networking 46
multimedia session with user B. At one point, user A decides to change his or her device
from a SIP Phone with a limited set of capabilities to a more powerful device like a laptop
or to another SIP phone device with a different type of capabilities. To do so, user A
initiates a transfer procedure and transfers the session from his or her first device to the
second device. If for any reason, User B’s device does not handle the coding required for
the new device the transfer procedure would fail. Also, the transfer procedure might be
unsuccessful due to the incompatibilities between protocol stacks.
The storage and processing capabilities in AORs will be very useful in handling mo-
bility and security related tasks in IMS networks. For instance, in the above scenario
and in the case of hand-off, content can be temporarily stored in AOR since the user de-
vice might be temporarily disconnected from the access network. In this scenario, smart
caching combined with adaptive stream processing in AORs enable a fast and efficient
connection resume phase in which the user experiences minimal disruption.
Issues like content transformation from one format to another can be solved in a
much easier way using AORs. For example, an intermediate AOR is able to perform the
necessary conversion between different media formats, or to transform one SIP message to
another. Also, it can compress, decompress, encrypt or decrypt the content. In addition,
another important common task which is processing intensive is content validation, which
can be performed in AORs.
Another use case for AORs is content-based policy enforcement. For example, if user
A in Figure 3.7 sends a high priority message in an emergency situation, an AOR, based
on a policy, can identify the priority of the message and treat it in a way different than
ordinary messages. In another use case, if a SIP device needs to access an XML-based
service, an AOR can perform the transformation required between the protocols.
Chapter 3. Application-Oriented Networking 47
37
Service
Requester
Service
Requester
Service
Provider
Service
Provider
Enterprise Service Bus
Application Oriented NetworkApplication Oriented Network
AOR AOR
Figure 3.8: Enterprise Service Bus and AON
3.4.2 Enterprise Networks
Enterprise Service Buses (ESB)[70] are used to provide necessary functionalities for de-
ployment of enterprise applications, mainly in the context of Service-Oriented Archi-
tecture [5]. In the context of Application-Oriented Networks, ESBs can perform their
processing-intensive tasks on the Application-Oriented Routers. These tasks include
content validation, compression/decompression, pub/sub message delivery, rule-based
matching and content forwarding, content encryption/decryption, and content transfor-
mation.
Current proprietary ESBs use XML-processing appliances to do the content processing
tasks, especially encryption/decryption tasks and content transformation [71]. These
XML-processing appliances, however, are not standardized and not available to small
vendors, and also are not affordable by many enterprises. In an Application-Oriented
Network, an enterprise can exploit the content processing facilities, embedded in the
Chapter 3. Application-Oriented Networking 48
AOR, to increase its application’s quality and decrease expenses, eventually leading to
better quality of services and reaching very large scales. In addition, enterprises can use
the storage capacity provided in the AORs to store their content with lower costs and for
reliable message delivery to temporary unavailable nodes in pub/sub model of message
delivery, especially to the mobile nodes.
For example, consider the case which is shown in Figure 3.8 where an enterprise is
using an ESB to support its SOA-based operations. One of the main tasks usually done
by an ESB is the content validation, which is very processing intensive and thus places
much burden on an ESBs’ shoulders. An AOR can perform this task for the ESB. It can
also perform security-related tasks such as content encryption and decryption as well.
Another use case for AOR in an enterprise environment is content-based routing and
multicasting. This functionality is valuable when there is a request for an unspecified
service and AOR can forward the request to a server based on a policy in a way similar to
a distributed ESB. Having seen these use case scenarios, we can conclude that by using
AORs, ESBs and enterprise networks can scale and adapt to the business demand with
high agility.
3.4.3 Overlay Networks and Content Distribution Networks
In the context of media and content distribution networks, one of the major issues is
effective content caching and multicasting, especially for live content streaming [72].
In Application-Oriented Networks, application providers can utilize AORs’ capabilities
in both content storage and content processing, and deliver high quality services to
their users. Among these applications, we can point to the Video-On-Demand and TV
applications.
AON content delivery can cover functionalities provided by traditional CDNs such as
Akamai, and it can also cover advanced content delivery functions by enabling application
providers to create their own application-specific CDN architecture using features such
Chapter 3. Application-Oriented Networking 49
AOR
AOR AOR
Application
Oriented
Network
P2P Network
Figure 3.9: Peer-to-Peer network in AON
as locality and identity to provide a customized content delivery for their users. Users
can interact with the content, and other users, and can produce metadata that can be
used by other application users.
As another example, we can mention peer-to-peer networks and other flat or hierar-
chical overlay networks which are currently unable to flourish due to their need to robust
nodes with processing and storage capabilities in the network. An example of these hi-
erarchical architectures has been studied in our research group [73]. In this structure,
unpopular content can be stored on smaller number of powerful nodes, while popular
ones are copied on many computers at the edge of the network and distributed in a
peer-to-peer topology. AORs content delivery features allow these types of new network
architectures to flourish.
In peer-to-peer networks, AORs can be used to store critical data which is needed to
be stored in a robust node. In addition, as it is shown in Figure 3.9, AORs can be used
to store popular content based on the usage patterns and users’ locations. As a result,
these networks can deliver better quality services while being efficient on the bandwidth
usage in the network. In this application scenario, AORs’ content storage functionality
can be used to store content in the places near to the interested users.
Chapter 3. Application-Oriented Networking 50
3.5 Related Work
There have been many attempts on developing new architectures for future networks [50,
51, 52, 53, 54], however, our proposed AON is different from many other proposals. This
is mainly because AON is designed to serve application providers instead of end-users,
and allows multiplicity of applications with different internal architectures to coexist
over same shared virtualized infrastructure. In other words, multiple proposed network
architectures, each with its own reason for their design and existence, can be deployed
inside an Application-Oriented Network.
Nevertheless, among the proposed network architectures, we needed to distinguish
our work from two related works. The first network is the Cisco’s Application-Oriented
Network [74]. Cisco has introduced its Application-Oriented Network line of products
that performs on-the-edge XML processing functionalities especially for Cisco’s enterprise
customers. Application-Oriented Network that we defined in this chapter is different
than Cisco AON in many aspects. In our AON we provide content-delivery in addition
to message processing, and we provide a framework for allocating virtualized in-network
resources to different application providers. In other words, we create a network of
computing and communication resources and allocate them to applications on-demand.
Another related work is the Slice-based Facility Architecture (SFA) proposed by GENI
[75] for federating network research testbeds in the United States. In SFA, researchers
are able to request instantiation of a slice of testbed resources across a federated network
of testbeds to perform an experiment on a new network architectures. SFA and AON
architectures are different in many ways. SFA’s goal is to enable experimentation over
multiple testbeds, and in this regard, it is not designed to address future network chal-
lenges. SFA does not have a three plane architecture, and clear statement about content
delivery support in future networks. Moreover, SFA does not acknowledge the fact that
the service-oriented application creation paradigm and its related set of technologies are
crucial in future networks success and is central in addressing future networks require-
Chapter 3. Application-Oriented Networking 51
ments. The last but not the least important difference is that SFA is not concerned with
the management of the resources to deliver the required quality of service.
Part II
Virtualized Application Networking
Infrastructure
52
Chapter 4
Virtualized Application Networking
Infrastructure
In the past few years, the idea of clean slate network design has been circulated in the
networking community and there have been several proposals for introducing new net-
work architectures and protocols [50, 51, 52]. One of the major obstacles in introducing
new network architectures was and still is experimentation with proposed network archi-
tectures in a large scale environment and possibly with massive numbers of end users.
To address this problem, there have been several initiatives to build large scale testbeds
for networking research.
GENI [55] is one of these initiatives that tries to create a testbed by federating
different testbeds such as PlanetLab [57, 58] and Emulab (ProtoGENI) [60] on top of
a research dedicated network in the United States. GENI is still in the design and
development phase, but currently it follows a slice-based architecture [56, 75]. In GENI
different testbeds would be able to connect to each other through GENI wrappers. The
exact communication protocol between the GENI wrapper and the testbed is left to each
testbed’s control plane and currently there are a few major control planes in GENI that
are trying to federate using the wrappers.
53
Chapter 4. Virtualized Application Networking Infrastructure 54
Probably among above testbeds PlanetLab [57] is the most developed. PlanetLab pro-
vides edge hosts on Internet and implements a slice-based architecture using the Linux
vServer [76] technology. PlanetLab, however, does not have a clear solution for exper-
imentation with new layer three protocols, and it’s not clear how it would facilitate
building high scale new routers that would need hardware-based acceleration.
In Canada, there is a research dedicated optical network called CANARIE [77] that
provides light paths connecting universities and research centers across Canada. CA-
NARIE has sponsored design and development of a User Controlled Light Path [65]
(UCLP) software that enables researchers to configure CANARIE network elements
through Web Services (WS) interfaces on-demand.
Another major initiative is FEDERICA [62] in Europe that is under development
through federation of several research network platforms in Europe such as i2CAT in
Spain and HEAnet in Ireland. FEDERICA uses WS-based UCLP software for creating
on-demand virtual networks atop of involving test platforms.
Another project for experimentation with lower layer protocols and networking al-
gorithms is NetFPGA [78]. NetFPGA is a PCI card with a Field Programmable Gate
Array (FPGA) chip, and four Gigabit Ethernet interfaces that could be used for de-
veloping different networking components such as a layer three router or a hardware
accelerator.
In this chapter, we present a new testbed for networking experiments and networked
systems. This testbed is different than the above mentioned projects in several aspects. It
benefits from a novel architecture for control and management functions capable of man-
aging various hardware-based and software-based resources. It also allows experimenting
with new network architectures that require in-network content processing and storage
capabilities. Moreover, it includes a new high performance and high throughput hard-
ware resource that makes experimentation with hardware-based or hardware-accelerated
networking algorithms and protocols as easy as experimentation with software-based
Chapter 4. Virtualized Application Networking Infrastructure 55
protocols.
Our vision in designing this testbed was to develop a converged application-oriented
computing and communications infrastructure to support an open applications market-
place. We investigated architectural aspects of this application-oriented network and
presented a proposal in the first part of this thesis. We also investigated autonomic
management issues and proposed an approach using virtual networks in [4, 79].
The essential aspects to enabling the above application-oriented environment are:
1. Service-oriented application creation; 2. Infrastructure as a Services methods for
configuring and scaling resources to support applications; 3. Virtualization of physical
resources.
Based on this view of an Application-Oriented Network, we began the development
of a testbed that would allow university researchers and application providers to de-
velop new networked systems and networking architectures. This testbed, Virtualized
Application Networking Infrastructure (VANI), allows the creation of virtual networks
of computing and communications resources. A VANI node consists of resources such as
processing, storage, networking and programmable hardware. A service-oriented control
and management plane allows VANI nodes to be interconnected into virtual networks to
support applications operating in the applications plane.
In the rest of this chapter, we describe the main requirements in VANI design, and
its architecture and main components. Also, we explain how our design would satisfy the
requirements. Moreover, we present the performance evaluations on the developed re-
sources for this infrastructure including a virtualized reprogrammable hardware resource
that enables hardware-based experimentation of networking algorithms and protocols.
Chapter 4. Virtualized Application Networking Infrastructure 56
4.1 VANI Design Requirements
Virtualized Application Networking Infrastructure (VANI) is a testbed that allows uni-
versity researchers and application providers to utilize its internal resources to rapidly
create and deploy networked systems, and to even experiment with new layer three pro-
tocols. Although the underlying concepts of the VANI testbed comes from our view on
Application-Oriented Network [17], but networked systems running in VANI environ-
ment could follow any architecture in any networking layer. The only limitation that the
researchers are facing in VANI is that their experiments should run on top of Ethernet
as their layer two. Next, we describe the main requirements in designing VANI.
The first requirement for VANI testbed is that it should allow experimentation for
future network architectures that might not fit into the traditional layer three defini-
tions. Currently networks are primarily responsible for delivering raw data but in future
it would be possible for future network architectures to shift-up the network tasks to
new functionalities that might be required by emerging applications. Among these func-
tionalities could be the task of content-delivery in addition to data-delivery (such as the
network architecture discussed in [17]) that would imply having content processing and
storage functions in the infrastructure.
The second main requirement was to allow researchers to experiment with new layer
three protocols (as in the traditional definition of L3) instead of the current Internet
Protocol. To do so, we designed the testbed assuming that everything above layer two
could be redesigned and experimented with, and we chose Ethernet protocol as the basis
of our layer two design.
Another main requirement in the testbed is to be able to setup experiments or create
new applications rapidly using already developed and ready to use components that
could be accessed through open interfaces. These components could be the virtualized
resources such as processing, low-latency hardware processing, and accelerator nodes, or
software components such as event processors that are used in many experiments for
Chapter 4. Virtualized Application Networking Infrastructure 57
Testing new
L3 protocols
Monitoring/
Testing
Rapid exp setup/
app creation
Future
network arch
VANIIsolation/
Security
Figure 4.1: VANI design requirements
data gathering and analysis. This requirement could be satisfied through the use of the
SOA technologies and standards that could allow flexible and dynamic composition of
reusable service components.
The fourth main requirement was to provide an isolated and secure environment for
researchers to carry on their experiments and develop their networked applications. This
requirement has to be satisfied at different levels such as traffic separation, bandwidth
allocations, storage access, secure access to the physical resources and isolation between
different physical resources. The fifth main requirement was the monitoring and de-
bugging mechanisms. In our design, we envisioned powerful complex event processing
components that could be customized to gather and analyze test and debugging data for
each experiment separately as well as for the testbed itself.
4.1.1 VANI Architecture
Based on these main requirements, we designed a two plane architecture for our platform:
control and management plane (VANI-CMP) and applications plane (VANI-AP).
VANI-CMP is responsible for virtualizing physical resources and allocating them to
Chapter 4. Virtualized Application Networking Infrastructure 58
Co
ntro
l &
Man
agem
ent
Plan
e
Ap
plicatio
n
Plan
e
• Two main planes– Control and Management
Plane
– Application Plane
• All resources needed for experiment setup are in app plane
• Control/Management used for allocating a resource pool to a researcher/application provider
• Applications can have their own architecture inside app plane
– Example applications: a new network instead of IP network with a new layering architecture, or a new Content Delivery Network
Figure 4.2: VANI architecture
the researchers and application providers. On the other hand, researchers deploy their
applications and experiments in the VANI applications plane (VANI-AP). Applications
operating in the applications plane can have their own architecture inside an applications
plane slice that is created by VANI-CMP.
For example, an experiment/application could be a new layer three protocol that
covers OSI layer three and four functions, could replace TCP/IP layer, or could be a new
content delivery network. Figure 4.2 shows this architecture including its two planes.
All virtualized resources and service components that can be used by researchers for
creating an application reside in the applications plane. Researchers can ask for these
resources through the testbed control and management plane and then they can directly
connect to the virtualized resource in the applications plane through any resource specific
protocol such as HTTP, UDP/IP, or ssh.
For example, a user can ask for uploading or downloading of a file to the storage
service through the control plane, and then if permitted by the control plane, it has to
directly contact the storage file service using HTTP/TLS connection and download or
upload its files.
Chapter 4. Virtualized Application Networking Infrastructure 59
Test-bed Control and ManagementTest-bed Control and Management
Virtualization LayerVirtualization Layer
Virtu
alized R
esou
rcesV
irtualized
Reso
urces
Researcher
Virtualization LayerVirtualization Layer
Control and Management Plane Application Plane
Virtualization
Agents
Virtualization
Agents
Web Service-Interface WS HTTP/SSLSSHUDP/IP
Virtualized Resources
WS WS
proprietary proprietary
PhyRes
Physical
Ph
yR
es
Resources
Ph
yR
es
Ph
yR
es
Ph
yR
es
PhyRes
PhyRes
PhyRes
PhyRes
PhyRes
Figure 4.3: Researcher interaction with VANI planes
VANI control and management plane (VANI-CMP) is responsible for allocating testbeds
resources to the researchers. Researchers ask VANI-CMP for a resource using VANI-
CMP’s Web Service interface. WS interface is chosen due its universal acceptance for
SOA, and the abundance of available tools for orchestrating and creating new applications
using independent Web Services.
After receiving the requests for resources from a researcher, VANI-CMP authenticates
the researcher and authorizes its request and then sends the request to the resource
virtualization layer. The resource virtualization layer is the layer which abstracts a
physical resource and offers it as a service to the control and management layer. If
the allocation is successful, VANI-CMP records the allocation, and replies back to the
researcher with a successful return result.
Chapter 4. Virtualized Application Networking Infrastructure 60
VANI-CMP also programs and releases the resource whenever an authorized re-
searcher wants to do so. Figure 4.3 depicts the logical view of the VANI testbed and how
a researcher interacts with VANI planes.
4.1.2 Current Physical Resources in VANI (VANIv1 Resources)
Currently, several physical resources have been virtualized and made available to VANI
users. In [8], the design and development details of these resources have been presented,
and here, we briefly overview these resources and type of functionalities that they can
offer to researchers.
In VANI all physical resources are virtualized. Through virtualization, we separate
applications from their underlying physical resources. To do so, we developed a virtual-
ization layer and virtualization agents for each physical resource as shown in figure 4.4.
The task of the virtualization layer is to coordinate the system wide virtualization of a
resource and to expose the resource as a service component with Web Service interface
to the rest of the system, and the agents task is to launch or destroy the virtual resources
on top of each physical resource.
The first physical resource that we have virtualized is the reprogrammable hardware
resource. To develop this resource we have used BEE2 boards [80]. Each BEE2 board
has four high-end Xilinx Field Programmable Gate Arrays (FPGA) each connected to
four 10GE interfaces. We have virtualized all four FPGAs in a BEE2 board so that a
researcher could ask for one or more FPGAs and program it as s/he likes.
Researchers can ask for an FPGA through the control plane and then program it,
configure it, or release it. They also have access to the libraries for controlling the 10
GE interfaces and some other commonly used hardware blocks such as DDR2 memory
modules. After programming an FPGA, a researcher can directly connect to the FPGA
through the 10GE interfaces according to whatever protocol designed for that FPGA. For
example, a researcher can use one FPGA or all four FPGAs to develop a layer three router
Chapter 4. Virtualized Application Networking Infrastructure 61
10GE Fabric Processor Blades
FPGAs/BEE2
Storage/Fileservers
Virtualization
Layer
Web Service interfaces
Virtualization
Layer
Virtualization
Layer
Virtualization Sub Agents
Figure 4.4: Virtualizing physical resources in VANI
with 4x10GE ports or 16x10GE ports, or a content-based routers that routes packets
based on the packets payload rather than their headers. We present the performance
evaluation results for this hardware resource in the performance evaluation section of
this chapter.
Another physical resource in the VANI testbed is the processing resource. The pro-
cessing service is developed based on Linux vServer [76] technology. Linux vServer is
an OS-level virtualization software that creates a virtual processing node on top of a
Linux kernel. Researchers are able to get a processing resource through VANI-CMP, and
release it whenever they wish to do so. Once a virtual processing node is allocated, the
researcher can directly ssh to the node. Researchers are also able to program the virtual
processing node with a specific image, create an image of their own, and save it on the
storage service, and share it with others or program other virtual nodes with that image.
We have also virtualized the internal fabric of the testbed for creating virtual networks.
The internal fabric consists of a set of high capacity Ethernet switches that are able to
Chapter 4. Virtualized Application Networking Infrastructure 62
isolate traffic between different applications and experiments by creating separate virtual
LANs. Moreover, it allows different experiments to intercommunicate by creating shared
virtual LANs that all have access to. This resource, together with the processing resource,
enable VANI to guarantee the bandwidth for an experiment. Later in the bandwidth
guarantee section, we will discuss this feature in more detail.
The gateway and bridge resource is another developed resource that enables com-
munication between different VANI nodes. If one of the resources in VANI needs to be
accessible from the Internet or from a resource in another VANI node, it can ask for
a public address through the gateway service and get an address for duration that the
external access is needed. The researcher can release the public address when it is no
longer needed.
Th bridge service is used for experiment involving new layer three protocols on top
of Ethernet network. Using the bridge service, a researcher can send and receive layer
two Ethernet frames to any other VANI node, and hence, would be able to develop and
test new layer three protocols over a wide area network. This functionality would only
be available if the VANI nodes are connected using a wide area Ethernet network. We
will discuss this case later in more detail.
Another physical resource developed for VANI is the storage resource. Storage re-
source is implemented on a set of distributed file servers that emulates one big storage
server. Researchers are able to connect to the storage service through VANI-CMP and
then directly connect to a file server for uploading and downloading files. All the direct
communications to the file servers for uploading and downloading files are done over a
secure HTTP/TLS connection. Researchers can use this service to store images for pro-
gramming other resources such as processing resource, and reprogrammable hardware
resource, and they can also share file with other researchers through this service.
Chapter 4. Virtualized Application Networking Infrastructure 63
Researcher Control Virtualized
Resource
Direct Connection to Resource
Reso
urce
StartgetResource
getResource
Auth/Authz
Accounting/
Record Keeping
Auth/Auhtz
programResource
programResource
releaseResource
releaseResource
Auth/Authz
Accounting/
Record Keeping
Figure 4.5: A sample interaction between a researcher and VANI to secure a resource
4.1.3 Example: Requesting a Resource in VANI
Figure 4.5 shows a sample message exchange scenario between a researcher, the VANI
control and management plane and physical resources inside a VANI node. A researcher
starts requesting for a resource by invoking the getResource operation of the VANI-CMP
WS interfaces. In that request, the researcher includes the type of resource, the duration
and number of required resources.
VANI-CMP authenticates and authorizes the request and forwards the request to the
resource. All resources in the testbed expose their operations to VANI-CMP through a
generic WSDL interface. This makes it possible to easily extend the types of resources
and services in the testbed without changing the control and management software.
The resource responds back to the control plane request with a success result, and
a Universally Unique IDentifier (UUID) for the resource. The control plane stores this
returned UUID and passes it to the researcher. The researcher can program the resource
Chapter 4. Virtualized Application Networking Infrastructure 64
identified by returned UUID, and release it at a later time.
In the next section, we delve into the control and management design and we describe
its main functionalities in detail.
4.2 VANI Control and Management Plane (VANI-
CMP)
In this section, we describe in detail the main functions of VANI-CMP. We also discuss
the main technologies that are used in design and development of VANI-CMP. These
technologies are mainly the SOA-based technologies such as Enterprise Service Bus (EBS)
[81], and Business Processes Execution Language (BPEL) [25] orchestrator engine.
VANI-CMP is responsible for performing AAA operations and allocates resources to
the researchers and application providers. In addition, it performs user management
functions, and stores and manages the testbed configuration data. It also has a registry
for all services and resources that can be used by researchers for creating a new application
or experiment setup.
VANI-CMP is designed and developed using BPEL and deployed on an Enterprise
Service Bus. Similarly to the resources and services inside the testbed, all internal com-
ponents and functions of VANI-CMP have also been developed as independent service
components, and are accessed through Web Services.
The use of ESB and Web Services enables VANI-CMP to be easily extended in func-
tionality and accessed through other types of interfaces in the future. This design choice
also enables independent development, testing, and redeployment of internal functions of
VANI-CMP such as AAA operation, configuration management, etc. Moreover, the use
of BPEL language for VANI-CMP enables a high level description of the VANI control
and management operations. This enables rapid and easy modifications of the control
and management logic.
Chapter 4. Virtualized Application Networking Infrastructure 65
In the next subsections, we examine each of the functionalities of the control and
management plane and we describe the design steps and interfaces of each of the modules.
4.2.1 User Management
Three concepts are used to manage users in VANI: application plans, service levels, and
plan administrator levels. Application plans are used to show different experiments and
to organize resources and resource usage in each experiment. When booking a resource,
the researcher must specify which plan (experiment) the resource is being booked on.
Any researcher belongs to a service level which governs what control operations s/he is
allowed to call and also how much of each resource s/he is allowed to book. Custom
service levels may be designed for specific users in order to maintain flexibility. Lastly,
plan administrator levels are used to govern access to certain resources. Resource users
will be granted specific levels of access defining their ability to release, program, save,
etc.
4.2.2 Authentication Authorization Accounting
The control software is responsible for handling authentication of users. All operations
in the control plane require users to provide credentials. Currently, credentials are in the
form of a user name and password combination however the implementation allows this
to be easily changed. On every call to the control software, the user is authenticated and
a check is made to ensure that the user has the rights to execute the requested operation.
In addition to authentication, the control software is responsible for authorizing access
to resources. Every access to a resource consists of two checks, ensuring the resource
belongs to the user, and the user has the rights to manipulate the resource as requested.
In order to prevent outsiders from directly accessing resources and bypassing the
control plane, all requests to resources require credentials known only to the control
plane. This credential is generated when resources are initialized.
Chapter 4. Virtualized Application Networking Infrastructure 66
The control software keeps a record every time a resource is booked or released. This
keeps an account of which resource was used by which user (on which plan) and for
how long as well as all resources currently in use. Resources are identified by a UUID
generated by the resource and passed back through the control plane.
4.2.3 Resource Allocation
Resources are booked through the control plane whether the user is a researcher or an
application provider building a resource on top of another. Users provide their credentials
and specify which resource they wish to book (on which VANI node) and the plan to
which the resource will belong. The control plane ensures the user is allowed to book the
resource and determines the location (WSDL address) of the resource in the network. A
getResource request is then made to the resource. The resource does not know who is
requesting the resource as this information is hidden by the control software. If successful,
the resource will return a UUID identifying the resource as well as any other relevant
data which is then passed back to the user. The UUID is used by the control plane for
accounting purposes.
4.2.4 Generic Resources and Registration
New resources can be made available dynamically in the control plane through a reg-
istration operation. The new resource must consist of a unique name, a service name,
a port name, one or more WSDL addresses, and optionally a JNLP address for the re-
sources GUI. The service and port name are used to create an end point reference which
is assigned to the partner link when the resource is to be accessed. The resource may
have multiple WSDL addresses if there are different instances of the resource on differ-
ent VANI nodes. The control software will select the appropriate address depending on
which node the user is attempting to access. Lastly, a JNLP address may be included
which allows resource creators to design and deploy their own GUI using Java web start
Chapter 4. Virtualized Application Networking Infrastructure 67
<xsd:element name="getRequestGenericContents">
<xsd:complexType>
<xsd:sequence>
<xsd:element name="internalIP" type="xsd:string"></xsd:element>
<xsd:element name="uuid" type="xsd:string"></xsd:element>
</xsd:sequence>
</xsd:complexType>
</xsd:element>
Figure 4.6: A sample schema for generic XML content in a getRequest response message
technology [82].
In order for resource creators to dynamically add new resources to the control plane,
it is necessary to use a generic WSDL interface for all resources. The main objective
with the generic interface is to provide a template that makes creating resources easy
while providing flexibility. This is accomplished by providing a number of operations,
messages that are common between many resources such as get, release, and program.
To maintain flexibility, each operation contains an optional XML string which can be
used to customize data that is passed in and out (figure 4.6). Furthermore a generic
operation is included in the WSDL which can be used to include operations not already
included in the template.
4.3 SOA-Based Implementation of VANI-CMP
The Control and Management software is implemented as a collection of web services and
BPEL models. The design is modular and flexible allowing for components to be replaced
or changed as required. The control plane is a BPEL model wrapped with a WSDL
exposing a number of operations for application providers and researchers. Currently,
there are five key components, each implemented as a BPEL model: authentication, data
store, resource manager, storage manager, and the dynamic partner link generator. In
Chapter 4. Virtualized Application Networking Infrastructure 68
this section, a brief description of each component is provided before focusing on how
the components fit together. For more information on each component, please refer to
the relevant section.
The data store stores all the data required by the control software. This may include
user authentication data, resource allocation and accounting, and network data. The
authentication component is responsible for checking user credentials as well as ensuring
users have the rights to execute operations. The storage and resource manager are used
to access the resources on the network. The managers determine the location (stored
as a WSDL address) of the resource in the network before forwarding requests to the
appropriate location. The dynamic partner link generator is used throughout the control
plane to dynamically choose an endpoint reference. This allows calls to be made to
different web-services determined at run time (provided they have the same interface).
The data store consists of a MySQL database, a BPEL model and three web services:
query generator, database, and result processor. The query generator has a number of
operations used to generate different SQL queries. The database has one operation which
takes in a SQL query and returns the result of the query in XML. This web service has
a socket connection to a database subagent which executes the query on the MySQL
database. The third web service processes the XML result.
The authentication component is implemented using a BPEL model and makes use
of some of the operations provided by the data store. It provides a number of operations
to check user login credentials, ensure users have the permission to execute the requested
operation. In addition, this component is responsible for ensuring users have permission
to book or manipulate (release, program etc.) booked resources.
The dynamic partner link generator is used to dynamically assign an endpoint refer-
ence to a partner link. First, a call to the data store to determine the WSDL address,
service name, and port name, which is then passed to a web service which wraps the
service name as a QName and the port as an NCName. An endpoint reference is then
Chapter 4. Virtualized Application Networking Infrastructure 69
created using the WSDL address, service name and port name and returned.
The resource and storage manager provide an interface for accessing resources avail-
able on the network and storage. A call is made to the dynamic partner link generator to
dynamically assign an endpoint reference to the partner link depending on which resource
is being accessed.
4.4 Security in VANI
One of the basic requirements in VANI design was to make sure the experiments are done
in a secure and isolated environment from the other applications and experiments. To
create this secure environment we have to consider security issues in various parts of the
system architecture.
The first part is to secure the communications between the researchers and VANI-
CMP. In VANI all communications between these two entities are encrypted using secure
SSL connections and WS-security specification. To do so, each researcher has to share
his/her public key with VANI (and vice versa). On top of that VANI-CMP authen-
ticates the researchers and application providers using the credentials provided in all
transactions, and then, authorizes the researcher’s access level to the resource.
The second part is the communications between the resources and VANI-CMP. These
communications have also been encrypted. Moreover, credentials only known to the
resource and VANI-CMP are included in all communications from VANI-CMP to the
resources.
All internal traffic within one experiment is separated from other experiments using
tagged Ethernet VLAN s. By proper configuration of the testbed internal fabric resource,
we are able to isolate these tagged VLANs from each other. This case is discussed in
more detail in the bandwidth guarantee section.
Communications inside the applications plane, internal to one experiment, or coming
Chapter 4. Virtualized Application Networking Infrastructure 70
to and from that experiment could be encrypted or not depending on the experiment,
and therefore it is outside of the scope of the VANI design. This allows researchers to
freely design and develop new encryption and decryption algorithms in different layers
inside their application plane slice.
4.5 Guaranteeing Bandwidth in VANI
In order to make sure that one experiment cannot undermine another experiment’s ca-
pability to send and receive traffic, we need to have a bandwidth guarantee mechanism
in place. Likewise, for communications between different VANI nodes, there should be a
rate guarantee in place so that a distributed experiment could have a guaranteed access
to the available bandwidth.
Since all communication in VANI is carried over the VLAN tagged Ethernet frames,
an Ethernet rate limiting mechanism in processing nodes has been developed. By doing
so, we limit the rate in which each virtual processing node sends and receives traffic
from/to another virtual processing nodes inside a VANI node. To guarantee the send
and receive rate, we designed and developed a novel Ethernet traffic shaping system,
called Distributed Ethernet Traffic Shaping system (DETS) [9] that we describe in the
next chapter.
Also the gateway and bridge service controls the rate in which an experiment sends
(receives) traffic to (from) the VANI wide area network. The wide area network that is
used to connect the VANI nodes would be a research-dedicated network like CANARIE
[77] or ORION [83] that can guarantee the aggregated traffic to/from the VANI nodes.
If the wide are network was able to provide dynamic and on-demand bandwidth allo-
cation, VANI would be able to use this functionality whenever an experiment asks for
sending/receiving traffic to/from the wide area network. VANI nodes could also be con-
nected to the public Internet network, however, bandwidth could not be guaranteed for
Chapter 4. Virtualized Application Networking Infrastructure 71
the experiments in this case.
To request a bandwidth guarantee in VANI, a researcher can specify the bandwidth re-
quirements of a virtual processing node in the resource get request. Likewise, a bandwidth
requirement can be specified when access to the VANI wide are network is requested.
The virtualization layer in VANI control and management plane makes sure that the
specified requirements are met when allocating virtual resources to the experiment.
If an experiments needs more VLANs it can simply ask for adding a new VLAN to
the experiment. Also, if separate experiments or applications need to have intercommu-
nication inside VLANs one of them could ask for creating a shared VLAN through the
control plane, and add other experiments to it.
Another way of communication between experiments could be through the gateway
using the public addresses allocated using the bridge and gateway services.
4.5.1 Interconnecting VANI Nodes in IP Layer
IP Network
VANI Node
GW
VANI Node
GW
VR
VANI Node
GW10.X.X.X/20 VR
VR
VR
VR
VR
VR
VR
VR
Public IP Address
Legend:
GW = Gateway
VR = Virtual Resource
VLAN#10
VLAN#20
VLAN#30
Figure 4.7: Connecting VANI nodes in IP layer
Chapter 4. Virtualized Application Networking Infrastructure 72
Figure 4.7 shows how we can set up an experiment or create a distributed application
across a wide area IP network. In this setting, all resources inside an experiment in a
VANI node get a local IP address in the range of 10.X.X.X. All resource could send traffic
to the wide are network using the NAT functionality implemented in the gateway service
(shown as GW in figure 4.7). It is possible to put multiple gateways in place and direct
outgoing traffic to different gateways to avoid bottlenecks in the system.
On the other hand, if a resource needs to be accessible from the wide area network, the
researcher can ask the gateway service for a public address/name, and the gateway service
redirects all traffic to that public address to the resource’s internal IP address/VLAN.
4.5.2 Interconnecting VANI Nodes in Ethernet Layer
Wide Area
Ethernet
Network
VANI Node
BR
VANI Node
VANI Node
MAC3
MAC1MAC2
BR
BR
Legend:
BR = Bridge
VLAN#10
VLAN#30
VLAN#20
QinQ#100
Figure 4.8: Connecting VANI nodes in Ethernet layer
Figure 4.8 shows an Ethernet connected VANI. Ethernet connected VANIs use the
bridge service instead of the gateway service to interconnect. Inside a VANI node, all
Chapter 4. Virtualized Application Networking Infrastructure 73
resources in an experiment communicate using a specific VLAN which is unique to the
VANI node. If an experiment needs to operate across multiple VANI nodes (for instance,
to test a new layer three protocol), the VANI wide area network has to be able to
transfer Ethernet frames. In this case, a unique Q-in-Q tag [84] would be assigned
to the experiment. The bridge service would be used to re-frame the internal tagged
Ethernet frames to the wide are Q-in-Q frames and the destination bridge would do the
reverse operation, and deliver the Ethernet frames to the destination MAC/VLAN in the
destination VANI node.
Since Q-in-Q tagged Ethernet frames might not be available in a wide area network,
we are able to define public MACs that can be used for redirecting traffic to an internal
MAC/VLAN by the bridge service. This functionality would enable any other Ethernet-
based experiment to send Ethernet frames to a resource in another experiment through
the bridge service.
4.5.3 Experimentation with L3 Protocols
Figure 4.9 shows how the testbed could be used to test a new layer three protocol in a
a large scale and distributed environment using proxy nodes. In this setting, the new
L3 protocol is tunneled within IP payload to a resource inside a VANI node, and then
that resource strips off the IP header and feed the new L3 packet over the VANI wide
are Ethernet network.
4.6 SW-Based Resources in VANI
One of the main contributions in our testbed control and management plane is that we
could encapsulate any software or hardware resource in our testbed as a service. To do
so, the resource can be virtualized, and abstracted as a service component that follows a
generic resource WSDL template. Then it can be registered into the control plane and
Chapter 4. Virtualized Application Networking Infrastructure 74
Testbed Network
Access through IP tunnels
Example:
“Red” network
protocol stack
deployed in slices
of VANI nodes &
tested to scale
Figure 4.9: Large scale experimentation with new L3 protocols
made available to other researchers. Details on how this task can be accomplished have
been discussed in the control and management plane section in this chapter.
Examples of such resources as a service are any hardware function or resource that
could be reused in different applications and experiments such as hardware accelerators
for encryption, decryption, content conversion, and content compression/decompression.
Also other reconfigurable hardware modules such as NetFPGA could be virtualized and
offered to the researchers on an on-demand basis.
Other types of processing nodes could also be offered to the researchers as a resource.
For example, Amazon Elastic Computing Cloud (EC2) nodes [27], GENI virtual pro-
cessing nodes, VMWare-based virtualized processing nodes [85], or Graphics Processing
Units (GPUs) could be controlled and managed by VANI-CMP.
Moreover, software services such as BPEL orchestrator engine and Complex Event
Processing (CEP) engine, could be developed and/or deployed on top of current virtual
resources and made available to the researchers through VANI-CMP.
Currently, we have developed and deployed several software-based resources as service
components in VANI. In this section, we briefly go over these resources and we describe
what functionalities each resource provides:
1. BPEL orchestrator as a service is able to execute a BPEL project and to orchestrate
Chapter 4. Virtualized Application Networking Infrastructure 75
a composite application.
2. Complex Event Processing as a service is a service that is customizable to receive
events from different sources using different protocol (JMS, SNMP, etc..). This
service is able to analyze received events and produce notifications and events and
send them to different destinations using different protocols. We have used this
service for the performance monitoring and analyses of VANI.
3. Database as a service is able to store, search, and retrieve data on-demand. Re-
searchers could get this resource, program it using a database file, and query it by
sending SQL commands over WS interface to the database resource. DB resource
uses MySQL engine and stores its data on VANI’s storage service.
4. Sensor as a service is able to manage different sensors data and forward them to
anywhere a researcher asks for on-demand. For example, a researcher can ask for
sensor data for the status of wind or sun in a specific location for a limited time.
This allows creation of many new applications and experiments using the sensor
service.
5. GENI federation service enables access to PlanetLab GENI resources through VANI-
CMP for the researchers connected to VANI. We discuss this service in the next
section in more detail as we describe the interconnection between VANI and GENI.
4.7 Federation with GENI
GENI is an initiative to create a large scale experiment through federation between
different testbeds. Federation in GENI is done using GENI wrappers. A GENI wrapper
is developed for each testbed and testbeds could connect to each other through them.
In VANI, we developed a wrapper for control and management plane, and through that
we invoke GENI wrapper operations to get a node on any GENI testbed. We tested our
Chapter 4. Virtualized Application Networking Infrastructure 76
VANI-CMP
Virtualization
Layer
Researcher
Phy
Nodes
VANI-CMP
VANIWrapper
GeniWrapper
Client
Gen
iWrap
per
Serv
er
GENI Nodes
R
S
M
A
M
VANI/GENI
Interface
Figure 4.10: Connecting VANI to GENI
wrapper with PlanetLab GENI wrapper and managed to obtain a PlanetLab processing
node through our VANI-CMP.
In VANI, researchers are able to get a PlanetLab processing resources using VANI
generic resource template. Since PlanetLab does not support storage service, and also
does not support other VANI requirements such as processing and bandwidth require-
ments, access to PlanetLab processing resources would not support these functionalities.
Figure 4.10, shows the structure of interconnection between VANI and PlanetLab through
the GENI wrappers. Currently, we are in the development phase of offering VANI re-
sources to GENI researchers through the VANI wrapper.
4.8 A VANI Node
A VANI node is composed of the resources described in this chapter, their corresponding
virtualization software, control and management software, and the storage service. A
Chapter 4. Virtualized Application Networking Infrastructure 77
VANI node can be totally deployed on a computer cluster composed of normal computing
blades, and manageable Ethernet networking elements. The basic resources in a VANI
node are the processing resource, the storage service, and the fabric service for the network
virtualization that are deployed on a computer cluster.
All other resources and the control and management software are deployed on these
basic services. In addition, all other software-based resources, and the virtualization layer
for resources like reconfigurable hardware resource, and the VANI wrapper for connecting
to GENI testbeds are also deployed on these basic resources.
The only elements that cannot be found in a normal computer cluster are the reconfig-
urable hardware resources, the gateway and bridge services, and required 10GE Ethernet
switches. These resources are also co-located with the computing cluster to provide
the WAN connectivity and to enable running experimentation with the reconfigurable
hardware resource.
In the future, we will publish instruction manuals on how to connect to the VANI
control and management plane and how to access resources through the developed GUI
as well as the secure WS interfaces. We will also describe how all features that we
described in this chapter can be accessed by application providers including registering
a new service in VANI.
4.9 Performance Evaluations
Up to now, we presented the VANI architecture and we discussed different aspects of its
design. To find if the currently developed resources can meet VANI design requirements,
we performed several experiments on those resources. In this section, we present perfor-
mance measurements on two key physical resources that have been virtualized and offered
to the researchers in VANI. The first one is the reprogrammable hardware resource, and
the next one is the processing resource. Our main focus in this part would be to see
Chapter 4. Virtualized Application Networking Infrastructure 78
10 Gbps Ethernet portDDR2 DIMM slot
Control
FPGA
User
FPGA
User
FPGA
User
FPGA
User
FPGA
40 Gbps
40 Gbps
40 Gbps
40 Gbps
20 Gbps
20 Gbps20 Gbps
20 Gbps
Figure 4.11: Reprogrammable Hardware (BEE2 Board)
if we could guarantee the promised quality of service to the researchers that use these
resources in their experiment.
4.9.1 Reprogrammable Hardware Resource
By introducing a virtualized and reprogrammable hardware resource in VANI, we enable
researchers to test new networking algorithms and protocols using high performance and
high throughput hardware resources. To do so, we virtualized BEE2 boards developed in
the University of California at Berkeley. A BEE2 board consists of one controlling FPGA,
and four high capacity Xilinx Vertex-II FPGAs (figure 4.11) that can be programmed by
users. Each FPGA has four 10GE interfaces, and 4 GB of memory.
In VANI, a researcher can get a set of FPGAs on a BEE2 board, and can ask for
on-board inter-chip communication channels which can carry up to 5 GigaBytes per
second (GBps). The detailed design of BEE2 virtualization system and introducing it as
a resource in VANI can be found in [8]. Here, we present the performance measurements
Chapter 4. Virtualized Application Networking Infrastructure 79
on this resource. The parameters of interest are the programming time of the FPGAs
through the virtualization software as well as the speed with FPGAs can send and receive
data.
The first parameter is the time in which a researcher can program an FPGA through
the testbed control plane. Also, we would like to know how this time would change if
four researchers want to program all four FPGAs concurrently. To do so, we developed
a bitstream that initializes all 10GE interfaces on the FPGAs and starts sending a burst
of UDP/IP packets on one of its 10GE interfaces, and we programmed FPGAs through
VAN-CMP using the generated bistream for several times. Table 4.1 shows the average
maximum programming time that programming one, two, three, and four FPGAs take.
As can be seen, it only takes 30 seconds on average to program an FPGA in the case
where all four FPGAs are programmed concurrently, and this time is around 11 seconds
if only one FPGA is programmed at a time.
This fast programming time allows a researcher to get an FPGA with four 10GE
interfaces in less than a minute, and to run an experiment and return the FPGA back
to the VANI resource pool as soon as it’s not required.
The next experiment that we performed is to measure the speed with which the
FPGAs can send and receive traffic. To do so, we developed a traffic generator using
Verilog hardware description language, and we started sending traffic from one 10GE
interface to another 10GE interface on the same FPGA, and we recorded the maximum
bandwidth that we could receive in the hardware resource. We also compared this with
the traffic statistics gathered by the Ethernet switch connected to the FPGA. We repeated
this experiment several times and were able to send and receive Ethernet frames to the
rate of 1GBps, which is equal to 8Gbps. The reason that we could not send more traffic is
the 8/10 bit encoding mechanism for 10GE-CX4 interfaces, and 8Gbps is the maximum
achievable traffic rate per port on a BEE2 board. In our measurements, this rate did not
change if all ports started sending and receiving traffic at the same time since separate
Chapter 4. Virtualized Application Networking Infrastructure 80
FPGAs 1 2 3 4Programming Time (s) 11 17 24 30
Table 4.1: Average maximum FPGA programming time
internal modules are controlling each port. This experiment shows that one FPGA alone
can send and receive 32Gbps traffic. If a researcher get all four FPGAs on a BEE2 Board
it is possible to send/receive traffic in the rate of 4x32=128Gbps.
We have used this reprogrammable resource in developing the high capacity gateway
and bridge service for VANI, and we have developed a bandwidth control mechanism on
this resource that controls and guarantees the rate at which one experiment could send
and receive traffic to/from a wide are network. In the future, we will present our design
for the gateway and bridge service, and we will present our performance measurements
for this service as well.
node01 from/to UDP UDP (rl) TCP TCP(rl)
node02 (12.50MBps) 24.5/24.3 12.4/12.4 15∼35/24.7 12.3/12.3node03 (18.75MBps) 24.5/24.3 18.8/18.8 15∼35/24.3 18.4/18.4node04 (25.00MBps) 24.5/24.3 25.3/25.3 15∼35/24.1 24.8/24.6node05 (31.25Mbps) 24.5/24.3 31.7/31.6 15∼35/22.1 31.3/31.1node06 (31.25Mbps) 24.5/24.3 31.7/31.6 15∼35/23.2 31.3/31.1
Table 4.2: UDP and TCP traffic measurements in a VANI node in MBytes per second(MBps)
4.9.2 Processing Service and Network Virtualization
Another main physical resource that we have virtualized is the processing service that
uses Linux vServer software. There have been studies on processing virtualization tech-
niques [86], and also specifically on Linux vServer [76]. Linux vServer performance eval-
uations show that this virtualization module has a very low overhead on overall system
Chapter 4. Virtualized Application Networking Infrastructure 81
performance.
VANI Internal Fabric
node01
VN_1_1 VN_1_2 VN_1_3 VN_1_4 VN_1_5
node02
VN_2_1
node06
VN_6_5
node03
VN_3_2
node04
VN_4_3
node05
VN_5_6
vlan#105
Exp#5
vlan#104
Exp#4
vlan#103
Exp#3
vlan#102
Exp#2
vlan#101
Exp#1
Virtual Processing Nodes
1G
E
Processing Servers
1G
E
1G
E
1G
E
1G
E
1G
E
Figure 4.12: Traffic measurement experiment topology
However, since we are also doing network virtualization in addition to the processing
node virtualization, we conducted two more experiments that we believe were necessary
to show that virtual processing nodes can have guaranteed access to the VANI network.
In our experiment, we virtualized cluster blades with dual Xen 1530 CPUs and 2GB
of RAM and one 1GE interface. The Linux kernel version that we used was 2.6.16, and
we used vServer 2.3.2. patch. The developed virtualization layer allows up to ten virtual
nodes on a physical node. For this experiment, we initialized and launched 5 virtual
nodes on a node named node01. We also launched 5 other virtual processing nodes on
five separate servers with same capabilities described for node01. These nodes are named
node02 to node06. Each of the virtual nodes in node01 belongs to an experiment that
includes one other virtual node running on one of the other nodes. The topology and
VLAN tags for experiments are shown in figure 4.12.
In this experiment, we measured the UDP and TCP traffic rate that each virtual
Chapter 4. Virtualized Application Networking Infrastructure 82
node in an experiment could send and receive in different cases. The first case is to find
out the maximum achievable rate when no limit is placed on the traffic rate and only one
experiment is active. This rate is 122MB per second (MBps) for both UDP and TCP
traffic which is equal to 976Mbit per second (Mbps). Table 4.2 show the achievable rate
in different cases when all experiments are active and send as fast as they can. Since
all experiments running on node01 try to send and receive on one 1Gbps Ethernet link
concurrently, they get a different share of this available traffic in different cases.
In table 4.2, we show the maximum traffic rate in MBps between a virtual node on
node01 and its corresponding virtual node on node02 to node06. The UDP column shows
the maximum rate when all virtual nodes in all experiments try to send and receive UDP
traffic, concurrently, without any rate limit mechanism in place. The TCP column shows
the TCP rate in this case. As it can be seen, because of the massive packet loss in this
case, TCP cannot achieve a stable rate, and its rate changes from 15 to 35 MBps. These
measurements prove the need for a rate limiting mechanism when different experiments
want to run on a shared virtualized infrastructure.
The columns with (rl) show measurements when we limit the send and receive rate
in experiments to (12.5), (18.75), (25), (31.25), and (31.25) MBps respectively, totaling
to 118.75 MBps (950 Mbps). As can be seen, using the rate limit functionality we could
achieve the bandwidth guarantee requirements (with maximum 1% deviation from the
target rate) in a VANI node. Another case that we have studied is the case where all
virtual nodes in one experiment start sending traffic to one virtual node concurrently.
This would result in congestion on the shared link that is serving the destination virtual
node. To solve this problem, we have developed a novel traffic control mechanism that
we will present in the next chapter of this thesis.
Chapter 4. Virtualized Application Networking Infrastructure 83
4.10 Experiments & Applications
The testbed could be used to run large scale experiments on networked systems and
applications and network architectures from layer three up. Especially it is designed
to enable experimentation with applications that need responsiveness and quality of
service guarantee by having processing and storage services in all testbed nodes. Example
applications that could use these functionalities are video streaming applications and
smart power grid networked applications.
Due to the ability to change the experiment configuration on-demand and on the
fly and together with the everything as a service foundation of the testbeds network
architecture such as green architecture could be tested on this testbed. In green network
architecture, network topology and configuration can be changed due to the changes in
the status of renewable energy generation and consumption.
Based on the same functionalities, we are in the process of building a green orches-
trator engine that would use many aspects of the testbed including the on-demand con-
figuration, short lived resource leases, and testbed’s status and performance monitoring
tools. The outcome of this application would be published soon.
Also, due to the availability of the storage and processing resources in the testbed
nodes, the testbed could be used to experiment with various content delivery networks
such as hybrid peer-to-peer networks. In hybrid p2p networks, peers and in-network
resources could be organized and structured in a way that content be delivered to the users
with a lower search and delivery time. Also, implementing content based routers, and
distributed publish/subscribe system would be possible in our testbed, and these services
could be offered to the researchers as a stand alone and reusable service components to
facilitate experiment setup and application creation.
Chapter 5
A Distributed Ethernet Traffic
Shaping System
The architecture of the local area networks is facing new challenges with the emergence
of cloud computing [87] and the deployment of massive data centers [26]. This new
computing paradigm allows users to access a virtual network of resources in the cloud that
can be called upon to deploy applications on demand. At the same time, the networking
research community has moved toward creating similar platforms for experimenting with
new networking concepts and architectures [7]. As in cloud computing, these networking
testbeds offer a virtual network of resources to the researchers so that they could evaluate
their networked systems in large scale.
The creation of these research testbeds and cloud computing platforms has become
possible mainly due to the advancement of virtualization techniques that have made
separation of the virtual computing resource and the underlying physical resources much
easier, and have allowed operation of multiple virtual machines on one physical resource.
Inherent in such shared resource environments is the potential for disruptive interac-
tion among users and hence the need for new techniques to provide network and resource
isolation. The Virtualized Application Networking Infrastructure (VANI) [7, 8], presented
84
Chapter 5. A Distributed Ethernet Traffic Shaping System 85
in the previous chapter of this thesis, is an example of a networking research testbed that
allocates a virtual network of resources to researchers. An important requirement in
VANI is to guarantee network access rates and isolation between different experiments.
In this chapter, we present the Distributed Ethernet Traffic Shaping (DETS) system
and its corresponding algorithms designed to provide a guaranteed network access rates
in VANI. The DETS system is not only applicable to VANI, but also to the computing
clusters and data centers that virtualize and share their resources among different virtual
networks. DETS deployment in a cluster or a data center does not require any changes in
system hardware, and can be deployed on top of normal computing blades and Ethernet
switches.
Ethernet Switch
Physical
Nodes
Virtual
Nodes
Virtual
Nodes
VLAN #2
VLAN #1
VN21 VN22 VN23 VN24 VN25
VN11 VN12 VN13 VN14 VN15
PN1 PN2 PN3 PN4
PN5
Figure 5.1: A system with five nodes and two virtual nodes on each
The primary role of DETS is to control and regulate the traffic sent and received
on VLANs. Especially, this is required where more than one virtual machine is working
on a physical node, and each has to send and receive a guaranteed rate of traffic on a
dedicated VLAN on a shared Ethernet access. Figure 5.1 shows a sample scenario for
DETS. In this sample system, we have five physical nodes (SNMP) each having two
Chapter 5. A Distributed Ethernet Traffic Shaping System 86
200 400 600 800 10000
200
400
600
800
1000
1200R
ecei
ved
TC
P R
ate
(Mbp
s) o
n vl
an 1
Time
Figure 5.2: TCP rate back off due to interfering UDP traffic
running virtual nodes (SNMP). All these PNs are connected to an Ethernet network and
the VNs running on these PNs require a guaranteed access rate to the Ethernet network.
For the sake of simplicity, we show an Ethernet network with just one Ethernet switch,
but in general, it is possible to have many switches in a network. In this topology, VNs
running on a node are working separately and can only communicate with their peer VNs
in other physical nodes.
If V N11, V N12, V N13, and V N14 start sending traffic to V N15 , they can consume
all the available bandwidth on the Ethernet link that connects PN5 to the Ethernet
switch. This may cause problem for traffic sent from nodes V N21, V N22, V N23, V N24
to node V N25 that shares the Ethernet link with V N15. Therefore there is a need for a
traffic shaping or rate control to limit the rate that PN5 can receive traffic for V N15 so
that V N25 can also receive traffic at a guaranteed rate.
This problem would become very evident and observable if the interfering traffic
(traffic for V N15) is UDP and the underdog traffic (traffic for VN25) is TCP. The high
Chapter 5. A Distributed Ethernet Traffic Shaping System 87
amount of UDP packets on the link to PN5 would virtually disable TCP traffic to V N25
as the experimental results in Figure 5.2 show. In the Figure, V N25 receives the maxi-
mum possible TCP rate, if no traffic is sent to node V N15. However, as soon as UDP
traffic is sent to node V N15 (around time 300 in Figure 5.1), TCP rate goes to almost
zero until UDP traffic stops (at around time 1000). This experiment shows not only the
sensitivity of a TCP flow rate to a competing UDP flow but also it shows the importance
of having a traffic shaping and rate control system to guarantee an agreed access rate for
different virtual nodes on a physical node that share one Ethernet link. Although there
have been proposals for TCP-friendly transport protocols [88, 89], in many systems and
environments, such as in VANI, it is not desirable to impose a specific flavor of transport
protocol on the virtual machines. The problem of network performance degradation in
virtualized environments has been also studied in [29] and the authors, through measure-
ments on Amazon Elastic Computing services, concluded that virtualization techniques
can cause significant throughput instability.
Current Ethernet flow control uses PAUSE signals [90]. When multiple ports flood a
port, the Ethernet switch sends PAUSE signals back to the flooding ports so that they
stop sending for an amount of time specified in the PAUSE message. It has been generally
accepted that the pause mechanism in Ethernet flow control is not suitable for solving
new challenges facing these networks [26]. To address Ethernet congestion problems,
two new IEEE task forces (802.1Qua [91] and 802.1Qbb [91]) have been created. The
main approach in these task forces is to do flow control at the level of class of service
by marking frames at Ethernet switches. In contrast to these approaches, our proposed
system operates at the edge of the Ethernet network on the computing hosts in a cluster
or a data center.
We direct interested readers to [26, 30, 92, 93, 94] for a survey on the recent work
on Ethernet network congestion control for data centers. The current proposed methods
for congestion management entail modifying Ethernet network elements. Moreover, the
Chapter 5. A Distributed Ethernet Traffic Shaping System 88
majority of the proposed systems are Congestion Notification based systems with no
explicit rate information [30, 92] that have been shown that have draw backs, such as
slow recovery in comparison to explicit rate systems [92].
The salient explicit rate congestion management system, Forward Explicit Congestion
Notification (FECN) [92, 94], passes explicit rate from the congestion point to the source
point based on the utilization ratio of the congested link. Our system is also an explicit
rate system, but differs from FECN in several aspects.
The DETS system is more than just an Ethernet congestion management system. In
particular DETS allows setting guaranteed limits on the send and receive on each virtual
network, and shapes the traffic so that virtual networks do not interfere with each other’s
ability to send and receive traffic. Unlike FECN, DETS does not need any change in
the current Ethernet equipments, and can be applied in the current computing cluster
systems and data centers. Moreover, our system is capable of supporting both fair and
weighted fair bandwidth allocation mechanisms. In addition, in allocating rates to the
sending nodes, the system considers the available sending capacity of the sending nodes
that results in higher throughput. Nevertheless, we emphasis that DETS is designed to
address congestion at the egress ports of Ethernet networks. Consequently, it does not
directly address congestion in the network.
The DETS operation is seamless to the virtual machines running on the host system,
and virtual machines only see the decrease and increase in send and receive traffic rate
on certain flows. In other words, the applications need not to report their bandwidth
requirement since the measurements are done in DETS. However, since our system runs
on the host system its rate set and measurements periods are limited to the system’s
timer (about 55ms).
The organization of this chapter is as follows: Section 2 describes our proposed system,
identifies key control and measurement points, and presents the DETS protocol. Section
3 presents the DETS system design and it main internal modules. In this section, we also
Chapter 5. A Distributed Ethernet Traffic Shaping System 89
VN11
VN21
PN1
VN12
VN22
PN2
VN13
VN23
PN3
VN14
VN24
PN4
VN15
VN25
PN5
Ethernet SwitchMeasure and
Rate Control Point
Measure Point
and Rate Allocator
Rate Measurement
Reports
Rate Control Commands
Figure 5.3: DETS measurement and rate control points
propose four different algorithms developed for DETS. The DETS system performance
measurements are presented in section 4, and in section 5, we describe the modification in
Ethernet control plane in order to port the DETS system to Ethernet network elements.
Finally in section 6, we present concluding remarks and our future work.
5.1 Distributed Ethernet Traffic Shaping (DETS) sys-
tem
The DETS system is designed to control the rate of the traffic generated by each virtual
machine according to the total traffic rate at the destination virtual node. DETS controls
the sending rate of the traffic in the originating VN before it enters the Ethernet network
based on a target rate imposed by the receiving virtual node.
In the VANI system, a virtual LAN is created for the virtual nodes that are in one
Chapter 5. A Distributed Ethernet Traffic Shaping System 90
group, and ”over the top” rate controller software is run in each of the physical nodes.
This software is able to control the rate at which each virtual machine sends traffic to
any other virtual machine in that virtual network. The module is also able to measure
received traffic to each virtual node, and detect if the received rate limit is violated. If
the received rate limit is violated, the receiving node is declared the congested node. The
controller then monitors the sent traffic to the congested node and controls its rate at
the sending node. This system is depicted in Figure 5.3 which shows the control and
measurement points.
Each agent in DETS has two separate modules; a send rate controller, and a receive
rate allocator. The send rate controller monitors the sending traffic rate to any other
virtual machine that is facing congestion, and reports it to the rate allocator in the con-
gested node (node PN5 in example scenario, called the receiving node in the remainder
of this document). The rate allocator at the receiving node (PN5) allocates a rate to
each sending node and sends set-rate commands to the corresponding send rate controller
modules in the sending nodes. The send rate controllers apply the received set rate com-
mands (at the set rate control points shown in Figure 5.3) and subsequently the traffic
sent to the congested node (PN5) will be shaped accordingly.
The DETS system can be implemented in any cluster with any operating system that
is able to control the egress Ethernet traffic rate. In the next section, we focus on a
cluster of Linux-based computing nodes, and we describe the system design and protocol
for deploying DETS in such a cluster.
5.1.1 DETS Protocol
The DETS protocol has five types of messages:
1. Traffic Report message, sent from a sending to a receiving node and includes mea-
sured rate, current rate limit, and available rate.
Chapter 5. A Distributed Ethernet Traffic Shaping System 91
2. Initialize Traffic Control message, sent from a receiving to a sending node to ini-
tialize the traffic controller to the receiving node.
3. Set Rate message, sent from a receiving to a sending node and includes the allocated
rate that the sending node has been granted.
4. Keep Alive message, sent from a receiving to a sending node when the traffic control
on the receiving node is active.
5. Deactivate Traffic Control message, sent from a receiving to a sending node to
deactivate traffic control to that receiving node.
5.1.2 DETS for Linux OS
In Linux, traffic shaping can be done on egress and ingress traffic. The main command
for performing traffic shaping is ’tc’ command [95]. This command can operate on a
virtual interface (serving a VLAN), and can be also used for measuring the send and
receive rates. The shaping in our system is done in the Linux hosts, and it is seamless
to the virtual machines running on them.
5.2 DETS System Design
Figure 5.4 shows the design of DETS. In the send rate control module, there is one
state machine for each receiving node. Also, there are two internal sub modules in the
receiving rate allocator module. The first module is responsible for communicating with
the sending nodes, and the second module allocates the rates to the sending nodes.
5.2.1 Rate Allocator Module
The core part of the DETS system is the rate allocator module that allocates the sending
rate to each sending node. The rate allocator module utilizes a Rate Allocation Algorithm
Chapter 5. A Distributed Ethernet Traffic Shaping System 92
Send Rate
Control
Subsystem
Receive Rate
Allocator
Rate
Allocator
Sending Node
Communication
Send Rate
Measurement
& Control
Receive Rate
Measurement
Linux Traffic Measurement and Shaping
DETS System
Figure 5.4: DETS System Internal Modules
(RAA) to determine the rate at which each sending node can send traffic to the receiving
node.
In RAA design, we need to consider that the measurements in the send rate control
modules are capped by the rate set by RAA. To better explain this limitation and its
implication on algorithm design we use an example scenario. Assume that in Figure 5.3,
the system in a steady state with four virtual nodes (V N11 to V N14) sending traffic
to VN15 with rates (80, 80, 20, 20)Mbps respectively. At this point, if VN11 stops
sending traffic to VN15 the rate allocator algorithm may reallocate the vacant rate to
other nodes. However, since there are no measurements for sending rate above the rate
limits, the RAA needs a mechanism to probe V N12 to V N14 to see if the sending nodes
need to send more traffic or not. Without a probing mechanism, RAA would allocate
rates to a node that might not need the extra allocated rate and the available bandwidth
would be wasted.
The probing mechanism allows us to provide fairness in rate allocation to virtual
nodes. Assume that in the above example, all nodes have similar importance, and have
equal amount of traffic to send to VN15, so the above allocated rate is not fair since two
of the virtual nodes have been allocated rates (80 Mbps to each) that are much more
Chapter 5. A Distributed Ethernet Traffic Shaping System 93
input : Active nodes list and their send capacityoutput: Calculates granted rate to each Node
1) Calculate the fair rate;fairRate← totalRate
activeNodes;
2) Assign fairRate to all active nodes considering their send capacity;
while There are unallocated rate and nodes with sending capacity dograntRate[i]← min(fairRate,maxRate[i]);if fairRate > maxRate[i] then
fairly distribute extra rate among other nodes;end
end
Algorithm 1: RAAFairShare
than the rates allocated to the other two nodes. If V N13 and V N14 had more traffic to
send this rate allocation is unfair. In this case, the probing mechanism in RAA starts
probing nodes with lower allocated rates to see if they have more traffic to send, and
whether they need more allocated rate.
The probing mechanism in RAA is done through gradual increase and decrease in
rate allocations to different nodes and monitoring the increase and decrease in rate mea-
surements. The probing mechanism may reduce bandwidth utilization, but this might
be acceptable in order to overcome the above mentioned problem.
Another important factor in RAA design is to consider the available traffic sending
capacity in the sending nodes in rate allocation. Assume that in Figure 5.3, V N14 is
sending 20 Mbps to V N15, and 80 Mbps to V N12, and its total send limit is 100 Mbps.
Therefore, V N14 cannot send any more traffic to V N15. Therefore, the rate allocator
algorithm in PN5 should consider the available sending capacity of the sending nodes in
its rate allocation.
There are a number of possible allocation algorithms that can be used in this system.
Next, we propose four of these rate allocation algorithms: Fair Share algorithm (RAA-
FS); Slow Probe algorithm (RAA-SP); Fast Probe algorithm (RAA-FP), and Forward
Explicit algorithm (RAA-FE).
Chapter 5. A Distributed Ethernet Traffic Shaping System 94
input : Active nodes list and their requested rate and send capacityoutput: Calculates granted rate to each node
1) Inflate requested rate of the nodes that fully use their allocated rate by 10%;
2) Calculate the total requested rate;
3) Calculate the ratio of increase and decrease the requested rate based on theavailable rate; ratio← totalReqRate/totalAvailableRate;
4) while There are unallocated rate and nodes with sending capacity dograntRate[i]← min(reqRate[i] ∗ ratio,maxRate[i]);if reqRate[i] ∗ ratio > maxRate[i] then
fairly distribute extra rate among other nodes;end
end
Algorithm 2: RAASlowProbe
The fair share algorithm (RAA-FS) calculates a fair rate by dividing the receiving
rate limit by the number of sending nodes that have traffic to send, and allocates that fair
share to each of the active sending nodes. This algorithm is suitable for the cases where
the sending nodes need to be treated similarly in the rate allocation process, independent
of the amount of required traffic, as shown in the pseudo code presented in Algorithm 1.
In this rate allocation mechanism, if the calculated fair rate is more than the sending
capacity of a sending node, the extra rate is fairly distributed among other sending
nodes with available sending capacity. This algorithm is oblivious to the difference in
rate requested by each active node, and does not perform any probing to see if the
nodes have more traffic to send or not. Although RAA-FS is fair but it might result to
bandwidth underutilization, since some sending nodes might not need all of the allocated
rate.
The second algorithm, slow probe algorithm (RAA-SP), allocates rates to the send-
ing nodes based on the rate measurement reports received from their send rate control
modules. The algorithm identifies the nodes that are fully utilizing their allocated rate,
and inflates their rate request by a percentage (for example 10%) to give them an op-
Chapter 5. A Distributed Ethernet Traffic Shaping System 95
input : Active nodes list and their requested rate and send capacityoutput: Calculates granted rate to each node
1) Execute Slow Probe algorithm;grantRate← RAASlowProbe();
2) Sort all nodes that fully utilized their allocated rate according to their grantedrate, and calculate the mean of the granted rate to them;
3) while pick a node with highest rate above mean rate(upper) dowhile pick a node with lowest rate below mean rate(lower) do
Multiply the rate of lower node by d and deduce the increase from highernode, considering lower node send capacity;
if upper node new rate goes below mean thenaverage lower and upper rate and assign avg rate to both;
end
end
end
Algorithm 3: RAAFastProbe
portunity to increase their rate realtive to other sending nodes that are not using their
allocated rate. RAA-SP then calculates the total requested rates and allocates a por-
tion of the available bandwidth to each node. This portion is calculated based on the
inflated request rate and the receive rate limit as presented in this algorithm’s pseudo
code (Algorithm 2).
RAA-SP gradually probes the sending nodes that are fully utilizing their allocated
rate, and gives them a better chance of getting more allocated rate. RAA-SP, however,
does not address the fairness problem, since it does not reallocate the rate from the high
rate allocated nodes to the low rate nodes.
The third algorithm, fast probe RAA (RAA-FP), extends the slow probe algorithm
by reallocating the sending rates from the higher rate allocated nodes to the lower rate
allocated nodes. In contrast to the two previous algorithms, RAA-FP addresses both
fairness and bandwidth utilization concerns. This algorithm sorts the nodes that fully
utilize their allocated rate and calculate the mean allocated rate to these nodes (shown
in the pseudo code presented in Algorithm 3). RAA-FP then picks the nodes with
Chapter 5. A Distributed Ethernet Traffic Shaping System 96
the highest allocated rate, and the lowest allocated rate. RAA-FP multiplies the rate
allocated to the lowest rate allocated node by a parameter (d > 1) and deducts that extra
allocated rate from the node with highest allocated rate if the resulting deducted rate
does not go below the mean allocated rate. Otherwise, it takes an average between the
highest and lowest rate allocated nodes, and allocates this average rate to both of them.
This change in the allocated rate is done considering the free sending capacity of the
node with lower allocated rate. This operation is repeated on the next two nodes with
the next highest and lowest allocated rates until all allocated rates to the fully utilizing
nodes get revised.
Our performance evaluations show that the fast probe rate allocation algorithm
(RAA-FP) is able to achieve probing algorithm goals rather quickly, since it gives more
opportunity to nodes that are fully utilizing their allocated rate to send more traffic.
Moreover, it achieves better fairness in rate allocation since it reduces the gap between
the nodes with high allocated rates and nodes with low allocated rates. The choice of pa-
rameter d controls the trade off between the fairness and bandwidth utilization. A small
d value results in more bandwidth utilization but lowers fairness in rate allocations. On
the other hand, A choice of large d results in lower bandwidth utilization in exchange of
higher fairness in rate allocation.
The fourth algorithm is inspired by the FERA algorithm introduced in [92] for FECN-
based Ethernet congestion management. This algorithm is designed to enable comparison
between a DETS-based rate allocation system and a FECN-based system. It has been
shown that [94] the FERA algorithm has a better convergence time compared to other
proposals for Ethernet congestion control. The essence of FERA is to control the queue
length of an outgoing Ethernet switch port by assigning a fair share rate to flows passing
through that port. This algorithm uses a linear (or a hyperbolic) control function to
adjust the allocated (fair) rate to achieve a target level on queue length (Qeq).
We modified this algorithm to arrive at a target receiving rate at the receiving node.
Chapter 5. A Distributed Ethernet Traffic Shaping System 97
This algorithm (called RAA-FE) calculates a fair rate (ri+1) at (i+ 1)th interval, based
on the ri value at ith interval, and a control function f(r) = 1− k ∗ r−Rt
Rtin which k is a
constant, r is the measured receiving rate, and Rt is the target rate.
DETS sends back the calculated rates to the sending nodes, and the sending nodes
apply the rates to their rate controller modules. Compared to the previous algorithms,
this algorithm does not require the rate measurements at the sending nodes, and does
not support weighted fair allocation.
In the original FERA, intervals are as low as 1 ms, but in DETS intervals are about
55 ms. Therefore, the rate regulations are done every 55 ms that makes rate convergence
a challenge for this algorithm. Although the linear control function leads to a faster
convergence time compared to the hyperbolic function, but as our experiments show,
RAA-FE takes about 40 intervals (> 2s) to converge to the fair rate. The analytical
results show this slow convergence as well [92]. This is mainly because this algorithm
does not include the sending rate measurements.
5.2.2 Performance Improvements
To improve the performance of the DETS system, we have embedded several performance
improvement mechanisms in the system. These improvements are mainly to reduce the
number of exchanged messages and to better predict the required sending rate of sending
nodes.
The first improvement is in the rate measurement reports. The send rate control
module can send the measurement reports only if there is a major change in the measured
rate. By doing so, the measured rates can be sent with a lower frequency. Also if the
measured rate is less than a minimum threshold, the send rate control module can stop
reporting it, and the rate allocator would automatically allocate a minimum rate to that
node.
The send rate control module can also use a prediction algorithm on the rate in which
Chapter 5. A Distributed Ethernet Traffic Shaping System 98
200 400 600 800 10000
200
400
600
Time
200 400 600 800 10000
500
1000
TCP Rate (Mbps)VLAN 1
UDP Rate(Mbps)VLAN 2
Figure 5.5: DETS performance evaluations for system shown in Figure 5.1
a sending node will generate traffic to a receiving node during the next time period, and
send it to the rate allocator module. This predicted rate can be calculated based on the
current and past measurements. This prediction improves the rate allocation algorithm
performance since it considers the predicted rate requirements of a node instead of past
sending rate measurements.
To reduce the number of rate allocation messages generated by the rate allocator mod-
ule, this module can send out these messages to the send rate control modules whenever
there is a major change in the allocated rate.
To make sure that the messages used in DETS protocol can be delivered to the
distributed modules with minimum delay, DETS messages can be conveyed on a separate
physical or virtual network. They can be even marked with a high priority, so that they
get better chance of arriving to their destination in case the network is congested.
Chapter 5. A Distributed Ethernet Traffic Shaping System 99
5.3 Performance Evaluations
In this section, we present experimental results that show DETS can achieve isolation
between virtual LANs. We implemented the DETS system in C++ and deployed it on
11 nodes with 1GE Ethernet connections in a computing cluster, and we created two
VLANs on the Ethernet switches. As in our VANI processing virtualization service [8],
we used Linux vServer technology for virtualization and deployed two virtual nodes on
each physical server. One virtual node in a physical node is connected to VLAN 1, and
the other one is connected to VLAN 2. This setting is similar to the one depicted in
Figure 5.1, except that we used eleven physical nodes instead of five nodes.
We set the send and receive limit rate for all virtual nodes in the first VLAN to 400
Mbps, and in the second VLAN to 500 Mbps, and we used the fast probe rate allocation
algorithm on both VLANs with parameter d = 2. We started sending TCP traffic from
10 nodes to one node. We expect that DETS control the rate that the receiving node
receives traffic, and limit it to 400 Mpbs. We also expect that if the nodes in the second
VLAN start sending UDP traffic to the receiving node, the TCP flows destined to that
machine don’t get overwhelmed with the interfering UDP traffic.
Our results (presented in Figure 5.5) show that DETS is able to achieve both goals.
In this Figure, the rate measurements are shown in every time unit (every 55 ms). As can
be seen, when all nodes in the second VLAN simultaneously start sending UDP traffic
to the receiving node (around time unit 320 in Figure 5.5), momentarily TCP traffic on
the first VLAN gets disrupted, and it takes two time units for the control algorithm to
receive the measurements and make the decision and apply the limits on the sending
nodes. After this short transient period, TCP traffic is able to bounce back quickly, and
continue sending information at the limit rate which is 400 Mbps.
We also evaluated and compared the performance of the four allocation algorithms.
To do so, we set up a VLAN with 10 virtual nodes sending a mix of UDP and TCP
traffic to one virtual node, and we monitored the received traffic on the receiving node.
Chapter
5.
AD
istrib
uted
Ethernet
Traffic
Shapin
gSystem
100
200 400 600 800 1000 1200 1400100
200
300
400
500
600
700
800
Rec
eive
d ra
te (
Mbp
s)a1) RAA−SP
200 400 600 800 1000 1200 14000
50
100
150
200
250
300
350
400a2)
Time
Rat
e (M
bps)
200 400 600 800 1000 1200 1400100
200
300
400
500
600
700
800b1) RAA−FP
200 400 600 800 1000 1200 14000
50
100
150
200
250
300
350
400b2)
Time
MeanStdDev
MeanStdDev
Figure 5.6: Performance evaluation of rate allocation algorithms a) RAA-SlowProbe b) RAA-FastProbe
Chapter
5.
AD
istrib
uted
Ethernet
Traffic
Shapin
gSystem
101
200 400 600 800 1000 1200 1400100
200
300
400
500
600
700
800
Rec
eive
d ra
te (
Mbp
s)a1) RAA−FS
200 400 600 800 1000 1200 14000
50
100
150
200
250
300
350
400a2)
Time
Rat
e (M
bps)
200 400 600 800 1000 1200 1400100
200
300
400
500
600
700
800b1) RAA−FE
200 400 600 800 1000 1200 14000
50
100
150
200
250
300
350
400b2)
Time
Mean
Mean
Figure 5.7: Performance evaluation of rate allocation algorithms a) RAA-FairShare b) RAA-ForwardExplicit
Chapter 5. A Distributed Ethernet Traffic Shaping System 102
We also limited the peak rate of three of the sending nodes to a low limit (to 20Mbps).
This helps us better compare the performance of the proposed algorithms.
We developed an on/off burst traffic generator that is able to generate a burst of
UDP or TCP traffic for a random period between 0 and T , and stops sending traffic for
another random period between 0 and T . We used various values for T ranging from 0.5s
to 10s on different nodes. This traffic generator enables DETS performance evaluation
under time varying and bursty UDP and TCP traffic.
Figures 5.6(a1, b1) and 5.7(a1, b1) show the received rate measurements on the
receiving node for all algorithms for the period of 82 seconds (1500 time units). Figures
5.6(a2, b2) show the measured mean and standard deviation of the allocated rate to the
nodes by the slow probe and fast probe (d = 2) algorithms, respectively. Figures 5.7(a2,
b2) show the mean value of the allocated rate by RAA-FS, and RAA-FE. The fluctuations
in the received rate measurements are due to the on-off nature of the generated traffic.
It can be seen that the fast probe and the slow probe algorithms achieve better
utilization of the received bandwidth compared to the fair share algorithm, especially
since some of the nodes have less sending capacity compared to the other nodes. As it
was expected, the slow probe algorithm outperforms the fast probe algorithm in term of
its receiving bandwidth utilization. However, the fast probe algorithm is able to achieve
a low standard deviation between different flows coming from different virtual nodes
compared to the slow probe algorithm.
The RAA-FE algorithm performs poorly compared to the other algorithms and has
a slow convergence rate, and it has difficulty stabilizing. This is mainly because of the
fluctuations in the generated traffic. RAA-FE also does not consider the sending rate
measurements, and does not have a probing mechanism.
In general, the fast probe algorithm is better than other algorithms if weighted fairness
is required, but if a user needs fairness in rate allocation the fair allocation schema can
be picked. The slow probe algorithm is for the cases where the user wants to increase
Chapter 5. A Distributed Ethernet Traffic Shaping System 103
DETS DETS
DETS
DETS Control Messages
Sending NodeReceiving Node
Ethernet Switch
SW1
SW2
SW3
8 5
Figure 5.8: DETS in Ethernet control plane
the bandwidth utilization in expense of fairness, and does not want a sudden change of a
traffic flow rate and prefers a slow change. In DETS, it is possible to have different rate
allocation algorithms running on different virtual networks, as long as algorithms satisfy
the network isolation requirement. This allows users to pick an algorithm that suits their
needs.
5.4 Modifications to Ethernet Control Plane
Here we discuss inclusion of DETS protocol in the Ethernet control plane so that Ethernet
switching equipments can perform DETS operations even without (or with minimum)
help from hosts attached to the Ethernet network.
We propose that in an Ethernet network, the distributed modules in the DETS system
be embedded in Ethernet switches, and DETS messages be added to the Ethernet control
messages. To do so, controlling traffic to a receiving node has to be done on ingress ports
by the edge Ethernet switches. These messages could be added to the MAC Control
type of Ethernet frames (EtherType = 0x8808) as specified in IEEE 802.3 family of
Chapter 5. A Distributed Ethernet Traffic Shaping System 104
specifications [90]. The only message currently defined in this type of frame is the PAUSE
message (opcode = 0x0001). The DETS messages can use other free opcodes in this frame
type. These messages have to be in VLAN-tagged frames, since DETS is designed to
control the rate on VLANs.
Figure 5.8 shows an Ethernet network equipped with DETS. The rate allocator mod-
ule operates on the receiving port of an edge Ethernet switch (SW3,port number 5) and
the send rate control module and the traffic shaper operates on the sending port of orig-
inating edge Ethernet switch (SW1,port number 8). The set rate messages are sent from
the receiving port to the sending port. The sending port applies the allocated rate to
the sent traffic, and can forward the rate control messages to the sending host in case it
(or its NIC) is able to do the traffic shaping.
Part III
QoS & Admission Control in
Service-Oriented Systems
105
Chapter 6
Allocating Services to Applications
using Markov Decision Processes
In the first two parts of this thesis, we analyzed impacts of service-oriented approaches
in application creation on future network architectures and more specifically its central
role in network-facilitated application creation in an Application-Oriented Network. In
this part of the thesis, we focus on improving quality of experience for the applications
that are created based on this paradigm.
In the Service-Oriented application creation paradigm, services that are designed and
developed independently can be composed with other service components to create new
applications or a more complex service component. Nowadays we can see the effect
of this paradigm on different aspects of networking, such as the development of new
applications through composition of service components, both in the form of “mashups”
[21] as well as in a more rigorous form by using the Service-Oriented Architecture [96,
19]. For example, Google maps have provided the basis for a huge number of mapping
mashups. The importance of the mashup phenomenon is that it marks the emergence of
a new mode of application creation where applications are created through a distributed
and collaborative process. The term Web 2.0 refers to this emerging network-centric
106
Allocating Services to Applications using MDP 107
platform [22]. In addition, The SOA-based loosely-coupled IT systems have given the
enterprises greater agility, when it comes to adjusting the structure of their businesses
to meet changing business requirements. Another example is applying this paradigm to
the multimedia applications, by composing multimedia services [97].
There are numerous literatures in the area of service composition. In [98], the authors
have discussed the service composition problem from the QoS-awareness point of view.
They have argued that the problem of composing services with different QoS parameters
for creating an application with a set of constraints on different QoS parameters is a
Linear Programming problem, and they have used the simplex method to find the best
service set for satisfying the application’s constraints.
A QoS-aware middleware for composing multimedia services for providing multimedia
applications has been proposed in [97]. The authors have shown that the problem of
composing services is an NP-hard problem and they have proposed a heuristic algorithm
for composing services in both centralized and P2P manner for satisfying the overall QoS
constraints of multimedia applications. In their peer-to-peer algorithm, upon receiving
a request from the user, the system starts finding candidate services which satisfies the
overall QoS constraints and at the end it decides which services have to be chosen to
properly serve the users interests and QoS constraints. In [99], the authors have proposed
a Markov Decision Process (MDP) model for combining services while having multiple
choices for each service to increase the overall reward for work flows while exploring
different possibilities.
The problem of scheduling workflows while composing web services has been discussed
in [100]. The authors have proposed a genetic search approach which searches among
the possible order of a vast number of business processes and tries to find the best order
so that it satisfies the overall QoS constraint of business processes. In this chapter,
we address the problem of service composition problem while having conflicted requests
for different services in composite applications. We study this problem in two different
Allocating Services to Applications using MDP 108
cases. The first case assumes applications that require simultaneous execution of service
components, and in the second case we investigate the applications that execute service
components in sequence. We propose optimal policies for assigning service instances to
different applications using Markov Decision Processes in both cases. After formulating
the problem in a form of an MDP problem, we obtain the optimal policy, and we compare
performance of the system while following this policy with systems that use the Complete
Sharing (CS) or Complete Partitioning (CP) [101] mechanisms.
The rest of this chapter is organized as follows: in the next section we define the prob-
lem of service allocation in the case of concurrent service executions. then we formulate
this problem as an MDP problem. In subsection 6.1.2, we analyze the optimal solution,
and in section 6.1.3, we analyze the problem in case a service has instances with different
QoS parameters. In section 6.1.4, we present the optimal policy for a sample system and
we compare its performance with the CS and CP methods.
In the second part of this chapter, we extend the MDP-based service allocation to
the applications that execute service components in sequence. Similar to the first case,
we define and formulate this problem using MDP, and we obtain the optimal policy for
a sample system and present the performance evaluations and comparison results.
6.1 Concurrent Service Executions
6.1.1 Problem Formulation
Consider an environment with m types of services, and k class of composite applications
(Figure 6.1). For simplicity, we assume that all instances of one service have similar
QoS parameters. Each class of composite application is composed of a set of services.
For example, a class 1 application is composed of services 1, 3 and m, while a class 2
application is composed of services 1, 2, 4 and m and a class 3 application is composed
of only one service of type m. Therefore, a request for a class 1 application will be
Allocating Services to Applications using MDP 109
accepted whenever there are free instances of services of type 1, 3 and m. Also, a request
for a class 2 application will be accepted whenever there are free instances of services
of type 1, 2, 4 and m, and similarly a request for a class 3 application will be accepted
whenever there is a free instance of service of type m. As it can be seen, there is a
conflict between service requirements between class 1, 2 and 3 applications. Consider a
case where high request rate for class 3 application results in allocating all services of
type m and hence decreasing the chance of accepting other classes of applications that
require a type m service instance. This fact results to leaving instances of other types of
services underutilized while the requests for the other applications are being rejected.
N-1
0
1
N-1
0
1
N-1
0
1
Service Type 1
Service Type 2
Service Type m
Figure 6.1: A system with m different service types and N instance of each type
To solve this problem, and consequently to maximize the overall utilization, there
should be a mechanism to allow or deny acceptance of requests for different classes of
application. In this section, we propose an MDP-based partitioning model for achieving
an optimal policy for accepting or denying the requests, and we compare the results
of enforcing this policy in achieving higher utilization compared to the other policies
including Complete Sharing and Complete Partitioning [101].
In CS policy, the requests for each class of application will be accepted, whenever
Allocating Services to Applications using MDP 110
there is one free instance of each corresponding service. In this algorithm, no reservation
for any of the applications is carried out. The CS policy, as described before, results to
non-optimal allocation of services to the applications. In CP policy, a constant number
of services will be allocated to each application class which can not be shared with other
classes of applications. While this policy seems fair, it underutilizes the services.
We propose a mechanism for accepting or rejecting the requests for each class of
application in the time of the request. We assume that for each class of application, the
incoming rate and holding time are exponentially distributed, where λi, (1 ≤ i ≤ k) is
the arrival rate for class i application, and µi, (1 ≤ i ≤ k) is the service rate for the class i
application. Also ni, (1 ≤ i ≤ k) is the number of the class i applications being currently
served in the system.
First we assume a simple model consisting of only 2 classes of applications (k = 2)
and 3 types of services (m = 3). The application class 1 is composed of services 1, 2 and
3. The application class 2 is composed of services 1 and 2 (Figure 6.2). We assume that
all services satisfy the QoS requirements of all classes of applications. Also, we have N
instances of services in our system from each type of service, and n1 and n2 represent
the number of applications which are currently in the system, respectively from class 1
and class 2 applications. Therefore, a state vector of (n1, n2) represent the current state
of the system.
Let S = {s = (ni, n2)|0 ≤ n1, n2 ≤ N, 0 ≤ n1, n2 ≤ N} be the system state, and st
be the system state at time t. Based on the statistical assumptions, {st, t ≥ 0} is a
continuous time Markov chain whose transitions are the event of an arrival or departure
of an application.
We try to formulate our problem as a Markov Decision Process [102]. Our objective
is to maximize the utilization of the services and increase the revenue. Therefore, our
decision process is to find how we should treat the next request arrival while the system
is in state s. The system either can only accept a request for a class 1 application, or
Allocating Services to Applications using MDP 111
Application Class 2
Application Class 1
N-1
0
i
N-1
0
i
N-1
0
i
Service Type 1
Service Type 2
Service Type 3
Figure 6.2: A system with three types of service and two classes of applications
only a request for a class 2 application or accept requests for both classes of applications.
Therefore, whenever the system enters the state (n1, n2), the system knows whether it
will serve the next request for either of the classes of application or it will reject it. We
assume that rejected requests do not interfere with the system. As a result, the possible
next actions based on the state s are:
A(s) = {0}:means only accept a request for class 1.
A(s) = {1}:means only accept a request for class 2.
A(s) = {2}:means accept requests for both classes.
Our objective is to find an optimal policy for each state to maximize the reward which
is the weighted sum of the applications currently being served in the system.
6.1.2 Markov Decision Process Formulation
This initial continuous-time Markov Decision Process can be converted into an equivalent
discrete-time MDP by applying the uniformization technique [102]. In order to do so, we
define the sampling time c := N(µ1 + µ2) + λ1 + λ2, and during each sample time only
Allocating Services to Applications using MDP 112
one transition can occur which corresponds to either arrival of a request, departure of a
request, or a fictitious event.
To maximize the utilization in our problem we try to maximize the reward function
which is the weighted sum of different classes of applications in the system. Therefore
we use the MDP infinite-horizon discounted reward model [102, 103], and we define our
one-step reward function as follows:
R(s) = αn1 + βn2 (6.1)
The optimal discounted function and the optimal policy can be computed using the
value iteration algorithm [103],
Vn+1(s) = maxa
[R(s) + ǫ∑
s′
P ass′Vn(s′)] (6.2)
in which ǫ is the discounting factor and P ass′ is the transition probability from state s to
state s′ while applying policy a and its value is as follows:
• When there is an arrival for a request for a class i application and we accept the
request the probability is: λi/c
• When there is an arrival for a request for a class i application and we reject the
request the probability is: λi/c
• The probability of departure of a class i application from the system is: niµi/c
• The probability of the fictitious event is: 1− (∑
niµi −∑
λi)/c
Now we can recursively compute the sequence of n-stage Vn(s) values using the method
of successive approximations [103] and limit of this sequence when n goes to infinity. It
is shown that V (s) := limn→∞ Vn(s) exists and it is the solution of the infinite-horizon
discounted problem [103].
Allocating Services to Applications using MDP 113
Application Class 1
Application Class 2
S3.1
Application Class 1
S3.2
N-1
0
i
N-1
0
i
N-2
0
i
Service Type 1
Service Type 2
Service Type 3
Figure 6.3: A system with three type of services, two classes of applications and twotypes of instances for service type 3
6.1.3 Optimal Policy with Different Services
In the previous section, we formulated the problem considering the case when all services
of one type are similar. But in some situations, there are different service components
with different QoS parameters which cover similar functionalities and can be replaced
with each other. For example consider the case where we have two classes of applications
and three types of services. Application class 1 is composed of services 1, 2 and 3, while
application class 2 is composed of services 2 and 3. Also, we have two type of service
3 in the system. For example, among N instances of service 3, L instance are similar
from the QoS point of view (S3.1 services), and other remaining (L - M) instances have
similar QoS properties but different than the first L instances (S3.2) (Figure 6.3).
We assume that after solving a Linear Programming (LP) problem for satisfying the
constraints of each class of application, we have found that a class 2 application can use
both type of service 3 instances, but a class 1 application only can use the instances of
type S3.1.
Now the problem is to propose a policy for accepting or rejecting requests for the
Allocating Services to Applications using MDP 114
applications class 1 and class 2 to maximize the utilization of service instances. We
show that this problem is similar to the previous problem, and the policy maker can
use the previously proposed model for achieving optimal policy and making optimal
decision. Since the services of type S3.2 can only be used by the application class 1,
then the decision is to either use the S3.1 instances for application class 1, or for a
class 2 application. Therefore, applications compete for a limited number of instances
instead of competing for all available instances. Then, upon an arrival of request for
application class 2, the system assigns the S3.2 if there is a free instance of it. If there is
no available S3.2 instance, system based on the MDP model, decides whether it should
give an instance of type S3.1 to this request or keep it for later use by a class 1 application.
Formulation of this problem is similar to the formulation of the previous problem,
except that among N services in the system we have L number of S3.1 instances of
services and the arrival rate of the class 2 application is λ2pf instead of λ2, in which pf
is the probability that a request for a class 1 application arrives to the system, and there
is no free S3.2 service instance in the system. Note that pf can be simply obtained using
Erlang B formula as follows:
pf =(λ/µ)m/m!
∑m
n=0 (λ/µ)n/n!,m = N − L (6.3)
Based on this problem formulation, in order to find the optimal policy we use the
following first-step reward function:
R(sn) = αn1 + βn2 (6.4)
where n2 represents the number of class 2 applications that have come into the system
and have not found any free S3.2 instances. Again, we try to maximize this reward using
MDP for the discounted reward model with infinite-horizon. Similar to the previous
problem, we can use the method of successive approximations for finite-period Markov
Allocating Services to Applications using MDP 115
Decision Processes for finding the optimal policy that maximizes the weighted-sum reward
function.
6.1.4 The Optimal Policy and Performance Comparison
Based on the presented MDP problem, we computed the optimal policy for the first
problem that described and formulated earlier. We found the optimal policies for request
arrival rates of λ1 = λ2 = 5, holding time of µ1 = µ2 = 1, and N = 10 and we set ǫ to
0.99 in the Equation 6.2. We obtained the optimal decision in each state in the case of
(α = 1, β = 0.1) and (α = 1, β = 0.5).
9 0
8 0 0
7 0 0 0
6 0 0 0 0
5 0 0 0 0 0
n2 4 0 0 0 0 0 0
3 2 0 0 0 0 0 0
2 2 2 0 0 0 0 0 0
1 2 2 2 2 0 0 0 0 0
0 2 2 2 2 2 0 0 0 0 1
0 1 2 3 4 5 6 7 8 9
n1
Figure 6.4: Optimal policy when the system is in state (n1, n2), and α = 1, β = 0.1
Respectively, Figure 6.4 and Figure 6.5 show the optimal policy for each case when
the system is in the state (n1, n2). In both these figures, ’0’ shows that the system only
will accept a request for a class 1 application, ’1’ shows that the system only will accept
a request for a class 2 application, and ’2’ shows that the system will accept request for
both classes of applications. As it can be seen when the weight of the class 2 application
is low, and plenty of them are currently being served in the system, our decision making
mechanisms suggests that we have to reject the new requests for a class 2 application
(Figure 6.4). However, if the weight of the class 2 applications is high we have to accept
Allocating Services to Applications using MDP 116
9 0
8 0 0
7 0 0 0
6 0 0 0 0
n2 5 2 2 0 0 0
4 2 2 2 0 0 0
3 2 2 2 2 0 0 0
2 2 2 2 2 2 0 0 0
1 2 2 2 2 2 2 0 0 0
0 2 2 2 2 2 2 2 2 1 1
0 1 2 3 4 5 6 7 8 9
n1
Figure 6.5: Optimal policy when the system is in state (n1, n2), and α = 1, β = 0.5
more requests for that class of application (Figure 6.5).
We simulated the system performance and compared the performance of the sys-
tem while using MDP-based partitioning mechanism with Complete Sharing (CS) and
Complete Partitioning (CP) mechanisms [101].
As we described before, in CS method, the system accepts any request for any class
of application if it has enough room to serve that request. In other words, the system
does not reserve any of its resources for any class of application. In CP method, the
system keeps a constant number of service instances for each application class and does
not allocate that portion to any other class of application. In our implementation of the
CP method, we divided the resources based on the weights of each class.
Figure 6.6 shows the comparison results between these three methods. Figure 6.6(a)
shows the case where α = 1, and β = 0.1 and Figure 6.6(b) shows the case where α = 1,
and β = 0.5.
The x-axis in both these figures represents the request rate in terms of λ1, and λ2.
In both these figures λ1 = λ2 and they change from 1 to 30. The y-axis in both figures
represents the reward value, which is the weighted sum of the number of applications
currently in the system, while applying each of the partitioning methods. As it can
be seen, MDP-based partitioning mechanism outperforms the other two mechanisms,
Allocating Services to Applications using MDP 117
0
2000
4000
6000
8000
10000
12000
14000
0 5 10 15 20 25 30
lambda1, lambda2
Rew
ard
CS CP MDP-based
(a) α = 1 and β = 0.1
0
2000
4000
6000
8000
10000
12000
14000
0 5 10 15 20 25 30
lambda1, lambda2
Rew
ard
CS CP MDP-based
(b) α = 1 and β = 0.5
Figure 6.6: Performance Comparison between Complete Sharing, Complete Partitioningand MDP-based partitioning mechanisms
Allocating Services to Applications using MDP 118
especially when the request rate is high. It can be seen that, when the request rate is
low, there is no significant difference between the CS, CP, and MDP-based partitioning.
However, when the load is high and weight of the second class of application is low using
the MDP-based partitioning results to 60% more reward compared to the CS method,
and 10% more reward compared to the CP method.
In the next section, we revisit this problem by relaxing some of the assumptions. We
again study the optimal policy and we present MDP-based solutions for this problem.
6.2 Sequential Service Executions
In the previous section [11], we studied the problem of optimal allocation of services
to different applications, and we proposed a Markov Decision Processes approach for
solving it. One of the main assumptions we made in that section was that all instances
of services are committed by the system to the application throughout its lifetime. In
this section, we try to put more relaxation on this assumption. We propose an optimal
policy for reserving services’ instances for different applications and business processes
using Markov Decision Processes. We obtain the optimal policy for a sample case and we
compare its performance with the performance of a system that uses a Full Commitment
Policy or a No Commitment Policy in assigning services’ instances to applications.
6.2.1 Problem formulation
Consider an environment with m types of services and k class of applications or business
processes (Figure 6.7). Each class of application is composed of a set of services. For
example, a class 1 application is composed of services 1, 3 and m, while a class 2 is
composed of services 1, 2, 4 and m and a class 3 application is composed of only one
service of type m.
Each application uses a service for a limited time during its lifetime and the service
Allocating Services to Applications using MDP 119
is free for the rest of the time. Whenever the system receives a request for a class of
application, it can accept the request or deny it. If the system accepts the request, one
policy is to put all corresponding instances of services on hold, until the application
execution finishes. We call this policy a Full Commitment Policy (FCP). Under this
policy, the system can accept a request for a class 1 application whenever there are free
instances of service types 1, 3 and m. Also, a request for a class 2 application will be
accepted whenever there are free instances of service types 1, 2, 4 and m, and similarly
a request for a class 3 application will be accepted whenever there is a free instance
of service type m. As it can be seen, there is a conflict between service requirements
between application classes 1, 2 and 3.
N-1
0
1
N-1
0
1
N-1
0
1
Service Type 1
Service Type 2
Service Type m
Figure 6.7: A system with m different service types and N instance of each type
Consider a case where a high request rate for class 3 applications results in consump-
tion of all services’ instances of type m and hence decreasing the chance of accepting
other classes of applications that require a type m service. This fact results to leaving
instances of other types of services underutilized while the requests for the other applica-
tions are being rejected. In the previous section, we analyzed this problem, and provided
optimal solutions for a sample scenario. Since some types of applications or business
Allocating Services to Applications using MDP 120
processes do not need the services’ instances throughout their lifetime, under the FCP
policy, the services’ instances could be underutilized, even with the optimal assignments
of services’ instances to applications or business processes.
Another alternative policy is to accept any request for any type of application when-
ever it has one free instance of the first service. We call this policy a No Commitment
Policy (NCP). Although this policy seems simple, it however has a significant drawback.
For instance, applications and business processes which are only composed of one service
and have high request rates can easily consume all instances of that service and force
other applications to fail when they need that particular service.
Another policy is the Partial Commitment Policy (PCP). Under this policy, the sys-
tem assigns the instances of services to applications considering the fact that applications
do not need all instances throughout their lifetime and also the fact that the system should
guarantee some level of service availability to all accepted applications. In this section,
we analyze this policy and propose an optimal solution based on it. We also formulate
this problem and propose an optimal solution using Markov Decision Processes. Using
the achieved policy, we compare the results of applying it in achieving higher service
utilization compared to the other policies.
Our proposed mechanism is to accept or reject the requests for each class of application
in the time of the request. In other words, by rejecting a request, we reserve available
services’ instances for future use by other classes of application.
For each class of application, the incoming rate is exponentially distributed, where
λ−1i (1 ≤ i ≤ k) is the mean interarrival time of the class i application, and µ−1
j (1 ≤ i ≤ m)
is the mean execution time of the service type j. Each application class is composed of
a set of services, and nij(1 ≤ i ≤ k)(1 ≤ j ≤ m) is the number of the class i applications
being currently served by services instances of type j in the system. Also, we have N
instances of services in our system from each type of service. As a result, the state vector
of the system is: s = (n11, n21, ..., nk1, n12, n22, ...., nk2, .., n1m, .., nkm)
Allocating Services to Applications using MDP 121
If an application class does not need an specific type of service at all, its corresponding
nij will be 0 throughout the system lifetime, and therefore it could be omitted form the
state vector. The set of all possible states, S, is given by:
S =
{
s : nij ≥ 0, i, j > 0, i ≤ k, j ≤ m,∑
j
nij ≤ N
}
(6.5)
Also each application class starts from a service and step by step executes a sequence
of services according to what has been already planned for it using any type of execution
language such as Business Processes Execution Language. Therefore, the state space will
be limited to the states valid based on the planned execution path. Throughout this
article, we only consider execution plans with no conditional branches.
For example Figure 6.8 demonstrates a sample scenario consisted of only 2 classes of
applications (k = 2) and 2 types of services (m = 2). The application class 1 is composed
of services 1, 2. The application class 2 is only composed of service 2. Therefore, the
state vector (n11, n12, n22) represents the current state of the system.
Let S = {(n11, n12, n22)|0 ≤ n11 ≤ N, 0 ≤ n12 + n22 ≤ N} be the system state, and
st be the system state at time t. Based on the statistical assumptions, {st, t ≥ 0} is a
continuous time Markov chain whose transitions are the event of an arrival or departure
of an application from the system, or transition from one service to the next service
according to execution plan.
Ultimately, for each state s, the optimal solution should tell us whether we should
accept next request for a class of application or not. Thus the action vector is:
a = (a11, a21, ..., ak1, a21, a22, ..., ak2, ..., a1m, ..., akm) (6.6)
in which aij ∈ {0, 1} is the act of accepting or rejecting a request for a class i application
while entering the system at service j. Consequently the action space of the system
is: A = {a : aij ∈ {0, 1} , i, j > 0, i ≤ k, j ≤ m}. This action space, however, can be
Allocating Services to Applications using MDP 122
Application Class 2
Application Class 1
N-1
0
i
N-1
0
i
Service Type 1
Service Type 2
Figure 6.8: A system with three types of service and two classes of applications
simplified based on the execution plan of each application class. Later in this section, we
will present a sample action space.
We try to formulate our problem as a Markov Decision Process [102, 103]. Our
objective is to maximize the utilization of the services and increase the revenue. Our
decision process is to find how we should treat the next request arrival, while the system
is in state s. For example, whenever the sample system enters the state (n11, n12, n22),
the system decides whether it will serve the next request for either of the classes of
application or it will reject it. We assume that rejected requests do not interfere with
the system.
Therefore in state s, the possible next actions are to accept the request only for
a class 1 application, or only for a class 2 application, or to accept requests for both
classes of application. Therefore possible next actions based on the state s are: A(s) =
{{0, 1} , {1, 0} , {1, 1}}. For simplicity we use following action representation in the rest
of this chapter:
A(s) = 0,which means only accept a request for class 1.
Allocating Services to Applications using MDP 123
A(s) = 1,which means only accept a request for class 2.
A(s) = 2,which means accept requests for both classes.
Our objective is to find an optimal policy for each state to maximize the reward which
is the weighted sum of the applications currently being served in the system.
6.2.2 Markov Decision Process formulation
This initial continuous-time Markov Decision Process can be converted into an equivalent
discrete-time MDP by applying the uniformization technique [103]. In order to do so,
we define the sampling time c := N∑
µj +∑
λi. During each sample time only one
transition can be occurred, which corresponds to either a change in state, or a fictitious
event. To maximize the utilization in our problem we try to maximize the reward function
which is the weighted sum of different classes of applications in the system. Therefore
we use the MDP infinite-horizon discounted reward model [102, 103], and we define our
one-step reward function as follows:
R(s, s′) = α∆+(s, s′)n11
+ β∆+(s, s′)n12
+ γ∆+(s, s′)n22 (6.7)
∆+(s, s′)nij = max {nij(s′)− nij(s), 0}
in which ∆+(s, s′)nij denotes the amount of increase in nij due to the transition from
state s to state s′.
The optimal discounted function and the optimal policy can be computed using dy-
namic programming techniques and the value iteration algorithm [102, 103],
Vn+1(s) = maxa
{
∑
s′
P ass′(R(s, s′) + ǫVn(s′))
}
(6.8)
Allocating Services to Applications using MDP 124
in which ǫ is the discounting factor and P ass′ is the transition probability from state s to
state s′ while applying policy a, and its value is as follows:
• When there is an arrival for a request for a class i application and we accept the
request the probability is: λi/c
• When there is an arrival for a request for a class i application and we reject the
request the probability is: λi/c
• The rate of the execution of service j is:∑
i nijµj/c
• The probability of the fictitious event is: 1− (∑∑
i nijµj −∑
λi)/c
Now we can recursively compute the sequence of n-stage Vn(s) values using the method
of successive approximations [103] and limit of this sequence when n goes to infinity. It
is shown that V (s) := limn→∞ Vn(s) exists and it is the solution of the infinite-horizon
discounted problem.
6.2.3 Optimal policy and performance comparison
Based on the presented MDP problem, we computed the optimal policy for the sample
system which is composed of two types of services and two classes of business processes.
We found the optimal policies for mean request arrival of λ−11 = λ−1
2 = 60, mean execution
time of µ−11 = 30, µ−1
2 = 40, (α = −0.1, β = 0.5), and N = 6 and we set ǫ to 0.99 in
Equation 6.8. We obtained the optimal decision in each state for γ = 0.1 and γ = 0.3.
To reflect the importance of the continuation of a business process or an application,
and not terminating it while it is in the middle way of its execution path, we chose
a negative value for α and a positive number for β. The total sum of (α + β) shows
the importance of a class 1 application or business process compared to a class 2 one,
which is represented by γ. We chose a negative value for α because: if the system let
an application to enter the system, and in the time of the completion of the first step, it
Allocating Services to Applications using MDP 125
forced the application to leave the system due to the unavailability of a free instance of
a service, the system would pay a cost of α.
Respectively, Figure 6.9 and Figure 6.10 show the optimal policy for each case when
the system is in the state (n11, n12, n22). We showed the results when n22 = 1(6.9a, 6.10a)
and when n22 = 4(6.9b, 6.10b).
6 1 0 0 0 0 0
5 2 2 0 0 0 0
4 2 2 2 0 0 0
11n 3 2 2 2 0 0 0
2 2 2 2 2 0 0
1 2 2 2 2 0 0
0 2 2 2 2 2 0
0 1 2 3 4 5
12n
6 0 0 0
5 0 0 0
4 0 0 0
11n 3 0 0 0
2 2 0 0
1 2 0 0
0 2 2 0
0 1 2
12n
a) b)
Figure 6.9: Optimal policy when the system is in state (n11, n12, n22), and γ = 0.1: a)n22 = 1, b) n22 = 4
In all figures, ’0’ shows that the system only accepts a request for a class 1 application,
’1’ shows that the system only accepts a request for a class 2 application and ’2’ shows
that the system accepts requests for both classes of applications. As it can be seen,
when the weight of a class 2 application is low, and plenty of them are currently being
served in the system, our decision making mechanism suggests that we have to reject the
new requests for a class 2 application (Figure 6.9), and therefore, reserve the remaining
resources for a class 1 application. However, if the weight of a class 2 applications or
business process is high we have to accept more requests for that class of application
(Figure 6.9). Also results show us that if the numbers of class 2 applications in the
Allocating Services to Applications using MDP 126
6 1 1 1 1 0 0
5 2 2 2 2 0 0
4 2 2 2 2 0 0
11n 3 2 2 2 2 2 0
2 2 2 2 2 2 0
1 2 2 2 2 2 0
0 2 2 2 2 2 0
* 0 1 2 3 4 5
12n
6 1 0 0
5 2 0 0
4 2 0 0
11n 3 2 2 0
2 2 2 0
1 2 2 0
0 2 2 0
0 1 2
12n
a) b)
Figure 6.10: Optimal policy when the system is in state (n11, n12, n22), and γ = 0.3: a)n22 = 1, b) n22 = 4
system are high, the system should reject new requests for that type of application or
business process, and reserve the free instances of services for other class of application.
We simulated the described system and compared the achieved performance using the
optimal MDP-based partitioning mechanism with other two policies; Full Commitment
Policy, and No Commitment Policy. For FCP, we use Complete Partitioning (CP) mech-
anism [101]. In CP method, the system keeps a constant number of service instances for
each application class and does not allocate that portion to any other class of application.
In our implementation of the CP method, we divided the resources based on the weight
of each class.
Figure 6.11 shows the comparison results between these three methods, for the case
where (α = −0.1, β = 0.5, γ = 0.1). The x-axis in this figure represents the requests
mean inter-arrival time as λ−11 , while λ1 = λ2 = 1, and λ−1
1 changes from 8 to 60. The
y-axis in both figures represents the system revenue or reward, which is the weighted
sum of the number of applications currently being served in the system, while applying
each of the partitioning policies.
As it can be seen, MDP-based partitioning policy outperforms the other two mecha-
Allocating Services to Applications using MDP 127
600
1150
1700
2250
2800
8 18 28 38 48 58
1/(lambda1), lambda2=lambda1
Rew
ard NCP
FCP
MDP
Figure 6.11: Performance Comparison between No Commitment Policy, Full Commit-ment Policy and MDP-based partitioning mechanisms (α = −0.1, β = 0.5, γ = 0.1)
nisms, especially when the request rates are high. It can be seen that, when the request
rate is low (inter-arrival time is high), there is no significant difference between the FCP,
NCP and MDP-based partitioning. However, when the load is high the MDP-based par-
titioning results to 60% more reward compared to the No Commitment Policy, and 30%
more reward compared to the Full Commitment Policy.
Another experiment which we carried out was on the service execution time distribu-
tion. So far, we used the exponential distribution for the service execution time. For some
types of services, however, this assumption might not be accurate. Using exponential dis-
tribution, we can assume memoryless properties for the problem, and consequently, we
can use Markov Decision Processes approach for obtaining optimal policy. Also expo-
nential distribution can be helpful in studying the problem behavior in the mean sense.
Therefore, in this part we decided to see how much the achieved policy would be effective,
if we had another type of distribution for the service execution. To do so, we assumed a
Allocating Services to Applications using MDP 128
0
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41
Pro
b. d
en
sit
y
A Sample Beta pdf
Figure 6.12: A sample beta distribution
600
1150
1700
2250
2800
8 18 28 38 48 58
1/(lambda1), lambda2=lambda1
Rew
ard NCP
FCP
MDP
Figure 6.13: Performance Comparison between No Commitment Policy, Full Commit-ment Policy and MDP-based partitioning with a beta distribution for service executiontime and (α = −0.1, β = 0.5, γ = 0.1)
Allocating Services to Applications using MDP 129
Beta distribution for the execution time of each service instance.
Beta distribution has some interesting properties which makes it a good candidate
for modeling many types of services and processes [104]. Figure 6.12 shows a beta prob-
ability density function, which we used for this experiment. As it can be seen, using this
distribution, we can assume an optimistic estimate, a pessimistic estimate, and also the
most likely estimate on service time execution.
Based on these assumptions, we replaced the execution time of both services with a
beta distribution which has the same mean as the exponential distribution of µ−11 , and
µ−12 . Figure 6.13 shows the result of this experiment. As it can be seen, the optimal
policy found for the exponential distribution is able to achieve satisfactory results for
the Beta distribution case as well. Beta parameters used for this experiment are α =
2.33, β = 4.66,m1 = 30,m2 = 40.
In this section, we presented optimal policies for making admission decisions in service-
oriented systems. These policies, however, are suitable for small scale systems since their
computation will become infeasible for large scale systems. Also, these policies are for
service-oriented systems that are exposed to stationary Poisson request arrival processes.
In the next chapters, we extend this work and propose heuristics that are able to operate
in large scale systems distributedly and able to handle both stationary and non-stationary
demands.
Chapter 7
A Distributed Probabilistic
Commitment-Control Algorithm
In the previous chapter, we introduced the problem of optimal allocation of services to
applications, and we proposed a Markov Decision Processes [103] approach to solve the
problem. In that chapter, we first studied this problem assuming that applications re-
quire all corresponding service instances throughout their lifetime [11]. We also assumed
exponential distribution for the applications request inter-arrival times and for the ap-
plications execution times. We next addressed the case in which applications do not
need all corresponding service instances throughout their lifetime [12] and we again as-
sumed exponential distributions for applications inter-arrival times and services execution
times. In this chapter, we propose an algorithm for the problem of service commitment
with the following desirable properties; the proposed heuristic algorithm does not limit
the distribution for services execution times and applications request inter-arrival times
to any specific type and it can be implemented in a distributed and scalable environ-
ment. Moreover, the heuristic algorithm can guarantee an important QoS parameter in
a service-oriented environment.
A key challenge in application creation through service composition is to guarantee the
130
Distributed Probabilistic Commitment Control Algorithm 131
quality of service of the created applications [97, 105, 106, 107, 108]. Guaranteeing quality
of service in a service-oriented environment has increasingly received more attention as
new types of large-scale applications are built based on this new paradigm [109, 110, 111].
We consider the QoS guarantee problem using an important QoS metric for composite
applications in terms of probability of successful completion, or equivalently its comple-
ment, probability of failure. We propose a Distributed Algorithm for Service Commit-
ment (DASC) that guarantees this QoS parameter. This algorithm can be a part of a
service-oriented system that orchestrates the execution of composite applications such
as business workflows, telecommunication applications or mixed IT/Telecommunication
applications.
The orchestrator system’s main task is to invoke different services according to the
application’s execution plan. Generally, each of the invoked services has an execution
time that is stochastic [112]. In order to guarantee successful completion of an application
a service component provider has to provide a characterization of this stochastic behavior
to the system, and the system has to consider this behavior in admitting requests for
composite application.
If a system overlooks these stochastic characteristics it may excessively invoke a ser-
vice, causing it to: serve an excessive number of application instances resulting in per-
formance degradation to applications in the system; refuse to serve some application
instances; or queue instances resulting in unwanted delays in application execution. To
avoid these undesirable events, the system includes an admission control mechanism to
control its commitments to the application instances and to guarantee the probability of
successful completion.
The design of an admission controller depends on the properties of the demand for
the applications. For example, if the demand is stationary (stationary arrival rate and
stationary service times) the admission control can be designed using off-line and steady-
state analyses. In this case, techniques and approximations such as decomposition-based
Distributed Probabilistic Commitment Control Algorithm 132
methods [113, 114, 115] can be used to find the acceptable region of requests arrival
rates, and admission mechanisms are then used to enforce arrivals using rate regulators.
If the demand is non-stationary, other techniques are required for admission control.
For example, for each arriving application request, an online admission controller can
calculate the likelihood that all service components of the application can be completed
given the current state of the system.
The DASC algorithm is designed to operate in non-stationary demand environments
(namely with non-stationary request arrivals) and it uses a predictive model that de-
livers a target level of the probability of successful completion for admitted application
instances. DASC does not assume any specific distribution type for applications request
inter-arrival time and it is capable of functioning in a distributed and scalable environ-
ment. DASC assumes that the service execution times distributions in a service-oriented
system are known and remain unchanged with time.
We present two versions of the DASC algorithm, one with no queuing permitted for
the application instances and one where the system queues application instances instead
of dropping them which is discussed later in this chapter.
We present simulations that show that without a commitment control mechanism,
the successful application completion probability can be very low. We also show that our
algorithms are able to meet QoS goals. Moreover, we compare DASC performance with
alternative steady-state based admission controllers, and we show that DASC performs
better especially where demand is bursty and non-stationary.
The chapter is organized as follows. In the next three sections, we state the problem
of service commitment in a service-oriented environment, we discuss the mathematical
basis of the problem, and we present the corresponding modeling and formulation. In
section 4, we present the DASC algorithm followed by performance evaluation results in
section 5. .
We extend the proposed algorithm to systems that can provide a limited number of
Distributed Probabilistic Commitment Control Algorithm 133
S1
Application 2
Application 1Applications
leaving the
system
S2S2S2
S1S1
Service Oriented System
Figure 7.1: A sample service-oriented environment
queuing spots for services. We present the modifications to the formulations and the
performance evaluations results. Finally in this chapter, we review the related work and
discuss our contribution to this problem.
Another issue that we do not discuss in this chapter is the system revenue maximiza-
tion. In this chapter, we assume that all application classes have the same revenue for
the system, and we are only interested in guaranteeing the quality of service. In the last
chapter of this part, we propose techniques that accompany the DASC algorithm to max-
imize the system revenue by admitting more valuable application classes, and rejecting
less valuable ones.
7.1 QoS Control in a Service-Oriented System
We are interested in guaranteeing the probability of successful completion of an applica-
tion in a service-oriented environment. To clarify the problem we begin with a simple
example. Figure 7.1 shows a service-oriented environment in which two different ap-
plication classes use two different service types. In this example, application class one
begins by executing service one followed by service two. Application class two merely
uses service type two.
The problem of interest in this paper arises when there are contending requests for
shared services. For example, a high number of requests for application class two might
Distributed Probabilistic Commitment Control Algorithm 134
result in consumption of all available service two instances. Consequently, application one
instances completing service one would fail to continue their execution because service two
instances are not available. Thus application one instances have to either leave the system
without completion or face unwanted delays. To avoid this problem, a service oriented
system can use an admission control mechanism to control its service commitments by
only admitting application instances when they are highly likely to complete successfully.
To design an admission controller for a service-oriented system, we need to consider
the environment in which it is operating. In a stationary environment with stationary
demand, we can design an admission controller using off-line steady-state based analyses
of the system. In such an environment, we can find the acceptable arrival rate region in
which the successful completion probability can be guaranteed in the steady-state. To
enforce operation within this acceptable region we can use token bucket regulators at the
portal to a service-oriented system. Although, to the best of our knowledge, there are no
exact closed form solutions for the associated Finite Capacity Queuing Network (FCQN),
in general there are a variety of approximation and decomposition based [113, 114, 115],
simulation-based [116] or bottleneck analysis methods [117] that can be used to find the
region of acceptable request arrival rates in steady-state.
When demand is non-stationary we need a different approach that can make admission
decisions based on the current state of the system including the current application
instances that are being served in the system. The design of this type of system requires
a transient-state analysis of the system and involves on-line decision making based on
these transient-state analyses. In this chapter, we present DASC as an algorithm that is
able to handle non-stationary arrivals in a distributed and scalable environment.
In DASC, each service component has an agent that tracks service component usage
as well as future commitments. When a new request for an application arrives to a
service-oriented system (for example to a service-oriented orchestrator engine), DASC
first queries all the corresponding service components agents in parallel. Each agent
Distributed Probabilistic Commitment Control Algorithm 135
accepts or rejects the admission of that request based on the current state and anticipation
of future usage, and the admission controller then makes the final admission decision
according to the agent responses. We will show that DASC is able to achieve higher
applications throughput compared to the steady-state based approaches when the service-
oriented system is exposed to the bursty request arrivals, while it can still guarantee the
application successful completion (or failure) probability.
DASC requires knowledge of the application execution plans. We assume that in a
service-oriented environment the structure of the application in terms of its logical ser-
vice components and their inter-connections are known. This is not an unreasonable
assumption, especially since the service components involved in an application and ap-
plication execution flow are known in most SO-based applications. DASC also requires
the probabilistic properties of service execution times. This allows DASC to anticipate
an application’s future service usage and commit the necessary service instances to each
admitted application instance.
S1 S3S2
S1 S2 S3
S3
S2
S1 S4
Sequential Conditional
Parallel loop
1
1
S3
S2
S1 S4
1 - p
p
l
Figure 7.2: Composition Operations
To model a service-oriented system for DASC algorithm, we first need to present some
definitions. Assume that in a service oriented system there are L different application
Distributed Probabilistic Commitment Control Algorithm 136
classes; Ai(i = 1, .., L); there exist M different service types represented by Sj(j =
1, ...,M), and each service type has Nj instances. Also S represents a set of all service
types: S = {sj : 1 ≤ j ≤M}, and Ui ⊆ S represents the set of services that are required
for creation of application i based on the composition function Ci(Ui).
Composition function Ci, uses basic operators for service composition:
Definition 1) In a service-oriented environment, the services can be composed using
five types of operations: (four operations are shown in Figure 7.2):
a) Sequential operation⊗
; Sj
⊗
Sk shows that the service Sk will be executed after
the completion of execution of service Sj.
b) Conditional operation ©; Sj © Sk shows that the system executes either service
Sj or Sk. We assume that the probability for choosing service Sj is pj, and for choosing
service Sk is pk, while pj + pk = 1.
c) Parallel operation⊕
; Sj
⊕
Sk means that services Sj and Sk will be executed in
parallel, and the output will not be available until both services finish their execution.
d) Loop operation⊗l;
⊗l Sj means that the system must execute l sequential itera-
tions of service type j before continuing the execution of the application.
e) End operation⊙
for the end of execution.
Sequential operator⊗
is also used as the fork and join operator together with the
parallel and conditional operators.
We use these operators to analyze the application instance execution times in terms
of each service execution time according to the applications execution plan.
Definition 2) an Application execution path (or execution path) is a path which a given
instance of an application follows, starting from one service type and ending with another
service type. By definition, there are no conditional operators in one execution path of
an application.
Definition 3) an Application execution plan (or execution plan) is a plan which outlines
all execution paths of an application, including all of its conditional operations and
Distributed Probabilistic Commitment Control Algorithm 137
describes the sequence of the service executions of an application from start to finish.
Our goal in this chapter is to present an algorithm that can guarantee the probability
of application completion. In other words, we would like to have:
Pf i≤ π ∀i ∈ {i = 1, .., L} (7.1)
where Pf iis the probability of failure of one instance of application class i, and π is
an agreed-upon threshold. Furthermore, this algorithm should be scalable and capable
of operating in a distributed environment, and also not involve excessive computation
overhead.
7.2 Probabilistic Modeling of Service Commitment
In this section we will consider random variables corresponding to the execution time of
an application. For simplicity of notation, in this section we will assume that the service
components that appear in the execution plan for an application are numbered S1, S2,
...., and that each such service appears only once in the plan so that the corresponding
execution times can be unambiguously denoted by X1, X2, ... The execution time of a
service j instance is a random variable Xj with a probability density function (pdf) and
cumulative distribution function (cdf) given by fj(t) and Fj(t), respectively. The pdf of
the application execution time for one path can be computed using a combination of pdfs
of corresponding services in that path.
We begin with the sequential operator. The execution time of Sj
⊗
Sk is a ran-
dom variable (Y⊗) which is the sum of two random variables representing the services
execution time: Y⊗ = Xj +Xk, with pdf of: fY ⊗(t) = fj(t)∗fk(t), assuming the indepen-
dence between execution times, and also assuming that there is no waiting time between
execution of two consecutive services.
Similar to the sequential operation, the execution time of the loop operation of one
Distributed Probabilistic Commitment Control Algorithm 138
service is the l-fold convolution of the execution time pdf of that service. In other words,
the execution time of⊗l Sj is Y⊗l =
∑l
c=0Xjc in which Xjcs are i.i.d random variables
with the pdf fj(t). Thus the pdf of the loop operation is fY⊗l(t) = f l
j(t) which denotes
the l-fold convolution of fj(t).
The execution time of the parallel operator (Sj
⊕
Sk) is a random variable (Y⊕)
which is equal to max(Xj, Xk) and has the pdf of fY ⊕(t) = FK(t)fj(t) +Fj(t)fk(t). The
execution time of the conditional operator (Sj © Sk) is the random variable:
YO =
Xj, (prob = pj)
Xk, (prob = pk), pj + pk = 1
with pdf of: fY O(t) = pjfj(t) + pkfk(t).
For an instance of application i, suppose we start service j at time zero and consider
the time until we complete a subsequent service k. Let hijk(t) denote the pdf for this
elapsed time and let Hijk(t) be the corresponding cdf. Now suppose we are interested in
the probability that having started service j at time zero that the application execution is
in service k at time t. Let m be the service that precedes service k in the execution plan,
then the application execution is in service k at time t if: 1. the application execution has
completed service m by time t; and 2. the application execution has not yet completed
service k by time t. This implies that the probability that having started service j at
time zero that the application execution is in service k at time t is given by:
Gijk(t) = Hijm(t)−Hijk(t) (7.2)
See Appendix C for a derivation of this result. Also, the probability that application i
just started executing service j still be at the same service at time t is simply equal to:
Gijj(t) = 1−Hijj(t) = 1− Fj(t).
Similarly, we can compute Gijk(t|t0) the probability that an application class i in-
stance has been in service j for t0 second, and will be at service k at time t by replacing
Distributed Probabilistic Commitment Control Algorithm 139
the pdf of fj(t) by its corresponding conditional pdf: fj(t|t0) = fj(t− t0)/(1− Fj(t0)).
Using Gijk(t), we can now characterize the random variable for the number of busy
service instances of any service at any given time t in future. Define an indicator function
for the event that the application execution will be in service k at time t having started
at service j at time zero:
Iijk(t) =
1, (prob = Gijk(t))
0, (prob = Gijk(t))(7.3)
For applications with multiple execution paths, the probabilities in the above indicator
function need to be multiplied by pijk which is the probability that the application i, now
in the service type j, will visit the service type k in future, according to its execution
plan.
Similarly, for the application instances that have already been in service j for t0
second, the probabilities in the indicator function (7.3) are replaced with their conditional
version Gijk(t|t0).
The number of busy service instances of type k at time t is found by adding the
indicator functions of all instances of applications in the system of any class i (index il),
which are being served by a service in system (index j) at time 0, and can therefore be
at service type k at time t:
Sk(t) =∑
l
Iiljk(t) (7.4)
We can now specify the probability of over-commitment in service type k at a future
time t (Pock(t)), that is, that the number of admitted applications needing service of type
k at time t exceeds the number Nk of service instances that have been provisioned. In
other words, Pock(t) is the probability that service type k is over-committed at a future
time t due to admission of too many applications:
Distributed Probabilistic Commitment Control Algorithm 140
Pock(t) = P
{
Sk(t) > Nk
}
(7.5)
In the DASC algorithm, the system computes this probability at the time of re-
ceiving a request for an application to ensure that the system is highly likely to have
the necessary free instances to serve the application instance in each of the succeeding
service types along the application execution paths for the time needed. Furthermore
DASC needs to compute the above probability for any future time t in order to meet an
agreed service level (Toc). Pockis a a major parameter in our algorithm, and we discuss
its computation in later sections.. In contrast to other admission control systems, the
incoming request rate is not a factor in computing this probability. We only need for
service execution distribution times to remain unchanged with time. For this reason our
proposed algorithm can operate in systems with bursty or non-stationary request arrivals,
and can handle transient surges in applications request rate without compromising the
service-level agreements.
We note that in this model not only different service components can have different
service execution time distributions, but also we can have different execution time dis-
tributions for a service type for different application classes. However, in the rest of this
chapter for the sake of simplicity, we assume that each service component only has one
execution time distribution for all application classes.
Consider the application failure probability which is the probability that an appli-
cation instance cannot complete its execution plan due to the unavailability of a free
instance of service type k at the time that the application needs the service k. We call
this probability Pfijk, and it can be computed as follows:
Pfijk=
∞∫
0
Pock(t)hijm(t)dt (7.6)
in which hijm(t) is pdf of the time to complete the execution of all services preceding
Distributed Probabilistic Commitment Control Algorithm 141
service type k (up to service m), or equivalently, the pdf for the start of the execution of
the service type k.
The DASC system keeps Pock(t) always below an over-commitment threshold Toc, so
an upper bound for the application failure probability is the over-commitment threshold:
Pfijk≤ Toc (7.7)
In other words, Toc is the upper bound for an application class i failure probability
at service k, if it starts its execution from service j. Consequently, to find the upper
bound of total application failure probability, we need to consider failure probabilities at
all services based on the application execution plan, as follows:
Pfij=
l∑
k=j+1
(Pfijk
k−1∏
m=j+1
(1− Pfijm)) (7.8)
where we assume that the last possible service is service l. Each term in the above sum
is the probability that the application execution fails at service k. By taking partial
derivatives of the above equation, it can be shown that Pfijis a monotonically increasing
function of Pfijk. Therefore, an upper bound for Pfij
can be obtained by applying the
upper bound for Pfijkthat is Toc:
Pfij≤ (1− (1− Toc)
l−j) (7.9)
in which (l−j) represents the maximum number of services that an application i instance
has to traverse to finalize its execution.
In the next section, we focus on the over-commitment probability, and to compute
this probability we use the Central Limit Theorem (CLT).
Distributed Probabilistic Commitment Control Algorithm 142
7.3 Computing Over-Commitment Probability
The random variable in (7.3) is a Bernoulli random variable at time t, and Sk(t) in (7.4)
is the sum of multiple non-identically distributed Bernoulli random variables. Therefore,
we can compute the mean and variance of the indicator function as:
E[Iijk(t)] = Gijk(t)
V AR[Iijk(t)] = Gijk(t)−Gijk(t)2
And consequently mean and variance of the sum random variable will be:
ηk(t) = E[Sk(t)] =∑
l
E[Iiljk(t)]
=∑
l
Giljk(t), (7.10)
σk(t)2 = V AR[Sk(t)]
=∑
l
V AR[Iiljk(t)]
+∑
l
∑
l′
COV (Iiljk(t), Iml′nk(t)) (7.11)
in which l and l′ denote the instances of applications currently in the system.
Now imagine that that there is an unlimited number of servers available to support
each service type. If so, then the application instances will all flow along their execution
paths without having to contend with each other to obtain servers and so they will not
interact at all. Consequently their corresponding indicator functions are independent
random variables. Because the over-commitment probability will be small, we can sup-
pose the number of servers of various types are ample. Therefore we assume that the
Bernoulli random variables in equation (7.4) are independent, and the above covariance
Distributed Probabilistic Commitment Control Algorithm 143
terms will be zero and the variance of sum random variable will be:
σk(t)2 =
∑
V AR[Iiljk(t)]
=∑
Giljk(t)−∑
(
Giljk(t))2
= ηk(t)−∑
(
Giljk(t))2
(7.12)
We know from the Central Limit Theorem (CLT) [118](p.278) that the sum of n
independent random variables approaches a Gaussian random variable with a mean and
variance equal to the sum of the means and variances of all of the random variables
respectively. Therefore, the over-commitment probability can be approximated using
CLT as:
Pock(t) = P
{
Sk(t) > Nk
}
= 1− Φ
(
Nk − ηk(t)
σk(t)
)
(7.13)
in which the Φ function is the cdf of a Gaussian random variable with mean η = 0
and variance σ2 = 1. The approximation for the over-commitment probability using the
Central Limit Theorem becomes more accurate as the number of application instances
in the system and the number of service instances are large, which is the case for many
real systems.
In summary, whenever an application request enters the service-oriented environ-
ment, the application admission control system computes the probability of the over-
commitment at any time during the application’s lifetime for every service along its
execution path, and if that probability is less than the permitted threshold (Toc), it
allows the application request to enter the system.
Another technique for computing the over-commitment probability is to use the theory
of large deviations and Chernoff’s bound. Chernoff’s bound enables us to find better
approximations of the probability if the target threshold Toc is very small, while the CLT-
based technique gives us good approximations of the probability for the target thresholds
in the range of 10−3 and higher. In the Appendix, we discuss this alternative method for
Distributed Probabilistic Commitment Control Algorithm 144
Agent 1 Agent 2 Agent 3
N1-1i
Serv
ice
Type
1
N1
Instances
N1-1iS
erv
ice
Type
2
N2
Instances
N1-1i
Serv
ice
Type
3
N3
Instances
AdmissionController
Application request
Queries
Figure 7.3: A service-oriented system with three agents, each controlling one service type
computing this probability using the theory of large deviation and Chernoff’s bound. In
the next section, we present the Distributed Algorithm for Service Commitment in more
detail, and we discuss how this system can be implemented in a distributed environment.
7.4 Distributed Algorithm for Service Commitment
Figure 7.3 shows the decentralized implementation of the service commitment function.
Each service type is controlled by one agent. The task of each agent is to monitor the
instances of one service type. Whenever the agent starts serving one application instance,
it informs the agents responsible for succeeding service types that it has just started the
execution of an application instance. The recipient agents have to store the relevant
information regarding each particular application instance and use it to compute their
over commitment probabilities every time a request for an application arrives. In other
words, the agent for service type k computes the parameters for the random variable
Sk(t) for all t in future.
In this distributed algorithm, when the admission controller receives a request for an
Distributed Probabilistic Commitment Control Algorithm 145
ACCEPTED
Update local Resources accordingly
Application Status (Running Agent -> *)
Is this service still in path?
IDLE ACCEPTED
Is application now running this service?
Do you have available
instances?
Application Dropped(*->Corresponding Agents)
Application Dropped(*->System Management)
Update Local
resources
RUNNING_AGENTIDLE
No
yes
yes
NoNo
yes
Update local
resources
RUNNING_AGENT
Execution Completed(Service -> *)
Update local resources
Application Status(*->Corresponding Agents)
IDLE
WAIT_FOR_COMMITMENT_CONFIRM
Commit Resources(Querying Agent->*)
Commit Future Resources
ACCEPTED
Release Resources(Querying Agent->*)
Release Temporary Commitment
IDLE
WAIT_FOR_CONFIRMS
Commitment Response(Corresponding Agents -> *)
Commitment possible?
Request rejected
Release Resources(*->Corresponding Agents)
All Agents reposne
received?
Save the confirmation
response
WAIT_FOR_CONFIRMS
Commit Resources(*->Corresponding
Agents)
Request Accepted
IDLEIDLE
IDLE
Check Over-Commitment(Admission Controller->*)
Evaluate probability of over-commitment
Probablity is under
threshold?
Temporarily Commit
Resources
Commitment Not Possible
(*->Admission Controllert)
Commitment Possible(*->Admission
Controllert)
WAIT_FOR_COMMITMENT_CONFIRM
No
Yes
IDLE
IDLE
Application Request for admission
Check local resources
Check Over-Commitment(*->All corresponding agents)
WAIT_FOR_CONFIRMS
The admission controller receives a
request an application instance and
queries all corresponding agents
All corresponding agents queried for
the over-commitment probability check
Admission controller receives Over-
Commitment Check Responses from all the
corresponding agents
Agents receive commitment command or
release commitment command from the
admission controller
The agent responsible for
the current service
(running agent) updates
other agents upon
completion of execution
Agents receive updates on
the current status and
location of the application
instance in the system from
the running agent
Figure 7.4: Distributed Algorithm for Service Commitment in SDL (Specification andDescription Language)
Distributed Probabilistic Commitment Control Algorithm 146
application, it will ask the corresponding agents whether they will have enough resources
for serving that application during the period in which it is anticipated for that applica-
tion to be served by their associated service types. Since all agents keep the records of
the applications which are likely to use their service type, they can reply to the agent’s
query with a ’yes’ or ’no’ reply. If the replies are all yes, the admission controller admits
the application, and it tells the corresponding agents to commit necessary resources for
the just admitted application instance.
It is noteworthy that the agents do not need to compute the relevant distribution
functions each time they receive a query from their preceding agents. Those distributions
can be provided to each agent by another computing module in the system, and the agents
can store them in their memory and use them as the need arises. Also, to avoid over-
commitment, the queried agents, upon arrival of a query from the preceding agents and
accepting to serve the application, can temporarily commit their resources for a limited
time, until they receive another message from the admission controller confirming the
acceptance or rejection of the application request.
The SDL (Specification and Description Language) [119] depicted in Figure 7.4 presents
the DASC algorithm. It can be seen that, the admission controller queries the correspond-
ing agents upon receiving a request for admitting an application instance. The presented
SDL shows the messages that should be exchanged among admission controller and the
agents responsible for services used in creating an application as well as the agents inter-
nal states and their interactions in DASC.
7.4.1 DASC Complexity Analysis
DASC is a distributed algorithm in which each agent is responsible for controlling one
service type. Therefore, for complexity analysis, we focus on one agent and we investigate
its processing and memory requirements.
If we represent the maximum lifetime of the longest living application in the system
Distributed Probabilistic Commitment Control Algorithm 147
by T , then the memory needed for storing the future estimation of instance usage for a
service type would be O(T ). Further, an agent has to store some information for each
application instance in the system that might use this service in future. If we represent
the maximum number of application instances in the system by Na then the memory for
storing application specific data will be in order of O(Na). So, in total, each agent needs
a memory in order of O(T +Na) to store the required data for the algorithm.
In addition, each agent for an incoming request has to compute the over-commitment
probability for the maximum duration of the longest living application in the system.
Therefore, the processing complexity for each agent would be in order of O(T ).
Since the algorithm is distributed, we need to analyze the communication overhead
in the system as well. In DASC, the admission controller has to query the corresponding
agents in order to admit a request. Therefore the communication overhead will be in
order of O(K) in which K shows the maximum number of services in the system. Also
as the application proceeds its execution in the system, each agent is required to notify
the succeeding agents of the latest change in the instance’s location. Therefore, the
total communication overhead for an admitted application instance would be in order of
O(K(K − 1)).
For example, for a system with 12 service components (presented in the performance
evaluation section) when a request enters the system, the admission controller commu-
nicates with 12 other agents (in worst case) to make an admission decision. These 12
messages are sent and processed in parallel, and each combined communication and com-
putation takes less than 10ms, and hence the total decision making time is less than 10
ms. We believe that for many systems and applications this decision making time is quite
acceptable. In addition, in this system, agents need a maximum memory of 500KB each.
The number of exchanged messages could be reduced if the bottleneck services in a
system are identified using an off-line analyses approach, and only agents responsible for
those services are queried for making decision. In some systems, this reduction would be
Distributed Probabilistic Commitment Control Algorithm 148
0 500 1000 1500 20000
0.5
1
1.5
2
2.5x 10
−3
prob
abili
ty d
ensi
ty fu
nctio
n (p
df)
time unit
Figure 7.5: Beta pdf for service execution time with parameters α = 2.333 and β = 4.666
significant if only a small portion of the services are bottleneck services.
7.5 DASC Performance Evaluation
In this section, we present the performance evaluation results for our proposed algorithm
for two different systems. The performance metric of interest is the applications failure
ratio. Application failure ratio is the ratio of the number of failed applications to the
number of applications admitted to the system. We would like this ratio to be less than
the threshold set for the failure probability. We also evaluate the application failure
ratios in each of the service types. Moreover, we compare the DASC algorithm against
steady-state based admission control systems in terms of application failure ratio as well
as applications throughput.
We begin by simulating the simple system described in the first section and depicted
in Figure 7.1 which is composed of two service types and two application classes. We
assume that the service provisioning has been performed and 100 instances for each
Distributed Probabilistic Commitment Control Algorithm 149
0.02 0.04 0.06 0.080
0.1
0.2
0.3
0.4
0.5
0.6
0.7
App
licat
ion
1 fa
ilure
rat
io
Applications request rate0.02 0.04 0.06 0.08
10−4
10−3
10−2
10−1
100
App
licat
ion
1 lo
g fa
ilure
rat
ioApplications request rate
NoCommitDASC−0.5DASC−0.1DASC−0.01
Figure 7.6: Application failure ratio for a system with two application classes and twoservice types
of service types one and two have been provisioned. We also assumed identical beta
distributions for service execution times for both services (Figure 7.5). We chose Beta
distribution since it can represent many pdf shapes and hence is useful in modeling many
types of services [104]. The Beta pdf parameters which we used in our experiment are
α = 2.333 and β = 4.666.
For generating application requests inter-arrival times, we used a geometric distri-
bution with parameter p ranging from 0.01 to 0.1 in 0.01 steps. Figure 7.6 shows the
application one failure ratio at service two for four different cases. The first case is the
case that there is no commitment control in place, and for the other three cases we applied
the DASC algorithm with thresholds of 0.5, 0.1, and 0.01. It is evident that without the
DASC algorithm the performance is very poor, i.e. more than 50% application failure for
high request rates. However, by applying the DASC algorithm, the system can achieve
its target QoS, even when the request rate is high.
We compared the DASC performance with an alternative steady-state based admis-
sion control mechanism. We designed this system using the bottleneck analysis of the
system [117]. In this method, we identify the bottleneck service (S2 in our system), and
Distributed Probabilistic Commitment Control Algorithm 150
0 0.02 0.04 0.06 0.082000
3000
4000
5000
6000
7000
8000
9000
10000a)Geometric Arrivals
Request arrival rate
App
licat
ions
Ser
ved
Suc
cess
fully
2000 4000 6000 8000 100005000
6000
7000
8000
9000
10000b)On/Off Bursty Arrivals
App
licat
ions
Ser
ved
Suc
cess
fully
Burst Period (T)
DASC−0.01BTLNK−0.01DASC−0.001BTLNK−0.001
Figure 7.7: Comparing DASC throughput with bottleneck-based admission control algo-rithm
we approximate the system performance by the bottleneck service performance. At the
bottleneck service, we used the Erlang-B formula to find the acceptable region of arrival
rates to the system, in the steady-state, to contain the probability of overflow at S2 at
the target level of 10−2, and 10−3. Then we use token regulators on the request arrival
processes to enforce the acceptable arrival rates.
We compared the throughput of a system controlled using Erlang-B, and a system
controlled by DASC algorithm, for a geometric request arrival (Figure 7.7a), and an on-
off bursty request arrivals (Figure 7.7b). In the bursty request arrival, we generated a
burst of request arrivals using geometric distribution with parameter 0.01 for a period of
T , followed by another burst of arrivals with parameter 0.1 for a period of T . Figure 7.7b
shows the applications throughput of the on-off bursty arrival process based on the value
of T . It can be seen that DASC outperforms the steady-state based admission control
system in both stationary and bursty arrival cases in terms of applications throughput
while it can meet the target QoS. This improvement is more significant in the bursty
case since DASC makes the admission decision based on the current state of the system,
and its anticipation of future usage.
Distributed Probabilistic Commitment Control Algorithm 151
S1 S2 S3
S4
S5
S6
S7
S8 S9
S10
S12
S11
S3 S4 S2
S7
S8
S7
S9
S1
S5
S6
S4 S1 S7
S2
S6
S8S5
S5
S7
S8
S4S3
S6
0.5
0.5
0.3
0.7
0.1
0.9
0.5
0.5
0.4
0.6
0.5
0.5
3
Application Class 1
Application Class 2
Application Class 3
Figure 7.8: A service oriented environment consisted of twelve service types and threeapplications
Distributed Probabilistic Commitment Control Algorithm 152
0.02 0.04 0.06 0.08 0.10
0.5
1
1.5x 10
−3
Applications request rate
App
licat
ions
failu
re r
atio
s
Applications Failure Ratios
APP1APP2APP3Total
Figure 7.9: Applications failure ratios in the system
Our simulations also show that the choice of target failure probability affects system
throughput. The lower we set this target, the system becomes more conservative in terms
of admitting requests for applications leading to the applications throughput reduction.
Also, as we increase the number of service instances the approximations become more
accurate. This is mainly due to the fact that the CLT becomes more accurate.
Next we simulated the more complex system depicted in Figure 7.8 which consists
of twelve service types and three application classes. The applications in this system
have sequential, conditional and parallel operations and application class two has one
loop operation. Again, we assume that the provisioning has been performed and 200
instances of each service type have been provisioned. For the service execution times, we
assumed identical beta distribution for all twelve service types. . We set the threshold
for the total application failure to 10−2 and used bound (7.9) to set the threshold for the
over-commitment probability for each service type (Toc) to 1.5 ∗ 10−3.
The parameters that we evaluated in this simulation are the total application failure
ratio, and the applications failure ratios in each service type separately. Moreover, we
compared DASC performance on this system with three other admission control mecha-
Distributed Probabilistic Commitment Control Algorithm 153
0.05 0.10
1
2
3
4x 10
−4
Applications request rate
Tot
al a
pplic
atio
n fa
ilure
rat
ios S1
S2S3
0.05 0.10
1
2
x 10−4
Applications request rate
S4S5S6
Figure 7.10: Failure ratios in services 1 to 6 vs. applications request rates
0.02 0.04 0.06 0.08 0.12
3
4
5
6
7
8
9
10
11
12x 10
4
Applications Request Rate
Tot
al A
pplic
atio
ns S
erve
d S
ucce
ssfu
lly
b)Applications Served Successfully
0.02 0.04 0.06 0.08 0.110
−6
10−5
10−4
10−3
10−2
10−1
100
Tot
al A
pplic
atio
n F
ailu
re R
atio
s
Applications Request Rate
a)Total Application Failure Ratios
DASCNOCOMMSIMBTLNK
DASCNOCOMMSIMBTLNK
Figure 7.11: Comparison between four admission control mechanisms with stationaryrequest arrivals
Distributed Probabilistic Commitment Control Algorithm 154
0.511.522.5
x 104
0.45
0.5
0.55
0.6
0.65
0.7
Burst Time (T)
Rat
io o
f app
licat
ions
ser
ved
succ
essf
ully
a)Applications Served Successfully (ratio)
0.511.522.5
x 104
10−6
10−5
10−4
10−3
10−2
10−1
100
Tot
al a
pplic
atio
n fa
ilure
rat
ios
Burst Time (T)
b)Application Failure (ratio)
DASCNoCommitSIMBTLNK
Figure 7.12: Comparison between four admission control mechanisms with on-off burstyrequest arrivals with burst time (T)
nisms. The simulation period consisted of 750000 time units. For generating application
requests, we used geometric distributions with parameter p ranging from 0.01 to 0.1. In
this sample service-oriented environment, our analysis shows that the bottleneck service
is S3. Therefore, we computed the required parameters for different values of p ranging
from 0.01 to 0.1 in 0.01 steps, which covers a low request rate up to a request rate that
loads the system with twice its provisioned capacity at the bottleneck service (S6).
Figure 7.9 shows that even under very high request rates the total application failure
using DASC remains under the guaranteed level of 10−2. We also measured the individ-
ual application class failure ratios at each service component. Figure 7.10 shows these
measured failure ratios at service S1 to S6. These measured ratios are all below the
target threshold (1.5 ∗ 10−3) even under very high request rates.
We also compared the DASC performance against three other admission controllers
with both stationary and non-stationary request arrivals. Two of the admission con-
trollers are token bucket regulators that enforce an acceptable region of arrival rates on
the arrival process. In one of these, the acceptable region is obtained using the bottleneck
Distributed Probabilistic Commitment Control Algorithm 155
analysis of the system, and applying the Erlang-B formula as described before. The other
admission controller uses simulation-based techniques to find the best arrival rates in the
steady-state to maximize the throughput while keeping the failure probability less than
the target threshold of 10−2. The third controller does not apply any admission control
on the arrival process, and admits requests for applications if there exists a free instance
of the first service component of the composed application in its execution plan.
Figure 7.11 shows the measured applications throughput and failure ratios for the
stationary arrivals based on the request rate, and Figure 7.12 shows these parameters
for the on-off bursty request arrivals based on the burst period (T). The DASC out-
performs other mechanisms in both cases in terms of total throughput, and is able to
meet the QoS target. With stationary arrivals and high request rates, this improve-
ment is approximately 20% compared to the next best method (simulation-based). Note
that the bottleneck approach is overly conservative and provides lower throughputs and
very low application failure ratios. The improvement becomes much more visible with
non-stationary request arrivals. This is mainly due to the fact that DASC is able to
take advantage of the ”openings” through transient-state analysis of the system. The
DASC throughput is higher than other methods when the burst period is large, and it
can acheive comparable throughput to the nocommit algorithm when burst period is
small while can still meet the target QoS. Interestingly, due to the transient-analyses
prperty of the DASC algorithm, in some low burst periods DASC can find more open-
ings, and hence, achieve a higher throughput compared to other low burst periods while
still keeping the faliure probabliy below the threshold.
Queue-enabled Service Commitment 156
7.6 Queue-enabled Distributed Algorithm for Ser-
vice Commitment
In the previous sections, we presented the Distributed Algorithm for Service Commit-
ment (DASC) as an application admission control mechanism in a service-oriented envi-
ronments able to guarantee the probability of successful completion for admitted appli-
cation instances. So far, we assumed that the system is not allowed to queue application
instances, and if one application instance finds no free instance of a service at the time it
needs that service, the application instance leaves the system. In this section, we mod-
ify our algorithm so that a service offers a small number of queuing spaces to mitigate
application failures. We allow queuing, but keep it’s usage under an agreed level.
The number of required queuing spaces in a DASC controlled queue-enabled system is
very small compared to the number of service instances, since DASC algorithm keeps the
probability of over-commitment very low. For instance, if we assume that the threshold
for probability of over-commitment is Toc and the total number of instances of a service
type is N , then we roughly need at least TocN queuing spaces to mitigate the application
failures. Considering the fact that Toc is usually very low, the number of queuing spaces
are significantly smaller than the total number of service instances.
This section is organized as follows: In the next subsection, we present the modifica-
tions to the formulations to accommodate the queuing capability in the system. Then,
we discuss the extensions the the distributed algorithm to be able to make the admission
decision in a distributed environment. This subsection is followed by the performance
evaluation section.
In the appendix, we present a set of theorems and corollaries that are used in obtain-
ing the required parameters in the Queue-enabled DASC algorithm (Q-DASC) and are
referred to in the formulation and algorithm section.
Queue-enabled Service Commitment 157
7.6.1 Problem Formulation and Description
To start applying this extension to DASC, we need to present a brief analytical description
of some parts of the DASC algorithm that we need to modify.
One of the main parameters in DASC is Gijk(t) that shows the probability of an
application instance i, just starting execution of service type j, will be at service k at
time t, as formulated in (7.2). We need to consider the effect of adding a queue on this
parameter.
Assume that an application instance i arrives at service j and finds itself at the qth
spot in the queue (1st spot being the head of queue), and assume that there will be no
further queuing for that application instance along its way to service k, then we have:
hqijk(t) = gq
j (t) ∗ hijk(t)
hqijm(t) = gq
j (t) ∗ hijm(t) (7.14)
in which gqj (t) is the pdf of the Time to Enter Service (TES) for the queued application
instance.
By replacing hijk(t) with its queue-enabled representation hqijk(t), we have:
Gqijk(t) = Hq
ijm(t)−Hqijk(t) (7.15)
in which Hqijk(t) is the cdf of hq
ijk(t).
Similarly, the probability that the application i which just joined the qth spot in
service j’s queue is still in the queue or is executing service j at time t is: Gqijj(t) =
1−Hqijj(t)
Finding a closed form for this distribution in general case is quite difficult and im-
practical. In the appendix, we present a series of results that are used in finding lower
bound for this TES distribution for the queued instances. In queue-enabled systems, we
Queue-enabled Service Commitment 158
use this bound to compute the required over-commitment probabilities.
In the Q-DASC algorithm, the agent responsible for the queue computes the TES
mean (ηqj ) and variance (σq
j2) for the queued application instance using the results in
the last section (Theorem 3 and Corollary 3). Then it reports these parameters to the
succeeding agents. The succeeding agents, on the other hand, compute the convolutions
in (7.14) using the received parameters and assuming that the TES distribution is a
Normal distribution with parameters (ηqj , σ
qj2), and apply them to (7.15) and update
their future resource usage:
hqijk(t) = N(t, ηq
j , σqj2) ∗ hijk(t) (7.16)
If σqj2 is much less than the variance of hijk(t), the above Normal distribution, in
comparison to hijk(t) distribution, can be treated as a delta function centered at ηqj ;
(δ(t − ηqj )). This can be easily shown using the frequency domain analysis of the above
distributions. In this case, the equations in (7.14) and (7.15) will be changed to:
hqijk(t) = gq
j (t) ∗ hijk(t) ≈ hijk(t− ηqj ), ∀t > ηq
j
Gqijk(t) = Hq
ijm(t)−Hqijk(t)
≈ Hqijm(t− ηq
j )−Hqijk(t− η
qj ), ∀t > ηq
j (7.17)
As it can be seen in this case, the future estimations would be the shifted versions of
the estimations used in the queue-less DASC algorithm.
7.6.2 Q-DASC Performance Evaluation
In order to evaluate Q-DACS performance and examine its effect on the quality of service,
we simulated the complex system introduced in Figure 7.8.
We first assume that all service types have ample number of queuing spaces, and we
Queue-enabled Service Commitment 159
0.02 0.04 0.06 0.08 0.10
0.5
1
1.5
2
2.5
3x 10
−3
Applications Request Rate
App
licat
ions
Que
uing
Rat
io
Q−DASC Applications Queuing Ratios
APP1APP2APP3
Figure 7.13: Applications queuing probability with ample number of queuing spaces usingQ-DASC algorithm
measured the applications queuing ratio instead of applications failure ratio. The queuing
ratio is measured by dividing number of queued instances of applications by the total
number of admitted application instances. In particular, we wanted to check whether
the proposed Q-DASC algorithm would operate below a target queuing ratio.
In another simulation we assume that services only have a few queuing spots, and we
determine whether these few spots would translate to lower applications failure ratios.
The reason for this experiment is to show that the system rarely needs to queue the appli-
cation instances since the commitment control mechanism restrains the over-commitment
probability.
In this section, we assume service component characteristics identical to the charac-
teristics described in Section 7.5, and we also use the geometric distribution based request
arrival generator as well as the thresholds described in Section 7.5.
Our first measurement is the application queuing ratio. The target level for this ratio
is 10−2. Figure 7.13 depicts the queuing ratio for all three application classes based on
the applications request rates assuming that there are ample number of queuing spaces.
Clearly the Q-DASC algorithm keeps the queuing ratio under the threshold even when
Queue-enabled Service Commitment 160
0 1 2 3 4 5 6 710
−5
10−4
10−3
10−2
Queue Size
App
licat
ions
Fai
lure
Rat
io
Q−DASC Applications Failure Ratios
APP1APP2APP3
Figure 7.14: Applications failure probability based on queue size in Q-DASC algorithm
the offered load to the system is very high.
Figure 7.14 shows the effect of number of queuing spots on applications failure proba-
bility. In this simulation, we assumed that the requests for applications follow a geometric
distribution with parameter p equal to 0.1 that offers a load to the system which is almost
twice the system’s capacity. It is evident that by adding a few queuing spaces for the
over-committed application instances, we can significantly reduce the applications failure
ratio, in comparison to a queue-less system, even when the offered load is very high.
7.7 Related work
One of the main issues in service-oriented systems that has been extensively studied is the
problem of QoS-aware service composition. This problem deals with the cases where each
service component has a specific set of QoS parameters and an overall QoS constraint
has to be met for a composite application [98, 97, 105, 106, 107, 108]. Among the papers
discussing this problem we can mention [98], in which the authors have formulated the
problem as a linear programming problem, and in [105], where the authors proposed
heuristics for optimal service composition considering general distributions for services.
Queue-enabled Service Commitment 161
We consider our work as an extension to these works, since we guarantee successful
completion of an application according to the service-level agreements.
The probabilistic nature of service execution and its influence on contracts between
service providers and application providers have been also studied in [112], in which the
authors have argued that instead of contracts that are based on hard bounds, probability
distributions can be used in soft contracts between web service providers and their clients.
In the first chapter of this part of the thesis, the problem of service allocation in
service-oriented environments has been introduced, and the optimal solution of the prob-
lem using Markov-Decision Processes [103] is presented. The computation of optimal
admission and allocation policies using MDP has some limitation for real large-scale
systems, especially due to the problem of state space explosion and assumptions on ex-
ecution distributions. In this chapter, however, we extended this work and we proposed
algorithms for guaranteeing quality of service in service-oriented systems.
In addition to the area of application creation through service composition, the work
in this chapter touches other research fields. For example, in the operations research
field, we can point to relevant research in admission control to a network of loss queues.
For example in [120], the authors proposed optimal solutions for admission control to two
queues in tandem, assuming only two user classes and exponential distributions. In [121],
the authors extended the work to multiple queues in tandem and presented a heuristic
algorithm as well. However, guaranteeing QoS is not a concern in their work.
In queuing theory, there has been a vast amount of research on analyzing queuing
network performance metrics [116]. While there are many types of queuing networks, very
few have exact analytical solutions for performance parameters [116, 122, 115, 114], and
many approximation techniques have been proposed to find approximate performance
metrics (especially throughput) [113, 123, 124, 125, 126, 127]. We modeled our problem
as an open Finite Capacity Queuing Network (FCQN) with limited or no waiting spaces
and loss[113]. These networks do not have closed form solutions [114] and generally
Queue-enabled Service Commitment 162
approximations are used to analyze their performance metrics in steady state. For
example in [113], the authors have presented a technique based on queuing network
analyzer [123] that approximates the throughput and expected waiting time. The authors
have found that the approximations are more accurate under light and moderate load,
and they become less accurate when the system is in heavy load. This and other methods
are based on the decomposition of the networks to individual queues. We direct interested
readers to [114] for a complete survey on these methods.
The inclusion of fork-join queues in a network makes its analysis more complicated.
In fact, for fork-join queues exact analytical results only exist for the mean response time
of a two server system [128, 129]. Although these types of queues can be seen in many
applications, these queues have not received much research attention because are very
difficult to analyze. For example in [126], the authors have proposed an approximation
technique for an open queue with fork-join queues and normal queues, but they have
assumed a blocking type of an open FCQN composed of M/M/C/K queues.
Another decomposition-based approximation method is the bottleneck analysis dis-
cussed in [117], and [130]. In this approach, the bottleneck queue is determined and
through its analyses approximations for the network of queues can be obtained.
To best of our knowledge, our work is the first that has used a probabilistic approach
to control admission in an open FCQN with losses that can guarantee the loss probability
for systems with non-stationary request arrival processes.
Admission to network of queues has also been studied by the telecommunication
research community in the context of admission to wireless networks. A comprehensive
survey on this field can be found in [131]. The context of our problem, however, is different
since we are dealing with composing multiple services and creating new applications. In
addition, while assuming exponential distribution for calls in wireless networks seems
reasonable, it would not be an accurate assumption in service-oriented environments.
Moreover, parallel and loop operations in service composition do not have a match in the
Queue-enabled Service Commitment 163
wireless cellular networks CAC problem.
The closest wireless CAC algorithm to our problem is introduced in [132], in which
the authors have also used a convolution-based approach to predict the future resource
usage in the cellular network. However, in that paper the authors have stopped short
of analytically computing the call dropping probabilities in the way we formulated the
over-commitment and application failure probabilities.
Other prediction-based papers in the field of CAC in wireless networks involve us-
ing linear predictors and wiener-process based predictors for future resource usages [133]
which basically anticipate the future based on the past. However, in our problem, by
utilizing the knowledge on the execution plans and service execution times, we can ana-
lytically anticipate the future resource usage in a much more accurate way.
In the real-time operating systems field, there are numerous articles on scheduling
and admission control mechanisms for real-time tasks [134, 135]. Recently the focus
has been more on tasks and jobs that have probabilistic execution times [136, 137].
For example in [137], the authors in addition to presenting a survey on the relevant
publications, have described a technique on computing the probability of missed deadlines
on a monoprocessor real-time operating system. In [138], the authors have approximated
the task execution distribution by Coxian distributions of exponentials and performed
the schedulability analysis for multiprocessor real time application. Although our work
in this chapter is presented for a system that orchestrate execution of applications, and
is mainly in the application level for a distributed and service-oriented environment,
variations of this modeling could also be applied to real-time large-scale and distributed
multiprocessor systems for the purpose of schedulability analysis as well.
In the next chapter, we study another issue in making admission decisions in service-
oriented systems that is the problem of system revenue maximization. The system rev-
enue in service-oriented systems can be maximized by admitting more valuable applica-
tion classes to the system and rejecting less valuable ones considering the applications
Queue-enabled Service Commitment 164
request arrival rates. We also present an application admission control system that com-
bines the DASC algorithm and the reward-based admission controller to maximize the
system revenue as well as to guarantee QoS.
Chapter 8
Application Admission Control
System
In a service-oriented environment service instances are allocated to composite applica-
tions so that the required performance is provided. Application admission control can
be used to ensure that appropriate amounts of instances are committed to applications,
given the revenue each application brings in the system and the system’s current com-
mitment. The techniques described so far are able to control the over-commitment and
failure probabilities and guarantee the application success probability. However, they do
not address the issue of maximization of system’s overall revenue.
In this chapter, we extend our study by proposing an application admission control
system for service-oriented environments. This proposed system mainly makes the ad-
mission decision in two steps. Upon receiving a request for an application, in the first
step, the system according to the current commitments, checks if it can guarantee the
target QoS in terms of probability of the successful completion. This check is called the
feasibility check part of the admission control system that uses Distributed Algorithm
for Service Commitment (DASC) [14] described in the previous chapter.
In the second step, a revenue maximization unit is used to maximize the system
165
Application Admission Control System 166
revenue by accepting more valuable applications to the system. For this unit, we propose
two approaches. The first approach is a steady-state based revenue maximization one
that is simple to implement, but does not capture the transient state of the system.
The second approach is a more complicated method that uses online optimization based
techniques which itself consists of three sub-blocks. The main sub-block is an online
optimizer block that solves a binary integer programming problem to maximize system
revenue. The proposed approaches in this chapter are different from the MDP-based
solutions proposed in Chapter 6 in that they avoid the exponential service execution
times and request inter-arrival assumptions of the MDP-based methods.
In this chapter, we first state the problem of reward-based admission. The steady-
state based approach is discussed next, and online optimization approach to application
admission control, and its main blocks are discussed in section 3. These blocks are the
feasibility check block, scenario generator block, online optimizer block, and the final
decision maker. The binary integer programming problem is formulated in this section
as well. Lastly, we present the performance evaluation and comparison results.
8.1 Problem Statement
Assume a service-oriented environment in which there are different service types, and
where different applications can be created by composing sets of different service types.
Each instance of an application requires each given service type during part of the appli-
cation lifetime. Service instances can be used by other applications instances as soon as
they become idle.
Figure 8.1 shows an example system with 3 types of applications and 3 service types:
Application 1 is composed of service types 1, 2 and 3; application 2 is composed of service
types 2 and 3; and application 3 is composed solely of service type 3.
In the example, application 1 first executes service type 1, and then executes service
Application Admission Control System 167
S1 S2 S3
S2 S3
S3Application 3
Application 2
Application 1
Applications
leaving the
system
Figure 8.1: A sample service-oriented environment
type 2, and finally it goes to the last service type. Similarly, application 2 executes service
type 2 followed by service type 3.
Multiple applications can contend for the same service, and we suppose that each
application brings a different reward to the system. For example if application 2 brings
a low reward to the system while applications 1 and 3 bring higher rewards, then the
system should avoid over-committing service type 3 to application 2 at the expense of
applications 1 and 3. Application admission control entails regulating the admission of
applications so that application requirements are met while system revenue is maximized.
In the previous chapter, we proposed a distributed heuristic algorithm for the prob-
lem of service-commitment in service-oriented systems called DASC [15]. The DASC
algorithm makes sure that the system delivers a guaranteed level of quality of service in
terms of success probability for each accepted application instance.
Another aspect of the problem of application admission control is to ensure maxi-
mization of the system revenue. In this chapter two revenue maximization methods are
proposed: one is a steady-state based method and the other one is an online optimization-
based method that maximizes the system revenue by solving a linear programming prob-
lem. In the next section, we first describe the steady-state based method for revenue
maximization.
Application Admission Control System 168
8.2 Steady-State Based Application Admission Con-
trol System
In this section, we study the revenue maximization problem in service-oriented systems in
steady-state. Our goal is to obtain a set of admission parameters for application classes
to maximize the system overall revenue.
Assume that there is a service-oriented environment withM different service types and
L different application classes in which the reward for serving an instance of application
class i is Ri. In this system, the probability of an application class i instance uses a
service type k instance during its execution is pik.
We assume that the incoming process for the application class i is a renewal process
[139] with mean 1/λi. In other words, its interarrival time follows a general distribution
with mean 1/λi. Also we assume that the execution time of service type k follows a
general distribution with mean mk. From [139], we can find the expected number of
application i instances being served at service type k in the steady-state as pikλimk if
there was no limitation for the service instances in the system. If we sum the expected
values for all application classes, then the expected number of busy service instances of
type k in the steady-state would be:
L∑
i=1
pikλimk (8.1)
On the other hand, since there are finite number of service instances of each service
type in the system, we define an admission control parameter zi for the application class
i indicating the portion of requests for class i applications that can enter the system.
Therefore, we define a linear programming problem for finding the optimum values for
zis that can maximize the system overall reward in the steady state considering the
Application Admission Control System 169
limitations on the number of service instances (Nk) in all service types as follows:
max
L∑
i=1
(λiRi)zi (8.2)
s.t. mk
L∑
i=1
(λipik)zi ≤ Nk,∀k ∈ {1, 2, ...,M}
zi ∈ [0, 1],∀i ∈ {1, 2, ..., K}
The optimum values achieved by solving the above linear programming problem are
used by our proposed service-oriented system to control the incoming request rate for
each application class entering the system. This control can be enforced using token
bucket mechanisms that regulate the admission rate of application classes to the system.
The rate control parameters which are required for this algorithm can be calculated
in an optimizer module. If the request arrival process is stationary and has known
arrival rate for each application, they can be provided to this module manually. On the
other hand, automatic rate measurement techniques can be used in case the application
request arrivals processes are not stationary. In this case, the optimizer can recalculate
these parameters every time the incoming request rates change.
It is important to note that in the steady-state method, the reward-based admission
block is totally separated from the commitment block, and if an application request passes
the reward-based admission control mechanism, it needs to pass the service commitment
checks as well, in order to get into the system. This makes the implementation of this
system simple and as it will be shown in the performance evaluations section it can
achieve acceptable performance results.
In the next section, we describe another revenue maximization approach using online-
optimization techniques. Although this alternative method is more complicated than the
steady-state method, but it can better capture the transient states of the system and
Application Admission Control System 170
make better admission decisions in those states.
8.3 Online Optimization-based Application Admis-
sion Control System
If we assume exponential distribution for the service execution times and for the appli-
cations request inter-arrival time, finding the optimal solution for the problem will lead
us to solving a dynamic programming problem using Markov Decision Processes that
we studied in [12]. However, in the general distribution case, the search for the optimal
solution involves solving a multi-stage stochastic programming problem [140].
In a multi-stage stochastic programming problem, in contrast to a deterministic pro-
gramming problem, we try to find optimal decisions at each stage, considering the stochas-
tic nature of the problem and uncertainty about future events. In our case, for instance,
whenever a request enters the service-oriented system, we would like to know whether
we should accept the request and gain its corresponding reward, or wait for later request
arrivals for other more valuable application classes. The uncertainty in this problem is
the time and type of the future request arrivals, and the execution times of the corre-
sponding service types. In a multi-stage stochastic problem the decisions should be made
when a request for an application arrives to the system, while somehow accounting for
the uncertainty in future stages when future requests might come and leave.
Due to the enormous number of uncertainties, this approach to finding optimal so-
lutions for the application admission control in service-oriented systems becomes com-
putationally extensive and infeasible for real systems, especially for the systems which
require on-line decision making.
Another approach to this problem that we follow in this section is a heuristic tech-
nique that finds near-optimal decisions using on-line optimization approaches [140]. In
online optimization approach, we try to find best decisions for accepting or rejecting a
Application Admission Control System 171
Online Optimization-based Admission Control
Feasibility
Check
(DASC)
Online
Optimizer
Scenario
Generator
Final
Decision
Maker
Request
for application
Accept
the request
rejectreject
Figure 8.2: Application Admission Control System using Online Optimization
request for an application class as requests arrive to the system, in an online manner.
As described in [140], the online stochastic combinatorial optimization approaches have
been used in solving many different decision making problems such as scheduling, and re-
source allocation. For example in [141], the authors have studied an online optimization
technique for the problem of admission control to a media-on demand system.
The online optimization approach for our problem consists of finding the optimal
decision for some sample scenarios of the system trajectory instead of finding the opti-
mal decision that could be achieved from a computation intensive multi-stage stochastic
programming problem. In particular, our online optimization approach considers few
sample scenarios up to a finite horizon, and finds the optimal decisions for those scenar-
ios, instead of considering all the uncertainties on events which would occur in future. To
do so, we have to consider few factors such as the number and status of the application
instances that are already being served in the system as well as the number of available
service instances.
An additional important factor in making the decision is the reward that each ap-
plication class instance brings for the system. The system has to make a decision on
either accepting the newly arrived request, or waiting for future more valuable requests.
Other two important factors in the decision making are the time between each arrival
Application Admission Control System 172
for different application classes, and the time for executing services for each application
class.
Therefore, we propose the following algorithm for the problem of application admis-
sion control in service-oriented systems using the online optimization approach, and we
elaborate more on each of the following steps later in this section.
1) Upon receiving a request for an application class, we check the feasibility of ac-
cepting the request and we reject the request if accepting the request is not feasible.
2) We generate some scenarios for the possible system trajectory in future.
3) We find the optimal decision of either accepting the newly arrived request for the
application or rejecting it in each generated scenario.
4) We make the final decision of accepting or rejecting the request based on the output
of each decision making process in step 3.
Figure 8.2 shows the block diagram of this proposed algorithm.
8.3.1 Feasibility Check
Feasibility check function in our algorithm evaluates the system’s current commitments
to the already being served application instances, in order to guarantee an agreed level
of quality of service. To do so, we use our previously proposed algorithm for the service
commitment. In the previous chapter [14], we stated the problem of service commitment
in service-oriented systems, and we proposed a distributed algorithm for this problem
called DASC. In DASC, we define a threshold for the over-commitment probability, and
we keep the over-commitment probability under this threshold by rejecting the requests
for application classes that might push this probability above the threshold. By doing
so, we guarantee an agreed level of application success probability for the admitted
application instances.
In our admission control system, we utilize the DASC algorithm to check the feasibility
of admitting an application instance. It is important to note that by feasibility, we mean
Application Admission Control System 173
guaranteeing the agreed level of success for all of the application instances that are already
being served, as well as the newly arrived request. If this check shows that we can not
deliver the guaranteed level, we will reject the request immediately, otherwise we will
proceed to the next step of the algorithm.
8.3.2 Scenario Generation
The second step in our online optimization approach is to generate a number of sample
scenarios for the system trajectory. These scenarios consist of scenarios for the application
instances that are currently in the system as well as scenarios for applications that arrive
in future.
The scenario generating mechanism identifies exact times for service executions as
well as the execution path of the application. For instance, if an application class 1
instance in Figure 8.1 is currently being served in service type 1, the scenario generator
would tell us that this instance finishes execution of that service at time unit 700, and will
continue to service type 2 and finishes executing it at time unit 2500, and after that starts
executing service type 3 and leaves the system at time unit 4200. As in this example, in
each additional scenario, an exact timing is assigned to each of these transitions and also
the exact execution path is specified.
To generate these scenarios, we can use the distributions for execution times for
each service type, distributions of request inter-arrival times, and different probabilities
associated with choosing each service and consequently the applications execution path.
We have discussed these distributions and probabilities in the previous chapters [12, 14].
An approach to obtain the required distributions is to use the system’s historical data.
To do so, the historical data of the system’s activity has to be recorded and analyzed to
find the required distributions and probabilities. In the rest of this chapter, we assume
that these distributions are already available to the online admission control system,
and the scenario generator block uses these distributions for generating scenarios for the
Application Admission Control System 174
current application instances and future arrivals.
Another issue in generating scenarios is specifying a finite horizon for these scenarios.
In other words, we have to decide how far we want to look to the future of the system in
generating scenarios. To some extent, assuming a finite horizon for generating scenarios
resembles defining a sliding window in discrete time signal processing systems. As in these
types of systems, a limited window of the accumulated data is used for the processing,
and decisions are made based on the windowed data.
There are various factors in determining the horizon, such as the storage and pro-
cessing limitations of each system. Applications lifetimes are another important factor
in determining the horizon. Therefore, the decision on the length of this horizon can be
made by the system designers based on each system’s resources and scale. The approach
that we practice in this paper is assuming a finite horizon based on the maximum lifetime
of the application classes.
After generating the required scenarios, we are ready to proceed to the next step of
our online optimization approach that is discussed in the next subsection.
8.3.3 Optimal Admission Decisions For Generated Scenarios
The online optimizer in our proposed system is responsible for finding the optimal decision
for accepting or rejecting the requests in each scenario. To do so, we formulate a linear
programming problem.
In our online optimizer, the reward for serving an application request r is represented
by w(r), and the decision for accepting or rejecting that request is shown by a(r) that
can take one of the two following values: 0 for rejecting, and 1 for accepting the request
r.
There are also K different service types in the system S = {sj, (1 ≤ j ≤ K)} that
each has N(sj) instances. The scenario generator block produces a series of events and
the timings associated to each event for the online optimization block, as described in
Application Admission Control System 175
the previous section.
Based on these definitions, we can define the following Binary Integer Programming
(BIP) problem for finding the optimal admission decision for each scenario:
max W =∑
r∈R
w(r)a(r), a(r) ∈ {0, 1} (8.3)
s.t∑
r∈R
er(sj, te)a(r) ≤ N(sj), ∀te ∈ T, sj ∈ S (8.4)
T =⋃
Ter
S = {sj, (1 ≤ j ≤ K)}
in which R represents the set of all requests for applications including the newly arrived
request represented by r = 0. Also Ter is a set of all of the event times associated to one
particular request r, and er is the execution path of the request r, both provided by the
scenario generator block.
The objective of this binary integer programming problem is to maximize the sys-
tem reward W by accepting or rejecting each request in the generated scenario. This
maximization is subject to the services capacity at the time of each transition in the
applications execution path. This constraint is evaluated by considering er(sj, te) which
shows that request r is in service type j at transition time te or not. The set representing
all these transition times is called T which is produced by the scenario generation mech-
anism for each scenario. The formulated BIP finds the optimal admission decision for all
requests in one scenario. However, our main concern is a(0) that shows the decision for
accepting or rejecting the newly arrived request in that particular scenario.
The above integer programming problem can be solved efficiently using techniques
such as branch and bound. The two main outputs of this step are a(0) and the max-
imum achievable reward (W ) that are fed into the next step of our proposed online
Application Admission Control System 176
admission control system explained in the next subsection. It is important to note that
this optimization problem is solved for each scenario, hence the number of scenarios that
can be evaluated is limited to the available time for making the decision, as well as the
time required for solving the stated BIP problem given the available processing power.
Therefore, in general case, the number of scenarios, and consequently the number of opti-
mizations will be determined by the system designers based on the system’s specifications
and resources.
8.3.4 Final Decision Making
In the previous subsection, we found the optimal decision for accepting or rejecting the
request for an application in each scenario . The next step is to make the final admission
decision. To make this decision, the output of the online optimizer block (i.e. a(0) and
W for each scenario) are fed to the final decision maker block. Based on these obtained
parameters, we can practice one of the following approaches for making the admission
decision:
1) Voting: accept the request if the majority of the optimal decisions for the generated
scenarios are in favor of accepting the request. This approach is similar to the voting
mechanism where a decision is made when majority of the voters are agreed to the
decision.
2) Conservative: accept the request if all of the decisions are in favor of accepting the
request.
3) Greedy: accept the request if at least one of the decisions is in favor of accepting
the request.
4) Maximum reward: accept the request if the total reward gained by accepting the
request is more than the total reward gained by rejecting the request.
Application Admission Control System 177
8.4 Performance Evaluation
To evaluate the performance of the proposed algorithm we simulated the system depicted
in Figure 8.1. We wrote a C++ program, and for solving the linear programming problem
we used an open source library called lpsolve [142].
We set the number of instances per service type (i.e. N1, N2, and N3) to be 20. For
the service execution times, we assumed a beta distribution [118] for all three service
types with parameters α = 2.333, β = 4.666, the optimistic value of 1000, pessimistic
value of 2000, and mean of 1333 time units.
The simulation period in our simulation is 180000 time units which is 30 times of
the maximum lifetime of an application 1 instance. We also chose similar geometric
distributions for the arrival inter-arrival times for all three classes of applications with
parameter p ranging from 0.002 to 0.010. We also assumed the following rewards for
successful termination of each application instance: 0.4 for an application class 1 instance,
0.2 for an application 2, and 0.8 for an application 3. The penalties for unsuccessful
termination of applications are 0.3 for an application 1 instance failed in service 2, 0.8
for an application 1 instance failed in service 3, and 0.4 for an application 2 instance
failed in service 3. No cost is associated with rejecting a request for an application, and
rejected requests will leave the system and will not interfere with the system in future.
We evaluated the system performance using four different techniques. The first tech-
nique which we used is the No Commitment Policy (NCP) in which the system does not
try to maximize the system revenue and does not avoid over-commitments.
For the second mechanism, we only used the DASC algorithm that guarantees the
quality of service, but does not address the problem of reward maximization. We set the
threshold parameter in this algorithm to 1% meaning that the system guarantees 99%
success probability for the admitted application requests.
For the third technique, we used the steady-state based application admission control
algorithm [14]. This technique works based on the steady-state analysis of the system,
Application Admission Control System 178
3 4 5 6 7 8 9 10
x 10−3
500
600
700
800
900
1000
1100
1200
1300
Applications Request Rate
Sys
tem
Rew
ard
NCPDASC1% AloneDASC1%−SSDASC1%−Online
Figure 8.3: System reward for four different techniques
and uses a linear-programming technique to achieve admission regulation parameters
that maximize the system revenue in the steady state.
For the fourth mechanism, we used online optimization-based system composed of
the feasibility check block, scenario-generating block, online-optimization block, and the
decision making block. For the feasibility check block, we used the DASC algorithm with
a threshold parameter equal to 1%. For the scenario generation block, we generated three
different scenarios for the period of two times longer than the longest living application
(i.e. application class 1). As we mentioned earlier, we used lpsolve library [142] for
solving the binary integer programming problem, and for decision making, we used the
voting mechanism.
Figure 8.3 shows the system revenue for the period of simulation for these four different
mechanisms. Figure 8.4, on the other hand, shows failure probability for application
classes 1 and 2 in a logarithmic view.
As it can be seen, the NCP technique performance is acceptable when the system
Application Admission Control System 179
3 4 5 6 7 8 9
x 10−3
10−3
10−2
10−1
100
App
licat
ion
1 F
ailu
re R
ate
Applications Request Rate3 4 5 6 7 8 9
x 10−3
10−3
10−2
10−1
100
Applications Request Rate
App
licat
ion
2 F
ailu
re R
ate
NCP
DASC1% Alone
DASC1%−SS
DASC1%−Online
Figure 8.4: Application 1 and application 2 failure rates based on the applications requestrate
is lightly loaded but the system revenue degrades drastically as system’s load increases.
Moreover, the application failure results for this technique are extremely poor considering
the best effort nature of this technique.
The second observation is the performance of the DASC algorithm when it is the sole
mechanism in place. As it can be seen, although the application failure probability is
under the threshold, the system revenue does not improve using this algorithm.
The steady-state based admission control algorithm, combined with the DASC al-
gorithm, can perform better than two previous techniques. As it can be observed, the
system revenue increases using this combination, and at the same time, the required
quality of service can be delivered.
However, the performance of the online optimization approach outperforms the steady-
state based technique, especially when the system is not heavily over-loaded. The main
reason for this performance improvement is the ability of the online optimization tech-
nique in capturing the transient conditions in the system, as opposed to the steady-state
Application Admission Control System 180
based admission control technique which uses the steady-state conditions of the system.
This observation can be interestingly confirmed by the fact that the performance im-
provements are mainly occur when the system is not heavily over-loaded, and there are
further potentials in the system for revenue maximization. As system goes over loaded,
the systems capacity becomes saturated, and therefore both the steady-state based tech-
nique and the online optimization based system can perform well.
Chapter 9
Conclusions
Future networks should cope with challenges imposed by emerging future generation of
applications; otherwise the range and scope of applications over future networks will be
limited by the design choices of the past. In this thesis, we studied future networks
and applications requirements and we addressed various challenges in future networks by
proposing an architecture, a network research testbed and scalable and distributed QoS
control algorithms.
9.1 Contributions
While most of present proposals on future network architectures have been designed to
address requirement of a particular class of applications, we have taken the research on
future network architectures a step further by proposing an application-oriented net-
work architecture as a configurable converged communication and computing network.
Based on this new network architecture, we designed a Virtualized Application Network-
ing Infrastructure that enables networking researchers to experiment with new network
architectures and distributed applications. We have also proposed a novel scalable and
distributed QoS and admission control algorithm in Service-Oriented systems and in Fi-
nite Capacity Queuing Networks in general. Overall this thesis contributions can be
181
Conclusions 182
listed as follows:
9.1.1 Application-Oriented Networking
We proposed a novel network architecture called an Application-Oriented Network archi-
tecture that addresses challenges from future networks applications such as configurability
and application-awareness, and facilitates application creation through virtualization of
processing, storage, reprogrammable hardware, and software resources that are commonly
used in application creation. We proposed a three-plane architecture for AON comprising
a control plane, a management plane, and an application plane. Applications are able to
configure the resources in the application plane to satisfy their own requirements. These
resources are virtualized computing, storage, hardware and software resources, and other
resources and functionalities needed for rapid application creation.
Multiplicity of applications are able to coexist over the same shared virtualized infras-
tructure in AON application plane that is managed and control by the other two AON
planes; AON management and AON control. The latter is responsible for control-related
functions such as allocation, and release of the resources as well as failure recovery op-
erations, while the former is responsible for performing management related functions
such as monitoring, provisioning, and re-provisioning and long-term fault management.
We also proposed an architecture for applications in the application plane that has three
main characteristic: a two-layer (service and transport) architecture, a service-oriented
service layer, and a transport layer that provides content and data delivery.
The proposed architecture can be helpful in a diverse range of applications that re-
quire responsiveness, reliability, security, smart caching, and efficient content broadcast-
ing/multicasting. Mobile networks can also utilize the processing and storage capabilities
embedded in the architecture for performing smart and adaptive content conversion and
distribution to mobile nodes that experience hand-off as well as temporary disconnec-
tions.
Conclusions 183
9.1.2 Virtualized Application Networking Infrastructure
In this thesis, we presented Virtualized Application Networking Infrastructure (VANI)
as a networking research testbed that allows experimentation with new networked sys-
tems and distributed applications. Compared to the other networking research testbeds
VANI utilizes a service-oriented control and management plane that provides flexible and
dynamic allocation, release, program, and configuration of resources used for performing
large-scale experiments in a wide area network from layer three up. VANI resources
in the application plane allow development of network architectures that require a con-
verged network of computing and communications resources and in-network processing,
and storage.
Another main contribution in VANI is introduction of a reprogrammable hardware
resource that can be allocated to the experiments that require high performance and high
throughput computing on-demand. This resource is designed based on virtualization of
hardware resources, in particular FPGAs, and providing well-defined interfaces to the
researchers to program and configure it. Through experimentations and measurements,
we showed that the reprogrammable hardware resource can be programmed rapidly and
can achieve very high throughput using its 16x10GE interfaces.
VANI also allows registration of new hardware and software resources in the control
and management plane. This facilitates experimentation since researchers can set-up new
experiments rapidly using the available service components developed independently by
other researchers.
VANI in essence is a prototype of our proposed Application-Oriented Network Archi-
tecture and a proof-of-concept to show case how AON proposed concepts can be realized
and how distributed applications and new network architectures can be built on such a
network.
Another major contribution of this study was the design, and development of DETS
that is a novel system to shape and regulate Ethernet traffic in VANI as well as in a
Conclusions 184
computing cluster, or a datacenter. The DETS system is required where there is a host
node connected to several virtual local area networks, and the sending and receiving traffic
rate on each of these virtual networks has to be guaranteed and controlled. Without this
control, an excess of received traffic on one of these virtual networks could disturb other
virtual networks ability to receive traffic in a guaranteed rate.
While most of current solutions for Ethernet congestion control rely on simple Con-
gestion Notification-based mechanisms and virtually all of them require a change in the
Ethernet hardware equipments, our proposed DETS system does not require any changes
in the hardware. It is also able to operate distributedly using one of the four algorithms
proposed for rate allocation. Through the experimentation on an actual Linux-based
computing cluster, we showed the effectiveness of the DETS, and we compared the per-
formance of the four algorithms and discussed their characteristics. We also proposed
modifications to the Ethernet control plane so that DETS can be natively supported by
Ethernet networking elements.
9.1.3 Scalable and Distributed QoS and Admission Control
In this thesis, we studied the problem of QoS and admission control and allocating
instances of services to different applications in service-oriented environments. In this
problem, a limited number of service instances from each service components are shared
among different application classes. The major concerns in this problem are two-fold:
maximizing the system revenue by allocating the service instances to the more valuable
application classes considering the service execution times and request inter-arrival times
of each application class; and guaranteeing the successful completion of an admitted
application instance.
We presented a method for obtaining the optimal policy for maximizing system rev-
enue using Markov Decision Processes for small scale systems with exponential service
execution times and request inter-arrival times. We analyzed the case where the consti-
Conclusions 185
tuting service components in an application are executed concurrently throughout appli-
cation lifetime as well as the case where the service components are executed sequentially,
and hence are not required throughout the application lifetime.
We presented the optimal policy for prototype examples, and we compared the perfor-
mance of applying this policy to the system with the performance of a system that uses
Complete Sharing or Complete Partitioning mechanisms. In all cases, we showed that
applying the policies obtained from Markov Decision Processes results to considerable
performance improvement in system revenue compared to the other two mechanisms,
especially when the request rates for the applications are high.
As another major contribution of this study, we presented a Distributed Algorithm for
Service Commitment (DASC) that guarantees a specified level of probability of successful
completion for an application in a service-oriented system in settings that have stationary
as well as non-stationary arrivals. We showed that the Central Limit Theorem can help us
in computing this probability, and we also described alternative approach for computing
this probability using Chernoff’s bound. The DASC algorithm can be implemented in a
distributed environment and does not assume any specific distribution type for service
execution times and application request inter-arrival times.
For stationary systems, we proposed two steady-state based alternative approaches
(one based on bottleneck analysis, and the other based on simulation) that use token
bucket regulators to control the admission of application request to the system. These
algorithms are simpler to implement than the DASC algorithm, but they can not operate
in non-stationary environments. DASC, however, is able to perform in both stationary
and non-stationary environments using transient-state analysis of the system. We pre-
sented performance evaluation results showing the effectiveness of the DASC algorithm
in a simple service oriented system as well as in a complex system with both stationary
and non-stationary request arrivals.
We also showed that by adding a few queuing spaces, we can guarantee a specified
Conclusions 186
level of queuing probability for an application instance, and at the same time, significantly
reduce the application failure probability. In doing so, we presented a series of theorems
and corollaries that can be used in finding bounds for the time to enter service distribution
in general queuing systems.
To maximize the system revenue in addition to guaranteeing QoS, we proposed an
application admission control system for service-oriented systems. The proposed system
is able to use a simple steady-state or an online optimization approach for maximizing
the system revenue, in addition to the DASC algorithm that guarantees the required
level of probability of success.
The online optimization block of our system is composed of three sub blocks; the sce-
nario generating block, the online optimizer, and the final decision maker. We elaborated
the functionalities of each block, and we discussed the important factors in designing each
of them. We also formulated a binary integer programming problem which maximizes the
system revenue in the online optimizer block. The simulation results and performance
comparisons show that the proposed system can achieve its objectives and it can improve
the system performance.
9.1.4 Related Educational Contributions
The last, but not the least, contribution if this study is the education of several University
of Toronto (UofT) students especially through their involvement in performing experi-
ments with AON architecture and design and development of various parts of VANI.
In the early stages of this study, we were conducting experiments on AON archi-
tecture and applications. Justin Seto and Andrew Mehes helped us in this process by
implementing a prototype of a new network architecture in AON for their final year
design project at Electrical and Computer Engineering (ECE) department, UofT. The
developed system has XML-delivery function in its transport layer and uses a peer-to-
peer mechanism to organize its network. The two other students that were involved in
Conclusions 187
this process were Michael Ens and Ian Gartley. They were Engineering Science students
that performed experiments with the NaradaBrokering pub/sub system as well as a new
open-source XML-parser.
A major force in the VANI project was Keith Redmond, a MASc student at University
of Toronto. We worked very closely together in design and development of virtualization
layer for main resources in VANI including processing, storage, reprogrammable hard-
ware, and the internal fabric. In summer of 2008, Tom Yue was a summer student that
worked with us in development of parts of the VANI virtualization layer, specifically
on the WS interfaces of reprogrammable hardware resource. Darryl Chung was also a
summer student that developed the base for a Graphical User Interface for VANI in the
summer of 2008.
Gordon Tam was an Engineering Science student that helped us in development of
VANI control and management plane software. He started working with us on his final
year design project and continued his collaboration during summer of 2009 as a summer
student. In the summer of 2009, a group of summer students helped us in development
of various software resources in VANI including database resource, orchestrator resource,
the hardware-based gateway resource, and GENI-VANI interworking resource. These
students were Arbab Khan, Saleh Dani, Mingliang Ma, Maxim Galash, and Wenyu Li.
Three of these students (Arbab Khan, Saleh Dani, Maxim Galash) together with Anthony
Das Santos worked on a prototype of a green orchestrator engine and developed a sensor
resource for VANI as their final year design project. Minglian Ming helped us in exploring
some of our future work in regard to automatic application deployment in VANI as well.
Arbab Khan still is cooperating with us as a summer student to integrate, maintain and
improve VANI control and management software, and the developed processing, storage,
gateway, and internal fabric resource.
The author takes pride in working with these students and in being a part of their
education process at University of Toronto.
Conclusions 188
9.2 Future Work
This dissertation has covered many subjects in dealing with challenges in future networks.
In terms of future work, there are many possibilities in each of the covered topics. In
Application-Oriented Networks in general, and VANI in particular, an important future
work is to develop large scale applications based on this architecture and the developed
testbed.
One application that we are currently investigating is a green application orchestrator
engine. In the green orchestrator engine, we intend to create a distributed follow-the-sun
system that is able to move service components to VANI nodes that have better access
to green energy such as solar power or wind. The green orchestrator system is built
on VANI using a variety of software-based resources developed for VANI including the
complex event processing service, and sensor service.
Another application of VANI is in SW-defined radio. In wireless networks, VANI
is capable of processing a large amount of aggregated and digitized radio signals in its
reprogrammable hardware resources. This capability facilitates advanced research on
software-radio systems, and future wireless technologies.
A major extension to the AON control and management plane, as well as VANI is
to develop functionalities to automate application creation and deployment. In an auto-
mated system, an Application-Provider would be able to specify the high level business
goals of an application, and the system can identify the appropriate service components
and deploy them in the right places in an AON to deliver the required functionality.
Inclusion of autonomous management techniques in VANI is another possible extension
of work on the VANI testbed.
Additional future work in VANI include interconnecting VANI to GENI testbeds so
that GENI researchers can use VANI resources to carry out federated experiments, as
well as setting up VANI nodes in different sites across a wide area network to enable large
scale experimentations. In addition, we plan to include new hardware resources such as
Conclusions 189
the new BEE3 boards and GPU-based hardware in VANI. We hope that VANI could
serve as a breeding ground for research on large-scale and advanced networked systems
in Canada in future.
In terms of future work on Distributed Ethernet Traffic Shaping system, we intend
to further explore the DETS protocol modifications to the Ethernet control plane, and
develop proof of concept Ethernet switches with this capability using the hardware re-
sources developed for VANI.
In scalable QoS and admission control in service-oriented systems, we intend to fur-
ther explore the transient-state analysis potential in maximizing the system revenue by
predicting the revenue that a system would loose or receive by admitting a request for an
application, especially when the system is not over-loaded and there is room for gaining
more revenue.
Another extension to this work could be including scheduling mechanisms for the
queued application instances in the system. Further development of the proposed com-
mitment algorithm to reduce power consumption in a service oriented system through
anticipation of future resource requirements and putting the surplus resources in the low
power mode could be another area of future research. Finally, incorporating the proposed
QoS-control mechanisms in a real service-oriented system such as AON is another major
extension of this work that we would like to explore in future.
Appendices
190
Appendix A
Queue-Enabled Service Commitment
In Q-DASC, we use the pdf of Time to Enter Service (TES) in a G/G/C/N queuing
system. Finding exact solutions for TES distribution in general is very difficult. Therefore
in this section, we introduce several results that help us find approximations for TES.
A.1 Time to Enter Service in a G/G/C/N System
Assume that there is a G/G/C/N system that has a general distribution type for the
request arrivals. Also it has C independent instances of one service with execution time
pdf f(t) with mean µ, and has N queue spots in front of them.
We are interested in the distribution for the Time to Enter Service (TES) for the
queued requests, assuming that all service instances are busy. To do so, we define the
system state at time t as s(t) = (t − t1, t − t2, ..., t − tC), t1 ≤ t2 ≤ ... ≤ tC , in which ti
represents the time that ith instance started serving a request.
Finding the closed form representation of the distribution of the time to enter service
(TES) for each of the requests in the queue is quite difficult and impractical in general
case. However, in this section, we develop series of theorems that lead bounds for these
distributions.
Our intuition is that we only need to study the residual times of the j longest served
191
192
requests that are already in the system to find out the time to enter service for a queued
request that is in spot j of the queue(with spot 1 being the head of the queue). Following
this intuition, we would then find the relation between the TES and the residual times
of the requests that are being served. We will use the concepts in stochastic orders [143]
to determine when our intuition is correct.
Definition 1: Let X and Y be two random variables which have the following prop-
erty:
P{X > t} ≤ P{Y > t} ∀t ∈ (−∞,∞) (A.1)
then X is said to be smaller than Y in the usual stochastic order, shown by X ≤st Y .
This property can be also represented in terms of cumulative distribution functions (cdf),
as follows:
FX(t) ≥ FY (t) ∀t ∈ (−∞,∞) (A.2)
In other words, the distribution of X is lower bounded by distribution of Y .
Definition 2: A nonnegative random variable X with distribution function F and
survival function F (t) ≡ 1−F (t) is said to be Increasing Failure Rate (IFR) if −logF is
convex on {t : F (t) > 0}. Also X is said to be Decreasing Failure Rate (DFR) if −logF
is concave on {t : F (t) > 0}.
The next theorem finds the sufficient and necessary condition for a random variable
to be IFR or DFR.
Theorem 1) The random variable X is IFR [DFR] if, and only if, [X−t1|X > t1] ≥st
[≤st][X − t2|X > t2] whenever t1 ≤ t2.
proof: Theorem 1.A.13 in [143].
According to this theorem if the execution time of a service has the IFR property,
then the application instances that are already being served in the system are more likely
to finish their execution in the order of their arrival to that service. Similarly, if it is
DFR, the instances are more likely to finish their execution in the reverse order of their
193
5 10 15 20 25 300
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
time unit
prob
ablit
y
cdfs for a uniform distribution
FF4F3F2F1
Figure A.1: Distributions for residual service times in a service with uniform executiontime
arrival.
Therefore the next lemma can follow at once from the above definitions and theorem:
Lemma 1) Assuming f(t) to be pdf of an IFR [DFR] service execution time, and
F (t) as its cumulative distribution function (cdf), and the system is in state s(t), then
F1(t) ≥ F2(t) ≥ ... ≥ FC(t) ≥ F (t), [F1(t) ≤ F2(t) ≤ ... ≤ FC(t) ≤ F (t)] in which Fi(t) is
the cdf of the random variable Trithat shows the residual time of the ith service instance.
The uniform distribution is an IFR distribution. Among other IFR distributions we
can name, Normal distribution, the Gamma and Weibull distributions for α > 1, and the
modified extreme value distribution [144]. DFR distributions are rare but as an example
we can name log normal distribution [144]. Equality in Lemma 1 is for the exponential
distribution that has a constant failure rate and is the boundary for IFR and DFR type
of distributions.
Example 1: Assume that a service execution time has a uniform distribution U(10,20).
Figure A.1 shows Fi(t) distributions for a system that has four service instances (C = 4),
and is in the state s(t) = (t− 15, t− 12, t− 8, t− 4).
194
50 100 150 200 250 3000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
time unit
prob
ablit
y
cdfs for a normal distribution
FF4F3F2F1
Figure A.2: Distributions for residual service times in a service with Normal executiontime
Example 2: Figure A.2 shows the distributions for a service with Normal distribution
N(200, 10). In this system there are four service instances (C = 4), and the system is in
the state s(t) = (t− 150, t− 120, t− 80, t− 40).
Example 3: Figure A.3 shows the distributions for a system with four servers. The
service execution time is the Beta distribution used in the previous sections, and depicted
in figure 7.5. This Beta distribution is also an IFR distribution. The figure is depicted
when the system is in the state s(t) = (t− 1500, t− 1200, t− 800, t− 400).
Using the above definitions, theorem and lemma, we can now focus back on the
properties of the Time to Enter Service (TES) for a queued application instance.
Theorem 2) In a G/G/C/N system the distribution of TES for the first instance
in the queue (head of the queue) is lower bounded by the distribution of the residual
time of any of the requests Already Being Served (ABS) in the system. In other words,
G1C(t) ≥ Fi(t), if G1C(t) is the cdf for the TES of the first request in the system.
proof : Assume that the system is in state s(t) and there is one request in queue. The
time to enter service (TES) for that request is a random variable shown by Tw1C and is
195
200 400 600 800 1000 1200 1400 1600 1800 20000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
time unit
prob
ablit
y
cdfs for a beta distribution
FF4F3F2F1
Figure A.3: Distributions for residual service times in a service with a Beta executiontime α = 2.333, β = 4.666
equal to min(Tr1, Tr2, .., TrC) in which Tri is the residual service time of the ith server.
Cdf for Tw1C is represented as G1C(t), and cdf of Tri is referred as Fi(t).
From the definitions, we wish to prove that Tw1C ≤st Tri, ∀0 < i ≤ C. To do so, we
have to show:
P{Tw1C < t} ≥ P{Tri < t} ∀t > 0 (A.3)
To prove the above inequality, we have to show that the event Tri < t is a subset of
event Tw1C < t. This, however, is true, since we know that if Tri < t then Tw1C will be
less than t. As a result, the above inequality is true, and the theorem is proved.
The next two corollaries discuss the properties of the this variable in terms of its
mean as well as its characteristics for an IFR [DFR] distribution.
Corollary 1) In a G/G/C/N system, the mean TES for the first request in the queue
(head of the queue) is not more than the mean residual time of any of the requests in
the system. In other words, m1C ≤ mi.
proof : This can be simply proved from theorem 1 and the fact that m1C =∫
∞
0(1 −
196
G1C(t)dt).
Corollary 2) In a G/G/C/N system with IFR [DFR] service time, we will have
Tw1C ≤st Tr1[TrC ], and consequently m1C ≤ m1[mC ].
proof : This corollary can be proved from theorem 2, lemma 1, and corollary 1.
The above corollary interestingly states that in a system with IFR [DFR] service
time, the distribution of the TES for the first request in queue is lower bounded by the
distribution of the longest [shortest] Already Being Served (ABS) request in the system.
The next theorem considers the properties of the time to enter service (TES) random
variable for other application instances in the service queue.
Theorem 3) In a G/G/C/N system, the distribution of TES of the jth request
(2 ≤ j ≤ C) in the queue is lower bounded by the distribution of the residual time of the
maximum of any combination of j ABS requests in the system.
proof : If we define TwjCas the random variable representing the TES of the jth
request in the queue, we can define a set VC as follows:
VC = {Tr1, Tr2
, ..., TrC} (A.4)
We define VjC ⊂ VC as any subset of random variables in VC having |VjC | = j,
assuming 2 ≤ j ≤ C. We need to prove:
TwjC ≤st max(VjC)
P{TwjC < t} ≥ P{max(VjC) < t}
Again to show that the above inequality is true, we have to prove that the event
max(VjC) < t is a subset of event TwjC < t. This is true, since if max(VjC) < t then
TwjCfor sure will be less than t. As a result the above inequality is true and the theorem
is proved.
Corollary 3) In a G/G/C/N system with IFR [DFR] service time, the cdf of TES
197
for the jth instance in queue is lower bounded by the cdf of the maximum of first [last]
j ABS instances in the system:
GjC(t) ≥
j⋃
k=1
[C⋃
k=C−j+1
]Fk(t), j ≤ C (A.5)
The next theorem finds an upper bound for the distribution of TES in a G/G/C/N
system.
Theorem 4) In a G/G/C/N system the distribution of TES of the jth request (j ≥ 2)
in the queue is upper bounded by the distribution of the (j − 1)th request in the queue.
proof: We know that TwjC ≥ Tw(j−1)C , therefore GjC(t) ≤ G(j−1)C(t),∀t, and j ≥ 2.
From Corollary 3 and theorem 3 we can see that for IFR [DFR] systems GjC(t) is
bounded by G(j−1)C(t) and⋃j
k=1[⋃C
k=C−j+1]Fk(t).
In summary, we showed that to find bounds of TES distribution for the jth request
in queue, we only need to analyze the residual time of j requests that are already being
served in the system. If the service time distribution is IFR, this j requests can be the
longest served ones. Since many distributions in real systems can be characterized as
IFR distributions, it can be concluded that our first intuition is correct for most real
systems. However, for DFR distributions better bounds can be obtained by analyzing j
shortest served requests.
In Q-DASC, if an application instance is queued, we find the TES mean and variance
using lower bounds, and we distribute them to other agents so that they can update their
future usage estimation.
As mentioned earlier finding the exact TES distribution for general service execution
times is very difficult because it not only depends on the service execution time distribu-
tion but also on the current state of the system as well as start time of ABS instances.
Therefore, we performed performance evaluations on the beta distribution that we used
for the DASC performance evaluations in Chapter 7 in order to assess the tightness of the
198
1 2 3 4 50
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1TES distribution for spot 1 in the queue
bound: mean=1.5,stdev=0.7sim: mean=2.3,stdev=0.3
2 4 6 8 100
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1TES distribution for spot 2 in the queue
bound: mean=3.3,stdev=2.0sim: mean=2.7,stdev=0.5
2 4 6 8 10 12 14 160
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1TES distribution for spot 3 in the queue
bound: mean=5.8,stdev=3.2sim: mean=3.2,stdev=0.7
5 10 15 200
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1TES distribution for spot 4 in the queue
bound: mean=8.6,stdev=4.4sim: mean=3.7,stdev=0.9
5 10 15 20 25 300
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1TES distribution for spot 5 in the queue
bound: mean=11.7,stdev=5.5sim: mean=4.3,stdev=1.1
0 500 1000 1500 20000
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1Service Execution Time distribution
mean=1333.3,stdev=166.7
Figure A.4: TES distribution and calculated bound for beta distribution with α = 2.333, β = 4.666
199
bound. To do so, we assumed a queue with 200 busy servers with beta execution times
with maximum service time of 2000. The start time of all ABS instances are uniformly
chosen from 5 to 1995 with steps of 10.
Figure A.4 shows the TES distributions and corresponding means and standard devi-
ations for the first five instances in the queue from spots one to five using simulation as
well as the bounds obtained in Corollary 3. As can be seen, the bound on distribution is
lower than the distribution found through simulations as expected. We can also observe
that the bound is tighter for the first spots in the queue and as we move further down the
queue the bound becomes more conservative. In the next subsections we study the time
to enter service distribution properties for G/D/C/N and G/M/C/N queuing systems.
A.2 TES for G/D/C/N System
A G/D/C/N queuing system is a system that has a deterministic service time of d
seconds. We assume that all C servers in the system are busy and system is in state
s(t) = (t− t1, t− t2, .., t− tC), in which t1 ≤ t2 ≤ ... ≤ tC , and t− tj < d. Also, we know
that a deterministic distribution is an IFR distribution.
Therefore, the TES for the first request in the queue will be equal to the residual time
of the longest served request in the system d − (t − t1). Similarly, we can see that the
TES for the jth request in queue is a deterministic value and equals to:
TwjC = d− (t− tj), j ≤ C (A.6)
A.3 TES for G/M/C/N System
Assume that there is a G/M/C/N system with service rate µ. It can be easily shown that
TES in a G/M/C/N system follows the m-Erlang distribution. The TES distribution
for the first request in queue is exponential (m-Erlang with parameter m = 1, and Cµ),
200
and distribution for the jth request in queue is an m-Erlang distribution with parameters
m = j, and Cµ. Also the residual time distributions in a G/M/C/N system are all iid
exponential distribution with rate µ. In this type of system, if we study the TES for the
jth request in queue we can see that its mean would be:
E[TdjC ] = E[TwjC ] + E[Ts]
= j/Cµ+ 1/µ =j + C
Cµ
V AR[TdjC ] = V AR[TwjC ] + V AR[Ts]
= j/(Cµ)2 + 1/µ2 =j + C2
(Cµ)2
in which Ts is the requests service time.
The interesting observation from the above equation is that for j ≪ C2 the variance
of the delay in the system is almost equal to the variance of the service time. In other
words, in systems with ample number of servers, the variance of the TES for the first few
requests in the queue (j/(Cµ)2, j ≪ C2) is almost negligible compared to the variance
of service time (1/µ2).
Appendix B
Computing Over-Commitment
Probability using Chernoff’s Bound
The Central Limit Theorem gives a good approximation of over-commitment probability
when the total number of service instances is in the range of few standard deviations
(σk(t)) from the mean ηk(t). Therefore, if this range is more than few standard deviations
and the required over-commitment probability threshold (Toc) is less than 0.001, it is
better to use a more tight bound on the probability. To do so, we use the theory of large
deviation and the Chernoff’s bound [145] to compute the probability of over-commitment.
The following is the definition of the Chernoff’s bound:
P{Sk(t) ≥ Nk} ≤ e−(sNk−µS(s)), s > 0,∀t > 0 (B.1)
in which Sk(t) is the sum random variable (7.4), and µS(s) = lnψS(s) is the loga-
rithmic moment generating function of the Sk(t) random variable. Since, the right hand
side of the above inequality is true for any s > 0, we can find the value of s which mini-
mizes the right hand side of the inequality by finding the s∗ which satisfies the following
equation:
201
Appendix: Over-Commitment Probability using Chernoff’s Bound 202
Nk = µ′
S(s) (B.2)
Putting the definition of the sum random variable from 7.4, we have:
P{Sk(t) ≥ Nk} ≤ e−s∗Nk+∑
(
ln(Gijk(t)+Gijk(t)es∗ ))
in which s∗ is the solution of the following equation:
Nk =∑
i,j
Gijk(t)
Gijk(t)e−s +Gijk(t).
It has been shown that the probability in (B.1) can be approximated for the random
variables which are the sum of finitely many random variables (like our defined sum
random variable (7.4) ) as follows [146]:
P{Sk(t) ≥ Nk} ≈1
s∗√
2πµ′′
S(s∗)e−(s∗Nk−µS(s∗)) (B.3)
To make sure that the probability of over-commitment remains less than the threshold,
we should compute s∗ for all times t, and compute the probability in approximation (B.3).
To do so, we use characteristics of the sum random variable Sk(t). We know that
Sk(t) is the sum of n independent Bernoulli random variables, in which n represents the
number of applications that can be served by service type k at time t. Therefore, we
analyze the problem in the general case as follows:
Assume Xi, i = 1, 2, .., n are n independent Bernoulli random variables with param-
eters (pi, qi), and pi + qi = 1. We define random variable Y as Y =∑n
i=1Xi. We
have:
Appendix: Over-Commitment Probability using Chernoff’s Bound 203
η := E[Y ] =n
∑
i=1
pi, (B.4)
σ2 := V AR[Y ] =n
∑
i=1
(V AR[Xi])
=n
∑
i=1
piqi (B.5)
The Chernoff’s Bound is:
p{Y ≥ N} ≤ e−sNE[esY ] = e−sN
n∏
i=1
(qi + pies) , s > 0 (B.6)
After taking derivative in respect to s we find s∗ as the root of the following function:
d(s) =n
∑
i=1
pi
pi + qie−s−N , s > 0 (B.7)
Also the second derivative of the Chernoff’s bound right-hand side equation or the
derivative of d(s) would be:
d′(s) =n
∑
i=1
piqie−s
(pi + qie−s)2, s > 0 (B.8)
As it can be seen, d′(s) is always positive and therefore d(s) is a strictly increasing
function that (at most) has one root. Figure B.1 shows a sample of this function. In
this Figure, we depicted d(s) for a service type with N = 900 instances and n = 1000
application instances in the system with random pi s. As expected, this function is a
strictly increasing function, and (in this case) with one root. Therefore, we present a five
step algorithm for finding the root. This algorithm in each step examines the cases where
there is no root for this function, or there is one root which is much larger than one, or
much less than one, or close to one. Now without getting further into the mathematical
details, we present this algorithm for finding the optimum s∗ as follows:
Appendix: Over-Commitment Probability using Chernoff’s Bound 204
0 2000 4000 6000 8000 10000 12000−400
−300
−200
−100
0
100
d(s)
s
Figure B.1: A sample d(s) for a service with 900 instances, and random pi s for 1000application instances
1) if N ≥ n then s∗ = ∞, and the bound is 0, which means the system has more
service instances than the number of admitted applications and the probability of over-
commitment is zero. Otherwise go to the next step.
2) if N ≤ η then s∗ = 0, and the bound is 1. This means that the number of service
instances is less than the mean number of admitted applications, and by using CLT we
can see that the over-commitment probability is more than 0.5. Otherwise go to the next
step.
3) If η < N < n, d(s) would be a strictly increasing function with only one root.
therefore, if that root is a lot less than 1, (s∗ ≪ 1), we have:
s∗ =N − η
σ2, s∗ ≪ 1
If the above equation achieves s∗ < 0.5 then s∗ is the answer. otherwise proceed to
the next step.
4) for s∗ ≫ 1, we have:
Appendix: Over-Commitment Probability using Chernoff’s Bound 205
s∗ =
∑
pi−1 − n
n−N, s∗ ≫ 1
if the above equation achieves s∗ > 5 then s∗ is the answer. otherwise proceed to the
next step.
5) the s∗ is in the range (0.5, 5), In this case, we can compute the root using the
Newton’s method very efficiently.
Our simulations show that in the most cases the above algorithm ends in the 4th step
and there is no need to use the Newton’s method. However, even if it is needed, the
Newton’s method can achieve a sufficiently accurate answer for our problem in less than
few iterations.
As we explained earlier, to compute the over-commitment probability in all future
times we have to compute s∗ for all times t that is most likely for the application to
be in that service. By calculating s∗ and obtaining Poc(t) we would be able to make
sure that the application failure probability is less than the agreed threshold Toc at all
times. The process of computation of s∗ for all t, however, in some systems can be a
computation intensive task. To overcome this obstacle, we propose a practical technique
for computing the root in equation (B.2).
Our solution is to combine the CLT-based method and the Chernoff’s bound method.
In this technique the system computes the over-commitment probability based on the
mean and variance values and using the central limit theorem as described in the previous
subsection. Moreover, the system keeps track of the time th that the CLT-based method
gives the highest value for the over-commitment probability. If the highest CLT-based
computed probability were less than 0.001, the system would compute the roots of the
equation (B.2) using the above mentioned technique. Consequently, the Chernoff’s bound
for that particular time th, can be computed using s∗.
Appendix C
Derivation of Gk(t) Probability
Assume that there is an application that can be created by cascading m different services
as following: S1
⊗
S2
⊗
...Sj
⊗
Sk...⊗
Sm
⊙
. The execution times of all services are
independent random variables shown as Xi(i = 1, ...,m), with pdf of fi(t)(i = 1, ...,m).
We want to find the probability that at time t the application has finished the exe-
cution of all services before service k and is currently executing the service k:
Gk(t) = P{j
∑
1
X < t <j
∑
1
X +Xk}, We define Yj as Yj :=j
∑
1
X, with pdf of fY j(t)
and cdf of FY j(t). Now we have:
Gk(t) = P{Yj < t < Yj +Xk}
=
t∫
0
fY j(τ)P{τ < t < Yj +Xk|τ = Yj}dτ
=
t∫
0
fY j(τ)P{t < τ +Xk}dτ
=
t∫
0
fY j(τ)(1− Fk(t− τ))dτ
206
Appendix: Derivation of Gk(t) Probability 207
= FY j(t)−
t∫
0
fY j(τ)Fk(t− τ)dτ
= FY j(t)−
t∫
0
t−τ∫
0
fY j(τ)fk(λ)dλdτ
with the change of variable λ to ν − τ , we have:
t∫
0
t−τ∫
0
fY j(τ)fk(λ)dλdτ =
t∫
0
t∫
0
fY j(τ)fk(ν − τ)dνdτ
=
t∫
0
(fY j(ν) ∗ fk(ν))dν = FY k(t)
Therefore, the probability Gk(t) is equal to:
Gk(t) = FY j(t)− FY k(t)
Appendix D
Simulation Environment Description
In this thesis, we have frequently used simulation techniques to evaluate performance of
the proposed systems and algorithms. The simulations environment and techniques used
for each of the performance evaluations have been described in the related parts of each
chapter. In this appendix we would like to present an overall description of simulation
environment and techniques used for the purpose of this study.
The simulations in this thesis were all conducted on a 56-node computing cluster
in the Network Architecture Lab in the Department of Electrical and Computer Engi-
neering, University of Toronto. Each of these 56 computing nodes has two Xen 1.7GHz
processors and two 40 GB local hard drives and 2GB of RAM. This considerable amount
of processing power allowed us to easily repeat each simulation many times (> 20 per
point) and use the calculated mean values of the obtained results to evaluate the per-
formance of the proposed algorithms. We have also calculated the confidence intervals
for these results and found out that since the number of trial runs are quite large the
confidence intervals are very narrow.
To make sure that the simulations are correct, we followed a step-by-step and modular
approach. In each case, we started the simulation process by simulating simpler cases
and we analyzed the extensive logs produced by the simulator to make sure the internal
208
Appendix: Simulation Environment Description 209
states and variables are correct. We also followed a modular design approach for our
simulations and we tested each module in separation to increase the quality of simulations
by simplifying the debugging process.
We have also evaluated the correctness of the random number generators by perform-
ing statistical analysis on the generated random numbers. The input and output of each
simulation is described in the performance evaluations sections in each chapter.
Bibliography
[1] T. Anderson, L. Peterson, S. Shenker, and J. Turner. Overcoming the internet
impasse through virtualization. Computer, 38(4):34 – 41, april 2005.
[2] Zhenyu Yang, Wanmin Wu, Klara Nahrstedt, Gregorij Kurillo, and Ruzena Bajcsy.
Enabling multi-party 3d tele-immersive environments with viewcast. ACM Trans.
Multimedia Comput. Commun. Appl., 6(2):1–30, 2010.
[3] A. Tizghadam and A. Leon-Garcia. Autonomic traffic engineering for network
robustness. Selected Areas in Communications, IEEE Journal on, 28(1):39 –50,
january 2010.
[4] R. Farha and A. Leon-Garcia. Blueprint for an Autonomic Service Architecture.
In Autonomic and Autonomous Systems, 2006. ICAS ’06. 2006 International Con-
ference on, July 2006.
[5] K.A. Abuosba and A.A. El-Sheikh. Formalizing service-oriented architectures. IT
Professional, 10(4):34 –38, july-aug. 2008.
[6] Virtualization. http://en.wikipedia.org/wiki/Virtualization.
[7] Hadi Bannazadeh, Albert Leon-Garcia, and et. al. Virtualized Application Net-
working Infrastructure. In Proc. of the 6th International Conference on Testbeds
and Research Infrastructures for the Development of Networks and Communities,
Berlin, Germany, May 2010.
210
Appendix: Simulation Environment Description 211
[8] Keith Redmond, Hadi Bannazadeh, Alberto Leon-Garcia, and Paul Chow. Devel-
opment of a Virtualized Application Networking Infrastructure Node. In Proc. of
the 3rd IEEE Workshop on Enabling the Future Service-Oriented Internet, Hon-
olulu, Hawaii, December 2009.
[9] Hadi Bannazadeh and Alberto Leon-Garcia. A Distributed Ethernet Traffic Shap-
ing System. In Proc. of the 17th IEEE Workshop on Local and Metropolitan Area
Networks (LANMAN 2010), Long Branch, NJ, May 2010.
[10] Michael Cusumano. Cloud computing and saas as new computing platforms. Com-
munications of the ACM, 53(4):27–29, 2010.
[11] Hadi Bannazadeh and Alberto Leon-Garcia. Allocating Services to Applications
using Markov Decision Processes. In proc. of IEEE Int. Conf. on Service-Oriented
Computing and Applications, SOCA’07, pages 141–146, Newport Beach, California,
June 2007.
[12] Hadi Bannazadeh and Alberto Leon-Garcia. Service Commitment Strategies in
Allocating Services to Applications. In proc. of IEEE Int. Conf. on Service Com-
puting, SCC’07, pages 91–97, Salt Lake City, Utah, July 2007.
[13] Hadi Bannazadeh and Alberto Leon-Garcia. A Distributed Algorithm for Ser-
vice Commitment in Allocating Services to Applications. In proc. of 2nd IEEE
Asia-Pacific Service Computing Conference, APSCC’07, pages 446–453, Tsukuba,
Japan, Dec 2007.
[14] Hadi Bannazadeh and Alberto Leon-Garcia. Probabilistic Approach to Service
Commitment in Service-Oriented Systems. In in the proc. of IEEE Congress on
Services, Honolulu, Hawaii, July 2008.
Appendix: Simulation Environment Description 212
[15] Hadi Bannazadeh and Alberto Leon-Garcia. A distributed probabilistic commit-
ment control algorithm for service-oriented systems. Network and Service Manage-
ment (TNSM), to appear in the IEEE Transactions on.
[16] Hadi Bannazadeh and Alberto Leon-Garcia. Online optimization in application
admission control for service oriented systems. In Asia-Pacific Services Computing
Conference, 2008. APSCC ’08. IEEE, pages 482–487, Yilan, Taiwan, Dec 2008.
[17] Hadi Bannazadeh and Albert Leon-Garcia. On the Emergence of an Application-
Oriented Network Architecture. In proc. of IEEE Int. Conf. on Service-Oriented
Computing and Applications, SOCA’07, pages 47–54, Newport Beach, California,
June 2007.
[18] Stephanos Androutsellis-Theotokis and Diomidis Spinellis. A survey of peer-to-peer
content distribution technologies. ACM Comput. Surv., 36(4):335–371, 2004.
[19] Service-Oriented Architecture. www.ibm.com/soa.
[20] OASIS Reference Model for Service Oriented Architecture 1.0. http://www.oasis-
open.org.
[21] Francis Shanahan. Amazon.com Mashups. Wrox Press Ltd., Birmingham, UK,
2007.
[22] Tim O’Reilly. What is web 2.0: Design patterns and business models for the next
generation of software. Available online at http://oreilly.com/web2/archive/
what-is-web-20.html.
[23] W3C Working Group Note. Web services architecture. Available online at http:
//www.w3.org/TR/ws-arch/.
[24] W3C. extensible markup language (xml). Available online at http://www.w3.
org/XML/.
Appendix: Simulation Environment Description 213
[25] Benny Mathew Poornachandra Sarang, Matjaz Juric. Business Process Execution
Language for Web Services BPEL and BPEL4WS. Packt Publishing, Birmingham,
UK, 2006.
[26] Krishna Kant. Data center evolution: A tutorial on state of the art, issues, and
challenges. Computer Networks, 53(17):2939 – 2965, December 2009.
[27] James Murty. Programming Amazon Web Services: S3, EC2, SQS, FPS, and
SimpleDB. O’Reilly Media Inc, California, 2008.
[28] D. Nurmi, R. Wolski, C. Grzegorczyk, G. Obertelli, S. Soman, L. Youseff, and
D. Zagorodnov. The eucalyptus open-source cloud-computing system. In Clus-
ter Computing and the Grid, 2009. CCGRID ’09. 9th IEEE/ACM International
Symposium on, pages 124 –131, Shanghai, May 2009.
[29] Guohui Wang and T. S. Eugene Ng. The impact of virtualization on network
performance of amazon ec2 data center. In Proceedings of the 29th IEEE Conference
on Computer Communications, INFOCOM 2010, San Diego, CA, March 2010.
[30] M. Alizadeh, B. Atikoglu, A. Kabbani, A. Lakshmikantha, Rong Pan, B. Prab-
hakar, and M. Seaman. Data center transport mechanisms: Congestion control
theory and ieee standardization. In Communication, Control, and Computing,
2008 46th Annual Allerton Conference on, pages 1270–1277, Sept. 2008.
[31] Alan B. Johnston. SIP: Understanding the Session Initiation Protocol. Artech
House Publishers, 2009.
[32] ITU-T. Next generation networks global standards initiative. Available online at
http://www.itu.int/ITU-T/ngn.
Appendix: Simulation Environment Description 214
[33] K. Knightson, N. Morita, and T. Towle. Ngn architecture: generic principles,
functional architecture, and implementation. Communications Magazine, IEEE,
43(10):49 – 56, oct. 2005.
[34] Gonzalo Camarillo and Miguel A. Garcia-Martin. The 3G IP Multimedia Subsystem
(IMS). John Wiley & Sons Ltd, England, 2006.
[35] TM forum. Ipsphere forum. Available online at http://www.tmforum.org/
ipsphere.
[36] Cornelia Kappler. UMTS Networks and Beyond. John Wiley & Sons, England,
2009.
[37] Pierre Lescuyer and Thierry Lucidarme. Evolved Packet System, The LTE and
SAE Evolution of 3G UMTS. John Wiley & Sons, England, 2008.
[38] Alasdair Allan. Learning iPhone Programming: From Xcode to App Store. O’Reilly
Media, CA, USA, 2010.
[39] Reto Meier. Professional Android 2 Application Development. Wiley Publishing,
USA, 2010.
[40] Akamai. http://www.akamai.com.
[41] R.L. Xia and J.K. Muppala. A survey of bittorrent performance. Communications
Surveys Tutorials, IEEE, 12(2):140 –158, second 2010.
[42] Gero Mhl, Ludger Fiege, and Peter Pietzuch. Distributed Event-Based Systems.
Springer, Germany, 2006.
[43] P. Saint-Andre. Xmpp: lessons learned from ten years of xml messaging. Commu-
nications Magazine, IEEE, 47(4):92 –96, april 2009.
Appendix: Simulation Environment Description 215
[44] Jacob Chakareski and Pascal Frossard. Adaptive systems for improved media
streaming experience. Communications Magazine, IEEE, 45(1):77 –83, jan. 2007.
[45] Cisco. Cisco visual networking index: Forecast and methodology, 2009-2014. Avail-
able online at http://www.cisco.com.
[46] Youtube. http://www.youtube.com.
[47] Hulu. http://www.hulu.com.
[48] E. Mikoczy, D. Sivchenko, Bangnan Xu, and J.I. Moreno. Iptv systems, standards
and architectures: Part ii - iptv services over ims: Architecture and standardization.
Communications Magazine, IEEE, 46(5):128 –135, may 2008.
[49] J.S. Turner and D.E. Taylor. Diversifying the internet. In Global Telecommunica-
tions Conference, 2005. GLOBECOM ’05. IEEE, volume 2, Dec 2005.
[50] Steven M. Bellovin, David D. Clark, Adrian Perrig, and Dawn Song. A
Clean-Slate Design for the Next-Generation Secure Internet, 2005. Available
at http://sparrow.ece.cmu.edu/group/pub/bellovin_clark_perrig_song_
nextGenInternet.pdf.
[51] Stanford University Clean Slate Design For Internet: An Interdisciplinary Research
Program. http://cleanslate.stanford.edu.
[52] 100x100 project. http://100x100network.org.
[53] Srel M., Rinta aho T., and Tarkoma S. Rtfm: Publish/subscribe internetwork-
ing architecture. ICT-MobileSummit 2008 Conference Proceedings, Paul Cunning-
ham and Miriam Cunningham (Eds), IIMC International Information Management
Corporation, 2008.
Appendix: Simulation Environment Description 216
[54] Van Jacobson, Diana K. Smetters, James D. Thornton, Michael F. Plass,
Nicholas H. Briggs, and Rebecca L. Braynard. Networking named content. In
CoNEXT ’09: Proceedings of the 5th international conference on Emerging net-
working experiments and technologies, pages 1–12, New York, NY, USA, 2009.
ACM.
[55] GENI System Overview, September 2008. Available at http://www.geni.net.
[56] GENI Control Framework Requirements, January 2009. Available at http://www.
geni.net.
[57] Peterson L. PlanetLab: A Blueprint for Introducing Disruptive Technology into
the Internet. http://www.planet-lab.org, January 2004.
[58] PlanetLab GENI Control Framework Overview, January 2009. Available at http:
//www.geni.net.
[59] Emulab - network emulation testbed. http://www.emulab.net.
[60] Mike Hibler, Robert Ricci, Leigh Stoller, Jonathon Duerig, Shashi Guruprasad,
Tim Stack, Kirk Webb, and Jay Lepreau. Large-scale Virtualization in the Emulab
Network Testbed. In Proceedings of the 2008 USENIX Annual Technical Confer-
ence, pages 113–128, June 2008.
[61] Open resource control architecture. http://nicl.cod.cs.duke.edu/orca/about.
html.
[62] P. Szegedi, S. Figuerola, M. Campanella, V. Maglaris, and C. Cervello-Pastor.
With evolution for revolution: managing FEDERICA for future Internet research.
Communications Magazine, IEEE, 47(7):34–39, July 2009.
[63] Snehapreethi Gopinath, Shweta Jain, Shivesh Makharia, and Dipankar Raychaud-
huri. An experimental study of the cache-and-forward network architecture in
Appendix: Simulation Environment Description 217
multi-hop wireless scenarios. In Proc. of the 17th IEEE Workshop on Local and
Metro Area Networks (LANMAN 2010), Long Branch, NJ, May 2010.
[64] E. Grasa, G. Junyent, S. Figuerola, A. Lopez, and M. Savoie. Uclpv2: a network
virtualization framework built on web services [web services in telecommunications,
part ii]. Communications Magazine, IEEE, 46(3):126 –134, march 2008.
[65] E. Grasa et al. UCLPv2: A Network Virtualization Framework Built on Web
Services. Communications Magazine, IEEE, 46(3):126–34, March 2008.
[66] Matthias Nicola and Jasmi John. Xml parsing: A threat to database performance.
In In proc. of 12th Intl. Conference on Information and Knowledge Management,
pages 175–178, New Orleans, Louisiana, 2003.
[67] D. Davis and M.P. Parashar. Latency performance of soap implementations. In
Cluster Computing and the Grid, 2002. 2nd IEEE/ACM International Symposium
on, New Orleans, Louisiana, may 2002.
[68] Hadi Bannazadeh. Hardware-based Content Processing, May 2007.
[69] F. Hartung, N. Niebert, A. Schieder, R. Rembarz, S. Schmid, and L. Eggert. Ad-
vances in network-supported media delivery in next-generation mobile systems.
Communications Magazine, IEEE, 44(8):82 –89, aug. 2006.
[70] D. Chappell. Theory in Practice: Enterprise Service Bus. O’Reilly Media, USA,
2004.
[71] IBM. Websphere datapower soa appliances. http://www-01.ibm.com/software/
integration/datapower/.
[72] Bo Li and Hao Yin. Peer-to-peer live video streaming on the internet: issues,
existing approaches, and challenges [peer-to-peer multimedia streaming]. Commu-
nications Magazine, IEEE, 45(6):94 –99, june 2007.
Appendix: Simulation Environment Description 218
[73] I. Hernandez-Serrano, S. Sharma, and A. Leon-Garcia. Reliable p2p networks:
Treblecast and treblecast. In Parallel Distributed Processing, 2009. IPDPS 2009.
IEEE International Symposium on, pages 1 –8, 2009.
[74] Cisco. Application-oriented networking. http://www.cisco.com.
[75] Larry Peterson, Soner Sevinc, Jay Lepreau, Robert Ricci, John Wroclawski, Ted
Faber, Stephen Schwab, and Scott Baker. Slice-based facility architecture. Avail-
able online at http://www.geni.net.
[76] Marc E. Fiuczynski Herbert Ptzl. Linux-VServer, Resource Efficient OS-Level
Virtualization, June 2007. Available at http://ols.108.redhat.com/2007/
Reprints/potzl-Reprint.pdf.
[77] CANARIE Inc. CANARIE: Canadian Network for the Advancement of Research,
Industry and Education. http://www.canarie.ca.
[78] Glen Gibb, John W. Lockwood, Jad Naous, Paul Hartke, and Nick McKeown.
NetFPGA: An Open Platform for Teaching How to Build Gigabit-Rate Network
Switches and Routers. Trans. on Education, 51(3):364–369, August 2008.
[79] Yu Cheng, R. Farha, A. Tizghadam, Myung Sup Kim, M. Hashemi, A. Leon-Garcia,
and J.W.-K. Hong. Virtual network approach to scalable ip service deployment and
efficient resource management. Communications Magazine, IEEE, 43(10):76 – 84,
oct. 2005.
[80] C. Chang, J. Wawrzynek, and R.W. Brodersen. BEE2: a high-end reconfigurable
computing system. Design and Test of Computers, IEEE, 22(2):114–125, March-
April 2005.
[81] Sun Microsystems Inc. OpenESB: The Open Enterprise Service Bus. http://
open-esb.dev.java.net.
Appendix: Simulation Environment Description 219
[82] Sun Microsystems Inc.: Java Web Start Technologies. http://java.sun.com/
javase/technologies/desktop/javawebstart.
[83] Ontario Research and Innovation Optical Network (ORION). http://www.orion.
on.ca.
[84] IEEE 802.1ad-2005, Virtual Bridged Local Area Networks Amendment 4: Provider
Bridges, 2006. Available at http://standards.ieee.org.
[85] Inc VMWare. VMware: A Virtual Computing Environment. http://www.vmware.
com, 2001.
[86] Padala P., Zhu X., Wang Z., Singhal S., and Shin K.G. Performance Evaluation
of Virtualization Technologies for Server Consolidation, 2007. Available at http:
//www.hpl.hp.com/techreports/2007/HPL-2007-59R1.html.
[87] Cloud Computing Definition, National Institute of Standards and Technol-
ogy, Version 15, 2006. Available at http://csrc.nist.gov/groups/SNS/
cloud-computing/index.html.
[88] The Internet Engineering Task Force (IETF). Rfc3448: Tcp friendly rate control
(tfrc). http://www.ietf.org/rfc/rfc3448.txt.
[89] S. Biyani and J. Martin. A comparison of tcp-friendly congestion control protocols.
In Computer Communications and Networks, 2004. ICCCN 2004. Proceedings. 13th
International Conference on, pages 255 –260, Oct 2004.
[90] IEEE 802.3x-1997, Local and Metropolitan Area Networks: Specification for 802.3
Full Duplex Operation, 1997. Available at http://standards.ieee.org.
[91] IEEE 802.1au, Virtual Bridged Local Area Networks Amendment Congestion No-
tification. Available at www.ieee802.org/1/pages/802.1au.html.
Appendix: Simulation Environment Description 220
[92] Jinjing Jiang, R. Jain, and Chakchai So-In. An explicit rate control framework for
lossless ethernet operation. In Communications, 2008. ICC ’08. IEEE International
Conference on, pages 5914–5918, May 2008.
[93] Gary McAlpine, Manoj Wadekar, Tanmay Gupta, Alan Crouch, and Don Newell.
An architecture for congestion management in ethernet clusters. In IPDPS ’05:
Proceedings of the 19th IEEE International Parallel and Distributed Processing
Symposium - Workshop 9, page 211.1, 2005.
[94] Chakchai So-In, R. Jain, and Jinjing Jiang. Enhanced forward explicit conges-
tion notification (e-fecn) scheme for datacenter ethernet networks. In Performance
Evaluation of Computer and Telecommunication Systems, 2008. SPECTS 2008.
International Symposium on, pages 542 –546, June 2008.
[95] Linux Advanced Routing and Traffic Control. Available at http://lartc.org/.
[96] M. Bichler and K-J. Lin. Service-Oriented Computing. IT Systems Perspectives,
39(3):99–101, March 2006.
[97] X. Gu and K. Nahrstedt. Distributed Multimedia Service composition with statis-
tical QoS Assurances. IEEE Transactions on Multimedia, 8(1):141–151, Feb 2006.
[98] L. Zeng, B. Benatallah, A.H.H Ngu, M. Dumas, J.Kalagnanam, and H. Chang.
QoS-Aware Middleware for Web Service Composition. IEEE Transactions on Soft-
ware Engineering, 30(5):311–327, May 2004.
[99] P. Doshi, R. Goodwin, R. Akkiraju, and K. Verma. Dynamic workflow composition
using Markov decision processes. In Proc. IEEE International Conference on Web
Services, pages 576–582, July 2004.
[100] Thomas Phan and Wen-Syan Li. Heuristics-based scheduling of composite web
service workloads. In MW4SOC ’06: Proceedings of the 1st workshop on Middleware
Appendix: Simulation Environment Description 221
for Service Oriented Computing (MW4SOC 2006), pages 30–35, New York, NY,
USA, 2006. ACM.
[101] K.W. Ross and D.H.K. Tsang. The stochastic knapsack problem. Communications,
IEEE Transactions on, 37(7):740 –747, jul 1989.
[102] D.P. Bertsekas. Dynamic Programming and Optimal Control, volume 1. Athena
Scientific, Belmont, Massachusetts, third edition, 2005.
[103] M.L. Puterman. Markov Decision Processes. Wiley Inter-Science, New York, 1994.
[104] S.D. Moitra. Skewness and the Beta Distribution. Journal of Operation Research
Society, 41(10):953–961, Oct 1990.
[105] Menasce Daniel A., Casalicchio Emiliano, and Dubey Vinod. A heuristic approach
to optimal service selection in service oriented architectures. In WOSP ’08: Pro-
ceedings of the 7th international workshop on Software and performance, pages
13–24, New York, NY, USA, 2008. ACM.
[106] Danilo Ardagna and Barbara Pernici. Adaptive service composition in flexible
processes. IEEE Transactions on Software Engineering, 33:369–384, 2007.
[107] Valeria Cardellini, Emiliano Casalicchio, Vincenzo Grassi, Francesco Lo Presti, and
Raffaela Mirandola. Qos-driven runtime adaptation of service oriented architec-
tures. In ESEC/FSE ’09: Proceedings of the the 7th joint meeting of the European
software engineering conference and the ACM SIGSOFT symposium on The foun-
dations of software engineering, pages 131–140, New York, NY, USA, 2009. ACM.
[108] Tao Yu, Yue Zhang, and Kwei-Jay Lin. Efficient algorithms for web services selec-
tion with end-to-end qos constraints. ACM Trans. Web, 1(1):6, 2007.
[109] David Chappell and David Berry. Soa - ready for primetime: The next-generation,
grid-enabled service-oriented architecture. SOA Magazine, September 2007.
Appendix: Simulation Environment Description 222
[110] Menasce Daniel A., Ruan Honglei, and Gomaa Hassan. Qos management in service-
oriented architectures. Perform. Eval., 64(7-8):646–663, 2007.
[111] Markus Schmid and Reinhold Kroeger. Decentralised qos-management in service
oriented architectures. In Distributed Applications and Interoperable Systems, vol-
ume 5053/2008, pages 44–57. Springer Berlin / Heidelberg, 2008.
[112] S. Rosario, A. Benveniste, S. Haar, and C. Jard. Probabilistic QoS and Soft Con-
tracts for Transaction-Based Web Services Orchestrations. IEEE Transaction on
Services Computing, 1(4):187–200, October-December 2008.
[113] Leyuan Shi. Approximate analysis for queueing networks with finite capacity and
customer loss. European Journal of Operational Research, 85(1):178 – 191, 1995.
[114] Boualem Rabta. Rapid Modelling for Increasing Competitiveness, chapter A Review
of Decomposition Methods for Open Queueing Networks, pages 25–42. 2009.
[115] Carolina Osorio and Michel Bierlaire. An analytic finite capacity queueing network
model capturing the propagation of congestion and blocking. European Journal of
Operational Research, 196(3):996 – 1007, 2009.
[116] H. Kobayashi and B. Mark. System Modeling and Analysis, Foundation of System
Perfromance Evaluation. Pearson Education, Inc., Upper Saddle River, New Jersey,
2009.
[117] Raj Jain. The art of computer systems performance analysis : techniques for ex-
perimental design, measurement, simulation, and modeling. John Wiley & Sons,
Inc., New York, NY, 1991.
[118] A. Papoulis and S. U. Pillai. Probablity, Random Variables and Stochastic Pro-
cesses. MacGraw-Hill, New York, 2002.
Appendix: Simulation Environment Description 223
[119] Z.100, Specification and Description Language. Available online at
http://www.itu.int/rec/T-REC-Z.100-200711-I/en, 2007.
[120] Cheng-Yuan Ku, Din-Yuen Chan, and Lain-Chyr Hwang. Optimal reservation
policy for two queues in tandem. Inf. Process. Lett., 85(1):27–30, 2003.
[121] Cheng-Yuan Ku and Scott Jordan. Near optimal admission control for multiserver
loss queues in series. European Journal of Operational Research, 144(1):166–178,
2003.
[122] S. Balsamo, V. Nitto Persone, and R. Onvural. Analysis of Queueing Networks
with Blocking. Kluwer’s International Series, 2001.
[123] W. Whitt. The queueing network analyzer. The Bell System Technical Journal,
62(9):2779–2815, 1983.
[124] A. Heindl. Approximate analysis of queueing networks with finite buffers and losses
by decomposition. Technical Report 1998-8, 1998.
[125] J.C. Strelen. Loss queueing networks with bursty arrival processes and phase type
service times: Approximate analysis. In In Proceedings of the 5th IFIP Workshop
on Performance Modelling and Evaluation of ATM Networks, pages 87/1–10, 1997.
[126] Sushant Jain and J. MacGregor Smith. Open finite queueing networks with
m/m/c/k parallel servers. Computers & Operations Research, 21(3):297 – 317,
1994.
[127] R. Sadre, B. Haverkort, and A. Ost. An efficient and accurate decomposition
method for open finite and infinite buffer queueing networks. In in proc. of the
Third International Workshop on Numerical Solution of Markov Chains, page 120,
1999.
Appendix: Simulation Environment Description 224
[128] Abigail Lebrecht and William J. Knottenbelt. Response time approximations in
fork-join queues. In in proceedings of 23rd Annual UK Performance Engineering
Workshop (UKPEW), June 2007.
[129] R. Nelson and A.N. Tantawi. Approximate analysis of fork/join synchronization in
parallel queues. Computers, IEEE Transactions on, 37(6):739 –743, jun 1988.
[130] Edward D. Lazowska, John Zahorjan, G. Scott Graham, and Kenneth C. Sevcik.
Quantitative system performance: computer system analysis using queueing net-
work models. Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 1984.
[131] Majid Ghaderi and Raouf Boutaba. Call admission control in mobile cellular net-
works: a comprehensive survey: Research articles. Wirel. Commun. Mob. Comput.,
6(1):69–93, 2006.
[132] D.A. Levine, I.F. Akyldiz, and M. Naghshineh. A Resource Estimation and Call
Admission Algorithm for Wireless Multimedia Networks using Shadow Cluster Con-
cept. IEEE/ACM Transactions on Networking, 5(1):1–12, Feb 1997.
[133] T. Zhang, E. van den Berg, J. Chennikara, P. Agrawal, Jyh-Cheng Chen, and
T. Kodama. Local predictive resource reservation for handoff in multimedia wireless
ip networks. Selected Areas in Communications, IEEE Journal on, 19(10):1931–
1941, Oct 2001.
[134] Ti-Yen Yen and Wayne Wolf. Performance estimation for real-time distributed
embedded systems. IEEE Trans. Parallel Distrib. Syst., 9(11):1125–1136, 1998.
[135] Lei Ju, Abhik Roychoudhury, and Samarjit Chakraborty. Schedulability analysis of
msc-based system models. In RTAS ’08: Proceedings of the 2008 IEEE Real-Time
and Embedded Technology and Applications Symposium, pages 215–224, Washing-
ton, DC, USA, 2008. IEEE Computer Society.
Appendix: Simulation Environment Description 225
[136] Firat Kart, Louise E. Moser, and P. Michael Melliar-Smith. Building a distributed
e-healthcare system using soa. IT Professional, 10(2):24–30, 2008.
[137] Sorin Manolache, Petru Eles, and Zebo Peng. Schedulability analysis of applica-
tions with stochastic task execution times. Trans. on Embedded Computing Sys.,
3(4):706–735, 2004.
[138] Sorin Manolache, Petru Eles, and Zebo Peng. Schedulability analysis of multipro-
cessor real-time applications with stochastic task execution times. In ICCAD ’02:
Proceedings of the 2002 IEEE/ACM international conference on Computer-aided
design, pages 699–706, New York, NY, USA, 2002. ACM.
[139] Sheldon M. Ross. Stochastic Processes. John Wiley & Sons, 1996.
[140] P.V. Hentenryck and R. Bent. Online Stochastic Combinatorial Optimization. The
MIT Press, Cambridge, Massachusetts, 2006.
[141] Martin Bichler and Thomas Setzer. Admission control for media on demand ser-
vices. Service Oriented Computing and Applications, 1(1):65–73, Apr 2007.
[142] Mixed Integer Linear Programming MILP solver lp solve.
http://sourceforge.net/projects/lpsolve.
[143] M. Shaked and J.G. Shanthikumar. Stochastic Orders and There Applications.
Academic Press, Boston, Massachusetts, 1994.
[144] Richard E. Barlow, Frank Proschan, and Larry C. Hunter. Mathematical Theory
of Reliability. SIAM, New York, NY, 1996.
[145] Alberto Leon-Garcia. Probability, Statistics, and Random Processes For Electrical
Engineering. Addison-Wessley, New York, 2008.
[146] Joseph Y. Hui. Switching and traffic theory for integrated broadband networks.
Kluwer Academic Publishers, Massachusetts, 1990.
Glossary
ABS Already Being Served.
AON Application-Oriented Network.
AOR Application-Oriented Router.
BEE2 Berkeley Emulation Engine 2.
BIP Binary Integer Programming.
BPEL Business Process Execution Language.
CAC Call Admission Control.
CDN Content Distribution (Delivery) Network.
CEP Complex Event Processing.
CLT Central Limit Theorem.
CP Complete Partitioning.
CS Complete Sharing.
DASC Distributed Algorithm for Service Commitment.
DETS Distributed Ethernet Traffic Shaping.
DFR Decreasing Failure Rate.
EC2 Amazon Elastic Cloud Computing.
ESB Enterprise Service Bus.
FCP Full Commitment Policy.
FCQN Finite Capacity Queuing Network.
226
Glossary 227
FECN Forward Explicit Congestion Network.
FPGA Field-Programmable Gate Array.
GENI Global Environment for Network Innovations.
GPU Graphics Processing Unit.
GUI Graphical User Interface.
HTTP Hypertext Transfer Protocol.
IFR Increasing Failure Rate.
IMS IP Multimedia Subsystem.
IP Internet Protocol.
JMS Java Message Service.
LP Linear Programming.
MDP Markov Decision Processes.
NCP No Commitment Policy.
NGN Next Generation Network.
PCP Partial Commitment Policy.
Q-DASC Queue-enabled Distributed Algorithm for Service Commitment.
RAA Rate Allocation Algorithm.
RAA-FE Rate Allocation Algorithm-Forward Explicit.
RAA-FP Rate Allocation Algorithm-Fast Probe.
RAA-FS Rate Allocation Algorithm-Fair Share.
RAA-SP Rate Allocation Algorithm-Slow Probe.
SDL Specification and Description Language.
SIP Session Initiation Protocol.
SNMP Physical Node.
SNMP Simple Network Management Protocol.
Glossary 228
SNMP Virtual Node.
SOA Service-Oriented Architecture.
SSL Secure Socket Layer.
SSS Service Signaling Stratum.
TES Time to Enter Service.
TLS Transport Layer Security.
UCLP User Controlled Light Path.
UUID Universally Unique IDentifier.
VANI Virtualized Application Networking Infrastructure.
VANI-AP VANI Application Plane.
VANI-CMP VANI Control and Management Plane.
VLAN Virtual Local Area Network.
WS Web Service.
WSDL Web Service Description Language.
XML Extensible Markup Language.
XMPP Extensible Messaging and Presence Protocol.