+ All Categories
Home > Documents > End-to-End Quality of Service for High-End ApplicationsEnd-to-End Quality of Service for High-End...

End-to-End Quality of Service for High-End ApplicationsEnd-to-End Quality of Service for High-End...

Date post: 15-Mar-2020
Category:
Upload: others
View: 20 times
Download: 0 times
Share this document with a friend
23
1 End-to-End Quality of Service for High-End Applications Ian Foster ab , Markus Fidler c , Alain Roy d , Volker Sander e , Linda Winkler a a Mathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL 60439, U.S.A. b Department of Computer Science, The University of Chicago, Chicago, IL 60637, U.S.A. c Department of Computer Science, Aachen University, 52064 Aachen, Germany d Department of Computer Sciences, University of Wisconsin-Madison, Madison, WI 53706, U.S.A. e Central Institute for Applied Mathematics, Forschungszentrum J¨ ulich GmbH, 52425 J¨ ulich, Germany High-end networked applications such as distance visualization, distributed data analysis, and advanced col- laborative environments have demanding quality of service (QoS) requirements. Particular challenges include concurrent flows with different QoS specifications, high bandwidth flows, application-level monitoring and con- trol, and end-to-end QoS across networks and other devices. We describe a QoS architecture and implementation that together help to address these challenges. The General-purpose Architecture for Reservation and Allocation (GARA) supports flow-specific QoS specification, immediate and advance reservation, and online monitoring and control of both individual resources and heterogeneous resource ensembles. Mechanisms provided by the Globus Toolkit are used to address resource discovery and security issues when resources span multiple administrative domains. Our prototype GARA implementation builds on differentiated services mechanisms to enable the co- ordinated management of two distinct flow types—foreground media flows and background bulk transfers—as well as the co-reservation of networks, CPUs, and storage systems. We present results obtained on a wide area differentiated services testbed that demonstrate our ability to deliver QoS for realistic flows. 1. Introduction Investigations of network quality of service (QoS) have tended to focus on the aggregation of relatively low-bandwidth flows associated with Web and media streaming applications. Yet the QoS requirements associated with these flows are not representative of all interesting applications. For example, distance visualization applications encountered in science and engineering can in- volve data transfers and media streaming at hun- dreds (ultimately thousands) of megabits per sec- ond (Mb/s), while the bulk data transfer op- erations required for replication or analysis of large datasets can require sustained high band- widths expressed in terms of terabytes per hour. Advanced collaborative environments can require complex mixes of these and other flows, with varying service level requirements and many in- terdependencies. The development of QoS for such high-end ap- plications introduces major challenges for both QoS protocols and higher-level architectures that use these protocols to provide end-to-end solu- tions for users. At the higher-level architecture level, new concepts and constructs are required for deal- ing with end-to-end flows that involve multiple scarce resources: for example, advance reserva- tion mechanisms, to ensure availability [14,17, 52]; co-reservation of network, compute, storage, and other resources [11]; control and monitor- ing application programmer interfaces (APIs) for application-level adaptation [26,15,45]; and pol- icy mechanisms able to deal with large reserva- tions and complex hierarchical allocation strate- gies. When considering appropriate QoS protocols to support these high-end applications, two ma- jor contenders are apparent: Integrated Services
Transcript

1

End-to-End Quality of Service for High-End Applications

Ian Fosterab, Markus Fidlerc, Alain Royd, Volker Sandere, Linda Winklera

aMathematics and Computer Science Division, Argonne National Laboratory, Argonne, IL 60439,U.S.A.

bDepartment of Computer Science, The University of Chicago, Chicago, IL 60637, U.S.A.

cDepartment of Computer Science, Aachen University, 52064 Aachen, Germany

dDepartment of Computer Sciences, University of Wisconsin-Madison, Madison, WI 53706, U.S.A.

eCentral Institute for Applied Mathematics, Forschungszentrum Julich GmbH, 52425 Julich, Germany

High-end networked applications such as distance visualization, distributed data analysis, and advanced col-laborative environments have demanding quality of service (QoS) requirements. Particular challenges includeconcurrent flows with different QoS specifications, high bandwidth flows, application-level monitoring and con-trol, and end-to-end QoS across networks and other devices. We describe a QoS architecture and implementationthat together help to address these challenges. The General-purpose Architecture for Reservation and Allocation(GARA) supports flow-specific QoS specification, immediate and advance reservation, and online monitoring andcontrol of both individual resources and heterogeneous resource ensembles. Mechanisms provided by the GlobusToolkit are used to address resource discovery and security issues when resources span multiple administrativedomains. Our prototype GARA implementation builds on differentiated services mechanisms to enable the co-ordinated management of two distinct flow types—foreground media flows and background bulk transfers—aswell as the co-reservation of networks, CPUs, and storage systems. We present results obtained on a wide areadifferentiated services testbed that demonstrate our ability to deliver QoS for realistic flows.

1. Introduction

Investigations of network quality of service(QoS) have tended to focus on the aggregationof relatively low-bandwidth flows associated withWeb and media streaming applications. Yet theQoS requirements associated with these flows arenot representative of all interesting applications.For example, distance visualization applicationsencountered in science and engineering can in-volve data transfers and media streaming at hun-dreds (ultimately thousands) of megabits per sec-ond (Mb/s), while the bulk data transfer op-erations required for replication or analysis oflarge datasets can require sustained high band-widths expressed in terms of terabytes per hour.Advanced collaborative environments can requirecomplex mixes of these and other flows, withvarying service level requirements and many in-terdependencies.

The development of QoS for such high-end ap-plications introduces major challenges for bothQoS protocols and higher-level architectures thatuse these protocols to provide end-to-end solu-tions for users.

At the higher-level architecture level, newconcepts and constructs are required for deal-ing with end-to-end flows that involve multiplescarce resources: for example, advance reserva-tion mechanisms, to ensure availability [14,17,52]; co-reservation of network, compute, storage,and other resources [11]; control and monitor-ing application programmer interfaces (APIs) forapplication-level adaptation [26,15,45]; and pol-icy mechanisms able to deal with large reserva-tions and complex hierarchical allocation strate-gies.

When considering appropriate QoS protocolsto support these high-end applications, two ma-jor contenders are apparent: Integrated Services

2

and Differentiated Services. The Integrated Ser-vices architecture [7] aims at addressing theseheterogenous demands for quality of service byallowing end-to-end reservations of network ca-pacity for individual flows, usually by using theResource Reservation Protocol (RSVP) [53]. Un-fortunately, the fine granularity of the IntegratedServices approach as originally specified was un-likely to scale effectively to be widely used in theInternet. (Note however recent proposals thatavoid the need to police all flows [35].) In ad-dition to attempted modifications of IntegratedServices, Differentiated Services (DS) [6], whichis the most recent approach of the Internet En-gineering Task Force (IETF) towards quality ofservice, addresses these scalability issues by anaggregation of micro-flows to classes. Doing soallows to support only a small number of serviceclasses within the core network and thereby of-fers better scalability. While DS has advantagesin terms of scalability, it is not obvious whetherand how it can support specialized high-end flows.

The work that we present in this article ad-dresses both the higher-level architecture andprotocol challenges just described. We describethe General-purpose Architecture for Reserva-tion and Allocation (GARA), a resource man-agement architecture that builds on mechanismsprovided by the Globus Toolkit [19] to supportsecure immediate and advance co-reservation, on-line monitoring/control, and policy-driven man-agement of a variety of resource types, includ-ing networks [21]. Then, we describe the ap-plication of GARA concepts and constructs toDS networks. We present a DS resource man-ager (i.e., bandwidth broker [6,31]) and ex-plain how this resource manager integrates withGARA facilities (e.g., advance reservation, au-thentication/authorization). We describe howthis resource manager builds on DS mechanismsto support two heterogeneous types of flowswithin a single framework: latency- and jitter-sensitive (e.g., media flows) and high-bandwidthbut latency-insensitive (e.g., bulk transfer). Wealso propose a policy model that allows admissioncontrol decisions to be made at multiple levels.Finally, we present performance experiments con-ducted on both local area and wide area DS net-

work testbeds; our results demonstrate our abilityto support multiple flow types and to co-reservenetwork and CPU resources.

The rest of this article is structured as follows.In Section 2 we introduce the QoS requirementsthat high-end applications have. Then, in Sec-tion 3, we describe GARA and its implementa-tion. In Section 4, we discuss its applicationin the context of DS networks and in Section 5we present our experimental results. We discussmulti-domain issues in Section 6 and related workin Section 7, and conclude with a discussion of fu-ture directions.

2. QoS Requirements of High-End Appli-cations

We use three representative examples to illus-trate QoS requirements of the high-end networkapplications that are encountered, for example,in advanced scientific and engineering comput-ing [20].

2.1. Application DescriptionsDistance visualization of large datasets. Scien-

tific instruments and supercomputer simulationsgenerate large amounts of data: tens of terabytestoday, petabytes within a few years. Remoteinteractive exploration of such datasets requiresthat the conventional visualization pipeline be de-composed across multiple resources [1,5,18]. Arealistic configuration might involve moving dataat hundreds or thousands of Mb/s to a data anal-ysis and rendering engine which then generatesand streams real-time MPEG-2 (or later HDTV)video to remote client(s), with control informa-tion flowing in the other direction. QoS param-eters of particular interest for this class of ap-plication include bandwidth, latency, and jitter;resources involved in delivering this QoS includestorage, network, CPU, and visualization engines.

Large data transfers. In other settings, largedatasets are not visualized remotely but insteadare transferred in part or in their entirety to re-mote sites for storage and/or analysis [33,4,8,32].The need to coordinate the use of other resourceswith the completion of these multi-gigabyte orterabyte transfers leads to a need for QoS guar-

3

antees of the form “data delivered by deadline”rather than instantaneous bandwidth. Achiev-ing this goal requires the scheduling of storagesystems and CPUs as well as networks so as toachieve often extremely high transfer rates.

High-end collaborative environments. High-endcollaborative work environments involve immer-sive virtual reality systems, high-resolution dis-plays, connections among many sites, and multi-ple interaction modalities including audio, video,floor control, tracking, and data exchange. Forexample, the Argonne “Access Grid” currentlyconnects some 15 sites via multiple audio, video,and control streams, with the audio streams es-pecially vulnerable to loss. Such applicationsrequire QoS mechanisms that allow the distinctcharacteristics of these different flows to be rep-resented and managed [13,24].

2.2. QoS RequirementsHeterogeneous flows. The applications of in-

terest frequently incorporate multiple flows withwidely varying characteristics, in terms of band-width, latency, jitter, reliability, and other re-quirements. GARA addresses these requirementsthrough (a) support for per-flow QoS specifica-tions while maintaining DS-like scalability and(b) a QoS-mechanism-independent architecturethat adapts to multiple techniques. A commonAPI means that for example a distance visualiza-tion application can specify the distinct require-ments of high-volume data and latency-sensitivecontrol flows, in a mechanism-independent man-ner; these flows might then be mapped to differentmechanisms: e.g., Multi-Protocol Label Switch-ing (MPLS) [41] and DS.

High bandwidth flows. Some applications in-volve high bandwidth flows that may require alarge percentage of the available bandwidth ona high-speed link. For example, we and oth-ers have demonstrated transfer rates of over aGb/s over wide area networks. This characteristichas significant implications for both mechanismsand policy. QoS mechanisms are required thatcan support such flows while allowing coexistencewith other flows having different characteristics.At the policy level, we believe that approachesare required that allow for the coordinated man-

agement of resources in multiple domains, so thatvirtual organizations (e.g., a scientific collabora-tion) can express policies that coordinate the al-location of the resources available to them in dif-ferent domains.

Need for end-to-end QoS. Satisfying appli-cation-level QoS requirements often requires thecoordinated management of resources other thannetworks: for example, a high-speed data trans-fer can require the scheduling of storage system,network, and CPU resources. As we shall see,GARA addresses this requirement by defining anextensible architecture that can deal with a rangeof different resource types and by providing sup-port for the co-allocation of multiple resources.

Need for application-level control. High end-to-end performance requires that applications beable to discover resource availability (GARA),monitor achieved service, and modify QoS re-quests (to network and to other resources, such asCPUs–for example, to reduce reservations whenload drops [16]) and application behavior dynam-ically.

Need for advance reservation. Specialized re-sources required by high-end applications suchas high-bandwidth virtual channels, scientific in-struments, and supercomputers are scarce and inhigh demand; in the absence of advance reserva-tion mechanisms, coordination of the necessaryresources is difficult. Reservation mechanisms areneeded to ensure that resources and services maybe scheduled in advance. Snell et al. have shownthat a meta-scheduler, which schedules a set ofGrid resources, can improve the overall effective-ness of the Grid by requesting a deterministic re-source in advance [44].

3. The GARA QoS Architecture

We designed GARA to meet the QoS require-ments listed above. We introduce GARA con-cepts here, and then describe below how we ap-ply these concepts in DS networks to manage theallocation of a particular flavor of QoS capability,namely Premium and Guaranteed Rate service.

4

End-to-End API

Remote API

LRAM API

Resource Manager

Resource

Diffserv Resource Manager

Cisco 7507

Gatekeeper

Slot Table

DSRT Resource Manager

DSRT Server

Gatekeeper

Slot Table

Application Remote

API

LRAM API LRAM API

Figure 1. On the left, the principal APIs used within GARA. On the right, the principal components ofthe GARA prototype as instantiated for DS and DSRT (CPU scheduler) services, with our own resourcemanager and slot manager being used in both cases. In the DS case, commands are issued to a routerwhile in the DSRT case commands are issued to a DSRT server (for tracking reservations).

3.1. GARA OverviewGARA defines APIs that allow users and ap-

plications to manipulate reservations of differentresources in uniform ways. For example, essen-tially the same calls are used to make an immedi-ate or advance reservation of a network or CPUresource. Once a reservation is made, an opaqueobject called a reservation handle is returned thatallows the calling program to monitor, modify,and cancel the reservation. Other functions al-low reservations to be monitored by polling orthrough a callback mechanism in which a user’sfunction is called every time the state of the reser-vation changes in an interesting way.

As illustrated on the left side of Figure 1,GARA defines APIs at multiple levels so as tomaximize both the functionality delivered to theuser and opportunities for code reuse in imple-mentations. In particular, the Local Reservationand Allocation Manager (LRAM) API providesdirect access to reservation functions within atrust domain, while the remote API provides re-mote access to LRAM functionality, addressing

issues of authentication and authorization. BothAPIs implement the functionality described in thepreceding paragraph.

The uniform treatment of reservations providedby GARA makes it possible to define and reuseco-reservation and co-allocation libraries that en-code strategies for the coordinated use of mul-tiple resources [11]. Because different resources(e.g., computers and storage systems) can be ma-nipulated via the same function calls, standardlibraries can be developed that encode strategiesfor dealing with, for example, co-reservation andfault recovery.

One co-reservation library that we have devel-oped in support of our work with GARA imple-ments an end-to-end network API that providesend-to-end analogs of each of the remote APIcalls. This API allows the user to create, moni-tor, cancel, etc., network co-reservations: that is,reservations involving more than one network re-source. This API allows users and applications toignore details of the underlying network topology.

Figure 2 illustrates the use of this end-to-end

5

API. This program first determines the band-width requirements of an application and thenqueries to determine available Premium band-width over the path of interest. A reservationis created for the smaller of these two values andthe reservation handle H is used to bind the reser-vation to the previously created flow. The ap-plication then checks periodically to see whetherthe reservation can be increased. Notice thatthe changes to what is otherwise a conventionalsocket-based code are small.

UDP-streamer(host A, host B) {(PortA,PortB) = new_socket_conn(A,B)F = compute_flow_requirement()When = {NOW,60 mins}Max = EnquireE2EResv(A,B,When)if (Max.forward > F) then

R = Felse

R = Max.forwardH = CreateE2EResv(A,B,R,0,When)BindE2EResv(H, PortA, PortB)repeat until done {

<send for a while>Max = EnquireE2EResv(A,B,When)if (Max.forward > 0 && R < F) then {

R = Max.forward + Rif (R > F) then

R = FModifyE2EResv(H, R, When)

}}

}

Figure 2. Pseudo-code for a simple applicationthat uses the GARA end-to-end API to first makeand subsequently monitor and modify a reserva-tion. For brevity, this code does not include errorchecking.

We note that while this example emphasizesapplication-centered monitoring and control of

reservation state, GARA also supports third-party reservation operations. For example, wecould remove the reservation logic from Figure 2altogether and instead perform appropriate reser-vation operations in a separate process.

3.2. GARA ImplementationWe review GARA implementation issues and

status, working up from the bottom of our APIstack.

GARA must provide admission control andreservation enforcement for multiple resources ofdifferent types. Because few resources providereservation capabilities, we have implemented ourown resource manager so as to ensure availabilityof reservation functions. As illustrated in Fig-ure 1, this manager uses a slot table [14,31] tokeep track of reservations and invokes resource-specific operations to enforce reservations. Re-quests to this resource manager are made via theLRAM API and result in calls to functions thatadd, modify, or delete slot table entries; timer-based callbacks generate call-outs to resource-specific routines to enable and cancel reserva-tions. Note that only certain elements of this re-source manager need to be replaced to instantiatea new resource interface. To date, we have de-veloped resource managers for DS networks (de-scribed below), for the Distributed Soft Real-Time (DSRT) CPU scheduler [9], and for the Dis-tributed Parallel Storage System (DPSS) [50], anetwork storage system; others are under devel-opment.

Our implementation of the end-to-end API in-vokes a path service to identify the resource man-agers that must be contacted to arrange for anend-to-end reservation, and then makes a seriesof GARA remote API calls to perform the co-reservation operation. See below for a discussionof issues that arise when traversing multiple do-mains.

Our GARA prototype uses two “Grid” servicesprovided by the Globus Toolkit: the Monitor-ing and Discovery Service (MDS) [10], currentlybased on the Lightweight Directory Access Pro-tocol (LDAP), which is used for publishing reser-vation status information and for accessing pathinformation; and the public-key based Grid Secu-

6

rity Infrastructure for authentication and autho-rization services. The interfaces to these servicesare simple and well-defined (LDAP and GSS-API,respectively), hence it is straightforward to sub-stitute alternative implementations.

4. GARA and Differentiated Services Net-works

The DS architecture is based on a simple modelin which packets entering a network are classifiedand possibly conditioned at the boundaries of thenetwork by service provisioning policies, and as-signed to different behavior aggregates. Withinthe core of the network, packets are forwardedaccording to the per-hop behavior (PHB) asso-ciated with the DS classification. These mecha-nisms have the advantage of not requiring thatper-flow state be maintained within the network.However, few guarantees can be made about end-to-end behaviors, which instead emerge as thecomposition of the PHBs associated with individ-ual links.

4.1. Integrating Differentiated Servicesand GARA

We have interfaced GARA concepts and con-structs to DS mechanisms in order to manage theallocation of Premium or Guaranteed Rate ser-vice bandwidth. As shown in Figure 4, we asso-ciate GARA resource managers with the locationsat network edges where admission control occurs.These resource managers are, in essence, what DSpapers call “bandwidth brokers” [6]: they gener-ate their region’s marked (Premium) traffic allo-cations and control the devices (e.g., routers) usedto enforce these allocations. Requests to resourcemanagers are authenticated, ensuring secure op-eration.

We have constructed our DS resource managerto support two classes of Premium service: a fore-ground service, for latency- and jitter-sensitiveflows (e.g., multimedia streaming and control),and a background service, for long-lived, highbandwidth but latency-insensitive flows (e.g.,bulk data transfer operations). The resourcemanager changes background reservations dy-namically as foreground reservations come and

go, generating callbacks to the application whena reservation changes. This strategy allows bulkdata transfers to co-exist with multimedia flows.The amount of bandwidth available for back-ground reservations over a particular time periodcan then be controlled via policy mechanisms. Wereport results with this approach below. Our pro-totype supports multiple foreground reservationsbut initially only a single background reservation;the extensions required to support multiple back-ground flows are not complex.

A resource management framework for DS net-works must also address end-to-end issues. A typ-ical wide area flow requires allocations of Pre-mium bandwidth at multiple edge routers andalso within interior domains. For example, in Fig-ure 4, a Premium flow from ANL to LBNL should,in principle, require an allocation not only fromthe ANL domain for the ANL/ESnet interface(where marking occurs) but also from ESnet forthe ANL-LBNL transit traffic and from the LBNLdomain for the ESnet/LBNL interface. Hence, weneed to associate resource managers with multi-ple DS domains and to implement co-reservationstrategies. Co-reservation operations must be de-signed with end-to-end verification in mind. Inour example, an application that omitted to ob-tain a reservation for ESnet transit traffic couldcause problems for other ANL-LBNL traffic, forexample if the aggregate ANL-ESnet traffic ex-ceeded what was allowed by the current ANL-ESnet service level agreement (SLA).

Most DS work assumes that co-reservation op-erations are encapsulated in the local domain’sresource manager: hence, a request to reservebandwidth from ANL to LBNL results in theANL manager contacting the ESnet manager,which in turn contacts the LBNL manager. Uponreceipt of a positive response from both othermanagers, a reservation is established. This ap-proach has the advantages of providing trustedco-reservation and of encapsulating all bandwidthbroker communication within a single local entity.The approach has disadvantages in settings whereend-to-end reservations involve resources otherthan networks, as a hierarchical co-reservationstructure results, or where allocation policies atinterior domains depend on factors other than the

7

identity of the requesting manager.An alternative approach to this problem is

to define a two-phase commit protocol. Inthis approach, an application program—or agentworking on behalf of an application program—contacts each manager in turn. In the first phase,a manager can indicate that acceptance of a reser-vation is conditional on the requestor securing ac-ceptance (indicated by a signed certificate) fromthe next manager.

In both approaches, inter-domain SLAs can ei-ther be established statically (in which case reser-vations can only be made if they fit within thepre-established SLAs), or they can be establisheddynamically, as reservations are made. The lat-ter approach provides greater flexibility but re-quires more sophisticated policy and enforcementengines in interior domains, as discussed below.

Our initial GARA prototype implements nei-ther of the approaches just described but insteadrelies on the end-to-end library to implement co-reservations correctly. We assume two domainsand static SLAs between domains; hence, weneed to allocate bandwidth at just two locations.Reservation policies are expressed via access con-trol lists associated with individual resource man-agers. These limitations are not inherent in ourmodel and are being removed in current work.

4.2. Differentiated Service ConfigurationThe final issue to be addressed in a DS im-

plementation relates to how PHBs are configuredto provide the premium services desired for par-ticular applications. In our DS implementation,this set-up involves the use of Committed AccessRate (CAR) and Weighted Fair Queuing (WFQ)mechanisms configured by means of the Modu-lar Quality of Service Command-Line interface(MQC) [51]. Figure 3 shows a schematic of thefunctionality that is applied at ingress routers.

We police the guaranteed rate traffic on theingress ports of edge routers and mark all con-forming packets by setting the DS Code-Pointsdefined in [39]. An over-provisioned WeightedFair Queuing configuration is used on the egressport of all routers.

The operation of CAR is controlled via com-mands issued to the router by the associated

GARA resource manager as reservations becomeactive, terminate, are modified, or are cancelled.These commands enter, remove, or modify flowspecifications that define a Premium service flowin terms of its source and destination IP addressand port, and its rate limit specification (desiredaverage transmission rate bandwidth and a nor-mal and excess burst size). Communication fromthe resource manager to the Cisco Systems routeris performed via Command Line Interface.

We also use CAR on the ingress ports of inter-domain routers, where it is used to enforce SLAsnegotiated with other domains, by rate limitingthe marked premium traffic that will be acceptedfrom another domain.

WFQ is used on the egress port of edge routersand in interior routers. WFQ ensures that inperiods of congestion—i.e., when packets getqueued in the router because the output linkdoes not provide the capacity for delivering themimmediately—each DS class receives at least thefraction of the output bandwidth given by theweight defined for that class. Hence, as long asthe total marked traffic destined for an outputport does not exceed the allocated output band-width, WFQ can be used to ensure that markedtraffic is forwarded without delay despite conges-tion in other classes.

This use of CAR and WFQ approximates aGuaranteed Rate service which can be build ei-ther on top of the Expedited Forwarding (EF)PHB described by the IETF’s DS Working Groupin [12], or by using one of the Olympic servicesbuild on top of the Assured Forwarding (AF)PHB described in [29].

Applying CAR and WFQ raises the questionof how these mechanisms should be configured tomeet application-level QoS requirements. Thisquestion is complicated by the wide variety offlows that we wish to support (UDP, TCP, lowand high bandwidth) and the geographic scaleover which QoS is required: from a few metersto thousands of kilometers. Considerable exper-imentation on the testbed described in the nextsection has been performed to understand theseissues.

8

Figure 3. The ingress router configuration consists of a classification, policing and marking unit and ofan optional traffic shaper. Traffic shaping is not applied for services that require a low latency, whereasguaranteed rate streams are shaped by applying a number of per-flow holding queues prior to WFQscheduling.

5. Experimental Studies

We report on experiments designed to evalu-ate the effectiveness of both the GARA architec-ture and our DS implementation. In particularwe show how a long lived bulk data transfer withdeadline can share a Premium class with short-lived prioritized foreground flows that require ex-plicit reservation. Two efficient implementationsthat support such applications are shown. Oneuses backward signaling, in which case the reser-vation manager provides information of currentlyunused Premium capacity to background bulkdata transfer applications. These in return areinstrumented to adapt their sending rate dynam-ically to the available Premium capacity. Theother option allows the background bulk datatransfer to make use of the resource manager’sability to pace the bulk data transfer traffic bythe use of traffic shaping at the output interfaceof the ingress router. Hence, the transmissionrate will be automatically updated by TCP’s selfclocking mechanisms.

5.1. Experimental ConfigurationOur experimental configuration, illustrated in

Figure 4, comprises a laboratory testbed at Ar-gonne National Laboratory (the Globus AdvanceReservation Network Testbed: GARNET) con-nected to a number of remote sites, includingLawrence Berkeley National Laboratory (LBNL).Connectivity to LBNL is provided by the Energy

Sciences Network (ESnet) DS testbed. GARNETallows controlled experimentation with basic DSmechanisms; the wide area extensions allow formore realistic operation, albeit with a small num-ber of sites. As end-system resources are lo-cated in different domains, we must deal with dis-tributed authentication and authorization.

Cisco Systems 7507 routers are used for all ex-periments. Within GARNET, these routers areconnected by OC3 ATM connections; across widearea links, they are connected by VCs of vary-ing capacity. We are restricted to these relativelyslow speeds because the 7507 cards do not im-plement CAR and WFQ at speeds faster thanOC3. End system computers are connected torouters by either switched Fast Ethernet or OC3connections. CAR and WFQ are used for QoSenforcement, as described above. Flow specifica-tions supplied to CAR use a bandwidth computedfrom the user-specified required bandwidth, tak-ing into account packet headers (note that thisrequires packet size information), with noncon-forming traffic dropped. Burst size and excessburst size parameters are both set as follows: ifusing TCP, to the bandwidth (in bytes/second)times the assumed maximum round trip time,subject to a minimum of 8 Kbytes and a max-imum of 2 Mbytes; if using UDP, to 1/4 of thisvalue, subject to a minimum of 8 Kbytes. WFQwas configured statically in all experiments.

During the initial experiment, no traffic shap-

9

Linux Linux

Cisco 7507 Cisco 7507 Cisco 7507

ESnet Testbed

ANL

Solaris

MREN EMERGE Testbed

Solaris

UW Madison

iCAIR

Univ of Chicago

Univ of Illinois

Solaris

ATM

Fast Ethernet

LBNL

Solaris

Solaris

Solaris

Solaris

Figure 4. The experimental configuration used in this work, showing our local GARNET testbed and itsextensions to remote sites connected via ESnet and MREN.

ing is performed on Premium flows beyond thelimited shaping provided by WFQ in the pres-ence of congestion. While the lack of shaping hasnot proven to be a significant problem to date,it will likely be required in future, more dynamicenvironments.

The network speeds supported in this testbedare clearly not adequate for the high-end applica-tions discussed above: the largest Premium flowthat we can support is around 80 Mb/s. Nev-ertheless, this testbed configuration has allowedus to validate multiple aspects of our generalapproach. We plan to extend our approach tohigher-speed networks in future work.

5.2. Multiple Flows: Local Area CaseOur first experiments evaluate GARA’s ability

to support multiple flows simultaneously and tosupport application monitoring of, and adapta-tion to, changes in reservation status.

We first report on experiments conducted onour local GARNET testbed: see Figure 4. Weconfigured GARNET to create a 45 Mb/s Pre-mium channel in a 100 Mb/s network. We then

created five distinct flows: a bulk data trans-fer, operating as a “background” flow; a com-peting 80 Mb/s Best-Effort UDP flow (a traf-fic generator submitting 1,000 byte packets ev-ery 100 µsecs); and three independent, short-livedforeground flows with immediate reservations. Inthis and subsequent experiments we used a sim-ple data transfer program, a modified version ofttcp that was able to limit the frequency that itwrote to the socket buffer, in order to achievea user-specified bandwidth, as our “application.”The source and destination computers used forthe Premium flows were distinct from the com-puters used for the competing flows.

Figure 5 shows the bandwidth delivered to theforeground, background, and Best-Effort flowsduring the experiment. We succeed in delivering“excess” Premium bandwidth to the bulk transferapplication without compromising the foregroundflows. The good bulk transfer performance thatwas achieved was made possible by the resourcemanager’s callbacks to the bulk transfer applica-tion, which allowed the application to change itssending rate in response to changes in its allo-

10

0

10000

20000

30000

40000

50000

60000

70000

80000

90000

0 20 40 60 80 100 120 140 160 180 200

Ban

dwid

th (

Kb/

s)

Time (s)

ForegroundBulk TransferCompetetive

Figure 5. Performance achieved for a mixture of Premium and Best-Effort services on GARNET. Wedemonstrate that a bulk-transfer (background) application is able to exploit unused Premium trafficwithout affecting foreground reservations. See text for details.

cated bandwidth, thereby avoiding packet dropsand the invocation of TCP slow-start. The follow-ing is a more detailed explanation of the graph:

1. The graph begins with the background TCPtraffic, which has a bulk-transfer reser-vation. This flow is initially allocated40.5 Mb/s Premium bandwidth: that is, 90percent of the 45 Mb/s Premium traffic.

2. The competitive UDP traffic is startedshortly after the bulk transfer but does notaffect it due to the Premium status of thebulk transfer flow.

3. At 25 secs, another application makes animmediate 36 Mb/s reservation and initi-ates a 32 Mb/s foreground flow. A callbacknotifies the bulk transfer application, whichreduces its sending rate to adapt to the re-duced reservation. (The and other similar

transitions take a little time due to the timerequired to control the router.)

4. At 48 secs, the foreground application fin-ishes its transmission and then cancels itsreservation. Another callback allows thebulk transfer process to increase its sendingrate to adapt to the newly available Pre-mium traffic.

5. Subsequently, two other foreground flowsare created, with similar effects: a 9 Mb/sreservation (8 Mb/s flow) from 75 to 105secs and an 18 Mb/s reservation (16 Mb/sflow) from 130 to 160 secs.

6. At time 185, the background flow completesand cancels its reservation. The compet-ing traffic rate increases to its target of80 Mb/s, actually exceeding this briefly be-cause of the filled router queues.

11

Notice that each time the bulk transfer reserva-tion is reduced, the bulk-transfer rate drops mo-mentarily then recovers. We attribute this behav-ior to the fact that TCP shrinks its window sizewhen packets are dropped, due to slow start orcongestion avoidance.

The model of bulk transfers used to date is in-tended to increase the economic use of the Guar-anteed Rate service. An important improvementto this mechanism would be to add a minimumreservation for the background traffic. By thusguaranteeing a particular amount of foregroundtraffic to the background class, deadline stagingoperations can easily be implemented. Again, theeconomic use indicates that the required reserva-tion to meet the deadline might change over time.Whenever a status update of the available band-width is received by the resource manager, it caneasily calculate the new required amount of band-width.

While figure 5 lists only a single bulk transferflow, the basic mechanism can easily be extendedto support multiple flows. Assume that there aren bulk transfer flows, with minimum bandwidthsb0, . . . , bn−1. At any given time, there is unusedbandwidth U , and we know that

U ≥n−1∑

i=0

bi

This is because we do admission control to en-sure that each bulk transfer flow will always haveits minimum bandwidth. We will assign each ac-tual bandwidth, Bi as:

Bi = U · bi∑n−1j=0 bj

This gives a proportional share of the band-width, and ensures that each flow gets its min-imum bandwidth. An optimization could be todecay the minimum bandwidth when the appli-cation is able to send at rates greater than theminimum bandwidth, thus allowing more band-width to be given to other flows. This allows thelinks to be shared, while still assisting the bulktransfer flows in finishing as soon as possible.

We now extend this model of bulk data trans-fer by also applying for unused Best-Effort band-

width. The required reservation of Premium ca-pacity to meet a deadline can in particular be de-creased during the run time of a bulk data trans-fer, if parallel sockets are used for the transmis-sion. Thereby one of the sockets has to be config-ured to generate Premium traffic at the confirmedrate to ensure that the deadline is met, whereasno reservation of network capacity is made for theremaining sockets, but these are mapped to a lessthan Best-Effort service in order to allow for fair-ness among competing and responsive Best-Effortflows. Within the QBone Internet2 project theDS Scavenger service [47] was proposed recentlyto provide such a less than Best-Effort service forhigh-bandwidth flows. By applying the Scavengerservice the bulk transfer application not only col-lects unused Premium bandwidth but also unusedBest-Effort capacity at the prize, that the Scav-enger service can be starved by Best-Effort traffic.Whenever the bulk transfer application succeedsin transmitting an appreciable amount of addi-tional data by applying the Scavenger service,forward signaling of a reduced required Premiumrate to the reservation manager can be applied todecrease the reserved Premium capacity and thusreduce costs and allow for a smaller blocking rateof the reservation manager, due to a higher avail-able Premium capacity.

Our prototype implementation of such a par-allel bulk data transfer application is based on asimple data fragmentation and the use of two par-allel sockets with the proposed mapping of onesocket to the Premium service and one to theScavenger service. In order to overcome the ef-fects of TCP congestion control in the Scavengerclass, the implementation option of using multi-ple striped sockets for this class exists as this isoffered by gridftp [3].

Figure 6 and 7 show results obtained from ourtestbed implementation of this combined use of aPremium and a Scavenger service. A target dead-line for a 280 MByte file transfer of 200 secondsis applied. The required Premium rate to ensurethis deadline with a safety margin of 10 seconds isderived to 12 Mb/s, for which an initial reserva-tion is made. The sender in addition to this Guar-anteed Rate service applies the Scavenger servicein parallel, to utilize unused Premium and Best-

12

Effort capacity, if any is available. Each time thesender successfully transmits a configurable ad-ditional amount of data (25 MByte) by applyingthe Scavenger service, it recomputes the reserva-tion of Premium capacity that is required to meetthe deadline, and performs a reservation update.

In figure 6 a scenario without additional traf-fic across an ATM bottleneck link of roughly 42Mb/s net capacity is addressed. Besides the guar-anteed rate of 12 Mb/s used by the premiumstream, the Scavenger stream can use the remain-ing capacity with a rate of about 30 Mb/s. Af-ter 7 seconds, the sender performs the first Pre-mium service rate adaptation, due to the addi-tional 25 MByte of data transferred by means ofthe Scavenger service, and lowers the Premiumcapacity reservation to about 10.5 Mb/s. Sincethe ATM bottleneck link in this experiment isnot used by any other flow, the rate at which theScavenger service can be used increases propor-tionately. This process occurs repeatedly duringthe file transfer and allows the reservation man-ager to redistribute the freed Premium capacity.In addition the use of parallel sockets reduces thefile transfer time to 58 seconds.

Figure 7 shows the same scenario under con-gestion. After 10, 40 and 70 seconds, congestionoccurs, each time for 10 seconds. During theseperiods the congestion leads to the intended star-vation of the less than Best-Effort Scavenger ser-vice and thus achieves the desired fairness amongScavenger and Best-Effort flows. Neverthelessthe deadline of the file transfer is never endan-gered, since the Premium flow still receives thecurrently required guaranteed rate. Due to thereduced amount of data transmitted by applyingthe Scavenger service, Premium rate adaptationscan in this scenario be made less frequently, andan overall file transfer time of 88 seconds is mea-sured, which still is well below the Premium flowtarget deadline of 190 seconds.

Therefore the use of parallel streams in afile transfer scenario and the mapping of thesestreams on a Guaranteed Rate and a Scavengerservice achieves three main goals:

• The file transfer deadline is guaranteed.

• Excess Premium and Best-Effort capacity

can be used if available, to reduce the over-all transmission time, and to be able to freePremium resources earlier.

• Fairness towards the Best-Effort class isachieved by applying the less than Best-Effort Scavenger service, thus allowing thecoexistence of responsive Best-Effort trafficwith high-bandwidth Scavenger flows

A slightly different implementation of the par-allel data transfer can alternatively be based onthe DS Assured Forwarding, which also can beused to implement a Guaranteed Rate service.Within one AF class a differentiation in termsof drop precedence can be applied to differentlymarked packets based on an implementation ofMultiple Random Early Detection. At the ingressnode of a DS domain a meter typically applies atoken bucket mechanism such as the Two RateThree Color Marker [28] proposed for the use withAF. This marker performs a marking of packetsto be treated in the core with a low drop proba-bility, if the traffic conforms to a committed in-formation rate and with a high drop probability,if it exceeds this rate.

5.3. Multiple Flows: Wide Area CaseOur next experiments repeat those just de-

scribed over the wide area network from ANL toLBNL: see Figure 4. Here, we used WFQ to con-figure our testbed with 55 Mb/s Premium traf-fic over the 60 Mb/s UBR VC between ANL andLBNL and 27 Mb/s Premium traffic within GAR-NET. Note that when congested this wide areaPremium traffic configuration is a good approxi-mation to priority queuing. (Only 27 Mb/s Pre-mium traffic was allowed on GARNET in theseparticular experiments because of either extratraffic or a bad device on a fast Ethernet segmentof the network that we were unable to control; inother experiments, we have successfully config-ured up to 45 Mb/s Premium.) Here, the back-ground flow is initially allocated 24.3 Mb/s Pre-mium bandwidth (that is, 90 percent of 27 Mb/s),the competing Best-Effort UDP flow operates at50 Mb/s (1,250 byte packets every 200 µsecs), andtwo foreground flows are created: a 16 Mb/s flow(18 Mb/s reservation) at 37 secs and an 8 Mb/s

13

0

5000

10000

15000

20000

25000

30000

35000

40000

45000

0 10 20 30 40 50 60

Ach

ieve

d B

andw

idth

(K

b/s)

Time (s)

GR StreamScavenger Stream

Figure 6. Performance achieved with a combination of a Guaranteed Rate service and a Scavenger service.We demonstrate that a bulk-transfer application with deadline is able to use a Guaranteed Rate serviceand in addition exploits unused Premium and Best-Effort capacity by a Scavenger service. See text fordetails.

flow (9 Mb/s reservation) at 94 secs.As shown in Figure 8, the results obtained in

the wide area are almost as good as in the lo-cal area. We attribute the somewhat more dy-namic behavior during reservation changes to thefact that the kernel buffers associated with thebulk transfer socket take some time to empty.Hence, data is initially sent too rapidly for theupdated router configuration, forcing packets tobe dropped and TCP to go into slow-start mode.This effect is magnified by the larger bandwidth-delay product and hence larger socket buffers(1 MB in this case) in the wide area network.

5.4. Evaluation of TCP PacingSo far, we required the application to be instru-

mented to adapt to a given rate. In this sectionwe introduce the use of traffic shaping as a vehicleto pace TCP throughput efficiently.

The actual throughput achieved by TCP appli-cations depends on two main factors:

• The size of the advertised window deter-mines the transmission rate of the trans-mitter (disregarding the congestion win-dow). Using the socket API, the relatedwindow size can be influenced by explicitlysetting the socket buffer size. The opti-mal socket buffer size should be equal tothe bandwidth-delay product [46]. How-ever, making the socket buffer too largemight result in a reduced throughput be-cause the sender might transmit more thanthe network is actually capable of handling,particularly during the slow-start processwhen TCP’s self-clocking capability is notthe dominant influence on bandwidth.

• The application has to provide data to the

14

0

5000

10000

15000

20000

25000

30000

35000

40000

45000

0 10 20 30 40 50 60 70 80 90

Ach

ieve

d B

andw

idth

(K

b/s)

Time (s)

GR StreamScavenger Stream

Figure 7. Performance achieved with a combination of a Guaranteed Rate service and a Scavenger service.While the Scavenger service reacts to periods of congestion, the Guaranteed Rate service remains stable.We see three periods of congestion: from 12-22, from 42-52, and from 72-82 seconds.

socket buffer that the TCP stack can ac-tually fall into the so called steady-state.This ability often depends on the state ofthe local operating system, as data mightbe read from disk or the CPU is heavilyused by other applications. The followingsection will address this issue.

There has been some discussion as to whethershaping of TCP traffic, i.e., TCP pacing, mightincrease fairness and throughput [34,2]. How-ever, none of these studies was concerned withTCP flows using a virtual leased line, i.e. a Pre-mium aggregate. Pacing TCP traffic in an envi-ronment offering a Guaranteed Rate service fa-cilitates the simple use of network reservationseven without the knowledge of the actual rate theapplication is writing data to the socket buffer.In the above experiments, we instrumented theapplication to adapt its socket buffer write fre-

quency to the available background rate. In a sce-nario where TCP pacing is controlled by GARA’sresource manager, the use of an oversized socketbuffer leverages TCP’s self clocking feature tocontrol the speed of transmission. In coordinatinga reservation with the shaping rate, packet dropscan be avoided and a well-defined throughput canbe established.

The following experiment is designed to demon-strate that shaping a TCP flow enables it towork smoothly with a Premium service withouttoo much effort. We demonstrate two PremiumTCP flows between Chicago (ANL) and Califor-nia (LBNL) which try to exceed the rate theyhave reserved. This is a fairly likely scenario,since it is often hard for programmers to estimatethe bandwidth their applications use. Also, wehave already shown elsewhere [43,23] that appli-cations that do not exceed their rate do not havea problem.

15

0

5000

10000

15000

20000

25000

30000

35000

40000

45000

50000

0 20 40 60 80 100 120 140 160 180

Ban

dwid

th (

Kb/

s)

Time (s)

ForegroundBulk TransferCompetetive

Figure 8. Performance achieved for a mixture of Premium and Best-Effort services on a wide area testbed.We demonstrate good performance even in the wide area. See text for details.

Specifically, we used a socket buffer size of 1MBand the round trip time is 75ms. Each flow ran atdifferent times but are shown on the same graph.The application tried to write to the socket bufferat 64 Mb/s while it only made a reservation for16 Mb/s. One of the flows is shaped to match thereservation bandwidth and therefore is paced toavoid packet drops. Figure 9 shows the achievedthroughput for each of the flows. There are twothings to notice in this figure. First, the shapedflow has a steady bandwidth at the reservation itmade. Second, the unshaped flow has an unstableinstantaneous bandwidth. Although it is not ob-vious from the graph, the average bandwidth ofthe flow is 9440 Kb/s, which is significantly lessthan the reservation.

One might hope that using selective acknowl-edgements would eliminate the need for shaping.This is because SACK can recover from multi-ple packet losses roughly in one round trip time.However, as can be seen from Figure 10, this is

not the case. In this Figure, we repeated thesame experiment as in Figure 9, but with selec-tive acknowledgements enabled. In this case, theinstantaneous bandwidth varies much less, butthe average bandwidth is still significantly lessthan the reservation, i.e. 11992 Kb/s comparedto 14448 Kb/s. Even though SACK can recoverfrom packet losses more easily, TCP still reactsto the dropped packets as if they imply networkcongestion.

The results demonstrate that bulk transferoperation can be controlled by TCP pacing ina Guaranteed Rate service. Without detailedknowledge about the rate the application is send-ing at, GARA can react to the current reserva-tion state and provide a constant, smooth trans-fer rate. Of course, a convenient shaping bufferhas to be available in the edge router. This, how-ever, is a realistic assumption as commodity net-working devices offer buffer space in the order ofMBytes [43].

16

0

5000

10000

15000

20000

0 20 40 60 80 100

Ban

dwid

th (

Kb/

s)

Time (s)

With ShapingWithout Shaping

Figure 9. Achieved throughput for a Premium TCP flow exceeding the reservation. The SACK-optionwas disabled. The average achieved throughput was 9440 Kb/s without shaping and 14320 Kb/s whenshaping was activated. See text for details.

The results also present the benefit of usingSACK. While the experiment without using theSACK-option was oscillated due to packet loss,SACK could reduce its amplitude. However, theachieved throughput with SACK was significantlybelow the paced experiment.

5.5. Co-Reservation of CPU and NetworkAn important challenge addressed by GARA

is the co-reservation of multiple resources: for ex-ample, network and CPU to ensure that a receivercan process incoming data. The experiment re-ported here demonstrates the ability of GARAto support such co-reservation. Specifically, weestablished a TCP flow and showed that we canmaintain data transfer performance despite com-peting traffic on the network and competing com-putational load on the receiving host.

We conducted this experiment on GARNETand use the 100 Mb/s network as before, except

that this time Premium traffic is configured to useup to 95 Mb/s. A TCP flow is started and net-work and CPU reservations and load are appliedin various combinations.

1. A ttcp application is started without net-work congestion and without any reserva-tion.

2. At 10 secs, an 80 Mb/s traffic generatoris started. Because of network congestion,ttcp switches to the slow start feature andcongestion control, with the result that ttcpperformance drops precipitously and mostavailable bandwidth is consumed by thecompetitive traffic.

3. At 40 secs, the TCP application createsan immediate network reservation throughGARA. Performance increases dramati-cally.

17

0

5000

10000

15000

20000

0 20 40 60 80 100

Ban

dwid

th (

Kb/

s)

Time (s)

With ShapingWithout Shaping

Figure 10. Achieved throughput for a Premium TCP flow exceeding the reservation. This time, theSACK-option was enabled. The average achieved throughput was 11992 Kb/s without shaping and14448 Kb/s when shaping was activated. See text for details.

4. At 60 secs, a significant competing CPUload is imposed on the TCP receiver host.TCP throughput is significantly effected,due to contention for the CPU.

5. At 80 secs, we use GARA to reserve a sig-nificant amount of CPU for the receivingTCP process through the DSRT manager.The achieved rate increases immediately, al-though some variation remains due to theinterval-based scheduling used by DSRT.

6. At 120 secs, we cancel the network reserva-tion; TCP performance drops precipitouslyonce again.

7. At 160 secs, we cancel the CPU reserva-tion; this has little further impact on per-formance.

6. Policy in Multidomain Settings

We sketch an approach to expanding GARA tosupport more sophisticated policy enforcement,particularly in multi-domain settings.

6.1. General ApproachWe assume the following system model. An

end-to-end reservation may involve multiple re-sources located in different domains. Resourceallocation decisions within a domain remain theresponsibility of that domain; hence, end-to-endreservations must be authorized by all appropri-ate domains or by entities to which domains havedelegated this authority.

Our policy approach is designed to support aflexible mix of policy options, for example:

• A domain may allocate resources on the ba-sis of user identity. Such a policy may beappropriate in the case of unique resources

18

0

20000

40000

60000

80000

100000

0 20 40 60 80 100 120 140 160 180 200

Ban

dwid

th (

Kb/

s)

Time (s)

Reserved TCP TrafficCompetitive UDP Traffic

Figure 11. Performance achieved for a TCP flow in the presence of competing UDP traffic and host load,for various combinations of network and CPU reservation. We demonstrate GARA’s ability to co-reservemultiple resource types.

for which users make distinct requests, e.g.,supercomputers or specialized network re-sources such as a low-bandwidth outgoingconnection.

• A domain may allocate resources in re-sponse to a request forwarded from an-other domain with which some agreementhas been negotiated previously. For exam-ple, a transit service domain (e.g., ESnetin Figure 4) might negotiate an agreementto accept any allocation request forwardedfrom another DS domain, up to some SLAlimit.

• A domain may allocate resources in re-sponse to a request authorized by somethird party, such as a virtual organizationwith which the domain has negotiated anagreement previously [6]. This delegationof authorization allows a community to ne-

gotiate agreements with multiple domainsin order to obtain control of some amountof Premium end-to-end bandwidth.

We anticipate multiple such authorization poli-cies being active at one time. For example, in anenvironment such as that of Figure 4, a transitdomain such as ESnet might support the follow-ing policies:

• Accept immediate reservations of Premiumbandwidth from any domain with a previ-ously negotiated SLA, subject to the con-straint that no single request can be morethan 100 Mb/s and the total requests froma domain cannot exceed its SLA.

• Accept immediate and advance reservationrequests labeled as “HEP” if approved by aserver operated by the high energy physics

19

community, up to limits and at times pre-viously negotiated with that community.

We believe that authorization and authentica-tion mechanisms provided in the Globus Toolkitprovide a basis on which to explore these issues.The Akenti system [49] also provides importantrelevant technology. We provide a more detaileddiscussion of the handling of policies in such adistributed environment elsewhere [42].

6.2. Bulk Transfers in Multidomain Set-tings

So far our model for bulk transfers has assumeda single administrative domain. In a Grid envi-ronment, however, this assumption is of limiteduse, as virtual organizations typically consist ofmultiple administrative domains. In fact, when-ever resources of different institutions are co-allocated, any request for network services passesat least three domains: the end-domains and onetransient domain. Figure 12 illustrates this prob-lem.

In a single-domain environment, one resourcemanager is able to provide a bulk-transfer classas described above. In a multi-domain environ-ment, things are more complicated, because therecan be many resource managers handling theirown set of foreground and bulk transfer flows,and each such manager is a potential source offeedback to the application.

This problem can be addressed by introducingthe abstraction of an aggregated end-to-end reser-vation or core tunnel. Users authorized to use thistunnel can then request portions of this aggregatebandwidth by contacting just the two end do-mains. The intermediate domains do not need tobe contacted as long the total bandwidth remainsless than the size of the tunnel. The importanceof this aggregation is increased by the fact that inmany Grids, multiple intermediate domains canbe involved where aggregates are split into differ-ent egress points. By locating the bulk transferwithin a single core tunnel we can perform thedescribed adaption steps efficiently.

7. Related Work

The general problem of QoS implementationand management is receiving increased atten-tion (see, e.g., [25]). However, there has beenlittle work on the specific problems addressedin this paper, namely advance reservation andco-reservation of heterogeneous collections of re-sources for end-to-end QoS and the use of DSmechanisms to support flow types encountered inhigh-end applications.

Proposals for advance reservations typicallyemploy cooperating servers that coordinate ad-vance reservations along an end-to-end path [52,17,14,27]. Techniques have been proposed forrepresenting advance reservations, for balancingimmediate and advance reservations [17], andfor advance reservation of predictive flows [14].However, this work has not addressed the co-reservation of resources of different types.

The concept of a bandwidth broker (simi-lar to GARA’s network resource manager) isdue to Nichols and Jacobson [40]. The Inter-net 2 Qbone initiative and the related BandwidthBroker Working Group are developing testbedsand requirements specifications and design ap-proaches for bandwidth brokering approaches in-tended to scale to the Internet [48]. However,advance reservations do not form part of theirdesign. Other groups have investigated the use ofDS mechanisms (e.g., [54]) but not for multipleflow types. Hoo et al. [30] propose mechanismsfor the secure negotiation of end-to-end reserva-tions.

The co-reservation of multiple resource typeshas been investigated in the multimedia com-munity: see, for example, [36,38,37]. However,these techniques are specialized to specific re-source types.

8. Conclusions and Future Work

We have described a QoS architecture thatsupports immediate and advance reservation(and co-reservation) of multiple resource types;application-level monitoring and control of QoSbehavior; and support for multiple concurrentflows with different characteristics. We have also

20

BB-B BB-C BB-A

Charlie Alice

Domain A Domain B Domain C

Figure 12. The multi-domain reservation problem. Alice needs to contact three BBs to make a networkreservation from her computer in domain A to Charlie’s computer in domain C.

described how this architecture can be realizedin the context of differentiated service networks.We presented experimental results that demon-strate our ability to deliver QoS to multiple flowsin local and wide area networks.

In future work we plan to improve and ex-tend GARA in a variety of areas, including im-proved representation and implementation of pol-icy, more sophisticated adaptation mechanisms(including real-time monitoring of network sta-tus), and more sophisticated co-reservation algo-rithms [11]. We also plan to extend our evalu-ation of GARA mechanisms to a wider range ofapplications and more complex networks. GARAmechanisms are being incorporated into the OpenGrid Services Architecture-compliant [22] version3.0 of the Globus Toolkit.

Acknowledgments

We gratefully acknowledge assistance providedby Rebecca Nitzan and Robert Olson with exper-imental studies. Numerous discussions with ourcolleagues Gary Hoo, Bill Johnston, Carl Kessel-man, and Steven Tuecke have helped shape ourapproach to quality of service. We also thankCisco Systems for an equipment donation thatallowed creation of the GARNET testbed. Thiswork was supported in part by the Mathematical,Information, and Computational Sciences Divi-sion subprogram of the Office of Advanced Sci-

entific Computing Research, U.S. Department ofEnergy, under Contract W-31-109-Eng-38; by theDefense Advanced Research Projects Agency un-der contract N66001-96-C-8523; by the NationalScience Foundation; and by the NASA Informa-tion Power Grid program.

REFERENCES

1. M. Aeschlimann, P. Dinda, L. Kallivokas,J. Lopez, B. Lowekamp, and D. O’Hallaron,Preliminary report on the design of a frame-work for distributed visualization, Proceed-ings of the Parallel and Distributed Process-ing Techniques and Applications Conference(1999).

2. A. Aggarwal, S. Savage, and T. Anderson,Understanding the Performance of TCP Pac-ing, Proceedings of IEEE Infocom (2000)1157–1165.

3. W. Allcock, J. Bester, J. Bresnahan, A. Cher-venak, I. Foster, C. Kesselman, S. Meder,V. Nefedova, D. Quesnel, and S. Tuecke,Data Management and Transfer in High-Performance Computational Grid Environ-ments, Parallel Computing (2001).

4. H. Andrade, T. Kurc, A. Sussman, and J.Saltz, Active Proxy-G: Optimizing the QueryExecution Process in the Grid, Proceedings ofSC (2002).

5. W.Bethel, B.Tierney, J.Lee, D.Gunter, and

21

S.Lau, Using high-speed WANs and networkdata caches to enable remote and distributedvisualization, Proceedings of ACM/IEEE Su-percomputing Conference , (2000).

6. S. Blake, D. Black, M. Carlson, M. Davies,Z. Wang, and W. Weiss, An Architecture forDifferentiated Services, RFC 2475 (1998).

7. R. Braden, D. Clark, and S. Shenker, Inte-grated Services in the Internet Architecture:an Overview, RFC 1633 (1994).

8. A. Chervenak, I. Foster, C. Kesselman, C.Salisbury, and S. Tuecke, The Data Grid:Towards an Architecture for the DistributedManagement and Analysis of Large ScientificData Sets, Journal on Network and ComputerApplications 23 (2001) 187– 200.

9. H. Chu, and K. Nahrstedt, CPU ServiceClasses for Multimedia Applications, Proceed-ings of IEEE Multimedia Computing and Sys-tems (1999).

10. K. Czajkowski, S. Fitzgerald, I. Foster, andC. Kesselman, Grid Information Services forDistributed Resource Sharing, Proceedingsof the Tenth IEEE International Symposiumon High-Performance Distributed Computing(HPDC-10), (2001).

11. K. Czajkowski, I. Foster, and C. Kessel-man, Co-allocation Services for Computa-tional Grids, Proceedings of the 8th IEEESymposium on High Performance DistributedComputing (1999).

12. B. Davie, A. Charny, J.C.R Bennett, K. Ben-son, J.Y. LeBoudec, W. Courtney, S. Davari,V. Firoiu, and D. Stiliadis, An Expedited For-warding PHB (Per-Hop-Behavior), RFC 3246(2002).

13. T. DeFanti, and R. Stevens, Teleimmersion,[20] 131–156.

14. M. Degermark, T. Kohler, S. Pink, and O.Schelen, Advance Reservations for PredictiveService in the Internet, ACM/Springer Jour-nal of Multimedia Systems , 5 (3) (1997).

15. T. DeWitt, T. Gross, B. Lowekamp, N.Miller, P. Steenkiste, and J. Subhlok, Re-MoS: A Resource Monitoring System forNetwork Aware Applications, Technical Re-port, Carnegie Mellon University, CMU-CS-97-194 (1997).

16. N. G. Duffield, P. Goyal, A. Greenberg, P.Mishra, K. K. Ramakrishnan, and J. E. vander Merwe, A flexible model for resource man-agement in virtual private networks, ACMSIGCOMM (1999).

17. D. Ferrari, A. Gupta, and G. Ventre, Dis-tributed Advance Reservation of Real-TimeConnections, ACM/Springer Journal on Mul-timedia Systems 5 (3) (1997).

18. I. Foster, J. Insley, G. von Laszewski, C.Kesselman, and M. Thiebaux, Distance Vi-sualization: Data Exploration on the Grid,IEEE Computer 32 (12) (1999) 36–43.

19. I. Foster, and C. Kesselman, Globus: AToolkit-Based Grid Architecture, [20] 259–278.

20. I. Foster and C. Kesselman (Eds.), The Grid:Blueprint for a Future Computing Infrastruc-ture, Morgan Kaufmann Publishers (1999).

21. I. Foster, C. Kesselman, C. Lee, R. Lin-dell, K. Nahrstedt, and A. Roy, A Dis-tributed Resource Management Architecturethat Supports Advance Reservations and Co-Allocation, Proceedings of the InternationalWorkshop on Quality of Service (1999) 27–36.

22. I. Foster, C. Kesselman, J. Nick, andS. Tuecke, The Physiology of the Grid:An Open Grid Services Architecturefor Distributed Systems Integration,www.globus.org/research/papers/physio-logy.pdf (2002).

23. I. Foster, A. Roy, V. Sander, and L. Win-kler, End-to-End Quality of Service forHigh-End Applications, Technical Report,Mathematics and Computer Science Divi-sion, Argonne National Laboratory, Argonne,www.mcs.anl.gov/qos/end to end.pdf(1999).

24. E. Frecon, C. Greenhalgh, and M. Stenius,The DiveBone - An Application-Level Net-work Architecture for Internet-Based CVEs,Symposium on Virtual Reality Software andTechnology (1999).

25. R. Guerin, and H. Schulzrinne, NetworkQuality of Service, [20] 479–503.

26. D. Gunter, B. Tierney, K. Jackson, J. Lee,and M. Stoufer, Dynamic Monitoring of

22

High-Performance Distributed Applications,Proceedings of the 11th IEEE Symposiumon High Performance Distributed Computing(2002).

27. A. Hafid, G. Bochmann, and R. Dssouli, AQuality of Service negotiation Approach withFuture Reservations (NAFUR): A DetailedStudy, Computer Networks and ISDN Sys-tems 30 (8) (1998).

28. J. Heinanen, and R. Guerin, A Two RateThree Color Marker, RFC 2698 (1999).

29. J. Heinanen, T. Finland, F. Baker, W. Weiss,and J. Wroclawski, Assured Forwarding PHBGroup, RFC 2597 (1999).

30. G. Hoo, K. Jackson, and W. Johnston, De-sign of the STARS Network QoS ReservationSystem, Technical Report, Lawrence BerkeleyNational Laboratory (2000).

31. G. Hoo, W. Johnston, I. Foster, and A. Roy,QoS as middleware: Bandwidth broker sys-tem design, Technical Report LBNL (1999).

32. W. Hoschek, J. Jaen-Martinez, A. Samar, H.Stockinger, and K. Stockinger, Data Manage-ment in an International Data Grid Project,IEEE/ACM International Workshop on GridComputing (2000).

33. W. Johnston, J. Guojun, G. Hoo, C. Larsen,J. Lee, B. Tierney, and M. Thompson, Dis-tributed environments for large data-objects:Broadband networks and a new view ofhigh performance large scale storage-basedapplications, Proceedings of Internetworking(1996).

34. J. Kulik, R. Coulter, D. Rockwell, and C. Par-tridge, Paced TCP for High Delay-BandwidthNetworks, Proceedings of the Workshop onSatellite-Based Information Systems (WOS-BIS) (1999).

35. S. Machiraju, M. Seshadri, and I. Stoica, AScalable and Robust Solution for BandwidthAllocation, International Workshop on Qual-ity of Service (2002).

36. A. Mehra, A. Indiresan, and K. Shin, Struc-turing Communication Software for Quality-of-Service Guarantees, Proceedings of the 17thReal-Time Systems Symposium (1996).

37. K. Nahrstedt, H. Chu, and S. Narayan, QoS-aware Resource Management for Distributed

Multimedia Applications, Journal on High-Speed Networking (1998).

38. K. Nahrstedt, and J. M. Smith, De-sign, Implementation and Experiences ofthe OMEGA End-Point Architecture, IEEEJSAC, Special Issue on Distributed Multime-dia Systems and Technology 14 (7) (1996)1263–1279.

39. K. Nichols, S. Blake, F. Baker, and D. Black,Definition of the Differentiated Services Field(DS Field) in the IPv4 and IPv6 Headers,RFC 2474 (1998).

40. K. Nichols, V. Jacobson, and L. Zhang, ATwo-bit Differentiated Services Architecturefor the Internet, RFC 2638 (1999).

41. E. Rosen, R. Callon, and A. Viswanathan,Multiprotocol label switching architecture,RFC 3031 (2001).

42. V. Sander, W. A. Adamson, I. Foster, and A.Roy, End-to-End Provision of Policy Informa-tion for Network QoS, Proceedings of the 10thIEEE Symposium on High Performance Dis-tributed Computing (2001).

43. V. Sander, I. Foster, A. Roy, and L. Winkler,A Differentiated Services Implementation forHigh-Performance TCP Flows, Terena Net-working Conference (2000).

44. Q. Snell, M. Clement, D. Jackson, and C.Gregory, The Performance Impact of AdvanceReservation Meta-scheduling, IPDPS Work-shop, Job Scheduling Strategies for ParallelProcessing (JSSPP), Springer-Verlag LNCS1911 (2000).

45. P. Steenkiste, Adaptation Models forNetwork-Aware Distributed Computations,Proceedings of CANPC (1999).

46. W. Stevens, TCP/IP Illustrated, Vol. 1 TheProtocols, Addison-Wesley (1997).

47. B. Teitelbaum, Future Priorities for In-ternet2 QoS, www.internet2.edu/qos/wg/papers/qosFuture01.pdf (2001).

48. B. Teitelbaum, S. Hares, L. Dunn, V.Narayan, R. Neilson, and F. Reichmeyer, In-ternet2 QBone - Building a Testbed for Dif-ferentiated Services, IEEE Network 13 (5)(1999).

49. M. Thompson, W. Johnston, S. Mudum-bai, G. Hoo, K. Jackson, and A. Essiari,

23

Certificate-based Access Control for WidelyDistributed Resources, Proceedings of the 8thUsenix Security Symposium (1999).

50. B. Tierney, W. Johnston, L. Chen, H. Her-zog, G. Hoo, G. Jin, and J. Lee, DistributedParallel Data Storage Systems: A ScalableApproach to High Speed Image Servers, Pro-ceedings of ACM Multimedia (1994).

51. S. Vegesna (Eds.), IP Quality of Service,Cisco Press (2001).

52. L.C. Wolf, and R. Steinmetz, Concepts forReservation in Advance, Kluwer Journal onMultimedia Tools and Applications 4 (3)(1997).

53. J. Wroclawski, The Use of RSVP with IETFIntegrated Services, RFC 2210 (1997).

54. I. Yeom, and A. L. Narasimha Reddy, Mod-eling TCP Behavior in a Differentiated-Services Network, Technical Report, TAMUECE (1999).


Recommended