+ All Categories
Home > Documents > Quality of Service

Quality of Service

Date post: 25-Nov-2015
Category:
Upload: ravindra-mule
View: 7 times
Download: 0 times
Share this document with a friend
28
Internet Quality-of-Service (QoS) Henning Schulzrinne Columbia University Fall 2003
Transcript
  • Internet Quality-of-Service (QoS)Henning SchulzrinneColumbia UniversityFall 2003

  • Quality of ServiceMotivationService availabilityElementary queueing theoryTraffic characterization & controlIntegrated services (RSVP, NSIS)Differentiated services (DiffServ)

  • What is quality of service?Many applications are sensitive to the effects of delay (+ jitter) and packet lossmay have floor below which utility drops to zeroThe existing Internet architecture provides a best effort service.All traffic is treated equally (generally, FIFO queuing) No mechanism for distinguishing between delay sensitive and best effort trafficOriginal IP architecture (IPv4) has TOS (type-of-service byte) in packet headerRFC 795: defined multiple axes (delay, throughput, reliability)rarely used outside some (rumor) military networks

    utility ($)bandwidth

  • MotivationQoS service availabilitynot good enough if all but 2 minutes of my phone call sound perfectSupport mission-critical applications that cant tolerate disruptionVoIPVPNs (LAN emulation)high-availability computingCharge more for business applications vs. consumer applications

  • Service availabilityUsers do not care about QoSat least not about packet loss, jitter, delayrather, its service availability how likely is it that I can place a call and not get interrupted?availability = MTBF / (MTBF + MTTR)MTBF = mean time between failuresMTTR = mean time to repairavailability = successful calls / first call attemptsequipment availability: 99.999% (5 nines) 5 minutes/yearAT&T (2003): Sprint IP frame relay SLA: 99.5%

    Long-distance voice99.978%ATM data99.999%Frame relay data99.998%IP99.991%

  • Availability PSTN metricsPSTN metrics (Worldbank study):fault rateshould be less than 0.2 per main linefault clearance (~ MTTR)next business daycall completion rateduring network busy hourvaries from about 60% - 75%dial tone delay

  • Example PSTN statisticsSource: Worldbank

  • Measurement setup

  • Measurement setupActive measurementscall duration 3 or 7 minutesUDP packets:36 bytes alternating with 72 bytes (FEC)40 ms spacingSeptember 10 to December 6, 200213,500 call hours

  • Call success probability62,027 calls succeeded, 292 failed 99.53% availabilityroughly constant across I2, I2+, commercial ISPs

  • Overall network lossPSTN: once connected, call usually of good qualityexception: mobile phonescompute periods of time below loss threshold5% causes degradation for many codecsothers acceptable till 20%

  • Network outagessustained packet lossesarbitrarily defined at 8 packetsfar beyond any recoverable loss (FEC, interpolation)23% outagesmake up significant part of 0.25% unavailabilitysymmetric: AB BAspatially correlated: AB AXnot correlated across networks (e.g., I2 and commercial)

  • Network outages

  • Network outages

  • Outage-induced call abortion probabilityLong interruption user likely to abandon callfrom E.855 survey: P[holding] = e-t/17.26 (t in seconds) half the users will abandon call after 12s2,566 have at least one outage946 of 2,566 expected to be dropped 1.53% of all calls

  • Conclusions from measurementAvailability in space is (mostly) solved availability in time restricts usability for new applicationsinitial investigation into service availability for VoIPneed to define metrics for, say, web accessunify packet loss and no Internet dial tonefar less than 5 ninesworking on identifying fault sources and locationslooking for additional measurement sites

  • Whats next?Existing SLAs are mostly uselesstoo many exceptionswrong time scales: month vs. minutesno guarantees for interconnectsExisting measurements similarly dubiousLimited ability to learn from mistakeswhat are the primary causes of service unavailability?what can I do to protect myself multi-homing via same fiber? diverse access mechanisms?Consumers of services have no good ways to compare service availabilityonly some very large customers may get access to carrier-internal dataThus, market failureNeed published metricssimilar to switch availability reporting

  • What's hard to scale (and not)Signaling does not have be hard:one message, on a reliable peering channel or IP router alert optionNSIS effort in the IETF?YESSIR: RTCP-based signaling700 MHz Celeron processor10,000 flow setups/second 300,000 softstate flowsIf scaling matters, sink-tree based reservation (BGRP)

  • Diversity is goodUnlike routing, no need for single signaling protocol:multicast is much harderdumb end devicesedge "pop-up" only show up in edge nodes

  • AAASignaling can easily be done in ASIC (no harder than IP), butneed cryptographic verification of requestneed interface to Authentication, Authorization, Accounting (AAA)cross-domain authentication hard, but 3G networks will do it anywayeasier if both sides ask their own access routersee also: iPass for dial-up, OSP (open settlement protocol)

  • AAA exampleAR1AR2Internetsourcedestinationsigns requestreserves for bothdirectionsCell phone model: both sides pay

  • Reservation scalingExample: every long-distance call in the US uses VoIP with per-flow resource reservation2000: 567.4 billion minutes @ 10 minutes each 1,800 calls/secondsingle mySQL server can sustain 5002,000 queries+updates/second

  • Business models don't workMost of the time, "tin" service is no worse than "platinum" servicecan't impress others with platinum AmEx cardno frequent flyer bonuses everybody switches only when the network is in bad shape

  • Resource control & reservationReservationProtocolApplicationAdmissionControlPacket SchedulerClassifier &route selectionDataRouting Protocols &DBsTrafficControl DBTspecY/NUSC EE-S 555

  • RED (Random Early Detection)TCP synchronization effect during overload, many connections lose packets and go into slowstartRED: start dropping based on average queue occupancy (vs. instantaneous queue occupancy)Parameter setting critical and non-trivialSee also RFC 2309

    THmin

    THmax

    0

    Do not discard

    Discard with increasing probability Pd

    Discard

  • ECN (Explicit Congestion Notification)Extension of RED: mark instead of dropRFC 2481 (A Proposal to add Explicit Congestion Notification (ECN) to IP)IP TOS6 bit indicates congestion: ECNIP TOS7 bit indicates support for mechanismNeeds cooperation of TCP (or similar protocols)TCP should act almost as if packet was dropped congestion windowbut dont do slow-start

    ECT=1ECN=1ECT=1ECN=0TCP ACK: ECN echo

  • Next steps in signaling (NSIS)RSVP not widely used for resource reservationbut is used for MPLS path setupdesign heavily biased by multicast needsmarginal and after-the-fact securitylimited support for IP mobilityThus, IETF NSIS working group developing new framework for general state management protocolresource reservationNAT and firewall controltraffic and QoS measurementMPLS and lambda path setupSplit into two components:NSLP: servicesNTLP: transport

  • NSISOn-path vs. off-pathoff-path bandwidth brokersDiscovery of next NTLP or NSLP hopuse router alert option

    UDPTCPSCTPSCTPNTLPQoSNAT/FWmeasure


Recommended