Post on 14-Jan-2016
description
transcript
1
Proposed future direction for CHEETAH
Outline Strategy discussion:
What's our goal for the CHEETAH network: eScience network or a scalable GP network?
Bandwidth sharing mode: Book-Ahead (BA) or Immediate-Request (IR)?
Tactical aspects: Network evolution
Networking software modules
Application software modules
Interconnection to HOPI/DRAGON
Malathi VeeraraghavanUniversity of Virginia
August 23, 2006
2
Observation
"Many e-science experiments are unique applications that involve collaboration among a handful of facilities. As a result, networks supporting these experiments are optimized to provide maximum throughput to a few facilities, as opposed to moderate throughput to millions of users, which is the raison d'etre for commercial networks."
3
eScience networks
eScience network requirements Number of users small Hard to achieve high utilization; also not impt. Overprovision network to keep call blocking
rate low We can then focus on creating software to
allow scientists to automatically create high-speed application-specific topologies: AST, UCLP, OSCARS, USN scheduler, BRUW
Bandwidth-sharing algorithms of less concern
4
General-purpose commercial networks
Has to be scalable: large number of users Metcalfe's statement: Value of a network
increases exponentially with the number of users
High utilization is an important goal Low call blocking probability or low
waiting time for resources Focus on efficient bandwidth-sharing
algorithms
5
Circuit/VC service on GP commercial networks
Just for ISPs/enterprise admins: needs similar to eScience
router-to-router circuits limited number of users high-bandwidth, long-held circuits low price not a high priority need BA mode of bandwidth sharing
For end users large number of users can only offer moderate BW and limited call holding
times IR mode of sharing becomes feasible
6
BW sharing modes in circuit/VC networks
Mean waiting time is proportional to mean call holding time Can afford to have a queueing based solution if calls are short
Large m Moderate throughput Small m
Short calls Long callsBank teller Doctor's office
High throughput
immediate-requestwith call blocking + retries("call queueing")(video, gaming)
immediate-requestwith delayed-start times(file transfers)
book-ahead
m is the link capacityexpressed in channelse.g., if 1Gbps circuitsare assigned on a 10Gbps link,m = 10
7
Impact of increasing m at different values of link utilization Ud
100
101
102
103
0
0.2
0.4
0.6
0.8
1
Ud=90%
Ud=80%
Ud=60%
Ud=40%
m
PQ
100
101
102
1030
200
400
600
800
1000
100
101
102
1030
200
400
600
800
1000
100
101
102
1030
200
400
600
800
1000
100
101
102
1030
200
400
600
800
1000
Ud=90%
Ud=80%
Ud=60%
Ud=40%
m=10
Pq=41%
Pro
b.
of a
rriv
ing
job
findi
ngal
l m c
ircui
ts b
usy
Off
ered
load
: ca
ll ar
rival
rat
e/ca
ll d
epar
ture
rat
e
Low-rate per-call circuits High-rate per-call circuits
Link capacity expressed in channels
8
Impact of mean call holding time /1
0 5 10 15 20 25 3010
0
101
102
103
104
105
1/ (minutes)
N
0 5 10 15 20 25 300
6
12
18
24
30
0 5 10 15 20 25 300
6
12
18
24
30
0 5 10 15 20 25 300
6
12
18
24
30
0 5 10 15 20 25 300
6
12
18
24
30
0 5 10 15 20 25 300
6
12
18
24
30
0 5 10 15 20 25 300
6
12
18
24
30
E[W
d] (m
inute
s)
m=1000,=1call/hour
m=100,=1call/hour
m=10,=1call/hour
m=10,=10calls/hour
m=10
m=100m=1000
Num
ber
of
port
s ag
gre
gatin
g tr
affic
on t
o th
e li
nk
Mea
n w
aiti
ng
time
for
dela
yed
calls
: per host call-generation rate Ud: 90% )1(
1][
dd Um
WE
'
9
Main findings of analysis Two key parameters:
If m is small (per-circuit BW is high) and mean call holding time is large
then need BA to avoid long waiting times
and mean call holding is small (file transfers) then use "call queueing"
If m is large, switch hardware costs increase N, number of aggregation ports, high level of demultiplexing high
Moderate m: best choice
10
Book-Ahead (BA) or Immediate-Request (IR)?
Bandwidth-sharing mechanisms
Book-Ahead (BA) Immediate-Request (IR)
eScience networks
very large file transfers need high-BW and long holding time + remote viz. need to reserve other resources such as displays
None?
general-purpose networks
circuit service to only ISPs/enterprise admins - router-to-router circuits
circuit service for end users- host-to-host + router-to-
router (end-to-end)- partial-path router-to-
router circuits on congested links (called in by end user)
11
Support for the BA mechanism of bandwidth sharing
Since RSVP-TE does not have parameters for BA calls (call duration, start time), this mode is not implemented in switch controllers
Need an external scheduler to manage bandwidth into the future
Easiest to make it centralized - one per domain Cannot utilize the BW management software
implemented in switch controllers as part of GMPLS control-plane software
The BA mode is necessary for high-BW, long-held calls
12
Support for the IR mechanism of bandwidth sharing
Switches have built-in (G)MPLS control-plane software (RSVP-TE/OSPF-TE)
Bandwidth management is part of RSVP-TE switch controller software Hence it is distributed bandwidth
management Need to limit call holding time -
reminders for renewals and automatic release
Moderate-to-high per-call bandwidth
13
To implement BA, IR, or both?
Implement only BA Develop and "standardize" protocols for scheduler-
to-scheduler signaling for interdomain circuits (one centralized scheduler per domain)
Implement scheduler and test with other networks Create software tools to enable scientists and
ISP/enterprise admins to visualize network topologies and request appropriate circuits/VCs
High-BW, long-held: Therefore AAA is a must Path being pursued by DRAGON, USN, OSCARS,
UCLP
14
Opportunity missed if the whole optical testbed community only experiments with BA
What opportunity? Enable the creation of large-scale circuit/VC
networks with moderate-rate circuits that can support a brand new class of applications economic value for the telcom industry
A "reservations-oriented" mode of networking to complement today's connectionless Internet ala airlines that complement roadways
Could prove useful to FIND, GENI, net-neutrality Alternative pricing models for bandwidth
15
What "brand new class of applications?"
Video, video, video Gaming Remote software access + Sync.
storage Async storage Multimedia (large) files in web sites
16
Video applications
Improve quality of conferencing, telephony, surveillance, entertainment and distance-learning by a significant degree
Expend bandwidth for a higher-quality, lower latency, multi-camera, auto-movement, auto-mixing experience
Make the "flat world" flatter Energy savings/environmental benefits Moderate bandwidth - IR with call
blocking/retries
17
Gaming applications
Current gamers buy personal graphics cards Players talk of "lag" caused by differences in
graphics processing speeds Moderate-speed circuits can enable a new class of
games in which rapidly-changing scenes are possible compare movies in which multiple story lines
keep scenes changing vs. gaming scenes Players connect to graphics servers Data transferred is not GL commands, but rather
rendered bits (doable?) Moderate bandwidth - IR with call blocking/retries
18
Remote software access/sync storage
Remote software access Reduce computer administration cost Personal computers vs. machine rooms I loaded 22 new applications on my new laptop
Instead: connect and run! Virtual Computing Laboratory: Mladen Vouk, NCSU
Synchronous storage access Disaster recovery
Moderate bandwidth - IR with call blocking/retries
19
Asynchronous storage
Asynchronous storage depots will lower costs for backups disaster recovery
Need for increased storage grows with multimedia files
High bandwidth, short calls IR with delayed start
20
Larger files in web sites
Multimedia files in web sites Imagine the use of video/audio files in all sorts of web
sites instead of ASCII My own course PPT files: I use audio sparingly because of
bandwidth Think assembly instructions for electric fans, furniture
Kinesthetic learning - show me a video Think hotel web pages
Show me exactly where the beach is relative to my room; do I have a balcony - saying it in text format is one thing; seeing it in a video format quite another!
Content distribution network & web caching High bandwidth, short calls IR with delayed start
21
Are all these "high"-BW apps just a matter of increasing BW of links in the current Internet?
No The socialistic mode of bandwidth
sharing on the Internet discourages individual investment in network bandwidth
Age-old question: should we pay for bandwidth with tax dollars
- "free" for the whole community? "Tragedy of the commons" (Tanenbaum)
should we create a network where individuals can pay for bandwidth on congested links more directly? - think higher-toll HOV lanes
22
What does all this mean?
Let's build a scalable circuit/VC network in which bandwidth is shared in IR mode Scalability will create "Metcalfe's value" Provides an opportunity to finally recoup our
investment in (G)MPLS technologies standards creation effort implementation: Cisco, Juniper, Sycamore, Movaz
Assign at least a few of the optical testbeds that we are investing in now to study whether this IR mode of bandwidth sharing can help with our understanding of net-neutrality, economic growth, FIND questions
IR more natural in data world unlike in airlines (BA)
23
Argument: IR is just a "now" in BA
BA and IR cannot coexist without some form of bandwidth partitioning BA allows for high-BW, long-duration calls IR calls will suffer a high call blocking rate if
supported through BA scheduler (the "add-now-as-an-option-in-scheduler" solution)
Should you admit an IR call if it arrives a few seconds before start time of a BA call and hope it completes before the BA call start time, or reject the call and waste bandwidth?
24
CHEETAH and TSI
The CHEETAH network solves only part of the TSI problem
Other problems Cray computer I/O problem Local-area connectivity within NCSU
If the CHEETAH project was a production solution to support TSI, we should spend money to solve these two problems for TSI
But as an experimental short-lived networking project, where should we focus?
25
Outline
Strategy discussion: What's our goal for the CHEETAH network:
eScience network or a scalable GP network?
Bandwidth sharing: Book-Ahead (BA) or Immediate-Request (IR)?
Tactical aspects: CHEETAH network evolution
Networking software modules
Application software modules
Interconnection to HOPI/DRAGON
26
Network evolution to support IR
Current CHEETAH network only supports 10 circuits per OC192 link remember IR mode does not work well when
m, the link capacity in channels, is small (i.e., 10)
Recoup OC1-crossconnect capability of the SN16Ks from its current 1Gbps use Has three advantages
supports higher m; better for IR GMPLS standards based signaling Call setup delay: 166ms for two-hop instead of
1.5sec!
27
Network evolution options
Four options: VLAN-enabled NICs + VLSR for SN16K 15454 with VLSR IP router with VLSR Ethernet switch with VLSR
28
Example: web caching application
CHEETAHCHEETAHzelda4
Xiuduan Fang, xf4c@virginia.eduBob Gisiger, rwg5f@virginia.edu
ORNL
zelda3Atlanta, GA
wukong
Raleigh, NC
mvsut6C'ville, VA
UTKUTK
UGaUGaGatechGatech
dukeduke
NCSUNCSU
UNC
UNC
UVaUVa VTVT
VLANs
29
VLAN-enabled NICs + SN16K VLSR
SN16K has data-plane support to map a sub-Gb/s VLAN on an Ethernet port to a corresponding number of OC1s on a SONET port
But, it does not have control-plane support for this type of circuit Even GMPLS support for the GbE port mapping to a 21-
OC1 VCAT signal is an experimental release just for CHEETAH usage
Because GMPLS support for such hybrid circuits is non-standard
Can implement our own (non-standard) solution as a VLSR
But, goal is to use off-the-shelf switches with GMPLS support to demonstrate IR mode
30
15454/VLSR at each PoP Make the 15454 serve as the intermediary between
Ethernet NICs in hosts and SONET based SN16Ks at CHEETAH PoPs
A 15454 VLSR could be useful for other projects, UCLP, Ultralight Cisco has no plans to implement a GMPLS control-plane
engine for the 15454 Two problems:
Non-standard solution for hybrid circuits VLAN ID continuity requirement
Cannot support partial-path circuits
31
IP router/VLSR at each PoP
Use channelized OCxx SONET interfaces to connect IP router to SN16K
Connect web caches to router Have routers initiate pure SONET circuit
setup Use PBR or just ordinary routing table
update to map flows to different OCxx circuits; support multiple circuits from one web cache
32
CHEETAH wide-area network
Raleigh PoP (MCNC)
Controlcard
GbE/10GbEcard
ORNL PoP
Controlcard
GbE/10GbEcard
SN16000
Controlcard
OC192card
OC-192
GbE/10GbEcard
End hosts
Atlanta PoP (SOX/SLR)
SN16000 SN16000
GaTech
End hosts
End hosts
ORNL
OC192card
NCSUOC192card
OC-192 (via NLR/SLR/NCREN)
via NCREN
UVa
CUNY Via Nysernet/HOPI Via Vortex/HOPI
33
CHEETAH evolution to support sub-Gb/s circuits
Raleigh PoP
Controlcard
OC192card
ORNL PoP
Controlcard
OC192card
SN16000
Controlcard
OC192card
OC-192
End hosts
Atlanta PoP
SN16000 SN16000
GaTech
End hosts
End hosts
ORNL
OC192card
NCSUOC192card
OC-192 (via NLR/SLR/NCREN)
UVa
CUNY
OC192card
GbE
GbE
GbE
GbE
GbE
GbE
GbE
GbE
34
IP router/VLSR at each PoP
Can support end-to-end circuits web caching CDN servers video apps at 10-15Mbps - map to one OC1 storage depots
Has the potential to support PPCs (partial-path circuits)
Place router with VLSR in enterprises at edge of GbE cheetah access link
35
Ethernet switch/VLSR at each PoP
Does not help with the problems noted in today's Gb/s circuit use of the SN16K long call setup delays: 1.5sec non-standard solution high per-circuit BW
Using an Ethernet switch/VLSR at an enterprise (e.g. CUNY) requires all VLANs sharing 1Gbps CHEETAH access link to be switched to the same exit SN16K.
Even worse, m=1 if whole 1Gb/s link used for a circuit
36
Software modules required
Networking software: CVLSR for IP router CTCP code to support multiple
simultaneous flows Application software:
Add CHEETAH API to web caching squid software
Write software for video apps CDN and storage software
37
Upcoming year goals specified in special report
Work item Milestones and deliverable
Responsible individuals
Item I Stabilize the CHEETAH network; increase user base; complete doc.
MV and Xuan/Tao
Item II Extend CHEETAH network by adding routers/VLAN switches/MSPPs
MV and Tao
Item III Interconnect to testbeds, such as HOPI, USN, DRAGON
MV and Tao
38
Work item Milestones and deliverable
Responsible individuals
Item IV Develop software, both apps and networking s/w, such as CVLSRs
Router CVLSR: MarkCTCP: Helali
Apps: Xiuduan
Item V Support ORNL and NCSI in TSI apps
Mark and Tao/MV
Item VI Enhance theoretical understanding of sharing modes
IR blocking/retries: XFBA: XiangfeiIR delayed start: Mark
Upcoming year goals specified in special report
39
Equipment required IP routers with channelized SONET cards with GA
GMPLS UNI implementation need one for ORNL PoP if we can partner with SOX in ATL, NCREN in Raleigh,
MAX in McLean, purchase channelized OC192 cards IP routers with GbE blades for V. Tech and UVA If NC 454 transponders are unavailable, purchase
transponders for DC-Raleigh NLR link - since HOPI doesn't have this link
Colocation costs at NLR McLean and Raleigh
40
Interconnect CHEETAH and HOPI
Through IP routers With our IP router/VLSR combo, setup
router-to-route SONET OC1 circuits via cheetah and router-to-router VLAN virtual circuits through HOPI. At routers, do PBR mapping for flows or just update routing tables
This means packets go back to IP layer between two networks
41
CHEETAH-HOPI interconnection
McLean, VA
10GbE
CHEETAH
HOPI network: courtesy of Rick Summerhill
NC
GATN
Webcache
Webcache
Webcache
Webcache
Webcache
Webcache
Webcache
Webcache
VLAN
MPLS
SONET
42
HOPI and web caching
Seems like a good match Rick's black cloud experiment - same as
web caching Exercises "hybrid" goal of HOPI Small per-circuit BW possible with
VLANs
43
Connection to DRAGON
Spoke with Jerry Sobieski, Aug. 12, 2006
He said DRAGON PoPs have Ethernet VLAN switches
Therefore, can use similar IP router demarcation points to interconnect CHEETAH/HOPI to DRAGON
44
Conclusions
We'd like to enable and demonstrate general-purpose
apps using circuit/VC service with scalability as a key goal
support IR mode of bandwidth sharing with limited per-call bandwidth and limited call holding times call blocking with retries delayed start