Post on 02-Apr-2015
transcript
CGW03, Crakow, 28 October 2003
DataTAG Project UpdateCGW’2003 workshop, Crakow (Poland)
October 28, 2003
Olivier Martin, CERN, Switzerland
DataTAG partners
http://www.datatag.orghttp://www.datatag.org
Funding agencies
Cooperating Networks
DataTAG MissionDataTAG Mission
EU EU US Grid network research US Grid network research High Performance Transport protocols
Inter-domain QoS
Advance bandwidth reservation
EU EU US Grid Interoperability US Grid Interoperability
Sister project to EU DataGRIDSister project to EU DataGRID
TTransransAAtlantictlantic G Gridrid
5CGW03, Crakow, 28 October 2003
Main DataTAG achievements (EU-US Grid interoperability)
GLUE Interoperability effort with DataGrid, iVDGL & Globus
GLUE testbed & demos VOMS design and implementation in collaboration
with DataGrid VOMS evaluation within iVDGL underway
Integration of GLUE compliant components in DataGrid and VDT middleware
6CGW03, Crakow, 28 October 2003
Main DataTAG achievements
(Advanced networking) Internet landspeed records have been beaten one after the other by
DataTAG project members and/or teams closely associated with DataTAG:
Atlas Canada lightpath experiment (iGRID2002) New Internet2 landspeed record (I2 LSR) by Nikhef/Caltech team
(SC2002) Scalable TCP, HSTCP, GridDT & FAST experiments (DataTAG
partners & Caltech) Intel 10GigE tests between CERN (Geneva) and SLAC (Sunnyvale) –
(Caltech, CERN, Los Alamos NL, SLAC) New I2LSR (Feb 27-28, 2003): 2.38Gb/s sustained rate, single
TCP/IP v4 flow, 1TB in one hour
Caltech-CERN Latest IPv4 & IPv6 I2LSR were awarded live from Indianapolis
during Telecom World 2003: May 6, 2003: 987 Mb/s single TCP/IP v6 stream Oct 1, 2003, 5.44 Gb/s sustained rate, single TCP/IP v4 stream,
1.1TB in 26 minutes -> 1 680MB CD/second
7CGW03, Crakow, 28 October 2003
Significance of I2LSR to the Grid?
Essential to establish the feasibility of multi-Gigabit/second single stream IPv4 & IPv6 data transfers:
Over dedicated testbeds in a first phase Then across academic & research backbones Last but not least across campus network Disk to disk rather than memory to memory Study impact of high performance TCP over disk servers
Next steps: Above 6Gb/s expected soon between CERN and Los Angeles
(Caltech/CENIC PoP) across DataTAG & Abilene Goal is to reach 10Gb/s with new PCI Express buses Study alternatives to standard TCP
Non-TCP transport HSTCP, FAST, Grid-DT, etc…
8CGW03, Crakow, 28 October 2003
Impact of high performance flows
across A&R backbones?
Possible solutions:• Use of “TCP friendly” non-TCP (i.e. UDP) transport• Use of Scavenger (i.e. less than best effort) services
9CGW03, Crakow, 28 October 2003
DataTAG testbed overview
(phase 1/2.5G & phase2/10G)
Layer1/2/3 Layer1/2/3 networking (1)networking (1)
Conventional layer 3 technology is no longer Conventional layer 3 technology is no longer fashionable because of:fashionable because of:
High associated costs, e.g. 200/300 KUSD for a 10G router interfaces
Implied use of shared backbones
The use of layer 1 or layer 2 technology is very The use of layer 1 or layer 2 technology is very attractive because it helps to solve a number of attractive because it helps to solve a number of problems, e.g. problems, e.g.
1500 bytes Ethernet frame size (layer1)
Protocol transparency (layer1&2)
Minimum functionality hence, in theory, much lower costs (layer1&2)
Layer1/2/3 Layer1/2/3 networking (2)networking (2)
So called, « lambda Grids » are becoming very So called, « lambda Grids » are becoming very popular, popular,
Pros: circuit oriented model like the telephone network, hence no
need for complex transport protocols
Lower equipment costs (i.e. typically a factor 2 or 3 per layer)
the concept of a dedicated end to end light path is very elegant
Cons:
« End to end » still very loosely defined, i.e. site to site, cluster to cluster or really host to host
High cost, Scalability & Additional required middleware to deal with circuit set up, etc
12CGW03, Crakow, 28 October 2003
Multi vendor 2.5Gb/s layer 2/3 testbed
GigE switch
Routers
L2 Servers
A-7770
C-7606
J-M10
GigE switchL3 Servers A1670Multiplexer
2*GigE
To STARLIGHT
From CERN
Ditto
C-ONS15454
GEANTGEANT
VTHD
AbileneESNetCanari
e
Layer 3Layer 3Layer 2Layer 2 Layer 1Layer 1
2.5G2.5G
2.5G2.5GGARGAR
RR
INRIAINRIA
INFN/CNAFINFN/CNAF
10G10G
CERCERNN
UvAUvA
8*G
igE
STARLIGHTSTARLIGHT
PPARC
Super-Janet
P-8801
State of 10G State of 10G deployment and deployment and
beyondbeyond
Still little deployed, because of lack of Still little deployed, because of lack of demand, hence:demand, hence:
Lack of products
High costs, e.g. 150KUSD for a 10GigE port on a Juniper T320 router
Even switched, layer 2, 10GigE ports are expensive, however the prices should come down to 10KUSD/port towards the end of 2003.
40G deployment, although more or less 40G deployment, although more or less technologically ready, is unlikely to happen in technologically ready, is unlikely to happen in the near future, i.e. before LHC startsthe near future, i.e. before LHC starts
10G DataTAG testbed extension to Telecom World 2003 and
Abilene/Cenic
Sponsors: Cisco, HP, Intel, OPI Sponsors: Cisco, HP, Intel, OPI (Geneva’s Office for the Promotion of (Geneva’s Office for the Promotion of Industries & Technologies), Services Industries & Technologies), Services
Industriels de Geneve, Telehouse Industriels de Geneve, Telehouse Europe, T-SystemsEurope, T-Systems
On September 15, 2003, the DataTAG On September 15, 2003, the DataTAG project was the first transatlantic testbed project was the first transatlantic testbed offering direct 10GigE access using offering direct 10GigE access using Juniper’sJuniper’s VPN layer2/10GigE emulation.VPN layer2/10GigE emulation.
15NEC’2003 Conference, Varna (Bulgaria) 19 September 2003
Impediments to high E2E throughput across LAN/WAN
infrastructure For many years the Wide Area Network has been the
bottlemeck, this is no longer the case in many countries thus, in principle, making the deployment of data intensive Grid infrastructure possible! Recent I2LSR records show for the first time ever
that the network can be truly transparent and that throughputs are limited by the end hosts
The dream of abundant bandwith has now become a reality in large, but not all, parts of the world! Challenge shifted from getting adequate bandwidth
to deploying adequate LANs and cybersecurity infrastructure as well as making effective use of it!
Major transport protocol issues still need to be resolved, however there are many encouraging signs that practical solutions may now be in sight.
Single TCP stream performance Single TCP stream performance under periodic lossesunder periodic losses
Effect of packet loss
0102030405060708090
100
0.000001 0.00001 0.0001 0.001 0.01 0.1 1 10Packet Loss frequency (%)
Ban
dw
idth
Util
izat
ion
(%)
WAN (RTT=120ms)
LAN (RTT=0.04 ms)
Loss rate =0.01%:Loss rate =0.01%:LAN BW LAN BW utilization= 99%utilization= 99%WAN BW WAN BW utilization=1.2%utilization=1.2%
Bandwidth available = 1 Gbps
TCP throughput is much more sensitive to packet loss in WANs TCP throughput is much more sensitive to packet loss in WANs than in LANsthan in LANs
TCP’s congestion control algorithm (AIMD) is not suited to gigabit TCP’s congestion control algorithm (AIMD) is not suited to gigabit networksnetworks
Poor limited feedback mechanismsPoor limited feedback mechanisms The effect of even very small packet loss rates is disastrousThe effect of even very small packet loss rates is disastrous
TCP is inefficient in high bandwidth*delay networksTCP is inefficient in high bandwidth*delay networks The future performance of data intensive grids looks grim if we The future performance of data intensive grids looks grim if we
continue to rely on the widely-deployed TCP RENO stackcontinue to rely on the widely-deployed TCP RENO stack