Post on 13-Jan-2016
transcript
George KolaComputer Sciences DepartmentUniversity of Wisconsin-Madison
kola@cs.wisc.eduhttp://www.cs.wisc.edu/condor
DiskRouter: A Mechanism for High
Performance Large Scale Data Transfers
2www.cs.wisc.edu/condor
Outline
› Problem
› DiskRouter Overview
› Details
› Real life DiskRouters
› Experiments
3www.cs.wisc.edu/condor
ProblemSDSC to NCSABottleneck Bandwidth : 12.5 MBPSLatency 67 ms
Transfer Rate got by applications for a 1GB file
Scp : 0.66MBPSGridFTP(1 stream) : 0.85 MBPSGridFTP(10 streams) : 3.52 MBPS
4www.cs.wisc.edu/condor
DiskRouter Overview
› Mechanism to efficiently move large amounts of data (order of terabytes)
› Uses disk as a buffer to aid in large scale data transfers
› Application-level overlay network used for routing
› Ability to use higher level knowledge for data movement
5www.cs.wisc.edu/condor
A Simple Case
A B
A is transferring a large amount of data to B
6www.cs.wisc.edu/condor
A Simple Case
A B
C
DiskRouter
C is an intermediate node between A and B
7www.cs.wisc.edu/condor
A Simple Case with DiskRouter
A B
C
Improves performance when bandwidth fluctuation between A and C is independent of the bandwidth fluctuation between C and B
DiskRouter
With DiskRouter
Without DiskRouter
8www.cs.wisc.edu/condor
Data Mover/Distributed Cach e
Source writes to the closest DiskRouter and Destination receives it up from its closest DiskRouter
SourceDestinationDiskRouter
Cloud
9www.cs.wisc.edu/condor
Outline
› Problem
› DiskRouter Overview
› Details
› Real life DiskRouters
› Experiments
10www.cs.wisc.edu/condor
Routing Between DiskRouters
C need not be in the path between A and B
DiskRouter ADiskRouter B
DiskRouter C
11www.cs.wisc.edu/condor
Network Monitoring› Uses ‘Pathrate’ for estimating network
capacity
› Performs actual transfers for measurement
› Logging the data rate seen by different components
› Generate network interface stats on the machines involved in the transfers
12www.cs.wisc.edu/condor
Implementation Details
› Uses multiple sockets and explicitly sets TCP buffer sizes
› Overlaps disk I/O and socket I/O
13www.cs.wisc.edu/condor
Client Side
› Client library provided› Applications can call library
functions for network I/O› Functions provided for common
case file transfer (overlaps network I/O and disk I/O)
› Third party transfer support
14www.cs.wisc.edu/condor
Outline
› Problem
› DiskRouter Overview
› Details
› Real life DiskRouters
› Experiments
15www.cs.wisc.edu/condor
Real Life DiskRouters
UW Madison
StarLight
MCS ANL
INFN Italy90 Mbps
3.3 ms
90 Mbps
5.5 ms 514 Mbps
0.85 ms
30 Mbps
126.6 ms
411 Mbps
8 ms
SDSCUW
MilwaukeeNCSA
518 Mbps
67 ms94 Mbps
2.7 ms
16www.cs.wisc.edu/condor
Outline
› Overview
› Details
› Real Life DiskRouters
› Experiments
17www.cs.wisc.edu/condor
Testing Multiroute
UW Madison
StarLight
90 Mbps
3.3 ms
90 Mbps
5.5 ms
411 Mbps
8 ms
UW Milwaukee
18www.cs.wisc.edu/condor
Multiroute Improves Performance
Total Data into Starlight
Data From Milwaukee
Data From Madison
Meg
abit
s/se
cond
19www.cs.wisc.edu/condor
SRB to Unitree Transfer Using Stork
› Data movement from SDSC to NCSA via Starlight (3 TB of data had to be moved)
› Integrated into Stork
› Found significant performance gain
20www.cs.wisc.edu/condor
Link between SDSC and NCSA
StarLight
SDSCNCSA
518 Mbps
67 ms94 Mbps
2.7 ms
21www.cs.wisc.edu/condor
Starlight DiskRouter Stats
Data Inflow
Data Outflow
Memory Used
Disk Used
22www.cs.wisc.edu/condor
GridFTP vs DiskRouterM
egab
ytes
/sec
ond
End-to-End Data Rate Seen by Stork(MBPS) vs. Time
GridFTP DiskRouter
23www.cs.wisc.edu/condor
A Glimpse of Performance
Transfer of 1 GB file from SDSC (SanDiego) to NCSA (Urbana-Champaign)
Tool Transfer RateScp 0.66 MBPSGridFTP(1 stream) 0.85 MBPSGridFTP(10 streams) 3.52 MBPSDiskRouter 10.77 MBPS
24www.cs.wisc.edu/condor
Work In Progress
› Computation on data streams in the DiskRouter
› Ability to perform computation in the nodes attached locally to the DiskRouter
› Working together with Stork to add intelligence to data movement
25www.cs.wisc.edu/condor
Questions
› Thanks for listening