Date post: | 18-Dec-2015 |
Category: |
Documents |
Upload: | dwain-french |
View: | 214 times |
Download: | 0 times |
GridFTP Introduction – Page 1
Grid Forum 5
GridFTP
Steve TueckeArgonne National Laboratory
GridFTP Introduction – Page 2
Grid Forum 5
Overview
Motivation for GridFTP Working Group Requirements GridFTP Solution GridFTP Working Group Documents Role of GridFTP Working Group
GridFTP Introduction – Page 3
Grid Forum 5
GridFTP Working Group Motivation
Data transfer solutions have been developed by the Globus Project over past ~5 years, GridFTP is 3rd generation
Grid Forum started ~1 year ago to promote and develop Grid technologies– Critical mass of people working in this area
Grid Forum GridFTP working group formed to foster the further specification and development of GridFTP– Community effort to move GridFTP forward
GridFTP Introduction – Page 4
Grid Forum 5
Some Important Definitions
Resource Network protocol Network enabled service Application Programmer Interface (API) Syntax Software Development Kit (SDK)
GridFTP Introduction – Page 5
Grid Forum 5
Resource
Entity that is to be shared– Includes computers, storage, data,
software Does not have to be physical entity
– Condor pool, distributed file system, … Defined in terms of interfaces, not
devices– E.g. LSF defines compute resource– Open/close/read/write defines access to a
distributed file system, e.g. NFS, AFS, DFS
GridFTP Introduction – Page 6
Grid Forum 5
Network Protocol
A formal description of message formats and a set of rules for message exchange– Rules may define sequence of message
exchanges– Protocol may define state-change in
endpoint, e.g. state change Good protocols designed to do one thing
– Protocols can be layered Examples of protocols
– IP, TCP, TLS, FTP, HTTP, Kerberos
GridFTP Introduction – Page 7
Grid Forum 5
Network Enabled Services Implementation of a protocol that
defines a set of capabilities– Protocol defines interaction with service– All services require protocols– Not all protocols are used to provide
services (e.g. IP, TLS) Examples: FTP and Web servers
Web Server
IP Protocol
TCP Protocol
TLS Protocol
HTTP Protocol
FTP Server
IP Protocol
TCP Protocol
FTP Protocol
Telnet Protocol
GridFTP Introduction – Page 8
Grid Forum 5
API(Application Programming Interface)
A specification for a set of routines to facilitate application development– Refers to definition, not implementation, e.g.
there are many implementations of MPI Spec often language-specific (or IDL)
– Routine name, number, order and type of arguments; mapping to language constructs
– Behavior or function of routine Examples
– GSS API, MPI
GridFTP Introduction – Page 9
Grid Forum 5
Syntax
A specification for how a defined set of information is encoded into bits– A syntax may be defined as part of a
protocol or API» Protocol messages have defined syntax» A syntax may be used as API function argument
– But syntax can also stand alone Good syntax designed to do one thing
– Syntaxes can be layered Examples
– XML, ASN.1, X.509, LDIF
GridFTP Introduction – Page 10
Grid Forum 5
SDK(Software Development Kit)
A particular instantiation of an API SDK consists of libraries and tools
– Provides implementation of API specification
Can have multiple SDKs for an API Examples of SDKs
– MPICH, Motif Widgets
GridFTP Introduction – Page 11
Grid Forum 5
Multiple APIs but a Single ProtocolExample: TCP/IP
Multiple APIs: BSD sockets, Winsock, System V streams, …
Different programs use different APIs Interoperability: programs using
different APIs can exchange information
TCP/IP Protocol: Reliable byte streams
WinSock API Berkeley Sockets API
Application Application
GridFTP Introduction – Page 12
Grid Forum 5
Single API, but Multiple ProtocolsE.g., GSS-API
GSS-API provides portability: any correct program compiles & runs on a platform
Does not provide interoperability: all processes must link against same SDK– E.g., GSI and Kerberos versions of GSS-API
ApplicationApplication
GSS-API GSS-API
GSI SDK
GSI protocol
Kerberos SDK
Kerberos protocol
TCP/IP TCP/IPDifferent message formats, exchange
sequences, etc.
GridFTP Introduction – Page 13
Grid Forum 5
I.e., Standard APIs and Protocols are Both Important: For Different Reasons
Standard APIs/SDKs are important– They enable application portability– But w/o standard protocols, interoperability
is hard (every SDK speaks every protocol?) Standard protocols are important
– Enable cross-site interoperability– Enable shared infrastructure– But w/o standard APIs/SDKs, application
portability is hard (different platforms access protocols in different ways)
GridFTP Introduction – Page 14
Grid Forum 5
Grid Data Needs
Transfer of large amounts of data (petabytes or terabytes) between storage systems
Access to large amounts of data (terabytes or gigabytes) by many geographically distributed applications and users for analysis, visualization, etc.
GridFTP Introduction – Page 15
Grid Forum 5
Requirements
Grid Security Infrastructure (GSI) and Kerberos support
Third-party control of data transfer Parallel data transfer Striped data transfer Partial file transfer Automatic negotiation of TCP
buffer/window size Support for reliable/recoverable data
transfer
GridFTP Introduction – Page 16
Grid Forum 5
Candidate Standards
FTP– Defined by a set of IETF RFCs– No partial file, parallel/striped, GSI, etc– Separate control & data channels
WebDAV– New extension to http– No third party transfer, parallel/striped, etc.– Combined control & data channel
GridFTP Introduction – Page 17
Grid Forum 5
Separate Control & Data Channels WebDAV combines control and data over single
channel FTP splits control and data
– Supports multiple, user selectable data channel protocols
Advantage to split channels– Third party transfers handled cleanly– Can (cleanly) define new data channel protocols
» E.g. parallel/striped transfer, automatic TCP buffer/window negotiation
– Amenable to high-performance proxies» E.g. For firewalls, load balancing, etc.
GridFTP Introduction – Page 18
Grid Forum 5
GridFTP Solution
Built on existing FTP standards– RFC 949: File Transfer Protocol– RFC 2228: FTP Security Extensions– RFC 2389: Feature Negotiation for the File
Transfer Protocol– Draft: FTP Extensions
Extends standards with– Additions to security extensions, partial file
transfer, parallel/striped transfer, TCP buffer/window size tuning,
GridFTP Introduction – Page 19
Grid Forum 5
GridFTP Implementation Status
Modified wu-ftpd server– Most features
Modified ncftp client– Security, TCP buffer setting
Modified HPSS & Unitree ftpd server– Security
Globus Toolkit client and server SDKs, and command line tools– Most features
Striped FTP server (aka DPSS2)
GridFTP Introduction – Page 20
Grid Forum 5
GridFTP Working Group Documents
GridFTP: A Data Transfer Protocol for the Grid– Overview of working group activities and
documents– Requirements– Informational draft
GridFTP: FTP Extensions for the Grid– Protocol specification
GridFTP Introduction – Page 21
Grid Forum 5
GridFTP Protocol Specifications
Existing standards– RFC 949: File Transfer Protocol– RFC 2228: FTP Security Extensions– RFC 2389: Feature Negotiation for the File
Transfer Protocol– Draft: FTP Extensions
New drafts– GridFTP: FTP Extensions for the Grid
GridFTP Introduction – Page 22
Grid Forum 5
GridFTP APIs
Should there be standard API(s)?– Posix I/O– SRB client– grid_storage– globus_ftp_client– MPI-IO– HDF5– etc
Beyond scope of this working group Common protocol beneath these APIs would
allow interoperability
GridFTP Introduction – Page 23
Grid Forum 5
Role of GridFTP Working Group Bring together those who are interested in the
future of GridFTP to help foster the…– continued specification and standardization of GridFTP– development of inter-operable GridFTP
implementations– widespread adoption of GridFTP as a transfer protocol
for the Grid Develop drafts which together define GridFTP
– May submit some of them to IETF Move GridFTP forward to better address Grid data
transfer requirements
GridFTP Introduction – Page 24
Grid Forum 5
NOT Goals of GridFTP Working Group
This working group will not start from first principles– Starting point is roughly GridFTP as it now exists– FTP base is assumed
Its not design by committee– Seeking rough consensus, with broad input– Draft authors and WG chair have final say
GridFTP Introduction – Page 25
Grid Forum 5
GF5 GridFTP Working Session
Is this appropriate for Grid Forum? Who is interested in participating, and
in what capacity? Is the problem scoped appropriately (at
least for now)? What are the right drafts to write? Establish rough timeline for drafts
GridFTP Introduction – Page 26
Grid Forum 5
A Call To Arms
The Grid Forum security working group needs to do more than just gather 3 times a year to chat about data management.
But Grid Forum is only appropriate for this activity if people meaningfully participate.– I will be doing this regardless. – But it will hopefully be done better and faster
with broad participation.– If there is not meaningful participation, I won’t
bother with the overhead of Grid Forum.