+ All Categories
Home > Documents > GridFTP Introduction – Page 1Grid Forum 5 GridFTP Steve Tuecke Argonne National Laboratory.

GridFTP Introduction – Page 1Grid Forum 5 GridFTP Steve Tuecke Argonne National Laboratory.

Date post: 18-Dec-2015
Category:
Upload: dwain-french
View: 214 times
Download: 0 times
Share this document with a friend
Popular Tags:
26
GridFTP Introduction – Page 1 Grid Forum 5 GridFTP Steve Tuecke Argonne National Laboratory
Transcript
Page 1: GridFTP Introduction – Page 1Grid Forum 5 GridFTP Steve Tuecke Argonne National Laboratory.

GridFTP Introduction – Page 1

Grid Forum 5

GridFTP

Steve TueckeArgonne National Laboratory

Page 2: GridFTP Introduction – Page 1Grid Forum 5 GridFTP Steve Tuecke Argonne National Laboratory.

GridFTP Introduction – Page 2

Grid Forum 5

Overview

Motivation for GridFTP Working Group Requirements GridFTP Solution GridFTP Working Group Documents Role of GridFTP Working Group

Page 3: GridFTP Introduction – Page 1Grid Forum 5 GridFTP Steve Tuecke Argonne National Laboratory.

GridFTP Introduction – Page 3

Grid Forum 5

GridFTP Working Group Motivation

Data transfer solutions have been developed by the Globus Project over past ~5 years, GridFTP is 3rd generation

Grid Forum started ~1 year ago to promote and develop Grid technologies– Critical mass of people working in this area

Grid Forum GridFTP working group formed to foster the further specification and development of GridFTP– Community effort to move GridFTP forward

Page 4: GridFTP Introduction – Page 1Grid Forum 5 GridFTP Steve Tuecke Argonne National Laboratory.

GridFTP Introduction – Page 4

Grid Forum 5

Some Important Definitions

Resource Network protocol Network enabled service Application Programmer Interface (API) Syntax Software Development Kit (SDK)

Page 5: GridFTP Introduction – Page 1Grid Forum 5 GridFTP Steve Tuecke Argonne National Laboratory.

GridFTP Introduction – Page 5

Grid Forum 5

Resource

Entity that is to be shared– Includes computers, storage, data,

software Does not have to be physical entity

– Condor pool, distributed file system, … Defined in terms of interfaces, not

devices– E.g. LSF defines compute resource– Open/close/read/write defines access to a

distributed file system, e.g. NFS, AFS, DFS

Page 6: GridFTP Introduction – Page 1Grid Forum 5 GridFTP Steve Tuecke Argonne National Laboratory.

GridFTP Introduction – Page 6

Grid Forum 5

Network Protocol

A formal description of message formats and a set of rules for message exchange– Rules may define sequence of message

exchanges– Protocol may define state-change in

endpoint, e.g. state change Good protocols designed to do one thing

– Protocols can be layered Examples of protocols

– IP, TCP, TLS, FTP, HTTP, Kerberos

Page 7: GridFTP Introduction – Page 1Grid Forum 5 GridFTP Steve Tuecke Argonne National Laboratory.

GridFTP Introduction – Page 7

Grid Forum 5

Network Enabled Services Implementation of a protocol that

defines a set of capabilities– Protocol defines interaction with service– All services require protocols– Not all protocols are used to provide

services (e.g. IP, TLS) Examples: FTP and Web servers

Web Server

IP Protocol

TCP Protocol

TLS Protocol

HTTP Protocol

FTP Server

IP Protocol

TCP Protocol

FTP Protocol

Telnet Protocol

Page 8: GridFTP Introduction – Page 1Grid Forum 5 GridFTP Steve Tuecke Argonne National Laboratory.

GridFTP Introduction – Page 8

Grid Forum 5

API(Application Programming Interface)

A specification for a set of routines to facilitate application development– Refers to definition, not implementation, e.g.

there are many implementations of MPI Spec often language-specific (or IDL)

– Routine name, number, order and type of arguments; mapping to language constructs

– Behavior or function of routine Examples

– GSS API, MPI

Page 9: GridFTP Introduction – Page 1Grid Forum 5 GridFTP Steve Tuecke Argonne National Laboratory.

GridFTP Introduction – Page 9

Grid Forum 5

Syntax

A specification for how a defined set of information is encoded into bits– A syntax may be defined as part of a

protocol or API» Protocol messages have defined syntax» A syntax may be used as API function argument

– But syntax can also stand alone Good syntax designed to do one thing

– Syntaxes can be layered Examples

– XML, ASN.1, X.509, LDIF

Page 10: GridFTP Introduction – Page 1Grid Forum 5 GridFTP Steve Tuecke Argonne National Laboratory.

GridFTP Introduction – Page 10

Grid Forum 5

SDK(Software Development Kit)

A particular instantiation of an API SDK consists of libraries and tools

– Provides implementation of API specification

Can have multiple SDKs for an API Examples of SDKs

– MPICH, Motif Widgets

Page 11: GridFTP Introduction – Page 1Grid Forum 5 GridFTP Steve Tuecke Argonne National Laboratory.

GridFTP Introduction – Page 11

Grid Forum 5

Multiple APIs but a Single ProtocolExample: TCP/IP

Multiple APIs: BSD sockets, Winsock, System V streams, …

Different programs use different APIs Interoperability: programs using

different APIs can exchange information

TCP/IP Protocol: Reliable byte streams

WinSock API Berkeley Sockets API

Application Application

Page 12: GridFTP Introduction – Page 1Grid Forum 5 GridFTP Steve Tuecke Argonne National Laboratory.

GridFTP Introduction – Page 12

Grid Forum 5

Single API, but Multiple ProtocolsE.g., GSS-API

GSS-API provides portability: any correct program compiles & runs on a platform

Does not provide interoperability: all processes must link against same SDK– E.g., GSI and Kerberos versions of GSS-API

ApplicationApplication

GSS-API GSS-API

GSI SDK

GSI protocol

Kerberos SDK

Kerberos protocol

TCP/IP TCP/IPDifferent message formats, exchange

sequences, etc.

Page 13: GridFTP Introduction – Page 1Grid Forum 5 GridFTP Steve Tuecke Argonne National Laboratory.

GridFTP Introduction – Page 13

Grid Forum 5

I.e., Standard APIs and Protocols are Both Important: For Different Reasons

Standard APIs/SDKs are important– They enable application portability– But w/o standard protocols, interoperability

is hard (every SDK speaks every protocol?) Standard protocols are important

– Enable cross-site interoperability– Enable shared infrastructure– But w/o standard APIs/SDKs, application

portability is hard (different platforms access protocols in different ways)

Page 14: GridFTP Introduction – Page 1Grid Forum 5 GridFTP Steve Tuecke Argonne National Laboratory.

GridFTP Introduction – Page 14

Grid Forum 5

Grid Data Needs

Transfer of large amounts of data (petabytes or terabytes) between storage systems

Access to large amounts of data (terabytes or gigabytes) by many geographically distributed applications and users for analysis, visualization, etc.

Page 15: GridFTP Introduction – Page 1Grid Forum 5 GridFTP Steve Tuecke Argonne National Laboratory.

GridFTP Introduction – Page 15

Grid Forum 5

Requirements

Grid Security Infrastructure (GSI) and Kerberos support

Third-party control of data transfer Parallel data transfer Striped data transfer Partial file transfer Automatic negotiation of TCP

buffer/window size Support for reliable/recoverable data

transfer

Page 16: GridFTP Introduction – Page 1Grid Forum 5 GridFTP Steve Tuecke Argonne National Laboratory.

GridFTP Introduction – Page 16

Grid Forum 5

Candidate Standards

FTP– Defined by a set of IETF RFCs– No partial file, parallel/striped, GSI, etc– Separate control & data channels

WebDAV– New extension to http– No third party transfer, parallel/striped, etc.– Combined control & data channel

Page 17: GridFTP Introduction – Page 1Grid Forum 5 GridFTP Steve Tuecke Argonne National Laboratory.

GridFTP Introduction – Page 17

Grid Forum 5

Separate Control & Data Channels WebDAV combines control and data over single

channel FTP splits control and data

– Supports multiple, user selectable data channel protocols

Advantage to split channels– Third party transfers handled cleanly– Can (cleanly) define new data channel protocols

» E.g. parallel/striped transfer, automatic TCP buffer/window negotiation

– Amenable to high-performance proxies» E.g. For firewalls, load balancing, etc.

Page 18: GridFTP Introduction – Page 1Grid Forum 5 GridFTP Steve Tuecke Argonne National Laboratory.

GridFTP Introduction – Page 18

Grid Forum 5

GridFTP Solution

Built on existing FTP standards– RFC 949: File Transfer Protocol– RFC 2228: FTP Security Extensions– RFC 2389: Feature Negotiation for the File

Transfer Protocol– Draft: FTP Extensions

Extends standards with– Additions to security extensions, partial file

transfer, parallel/striped transfer, TCP buffer/window size tuning,

Page 19: GridFTP Introduction – Page 1Grid Forum 5 GridFTP Steve Tuecke Argonne National Laboratory.

GridFTP Introduction – Page 19

Grid Forum 5

GridFTP Implementation Status

Modified wu-ftpd server– Most features

Modified ncftp client– Security, TCP buffer setting

Modified HPSS & Unitree ftpd server– Security

Globus Toolkit client and server SDKs, and command line tools– Most features

Striped FTP server (aka DPSS2)

Page 20: GridFTP Introduction – Page 1Grid Forum 5 GridFTP Steve Tuecke Argonne National Laboratory.

GridFTP Introduction – Page 20

Grid Forum 5

GridFTP Working Group Documents

GridFTP: A Data Transfer Protocol for the Grid– Overview of working group activities and

documents– Requirements– Informational draft

GridFTP: FTP Extensions for the Grid– Protocol specification

Page 21: GridFTP Introduction – Page 1Grid Forum 5 GridFTP Steve Tuecke Argonne National Laboratory.

GridFTP Introduction – Page 21

Grid Forum 5

GridFTP Protocol Specifications

Existing standards– RFC 949: File Transfer Protocol– RFC 2228: FTP Security Extensions– RFC 2389: Feature Negotiation for the File

Transfer Protocol– Draft: FTP Extensions

New drafts– GridFTP: FTP Extensions for the Grid

Page 22: GridFTP Introduction – Page 1Grid Forum 5 GridFTP Steve Tuecke Argonne National Laboratory.

GridFTP Introduction – Page 22

Grid Forum 5

GridFTP APIs

Should there be standard API(s)?– Posix I/O– SRB client– grid_storage– globus_ftp_client– MPI-IO– HDF5– etc

Beyond scope of this working group Common protocol beneath these APIs would

allow interoperability

Page 23: GridFTP Introduction – Page 1Grid Forum 5 GridFTP Steve Tuecke Argonne National Laboratory.

GridFTP Introduction – Page 23

Grid Forum 5

Role of GridFTP Working Group Bring together those who are interested in the

future of GridFTP to help foster the…– continued specification and standardization of GridFTP– development of inter-operable GridFTP

implementations– widespread adoption of GridFTP as a transfer protocol

for the Grid Develop drafts which together define GridFTP

– May submit some of them to IETF Move GridFTP forward to better address Grid data

transfer requirements

Page 24: GridFTP Introduction – Page 1Grid Forum 5 GridFTP Steve Tuecke Argonne National Laboratory.

GridFTP Introduction – Page 24

Grid Forum 5

NOT Goals of GridFTP Working Group

This working group will not start from first principles– Starting point is roughly GridFTP as it now exists– FTP base is assumed

Its not design by committee– Seeking rough consensus, with broad input– Draft authors and WG chair have final say

Page 25: GridFTP Introduction – Page 1Grid Forum 5 GridFTP Steve Tuecke Argonne National Laboratory.

GridFTP Introduction – Page 25

Grid Forum 5

GF5 GridFTP Working Session

Is this appropriate for Grid Forum? Who is interested in participating, and

in what capacity? Is the problem scoped appropriately (at

least for now)? What are the right drafts to write? Establish rough timeline for drafts

Page 26: GridFTP Introduction – Page 1Grid Forum 5 GridFTP Steve Tuecke Argonne National Laboratory.

GridFTP Introduction – Page 26

Grid Forum 5

A Call To Arms

The Grid Forum security working group needs to do more than just gather 3 times a year to chat about data management.

But Grid Forum is only appropriate for this activity if people meaningfully participate.– I will be doing this regardless. – But it will hopefully be done better and faster

with broad participation.– If there is not meaningful participation, I won’t

bother with the overhead of Grid Forum.


Recommended