+ All Categories
Home > Documents > 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory...

1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory...

Date post: 11-Jan-2016
Category:
Upload: camilla-jefferson
View: 220 times
Download: 3 times
Share this document with a friend
Popular Tags:
69
1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory [email protected] Keith R. Jackson Lawrence Berkeley National Laboratory [email protected]
Transcript
Page 1: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

1

Grid Programming in

Java & Python

Prepared by Gregor von Laszewski

Argonne National Laboratory

[email protected]

Keith R. Jackson

Lawrence Berkeley National Laboratory

[email protected]

Page 2: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

2

Chapter Outline

Introduction to Grids– What is a Grid?– What is the Globus Toolkit?– What is a Commodity Grid Kit?

Using and Programming Grids with the JavaTM and Python CoG Kits– Secure access to remote resources– Remote job submission and monitoring– Distributed grid information management and remote data

access– Graphical component to access a Grid

Conclusion for this part of the presentation Exercises

Page 3: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

3

Integration of High-end Resources

On-demand creation of powerful virtual computing systems

Sensor nets

Data archives

SupercomputersSoftwarecatalogs

Colleagues

Page 4: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

4

Grid Computing and Existing Technologies

Commonalities between “Grid computing” and major industrial thrusts– Business-to-business, peer-to-peer, application service

providers, storage service providers, distributed computing, Internet computing

Differences between Grid computing and existing technologies – Complicated requirements: “Run program X at site Y subject

to community policy P, providing access to data at Z according to policy Q”

– High performance: unique demands of advanced & high-performance systems

Page 5: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

5

• Authenticate once

• Specify simulation (code, resources, etc.)

• Locate resources

• Negotiate authorization, acceptable use, etc.

• Acquire resources

• Initiate computation

• Steer computation

• Access remote datasets

• Collaborate on results

• Account for usage Domain 2

Domain 1

What challenges do we have to solve?

Page 6: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

6

An Example Application:An Advanced Scientific Instrument

Avatar

Virtual Reality Cave

Scientist

Advanced Photon Source

Electronic Library

and Databases

Computing Portal Clients

Supercomputer

Page 7: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

7

The Globus Toolkit

The Globus Toolkit provides a range of basic Grid Services – Security, information, fault detection, communication,

resource management

These services are simple and orthogonal– Independently use: mix and match– Programming model independence

For each service there is generally a well-defined API Standards are used extensively

– E.g. LDAP, GSS-API, X.509

Page 8: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

8

Globus Approach

A toolkit and collection of services addressing key technical problems– Modular “bag of services” model– Not a vertically integrated solution– General infrastructure tools (aka middleware) that

can be applied to many application domains Interdomain issues, rather than clustering

– Integration of intradomain solutions Distinction between local and global services

Page 9: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

9

Globus Hourglass

Focus on architecture issues– Propose set of core

services as basic infrastructure

– Use to construct high-level, domain-specific solutions

Design principles– Keep participation cost low– Enable local control– Support adaptation

“IP hourglass” model

Diverse global services

Core Globusservices

Local OS

A p p l i c a t i o n s

Page 10: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

10

Layered Grid Architecture

Application

Fabric

Connectivity

Resource

Collective

InternetTransport

Applica-tion

Link

Inte

rnet P

roto

col

Arch

itectu

re

Page 11: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

11

Production Grids & Testbeds

NASA’s Information Power GridThe Alliance National Technology Grid

GUSTO Testbed

Page 12: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

12

Key Grid Services

Resource Discovery– List all of the solaris machines with at least 16 processors

and 2 Gig’s of memory

Resource Acquisition– Submit my job to this machine

Data Management/Movement– Locate, or create, a replica of this data set– Move this data set to this host

Security– Authentication– Authorization

Page 13: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

13

Grid Security Infrastructure

Standards based infrastructure that provides:– Single sign-on– Authentication & data

integrity/confidentiality– Delegation

Based on:– X.509 Proxy certificates (draft ietf pkix

standard)– TLS– GSS-API

Page 14: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

14

Delegation in GSI

Compute Resource

Client

Tertiary Storage

1) Mutual Authentication using TLS

GSI Proxy

2) Delegate GSI credential

GSI Proxy

3) Submit Job

4) Mutual Authentication using TLS (using GSI proxy)

5) Retrieve Data Set

6) Run computation

Page 15: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

15

GridFTP

Implements many of the optional RFC’s for the ftp protocol.– Authentication using the Grid Security Infrastructure through

the GSS-API– Restart markers– Third-party transfers

Adds some new extensions to support high-performance transfers.– Parallel streams– Striped servers– TCP Buffer Tuning

Page 16: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

16

Definition: CoG Kit

A Commodity Grid (CoG) Kit defines and implements a set of general components that map Grid functionality into a commodity environment.

– Web/CGI, Java, Python, CORBA, DCOM, .... – CoG Kits help us build applications, Problem Solving

Environments, and Portals.– CoG Kits take the good things from framework & Grid

Page 17: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

17

To further elaborate …

Computational Grids– Globus Infrastructure

Commodity Technologies– CORBA, Java, Python, Perl, Web technologies,

...

Science Application

Computing Portal– Problem Solving Environments– Workbenches

++

=

+

Page 18: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

18

Motivation: Java & Python CoG Kits• Use and leverage existing technologies for Grid programming

- The capabilities of the framework onto which Grid Services are mapped can be exploited:

Objects, Events, Exceptions, JNDITM, ...

- Objects like jobs/tasks can be defined.

- XML support is provided.

- GUI's, ...., IDE's can be used (Forte, BOA Constructor) • Maximize software flexibility, extensibility, and reusability• Provide foundations for application developer teams that are

familiar to develop applications in this framework

- Reduce development and maintenance cost• Use as glue for many technologies

• Python is well suited to tying together many different languages/technologies.

Page 19: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

19

What is the Java CoG Kit ?

The Java CoG Kit provides a mapping between Java and the Globus Toolkit. It extends the use of Globus by enabling to access advanced Java features such as events and objects for Grid programming.

The Java CoG Kit is implemented in pure Java. It speaks the Grid protocols.

It is not a wrapper of the C Globus Toolkit This allows integration within applets. Mostly client side support

Page 20: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

20

What is the Python CoG Kit?

Similarly the Python CoG Kit provides a mapping between Python and the Globus Toolkit. It extends the use of Globus by enabling to access advanced Python features such as events and objects for Grid programming.

The Python CoG Kit is implemented as a series of Python extension modules that wrap the Globus C code.

Uses SWIG (http://www.swig.org) to help generate the interfaces.

Page 21: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

21

Status: Java CoG Kit

Modified core Globus components (Protocols) Basic services are provided accessing:

– Security (GSI) – Remote job submission and monitoring (GRAM)– Quality of service (GARA)– Remote Data Access (GSIFTP)– Information Service Access (MDS)– Certificate store (myProxy)

Current 100% client side components includes Reusable Grid GUI components

Page 22: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

22

Status: Python CoG Kit

Basic services are provided accessing: – Security (security) – Remote job submission and monitoring (gramClient)– Gass Server (gassServerEZ)– Secure high-performance network IO (io)– Protocol independent data transfers (gassCopy)– High performance Grid FTP transfers (ftpClient)– Support for building Grid FTP servers (ftpControl)– Remote file IO (gassFile)– Replica Catalog (replicaCatalog)– Replica Management (replicaManagement)– GSI enabled SOAP (GSISOAP)

Page 23: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

23

Python CoG Kit basics

Everything is contained in the pyGlobus package. The low-level C-wrappers are all named modulec,

e.g., ftpClientc, gramClientc– Shouldn’t need to access these directly, instead use the

Python wrappers (ftpClient, gramClient)

Exceptions are used to handle error conditions– pyGlobus.util.GlobusException is the base class for all

pyGlobus exceptions. It extends the base Python exceptions.Exception class

The Python wrappers classes manage the underlying memory

Page 24: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

24

Building the Python CoG Kit

Uses standard GNU tools (autoconf, make) Download the latest source from http://www-

itg.lbl.gov/Grid/projects/pyGlobus/ Tar zxvf pyGlobus-VERSION.tar.gz cd pyGlobus; mkdir build; cd build ../configure

– Use –with-flavor=<Globus flavor string> to override the default flavor

– Use –with-python=<path to python> to override the default python installation

make; make install– Installs into your site-packages directory (No need to edit

PYTHONPATH)

Page 25: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

25

Where do we go from here?

On which computers can I perform my task?– Use a Grid information service (MDS) to answer this.– Use the LDAP browser as a good example of how to

develop a Grid-based Portal to a directory based information service.

How can I execute a job on a remote machine?– Use the job submission API or GUI

How can I access data on a remote machine?– Use Grid FTP.– Use the protocol independent gassCopy module.– Remote file IO using the gassFile module.

Secure high-performance network IO.

Page 26: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

26

LDAP Browser and Editor User-friendly Windows Explorer-like interface to LDAP

directories– It allows browsing and modification of information

– It is written entirely in Java• Uses JFC (SwingSet) for the GUI• Uses JNDI class library for LDAP access• Connects to LDAP v2 and v3 servers.

– Note: v3 standard is defined as we speak

Recognition– More than 1600 commercial user licenses

– Many users in academia (license for free)

– Winner of the Best Student Application award in the Novell Developers' Contest

– Jars top 25%

Page 27: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

27

Features

Browsing, editing, searching Export Binary attributes, customizable attribute editors Object templates LDAP v3 aware, SSL support Drag and drop, copy-and-paste interface Named sessions Applet support

Page 28: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

28

Features

Browsing, editing, searching Export Binary attributes, customizable attribute editors Object templates LDAP v3 aware, SSL support Drag and drop, copy-and-paste interface Named sessions Applet support

Page 29: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

29

Features

Browsing, editing, searching Export Binary attributes, customizable attribute editors Object templates LDAP v3 aware, SSL support Drag and drop, copy-and-paste interface Named sessions Applet support

Page 30: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

30

Screenshot

Page 31: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

31

Tiny Information Query ProgramMDS mds = new MDS ("mds.globus.org", 389, "o=Grid")

MDSresult result = mds.search ("&(objectclass=GridComputeResource) (freenodes=64))", "contact freenodes totalnodes");

String contact = result.get("contact");

System.out.println(result.print());

This can also be done with JNDI or Netscape SDK We have this layer for portability

Page 32: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

32

Remote Job Submission

Access to remote resources is an elementary Grid problem

Supports delegation to the remote resource

Page 33: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

33

Gram Submission Demo

Hello World

1

2

3

Bang

A test to stderr

3

2

1

Bang

Page 34: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

34

C Job Submission Example

callback_func(void *user_arg, char *job_contact, int state, int errorcode){ globus_i_globusrun_gram_monitor_t *monitor; monitor = (globus_i_globusrun_gram_monitor_t *) user_arg; globus_mutex_lock(&monitor->mutex); monitor->job_state = state; switch(state) { case GLOBUS_GRAM_PROTOCOL_JOB_STATE_PENDING:{ globus_i_globusrun_gram_monitor_t *monitor; monitor = (globus_i_globusrun_gram_monitor_t *) user_arg; globus_mutex_lock(&monitor->mutex); monitor->job_state = state; switch(state) { case GLOBUS_GRAM_PROTOCOL_JOB_STATE_FAILED: if(monitor->verbose) { globus_libc_printf("GLOBUS_GRAM_PROTOCOL_JOB_STATE_FAILED\n"); } monitor->done = GLOBUS_TRUE; break; case GLOBUS_GRAM_PROTOCOL_JOB_STATE_DONE: if(monitor->verbose) { globus_libc_printf("GLOBUS_GRAM_PROTOCOL_JOB_STATE_DONE\n"); } monitor->done = GLOBUS_TRUE; break; } globus_cond_signal(&monitor->cond); globus_mutex_unlock(&monitor->mutex);}

Page 35: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

35

C Job Submission Example (cont.)globus_l_globusrun_gramrun(char * request_string, unsigned long options, char *rm_contact){ char *callback_contact = GLOBUS_NULL; char *job_contact = GLOBUS_NULL; globus_i_globusrun_gram_monitor_t monitor; int err; monitor.done = GLOBUS_FALSE; monitor.verbose=verbose; globus_mutex_init(&monitor.mutex, GLOBUS_NULL); globus_cond_init(&monitor.cond, GLOBUS_NULL);

err = globus_module_activate(GLOBUS_GRAM_CLIENT_MODULE); if(err != GLOBUS_SUCCESS) { … } err = globus_gram_client_callback_allow( globus_l_globusrun_gram_callback_func, (void *) &monitor, &callback_contact); if(err != GLOBUS_SUCCESS) { … } err = globus_gram_client_job_request(rm_contact, request_string, GLOBUS_GRAM_PROTOCOL_JOB_STATE_ALL, callback_contact, &job_contact); if(err != GLOBUS_SUCCESS) { … } globus_mutex_lock(&monitor.mutex); while(!monitor.done) { globus_cond_wait(&monitor.cond, &monitor.mutex); } globus_mutex_unlock(&monitor.mutex); globus_gram_client_callback_disallow(callback_contact); globus_free(callback_contact);

globus_mutex_destroy(&monitor.mutex); globus_cond_destroy(&monitor.cond);

Page 36: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

36

Gram

Creating a job

– GramJob job = new GramJob ("&(executable=/bin/sleep)(arguments=15)");

Listening to state changes via Listeners– job.addListener( new GramJobListener() {

public void statusChanged(GramJob job) { System.out.println(“Job [” +

job.getIDAsString() + “]:" + " Status : "+ job.getStatusAsString());

} });

Page 37: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

37

gramClientfrom pyGlobus import gramClientfrom threading import *cond = 0condV = Condition(Lock())

try:

gramClient = GramClient()

callbackContact = gramClient.set_callback(func, condV)

jobContact = gramClient.submit_request(rm,rsl,

gramClient.JOB_STATE_ALL, callbackContact)

condV.acquire()

while cond == 0:

condV.wait()

condV.release()

except GramClientException, ex:

print ex

Page 38: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

38

gramClient Job State Callback

def func(cv, contact, state, error):

global cond

if state == gramClient.JOB_STATE_FAILED:

print "Job failed"

cv.acquire()

cond = 1

cv.notify()

cv.release()

elif state ==

gramClient.JOB_STATE_DONE:

print "Job is done"

cv.acquire()

cond = 1

cv.notify()

cv.release()

Page 39: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

39

Redirecting stdout with gramClient

from pyGlobus import gassServerEZ

opts = gassServerEZ.STDOUT_ENABLE

server = gassServerEZ.GassServerEZ(opts)

url = server.getURL()

rsl = "&(executable=/bin/sleep)(arguments=15)

(stdout=%s/dev/stdout)“ % url

(normal job submission from here)

Page 40: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

40

Remote Data Transfer

Directly use the Grid FTP protocol to transfer the data.

Use the gassCopy module for protocol independent transfers. It currently supports the ftp, gsiftp, http, and https protocols.

Page 41: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

41

Java UrlCopy (protocol independent file transfer)

import org.globus.io.urlcopy.*;

UrlCopy c = new UrlCopy();

c.setSourceUrl(from);

c.setDestinationUrl(to);

c.setUseThirdPartyCopy(true); // hint to enable thridparty transfer

// register a transfer listener....

c.setListener(new UrlCopyListener() {

public void transfer(int total, int current) {

System.out.println(total + " " + current); }

public void transferError(Exception e) {

System.out.println("transfer failed: " + e.getMessage());

}

});

c.copy(); // this starts the copy

Page 42: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

42

GSI FTPClient (not GridFTP)

import org.globus.io.ftp.*;

GSIFTPClient ftp = new GSIFTPClient(host, port);

ftp.authenticate();

ftp.setType(FTPClient.BINARY);

// we have convenient calls like

ftp.makeDir(dir);

ftp.deleteDir(dir3);

// the listener for the transfers....

listener = new TransferProgressListener {

public void transfer(int total, int current, String from, String to) {

System.out.println(total + " " + current + " " + from + " " + to);

}

public void transferError(String from, String to, Exception e) {

System.out.println("transfer failed: " + from + " " + to);

}

}

Page 43: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

43

Simple Examples

Retrieve a Makefile from a remote site and save it to Makefile.bak– File dst = new File (“Makefile.bak”);– ftp.get (“Makefile”, dst, listener);

Retrieve all files in remote directory to a directory called “Destination”– File destinationDir = newFile(“Destination”);– ftp.Get(“*”, destinationDir, true, listener);

Disconnect– ftp.disconnect();

Examples in source code: apis/examples, UrlCopy, FTPClient

Page 44: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

44

ftpClient Use Grid FTP to transfer a file.from pyGlobus import ftpClientfrom pyGlobus.util import BufferhandleAttr = ftpClient.HandleAttr()opAttr = ftpClient.OperationAttr()marker = ftpClient.RestartMarker()ftpClnt = ftpClient.FtpClient(handleAttr)ftpClnt.get(url, opAttr, marker, done_func, condV)buf = Buffer(64*1024)handle = ftpClnt.register_read(buf, data_func, 0)

def data_func(cv, handle, buffer, bufHandle, bufLen, offset, eof, error):

g_dest.write(buffer) if not eof: try: handle = g_ftpClient.register_read(g_buffer, data_func, 0) except Exception, e:

Page 45: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

45

Performance Options in ftpClient Setting tcpbuffer sizefrom pyGlobus import ftpControlbattr = ftpControl.TcpBuffer()battr.set_fixed(64*1024)Or battr.set_automatic(16*1024, 8*1024, 64*1024)opAttr.set_tcp_buffer(battr) Setting parallelism para = ftpControl.Parallelism()para.set_mode(ftpControl.PARALLELISM_FIXED)para.set_size(3)opAttr.set_parallelism(para)

Page 46: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

46

Using ftpClient plugins

Debug Plugin

from pyGlobus import ftpClient

f = open(“/tmp/out”, “w+”)

debugPlugin = ftpClient.DebugPlugin(f, “foobar”)

handleAttr.add_plugin(debugPlugin) RestartMarker Plugin

restartPlugin = ftpClient.RestartMarkerPlugin(beginCB,

markerCB, doneCB, None)

Page 47: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

47

gassCopy

Provides a protocol independent API to transfer remote files.

from pyGlobus import gassCopy

from pyGlobus import ftpClient

srcAttr = gassCopy.Attr()handleAttr = gassCopy.HandleAttr()destAttr = gassCopy.Attr()ftpSrcAttr = ftpClient.OperationAttr()ftpDestAttr = ftpClient.OperationAttr()srcAttr.set_ftp(ftpSrcAttr)destAttr.set_ftp(ftpDestAttr)copy = gassCopy.GassCopy(handleAttr)copy.copy_url_to_url(srcUrl, srcAttr, destUrl, destAttr)

Page 48: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

48

Remote File IO

Supports reading and writing remote files from http, https, ftp, gsiftp servers.– Can be opened either as normal Python file objects or as int

file descriptors suitable for use with the os module.

from pyGlobus import gassFile

gassFile = gassFile.GassFile()

f = gassFile.fopen(“gsiftp://foo.bar.com/file”, “r”)

lines = f.readlines()

gassFile.fclose(f)

Page 49: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

49

Secure High-Performance IO.

Uses the Grid Security Infrastructure to provide authentication.

Provides access to the underlying network parameters for tuning performance.

Page 50: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

50

TCP Server example

attr = NetIOAttr.TCPIOAttr() attr.set_authentication_mode(

io.GLOBUS_IO_SECURE_AUTHENTICATION_MODE_GSS_API) authData = AuthData.AuthData() authData.set_callback(auth_callback, None) attr.set_authorization_mode(

io.GLOBUS_IO_SECURE_AUTHORIZATION_MODE_CALLBACK, authData)

attr.set_channel_mode( io.GLOBUS_IO_SECURE_CHANNEL_MODE_GSI_WRAP)

soc = GSITCPSocket.GSITCPSocket() port = soc.create_listener(attr) soc.listen() childSoc = soc.accept(attr) buf = Buffer.Buffer(size) bytesRead = childSoc.read(buf, size, size)

We will develop a similar example for Java, It is already available as part of the GASS server.

Page 51: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

51

TCP Client example

attr = NetIOAttr.TCPIOAttr() attr.set_authentication_mode(

io.GLOBUS_IO_SECURE_AUTHENTICATION_MODE_GSS_AP) authData = AuthData.AuthData() attr.set_authorization_mode(

io.GLOBUS_IO_SECURE_AUTHORIZATION_MODE_SELF, authData) attr.set_channel_mode(

io.GLOBUS_IO_SECURE_CHANNEL_MODE_GSI_WRAP) soc = GSITCPSocket.GSITCPSocket() soc.connect(host, port, attr) nBytes = soc.write(str, len(str))

Page 52: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

52

Web Services in Grids

Provides a uniform, lightweight, standards based, framework for:– Service description (WSDL)– Discovery (UDDI / WSIL)– Invocation (SOAP)

Allows the use of standard tools Separates the interface from implementation details

– Including separating service definition from protocol bindings

Language/Programming model agnostic Reduces the cost of building scientific applications Uses SOAP over HTTP over the Grid Security Infrastructure

– Similar to HTTPS, but supports delegation

Page 53: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

53

GSISOAP

Built on SOAP.py, looking at switching to ZSI GSISOAP client example

from pyGlobus import GSISOAP, ioc

proxy = GSISOAP.SOAPProxy(“https://host.lbl.gov:8081”, namespace=“urn:gtg-Echo”)

proxy.channel_mode = ioc.GLOBUS_IO_SECURE_CHANNEL_MODE

proxy.delegation_mode = ioc.GLOBUS_IO_SECURE_DELEGATION_MODE_NONEprint proxy.echo(“spam, spam, spam, eggs, and spam”)

Page 54: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

54

GSISOAP (cont.) GSISOAP server examplefrom pyGlobus import GSISOAP, ioc

def echo(s, _SOAPContext): cred = _SOAPContext.delegated_cred # Do something useful with cred here return s

server = GSISOAP.SOAPServer(host.lbl.gov, 8081)server.channel_mode =

ioc.GLOBUS_IO_SECURE_CHANNEL_MODE_GSI_WRAPserver.delegation_mode =

ioc.GLOBUS_IO_SECURE_DELEGATION_MODE_FULL_PROXY

server.registerFunction(SOAP.MethodSig(echo, keywords=0, context=1), “urn:gtg-Echo”)

server.serve_forever()

Page 55: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

55

Reliable File Transfer Service

A Grid Web Service that:– Provides a simple interface to reliably

transfer data between GridFTP servers– Supports reliable transfers in the face of

network outages, machine crashes, and other faults

– Supports the automated tuning of network parameters to optimize bandwidth

Page 56: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

56

RFT Usage

GridFTP Server GridFTP Server

Reliable File Transfer Service

GridFTP transfer

WSDL Repository(UDDI?, WSIL?)

ClientRetrieve WSDL

GSI Proxy

Request transfer, and delegate GSI proxy.

SOAP + GSI

GSI Proxy Initiate GridFTP third party transfer using delegated GSI credential

Page 57: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

57

Fault Recovery

GridFTP Server GridFTP Server

Reliable File Transfer Service

Restart Markers are periodically returned

Network Partition occurs

Page 58: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

58

Fault Recovery

GridFTP Server GridFTP Server

Reliable File Transfer Service

Restart Markers are periodically returned

Restart transfer from last restart marker

Page 59: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

59

Fault Recovery (cont.)

GridFTP Server GridFTP Server

Reliable File Transfer Service

Restart Markers are periodically returned and written to the DB

DB

Machine crashes

Page 60: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

60

Fault Recovery (cont.)

GridFTP Server GridFTP Server

Reliable File Transfer Service

Restart Markers are periodically returned and written to the DB

DB

Restart transfer from last restart marker

After machine reboot, recover incomplete transfers from the DB

Page 61: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

61

RFT WSDL<definitions name="FileTransferService“ namespace stuff …> <message name= "submitTransferRequest">

<part name = "srcUrl" type = "xsd:string"/>

<part name = "destUrl" type = "xsd:string"/>

<part name = "priority" type = "xsd:int"/>

</message>

<message name = "submitTransferResponse">

<part name = "return" type = "xsd:string"/>

</message>

<message name = "checkStatusRequest">

<part name = "handle" type = "xsd:string"/>

</message>

<message name = "checkStatusResponse">

<part name = "return" type = "xsd:string"/>

</message>

Page 62: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

62

RFT Internals

Client

GSISOAP Interface

Persistent DBManager ThreadTransfer ThreadsOptimization

Module

Submit transfer request using 2 phase commit protocol

Transfer Queue

Write transfer to DB and then to Queue

GridFTP Server GridFTP Server

Read next transferObtain TCP buffer size

Initiate transfer

Page 63: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

63

RFT Usage Example

from rftClient import *

handle = “”srcUrl = “gsiftp://host.lbl.gov/data/climate_model_1234”destUrl = “gsiftp://lorenz.ncar.edu/tmp/climate_model_1234”try: handle = submit_transfer(srcUrl, destUrl)except TransferRequestException, ex: print ex…try: status = check_transfer_status(handle)Except TransferRequestException, ex: print exprint status…

Page 64: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

64

Many more CoGs

Perl JSP CORBA …. GUIs

Applications

Page 65: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

65

Java CoG Requirements

JDK 1.2 Security

– Needs security package such as IAIK We are able to replace security packages based on a

couple of common functions with public domain package.– We hope that these package will reach the

maturity of iaik

Page 66: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

66

Python CoG Requirements

Python 2.0+ The GT2-beta (or better) release of Globus. Support for dynamic libraries. Currently pyGlobus is tested on

– Linux– Solaris– Should compile– Will support win32 when Globus is ported

Page 67: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

67

Review

Modifications to core Globus– Http based protocols– APIs are not sufficient: we need services

Components useful to enhance grid infrastructure – JavaBeansTM can be used to integrate with COTS. – Java and Python have a rich set of libraries (network,

ldap, ...).– Grid programming in Java and Python is easier! – Java uses event model - not callbacks– GUIs do not have everything we need scripting!

Page 68: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

68

References

Java CoG Kit– http://www.globus.org/cog.

Python CoG Kit– http://www-itg.lbl.gov/Grid/projects/pyGlobus

Globus– http://www.globus.org

SWIG– http://www.swig.org

[email protected]

[email protected]

Page 69: 1 Grid Programming in Java & Python Prepared by Gregor von Laszewski Argonne National Laboratory gregor@mcs.anl.gov Keith R. Jackson Lawrence Berkeley.

69

Acknowledgements

This research is supported by the Mathematical, Information, and Computational Science Division subprogram of the Office of Advanced Scientific Computing Research, U.S. Department of Energy, under Contract W-31-109-Eng-38 and under Contract DE-AC03-76SF00098 with the University of California. Globus research and development is supported by DARPA, DOE, and NSF.


Recommended