+ All Categories
Home > Documents > 2-1.1 Job Submission Slides for Grid Computing: Techniques and Applications by Barry Wilkinson,...

2-1.1 Job Submission Slides for Grid Computing: Techniques and Applications by Barry Wilkinson,...

Date post: 02-Jan-2016
Category:
Upload: alexandra-atkins
View: 216 times
Download: 0 times
Share this document with a friend
79
2-1.1 Job Submission Slides for Grid Computing: Techniques and Applications by Barry Wilkinson, Chapman & Hall/CRC press, © 2009. Chapter 2, pp. 35-59. For educational use only. All rights reserved. Aug 24, 2009
Transcript

2-1.1

Job Submission

Slides for Grid Computing: Techniques and Applications by Barry Wilkinson, Chapman & Hall/CRC press, © 2009.Chapter 2, pp. 35-59. For educational use only. All rights reserved. Aug 24, 2009

Types of jobs to be submitted to a Grid

• Programs written in C, C++, … that need to be compiled.

• Java programs that need a Virtual Java Machine

• Pre-compiled application packages

2-1.2

Submitting a job that needs to be compiled

2-1.3Fig. 2.1

Java programs• Quite similar to compiling C programs, except

Java compiler (javac) creates class file (bytecode) that is interpreted by a Java Virtual Machine (java).

• It is the Java Virtual Machine that is the executing program and the class file is an input file.

• Other class files usually need to be called too, found in path specified by CLASSPATH variable, so this variable must be set up properly.

2-1.4

Submitting a Java job

2-1.5Fig. 2.2

• Java programs offer more portability because class file could be sent to any remote computer having a Java Virtual Machine installed.

• However, speed of execution may be less than executing fully compiled binaries.

• Some studies have shown Java programs to run at 70% of equivalent C programs.

• Many internal components of Grid middleware software such as Globus actually use a mixture of Java and C. Java commonly used to create Web service components.

2-1.6

Types of ApplicationsSince Grid is a collection of computers, user might wish to use these computers collectively to solve problems.

Two ways:

• Parallel programs -- Break problem down into tasks that need to be done to solve problem and submit individual tasks to different computers to work on them simultaneously.

• Parameter sweep problems -- Run same job on different computers at same time but with different input parameters.

Particularly attractive for Grid computing platforms because no dependences between each sweep (usually).

2-1.7

Grid Resource Allocation Management (GRAM)

Principal job submission component of Globus

2-1.8

Data Management

SecurityCommonRuntime

Execution Management

Information Services

Web Services

Components

Non-WS

Components

Pre-WSAuthenticationAuthorization

GridFTP

GridResource

Allocation Mgmt(Pre-WS GRAM)

Monitoring& Discovery

System(MDS2)

C CommonLibraries

GT2

WSAuthenticationAuthorization

ReliableFile

Transfer

OGSA-DAI[Tech Preview]

GridResource

Allocation Mgmt(WS GRAM)

Monitoring& Discovery

System(MDS4)

Java WS Core

CommunityAuthorization

ServiceGT3

ReplicaLocationService

XIO

GT3

CredentialManagement

GT4

Python WS Core[contribution]

C WS Core

CommunitySchedulerFramework

[contribution]

DelegationService

GT4

Globus Open Source Grid Software

I Foster

GRAM

2-1.10

Job submission components

Fig. 2.3

Running simple jobs

across a Grid computing

environment

2-1.11Fig. 2.4

Specifying the job

Two basic ways a job might be specified:

•Directly by name of executable with required input arguments

or

•By a job description file – more powerful

2-1.12

DirectlyFor very simple jobs, one can submit a single job using

-c option, e.g.,

globusrun-ws -submit -c prog1 arg1 arg2

which executes program prog1 with arguments arg1 and arg2 on local host.

-c option actually causes globusrun-ws to generate a job description with the named program and arguments that follow.

-c option must be the last globusrun-ws option (why?). 2-1.13

Example

globusrun-ws –submit –c /bin/echo hello

Globus job monitoring output created on command line and will indicate that the job completes.

However, output from echo program (hello) not displayed and is lost as is any standard output without further specification (see later).

1b.14

2-1.15

Job Description FileGives details such as:• Job Description

- Name of executable

- Number of instances

- Arguments

- Input files

- Output files

- Directories

- Environment variables, paths, ...• Resource requirements

- Processor

- Number, cores, ...

- Type

- Speed, ...

- Memory

Used to match job with resources

Job Description Languages

Several languages invented.

• Globus - specific:– Globus 1 and 2 used their Resource

Specification language RSL (version 1)– Globus 3 used an XML version called RSL-2– Globus 4 uses a variation of RSL-2 in a JDD

(Job Description Document)

• Job Submission Description Language (JSDL)– A recent industry-wide standard (2005)

2-1.16

2-1.17

Resource Specification LanguageRSL version 1

• A meta-language describing job and its required execution.

Provides specification for:• Job description - directory, executable,

arguments, environment• Resource requirements - machine type,

number of nodes, memory, etc.

2-1.18

RSL Version 1 examplesConstraints Example

Conjunction (AND): &

• To create 3-5 instances of myProg, each on a machine with at least 64 Mbytes memory available to me for 1 hours:

& (executable=myProg)

(count>=3)(count<=5)(memory>=64)

(max_time=60)

2-1.19

Constraints Example

Disjunction (OR): |

• To create 5 instances of myProg, each on a machine with at least 64 Mbytes memory or 7 instances of myProg, each on a machine with at least 32 Mbytes memory :

&(executable=myProg)(|(&(count=5)(memory>=64))(&(count=7)(memory>=32)))

2-1.20

Requesting multiple resources

multirequest: +

• To execute 5 instances of myProg1 on a machine with at least 64 Mbytes memory and execute 2 instances of myProg2:

+(&(count=5)(memory>=64))

(executable=myProg1))

(&(count=2)(executable=myProg2))

XML Job Description languages

• With introduction of XML in early 2000’s, job description languages began to be changed to XML.

2-1.21

2-1.22

Using XML

• Much more elegant and flexible, and in keeping with Web services.

• Can use XML parsers.

• Allows more powerful mechanisms with job schedulers.

• Resource scheduler/broker applies specification to local resources.

2-1.23

Resource Specification Language, RSL version 2

• XM job description language used Globus version 3 (GT3).

• An XML language.

2-1.24

RSL-2

• XML version of RSL 1

• Can specify everything from executable,

paths, arguments, input/output, error file,

number of processes, max/min execution

time, max/min memory, job type etc. etc.

2-1.25

GT 3 RSL-2 ExampleSpecifying Executable

(executable=/bin/echo)

<gram:executable> <rsl:path> <rsl:stringElement

value="/bin/echo"/> </rsl:path></gram:executable>

2-1.26

RSL and GT 3.2 RSL-2 comparison for echo program

&((executable=/bin/echo)

(directory="/bin")

(arguments="Hello World")

(stdin=/dev/null)

(stdout="stdout")

(stderr="stderr")

(count=1)

)

<?xml version="1.0" encoding="UTF-8"?>• <rsl:rsl xmlns:rsl="http://www.globus.org/namespaces/2003/04/rsl"• xmlns:gram="http://www.globus.org/namespaces/2003/04/rsl/gram"• xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"• xsi:schemaLocation="• http://www.globus.org/namespaces/2003/04/rsl• c:/ogsa-3.0/schema/base/gram/rsl.xsd• http://www.globus.org/namespaces/2003/04/rsl/gram• c:/ogsa-3.0/schema/base/gram/gram_rsl.xsd">• <gram:job>• <gram:executable> <rsl:path>• <rsl:stringElement value="/bin/echo"/> </rsl:path>• </gram:executable>• <gram:directory> <rsl:path>• <rsl:stringElement value="/bin"/> </rsl:path>• </gram:directory>• <gram:arguments>• <rsl:string> <rsl:stringElement value="Hello World"/> </rsl:string>• </gram:arguments>• <gram:stdin> <rsl:path>• <rsl:stringElement value="/dev/null"/> </rsl:path> </gram:stdin>• <gram:stdout>• <rsl:pathArray>• <rsl:path>• <rsl:substitutionRef name="HOME"/>• <rsl:stringElement value="/stdout"/>• </rsl:path>• </rsl:pathArray>• </gram:stdout>• <gram:stderr>• <rsl:pathArray>• <rsl:path>• <rsl:substitutionRef name="HOME"/>• <rsl:stringElement value="/stderr"/>• </rsl:path>• </rsl:pathArray>• </gram:stderr>• <gram:count> <rsl:integer value="1"/> </gram:count>• <gram:jobType>• <gram:enumeration>• <gram:enumerationValue> <gram:multiple/> </gram:enumerationValue>• </gram:enumeration>• </gram:jobType>• <gram:gramMyJobType>• <gram:enumeration>• <gram:enumerationValue> <gram:collective/> </gram:enumerationValue>• </gram:enumeration>• </gram:gramMyJobType>• <gram:dryRun> <rsl:boolean value="false"/> </gram:dryRun>• <gram:saveState> <rsl:boolean value="true"/> </gram:saveState>• <gram:twoPhase> <rsl:integer value="600"/> </gram:twoPhase>• </gram:job>• </rsl:rsl>

2-1.27

Job Description Document (JDD)

• RSL-2 renamed and called JDD used in more recent Globus 4 (GT4) documents.

• Similar to original RSL-2 but simplified syntax.

• Not completely interchangeable.

2-1.28

GT 4 JDD ExampleSpecifying Executable

executable=/bin/echo

<executable>/bin/echo</executable>

2-1.29

GT 4.0 JDD for echo program

<?xml version="1.0" encoding="UTF-8"?><job> <executable>/bin/echo</executable>

<directory>${GLOBUS_USER_HOME}</directory>

<argument>Hello</argument><argument>World</argument><stdout>${GLOBUS_USER_HOME}/stdout</stdout>

<stderr>${GLOBUS_USER_HOME}/stderr</stderr>

</job>

Job Submission Description Language (JSDL)

• A standard introduced by GGF (Global Grid forum) in 2005 and beginning to be widely adopted.

2-1.30

Basic JSDL structure

<JobDefinition>

<JobDescription>

<JobIdentification > ....</JobIdentification>

<Application> ... </Application>

<Resources> ... </Resources >

<DataStaging> ... <DataStaging >

</JobDescription>

</JobDefinition>

2-1.31

For executables operating in a Linux environment, replace <application> with <POSIXapplication>

<POSIXApplication name=”xsd: ... ”>

<Executable> ... </Executable>

<Argument> ... </Argument>

<Input> ... </Input>

<Output> ... </Output>

<Error> ... </Error>

<WorkingDirectory> ... </WorkingDirectory>

</POSIXApplication>

Portable Operating System Interface, a collection of IEEE standards that define APIs, compatible to most versions of Unix/Linux

2-1.32

Sample Linux job description<?xml version="1.0" encoding="UTF-8"?>

<jsdl:JobDefinition

xmlns:jsdl="http://schemas.ggf.org/jsdl/2005/11/jsdl"

xmlns:jsdl-posix="http://schemas.ggf.org/jsdl/2005/11/jsdl-posix">

<jsdl:JobDescription>

<jsdl:Application>

<JobName>Test Job</JobName>

<Description>Hello world Job</Description>

<jsdl-posix:POSIXApplication >

<jsdl-posix:Executable>/bin/echo</jsdl-posix:Executable>

<jsdl-posix:Argument>hello, world</jsdl-posix:Argument>

<jsdl-posix:Output>${GLOBUS_USER_HOME}/stdout</jsdl-posix:Output>

<jsdl-posix:Error>${GLOBUS_USER_HOME}/stderr</jsdl-posix:Error>

</jsdl-posix:POSIXApplication>

</jsdl:Application>

</jsdl:JobDescription>

</jsdl:JobDefinition> 2-1.33

<Resources> describes requirements of resources for job and can include:

<Resources>

<CandidateHosts> ...</CandidateHosts>

<FileSystem> ... </FileSystem>

<ExlusiveExecution> ... </ExlusiveExecution>

<OperatingSystem> ... </OperatingSystem>

<CPUArchitecture> ... </CPUArchitecture>

<IndividualCPUSpeed> ... </IndividualCPUSpeed>

<IndividualCPUTime> ... </IndividualCPUTime>

<IndividualCPUCount> ... </IndividualCPUCount>

<IndividualNetworkBandwidth> ... </IndividualNetworkBandwidth>

<IndividualPhysicalMemory> ... </IndividualPhysicalMemory>

<IndividualVirtualMemory> ... </IndividualVirtualMemory>

<IndividualDiskSpace> ... </IndividualDiskSpace>

<TotalCPUTime> ... </TotalCPUTime>

<TotalCPUCount> ... </TotalCPUCount>

<TotalPhysicalMemory> ... </TotalPhysicalMemory>

<TotalVirtualMemory> ... </TotalVirtualMemory>

<TotalDiskSpace> ... </TotalDiskSpace>

<TotalResourceCount> ... </TotalResourceCount>

</Resources> 2-1.34

Submitting a job

2-1.35

2-1.36

GT4 job submission command globusrun-ws

• Submit and monitor GRAM jobs

• Written in C, for faster startup and execution than earlier Java version

• Supports multiple and single job submission

• Handles credential management

• Streaming of job stdout/err during execution

2-1.37

Simple job submission

• Step 1: Create proxy with: grid-proxy-int command.

• Step 2: Issue globusrun-ws with parameters to specify job.

2-1.38

Some globusrun-ws flags (options) for job submission

2-1.39

Running GT 4 Jobusing XML job description file

• Command:

globusrun-ws –submit –f prog.xml

where prog.xml specifies job in JDD.

-submit causes job to be submitted

Submitted to localhost (machine that is executing command) as no contact resource specified.

Submitted immediately using “fork”

2-1.40

With named executable-c option

Example: Submit program echo with argument hello to default localhost.

globusrun-ws –submit –c /bin/echo hello

-c Causes globusrun-ws to generate job description with named program and arguments.

-c option, if used, must be last option.

Only useful for very simple single jobs.

2-1.41

Output modes

-submit Submits (or resubmits) a job in one of three output modes:

batchinteractive, or interactive-streaming.

Default (without additional flags to specify) is interactive.

2-1.42

Interactive modeExample

Submit program echo with argument hello to default localhost.

% globusrun-ws –submit –c /bin/echo hello

Submitting job...Done.Job ID: uuid:d23a7be0-f87c-11d9-a53b-0011115aae1fTermination time: 07/20/2005 17:44 GMTCurrent job state: ActiveCurrent job state: CleanUpCurrent job state: DoneDestroying job...Done.

Output

Job ID

Job goes thro several states

StreamingRefers to sending contents of a stream of data from one location to another location as it is generated.

Often associated with Linux standard output and standard error streams, stdout and stderr.

For a program that creates output on remote machine, need:• Files to hold output and error messages ,or • Re-direct output and error messages to user console.

2-1.43

Provides for capturing program output and error messages and re-directing them to user’s console (output of globusrun-ws) or to specified files.

2-1.44

Interactive-streaming mode -s option

Interactive-streaming mode Re-direction to user console

-s option

Example

globusrun-ws -submit -s -c /bin/echo hello

Output (hello) redirected to (globusrun-ws) stdout Error messages redirected to (globusrun-ws) stderr

2-1.45

2-1.46

-s for streaming output

and

–so to specify output file–se to specify error file

Interactive-streaming mode Re-direction to files

-s option with –so and –se options

2-1.47

Exampleglobusrun-ws -submit

-s -so outfile -se errorfile -c /bin/echo hello

name of file holding output Argument for echo

name of file holding error messages

2-1.48

Example (JDD)<job>

<executable>/bin/echo</executable>

<argument>Hello</argument>

<stdout>jobOut</stdout>

<stderr>jobErr</stderr>

</job>

Specify streaming to files using Job description file

Batch submissionA long-standing Computer Science term from early days of computing where jobs submitted to system in a group (a batch) and wait their turn to be executed sometime in the future.

Originally appeared when programs were submitted by punched cards to a shared system, perhaps to be run perhaps overnight. (The author remembers those days with frustration.)

Batch submission really part of a scheduling approach.

2-1.49

Batch submission-b option

In globusrun-ws, batch referred to as an output mode because of way output generated.

Once job submitted, control returned to command line, and one will need to query system to find out status of job.

2-1.50

For example, suppose we ran the job:

globusrun-ws –submit /bin/sleep 100

in interactive mode. Would return when program (sleep for 100 seconds in this case) completes.

We would get normal globusrun-ws output, such as:

Submitting job...Done.

Job ID: uuid:d23a7be0-f87c-11d9-a53b-0011115aae1f

Termination time: 07/20/2005 17:44 GMT

Current job state: Active

Current job state: CleanUp

Current job state: Done

Destroying job...Done.

only each line would appear as process moves to next status condition.

2-1.51

Alternatively, could execute sleep in batch output mode: (-b option):

globusrun-ws –submit –b /bin/sleep 100

Output would immediately appear of the form:

Submitting job…Done

JoB ID: uuid:f9544174-60c5-11d9-97e3-0002a5ad41e5

Termination time: 01/08/2005 16:05 GMT

Displays ManagedJob EPR as job ID (more on this later).

Control returned to command line.

Program may not have finished. In this case it will not for 100 seconds.

2-1.52

Now one has to query state of job to find out when it completes.

Need job ID (ManagedJob EPR)

Convenient to have that put in a file using –o option when submitting job, e.g.

globusrun-ws –submit –b -o jobEPR /bin/sleep 100

where jobEPR holds the job ID (ManagedJob EPR).

2-1.53

To watch status of submitted job“Attach” interactive monitoring with -monitor option.

Job ID (ManagedJob EPR) provided with -j option, e.g.:

globusrun-ws –monitor –j jobEPR

where jobEPA holds ManagedJob EPR.

Then can see stages job goes through with interactive output immediately:

job state: Active

Current job state: CleanUp

Current job state: Done

Requesting original job description...Done.

Destroying job...Done

although job itself still batch output job.2-1.54

2-1.55

Some other options

-status Reports the current state of the job and exits

-kill Requests immediate cancellation of job and exits.

2-1.56

2-1.57

Specifying where job is submitted

Request to run job processed by “factory” service called ManagedJobFactoryService.

Default URL:

https://localhost:8443/wsrf/services/ManagedJobFactoryService

2-1.58

To specify where job is submitted-F Specifies “contact” for the job submission.

globusrun-ws –submit –F http://localhost:8440 –f prog1.xml

Job submitted to localhost

Globus container that hosts services running on port 8440

Factory service still located at. wsrf/services/ManagedJobFactoryService

2-1.59

Selecting a different host

Example

globusrun-ws –submit –F

https://140.221.65.193:4444/wsrf/

services/managedJobFactoryService

–f prog1.xml

2-1.60

Many other optionsExample

-term time

Set an absolute termination time, or a time relative to successful job creation

Transferring Files

2-1.61

Job submission command, for example:

globusrun-ws –submit –F http://coit-grid01.uncc.edu –c prog1

requires prog1 to be existing on the remote machine in the default directory ( ${GLOBUS_USER_HOME} ).

Up to user to ensure executable is in place.

GridFTPA Globus component that provides for:

• Large data transfers• Secure transfers• Fast transfers

– Parallel transfers -- employing multiple virtual channels sharing a single physical network connection

– Striping -- employing multiple physical channels using multiple hardware interfaces.

• Reliable transfers• Third party transfers.

2-1.62

Data Management

SecurityCommonRuntime

Execution Management

Information Services

Web Services

Components

Non-WS

Components

Pre-WSAuthenticationAuthorization

GridFTP

GridResource

Allocation Mgmt(Pre-WS GRAM)

Monitoring& Discovery

System(MDS2)

C CommonLibraries

GT2

WSAuthenticationAuthorization

ReliableFile

Transfer

OGSA-DAI[Tech Preview]

GridResource

Allocation Mgmt(WS GRAM)

Monitoring& Discovery

System(MDS4)

Java WS Core

CommunityAuthorization

ServiceGT3

ReplicaLocationService

XIO

GT3

CredentialManagement

GT4

Python WS Core[contribution]

C WS Core

CommunitySchedulerFramework

[contribution]

DelegationService

GT4

Globus Open Source Grid Software

I Foster

GridFTP

Third party transfers

Transferring a file from one remote location to another remote location controlled by a party at another location (the third party).

Already seen third party transfers in Grid portal at file management portlet.

There, user can initiate a transfer between two locations from portal running on a third system.

2-1.64

GridFTP third party transfers

2-1.65Fig 2.5

ReliableFileTransfer (RFT) service

GridFTP is not a Web/Grid service.

ReliableFileTransfer (RFT) service provides service interface and additional features for reliable file transfers (retry capabilities etc.).

RFT uses GridFTP servers to effect actual transfer.

2-1.66

Data Management

SecurityCommonRuntime

Execution Management

Information Services

Web Services

Components

Non-WS

Components

Pre-WSAuthenticationAuthorization

GridFTP

GridResource

Allocation Mgmt(Pre-WS GRAM)

Monitoring& Discovery

System(MDS2)

C CommonLibraries

GT2

WSAuthenticationAuthorization

ReliableFile

Transfer

OGSA-DAI[Tech Preview]

GridResource

Allocation Mgmt(WS GRAM)

Monitoring& Discovery

System(MDS4)

Java WS Core

CommunityAuthorization

ServiceGT3

ReplicaLocationService

XIO

GT3

CredentialManagement

GT4

Python WS Core[contribution]

C WS Core

CommunitySchedulerFramework

[contribution]

DelegationService

GT4

Globus Open Source Grid Software

I Foster

RFT

Globus file transfer commandglobus-url-copy

Example

globus-url-copy

gsiftp://www.coit-grid02.uncc.edu/~abw/hello

file:///home/abw/

copies file hello from coit-grid02.uncc.edu to the local machine using GridFTP.

Users needs valid security credentials (a certificate and proxy)

2-1.68

Source URL

Destination URL

Question

Why three /’s in file URL, i.e. file:/// ?

Answer

The general form of file URL is file://host/path. If host omitted, it is assumed to be localhost, left with three /’s, i.e. file:///.

2-1.69

File Staging

Moving complete files to where they are needed.

Usually associated with input and output files.

Input file need to be moved to where program located

Output files generated need to be moved back to user, or as input to other programs.

Note different to input and output streaming, which moving a series of data items as a stream as it happens.

2-1.70

File staging

2-1.71Fig 2.6

Staging example in JDD<job>

<fileStageOut>

<transfer>

<sourceUrl>file:///prog1Out</sourceUrl>

<destinationUrl>gsiftp://coit-grid05.uncc.edu:2811

/prog1Out</destinationUrl>

</transfer>

</fileStageOut>

</job>

2-1.72

Staging example in JSDL<jsdl:DataStaging>

<jsdl:FileName>

/inputfiles/prog1Input

</jsdl:FileName>

<jsdl:CreationFlag>overwrite</jsdl:CreationFlag>

<jsdl:Source>

<jsdl:URI>

gsiftp://coit grid05.uncc.edu:2811/prog1Input

</jsdl:URI>

</jsdl:Source>

</jsdl:DataStaging>

2-1.73

2-1.74

Sources of GT 4 information

http://www.globus.org/toolkit/docs

2-1.75

Questions(multiple choice)

2-1.76

When one issues the GT4.0 command:

globusrun-ws -submit -F localhost:8440 -s -so hello1 -c /bin/echo hello

what is hello?

(a) A java class(b) An xml file containing the description of the job

to be run(c) The executable to run in Globus(d) The argument for the program that will be

executable

2-1.77

When one issues the GT4.0 command:

globusrun-ws -submit -F localhost:8440 -s

-so hello1 -c /bin/echo hello

is the order of the flags important, and if so why?

(a) Not important

(b) Important: -c must be last as it uses the remaining arguments

(c) Important: -s must be before -so

(d) Important: -F must be first

2-1.78

When one issues the GT4.0 command:

globusrun-ws -submit -F localhost:8440 -s

-so hello1 -c /bin/echo hello

what is localhost?

(a) The server logged into running globusrun-ws.

(b) The computer you are using to log into the server

(c) None of the other answers.

2-1.79

What does the tag <count> specify in an RSL-2/JDD file?

(a) The number of different jobs submitted.

(b) The number of computers to use.

(c) The number of identical jobs to submit.

(d) The number of arguments.


Recommended