+ All Categories
Home > Documents > 3-1.1 Schedulers Slides for Grid Computing: Techniques and Applications by Barry Wilkinson, Chapman...

3-1.1 Schedulers Slides for Grid Computing: Techniques and Applications by Barry Wilkinson, Chapman...

Date post: 02-Jan-2016
Category:
Upload: augustus-fletcher
View: 213 times
Download: 0 times
Share this document with a friend
Popular Tags:
106
3-1.1 Schedulers Slides for Grid Computing: Techniques and Applications by Barry Wilkinson, Chapman & Hall/CRC press, © 2009. Chapter 3, pp. 65-99. For educational use only. All rights reserved. Aug 26, 2009
Transcript

3-1.1

Schedulers

Slides for Grid Computing: Techniques and Applications by Barry Wilkinson, Chapman & Hall/CRC press, © 2009.Chapter 3, pp. 65-99. For educational use only. All rights reserved. Aug 26, 2009

Job Schedulers

• Assigns work (jobs) to compute resources to meet specified job requirements within constraints of available resources and their characteristics

• An optimization problem.

• Objective usually to maximum throughput of jobs.

3-1.2

Job scheduler

3-1.3Fig 3.1

Scheduling Policies

Some traditional scheduling policies:

•First-in, first-out (longest waiting job)

•Favor certain types of jobs

•Shortest job first

•Smallest (or largest) memory first

•Short (or long) running job first

•Priority based

3-1.4

Job and resource matching

• Useful for a distributed heterogeneous computing platform such as a Grid platform

• Found in schedulers we will describe

• Requires both the characteristics the job and the resources to be described.

• For dynamic characteristics such as resource load, a mechanism necessary for reporting dynamic characteristics.

3-1.5

Types of jobs1. Named executable that can execute on target

resources, possibly with named input and output files.

2. OS (Linux) commands

3. Scripts

4. Programs that first need compiling.– For Java, executable is JVM, input is java class file

5. Array jobs - multiple instances of same job executable

6. Workflow – series of interdependent jobs

3-1.6

Batch jobs

• Most jobs expected to be batch jobs. (Interactive jobs possible.)

• One of expected types of jobs are long running unattended batch jobs.

• Standard input, standard output and standard error often redirected to files.

3-1.7

Types of Compute Resources

• Usually local compute resources consist individual computers, sometimes hundreds of computers, connected together in a cluster.

• Such clusters have been around for many years

• Schedulers designed to handle cluster configurations

3-1.8

Typical cluster configuration

3-1.9Fig 3.2

Scheduling Compute ResourcesResource characteristics scheduler will consider:• Static characteristics of machines

- Processor type, speed, number of cores, threads

- Main memory, cache memory, ...

• Dynamic machine conditions:- Load on machine

- Available disk storage

- Network load

• Network connections and characteristics• Characteristics of job:

- Code size, data, expected exec. time, memory requirements.

- Location of input files, output files, input/output staging

• User preferences/requirements 3-1.10

Monitoring Job ProgressSchedulers monitor job progress and report back to user.

Typically, job exists in one of various states as it go through processing, e.g.:

3-1.11Fig 3.3

Scheduler with automatic data placement components

(Input/output staging)

3-1.12Fig 3.4

e.g. Stork

Fault ToleranaceCheckpoint concept

3-1.13

Often available in schedulers

Fig. 3.5

Advance reservation• Term used for requesting actions at times

in future

• In this context, requesting a job to start at some time in the future.

• Both computing resources and network resources are involved

• Network connection usually being the Internet is not reserved.

• Found in recent schedulers

3-1.14

Reasons one might want advance reservation in Grid computing

• Reserved time chosen to reduce network or resource contention.

• Resources not physically available except at certain times.

• Jobs require access to a collection of resources simultaneously, e.g. data generated by experimental equipment.

• A deadline for results of work

• Parallel programming jobs in which jobs must communicate between themselves during execution.

• Workflow tasks in which jobs must communicate between themselves during execution.

6d-1.15

• Without advance reservation, schedulers will schedule jobs from a queue with no guarantee when they actually would be scheduled to run.

Synchronization• Critical distributed resources synchronized, i.e. they all “see” the same time.

• Synchronization can be achieved by running a Network Time Protocol (NTP) daemon synchronizing time with a public time server.

6d-1.16

Resource Broker

3-1.17Fig 3.6

Intermediary between user and resources that negotiates for resources and brokers agreement. May involve negotiating cost.

3-1.18

QuizGive one reason why a scheduler or resource broker is used in conjunction with Globus:

(a)Globus does not provide the ability to submit jobs.

(b)Globus does not provide the ability to make advance reservations.

(c) No reason whatsoever.(d) Globus does not provide the ability to

transfer files.

3-1.19

In the context of schedulers, what is meant by the term “Advance Reservation”?

(a) Requesting an advance.

(b) Submitting a more advanced job.

(c) Move onto the next job.

(d) Requesting actions at a future time.

3-1.20

Scheduler Examples

• Sun Grid Engine

• Condor/Condor-G

(Sun) Grid EngineHas all the usual features of a job scheduler including:

•Various scheduling algorithms•Job and machine matching•Multiple job queues•Checkpointing for fault tolerance and job migration•Multiple array jobs•Parallel job support•Advance reservation (from SGE version 6.2)•Accounting•Command line and GUI interfaces•DRMAA interface (see later)

3-1.21

SGE Command line interface

3-1.22

6d-1.23

qsub

6d-1.24

Command to submit job.

Job specified as (shell) script, or as named executable with the -b yes option:

Example

qsub -b y uptime

The immediate output of form:

Your job 238 ("uptime") has been submitted.

qstat Display status of jobs.

qstat issued after previous qsub might produce display:

6d-1.25

qw indicates waiting in queue.

Once job completed, qstat will display nothing.

By default, standard input and standard output redirected to files named <job_name>.o<job-ID> and <job_name>.e<job_ID>.

Grid Engine Graphical User Interface

• Started with qmon command.3-1.26Fig. 3.7

Grid Engine job submission GUI interface

3-1.27Fig. 3.8

Grid Engine job submission - advanced section

3-1.28Fig. 3.9

Submitting a job through GRAM and through an SGE scheduler

3-1.29Fig. 3.10

Running Globus job with SGE scheduler using globusrun-ws command

Scheduler selected by name using -Ff option (i.e. factory type).

Name for Sun Grid Engine (obviously) is SGE. Hence:

globusrun-ws –submit -Ft SGE -f prog1.xml

submits job described in job description file called prog1.xml.

6d-1.30

Output

Delegating user credentials...Done.Submitting job...Done.Job ID: uuid:d23a7be0-f87c-11d9-a53b-0011115aae1fTermination time: 07/20/2008 17:44 GMTCurrent job state: ActiveCurrent job state: CleanUpCurrent job state: DoneDestroying job...Done.Cleaning up any delegated credentials...Done.

6d-1.31

Note: the user credentials have to be delegated

Actual machine running job

Scheduler will choose machine that job is run on, which can vary for each job. Hence

globusrun-ws –submit –s -Ft SGE –c /bin/hostname

submits executable hostname to SGE scheduler in streaming mode redirecting output to console, with usual Globus output.

Output: Hostname displayed as output will be that of machine running job and may vary.

6d-1.32

Specifying Submit Host

Submit host and location for factory service can be specified by using -F option, e.g.:

globusrun-ws –submit –s

-F http://coit-grid03.uncc.edu:8440

-Ft SGE –c /bin/hostname

6d-1.33

6d-1.34

Condor

• Developed at University of Wisconsin-Madison in mid 1980’s to convert collection of distributed workstations and clusters into a high-throughput computing facility.

• Key concept - using wasted computer power of idle workstations.

• Hugely successful.

• Many institutions now operate Condor clusters.

Condor• Essentially a job scheduler

– jobs scheduled in background on distributed computers, but without user needing an account on individual computers.

• Users compile their programs for computers Condor is going to use, and include Condor libraries, which apart from other things handles input and captures output.

• Job described in a job description file.

• Condor then ships job off to appropriate computers.

6d-1.35

Ideal Use Case

• Executing long-running job multiple times with different parameters (parameter-sweep problem)– No communication between jobs– Each job can be scheduled independently.

• If a single parameter sweep takes n hours on a single computer.

• With p sweeps would take np hours.

• With m computers, it would take np/m hours

(where p is a multiple of m). m times faster.

6d-1.36

Submitting multiple parameter sweeps across m computers

3-1.37Fig. 3.11

3-1.38

Condor Features

• Includes:– Resource finder

– Batch queue manager

– Scheduler

– Checkpoint/restart

– Process migration

3-1.39

Intended to run job even if:

• Machines crash

• Disk space exhausted

• Software not installed

• Machines needed by others

• Machines managed by others

• Machines far away

3-1.40

Condor Structure

• A collection of Condor machines called a pool.

• Machines have one or more of 4 roles:

– One Central manager– Submit machine(s)– Execution machine(s)– One Checkpoint server

3-1.41

Central Manager

• Resource broker for a pool.

• Keeps track of:– Which machines available– What jobs running

• Negotiates which machine will run which job, etc.

• Only one central manager per pool.

3-1.42

Submit Machine• A machine which can submit jobs to pool.

• Must be at least one submit machine in a pool, and usually more than one.

Execute Machine• A machine on which jobs can be run.

•Must be at least one execute machine in a pool, and usually more than one.

3-1.43

Checkpoint Server

• Machine which stores checkpoint files produced by job which checkpoint.

• Can only be one checkpoint machine in a pool.

• Optional to have checkpoint machine.

3-1.44

Possible Configuration• A central manager.

• Some machine that can only be submit hosts.

• Some machine that can be only execute hosts.

• Some machines that can be both submit and execute hosts.

3-1.45Fig 3.12

General Condor configuration

Internal steps to execute a job in Condor

3-1.46Fig 3.13

3-1.47

Types of Jobs

• Classified according to environment it provides.• Currently nine environments (Condor 7.0.1 released 2008):

– Vanilla– Standard– Java– MPI (for legacy)

– Parallel– Grid (or Globus depending on Condor version)

– Scheduler (possibly for legacy)

– Local– VM (Virtual Machine)

3-1.48

Vanilla Universe• For straightforward jobs written in compiled languages such

as C or C++ or pre-compiled applications, shell scripts and Windows batch files.

• For jobs that cannot be compiled with Condor libraries. Programs not compiled with Condor libraries.

• Does not provide for features such as checkpointing or remote system calls.

• May be less efficient than under other universes, but it only requires executable.

• Code restrictions such as on the use of system calls.

3-1.49

Example job submission

# This is a comment condor submit file for prog1 job

Universe = vanilla

Executable = prog1

Output = prog1.out

Error = prog1.error

Log = prog1.log

Queue

Condor has its own job description language to describe job in a “submit description file”

Not in XML as predates XML

Simple Submit Description File Example

3-1.50

Submitting Multiple JobsDone by adding number after Queue command, i.e.:

Submit Description File Example

# condor submit file for program prog1

Universe = vanilla

Executable = prog1

Queue 500

will submit 500 identical prog1 jobs at once. Can use multiple Queue commands with Arguments for each instance.

Submitting jobcondor_submit command

condor_submit prog1.sdl

where prog1.sdl is submit description file.

Without any other specification, Condor will attempt to find suitable executable machine from all available.

Condor works with and without a shared file system.

Most local clusters set up with shared file system and Condor will not need to explicitly transfer files.

3-1.51

Transferring filesWhen necessary to explicitly tell Condor to transfer files, additional parameters included in submit description file:

# condor submit file for uptime

# with explicit file transfers

Universe = vanilla

Executable = uptime

Output = uptime.out

Error = uptime.error

Log = uptime.log

Should_transfer_files = YES

When_to_transfer_output = ON_EXIT

Queue3-1.52

After submitting job, there will be a message that job has been submitted, for example after:

condor_submit prog1.sdl

Submitting job(s).

Logging submit event(s).

1 job(s) submitted to cluster 662.

3.1.53

MonitoringCan query status of Condor queue with:

condor_q

Get output of form:

Queue

-- Submitter: coit-grid02.uncc.edu : <152.15.98.25:32821> :

ID OWNER SUBMITTED RUN_TIME ST PRI SIZE CMD

.

662.0 abw 5/23 17:36 0+00:00:00 I 0 9.8 uptime

16 jobs; 1 idle, 0 running, 15 held

Status: H (hold), R (running), I (idle, waiting for machine), C (Completed), U (unexpanded, never being run) or X (removed). 3.1.54

3-1.55

Standard Universe

• For jobs compiled with Condor libraries.

• Allows for checking pointing and remote system calls.

• Must be single threaded.

• Not available under Windows.

Compiling in standard universe condor_compile

Example to compile and link Condor libraries would be:

condor_compile cc -o prog1 prog1.c

Simplest submit job description would be:

#Simplest possible condor submit file for prog1 job

Executable = prog1

Queue

which would run job in standard universe as that is default.

All standard input/output/error (stdin, stdout, stderr) lost, or in Linux jargon, redirected to /dev/null, without file transfer commands in submit description file.

3-1.56

3-1.57

Checkpointing• Certain jobs can checkpoint, both periodically for

safety and when interrupted.

• If checkpointed job interrupted, it will resume at the last checkpointed state when it starts again.

• Generally no change to source code - need to link Condor’s Standard Universe support library.

• Checkpointing disabled by including command

+WantCheckpoint = False

in submit description file before queue.

Java universeFor submitting Java programs run by Java Virtual Machine.

Executable is Java class file, as universe invokes JVM automatically.

Example submit description file

# This is a comment condor submit file for java job

Universe = java

Executable = Prog1.class

Arguments = Prog1 1234

Output = prog1.out

Error = prog1.error

Log = prog1.log

Should_transfer_files = IF_NEEDED

When_to_transfer_output = ON_EXIT

Queue3-1.58

First argument must be name of class file to be executed by JVM

Grid universeCondor can be used as the environment for Grid computing:

• Stand-alone without Grid middleware such as Globus

or alternatively

• Integrated with the Globus toolkit.

3-1.59

Stand-alone Grid environmentFlocking

• Condor pools can be joined together in a process called flocking in Condor

• Create Grid computing environment with different pools under different administrative domains.

• Migration will occur if a suitable computer not available in original pool.

• In Condor-C, jobs can move from one computer’s job queue to another.

3-1.60

Stand-alone Grid environment Condor Glidein mechanism

Enables computers to join a Condor pool temporally.

Condor command:

condor_glidein <contact argument>

where <contact argument> could be hostname or job manager/scheduler or Globus resource.

Various options enable more information to be passed.

3-1.61

Condor’s matchmaking mechanism

To chooses best computer to run the job

Condor ClassAd

Based upon notion that jobs and resources advertise themselves in “classified advertisements”, which include their characteristics and requirements.

Job ClassAd matched against resource ClassAd.

3-1.62

Example

A user might advertise his/her job :

My job needs an Intel Core 2 processor a speed of at least 2 MHz (or equivalent Intel compatible processor) and with at least 6 GB of main memory and 1 TB of working disk space.

using job attributes of ClassAd (in submit description file).

3-1.63

Compute resources advertise their capabilities, for example:

I am a computer with an AMD Phenom processor operating at a speed of 2.6 MHz with 256 GB of main memory and 16 TB of working disk space.

using machine attributes of the resource ClassAd.

3-1.64

3-1.65

ClassAd Matchmaking Steps

1. Agents (jobs) and resources (computers) advertise their characteristics and requirements in “classified advertisements.”

2. Matchmaker scans ClassAds and creates pairs that satisfy each others constraints and preferences.

3. Matchmaker informs both parties of match.

4. Agent and resource make contact.

3-1.66

Condor’s ClassAd Matchmaking Mechanism

Fig 3.14

ClassAdd commands

MyType

Identifies type of ClassAd:

MyType = Job

or

MyType = Machine

3-1.67

ClassAdd commands

TargetType

Specifies what ClassAd is to match with:

TargetType = “Machine”

or

TargetType = “Job”

3-1.68

Machine ClassAddSet up during system configuration.

Some attributes provided by Condor but their values can be dynamic and alter during system operation.

Machine attributes can describe such things as: • Machine name• Architecture• Operating system• Main memory available for job• Disk memory available for job• Processor performance• Current load, etc

3-1.69

3-1.70

3-1.71

Job ClassAddJob is typically characterized by its resource requirements and preferences.May include:

– What job requires– What job desires– What job prefers, and– What job will accept

using Boolean expressions.These details put in submit description file.

3-1.72

Matchmaking commands Requirements and Rank

Available for both job ClassAd and machine ClassAd:

Requirements -- specify machine requirements.

Rank -- used to differentiate between multiple machines that can satisfy requirements and can identify a preference based upon a user criteria.

3-1.73

Requirements

Requirements = <Boolean Expression>

Use a C/Java-like Boolean expression that evaluates to TRUE for a match.

3-1.74

Machine RequirementsA machine ClassAd might be:

MyType = “Machine”

TargetType = “Job”

Machine = coit-grid02.cs.uncc.edu

Arch = “INTEL”

OpSy = “LINUX”

Disk = 1000 * 1024

Memory = 100 * 1024

Requirements = (LoadAvg<=0.2)

3-1.75

Job Requirements

Example

My job needs a machine with at least 6 GB of main memory and 25 MB of disk space.

ClassAd

MyType = “Job”

TargetType = “Machine”

Universe = ...

Executable = ...

Requirements = (memory == 6*1024) && (disk = 25 * 1024)

3-1.76

Rank

Rank = <number>

Computes to a floating point number.

Resource with highest rank chosen.

3-1.77

3-1.78

Job ClassAd’s Rank statement • Can be used in job ClassAdd for selection

between compatible machines.• Choose highest rank

Example

Rank = (Memory * 10000) + KFlops

Machine performance

Rank

Sometimes just TRUE (1) or FALSE (0) is sufficient for rank, i.e.:

Rank = (Target.Memory > 10000)

3-1.79

3-1.80

Machines Rank StatementCan also be used in Machines ClassAd for

matchmaking.

Example

Rank = (Department == “Computer Science”)

where Department defined in job ClassAdd, say:

Department=“Computer Science”

3-1.81

Job ClassAd[MyType = “Job”TargetType=“Machine”

…Department=“Computer

Science”…]

Machines ClassAd[MyType=“Machine”TargetType=“Job”

…Rank = (Department == “Computer Science”)…]

Using rank in Machines ClassAd

3-1.82

Directed Acyclic Graph

Manager (DAGMan)

Meta-scheduler

Allows one to specify dependencies between Condor Jobs.

3-1.83

Example

“Do not run Job B until Job A completed successfully”

Especially important to jobs working together (as in Grid computing).

3-1.84

Directed Acyclic Graph(DAG)

• A data structure used to represent dependencies.

• Directed graph.

• Must not have cycles (acyclic).

• Each job is a node in DAG.

• Each node can have any number of parents and children.

3-1.85

DAG

Job A

Job CJob B

Job D

Do job A.

Do jobs B and C after job A finished

Do job D after both jobs B and C finished.

6d-1.86

Defining a DAG• Defined by a .dag file, listing nodes and their

dependencies.

• Each “job” statement has a name (say A) and a file (say a.condor)

• PARENT-CHILD statement describes relationship between two or more jobs

• Other statements available.

6d-1.87

Example

# diamond.dagJob A a.subJob B b.subJob C c.subJob D d.subParent A Child B CParent B C Child D

Job A

Job CJob B

Job D

Directed Acyclic Graph Manager (DAGMan) DAGs

3-1.88Fig. 3.15

# (a) DAG

JOB A A.sub

JOB B B.sub

JOB C C.sub

PARENT A CHILD B

PARENT B CHILD C

3-1.89

# (b) DAG

JOB A A.sub

JOB B B.sub

JOB C C.sub

JOB D D.sub

JOB E E.sub

PARENT A CHILD B C

PARENT B CHILD D

PARENT C D CHILD E

3-1.90

DAG for PARENT A B CHILD C D

# DAG

JOB A A.sub

JOB B B.sub

JOB C C.sub

JOB D D.sub

PARENT A B CHILD C D

3-1.91Fig. 3.16

3-1.92

Running DAG

condor_submit_dag

Start a DAG with dag file diamond.dag.

condor_submit_dag diamond.dag

Submits a Scheduler Universe Job with DAGMan as executable.

3-1.93

DAGMan

• Acts as a scheduler managing submission of jobs upon DAG dependencies.

• Holds and submits jobs to Condor queue at appropriate times.

3-1.94

Job Failures

• DAGMan continues until it cannot make progress and then creates a rescue file holding current state of DAG.

• When failed job ready to re-run, rescue file used to restore prior state of DAG.

Rescue fileNodes completed marked with DONE.

# DAG

JOB A A.sub DONE

JOB B B.sub DONE

JOB C C.sub DONE

JOB D D.sub

JOB E E.sub

PARENT A CHILD B C

PARENT B CHILD D

PARENT C D CHILD E

Nodes A, B, and C have completed before a failure occurred.

DAG can restart with node D and then node E.

DONE can be inserted by users. Useful for testing.

Restart is at level of nodes. 3-1.95

3-1.96

Summary of Key Condor Features

• High throughput computing using an opportunistic environment.

• Provides a mechanisms for running jobs on remote machines.

• Matchmaking

• Checkpointing

• DAG scheduling

3-1.97

More Information• http://www.cs.wisc.org/condor

• Chapter 11, Condor and the Grid, D. Thain, T. Tannenbaum, and M. Livny, Grid Computing: Making The Global Infrastructure a Reality, F. Berman, A. J. G. Hey, and G. Fox, editors, John Wiley, 2003.

• “Condor-G: A Computation Management Agent for Multi-Institutional Grids,” J. Frey, T. Tannenbaum, I. Foster, M. Livny, S. Tuecke, Proc. 10th Int. Symp. High Performance Distributed Computing (HPDC-10) Aug. 2001.

Questions(Multiple choice)

3-1.98

What command is used to submit the executable /bin/hostname to the SGE scheduler?

(a) qsub -b y /bin/hostname

(b) qsub /bin/hostname

(c) sge_submit -Ft SGE -c /bin/hostname

(d) qmon /bin/hostname

3-1.99SAQ 3-4

What Globus command is used to submit the executable /bin/hostname to coit-grid03.uncc.edu with the SGE local scheduler?

(a) globusrun-ws -submit -F coit-grid03.uncc.edu -s SGE -c /bin/hostname

(b) globusrun-ws -submit -s -F coit-grid03.uncc.edu -Ft SGE -c /bin/hostname

(c) globusrun-ws -submit coit-grid03.uncc.edu -Ft SGE -c /bin/hostname

(d) qsub coit-grid03.uncc.edu /bin/hostname

3-1.100SAQ 3-5

In a Condor environment, what is displayed after issuing the condor_status command?

(a) An error message as this is not a valid Condor command

(b) A list of computers in the condor pool and their status

(d) A list of jobs in the Condor queue and their status

(e) The status of the Condor central manager

3-1.101SAQ 3-7

In Condor, where is the job ClassAd?

(a) In the submit description file

(b) In a file that is specified in the condor_submit command in addition to the submit description file

(c) In a file that the user transfers to the computer separately

(d) In the local newspaper

3-1.102SAQ 3-8

What is DAGMan in Condor?

(a) A Data Access Grid Manager

(b) A software tool for providing checkpointing

(c) A scheduler that can schedule jobs in a workflow

(d) A Database Group Manager for Condor

3-1.103SAQ 3-10

What is checkpointing?

(a) Someone pointing to a check

(b) The process of someone checking the grade of an assignment

(c) Saving state information periodically to be able to recover from failures during execution

(d) The process of a compiler finding syntax errors

3-1.104SAQ 3-11

Which of the following is NOT a Condor environment?

(a) Globus/Grid

(b) Vanilla

(c) Chocolate

(d) Standard

(e) Java

3-1.105SAQ 3-12

3-1.106

(a) There are no similarities.

(b) They both provide a means of specifying command line arguments for the job.

(c) They both provide a means of specifying whether a named user is allowed to execute a job.

(d) They both provide a means of specifying machine requirements for a job.

Quiz

Identify which of the following are similarities between Condor ClassAd and Globus RSL 1/RSL2/JDD. (There may be more than one similarity.)


Recommended