+ All Categories
Home > Education > Session9part1

Session9part1

Date post: 06-May-2015
Category:
Upload: issgc-summer-school
View: 596 times
Download: 0 times
Share this document with a friend
42
6. Juli 2009 Mitglied der Helmholtz-Gemeinschaft ISSGC’09 UNICORE day at ISSGC’09 Presenters: Rebecca Breu, Bastian Demuth, Mathilde Romberg Jülich Supercomputing Centre (JSC) 7 July 2009 7 th International Summer School on Grid Computing
Transcript
Page 1: Session9part1

6. Juli 2009

Mitg

lied

der H

elm

holtz

-Gem

eins

chaf

tISSGC’09

UNICORE day at ISSGC’09Presenters: Rebecca Breu, Bastian Demuth, Mathilde Romberg

Jülich Supercomputing Centre (JSC)

7 July 2009

7th InternationalSummer School on

Grid Computing

Page 2: Session9part1

07/07/2009 Slide 2

ISSGC’09

Agenda

9:00 – 10:30 Principles of Job Submission and Execution Management

Set the scene

11:00 – 12:30 UNICORE – Architecture and Components

Technical overview on how UNICORE works and how it is used

14:00 – 15:30 UNICORE Basic Practical

Practical: submitting jobs with the command line client

16:00 – 17:30 UNICORE Workflow Practical

Practical: submitting workflows with the graphical client

18:00 – 19:00 UNICORE: An Application

Example applications using UNICORE

Page 3: Session9part1

6. Juli 2009

Mitg

lied

der H

elm

holtz

-Gem

eins

chaf

tISSGC’09

7th InternationalSummer School on

Grid Computing

Session 9: Principles of Job Submission and Execution ManagementAuthor/Presenter: Achim Streit, Mathilde RombergJülich Supercomputing Centre (JSC)

Page 4: Session9part1

07/07/2009 Slide 4

ISSGC’09

Job Submission

Page 5: Session9part1

07/07/2009 Slide 5

ISSGC’09

Jobs

JobSome work to be executedRequires CPU and memoryPossibly accesses additional resources, e.g., storage, devices, services

Job schedulingPolicy for assigning jobs to resources

Courtesy of Prof. Felix Wolf, RWTH Aachen

Page 6: Session9part1

07/07/2009 Slide 6

ISSGC’09

Resources

ComputeMemoryCentral Processing UnitsNodesThreads/Tasks

DataSizeTransfer rate

NetworkBandwidths

...

Page 7: Session9part1

07/07/2009 Slide 7

ISSGC’09

How to Differentiate Compute Resources?

by number of CPUsSingle processorMulti processor

Multiprocessor systems can be grouped intoShared memory

Equal access time to memory from each processorDistributed memory

Each CPU has its own memory and I/ODifferent address spaces

Distributed shared memoryShared address space Access time depends on location of data in memory

Page 8: Session9part1

07/07/2009 Slide 8

ISSGC’09

Multiprocessor Systems – Examples

SMP (Symmetric (shared-memory) MultiProcessors)IBM Power 4/5/6 node, multi-core chips

MPP (Massively Parallel Processor)IBM Blue Gene/P, Cray XT4

NUMA (Non-Uniform Memory Access)SGI Altix

Cluster:Mare Nostrum, IBM Power4/5/6 system Tera-10, self-built cluster

Jülich

Jülich Helsinki

Munich

Barcelona

Page 9: Session9part1

07/07/2009 Slide 9

ISSGC’09

Job Scheduling

Page 10: Session9part1

07/07/2009 Slide 10

ISSGC’09

Job Scheduling

Policy for assigning jobs to resourcesInput are

Set of jobs with requirementsSet of resources

Criteria for assignmentFairnessEfficiencyMinimize response time (interactive users) and turnaround time (batch jobs)Maximize throughput

Courtesy of Prof. Felix Wolf, RWTH Aachen

Page 11: Session9part1

07/07/2009 Slide 11

ISSGC’09

Usage of Multiprocessor Systems

Typically the user/job resource demands are greater than the available resources users/jobs competeTypically resource requirements differ from one user (or application) to the other

Large/small (in terms of number of processors)Large/small (in terms of amount of memory)Long/short (in terms of duration of resource usage)

A form of resource management and job scheduling is required !

How to share the available resources among the competing jobs?When does a job start and which resources are assigned?

Page 12: Session9part1

07/07/2009 Slide 12

ISSGC’09

Resource Management & Job Scheduling – 1

Time-sharing (or time-slicing)Several jobs share the same resourceJobs are executed quasi-simultaneously

Resources are not exclusively assigned to jobsResource usage of jobs is reduced to short time slices (some clock ticks of the processor)

Jobs need more than a single time slice to completeEach job gets the resource assigned in a round-robin fashion

New jobs start immediatelyExecution time takes longer than on a dedicated resourceTypically handled by the operating system

Examples: SMP machines, your own Linux PC

Page 13: Session9part1

07/07/2009 Slide 13

ISSGC’09

Resource Management & Job Scheduling – 2

Space-sharing (or space-slicing)Resources are exclusively assigned to a job until it completesJobs may have to wait for enough free resources until their startNeeds a separate resource management system (also known as batch system) and job scheduler

Examples:MPP systems, clusters, etc.LoadLeveler, Torque + Maui, PBSPro, OpenCCS, SLURM, …

space-sharing based resource management and job scheduling is commonly used on clusters

and other multiprocessor systems

Page 14: Session9part1

07/07/2009 Slide 14

ISSGC’09

Job Submission on Multiprocessor SystemsExample – LoadLeveler

IBM Tivoli Workload Scheduler LoadLeveler Available for AIX, Linux

Basic LoadLeveler commands

Job submission via job command file

Displays status informationllstatus

Delete a queued or running jobllcancel <job_id>

Show queued and running jobsllq

Submit a jobllsubmit

llsubmit <cmdfile>

Page 15: Session9part1

07/07/2009 Slide 15

ISSGC’09

LoadLeveler cmd_file examples – 1

IBM Blue Gene/P system @ Jülich – JUGENE# @ job_name = BGP-LoadL-Sample-1# @ comment = "BGP Job by Size"# @ error = $(job_name).$(jobid).out# @ output = $(job_name).$(jobid).out# @ environment = COPY_ALL;# @ wall_clock_limit = 00:20:00# @ notification = error# @ notify_user = [email protected]# @ job_type = bluegene# @ bg_size = 32# @ queue/usr/local/bin/mpirun -exe `/bin/pwd`/wait_bgp.rts \

-mode VN -np 48 -verbose 1 -args "-t 1"

size of partition

runtime/duration

Executable, only mpirun !

Page 16: Session9part1

07/07/2009 Slide 16

ISSGC’09

LoadLeveler cmd_file examples – 2

IBM p690 eServer Cluster 1600 @ Jülich – JUMP

#@ node: number of nodes for the job#@ total_tasks: number of total tasks in the job

#@ job_type = parallel#@ output = out.$(jobid).$(stepid)#@ error = err.$(jobid).$(stepid)#@ wall_clock_limit = 00:15:00#@ notify_user = [email protected]#@ node = 2 #@ total_tasks = 64#@ data_limit = 1.5GB#@ queuemyprogram

resource requirements

runtime/duration

executable

Page 17: Session9part1

07/07/2009 Slide 17

ISSGC’09

Job Submission on Multiprocessor SystemsExample – Torque + Maui

Torque is the resource managerMaui is the cluster scheduler

Basic Maui commands

Shows estimated start time of idle jobs

showstart

Shows detailed usage statistics for users, groups, and accounts, the user has access to

showstats

Cancels existing jobscanceljob

Displays detailed and prioritized list of active and idle jobs

showq

Submit a new jobmsub

Page 18: Session9part1

07/07/2009 Slide 18

ISSGC’09

Job submission in Maui

Via commandline:

resource list:32 nodes with 2 processors each1800 MB per task3600 seconds duration

script file

msub -l nodes=32:ppn=2,pmem=1800mb,walltime=3600 myscript

Page 19: Session9part1

07/07/2009 Slide 19

ISSGC’09

Each job submission system is differentDifferent commands for submission, status query, cancellationDifferent options, scheduling policies, …

Even different configurations of the same job submission systemsfor different multiprocessor systems exist

Job requirements are specified differentlyCommand-line parameters for the job submission commandSeparate job command file

Different job requirements existNodes and tasks per node, total tasks, …

Lessons Learned

Page 20: Session9part1

07/07/2009 Slide 20

ISSGC’09

Job submission and the Grid

A higher, meta level with more abstraction is needed to describe the requirements of jobs in a Grid of heterogeneous systemsA lot of proprietary solutions exist, each Grid middleware is using its own language, e.g.

AJO in UNICORE 5, ClassAds/JDL in Condor, JDL in gLite, RSL in Globus Toolkit, xRSL in ARC/NorduGrid, etc…

And there is JSDL 1.0Open Grid Forum (OGF) standardhttp://www.gridforum.org/documents/GFD.56.pdf

Page 21: Session9part1

07/07/2009 Slide 21

ISSGC’09

JSDL – Introduction

JSDL stands for Job Submission Description LanguageA language for describing the requirements of computational jobs for submission to Grids and other systemsCan also be used between middleware systems or for submitting to any Grid middleware ( interoperability)

JSDL does not define a submission interface or what the results of a submission look like or how resources are selected

Page 22: Session9part1

07/07/2009 Slide 22

ISSGC’09

JSDL Document

A JSDL document is an XML document, which may containGeneric (job) identification information Application descriptionResource requirements (main focus is computational jobs)Description of required data files

A JSDL document is a template, which can be submitted multiple times and can be used to create multiple job instances

No job instance specific attributes can be defined, e.g. start or end time

Page 23: Session9part1

07/07/2009 Slide 23

ISSGC’09

JSDL – A Hello World Example

<?xml version="1.0" encoding="UTF-8"?><jsdl:JobDefinition

xmlns:jsdl=“http://schemas.ggf.org/2005/11/jsdl”xmlns:jsdl-posix=http://schemas.ggf.org/jsdl/2005/11/jsdl-posix>

<jsdl:JobDescription><jsdl:Application>

<jsdl-posix:POSIXApplication><jsdl-posix:Executable>/bin/echo<jsdl-posix:Executable><jsdl-posix:Input>/dev/null</jsdl-posix:Input><jsdl-posix:Output>std.out</jsdl-posix:Output><jsdl-posix:Argument>hello</jsdl-posix:Argument><jsdl-posix:Argument>world</jsdl-posix:Argument>

</jsdl-posix:POSIXApplication></jsdl:Application>

</jsdl:JobDescription></jsdl:JobDefinition>

Page 24: Session9part1

07/07/2009 Slide 24

ISSGC’09

JSDL – Resource Requirements Description

Support simple descriptions of resource requirementsNOT a comprehensive resource requirements languageCan be extended with other elements for richer or more abstract descriptions

Main target is compute jobsCPU, memory, file system/disk, operating system requirements

Allows some flexibility for aggregate (total) requirements“I want 10 CPUs in total and each resource should have 2 or more”

Very basic support for network requirements

Page 25: Session9part1

07/07/2009 Slide 25

ISSGC’09

JSDL application extensions

SPMD (single-program-multiple-data)http://www.ogf.org/documents/GFD.115.pdf

Extends JDSL 1.0 for parallel applications (MPI, PVM, etc.)Allows to specify number of processors, processors per host, threads per processes along with the application

HPC (high performance computing)http://www.ogf.org/documents/GFD.111.pdf

Extends JSDL 1.0 for HPC applications running as operating system processes (e.g. username, environment, working directory can be specified)

Page 26: Session9part1

07/07/2009 Slide 26

ISSGC’09

Lessons Learned

JSDL is a standardized language to describe jobs to be submittedto Grid resources

Not only the job itself (application, arguments, input, output, etc.), but also resource requirements (CPU, memory, etc.)Extensions for specific application domains (parallel programs, HPC applications) exist

BUT: JSDL can not directly be submitted to Grid resources, i.e. a resource management and job scheduling system of a cluster or multiprocessor system in a Grid

Page 27: Session9part1

07/07/2009 Slide 27

ISSGC’09

Execution and Job Management – 1

One of the essential functionalities and components in a Grid middleware

Deals withInitiating/submitting, monitoring and managing jobsHandling and staging of all job dataCoordinating and scheduling of multi-step jobs

Examples:XNJS in UNICORE 6 ( sessions 10-12 today)WS-GRAM in GT4 ( session 19-21 on Thursday)WMS in gLite ( session 24-26 on Friday)ARC Client in NorduGrid/ARC ( session 29-30 on Sat.)

Page 28: Session9part1

07/07/2009 Slide 28

ISSGC’09

Execution and Job Management – 2

Initiating/submitting, monitoring and managing jobsTranslates the Grid job in a specific job (application details, resources, etc.) for the target systemSubmits the job to the resource management system using its proprietary way of job submissionFrequently polls the job status (waiting/queued, running/executing, failed, aborted, paused, finished, etc.) from the resource management systemProvides “access” to the job, its status and data during its runtime and after its (successful or unsuccessful) completionIf at job submission time the resource management system becomes not available/reachable, the job is cached for a future hand over to it

Page 29: Session9part1

07/07/2009 Slide 29

ISSGC’09

Execution and Job Management – 3

Handling and staging of all job data, incl. job directory and persistencyCreates, manages, destroys the job directoryAll data submitted with the job as input is stored in the job directoryData is staged in from remote data resources/archivesAll data generated by the job is preserved and/or staged after the successful completion of the job

Coordinating and scheduling of multi-step jobsIf a job consists of more than one step (a workflow), the required resources are orchestratedManages the proper initiation of the workflow executionThe execution of the workflow is controlled and monitored

Page 30: Session9part1

07/07/2009 Slide 30

ISSGC’09

Execution and job management is needed A meta-layer on top of the Grid resources is needed

to provide a uniform way of accessing the Grid and to provide an intuitive, secure and easy to use interface for the user

Lessons Learned

Page 31: Session9part1

07/07/2009 Slide 31

ISSGC’09

Introduction to UNICORE(from 30,000 ft)

more in sessions 10-12, today

Page 32: Session9part1

07/07/2009 Slide 32

ISSGC’09

UNiform Interface to COmputing REsourcesseamless, secure, and intuitive

Initial development started in two German projects funded by theGerman ministry of education and research (BMBF)

08/1997 – 12/1999: UNICORE projectResults: well defined security architecture with X.509 certificates,

intuitive GUI, central job supervisor based on Codine(predecessor of SGE) from Genias

1/2000 – 12/2002: UNICORE Plus projectResults: implementation enhancements (e.g. replacement of

Codine by custom NJS), extended job control (workflows), application specific interfaces (plugins)

Continuous development since 2002 in several EU projectsOpen Source community development since Summer 2004

(Short) History Lesson

Page 33: Session9part1

07/07/2009 Slide 33

ISSGC’09

2008200720062005200420032002200120001999 2009

More than a decade of German and European research & development and infrastructure projects

Any many others, e.g.

Projects

2010 2011

UNICOREUNICORE Plus

EUROGRIDGRIP

GRIDSTARTOpenMolGRID

UniGridsVIOLADEISA

NextGRIDCoreGRID

D-Grid IPEGEE-IIOMII-EuropeA-WARE

ChemomentumeDEISA

PHOSPHORUS

D-Grid IP 2

SmartLMPRACE

D-MON

DEISA2

ETICS2WisNetGrid

Page 34: Session9part1

07/07/2009 Slide 34

ISSGC’09

– Grid driving HPC

Used inDEISA (European Distributed Supercomputing Infrastructure)National German Supercomputing Center NICGauss Center for Supercomputing (Alliance of the three German HPC centers)PRACE (European PetaFlop HPC Infrastructure) – starting-upBut also in non-HPC-focused infrastructures (i.e. D-Grid)

Taking up major requirements from i.e.

HPC usersHPC user support teamsHPC operations teams

Page 35: Session9part1

07/07/2009 Slide 35

ISSGC’09

Open source (BSD license)Open developer community on SourceForgeContribution with your own developments easily possible

Design principlesStandards: OGSA-conform, WS-RF compliantOpen, extensible, interoperableEnd-to-End, seamless, secure and intuitiveSecurity: X.509, proxy and VO supportWorkflow and application support directly integratedVariety of clients: graphical, commandline, portal, API, etc.Quick and simple installation and configurationSupport for many operating and batch systems100% Java 5

– www.unicore.eu

Page 36: Session9part1

07/07/2009 Slide 36

ISSGC’09

UNICORE in usesome examples

Page 37: Session9part1

07/07/2009 Slide 37

ISSGC’09

Usage in D-Grid

Core D-Grid sites committing parts of their existing resources to D-Grid

Approx. 700 CPUsApprox. 1 PByte of storageUNICORE is installed and used

Additional Sites received extra money from the BMBF for buying compute clusters and data storage

Approx. 2000 CPUsApprox. 2 PByte of storageUNICORE (as well as Globusand gLite) is installed as soon as systems are in place

LRZDLR-DFD

Page 38: Session9part1

07/07/2009 Slide 38

ISSGC’09

Consortium of leading national HPC centers in EuropeDeploy and operate a persistent, production quality,

distributed, heterogeneous HPC environmentUNICORE as Grid Middleware

On top of DEISA’s core services:Dedicated networkShared file systemCommon productionenvironment at all sites

Used e.g. for workflow applications

IDRIS – CNRS (Paris, France), FZJ (Jülich, Germany), RZG (Garching, Germany), CINECA (Bologna, Italy), EPCC ( Edinburgh, UK), CSC (Helsinki, Finland), SARA (Amsterdam, NL), HLRS (Stuttgart, Germany), BSC (Barcelona, Spain), LRZ (Munich, Germany), ECMWF (Reading, UK)

Distributed European Infrastructure for Supercomputing Applications

www.deisa.eu

more in session 33, Monday 9:00 – 10:30

Page 39: Session9part1

07/07/2009 Slide 39

ISSGC’09

Focus on providing common interfaces and integration of major Grid software infrastructures

OGSA-DAI, VOMS, GridSphere, OGSA-BES, OGSA-RUSUNICORE, gLite, Globus Toolkit, CROWN

Apply interoperability components in application use-cases

Interoperability and Usability of Grid Infrastructures

Page 40: Session9part1

07/07/2009 Slide 40

ISSGC’09

Provide an integrated Grid solution for workflow-centric, complex applications with a focus on data, semantics and knowledge

Provide decision support services for risk assessment, toxicity prediction, and drug designEnd user focus

ease of usedomain specific tools“hidden Grid”

Based on UNICORE 6

Grid Services based Environment to enable Innovative Research

more in sessions 12-13, this afternoon

Page 41: Session9part1

07/07/2009 Slide 43

ISSGC’09

Commercial usage at

Slide courtesy of Alfred Geiger, T-Systems SfR

Page 42: Session9part1

07/07/2009 Slide 44

ISSGC’09

UNICORE has a strong HPC-background, but is not limited to HPC use cases, it can be used in any GridUNICORE is OGSA-conform and WS-RF compliantUNICORE is open, extensible and interoperableUNICORE is open source and coded in JavaUNICORE is used in EU and national projects, European e-infrastructures, National Grid Initiatives (NGI), commercial environments, etc.Documentation, tutorials, mailing lists, community links, software, source code, and more:

Lessons Learned

http://www.unicore.euhttp://www.unicore.eu