Date post: | 11-May-2015 |
Category: |
Education |
Upload: | issgc-summer-school |
View: | 1,040 times |
Download: | 2 times |
EGEE-III INFSO-RI-222667
Enabling Grids for E-sciencE
www.eu-egee.org
Emidio GiorgioINFN Catania
ISSGC’09,Nice-Sophia Antipolis, 10.07.2009
Middleware Overview
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 2
Outline• General overview
• Security System– VOMS server– LCAS LCMAPS
• Information Service– Berkeley DB Information Index (BDII)
• Workload Management System– WMS mechanism– JDL– Computing Element– Logging and bookkeeping
• Questions
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 3
gLite Middleware overview
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 4
EGEE Project and gLite
• Enabling Grids for E-sciencE (EGEE) is a large multi-disciplinary grid infrastructure– Brings together more than 120 European organisations – Consists of ~300 sites in 48 countries and more than 68,000 CPUs – Is available to some 10,000 users 24 hours a day, 7 days a week– Processes more than 150,000 jobs per day from different scientific
domains
• gLite is the middleware powering the EGEE infrastructure and many other related projects– Is an integrated set of components designed to enable resource
sharing among different institutions– Pulls together contributions from many other projects, including LCG
and VDT– Enable users with a large set of services
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 5
Additional Infrastructures: GILDA
• EGEE provides a training infrastructure: GILDA (Grid INFN Laboratory for Dissemination Activities)– Runs the entire gLite stack protocols– Used to demonstrate EGEE grid technology project– Supports beginner and expert training courses on gLite
• Adopted by several Grid projects worldwide
• Own Certification Authority
• Available 365 days for everyone !
• Used in the ISSGC schools series
• Since 2007 other middleware than gLite are tested on GILDA
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 5
Additional Infrastructures: GILDA
• EGEE provides a training infrastructure: GILDA (Grid INFN Laboratory for Dissemination Activities)– Runs the entire gLite stack protocols– Used to demonstrate EGEE grid technology project– Supports beginner and expert training courses on gLite
• Adopted by several Grid projects worldwide
• Own Certification Authority
• Available 365 days for everyone !
• Used in the ISSGC schools series
• Since 2007 other middleware than gLite are tested on GILDA
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 6
gLite in the Grid “ecosystem”
. . .
LCG
EGEE
Used in
USA EU
NextGrid DEISAGridCC
Future grids
EDG
Globus MyProxyCondor ...
VDTDataTAG
CrossGrid ...
OSG, …
SRM
…
interactive
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 7
The Middleware structure• Applications have access both to
Higher-level Grid Services and to Foundation Grid Middleware
• Higher-Level Grid Services are supposed to help the users building their computing infrastructure but should not be mandatory
• Foundation Grid Middleware are actually developed in EGEE– Must be complete and robust– Should allow interoperation with other
major grid infrastructures– Should not assume the use of Higher-
Level Grid Services
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 8
gLite Services Decomposition
API Access
Job Mgmt. Services
ComputingElement
WorkloadManagement
MetadataCatalog
Data Services
StorageElement
DataMovement
File & ReplicaCatalog
Authorization
Security Services
Authentication
Information &Monitoring
Information & Monitoring Services
ServiceDiscovering
Accounting
Auditing
JobProvenance
PackageManager
CLI
NetworkMonitoring
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 9
gLite infrastructure
Workload Management System (WMS)Data Management
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 10
Security System
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 11
gLite Security
• Authentication based on X.509 PKI infrastructure– Certificate Authorities (CA) issue (long lived) certificates
identifying individuals (much like a passport)– Trust between CAs and sites is established (offline)– In order to reduce vulnerability, Grid user identification is done by
(short lived) proxies of their certificates• Proxies can
– Be delegated to a service such that it can act on the user’s behalf
– Include additional attributes (like VO information via the VO Membership Service VOMS)
– Be stored in an external proxy store (MyProxy) – Be renewed (in case they are about to expire)
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 12
X.509 Proxy Certificate
• Proxy: GSI extension to X.509 Identity Certificates– signed by the normal end entity cert (or by another proxy).
• It enables single sign-on.
• It supports some important features:– Delegation, Mutual authentication
• It has a limited lifetime (minimized risk of “compromised credentials”)
• It is created by the voms-proxy-init command– Options for voms-proxy-init:
-hours <lifetime of credential> -bits <length of key>
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 13
GRID Security: Components
• Large and dynamic population•Different accounts at different sites •Personal and confidential data•Heterogeneous privileges (roles)•Desire Single Sign-On
Users
• “Group” data • Access Patterns • Membership
“Groups”
Sites• Heterogeneous Resources• Access Patterns • Local policies• Membership
Grid
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 14
VOM
S
client
Query
Authentication
Request
AuthDB
OK
C=IT/O=INFN /L=CNAF/CN=Pinco Palla/CN=proxy
VOMSAC
VOMSAC
VOMS: conceptsVirtual Organization Membership Service:
– Extends the proxy with info on VO membership, group, roles– Fully compatible with GSI– Each VO has a database containing group membership, roles and capabilities
informations for each user– User contacts VOMS server requesting his authorization info – Server sends authorization info to the client, which includes it in a proxy certificate
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 15
FQAN and AC
• VOMS uses the Fully Qualified Attribute Name (FQAN) to express membership and other authorization info
• Groups membership, roles and capabilities may be expressed in a format that bounds them together– <group>/Role=[<role>][/Capability=<capability>]
• FQAN are included in an Attribute Certificate
• Attribute Certificates are used to bind a set of attributes (like membership, roles, authorization info etc) with an identity
• ACs are digitally signed
• VOMS uses AC to include the attributes of a user in a proxy certificate
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 16
VOMS Certificate
• AC is included by the client in a well-defined, non critical, extension assuring compatibility with GT-based mechanism
asli@levrek:~$ voms-proxy-init --voms gildaYour identity: /C=IT/O=GILDA/OU=Personal Certificate/L=INFN/CN=Marco Fargetta/[email protected] GRID pass phrase:Creating temporary proxy .................................... DoneContacting voms.ct.infn.it:15001 [/C=IT/O=INFN/OU=Host/L=Catania/CN=voms.ct.infn.it] "gilda" DoneCreating proxy .................................. DoneYour proxy is valid until Tue Jun 26 03:16asli@levrek:~$
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 16
VOMS Certificate
• AC is included by the client in a well-defined, non critical, extension assuring compatibility with GT-based mechanism
asli@levrek:~$ voms-proxy-init --voms gildaYour identity: /C=IT/O=GILDA/OU=Personal Certificate/L=INFN/CN=Marco Fargetta/[email protected] GRID pass phrase:Creating temporary proxy .................................... DoneContacting voms.ct.infn.it:15001 [/C=IT/O=INFN/OU=Host/L=Catania/CN=voms.ct.infn.it] "gilda" DoneCreating proxy .................................. DoneYour proxy is valid until Tue Jun 26 03:16asli@levrek:~$
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 17
VOMS Certificateasli@levrek:~$ voms-proxy-info -allsubject : /C=IT/O=GILDA/OU=Personal Certificate/L=INFN/CN=Marco Fargetta/CN=proxyissuer : /C=IT/O=GILDA/OU=Personal Certificate/L=INFN/CN=Marco Fargettaidentity : /C=IT/O=GILDA/OU=Personal Certificate/L=INFN/CN=Marco Fargettatype : proxystrength : 512 bitspath : /tmp/x509up_u18948timeleft : 11:57:20=== VO gilda extension information ===VO : gildasubject : /C=IT/O=GILDA/OU=Personal Certificate/L=INFN/CN=Marco Fargettaissuer : /C=IT/O=INFN/OU=Host/L=Catania/CN=voms.ct.infn.itattribute : /gilda/Role=NULL/Capability=NULLattribute : /gilda/grelc/Role=NULL/Capability=NULLattribute : /gilda/grelc/das/Role=NULL/Capability=NULLattribute : /gilda/grelc/das/grelc.unile.it/Role=NULL/Capability=NULLattribute : /gilda/grelc/das/grelc.unile.it/sakila/Role=NULL/Capability=NULLattribute : /gilda/grelc/das/grelc02.unile.it/Role=NULL/Capability=NULLattribute : /gilda/grelc/das/grelc02.unile.it/sakila/Role=NULL/Capability=NULLtimeleft : 11:57:48asli@levrek:~$
Attributes
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 18
LCAS & LCMAPS• At resources level, authorization info is extracted from
the proxy and processed by LCAS and LCMAPS
• Local Centre Authorization Service (LCAS)– Checks if the user is authorized– Checks if the user is banned at the site
• Local Credential Mapping Service (LCMAPS)– Map remote credentials to local credentials (eg. different UNIX
uid/gid)– Map also VOMS group and roles (full support of FQAN)
enables privileges separations
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 19
VOMS enabled Grid
• User can be in multiple VOs– Aggregate rights
• VO can have groups– Different rights for each
Different groups of experimentalists …
– Nested groups
• VO has roles– Assigned to specific purposes
E,g. system admin When assume this role
• Proxy certificate carries the additional attributes
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 20
Information Service
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 21
Information Service
• What?– System to collect information on the state of resources
• Why?– To discover resources of the grid and their nature– To check for health status of resources– To provide data in order to manage the workload more efficiently
• How?– Monitoring and publishing fresh data on the state of resources– Adopting a well known data model
• Who?– User searching specific resources for their activity– Workload Management System– Other monitoring system
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 22
Information Service Systems
• The gLite Data Model is based on Grid Laboratory Uniform Environment (GLUE) Schema
• The IS architecture used in gLite is Berkeley DB Information Index (BDII)– has been adopted in LCG middleware as the Information System
provider– It is an evolution of the Globus Meta Directory System (MDS)– It is based on Lightweight Directory Access Protocol (LDAP)
servers
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 23
GLUE Schema
• Describe the Grid resources information stored in the IS
• Independent from the underlying technology
• Actual release is mapped on– LDAP– XML– ClassAd (Condor Matchmaking language)
• The entities of the GLUE Schema are organised hierarchically– Include the concept of Site, Cluster, Computing Element, Storage
Element, and an abstraction of service
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 24
GLUE Schema Structure
Collection of resources owned by a single organisation. Contains info on the location, the administrator, web page and so on
Site
Description of deployed service
Service
StorageElement
Set of heterogeneous resources. Contains info on shared directory
Cluster
1 1 1
*
*
*
Set of homogeneous resources. Contains the size of the set
Sub-Cluster
ComputingElement
Contains details of hardware (features and performance) and software
Host
1
*
1
Job
VOview
State
PolicyInfo
**
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 25
Abbreviations:BDII: Berkeley DataBase Information Index
GIIS: Grid Index Information ServerGRIS: Grid Resource Information Server
GRISs, local BDII and BDII
Each site can run a BDII. It
collects the information given by the local BDIIs
At each site, a *local* BDII collects the information
given by the GRISs
Local GRISes run on CEs and SEs at each site and report dynamic and static information
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 26
RB Local GRIS
SELocal GRIS
CE Local GRIS
BDII-A BDII-B
SELocal GRIS
SELocal GRIS
CE Local GRIS
SELocal GRIS
BDII-C
CELocal GRIS
CE Site BDII
CELocal GRIS
CE Site BDIICE
Local GRIS
CE Site BDII
Site 1 Site 2 Site 3
The IS in gLite
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 27
BDII
• Users and other Grid services (such as the WMS) can interrogate BDIIs to get information about the Grid status.
• Each BDII collects information from the site GIISes (or local BDII) defined in a configuration file, which it accessed through a web interface.
• Every two minutes a cron-job runs a script and collects information (pull model) from all the GIIS (local BDII) listed in the configuration file
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 28
Summary
• The security system of gLite is based on X.509 certificates– Users are identified by certificates– VOMS server link user to VOs, groups and roles adding
attributes to the proxy certificate– LCAS and LCMAPS control the local access to the resources
checking the user certificates• Information System provided by gLite is the BDII
– The information are organised following the GLUE Schema– Current implementation use only BDII to check the state of the
resources The user can contact the top BDII in the hierarchy to get the
information of all the resources
venerdì 10 luglio 2009
EGEE-III INFSO-RI-222667
Enabling Grids for E-sciencE
www.eu-egee.org
gLite Workload Management System
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 30
Outline
gLite OverviewWorkload Management System
WMS ArchitectureJob state machineJob Description Language Overview
Security overview
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 31
gLite services
Computing Element
Storage Element
Site X
Information System
submit
submit
query
retrieve
retrieve
Workload ManagementLogging & Bookkeeping
User Interface
publishstate
File and ReplicaCatalogs
AuthorizationService
query
updatecredential
publishstate
discoverservices
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 32
The Workload Management System (WMS) comprises a set of Grid middleware components responsible for distribution and management of tasks across Grid resources.
The purpose of the Workload Manager (WM) is to accept and satisfy requests for job management coming from its clientsmeaning of the submission request is to pass the responsibility of the job to the WM.WM will pass the job to an appropriate CE for executiontaking into account requirements and the preferences expressed in the job description file
The decision of which resource should be used is the outcome of a matchmaking process.
WMS Objectives
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 33
WMS Architecture
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 33
WMS Architecture
Job managementrequests (submission,
cancellation) expressedvia a Job Description
Language (JDL)
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 33
WMS Architecture
Keeps submission requests
Requests are kept for a while
if no resources are immediately available
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 33
WMS Architecture
Finds an appropriateCE for each submission
request, taking into account job requests and preferences, Grid status, utilization policies
on resources
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 33
WMS Architecture
Repository of resource information
available to matchmaker
Updated via notifications and/or active
polling on resources
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 33
WMS Architecture
Performs the actual job submission and monitoring
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 33
WMS Architecture
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 34
Job Description Language
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 35
Job Description Language
In gLite, Job Description Language (JDL) is used to describe jobs for execution on Grid.
The JDL adopted within the gLite middleware is based upon Condor’s CLASSified Advertisement language (ClassAd).A ClassAd is a record-like structure composed of a finite number of attributes separated by semi-colon (;)A ClassAd is highly flexible and can be used to represent arbitrary services
The JDL is used in gLite to specify the job’s characteristics and constrains, which are used during the match-making
process to select the best resources that satisfy job’s requirements.
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 36
The JDL syntax consists on statements like:Attribute = value;
Comments must be preceded by a sharp character ( # ) or have to follow the C++ syntax
WARNING: The JDL is sensitive to blank characters and tabs. No blank characters
or tabs should follow the semicolon at the end of a line.
Job Description Language (cont.)
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 37
JDL: an example
Type = "Job";JobType = "Normal";Executable = "startGen4.sh";Environment = {"CLASSPATH=./gfal.jar:./gint.jar","LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH","LCG_GFAL_VO=gilda","LCG_RFIO_TYPE=dpm"};Arguments = " 0 0 10 4 10000 aliserv6.ct.infn.it lfn:/grid/gilda/valeria/2000pillar.dat /gilda/issgc07/";StdOutput = "sample.out";StdError = "sample.err";InputSandbox = {"startGen4.sh","gint.jar","gfal.jar","libGFalFile.so"};OutputSandbox = {"sample.err","sample.out","res.txt"};Requirements = Member("GLITE-3_1_0",other.GlueHostApplicationSoftwareRunTimeEnvironment);
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 37
JDL: an example
Type = "Job";JobType = "Normal";Executable = "startGen4.sh";Environment = {"CLASSPATH=./gfal.jar:./gint.jar","LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH","LCG_GFAL_VO=gilda","LCG_RFIO_TYPE=dpm"};Arguments = " 0 0 10 4 10000 aliserv6.ct.infn.it lfn:/grid/gilda/valeria/2000pillar.dat /gilda/issgc07/";StdOutput = "sample.out";StdError = "sample.err";InputSandbox = {"startGen4.sh","gint.jar","gfal.jar","libGFalFile.so"};OutputSandbox = {"sample.err","sample.out","res.txt"};Requirements = Member("GLITE-3_1_0",other.GlueHostApplicationSoftwareRunTimeEnvironment);
Executable indicates which file will be executed remotely
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 37
JDL: an example
Type = "Job";JobType = "Normal";Executable = "startGen4.sh";Environment = {"CLASSPATH=./gfal.jar:./gint.jar","LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH","LCG_GFAL_VO=gilda","LCG_RFIO_TYPE=dpm"};Arguments = " 0 0 10 4 10000 aliserv6.ct.infn.it lfn:/grid/gilda/valeria/2000pillar.dat /gilda/issgc07/";StdOutput = "sample.out";StdError = "sample.err";InputSandbox = {"startGen4.sh","gint.jar","gfal.jar","libGFalFile.so"};OutputSandbox = {"sample.err","sample.out","res.txt"};Requirements = Member("GLITE-3_1_0",other.GlueHostApplicationSoftwareRunTimeEnvironment);
Environment allows to specify env. variables which will be set at run time
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 37
JDL: an example
Type = "Job";JobType = "Normal";Executable = "startGen4.sh";Environment = {"CLASSPATH=./gfal.jar:./gint.jar","LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH","LCG_GFAL_VO=gilda","LCG_RFIO_TYPE=dpm"};Arguments = " 0 0 10 4 10000 aliserv6.ct.infn.it lfn:/grid/gilda/valeria/2000pillar.dat /gilda/issgc07/";StdOutput = "sample.out";StdError = "sample.err";InputSandbox = {"startGen4.sh","gint.jar","gfal.jar","libGFalFile.so"};OutputSandbox = {"sample.err","sample.out","res.txt"};Requirements = Member("GLITE-3_1_0",other.GlueHostApplicationSoftwareRunTimeEnvironment);
Arguments appends a string (to be used as argument) to Executable
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 37
JDL: an example
Type = "Job";JobType = "Normal";Executable = "startGen4.sh";Environment = {"CLASSPATH=./gfal.jar:./gint.jar","LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH","LCG_GFAL_VO=gilda","LCG_RFIO_TYPE=dpm"};Arguments = " 0 0 10 4 10000 aliserv6.ct.infn.it lfn:/grid/gilda/valeria/2000pillar.dat /gilda/issgc07/";StdOutput = "sample.out";StdError = "sample.err";InputSandbox = {"startGen4.sh","gint.jar","gfal.jar","libGFalFile.so"};OutputSandbox = {"sample.err","sample.out","res.txt"};Requirements = Member("GLITE-3_1_0",other.GlueHostApplicationSoftwareRunTimeEnvironment);
StdOutput is the remote file where output will be redirected
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 37
JDL: an example
Type = "Job";JobType = "Normal";Executable = "startGen4.sh";Environment = {"CLASSPATH=./gfal.jar:./gint.jar","LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH","LCG_GFAL_VO=gilda","LCG_RFIO_TYPE=dpm"};Arguments = " 0 0 10 4 10000 aliserv6.ct.infn.it lfn:/grid/gilda/valeria/2000pillar.dat /gilda/issgc07/";StdOutput = "sample.out";StdError = "sample.err";InputSandbox = {"startGen4.sh","gint.jar","gfal.jar","libGFalFile.so"};OutputSandbox = {"sample.err","sample.out","res.txt"};Requirements = Member("GLITE-3_1_0",other.GlueHostApplicationSoftwareRunTimeEnvironment);
StdError is the remote file where std error will be redirected
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 37
JDL: an example
Type = "Job";JobType = "Normal";Executable = "startGen4.sh";Environment = {"CLASSPATH=./gfal.jar:./gint.jar","LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH","LCG_GFAL_VO=gilda","LCG_RFIO_TYPE=dpm"};Arguments = " 0 0 10 4 10000 aliserv6.ct.infn.it lfn:/grid/gilda/valeria/2000pillar.dat /gilda/issgc07/";StdOutput = "sample.out";StdError = "sample.err";InputSandbox = {"startGen4.sh","gint.jar","gfal.jar","libGFalFile.so"};OutputSandbox = {"sample.err","sample.out","res.txt"};Requirements = Member("GLITE-3_1_0",other.GlueHostApplicationSoftwareRunTimeEnvironment);
InputSandbox defines a set of local files that you want to be staged remotely for execution
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 37
JDL: an example
Type = "Job";JobType = "Normal";Executable = "startGen4.sh";Environment = {"CLASSPATH=./gfal.jar:./gint.jar","LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH","LCG_GFAL_VO=gilda","LCG_RFIO_TYPE=dpm"};Arguments = " 0 0 10 4 10000 aliserv6.ct.infn.it lfn:/grid/gilda/valeria/2000pillar.dat /gilda/issgc07/";StdOutput = "sample.out";StdError = "sample.err";InputSandbox = {"startGen4.sh","gint.jar","gfal.jar","libGFalFile.so"};OutputSandbox = {"sample.err","sample.out","res.txt"};Requirements = Member("GLITE-3_1_0",other.GlueHostApplicationSoftwareRunTimeEnvironment);
OutputSandbox defines a set of remote files that you want to get back after execution
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 37
JDL: an example
Type = "Job";JobType = "Normal";Executable = "startGen4.sh";Environment = {"CLASSPATH=./gfal.jar:./gint.jar","LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH","LCG_GFAL_VO=gilda","LCG_RFIO_TYPE=dpm"};Arguments = " 0 0 10 4 10000 aliserv6.ct.infn.it lfn:/grid/gilda/valeria/2000pillar.dat /gilda/issgc07/";StdOutput = "sample.out";StdError = "sample.err";InputSandbox = {"startGen4.sh","gint.jar","gfal.jar","libGFalFile.so"};OutputSandbox = {"sample.err","sample.out","res.txt"};Requirements = Member("GLITE-3_1_0",other.GlueHostApplicationSoftwareRunTimeEnvironment);
Requirements allows to specify a set of characteristic (hardware or software that you wish for the resource.
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 37
JDL: an example
Type = "Job";JobType = "Normal";Executable = "startGen4.sh";Environment = {"CLASSPATH=./gfal.jar:./gint.jar","LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH","LCG_GFAL_VO=gilda","LCG_RFIO_TYPE=dpm"};Arguments = " 0 0 10 4 10000 aliserv6.ct.infn.it lfn:/grid/gilda/valeria/2000pillar.dat /gilda/issgc07/";StdOutput = "sample.out";StdError = "sample.err";InputSandbox = {"startGen4.sh","gint.jar","gfal.jar","libGFalFile.so"};OutputSandbox = {"sample.err","sample.out","res.txt"};Requirements = Member("GLITE-3_1_0",other.GlueHostApplicationSoftwareRunTimeEnvironment);
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009
Other relevant JDL attributes• If your job needs a file stored somewhere, you can
specify its LFN :• The file will not be copied but your job scheduled to a CE
near the SE holding that file• That is crucial when dealing with large files
38
DataRequirements = { [ InputData = {"lfn:/grid/gilda/emidio/test.txt"}; DataCatalogType = "DLI"; DataCatalog = "http://lfc-gilda.ct.infn.it:8085"; ] }; DataAccessProtocol = {"rfio","gsiftp"};
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009
Othere relevant JDL attributes• Rank : allows to override UI’s default for fitness function on
which resources are classified
• RetryCount : override default for times that a job will be resubmitted after the first failure
• Requirements : a wide set of attributes, as they are published from the BDII, can be required. Regular expressions can be even set, and/or combined with logical operators ( II, &&, ! )
39
Rank = ( other.GlueCEStateWaitingJobs == 0 ? other.GlueCEStateFreeCPUs : -other.GlueCEStateWaitingJobs);
RetryCount = 7
Requirements = (RegExp("pd.infn.it",other.GlueCEUniqueID));
Rank = ( other.GlueCEInfoTotalCPUs);
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 40
Workflows of jobs• With a single request, multiple jobs
can be generated and executed
• Direct Acyclic Graph (DAG) is a set of jobs where the input, output, or execution of one or more jobs depends on one or more other jobs
• A Collection is a group of jobs with no dependencies– basically a collection of JDL’s
• A Parametric job is a job having one or more attributes in the JDL that vary their values according to parameters
• Using compound jobs it is possible to have one shot submission of a (possibly very large, up to thousands) group of jobs – Submission time reduction
• Single call to WMProxy server / single Authentication and Authorization process• Sharing of files between jobs
– Availability of both a single Job Id to manage the group as a whole and Job Ids for each single job in the group
nodeEnodeC
nodeA
nodeD
nodeB
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009
DAG example[ Type = "dag";
InputSandbox = {"son.sh"}; nodes = [ son1 = [ description = [ JobType = "Normal"; Executable = "/bin/sh"; InputSandbox = {root.InputSandbox}; Arguments = "son.sh 1"; StdOutput = "son1.output"; StdError = "son1.error"; OutputSandbox = {"final1.input","son1.output","son1.error"}; ]; ]; final = [ description = [ JobType = "Normal"; Executable = "/bin/sh"; InputSandbox = {"final.sh", root.nodes.son1.description.OutputSandbox[0]}; Arguments = "final.sh"; StdOutput = "dag.out"; StdError = "dag.err"; OutputSandbox = {"dag.out","dag.err"}; ]; ]; dependencies = { {son1,final}}; ];]
41
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009
DAG example[ Type = "dag";
InputSandbox = {"son.sh"}; nodes = [ son1 = [ description = [ JobType = "Normal"; Executable = "/bin/sh"; InputSandbox = {root.InputSandbox}; Arguments = "son.sh 1"; StdOutput = "son1.output"; StdError = "son1.error"; OutputSandbox = {"final1.input","son1.output","son1.error"}; ]; ]; final = [ description = [ JobType = "Normal"; Executable = "/bin/sh"; InputSandbox = {"final.sh", root.nodes.son1.description.OutputSandbox[0]}; Arguments = "final.sh"; StdOutput = "dag.out"; StdError = "dag.err"; OutputSandbox = {"dag.out","dag.err"}; ]; ]; dependencies = { {son1,final}}; ];]
41
Single Submission
single job id
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 42
[issgc59@issgc-ui ~]$ glite-wms-job-submit -d emidio -o jobid-file sfk-explorer.jdl
Connecting to the service https://gilda-wms-01.ct.infn.it:7443/glite_wms_wmproxy_server
====================== glite-wms-job-submit Success ======================
The job has been successfully submitted to the WMProxyYour job identifier is:
https://gilda-lb-01.ct.infn.it:9000/4OaQng0PdA1nZJZHMcilqA
The job identifier has been saved in the following file:/home/issgc59/jid
=====================================================================
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 43
Jobs State Machine (1/9)
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 43
Jobs State Machine (1/9)
Submitted job is entered by the user to the User Interface but not yet transferred to Network Server for processing
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 44
Jobs State Machine (2/9)
Waiting job accepted by WMS and waiting for Workload Manager processing or being processed by WMHelper modules.
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 45
Jobs State Machine (3/9)
Ready job processed by WM but not yet transferred to the CE (local batch system queue).
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 46
Jobs State Machine (4/9)
Scheduled job waiting in the queue on the CE.
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 47
Jobs State Machine (5/9)
Running job is
running.
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 48
Jobs State Machine (6/9)
Done job exited or considered to be in a terminal state by CondorC (e.g., submission to CE has failed in an unrecoverable way).
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 49
Jobs State Machine (7/9)
Aborted job processing was aborted by WMS (waiting in the WM queue or CE for too long, expiration of user credentials).
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 50
Jobs State Machine (8/9)
Cancelled job has been successfully canceled on user request.
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 51
Jobs State Machine (9/9)
Cleared output sandbox was transferred to the user or removed due to the timeout.
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009
Logging and Bookkeping
• Every step of the job life cycle is logged on a service called Logging and Bookkeeping
• It is useful for users willing to know the status of their execution– when a job is submitted the UI logs it on LB– As result of submission a job identifier is returned
– WMS logs each step of scheduling– CE logs when it receive a job (scheduled), when it’s running and
when it’s done – Users can query the job status to the LB providing the job id
• Asynchronous updates....
52
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009
Logging and Bookkeping
• Every step of the job life cycle is logged on a service called Logging and Bookkeeping
• It is useful for users willing to know the status of their execution– when a job is submitted the UI logs it on LB– As result of submission a job identifier is returned
– WMS logs each step of scheduling– CE logs when it receive a job (scheduled), when it’s running and
when it’s done – Users can query the job status to the LB providing the job id
• Asynchronous updates....
52
https://gilda-lb-01.ct.infn.it:9000/fw4Ua8b_7Z8Vd8oJC74NCw
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009
The Computing Element
• The CE is the front-end machine (master node) to a local batch system– supported batch systems are PBS(Torque/MAUI), LSF, Condor
• WMS “pushes” job execution requests to the CE using condor-G– when a CE receives a job, this is moved on a queue– Then the job will be executed on the first available among its
Worker Nodes (where the batch system clients run) – when execution is complete, output files are copied to the CE
using scp• If the job is succesfully executed, output files are
copied back to the WMS using globus-url-copy• By queries to the LB, users knows when a job is done
and they can retrieve the output
53
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009
Summary
• WMS catchs users’ request for job executions • Requests are expressed through JDL
– JDL allows to specify requirements that selected resources must have
• The WMS processes request and chooses (matchmaking) a Computing Element for the actual execution– Status of resources is known to WMS with queries to BDII
• The CE tries to execute the job and copies back output files to WMS – status of execution is logged on LB
• Users queries LB, discovers their job is done and download output files from WMS
54
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 55
References
• gLite – http://www.glite.org
• GILDA Infrastructure– https://gilda.ct.infn.it/
• VOMS– http://infnforge.cnaf.infn.it/projects/voms
• GGF Security– http://www.gridforum.org/security
• GLUE Schema– http://glueschema.forge.cnaf.infn.it/
• EGEE– http://www.eu-egee.org
venerdì 10 luglio 2009
Enabling Grids for E-sciencE
EGEE-III INFSO-RI-222667 Overview of gLite middleware, ISSGC 2009 56
www.glite.org
Questions ?
venerdì 10 luglio 2009