DIRAC Review (12 th December 2005)Stuart K. Paterson1 DIRAC Review Workload Management System.

Post on 17-Jan-2018

216 views 0 download

description

DIRAC Review (12 th December 2005)Stuart K. Paterson3 Introduction The WMS is a key component of DIRAC Realizes PULL Mechanism Community Overlaying Grid System COGS Paradigm WMS Services:- Rely on MySQL Job Database XML-RPC Protocol used for client service access Jabber is used for communication between services Condor Classad for Job JDL

transcript

DIRAC Review (12th December 2005) Stuart K. Paterson 1

DIRAC Review

Workload Management System

DIRAC Review (12th December 2005) Stuart K. Paterson 2

Contents

Introduction & OverviewThe Life Cycle of a DIRAC Job

Central Services & InteractionsDistributed Workload Management

WMS StrategiesOutlook & Improvements

DIRAC Review (12th December 2005) Stuart K. Paterson 3

Introduction

The WMS is a key component of DIRACRealizes PULL MechanismCommunity Overlaying Grid System

COGS Paradigm

WMS Services:-Rely on MySQL Job DatabaseXML-RPC Protocol used for client service accessJabber is used for communication between servicesCondor Classad for Job JDL

DIRAC Review (12th December 2005) Stuart K. Paterson 4

The Life of a DIRAC Job

Consider a typical DaVinci Analysis job since other use cases involve a subset of the steps for thisJob is submitted to WMS via DIRAC API

See tomorrow’s presentation

User

DIRACsubmit()

status()

getOutput()

DIRACAPI

GANGA

DIRAC Review (12th December 2005) Stuart K. Paterson 5

WMS Overview

JobReceiver

LFC

MatcherDataOptimiser

JobDB

TaskQueue

AgentDirector

Pilot Agent

LCGWMS

Computing Resource

Pilot Agent

AgentMonitor

DIRAC Review (12th December 2005) Stuart K. Paterson 6

Secure Job Receiver

Currently the only secure serviceAssigns Job ID Saves job in JobDBAlso uploads and saves proxy of userNotifies Optimiser Data or FIFO depending on requirements of job

DIRAC Review (12th December 2005) Stuart K. Paterson 7

Input/Output Sandbox Services

At present use MySQL DB for storing I/O sandboxVery fast and efficient No problems observed for ‘small’ files (<10Mb)Limit on DB size of 4Gb

Proposal is to move to Grid storage for this but:This will be slower

Extra dependency on LFCWill be implemented and tested

Final decision will be made based on performance

DIRAC Review (12th December 2005) Stuart K. Paterson 8

Input/Output Sandbox Mechanism

DIRAC Review (12th December 2005) Stuart K. Paterson 9

Job Database

MySQL DB but this is not accessed directly Accessed through the JobDB class

Contains full information about all the jobsJob description and statusPrimary job parameters

Common for all jobs e.g. Owner, access optimizedExtra Job Parameters

Arbitrary key/value pairs

DIRAC Review (12th December 2005) Stuart K. Paterson 10

JobDB Interface

Marks job status as ‘ready’ when addedNot only a thin layer on top of SQL statements Performs high-level operations

Adding jobsRemoving jobs

Provides bulk queries (e.g. for job monitoring)Scalability issues

Test system up to 15000 (production and analysis) jobs without automatic cleaning, no problems so far…

DIRAC Review (12th December 2005) Stuart K. Paterson 11

WMS Overview

JobReceiver

LFC

Matcher

JobDB

TaskQueue

AgentDirector

Pilot Agent

LCGWMS

Computing Resource

Pilot Agent

AgentMonitor

DataOptimiser

JobDB

JobReceiver

DIRAC Review (12th December 2005) Stuart K. Paterson 12

Data Optimizer

Instantiates File Catalog Client (LFC)Retrieves requirements of job from JobDB

Uses Condor Classad and MySQLSets job status to ‘waitingdata’Checks LFC for input data files and determines suitable SEs

Job fails meaningfully if the data is not availableRead-only LFC could speed up this process

Sets job status to ‘waiting’ ‘PilotAgent Submission’ if data is available

Inserts job into Task Queue

DIRAC Review (12th December 2005) Stuart K. Paterson 13

Data Optimizer - Improvements

Currently uses proxy from the process owner certificate to access LFC

Could use user proxy or Server certificate

Read-only LFC would solve this issueOptimizer currently checks all input files for each job

Can move to directories of datasets in the futureNeed to optimize LFC interaction for bulk queries

In cooperation with LFC developers

DIRAC Review (12th December 2005) Stuart K. Paterson 14

Task Queue

Deliberately have many task queues1 Task Queue per set of requirementsDrastically reduces the matching time

Works ok for production jobsFor analysis jobs with many varied requirements this remains to be seen

‘Double matching’ helps with this – see laterToo many queues can cause problems

Hierarchical organisation of queues with respect to requirements should improve matching

DIRAC Review (12th December 2005) Stuart K. Paterson 15

WMS Overview

JobReceiver

LFC

Matcher

JobDB

TaskQueue

AgentDirector

Pilot Agent

LCGWMS

Computing Resource

Pilot Agent

AgentMonitor

DataOptimiser

JobDB

JobReceiver

DataOptimiser

TaskQueue

LFC

DIRAC Review (12th December 2005) Stuart K. Paterson 16

Agent Director (1)

Agent Director is an API for Pilot Agent submission to LCG

Sets up the proxy of the user for submission to LCGMonitors jobs in ‘waiting’ ‘PilotAgent Submission’ status

Polls frequentlyAfter submission it enters ‘waiting’ ‘PilotAgent Response’

Since job remains in the ‘waiting’ state, can be picked up at any time by existing Agents from the same user

DIRAC Review (12th December 2005) Stuart K. Paterson 17

Agent Director (2)

Currently only used for ‘user’ jobsMove to AD for production

Can submit agents for each job in the Task QueueCurrently Pilot Agents are submitted by cron job independently of the Task Queue state

Can also potentially have agents working in filling mode (some infinite queues)

Could be used for submission to other Grids…

DIRAC Review (12th December 2005) Stuart K. Paterson 18

Agent Monitoring Service

Checks Pilot Agents of jobs in ‘waiting’ stateCurrently every 5 mins

Monitoring of Pilot Agents on LCG allows:Catch jobs spending too much time in the ‘Waiting’ stateQuickly spot the ‘Aborted’ status problem

Aborted agents are tracked and can be accountedCan also spot the problem of Pilot Agents being submitted to batch queues

Assigns ‘waiting’ ‘Proxy Expired’ state Can flag jobs for Agent Director to submit further Pilot Agents as necessary

DIRAC Review (12th December 2005) Stuart K. Paterson 19

Future Developments of AD and AM Services

Ensure PilotAgents are submitted to different LCG sites if the job requirements permit it

Not immediately obvious how to accomplish thisWould need direct submission to LCG CE

Make use of MyProxy ServerAutomatically renew user proxies before submission when necessaryPipe long life proxy with jobs?Run proxy monitor as a service which can deliver renewed user proxy?

DIRAC Review (12th December 2005) Stuart K. Paterson 20

WMS Overview

JobReceiver

LFC

Matcher

JobDB

TaskQueue

AgentDirector

Pilot Agent

LCGWMS

Computing Resource

Pilot Agent

AgentMonitor

DataOptimiser

JobDB

JobReceiver

DataOptimiser

TaskQueue

LFC

AgentMonitor

AgentDirector

DIRAC Review (12th December 2005) Stuart K. Paterson 21

Job State Machine up to this point

Jobs may be picked up as soon as they enter the ‘Waiting’ state

DIRAC Review (12th December 2005) Stuart K. Paterson 22

Pilot Agent running on LCG WN

Simple wrapper script sent as LCG jobInstalls DIRAC

Runs a standard DIRAC Agent which polls for the particular Job from a particular UserIf not successful, requests any job from particular user which is satisfied by the requirements of the site it runs on

‘Filling’ Mode – see laterDIRAC Agent starts JobAgent Module which performs the job requests

DIRAC Review (12th December 2005) Stuart K. Paterson 23

Matcher (1)

Receives request from Pilot Agent Only responds to sites in ‘mask’

Contains list of allowed sitesChecks available jobs in task queue Matches requirements of job (e.g. possible SEs) to requirements from Agent (e.g. owner, JobID at site with particular LocalSE)

Double match, agent can put specific requirements on jobs (e.g. job of particular owner or certain priority level)

DIRAC Review (12th December 2005) Stuart K. Paterson 24

Matcher (2)

Matcher has ‘semaphore’ mechanism to ensure job is only picked up once Assigns ‘matched’ state, sets the site for the job, logs this info and deletes job from task queue

Sends job to WNDoesn’t need to be secure

However, must ensure jobs are not picked up in error

DIRAC Review (12th December 2005) Stuart K. Paterson 25

WMS Workflow

Job State Machine

DIRAC Review (12th December 2005) Stuart K. Paterson 26

Job Agent Module Running on WN

Job Agent Requests Job from WMS, gets JDL if successfulInstalls any software not available locally

Links to any pre-installed software are created local to job during installation of DIRAC (dirac-install)

Creates Job Wrapper using information local to the WNTemplate + job specific parameters, e.g. job JDL ++ site specific parameters, e.g. software paths

Job Agent executes Job WrapperLocal InProcess CE

DIRAC Review (12th December 2005) Stuart K. Paterson 27

Job Agent

Job Preparation

Installing Application Software

Create Job Wrapper

DIRAC Review (12th December 2005) Stuart K. Paterson 28

WMS Overview

JobReceiver

LFC

Matcher

JobDB

TaskQueue

AgentDirector

Pilot Agent

LCGWMS

Computing Resource

Pilot Agent

AgentMonitor

DataOptimiser

JobDB

JobReceiver

DataOptimiser

TaskQueue

LFC

AgentMonitor

AgentDirector

Matcher

Pilot Agent

LCGWMS

DIRAC Review (12th December 2005) Stuart K. Paterson 29

WMS Overview

Pilot

Agent

Computing Resource

DIRAC Review (12th December 2005) Stuart K. Paterson 30

WMS Overview

Computing Resource

DIRAC Review (12th December 2005) Stuart K. Paterson 31

Job Wrapper (1)

Downloads input sandboxCurrently InputSandbox is a DIRAC WMS specific serviceCan also use generic LFNs

Provides access to the input dataResolves the input data LFN into a “best replica” PFN for the execution siteGenerates an appropriate Pool XML slice for protocol

DIRAC Review (12th December 2005) Stuart K. Paterson 32

Input Data Access Strategy

Attempt to stage input data (multi-threaded, via lcg-gt)Try rfio and dcapReturns TURL for protocol (when it works)

The returned TURLs currently don’t work inside the applications (also can’t ‘pin’ files yet…)

Currently use globally constructed TURL from DIRAC Storage Element class

Works fine with Gaudi applicationsIf this isn’t available, bring datasets local

DIRAC Review (12th December 2005) Stuart K. Paterson 33

Job Wrapper (2)

Invokes the job application in a child processRuns a watchdog process parallel to the application one:

Provides heart-beats for the Job Monitoring ServiceCollects the application CPU and memory consumptionto generate average numbers in the endMay catch the application in a ‘stalled’ state if no CPU consumption detected

DIRAC Review (12th December 2005) Stuart K. Paterson 34

Job Wrapper (3)

May receive messages through a messaging systemJabber messaging was demonstratedPossibility to kill the application gracefully or spy on the application outputNot used currently as security issues should be sorted out first if at all possible

Collecting job execution environment and consumption parameters, passing them to the Job Monitoring Service

Local job ID’s, worker node characteristics and load, total CPU consumption, job timing, etc

DIRAC Review (12th December 2005) Stuart K. Paterson 35

Job Wrapper (4)

Uploading output sandboxCurrently OuputSandbox is a DIRAC WMS specific serviceMight be moved to a generic SE implementation soon

Uploading output data Uploads output data to a predefined SE

Default one or user definedChooses PFN path according to the LHCb conventions

May be overridden by user – not recommendedNotifies the Job Monitoring Service of the changes in the job state

DIRAC Review (12th December 2005) Stuart K. Paterson 36

Job Wrapper Workflow

DIRAC Review (12th December 2005) Stuart K. Paterson 37

Overview of State Machine After Job Reaches WN (1)

DIRAC Review (12th December 2005) Stuart K. Paterson 38

Overview of State Machine After Job Reaches WN (2)

DIRAC Review (12th December 2005) Stuart K. Paterson 39

Overview of State Machine After Job Reaches WN (3)

DIRAC Review (12th December 2005) Stuart K. Paterson 40

Overview of Status Machine for Failed Jobs

DIRAC Review (12th December 2005) Stuart K. Paterson 41

DIRAC Job in Final State

Once DIRAC Agent(s) executed, Pilot Agent terminates gracefully, freeing the resourceIf successful, job is in ‘outputready’ state which means it is retrievable User can request output at any time

DIRAC Review (12th December 2005) Stuart K. Paterson 42

Job Logging

Logging is a mixture of primary (Job State) and secondary (App State) job states

JOB STATE DATE TIME SITE

submission 2005-12-11 13:40:21 LCG.CNAF.it

ready 2005-12-11 13:40:23 LCG.CNAF.it

waitingdata 2005-12-11 13:40:24 LCG.CNAF.it

waiting 2005-12-11 13:40:26 LCG.CNAF.it

matched 2005-12-11 13:53:09 LCG.CNAF.it

Job received by Agent 2005-12-11 13:53:09 LCG.CNAF.it

Installing Software 2005-12-11 13:53:09 LCG.CNAF.it

Job prepared to submit 2005-12-11 13:55:01 LCG.CNAF.it

scheduled 2005-12-11 13:55:01 LCG.CNAF.it

queued 2005-12-11 13:55:01 LCG.CNAF.it

running 2005-12-11 13:55:03 LCG.CNAF.it

Starting DIRAC job 2005-12-11 13:55:03 LCG.CNAF.it

DIRAC job initialization 2005-12-11 13:55:04 LCG.CNAF.it

Getting Input Data 2005-12-11 13:55:04 LCG.CNAF.it

Starting the application 2005-12-11 13:55:27 LCG.CNAF.it

DaVinci step 1 started 2005-12-11 13:55:30 LCG.CNAF.it

DaVinci execution, step 1 2005-12-11 13:55:38 LCG.CNAF.it

DaVinci, step 1 done 2005-12-11 13:56:54 LCG.CNAF.it

Job finalization 2005-12-11 13:56:54 LCG.CNAF.it

Job finished successfully 2005-12-11 13:56:55 LCG.CNAF.it

done 2005-12-11 13:56:55 LCG.CNAF.it

outputready 2005-12-11 13:56:56 LCG.CNAF.it

DIRAC Review (12th December 2005) Stuart K. Paterson 43

Job Monitoring Service

At all stages the Job Monitoring Service is used as an interface to update status information

Changes status in Job DB directlyAlso updates the Job Logging information

Two entry points, one for writing one for readingMove writing to secure service in the futureOptimized for bulk queries

One of the most solicited servicesCould maintain separate cache of state information to reduce future loads if necessary in the future

DIRAC Review (12th December 2005) Stuart K. Paterson 44

Cleaning Up

Cleaning Agent is used for Production jobsAccounting Agent monitors jobs in final state

Extracts accounting info and marks job as ‘deleted’ Cleaning Agent deletes job on next loop

For analysis/user jobs, could mark as ‘Accounting Sent’ then mark as ‘Purgeable’

To be decidedNew Cleaning Agent can implement policy

If output retrieved hold job for ~1 day more ??If output not yet retrieved hold job for ~ 1 week ??

DIRAC Review (12th December 2005) Stuart K. Paterson 45

Overall Job State Machine

DIRAC Review (12th December 2005) Stuart K. Paterson 46

Current WMS Pilot Agent Strategy

Now submit up to 4 Pilot Agents per job (unless all are being ‘Aborted’)

These are submitted one at a time with a waiting period of 5 minutes between

Agent ‘Filling’ modeWhen a Pilot Agent arrives at a WN, it first requests a particular job from the user. Next, the Pilot Agent will request any job from the same user

If the requirements of the job match the site this is successful

This should be optimized to make the most of the available resource – e.g. check time left on WN

DIRAC Review (12th December 2005) Stuart K. Paterson 47

Possible WMS Strategies (1)

Simplest strategy is no strategy, 1Pilot Agent per job‘Filling’ modeMulti-Threaded Agent infrastructure in place

Can run jobs in parallel, can be a huge improvementEspecially when mixing jobs of different priority and nature

Reading data, downloading etc. can be complementary activities

DIRAC Review (12th December 2005) Stuart K. Paterson 48

Possible WMS Strategies (2)

Picking up jobs with higher priority, can be achieved through ‘double matching’ mechanism Running jobs of different members of same VO by same Pilot Agent

Very promising mechanism to optimize the workload for the LHCb VO as a whole

DIRAC Review (12th December 2005) Stuart K. Paterson 49

Outlook & Improvements (1)

Input / Output Sandbox move to Grid storageJob ‘failures’ on LCG will be recovered as much as possible

e.g. treatment of Stalled jobsEnsure PilotAgents are submitted to different LCG sites if the job requirements permit it

To cope with troublesome sites

DIRAC Review (12th December 2005) Stuart K. Paterson 50

Outlook & Improvements (2)

Extended life of user proxies using MyProxy Server or …Explore use of Multi-Threaded Agent on the GridProvide at least minimal interactivity with running job

Job killing/spyingSpotting stalled Applications