+ All Categories
Home > Documents > Resource Manager for Grid with global job queue and with planning based on local schedules

Resource Manager for Grid with global job queue and with planning based on local schedules

Date post: 09-Feb-2016
Category:
Upload: nedaa
View: 34 times
Download: 0 times
Share this document with a friend
Description:
Keldysh Institute of Applied Mathematics Russian Academy of Sciences. Resource Manager for Grid with global job queue and with planning based on local schedules. V.N.Kovalenko, E.I.Kovalenko, D.A.Koryagin, E.Z.Ljubimskii, A.V.Orlov, E.V.Huhlaev - PowerPoint PPT Presentation
Popular Tags:
18
ource Manager for Grid with glo urce Manager for Grid with glob job queue and with planning job queue and with planning based on local schedules based on local schedules V.N.Kovalenko, E.I.Kovalenko, D.A.Koryagin, V.N.Kovalenko, E.I.Kovalenko, D.A.Koryagin, E.Z.Ljubimskii, A.V.Orlov, E.V.Huhlaev E.Z.Ljubimskii, A.V.Orlov, E.V.Huhlaev {kvn,kei, {kvn,kei, koryagin,ljubimsk,ao,huh koryagin,ljubimsk,ao,huh }@keldysh.ru }@keldysh.ru Keldysh Institute of Applied Mathematics Keldysh Institute of Applied Mathematics Russian Academy of Sciences Russian Academy of Sciences 1 1
Transcript
Page 1: Resource Manager for Grid with global  job queue and with planning  based on local schedules

Resource Manager for Grid with global Resource Manager for Grid with global job queue and with planning job queue and with planning

based on local schedulesbased on local schedules

V.N.Kovalenko, E.I.Kovalenko, D.A.Koryagin, V.N.Kovalenko, E.I.Kovalenko, D.A.Koryagin, E.Z.Ljubimskii, A.V.Orlov, E.V.HuhlaevE.Z.Ljubimskii, A.V.Orlov, E.V.Huhlaev

{kvn,kei,{kvn,kei,koryagin,ljubimsk,ao,huhkoryagin,ljubimsk,ao,huh}@keldysh.ru}@keldysh.ru

Keldysh Institute of Applied MathematicsKeldysh Institute of Applied Mathematics

Russian Academy of SciencesRussian Academy of Sciences

11

Page 2: Resource Manager for Grid with global  job queue and with planning  based on local schedules

Job submittingJob submitting inin GlobusGlobus systemsystem

Job submittingJob submitting by means of by means of BrokerBroker

Broker

22

Page 3: Resource Manager for Grid with global  job queue and with planning  based on local schedules

GRID Resource Broker (GRB) – HPC lab, University GRID Resource Broker (GRB) – HPC lab, University of Lecce, Italy and CACR, California Institute of of Lecce, Italy and CACR, California Institute of Technology. http://sara.unile.It/grb/Technology. http://sara.unile.It/grb/

EZ-Grid - Department of Computer Science, EZ-Grid - Department of Computer Science, University of Houston. University of Houston.

http: //www.cs.uh.edu/~ ezgrid/http: //www.cs.uh.edu/~ ezgrid/

Resource BrokersResource Brokers

MetaDispatcher – Keldysh Institute of MetaDispatcher – Keldysh Institute of Applied Mathematics, MoscowApplied Mathematics, Moscow

33

Page 4: Resource Manager for Grid with global  job queue and with planning  based on local schedules

Job submittingJob submitting inin GlobusGlobus systemsystem

Job submittingJob submitting by means of by means of BrokerBroker

Broker

44

Page 5: Resource Manager for Grid with global  job queue and with planning  based on local schedules

Architecture of MetaDispatcherArchitecture of MetaDispatcher

Client Metadispatcher

JOBS SPOOL

Scheduler

(reacts to events)

Job monitor

Target GRAM Job manager

Request (RSL) status

Proxy

interception

submit status cancel clean get-output

Clie

nt u

tiliti

es

submit status cancel get-output clean C

omm

and

inte

rpre

tato

r

Deleg. Proxy

Loc.copies executable

stdin

Gat

ekee

per

Jobm

anag

er-m

eta

Start (jobid)

Gatekeeper

Deleg.-2 Proxy

Bufferized

stdout, stderr

Cancel tail Cleanup

MDS GRIS GIIS GIIS

Pro

ving

O

f Job

GIIS

Statics

Dyn

amic

s

55

Page 6: Resource Manager for Grid with global  job queue and with planning  based on local schedules

Problem of schedulingProblem of schedulingThe problem of scheduling is decided The problem of scheduling is decided

on two sets: 1) the set of jobs and 2) on two sets: 1) the set of jobs and 2) the set of computing elements. the set of computing elements.

Scheduling results: Scheduling results: -The dispatch time for each jobThe dispatch time for each job-The place, where the job should be The place, where the job should be

directed and executed directed and executed

66

Page 7: Resource Manager for Grid with global  job queue and with planning  based on local schedules

Config. Config.

Config. fileConfig. file

Two management levels - local and global, each having Two management levels - local and global, each having own objects: job, queue, and management system - own objects: job, queue, and management system - Local Resource Monitor (LRM) and MetaDispatcher.Local Resource Monitor (LRM) and MetaDispatcher.

Global levelGlobal level

LRMLRM

LocalLocalqueuequeue

Local levelLocal level

MetaDispatcherMetaDispatcher

jobjobjobjob

jobjobjobjob

Global Global queuequeue

77

Page 8: Resource Manager for Grid with global  job queue and with planning  based on local schedules

Question 1Question 1: : In What Order Should In What Order Should the Global Jobs Be Served?the Global Jobs Be Served?

The order, in which the scheduler serves the job The order, in which the scheduler serves the job queue, should differ from FIFO.queue, should differ from FIFO.

User should have available the management User should have available the management facilities for placing his job at any position in the facilities for placing his job at any position in the global queue.global queue.

To achieve that:To achieve that:

Limited budget is allocated to each user.Limited budget is allocated to each user.

Within the budget limits user prices his jobs.Within the budget limits user prices his jobs.

Function GP evaluates Function GP evaluates global priorityglobal priority of the job: of the job:

GP=GP(price, required resources, run timeGP=GP(price, required resources, run time ))

jobjob

job

jobjob

jobjobjob

jobjob

jobjob

jobjob

jobjobjobjob

new jobnew job

88

Page 9: Resource Manager for Grid with global  job queue and with planning  based on local schedules

Question 2:Question 2: When When ForwardForward a Job to a a Job to a Target Computing Element?Target Computing Element?

jobjobjobjob

jobjobjobjob

IfIf destination point of a job is determined at destination point of a job is determined at the moment, when it comes in to a global the moment, when it comes in to a global queue, and the job is immediately routed to queue, and the job is immediately routed to a local queue…a local queue…

itit may be delayed there because of the local may be delayed there because of the local job arrival.job arrival. At the same time resources of At the same time resources of other computing elements may become free other computing elements may become free and idleand idle..

The conclusion:The conclusion:It is more reasonablly to store global jobs in global queue It is more reasonablly to store global jobs in global queue as long as possible, best of all up to the moment of start.as long as possible, best of all up to the moment of start.

new jobnew job

jobjobjobjob

jobjobjobjob

jobjobjobjob

jobjob

99

Page 10: Resource Manager for Grid with global  job queue and with planning  based on local schedules

The scheduling model of computing The scheduling model of computing installation:installation:

A set of resourcesA set of resources

Resource description:Resource description:Static attributes: (OS type, CPU time, memory volume)Static attributes: (OS type, CPU time, memory volume)Dynamic attributes: free/busy, resource amountDynamic attributes: free/busy, resource amount

Question 3:Question 3: To Which Computing To Which Computing Elements a Job Should Be Passed? Elements a Job Should Be Passed?

1010

Page 11: Resource Manager for Grid with global  job queue and with planning  based on local schedules

Resource Release TimeResource Release Time

However the scheduler must have a guarantee, However the scheduler must have a guarantee, that the planned global job will really start and that the planned global job will really start and will not stay waiting in a local queue.will not stay waiting in a local queue.

Resource

TimeRunning jobRunning job

Running jobRunning job

Running jobRunning job

Busy resources have an Busy resources have an additional attribute – release additional attribute – release time estimated from the time estimated from the request of a running job. request of a running job. Being aware of the release Being aware of the release time, the scheduler is able to time, the scheduler is able to plan the future usage of the plan the future usage of the busy resource. busy resource.

1111

Page 12: Resource Manager for Grid with global  job queue and with planning  based on local schedules

+

Question 4:Question 4: How the Interaction of the Global How the Interaction of the Global Scheduler and Local Resource Monitor Should Be Scheduler and Local Resource Monitor Should Be

Organized?Organized?

Autonomy of computing element:Autonomy of computing element:Each computing element of the Grid belongs to a certain owner that Each computing element of the Grid belongs to a certain owner that could be able to restrict access for external jobs completely or partly.could be able to restrict access for external jobs completely or partly.

If global and local jobs make demands for the same resources, their If global and local jobs make demands for the same resources, their priorities are compared. For this purpose each computing element i priorities are compared. For this purpose each computing element i determines the function LPi() that calculates the local priority of a determines the function LPi() that calculates the local priority of a global job. This function depends on job’s price, consumable global job. This function depends on job’s price, consumable resources and run time:resources and run time:

LPi = LPi (price, consumable resources, run time) LPi = LPi (price, consumable resources, run time)

If two jobs, local and global,If two jobs, local and global, ask for free resources, which one ask for free resources, which one should be preferred?should be preferred?

Question 4:Question 4: How should the interaction of the How should the interaction of the global scheduler and local resource monitor global scheduler and local resource monitor

be organized?be organized?

1212

Page 13: Resource Manager for Grid with global  job queue and with planning  based on local schedules

+

Question 4:Question 4: How the Interaction of the Global How the Interaction of the Global Scheduler and Local Resource Monitor Should Be Scheduler and Local Resource Monitor Should Be

Organized?Organized?

The global scheduler should distribute its jobs so that the global jobs The global scheduler should distribute its jobs so that the global jobs would not withhold would not withhold the start of any more "expensive” local jobs. the start of any more "expensive” local jobs.

Resource

TimeRunning jobRunning job

Running jobRunning job

Global queueGlobal queue

PPGG<P<PLL

PPGG

PPGG= LP(job= LP(jobGG))

jobjobGG

PPLLLocal queueLocal queue

jobjobLL

1313

Page 14: Resource Manager for Grid with global  job queue and with planning  based on local schedules

ScheduleScheduleResource

Future

Time

Running jobRunning job

Running jobRunning job

Running jobRunning job

priority1priority1priority2priority2

priority4priority4

priority3priority3

The The local schedulelocal schedule is the plan of resource occupation by local jobs is the plan of resource occupation by local jobs for some period of time in the future. for some period of time in the future.

Local schedule: Local schedule: For each local jobFor each local job

{priority, assigned resources, occupation and release time}{priority, assigned resources, occupation and release time}1414

Page 15: Resource Manager for Grid with global  job queue and with planning  based on local schedules

The local schedule is drawn up by the special The local schedule is drawn up by the special agentsagents of the global scheduler. Such agents, of the global scheduler. Such agents, working on each computing installation, arrange the working on each computing installation, arrange the schedule in precise conformity with scheduling schedule in precise conformity with scheduling strategy and configuration parameters of the local strategy and configuration parameters of the local monitor.monitor.

The actual state of all local schedules is The actual state of all local schedules is delivered to the delivered to the information baseinformation base of the global of the global scheduler, and, thus, it has available the scheduler, and, thus, it has available the information about the usage plan of all virtual information about the usage plan of all virtual organization resources. organization resources.

On the basis of this aggregate schedule the On the basis of this aggregate schedule the scheduler can scheduler can make upmake up the layout of global jobs the layout of global jobs allocation to resources.allocation to resources.

1515

Page 16: Resource Manager for Grid with global  job queue and with planning  based on local schedules

Data BaseData Base

jobjob

jobjob

jobjobjobjob

Global Global queuequeue

PProgram architecture of schedulingrogram architecture of scheduling

AgentAgent

LRMLRMAgentAgent

LRMLRMAgentAgent

QueueQueue

LRMLRM

SchedulerScheduler

1616

Page 17: Resource Manager for Grid with global  job queue and with planning  based on local schedules

The global schedulerThe global scheduler implement implementing ing certaincertain scheduling scheduling strategstrategy make up the global schedule.y make up the global schedule.

The information baseThe information base resides adjacently with the resides adjacently with the scheduler and stores aggregate schedule. scheduler and stores aggregate schedule. ForFor data data management the distributed management the distributed systemsystem like like SSpitfire of pitfire of DDatagrid project atagrid project with relational data base as a core is with relational data base as a core is considered.considered.

TThe local agenthe local agentss of the scheduler works on each of the scheduler works on each computing computing elementelement. Interacting with the local resource . Interacting with the local resource monitor, the agent monitor, the agent arrangesarranges a local schedule of this a local schedule of this computing element and transfers updates to the global computing element and transfers updates to the global scheduler. scheduler. Proposed implementation is based on Maui Proposed implementation is based on Maui schedulerscheduler. .

1717

Page 18: Resource Manager for Grid with global  job queue and with planning  based on local schedules

Future directions:Future directions:

Backfill algorithm implementation at the Backfill algorithm implementation at the global level to avoid blocking of the jobs.global level to avoid blocking of the jobs.

AdvanceAdvancedd resource reservation for resource reservation for distributed multiprocessor jobs.distributed multiprocessor jobs.

Economical model of virtual organiEconomical model of virtual organizzation ation as applied to scheduling. as applied to scheduling.

1818


Recommended