+ All Categories
Home > Documents > Scheduling Application · PE 0 of the app. SLIDE 13 Application Scheduling Richard Lagerstrom CUG...

Scheduling Application · PE 0 of the app. SLIDE 13 Application Scheduling Richard Lagerstrom CUG...

Date post: 24-Jul-2020
Category:
Upload: others
View: 1 times
Download: 0 times
Share this document with a friend
23
SLIDE 1 Application Scheduling Richard Lagerstrom CUG 2003 / Columbus, Ohio, USA 6/2/03 Application Scheduling Richard Lagerstrom 15 MAY 2003
Transcript
Page 1: Scheduling Application · PE 0 of the app. SLIDE 13 Application Scheduling Richard Lagerstrom CUG 2003 / Columbus, Ohio, USA 6/2/03 Application Startup Application node(s) phase ...

SLIDE 1 Application Scheduling Richard LagerstromCUG 2003 / Columbus, Ohio, USA6/2/03

ApplicationScheduling

Richard Lagerstrom15 MAY 2003

Page 2: Scheduling Application · PE 0 of the app. SLIDE 13 Application Scheduling Richard Lagerstrom CUG 2003 / Columbus, Ohio, USA 6/2/03 Application Startup Application node(s) phase ...

SLIDE 2 Application Scheduling Richard LagerstromCUG 2003 / Columbus, Ohio, USA6/2/03

Scheduling Hierarchy and Scope

UNICOS/mpSingle Node,

MultiprocessorProcess

PScheDSingle System,

MultinodePlacement

PBS ProOrganizational,

Departmental, or ClusterBatch

GlobusGlobalGrid

ExampleScopeName

Page 3: Scheduling Application · PE 0 of the app. SLIDE 13 Application Scheduling Richard Lagerstrom CUG 2003 / Columbus, Ohio, USA 6/2/03 Application Startup Application node(s) phase ...

SLIDE 3 Application Scheduling Richard LagerstromCUG 2003 / Columbus, Ohio, USA6/2/03

Functional Organization

pbs_mom

pbs_server pbs_sched

PScheD Kernel

PBS Pro

UNICOS/mp

Page 4: Scheduling Application · PE 0 of the app. SLIDE 13 Application Scheduling Richard Lagerstrom CUG 2003 / Columbus, Ohio, USA 6/2/03 Application Startup Application node(s) phase ...

SLIDE 4 Application Scheduling Richard LagerstromCUG 2003 / Columbus, Ohio, USA6/2/03

History

• Psched was ported from Cray T3E

• Enhanced to do initial placement

• Modified to support multi-CPU nodes

• User and admin. displays through psview

• More displays with apstat

• Cray X1 kernel cannot initiate applicationswithout the assistance of psched

Page 5: Scheduling Application · PE 0 of the app. SLIDE 13 Application Scheduling Richard Lagerstrom CUG 2003 / Columbus, Ohio, USA 6/2/03 Application Startup Application node(s) phase ...

SLIDE 5 Application Scheduling Richard LagerstromCUG 2003 / Columbus, Ohio, USA6/2/03

Introduction

• Placement strategies

• Placement requirements

• Starting an application

• Gang scheduling

• Migration

• The PBSpro interface

Page 6: Scheduling Application · PE 0 of the app. SLIDE 13 Application Scheduling Richard Lagerstrom CUG 2003 / Columbus, Ohio, USA 6/2/03 Application Startup Application node(s) phase ...

SLIDE 6 Application Scheduling Richard LagerstromCUG 2003 / Columbus, Ohio, USA6/2/03

Placement Strategies

Many configurable options

• Equalize node workload

• Minimize node fragmentation

• Maximize processor utilization

Page 7: Scheduling Application · PE 0 of the app. SLIDE 13 Application Scheduling Richard Lagerstrom CUG 2003 / Columbus, Ohio, USA 6/2/03 Application Startup Application node(s) phase ...

SLIDE 7 Application Scheduling Richard LagerstromCUG 2003 / Columbus, Ohio, USA6/2/03

• Power-of-2 MSP/SSP per node

• Memory loaded when executing

Accelerated applications need:

• Global address space ID (GASID) for off-node accelerated memory references

• Node contiguity

Placement Requirements

Page 8: Scheduling Application · PE 0 of the app. SLIDE 13 Application Scheduling Richard Lagerstrom CUG 2003 / Columbus, Ohio, USA 6/2/03 Application Startup Application node(s) phase ...

SLIDE 8 Application Scheduling Richard LagerstromCUG 2003 / Columbus, Ohio, USA6/2/03

Six-node Example

• Each node has 4 MSPs and 16GB memory

• Five with application flavor

• One with operating system and supportflavors

AppApp App App OS/SUPApp

Page 9: Scheduling Application · PE 0 of the app. SLIDE 13 Application Scheduling Richard Lagerstrom CUG 2003 / Columbus, Ohio, USA 6/2/03 Application Startup Application node(s) phase ...

SLIDE 9 Application Scheduling Richard LagerstromCUG 2003 / Columbus, Ohio, USA6/2/03

Application Mapping

How many PEs are allocated to a node?

• User option to choose PEs/node

• Psched will pick a mapping by default

• Memory usage per PE is the major reasonfor user specified mapping

Page 10: Scheduling Application · PE 0 of the app. SLIDE 13 Application Scheduling Richard Lagerstrom CUG 2003 / Columbus, Ohio, USA 6/2/03 Application Startup Application node(s) phase ...

SLIDE 10 Application Scheduling Richard LagerstromCUG 2003 / Columbus, Ohio, USA6/2/03

Mapping Examples

-n 10 –N 4

-n 10 –N 2

-n 5 –N 1

Page 11: Scheduling Application · PE 0 of the app. SLIDE 13 Application Scheduling Richard Lagerstrom CUG 2003 / Columbus, Ohio, USA 6/2/03 Application Startup Application node(s) phase ...

SLIDE 11 Application Scheduling Richard LagerstromCUG 2003 / Columbus, Ohio, USA6/2/03

Posting an Application

Support node phase

• Run aprun -n x -N y a.out

• aprun checks for option errors

• aprun sends an RPC request to psched topost the app for placement

• aprun waits for a signal to continue

• Psched gets PBSpro queue limits

• Psched creates an apteam and joins aprun

Page 12: Scheduling Application · PE 0 of the app. SLIDE 13 Application Scheduling Richard Lagerstrom CUG 2003 / Columbus, Ohio, USA 6/2/03 Application Startup Application node(s) phase ...

SLIDE 12 Application Scheduling Richard LagerstromCUG 2003 / Columbus, Ohio, USA6/2/03

Placing an Application

• Psched generates placement information

• Psched sends placement information to thekernel

• On the next time slice psched sends a startsignal to aprun to exec() PE 0 of the app

Page 13: Scheduling Application · PE 0 of the app. SLIDE 13 Application Scheduling Richard Lagerstrom CUG 2003 / Columbus, Ohio, USA 6/2/03 Application Startup Application node(s) phase ...

SLIDE 13 Application Scheduling Richard LagerstromCUG 2003 / Columbus, Ohio, USA6/2/03

Application Startup

Application node(s) phase

• Execution begins in startup() which setsup the shared memory environment

• All PEs of the app are created with a placedfork()

• App execution begins in main()

Page 14: Scheduling Application · PE 0 of the app. SLIDE 13 Application Scheduling Richard Lagerstrom CUG 2003 / Columbus, Ohio, USA 6/2/03 Application Startup Application node(s) phase ...

SLIDE 14 Application Scheduling Richard LagerstromCUG 2003 / Columbus, Ohio, USA6/2/03

Running an Application

User types aprun -n20 a.out

Waitrequest

RPCrequest

psched1 2 3

kernel - apteamctl

ExecPE 0

Startup�app runs

Move to app node[0]

Page 15: Scheduling Application · PE 0 of the app. SLIDE 13 Application Scheduling Richard Lagerstrom CUG 2003 / Columbus, Ohio, USA 6/2/03 Application Startup Application node(s) phase ...

SLIDE 15 Application Scheduling Richard LagerstromCUG 2003 / Columbus, Ohio, USA6/2/03

Application Execution

• Apps are time sliced by psched

• Memory of inactive apps may page out

• Memory of active apps is locked in

Page 16: Scheduling Application · PE 0 of the app. SLIDE 13 Application Scheduling Richard Lagerstrom CUG 2003 / Columbus, Ohio, USA 6/2/03 Application Startup Application node(s) phase ...

SLIDE 16 Application Scheduling Richard LagerstromCUG 2003 / Columbus, Ohio, USA6/2/03

Gang Scheduling

Five gangs – three parties

Three time slice example

Page 17: Scheduling Application · PE 0 of the app. SLIDE 13 Application Scheduling Richard Lagerstrom CUG 2003 / Columbus, Ohio, USA 6/2/03 Application Startup Application node(s) phase ...

SLIDE 17 Application Scheduling Richard LagerstromCUG 2003 / Columbus, Ohio, USA6/2/03

Gang Scheduling 1

Five gangs – three parties

First time slice

Five gangs – three parties

Page 18: Scheduling Application · PE 0 of the app. SLIDE 13 Application Scheduling Richard Lagerstrom CUG 2003 / Columbus, Ohio, USA 6/2/03 Application Startup Application node(s) phase ...

SLIDE 18 Application Scheduling Richard LagerstromCUG 2003 / Columbus, Ohio, USA6/2/03

Gang Scheduling 2

Five gangs – three parties

Second time slice

Gang Scheduling

Five gangs – three parties

Page 19: Scheduling Application · PE 0 of the app. SLIDE 13 Application Scheduling Richard Lagerstrom CUG 2003 / Columbus, Ohio, USA 6/2/03 Application Startup Application node(s) phase ...

SLIDE 19 Application Scheduling Richard LagerstromCUG 2003 / Columbus, Ohio, USA6/2/03

Gang Scheduling 3

Five gangs – three parties

Third time slice

Gang Scheduling

Five gangs – three parties

Page 20: Scheduling Application · PE 0 of the app. SLIDE 13 Application Scheduling Richard Lagerstrom CUG 2003 / Columbus, Ohio, USA 6/2/03 Application Startup Application node(s) phase ...

SLIDE 20 Application Scheduling Richard LagerstromCUG 2003 / Columbus, Ohio, USA6/2/03

Application Migration

• A target place list is generated

• The app is disconnected to stop executionand unlock its memory pages

• The target place list is given to the kernel

• The app is connected

• Memory pages are moved from the originnodes or disk to the target nodes

• Execution begins on the target nodes

Page 21: Scheduling Application · PE 0 of the app. SLIDE 13 Application Scheduling Richard Lagerstrom CUG 2003 / Columbus, Ohio, USA 6/2/03 Application Startup Application node(s) phase ...

SLIDE 21 Application Scheduling Richard LagerstromCUG 2003 / Columbus, Ohio, USA6/2/03

AAAApppppppplllliiiiccccaaaattttiiiioooonnnn EEEExxxxiiiitttt

• Each PE exits

• PE 0 waits for all other PEs to exit

• When PE 0 exits the kernel detects the PEcount is zero

• The kernel sends psched the app exit signal

• Psched deletes the kernel’s apteam entryand its internal information about the app

Page 22: Scheduling Application · PE 0 of the app. SLIDE 13 Application Scheduling Richard Lagerstrom CUG 2003 / Columbus, Ohio, USA 6/2/03 Application Startup Application node(s) phase ...

SLIDE 22 Application Scheduling Richard LagerstromCUG 2003 / Columbus, Ohio, USA6/2/03

PPPPBBBBSSSSpppprrrroooo////PPPPsssscccchhhheeeedddd IIIInnnntttteeeerrrrffffaaaacccceeee

PBS pschedResource and

Usage information

Application queuelimits

Limitsdatabase

Page 23: Scheduling Application · PE 0 of the app. SLIDE 13 Application Scheduling Richard Lagerstrom CUG 2003 / Columbus, Ohio, USA 6/2/03 Application Startup Application node(s) phase ...

SLIDE 23 Application Scheduling Richard LagerstromCUG 2003 / Columbus, Ohio, USA6/2/03

End

---end---


Recommended