+ All Categories
Home > Documents > © 2007 IBM Corporation IBM Global Engineering Solutions IBM Blue Gene/P Job Submission.

© 2007 IBM Corporation IBM Global Engineering Solutions IBM Blue Gene/P Job Submission.

Date post: 14-Dec-2015
Category:
Upload: clifford-winey
View: 223 times
Download: 3 times
Share this document with a friend
Popular Tags:
23
© 2007 IBM Corporation IBM Global Engineering Solutions IBM Blue Gene/P Job Submission
Transcript

© 2007 IBM Corporation

IBM Global Engineering Solutions

IBM Blue Gene/P

Job Submission

IBM Blue Gene/P System Administration

Job submission

Basic procedure1. Create a block

2. Allocate a block

3. Boot a block

4. Run a job

5. Free the block, or run another job

IBM Blue Gene/P System Administration

Block creation

Two ways of creating blocks Block builder mmcs_db_console

The use of block builder is recommended Block builder is capable to create any available blocks Block builder is a lot easier to use…

IBM Blue Gene/P System Administration

Block creation

Block builder Available via Navigator Able to create any block with a valid block size

– 16, 32, 64, 128 and 256 nodes (mesh)– 512 and multiples of 512 nodes (torus/mesh)

Starting card– 16 node : J00, J01– 64 node : N00, N02, N04, N08, N10, N12 or N14– 128 node : N00, N04, N08 or N12– 256 node : N00 or N08

IBM Blue Gene/P System Administration

Block creation

mmcs_db_console Able to create most of the available blocks

Provides a set of commands to create block

– genblock : a base partition

– genblocks : each base partition on the system

– gensmallblock : a sub-base partition

– genBPblock : a set of base partitions

– genfullblock : the entire system

Use Navigator for pass-through or split cables

IBM Blue Gene/P System Administration

Block deletion

Available via mmcs_db_console mmcs$ delete bgpblock R00-M0

type ‘help delete’ in the mmcs shell prompt for usage

Block deletion is not available via Navigator’s GUI mmcs_db_console within the Navigator is available

IBM Blue Gene/P System Administration

Exercise

Create a block from the block builder Create a block from the mmcs_db_console Delete a block from the mmcs_db_console

IBM Blue Gene/P System Administration

Job modes

There are three job modes, virtual node mode, SMP mode, and Dual Mode

MPI Ranks (processes) per node & Threads per process: VNM 4 processes/node, 1 thread/process

SMP 1 process/node, 4 threads/process

Dual 2 processes/node, 2 threads/process

CPU 0

Rank 0

CPU 1

Rank 1

CPU 2

Rank 2

CPU 3

Rank 3

Virtual Node Mode

CPU 0

Rank 0

CPU 1

thread

CPU 2

thread

CPU 3

thread

SMP Mode

CPU 0

Rank 0

CPU 1

thread

CPU 2

Rank 1

CPU 3

thread

Dual Mode

IBM Blue Gene/P System Administration

Job submission

Ways to submit a job mmcs_db_console

mpirun

LoadLeveler

IBM Blue Gene/P System Administration

Job submission

mmcs_db_console A console for the Midplane Management Control System (MMCS) Used to configure and allocate blocks of compute nodes and I/O

nodes and run programs on the BG/P system. Basically for administrator use Requires an access to the service node Environmental variable needed to be set

– /etc/profile.d/bgp.sh Caveat when submitting jobs from the console

– No stdin support– stdout & stderr sent to files

IBM Blue Gene/P System Administration

Job submission

mmcs_db_console1. $ cd /bgsys/driver/ppcfloor/bin

2. $ ./mmcs_db_console

3. mmcs$ allocate_block R00-M0

4. mmcs$ boot_block

5. mmcs$ submit_job /bghome/test/hello.rts /bghome/test

6. mmcs$ free R00-M0

7. mmcs$ quit

type ‘help’ in the mmcs shell prompt for available commands

IBM Blue Gene/P System Administration

Job submission

mmcs commands allocate_block : mark the block as allocated, but does not boot it boot_block : initialize, load and start block resource submit_job : starts an executable running on the currently

selected block free : release the resources associated with the block ID

IBM Blue Gene/P System Administration

Job submission

mpirun Launches jobs on the BG/P hardware and acts as a job monitor

– mpirun continually monitors status of the job, terminates when job is done– Transparently forwards stdin & receives stdout and stderr

Acts as a gateway for debuggers such as gdb and TotalView Each job requires a partition

– Can be allocated on the fly (-np or –shape)– Or used predefined partitions

Can boot partitions from their initial state– Disable this feature with –noallocate– User should verify no overlapping busy hardware

Can optionally not destroy booted partitions with -nofree

IBM Blue Gene/P System Administration

Job submission

mpirun$ mpirun –partition R00-M0 –mode SMP –cwd

/bghome/test –exe /bghome/test/hello.rts

partition : specify which block to use

mode : specify execution processor mode

cwd : specify currently working directory

exe : specify the program to run

type mpirun –h for available options

IBM Blue Gene/P System Administration

Job submission

LoadLeveler Allocates machine resources to run jobs

Scheduling of jobs depends on the availability of resources within the system

A user submits a job using a job command file

Maximize the efficiency of the cluster by maximizing the utilization of resources

IBM Blue Gene/P System Administration

Job submission

LoadLevelersome of the tasks can be performed:

Choosing the next job to run Examining the job requirements Collecting available resource in its cluster Dispatching the job to the selected machine Controlling running jobs Create reservations and schedule jobs to run in the reservations Job preemption to enable high priority jobs to run immediately Fair share scheduling to automatically balance resources among users or

groups of users Co-scheduling to enable several jobs to be scheduled to run at the same

time

IBM Blue Gene/P System Administration

Example code

1. Write simple hello world:

/* Hello World program */

#include<stdio.h>

void main(void)

{

printf("Hello World!\n");

}

2. Compile the program:

/bgsys/drivers/ppcfloor/comm/bin/mpicc -o hello hello.c

3. Run the program:

Assuming that the program lives in /bgsys/apps and you want the results (STDOUT and STDERR) to be written to /bgsys/apps/results:

At the mmcs_db_console prompt:

mmcs$ submit_job /bgsys/apps/hello /bgsys/apps/results/

IBM Blue Gene/P System Administration

Exercise

Submit a job using mmcs_db_console Free the block after the job finishes

Submit a job using mpirun Submit a job using LoadLeveler

IBM Blue Gene/P System Administration

Job termination

mmcs_db_console killjob, kill_job

1. mmcs$ killjob R00-M0 124

2. mmcs$ wait_job

Terminating a job can take a while default timeout is 5 minutes

IBM Blue Gene/P System Administration

Job termination

mpirunControl-C

– mpirund will do a cleanup

– Do not send multiple control-C

• Second control-C will force termination

• Third control-C is almost similar to kill -9, which may cause block state to be left in limbo

IBM Blue Gene/P System Administration

Scripting

A list of commands for mmcs_db_console can be written into a file for a scripting usage

$ mmcs_db_console < script_file

script_file is a simple ascii text file with a list of commands for mmcs_db_console

IBM Blue Gene/P System Administration

Scripting

Sample script_fileCreate and test several blocks

$ cat script_file

genblock R00-M0 R00-M0 64

allocate R00-M0

free R00-M0

genblock R00-M1 R00-M1 64

allocate R00-M1

free R00-M1

quit

IBM Blue Gene/P System Administration

Bridge API

Public API used by job schedulers LoadLeveler, SLURM, Altair PBS Pro, Platform LSF, Cobalt Used by mpirun too

Has Interfaces to manage various Blue Gene resources Create, destroy, query logical constructs such as jobs and partitions Query physical entities such as midplanes, node cards, switches, and

cables Essentially a thin abstraction layer of the database

Requires a polling model to obtain machine state, example: Grab a snapshot of the machine state Create a partition based on free resources Boot partition Poll partition state until it is INITIALIZED


Recommended