INTRODUCTIONTO HIGH PERFORMANCE
COMPUTING
Course material:
http://rcg.group.shef.ac.uk/courses/hpcintro/
GETTING STARTEDGetting an Account
Before you can start using Bessemer you need to register for an account.
Students can also have an account on Bessemer with the permission of their supervisors.
Accounts are available by emailing [email protected]
Connecting to BessemerWindows - Putty / MobaXterm
Download and install Putty, MobaXterm or other client.
Hostname:<u>@bessemer.shef.ac.uk
Connecting to BessemerLinux / macOS
Linux and macOS both have a terminal emulator pre-installed.
Once you have a terminal open run the following command:
ssh -X <username>@bessemer.shef.ac.ukssh -Y <username>@bessemer.shef.ac.uk
where you replace <username> with your CICS username.
Connecting to the Training Cluster
Host: traininghpcUsername: Muse usernamePassword: Muse password
Connecting to ShARCPlatform independent
Open a browser and type:https://myapps.shef.ac.uk
Log in with your university account.
Click on Connect via myAPPs Portal and log in.
INTRODUCTION
The supercomputer is a computer with a high level of computing performance compared to a general-purpose computer.
Bessemer specifications:• CPU cores: 1040• Memory: 5184 GiB• GPUs: 4• Storage: 460 TiB
Machine
Dell PowerEdge C6420
Central Processing Units:• 2 x Intel Xeon Gold 6138 • 2.00 GHz;
Memory:• 192 GB• 2666 MHz• DDR4.
General CPU node specifications25 nodes are publicly available
Operating System
• Centos 7.x • Interactive and batch job
scheduling software: Slurm
• Many applications, compilers, libraries and parallel processing.
Worker node #1
Worker node #2
Worker node #3
Worker node #4
Worker node #25
…
Login node #1
Login node #2Shared userfile storage
Two Bessemer head-nodes are gateways to the cluster of worker nodes.
Head-nodes’ main purpose is to allow access to the worker nodes but NOT to run cpu intensive programs.
All cpu intensive computations must be performed on the worker nodes. This is achieved by;
srun --pty bash -i
RUNNING SIMPLE PROGRAMS
Setting up your softwaredevelopment environment
You can setup your software environment for a job by the command
module
All the available software environments can be listed by using
module avail
You can then select the ones you wish to use by using
module add
Using modules• List modules• Available Modules• Load Module
Write a simple “Hello World!” application and run it!
Demonstration
PRACTICE SESSION
Start an interactive session on the training machine usingsrun --pty bash -i
tar -xvf /usr/local/courses/hpc_intro_long.tgz
In LOGIN NODE extract the course examples:
We are studying inflammation in patients who have been given a new treatment for arthritis we need to analyse the first set of inflammation data. The data sets are held in comma separated variable (csv) format. Each row holds the observations for one patient. Each column holds the inflammation measured in one day.
For this practice session we will run the R application The lastest version of R can be loaded withmodule load apps/R/3.6.1/binary
Change directory to the ‘hpc_intro_long/data’ directory and run R with$ R
From the R session you can run a series of commands to plot the inflammation data.dat <- read.csv(file = "inflammation-01.csv", header = FALSE)avg_day_inflammation <- apply(dat, 2, mean)plot(avg_day_inflammation)
to exit R typeq()
MANAGING YOUR JOBS
Slurm is the resource management system, job scheduling and batch control system. Starts up
interactive jobs on available workers.
Schedules all batch orientated‘i.e. non-interactive’ jobs
Attempts to create a fair-share environment
Optimises resource utilisation
Difference between interactiveand non-interactive jobs
Until now, you have used interactive jobs. However, there are certain facts that cannot be ignored:
• Maximum time limit for interactive jobs is 8 hours.
• You must keep your connection alive!
Inconvenient or impossible to solve time consuming problems.
Solution? Non-interactive jobs
NON-INTERACTIVE JOBS
1) Write a job-submission shell script
You can submit your job, using a shell script. A general job-submission shell script contains the “bang-line” in the first row.#!/bin/bash
2) Next you may specify some options, such as memory limit.
#SBATCH --"OPTION"="VALUE"
3) Load the approipate modules if necessery.
module use "MODULE NAME”
4) Run your program by using the Slurm “srun” command.
srun "PROGRAM"
Save the script (“submission.sh”) and use
sbatch submission.sh
Note the job submission number. For example:
Submitted batch job 1226
Check your output file when the job is finished.
cat "JOB_NAME"-1226.out
JOB SUBMISSION
Jobs typically pass through several states in the course of their execution. The typical states are PENDING, RUNNING, SUSPENDED, COMPLETING, and COMPLETED.
Display the job queue.
squeue
Shows job details:
sacct -v
Deletes job from queue:
scancel "JOB_ID"
Managing Jobs monitoring and controlling your jobs
Additional options for job submission
Name your submission:
#SBATCH --comment=test_job
Specify nodes and tasks for MPI jobs:
#SBATCH --nodes=1#SBATCH --ntasks-per-node=16
Memory allocation:
#SBATCH --mem=16000
Additional options for job submissionSpecify the output file name:
#SBATCH --output=output.%j.test.out
Request time:
#SBATCH --time=00:30:00
Email notification:
#SBATCH [email protected]
For the full list of the available options please visit the Slurm manual webpage at https://slurm.schedmd.com/pdfs/summary.pdf.
#!/bin/bash#SBATCH --nodes=1#SBATCH --ntasks-per-node=40#SBATCH --mem=64000#SBATCH [email protected]
module load OpenMPI/3.1.3-GCC-8.2.0-2.31.1
srun programMaximum 40 cores can be requested per node in the general use queues.
EXAMPLE
DEMONSTRATION
Write a single script!
#!/bin/bashmodule add apps/python/3.6/binary srun python hello.py You simply type; sbatch myjob.sh
PRACTICE SESSION
Submit your R job using the command
sbatch rslurm.sh
Change directory to the r folder of the course examples
Inspect the script file rslurm.sh and check that it will execute The R job for analysing computing the means of the inflammationdata sets.
JOB ARRAY
Job arrays offer a mechanism for submitting and managing collections of similar jobs quickly and easily.
All jobs must have the same initial options (e.g. size, time limit, etc.),
#SBATCH --array=0-4
Job arrays are only supported for batch jobs and the array index values are specified using the --array or -a option of the sbatch command.
The option argument can be specific array index values, a range of index values, and an optional step size
JOB ARRAY
Job ID and Environment Variables
Job arrays will have two additional environment variable set.
SLURM_ARRAY_JOB_ID will be set to the first job ID of the array.SLURM_ARRAY_TASK_ID will be set to the job array index value.
srun ./fish < fish${SLURM_ARRAY_TASK_ID}.in > fish${SLURM_ARRAY_TASK_ID}.out
Submitting a Job Array
#!/bin/bash
#SBATCH --array=0-4
srun ./fish < fish${SLURM_ARRAY_TASK_ID}.in > fish${SLURM_ARRAY_TASK_ID}.out
Job submission script (named submit.sh):
Job submission:
sbatch submit.sh
Getting help• Web site
- http://www.shef.ac.uk/cics/research• Iceberg Documentation
- http://www.sheffield.ac.uk/cics/research/hpc/iceberg• Training (also uses the learning management system)
- http://www.shef.ac.uk/cics/research/training• Discussion Group (based on google groups)
- https://groups.google.com/a/sheffield.ac.uk/forum/?hl=en-GB#!forum/hpc• E-mail the group [email protected] Help on google groups
- http://www.sheffield.ac.uk/cics/groups• Contacts