Date post: | 22-Dec-2015 |
Category: |
Documents |
View: | 213 times |
Download: | 0 times |
its.unc.edu 2
Course ObjectivesCourse Objectives
Word for the Day: Heterogeneous
Emerald: the Swiss army knife of computing, something for everyone :)
Something you can use today
A reference for something you can use tomorrow
its.unc.edu 3
Course Objectives Cont.
Course Objectives Cont.
Educate users on the broader aspects of research computing
Practical knowledge to allow you to efficiently perform your research
Pointers towards more advanced topics
its.unc.edu 4
Course Objectives What are compute clusters and Emerald in
particular? Accessing Emerald
• login• file systems
Running jobs on Emerald – Job Management• job schedulers• batch commands• submitting jobs• specialty scripts
Available Software• software• package space
Compiling Code
Course OutlineCourse Outline
its.unc.edu 5
Help DocumentationHelp Documentation
Getting Started on Emerald•http://help.unc.edu/6020
•General overview of Emerald for range of users
Short Course – Getting Started on Emerald•http://help.unc.edu/6479
•Detailed notes for beginning Emerald users
its.unc.edu 8
What is Emerald?What is Emerald?
General Purpose Linux Cluster•Maintained by Research Computing Group
Appropriate for all users regardless of expertise level
Other Servers:•Cedar/Cypress (128-processor SGI/Altix)
a large shared memory system
•Topsail (4160-processor Dell Linux Cluster)homogeneous capability cluster with fast interconnect
Mass Storage•Account access
its.unc.edu 9
What is a compute cluster?
What is a compute cluster?
Some Typical Components Compute Nodes
Interconnect
Shared File System
Software
Operating System (OS)
Job Scheduler/Manager
Mass Storage
its.unc.edu 10
Emerald is a Heterogeneous
Cluster
Emerald is a Heterogeneous
Cluster Compute Nodes
• Xeon blades, IBM Power 4 and Power5
Interconnect
• Gigabit Ethernet (aka gigE or GbE)
Shared File Systems
• AFS, NFS, and GPFS
Mass Storage
• ~/ms
Software
• much licensed and public domain s/w in package space
Operating Systems (OS)
• RH5 (64bit), RH4 (32 bit) and AIX (64 bit)
Job Scheduler/Manager
• all handled by LSF
its.unc.edu 12
Advantages of Using Emerald
Advantages of Using Emerald
High performance
Large capacity
Parallel processing
Many available software packages
Variety of compiling options
Shared file systems
Mass storage
its.unc.edu 13
Emerald Compute Nodes
Emerald Compute Nodes
Mostly IBM BladeCenter xeon blades
•all are dual Socket Intel Xeons
•1, 2, or 4 cores/socket (i.e. 2,4,8 processors/node)
•2.0, 2.8, 3.0, 3.2 GHz processors
•varying memory, mostly 2 or 4 GB per core
IBM Power 4 and 5
•large memory, varying processor speeds
Cluster is constantly evolving
its.unc.edu 15
Emerald SummaryEmerald Summary
Over 200 host blade nodes, Intel Xeon•Over 800 blade cores
•typically 2-4 GB memory per core 4 IBM AIX p575’s, Power 5
•64 cores, large memory 1 IBM AIX p690, Power 4
•32 cores, large (128 GB) shared memory Gigabit Ethernet switching fabric Running 32 and 64 bit Linux and 64 bit
AIX
its.unc.edu 16
Emerald DetailsEmerald Details
Run the lshosts command to see resources for each node (host). Note host, model, ncpus, maxmem, resources
%lshosts• HOST_NAME type model cpuf ncpus maxmem maxswp
server RESOURCES• bc12-n01 X86_64 Xeon_3_2 12.0 2 3954M 996M Yes
(X64bit blade blade12 L26 lammpi mem3 mem4 mpich2 mpichp4 RH5 tmp25G xeon32)
• bc10-n10 X86_64 Xeon_2_8 11.7 2 3954M 996M Yes (X64bit blade blade10 L26 lammpi mem3 mem4 mpich2 mpichp4 RH5 tmp25G xeon28)
• bc09-n01 X86_64 Xeon_2_8 11.7 2 3954M 996M Yes (X64bit blade blade9 L26 lammpi mem3 mem4 mpich2 mpichp4 RH5 tmp25G xeon28)
• bc01-n01 X86_64 Xeon_3_0 11.9 8 32190M 29313M Yes (X64bit blade blade1 L26 lammpi mem32 mpich2 mpichp4 RH5 tmp100G xeon30)
its.unc.edu 18
Logging Into EmeraldLogging Into Emerald
UNIX/Linux/OSX
•ssh [email protected]
•ssh –l my_onyen emerald.unc.edu
Windows: SSH Secure Shell
•X windows software -> shareware.unc.edu
•Setting up a Profile for Emerald
•Forwarding X11 packets
its.unc.edu 19
Head NodesHead Nodes
Emerald has multiple head nodes or login nodes for
•login and basic file manipulation
•compiling
•testing short (~ <1 min), small memory jobs
Login nodes run the Linux operating system
•take the Introduction to Linux class or see some of the many online tutorials if you are unfamiliar with Linux
its.unc.edu 20
Home Directory on Emerald
Home Directory on Emerald
Home Directory
•/afs/isis/home/m/y/my_onyen/
•250MB quota
•~/private/
•Files backed up daily [ ~/OldFiles ]
•Space quota/usage in Home Directory:fs lq
its.unc.edu 21
Work Directories on Emerald
Work Directories on Emerald
No space limit but periodically cleaned Not backed up!!! Work Directories:
• /netscr/my_onyen, /nas/my_onyen, /nas2/my_onyen totals 26.2 TB
• /largefsoptimized for large file operations (> 1MB)23 TB
• /smallfsoptimized for small file operations (< 1MB)16 TB
its.unc.edu 22
File PermissionsFile Permissions
Your home directory is in AFS space. AFS is a distributed networked file system.
Permissions are determined by ACLs (access control lists)
•see Introduction to AFS (http://help.unc.edu/215)
The other files systems, /largefs, /netscr, etc. are controlled by the usual Linux file permissions
•making everything under /netscr/myOnyen accessible: chmod –R a+rX /netscr/myOnyen
its.unc.edu 23
Mass StorageMass Storage
“To infinity … and beyond” - Buzz Lightyear
access via ~/ms looks like ordinary disk file
system – data is actually stored on tape
“limitless” capacity data is backed up For storage only, not a work
directory (i.e. don’t run jobs from here)
if you have many small files, use tar or zip to create a single file for better performance
its.unc.edu 25
What does a Job Scheduler and batch
system do?
What does a Job Scheduler and batch
system do?Manage Resources
allocate user tasks to resource
monitor tasks
process control
manage input and output
report status, availability, etc
enforce usage policies
its.unc.edu 26
LSFLSF
All Research Computing clusters use LSF to do job scheduling and management
LSF (Load Sharing Facility) is a (licensed) product from Platform Computing•Fairly distribute compute nodes among users•enforce usage policies for established queues
most common queues: int, now, week, month
•RC uses Fair Share scheduling, not first come, first served (FCFS)
LSF commands typically start with the letter b (as in batch), e.g. bsub, bqueues, bjobs, bhosts, …•see man pages for much more info!
its.unc.edu 27
Simplified view of LSFSimplified view of LSF
bsub –R X64bit –q week myjob
Login Node
Jobs Queued
job routed to queue
job_J
job_F
myjob
job_7
job dispatched to run on available host which satisfies job requirements
user logged in to login node submits job
its.unc.edu 28
Common batch commands
Common batch commands
bsub - submit jobs bqueues – view info on defined queues
•bqueues –l week bkill – stop/cancel submitted job bjobs – view submitted jobs
•bjobs –u all bhist – job history
•bhist –l <jobID> bhosts – status and resources of hosts
(nodes)
its.unc.edu 29
Common batch commands
Common batch commands
bpeek – display output of running job Use man pages to get much more info!
•man bjobs bfree – query LSF to find job slots
currently available that fit your resource requirement•this is a RC command extension
•bfree –help (or –h) jobmon – monitor changes in job status
•this is a RC command, typically runs in a separate window
its.unc.edu 30
Submitting Jobs: bsub Command
Submitting Jobs: bsub Command
Submit Jobs - bsub
•All files must be in scratch space, e.g. /netscr, /largefs, /smallfsHome directory is not mounted on compute
nodes
•bsub [- bsub_opts] executable [-exec_opts]
its.unc.edu 31
bsub continuedbsub continued
Common bsub options: • –o <filename>
–o out.%J
• -q <queue name> -q now
• -R “resource specification” -R xeon30
• -n <number of processes> used for parallel, MPI jobs
• -a <application specific esub> -a mpichp4 (used on MPI jobs)
its.unc.edu 32
Two methods to submit jobs:
Two methods to submit jobs:
bsub example: submit the executable job, myexe, to the week queue to run on a 64 bit Linux OS and redirect output to the file out.<jobID> (default is to mail output)
Method 1: Command Line•bsub –q week –R X64bit –o out.%J myexe
Method 2: Create a file (details to follow) called, for example, myexe.bsub, and then submit that file. Note the redirect symbol, <•bsub < myexe.bsub
its.unc.edu 33
Method 2 cont.Method 2 cont.
The file you submitted will contain all the bsub options you want in it, so for this example myexe.bsub will look like this
• #BSUB –q week
• #BSUB –o out.%J
• #BSUB –R X64bit
• myexe
This is actually a shell script so the top line could be the normal #!/bin/csh, etc and you can run any commands you would like.
• if this doesn’t mean anything to you then nevermind :)
its.unc.edu 34
Parallel Job exampleParallel Job example
Batch Command Line Method
bsub –q week –o out.%J -n 30 -a mpichp4 mpirun.lsf myParallelExe
Batch File Method
bsub < myexe.bsub
where myexe.bsub will look like this
• #BSUB –q week
• #BSUB –o out.%J
• #BSUB –a mpichp4
• mpirun.lsf myexe
its.unc.edu 35
Submitting Jobs: Specialty ScriptsSubmitting Jobs: Specialty Scripts
Running a SAS job through batch (2 ways)•bsub -q week -R blade sas program.sas
•bsas test.sas Running a Matlab job through batch (2
ways)•bsub -q week -R blade matlab -nodisplay -
nojvm -nosplash program.m -logfile program.log
•bmatlab test.m
its.unc.edu 36
Interactive Jobs: SetupInteractive Jobs: Setup
X-Windows
•Linux/OSXX11 client
•WindowsX-Win32
Offered on UNC Software Acquisition site https://shareware.unc.edu
Port forwarding on SSH Secure Shell
Setting up a session on X-Win32
its.unc.edu 37
Interactive Jobs: Submission
Interactive Jobs: Submission
–Ip or -Is
•bsub –q int –R blade –Ip sas
•bsub –q int –R blade –Ip gv
•bsub –q int –R blade –Ip matlab
•bsub –q int –Is tcsh
Specialty Scripts
•xsas
•xstata
its.unc.edu 39
Licensed SoftwareLicensed Software
over 20 licensed software applications (some are site licensed, others restricted)• Matlab, Maple, Mathematica, Gaussian, Accelrys
Materials Studio and Discovery Studio modules, Sybyl, Schrodinger, SAS, Stata, ArcGIS, NAG, IMSL, Totalview, and more.
compilers (licensed and otherwise)• intel, PGI, absoft, gnu, IBM
Numerous other packages provided for research and technical computing• including BLAST, PyMol, SOAP, PLINK, NWChem,
R, Cambridge Structural Database, Amber, Gromacs, Petsc, Scalapack, Netcdf, Babel, Qt, Ferret, Gnuplot, Grace, iRODS, XCrySDen, and more.
its.unc.edu 40
Available SoftwareAvailable Software
Most of the software is installed under AFS and is made available through package space.
AFS (Andrew File System) is a distributed networked file system. Your home directory and software packages are mounted in AFS space.
Changes made to your package space are preserved over login sessions.
its.unc.edu 41
Package SpacePackage Space
Use ipm (Isis Package Manager) to manage your packages.
ipm commands• ipm add (ipm a)• ipm remove (ipm r)• ipm query (ipm q)
Available packages• http://help.unc.edu/1689
man ipm
its.unc.edu 43
Compiling on EmeraldCompiling on Emerald
Compilers
•FORTRAN 77/90/95
•C/C++
Parallel Computing
•MPI (MPICH, LAM/MPI, MPICH-GM)
•OpenMP
its.unc.edu 44
Compiling Details on Emerald
Compiling Details on Emerald
Compiler Package name Command
Intel intel_fortran, intel_CC
ifort, icc, icpc
Portland Group pgi pgf77, pgf90,pgcc,pgCC
Absoft profortran f77, f90
GNU gcc gfortran, g77, gcc, g++
its.unc.edu 45
Compiling MPI programs
Compiling MPI programs
Use the MPI wrappers to compile your program
•mpicc, mpiCC, mpif90, mpif77
•the wrappers will find the appropriate include files and libraries and then invoke the actual compiler
•for example, mpicc will invoke either gcc, icc, or pgcc depending upon which package you have loaded
its.unc.edu 46
Compiling Details on Emerald
Compiling Details on Emerald
Add a compiler into your working environment
•ipm add package_name
Compile a code
•command code.c –o executable
Run executable on a compute node using the bsub command
•bsub –q week –R blade executable
its.unc.edu 47
Contacting Research Computing
Contacting Research Computing
Questions?
For assistance with Emerald, please contact the Research Computing Group:
•Email: [email protected]
•Phone: 919-962-HELP
•Submit help ticket at http://help.unc.edu