Before We Start
• Sign in • Request account if necessary
• Windows Users: • Download PuTTY
• Google PuTTY • First result • Save puFy.exe to Desktop
• ALTERNATIVE: • ETX newriver.arc.vt.edu
Today’s goals
• Introduce ARC • Give overview of HPC today • Give overview of VT-‐HPC resources • Familiarize audience with interac)ng with VT-‐ARC systems
Should I Pursue HPC?
• Necessity: Are local resources insufficient to meet your needs? – Very large jobs – Very many jobs – Large data
• Convenience: Do you have collaborators? – Share projects between different en))es – Convenient mechanisms for data sharing
Parallelism is the New Moore’s Law
• Power and energy efficiency impose a key constraint on design of micro-‐architectures • Clock speeds have plateaued • Hardware parallelism is increasing rapidly to make up the difference
Research in HPC is Broad
• Earthquake Science and Civil Engineering
• Molecular Dynamics • Nanotechnology • Plant Science • Storm modeling • Epidemiology • Par)cle Physics • Economic analysis of phone
network paFerns
• Brain science • Analysis of large cosmological
simula)ons • DNA sequencing • Computa)onal Molecular
Sciences • Neutron Science • Interna)onal Collabora)on in
Cosmology and Plasma Physics
7
Physics (91) 19%
Molecular Biosciences (271)
17%
Astronomical Sciences (115)
13%
Atmospheric Sciences (72)
11%
Materials Research (131)
9%
Chemical, Thermal Sys (89)
8%
Chemistry (161) 7%
ScienEfic CompuEng (60)
2%
Earth Sci (29) 2%
Training (51) 2%
Who Uses HPC? • >2 billion core-‐hours allocated
• 1400 alloca)ons • 350 ins)tu)ons • 32 research domains
Popular SoEware Packages
• Molecular Dynamics: Gromacs, LAMMPS, NAMD, Amber • CFD: OpenFOAM, Ansys, Star-‐CCM+ • Finite Elements: Deal II, Abaqus • Chemistry: VASP, Gaussian, PSI4, QCHEM • Climate: CESM • Bioinforma)cs: Mothur, QIIME, MPIBLAST • Numerical Compu)ng/Sta)s)cs: R, Matlab, Julia • Visualiza)on: ParaView, VisIt, Ensight
Learning Curve
• Linux: Command-‐line interface • Scheduler: Shares resources among
mul)ple users • Parallel Compu)ng:
• Need to parallelize code to take advantage of supercomputer’s resources
• Third party programs or libraries make this easier
EssenJal Components of HPC
• Supercompu)ng resources • Storage • Visualiza)on • Data management • Network infrastructure • Support
12
Terminology
• Core: A computa)onal “unit” • Socket: A single CPU (“processor”). Includes roughly 4-‐16 cores. • Node: A single “computer”. Includes roughly 2-‐8 sockets. • Cluster: A single “supercomputer” consis)ng of many nodes. • GPU: Graphics processing unit. AFached to some nodes. General purpose GPUs (GPGPUs) can be used to speed up certain kinds of codes.
Blade : Rack : System
• 1 node : 2 x 8 cores= 16 cores • 1 chassis : 10 nodes = 160 cores • 1 rack (frame) : 4 chassis = 640 cores • system : 10 racks = 6,400 cores x 4
x 10
Shared vs. Distributed memory
• All processors have access to a pool of shared memory • Access )mes vary from CPU to CPU in NUMA systems • Example: SGI UV, CPUs on same node
• Memory is local to each processor
• Data exchange by message passing over a network
• Example: Clusters with single-‐socket blades
P
Memory
P P P P P P P P P
M M MM M
Network
MulJ-‐core systems
• Current processors place mul)ple processor cores on a die • Communica)on details are increasingly complex
– Cache access – Main memory access – Quick Path / Hyper Transport socket connec)ons – Node to node connec)on via network
Memory
Network
Memory Memory Memory Memory
Accelerator-‐based Systems
• Calcula)ons made in both CPUs and GPUs • No longer limited to single precision calcula)ons • Load balancing cri)cal for performance • Requires specific libraries and compilers (CUDA, OpenCL) • Co-‐processor from Intel: MIC (Many Integrated Core)
Network
GPU
Memory
GPU
Memory
GPU
Memory
GPU
Memory
Advanced Research CompuJng
• Unit within the Office of the Vice President of Informa)on Technology • Provide centralized resources for:
– Research compu)ng – Visualiza)on
• Staff to assist users • Website: hFp://www.arc.vt.edu
ARC Goals
• Advance the use of compu)ng and visualiza)on in VT research • Centralize resource acquisi)on, maintenance, and support for research community • Provide support to facilitate usage of resources and minimize barriers to entry • Enable and par)cipate in research collabora)ons between departments
Personnel • Associate VP for Research Compu)ng: Terry
Herdman • Director, Visualiza)on: Nicholas Polys • Computa)onal Scien)sts
• John Burkardt • Jus)n Krome)s • James McClure • Srijith Rajamohan • Bob SeFlage
• Sorware Engineer: Nathan Liles
Personnel (ConJnued)
• System Administrators (UAS) • Josh Akers • MaF Strickler
• Solu)ons Architects • Brandon Sawyers • Chris Snapp
• Business Manager: Alana Romanella • User Support GRAs: Mai Dahshan, Ahmed Ibrahim, TBA
Compute Resources
System Usage Nodes Node DescripEon Special Features
NewRiver GPGPU, Data Intensive 165
126x 24 cores, 128 GB (Haswell) 39x 28 cores, 512 GB (Broadwell)
78 P100 GPU 8 K80 GPU 16 “big data” nodes 24 512GB nodes 2 3TB nodes
Cascades Large-‐scale CPU 196 32 cores, 128 GB (Broadwell)
8 K80 GPU 2 3TB nodes
DragonsTooth Single-‐node 96 24 cores, 256 GB (Haswell)
OpenStack on 24 nodes
BlueRidge Large-‐scale CPU 408 16 cores, 64 GB (Sandy Bridge)
4 K40 GPU 18 128GB nodes
Huckleberry Deep Learning 14 2x IBM Power8 (Minsky), 256 GB
96 P100 GPU with NVLink
Storage Resources
Name Intent File System Total Size Maximum Usage Data Lifespan
Home Long-‐term storage of
files Qumulo 219 TB 640 GB per user Unlimited
Group Shared Storage for Research Groups
GPFS 1.0 PB 10 TB per group Unlimited
Work Fast I/O, Temporary
storage GPFS 1.1 PB 14 TB per user 120 days
Archive Long-‐term storage for infrequently-‐accessed
files LTFS Unlimited Unlimited
Storage within a Job
Name Intent File
System Environment Variable
Per User Maximum
Data Lifespan
Available On
Local Scratch
Local disk (hard drives)
$TMPDIR Size of node hard drive
Length of Job
Compute Nodes
Memory (tmpfs)
Very fast I/O Memory (RAM)
$TMPFS Size of node memory
Length of Job
Compute Nodes
VisualizaJon Resources
• VisCube: 3D immersion environment with three 10ʹ′ by 10ʹ′ walls and a floor of 1920×1920 stereo projec)on screens • DeepSix: Six )led monitors with combined resolu)on of 7680×3200 • ROVR Stereo Wall • AISB Stereo Wall
GeWng Started with ARC
• Review ARC’s system specifica)ons and choose the right system(s) for you – Specialty sorware
• Apply for an account online the Advanced Research Compu)ng website • When your account is ready, you will receive confirma)on from ARC’s system administrators
AllocaJon System: Goals
• Track projects that use ARC systems and document how resources are being used • Ensure that computa)onal resources are allocated appropriately based on needs – Research: Provide computa)onal resources for your research lab – Instruc)onal: System access for courses or other training events
AllocaJon Eligibility
To qualify for an alloca)on, you must meet at least one of the following: • Be a Ph.D. level researcher (post-‐docs qualify) • Be an employee of Virginia Tech and the PI for research compu)ng • Be an employee of Virginia Tech and the co-‐PI for a research project led by non-‐VT PI
AllocaJon ApplicaJon Process
• Create a research project in ARC database • Add grants and publica)ons associated with project • Create an alloca)on request using the web-‐based interface • Alloca)on review may take several days • Users may be added to run jobs against your alloca)on once it has been approved
Resources • ARC Website: hFp://www.arc.vt.edu • ARC Compute Resources & Documenta)on: hFp://www.arc.vt.edu/compu)ng • ARC Sorware: hFp://www.arc.vt.edu/sorware • New Users Guide: hFp://www.arc.vt.edu/newusers • Frequently Asked Ques)ons: hFp://www.arc.vt.edu/faq • Linux Introduc)on: hFp://www.arc.vt.edu/unix