+ All Categories
Home > Documents > Introduction to High Performance Computing at ZIHmlieber/slides/Architecture.pdf · 2010-06-22 ·...

Introduction to High Performance Computing at ZIHmlieber/slides/Architecture.pdf · 2010-06-22 ·...

Date post: 27-Jun-2020
Category:
Upload: others
View: 3 times
Download: 0 times
Share this document with a friend
19
Zellescher Weg 12 Trefftz-Bau/HRSK 151 Phone +49 351 - 463 - 39871 Guido Juckeland ([email protected]) Center for Information Services and High Performance Computing (ZIH) Introduction to High Performance Computing at ZIH Architecture of the PC Farm (Deimos)
Transcript
Page 1: Introduction to High Performance Computing at ZIHmlieber/slides/Architecture.pdf · 2010-06-22 · High Performance Computing at ZIH Architecture of the PC Farm (Deimos) Slide 2 -

Zellescher Weg 12

Trefftz-Bau/HRSK 151

Phone +49 351 - 463 - 39871

Guido Juckeland ([email protected])

Center for Information Services and High Performance Computing (ZIH)

Introduction to High Performance Computing at ZIH

Architecture of the PC Farm (Deimos)

Page 2: Introduction to High Performance Computing at ZIHmlieber/slides/Architecture.pdf · 2010-06-22 · High Performance Computing at ZIH Architecture of the PC Farm (Deimos) Slide 2 -

Slide 2 - Guido Juckeland

Agenda

PC Farm Components

AMD Opteron Prozessors und Systems

Infiniband Networks

Page 3: Introduction to High Performance Computing at ZIHmlieber/slides/Architecture.pdf · 2010-06-22 · High Performance Computing at ZIH Architecture of the PC Farm (Deimos) Slide 2 -

Slide 3 - Guido Juckeland

PC Farm Components (Deimos)

Page 4: Introduction to High Performance Computing at ZIHmlieber/slides/Architecture.pdf · 2010-06-22 · High Performance Computing at ZIH Architecture of the PC Farm (Deimos) Slide 2 -

Slide 4 - Guido Juckeland

Linux Networx PC-Farm (Deimos)

1292 AMD Opteron x85 Dual-Core CPUs (2,6 GHz)

726 Compute nodes with 2, 4 oder 8 CPU Cores

Per core 2 GiByte main memory

2 Infiniband interconnects (MPI- and I/O-Fabric)

68 TByte SAN-Storage

Per node 70, 150, 290 GByte scratch-disk

OS: SuSE SLES 10

Batch system: LSF

Compiler: Pathscale, PGI, Intel, Gnu

3rd party applications: Ansys100, CFX, Fluent, Gaussian, LS-DYNA, Matlab, MSC,…

Page 5: Introduction to High Performance Computing at ZIHmlieber/slides/Architecture.pdf · 2010-06-22 · High Performance Computing at ZIH Architecture of the PC Farm (Deimos) Slide 2 -

Slide 5 - Guido Juckeland

Deimos - Partitions

2 Master Nodes

– Not accessible for users, PC-Farm management

4 Login Nodes

– 4 Core Nodes

– Accessible with DNS Round Robin under deimos.hrsk.tu-dresden.de

Single-, Dual- und Quad-Nodes

– 1, 2 or 4 CPUs

– 4, 8 or 16 GiByte main memory (24 Quads with 32 GiByte)

– 80, 160 or 300 GByte local disks

Setup in phase 1 and phase 2 nodes

– Identical hardware

– Differences in the connection to the MPI- and the I/O-Fabric (later)

Page 6: Introduction to High Performance Computing at ZIHmlieber/slides/Architecture.pdf · 2010-06-22 · High Performance Computing at ZIH Architecture of the PC Farm (Deimos) Slide 2 -

Slide 6 - Guido Juckeland

AMD Opteron Processors und Systems

Page 7: Introduction to High Performance Computing at ZIHmlieber/slides/Architecture.pdf · 2010-06-22 · High Performance Computing at ZIH Architecture of the PC Farm (Deimos) Slide 2 -

Slide 7 - Guido Juckeland

AMD Opteron CPU - Design

AMD Opteron x85 (2,6 GHz)

Memory controller on-chip(2 memory channels with 3.2 GiByte/s transfer bandwidth each)

Each Core 64 KiByte level 1 instruciton- and data cache

1 MiByte Level 2 Cache

64 Bit extension of IA-32 x86-architecture (x86-64, x64 oder EM64T)

Now also as quad core CPUs available

Page 8: Introduction to High Performance Computing at ZIHmlieber/slides/Architecture.pdf · 2010-06-22 · High Performance Computing at ZIH Architecture of the PC Farm (Deimos) Slide 2 -

Slide 8 - Guido Juckeland

AMD Opteron – Block diagram

Instr 'nTLB Level 1 Instr'n Cache

Fetch 2 - transit

Pick

Decode 1Decode 2

Decode 1Decode 2

Decode 1Decode 2

Pack Pack Pack

Decode Decode Decode

8-entryScheduler

8-entryScheduler

8-entryScheduler

ALU AGU ALU AGU ALU AGU FADD FMUL FMISC

36-entryScheduler

DataTLB Level 1 Data Cache ECC

2kBranchTargets

16kHistoryCounter

RAS&

Target Address

Level 2Cache

L2 ECCL2 Tags

L2 Tag ECC

System RequestQueue (SRQ)

Cross Bar(XBAR)

Memory Controller&

HyperTransport TM

v

Page 9: Introduction to High Performance Computing at ZIHmlieber/slides/Architecture.pdf · 2010-06-22 · High Performance Computing at ZIH Architecture of the PC Farm (Deimos) Slide 2 -

Slide 9 - Guido Juckeland

Deimos – Layout of a single-CPU node

AMDOpteron

185Mem

ory

(4 G

iByt

e)

Hypertransport

Peripheral devices(Infiniband, Ethernet, Disk)

Page 10: Introduction to High Performance Computing at ZIHmlieber/slides/Architecture.pdf · 2010-06-22 · High Performance Computing at ZIH Architecture of the PC Farm (Deimos) Slide 2 -

Slide 10 - Guido Juckeland

Deimos – Layout of a dual-CPU nodes

AMDOpteron

285

AMDOpteron

285Mem

ory

(4 G

iByt

e)

Mem

ory

(4 G

iByt

e)

Hypertransport

Hypertransport

Peripheral devices

(Infiniband, Ethernet, Festplatte)

Page 11: Introduction to High Performance Computing at ZIHmlieber/slides/Architecture.pdf · 2010-06-22 · High Performance Computing at ZIH Architecture of the PC Farm (Deimos) Slide 2 -

Slide 11 - Guido Juckeland

Deimos - Layout of a quad-CPU Node

AMDOpteron

885

AMDOpteron

885Mem

ory

(4 G

iByt

e)

Mem

ory

(4 G

iByt

e)

Hypertransport

Hypertransport

Peripheral devices

(Infiniband, Ethernet, Festplatte)

AMDOpteron

885

AMDOpteron

885Mem

ory

(4 G

iByt

e)

Mem

ory

(4 G

iByt

e)

Hypertransport

Hypertransport Hypertransport

Page 12: Introduction to High Performance Computing at ZIHmlieber/slides/Architecture.pdf · 2010-06-22 · High Performance Computing at ZIH Architecture of the PC Farm (Deimos) Slide 2 -

Slide 12 - Guido Juckeland

Infiniband Networks

Page 13: Introduction to High Performance Computing at ZIHmlieber/slides/Architecture.pdf · 2010-06-22 · High Performance Computing at ZIH Architecture of the PC Farm (Deimos) Slide 2 -

Slide 13 - Guido Juckeland

Basic Layout

Page 14: Introduction to High Performance Computing at ZIHmlieber/slides/Architecture.pdf · 2010-06-22 · High Performance Computing at ZIH Architecture of the PC Farm (Deimos) Slide 2 -

Slide 14 - Guido Juckeland

More complicated structures

Page 15: Introduction to High Performance Computing at ZIHmlieber/slides/Architecture.pdf · 2010-06-22 · High Performance Computing at ZIH Architecture of the PC Farm (Deimos) Slide 2 -

Slide 15 - Guido Juckeland

Infiniband-Stack

Page 16: Introduction to High Performance Computing at ZIHmlieber/slides/Architecture.pdf · 2010-06-22 · High Performance Computing at ZIH Architecture of the PC Farm (Deimos) Slide 2 -

Slide 16 - Guido Juckeland

Consequences for the user

No standard Linux networks (eth0,...)

No IP-addresses

No direct traffic monitoring possible

Very low MPI latency (about 5-15 μs)

High MPI bandwidth (up to 900 MiByte/s)

The batch system does not know about the state of the Infiniband fabric

Page 17: Introduction to High Performance Computing at ZIHmlieber/slides/Architecture.pdf · 2010-06-22 · High Performance Computing at ZIH Architecture of the PC Farm (Deimos) Slide 2 -

Slide 17 - Guido Juckeland

Deimos Infiniband-Layout (rough sketch)

Node

Node

Node

Node

Node

...Node

Node

Node

Node

Node

...

MPI Netzwerk

IO Netzwerk

Page 18: Introduction to High Performance Computing at ZIHmlieber/slides/Architecture.pdf · 2010-06-22 · High Performance Computing at ZIH Architecture of the PC Farm (Deimos) Slide 2 -

Slide 18 - Guido Juckeland

Deimos MPI-Fabric

+-------------------+ +--------------------+ +-------------------+| Switch 1 | | Switch 2 | | Switch 3 || | 30x | | 30x | || Rack 05 |-------| Rack 20 |-------| Rack 25 || | | | | || all Phase1 Nodes | | Phase2 Duals+Quads | | Phase 2 Singles |+-------------------+ +--------------------+ +-------------------+

3 288-Port Voltaire ISR 9288 IB-Switches with 4x Infiniband Ports

Page 19: Introduction to High Performance Computing at ZIHmlieber/slides/Architecture.pdf · 2010-06-22 · High Performance Computing at ZIH Architecture of the PC Farm (Deimos) Slide 2 -

Slide 19 - Guido Juckeland

Deimos I/O Fabric

Tree structure with

– 1 192 Port Voltaire ISR 9288 IB-Switch with 4x Infiniband Ports (Rack 07)

– 36 24 Port Mellanox IB-Switch (4x) passive

Voltaire

Core-Switch

24 Port Mellanox

24 Port Mellanox

24 Port Mellanox

24 Port Mellanox

24 Port Mellanox

24 Port Mellanox

... ...

Phase 1 Phase 2


Recommended