Systems Controls @lrz - Lawrence Livermore National Laboratory · • Requirements of liquid cooled...

Post on 09-Apr-2020

0 views 0 download

transcript

Systems Controls @lrz.de

1

Detlef Labrenz (labrenz@lrz.de) 18. Sept. 2014

17.09.2014 Leibniz-Rechenzentrum2

Agenda

• Introduction

• Power Infrastructure

• Cooling Infrastructure

• IT systems

• Issues @lrz.de

• Discussion

Leibniz Supercomputing Centre

-3-

Munich Bavaria Germany & Europe

• We provide generic IT services to all Munich universities

• We provide special IT services to all universities in Bavaria• Network, High Performance and Grid Computing

• Backup and Archive Services

• IT Management

• We provide supercomputing resources to scientists in Europe• Member of the German Gauss Supercomputing Centre

• Part of the European HPC Infrastructure PRACE

• Operating Tier-0 Supercomputing Center (SuperMUC system)

• Investigations on Future HPC Systems:

• Hardware Architectures

• Programming Models & System Software

• Zero Emission Data Center

• Re-Use of Waste Heat

SuperMUC: IBM System x iDataPlex

With Direct Water Cooling

-4-Torsten Bloth, IBM Lab Services - © IBM Corporation

iDataplex DWC Rack w/ water cooled nodes

(rear view of water manifolds)

Data Center Infrastructure

-5-

Layout Power Infrastructure

Controls Power Infrastructure

• Equipment:- Transformer, switching, ...: SIEMENS

- Dyn UPS: Piller

- Battery backup: Emerson

- Diesel generator: MTU

• Metering- SOCOMEC&WinCC (power)

- Piller&WinCC (power, messaging)

- JCI Metasys M5 (power)

- deZem (power)

- SWM - Utility Provider (power)

• Monitoring- Siemens WinCC

-7-

-8-

Layout Cooling Infrastructure

-9-

Controls Cooling Infrastructure

• Equipment:

- Cooling towers: Gohl, Jaeggi

- Chiller: McQuay, CARRIER

- CRAC/CRAH: GEA, WEISS, STULZ,

RC Group

- Pumps: Grundfoss/ABB&KSB

• Metering

- Krohne (flow)

- Calec (heat)

- WIKA a.o. (pressure, temperature)

• Monitoring & Operations

- JCI Metasys

-10-

Monitoring of IT Systems

• SuperMUC

- Vendor solution: IBM tool set based on icinga

- Power and energy readings at server (PDU & Paddle cards

& RAPL counters) and system level

- Temperature at server level and room level

- Pressure/heat at system level

• CoolMUC

- Vendor solution: power/heat/temperature/flow control

• Clusters & servers, NAS systems

- Nagios based inhouse tools

• Tape libraries

• Networking

-11-

Issues @lrz.de

• Power infrastructure

- Monitoring, reporting (dashboard)

- Quality of reported measurements

• Cooling infrastructure

- Ops of cooling loops (hydraulics, meta controls)

- Ops of cooling towers

• Information management

- Integration&consolidation of heterogeneous data sources

- Interoperatorbility of differing system controls

• General

- Interaction with vendors/contractors of BMS

- Strategy DCIM: in house/vendor based, open source

-12-

Topics of Interest

• Requirements of liquid cooled systems for BMS

• Requirements of large HPC systems for systems control

• Roadmaps for BMS and DCIM

• Vendors view on status and trends in system controls

• Standardization

• APIs

• Lessons learned and white paper on

„Best practise in systems controls for HPC data centers“

Zero Emission Supercomputing Centre

Thank You!

Overview Cooling Infrastructure

Wasseraufbereitung

(3x Umkehrosmose)

UKG 6x

Dach

SuperMUC

Compute

Section (≈ 150 Racks)

HRR

Disks

Tape

Libraries

KW-Verteiler/-Sammler

Serv

ers

(2

5x K

KT

Kra

us R

acks)

Su

pe

rMU

C In

terc

on

ne

ct

(„R

DH

X“)

EGGelände

RLT 2x GEA

NSHVsKä

lte

mas

ch

ine

n5x+

2x

SuperMUC Storage

3.OG 2.OG

NSR

1.OG EG

RLT (2x)

Dir

ekte

Modu

l-K

üh

lun

g

30

– 6

0°C

Kühltürme

(4x Gohl)

Dunstturm

(1x Gohl)

Kühltürme

(2x Gohl

+ 5x Jäggi)

Brunnen

KKG

5x

Netw

ork

& C

ore

Serv

ers

I

Serv

ers

(≈

95

x R

acks)

UKGUKG

Präzisions-

kühler

„fre

ie K

ühlu

ng“

(Win

ter)

ckkühlu

ng

RLT (2x)

RLT 2xGEA

KKG

4x

Netw

ork

& C

ore

Serv

ers

II

UKG

5x

DAR

KKG

3x+3x

USVstat

3x+3x

UG

UKG

WKZ

NEA

USVdyn.

3x+6x

Trafo

6x+6x

Mittel-

SP

Elektro

Kältemaschinen

(2x McQuay

+ 5x Carrier)

Gelä

nd

e