+ All Categories
Home > Technology > InfiniCortex and the Renaissance in Polish Supercomputing

InfiniCortex and the Renaissance in Polish Supercomputing

Date post: 16-Apr-2017
Category:
Upload: inside-bigdatacom
View: 290 times
Download: 0 times
Share this document with a friend
35
DDN Users’ Group Mee?ng Supercompu?ng 2016 Salt Lake City, 15 November 2016 From Singapore to Warsaw: InfiniCortex and the Renaissance in Polish Supercompu?ng Marek Michalewicz Interdisciplinary Centre for Mathema?cal and Computa?onal Modelling (ICM), University of Warsaw, Poland Ins?tute for Advanced Computa?onal Science, Stony Brook University, USA A*STAR Computa?onal Resource Centre, Singapore
Transcript
Page 1: InfiniCortex and the Renaissance in Polish Supercomputing

DDN$Users’$Group$Mee?ng$$Supercompu?ng$2016$

Salt$Lake$City,$15$November$2016$

From&Singapore&to&Warsaw:(InfiniCortex&&

and(the&Renaissance&in&Polish&

Supercompu?ng$Marek$Michalewicz$

Interdisciplinary$Centre$for$Mathema?cal$and$Computa?onal$Modelling$(ICM),$University$of$Warsaw,$Poland$$Ins?tute$for$Advanced$Computa?onal$Science,$Stony$Brook$University,$

USA$A*STAR$Computa?onal$Resource$Centre,$Singapore$

Page 2: InfiniCortex and the Renaissance in Polish Supercomputing

2

Level(17(at (Fusionopolis$

A*CRC(Datacenter(1$

Singapore(

Page 3: InfiniCortex and the Renaissance in Polish Supercomputing

3

A*CRC(Datacenter(2(

Matrix(Building(at(Biopolis$

Page 4: InfiniCortex and the Renaissance in Polish Supercomputing

Mellanox$Metro]]]X$tes?ng$since$early$2013$$goal:$to$connect$HPC$resources$at$Fusionopolis$$with$storage$and$genomics$pipeline$at$Biopolis$

A*CRC $Metro]]]X$tes?ng$team:$$Stephen$Wong$Tay$Teck$Wee$$Steven$Chew$

Page 5: InfiniCortex and the Renaissance in Polish Supercomputing

NOT(GRID!(

NOT(CLOUD!(

NOT(“Internet”!(

InfiniCortex&is&…(

InfiniCortex&demo&at&SC16:&booth&501$

Page 6: InfiniCortex and the Renaissance in Polish Supercomputing

6

InfiniCortex(is(like(a(living(global(brain(

The$InfiniCortex(uses$a$metaphor$of$a$human$brain’s$outer$layer,$the$Cortex,$consis?ng$of$$highly$connected$and$dense$network$of$neurons$enabling$thinking$….$

to$deliver$concurrent(supercompu?ng$across(the(globe(u?lising$trans]]]con?nental$InfiniBand(and$Galaxy(of(Supercomputers$

Page 7: InfiniCortex and the Renaissance in Polish Supercomputing

InfiniCortex(Components(

1.(Galaxy(of(Supercomputers$

•$

•$

Supercomputer$interconnect$topology$work$$by$Y.$Deng,$M.$Michalewicz$and$L.$Orlowski$Obsidian$Strategics$Crossbow$InfiniBand$router$

2.(ACA(100(&(ACE(10$

•$

•$

Asia$Connects$America$100$Gbps,$by$November$2014$$Asia$Connects$Europe$10Gbps,$established$February$2015$

3.(InfiniBand(over(trans111con?nental(distances$

•$ Using$Obsidian$Strategics$Longbow$range$extenders$

4.(Applica?on(layer$

•$

•$

from$simplest$file$transfer:$dsync+$to$complex$workflows:$ADIOS,$mul?]]]scale$models$

Page 8: InfiniCortex and the Renaissance in Polish Supercomputing

InfiniCortex(Team(

With$help$from:$SingAREN$A/Prof(Francis(Lee((

Prof(Lawrence(Wong$NTU$Stanley(Goh$

A*CRC$Tan(Geok(Lian((Networking)((

Lim(Seng((Networking)$Dr(Jonathan(Low((H/W,(S/W,(Applica?ons)((

Dr(Gabriel(Noaje((S/W,(Applica?ons)((

Lukasz(Orlowski((S/W,(Applica?ons)$Dr(Dominic(Chien((S/W,(Applica?ons)((

Dr(Liou(Sing111Wu((S/W,(Applica?ons)$

Yves(Poppe,((

Interna?onal(connec?vity$Prof (Yuefan(Deng$Dr(Marek(Michalewicz((PI) (A/Prof(Tan(Tin(Wee((PI)$ Dr(David(Southwell$

Page 9: InfiniCortex and the Renaissance in Polish Supercomputing

(most)(Project(Partners(20141112016(

Huawei, HPE, Fujitsu, Aspera, Bright Cluster, Altair, ByteScale, Arista FermiLab, George Washington University

Team Europe GEANT, TEIN,

France: University of Reims, Poland: PSNC, ICM

Page 10: InfiniCortex and the Renaissance in Polish Supercomputing

10

TITECH (Tokyo)

A*STAR (Singapore)

NCI (Canberra)

Seattle

SC14 (New Orleans)

GA Tech (Atlanta)

10Gbps InfiniBand 100Gbps InfiniBand

Enabling$geographically$dispersed$HPC$facili?es$to$collaborate$and$func?on$as$ONE$concurrent$supercomputer,$$bringing$the$capability$to$address$and$solve$grand$challenges$to$the$next$level$of$efficiency$and$scale.$

InfiniCortex(2014((phase(1)(

Page 11: InfiniCortex and the Renaissance in Polish Supercomputing

100Gbps(Bandwidth(U?liza?on(

Page 12: InfiniCortex and the Renaissance in Polish Supercomputing

12

10Gbps InfiniBand 100Gbps InfiniBand

100Gbps$InfiniBand$East]]]ward$link:$Singapore]]]trans]]]Pacific]]]USA]]]trans]]]Atlan?c]]]Europe$$10Gbps$InfiniBand$West]]]ward$link:$Singapore]]]Europe$(via$ TEIN*CC)$

InfiniCortex(2015(InfiniBand(ring111around(the(World(

TITECH (Tokyo)

NCI (Canberra)

Seattle

GA Tech (Atlanta) SC15

(Austin,TX)

PSNC (Poznan)

A*STAR (Singapore)

ANA200/Internet2/GEANT

ESnet/Internet2

100Gbps shared Internet 2-SingAREN URCA

(Reims)

Page 13: InfiniCortex and the Renaissance in Polish Supercomputing

NCI (Canberra)

SC15 (Austin)

GA TECH (Atlanta)

SBU (New York)

A*CRC (Singapore)

PSNC (Poznań)

URCA (Reims)

InfiniCloud(2015(

True(HPC(Cloud(around(thUAlbeerta

(EdmontoGn)

lobe$

Page 14: InfiniCortex and the Renaissance in Polish Supercomputing

InfiniCortex(demo,(SC15,(Aus?n,(TX,(USA(

7(InfiniBand(sub111nets$7(countries:(Singapore,$USA,$Australia,$Japan,$$Poland,$France,$Canada$100Gbps(Singapore111Aus?n$1011130Gbps(rest(of(network$~15(Universi?es(and(Research(en??es$~40(partners(and(growing$HPC(InfiniCloud(over(4(con?nents$

Page 15: InfiniCortex and the Renaissance in Polish Supercomputing

Science, Technology and Research Network (STAR-N) connects all National Supercomputing Centre stakeholders: A*STAR, NUS, NTU and Industrial users with 100Gbps + InfiniBand links.

NUS

NTU

A*STAR Fusionopolis

SingAREN Global Switch

A*STAR Biopolis

WOODLANDS

500Gbps Infinera Cloud Express

100Gbps InfiniBand

10/40/100Gbps InfiniBand/IP

SELETAR

CHANGI

NOVENA

OUTRAM

ONE-NORTH

JURONG

Singapore(InfiniBand(connec?vity(

•$

•$

•$

•$

•$

A$high$bandwidth$network$$to$connect$the$distributed$$login$nodes$Provide$high$speed$access$$to$users$(both$public$and$$private)$anywhere((Support$transfer$of$large$$data]]]sets$(both$locally$and$$interna?onally)$Builds$local(and((interna?onal(network((

connec?vity((Internet$2,$$TEIN*CC)$ASEAN,(USA(Europe,((

Australia,(Japan,(Middle((

East$

Page 16: InfiniCortex and the Renaissance in Polish Supercomputing

Genomic(Ins?tute(of(Singapore(111(Na?onal(Supercompu?ng(Centre((

(GIS111NSCC)(Integra?on(

NGSP Sequencers at B2 (Illumina + PacBio)

NSCC Gateway

GIS

NSCC

STEP 2: Automated pipeline analysis once sequencing completes. Processed data resides in NSCC

500Gbps Primary link

Data Manager

STEP 3: Data manager indexes and annotates processed data. Replicate metadata to GIS. Allowing data to be search ed and retrieved

Data Manager

Compute Tiered Storage

POLARIS, Genotyping and other Platforms in L4~L8

Tiered Storage

STEP 1: Sequencers stream directly to NSCC Storage (NO footprint in GIS)

Compute

1 Gbps per sequencer

10 Gbps

1 Gbps per machine

100 Gbps

10 Gbps

Page 17: InfiniCortex and the Renaissance in Polish Supercomputing

GIS DC (Biopolis)

HPC

Storage (Isilon)

Longbow C400 6

A-CWDM81 7

IB EDR/ FDR Switch

8

Longbow C400 6

A-CWDM-81 7

IB EDR/

Matrix DC (Biopolis)

100GE Switch 18

Mellanox MTX 6100

16

Network Room (Fusionopolis)

NSCC (Fusionopolis)

Storage System

Large Memory Nodes

8 x 10Gbps (80Gbps)

Infinera CX-1003

Arista 100G Core

Switch 4

Infinera CX-100 3

Longbow C400 6

Longbow C400 6

Exanet 100G Core

Switch 5 40GE Switch 9

Exanet 100G Core

Switch 5

40GE Switch 9

HPC

Storage

A*CRC (Fusionopolis)

HPC

NTU DR Site

BMRC Research Institutes

SERC Research Institutes

Biopolis Fusionopolis 1.18Tb

NTU

100GE Switch 18

100GE Switch 18

Mellanox MTX 6100

16

200G Transponder

19

400Gbps 15

240Gbps used by MTX ( + 160Gbps spare)

GIS IP Network

Sequencers

1GE

500Gbps 2

10

10

11 11

10 10

11

11 FDR Switch 8

11

11

13 13 13 12 13

14

11 11

13 13 13 13 12

13

Arista 100G 13

Core Switch4

13

13

21 11

11

13

13

17 17 17

1 1

14

1$–$Inter]rack$fibers$(To$be$procured)$

2$]$Available$dark$fibres$

3$]$Infinera$CX]100$are$500Gbps$DWDM$switches$for$$mul?plexing$5$x$100GE$of$total$capacity$over$a$single$dark$$fibre.$

4$]$Arista$100Gbps$Ethernet$Switch$for$Core$Backbone$

5$]$Exanet$100Gbps$Core$Switches$using$Cisco$Nexus$switches.$

6$ ]$ Obsidian$ Longbow$ C400$ InfiniBand$ Range$ Extender$switch.$ $This$allows$combined$capacity$of$40Gbps$of$na?ve$InfiniBand$ $ connec?vity$ over$ a$ distance$ of$ 10km]40km,$depending$on$the$$type$of$transceivers$used.$

7$]$A]CWDM81$(Coarse$Wavelength$Division$Mul?plexing)$]$$performs$op?cal$mul?plex/demul?plex$func?ons$neccessary$to$$carry$two$4$x$QDR$range]extended$InfiniBand$links$(as$well$as$a$$bonus$10G$Ethernet$or$Fiber$Channel$circuit)$over$a$single$fiber$$pair$across$a$campus$or$metro$area$network.$

8$]$InfiniBand$EDR/FDR$Switch$

9$]$Exanet$10/40Gbps$Ethernet$switch$

10$]$4$x$10Gbps$(40Gbps)$InfiniBand$link$

11$]$40Gbps$QDR$link$

12$]$9$x$10Gbps$ethernet$links$

13$]$100Gbps$ethernet$links$

14$]$10Gbps$ethernet$link$

15]$400Gbps$combined$capacity$$over$a$single$dark$fibre$

16$]$Mellanox$MTX$6100$InfiniBand$Switch$with$up$to$$240Gbps$of$InfiniBand$capacity$$over$6$pairs$of$dark$fibres$

2 ROADM 88ch 19 200G Transponder 19

ROADM 88ch 19

20

20

20

20

2

100Gbps 13

17$]$2$x$40Gbps$ethernet$link$

18]$100GE$edge$switch$(To$be$procured)$

19$–$Packetlight$DWDM$Switches$200G$Transponder$–$Op?cal$Network$$Transport$(OTN)$Switches$

ROADM$–$Reconfigurable$Op?cal$Add/Drop$Mul?plexer$

20 –$10GE$Ethernet$

21 –$100G$EDR$link$

2

13 13

A

B

C 100Gbps 13

D

Page 18: InfiniCortex and the Renaissance in Polish Supercomputing

PSNC:(1$

Cyfronet:(2.4$

WCNS:(1$

ICM:(1.3$

TASK:(1$

PIONIER(Academic(Network((

Consor?um( coordinated( by((

PSNC,(Poznan$

7,500(km(own(fiber$

Five(Academic(Supercompu?ng((

Centers,(combined(~6.7(PFLOPS$

Polish(Academic(Supercompu?ng(and(Networking(Landscape,(2016(

Poland$

Page 19: InfiniCortex and the Renaissance in Polish Supercomputing

6

5 4

3 2

1

7

0

199

5

199

6

199

7

199

8

199

9

200

0

200

1

200

2

200

3

200

4

200

5

200

6

200

7

200

8

200

9

201

0

201

1

201

2

201

3

201

4

201

5

20

16

Number(of(Polish(HPC(systems(in(Top500(list(

Page 20: InfiniCortex and the Renaissance in Polish Supercomputing

100

200

300

400

500

1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016

479

408

467

430

68

138

85 88 106

145

232 211

59 38

Posi?on(of(the(top(Polish(HPC(system(at(Top500(list((

(lower(is(bejer)(

Page 21: InfiniCortex and the Renaissance in Polish Supercomputing

1

100

10

1 000

1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 2015 2016

Rmax(of(the(top(Polish(HPC(system(at(Top500(list(

10 000 000

PFLOPS 1 000 000

100 000

10 000

TFLOPS

GFLOPS

Page 22: InfiniCortex and the Renaissance in Polish Supercomputing

Polish(HPC(systems(in(Top500(list,(June(2016(

Page 23: InfiniCortex and the Renaissance in Polish Supercomputing

Poznan(Supercompu?ng(and((

Networking(Centre$

SC16:(booth(501$

CYFRONET,(Krakow$

Page 24: InfiniCortex and the Renaissance in Polish Supercomputing

www.icm.edu.pl

Interdisciplinary Centre for Mathematical and Computational Modelling

University of Warsaw, Poland

Page 25: InfiniCortex and the Renaissance in Polish Supercomputing

www.icm.edu.pl

ICM(as(an(interdisciplinary(centre((

METEOROLOGICAL(ACTIVITY(

• Academic$(University$of$Warsaw)$service.$• Meteorological$ac?vity$at$ICM$UW$since$1996.$• Results$are$publicly$available:$–hyp://www.meteo.pl$–hyp://maps.meteo.pl$

• Three$independent$NWP$models:$–UM$(Unified$Model)$UK$Met$Office,$–WRF$(Weather$Researcg$and$Forecas?ng$Model),$– $COAMPS$(Coupled$Ocean/Atmosphere$Mesoscale$Predic?on$System).$

• Forecasts$are$used$by$public$sector,$scien?sts$and$commercial$users.$• Flexible$approach$to$meet$sophis?cated$requirements.$

Page 26: InfiniCortex and the Renaissance in Polish Supercomputing

www.icm.edu.pl

Statistics of the main service for 2016

ICM(as(an(interdisciplinary(centre((

METEOROLOGICAL(ACTIVITY(

Page 27: InfiniCortex and the Renaissance in Polish Supercomputing

www.icm.edu.pl

UM (Unified Model):

– spatial resolution: 4 km and 1.5 km, length: up to 72 hours.

ICM(as(an(interdisciplinary(centre((

METEOROLOGICAL(ACTIVITY(

Page 28: InfiniCortex and the Renaissance in Polish Supercomputing

www.icm.edu.pl

WRF (Weather Research and Forecasting Model) – spatial resolution: 3.4 km, length: up to 120 hours.

ICM(as(an(interdisciplinary(centre((

METEOROLOGICAL(ACTIVITY(

Page 29: InfiniCortex and the Renaissance in Polish Supercomputing

www.icm.edu.pl

COAMPS (Coupled Ocean/Atm. Mesoscale Pred. System) – spatial resolution: 13 km, length: up to 108 hours.

ICM(as(an(interdisciplinary(centre((

METEOROLOGICAL(ACTIVITY(

Page 30: InfiniCortex and the Renaissance in Polish Supercomputing

Storage Requirements from RFP

6PB 150GB/s**

500GB/s Burst Buffer

2PB 100GB/s

2PB 100GB/s

2PB 100GB/s

4PB 150GB/s**

500GB/s Burst Buffer

2PB 150GB/s

500GB/s Burst Buffer

3PB 50GB/s

3PB 50GB/s

3PB 50GB/s

5PB 20TB/h

5PB 20TB/h

5PB 20TB/h

TIER 0 Scratch FS

TIER 1 Home FS

TIER 2 Nearline

TIER 3 Archive

1PF Config 2PF Config 3PF Config

HSM

Singapore(

Page 31: InfiniCortex and the Renaissance in Polish Supercomputing

1PF$Compute$Cluster$

265TB$Burst$Buffer$with$DDN$Infinite$Memory$$Engine$(IME)$at$500$GB/s$

DDN$EXAScaler$(Lustre)$For$Scratch$$4PB$at$200$GB/s$performance$

EDR$Infiniband$N/w$

GS]]]WOS$Bridge$

PFS$Stats$Collec?on$$&$Monitoring$

DDN$GRIDScaler$For$Home$&$Nearline$$4PB$at$100$GB/s$performance$

WOS$over$$10GbE$

NAS$Gateways$&$Data$$Transfer$Nodes$

MetroX$

5PB$DDN$WOS$Object$$Storage$Archive$

Remote$Login$$Nodes$at$NUS$

MetroX$

Remote$Login$$Nodes$at$NTU$

NSCC(/(A*STAR(End111To111End((Storage(Architecture(

Page 32: InfiniCortex and the Renaissance in Polish Supercomputing

~550GB/s(Read,(Write$ ~50(Million(IOPs$

IOR(File111per111Process((GB/s)$560$000$ 420$000$ 280$000$ 140$000$

0$Write $Read$

1$

10$

100$

1$000$

10$000$

100$000$

1$000$000$

10$000$000$

4k(Random(IOPS$100$000$000$

Write$ Read$

Rack Performance: IME

Page 33: InfiniCortex and the Renaissance in Polish Supercomputing

ICM Lustre for OKEANOS (CRAY XC40) schematics

Poland(

Page 34: InfiniCortex and the Renaissance in Polish Supercomputing

(

(

(

(

(

(

(

5x$DDN$SF12KA$Head$(HA)$25x$DDN$SS8460$Disk$Shelf$2100$6TB$Disks$(HGST$He8)$=$total$raw$12.6PB$$RAID6$8+2$–$usable$space$10PB$25x$OSS$with$2x$dual$port$Mellanox$FDR$HCA$$2x$MDS$+$Netapp$E2700$for$metadata$10x$Mellanox$6025$InfiniBand$36$port$FDR$Switch$$Full$Fat$Tree$IB$topology$20x$CRAY$LNET$Router$(STRIO)$with$dual$port$FDR$HCA$

Configuration Details

Page 35: InfiniCortex and the Renaissance in Polish Supercomputing

Benchmark Results

IOR]]]2.10.3:$MPI$Coordinated$Test$of$Parallel$I/O$Command: $cray]]]ior$]]]w$$$]]]t$$$4m$]]]b$$$512g$]]]F$$$]]]k$$$]]]E$$$]]]D$$$240$]]]i$$$$1$]]]o $/ddn/5sfa/ior_tesâile$

blocksize=$512$GiB $filesize$=$105$TiB$clients$=$210$(1$per$node) $$xfersize$=$4$MiB$$Max$Write:$152.0(GB/sec ((141.6$GiB/sec)$Max$(Ops): $114$592$(write)$

Command:$/ddn/cray/cray]]]ior$]]]r$]]]t$4m$]]]b$512g$]]]F$]]]k$]]]E$]]]D$240$]]]i$1$]]]o$/ddn/5sfa/ior_tesâile$$Max$Read:(181.9(GB/sec((168.5$MiB/sec)$Max$(Ops):$114$589$(read)$


Recommended