+ All Categories
Home > Documents > SHORT OVERVIEW OF CURRENT STATUS

SHORT OVERVIEW OF CURRENT STATUS

Date post: 10-Feb-2016
Category:
Upload: amalia
View: 36 times
Download: 3 times
Share this document with a friend
Description:
“SKIF-GRID” SUPERCOMPUTING PROJECT OF THE UNION STATE OF RUSSIA AND BELARUS. SHORT OVERVIEW OF CURRENT STATUS. A. A. Moskovsky Program Systems Institute, Russian Academy of Sciences IKI - MSR Research Workshop Moscow, 10-12 June, 2009. Pereslavl-Zalessky. - PowerPoint PPT Presentation
23
28.06.22 SHORT OVERVIEW OF CURRENT STATUS A. A. Moskovsky Program Systems Institute, Russian Academy of Sciences IKI - MSR Research Workshop Moscow, 10-12 June, 2009 “SKIF-GRID” SUPERCOMPUTING PROJECT OF THE UNION STATE OF RUSSIA AND BELARUS
Transcript
Page 1: SHORT OVERVIEW OF CURRENT STATUS

22.04.23

SHORT OVERVIEW OF CURRENT STATUS A. A. MoskovskyProgram Systems Institute, Russian Academy of Sciences

IKI - MSR Research WorkshopMoscow, 10-12 June, 2009

“SKIF-GRID” SUPERCOMPUTING PROJECT OF THE UNION STATE OF RUSSIA AND BELARUS

Page 2: SHORT OVERVIEW OF CURRENT STATUS

22.04.23 Slide 2 22

Pereslavl-ZalesskyPereslavl-Zalessky

Russian Golden Ring Russian Golden Ring City: 857 years oldCity: 857 years old

Hometown of Great Hometown of Great Dukes of RussiaDukes of Russia

The first building site The first building site Peter The Great navyPeter The Great navy

Ancient capital of Ancient capital of Russian Orthodox Russian Orthodox churchchurch

Moscow

Pereslavl Zalessky

120 km

Page 3: SHORT OVERVIEW OF CURRENT STATUS

22.04.23 Slide 3

“SKIF-GRID” PROJECT TIMELINE

1. 2000-2004 - SKIF project, SKIF K-1000 is #98 in Top500

2. June 2004 – first proposal filed for “SKIF-GRID” project

3. March 2007 – approved by Government4. March 2008 - SKIF-MSU supercomputer deployed

(#36 in June 08 Top 500)5. May 2008 - “SKIF-Testbed” federation created.6. March 2009 – alliance agreement signed for SKIF

series 4 development

Page 4: SHORT OVERVIEW OF CURRENT STATUS

22.04.23 Slide 4

PROJECT ORGANIZATION: 2007-2008

Project directions1. Grid technology2. Supercomputers

• SW• HW

3. Security4. Pilot projects –

applications of HPC and grid technology

Page 5: SHORT OVERVIEW OF CURRENT STATUS

22.04.23 Slide 5

«SKIF MSU»

Page 6: SHORT OVERVIEW OF CURRENT STATUS

22.04.23 Slide 6

SKIF MSU

Theoretical peak performance 60 TFlops

47 TFlops Linpack Advanced clustering

solutions: diskless

computational nodes

Original blade design

Parameter Value

CPU architecture: x86-64

CPU model: Intel XEON E5472 3,0 GHz (4-cores)

Nodes (dual CPU) 625

CPU cores total 5 000Interconnect Infiniband DDR,

Fat Tree

Page 7: SHORT OVERVIEW OF CURRENT STATUS

22.04.23 Slide 7

«SKIF-Testbed» a/k/a “SKIF-Polygon”

Federation of HPC centers, ~100 Tflops

4 computers in the current Top 500 MSU (#35 in Top500) South Urals State

University Tomsk State

University UFA state technical

university

Page 8: SHORT OVERVIEW OF CURRENT STATUS

22.04.23 Slide 8

Middleware platform – UNICORE 6.1

X.509 for security Certificate Authority at Pereslavl-Zalessky (PyCA) Site platform

UNICORE 6.1 Java 1.5 Linux Torque

Experimental sites: UNICORE is complemented with additional services/modules

Page 9: SHORT OVERVIEW OF CURRENT STATUS

22.04.23 Slide 9

Applications (2007-2008)

HPC applications: Drug design (MSU Belozersky Institute, SRCC,

Chelyabinsk SU) Inverse problems in soil remote sensing (SRCC) Computational chemistry (MSU Chemistry department)

Geophysical data services Mammography database prototype (N.N. Semenov Chemical

Physics Institute, RAS) Text mining (PSI RAS) Engineering (South Ural University …) Space Research Institute... …

Page 10: SHORT OVERVIEW OF CURRENT STATUS

22.04.23

SKIF-Aurora

2009-2010: second phase of SKIF-GRID project

Page 11: SHORT OVERVIEW OF CURRENT STATUS

22.04.23 Slide 11

SKIF Series 4: original R&D goals

Highest density of performance(biggest possible number CPU per 1U) Smaller latency Less cables and connectors — better reliability Enlarged emission of heat per 1U

• We need new technology of cooling… How to? Improved Interconnect: we need better scalability,

bandwidth and latency that it’s provided by best available solutions (eg. Infiniband QDR)

New approach to monitoring and management of the supercomputer

Combining standard CPUs and accelerators in computational nodes of the supercomputer

Page 12: SHORT OVERVIEW OF CURRENT STATUS

22.04.23 Slide 12

Spring’2008: SKIF Series 4 — How To?

Page 13: SHORT OVERVIEW OF CURRENT STATUS

22.04.23 Slide 13

Summer’2008: SKIF Series 4 — Know How!

Italian-Russian Cooperation «SKIF Series 4» ==

«SKIF-AURORA Project» Designed by an alliance of

Eurotech, PSI RAS and RSC SKIF with support by Intel

To be present at ISC 09

Program SystemsInstitute of RAS

Page 14: SHORT OVERVIEW OF CURRENT STATUS

22.04.23 Slide 14

SKIF-Aurora distinctive features

No moving parts Liquid cooling – power efficiency X86_64 processors (IntelNehalem) 3-D torus interconnect Redundant management/monitoring

subsystem FPGA on board (optional) SSD disks (optional) QDR Infiniband

Page 15: SHORT OVERVIEW OF CURRENT STATUS

22.04.23 Slide 15

SKIF-Aurora

32 nodes per chassis 64 CPUs in 6U

Up to 8 chassis per rack Up to 512 CPU per rack Up to 2048 cores

To build 500 TFlops 21 racks in 2009 scalable due to 3-D torus

10 kW per chassis

Page 16: SHORT OVERVIEW OF CURRENT STATUS

22.04.23 Slide 16

SKIF-AURORA: Designed by the alliance of Eurotech, PSI RAS and RSC SKIF

PCBs, mechanics,

power supply, cooling,1 and 2 levels of

management system

3 level of management

system, Interconnect

(3D-torus: firmware,

routing, drivers,

MPI-2…), FPGA as

accelerator

Page 17: SHORT OVERVIEW OF CURRENT STATUS

22.04.23 Slide 17

SKIF-AURORA Management Subsystem

Page 18: SHORT OVERVIEW OF CURRENT STATUS

22.04.23 Slide 18

3-D torus interconnect implementation

System Interconnect, 3D-torus

Subsidiary Interconnect, Infiniband

FPGA FPGA FPGA FPGA...

CPU CPU CPU CPUstandard part

non-standard part

Only QCD specific is implemented by Italian team Russian teams to upgrade network to general-purpose

interconnect (MPI 2.0), due to appear fall 2009

Page 19: SHORT OVERVIEW OF CURRENT STATUS

22.04.23 Slide 19

R&D Directions Using FPGA

Collective MPI operations using FPGA FPGA to facilitate support of PGAS-languages (UPC, Titanium, etc) FPGA+CPU hybrid computing

Page 20: SHORT OVERVIEW OF CURRENT STATUS

22.04.23 Slide 20

Conclusions

Is based on collaboration between international teams

Harnesses shared expertise and results Aimed to develop a family of petascale-level

supercomputers with innovative techniques: Higher density of CPUs (flops per volume) Efficient water cooling system Scalable powerful 3D-Torus Interconnect Etc.

Page 21: SHORT OVERVIEW OF CURRENT STATUS

22.04.23 Slide 21

Datacenter visualization

Page 22: SHORT OVERVIEW OF CURRENT STATUS

22.04.23 Slide 22

Datacenter visualization

Page 23: SHORT OVERVIEW OF CURRENT STATUS

22.04.23 Slide 23

THANKS

SKIF-GRID web sitehttp://skif-grid.botik.ru


Recommended