+ All Categories
Home > Documents > Columbia Supercomputer and the NASA Research & Education Network

Columbia Supercomputer and the NASA Research & Education Network

Date post: 21-Jan-2016
Category:
Upload: mikkel
View: 49 times
Download: 0 times
Share this document with a friend
Description:
Columbia Supercomputer and the NASA Research & Education Network. WGISS 19: CONAE March, 2005 David Hartzell NASA Ames / CSC [email protected]. Agenda. Columbia NREN-NG Applications. NASA’s Columbia System. NASA Ames has embarked on a joint SGI and Intel Linux Super Computer. - PowerPoint PPT Presentation
26
1 Columbia Supercomputer and the NASA Research & Education Network WGISS 19: CONAE March, 2005 David Hartzell NASA Ames / CSC [email protected]
Transcript
Page 1: Columbia Supercomputer and the NASA Research & Education Network

1

Columbia Supercomputer and the NASA Research & Education

Network

WGISS 19: CONAE

March, 2005

David Hartzell

NASA Ames / CSC

[email protected]

Page 2: Columbia Supercomputer and the NASA Research & Education Network

2

Agenda• Columbia

• NREN-NG

• Applications

Page 3: Columbia Supercomputer and the NASA Research & Education Network

3

NASA’s Columbia System

• NASA Ames has embarked on a joint SGI and Intel Linux Super Computer.– Initally twenty 512 processor Intel IA-64 SGI Altix nodes

– NREN-NG: an Optical support WAN

• NLR will be the optical transport for this network, delivering high-bandwidth to other NASA centers.

• Achieved 51.9 Teraflops with all 20 nodes on Nov 2004

• Currently 2nd on the Top500 list– Other systems have come on-line that are now faster.

Page 4: Columbia Supercomputer and the NASA Research & Education Network

4

Columbia

Page 5: Columbia Supercomputer and the NASA Research & Education Network

5

Preliminary Columbia UsesSpace Weather Modeling Framework (SWMF)

SWMF has been developed at the University of Michigan under the NASA Earth Science Technology Office (ESTO) Computational Technologies (CT) Project to provide “plug and play” Sun-to-Earth simulation capabilities to the space physics modeling community.

Estimating the Circulation and Climate of the Ocean (ECCO)Continued success in ocean modeling has improved model and the work continued during very busy Return To Flight uses of Columbia.

finite-volume General Circulation Model (fvGCM)Very promising results from 1/4° fvGCM runs encouraged use for real time weather predictions during hurricane seasons - one goal is to predict hurricanes accurately in advance.

Return to Flight (RTF)Simulations of tumbling debris from foam and other sources are being used to assess the threat that shedding such debris poses to various elements of the Space Shuttle Launch Vehicle.

Page 6: Columbia Supercomputer and the NASA Research & Education Network

6

20 Nodes in Place• Kalpana was on site at the

beginning of the project

• The first two new systems were received on 28 June and placed in to service that week.

• As of late October, 2004, all systems were in place.

Page 7: Columbia Supercomputer and the NASA Research & Education Network

7

Power

• Ordered and received twenty 125kw PDU’s

• Upgrade / installation of power distribution panels

Page 8: Columbia Supercomputer and the NASA Research & Education Network

8

Cooling

• New Floor Tiles• Site visits

conducted• Plumbing in HSPA

and HSPB complete• Heating problem

contingency plans developed

Page 9: Columbia Supercomputer and the NASA Research & Education Network

9

Networking

• Each Columbia node has four 1 GigE

• And one 10 GigE• Plus Fiber Channel and

Infiniband• Required all new fiber

and copper infrastructure, plus switches

Page 10: Columbia Supercomputer and the NASA Research & Education Network

10

ComponentsFront End- 128p Altix 3700 (RTF)Networking- 10GigE Switch 32-port-10GigE Cards (1 Per 512p)- InfiniBand Switch (288port)- InfiniBand Cards (6 per 512p)- Altix 3900 2048 Numalink Kits

Compute Node- Altix 3700 12x512p - “Altix 3900” 8x512p

Storage Area Network-Brocade Switch 2x128port

Storage (440 TB)-FC RAID 8x20 TB (8 Racks)-SATARAID 8x35TB (8 Racks)

A512p

A512p

A512p

A512p

A512p

A512p

A512p

A512p

A512p

A512p

A512p

A512p

T512p

T512p

T512p

T512p

T512p

T512p

T512p

T512p

RTF 128

SATA35TB

Fibre Channel

20TB

SATA35TB

SATA35TB

SATA35TBSATA

35TBSATA35TB

SATA35TB

SATA35TB

FC Switch 128pFC Switch 128p

Fibre Channel

20TB

Fibre Channel

20TB

Fibre Channel

20TBFibre

Channel20TB

Fibre Channel

20TB

Fibre Channel

20TB

Fibre Channel

20TB

InfiniBand10GigE

Page 11: Columbia Supercomputer and the NASA Research & Education Network

11

NREN Goals• Provide a wide area, high-speed network for large data distribution

and real-time interactive applications

• Provide access to NASA research and engineering communities - primary focus: supporting distributed data access to/from Columbia

• Provide access to federal and academic entities via peering with High Performance Research and Engineering Networks (HPRENs)

• Perform early prototyping and proofs-of-concept of new technologies that are not ready for production network (NASA Integrated Services Network - NISN)

Page 12: Columbia Supercomputer and the NASA Research & Education Network

12

NREN-NG

• NREN Next Generation (NG) wide-area network will be expanded from OC-12 to 10 GigE within the next 3-4 months to support Columbia applications.

• NREN will “ride” the National Lambda Rail (NLR) to reach the NASA research centers and major exchange locations.

Page 13: Columbia Supercomputer and the NASA Research & Education Network

13

ARC/NGIX-WestARC/NGIX-West

JPLJPL

GSFCGSFC

NGIX-EastNGIX-East

NREN SitesPeering Points10 GigE

NREN-NG Target

NLR SunnyvaleNLR Sunnyvale

NLR Los AngelesNLR Los Angeles

MATPMATP

NLR HoustonNLR Houston

NLR ClevelandNLR ClevelandNLR ChicagoNLR ChicagoStarLightStarLight

JSCJSC

GRCGRC

Approach

MSFCMSFCNLR MSFCNLR MSFC

LRCLRC

Implementation Plan, Phase 1

Page 14: Columbia Supercomputer and the NASA Research & Education Network

14

NREN-NG Progress

• Equipment order has been finalized.• Start construction of network from West to East• Temporary 1 GigE connection up to JPL in place,

moving to 10 GigE by end of summer.• Current NREN paths to/from Columbia seeing

gigabit/s transfers• NREN-NG will ride the National Lambda Rail

network in the US

Page 15: Columbia Supercomputer and the NASA Research & Education Network

15

The NLR• National Lambda Rail (NLR).

• NLR is a U.S. consortium of education institutions and research entities that partnered to build a nation-wide fiber network for research activities.– NLR offers wavelengths to members and/or Ethernet

transport services.

– NLR is buying a 20-year right-to-use of the fiber.

Page 16: Columbia Supercomputer and the NASA Research & Education Network

16

Denver

Seattle

LASan Diego

ChicagoPitts

Wash DC

Raleigh

Jacksonville

Atlanta

KC

Portland

Clev

Boise

Ogden/Salt Lake

NLR Layer 1

NLR Route

NLR – Optical Infrastructure - Phase 1

Page 17: Columbia Supercomputer and the NASA Research & Education Network

17

Page 18: Columbia Supercomputer and the NASA Research & Education Network

18

Some Current NLR Members

• CENIC

• Pacific Northwest GigaPOP

• Pittsburgh Supercomp. Center

• Duke (coalition of NC universities)

• Mid-Atlantic Terascale Partnership

• Cisco Systems

• Internet2

• Florida LambdaRail

• Georgia Institute of Technology

• Committee on Institutional Cooperation (CIC)

• Texas / LEARN

• Cornell

• Louisiana Board of Regents

• University of New Mexico

• Oklahoma State Regents

• UCAR/FRGP

Plus Agreements with:

• SURA (AT&T fiber donation)

• Oak Ridge National Lab (ORNL)

Page 19: Columbia Supercomputer and the NASA Research & Education Network

19

NLR Applications• Pure optical wavelength research

• Transport of Research and Education Traffic (like Internet2/Abilene today)

• Private Transport of member traffic

• Experience working operating and managing an optical network– Development of new technologies to integrate

optical networks into existing legacy networks

Page 20: Columbia Supercomputer and the NASA Research & Education Network

20

– Finite Volume General Circulation Model (fvGCM): Global atmospheric model

– Requirements: (Goddard – Ames)• ~23 million points• 0.25 degree global grid • 1 Terabyte set for 5 day forecast

– No data compression required, prior to data transfer

– Assumes BBFTP for file transfers, instead of FTP or SCP

Distribution of Large Data Sets

10.00 / 10.00GSFC - Ames Performance (Full 10 Gig)

1.00 / 10.00GSFC - Ames Performance (1/10 Gig)

1.00 / 0.155Current GSFC - Ames Performance

Bandwidth[Gigabits/sec]

(LAN/WAN)  

0.4 - 1.1

3 - 5

17 - 22

DataTransfer

Time (hours)

Columbia Applications

Page 21: Columbia Supercomputer and the NASA Research & Education Network

21

Distribution of Large Data Sets• ECCO: Estimating the Circulation and Climate of the

Ocean. Joint activity among Scripps, JPL, MIT & others

• Run Requirements are increasing as model scope and resolution are expanded:

– November ’03 = 340 GBytes / day– February ’04 = 2000 GBytes / day– February ‘05 = 4000 Gbytes / day (est)

– Bandwidth for distributed data intensive applications can be limiter

– Need high bandwidth alternatives and better file transfer options

Projected NREN (CENIC 10G)

NREN Feb 2005 (CENIC 1G)

Previous NREN Performance

 

0.2 - 0.410.0/10.0

0.6 - 0.91.0/1.0

6 - 121.0/0.155

Data Transfer Time

(Hours)

Bandwidth [Gigabits/sec]

(LAN/WAN)

Columbia Applications

Page 22: Columbia Supercomputer and the NASA Research & Education Network

22

Page 23: Columbia Supercomputer and the NASA Research & Education Network

23

Page 24: Columbia Supercomputer and the NASA Research & Education Network

24

hyperwall-1: large images

Page 25: Columbia Supercomputer and the NASA Research & Education Network

25

Disaster Recovery/Backup– Transfer up to seven 200 gigabyte files per day between

Ames and JPL– Limiting Factors

• Bandwidth: recent upgrade from OC-3 POS to 1 Gigabit Ethernet

• Compression: 4:1 Compression utilized for WAN transfers at lower bandwidths. Compression limited bandwidth to 29 Mbps (end host constraint)

No

No

Yes (4:1)

DataCompression

Required

0.6 - 1.510.00 / 10.00JPL - Ames (CENIC 10 GigE)

4.4 - 6.21.00 / 1.00JPL - Ames (CENIC 1 GigE)

27 - 311.00 / 0.155JPL - Ames (OC-3 POS)

DataTransfer

Time (hours)

Bandwidth[Gigabits/sec]

(LAN/WAN)

Projected  Transfer Improvement

ARCARC

JPLJPL

Columbia Applications

Page 26: Columbia Supercomputer and the NASA Research & Education Network

26

Thanks.

David Hartzell

[email protected]


Recommended