+ All Categories
Home > Documents > HPSS at Los Alamos - THICMicrosoft PowerPoint - LANL.glee.980421.ppt Created Date 4/24/1998 7:58:02...

HPSS at Los Alamos - THICMicrosoft PowerPoint - LANL.glee.980421.ppt Created Date 4/24/1998 7:58:02...

Date post: 06-Aug-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
28
Gary Lee Mail Stop B269 Data Storage Systems (CIC-11) Group Los Alamos National Laboratory Phone:+1-505-667-2828; FAX:+1-505-667-0168 email: [email protected] URL: http://storage.lanl.gov/cic11/hpss.html Presented at the THIC Meeting in Albuquerque NM April 21, 1998 HPSS at Los Alamos
Transcript
Page 1: HPSS at Los Alamos - THICMicrosoft PowerPoint - LANL.glee.980421.ppt Created Date 4/24/1998 7:58:02 PM ...

Gary LeeMail Stop B269

Data Storage Systems (CIC-11) GroupLos Alamos National Laboratory

Phone:+1-505-667-2828; FAX:+1-505-667-0168

email: [email protected]: http://storage.lanl.gov/cic11/hpss.html

Presented at the THIC Meeting in Albuquerque NMApril 21, 1998

HPSS at Los Alamos

Page 2: HPSS at Los Alamos - THICMicrosoft PowerPoint - LANL.glee.980421.ppt Created Date 4/24/1998 7:58:02 PM ...

• Overview of the High PerformanceStorage System (HPSS)

• Current Status at Los Alamos

• Accelerated Strategic ComputingInitiative (ASCI) Requirements

• Challenges

• Vision

Outline

Page 3: HPSS at Los Alamos - THICMicrosoft PowerPoint - LANL.glee.980421.ppt Created Date 4/24/1998 7:58:02 PM ...

Overview of HPSS

• Scaleable, parallel, high-performancesoftware system

• A collaborative effort

• Vendor supported - IBM

• A major ASCI project

• Winner of a 1997 R&D 100 Award

Page 4: HPSS at Los Alamos - THICMicrosoft PowerPoint - LANL.glee.980421.ppt Created Date 4/24/1998 7:58:02 PM ...

HPSS Collaborators

Page 5: HPSS at Los Alamos - THICMicrosoft PowerPoint - LANL.glee.980421.ppt Created Date 4/24/1998 7:58:02 PM ...
Page 6: HPSS at Los Alamos - THICMicrosoft PowerPoint - LANL.glee.980421.ppt Created Date 4/24/1998 7:58:02 PM ...

Overview of HPSS

• High Capacitystore petabytes of data and billions of files

• High Performancedata transfer rates in the GB/sec range

• Parallel data transfers across disks, tapes,and networks

Page 7: HPSS at Los Alamos - THICMicrosoft PowerPoint - LANL.glee.980421.ppt Created Date 4/24/1998 7:58:02 PM ...
Page 8: HPSS at Los Alamos - THICMicrosoft PowerPoint - LANL.glee.980421.ppt Created Date 4/24/1998 7:58:02 PM ...

Current Status at LANL

• Storage for Science-Based StockpileStewardship (SBSS) program, ASCI,grand challenge problems

• In production in both open and securenetworks

• Currently accessible from Crays andASCI Blue systems with deployment toLANs in progress

Page 9: HPSS at Los Alamos - THICMicrosoft PowerPoint - LANL.glee.980421.ppt Created Date 4/24/1998 7:58:02 PM ...

Current Status at LANL

Dep

loym

ent

Development

1997 1998 1999

. . . . V3

1996

V4 V5

ProductionProduction

Page 10: HPSS at Los Alamos - THICMicrosoft PowerPoint - LANL.glee.980421.ppt Created Date 4/24/1998 7:58:02 PM ...

Current Status at LANL

• Version 3.2 deployed, v4.1 Q4 1998

• Locally-written Parallel Storage Interface(PSI) is user interface

• MetricsUsers200

Files350K

Storage20TB

Growth33 GB/day

Storage growth is ~ 66 GB/day

Availability is ~ 95% since 1/1/98

Availability problems primarily due to othercauses than HPSS: HIPPI network, DCE server

Page 11: HPSS at Los Alamos - THICMicrosoft PowerPoint - LANL.glee.980421.ppt Created Date 4/24/1998 7:58:02 PM ...

• Data Transfer Performance– 500KB/s for reads and writes of small files to disk

– 10MB/s for reads and writes of large files to disk

– 10MB/s for reads and writes of large files to one-way tape

– 20MB/s for reads and writes of large files to two-way tape

• Networks: FIDDI for control, HIPPI-800 for data transfer

Current Status at LANL

Page 12: HPSS at Los Alamos - THICMicrosoft PowerPoint - LANL.glee.980421.ppt Created Date 4/24/1998 7:58:02 PM ...

• Performance Issues– Small file performance: file and metadata creation -

currently 1+ sec

– Disk performance: problems with SSA adapter

– System configuration: limited equipment funds

– HIPPI device driver problems

• File Size Issue - much smaller than expected– open: 67MB, secure: 38MB

Current Status at LANL

Page 13: HPSS at Los Alamos - THICMicrosoft PowerPoint - LANL.glee.980421.ppt Created Date 4/24/1998 7:58:02 PM ...

ASCI Requirements

• Accelerated Strategic ComputingInitiative (ASCI)

– Purpose - to accelerate computing technology

– Funded by DOE

– Replace nuclear testing with modeling andsimulation

Page 14: HPSS at Los Alamos - THICMicrosoft PowerPoint - LANL.glee.980421.ppt Created Date 4/24/1998 7:58:02 PM ...

A Time of Change

1979 19981998

Photostore

CFS

<1 TB 100 TB PETABYTES

0.5 MB/SEC 5MB/SEC 1 GB/SEC

HPSS

ASCI Requirements

Page 15: HPSS at Los Alamos - THICMicrosoft PowerPoint - LANL.glee.980421.ppt Created Date 4/24/1998 7:58:02 PM ...

ASCI Requirements

• Two Driving Assumptions– Capacity: 750 Memories/year

Growth Rate

– Bandwidth: 1/2 of memory in< 20 minutes.

Page 16: HPSS at Los Alamos - THICMicrosoft PowerPoint - LANL.glee.980421.ppt Created Date 4/24/1998 7:58:02 PM ...

The ASCI Data Storage Challenge(ASCI System Memory & Storage Growth)

0

5,000

10,000

15,000

20,000

25,000

30,000

35,000

40,000

45,000

50,000

1997 1998 1999 2000 2001 2002 2003 2004

GB

ASCI Requirements

21 PB

37.5 PB

12 PB

6.7 PB3.4 PB

1.1 PB

Page 17: HPSS at Los Alamos - THICMicrosoft PowerPoint - LANL.glee.980421.ppt Created Date 4/24/1998 7:58:02 PM ...

Challenges

• Funding for data storage to meet ASCIneeds

• Accelerating data storage technology– network-attached storage devices (NASD)– striped tape systems (RAIT, RATS )– bandwidth aggregation devices (GNATS )– innovative caching, pre-fetch, data reduction

techniques– practical, scalable, parallel I/O– practical, scalable, storage management

• New data storage paradigm

Page 18: HPSS at Los Alamos - THICMicrosoft PowerPoint - LANL.glee.980421.ppt Created Date 4/24/1998 7:58:02 PM ...

HPSS Tape Striping

User File10 MB/sec

10 MB/sec

10 MB/sec

10 MB/sec

Aggregate throughput: 40 MB/sec

Challenges

Page 19: HPSS at Los Alamos - THICMicrosoft PowerPoint - LANL.glee.980421.ppt Created Date 4/24/1998 7:58:02 PM ...

10 MB/sec10 MB/sec

10 MB/sec

10 MB/sec

Parity

10 MB/sec10 MB/sec

10 MB/sec

10 MB/secParity

HPSS Multi-level Tape Stripingwith Parity

User File

Challenges

Page 20: HPSS at Los Alamos - THICMicrosoft PowerPoint - LANL.glee.980421.ppt Created Date 4/24/1998 7:58:02 PM ...

Vision

• Common Data Storage Infrastructure– Improved connectivity

– Enhanced performance

– Peripheral sharing

– Central administration

– Higher device utilization

– Increased availability

Page 21: HPSS at Los Alamos - THICMicrosoft PowerPoint - LANL.glee.980421.ppt Created Date 4/24/1998 7:58:02 PM ...

Vision

Evolution of Data Storage InfrastructuresM

ainf

ram

e-ba

sed

Serv

er-b

ased

Com

mod

ity d

eskt

op -

Prot

ocol

eng

ine

data

mov

ers

Hig

her B

W P

E-2

,

Stor

age

Are

a N

etSA

N, R

AN

, PE

-3lig

htw

eigh

t pro

toco

lsN

o PE

,std

s-ba

sed,

ST

,

mem

ory

inte

rcon

nect

TodayPast Tomorrow Future

Page 22: HPSS at Los Alamos - THICMicrosoft PowerPoint - LANL.glee.980421.ppt Created Date 4/24/1998 7:58:02 PM ...

Past Mainframe-Based MovementPast Mainframe-Based Movement

Device

ControllerController

EsconEscon

MainframeMainframeGWGW ClientClientgeneral purposegeneral purpose

networknetwork

(hippi800) (not(hippi800) (not

reliable) heavyreliable) heavy

protocol (TCP/IP)protocol (TCP/IP)370 channel370 channel

Data moves from client through general purpose net to mainframe, throughData moves from client through general purpose net to mainframe, through

370/370/esconescon storage network to peripheral, storage network is just a set of storage network to peripheral, storage network is just a set of

channels from mainframe intermediary to peripherals,channels from mainframe intermediary to peripherals, escon escon allows for allows for

switching and sharing between mainframes.switching and sharing between mainframes.

Page 23: HPSS at Los Alamos - THICMicrosoft PowerPoint - LANL.glee.980421.ppt Created Date 4/24/1998 7:58:02 PM ...

Recent, Past, Workstation/Server ClassRecent, Past, Workstation/Server ClassData MoversData Movers

DeviceDevice ControllerController ClientClient

general purposegeneral purpose

networknetwork hippi hippi 800 800

ssa

not reliable heavynot reliable heavy

protocolsprotocols

Workstation classWorkstation classmovermover

Data flows from client to mover via general Data flows from client to mover via generalpurpose network thenpurpose network then scsi scsi//ssassa to devices with or to devices with orwithout sharingwithout sharing scsi scsi - channel and - channel and ssa ssa - storage - storagenetworknetwork

Page 24: HPSS at Los Alamos - THICMicrosoft PowerPoint - LANL.glee.980421.ppt Created Date 4/24/1998 7:58:02 PM ...

Current - Commodity Desktop ProtocolCurrent - Commodity Desktop ProtocolEngine Version 1 Data MoverEngine Version 1 Data Mover

DeviceDevice ControllerController ClientClient

general purposegeneral purpose

network network hippihippi 800 800

ssa

not reliable heavynot reliable heavy

protocolsprotocols

Protocol engine 1Protocol engine 1commoditycommoditydesktop data moverdesktop data mover

Data from client to mover via general purpose Data from client to mover via general purposenetwork then throughnetwork then through scsi scsi//ssassa to devices with to devices withor without sharingor without sharing scsi scsi - channel and - channel and ssa ssa - -storage networkstorage network

scsiscsi

Page 25: HPSS at Los Alamos - THICMicrosoft PowerPoint - LANL.glee.980421.ppt Created Date 4/24/1998 7:58:02 PM ...

Next: Higher bandwidth mover to deviceNext: Higher bandwidth mover to devicewith standards-based sharing, beginwith standards-based sharing, begin

direct client attach experimentsdirect client attach experiments

DeviceDevice ControllerController ClientClient

general purposegeneral purpose

networknetwork hippi hippi 800 800

ssa

not reliable heavynot reliable heavy

protocolsprotocols

Data flows through general Data flows through generalpurpose network to protocolpurpose network to protocolengine 2 commodity desktop dataengine 2 commodity desktop datamover then intomover then into fc fc-al or-al or fc fc--elel or orssassa-2 storage network to devices-2 storage network to devices(full storage network sharing)(full storage network sharing)

fcal

or

ssa2

direct attach testdirect attach test

SAN, notSAN, not gen purp gen purpnet lesser protocolnet lesser protocol

Page 26: HPSS at Los Alamos - THICMicrosoft PowerPoint - LANL.glee.980421.ppt Created Date 4/24/1998 7:58:02 PM ...

Beyond: SAN for mover device connectivity,Beyond: SAN for mover device connectivity,RAN for client to mover, light weightRAN for client to mover, light weight

protocol in both nets, much thinner moverprotocol in both nets, much thinner moverDeviceDevice ControllerController ClientClient

hippihippi 6400 reliable 6400 reliable(RAN)(RAN)

ssa

light-weightlight-weightprotocol notprotocol notgengen purpose net purpose net

Data flows from client through Data flows from client throughRAN (light protocol) to thin moverRAN (light protocol) to thin moverto device over SAN (lightto device over SAN (lightprotocol) alternate path clientprotocol) alternate path clientdirect to device via SANdirect to device via SAN

fcal

or

ssa2

san direct attachsan direct attach

SAN, notSAN, not gen purp gen purpnet lesser protocolnet lesser protocol

SAN attached RAITSAN attached RAITcontroller with paritycontroller with parity

Protocol engine 3Protocol engine 3commodity desktopcommodity desktopdata moverdata mover

Page 27: HPSS at Los Alamos - THICMicrosoft PowerPoint - LANL.glee.980421.ppt Created Date 4/24/1998 7:58:02 PM ...

Way beyond: no protocol engine, standards basedWay beyond: no protocol engine, standards basedcontrollers, ST with third party for device to device,controllers, ST with third party for device to device,memory interconnect instead of IOS interconnect,memory interconnect instead of IOS interconnect,

mover gonemover goneDeviceDevice ControllerController ClientsClients

hippihippi 6400 reliable 6400 reliable(RAN)(RAN)

light-weightlight-weightprotocol notprotocol notgengen purpose net purpose net

Data moves client memory to controller Data moves client memory to controllermemory via RAN light protocol,memory via RAN light protocol,stst//numanuma like. (sea of memory including like. (sea of memory includingstorage system) or data could movestorage system) or data could movefrom device to device via same netfrom device to device via same net(third party)(third party)

SAN direct attachSAN direct attach

hippihippi 6400 memory attached ST 6400 memory attached STwith 3rd party RAIT controllerwith 3rd party RAIT controllerwith paritywith parity

SANSAN cpuscpus

cpuscpus

cpuscpus

cpuscpus

cpuscpusmemorymemoryiosios

memorymemoryiosios

memorymemoryiosios

memorymemoryiosios

memorymemoryiosios

hippihippi 6400 memory attach (ST 6400 memory attach (STwith 3rd party controllers)with 3rd party controllers)

Page 28: HPSS at Los Alamos - THICMicrosoft PowerPoint - LANL.glee.980421.ppt Created Date 4/24/1998 7:58:02 PM ...

For more information on HPSS

http://storage.lanl.gov/cic11/hpss.html

email: [email protected]


Recommended