+ All Categories
Home > Documents > High Performance Storage System (HPSS) Jason Hick Mass Storage Group jhick@lbl HEPiX

High Performance Storage System (HPSS) Jason Hick Mass Storage Group jhick@lbl HEPiX

Date post: 05-Jan-2016
Category:
Upload: isanne
View: 35 times
Download: 1 times
Share this document with a friend
Description:
High Performance Storage System (HPSS) Jason Hick Mass Storage Group [email protected] HEPiX October 26-30, 2009. Agenda. How HPSS Works Current Features Future Directions (to Extreme Scale). Latency. Capacity. HPSS as a Hierarchical Storage Management. - PowerPoint PPT Presentation
Popular Tags:
14
High Performance Storage System (HPSS) Jason Hick Mass Storage Group [email protected] HEPiX October 26-30, 2009
Transcript
Page 1: High Performance Storage System (HPSS) Jason Hick Mass Storage Group jhick@lbl HEPiX

High Performance Storage System (HPSS)

Jason HickMass Storage Group

[email protected]

HEPiXOctober 26-30, 2009

Page 2: High Performance Storage System (HPSS) Jason Hick Mass Storage Group jhick@lbl HEPiX

HEPiX, October 26-30, 2009 2

Agenda

• How HPSS Works• Current Features• Future Directions (to Extreme Scale)

Page 3: High Performance Storage System (HPSS) Jason Hick Mass Storage Group jhick@lbl HEPiX

3

HPSS as a Hierarchical Storage Management

• Top of pyramid is the Class of Service (COS)

• Pyramid is a single hierarchy, we have many of these

• Each level is a storage class, each storage class can be striped (disk & tape) and produce multiple copies (tape only)

• Migration copies files to lower levels

• Files can exist at all levels within a hierarchy

• Continually replacing all hardware within a level for technology refresh

Capacity

Latency

LocalDisk or Tape

High Capacity Disk

Fast Disk

RemoteDisk or Tape

HEPiX, October 26-30, 2009

Page 4: High Performance Storage System (HPSS) Jason Hick Mass Storage Group jhick@lbl HEPiX

4

A HPSS Transfer

Switch

LAN2. Core Server accesses metadata on disk

3. Core Server commands Mover to stage file from tape to disk

Tape

4. Mover stages file from tape to disk

5. Core Server sends lock and ticket back to client

6. Mover reads data and sends to client over LAN

Metadata

Client ClusterClient

ClusterClient Cluster

HPSS Movers

HPSS Core Server

DataDisks

1. Client issues READ to Core Server

HEPiX, October 26-30, 2009

Page 5: High Performance Storage System (HPSS) Jason Hick Mass Storage Group jhick@lbl HEPiX

HPSS Current Features (v7)

• Single client transfer optimizations– Globus gridFTP service– Striping a single file across Disk or Tape drives– Aggregation capable clients (HTAR, PSI)

• Manage 10’s of PBs effectively– Dual copy on tape, delayed or real-time– Technology insertion– Recover data from another copy– Aggregation on migration to tape

• Data Management Possibilities– User-defined attributes on files

• File System Interfaces– GPFS/HPSS Integration – IBM– Lustre/HPSS Integration – CEA/Sun-CFS– Virtual File System interface

HEPiX, October 26-30, 2009 5

Page 6: High Performance Storage System (HPSS) Jason Hick Mass Storage Group jhick@lbl HEPiX

HPSS Feature – gridFTP Transfers

• Data Transfer Working Group– Data transfer nodes at ORNL-LCF, ANL-LCF, and LBNL-NERSC

with ESNet

– Optimize WAN transfers between global file systems and archives at the sites

• Dedicated WAN nodes are helping users– Several 20TB days between HPSS and DTN global file system

– Several large data set/project movements between sites

• Have plans for– SRM: BeStMan to aid in scheduling and persistent transfers

between sites

– Increasing network (ESNet), and transfer nodes as usage increases

HEPiX, October 26-30, 2009 6

Page 7: High Performance Storage System (HPSS) Jason Hick Mass Storage Group jhick@lbl HEPiX

HPSS Feature – Striping transfers across disk/tape

7

Switch

LAN

Tape

Metadata

Client ClusterClient

ClusterI/O Node

HPSS Movers

HPSS Core Server

DataDisks

Client network BW is the bottleneck

HEPiX, October 26-30, 2009

Page 8: High Performance Storage System (HPSS) Jason Hick Mass Storage Group jhick@lbl HEPiX

HPSS Feature – Multi-noded transfers & striping in HPSS

8

Switch

LAN

Tape

Metadata

Client ClusterClient

ClusterI/O Node

HPSS Movers

HPSS Core Server

DataDisks

Match client BW to HPSS mover BW

HEPiX, October 26-30, 2009

Page 9: High Performance Storage System (HPSS) Jason Hick Mass Storage Group jhick@lbl HEPiX

HPSS Feature – Virtual File System

9

Unix/Posix Application

Posix File System

Interface

HPSS VFS Extensions& Daemons

HPSS Client

API

HPSS Data

Movers

HPSS Core

Server

DataBuffer

Linux Client

HPSS ClusterAIX or Linux

Control

Data

Optional SAN Data Path

HPSS accessed usingstandard UNIX/Posix semantics

Run standard applications on HPSS such as IBM DB2, IBM TSM, NFSv4, and Samba

VFS available for Linux

HEPiX, October 26-30, 2009

Page 10: High Performance Storage System (HPSS) Jason Hick Mass Storage Group jhick@lbl HEPiX

HPSS Feature – User-defined Attributes

• Goals:– Provide an extensible set of APIs that will

insert/update/delete/select UDAs from database

– Provide robust search capability

• Storage based on DB2 pureXML

• Possible uses:– Checksum type w/value– Application specific– Expiration/action date– File version– Lustre path– Tar file TOCs

• Planned uses:– HSI: cksum, expiration date, trashcan,

annotation, some application specific– HTAR: creator code and expiration date

HEPiX, October 26-30, 2009 10

Page 11: High Performance Storage System (HPSS) Jason Hick Mass Storage Group jhick@lbl HEPiX

Extreme Scale (2018-2020)

• Series of workshops conducted by users, applications, and organizations starting in 2007

• Proposed new program within DOE to realize computing at exascale levels

• Challenges:– Power

• 20 MW - ?

– Cost (size of the system, # of racks)• 3.6 - 300PB of memory

– Storage• Exabytes of data, millions of concurrent accesses, PBs dataset movement between sites

• HPSS held a ES workshop and determined the following challenges:

– Scalability– Data Management– System Management– Hardware

11 HEPiX, October 26-30, 2009

Page 12: High Performance Storage System (HPSS) Jason Hick Mass Storage Group jhick@lbl HEPiX

HPSS v8.1

• Multiple Metadata Servers– Optimizes multiple client transfers

– Enables managing Exabytes of data effectively

• On-line Upgrades– Ability to upgrade HPSS software while system available to

users

HEPiX, October 26-30, 2009 12

Page 13: High Performance Storage System (HPSS) Jason Hick Mass Storage Group jhick@lbl HEPiX

HPSS post 8.1

• Advanced Data Management– Collaboration with data management

community (SRMs, Content Managers…)

• Integration with 3rd party tape monitoring applications– Crossroads, HiStor, Sun solutions?

• Metadata footprint reduction• New client caching for faster pathname

operations

HEPiX, October 26-30, 2009 13

Page 14: High Performance Storage System (HPSS) Jason Hick Mass Storage Group jhick@lbl HEPiX

Thank you, Questions?

HEPiX, October 26-30, 2009 14


Recommended