+ All Categories
Home > Documents > HPSS/DFS: Integration of a Distributed File System...

HPSS/DFS: Integration of a Distributed File System...

Date post: 07-Apr-2020
Category:
Upload: others
View: 6 times
Download: 0 times
Share this document with a friend
29
HPSS/DFS: Integration of a Distributed File System with a Mass Storage System NASA Goddard Conference on Mass Storage Systems and Technologies IEEE Symposium on Mass Storage Systems March 23-26, 1998 Rajesh Agarwalla Transarc Corporation Rena Haynes Sandia National Laboratories
Transcript
Page 1: HPSS/DFS: Integration of a Distributed File System …storageconference.us/1998/presentations/b2-2-AGARWAvg.pdfHPSS/DFS: Integration of a Distributed File System with a Mass Storage

HPSS/DFS: Integration of a Distributed FileSystem

with a Mass Storage System

NASA Goddard Conference on Mass StorageSystems and Technologies

IEEE Symposium on Mass Storage Systems

March 23-26, 1998

Rajesh Agarwalla

Transarc Corporation

Rena Haynes

Sandia National Laboratories

Page 2: HPSS/DFS: Integration of a Distributed File System …storageconference.us/1998/presentations/b2-2-AGARWAvg.pdfHPSS/DFS: Integration of a Distributed File System with a Mass Storage

The Team

This work was performed jointly under the auspices of the U.S. Department of Energy as part of the Accelerated Strategic Computing Initiative (ASCI) by LLNL(Contract W-7405-ENG-48), LANL (Contract W-7405-ENG-36), SNL (ContractDE-AC04-94AL85000).

Page 3: HPSS/DFS: Integration of a Distributed File System …storageconference.us/1998/presentations/b2-2-AGARWAvg.pdfHPSS/DFS: Integration of a Distributed File System with a Mass Storage

Motivation

• Information intensive era

• Cost effective storage of data

• Efficient and seamless access to data

Page 4: HPSS/DFS: Integration of a Distributed File System …storageconference.us/1998/presentations/b2-2-AGARWAvg.pdfHPSS/DFS: Integration of a Distributed File System with a Mass Storage

Storage of data

Memory hiearchyPrimary RAM 60ns $2 per MBSecondary Disk 8ms, ~40MB/s $0.10 per MBTertiary Tape >4 min, ~5MB/s $0.002 per MB

Mass storage systems

Page 5: HPSS/DFS: Integration of a Distributed File System …storageconference.us/1998/presentations/b2-2-AGARWAvg.pdfHPSS/DFS: Integration of a Distributed File System with a Mass Storage

Access to data

•Integrated access•Data integrity•Security

•Scalable across geographically disparate locations

File system

Distributed file system

•High speed I/O Parallel pathsThird party transfers

Page 6: HPSS/DFS: Integration of a Distributed File System …storageconference.us/1998/presentations/b2-2-AGARWAvg.pdfHPSS/DFS: Integration of a Distributed File System with a Mass Storage

Solution

•Integrate filesystem with mass storage system

•Migrate data from filesystem to mass storage system

•Cache data from mass storage system to filesystem

•Transparent migration / caching

•Efficient migration/caching mechanism

Page 7: HPSS/DFS: Integration of a Distributed File System …storageconference.us/1998/presentations/b2-2-AGARWAvg.pdfHPSS/DFS: Integration of a Distributed File System with a Mass Storage

Previous approaches

•Filesystem uses mass storage system at backend• DMF, AMASS• Needs kernel modifications with OS upgrades by

mass storage system vendor

•Mass storage system implements a filesystem interface• CFS, Unitree, HPSS• Lacks benefits of distributed filesystems• May need specialized clients

Page 8: HPSS/DFS: Integration of a Distributed File System …storageconference.us/1998/presentations/b2-2-AGARWAvg.pdfHPSS/DFS: Integration of a Distributed File System with a Mass Storage

Our approach

• DMAPI• Recent standard interface between filesystem and

data management apps• Adopted by X/Open - XDSM

• Integrate• DFS TM distributed file system distributed file system with• HPSS mass storage system• via DMAPI

• DFS Storage Management Toolkit (DFS SMT) layer• An implementation of the DMAPI standard for DFS• HPSS interface layer using DFS SMT

Page 9: HPSS/DFS: Integration of a Distributed File System …storageconference.us/1998/presentations/b2-2-AGARWAvg.pdfHPSS/DFS: Integration of a Distributed File System with a Mass Storage

Our Requirements

• Transparent archiving and caching of data• Partial file residency• No kernel mods by mass storage vendor across OS

updates• Preserve existing functionality/performance of DFS and

HPSS•Add a mode where file modifications are visible inboth DFS and HPSS

Page 10: HPSS/DFS: Integration of a Distributed File System …storageconference.us/1998/presentations/b2-2-AGARWAvg.pdfHPSS/DFS: Integration of a Distributed File System with a Mass Storage

HPSS

Control

Data

Integrated HPSS-DFS Architecture

DFS

DMAP Gateway

Native HPSS Interfaces

Storage Devices

HPSSClientAPI

Client Application

DFS Client

DFS File Server

DFS SMT

HPSS DMAP

Page 11: HPSS/DFS: Integration of a Distributed File System …storageconference.us/1998/presentations/b2-2-AGARWAvg.pdfHPSS/DFS: Integration of a Distributed File System with a Mass Storage

DFS Storage Management Toolkit (SMT)Architecture

User Space

Kernel Space

DMAPI library

Data Management Application

Device driver

DMBASE

Episode

DMLFS

DFS File Exporter

Page 12: HPSS/DFS: Integration of a Distributed File System …storageconference.us/1998/presentations/b2-2-AGARWAvg.pdfHPSS/DFS: Integration of a Distributed File System with a Mass Storage

Control

Data

DFS

DFS

HPSS Create Example

HPSSClient

HPSS

Storage Devices

HPSSClientAPI

HPSS DMAP

Client Application

DFS File Server

DFS Client

DMAP Gateway

Parallel FTP

DFS SMT

Page 13: HPSS/DFS: Integration of a Distributed File System …storageconference.us/1998/presentations/b2-2-AGARWAvg.pdfHPSS/DFS: Integration of a Distributed File System with a Mass Storage

Control

Data

DFS

DFS Read Example

HPSSClient

HPSS

DMAP Gateway

Storage Devices

HPSSClientAPI

Client Application

DFS SMT

DFS File Server

DFS Client

HPSS DMAP

Native Interfaces

Page 14: HPSS/DFS: Integration of a Distributed File System …storageconference.us/1998/presentations/b2-2-AGARWAvg.pdfHPSS/DFS: Integration of a Distributed File System with a Mass Storage

DFS SMT Features

•Filesystem sends notifications to DM application•Then if necessary waits for response from DM application•DM application processes notification •e.g. caches data into the filesystem from mass store

•DM application responds to the notification•DM application can initiate operations on files via DMAPI•e.g. migrating data from filesystem to mass store•e.g. making migrated data non-resident in the filesystem

Page 15: HPSS/DFS: Integration of a Distributed File System …storageconference.us/1998/presentations/b2-2-AGARWAvg.pdfHPSS/DFS: Integration of a Distributed File System with a Mass Storage

DFS SMT Features - 2

• Provides for storage of DM attributes with files• Filesystem visible attributes

• e.g. filesystem operations which should generatenotifications to DM app

• Filesystem opaque attributes• Understood by DM application• e.g. pointers to migrated data

• Implements all required DMAPI features

Page 16: HPSS/DFS: Integration of a Distributed File System …storageconference.us/1998/presentations/b2-2-AGARWAvg.pdfHPSS/DFS: Integration of a Distributed File System with a Mass Storage

DFS SMT Features - 3

•Many optional DMAPI features provided• Persistent event masks• Persistent managed regions• Persistent attributes• Real removal of residency of migrated data• Punch holes in files

• Non-blocking lock upgrades

Page 17: HPSS/DFS: Integration of a Distributed File System …storageconference.us/1998/presentations/b2-2-AGARWAvg.pdfHPSS/DFS: Integration of a Distributed File System with a Mass Storage

17

Concept - DFS Filesets

Aggregates(partitions)Aggregates(partitions)

/dev/sd1g

user.richuser.donna

catia.bincatia.lib

proj.data

DISK

/dev/sd1b

•Total file space at fileserversdivided into filesets

•Each fileset is a separatetree-like filesystem

•Fileset: collection of relatedfiles

•Unit of administration,backup, replication

Page 18: HPSS/DFS: Integration of a Distributed File System …storageconference.us/1998/presentations/b2-2-AGARWAvg.pdfHPSS/DFS: Integration of a Distributed File System with a Mass Storage

18

How are filesets linked in DFS?

user public

rich donna

dirx/ filex .cshrc filex

/:DFS Client

root.dfs

user

user.donnauser.rich

• Embedded junctions calledmount points

• Mount points join filesets intoa single, global, uniform namespace in DFS

Page 19: HPSS/DFS: Integration of a Distributed File System …storageconference.us/1998/presentations/b2-2-AGARWAvg.pdfHPSS/DFS: Integration of a Distributed File System with a Mass Storage

Other DFS Additions

• Episode• Support for punching holes in files

• Ability to mark holes as offline data for purgeddata

• Support for storing file attributes for files• Attributes inherently linked with respective file (No

auxiliary file)• Data backup

• Filesets are unit of data dump and restore• Dump/restore facilities extended for file attributes,

purged holes• Migrated and purged data not recalled currently

when dumping

Page 20: HPSS/DFS: Integration of a Distributed File System …storageconference.us/1998/presentations/b2-2-AGARWAvg.pdfHPSS/DFS: Integration of a Distributed File System with a Mass Storage

DMAPI Extensions for DFS SMT

•Filesets and aggregates•mounting/unmounting aggregates•fileset destruction•enumerate fileset information

•Management interfaces•Scan by attribute•DCE security authentication information•ACL / permission events

•Mirrored fileset support•Synchronous post events for name space modification•Events for permissions changes•Association of pre and post events

Page 21: HPSS/DFS: Integration of a Distributed File System …storageconference.us/1998/presentations/b2-2-AGARWAvg.pdfHPSS/DFS: Integration of a Distributed File System with a Mass Storage

•Name Server•Fileset type•Junctions to link filesets into HPSS tree

•Client API

•Fileset behavior

•Junction processing

•Shared transactional boundaries to support atomic

behavior between DFS and HPSS•Bitfile Server - data consistency

•File families

HPSS Additions•Data residency supported in HPSS

•HPSS Only•Archived•Mirrored

Page 22: HPSS/DFS: Integration of a Distributed File System …storageconference.us/1998/presentations/b2-2-AGARWAvg.pdfHPSS/DFS: Integration of a Distributed File System with a Mass Storage

Performance Test Hardware Configuration atSandia National Laboratories

DFS

Sun Ultra Sparc 2DFS client and server128 MB memorySun Fibre ChannelSPARCstorage Array(25GB)1GB DFS Cache

ATM Network

HPSS

2 IBM RS6000 model 570IBM 3494 Tape Library3590 TapesIBM 7135 Disk Array (45GB)

Page 23: HPSS/DFS: Integration of a Distributed File System …storageconference.us/1998/presentations/b2-2-AGARWAvg.pdfHPSS/DFS: Integration of a Distributed File System with a Mass Storage

Average Time to Write a File

0.001

0.01

0.1

1

10

8 32 128 512 2048 8192

File Size (KB)

Seconds

DFSArchived SMT DFSMirrored SMT DFSNative HPSS

Page 24: HPSS/DFS: Integration of a Distributed File System …storageconference.us/1998/presentations/b2-2-AGARWAvg.pdfHPSS/DFS: Integration of a Distributed File System with a Mass Storage

Connectathon Test Description

•Create : create 155 files 62 directories 5 levels deep•Delete: remove 155 files 62 directories 5 levels deep

•Lookup: 500 getwd and stat calls•Attr: 1000 chmods and stats on 10 files•Write: write 1MB file 10 times•Read: read 1MB file 10 times

•Readdir: 20500 entries read, 200 files•Link: 200 renames and links on 10 files•Symlink: 400 symlinks and readlinks on 10 files

•Statfs: 1500 statfs calls

Page 25: HPSS/DFS: Integration of a Distributed File System …storageconference.us/1998/presentations/b2-2-AGARWAvg.pdfHPSS/DFS: Integration of a Distributed File System with a Mass Storage

Integrated DFS Connectathon Performance

0

1

2

3

4

5

6C

reat

e

Del

ete

Loo

kup

Att

r

Wri

te

Rea

d

Rea

ddir

Lin

k

Sym

link

Stat

fs

Seconds

DFSArchived SMT DFS

Page 26: HPSS/DFS: Integration of a Distributed File System …storageconference.us/1998/presentations/b2-2-AGARWAvg.pdfHPSS/DFS: Integration of a Distributed File System with a Mass Storage

Integrated HPSS I/O Performance

0

50

100

150

20050 10

0

500

1000 50 10

0

500

1000

Seconds

HPSSOnlyFileset

Mirrored HPSSFileset

Megabytes Written Megabytes Read

Page 27: HPSS/DFS: Integration of a Distributed File System …storageconference.us/1998/presentations/b2-2-AGARWAvg.pdfHPSS/DFS: Integration of a Distributed File System with a Mass Storage

Conclusions

• Flexible DFS/HPSS Integration

• Minimal impact on data I/O rates• Archived DFS file creation equivalent to DFS at 8KB• Mirrored DFS file creation equivalent to DFS at 32KB• Mirrored HPSS performance equivalent to native HPSS (<3%

difference)

• Connectathon performance overhead• archived DFS connectathon performance overhead ~20%

• Greater administrative complexity

Page 28: HPSS/DFS: Integration of a Distributed File System …storageconference.us/1998/presentations/b2-2-AGARWAvg.pdfHPSS/DFS: Integration of a Distributed File System with a Mass Storage

Future Work

• DFS• Support for fileset movement and replication• Support for full fileset dumps• Client visible DM attributes

• HPSS DMAP ports to other platforms• Easier administration tools• Performance enhancements• DMAPI extensions

• better support for distributed systems• name space synchronization• parallel file system support

Page 29: HPSS/DFS: Integration of a Distributed File System …storageconference.us/1998/presentations/b2-2-AGARWAvg.pdfHPSS/DFS: Integration of a Distributed File System with a Mass Storage

Additional Information

• HPSS URL:www5.clearlake.ibm.com:6001

• DFS URL:www.transarc.com

• Availability:

Sun Solaris and IBM AIX platforms

July - September


Recommended