- 1. A&T Advisory Board EDC Storage Area Network (SAN) April
19, 2004 Ken Gacke, Brian Sauer, Doug Jaton [email_address]
[email_address] [email_address]
2. Agenda
3. Storage Architecture Linux Sun SGI Direct Attached Storage
Ethernet
- Difficult to reallocate resources
- File sharing via Network (NFS, FTP)
-
- NFS Performance/Security Issues
-
- I/O Performance/Bandwidth
- Data Availability Concerns
-
- Server failure => no data access
4. Storage Technology Linux Sun SGI Disk Farm SAN Configuration
Ethernet Fibre Switch
- Logical Reallocation of Resources
- File sharing via Network (NFS, FTP)
-
- NFS Performance/Security Issues
-
- I/O Performance/Bandwidth
- Data Availability Concerns
-
- Server failure => no data access
5. Storage Technology Linux Sun SGI Clustered File System SAN
Configuration Ethernet Fibre Switch
- Hardware/Software Solution
- Logical Reallocation of Resources
Shared File System CXFS/CFS CXFS/CFS CXFS/CFS 6. Storage
Architecture
-
- File sharing across multiple servers
-
-
- Heterogeneous Platform Support (IRIX, Solaris, Linux)
-
-
- Reduce number of file copies
-
-
-
- Reduce I/O requirements on server
-
-
-
- Reduce time required to transfer data
-
-
- Increase disk storage utilization
-
-
- Logical reallocation of storage resources
-
-
- Maintain data access when a server fails
7. Digital Reproduction CR1 SAN April 19, 2004 Ken Gacke SAIC
Contractor [email_address] 8. Historical Architecture No SAN
Product Distribution Ethernet Architecture Notes: 1) Data transfer
via FTP 2) Duplicate storage on both servers 3) Multiple data file
I/O required on both servers 4) System bandwidth constrained by
Network UniTree Server Tape Drives 8x9840 2x9940B 9. CR1 SAN
Timeline
-
- DMF Production Release in December 2001
-
-
- Fully automated Data Migration process
-
-
- 21TB migrated to DMF within 3 months
-
-
-
- Data migration during off hours
-
-
-
- Full data access through data migration period
-
- SGI CXFS Certified SAN Configuration
-
-
- CXFS On Two IRIX Servers, DMF and PDS
-
-
- 8 Port Brocade and 16 Port Brocade fibre switches
-
-
- Test DMF/CXFS configuration
-
-
- Performed final CXFS testing
-
- DMF/CXFS released to production on 11/5/02
10. CR1 SAN Architecture DMF Server Product Distribution
Ethernet Tape Drives 8x9840 2x9940B 1Gb Fibre 2Gb Fibre Disk Cache
/dmf/edc 68GB /dmf/doqq 547GB /dmf/guo 50GB /dmf/pds 223GB
/dmf/pdsc 1100GB 11. CR1 SAN Architecture 12. CR1 SAN Summary
-
- 2TB Disk Cache storing 67 Terabytes on the backend
- 2003 AverageMonthlyData Throughput
-
- Average data throughput of 8.5MB/sec (includes tape
access)
- Minimal System/Ops Administration
-
- SGI Software, RAID, and Fibre Switches
-
- CXFS supported on SGI IRIX, Linux, Solaris, Windows, etc
13. Landsat SAN April 19, 2004 Brian Sauer SAIC Contractor
[email_address] 14. Landsat SAN Goals
- Improve Overall Performance (3 Hrs -> 1.5 Hrs)
- Maximize Disk Storage Through Shared Resources
- Centralized Management (System Admin, Hardware Eng)
- Overcome Old SCSI RAID Obsolescence (Ciprico 6900)
- Utilize Existing Investment in Fibre Channel Storage
-
- Existing Investment in Ciprico NetArrays
-
- Combined throughput of over 240MB/sec
- Total Usable Storage over 10TB
- SGI, Linux and SUN Clients
- Integrate in Phases as Tasks Become SAN Ready
15. Landsat SAN Overview
- 13 TB of Raw Storage Utilizing Ciprico NetArrays
- Eleven Linux and Six SGI Clients
-
- Data Capture System Database Server (DDS)
-
- Landsat Processing System (LPS)
-
- Landsat Archive Management System (LAM)
-
- Image Assessment System (IAS)
-
- Landsat Product Generation System (LPGS)
- ADIC StorNext File System Software
-
- Shared High Performance File System
- Qlogic Fibre Channel Host Bus Adapters
16. Landsat OLD Data Flow L7L0RaArchive (LAM) L7RawCC Archive
(LAM) R C C L7 ProcessingSystem (LPS) L 0 R a 85 Minutesto Process
DCSDatabase Server ( DDS ) R C C R C C R C C
Capture&TransferSystem (CTS) R C C R C C 24 Minute Transfer 14
MinutePass 24 Minute Transfer 20Minute Transfer 17. Landsat SAN
Satellite dish SAN LGS CTS1 CTS2 CTS3 RAID3 RAID3 RAID3 DDS
RAW DATA L0RA DATA LAM LPS 18. Landsat SAN Summary
-
- Able to share data in a high performance environment to reduce
the amount of storage necessary
-
- Increase in overall performance of the Landsat Ground
System
-
-
- Able to utilize existing equipment
-
-
- Currently testing with other vendors
-
- Disk availability for projects during off-peak times e.g.
IAS
- Disadvantages / Challenges
-
- Challenge to integrate an open solution
-
-
- CIPRICO RAID controller failures
-
- Not good for real-time I/O
-
- Challenge to integrate into multiple tasks
-
-
- Difficult to guarantee I/O
19. LP DAAC SAN Forum April 19, 2004 Douglas Jaton SAIC
Contractor [email_address] 20. LP DAAC Data Pool Phase I SAN
Goals
- Phase I Data Pool Implementation in early FY03
- Access/Distribution Method (ftp site):
- Support increased electronic distribution
- Reduce need to pull data from archive silos
- Reduce need for order submissions (and media/shipping
costs)
- Give science and applications users timely, direct access to
data, including machine access
- Allow users to tailor their data views to more quickly locate
the data they need by providing
- The Data Pool SAN infrastructure effectively acts as a subset
archive of the full ECS archive
21. LP DAAC Data Pool (SAN) Configuration
- Data Pools are an additional subset inventory of science data
(granule, browse, metadata) that reside in a separate inventory
database, with their physical files resident on local storage area
network (SAN = 44TB)
-
- STK D178 RAID racks with 1 Sun E450 metadata server.
-
- Data Pool inventory is managed via 2 ndSybase Inventory
database
- Data pool contents are populated from the primary ECS
archive.
-
- Subscriptions can be fully qualified with the population
occurring at insert time in the primary ECS archive (a function of
ingest) (forward population)
-
- Historical data load from primary ECS archive via query
(historical population capability) in support of science or user
requirements.
-
- NASA intent is to grow the on-line to be a working copy of the
most popular data
- Dataset Collections belong to Groups and are configured for N
days of persistence and are automatically removed at expiration
(rolling archive concept)
-
- Data Management of this 2 ndarchive to keep synchronized to
primary has been problematic and has increased O&M costs.
- Data Pool Web client(s) and/or anonymous ftp site access are
used to navigate contents, browse, access, and download data
products.Directory structure is used:
-
- /datapool//// e.g.
/datapool/ops/astt/ast_l1b.001/1999.12.31
22. LP DAAC Data Pool Contents & Access
-
- ASTER collection over U.S. States and Territories (no
billing!)
- MODIS Group (TERRA & AQUA)
-
- 8 day rolling archive of daily data for MODIS
-
- 12 months of data for higher level products
-
-
- Most 8-day, 16-day, and 96-day products
- Web Client interface(s) to navigate & browse data holdings
via Sybase inventory database
- Public Access: http:// lpdaac . usgs . gov / datapool /
datapool .asp
23. LP DAAC Data Pool Phase II SAN Goals
- Phase II FY04 Optimize System Throughput (systemic
resource):
- Maximize Disk Storage Through Shared Resources
- Centralized Management (System Admin, Hardware Engr) of
disk
- High Performance fibre channel connections
-
- SGI, Linux and SUN Clients
- Decrease turn-around time for production and distribution
orders.
- Integrate SAN into ECS subsystems in Phases as tasks become SAN
ready/capable
-
- Granules will be served from SAN (Data Pool) if available,
rather than staging from tape.Less thrashing of the archives for
popular datasets.
-
-
- Effectively allows for more ingest bandwidth as less archive
drive contention
-
-
- Trick here is to maintain rule sets for popular data to
minimize silo thrashing
-
- Less copying of data no need for dedicated read only caches
across ingest, archive staging, production, media (PDS),
distribution (ftp push & pull)
- Fully Utilize the SAN infrastructure effectively across the
sub-systems of the full ECS archive
24. LP DAAC SAN Overview 25. SAN Reality Check April 19, 2004
Brian Sauer SAIC Contractor [email_address] 26. EDC SAN
Experience
-
- TSSC Understands this new technology.
-
- Bring it in at right level and at the right time to satisfy
USGS programmatic requirements.
-
- SAN technology is not a one size fits all solution set.
-
- Need to balance complexity vs. benefits.
- Project Requirements Differ
-
- Size of SAN (Storage, Number Clients, etc)
-
- Open System Versus Single Vendor
-
- Provides high performance shared storage access
-
- Provides better manageability and utilization
-
- Provides flexibility in reallocating resources
-
- Requires trained Storage Engineers
-
- Complex architecture, especially as number of nodes
increases
27. EDC SAN Reality Check
-
- Vendors typically oversell SAN architecture
-
-
-
- Hardware Switches, HBAs, Fibre Infrastructure
-
-
-
-
- Hardware/Software maintenance
-
-
-
-
- Disk maintenance higher than tape
-
-
-
- Power & cooling of disk vs. tape
-
-
-
- Requires additional/stronger System Engineering
-
-
-
- Requires highly skilled System Administration
-
-
- Lifecycle is significantly shorter with disk vs. tape.
28. EDC SAN Reality Check
-
- Difficult to share resources among projects in an enterprise
environment
-
-
- Ability to fund large shared infrastructure historically been
problematic for EDC
-
-
- Ability to allocate and guarantee performance to projects
(storage, bandwidth, security, peak vs. sustained)
-
-
- Scheduling among multiple projects would be challenging
- Not all projects require a SAN
-
- SAN will not replace the Tape Archive(s) anytime soon
-
- Direct attached storage may be sufficient for many
projects