Evolution of Archive Technologies at the
National Snow and Ice Data CenterRuth Duerr
National Snow and Ice Data Center1540 30th St, Boulder CO 80309-0449
Phone +1-303-735-0136 FAX: +1-303-492-2468E-mail: [email protected]
Presented at the THIC Meeting at the National Center for Atmospheric Research, 1850 Table Mesa Drive, Boulder CO 80305-5602
July 19-20, 2005
Evolution of Archive Technologies at the National Snow and Ice Data CenterPresented at the THIC Conference, Boulder CO, July 19-20, 2005
Outline
• A brief history of NSIDC• Current holdings• Current systems• Thoughts on the future
Evolution of Archive Technologies at the National Snow and Ice Data CenterPresented at the THIC Conference, Boulder CO, July 19-20, 2005
Outline
• A brief history of NSIDC• Current holdings• Current systems• Thoughts on the future
Evolution of Archive Technologies at the National Snow and Ice Data CenterPresented at the THIC Conference, Boulder CO, July 19-20, 2005
A brief history of NSIDC
• World Data Center for Glaciology 1957 - U.S. National Committee for the IGY awarded the operation of WDC-A for Glaciology to the American Geographical Society under the direction of Dr. William O. Field1970 - WDC for Glaciology transferred to the U.S. Geological Survey in Tacoma, Washington under the direction of Dr. Mark F. Meier1976 - WDC for Glaciology transfers to the NOAA Environmental Data and Information Service; an agreement between the University of Colorado and NOAA placed the WDC at CU-Boulder, Colorado under the direction of Roger Barry
Evolution of Archive Technologies at the National Snow and Ice Data CenterPresented at the THIC Conference, Boulder CO, July 19-20, 2005
A brief history of NSIDC
• National Snow and Ice Data Center1982 - NOAA grants NSIDC its name1983 - NSIDC receives a grant from NASA to archive Nimbus 7 passive microwave data1990 - NSIDC receives funding from NSF for the Arctic System Science (ARCSS) Data Coordination Center (ADCC)1993 - NSIDC receives first NASA Distributed Active Archive Center (DAAC) contract
Evolution of Archive Technologies at the National Snow and Ice Data CenterPresented at the THIC Conference, Boulder CO, July 19-20, 2005
A brief history of NSIDC
• 1996 - Antarctic Data Coordination Center (ADCC) established with NSF support
• 1999 - Antarctic Glaciological Data Center (AGDC) established with NSF support
• 2002 - Frozen Ground Data Center (FGDC) established with International Arctic Research Center (IARC) support
Evolution of Archive Technologies at the National Snow and Ice Data CenterPresented at the THIC Conference, Boulder CO, July 19-20, 2005
Outline
• A brief history of NSIDC• Current holdings• Current systems• Thoughts on the future
Evolution of Archive Technologies at the National Snow and Ice Data CenterPresented at the THIC Conference, Boulder CO, July 19-20, 2005
Current Holdings
• Information Center~44,000 monographs, reports, serials, reprints, etc.
Evolution of Archive Technologies at the National Snow and Ice Data CenterPresented at the THIC Conference, Boulder CO, July 19-20, 2005
Current Holdings (continued)
Analog Archives• ~ 10,000 glacier photos• ~ 7,000 sea ice charts• ~ 1,440 maps• TBD cu ft of
manuscripts and other records
Evolution of Archive Technologies at the National Snow and Ice Data CenterPresented at the THIC Conference, Boulder CO, July 19-20, 2005
Data Holdings (continued)
Digital Archives• ~ 440 publicly advertised
data sets• 4.6 million granules in ECS
system• > 3.5 million files in
non-ECS systemsArchive Types• ~ 8 TB on-line• ~ 80 TB near-line• >5 TB off-line• Off-site backups for
primary data without recovery agreements
Evolution of Archive Technologies at the National Snow and Ice Data CenterPresented at the THIC Conference, Boulder CO, July 19-20, 2005
NSIDC Near-Line Archive SizeNSIDC Total Archive Size
0.00
10.00
20.00
30.00
40.00
50.00
60.00
70.00
80.00
90.0010/6/05
12/6/05
2/6/06
4/6/06
6/6/06
8/6/06
10/6/06
12/6/06
2/6/07
4/6/07
6/6/07
8/6/07
10/6/07
12/6/07
2/6/08
4/6/08
6/6/08
8/6/08
10/6/08
12/6/08
2/6/09
4/6/09
6/6/09
Date
Ter
abyt
es(T
B)
Evolution of Archive Technologies at the National Snow and Ice Data CenterPresented at the THIC Conference, Boulder CO, July 19-20, 2005
NSIDC Distribution Statistics
NSIDC Data and Information Request Totals Fiscal Years 1978-2004
0
1000
2000
3000
4000
5000
6000
7000
8000
9000
1978 1980 1982 1984 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004Fiscal Year
Sixty-six % of requests in FY 2004 were for ECS Products
ECS Product Distribution in GB per Month and Instrument Type
0
500
1000
1500
2000
2500
3000
3500
Feb '0
0
ECS Product - Total Granules Distributed by Month and Instrument Type
0
10000
20000
30000
40000
50000
60000
70000
80000
90000
Feb '0
0
Evolution of Archive Technologies at the National Snow and Ice Data CenterPresented at the THIC Conference, Boulder CO, July 19-20, 2005
Outline
• A brief history of NSIDC• Current holdings• Current systems• Thoughts on the future
NSIDC Architecture
UCB GE Campus LanAbilene via
Front Range GigaPop
Cisco
GE switch
ECS
Router
Ebnet
Router
Firewall
LASP
GES DAAC (MODIS)AMSR-E SIPSI-SIPSPO.DAAC AMSR L1AEDOS
M&OLAN
(Includes M&O Intranet Server and
Backup Server)
Ingest & Distribution• ftp/file server• tape production• Disk Storage (FC
and SATA @ RAID-5)• Rimage CD/DVD• AMSR L1A PDR/Met
ECS Production LAN
UCB non-ECS LAN
Web Services• Guide Documents• SOTC, All about…• Catalog/DIF generator• Client Interfaces
- EDG,GISMO,PSQ, SNOWI- ECHO Client (WSRD)
Archive Services• MAID vtl w/ AMASS(transition from tapearchive in progress)
• Off-site archive
Raytheon(Denver)
Infrastructure Services•Email• Calendar Manager• Center-wide Intranet• Backup Services• SNIPS• IDS
Science Processing Dev /Production
• SSM/I, NISE, NRTSI- Std Processing Env.
• GISMO/PSQ/WSRD-backend services
• AMSR-E PDR/.Met
ICESat/GLAS Remote SCF Server•Visualization•Subsetting Services•(Includes storage forArchived GLAS products)
HEG Server• HDFEOS-to-GEOTIFF
ECS Subsetter• Spatial, temporaland parameter subsettingof granules (HDF-EOS Only)
Data Management• Data Dictionary•V0Gateway- Inv. Search/Results
• Order Requests - subsetting services
• ECHO transfers
Data Server• Manage the archive
-STK Powderhorn w/AMASS and ACSLS
• Insert data into archive• Search and Retrieve datafrom the archive
Ingest Polling Server•ASMR, AMSR-E• NISE• MODIS• ICESat
Data Pool• On-line storage of most
recently ingested data(StorNext SAN)
• Data accessible via WEB GUI or ftp
Infrastructure• Backup services• What’s Up• SNIPS• MSS and CSS• Email Gateway
Order Manager• Manages orders from V0Gateway, MTMWG, and Spatial Sub. Server
• Transfers orders to PDSfor media requests
VJT 7/05
Evolution of Archive Technologies at the National Snow and Ice Data CenterPresented at the THIC Conference, Boulder CO, July 19-20, 2005
NSIDC Data Catalog
• Contains metadata about each published data set
GCMD, FGDC compliant metadata
• Used to drive web page creation
• Modification underway to include OAIS/PREMIS compliant metadata for all data sets
Evolution of Archive Technologies at the National Snow and Ice Data CenterPresented at the THIC Conference, Boulder CO, July 19-20, 2005
NSIDC DAAC ECS Interfaces
EDG CLIENTMTMGW
Subscriptions
QA SCFs SCFs SCIENCE USERS GENERAL USERS
ECS NSIDC DAAC
NISE
NSIDC PDR Server
EDOS
AMSR L1A [1.3]
GBAD PDS [<1] AMSR-E PDS [1.4]
NISE Data
AMSR-E SIPS
L2+ Products [3.6]
Total Data Distribution
• Aqua AMSR-E Data Products [7] • Aqua MODIS Data Products [16] • Terra MODIS Data Products [40] • ADEOS II AMSR Data Products [1] • GLAS Data Products [7]
[<1]
GDAAC LDAAC LaTIS
NISE Data[total <1]
MODIS SCFs
MODIS L2/L3 Snow/Ice Products
QA Updates[2/T, 1/A]]
[<1/T, <1/A]
MODAPSMODIS L2/L3 Snow/Ice Products[120/T, 16/A]
[ ] : Archive Volumes at L+1yr in GB/day T: Terra A: Aqua
NASDAAMSR-E L0 Science & GBAD PDSs8mm back-up 1st 90 days [1.4]
Emergency back-up following L+90
GLAS L0 [5.8]GLAS SIPS
GLAS L1+ Products [23]
Data/QA Updates
EDOS
LP DAACASTER Anc [<1]
GDAACAQUA Ancillary from EMOS [<1]
ECHO Metadata and Browse
NSIDC/UCB InterfacesNSIDC/UCB
NISE Data (ftp push)
sidads
NSIDC/ECSAMSR/AMSR-E L1A, PDR’s and met (ftp push)-
adcc / agdc(arcss)
ftp push
Public User Communities
ftp pull
Requests, results, orders(TCP Sockets)
E-mail andattachments
kryos(Email-server)
html,.gif, jpeg,data, etc.
SSM/I Wentz Tas(RSS)
FedEx
AMSR/AMSR-E-JPL/PO.DAAC
Fastcopypush
ftp pullSSM/I TBs
SNODAS(NORSC)
Glacier Photos
AVHRR imagery(Antarctic Ship)
ftp pull
PermafrostInvestigators
ftp push/pull
ARCSS Investigators
VorticityModelingData(USGS/Denver)rcp
NCEP GlobalReanalysis (CDC)
Coriolis
ftp pushftp push NCEP Global
Reanalysis (UNH)
Via UCB Network Via NASA EMSn/ECS Media (Tape/CDR)
Ftp push/pull
Mediadistribution
glacierGISMO-E
bipolar
EDG/SNOWI
(Web server)
DocumentsImages
Forms
GISMO/PSQ
IMS (NCDC) ftp pull
Sea Ice Charts(Arg)
ftp push
Dehn Ice Charts
IceCharts/Reports (NIC)
AMSR-E NRT(MSFC)
ftp pull
Weekly Snow Map(Rutgers)
ftp pull
ftp pull
(GHRC/UAH MODIS Browse (ftp push)
DVD
CDftp
ftp pull
rSCFRequests
GLAS data(ftp push)
Subsetted GLAS dataDVD
Evolution of Archive Technologies at the National Snow and Ice Data CenterPresented at the THIC Conference, Boulder CO, July 19-20, 2005
COPAN MAID - Early Experiences
• Using AMASS to provide a file system front end to a COPAN 200t system configured as a L700 with 7 - 9940 drives
COPAN to act as a drop-in replacement for the STK 9710
• Migration of data underway • The big surprise was performance• Minor surprise is the ever shrinking total
archive size
Evolution of Archive Technologies at the National Snow and Ice Data CenterPresented at the THIC Conference, Boulder CO, July 19-20, 2005
Outline
• A brief history of NSIDC• Current holdings• Current systems• Thoughts on the future
Evolution of Archive Technologies at the National Snow and Ice Data CenterPresented at the THIC Conference, Boulder CO, July 19-20, 2005
Thoughts on the Future
• NSIDC DAAC is pushing for an entirely on-line archive
What technologies will allow this? (SATA RAID, COPAN, or ???)What happens to media?How do we ensure preservation of these data over time?