+ All Categories
Home > Documents > AGU2015 Rosen v3 - Dawn Wrightdusk.geo.orst.edu/Pickup/Esri/AGU2015/IN43B-1734-Rosen.pdf ·...

AGU2015 Rosen v3 - Dawn Wrightdusk.geo.orst.edu/Pickup/Esri/AGU2015/IN43B-1734-Rosen.pdf ·...

Date post: 15-Jul-2020
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
1
IN43B– 1734 The Synthetic Aperture Radar Science Data Processing Foundry Concept for Earth Sciences Paul A. Rosen 1 , Hook Hua 1 , Charles Norton 1 , Michael M. Little 2 1 Jet Propulsion Laboratory, California Institute of Technology, 4800 Oak Grove Drive, Pasadena, CA 91109 2 National Aeronautics and Space Administration, Earth Science Technology Office, Washington DC Abstract Since 2008, NASA's Earth Science Technology Office and the Advanced Information Systems Technology Program have invested in two technology evolutions to meet the needs of the community of scientists exploiting the rapidly growing database of international synthetic aperture radar (SAR) data. JPL, working with the science community, has developed the InSAR Scientific Computing Environment (ISCE), a next-generation interferometric SAR processing system that is designed to be flexible and extensible. ISCE currently supports many international space borne data sets but has been primarily focused on geodetic science and applications. A second evolutionary path, the Advanced Rapid Imaging and Analysis (ARIA) science data system, uses ISCE as its core science data processing engine and produces automated science and response products, quality assessments and metadata. The success of this two-front effort has been demonstrated in NASA's ability to respond to recent events with useful disaster support. JPL has enabled high-volume and low latency data production by the re-use of the hybrid cloud computing science data system (HySDS) that runs ARIA, leveraging on-premise cloud computing assets that are able to burst onto the Amazon Web Services (AWS) services as needed. Beyond geodetic applications, needs have emerged to process large volumes of time-series SAR data collected for estimation of biomass and its change, in such campaigns as the upcoming AfriSAR field campaign. ESTO is funding JPL to extend the ISCE-ARIA model to a “SAR Science Data Processing (SDP) Foundry” to on-ramp new data sources and to produce new science data products to meet the needs of science teams and, in general, science community members. An extension of the ISCE-ARIA model to support on- demand processing will permit PIs to leverage this Foundry to produce data products from accepted data sources when they need them. This paper will describe each of the elements of the SAR SDP Foundry and describe their integration into a new conceptual approach to enable more effective use of SAR instruments. References Gurrola, Eric, Gian Franco Sacco, Paul A. Rosen, and Howard Zebker, InSAR Scientific Computing Environment. Earth Science Technology Forum 2010. H. Hua; G. Manipon; G. Sacco; S. Owen; E. Fielding; S. Yun; P. Lundgren; A. Moore; P. Milillo; P. Rosen; F. Webb; M. Simons; A. Smith; B. Wilson; “The Advanced Rapid Imaging And Analysis Data System: Automating SAR Data Analysis For Science And Hazard Response”, IGARSS 2015, Milan, Italy, July 28, 2015 Zebker, Howard, Scott Hensley, Piyush Shanker, Cody Wortham, Geodetically Accurate InSAR Data Processor. IEEE Trans. On Geoscience and Remote Sensing, 48(12), 2010. Shams, K. Riofrio, A. Hardman, S. et al. Enabling Earth Science through Cloud Computing. IEEE/AIAA Aerospace 2012. PS Agram, R Jolivet, B Riel, YN Lin, M Simons, E Hetland, MP Doin, etal. New Radar Interferometric Time Series Analysis Toolbox Released. Eos, Transactions American Geophysical Union 94 (7), 69-70 !"#$%&'" )$*$ +,-*'. !/0!123 4/56)7 89:*:*,;' +,-*'. )8< 4!=07 >'#'?1@ /$A )$*$ !?:%B1*9$&C )D-;?$&'.'%* E9FD* )$*$ 6D.' +'9D'- 89:& G:%HBI:I- 8$D9- >D%'1:J1-DBK* )D-;?$&'.'%* /$"$9 +$*'??D*' +'%-:9- 89:#D"'9-L!9&KD#'- 2+! 42/+1MLNO 2%#D-$*O +'%H%'?1M$LF7 P!Q! 4!>E+1MLN7 !+0 4GE+<E1+C,<'"7 R!+! 4)2+),%07 )$D?, 8:D%* 8:-S =8+ +*$H:% +IF1"$D?, 8:D%* 8:-S +E8!G TR!UGE =+0 G6E V E*K'9- =8+ G:%-*'??$H:% R!+! 1 P8> )$.$B' 89:W, <$; +'D-.:.'*'9 2#'%*- G:K'9'%&' G:-'D-.D& 0%*'9J'9:B9$. 0%+!/ V =8+ 0%*'B9$H:% 0+G2 4!0+67 /28!0/ 4)D-$-*'9-7 )8< R20G 8!=2/ X$I?* <:"'? +2+2+ !D9F:9%' U:?&$%: +&D'%&' EF-'9#$*:9, 4!U+E7 =0!%6 4G$?*'&K7 9G<6 2$9*KYI$C' U:?&$%: >$%"-?D"' X?::" +IF-D"'%&' !/0!1<Z 0%JI-D:% ZUE U!>U2 69$%-D'%* )'*'&H:% [) 6D.' +'9D'- !/0!1<Z 4!0+67 G?:I"1F$-'" G:.;IH%BO +*:9$B'O $%" )$*$ <$%$B'.'%* !/0!1<Z !/0!1<Z !/0!1<Z !/0! 89:\'&*- N@M[S@]SNN !/0!123 4/56)7 D- "'#'?:;D%B $?? &:%%'&H:%- D% *K' F?I' 9'BD:%-O AKD?' !/0!1<Z 4!0+67 AD?? "'#'?:; $"#$%&'" JI%&H:%- J:9 .$*I9D%B *K' "$*$ -,-*'. &$;$FD?DH'- $%" '%$F?D%B &?:I"1F$-'" K$^$9" .:%D*:9D%BO "$*$ ;9:&'--D%BO .$%$B'.'%*O $%" "D-*9DFIH:%S !/0! G:1>$F:9$*:9, 48)X7 D- $"#$%&D%B ?:A1?$*'%&, =8+O :;H&$? -$*'??D*' D.$B'9, $%" J$I?* .:"?'D%B D%*'B9$H:%S T+=+ R20G =?:F$? G<6 0/0+ 4_$#'J:9.7 X$I?* <:"'? +2+2+ >'#'?1@ /0R2Q )$*$ E9FD* )$*$ )$D?, Z:I9?, M Z^ 4=)=8+7 XD%D*' X$I?* 0%# 4G$?*'&K7 G:-'D-.D& =8+ U'&*:9 <$; =8+ 6+ 0%+!/ 6+ =2 4!0+67 69:;:-;K'9D& G:99'&H:% _'$*K'9 =8+ E+G!/ 4!0+67 2G<_X <E)0+ _D?"`9' >2=2R) !/0!123 4P8> /56)7 !/0!1<Z 4R!+! !0+67 <!0 4a:9'$% XI%"7 )8< 4R!+! !=07 GI-*:.'9 89:"I&*- 0+G2 4R!+! !0+67 E+G!/ 4R!+! !0+67 =8+ 2W;?:9'9 4R!+! !0+67 89:\'&*- $* G$?*'&K /28!0/ 4R!+! )D-$-*'9-7 +2+2+ 4R!+! <2$+T/2-7 )';'%"'%&, X?:A 2U1+N !U+E E;H&$? +$*'??D*' 3ID&CFD9" _:9?"#D'A M5N M@ .D% !/0! G:1>$F:9$*:9, E;H&$? )$.$B' <$;- 4G$?*'&K7 )8< 4G:1>$F7 0%*'B9$*'" )$.$B' 89:W, <$; ISCE in ARIA (Advanced Rapid Imaging Analysis system) Acknowledgement The authors would like to thank the Earth Science Technology Office and High End Computing Program at NASA for support. This work was performed at the Jet Propulsion Laboratory, California Institute of Technology under a contract with NASA. Copyright © 2015 California Institute of Technology. Government sponsorship acknowledged. All Rights Reserved. The NISAR mission will launch in 2020 and will collect large volumes of SAR data to measure Earth’s changing ecosystems, dynamic surfaces, and ice masses, providing information about biomass, natural hazards, sea level rise, and groundwater. HySDS Option for the Proposed NISAR Mission Our Hybrid cloud Science Data System (HySDS) is currently being considered as a part of the solution for dealing with the flood of data in need of processing. The project will process the raw Level-0 data into Level-2 products using ISCE and provide these low level products directly to the scientists. HySDS would control the processing to these levels and could carry data to higher level science products. Key Considerations ! Any SAR mission can produce thousands to millions of images - Orbital global mapping produces data continuously - Sub-orbital flight missions and campaigns support specific objectives ! Science Data Products vary depending on system design, domain and intended purpose - Radar frequency (Band) - Scanning strategy (multi-pass, single pass, etc) - Platform operations artifacts (orbital vs. aircraft) ! Science Data Processing has some common characteristics - High volume of (typically) embarrassingly parallel processing jobs - Quality Assurance, metadata and registration of images - Cloud Computing offers scalable, if not affordable, solution o Prioritization for scheduling o Event Triggers o Low latency processing - Create metadata for provenance, geolocation, temporal, quality Synthetic Aperture Radar Science Data Processing (SAR SDP) Foundry Concept ! Definition - A set of user-selectable components implemented in a scalable processing environment to leveraging a common framework for producing community-accepted Science Data Products from SAR instruments - Support multiple research and applied science communities - Community review/acceptance of processing model and subsequent improvements - Community defined science data products ! Components - Interface to Instruments which have been on-ramped - Production Processing Codes for Community defined Science Data Products which have been on-ramped - ISCE – Processing environment for instrument output - ARIA SDS – end-to-end SDS for SAR processing and data management o provides provenance, metadata, quality control, registration and workflow - Hybrid Cloud – Provides scalable processing environment, including AWS - Foundry User Interface o Implements Business Model o Permits user selection of instrument, scenes, standard data products - EOS-DIS designated repository provides common destination for output products SAR SDP Foundry Benefits ! Processing is under the control of the customer with data and funding - JPL can leverage their cloud interface - NASA can leverage OCIO SEWP Acquisition and simply use a WBS instead of a PR - Non-NASA collaborators, through agreement, can buy their own processing on AWS ! Processing environment is published and community-accepted ! Clearly defined processes for on-ramping instruments and data product specifications - Interface Control Documents publish requirements for L0 and L1 to permit processing - Instrument Team can account for high volume processing at initial product design ! Processing improvements are shared among the science communities - Example: Reliable use of Spot-pricing ! Science Data Products can become available to the communities regardless of who funded their production - Consistent with 2004 InSAR Working Group Workshop Summary Report (10/20/2004) - Can also deliver to an optional destination for immediate use SAR SDP Foundry Funding Model ! Major Ongoing Costs - SDP Processing - Repository of Data Products ! Instrument development team - Produces L0 and L1 data products conforming to Foundry interface - On-ramping of instrument by the Foundry team at JPL o Foundry creates configuration model, selectable configuration file, processing pattern o Acceptance by Instrument Team in conjunction with Appropriate Science Team ! Research or Applied Sciences Community Science Team defines, tests and accepts and funds data products - Funds JPL for Implementation in the Foundry - Funds Community acceptance tests of output products ! EOS-DIS funds Repository and Stewardship Functions - Presumably ASF Repository Functions ! Foundry technology development competes for AIST funding ! Ongoing operations should not be funded by individual PI projects - Maintenance of Foundry, including Help Desk, training, software maintenance ! However, data processing costs themselves would be paid for by customers as part of their Project - PI controls what processing to pay for Will It Work? ! ESTO is supporting this effort as technology demonstration ! Goal is to demonstrate that the Foundry is: - Scientifically valuable - Technically feasible and efficient - Cost effective - Supported by the user and application development community ! We look forward to working with the community to explore the Foundry’s potential !"#$"%& "'%()$$ Advanced Rapid Imaging & Analysis for Monitoring Hazards, using HySDS Technology And ISCE *+,-./.0+ )1234.0-356+ 734, 9.:2- 34, 963402 ;1.$"% 9-+.<562-=1 $1=2412 9-><?3/ @+43A=1< Once on- ramped, PI’s can quickly process data they select, to the extent that they can afford it. Bottom Right: EcoSAR acquired image over Andros Island, Bahamas in March 2014. Top 4 right: from InSAR Workshop Summary Report, 2004, Oxnard, CA. InSAR Scientific Computing Environment on the Cloud The SAR SDP Foundry L-Band P-Band (JPL) 9)$()B$C+(2, X-Band (Italy) @7% DB$"% $24?=42/ E L0 or L1 SAR data from NASA and other sources, as well as new processing workflows, can be on- ramped into the Foundry. Figure shows some of the planned data sets (COSMO-SkyMed implemented). NISAR Big Data Handling Average Input Data Volume to SDS: 3 TB/day Average Daily Production Volume: 104 TB / day L0 (6 TB per day) L1( 32 TB per day, L-band) L2 (66 TB per day, L-band) Aggregate Data Generated per day Sustained mean 1.2GB/s – Forward processing Sustained mean 4.8GB/s – Bulk reprocessing (optional) New architectures needed to support data handling and throughput SDS processing at JPL on-premise SDS processing at collocated with DAAC SDS processing at public cloud Hybrid cloud: on-premise and public cloud for processing and data storage Expected high data throughput needs of modern SAR missions ! Sentinel 1A/1B - C-band SAR - 1.8TB/day raw data ! NISAR (2020) - L-band SAR - Deliver >100TB data products per day to DAACs ! SWOT (2020) - Bulk reprocessing requires >3GB/second mean sustained data throughput to DAAC SAR SDP Foundry Next Steps ! Characterize and develop on-ramping cost model ! Open HySDS adaptation for SAR SDP Foundry to beta users ! Develop charging protocols ! Define a sensor and workflow to on-ramp of interest to the community ! Exercise the Foundry with customer inputs ! Utilizes both on-premise and off-site infrastructure - PB-scale processing and storage purely in public cloud currently too expensive - Hybrid Cloud data system architecture - Burst out to public cloud when demand exceeds on-premise resources ! Leverage Amazon GovCloud US to address export control and firewall security issues ! Processing at low-cost (up to 10X cheaper) using high-resiliency data system that can run in competitive AWS spot market ! Auto-scaling of science data system Hybrid Cloud Computing Science Data System (HySDS) Raw Instrument L0a + Ancillary Data + Inst HK JPL on-premise cloud and/or AWS Resource Management Publishes data products Localizes Inputs Auto-Scaling Monitor Alerts & Provisions DAAC L0-L2 Pipelines & Ingest (VMs/containers) Gets job L0-L2 Data Products (Object Store) Publish data Gets job Ingest Worker Extracts Data Gets job Ops Work Data Cache Triage Worker Writes Work Data Discovery Services Publishes metadata Notifies Workflow Management Orchestrates Jobs Gets job Crawler Worker Monitors Events Event Management Events DAAC Staging Gets job Product Localize Worker Writes Data Products to Staging JPL GDS L0 RAW Cloud Computing-based SAR Foundry VM Bare Metal Beowulf Clusters Discovery Services Access Services Compute Services Storage Services Analytics Services UX Services Provenance Services Infrastructure “Fabric” SDS services for Faceted Discovery, Monitoring Rules, Conditional Actions, Workflow Processing, and Real-time Metrics Science Users Response Agencies DAACs Ground Systems Operations SAR Foundry leverage computing fabric layer over multiple heterogeneous infrastructures for diversification and high- resiliency. Science Data Products Faceted search and on- demand services Science Data System Faceted Metrics and PROV-ES provenance # compute nodes over time Sentinel-1A scenes for Nepal EQ response Real-Time Situational Awareness of SAR Processing Automatic Scaling of Science Data System ! The size of the data system automatically grows/shrinks based on computing needs. ! Scaled up thousands of compute nodes ! Demonstrated capability of higher internal data throughput rates than currently expected NISAR needs Multiple Concurrent Platforms Amazon Web Services Open Stack Cloud Software Windows Azure Eucalyptus Nebula
Transcript
Page 1: AGU2015 Rosen v3 - Dawn Wrightdusk.geo.orst.edu/Pickup/Esri/AGU2015/IN43B-1734-Rosen.pdf · 2016-01-19 · IN43B– 1734 The Synthetic Aperture Radar Science Data Processing Foundry

IN43B– 1734The Synthetic Aperture Radar Science Data Processing Foundry Concept for Earth Sciences

Paul A. Rosen1, Hook Hua1, Charles Norton1, Michael M. Little2

1Jet Propulsion Laboratory, California Institute of Technology, 4800 Oak Grove Drive, Pasadena, CA 911092National Aeronautics and Space Administration, Earth Science Technology Office, Washington DC

Abstract

Since 2008, NASA's Earth Science Technology Office and the Advanced Information SystemsTechnology Program have invested in two technology evolutions to meet the needs of thecommunity of scientists exploiting the rapidly growing database of international synthetic apertureradar (SAR) data. JPL, working with the science community, has developed the InSAR ScientificComputing Environment (ISCE), a next-generation interferometric SAR processing system that isdesigned to be flexible and extensible. ISCE currently supports many international space borne datasets but has been primarily focused on geodetic science and applications. A second evolutionarypath, the Advanced Rapid Imaging and Analysis (ARIA) science data system, uses ISCE as its corescience data processing engine and produces automated science and response products, qualityassessments and metadata. The success of this two-front effort has been demonstrated in NASA'sability to respond to recent events with useful disaster support. JPL has enabled high-volume andlow latency data production by the re-use of the hybrid cloud computing science data system(HySDS) that runs ARIA, leveraging on-premise cloud computing assets that are able to burst ontothe Amazon Web Services (AWS) services as needed. Beyond geodetic applications, needs haveemerged to process large volumes of time-series SAR data collected for estimation of biomass andits change, in such campaigns as the upcoming AfriSAR field campaign. ESTO is funding JPL toextend the ISCE-ARIA model to a “SAR Science Data Processing (SDP) Foundry” to on-ramp newdata sources and to produce new science data products to meet the needs of science teams and, ingeneral, science community members. An extension of the ISCE-ARIA model to support on-demand processing will permit PIs to leverage this Foundry to produce data products from accepteddata sources when they need them. This paper will describe each of the elements of the SAR SDPFoundry and describe their integration into a new conceptual approach to enable more effective useof SAR instruments.

ReferencesGurrola, Eric, Gian Franco Sacco, Paul A. Rosen, and Howard Zebker, InSAR Scientific Computing Environment. Earth Science Technology Forum 2010.H. Hua; G. Manipon; G. Sacco; S. Owen; E. Fielding; S. Yun; P. Lundgren; A. Moore; P. Milillo; P. Rosen; F. Webb; M. Simons; A. Smith; B. Wilson; “The Advanced Rapid Imaging And Analysis Data System:

Automating SAR Data Analysis For Science And Hazard Response”, IGARSS 2015, Milan, Italy, July 28, 2015Zebker, Howard, Scott Hensley, Piyush Shanker, Cody Wortham, Geodetically Accurate InSAR Data Processor. IEEE Trans. On Geoscience and Remote Sensing, 48(12), 2010.Shams, K. Riofrio, A. Hardman, S. et al. Enabling Earth Science through Cloud Computing. IEEE/AIAA Aerospace 2012.PS Agram, R Jolivet, B Riel, YN Lin, M Simons, E Hetland, MP Doin, etal. New Radar Interferometric Time Series Analysis Toolbox Released. Eos, Transactions American Geophysical Union 94 (7), 69-70

!"#$%&'"()$*$(+,-*'.(

!/0!123(4/56)7(

89:*:*,;'(+,-*'.(

)8<(4!=07(

((((((

>'#'?1@((/$A()$*$(

!?:%B1*9$&C()D-;?$&'.'%*(

E9FD*()$*$( 6D.'(+'9D'-(89:&(

G:%HBI:I-(8$D9-(

>D%'1:J1-DBK*()D-;?$&'.'%*(

/$"$9((+$*'??D*'(

+'%-:9-( 89:#D"'9-L!9&KD#'-(

2+!(42/+1MLNO(2%#D-$*O(+'%H%'?1M$LF7(

P!Q!(4!>E+1MLN7((!+0(4GE+<E1+C,<'"7(R!+!(4)2+),%07(

)$D?,(8:D%*(8:-S(

=8+(+*$H:%(

+IF1"$D?,(8:D%*(8:-S(

+E8!G(TR!UGE(=+0(G6E(V(E*K'9-(

=8+((G:%-*'??$H:%(

R!+!(1(P8>(

)$.$B'(89:W,(<$;(

+'D-.:.'*'9(

2#'%*-(

G:K'9'%&'(

G:-'D-.D&(0%*'9J'9:B9$.(

0%+!/(V(=8+(0%*'B9$H:%(0%+!/(V(=8+(

0+G2(4!0+67(

/28!0/(4)D-$-*'9-7(

((((((

)8<(

R20G(8!=2/(

X$I?*(<:"'?(

+2+2+((

!D9F:9%'(U:?&$%:(

+&D'%&'(

EF-'9#$*:9,(

4!U+E7(

=0!%6(4G$?*'&K7(

9G<6(

2$9*KYI$C'(

U:?&$%:(

>$%"-?D"'(

X?::"(

+IF-D"'%&'(

!/0!1<Z(0%JI-D:%(

ZUE((U!>U2(

69$%-D'%*()'*'&H:%(

[)(6D.'(+'9D'-(

!/0!1<Z(4!0+67((

G?:I"1F$-'"(G:.;IH%BO(+*:9$B'O($%"()$*$(<$%$B'.'%*(

!/0!1<Z(

!/0!1<Z(

!/0!1<Z(

!/0!(89:\'&*-(N@M[S@]SNN(!/0!123(4/56)7(D-("'#'?:;D%B($??(&:%%'&H:%-(D%(*K'(F?I'(9'BD:%-O(AKD?'(!/0!1<Z(4!0+67(AD??("'#'?:;($"#$%&'"(JI%&H:%-(J:9(.$*I9D%B(*K'("$*$(-,-*'.(&$;$FD?DH'-($%"('%$F?D%B(&?:I"1F$-'"(K$^$9"(.:%D*:9D%BO("$*$(;9:&'--D%BO(.$%$B'.'%*O($%"("D-*9DFIH:%S(!/0!(G:1>$F:9$*:9,(48)X7(D-($"#$%&D%B(?:A1?$*'%&,(=8+O(:;H&$?(-$*'??D*'(D.$B'9,($%"(J$I?*(.:"?'D%B(D%*'B9$H:%S((

T+=+(R20G(=?:F$?(G<6(0/0+(4_$#'J:9.7(

X$I?*(<:"'?(

+2+2+((

6D.'(+'9D'-(

>D%'1:J1-DBK*()D-;?$&'.'%*(

>'#'?1@(/0R2Q()$*$(

E9FD*()$*$(

)$D?,(

Z:I9?,(

M(Z^(4=)=8+7(

X$I?*(<:"'?(

XD%D*'(X$I?*(0%#(4G$?*'&K7(

G:-'D-.D&(=8+(U'&*:9(<$;(

G:-'D-.D&

4G$?*'&K7(

=8+(6+(

0%+!/(6+(

=2(4!0+67(

((((((

69:;:-;K'9D&(G:99'&H:%(

_'$*K'9(

=8+(

E+G!/(4!0+67(

((((((

2G<_X(

<E)0+(_D?"`9'(

>2=2R)(

!/0!123(4P8>(/56)7(

!/0!1<Z(4R!+!(!0+67(<!0(4a:9'$%(XI%"7(

)8<(4R!+!(!=07(

GI-*:.'9(89:"I&*-(0+G2(4R!+!(!0+67(

E+G!/(4R!+!(!0+67(

=8+(2W;?:9'9(4R!+!(!0+67(

89:\'&*-($*(G$?*'&K(

/28!0/(4R!+!()D-$-*'9-7(

+2+2+(4R!+!(<2$+T/2-7( )';'%"'%&,(X?:A(

2U1+N(!U+E(

!/0!1<Z(4R!+!(!0+67(

E;H&$?(+$*'??D*'(

3ID&CFD9"(_:9?"#D'A(M5N(

M@(.D%((

!/0!(G:1>$F:9$*:9,(

E;H&$?()$.$B'(<$;-(4G$?*'&K7(

)8<(4G:1>$F7(

0%*'B9$*'"()$.$B'(

89:W,(<$;(

!"#$%&'"()$*$(+,-*'.(

!/0!123(4/56)7(

89:*:*,;'(+,-*'.(

G:K'9'%&'(

0+G2(4!0+67( =0!%6(4G$?*'&K7(

!/0!1<Z(

ISCE in ARIA (Advanced Rapid Imaging Analysis system)

AcknowledgementThe authors would like to thank the Earth Science Technology Office and High End Computing Program at NASA for support. This work was performed at the Jet Propulsion Laboratory, California Institute of Technology under a contract with NASA.Copyright © 2015 California Institute of Technology. Government sponsorship acknowledged. All Rights Reserved.

The NISAR mission will launch in2020 and will collect large volumes ofSAR data to measure Earth’s changingecosystems, dynamic surfaces, and icemasses, providing information aboutbiomass, natural hazards, sea levelrise, and groundwater.

HySDS Option for the Proposed NISAR Mission

Our Hybrid cloud Science DataSystem (HySDS) is currently beingconsidered as a part of the solutionfor dealing with the flood of data inneed of processing. The project willprocess the raw Level-0 data intoLevel-2 products using ISCE andprovide these low level productsdirectly to the scientists. HySDSwould control the processing tothese levels and could carry data tohigher level science products.

Key Considerations

! Any SAR mission can produce thousands to millions of images- Orbital global mapping produces data continuously- Sub-orbital flight missions and campaigns support specific objectives

! Science Data Products vary depending on system design, domain and intended purpose- Radar frequency (Band)- Scanning strategy (multi-pass, single pass, etc)- Platform operations artifacts (orbital vs. aircraft)

! Science Data Processing has some common characteristics- High volume of (typically) embarrassingly parallel processing jobs- Quality Assurance, metadata and registration of images- Cloud Computing offers scalable, if not affordable, solution

o Prioritization for schedulingo Event Triggerso Low latency processing

- Create metadata for provenance, geolocation, temporal, quality

Synthetic Aperture Radar Science Data Processing (SAR SDP) Foundry Concept

! Definition- A set of user-selectable components implemented in a scalable processing environment to

leveraging a common framework for producing community-accepted Science Data Products from SAR instruments

- Support multiple research and applied science communities- Community review/acceptance of processing model and subsequent improvements- Community defined science data products

! Components- Interface to Instruments which have been on-ramped- Production Processing Codes for Community defined Science Data Products which have been

on-ramped- ISCE – Processing environment for instrument output- ARIA SDS – end-to-end SDS for SAR processing and data management

o provides provenance, metadata, quality control, registration and workflow- Hybrid Cloud – Provides scalable processing environment, including AWS- Foundry User Interface

o Implements Business Modelo Permits user selection of instrument, scenes, standard data products

- EOS-DIS designated repository provides common destination for output products

SAR SDP Foundry Benefits

! Processing is under the control of the customer with data and funding- JPL can leverage their cloud interface- NASA can leverage OCIO SEWP Acquisition and simply use a WBS instead of a PR- Non-NASA collaborators, through agreement, can buy their own processing on AWS

! Processing environment is published and community-accepted! Clearly defined processes for on-ramping instruments and data product specifications

- Interface Control Documents publish requirements for L0 and L1 to permit processing- Instrument Team can account for high volume processing at initial product design

! Processing improvements are shared among the science communities- Example: Reliable use of Spot-pricing

! Science Data Products can become available to the communities regardless of who funded their production- Consistent with 2004 InSAR Working Group Workshop Summary Report (10/20/2004)- Can also deliver to an optional destination for immediate use

SAR SDP Foundry Funding Model

! Major Ongoing Costs- SDP Processing- Repository of Data Products

! Instrument development team - Produces L0 and L1 data products conforming to Foundry interface- On-ramping of instrument by the Foundry team at JPL

o Foundry creates configuration model, selectable configuration file, processing patterno Acceptance by Instrument Team in conjunction with Appropriate Science Team

! Research or Applied Sciences Community Science Team defines, tests and accepts and funds data products - Funds JPL for Implementation in the Foundry- Funds Community acceptance tests of output products

! EOS-DIS funds Repository and Stewardship Functions- Presumably ASF Repository Functions

! Foundry technology development competes for AIST funding! Ongoing operations should not be funded by individual PI projects

- Maintenance of Foundry, including Help Desk, training, software maintenance

! However, data processing costs themselves would be paid for by customers as part of their Project- PI controls what processing to pay for

Will It Work?

! ESTO is supporting this effort as technology demonstration! Goal is to demonstrate that the Foundry is:

- Scientifically valuable- Technically feasible and efficient- Cost effective- Supported by the user and application development community

! We look forward to working with the community to explore the Foundry’s potential

!"#$"%&"'%()$$

Advanced Rapid Imaging & Analysis for Monitoring Hazards, using

HySDS Technology

And

ISCE*+,-./.0+

)1234.0-356+

734,89.:2-834,8963402

;1.$"%

9-+.<562-=1$1=2412

9-><?3/8@+43A=1<

Once on-ramped, PI’s can quickly process data they select, to the extent that they can afford it.

Bottom Right: EcoSAR acquired image over Andros Island, Bahamas in March 2014. Top 4 right: from InSAR Workshop Summary Report, 2004, Oxnard, CA.

InSAR Scientific Computing Environment on the Cloud

The SAR SDP Foundry

L-BandP-Band(JPL)

P-Band(GSFC)

9)$()B$C+(2,

X-Band(Italy)

@7%8DB$"%

X,C,S,L,PX,C,S,L,P-BandsBands(DLR)

$24?=42/8E

C-Band(ESA)

L0 or L1 SAR data from NASA and other sources, as well as new processing workflows, can be on-ramped into the Foundry. Figure shows some of the planned data sets (COSMO-SkyMed implemented).

NISAR Big Data Handling• Average Input Data Volume to SDS: 3 TB/day• Average Daily Production Volume: 104 TB / day

– L0 (6 TB per day)– L1( 32 TB per day, L-band)– L2 (66 TB per day, L-band)

• Aggregate Data Generated per day– Sustained mean 1.2GB/s – Forward processing– Sustained mean 4.8GB/s – Bulk reprocessing (optional)

• New architectures needed to support data handling and throughput– SDS processing at JPL on-premise– SDS processing at collocated with DAAC– SDS processing at public cloud– Hybrid cloud: on-premise and public cloud for processing and

data storage

Expected high data throughput needs of modern SAR missions! Sentinel 1A/1B

- C-band SAR- 1.8TB/day raw data

! NISAR (2020)- L-band SAR- Deliver >100TB data products per day to DAACs

! SWOT (2020)- Bulk reprocessing requires >3GB/second mean sustained data throughput to DAAC

SAR SDP Foundry Next Steps

! Characterize and develop on-ramping cost model! Open HySDS adaptation for SAR SDP Foundry to beta users ! Develop charging protocols ! Define a sensor and workflow to on-ramp of interest to the community! Exercise the Foundry with customer inputs

! Utilizes both on-premise and off-site infrastructure- PB-scale processing and storage purely in public cloud currently too expensive

- Hybrid Cloud data system architecture- Burst out to public cloud when demand exceeds on-premise resources

! Leverage Amazon GovCloud US to address export control and firewall security issues! Processing at low-cost (up to 10X cheaper) using high-resiliency data system that can run in competitive

AWS spot market! Auto-scaling of science data system

Hybrid Cloud Computing Science Data System (HySDS)

JPL MOS/GDS Raw Instrument L0a+ Ancillary Data + Inst HK

JPL on-premise cloud and/or AWS

Resource Management

Publishes data productsLocalizes Inputs

Auto-Scaling

Monitor Alerts & Provisions

DAAC

L0-L2 PGEs(EC2 Compute VMs)

L0-L2 PGEs(EC2 Compute VMs)L0-L2 Pipelines & Ingest

(VMs/containers)

Gets job

L0-L2 Data Products(Object Store)

Publish data

L0-L2 Data Products(Object Store)

Publish data

Gets job

Ingest Worker

Extracts Data

Gets job

Ops Work Data Cache

Triage Worker

Writes Work Data

Discovery Services

Publishes metadata

Discovery Services

Publishes metadata

Notifies

Workflow Management

Orchestrates JobsGets job

Crawler Worker

Monitors

EventsEvent ManagementEvents

Auto-Scaling

Monitor Alerts & ProvisionsMonitor Alerts &

(VMs/containers)

EventsEvent Management

DAAC Staging

Gets job

Product Localize Worker

Writes Data Products to Staging

Raw Instrument L0a+ Ancillary Data + Inst HKHK

Ingest Worker

Extracts Data

Gets job

Ops Work Data Cache

Gets jobGets jobGets jobGets jobGets jobGets job

Triage Worker

Writes Work Data

Gets job

Ingest Worker

Gets jobGets jobCrawler Worker

MonitorsMonitors

JPL

GDS L0 RAW

Cloud Computing-based SAR Foundry

VM BareMetal

BeowulfClusters

Discovery Services Access Services Compute Services Storage Services Analytics Services UX

ServicesProvenance

Services

Infrastructure “Fabric”

SDS services for Faceted Discovery, Monitoring Rules, Conditional Actions, Workflow Processing, and Real-time Metrics

Science Users Response AgenciesDAACsGround Systems Operations

SAR Foundry leverage computing fabric layer over multiple heterogeneous infrastructures for diversification and high-resiliency.

Science Data ProductsFaceted search and on-demand services

Science Data SystemFaceted Metrics and PROV-ES provenance

# compute nodes over time

Sentinel-1A scenes for Nepal EQ response

Real-Time Situational Awareness of SAR Processing

Automatic Scaling of Science Data System

! The size of the data system automatically grows/shrinks based on computing needs.

! Scaled up thousands of compute nodes! Demonstrated capability of higher internal data throughput rates

than currently expected NISAR needs

Multiple Concurrent Platforms

Amazon Web Services

Open Stack Cloud Software

WindowsAzure

Eucalyptus

Nebula

Recommended