IN43B– 1734The Synthetic Aperture Radar Science Data Processing Foundry Concept for Earth Sciences
Paul A. Rosen1, Hook Hua1, Charles Norton1, Michael M. Little2
1Jet Propulsion Laboratory, California Institute of Technology, 4800 Oak Grove Drive, Pasadena, CA 911092National Aeronautics and Space Administration, Earth Science Technology Office, Washington DC
Abstract
Since 2008, NASA's Earth Science Technology Office and the Advanced Information SystemsTechnology Program have invested in two technology evolutions to meet the needs of thecommunity of scientists exploiting the rapidly growing database of international synthetic apertureradar (SAR) data. JPL, working with the science community, has developed the InSAR ScientificComputing Environment (ISCE), a next-generation interferometric SAR processing system that isdesigned to be flexible and extensible. ISCE currently supports many international space borne datasets but has been primarily focused on geodetic science and applications. A second evolutionarypath, the Advanced Rapid Imaging and Analysis (ARIA) science data system, uses ISCE as its corescience data processing engine and produces automated science and response products, qualityassessments and metadata. The success of this two-front effort has been demonstrated in NASA'sability to respond to recent events with useful disaster support. JPL has enabled high-volume andlow latency data production by the re-use of the hybrid cloud computing science data system(HySDS) that runs ARIA, leveraging on-premise cloud computing assets that are able to burst ontothe Amazon Web Services (AWS) services as needed. Beyond geodetic applications, needs haveemerged to process large volumes of time-series SAR data collected for estimation of biomass andits change, in such campaigns as the upcoming AfriSAR field campaign. ESTO is funding JPL toextend the ISCE-ARIA model to a “SAR Science Data Processing (SDP) Foundry” to on-ramp newdata sources and to produce new science data products to meet the needs of science teams and, ingeneral, science community members. An extension of the ISCE-ARIA model to support on-demand processing will permit PIs to leverage this Foundry to produce data products from accepteddata sources when they need them. This paper will describe each of the elements of the SAR SDPFoundry and describe their integration into a new conceptual approach to enable more effective useof SAR instruments.
ReferencesGurrola, Eric, Gian Franco Sacco, Paul A. Rosen, and Howard Zebker, InSAR Scientific Computing Environment. Earth Science Technology Forum 2010.H. Hua; G. Manipon; G. Sacco; S. Owen; E. Fielding; S. Yun; P. Lundgren; A. Moore; P. Milillo; P. Rosen; F. Webb; M. Simons; A. Smith; B. Wilson; “The Advanced Rapid Imaging And Analysis Data System:
Automating SAR Data Analysis For Science And Hazard Response”, IGARSS 2015, Milan, Italy, July 28, 2015Zebker, Howard, Scott Hensley, Piyush Shanker, Cody Wortham, Geodetically Accurate InSAR Data Processor. IEEE Trans. On Geoscience and Remote Sensing, 48(12), 2010.Shams, K. Riofrio, A. Hardman, S. et al. Enabling Earth Science through Cloud Computing. IEEE/AIAA Aerospace 2012.PS Agram, R Jolivet, B Riel, YN Lin, M Simons, E Hetland, MP Doin, etal. New Radar Interferometric Time Series Analysis Toolbox Released. Eos, Transactions American Geophysical Union 94 (7), 69-70
!"#$%&'"()$*$(+,-*'.(
!/0!123(4/56)7(
89:*:*,;'(+,-*'.(
)8<(4!=07(
((((((
>'#'?1@((/$A()$*$(
!?:%B1*9$&C()D-;?$&'.'%*(
E9FD*()$*$( 6D.'(+'9D'-(89:&(
G:%HBI:I-(8$D9-(
>D%'1:J1-DBK*()D-;?$&'.'%*(
/$"$9((+$*'??D*'(
+'%-:9-( 89:#D"'9-L!9&KD#'-(
2+!(42/+1MLNO(2%#D-$*O(+'%H%'?1M$LF7(
P!Q!(4!>E+1MLN7((!+0(4GE+<E1+C,<'"7(R!+!(4)2+),%07(
)$D?,(8:D%*(8:-S(
=8+(+*$H:%(
+IF1"$D?,(8:D%*(8:-S(
+E8!G(TR!UGE(=+0(G6E(V(E*K'9-(
=8+((G:%-*'??$H:%(
R!+!(1(P8>(
)$.$B'(89:W,(<$;(
+'D-.:.'*'9(
2#'%*-(
G:K'9'%&'(
G:-'D-.D&(0%*'9J'9:B9$.(
0%+!/(V(=8+(0%*'B9$H:%(0%+!/(V(=8+(
0+G2(4!0+67(
/28!0/(4)D-$-*'9-7(
((((((
)8<(
R20G(8!=2/(
X$I?*(<:"'?(
+2+2+((
!D9F:9%'(U:?&$%:(
+&D'%&'(
EF-'9#$*:9,(
4!U+E7(
=0!%6(4G$?*'&K7(
9G<6(
2$9*KYI$C'(
U:?&$%:(
>$%"-?D"'(
X?::"(
+IF-D"'%&'(
!/0!1<Z(0%JI-D:%(
ZUE((U!>U2(
69$%-D'%*()'*'&H:%(
[)(6D.'(+'9D'-(
!/0!1<Z(4!0+67((
G?:I"1F$-'"(G:.;IH%BO(+*:9$B'O($%"()$*$(<$%$B'.'%*(
!/0!1<Z(
!/0!1<Z(
!/0!1<Z(
!/0!(89:\'&*-(N@M[S@]SNN(!/0!123(4/56)7(D-("'#'?:;D%B($??(&:%%'&H:%-(D%(*K'(F?I'(9'BD:%-O(AKD?'(!/0!1<Z(4!0+67(AD??("'#'?:;($"#$%&'"(JI%&H:%-(J:9(.$*I9D%B(*K'("$*$(-,-*'.(&$;$FD?DH'-($%"('%$F?D%B(&?:I"1F$-'"(K$^$9"(.:%D*:9D%BO("$*$(;9:&'--D%BO(.$%$B'.'%*O($%"("D-*9DFIH:%S(!/0!(G:1>$F:9$*:9,(48)X7(D-($"#$%&D%B(?:A1?$*'%&,(=8+O(:;H&$?(-$*'??D*'(D.$B'9,($%"(J$I?*(.:"?'D%B(D%*'B9$H:%S((
T+=+(R20G(=?:F$?(G<6(0/0+(4_$#'J:9.7(
X$I?*(<:"'?(
+2+2+((
6D.'(+'9D'-(
>D%'1:J1-DBK*()D-;?$&'.'%*(
>'#'?1@(/0R2Q()$*$(
E9FD*()$*$(
)$D?,(
Z:I9?,(
M(Z^(4=)=8+7(
X$I?*(<:"'?(
XD%D*'(X$I?*(0%#(4G$?*'&K7(
G:-'D-.D&(=8+(U'&*:9(<$;(
G:-'D-.D&
4G$?*'&K7(
=8+(6+(
0%+!/(6+(
=2(4!0+67(
((((((
69:;:-;K'9D&(G:99'&H:%(
_'$*K'9(
=8+(
E+G!/(4!0+67(
((((((
2G<_X(
<E)0+(_D?"`9'(
>2=2R)(
!/0!123(4P8>(/56)7(
!/0!1<Z(4R!+!(!0+67(<!0(4a:9'$%(XI%"7(
)8<(4R!+!(!=07(
GI-*:.'9(89:"I&*-(0+G2(4R!+!(!0+67(
E+G!/(4R!+!(!0+67(
=8+(2W;?:9'9(4R!+!(!0+67(
89:\'&*-($*(G$?*'&K(
/28!0/(4R!+!()D-$-*'9-7(
+2+2+(4R!+!(<2$+T/2-7( )';'%"'%&,(X?:A(
2U1+N(!U+E(
!/0!1<Z(4R!+!(!0+67(
E;H&$?(+$*'??D*'(
3ID&CFD9"(_:9?"#D'A(M5N(
M@(.D%((
!/0!(G:1>$F:9$*:9,(
E;H&$?()$.$B'(<$;-(4G$?*'&K7(
)8<(4G:1>$F7(
0%*'B9$*'"()$.$B'(
89:W,(<$;(
!"#$%&'"()$*$(+,-*'.(
!/0!123(4/56)7(
89:*:*,;'(+,-*'.(
G:K'9'%&'(
0+G2(4!0+67( =0!%6(4G$?*'&K7(
!/0!1<Z(
ISCE in ARIA (Advanced Rapid Imaging Analysis system)
AcknowledgementThe authors would like to thank the Earth Science Technology Office and High End Computing Program at NASA for support. This work was performed at the Jet Propulsion Laboratory, California Institute of Technology under a contract with NASA.Copyright © 2015 California Institute of Technology. Government sponsorship acknowledged. All Rights Reserved.
The NISAR mission will launch in2020 and will collect large volumes ofSAR data to measure Earth’s changingecosystems, dynamic surfaces, and icemasses, providing information aboutbiomass, natural hazards, sea levelrise, and groundwater.
HySDS Option for the Proposed NISAR Mission
Our Hybrid cloud Science DataSystem (HySDS) is currently beingconsidered as a part of the solutionfor dealing with the flood of data inneed of processing. The project willprocess the raw Level-0 data intoLevel-2 products using ISCE andprovide these low level productsdirectly to the scientists. HySDSwould control the processing tothese levels and could carry data tohigher level science products.
Key Considerations
! Any SAR mission can produce thousands to millions of images- Orbital global mapping produces data continuously- Sub-orbital flight missions and campaigns support specific objectives
! Science Data Products vary depending on system design, domain and intended purpose- Radar frequency (Band)- Scanning strategy (multi-pass, single pass, etc)- Platform operations artifacts (orbital vs. aircraft)
! Science Data Processing has some common characteristics- High volume of (typically) embarrassingly parallel processing jobs- Quality Assurance, metadata and registration of images- Cloud Computing offers scalable, if not affordable, solution
o Prioritization for schedulingo Event Triggerso Low latency processing
- Create metadata for provenance, geolocation, temporal, quality
Synthetic Aperture Radar Science Data Processing (SAR SDP) Foundry Concept
! Definition- A set of user-selectable components implemented in a scalable processing environment to
leveraging a common framework for producing community-accepted Science Data Products from SAR instruments
- Support multiple research and applied science communities- Community review/acceptance of processing model and subsequent improvements- Community defined science data products
! Components- Interface to Instruments which have been on-ramped- Production Processing Codes for Community defined Science Data Products which have been
on-ramped- ISCE – Processing environment for instrument output- ARIA SDS – end-to-end SDS for SAR processing and data management
o provides provenance, metadata, quality control, registration and workflow- Hybrid Cloud – Provides scalable processing environment, including AWS- Foundry User Interface
o Implements Business Modelo Permits user selection of instrument, scenes, standard data products
- EOS-DIS designated repository provides common destination for output products
SAR SDP Foundry Benefits
! Processing is under the control of the customer with data and funding- JPL can leverage their cloud interface- NASA can leverage OCIO SEWP Acquisition and simply use a WBS instead of a PR- Non-NASA collaborators, through agreement, can buy their own processing on AWS
! Processing environment is published and community-accepted! Clearly defined processes for on-ramping instruments and data product specifications
- Interface Control Documents publish requirements for L0 and L1 to permit processing- Instrument Team can account for high volume processing at initial product design
! Processing improvements are shared among the science communities- Example: Reliable use of Spot-pricing
! Science Data Products can become available to the communities regardless of who funded their production- Consistent with 2004 InSAR Working Group Workshop Summary Report (10/20/2004)- Can also deliver to an optional destination for immediate use
SAR SDP Foundry Funding Model
! Major Ongoing Costs- SDP Processing- Repository of Data Products
! Instrument development team - Produces L0 and L1 data products conforming to Foundry interface- On-ramping of instrument by the Foundry team at JPL
o Foundry creates configuration model, selectable configuration file, processing patterno Acceptance by Instrument Team in conjunction with Appropriate Science Team
! Research or Applied Sciences Community Science Team defines, tests and accepts and funds data products - Funds JPL for Implementation in the Foundry- Funds Community acceptance tests of output products
! EOS-DIS funds Repository and Stewardship Functions- Presumably ASF Repository Functions
! Foundry technology development competes for AIST funding! Ongoing operations should not be funded by individual PI projects
- Maintenance of Foundry, including Help Desk, training, software maintenance
! However, data processing costs themselves would be paid for by customers as part of their Project- PI controls what processing to pay for
Will It Work?
! ESTO is supporting this effort as technology demonstration! Goal is to demonstrate that the Foundry is:
- Scientifically valuable- Technically feasible and efficient- Cost effective- Supported by the user and application development community
! We look forward to working with the community to explore the Foundry’s potential
!"#$"%&"'%()$$
Advanced Rapid Imaging & Analysis for Monitoring Hazards, using
HySDS Technology
And
ISCE*+,-./.0+
)1234.0-356+
734,89.:2-834,8963402
;1.$"%
9-+.<562-=1$1=2412
9-><?3/8@+43A=1<
Once on-ramped, PI’s can quickly process data they select, to the extent that they can afford it.
Bottom Right: EcoSAR acquired image over Andros Island, Bahamas in March 2014. Top 4 right: from InSAR Workshop Summary Report, 2004, Oxnard, CA.
InSAR Scientific Computing Environment on the Cloud
The SAR SDP Foundry
L-BandP-Band(JPL)
P-Band(GSFC)
9)$()B$C+(2,
X-Band(Italy)
@7%8DB$"%
X,C,S,L,PX,C,S,L,P-BandsBands(DLR)
$24?=42/8E
C-Band(ESA)
L0 or L1 SAR data from NASA and other sources, as well as new processing workflows, can be on-ramped into the Foundry. Figure shows some of the planned data sets (COSMO-SkyMed implemented).
NISAR Big Data Handling• Average Input Data Volume to SDS: 3 TB/day• Average Daily Production Volume: 104 TB / day
– L0 (6 TB per day)– L1( 32 TB per day, L-band)– L2 (66 TB per day, L-band)
• Aggregate Data Generated per day– Sustained mean 1.2GB/s – Forward processing– Sustained mean 4.8GB/s – Bulk reprocessing (optional)
• New architectures needed to support data handling and throughput– SDS processing at JPL on-premise– SDS processing at collocated with DAAC– SDS processing at public cloud– Hybrid cloud: on-premise and public cloud for processing and
data storage
Expected high data throughput needs of modern SAR missions! Sentinel 1A/1B
- C-band SAR- 1.8TB/day raw data
! NISAR (2020)- L-band SAR- Deliver >100TB data products per day to DAACs
! SWOT (2020)- Bulk reprocessing requires >3GB/second mean sustained data throughput to DAAC
SAR SDP Foundry Next Steps
! Characterize and develop on-ramping cost model! Open HySDS adaptation for SAR SDP Foundry to beta users ! Develop charging protocols ! Define a sensor and workflow to on-ramp of interest to the community! Exercise the Foundry with customer inputs
! Utilizes both on-premise and off-site infrastructure- PB-scale processing and storage purely in public cloud currently too expensive
- Hybrid Cloud data system architecture- Burst out to public cloud when demand exceeds on-premise resources
! Leverage Amazon GovCloud US to address export control and firewall security issues! Processing at low-cost (up to 10X cheaper) using high-resiliency data system that can run in competitive
AWS spot market! Auto-scaling of science data system
Hybrid Cloud Computing Science Data System (HySDS)
JPL MOS/GDS Raw Instrument L0a+ Ancillary Data + Inst HK
JPL on-premise cloud and/or AWS
Resource Management
Publishes data productsLocalizes Inputs
Auto-Scaling
Monitor Alerts & Provisions
DAAC
L0-L2 PGEs(EC2 Compute VMs)
L0-L2 PGEs(EC2 Compute VMs)L0-L2 Pipelines & Ingest
(VMs/containers)
Gets job
L0-L2 Data Products(Object Store)
Publish data
L0-L2 Data Products(Object Store)
Publish data
Gets job
Ingest Worker
Extracts Data
Gets job
Ops Work Data Cache
Triage Worker
Writes Work Data
Discovery Services
Publishes metadata
Discovery Services
Publishes metadata
Notifies
Workflow Management
Orchestrates JobsGets job
Crawler Worker
Monitors
EventsEvent ManagementEvents
Auto-Scaling
Monitor Alerts & ProvisionsMonitor Alerts &
(VMs/containers)
EventsEvent Management
DAAC Staging
Gets job
Product Localize Worker
Writes Data Products to Staging
Raw Instrument L0a+ Ancillary Data + Inst HKHK
Ingest Worker
Extracts Data
Gets job
Ops Work Data Cache
Gets jobGets jobGets jobGets jobGets jobGets job
Triage Worker
Writes Work Data
Gets job
Ingest Worker
Gets jobGets jobCrawler Worker
MonitorsMonitors
JPL
GDS L0 RAW
Cloud Computing-based SAR Foundry
VM BareMetal
BeowulfClusters
Discovery Services Access Services Compute Services Storage Services Analytics Services UX
ServicesProvenance
Services
Infrastructure “Fabric”
SDS services for Faceted Discovery, Monitoring Rules, Conditional Actions, Workflow Processing, and Real-time Metrics
Science Users Response AgenciesDAACsGround Systems Operations
SAR Foundry leverage computing fabric layer over multiple heterogeneous infrastructures for diversification and high-resiliency.
Science Data ProductsFaceted search and on-demand services
Science Data SystemFaceted Metrics and PROV-ES provenance
# compute nodes over time
Sentinel-1A scenes for Nepal EQ response
Real-Time Situational Awareness of SAR Processing
Automatic Scaling of Science Data System
! The size of the data system automatically grows/shrinks based on computing needs.
! Scaled up thousands of compute nodes! Demonstrated capability of higher internal data throughput rates
than currently expected NISAR needs
Multiple Concurrent Platforms
Amazon Web Services
Open Stack Cloud Software
WindowsAzure
Eucalyptus
Nebula