Date post: | 18-Jan-2016 |
Category: |
Documents |
Upload: | emmeline-warren |
View: | 219 times |
Download: | 0 times |
UK E-Science Initiativeand its
Application to SDO
J.L. Culhane
MSSL
SUMMARY
• The UK Astrogrid
• Dealing with SDO Data Volumes
• The PPARC E-Science AO
• HMI Data Products and Pipeline
What is the Grid?
Ian Foster, Argonne National Lab & University of Chicago
“A Grid is a system that:
• Coordinates resources that are not subject to centralized control.
• Uses standard, open, general-purpose protocols and interfaces.
• Delivers nontrivial qualities of service.”
- Ian Foster, “What is the Grid? A Three Point Checklist”
GRID
PC
Mainframe
Space Missions
Network
Laptop
Phone / PDA
Printer
UK Astrogrid
• Astrogrid is one of three major world-wide projects (along with European AVO and US-VO projects) which aim to create an astronomical Virtual Observatory
• Astrogrid has a significant Solar Physics component
• The Virtual Observatory will be a set of co-operating and interoperable software systems that: – allow users to interrogate multiple data centres in a seamless
and transparent way; – provide powerful new analysis and visualisation tools; – give data centres a standard framework for publishing and
delivering services using their data.
How does Astrogrid work?
User
Web Service
Web Service
Web Service
Web Service
Web InterfaceData Archive
Data Storage
Data Transformation & Processing
Distributed Network of Registries
Web Service: “A web service is any piece of software that makes itself available over the Internet and uses a standardized XML messaging system.”
- Ethan Cerami, “Top Ten FAQs for Web Services”, The O’Reilly Network
RESOURCES
Astrogrid Registry
Data Archive
Data Storage
Data Transformation
Distributed Network of Registries
Registry: “Dynamic database of metadata describing a set ofInternet-available resources. A registry is used to identify and locate resources satisfying user-specified criteria, and to direct more detailed information requests to the relevant services. Robert Hanisch, STSCI
Registry Database
METADATA:
• Basic: ID, title, service type
• Curation: Location, contact, publisher, creator, etc.
• Metadata: Allowed methods, input / output variables, etc.
• Metadata Format: Wavelength, coordinates, instrument coverage…
Registries contain informationabout resources
Solar Interior to Outer AtmosphereSolar Interior to Outer Atmosphere
Science goal: Connect observations of the interior to fluctuations in the solar atmosphereData Required: Helioseismology observations connected with solar atmosphere observationsCurrent difficulties: Being able to search efficiently for solar atmospheric events that may be responding to an excitation source in the interiorGrid future: Ability to:- Search easily for events e.g. flux emergence, AR evolution, flares, coronal mass ejections, over specific time periods- Extract parameters over the cycle from the atmosphere and interior in order to compare their evolutionCrucial for SDO to relate convection zone observations to magnetic field data for Photosphere and above
SDO HMI Archiving and Processing
• SDO instruments generate raw data (~ 2 Tbyte/day) along with derived products
• Derived products result from pipeline processing that must keep up with the flow of incoming data
• GRID or Virtual Observatory approach could allow:– Distributed data holding– Distributed processing capability
• Network bandwidths and processing power at single sites set limits:– Available network bandwidths for users could limit data transfer
from/between multiple archives– All data at one site implies considerable processing power
accessible by many distributed users
Distributed Archive Approach
• Multiple copies of the data desirable
• Needs a minimum of two geographically separated sources with the advantages:– Greater resilience in ability to supply users– Load sharing between different providers
(network and processing)– Avoids need for single site to provide
excessive processing power
Single Archive Approach
• Solar data normally stored in a raw form and need to be processed before use
• Processing involves extraction and calibration of selected observations.
• For data (e.g. helioseismology data) involving extended time intervals, processing data at source is desirable
• Advantages that result:– Reduced amount of information to be returned to user– Affords the instrument teams more control over the processing
and quality of their data products
but• Heavy loading of processors at single archive site unless
requests are for high-level lower-volume data products
• UK has “SuperJanet” backbone currently at 10 Gbps
• Local access points operate at 2.5 Gbps (e.g. UCL interconnect rate to backbone)
• Europe has “Geant” backbone at 10 Gbps covering UK, France, Germany,Sweden, Switzerland with 2.5 Gbps local interconnects
• Transatlanic connection to Geant currently 2.5 Gbps with upgrade to 10 Gbps planned for 2004
• Discussion of “Global” 1 Tbps network by 2006??• Geant driven in part by needs of HEP community
for LHC – hence SDO may not have a problem in moving data between sites
Network Issues
PPARC E-Science AO
• Proposals due by 31st May, 2003• Existence of first level Astrogrid infrastructure
assumed• Proposals should:
– Be for the application of infrastructure and related techniques to “real” data sets
– Underpin science but close connection between projects and the science programme is essential
– Demonstrate an enabling role for eventual science exploitation
– Ensure development of standards and deployment of Grid infrastructure
– SDO bid is now anticipated by PPARC
HMI Data Analysis Pipeline
Data Product
Brightness featuremaps
Solar limb parameters
Vector MagnetogramsFast algorithm
Vector MagnetogramsInversion algorithm
DopplerVelocity
HeliographicDoppler velocity
maps
Tracked TilesOf Dopplergrams
StokesI,V
Filtergrams
ContinuumBrightness
Tracked full-disk1-hour averagedContinuum maps
StokesI,Q,U,V
Full-disk 10-minAveraged maps
Tracked Tiles
Line-of-sightMagnetograms
Egression andIngression maps
Time-distanceCross-covariance
function
Ring diagrams
Wave phase shift maps
Wave travel times
Local wave frequency shifts
SphericalHarmonic
Time seriesTo l=1000
Mode frequenciesAnd splitting
Version 1.2w
Brightness Images
Line-of-SightMagnetic Field Maps
Coronal magneticField Extrapolations
Coronal andSolar wind models
Far-side activity index
Deep-focus v and cs
maps (0-200Mm)
High-resolution v and cs
maps (0-30Mm)
Carrington synoptic v and cs
maps (0-30Mm)
Full-disk velocity, v(r,Θ,Φ),And sound speed, cs(r,Θ,Φ),
Maps (0-30Mm)
Internal sound speed,cs(r,Θ) (0<r<R)
Internal rotation Ω(r,Θ)(0<r<R)
Vector MagneticField Maps
HMI Data Processing
Enabling Code/Algorithms
Net Access/Mirror
HMI SRR/SCR Presentation April 8-10
HMI Science Data Analysis Plan
HMI SRR/SCR Presentation April 8-10
ScienceExploitation
HMI Data Volumes
Net AccessHMI SRR/SCRPresentation April 8-10
END OF TALK
What is Astrogrid?
Astrogrid is a £5 M data grid project that will link data archives, resources, and disciplines from UK space institutions into a virtual observatory.
Data Archives
• Mullard Space Science Laboratory
• Rutherford Appleton Laboratory
• University of Cambridge
• University of Leicester
• Royal Observatory Edinburgh
• Queens University Belfast
• Jodrell Bank Observatory
Resources
• Datasets
• Processors
• Storage
• Other virtual observatories
Disciplines
• Astrophysics
• Solar Physics
• Solar Terrestrial Physics
GRID/Virtual Observatory
Within a virtual observatory:
• Not required for all datasets to be stored at a single site • Metadata and registries allow system to handle a distributed
archive.• Different organisations or countries could host the different datasets
or different parts of the datasets (e.g. split by time). • Complete catalogues relating to particular datasets should be held
wherever the data are held. • Distributed data holding reduces the pressure on:
– Network connection to an archive – Processing capabilities needed at the archive site
• Most accessed data could be selectively copied to distributed archives e.g. EGSO, Astrogrid
• Derived data products should be held at distributed sites• Material needed for more detailed searches should be described by
metadata in appropriate registries.
Example: Solar / Stellar FlaresScience Problem: A solar physicist studying the flare mechanism would like to gather data on both solar and stellar flares.
Data Required: X-ray datasets: lightcurves, spectra, and redshift / blueshift information from SOHO, Yokhoh, EXOSAT, ROSAT, XMM, Chandra, etc.
Current Issues: No stellar flare catalogue (at time of science problem writing), datasets provided by several different archives with no common interface.
Solar Flare Catalogue #1
Solar Flare Catalogue #2 Yohkoh
Archive
Solar-B Archive
XMM Archive
Chandra Archive
User Web Interface
Merged Solar Flare List
NEW:Stellar FlareCatalogue
HMI Data Archive
HMI Data Flow
HMI Dataflow Concept
HMI SRR/SCR Presentation April 8-10
HMI Standard Data Products
UK Astrogrid Scientific Aims
• Improve the quality, efficiency, ease, speed, and cost-effectiveness of on-line astronomical research
• Make comparison and integration of data from diverse sources seamless and transparent
• Remove data analysis barriers to interdisciplinary research
• Make science involving manipulation of large datasets as easy and as powerful as possible.
UK Astrogrid Practical Goals
• Develop, with our IVOA partners (including European Grid of Solar Observations/EGSO), internationally agreed standards for data, metadata, data exchange and provenance
• Develop a software infrastructure for data services • Establish a physical grid of resources shared by AstroGrid and key
data centres • Construct and maintain an AstroGrid Service and Resource Registry • Implement a working Virtual Observatory system based around key
UK databases and of real scientific use to astronomers • Provide a user interface to that VO system • Provide, either by construction or by adaptation, a set of science
user tools to work with that VO system • Establish a leading position for the UK in VO work