Date post: | 24-Dec-2015 |
Category: |
Documents |
Upload: | griselda-cobb |
View: | 215 times |
Download: | 2 times |
Astrophysics on the Astrophysics on the OSGOSG(LIGO, SDSS, DES)(LIGO, SDSS, DES)
Kent BlackburnLIGO LaboratoryCalifornia Institute of Technology
Open Science Grid Consortium Meeting
University of Florida
January 23, 2006
Outline and ContributorsOutline and Contributors
LIGO on the OSGLIGO on the OSG Kent Blackburn, Duncan Brown, Albert Lazzarini,
David Meyers SDSS, NEO & DES on the OSGSDSS, NEO & DES on the OSG
Nickolai Kuropatkin, Neha Sharma, Chris Stoughton, James Annis, Steve Kent
Gravitational Wave Gravitational Wave Physics on the OSGPhysics on the OSG
Laser Interferometer Gravitational wave Observatory (LIGO)
LIGO Scientific Collaboration (LSC)
LIGO on the Open Science LIGO on the Open Science GridGrid Search for Gravitaitional Waves
Hanford, WA Livingston, LA Plus GEO, TAMA and VIRGO
LIGO Scientific Collaboration ~ 40 Institutions worldwide ~ 400 individuals contributing
LIGO Data Grid (LDG) Nine Grid Sites Over 2000 CPUs Multi-Petabyte Data Archive at Caltech
Scientific Data Collection grouped into temporal “Science Runs”
Currently In Science Run 5 Goal to collect one year “plus” of design
sensitivity data One Terabyte of data each day
Analysis carried out primarily on the LIGO Data Grid (LDG)
Stepping out onto the OSG http://www.ligo.caltech.edu
LIGO Data Analysis LIGO Data Analysis Classifications Classifications Principle Classifications of Searches
Binary Inspiral (Neutron Stars & Black Holes) Consumes bulk of LIGO Data Grid resources
Burst (Supernovae and other Unmodeled Events) Coincidence between different data streams necessary
Stochastic Background (Similar to the CMB) Computationally least demanding but requires cross correlation
Periodic (Pulsars, Rotating Neutron Stars) Signal sinusoidal in reference frame of source All Sky Survey could promote Global Warming (Order 1020 FLOPS)
Binary Inspiral Search selected for initial adoption onto the OSG Workflow well suited to Open Science Grid
Already using a similar set of Grid Technologies within LIGO Data Grid Simple parametric parallelization of algorithms
Optimal filtering of data against tens of thousands of waveforms Computationally demanding but interesting on the scale of the OSG
Expect other searches to follow once OSG trailblazing work done
Binary Inspiral Search Binary Inspiral Search Experiences on the Open Experiences on the Open
Science GridScience Grid First attempt at July, 2005 OSG Consortium Meeting in Milwaukee, Wisconsin Unsuccessful at submitting a binary inspiral workflow at any OSG site Authentication was primary reason for failures (LIGO VO not part of 0.2.1) Other issues discovered with the version of VDS distributed in 0.2.1
First successful completion of a binary inspiral workflow October 1st, 2005 on LIGO’s OSG Integration Testbed Cluster at Caltech Eight Node Dual CPU cluster with two terabytes of disk space Running a “patched” version of VDS on top of OSG 0.2.1 Used a test workflow involved ~38 GBs of LIGO Data and workflows with about 700
DAG nodes. Followed up by running at LIGO’s OSG Productions sites at PSU(PBS) and
UWM(Condor) (once VDS patch applied at each) Collaborated with several CMS resources to further test outside LIGO’s VO
Worked with clusters at San Diego, Nebraska and Caltech All clusters added LIGO’s VOMS to allow authentication Updated OSG 0.2.1 with VDS patches Mixed results do to size of LIGO data sets transferred for this test workflow
Worked with Deployment and Integration Teams to assure LIGO’s functional requirements appeared in the OSG 0.4 software stack (just announced!)
Greatly Simplified LIGO Greatly Simplified LIGO DAGDAG
LIGO’s Next Move on the LIGO’s Next Move on the OSG OSG The OSG 0.4.0 release should greatly improve the OSG for LIGO’s
Binary Inspiral Workflow A workflow geared toward actually conducting a scientific study would
involve at least 16000 DAG nodes and close to two terabytes of data. Recent OSG motivated activities in LIGO have produced a nearly 10:1
reduction is data through improved data selection and compression Need to develop more flexible workflows that don’t challenge the limited
data storage resources typical of a present day OSG site Pegasus is used to construct concrete DAGS from abstract DAX workflows Flexibility here to recognize and adapt to OSG site specifics could facilitate
greater utilization of the OSG as an abstract “Grid” Develop ability to benefit from Storage Resource Management
Typical LIGO data analyses benefit from being able to repeat the analysis on the same data set with improved calibration and selection criteria
LIGO is currently bringing up an SE on our local ITB cluster at Caltech to experiment with SRM
Astronomy on the Astronomy on the OSGOSG
Sloan Digital Sky Survey (SDSS)
Experimental Astronomy Group (EAG)
Fermi National Accelerator Laboratory
Near Earth Objects Near Earth Objects
Near Earth Objects (NEOs) Comets and Asteroids nudged by the gravitational attraction of
planets into orbits that pass by the Earth's neighborhood Composed of water ice and dust, formed early in the history of
the Solar System The scientific interest in comets and asteroids is due to their
being remnants of the early solar system ; the interest in NEO is their potential for hitting the earth…
37 Near Earth Object candidates are identified in the SDSS imaging data Apparent magnitudes r=19 – 21 and proper motions of 1.3 to 18
degrees per day The earth collision rate for this population (size greater than 20
m) is estimated to be one per century
How to find Near Earth How to find Near Earth ObjectsObjects
SDSS imaging data consist of 6 stripscalled “camcols”. The above image isa small portion of a “run”that extendsfor 800 fields.
Camcols1 2 3 4 5 6
5Fields
Near Earth Objectsshow up as streaksin 3 colors
NEO WorkflowNEO Workflow
OSG- SiteOSG- Site OSG-SiteOSG-Site OSG-SiteOSG-Site OSG-SiteOSG-Site
SDSS Cluster
TAM Cluster
GFS Disk Local Disk
RLS Server on TAM
VDL Generation
VDL2XML
Abstract DAX Creation
Concrete DAG Creation
Condor Submit DAG
Compute NodeLocal Disk
Register Input Files
Query RLS to Create Condor Submit File
Copy Input to Local Disk
Copy Output Back
Register Output FilesTransfer Output Back to TAM
NEO Job StatisticsNEO Job Statistics
Total Jobs Total Jobs 180180Total Input Data Total Input Data 9*180=1620 GB9*180=1620 GBTotal Output DataTotal Output Data12*180=2160 K12*180=2160 K
Run-RerunAvg: 150 Fields
camcol 1 camcol 2 camcol3 camcol4 camcol5 camcol6
6 Tar ballsAvg: 1.5*6=9 GB
Neo-$run-$camcol-Input.tar
Neo-Executable
6 Neo Par FilesAvg: 2*6=12 Kneo-00$run-$camcol-$rerun.par
Quasar Spectra Fitting Quasar Spectra Fitting using SDSSusing SDSS Quasars are super massive black holes. Swirling clouds of gas
and plasma falling into a black hole glowing at many different wavelengths. We measure the spectrum of the light to measure the properties of each quasar.
The SDSS provides us with 50,000 quasar spectra. We make fits to these spectra that include the following components: Power-law continuum, decreasing as e-
A Balmer continuum due to ionized Hydrogen, with a characteristic bump from 2000 to 4000 Angstroms
Strong emission lines from ionized gas, such as Hydrogen, Nitrogen, Oxygen, and Magnesium
Many faint emission lines from Iron Starlight from the galaxy that surrounds the quasar
Example Quasar Spectrum Example Quasar Spectrum with Fitwith Fit
Quasar Fit Production: Quasar Fit Production: Science using the Generic Science using the Generic Grid Gofer (GGG)Grid Gofer (GGG) All jobs are stored in “jobs” table. Available grid sites are stored in “pool” table Job Manager takes jobs from the database, creates
Condor DAG files and submits them to sites from the pool in an automatic mode.
Two main parts – Job Manager and DAG Creator All completed stages of a job are recorded in the
database together with submission time and execution time
Workflow in Generic Grid Workflow in Generic Grid GoferGofer
Nickolai Kuropatkin
Astronomy Experiences on Astronomy Experiences on the Gridthe Grid Experience tells us that Grid
is more suitable for CPU Intensive Jobs … achieve parallelism … more jobs… finish sooner
Running locally would limit the number of jobs run simultaneously
On OSG, can run several run-rerun and camcols within a run-rerun in parallel
Current Workflow also will facilitate further analysis
Spectra
CPU CPU IntensiveIntensive
NEOData&CPData&CP
U U IntensiveIntensive
Grid Match
Ideal for Grid
Grid not very
happy
Total No. of Jobs
~50000 180
Data Input/Job
1 Megabyt
es
9 Gigabyte
s
Data Output/Jo
b
2 Megabyt
es
12 Kilobytes
Avg. Rate of Job
Completion
800-1200 per day
10-15 per day ?
Future Grid Projects in Future Grid Projects in AstronomyAstronomy In the coming year 2005-2006 Experimental
Astrophysics Group ( EAG) has 4 projects planned for the Open Science Grid: The Simulation effort for the Dark Energy Survey
(DES) Genetic algorithm fitting of Sloan Digital Sky
Survey (SDSS) Quasar Spectra Search for Near Earth Asteroids (NEOs) in the
SDSS Imaging data The Co-addition of the SDSS Southern Stripe