+ All Categories
Home > Documents > Ecss des

Ecss des

Date post: 14-Jul-2015
Category:
Upload: raminder-singh
View: 128 times
Download: 1 times
Share this document with a friend
Popular Tags:
13
Efficient Production of Synthetic Skies for the Dark Energy Survey Raminder Singh Science Gateways Group Indiana University, Bloomington. [email protected]
Transcript
Page 1: Ecss des

Efficient Production of Synthetic Skies for the Dark Energy Survey

Raminder SinghScience Gateways Group

Indiana University, [email protected]

Page 2: Ecss des

Background & Explanation

• First project to combine four different statistical methods of dark matter and dark energy. • Baryon acoustic oscillations in the matter power

spectrum.• Abundance and spatial distribution of galaxy groups and

clusters.• Weak gravitational lensing by large-scale structure.• Type Ia supernovae - are quasi-independent.

• 5000-square degree survey of cosmic structure traced by galaxies

• Simulation Working Group is generating simulations for galaxy yields in various cosmologies

• Analysis of these simulated catalogs offers a quality assurance capability for cosmological and astrophysical analysis of upcoming DES telescope data.

Page 3: Ecss des

Blind Cosmology Challenge (BCC)

Figure: Processing steps to build a synthetic galaxy catalog. Xbaya workflow currently controls the top-most element (N-body simulations) which consists of methods to sample a cosmological power spectrum (ps), generating an initial set of particles (ic) and evolving the particles forward in time with Gadget (N-body). The remaining methods are run manually on distributed resources.

• N-Body simulation to simulate the dark matter• Post-processed using an

empirical approach to link galaxy properties to dark matter structures (halos)• Gravitational lensing shear

(new Spherical Harmonic Tree code)• ADDGALS methodology,

empirical tuning.

Page 4: Ecss des

Project Plan

• Project requested for ECSS support to automate and optimize processing on XSEDE resources.

• ECSS staff, Dora Cai worked with the group on the work plan.

• Work plan was discussed and a timeline were given to each task

• Work plan called for porting all codes to TACC-ranger

Page 5: Ecss des

Main Goals

• Automated (re)submission of jobs to improve manual inefficiencies Walltime limits of TACC Ranger long queue is 48

hrs so jobs need restart. Multiple simulations can run in parallel

• Automatic data archival Data need to be archived on Ranch Data need to moved to SLAC for post processing

• Provenance• User portal

Page 6: Ecss des

Development Roadmap

XSEDE Quarterly Developments Lesson Learned

July - Sept 2011 • Deployed/Tested the N-Body simulation code on Ranger.

• Developed/Tested individual codes using Apache Airavata.

• Developed BCC Parameter Maker for workflow configuration

• Added module loading support on Ranger using Gram. RSL parameters mod, mod_del

Sept - Dec 2011 • Enhanced Parameter maker script to accept inputs from Airavata Workflow.

• Full workflow run for Single N-Body smaller simulation

• Queue walltime 48 hr limits of Ranger

Jan - Mar 2012 • Medium scale testing• Restart capability using Do-While

contracts • Running multiple simulations in parallel

for production runs. • Evaluated file movement using GridFTP,

bbcp, Globus Online.

• Globus online client library is required to initiate transfers from workflow

Page 7: Ecss des

Development RoadmapXSEDE Quarterly Developments Learn Learned

April - June2012 • Ran few more production simulations 2 in parallel

• Changed I/O routines to fix issues. • Able to produce full simulation data for

post processing

• Luster issue with to many files created in a single folder

• Gram canceling jobs after few hours in case of connection timeout with client

July – Sept 2012 • Investigate the job canceling issue and reported to Globus team

• Changed COG API to add debug statements to debug issue

• Ran more production boxes• Migration to SDSC trestles

• Globus team confirmed that issue is not from server side but client

• didn't give proper error to debug

• MPI library loading and RSL parameter resolved using Gram on Trestles

Oct – Dec 2012 • Able to run 2 full 4 box simulations using workflow.

• Able to resolve Job cancel issue for the latest run.

• Restart files were not written properly because of some new exception

Page 8: Ecss des

Xbaya Workflow• Three large-volume realizations with 20483 particles using 1024

processors• One smaller volume of 14003 particles using 512 processors.• These 4 boxes need about 300,000 SUs• Each cosmology simulation entails roughly 56TB of data.

Page 9: Ecss des

Workflow Description

• BCC Parameter MakerThis initial setup code is written as a python script and prepares necessary configurations

and parameter files for the workflow execution. • CAMBThe CAMB (Code for Anisotropies in the Microwave Background) application computes the

power spectrum of dark matter, which is necessary for generating the simulation initial conditions.

• 2LPTicThe Second-order Lagrangian Perturbation Theory initial conditions code (2LPTic) is

programmed using Message Passing (MPI) C code that computes the initial conditions for the simulation from parameters and an input power spectrum generated by CAMB.

• LGadgetThe LGadget simulation code is MPI based C code that uses a TreePM algorithm to evolve a

gravitational N-body system. The outputs of this step are system state snapshot files, as well as lightcone files, and some properties of the matter distribution, including diagnostics such as total system energies and momenta. The total output from LGadget depends on resolution and the number of system snapshots stored, and approaches close to 10 TeraBytes for large DES simulation volumes.

Page 10: Ecss des

Production results

• We have run four full size boxes with Airavata. The following table compares how time improvement is used by Airavata compared to manually submitting jobs.

* ignores 1.5 day delay due to network connection error with job listener and subsequent gram5 job deletion

Page 11: Ecss des

Lessons learned

• MPI libraries and job scripts can be different on different resources. User needs to experiment to learn.

• User scratch policies can be different on different machines so it can talk some time to migrate between resources.

• Migration of working codes from one machine to another can take weeks to months

• Grid Services and client libraries need 1st class support

XSEDE ticketing system is your Best Friend!

Page 12: Ecss des

Future Goals

• Migrate to SDSC Trestles for next production run.• Group is planning to work with Apache Airavata for future

extensions.• Implement intermediate start of workflow in case of failure

based on Provenance information.• Post processing

• Plan for TACC Stampede migration.• Currently we are using Globus Online GUI interface for file

transfer but would like to integrate using API’s with the Workflow

• Migrate post processing and quality assurance code on XSEDE and develop post processing workflow

• Try to integrate Post processing steps on SLAC

Page 13: Ecss des

Team & Publication

DES Simulation Working groupAugust E. Evrard, Departments of Physics and Astronomy(Michigan)Brandon Erickson, grad student (Michigan)Risa Wechsler, asst. professor (Stanford/SLAC)Michael Busha, postdoc (Zurich)Matt Becker, grad student (Chicago)

Andrey V. Kravtsov, Department of Astronomy and Astrophysics(Chicago) ECSS

Suresh Marru, Science Gateways Group(Indiana)Marlon Pierce Science Gateways Group(Indiana)Lars Koesterke, TACC (Texas)Dora Cai, NCSA (Illinois)

Publication: XSDEDE 12: Brandon M. S. Erickson, Raminderjeet Singh, August E. Evrard, Matthew R. Becker, Michael T. Busha, Andrey V. Kravtsov, Suresh Marru, Marlon Pierce, and Risa H. Wechsler. 2012. A high throughput workflow environment for cosmological simulations. In Proceedings of the 1st Conference of the Extreme Science and Engineering Discovery Environment: Bridging from the eXtreme to the campus and beyond (XSEDE '12). ACM, New York, NY, USA, , Article 34 , 8 pages. DOI=10.1145/2335755.2335830 http://doi.acm.org/10.1145/2335755.2335830


Recommended