+ All Categories
Home > Science > Simulation Management and Execution Control

Simulation Management and Execution Control

Date post: 09-Jan-2017
Category:
Upload: daniel-wheeler
View: 23 times
Download: 1 times
Share this document with a friend
1
fasteners psutils $ corrcli watcher start Launch watcher with ID: 6ea96596ac29 $ python my_script.py params.json 6ea96596ac29 $ corrcli jobs list label status time stamp pid 0 27f427d2 finished 16-06-26 21:03:46 27800 $ corrcli sync Daniel Wheeler a and Faïçal Yannick P. Congo a,b a Material Science and Engineering Division, Material Measurement Laboratory, NIST, b LIMOS, Blaise Pascal University, France Email: [email protected] SIMULATION MANAGEMENT AND EXECUTION CONTROL I-What is simulaon management? Data context (metadata) is essenal to verify scienfic research claims. Version control and other workflow tools capture varying aspects of data context. Simulaon management is concerned with the metadata for a scienfic simulaon or execuon. Execuon control tools capture simulaon execuon context in an analogous way to version control. References [1] A. Davison, “Automated Capture of Experiment Context for Easier Reproducibility in Computaonal Research”, CISE, 2012, DOI: 10.1109/MCSE.2012.41 [2] F. Y. P. Congo, “Building a Cloud Service for Reproducible Simulaon Management”, Scipy Proceedings, 2015 [3] F. Y. P. Congo, “Cloud of Reproducible Records”, MGI website, 2015, https://mgi.nist.gov/cloud- reproducible-records Workfow Tools Wrapping Tools II-Execuon control in context Execuon Control Version Control Ubiquitous, robust Command line Web integraon Highly collaborave Not suitable for capturing execuon context Suitable for recording stable automated execuons Provides log, search and view of execuon history Capture enre simulaon context Version environments Collaborave Not collaborave with current tools Not robust or ubiquitous Not suitable for log, search and view of history Suitable for building pipelines of disnct tasks Enables a clear division of tasks enabling flexible and adjustable pipeline implementaon by non- experts Black box design for each secon of the pipeline Monolithic in nature encouraging isolated ecosystem of tools IV-Summary III-Execuon Control Apps $ smt init $ smt configure configure=python \ --main=script.py $ smt run params.json $ smt list 538732dcc47d Record saved at end of execuon Relavely mature code base Local web view Computaon launched as a subprocess Robust local record store on file system Use “watcher” to inspect process – no subprocesses Connuously update records Sync to CoRR API independently https://github.com/usnistgov/corr-cli/ Separate API and Cloud apps for scalability Code base under construcon Centralized cloud plaorm https://github.com/usnistgov/corr Capture version control details Capture python dependencies Capture input and output files Capture parameters Capture CPU and memory usage Capture record commentary Features What is CoRR? The Cloud of Reproducible Records is a web plaorm and command line tool client (CoRR-cli) for recording execuon context as a set of records. Future Work Overcome government security issues to host live app Implement searchable table view of data Implement common metadata standard and interoperability between CoRR and Sumatra Enlarge CoRR-cli metadata capture capabilies to be equivalent to Sumatra for Python
Transcript
Page 1: Simulation Management and Execution Control

fasteners

psutils

$ corrcli watcher startLaunch watcher with ID: 6ea96596ac29$ python my_script.py params.json 6ea96596ac29$ corrcli jobs list label status time stamp pid0 27f427d2 finished 16-06-26 21:03:46 27800$ corrcli sync

Daniel Wheelera and Faïçal Yannick P. Congoa,b a Material Science and Engineering Division, Material Measurement Laboratory, NIST,

b LIMOS, Blaise Pascal University, FranceEmail: [email protected]

SIMULATION MANAGEMENT AND EXECUTION CONTROL

I-What is simulation management? Data context (metadata) is essential to verify scientific

research claims. Version control and other workflow tools capture

varying aspects of data context. Simulation management is concerned with the

metadata for a scientific simulation or execution. Execution control tools capture simulation execution

context in an analogous way to version control.

References[1] A. Davison, “Automated Capture of Experiment Context for Easier Reproducibility in Computational Research”, CISE, 2012, DOI: 10.1109/MCSE.2012.41 [2] F. Y. P. Congo, “Building a Cloud Service for Reproducible Simulation Management”, Scipy Proceedings, 2015[3] F. Y. P. Congo, “Cloud of Reproducible Records”, MGI website, 2015, https://mgi.nist.gov/cloud-reproducible-records

Workfow Tools

Wrapping Tools

II-Execution control in context

Execution Control

Version Control

Ubiquitous, robust Command line Web integration Highly collaborative

Not suitable for capturing execution context

Suitable for recording stable automated executions

Provides log, search and view of execution history

Capture entire simulation context

Version environments Collaborative

Not collaborative with current tools

Not robust or ubiquitous

Not suitable for log, search and view of history

Suitable for building pipelines of distinct tasks

Enables a clear division of tasks enabling flexible and adjustable pipeline implementation by non-experts

Black box design for each section of the pipeline

Monolithic in nature encouraging isolated ecosystem of tools

IV-Summary

III-Execution Control Apps

$ smt init$ smt configure configure=python \– --main=script.py$ smt run params.json$ smt list538732dcc47d

Record saved at end of execution Relatively mature code base Local web view Computation launched as a subprocess

Robust local record store on file system Use “watcher” to inspect process – no subprocesses Continuously update records Sync to CoRR API independently https://github.com/usnistgov/corr-cli/

Separate API and Cloud apps for scalability Code base under construction Centralized cloud platform https://github.com/usnistgov/corr

Capture version control details Capture python dependencies Capture input and output files Capture parameters Capture CPU and memory usage Capture record commentary

Features

What is CoRR? The Cloud of Reproducible Records is a

web platform and command line tool client (CoRR-cli) for recording execution context as a set of records.

Future Work Overcome government security issues to host live app Implement searchable table view of data Implement common metadata standard and

interoperability between CoRR and Sumatra Enlarge CoRR-cli metadata capture capabilities to be

equivalent to Sumatra for Python

Recommended