Date post: | 13-Jan-2016 |
Category: |
Documents |
Upload: | britney-jordan |
View: | 217 times |
Download: | 0 times |
Peter Bajcsy, Rob Kooper, Luigi Marini, Barbara Minsker and Jim Myers
National Center for Supercomputing Applications (NCSA)University of Illinois at Urbana-Champaign (UIUC)
POC: Peter Bajcsy, email: [email protected]
CyberIntegrator: A Meta-Workflow System Designed for Solving Complex Scientific Problems using Heterogeneous Tools
Outline
• Problem Formulation– Meta-Workflow Definitions– Past Work
• Design– Workflow Requirements Driven by Environmental Observatories– Architecture of NCSA Meta-workflow Prototype Called
CyberIntegrator
• Implementation– Key Capabilities of CyberIntegrator
• Use Cases– Environmental and Hydrological Engineering
• Summary
Problem Formulation
Science Problem Formulation
System Problem Formulation
Work Flow Problem Formulation
Meta-Workflow Definition
• Meta-workflow (MWF) definitions in the past: – (1) Workflow aspect: a workflow is an aggregation of tasks, a meta-
workflow is an aggregation of workflows or a hierarchy of workflows – (2) Process management aspect: large activities have to be
integrated, executed and evaluated in a process of conducting electronic commerce
• Our meta-workflow definition includes multiple of its dimensions:– (1) hierarchical structure and organization of software,
• combinatorial explosion of module connection– (2) heterogeneity of software tools and computational resources,
• the number of different engines and software applications used by people for a reason
– (3) usability of tool and workflow interfaces, – (4) community sharing of fragments and user friendly security, – (5) community knowledge and provenance, – (6) execution and built-in fault-tolerance, etc
Previous Work• Other efforts:
– Business process workflow architectures - FlowMark, WSFL and BPEL: serving business community
– Scientific workflow architectures - DAGMan, Taverna, SciFlo, Kepler, D2K, OGRE, CCA, Pegasus, GridFlow and Grid Ant, Triana and GSFL
• Comparison: – Our work focuses on the simplicity of end user
interactions with information technologies while utilizing all execution mechanisms transparently (workflow by example).
– Our work creates provenance to recommendation pipelines for the benefit of a community (recommendations based on provenance information).
Research Topics
• Data Translations: Semantic and syntactic mapping of data structures
• Provenance Information: Granularity of gathered provenance information for recommendations, auditing and re-construction
• HCI: User interface design issues and community dependencies
• Meta-Data: Federation of distributed (data, tool, computational resource) registries
• Execution: Just in time data delivery wrt. remote computing; Cost benefit analysis of data transfer vs. CPU requirements; Execution triggered by streaming data
Design
Design Goals
• Make scientific discoveries easier– Workflow by example (step-by-step
experimentation)– Design friendly user interfaces– Build seamless access to heterogeneous
data/tools/resources – Provide data and process provenance
information– Recommend data, tools and computational
resources– Derive higher level semantic tools
Meta-workflow Architecture
Implementation
Meta-Workflow Features
• Workflow by example
• Support of heterogeneous executors– Workflows: GeoLearn, D2K, Kepler/Ptolemy– Applications: MS Excel, Im2Learn, ArcGIS– Web services: D2KWS
• Provenance– Gathering & Meta-data repositories
• Recommendations
Meta-workflow Editor
Use Cases
Meta-Workflow R&D Drivers
• Community drivers: – Environmental Science: CLEANER– Hydrological Science: CUAHSI
• Science drivers:– Environmental Modeling of Nutrient Distribution
• Monte Carlo simulations of maximum amount of pollution that a water body can receive each day and still retain its uses
– Understanding the Dynamic Evolution of Land-Surface Variables in the Illinois River Basin
• Data-driven analyses of multi-variable relationships from remote sensing data
• Technology drivers: – Collaboratory Cyberenvironments
Summary
• The problem of designing a highly interactive scientific meta-workflow system is very complex
• Key capabilities of our meta-workflow prototype implementation called CyberIntegrator were demonstrated with two use cases.
• We plan on building and deploying a practical tool for multiple communities.
• Publications:– Image Spatial Data Analysis Group at NCSA: – URL: http://isda.ncsa.uiuc.edu
• Questions:– Peter Bajcsy; Email: [email protected]
Hydro-informatics
Backup
Meta-workflow System Information
Terminology
• Engines are stand-alone environments and applications that are used by many tools– Examples: Matlab, MS Excel, D2K, Im2Learn, ArcGIS,
Kepler
• Tools are solutions specific to a problem and consist of several algorithms– Examples: Image Calculator in Im2Learn, Pie chart
visualization in MS Excel, …
• Algorithms are code fragments that perform a specific operation in a tool– Examples: image addition operation in Image Calculator
Environmental Science
Hydrological Science