+ All Categories
Home > Documents > Scientific Workflows in e-Science

Scientific Workflows in e-Science

Date post: 02-Jan-2016
Category:
Upload: marvene-grealish
View: 39 times
Download: 0 times
Share this document with a friend
Description:
Scientific Workflows in e-Science. Dr Zhiming Zhao ( [email protected] ) System and Network Engineering, University of Amsterdam Virtual Laboratory for e-Science. Outline. Background Scientific workflow management system Virtual Laboratory for e-Science Our approach - PowerPoint PPT Presentation
22
August 31 2006, Elsevier, Amsterdam Scientific Workflows in e- Science Dr Zhiming Zhao ([email protected] ) System and Network Engineering, University of Amsterdam Virtual Laboratory for e-Science
Transcript
Page 1: Scientific Workflows in e-Science

August 31 2006, Elsevier, Amsterdam

Scientific Workflows in e-Science

Dr Zhiming Zhao([email protected])

System and Network Engineering, University of AmsterdamVirtual Laboratory for e-Science

Page 2: Scientific Workflows in e-Science

August 31 2006, Elsevier, Amsterdam

Outline

Background Scientific workflow management system Virtual Laboratory for e-Science Our approach Challenges and research lines Activities

Page 3: Scientific Workflows in e-Science

August 31 2006, Elsevier, Amsterdam

Problem solving: a typical scenario in scientific research

• Analysis

• Hypothesis

• Related work

• Propose experiments

• Define steps

• Prototype computing systems

• Perform experiments

• Data collection

• Visualization

• Validation

• Adjust experiment

• Refine hypothesis

• Presentation

• Dissemination

Define problems Experiments Data analysis Discovery

Activities are:

- Iterative, dynamic, and human centered

- Requires different levels of resources

Page 4: Scientific Workflows in e-Science

August 31 2006, Elsevier, Amsterdam

Example scenarios

In problem analysis Identify domains, search key problems, find typical methods, and

review related work In scientific experiments: scientific computing & data processing

Define dependencies between computing and data processing tasks, and schedule their runtime behavior

In data analysis Visualization, compare the results of different parameters, keep

meaningful configuration and continue experiments Search related work, compare results

In dissemination Documenting experiments, present results, citation, publication

Page 5: Scientific Workflows in e-Science

August 31 2006, Elsevier, Amsterdam

Computer support for problem solving

Problem Solving Environment: (E Gallopoulos et. al., IEEE CS Eng. 1994)• Organize different software components/ tools• Allows a user to assemble these tools at a high level of abstraction• Control runtime behavior of experiments• Examples: MATLab, Ptolemy, etc.

Traditional PSE: organize and execute resources locally!

Distributed resources

DistributedParallel

computing

Visualization,Remote resource

invocation

Distributed data sharing & dissemination

Scientific workflow management systems:A new guise of PSE!

Page 6: Scientific Workflows in e-Science

August 31 2006, Elsevier, Amsterdam

Inside a Scientific Workflow Management System

In our view, a SWMS at least implements:

A model for describing workflows;

An engine for executing/managing workflows;

Different levels of support for a user to compose, execute and control a workflow.

Workflow (based on certain model)

Engine

User su

pp

ort

resources

Composition

Engine level control

Resource level control

A SWMS

Page 7: Scientific Workflows in e-Science

August 31 2006, Elsevier, Amsterdam

Scientific Workflows in e-Science

Workflows varies at different

Phases of experiments: design, runtime control, dissemination;

Abstractions of resources: concrete and abstract;

Levels of activity details: computing, data access, search/matching, human activities;

Experiment processes

Abstract workflows

Executable (concrete workflows)

Wo

rkflow

s for ad

min

istration

, e.g.,

AA

A, an

d o

ther issu

es.

Page 8: Scientific Workflows in e-Science

August 31 2006, Elsevier, Amsterdam

Diversity in SWMSTaverna:

-Web services based language: Scufl;

-FreeFluo: engine

-Graphical viz of workflow

Kepler:

-Actor,director

-MoML

-Execution models

Triana:

-Components

-Task graph

-Data/control flow

DAGMan:

-Computing tasks

-DAG

Pegasus:

-Based on DAGMan

-VDL

-DAG

Page 9: Scientific Workflows in e-Science

August 31 2006, Elsevier, Amsterdam

Virtual Laboratory for e-Science

Dutc

h te

lesc

ienc

e

Data

inte

nsive

scie

nce

Med

ical

diag

nosis

Generic e-science framework layer

Application layer

Bio

info

rmat

ics

ASP

Bio

dive

rsity

Food

Info

rmat

ics

Grid layer

Page 10: Scientific Workflows in e-Science

August 31 2006, Elsevier, Amsterdam

MissionEffectively reuse existing workflow managements

systems, and provide a generic e-Science framework for different application domains.

A generic framework can Improve the reuse of workflow components and the

workflows for different experiments Reduce the learning cost for different systems Allow application users to work on a consistent

environment when underlying infrastructure changed

Page 11: Scientific Workflows in e-Science

August 31 2006, Elsevier, Amsterdam

Previous work: VLAM-G environment VLAM-G

A Grid enable PSE Data intensive

applicationsVisual interfaceTwo levels of workflow

supportHuman interaction

support

Page 12: Scientific Workflows in e-Science

August 31 2006, Elsevier, Amsterdam

Workflow in VLAMG

Page 13: Scientific Workflows in e-Science

August 31 2006, Elsevier, Amsterdam

Experiment Topology– Graphical representation of self-contained data

processing modules attached to each other in a workflow.

hasExperiments(NOREUSE)

hasSteps(NOREUSE)

PROJECT(LINK)

EXPERIMENT(COPY)

COMMENT(COPY)

hasComments(COPY)

OWNER(LINK)

hasOwnerLINK

CONTRIBUTOR(LINK)

isPartOfProject(NOREUSE)

ownsExperiments(NOREUSE)

hasContributors(LINK)

contributedExperiments(NOREUSE)

EXPERIMENT(LINK)

hasNextExperiment(NOREUSE)

hasPrevExperiment(NOREUSE)

isPartOfExperiment(NOREUSE)

COMMENTATOR(LINK)

isMadeBy(LINK)

ARRAYMEASUREMENT

(COPY)

COMMENT(COPY)

hasComments(COPY)

PROPERTY(COPY)

hasProperties(COPY)

OWNER(LINK)

isPerformedBy(LINK)

hasPerformed(NOREUSE)

COMMENTATOR(LINK)

isMadeBy(LINK)

hasNextStep(NOREUSE)

hasPrevStep(NOREUSE)

DATA ANALYSIS(COPY)

hasExperiments(NOREUSE)

hasSteps(NOREUSE)

PROJECT(LINK)

EXPERIMENT(COPY)

COMMENT(COPY)

hasComments(COPY)

OWNER(LINK)

hasOwnerLINK

CONTRIBUTOR(LINK)

isPartOfProject(NOREUSE)

ownsExperiments(NOREUSE)

hasContributors(LINK)

contributedExperiments(NOREUSE)

EXPERIMENT(LINK)

hasNextExperiment(NOREUSE)

hasPrevExperiment(NOREUSE)

isPartOfExperiment(NOREUSE)

COMMENTATOR(LINK)

isMadeBy(LINK)

ARRAYMEASUREMENT

(COPY)

COMMENT(COPY)

hasComments(COPY)

PROPERTY(COPY)

hasProperties(COPY)

OWNER(LINK)

isPerformedBy(LINK)

hasPerformed(NOREUSE)

COMMENTATOR(LINK)

isMadeBy(LINK)

hasNextStep(NOREUSE)

hasPrevStep(NOREUSE)

DATA ANALYSIS(COPY)

Process-Flow Template– Graphical representation of data elements and processing steps in an experimental procedure.

Study– Descriptions of experimental steps represented as an instance of a PFT with references to experiment topologies.

VLAM-G PFT/Study

Page 14: Scientific Workflows in e-Science

August 31 2006, Elsevier, Amsterdam

Lessons learned

How to introduce a new PSE to a domain scientist? Because it has a beautiful architecture? Or because it can allow a scientist to keep their

current work style? How to use existing work?

Scientists need one system or more options? How to include user in the computing loop?

Dynamic workflows and human in the loop computing are important.

Z. Zhao et al., “Scientific workflow management: between generality and applicability”, QSIC 2005, Australia

Page 15: Scientific Workflows in e-Science

August 31 2006, Elsevier, Amsterdam

Workflow support in VL-e

Recommend suitable workflow systems for different application domains: Analyze typical application use cases Define small projects with different application

domains Review existing workflow systems Recommend four workflow systems: Triana, Taverna,

Kepler, and VLAMG A long term

Extend VLAMG and develop our own generic workflow framework

Page 16: Scientific Workflows in e-Science

August 31 2006, Elsevier, Amsterdam

A workflow bus paradigm

Workflow bus

Taverna KeplerTriana

Sub workflow 1

Sub workflow 2

Sub workflow 3

Workflow

A workflow bus is a special workflow system for executing meta workflows, in which sub workflows will be executed by different engines.

Z. Zhao et al., “Workflow bus for e-Science”, in IEEE Int’l Conf. e-Science 2006, Amsterdam

Page 17: Scientific Workflows in e-Science

August 31 2006, Elsevier, Amsterdam

Applications of workflow bus

Use case 1: A user has workflow in Taverna Some functionality is missing in Taverna but can be

provided by Triana He can develop the workflow in two systems, and run

it via the workflow bus

Use case 2: A user wants to execute a Taverna or Triana workflow

in multiple instances with different input data

Page 18: Scientific Workflows in e-Science

August 31 2006, Elsevier, Amsterdam

Ongoing research

Web service in data intensive applications Execution models for Grid workflows Including PSE in scientific workflows Industrial standards in scientific workflows

Page 19: Scientific Workflows in e-Science

August 31 2006, Elsevier, Amsterdam

Relevance between our research and Elsevier’s work In a same context from the scale of entire

lifecycle of e-Science experiments Different focuses

We focus on runtime behavior of scientific experiments, e.g., Grid computing, data/computing intensive applications, and scheduling of computing tasks

Elsevier highlights data search and integration on well structured data bases, research preparation, and literature search and management

Page 20: Scientific Workflows in e-Science

August 31 2006, Elsevier, Amsterdam

Cont.

Different characteristics in workflows In our workflows, processing and managing runtime dynamic

data is the key patterns In Elsevier workflows, storage, replicate, access, match and

integrate static data might be more common Facing similar challenges:

Semantics based data search and integration Workflow provenance Collaborative interaction (workflow development, resource

sharing, knowledge transfer) Modeling user profiles

Page 21: Scientific Workflows in e-Science

August 31 2006, Elsevier, Amsterdam

Activities Int’l workshop on “Workflow systems in e-Science”, organized by

Zhiming Zhao and Adam Belloum, in the context of ICCS06, Reading University, May 28, 2006. Proceedings is in LNCS, Springer Verlag. A special issue will be published in Scientific Programming Journal. http://staff.science.uva.nl/~zhiming/iccs-wses

Workshop on “Scientific workflows and industrial workflow standards in e-Science ”, organized by Adam Belloum and Zhiming Zhao, in the context of IEEE e-Science and Grid computing conference in Amsterdam December 2006. Pegasus, Dr. Ewa Deelman (Department of Computer Science University of South

California) BPEL, Dr. Dieter König (IBM Research Germany Development Laboratory) Kepler, Dr. Bertram Ludäscher (Department of Computer Science University of

California, Davis) Taverna, Prof. Peter Rice (European Bioinformatics Institute) WS and Semantic issues, Dr. Steve Ross-Talbot (CEO, and a co-founder, of

Pi4 Technologies) Triana, Dr. Ian J. Taylor (Department of Computer Science Cardiff University) http://staff.science.uva.nl/~adam/workshop/VL-e-workshop.htm

Page 22: Scientific Workflows in e-Science

August 31 2006, Elsevier, Amsterdam

References

1. Virtual Laboratory for e-Science: www.vl-e.nl2. Network and System Engineering, Faculty of Science, University of Amsterdam:

http://www.science.uva.nl/research/sne/3. Z. Zhao; A. Belloum; H. Yakali; P.M.A. Sloot and L.O. Hertzberger: Dynamic

Workflow in a Grid Enabled Problem Solving Environment, in Proceedings of the 5th International Conference on Computer and Information Technology (CIT2005), pp. 339-345 . IEEE Computer Society Press, Shanghai, China, September 2005.

4. Z. Zhao; A. Belloum; A. Wibisono; F. Terpstra; P.T. de Boer; P.M.A. Sloot and L.O. Hertzberger: Scientific workflow management: between generality and applicability, in Proceedings of the International Workshop on Grid and Peer-to-Peer based Workflows in conjunction with the 5th International Conference on Quality Software, pp. 357-364. IEEE Computer Society Press, Melbourne, Australia , September 19th-21st 2005.

5. Z. Zhao; A. Belloum; P.M.A. Sloot and L.O. Hertzberger: Agent technology and scientific workflow management in an e-Science environment, in Proceedings of the 17th IEEE International conference on Tools with Artificial Intelligence (ICTAI05), pp. 19-23. IEEE Computer Society Press, Hongkong, China, November 14th-16th 2005.


Recommended