+ All Categories
Home > Education > Taverna workflows: provenance and reproducibility - STFC/NERC workshop 2013

Taverna workflows: provenance and reproducibility - STFC/NERC workshop 2013

Date post: 10-May-2015
Category:
Upload: anpawlik
View: 230 times
Download: 0 times
Share this document with a friend
Description:
Slides on Taverna www.tvaerna.org.uk from the talk given at STFC/NERC workshop "Workflow approaches to investigation of biological complexity", 15-16 October 2013.
Popular Tags:
17
Taverna workflows: provenance and reproducibility Aleksandra Pawlik The University of Manchester Workflow approaches to investigation of biological complexity STFC/NERC Workshop 15-16 October 2013
Transcript
Page 1: Taverna workflows: provenance and reproducibility - STFC/NERC workshop 2013

Taverna workflows: provenance and reproducibility

Aleksandra PawlikThe University of Manchester

Workflow approaches to investigation of biological complexitySTFC/NERC Workshop 15-16 October 2013

Page 2: Taverna workflows: provenance and reproducibility - STFC/NERC workshop 2013

Workflows for improvement

Workflows are more than just pipelines…

Scaling up automated executionBringing together distributed and continually changing resourcesDealing with different standards, interfaces and implementationSupport for repeatable analysis

Page 3: Taverna workflows: provenance and reproducibility - STFC/NERC workshop 2013

Taverna Engine Execution

Workflows in Scufl2 Functional dataflow, simple control flows, implicit

iteration Linking services and tools Different data resources and formats “In Workflow Programming” (eg. Beanshell scripting) Provenance collection: W3C PROV-O, OPM Plug-in Framework

Infrastructures: Web Services (SOAP, REST), Grid, HPC Common Tools: Excel Spreadsheets, Google Refine, R

OAuth security plug-in

Page 4: Taverna workflows: provenance and reproducibility - STFC/NERC workshop 2013

Workflow engine to run workflows

List of services

Construct and visualise workflows

Taverna Workbench• Customizable for domains (eg. expose services only for biodiversity)• Desktop application• Intermediate results views• Plug-in framework

Page 5: Taverna workflows: provenance and reproducibility - STFC/NERC workshop 2013

Taverna User Spectrum

Workflow Engineer

ComputationalScientist

DomainScientist(Workflow User)

Workbench WorkbenchComponents

Lite Domain-SpecificWebsite / Tool / Portal

Workflow Visibility

Concept KnowledgeTaverna Domain

High Low

Player

Page 6: Taverna workflows: provenance and reproducibility - STFC/NERC workshop 2013

Application Factory: productive by reuse

Right apps, right users Commodity apps:

Web. Spreadsheets. R. Customisation Mixed workflow / scripting Deployment / Portability

Web based / desktop Virtualised deployments Cloud hosted service A cloud-enabled local host

Local ownership Capability building

InfrastructureInfrastructureLegacy, others and your own software, datasets, services, codes, and platforms. Optimise and manage use of computing infrastructure.

WFMSmiddleware

WFMSmiddleware

Support design, config. and execution of workflows. manage utility actions for data, logging, security, compute, error. Shield incompatibilities & complexity.

Parameterised, integrative, multi-step (data) pipelines, analytics, computational protocols. Can be repetitively reused.

WorkflowWorkflow

AppsAppsDomain/task specific apps that incorporate (an ecosystem of) workflows. Integrate

Page 7: Taverna workflows: provenance and reproducibility - STFC/NERC workshop 2013

Reuse and Reproducibility

Page 8: Taverna workflows: provenance and reproducibility - STFC/NERC workshop 2013

~6,000 membersover 300 groups, over 3,000 workflows

Page 9: Taverna workflows: provenance and reproducibility - STFC/NERC workshop 2013

Taverna ComponentsWorkflow Blocks made of a

workflow

Well described Well behaved Well looked after Agreed fail Agreed formats in and out Agreed provenance

Deposited in myExperimentGrouped into families

Page 10: Taverna workflows: provenance and reproducibility - STFC/NERC workshop 2013

Provenance: how did you do it? The link between

computation and results

Collecting -> Using Provenance

Reporting at different scales/ levels

PDIFF: comparing provenance traces to diagnose divergence across experimental results [Woodman et al, 2011]

Page 11: Taverna workflows: provenance and reproducibility - STFC/NERC workshop 2013

Research Objectshttp://www.researchobject.org/

bundles and relates digital resources of a scientific experiment or investigation using standard mechanisms

http://www.w3.org/community/rosc/

Page 12: Taverna workflows: provenance and reproducibility - STFC/NERC workshop 2013

Galaxy execution

Tools

Taverna server

Wrap as Tool

Taverna in Galaxy

Workflow in

Upload

Page 13: Taverna workflows: provenance and reproducibility - STFC/NERC workshop 2013

The Taverna Suite of ToolsClient User InterfacesUser InterfacesWorkflow

Repository

Service Catalogue

Third Party Tools

Web Portals / Gateways

Activity and Service Plug-in

Manager

Workflow Provenance

Workflow Server

Workflow Engine

Virtual Machine

Prog APIs

Command Line

Player

WorkflowComponent

s

Workbench Taverna Lite

Interaction Server

Page 14: Taverna workflows: provenance and reproducibility - STFC/NERC workshop 2013

Freely availableOpen source

Current version 2.4

80,000+ downloads across version

Active user forum & support

Windows/Mac OS X/Linux/Unix

Sustainability and user support

Tutorials and Workshops

www.taverna.org.uk

Page 15: Taverna workflows: provenance and reproducibility - STFC/NERC workshop 2013

Taverna in other projects

BioDiversity Virtual e-Laboratorywww.biovel.eu

Wf4Everwww.wf4ever-project.org

VPH-Sharewww.vph-share.eu

SCAPEwww.scape-project.eu

Pacific Northwest National Laboratorywww.pnnl.gov

Pacific Northwest National Laboratorywww.pnnl.gov

Scientific Workflows and Provenance Working Groupwww.dataone.org

Scientific Workflows and Provenance Working Groupwww.dataone.org

iPlant Collaborativewww.iplantcollaborative.org

iPlant Collaborativewww.iplantcollaborative.org

HELIOwww.helio-vo.eu

HELIOwww.helio-vo.eu KBase

www.kbase.usKBasewww.kbase.us

SHIWAwww.shiwa-workflow.eu

Page 16: Taverna workflows: provenance and reproducibility - STFC/NERC workshop 2013

Data-centric ComputationScientific workflows over Distributed Cyber-Infrastructure.

Data sharing libraries and catalogues for all types of scientific artefacts and all types of scientists.

Knowledge ManagementMetadata, semantics digital exchange, preservation, publishing

Software EngineeringSoftware sustainability, software and data policy, training

Products Methods

Page 17: Taverna workflows: provenance and reproducibility - STFC/NERC workshop 2013

For more information

Taverna http://www.taverna.org.uk

myExperiment http://www.myexperiment.org

myGrid http://www.mygrid.org.uk


Recommended