+ All Categories
Home > Documents > Performing In silico Experiments in a Service Based Architecture: Solutions and Issues

Performing In silico Experiments in a Service Based Architecture: Solutions and Issues

Date post: 07-Jan-2016
Category:
Upload: odin
View: 24 times
Download: 0 times
Share this document with a friend
Description:
Performing In silico Experiments in a Service Based Architecture: Solutions and Issues. Chris Wroe, Phillip Lord, Robert Stevens & Carole Goble The University of Manchester, UK http://www.mygrid.org.uk. EPSRC funded UK eScience Program Pilot Project. - PowerPoint PPT Presentation
Popular Tags:
39
VBI Web Services Workshop 26-27 May 2005 Performing In silico Experiments in a Service Based Architecture: Solutions and Issues Chris Wroe, Phillip Lord, Robert Stevens & Carole Goble The University of Manchester, UK http://www.mygrid.org.uk
Transcript
Page 1: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

Performing In silico Experiments in a Service Based Architecture: Solutions and Issues

Chris Wroe, Phillip Lord, Robert Stevens & Carole Goble

The University of Manchester, UK

http://www.mygrid.org.uk

Page 2: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

EPSRC funded UK eScience Program Pilot Project

Thanks to the other members of the Taverna project, http://taverna.sf.net

Page 3: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

Core• Matthew Addis, Nedim Alpdemir, Tim Carver, Rich Cawley, Neil Davis, Alvaro

Fernandes, Justin Ferris, Robert Gaizaukaus, Kevin Glover, Carole Goble, Chris Greenhalgh, Mark Greenwood, Yikun Guo, Jan Humble, Ananth Krishna, Peter Li, Phillip Lord, Darren Marvin, Simon Miles, Luc Moreau, Arijit Mukherjee, Tom Oinn, Juri Papay, Savas Parastatidis, Norman Paton, Terry Payne, Matthew Pocock Milena Radenkovic, Stefan Rennick-Egglestone, Peter Rice, Ian Roberts, Martin Senger, Nick Sharman, Robert Stevens, Victor Tan, Anil Wipat, Paul Watson, Jimi Worthington and Chris Wroe.

Users• Simon Pearce and Claire Jennings, Institute of Human Genetics School of

Clinical Medical Sciences, University of Newcastle, UK• Hannah Tipney, May Tassabehji, Andy Brass, St Mary’s Hospital, Manchester,

UK• Steve Kemp, Liverpool, UKPostgraduates• Martin Szomszor, Duncan Hull, Jun Zhao, Pinar Alper, Keith Flanagan, Antoon

Goderis, Tracy Craddock, Alastair HampshireIndustrial • Dennis Quan, Sean Martin, Michael Niemi, Syd Chapman (IBM)• Robin McEntire (GSK)Collaborators• Keith Decker

Page 4: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

Bioinformatics Services

• A typical HAD environment– Distributed, Autonomous and very, very Heterogeneous

• No standard API or calling mechanisms

• Complex types are often implicit – everything is String

• No domain typing – everything is String

• Numerous Services and growing

• Close the world – controlled, but constrained

• Open the world – uncontrolled, but versatile

Page 5: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

In silico Bioinformatics

• Bioinformatics experiments use 1, 2 up to N services chained together

• Ultimate result is the goal and some or all intermediates are part of the goal

• Intermediates are necessary for evidence gathering• Often need to be repeated• Often need to be re-purposed• Workflows offer a suitable model for bioinformatics

experiments

Page 6: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

Williams-Beuren Syndrome• Contiguous sporadic gene deletion disorder• 1/20,000 live births, caused by unequal

crossover (homologous recombination) during meiosis

• Haploinsufficiency of the region results in the phenotype

Chr 7 ~155 Mb

~1.5 Mb7q11.23

**

WBS

SVAS

Patient deletions

CTA-315H11

CTB-51J22

‘Gap’

Physical Map

Page 7: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

1. Identify new, overlapping sequence of interest2. Characterise the new sequence at nucleotide and amino acid

level

Cutting and pasting between numerous web-based services i.e. BLAST, InterProScan etc

12181 acatttctac caacagtgga tgaggttgtt ggtctatgtt ctcaccaaat ttggtgttgt 12241 cagtctttta aattttaacc tttagagaag agtcatacag tcaatagcct tttttagctt 12301 gaccatccta atagatacac agtggtgtct cactgtgatt ttaatttgca ttttcctgct 12361 gactaattat gttgagcttg ttaccattta gacaacttca ttagagaagt gtctaatatt 12421 taggtgactt gcctgttttt ttttaattgg gatcttaatt tttttaaatt attgatttgt 12481 aggagctatt tatatattct ggatacaagt tctttatcag atacacagtt tgtgactatt 12541 ttcttataag tctgtggttt ttatattaat gtttttattg atgactgttt tttacaattg 12601 tggttaagta tacatgacat aaaacggatt atcttaacca ttttaaaatg taaaattcga 12661 tggcattaag tacatccaca atattgtgca actatcacca ctatcatact ccaaaagggc 12721 atccaatacc cattaagctg tcactcccca atctcccatt ttcccacccc tgacaatcaa 12781 taacccattt tctgtctcta tggatttgcc tgttctggat attcatatta atagaatcaa

Page 8: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

• The individual scientist doodling

• Workflows & distributed queries to link up your own and others resources

• Data intensive, up stream pipelines

• Reuse - sharing and adapting workflows & resources, and their outcomes

• Semantic descriptions for discovery, validation & linkage

• Whole experiment lifecycle, including logging provenance

Middleware for data intensive in silico biology by bioinformaticians

Discovering and reusingexperiments

and resources

Managing lifecycle,

provenance and results

Sharingservices &

experiments

Personalisation

Forming experiments

Executing &

monitoring experiments

Page 9: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

An Open World • Open source• Open domain services and resources• Open community• Open application

– Nothing specific to biology but oriented to • Open model and open data

– No prescribed typing or domain data model

– A layered information model• Open architecture

– Service Oriented Architecture– Loosely coupled– Web services based– Assemble your own components– Designed to work together

TavernaFreefluo

Grimoire

Registry

EventNotification mIR

PedroAnnotation

FetaDiscovery

Info.Model

SoaplabGowlabBioNanny

MediatorPortal

LSIDs

KAVE

DQP

Page 10: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

Biologists

BioinformaticiansService Providers

Stakeholders

Page 11: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

• Jam today• Important for take up and

community building.• Take up leads to much better

understanding. • Energy of bioinformaticians and

service providers• Dealing with lots of legacy

remote services• Incorporating my bits and

pieces • Networking effects• Added value with added effort

Activation Energy

Cost

Ben

efit

Page 12: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

Scufl Simple Conceptual Unified Flow LanguageTaverna Writing, running workflows & examining resultsSOAPLAB Makes applications available

Freefluo Workflow engine to run workflows

Freefluo

SOAPLABWeb Service

Any Application

Web Service e.g. DDBJ BLAST

Taverna

SeqHoundService

Special processor

http://taverna.sourceforge.net/

Page 13: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

Viewer plug-ins

Service failure protocol

Viewer plug-ins

Page 14: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

1..*0..* uses

1

0..*

contains

0..*1

method method

1 0..*has instances

0..*

0..*

researchFocus

0..*

1

uses

10..*

0..*

1

acts in

1

0..*

initiates

1 1..*episodes

10..*

labBooks

scmInvestigator

1 0..*has participants10..*

participates in

selected studies

<<Resource>>Operations.Operation

Annotation.SemanticConcept

SubjectObject

Resources.Resource

+getId:URIString

ProgrammeResource

+name:String

<<Resource>>Study

+name:String+description:String+startTime:DateTime+endTime:DateTime+status:String

AgentExperimentInstance

Investigation

<<Resource>>ExperimentDesign

Programme

LabBookView

+name:String+rule:String

Life Science Identifiers

Model Driven Approach

OWL & RDFS OntologiesTo annotate and classify entities with a common vocabulary based on a common understanding.

RDF Knowledge Added Value to Experiment

Information Repository and Common Information model for e-Science

Page 15: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

Williams-Beuren Workflows

Characterisation of nucleotide sequence

Identification of overlapping sequence

Characterisation of protein sequence

Page 16: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

WBS Workflow Experience• Correct and Biologically meaningful results: Found all expected

results; plus unnoticed pseudo gene

• Automation: Saved time, increased productivity• Sharing: Other people have used and want to develop the workflows,

notably mouse and chicken

Page 17: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

Gene annotation pipelines Microarray analysis pipelinesFind differentially expressed genes, e.g. NF-kappa beta inhibitor protein

Autoimmune disease of the thyroid in which the immune system of an individual attacks cells in the thyroid gland resulting in hyperthyroidism

Graves Disease

Page 18: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

Trypanosomiasis in cattle Chicken genome

Mouse genome

Reuseadapting and sharing best practice and

know-how across a community

Chris Wroe, Carole Goble, Antoon Goderis, Phillip Lord, Simon Miles, Juri Papay, Pinar Alper, Luc Moreau Recycling workflows and services through discovery and reuse Concurrency and Computation: Practice and Engineering accepted for publication

Page 19: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

Thi

rd-

part

y to

ols

Utopia

Haystack LSID Launchpad

myGrid information

model

Applications

Core Services

External Services

Se

rvic

e &

wo

rkflo

w

dis

cove

ry

Feta semantic discovery

GRIMOIRES registry

Web portalsWeb

portals

Tavernae-Science workbench

Wor

kflo

w

en

act

me

nt

Taverna-Freefluoworkflow engine

Met

adat

a M

anag

emen

t KAVE metadata store

ProQAprovenance

manager

myGrid ontology

Soaplab

Gowlab

Termino

Lexical mark-up

Legacy applications

Web Services OGSA-DAI databases

Web Sites

OGSA-DQP service

e-Science coordination e-Science mediator

e-Science process patterns

e-Scien

ce even

ts

LSID support

Dat

a

Man

agem

ent

mIR myGrid information repository

Web Service (Grid Service) communication fabric

Web Service (Grid Service) communication fabric

Notification service

Pedro semantic publication

Pedro semantic publication

Java applications

Executable codes with an IDL

Custom databases

Page 20: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

• Taverna currently ships with access to over 1000 services

• But it wasn’t always the case!• Lack of available services, at

least at first• A lot of activation energy

needed that hopeful gets less as services get pooled

• Service partnerships and network effects

• If your service ain’t there, that’s an obstacle.

First, catch your service

Page 21: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

• Soaplab and Gowlab wrappers• http://industry.ebi.ac.uk/soaplab/• WSDL scavenging• Processor abstraction over

stereotypical invocation patterns of service families

• Many services are not plain WSDL

• API consumer in Taverna 1.1

Service Bootstrapping

Page 22: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

Classes and Interfaces

presented here

User selects appropriate

methods to be exposed within

Taverna

API Consumer Interface•Interoperate existing APIs with SOAP services, SoapLab, BioMoby, SeqHound, caBIG, BioJava, etc.

•Refine complex APIs to sets of task centric functionality

•Take advantage of myGrid infrastructure: monitoring, result browsing, provenance etc. and applies it to your APIs

•Taverna 1.1 onwards, download API consumer and toolset at http://taverna.sf.net

Page 23: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

Import into Taverna

Previously created API definition is imported – methods and constructors appear as components alongside other services.

Page 24: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

Invocation Heterogeneity• WSDL - single Web Service operation

described in a WSDL file. • Local Java or Beanshell function • Soaplab - CORBA-like stateful protocol of

the Web Service operations • Nested workflow - implemented by a Scufl

workflow.• BioMOBY processor.• SeqHound - a Representational State

Transfer style interface• BioMart - directly accesses queries over a

relational database.• Styx - executes a workflow subgraph

containing streamed services using P2P data transfer based on Styx Grid service protocol.

BLAST

createJob()

setProgram()

run()

getResults()

setDatabase()

setE_value()

blastQuery()

IBM Life Sciences BLAST service

SOAPLAB BLAST service

Processors

Page 25: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

Freefluo Workflow enactor

Scufl + Workflow Object Model

Processor Processor

PlainWeb

Service

Soaplab

Processor

LocalApp

Processor

Enactor

TavernaWorkbench

Processor

BioMOBY

Processor

SeqHound

Processor

BioMART

Three tiered abstraction

Application data flow layerScufl graph + service introspection

Execution flow layer List management; implicit iteration mechanism; MIME & semantic type decoration; fault management; service alternates

Processor invocation layer

Workflow Execution

Page 26: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

Architecture Confusagram

Tom Oinn, Mark Greenwood, Matthew Addis, M. Nedim Alpdemir, Justin Ferris, Kevin Glover, Carole Goble, Duncan Hull, Darren Marvin, Peter Li, Phillip Lord, Matthew R. Pocock, Martin Senger, Robert Stevens, Anil Wipat and Chris Wroe Taverna: Lessons in creating a workflow environment for the life sciences in Concurrency and Computation: Practice and Engineering in press

Page 27: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

Soaplab Service

WSDL Web Service BioMOBY Service

Local Java Service

Page 28: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

Workflows are not the only game

Workflows

OGSA-DQP

Applications

e-Science coordination e-Science mediator

e-Science process patterns

e-Science events

Notification service

Mediator

Protein Phosphatases

Page 29: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

?

• How to select among 1000+ services?

• Mostly inputs & outputs are “string”

• Domain specific descriptions of capabilities

• Selection is part of workflow assembly by bioinformaticians

• Selection of alternates for failure also generally user defined, and usually replicas, but need not be.

So many services, so poorly described

Page 30: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

Semantic discovery• Publish and find services (and

workflows) with description using an ontology (in OWL/RDF)

• Define domain types for objects passed around and a set of dimensions with which service capabilities can be defined using processor abstraction

• Bootstrapping descriptions

• Mining and maintaining descriptions

• The Expert Annotator

• GRIMOIRE / WebDAV directory

• Tie into BioMOBY central

• http://phoebus.cs.man.ac.uk:8100/feta-beta/mygrid/descriptions/

Phillip Lord, Pinar Alper, Chris Wroe, and Carole Goble Feta: A light-weight architecture for user oriented semantic service discovery in Proc of 2nd European Semantic Web Conference, Crete, June 2005

Page 31: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

Web Interface

Processor

API

Processor

API

Generic Schema for Service (part of Information model) Specific

ApplicationOntologye.g. caCORE

Semantic Web ServicesLayered model

Wroe C, Goble CA, Greenwood M, Lord P, Miles S, Papay J, Payne T, Moreau L Automating Experiments Using Semantic Data on a Bioinformatics Grid in IEEE Intelligent Systems Jan/Feb 2004

We don’t describe WSDL, we describe operations and processors

We are classifying for people not machines, so don’t be too clever!

Page 32: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

Operation

name, descriptiontaskmethodresourceapplication

Service

namedescriptionauthororganisation

Parameter

name, descriptionsemantic typeformattransport typecollection typecollection format

WSDL based Web service

WSDL basedoperation

Soaplab servicebioMoby serviceworkflow

hasInput

hasOutput

Local Java code

subclasssubclass

Page 33: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

Service hassles• The workflow are only as

good as the services they link together.

• Licensing models • Instability and unreliability• BioNanny + QoS registry

description• Configurable fault tolerance

and fail over strategies for graceful failure

• Few alternates and genuine replica services

Page 34: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

Type management: Shims

Sequencei.e. last known 3000bp

Mask BLASTIdentify new sequences and determine their degree of identity

Sequence database entryFasta format sequenceGenbank format sequence

Alignment of full query sequence V full ‘new’ sequence

Old BLAST result

Simplify and Compare

Lister

Retrieve

BLAST2‘I want to identify new sequences which overlap with my query sequence and determine if they are useful’

• The fiddly bits necessitated by not having a common type system or object model, or building elaborate wrappers

• Adding functionality to Web Services

• Shim libraries; Automatic deployment at workflow assembly

• Beanshell scripts for quick and dirty scripting

Page 35: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

• Put the workflow together to duplicate how they did the linking without duplicating how they did the on-the-fly integration

• Post hoc analysis. Don’t analyse data piece by piece receive all data all at once

• Service interoperability but fragmented results

• Because integration needs smarter workflows and smart thinking about data types.

• Close the world with Shims or services and build domain objects.

• Smarter ways of visualising and linking intermediate results using provenance graphs

• Custom visualisation application

Provenance Record

Result Result Result Result Result

Input

Workflow Practices

Page 36: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

Gene annotation pipeline workflow Integration and visualisation of GD annotation workflow results

Provenance Record

Custom Data Model

Input

Result

Integrated results

Page 37: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

Integration and interoperation

e-Science Semantics

Configuration

Invocation model

Interface

Data format

Domain Semantics

e-Science Semantics

Configuration

Invocation model

Interface

Data format

Domain Semantics

Syntax Syntax

Provenance AnnotationService & Data

AnnotationApp & Shim Services

Information Model

• Information model is a container for domain semantics• Linking stuff together is Integration Lite

Data identity Data Identity

Ontologies

Custom Data Objects

Ontologies

Custom Data Objects

LSID

WorkflowsProcessors

Shims

Shims

Page 38: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

Take Homes• Our apps are providing real scientific results – or at least

the hypotheses…• The problem is not really gathering and coordinating

services, but gathering and coordinating the results• Are you interoperating or integrating• Careful thought has to go into the abstractions we apply to

services for finding them and running them• Activation energy vs reusability of service: ROI and

altruism• We need more services, more replicas of services, better

service interfaces and better reliability and stability• Most of our services turn out not to be vanilla WSDL• Light touch vs added value

Page 39: Performing  In silico  Experiments in a Service Based Architecture:  Solutions and Issues

VBI Web Services Workshop 26-27 May 2005

Performing In silico Experiments in a Service Based Architecture: Solutions and Issues

Chris Wroe, Phillip Lord, Robert Stevens & Carole Goble

The University of Manchester, UK

http://www.mygrid.org.uk


Recommended