BioMoby and Taverna 2 Tutorial

Post on 09-Jan-2016

29 views 0 download

Tags:

description

BioMoby and Taverna 2 Tutorial. Preamble. The Taverna 2 platform is constantly changing; while the look and feel of the workbench may change, the functionality wont!. Getting started. If you don’t see a ‘Biomoby’ folder, you need to tell Taverna to go and fetch it for you!. Getting started. - PowerPoint PPT Presentation

transcript

BioMoby and Taverna BioMoby and Taverna 2 Tutorial2 Tutorial

PreamblePreamble

►The Taverna 2 platform is constantly The Taverna 2 platform is constantly changing; while the look and feel of changing; while the look and feel of the workbench may change, the the workbench may change, the functionality wont!functionality wont!

Getting startedGetting startedIf you don’t see a ‘Biomoby’ folder, you need to tell If you don’t see a ‘Biomoby’ folder, you need to tell Taverna to go and fetch it for you!Taverna to go and fetch it for you!

Getting startedGetting startedFrom the ‘Service Panel’:From the ‘Service Panel’:

Import new services Import new services Biomoby service… Biomoby service…

Choose the ‘default’ registry (more on this later!)Choose the ‘default’ registry (more on this later!)

Getting startedGetting startedNow we see that Taverna has fetched Biomoby services Now we see that Taverna has fetched Biomoby services for us to usefor us to use

Installing a PluginInstalling a PluginWe will also make sure that the Spreadsheet import plugin is installed!We will also make sure that the Spreadsheet import plugin is installed!

To install the plugin, from the toolbar, click on To install the plugin, from the toolbar, click on AdvancedAdvanced and then on and then on Updates and pluginsUpdates and plugins..

Installing a PluginInstalling a PluginIn the resultant window, click on In the resultant window, click on Find New PluginsFind New Plugins

Installing a PluginInstalling a PluginAmong others, we discovered the Among others, we discovered the Taverna 2 Taverna 2 SpreadsheetSpreadsheet activity. Check the box and then click on activity. Check the box and then click on InstallInstall..

Installing a PluginInstalling a Plugin

The plugin is being installed! Once it has been fully The plugin is being installed! Once it has been fully

installed, you will have to installed, you will have to restartrestart the workbench. the workbench.

Installing a PluginInstalling a PluginHere we see the activity under the Here we see the activity under the Service templatesService templates node in the service panel.node in the service panel.

► The The Service panel Service panel lists all of the services lists all of the services available to a workflow designer.available to a workflow designer.

► Under the node Under the node ‘Biomoby’ Moby ‘Biomoby’ Moby services are shown.services are shown.

► Services are sorted by Services are sorted by their Biomoby Service their Biomoby Service type.type.

► If you wish to use registries other than the default If you wish to use registries other than the default one, you can add new one, you can add new Biomoby Activities Biomoby Activities exactlyexactly as we did for the default registryas we did for the default registry

► Taverna even remembers the registry (or Taverna even remembers the registry (or registries) that you chose when you restart the registries) that you chose when you restart the workbench!workbench!

Creating WorkflowsCreating Workflows

►We will start by adding a Biomoby datatype We will start by adding a Biomoby datatype to the workflow.to the workflow.

► From the menu bar, click on From the menu bar, click on Advanced Advanced Biomoby Biomoby Datatype Browser -> http://… Datatype Browser -> http://…

► The Datatype viewer should be visible nowThe Datatype viewer should be visible now► Context click on the root node of the tree Context click on the root node of the tree

(Object) and (Object) and Add Datatype – ‘Object’ to the workflowAdd Datatype – ‘Object’ to the workflow..

► The Datatype viewer should be visible nowThe Datatype viewer should be visible now► Context click on the root node of the tree Context click on the root node of the tree

(Object) and (Object) and Add Datatype – ‘Object’ to the workflowAdd Datatype – ‘Object’ to the workflow..

► The Datatype viewer should be visible nowThe Datatype viewer should be visible now► Context click on the root node of the tree Context click on the root node of the tree

(Object) and (Object) and Add Datatype – ‘Object’ to the workflowAdd Datatype – ‘Object’ to the workflow..

► The The Workflow ExplorerWorkflow Explorer now shows that we have a now shows that we have a Processor called Processor called ObjectObject Has 3 input ports: id, namespace and article Has 3 input ports: id, namespace and article

namename Has 1 output port: mobyDataHas 1 output port: mobyData

► The Workflow diagram illustrates our processorThe Workflow diagram illustrates our processor

► If we click on our Datatype, and then on the Details If we click on our Datatype, and then on the Details Tab, we can do interesting things with our Tab, we can do interesting things with our ObjectObject

► The Details tab provides us with some information The Details tab provides us with some information on our datatypeon our datatype

► Please click on the Please click on the Datatype registry queryDatatype registry query button button to proceedto proceed

► To find those services that operate on a specific To find those services that operate on a specific domain, we can restrict our search to only those domain, we can restrict our search to only those that operate on the namespace(s) that we specify.that operate on the namespace(s) that we specify.

► Click Click YesYes, then navigate to the namespace , then navigate to the namespace NCBI_giNCBI_gi and click and click DoneDone..

► The resulting window illustrates what services The resulting window illustrates what services produce and consume our datatype. produce and consume our datatype.

► Navigate to the Navigate to the bioinfo.icapture.ubc.cabioinfo.icapture.ubc.ca node and node and context click on context click on getGenBankFastagetGenBankFasta..

► Add the service to the workflow.Add the service to the workflow.

► Notice how Taverna automatically made the Notice how Taverna automatically made the appropriate connection from our datatype to our appropriate connection from our datatype to our Biomoby service.Biomoby service.

► The The Workflow ExplorerWorkflow Explorer now shows that we have a now shows that we have a Processor called Processor called getGenBankFastagetGenBankFasta Has 1 input port: Object(identifier)Has 1 input port: Object(identifier) Has 1 output port: FASTA(fasta)Has 1 output port: FASTA(fasta)

► The Workflow diagram illustrates our processorThe Workflow diagram illustrates our processor

► To discover more services that we can use, click To discover more services that we can use, click on the on the getGenBankFasta activity, then click on activity, then click on the ‘the ‘DetailsDetails’ tab and finally click on the ’ tab and finally click on the Browse Browse Biomoby service details Biomoby service details button.button.

► The resultant window displays the services’ The resultant window displays the services’ inputs and outputs.inputs and outputs.

► There are also tool tips that show up when your There are also tool tips that show up when your mouse hovers over any particular input or output mouse hovers over any particular input or output that tells you what namespaces the data type is that tells you what namespaces the data type is valid invalid in

► FYIFYI► If we context (right) click on the leaf If we context (right) click on the leaf

Object(‘identifier’)Object(‘identifier’), we can bring up a menu that , we can bring up a menu that will allow us to add that datatype to our will allow us to add that datatype to our workflow.workflow.► Not only would the datatype be added, but Not only would the datatype be added, but

Taverna will attempt to make the appropriate Taverna will attempt to make the appropriate connections for you too!connections for you too!

► Context clicking on an output reveals a menu with 3 Context clicking on an output reveals a menu with 3 options.options. A brief search for services that consume our A brief search for services that consume our

datatypedatatype A semantic search for services that consume our A semantic search for services that consume our

datatypedatatype Adding a parser to the workflow that understands Adding a parser to the workflow that understands

our datatypeour datatype

► The result of choosing to add a parser for FASTA to our The result of choosing to add a parser for FASTA to our workflow.workflow.

► The parser allows us to extract:The parser allows us to extract: The namespace and id from FASTAThe namespace and id from FASTA The namespace and id from the child StringThe namespace and id from the child String The textual content from the child StringThe textual content from the child String

► The result of choosing to conduct a brief search The result of choosing to conduct a brief search for services that consume FASTAfor services that consume FASTA

► We will add the service getDragonBlastText to our We will add the service getDragonBlastText to our workflow by choosing ‘Add service -…’ from the workflow by choosing ‘Add service -…’ from the context menucontext menu

► The current state of our workflow shown The current state of our workflow shown graphically.graphically.

► Again, Taverna made a guess to determine the Again, Taverna made a guess to determine the appropriate connections. Sometimes the guess appropriate connections. Sometimes the guess isn’t correct, but usually it is.isn’t correct, but usually it is.

► A more complex view of our workflowA more complex view of our workflow

► Finding services that consume Finding services that consume NCBI_BLAST_TextNCBI_BLAST_Text starts by browsing the details for the Biomoby starts by browsing the details for the Biomoby service ‘service ‘getDragonBlastTextgetDragonBlastText’’

► Conduct a brief searchConduct a brief search

► Add the service ‘Add the service ‘parseBlastTextparseBlastText’ to our workflow’ to our workflow

► Our current workflowOur current workflow

► Workflow inputs are added by clicking on the little Workflow inputs are added by clicking on the little red triangle (located on the main menu bar)red triangle (located on the main menu bar)

► The result from The result from adding 2 inputs:adding 2 inputs: IdId namespacenamespace

► The workflow input id will be connected to The workflow input id will be connected to Object’s input port ‘id’. You can make the Object’s input port ‘id’. You can make the connection by clicking on the workflow input id connection by clicking on the workflow input id and dragging it to the Object’s input port ‘id’and dragging it to the Object’s input port ‘id’

► Workflow after connecting the workflow Workflow after connecting the workflow input ‘id’input ‘id’

► The workflow input namespace will The workflow input namespace will connect to Object’s input port connect to Object’s input port ‘namespace’.‘namespace’.

►Workflow after connection of the workflow Workflow after connection of the workflow inputs.inputs.

► Workflow outputs are added by clicking on the Workflow outputs are added by clicking on the green triangle (located on the toolbar).green triangle (located on the toolbar).

► The result from adding 2 workflow outputs:The result from adding 2 workflow outputs: moby_blast_idsmoby_blast_ids fasta_outfasta_out

► The output moby_blast_ids will be connected to The output moby_blast_ids will be connected to parseBlastText’s output port Object(Collection parseBlastText’s output port Object(Collection –’hit_ids’)–’hit_ids’)

► You can make the connection by dragging the You can make the connection by dragging the parseBlastText’s output port to the workflow parseBlastText’s output port to the workflow output port.output port.

► The output fasta_out will be connected to Parse The output fasta_out will be connected to Parse Moby Data(FASTA) output port fasta_’content’Moby Data(FASTA) output port fasta_’content’

► To run the workflow, click on To run the workflow, click on File File from the toolbarfrom the toolbar► Choose ‘Run workflow’Choose ‘Run workflow’

► A prompt to add values to our 2 workflow inputsA prompt to add values to our 2 workflow inputs

► To add a value to the input ‘id’ click on the id tab To add a value to the input ‘id’ click on the id tab and choose ‘New value’and choose ‘New value’

► Enter Enter 656461656461 as the id as the id

► Choose the namespace tab and click on ‘New Choose the namespace tab and click on ‘New value’value’

► Enter Enter NCBI_giNCBI_gi as the value for namespace as the value for namespace► Once you are done, click on ‘Run workflow’Once you are done, click on ‘Run workflow’

► Our workflow in actionOur workflow in action

► Once the workflow is complete, we can examine Once the workflow is complete, we can examine the results of our workflow.the results of our workflow.

► We may not have results for moby_blast_ids*We may not have results for moby_blast_ids*► * we may need to * we may need to configureconfigure the service to be the service to be

less stringent. More on this later!less stringent. More on this later!

► Without the parser, FASTA is represented as a Moby Without the parser, FASTA is represented as a Moby message, fully enclosed in its wrapper.message, fully enclosed in its wrapper.

► Non-moby services do not expect this kind of Non-moby services do not expect this kind of messagemessage Example for moby_blast_ids:Example for moby_blast_ids:

►Non-moby services expect the just the sequence Non-moby services expect the just the sequence and using the and using the Parse Moby Data(FASTA)Parse Moby Data(FASTA) processor, we can extract just thatprocessor, we can extract just that

► Moby services can interact with the other services Moby services can interact with the other services in Taverna.in Taverna.

► Let’s add a Soaplab service.Let’s add a Soaplab service.

► We will choose a We will choose a nucleic_restriction Soaplab service nucleic_restriction Soaplab service called ‘restrict’called ‘restrict’

► Drag it into our workflowDrag it into our workflow

► We will connect the output port fasta_’content’ We will connect the output port fasta_’content’ from the service from the service Parse Moby Data(FASTA)Parse Moby Data(FASTA) to the to the input port ‘sequence_direct_data’ from the service input port ‘sequence_direct_data’ from the service restrictrestrict

► Context click on the Context click on the Parse Moby Data(FASTA)Parse Moby Data(FASTA) and choose and choose Link from output …Link from output … fasta_’content’fasta_’content’

► Then click on the Soaplab serviceThen click on the Soaplab service► In the resulting list menu, choose In the resulting list menu, choose

sequence_direct_datasequence_direct_data

► The result of our The result of our actions so far.actions so far.

► We will need to add We will need to add another workflow another workflow output to capture the output to capture the output of restrict.output of restrict.

► Create an output called restrict_outCreate an output called restrict_out

► Connect the output port ‘Connect the output port ‘outfileoutfile’ from the service ’ from the service restrictrestrict to the workflow output to the workflow output restrict_outrestrict_out

► Once the connections Once the connections have been made, run have been made, run the workflow again the workflow again using the same inputs.using the same inputs.

► The workflow on top has some extra services added to it.The workflow on top has some extra services added to it. FASTA2HighestGenericSequenceObjectFASTA2HighestGenericSequenceObject from the authority from the authority

bioinfo.icapture.ubc.ca, bioinfo.icapture.ubc.ca, a a conversionconversion service service runRepeatMaskerrunRepeatMasker from the authority from the authority genome.imim.es, genome.imim.es, anan analysisanalysis

serviceservice A Moby parser for the output A Moby parser for the output DNASequenceDNASequence from from runRepeatMaskerrunRepeatMasker.. A workflow output A workflow output Masked_SequenceMasked_Sequence

► Add them to your workflowAdd them to your workflow

► The service runRepeatMasker is configurable, i.e. The service runRepeatMasker is configurable, i.e. it consumes Secondary parameters.it consumes Secondary parameters.

► To edit these parameters, click on the service, To edit these parameters, click on the service, and in the Details tab choose ‘Configure’and in the Details tab choose ‘Configure’

► The name of the parameter is on the left and the The name of the parameter is on the left and the value is on the right.value is on the right.

► Clicking on the Value will bring up a drop down Clicking on the Value will bring up a drop down menu, an input text field, or any other menu, an input text field, or any other appropriate field depending on the parameter.appropriate field depending on the parameter.

► The parameter species contains an enumerated The parameter species contains an enumerated list of possibilities. list of possibilities.

► Select human.Select human.► When you have made your selection, close the When you have made your selection, close the

window.window.

► Before we run the workflow again, we will make the workflow Before we run the workflow again, we will make the workflow input port input port idid a port that can take in a list of strings a port that can take in a list of strings Context click on the id port and choose to edit the port. Context click on the id port and choose to edit the port.

Then give the port a Then give the port a List of depth 1List of depth 1 port type port type

► Let’s run the workflowLet’s run the workflow

► We will run our workflow with a list of valuesWe will run our workflow with a list of values Click on the id tab and then click on Click on the id tab and then click on New New

valuevalue twice twice

► Enter 656461 and 654321 as the idsEnter 656461 and 654321 as the ids► Enter NCBI_gi as the value for namespaceEnter NCBI_gi as the value for namespace► Our workflow will now run using each id with the single namespaceOur workflow will now run using each id with the single namespace

► Imagine now that you want to run the workflow using a FASTA sequence that Imagine now that you want to run the workflow using a FASTA sequence that you input yourself (without the gi identifier)you input yourself (without the gi identifier)

► To do this, context click on getDragonBlastText and choose To do this, context click on getDragonBlastText and choose Browse Biomoby Browse Biomoby service detailsservice details Expand the Inputs node and context click on FASTA(‘sequence’)Expand the Inputs node and context click on FASTA(‘sequence’) Choose Add Datatype – FASTA(‘sequence’) to the workflowChoose Add Datatype – FASTA(‘sequence’) to the workflow

► A FASTA datatype will be added to the workflow and the appropriate links A FASTA datatype will be added to the workflow and the appropriate links createdcreated

► Notice the datatype FASTA Notice the datatype FASTA on the left of the workflowon the left of the workflow Since the datatype FASTA Since the datatype FASTA

hasa String, a String was hasa String, a String was also added to our workflow also added to our workflow and the appropriate and the appropriate connection was madeconnection was made

► We will now have to add We will now have to add another workflow input and another workflow input and connect it to the String connect it to the String component of FASTA.component of FASTA.

► A workflow input ‘sequence’ A workflow input ‘sequence’ was added to the workflow was added to the workflow and a connection was made and a connection was made from the workflow input to from the workflow input to the input port ‘value’ of the input port ‘value’ of String.String.

► We also removed the link We also removed the link between between getGenBankFasta getGenBankFasta and and getDragonBlastTextgetDragonBlastText by by context clicking on the link on context clicking on the link on the workflow diagram and the workflow diagram and choosing to remove the linkchoosing to remove the link

► Now when we choose to run Now when we choose to run our workflow, we will also our workflow, we will also have the chance to enter a have the chance to enter a FASTA sequenceFASTA sequence

► In addition, we In addition, we configuredconfigured the the getDragonBlastText service getDragonBlastText service and made the evalue 10,000and made the evalue 10,000

► Go ahead an enter any FASTA sequence as the Go ahead an enter any FASTA sequence as the input to the workflow input ‘sequence’input to the workflow input ‘sequence’

► Run the workflowRun the workflow

► Any results can be saved by simply choosing to Any results can be saved by simply choosing to Save resultSave result You will be prompted to enter a directory to save the results.You will be prompted to enter a directory to save the results.

Using the Spreadsheet Import Using the Spreadsheet Import PluginPluginLet’s remove the Let’s remove the SequenceSequence workflow input from our workflow input from our workflow, so that our workflow looks similar to the one workflow, so that our workflow looks similar to the one pictured here.pictured here.

Using the Spreadsheet Import Using the Spreadsheet Import PluginPlugin• We would like to import a spreadsheet with our data in it.We would like to import a spreadsheet with our data in it.• Our data has 2 columns; sequence id and sequences.Our data has 2 columns; sequence id and sequences.•You can obtain the demo spreadsheet from:You can obtain the demo spreadsheet from:

http://dev.biordf.net/~kawas/sequences.xlshttp://dev.biordf.net/~kawas/sequences.xls

Using the Spreadsheet Import Using the Spreadsheet Import PluginPluginFrom the Service panel, we need to navigate to the node From the Service panel, we need to navigate to the node SpreadsheetImportSpreadsheetImport located directly below the located directly below the Service Service templatestemplates node. node.

Using the Spreadsheet Import Using the Spreadsheet Import PluginPluginWhen we add the import plugin, we are asked to configure When we add the import plugin, we are asked to configure it.it.

Using the Spreadsheet Import Using the Spreadsheet Import PluginPlugin

• In the columns section, we will only import our sequences (B to B).In the columns section, we will only import our sequences (B to B).• In the rows section, we will import all rows and exclude the header In the rows section, we will import all rows and exclude the header row.row.• In the column to port name mapping, we will map column B to In the column to port name mapping, we will map column B to fasta_sequencefasta_sequence..

Using the Spreadsheet Import PluginUsing the Spreadsheet Import Plugin

Once we click on the Ok button, we will see our SpreadsheetImport Once we click on the Ok button, we will see our SpreadsheetImport activity on the canvas.activity on the canvas.

Notice that the output is fasta_sequence.Notice that the output is fasta_sequence.

Using the Spreadsheet Import Using the Spreadsheet Import PluginPlugin

Since our spreadsheet contains FASTA sequences, we Since our spreadsheet contains FASTA sequences, we need to connect the output port of the import activity to need to connect the output port of the import activity to the value input port of our String activity.the value input port of our String activity.

Using the Spreadsheet Import Using the Spreadsheet Import PluginPlugin

Next we will add a widget that will prompt us for our spreadsheet.Next we will add a widget that will prompt us for our spreadsheet.

Navigate to the Navigate to the Select FileSelect File service located under service located under Local services Local services ui node ui node. .

Add it to the workflow!Add it to the workflow!

Using the Spreadsheet Import Using the Spreadsheet Import PluginPlugin

Once we add the Select File widget to our workflow, we Once we add the Select File widget to our workflow, we need to connect the output port, need to connect the output port, selectedFileselectedFile, to the , to the input port of SpreadsheetImport (input port of SpreadsheetImport (fileurlfileurl).).

Using the Spreadsheet Import Using the Spreadsheet Import PluginPlugin

We need to now give the input ports for the Select File activity We need to now give the input ports for the Select File activity constant values! constant values!

This can be done by context clicking on each of the input ports This can be done by context clicking on each of the input ports and choosing and choosing Set constant value Set constant value from the resulting menu.from the resulting menu.

Using the Spreadsheet Import Using the Spreadsheet Import PluginPlugin

•fileExtensionsfileExtensions – the extensions of the files we are interested in; set to – the extensions of the files we are interested in; set to xlsxls•titletitle – the title to give the select file widget; set to – the title to give the select file widget; set to Select spreadsheetSelect spreadsheet•fileExtLabelsfileExtLabels – the labels to give the extensions we provided above; again, use – the labels to give the extensions we provided above; again, use Excel Excel SpreadsheetSpreadsheet

Note: if the input ports for this widget are not filled in, it will fail to run!Note: if the input ports for this widget are not filled in, it will fail to run!

Using the Spreadsheet Import Using the Spreadsheet Import PluginPlugin

Run the workflow again using the inputs that we have Run the workflow again using the inputs that we have previously used. previously used.

You will be prompted to open your spreadsheet. Do so and You will be prompted to open your spreadsheet. Do so and see Taverna utilize your data!see Taverna utilize your data!

Using the Spreadsheet Import Using the Spreadsheet Import PluginPlugin

Some results produced by using the spreadsheet dataSome results produced by using the spreadsheet data

DOWNLOAD THE DOWNLOAD THE WORKFLOWWORKFLOWGet the workflow for this tutorial fromGet the workflow for this tutorial fromhttp://dev.biordf.net/~kawas/t2_biomoby_tutorial.t2flow

http://dev.biordf.net/~kawas/t2_biomoby_tutorial.t2flowhttp://dev.biordf.net/~kawas/t2_biomoby_tutorial.t2flow