Date post: | 20-Jan-2016 |
Category: |
Documents |
Upload: | bethany-lee |
View: | 214 times |
Download: | 0 times |
Analysis: Discovery of possible regulatory motifs
What follows is a simulation of the proposed graphical interface. As you go through the simulation please consider what capabilities you would want to serve your research and annotation interests.
A narrative to help you go through the simulation appears in a red-bordered box, such as the one below.
To begin:1. Click on Slide Show, (on the upper toolbar)2. Click View Show3. Click Continue button
Continue
Scenario 5
You’ve decided you want to know what regulates the expression of nif genes, encoding the machinery for nitrogen fixation. Here’s your strategy:
Scenario 5
Continue
• (Search for other genes with same motifs)
• Analyze set of 5’ sequences for motifs
• Extract 5’ sequences from all genes in set
• Collect nif genes from Anabaena PCC 7120 into set
• Include in set orthologs of the Anabaena genes
Analysis: Discovery of possible regulatory motifs
Build set Display set Modify set Set operation
Click on Build Set to begin finding orfs with
the desired specifications
All items in All open reading frames of
All amino acid sequences of
All intergenic regions of
Human-annotated orfs of
Private set
Public set
All open reading frames of
Build set Display set Modify set Set operation Cancel
Choose set type
The first goal is to find all open reading frames within Prochlorococcus
annotated as nif genes, so click on All open reading frames in
All items in All open reading frames of Arthrobacter platensisGloeobacter violaceusMicrocystis aeruginosa
Nostoc punctiformeNostoc PCC 7120
Prochlorococcus MED4Prochlorococcus MIT9313
Prochlorococcus S120Synechococcus PCC6301Synechococcus PCC7942
Synechococcus WHSynechocystis PCC 6803Thermosynechococcus
TrichodesmiumUnicellulularFilamentous
All
Anabaena PCC 7120
Display set Modify set Set operation Cancel
Choose set type Choose database
Build set
Click on Anabaena PCC 7120
All items in Anabaena PCC 7120
Display set Modify set Set operation Cancel
such that:
Variable Data Operation Function Done
Choose database
Build set
All open reading frames of
Choose set type
You want to compare the description of each orf with “nif”. To get a tool to extract the
description, click on .Function
All items in Anabaena PCC 7120
Display set Modify set Set operation Cancel
such that:
Variable Data Operation Function Done
Choose database
Closest ortholog of
Protein product of
Upstream region of
Downstream region of
Description of
Category of
Annotation level of
Description of
Choose function
(item
Build set
All open reading frames of
Choose set type
Click on Description of.
All items in
Display set Modify set Set operation Cancel
Variable Data Operation Function Done
Description of
Choose function
(item) =
includesexcludesincludes
Op
Build set
You want to find orfs whose description includes the word “nif”. Click on includes.
Anabaena PCC 7120 such that:
Choose database
All open reading frames of
Choose set type
All items in
Display set Modify set Set operation Cancel
Data Operation Function Done
includes
Op
nif
Type description term(s)
Build set
Description of
Choose function
(item)
You can type in any characters to search for. For this simulation, the term “nif” is
provided. Press the Enter key
Anabaena PCC 7120 such that:
Choose database
All open reading frames of
Choose set type
All items in
Display set Modify set Set operation Cancel
Variable Data Operation Function Done
includes
Op
nif
Type description term(s)
Build set
Description of
Choose function
(item)
No more specifications. Press the Done button.
Anabaena PCC 7120 such that:
Choose database
All open reading frames of
Choose set type
All items in
Display set Modify set Set operation Cancel
Variable Data Operation Function Done
includes
Op
nif
Type description term(s)
Build set
Description of
Choose function
(item)
Done
Save results and scriptSave only resultsSave only results
If this were a complicated search, you might want to save the specifications as a script. In this case, just save the results by clicking on
Save only results.
Anabaena PCC 7120 such that:
Choose database
All open reading frames of
Choose set type
All items in
Display set Modify set Set operation Cancel
Variable Data Operation Function Done
includes
Op
nif
Type description term(s)
Build set
Description of
Choose function
(item)
7120 nif genes
Type name of set
Anabaena PCC 7120 such that:
Choose database
All open reading frames of
Choose set type
All orfs of Anabaena whose descriptions include “nif” will be collected into a set. You can name the set anything you want. For this simulation, a
name is provided. Press the Enter key.
Build set Display set Modify set Set operation
Anab7120:all0687 hupL [NiFe] uptake hydrogenase large subunit, C terminus
Anab7120:all0687 hupL [NiFe] uptake hydrogenase large subunit, N terminus
Anab7120:all0688 hupS [NiFe] uptake hydrogenase small subunit
Anab7120:alr0692 similar to nifU
Anab7120:alr0874 nifH2 dinitrogenase reductase
Anab7120:asr1309 similar to nifU
Anab7120:alr1407 nifV1 homocitrate synthase
Anab7120:asr1408 nifZ iron-sulfur cofactor synthesis
Anab7120:asr1409 nifT
Done
Set: 7120 nif genes
<< more items >>
This is the result of the search. The set is displayed both as a list of orfs and a graphical representation of
the genetic neighborhood of each orf. You can find out more about an orf by clicking its name or its
arrow. For now, just press . ContinueContinue
Build set Display set Modify set Set operation
Anab7120:all0687 hupL [NiFe] uptake hydrogenase large subunit, C terminus
Anab7120:all0687 hupL [NiFe] uptake hydrogenase large subunit, N terminus
Anab7120:all0688 hupS [NiFe] uptake hydrogenase small subunit
Anab7120:alr0692 similar to nifU
Anab7120:alr0874 nifH2 dinitrogenase reductase
Anab7120:asr1309 similar to nifU
Anab7120:alr1407 nifV1 homocitrate synthase
Anab7120:asr1408 nifZ iron-sulfur cofactor synthesis
Anab7120:asr1409 nifT
Done
Set: 7120 nif genes
<< more items >>
This search, like most, is only a beginning. It brought up some unintended hits (“nif” found “NiFe”). More seriously, it brought up
many genes probably in the middle of operons and unlikely to be preceded by regulatory motifs. The genetic neighborhood gives clues
as to operon structure. Select the two most likely orfs to begin operons by clicking on the circles next to alr0874 and alr1407.
Build set Display set Modify set Set operation
Anab7120:all0687 hupL [NiFe] uptake hydrogenase large subunit, C terminus
Anab7120:all0687 hupL [NiFe] uptake hydrogenase large subunit, N terminus
Anab7120:all0688 hupS [NiFe] uptake hydrogenase small subunit
Anab7120:alr0692 similar to nifU
Anab7120:alr0874 nifH2 dinitrogenase reductase
Anab7120:asr1309 similar to nifU
Anab7120:alr1407 nifV1 homocitrate synthase
Anab7120:asr1408 nifZ iron-sulfur cofactor synthesis
Anab7120:asr1409 nifT
Done
Set: 7120 nif genes
<< more items >>
Let’s suppose you proceed in a like fashion through the rest of the list.
Press . Done
Build set Display set Modify set Set operation
Anab7120:alr0874 nifH2 dinitrogenase reductase
Anab7120:alr1407 nifV1 homocitrate synthase
Done
Set: 7120 nif genes
The set now consists of the six Anabaena nif genes that you judged most likely to be preceded by transcriptional signals. It might be interesting to see where this set is located on the genome. To do this, click , then make some room
by clicking on Show graphic.Display set
Anab7120:all1438 nifE nitrogenase Fe/Mo cofactor
Anab7120:all1455 nifH dinitrogenase reductase
Anab7120:all1517 nifB nitrogen fixation protein
Anab7120:alr2968 nifV2 homocitrate synthase
Display set
Show orf ID
Show gene name
Show description
Show coordinates
Show graphic
Show neighbors: +/- 1
Show map
Build set Display set Modify set Set operation
Anab7120:alr0874 nifH2 dinitrogenase reductase
Anab7120:alr1407 nifV1 homocitrate synthase
Done
Set: 7120 nif genes
Replace the space-consuming description with coordinates by clicking on Show
description, and then click Show coordinates and finally Show map.
Anab7120:all1438 nifE nitrogenase Fe/Mo cofactor
Anab7120:all1455 nifH dinitrogenase reductase
Anab7120:all1517 nifB nitrogen fixation protein
Anab7120:alr2968 nifV2 homocitrate synthase
Display set
Show orf ID
Show gene name
Show description
Show coordinates
Show graphic
Show neighbors: +/- 1
Show map
Build set Display set Modify set Set operation
Anab7120:alr0874 nifH2
Anab7120:alr1407 nifV1
Done
Set: 7120 nif genes
Anab7120:all1438 nifE
Anab7120:all1455 nifH
Anab7120:all1517 nifB
Anab7120:alr2968 nifV2
Display set
Show orf ID
Show gene name
Show description
Show coordinates
Show graphic
Show neighbors: +/- 1
Show map
Replace the space-consuming description with coordinates by clicking on Show
description, and then click Show coordinates and finally Show map.
Anab7120:alr0874 nifH2 1008496 -> 1009389
Anab7120:alr1407 nifV1 1671878 -> 1673011
Anab7120:all1438 nifE 1696389 <- 1697831
Anab7120:all1455 nifH 1713396 <- 1714283
Anab7120:all1517 nifB 1776670 <- 1778097
Anab7120:alr2968 nifV2 3609625 -> 3611012
Build set Display set Modify set Set operation Done
Set: 7120 nif genes
Replace the space-consuming description with coordinates by clicking on Show
description and then Show coordinates, and finally, click on Show map.
Display set
Show orf ID
Show gene name
Show description
Show coordinates
Show graphic
Show neighbors: +/- 1
Show map
Build set Display set Modify set Set operation Done
Anab7120:alr0874 nifH2 1008496 -> 1009389
Anab7120:alr1407 nifV1 1671878 -> 1673011
Set: 7120 nif genes
Anab7120:all1438 nifE 1696389 <- 1697831
Anab7120:all1455 nifH 1713396 <- 1714283
Anab7120:all1517 nifB 1776670 <- 1778097
Anab7120:alr2968 nifV2 3609625 -> 3611012
Anabaenachromosome
6413771 bpFour of the six putative nif operons are clustered near 1.7 Mb... but
back to business. Our idea was to extend the set to include orthologs in other nitrogen-fixing cyanobacteria.
To do this, click , then
Transformations, then Ortholog of.Set operation
Set operation
Maintenance
Set operations
Analysis tools
Discovery tools
TransformationsTransformations Closest ortholog of
Protein product of
Upstream region of
Downstream region of
Ortholog of
Orthologs of (
Build set Display set Modify set Set operation Cancel
All open reading frames of
All amino acid sequences of
All intergenic regions of
Human-annotated orfs of
Public set
Private setPrivate set
Choose set type
You want the orthologs of the orfs in the set you just made. This set is yours – a private
set – as opposed to certain sets that are available to all users. Click Private set.
Orthologs of (
Build set Display set Modify set Set operation Cancel
Private set
Choose set type
The list of choices will consist of whatever sets you may have created. Choose the one
you just made: 7120 nif genes.
7120 IS895 seqs7120 nif genes
7120 STTR7 regionsLight-specific genesNpun STTR7 regions
7120 nif genes
Choose set
Orthologs of (
Build set Display set Modify set Set operation Cancel
Private set
Choose set type
At present, the set of filamentous cyanobacteria include just the nitrogen-
fixing strains Nostoc punctiforme, Trichodesmium erythreum, Anabaena.
Click on filamentous.
7120 nif genes
Choose set
Arthrobacter platensisGloeobacter violaceusMicrocystis aeruginosa
Nostoc punctiformeAnabaena PCC 7120
Prochlorococcus MED4Prochlorococcus MIT9313
Prochlorococcus S120Synechococcus PCC6301Synechococcus PCC7942Synechococcus WH8102Synechocystis PCC 6803Thermosynechococcus
Trichodesmium erythreumUnicellulularFilamentous
Allfilamentous
Choose database
in )
Orthologs of (
Build set Display set Modify set Set operation Cancel
Private set
Choose set type
7120 nif genes
Choose set
Filamentous
Choose database
in )
all nif genes
Type name of set
All orthologs of the selected nif genes will be combined and saved in a set of
your choice. For this simulation, a name is provided. Press the Enter key.
Build set Display set Modify set Set operation Done
Anab7120:alr0874 nifH2 dinitrogenase reductase
Anab7120:alr1407 nifV1 homocitrate synthase
Set: all nif genes
Anab7120:all1438 nifE nitrogenase Fe/Mo cofactor
Anab7120:all1455 nifH dinitrogenase reductase
Anab7120:all1517 nifB nitrogen fixation protein
Anab7120:alr2968 nifV2 homocitrate synthase
NostPunc:637.025 nifH2 dinitrogenase reductase
NostPunc:510.011 nifV1 homocitrate synthase
NostPunc:651.072 nifE nitrogenase Fe/Mo cofactor
NostPunc:510.021 nifB nitrogen fixation protein
<< more items >>The set now consists of nif genes from all filamentous cyanobacteria. From this set
we want to extract the upstream sequences. Click on ,then click on Transformations and
Upstream region of.
Set operation
Ortholog of
Protein product of
Upstream region of
Downstream region ofUpstream region of
Set operation
Maintenance
Set operations
Analysis tools
Discovery tools
TransformationsTransformations
Upstream region of (
Build set Display set Modify set Set operation Cancel
All open reading frames of
Human-annotated orfs of
Public set
Private setPrivate set
Choose set type
Again you want the orfs from a set you made yourself, so click on Private set.
Upstream region of (
Build set Display set Modify set Set operation Cancel
Private set
Choose set type
7120 IS895 seqs7120 nif genes
7120 STTR7 regionsall nif genes
Light-specific genesNpun STTR7 regions
all nif genes
Choose set
)
The set you just defined magically appears on the list (no chance for
misspelling). Click on it.
Upstream region of (
Build set Display set Modify set Set operation Cancel
Private set
Choose set type
all nif genes
Choose set
)
Give this new set of 5’ regions a descriptive name (done here for you). Press the Enter key.
all nif genes – 5’
Type name of set
Build set Display set Modify set Set operation Done
Anab7120.C:1006982-1008496d
Anab7120.C:1671462-1671878d
Set: all nif genes – 5’
Anab7120.C:1697832-1698138c
Anab7120.C:1713264-1713395c
Anab7120.C:1778098-1779034c
Anab7120.C:3609273-3609624d
NostPunc.637:37288-37376d
NostPunc.510:15955-16325d
NostPunc.651:60311-60584c
NostPunc.510:5239-6338c
<< more items >>The resulting set consists of sequences not orfs, and so the elements are defined by coordinates.
Clicking on a coordinate brings up the sequence display (see Scenario 6). Clicking on a graph of an orf brings up the orf’s annotation
page. Click .Continue Continue
Build set Display set Modify set Set operation Done
Anab7120.C:1006982-1008496d
Anab7120.C:1671462-1671878d
Set: all nif genes – 5’
Anab7120.C:1697832-1698138c
Anab7120.C:1713264-1713395c
Anab7120.C:1778098-1779034c
Anab7120.C:3609273-3609624d
NostPunc.637:37288-37376d
NostPunc.510:15955-16325d
NostPunc.651:60311-60584c
NostPunc.510:5239-6338c
<< more items >>The final step in this procedure is to analyze the set of upstream sequences of nif genes hoping to find a
common motif. Click on Set operatio , then Analysis tools. Tools based on Position-Specific
Scoring Matrices (PSSM’s) are most often used for the task. Click on one of these: Meme.
Set operation
Maintenance
Set operations
Analysis tools
Discovery tools
Transformations
Analysis tools Align
PSSM: Gibbs sampler
PSSM: Meme
Make HMM
PSSM: Meme
Set operation
PSSM: Meme of (
Build set Display set Modify set Set operation Cancel
Public set
Private setPrivate set
Choose set type
Click Private set and then all nif genes – 5’ to give Meme the set of 5’ sequences.
PSSM: Meme of (
Build set Display set Modify set Set operation Cancel
Private set
Choose set type
Click Private set and then all nif genes – 5’ to give Meme the set of 5’ sequences.
7120 IS895 seqs7120 nif genes
7120 STTR7 regionsall nif genes
all nif genes – 5’Npun STTR7 regions
all nif genes – 5’
Choose set
)
PSSM: Meme of (
Build set Display set Modify set Set operation Cancel
Private set
Choose set type
Give the results a name, press Enter, and the task is accomplished.
all nif genes – 5’
Choose set
)
PSSM:all nif – 5’
Type name of results
Analysis: Discovery of possible regulatory motifsSummary
• The interface facilitates operations on sets of genes and sequences
• The interface puts at your disposal powerful tools (that already exist), without the need to figure out a different computer environment
• Taken together, these capabilities make possible a focus by those not particularly adept at computer programming on the function of noncoding sequences
Scenario 5
But don’t be fooled – the interface does not yet exist. That’s the point of the proposal!