+ All Categories
Home > Documents > RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

Date post: 21-Dec-2015
Category:
View: 218 times
Download: 1 times
Share this document with a friend
31
RiboSearch RiboSearch Ben Daniel Ariel Ben Daniel Ariel Kirshner Naomi Kirshner Naomi Instructor : Dr. Danny Barash Instructor : Dr. Danny Barash Adaya Cohen Adaya Cohen
Transcript
Page 1: RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

RiboSearchRiboSearch

Ben Daniel ArielBen Daniel Ariel Kirshner NaomiKirshner Naomi

Instructor : Dr. Danny BarashInstructor : Dr. Danny BarashAdaya CohenAdaya Cohen

Page 2: RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

IntroductionIntroduction

Biological IntroductionBiological Introduction Method LayoutMethod Layout ““The merge strategy”The merge strategy” Results and ConclusionsResults and Conclusions

Page 3: RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

RNARNARNARNAA single-stranded nucleic acid made up of 4 nucleotides :

Purines : adenine (A), guanine (G)

Pyramidines: cytosine (C), and uracil (U).

WC pairs:

A-U G-C

Page 4: RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

IntroductionIntroduction BiologicalBiological

Old scheme Old scheme

Protein carry out all biological Protein carry out all biological functionsfunctions

RNA : only a stage between DNA to RNA : only a stage between DNA to protein with no catalytic functionprotein with no catalytic function

DNA RNA Protein

Page 5: RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

Biological introductionBiological introduction

New schemeNew scheme Since the discovery of self-splicing RNAs in the Since the discovery of self-splicing RNAs in the

early 1980’s, a number of new structural and early 1980’s, a number of new structural and catalytic RNAs have been discovered.catalytic RNAs have been discovered.

Recent studies focusing on non-coding and Recent studies focusing on non-coding and small RNAs have led to discovery of RNA small RNAs have led to discovery of RNA molecules that posses essential regulatory molecules that posses essential regulatory functionsfunctions

DNA RNA Protein

Page 6: RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

RNA Secondary StructureRNA Secondary Structure

The secondary structure of many RNAs is usually The secondary structure of many RNAs is usually more conserved than their sequencemore conserved than their sequence

a. Hairpinb. Internal loopc. Bulge loopd. Junctione. Stem (double strand)f. pseudoknot

Page 7: RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

RiboswitchRiboswitch

RNA control elements that regulates gene RNA control elements that regulates gene expression, without the participation of expression, without the participation of proteins proteins

Utilize a unique mechanism where by small Utilize a unique mechanism where by small molecules bind to aptamer/box region molecules bind to aptamer/box region causing a conformational switch causing a conformational switch

Were found initially in 5’ UTR of bacteria with Were found initially in 5’ UTR of bacteria with successive discoveries in prokaryotessuccessive discoveries in prokaryotes

There are evidence suggesting riboswitches There are evidence suggesting riboswitches could be found in eukaryotes.could be found in eukaryotes.

Aptamer Coding section 3’5’

Expression platform

5 ’UTR 3 ’UTR

Page 8: RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

Riboswitch Riboswitch mechanismmechanism

Guanine bind to aptamer region with cause Guanine bind to aptamer region with cause conformational change in the expression platform, conformational change in the expression platform, which regulates the guanine metabolism.which regulates the guanine metabolism.

Page 9: RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

G-boxG-box

Regulates genes related Regulates genes related to purine metabolism and to purine metabolism and transporttransport

Binds purinesBinds purines Consists of 2 hairpins and Consists of 2 hairpins and

1 internal junction1 internal junction

Page 10: RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

RiboSearchRiboSearchGoal Goal Finding G-box in Finding G-box in

eukaryotic genomeseukaryotic genomes

MethodMethod Combining existing Combining existing

search methods into search methods into one overall packageone overall package

Page 11: RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

Search MethodsSearch Methods

Whiffer – CS department, BGUWhiffer – CS department, BGU RNAMotif – Macke RNAMotif – Macke et alet al. , 2001 . , 2001 RNAProfile – Pavesi RNAProfile – Pavesi et alet al. , 2004 . , 2004 STRSTR22 – CS department, BGU – CS department, BGU

Page 12: RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

WhifferWhiffer

Input Input Pattern that consists of : Pattern that consists of :

Sequence informationSequence information Variable gaps Variable gaps Base pairing brackets representing WC pairsBase pairing brackets representing WC pairs

OutputOutput Candidates locations that meet constraints Candidates locations that meet constraints

imposed by the methodimposed by the method

[ <<<<2 ]TA ]5[ GTNTCTAC ]3[ <<<<< ]3[ CCNNNAA ]3[ <<<<< ]5[ <<<<

Page 13: RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

WhifferWhifferMethodMethod Uses simple matching ,based on the Uses simple matching ,based on the

constraints ,as opposed to dynamic constraints ,as opposed to dynamic programming.programming.

Page 14: RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

RNAMotifRNAMotif

InputInput Database of nucleotide sequencesDatabase of nucleotide sequences Description file that consists of:Description file that consists of:

Descriptor sectionDescriptor section Score section (optional)Score section (optional)

OutputOutput Candidates that meet the conditions of the Candidates that meet the conditions of the

descriptor and the scoring schemedescriptor and the scoring scheme

Page 15: RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

RNAMotifRNAMotif

descr h5 (minlen=6, maxlen=8)

ss (minlen=4, maxlen=6) h3score { gcnt = 0; glen = 0; for( i = 1; i <= NSE; i++ ){

llen=length( se]i[ ); glen=glen+llen;

for( j = 1; j <= glen; j++ ){ b = se]i,j,1[; if( b == "g" || b == "c" ) gcnt++;

{{SCORE = 1.0 * gcnt / glen; if( SCORE < .4 ) REJECT; }

Sample descriptor file :

h5 h3

ss

Page 16: RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

RNAMotifRNAMotif

MethodMethod Two-stage algorithmTwo-stage algorithm

Stage I : Compilation stageStage I : Compilation stage Analyzing the specific motif, called a descriptor Analyzing the specific motif, called a descriptor

and converting it into a search tree based on the and converting it into a search tree based on the helical nesting of the motifhelical nesting of the motif

Page 17: RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

RNAMotifRNAMotif

MethodMethod Two-stage algorithmTwo-stage algorithm

Stage II : DFSStage II : DFS Depth first search of the tree that was created by Depth first search of the tree that was created by

the compilation stagethe compilation stage Each time a complete solution to the descriptor is Each time a complete solution to the descriptor is

found, the candidate is passed to an optional found, the candidate is passed to an optional score section for scoring and rankingscore section for scoring and ranking

In absence of score section the candidate is In absence of score section the candidate is acceptedaccepted

Page 18: RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

RNAProfileRNAProfile

InputInput

Number of distinct hairpins Number of distinct hairpins a motif has to containa motif has to contain

Set of unaligned RNA Set of unaligned RNA sequences expected to sequences expected to share a common motifshare a common motif

Page 19: RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

RNAProfileRNAProfile

OutputOutput

Regions that are most conserved Regions that are most conserved throughout the sequences, according to throughout the sequences, according to sequence of the regionssequence of the regions Secondary structure that can be formed Secondary structure that can be formed

according to base-pairing and according to base-pairing and thermodynamic rulesthermodynamic rules

Page 20: RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

RNAProfileRNAProfile

MethodMethod

Two phasesTwo phases Phase I : Phase I : Extracting a set of candidate regions from each Extracting a set of candidate regions from each

input sequence, whose predicted optimal secondary input sequence, whose predicted optimal secondary structure contains the number of hairpins given as structure contains the number of hairpins given as inputinput

Phase II : Phase II : The regions selected are compared with each other The regions selected are compared with each other

to find the group of most similar ones, formedto find the group of most similar ones, formed by a by a region taken from each sequenceregion taken from each sequence

Page 21: RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

Method SummeryMethod Summery

Whiffer Whiffer Combines sequence and structure similarityCombines sequence and structure similarity Very high specifity – potential candidates may be Very high specifity – potential candidates may be

ruled outruled out

RNAMotifRNAMotif Similarity based mostly on structural elements, Similarity based mostly on structural elements,

according to the descriptoraccording to the descriptor

RNAProfileRNAProfile Similarity based on both sequence and structureSimilarity based on both sequence and structure Recommended as a post-processing stepRecommended as a post-processing step

Page 22: RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

The merge strategyThe merge strategy

Query:Sequence

Structure (bracket notation)

Whiffer RNAMotif

Candidates

Input

Parsing

Parsing

(((..((((…)))).))

Page 23: RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

Candidates

Filtering

RNAProfile

Final candidates

Post processing

1. The location contained within a gene

2. The gene is relevant to the requested function (purine metabolism)

Page 24: RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

Final candidates

Sequence alignment

Biological experiments

Page 25: RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

Results – prokaryoteResults – prokaryoteBacillus HaloduransBacillus Halodurans

WhifferWhifferRNAMotifRNAMotifMergeMerge

CandidatesCandidates447777

True True positivespositives

442244

False False positivespositives

005533

False False negativesnegatives

002200

Page 26: RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

Results – eukaryoteResults – eukaryoteArabidopsis ThalianaArabidopsis Thaliana

WhifferWhifferRNAMotifRNAMotif

Run #1Run #1

RNAMotifRNAMotif

Run #2Run #2

MergeMerge

CandidatesCandidates0030307000070000--

Final Final candidatescandidates

000017171111

Page 27: RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

Results – eukaryoteResults – eukaryoteArabidopsis ThalianaArabidopsis Thaliana

Most promising candidatesMost promising candidates

Arabidopsis Thaliana

Page 28: RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

c2__11199940_11199996

queryGBox CGTGGATATGGCACGCAAGTTTCTACCGGGCACCGTAAATGTCCGACTAT 50c2__11199940_11199996_ --TTCAGGTC-CATCTTTGGCTAGACCGAAGTCAGATAATTTGGCGTTAT 47 * * * ** * * **** * * *** * ***

queryGBox G-------- 51c2__11199940_11199996_ AGTCCTGAA 56

Page 29: RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

c3_20894864_20894920c3_20894864_20894920

c3_sequencesGGATGAGGAACCAATTGACCCTGGATTTCAAGATT-TACAAAAGAACGTA 49queryGBox -------------CGTGGATATGGCACGCAAGTTTCTACCGGGCACCGTA 37 ** *** **** ** *** * ****

c3_sequences AGCATCC------- 56queryGBox AATGTCCGACTATG 51 * ***

Page 30: RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

RiboSearch - ConclusionsRiboSearch - Conclusions

Filters false positivesFilters false positives Sequences are by far less conserved Sequences are by far less conserved

within eukaryotes than prokaryoteswithin eukaryotes than prokaryotes The merge strategy is essential in The merge strategy is essential in

eukaryotic genomes searcheukaryotic genomes search

Page 31: RiboSearch Ben Daniel ArielKirshner Naomi Instructor : Dr. Danny Barash Adaya Cohen.

Our thanksOur thanks

Dr. Danny BarashDr. Danny Barash

Adaya CohenAdaya Cohen


Recommended