+ All Categories
Home > Documents > Exploring Protein Sequences Tutorial 5. Exploring Protein Sequences Multiple alignment –ClustalW...

Exploring Protein Sequences Tutorial 5. Exploring Protein Sequences Multiple alignment –ClustalW...

Date post: 21-Dec-2015
Category:
View: 221 times
Download: 2 times
Share this document with a friend
27
Exploring Protein Sequences Tutorial 5
Transcript
Page 1: Exploring Protein Sequences Tutorial 5. Exploring Protein Sequences Multiple alignment –ClustalW Motif discovery –MEME –Jaspar.

Exploring Protein Sequences

Tutorial 5

Page 2: Exploring Protein Sequences Tutorial 5. Exploring Protein Sequences Multiple alignment –ClustalW Motif discovery –MEME –Jaspar.

Exploring Protein Sequences

• Multiple alignment– ClustalW

• Motif discovery– MEME– Jaspar

Page 3: Exploring Protein Sequences Tutorial 5. Exploring Protein Sequences Multiple alignment –ClustalW Motif discovery –MEME –Jaspar.

• More than two sequences– DNA– Protein

• Evolutionary relation– Homology Phylogenetic tree– Detect motif

Multiple Sequence Alignment

GTCGTAGTCG-GC-TCGACGTC-TAG-CGAGCGT-GATGC-GAAG-AG-GCG-AG-CGCCGTCG-CG-TCGTA-AC

A

D B

CGTCGTAGTCGGCTCGACGTCTAGCGAGCGTGATGCGAAGAGGCGAGCGCCGTCGCGTCGTAAC

Page 4: Exploring Protein Sequences Tutorial 5. Exploring Protein Sequences Multiple alignment –ClustalW Motif discovery –MEME –Jaspar.

• Dynamic Programming– Optimal alignment– Exponential in #Sequences

• Progressive– Efficient– Heuristic

Multiple Sequence Alignment

GTCGTAGTCG-GC-TCGACGTC-TAG-CGAGCGT-GATGC-GAAG-AG-GCG-AG-CGCCGTCG-CG-TCGTA-AC

A

D B

CGTCGTAGTCGGCTCGACGTCTAGCGAGCGTGATGCGAAGAGGCGAGCGCCGTCGCGTCGTAAC

Page 5: Exploring Protein Sequences Tutorial 5. Exploring Protein Sequences Multiple alignment –ClustalW Motif discovery –MEME –Jaspar.

ClustalW

“CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice”, J D Thompson et al

Page 6: Exploring Protein Sequences Tutorial 5. Exploring Protein Sequences Multiple alignment –ClustalW Motif discovery –MEME –Jaspar.

• Progressive– At each step align two existing alignments or sequences

– Gaps present in older alignments remain fixed

ClustalW

GTCGTAGTCG-GC-TGTC-TAG-CGAGCGTGC-GAAG-AG-GCG-GCCGTCG-CG-TCGT

GTCGTAGTCGGCTCGACGTCTAGCGAGCGTGATGCGAAGAGGCGAGCGCCGTCGCGTCGTAAC

Page 7: Exploring Protein Sequences Tutorial 5. Exploring Protein Sequences Multiple alignment –ClustalW Motif discovery –MEME –Jaspar.

ClustalW - InputScoring matrix

Gap scoring

Input sequences

Page 8: Exploring Protein Sequences Tutorial 5. Exploring Protein Sequences Multiple alignment –ClustalW Motif discovery –MEME –Jaspar.

ClustalW - Output

Page 9: Exploring Protein Sequences Tutorial 5. Exploring Protein Sequences Multiple alignment –ClustalW Motif discovery –MEME –Jaspar.

ClustalW - Output

Input sequences

Pairwise alignment scores

Building alignment

Final score

Page 10: Exploring Protein Sequences Tutorial 5. Exploring Protein Sequences Multiple alignment –ClustalW Motif discovery –MEME –Jaspar.

ClustalW - Output

Page 11: Exploring Protein Sequences Tutorial 5. Exploring Protein Sequences Multiple alignment –ClustalW Motif discovery –MEME –Jaspar.

ClustalW Output

Sequence names Sequence positions

Match strength in decreasing order: * : .

Page 12: Exploring Protein Sequences Tutorial 5. Exploring Protein Sequences Multiple alignment –ClustalW Motif discovery –MEME –Jaspar.

http://http://www.megasoftware.net/

Page 13: Exploring Protein Sequences Tutorial 5. Exploring Protein Sequences Multiple alignment –ClustalW Motif discovery –MEME –Jaspar.

Can we find motifs using multiple sequence alignment?

1 2 3 4 5 6 7 8 9 10

A 0 0 0 0 0 0.5 1/6 1/3 0 0

D 0 0.5 1/3 0 0 1/6 5/6 1/6 0 1/6

E 0 0 2/3 1 0 0 0 0 1 5/6

G 0 1/6 0 0 1 1/3 0 0 0 0

H 0 1/6 0 0 0 0 0 0 0 0

N 0 1/6 0 0 0 0 0 0 0 0

Y 1 0 0 0 0 0 0.5 0.5 0 0

1 3 5 7 9..YDEEGGDAEE....YDEEGGDAEE....YGEEGADYED....YDEEGADYEE....YNDEGDDYEE....YHDEGAADEE.. * :** *:

MotifA widespread pattern with a biological significance

Page 14: Exploring Protein Sequences Tutorial 5. Exploring Protein Sequences Multiple alignment –ClustalW Motif discovery –MEME –Jaspar.

Can we find motifs using multiple sequence alignment?

YES! NO

Page 15: Exploring Protein Sequences Tutorial 5. Exploring Protein Sequences Multiple alignment –ClustalW Motif discovery –MEME –Jaspar.

MEME – Multiple EM for Motif finding

• http://meme.sdsc.edu/• Motif discovery from unaligned sequences

– Genomic or protein sequences• Flexible model of motif presence (Motif can be absent in some sequences or appear several times in one sequence)

Page 16: Exploring Protein Sequences Tutorial 5. Exploring Protein Sequences Multiple alignment –ClustalW Motif discovery –MEME –Jaspar.

MEME - InputEmail address

Multiple input sequences

How many times in each sequence?

How many motifs?

How many sites?

Range of motif lengths

Page 17: Exploring Protein Sequences Tutorial 5. Exploring Protein Sequences Multiple alignment –ClustalW Motif discovery –MEME –Jaspar.

MEME - OutputMotif length

Number of times

Like BLAST

Page 18: Exploring Protein Sequences Tutorial 5. Exploring Protein Sequences Multiple alignment –ClustalW Motif discovery –MEME –Jaspar.

MEME - Output

Probability * 10

‘a’=10, ‘:’=0

Page 19: Exploring Protein Sequences Tutorial 5. Exploring Protein Sequences Multiple alignment –ClustalW Motif discovery –MEME –Jaspar.

MEME - Output

Low uncertainty

=

High information content

Page 20: Exploring Protein Sequences Tutorial 5. Exploring Protein Sequences Multiple alignment –ClustalW Motif discovery –MEME –Jaspar.

MEME - Output

Multilevel Consensus

Page 21: Exploring Protein Sequences Tutorial 5. Exploring Protein Sequences Multiple alignment –ClustalW Motif discovery –MEME –Jaspar.

Sequence names

Reverse complement (genomic input only)

Position in

sequence

Strength of match

Motif within sequence

MEME - Output

Page 22: Exploring Protein Sequences Tutorial 5. Exploring Protein Sequences Multiple alignment –ClustalW Motif discovery –MEME –Jaspar.

Overall strength of motif matches

sequence lengths

Motif instance

MEME - Output

‘-’=Other strand

Page 23: Exploring Protein Sequences Tutorial 5. Exploring Protein Sequences Multiple alignment –ClustalW Motif discovery –MEME –Jaspar.

MAST• Searches for motifs (one or more) in sequence databases:– Like BLAST but motifs for input– Similar to iterations of PSI-BLAST

• Profile defines strength of match– Multiple motif matches per sequence– Combined E value for all motifs

• MEME uses MAST to summarize results: – Each MEME result is accompanied by the MAST result for searching the discovered motifs on the given sequences.

Page 24: Exploring Protein Sequences Tutorial 5. Exploring Protein Sequences Multiple alignment –ClustalW Motif discovery –MEME –Jaspar.

JASPAR• Profiles

– Transcription factor binding sites– Multicellular eukaryotes– Derived from published collections of

experiments

• Open data accesss

Page 25: Exploring Protein Sequences Tutorial 5. Exploring Protein Sequences Multiple alignment –ClustalW Motif discovery –MEME –Jaspar.

JASPAR• profiles

– Modeled as matrices.– can be converted into PSSM for scanning

genomic sequences.

1 2 3 4 5 6 7 8 9 10

A 0 0 0 0 0 0.5 1/6 1/3 0 0

D 0 0.5 1/3 0 0 1/6 5/6 1/6 0 1/6

E 0 0 2/3 1 0 0 0 0 1 5/6

G 0 1/6 0 0 1 1/3 0 0 0 0

H 0 1/6 0 0 0 0 0 0 0 0

N 0 1/6 0 0 0 0 0 0 0 0

Y 1 0 0 0 0 0 0.5 0.5 0 0

Page 26: Exploring Protein Sequences Tutorial 5. Exploring Protein Sequences Multiple alignment –ClustalW Motif discovery –MEME –Jaspar.

Search profile

http://jaspar.cgb.ki.se/

Page 27: Exploring Protein Sequences Tutorial 5. Exploring Protein Sequences Multiple alignment –ClustalW Motif discovery –MEME –Jaspar.

http://jaspar.cgb.ki.se/


Recommended