+ All Categories
Home > Documents > 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

Date post: 15-Jan-2016
Category:
View: 223 times
Download: 0 times
Share this document with a friend
Popular Tags:
44
1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I
Transcript
Page 1: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

1

Bioinformatics Master Course Sequence Alignment

Lecture 9aPattern matching

part I

Page 2: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

2

Sequence Patterns vs. Protein Structure

I. Protein-Protein interaction1. enzyme (protein) substrate : serine protease trypsin2. receptor (protein) ligand : growth hormone receptor3. antibody (protein) antigen : immunoglobulin (Ig)

II. Protein-Ion and small molecule interaction1. protein ion (Ca2+, Mg2+, Na+, K+, Cl–, HCO3

–, SO42–) :

calmodulin2. pump ion, coupled to enzymatic function : ATPase3. channel water : aquaporin

III. Protein-DNA/RNA interaction1. enzyme DNA : Eco-RI ribozyme2. binder DNA groove : leucine zipper, zinc finger3. regulator RNA : KH domain

Page 3: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

3

Reactions and Interactions

• What is the difference between a reaction and an interaction? change in chemical bonding

• Which one of these is a chemical bond?1. H3C-CH2-O-H2. Na+ Cl–

3. H-O-H···OH2

4. H-O-CH2-CH3···H3C-CH2-O-H

Page 4: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

4

Bond Strength

• Bond strength and lifetime are a function of temperature vibration (bond stretching), thermal background

• Non-covalent interactions depend very much on the medium compare salt crystal with salt solution

• Interaction strength has a strong distance dependence ion-ion ~ r–2, dipole-dipole ~ r–4

quadrupole-quadrupole ~ r–6

Page 5: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

5

Binding: Complementary Interfaces

Binding requires complementary interfaces:

Interfaces have characteristic and conserved residues patterns or motifs

Page 6: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

6

Sequence Patterns and Profiles• Comparison between sequence pattern matching and

similarity scoring

PATTERN SCORE

exact word identity

regular expression weight matrix

Hidden Markov Model profile

generalized profilegeneral Hidden Markov Model

Page 7: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

7

Resources• PROSITE: biologically significant sites, patterns and profiles

– www.ebi.ac.uk/ppsearch/

• PFAM: large collection of multiple sequence alignments– www.sanger.ac.uk/Software/Pfam/

• DIP: interacting proteins– dip.doe-mbi.ucla.edu/

• Specialized Databases– Immunoglobins: imgt.cines.fr/– Ca2+-binding proteins structbio.vanderbilt.edu/cabp_database/

• Molecular visualisation packages– VMD: www.ks.uiuc.edu/Research/vmd/– MOLMOL: www.mol.biol.ethz.ch/wuthrich/software/molmol/– Rasmol: www.umass.edu/microbio/rasmol/

Page 8: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

8

Protein-Protein Interactions

Page 9: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

9

Protein Interaction NetworksMost proteins are functionally linked to other proteins

H Jeong, SP Mason, A-L Barabási & ZN Oltvai "Lethality and centrality in protein networks" Nature 2001;411(6833):41

Page 10: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

10

I.1 Enzyme: Serine Protease Trypsin

• Specific class of hydrolases– cleave peptide bonds at specific residue positions.

• aspartate proteases, cysteine proteases, serine proteases

• Trypsin is a serine protease– cleaves C-terminal of the basic residues Lys and Arg– one of the three principal digestive proteases

• other two are pepsin and chymotrypsin

– produced in an inactive form by the pancreas

• Pattern: His57, Asp102 and Ser195 (H-D-S)

NC

CN

'R'

OH

H

HN

CC

'R'

OHH

O

H2O

NC

C

'R'

OHH

OH

CH2

Trypsin

HOCH2

Trypsin

HOCH2

Trypsin

N

H

HN

H

H

Page 11: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

11

Serine Protease: Trypsin

• Pattern: His57, Asp102 and Ser195 (H-D-S)

Page 12: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

12

Principle of Catalysis

http://www.chemguide.co.uk/physical/basicrates/catalyst.html

Page 13: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

13

Trypsin Complex with Inhibitor

1btc.pdb

Page 14: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

14

I.2 Receptor: Growth Hormone Receptor

• Membrane-borne receptors:– extra-cellular domain

• ligand-binding site

– transmembrane domain• anchoring in the cell membrane

– intracellular domain• kinase or another signalling module (typically)

• Receptor for growth hormone – member of the cytokine receptor superfamily– dimerizes upon binding growth hormone as ligand– activates intracellular kinase, triggers cellular signalling cascade.

• Most structures only contain extra/intracellular domain– transmembrane domain is difficult to crystallize

• Patterns:– YGEFS (growth hormone receptor)

– WSxWS (cytokine receptor family)

Page 15: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

15

Growth Hormone Receptor Complex with Growth Hormone

1a22.pdb

Page 16: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

16

I.3 Immune System: Antibody• Antibodies (immunoglobulins, or Ig)

– immune system: bind ’foreign’ (non-self) characteristic structures

• e.g. protein surfaces

• Heavy Chain and Light Chain• Constant part (Fc) and Variable part (Fv).

– Fv specific recognition of target molecule (‘antigen’)

• structure called ‘Ig fold’:– Two -sheets face-to-face, with ‘Greek-key’ motif– binding site between two Ig folds– hypervariable loops participate in binding:

• H1, H2, H3 and L1, L2, L3• composition characteristic for antigen

Page 17: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

17

Pfam Ig Family Alignment

Page 18: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

18

Patterns of Hypervariable Loops

Loop Before After Length

CDR-L1 always Cys always Trp 10 to 17

CDR-L2 generally Ile-Tyr, also Val-Tyr, Ile-Lys, Ile-Phe

- always 7

CDR-L3 always Cys always Phe-Gly-xxx-Gly 7 to 11

CDR-H1 always Cys-xxx-xxx-xxx always Trp 10 to 12

CDR-H2 typically Leu-Glu-Trp-Ile-Gly Lys, Arg-Leu, Ile, Val, Phe, Thr, Ala-Thr, Ser, Ile, Ala

16 to 19

CDR-H3 always Cys-xxx-xxxx always Trp-Gly-xxx-Gly 3 to 25

Page 19: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

19

Antibody Structure

Kontou et al. Eur J Biochem 2000 267 23891F3R.pdb

Page 20: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

20

Antibody Diversity• Gene translocation

• heavy chain – multiple VH genes join with one DH and one JH

• light chain – multiple VL genes join with one JL gene

www.cat.cc.md.us/courses/bio141/lecguide/unit3/humoral/antibodies/abydiversity/abydiversity.html

Page 21: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

21

Protein-Ion and Protein-’small molecule’

Interactions

Page 22: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

22

II.1 Ion Binding: Calmodulin

• Two domains, each two ‘EF-hands’: – helix-loop-helix structure– loop contains Ca2+-binding motif.

• Ca2+-ion: 6-fold coordinated: – Oxygens from residues 1, 3, 5, 7, 9, and 12 in EF loop:

D-K-D-G-D-G-T-I-T-T-K-Q– one water molecule– three are negatively charged

• Ca2+-binding changes conformation of entire protein from closed to open– open conformation exposes hydrophobic surface area– binding site for calmodulin target proteins

Page 23: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

23

Calmodulin Complex with Calcium Ions

1exr.pdb

Page 24: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

24

II.2 Ion Pump: 2. Calcium ATPase (ATP synthase)• protein complex

– links electrical potential to ATP hydrolysis/synthesis– interconversion between mechanical and electrochemical energy in

molecular motors.

• F1F0 ATPase: reversible proton pump/motor• P-type ATPases: transport ions across membrane against a

concentration gradient.– Pattern: D-K-T-G-T-[LIVM]-[TIS]– Next to aspartate which is phosphorylated during reaction cycle

• Na+/K+-ATPase: ubiquitous membrane transport protein in mammalian cells– maintains high K+ and low Na+ in cytoplasm for normal membrane potentials

and cellular activities

• Ca-ATPases: Ca2+ from cytoplasm to organels (mammalian)– e.g. sarcoplasmic reticulum, endoplasmic reticulum

Page 25: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

25

ATPases

F1Fo-ATPase Ca2+-ATPasewww.rpi.edu/dept/bcbp/molbiochem/MBWeb/mb1/part2/f1fo.htm

www.utoronto.ca/maclennan/rint1.htm

Page 26: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

26

ATPase: Calcium Ions in Active Site

1eul.pdb

Page 27: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

27

II.3 Membrane Channel: Aquaporin

Conserved NPA motifs: Asn, Pro and Ala stabilise loops through multiple hydrogen bonds

Bert de Groot: www.mpibpc.mpg.de/groups/de_groot/bgroot.html

Page 28: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

28

Aquaporin: Motifs

•NPA: stabilizes loops B and E

• G(a)xxxG(a)xxG(a):– Crossing of

right-handhelicalbundles

Andreas Engel and Henning Stahlberg, in: Current Topics in Membranes (2001), Hohmann, Agre & Nielsen (Eds.) Academic Press

Page 29: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

29

Aqu

apor

in S

ubun

it

Ber

t de

Gro

ot: w

ww

.mpi

bpc.

mpg

.de/

grou

ps/d

e_gr

oot/b

groo

t.htm

l

1j4n.pdb

Page 30: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

30

Protein-DNA/RNA Interactions

Page 31: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

31

III.1 Enzyme: Eco-RI• Restriction enzyme:

– cut palindrome sequences – complex of one

DNA molecule with two Eco-RI molecules with inversion symmetry

www.accessexcellence.org/RC/VL/GG/restriction.html

Page 32: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

32

Eco-RI

1qrh.pdb

Page 33: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

33

III.2a DNA recognition: Leucine Zipper

• Dimer – Leu interactions– binds DNA by a fork-shaped structure

• ‘coiled-coil’ structure:– leucines on one side of helix– 7-residue repeat; one helix turn is 3.6 residues

a b c d e f g (position)

256 KV E E L L S KN Y H L E N EV A R L K K LV G 279

Page 34: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

34

Leucine Zipper: Complex with DNA

1an2.pdb

Page 35: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

35

Leucine Zipper: 7-Residue Repeat

Page 36: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

36

III.2b DNA Recognition: Zinc Finger Proteins

• zinc coordinates several side chains– pulls them together to form ‘finger’ loops

• Pattern: C-x2-4-C-x12-15-H-x3-5-H or C-x2-4-C-x12-15-C

– recognize nucleic acids (DNA or RNA) • modulate genes (also proteins can be targeted)

• modulate important functions:– gene expression– reverse transcription and virus assembly

• drug discovery targets: – pathogen-specific 3D structures – different from endogeneous (cellular) zinc finger proteins

Page 37: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

37

Zinc Finger Complex with DNA

1a1h.pdb

Page 38: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

38

III.3 RNA Regulation: KH Domain

• bind to specific DNA/RNA locations– regulation of RNA synthesis and metabolism– combination with other domains– Pattern: G-x-x-G

• ribonucleoprotein (RNP) domain• double stranded RNA binding domain (dsRBD)• K Homology (KH) domain

– recognize tetranucleotide motifs – high affinity/specificity:

• RNA secondary structure• repeated sequence elements

• alpha/beta fold similar to ribosomal proteins

Page 39: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

39

KH Domain Complex with RNA

1k1g.pdb

Page 40: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

40

Copyright ©2005 American Society of Plant BiologistsPrzybilski, R., et al. Plant Cell 2005;17:1877-1885

The HHRzHammerhead Motif of Ribozyme

Page 41: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

41

Hammerhead Motif of Ribozyme

• three base-paired helices (I-III) • core of 11 highly conserved, non-complementary

nucleotides – necessary for the catalysis.

• catalytic motif discovered by sequence comparison of plant viroids– site-specific,

self-catalyzed cleavage

(Birikh, 1997)academic.brooklyn.cuny.edu/chem/zhuang/QD/toppage1.htm

Page 42: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

42

Hammerhead Ribozyme Action

488d.pdb

Page 43: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

43

Copyright ©2005 American Society of Plant Biologists

Przybilski, R., et al. Plant Cell 2005;17:1877-1885

Modeling of the Arabidopsis HHRz Ara2

Page 44: 1 Bioinformatics Master Course Sequence Alignment Lecture 9a Pattern matching part I.

44


Recommended