Date post: | 15-Jan-2016 |
Category: |
Documents |
View: | 223 times |
Download: | 0 times |
1
Bioinformatics Master Course Sequence Alignment
Lecture 9aPattern matching
part I
2
Sequence Patterns vs. Protein Structure
I. Protein-Protein interaction1. enzyme (protein) substrate : serine protease trypsin2. receptor (protein) ligand : growth hormone receptor3. antibody (protein) antigen : immunoglobulin (Ig)
II. Protein-Ion and small molecule interaction1. protein ion (Ca2+, Mg2+, Na+, K+, Cl–, HCO3
–, SO42–) :
calmodulin2. pump ion, coupled to enzymatic function : ATPase3. channel water : aquaporin
III. Protein-DNA/RNA interaction1. enzyme DNA : Eco-RI ribozyme2. binder DNA groove : leucine zipper, zinc finger3. regulator RNA : KH domain
3
Reactions and Interactions
• What is the difference between a reaction and an interaction? change in chemical bonding
• Which one of these is a chemical bond?1. H3C-CH2-O-H2. Na+ Cl–
3. H-O-H···OH2
4. H-O-CH2-CH3···H3C-CH2-O-H
4
Bond Strength
• Bond strength and lifetime are a function of temperature vibration (bond stretching), thermal background
• Non-covalent interactions depend very much on the medium compare salt crystal with salt solution
• Interaction strength has a strong distance dependence ion-ion ~ r–2, dipole-dipole ~ r–4
quadrupole-quadrupole ~ r–6
5
Binding: Complementary Interfaces
Binding requires complementary interfaces:
Interfaces have characteristic and conserved residues patterns or motifs
6
Sequence Patterns and Profiles• Comparison between sequence pattern matching and
similarity scoring
PATTERN SCORE
exact word identity
regular expression weight matrix
Hidden Markov Model profile
generalized profilegeneral Hidden Markov Model
7
Resources• PROSITE: biologically significant sites, patterns and profiles
– www.ebi.ac.uk/ppsearch/
• PFAM: large collection of multiple sequence alignments– www.sanger.ac.uk/Software/Pfam/
• DIP: interacting proteins– dip.doe-mbi.ucla.edu/
• Specialized Databases– Immunoglobins: imgt.cines.fr/– Ca2+-binding proteins structbio.vanderbilt.edu/cabp_database/
• Molecular visualisation packages– VMD: www.ks.uiuc.edu/Research/vmd/– MOLMOL: www.mol.biol.ethz.ch/wuthrich/software/molmol/– Rasmol: www.umass.edu/microbio/rasmol/
8
Protein-Protein Interactions
9
Protein Interaction NetworksMost proteins are functionally linked to other proteins
H Jeong, SP Mason, A-L Barabási & ZN Oltvai "Lethality and centrality in protein networks" Nature 2001;411(6833):41
10
I.1 Enzyme: Serine Protease Trypsin
• Specific class of hydrolases– cleave peptide bonds at specific residue positions.
• aspartate proteases, cysteine proteases, serine proteases
• Trypsin is a serine protease– cleaves C-terminal of the basic residues Lys and Arg– one of the three principal digestive proteases
• other two are pepsin and chymotrypsin
– produced in an inactive form by the pancreas
• Pattern: His57, Asp102 and Ser195 (H-D-S)
NC
CN
'R'
OH
H
HN
CC
'R'
OHH
O
H2O
NC
C
'R'
OHH
OH
CH2
Trypsin
HOCH2
Trypsin
HOCH2
Trypsin
N
H
HN
H
H
11
Serine Protease: Trypsin
• Pattern: His57, Asp102 and Ser195 (H-D-S)
12
Principle of Catalysis
http://www.chemguide.co.uk/physical/basicrates/catalyst.html
13
Trypsin Complex with Inhibitor
1btc.pdb
14
I.2 Receptor: Growth Hormone Receptor
• Membrane-borne receptors:– extra-cellular domain
• ligand-binding site
– transmembrane domain• anchoring in the cell membrane
– intracellular domain• kinase or another signalling module (typically)
• Receptor for growth hormone – member of the cytokine receptor superfamily– dimerizes upon binding growth hormone as ligand– activates intracellular kinase, triggers cellular signalling cascade.
• Most structures only contain extra/intracellular domain– transmembrane domain is difficult to crystallize
• Patterns:– YGEFS (growth hormone receptor)
– WSxWS (cytokine receptor family)
15
Growth Hormone Receptor Complex with Growth Hormone
1a22.pdb
16
I.3 Immune System: Antibody• Antibodies (immunoglobulins, or Ig)
– immune system: bind ’foreign’ (non-self) characteristic structures
• e.g. protein surfaces
• Heavy Chain and Light Chain• Constant part (Fc) and Variable part (Fv).
– Fv specific recognition of target molecule (‘antigen’)
• structure called ‘Ig fold’:– Two -sheets face-to-face, with ‘Greek-key’ motif– binding site between two Ig folds– hypervariable loops participate in binding:
• H1, H2, H3 and L1, L2, L3• composition characteristic for antigen
17
Pfam Ig Family Alignment
18
Patterns of Hypervariable Loops
Loop Before After Length
CDR-L1 always Cys always Trp 10 to 17
CDR-L2 generally Ile-Tyr, also Val-Tyr, Ile-Lys, Ile-Phe
- always 7
CDR-L3 always Cys always Phe-Gly-xxx-Gly 7 to 11
CDR-H1 always Cys-xxx-xxx-xxx always Trp 10 to 12
CDR-H2 typically Leu-Glu-Trp-Ile-Gly Lys, Arg-Leu, Ile, Val, Phe, Thr, Ala-Thr, Ser, Ile, Ala
16 to 19
CDR-H3 always Cys-xxx-xxxx always Trp-Gly-xxx-Gly 3 to 25
19
Antibody Structure
Kontou et al. Eur J Biochem 2000 267 23891F3R.pdb
20
Antibody Diversity• Gene translocation
• heavy chain – multiple VH genes join with one DH and one JH
• light chain – multiple VL genes join with one JL gene
www.cat.cc.md.us/courses/bio141/lecguide/unit3/humoral/antibodies/abydiversity/abydiversity.html
21
Protein-Ion and Protein-’small molecule’
Interactions
22
II.1 Ion Binding: Calmodulin
• Two domains, each two ‘EF-hands’: – helix-loop-helix structure– loop contains Ca2+-binding motif.
• Ca2+-ion: 6-fold coordinated: – Oxygens from residues 1, 3, 5, 7, 9, and 12 in EF loop:
D-K-D-G-D-G-T-I-T-T-K-Q– one water molecule– three are negatively charged
• Ca2+-binding changes conformation of entire protein from closed to open– open conformation exposes hydrophobic surface area– binding site for calmodulin target proteins
23
Calmodulin Complex with Calcium Ions
1exr.pdb
24
II.2 Ion Pump: 2. Calcium ATPase (ATP synthase)• protein complex
– links electrical potential to ATP hydrolysis/synthesis– interconversion between mechanical and electrochemical energy in
molecular motors.
• F1F0 ATPase: reversible proton pump/motor• P-type ATPases: transport ions across membrane against a
concentration gradient.– Pattern: D-K-T-G-T-[LIVM]-[TIS]– Next to aspartate which is phosphorylated during reaction cycle
• Na+/K+-ATPase: ubiquitous membrane transport protein in mammalian cells– maintains high K+ and low Na+ in cytoplasm for normal membrane potentials
and cellular activities
• Ca-ATPases: Ca2+ from cytoplasm to organels (mammalian)– e.g. sarcoplasmic reticulum, endoplasmic reticulum
25
ATPases
F1Fo-ATPase Ca2+-ATPasewww.rpi.edu/dept/bcbp/molbiochem/MBWeb/mb1/part2/f1fo.htm
www.utoronto.ca/maclennan/rint1.htm
26
ATPase: Calcium Ions in Active Site
1eul.pdb
27
II.3 Membrane Channel: Aquaporin
Conserved NPA motifs: Asn, Pro and Ala stabilise loops through multiple hydrogen bonds
Bert de Groot: www.mpibpc.mpg.de/groups/de_groot/bgroot.html
28
Aquaporin: Motifs
•NPA: stabilizes loops B and E
• G(a)xxxG(a)xxG(a):– Crossing of
right-handhelicalbundles
Andreas Engel and Henning Stahlberg, in: Current Topics in Membranes (2001), Hohmann, Agre & Nielsen (Eds.) Academic Press
29
Aqu
apor
in S
ubun
it
Ber
t de
Gro
ot: w
ww
.mpi
bpc.
mpg
.de/
grou
ps/d
e_gr
oot/b
groo
t.htm
l
1j4n.pdb
30
Protein-DNA/RNA Interactions
31
III.1 Enzyme: Eco-RI• Restriction enzyme:
– cut palindrome sequences – complex of one
DNA molecule with two Eco-RI molecules with inversion symmetry
www.accessexcellence.org/RC/VL/GG/restriction.html
32
Eco-RI
1qrh.pdb
33
III.2a DNA recognition: Leucine Zipper
• Dimer – Leu interactions– binds DNA by a fork-shaped structure
• ‘coiled-coil’ structure:– leucines on one side of helix– 7-residue repeat; one helix turn is 3.6 residues
a b c d e f g (position)
256 KV E E L L S KN Y H L E N EV A R L K K LV G 279
34
Leucine Zipper: Complex with DNA
1an2.pdb
35
Leucine Zipper: 7-Residue Repeat
36
III.2b DNA Recognition: Zinc Finger Proteins
• zinc coordinates several side chains– pulls them together to form ‘finger’ loops
• Pattern: C-x2-4-C-x12-15-H-x3-5-H or C-x2-4-C-x12-15-C
– recognize nucleic acids (DNA or RNA) • modulate genes (also proteins can be targeted)
• modulate important functions:– gene expression– reverse transcription and virus assembly
• drug discovery targets: – pathogen-specific 3D structures – different from endogeneous (cellular) zinc finger proteins
37
Zinc Finger Complex with DNA
1a1h.pdb
38
III.3 RNA Regulation: KH Domain
• bind to specific DNA/RNA locations– regulation of RNA synthesis and metabolism– combination with other domains– Pattern: G-x-x-G
• ribonucleoprotein (RNP) domain• double stranded RNA binding domain (dsRBD)• K Homology (KH) domain
– recognize tetranucleotide motifs – high affinity/specificity:
• RNA secondary structure• repeated sequence elements
• alpha/beta fold similar to ribosomal proteins
39
KH Domain Complex with RNA
1k1g.pdb
40
Copyright ©2005 American Society of Plant BiologistsPrzybilski, R., et al. Plant Cell 2005;17:1877-1885
The HHRzHammerhead Motif of Ribozyme
41
Hammerhead Motif of Ribozyme
• three base-paired helices (I-III) • core of 11 highly conserved, non-complementary
nucleotides – necessary for the catalysis.
• catalytic motif discovered by sequence comparison of plant viroids– site-specific,
self-catalyzed cleavage
(Birikh, 1997)academic.brooklyn.cuny.edu/chem/zhuang/QD/toppage1.htm
42
Hammerhead Ribozyme Action
488d.pdb
43
Copyright ©2005 American Society of Plant Biologists
Przybilski, R., et al. Plant Cell 2005;17:1877-1885
Modeling of the Arabidopsis HHRz Ara2
44