© Burkhard Rost (TUM Munich) /971
title: Secondary structure predictionshort title: pp1_SecStrPred_1
lecture: Protein Prediction 1 - Protein structure TUM summer 2014
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Announcements
Videos: YouTube / www.rostlab.orgTHANKS : Tim Karl + Jonas ReebSpecial lectures:• Apr 15 - Andrea SchafferhansNo lecture:• May 29 Thu Ascension day• Jun 03 Tue no lecture• Jun 10 Tue Whitsun holidays• Jun 19 Thu Corpus ChristiLAST lecture: July 1Examen: July 8 - this room • Makeup: Oct 21 - morning
CONTACT: Lothar Richter [email protected]
2
TimKarl
LotharRichter
JonasReeb
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Notation: protein structure 1D, 2D, 3DPQITLWQRPLVTIKIGGQLKEALLDTGADDTVL
PP PQQQYFFQVISSIVRLLSTLWWQEDRKQAKRRRPQPPPPPVVTKFVVLIITTKEKAALIVHYKKFIILVIEENGGGGGTGQQKRRPPLWWVVFKVEESKKVVGLGLLILLLLLVVDDDDDTTTTTGGGGGAAAAADDDDDDDAKESSTTVIIVIVVVIVL
1281757077
120238169200247114740
904
466268
11831
1241
292449726217
102691
140
1109760691481976248590
690
730
415371597395000
5851300
79586900
EEEEE
EEEEEE
EEEEEEE
EE
EEEEE
EEEEEE
EE
kcal/mol0 -1 -2 -3 -4 -5
1 10 20 30 40 50 60 70 80 90
1
10
20
30
40
50
60
70
80
90
1D1D 2D2D 3D3D
3Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
1D: secondarystructureprediction
4Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Words
5
Secondary structure prediction2ndary structure prediction2D prediction
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Coverage of structure space
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Secondary structure prediction
7
DSSP secondary assignment has 8 “states”
H = HelixG = 310 helixI = Pi helixE = Extended (strand)B = beta-bridge, single strand residueT = Turn, i.e. one turn of helixS = bent“ “ = loop
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Goal of secondary structure prediction
LEDKSPDHNPTGID
AKGKPMDRNFTGRNHPPKDSS
AAQVKDALTK
LEQWGTLAQLRAIWEQELTDFPEFLTMMARQETWLGWLTI
helix strand
loop
LAVIGVLMKW
FVFLMIEKIYHKLT
DIRVGLTYYIAQ
VNTFVGTFAAVAHAL
8W Kabsch & C Sander (1985) Identical pentapetides with different backbones. Nature 317:207
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /979
??
???
How pentapeptides occur in 2 states?
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
L Pauling & RB Corey (1953) PNAS 39:247-252L Pauling, RB Corey & HR Branson (1951) PNAS 37:205-234W Kabsch & C Sander (1983) Biopolymers 22:2577-2637
DSSP
Pauling’s H-bond pattern used in DSSP
10Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Secondary structure prediction methods
11
L Pauling, RB Corey and HR Branson (1951) Two Hydrogen-Bonded Helical Configurations of the Polypeptide Chain. PNAS 37:205-211.L Pauling, RB Corey and HR Branson (1951) The Structure of Proteins: Two Hydrogen-bonded Helical Configurations of the Polypeptide Chain. PNAS 37:205-234.AG Szent-Györgyi & C Cohen (1957) Role of proline in polypeptide chain configuration of proteins. Science 126:697.some are more equal than others ...
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Sec str pred methods: single residues
12
Pauling, RB Corey and HR Branson (1951) Two Hydrogen-Bonded Helical Configurations of the Polypeptide Chain. PNAS 37:205-211.L Pauling, RB Corey and HR Branson (1951) The Structure of Proteins: Two Hydrogen-bonded Helical Configurations of the Polypeptide Chain. PNAS 37:205-234.AG Szent-Györgyi & C Cohen (1957) Role of proline in polypeptide chain configuration of proteins. Science 126:697.MF Perutz, MG Rossmann, AF Cullis, G Muirhead, G Will and AT North (1960) Structure of haemoglobin: a three-dimensional Fourier synthesis at 5.5 Å resolution, obtained by X-ray analysis. Nature 185:416-422.JC Kendrew, RE Dickerson, BE Strandberg, RJ Hart, DR Davies and DC Phillips (1960) Structure of myoglobin: a three-dimensional Fourier synthesis at 2 Å resolution. Nature 185:422-427.
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Albert Szent-Györgyi
Albert Szent-Györgyi von Nagyrapolt(Sep 16, 1893 - Oct 22, 1986)
1937: Nobel Prize in Physiology or Medicine"for his discoveries in connection with the biological combustion processes, with special reference to vitamin C and the catalysis of fumaric acid"
13
Albert Szent-Györgyi von Nagyrapolt
NIH © Wikipedia
Shapers and Shakers
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Simple prediction: frequency
First step (Szent-Györgyi)Proline breaks a helixHelices span several turns, i.e. >4 residues-> identify helices/non-helices
14
Proline bends main chain
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Simple prediction: frequency
First step (Szent-Györgyi)Proline breaks a helixHelices span several turns, i.e. >4 residues-> identify helices/non-helices
from Proline to odds for all ....
15Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Simple prediction: frequency
from Proline to odds for all
16
....,....1....,....2....QEKSPREVTMKKGDILTLLNSTNK E..E EEEEEE
AA D E G I K L M N P Q R S T V
E 1 1 3 1 1 1
L 1 1 1 4 1 1 1 1 2 1
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Secondary structure prediction methods
17
single residues (1. generation)• Chou-Fasman, GOR 1957-70/80
Robson B & Pain RH (1971) Analysis of the Code Relating Sequence to Conformation in Proteins: Possible Implications for the Mechanism of Formation of Helical Regions. J. Mol. Biol. 58:237-259.Chou PY & Fasman GD (1974) Prediction of protein conformation. Biochemistry 13:211-215.Garnier J, Osguthorpe DJ and Robson B (1978) Analysis of the accuracy and Implications of simple methods for predicting the secondary structure of globular proteins. J. Mol. Biol. 120:97-120.
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Sec struc pred: 1st gen
18
1st generation (1957-1978):e.g. Chou-Fasman / GORsingle residue oddsp(SEC|AAi)=
probability for observing secondary structure state SEC for amino acid AA at position=p(SEC|AAj) - ∀ i ⋀ j
Erabutoxin β (3ebx)
V32 V36 V51
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Sec struc pred: 1st gen
19
1st generation (1957-1978):e.g. Chou-Fasman / GORsingle residue oddsp(SEC|AAi)=
probability for observing secondary structure state SEC for amino acid AA at position=p(SEC|AAj) - ∀ i ⋀ j
Erabutoxin β (3ebx)
V32 V36 V51
this is not a
valine!
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
how to assess performance?
problem 1: where to get secondary structure from?
20Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
how to assess performance?
problem 2: how to measure?
21Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Secondary structure prediction accuracy
22
• Q3 : three-state per-residue accuracy
number of correctly predicted residues in states helix, strand, otherQ3= ---------------------------------------------------------------------------- number of residues in protein
Schulz GE & Schirmer RH (1979) Prediction of secondary structure from the amino acid sequence. In: (eds). Principles of protein structure. Berlin: Springer-Verlag, pp 108-130.
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Secondary structure prediction methods
23
single residues (1. generation)• Chou-Fasman, GOR 1957-70/80
published: 63% accuracy
Robson B & Pain RH (1971) Analysis of the Code Relating Sequence to Conformation in Proteins: Possible Implications for the Mechanism of Formation of Helical Regions. J. Mol. Biol. 58:237-259.Chou PY & Fasman GD (1974) Prediction of protein conformation. Biochemistry 13:211-215.Garnier J, Osguthorpe DJ and Robson B (1978) Analysis of the accuracy and Implications of simple methods for predicting the secondary structure of globular proteins. J. Mol. Biol. 120:97-120.
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Secondary Structure Assignment: DSSP
Dictionary of protein Secondary Structure for ProteinsASSESSING secondary structure prediction
24
Wolfgang Kabsch & Chris Sander (1983) Biopolymers 22:2577-637
Wolfgang KabschChris Sander
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Chris Sander
Sloan Kettering Cancer Center, NYCpapers: • >770 papers (May 2011)• 1 >6,000 citations (May 2011)• 6 >1,000 citations (May 2011)• 87 over 100• H-index 92 (ISI May 2011)ISCB Fellow
25
Chris SanderSloan Kettering Cancer
Center NYC
Shapers and Shakers
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Secondary structure prediction methods
26
single residues (1. generation)• Chou-Fasman, GOR 1957-70/80
50-55% accuracy (assessed in 1994)
Robson B & Pain RH (1971) Analysis of the Code Relating Sequence to Conformation in Proteins: Possible Implications for the Mechanism of Formation of Helical Regions. J. Mol. Biol. 58:237-259.Chou PY & Fasman GD (1974) Prediction of protein conformation. Biochemistry 13:211-215.Garnier J, Osguthorpe DJ and Robson B (1978) Analysis of the accuracy and Implications of simple methods for predicting the secondary structure of globular proteins. J. Mol. Biol. 120:97-120.
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
2nd Generation:what would you do?
27Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Secondary Structure Prediction: Segment 83
Dictionary of protein Secondary Structure for Proteins
28
Wolfgang Kabsch & Chris Sander (1983) Biopolymers 22:2577-637W Kabsch & C Sander (1985) Identical pentapetides with different backbones. Nature 317:207W Kabsch & C Sander (1983) Segment 83 unpublished
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Secondary structure prediction: 1.+2. Generation
29
single residues 1. generation• Chou-Fasman, GOR 1957-70/80
50-55% accuracy (Q3) segments 2. generation• GORIII 1986-92
55-60% Q3• Gibrat J-F, Garnier J and Robson B (1987) Further developments of protein
secondary structure prediction using information theory. New parameters and consideration of residue pairs. J. Mol. Biol. 198:425-443.
• Biou V, Gibrat JF, Levin JM, Robson B and Garnier J (1988) Secondary structure prediction: combination of three different methods. Prot. Engin. 2:185-191.
• Garnier J & Robson B (1989) The GOR method for predicting secondary structure in proteins. In: D. FG (eds). Prediction of protein structure and the principles of protein conformation. New York: Plenum Press, pp 417-465.
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Sec struc pred: 1st gen
30
1st generation (1957-1978): single residue oddse.g. Chou-Fasman/GOR
2nd generation (1983-1992):e.g. GORIIIodds for windows
p1(SECi|AAi)=probability for observing secondary structure state SEC for amino acid AA at position i
p(SEC|AAi)=probability for observing secondary structure state SEC for amino acid AA at position i= SUM (j=i-w,i+w) p1(SECj,AAj)
Erabutoxin β (3ebx)
V32 V36 V51
w=3
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Sec struc pred: 1st gen
31
1st generation (1957-1978): single residue oddse.g. Chou-Fasman/GOR
2nd generation (1983-1992):e.g. GORIIIodds for windows
p1(SECi|AAi)=probability for observing secondary structure state SEC for amino acid AA at position i
p(SEC|AAi)≠ p(SEC|AAi)
Erabutoxin β (3ebx)
V32 V36 V51
w=3
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Secondary structure prediction: 1.+2. Generation
32
single residues (1. generation)• Chou-Fasman, GOR 1957-70/80
50-55% accuracy
segments (2. generation)• GORIII 1986-92
55-60% accuracy
problems• < 100% they said: 65% max
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Helix formation is local
residuesiandi+3
THYROID hormone receptor (2nll)
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Secondary structure prediction: 1.+2. Generation
34
single residues (1. generation)• Chou-Fasman, GOR 1957-70/80
50-55% accuracy
segments (2. generation)• GORIII 1986-92
55-60% accuracy
problems• < 100% they said: 65% max
• < 40% they said: strand non-local
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
β-sheet formation is NOT local
Erabutoxin β (3ebx)Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Secondary structure prediction: 1.+2. Generation
36
single residues (1. generation)• Chou-Fasman, GOR 1957-70/80
50-55% accuracy
segments (2. generation)• GORIII 1986-92
55-60% accuracy
problems• < 100% they said: 65% max
• < 40% they said: strand non-local
• short segments
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
SEQ! KELVLALYDYQEKSPREVTMKKGDILTLLNSTNKDWWKVEVNDRQGFVPAAYVKKLDOBS! EEEE E E E EEEEEE EEEEEE EEEEEEHHHEEEE
TYP! EHHHH EE EEEE EE HHHEE EEEHH
Problems of secondary structure predictions (before 1994)
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
INSERT:concept of neural
networks
38Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
J11
J12
1
1
1
0
out0 = in1J11 in2J12 +
out = tanh (out0)
Simple Neural Network
Simple neural network
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
10
Training a neural network 1
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
10
Errare = (out net - out want) 2
.
1
- 121-1-2
out
in
Training a neural network 2
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Error
Junctions
10
01
11
11
Training a neural network 3
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
10
01
11
11
.
1
- 121-1-2
out
in
10
01
01
12
10
01
- 1
1
12+?
Training a neural network 4
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Neural networks classify points
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Neural networks classify points
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Neural networks classify points
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Neural networks classify points
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Neural networks classify points
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Simple Neural NetworkWith Hidden Layer
outi = f ij2 J ⋅ f jk
1 Jk∑ ⋅ kin⎛
⎝⎜
⎞
⎠⎟
j∑
⎛
⎝⎜⎜
⎞
⎠⎟⎟
Simple neural network with hidden layer
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Principles of networks: input -> output
two steps:1. linear: sum over all input × connection2. non-linear: sigmoid trigger, i.e., project sum onto 0-1
.
:ACACC:
1.0
0input to unit
(=sum)
Σconnectionij*inputjstep 1:
step 2:
outp
utfr
om u
nit
inpu
t = 3
adj
acen
t res
idue
s in
pro
tein
seq
uenc
e
outp
ut =
sec
onda
ry s
truct
ure
stat
e of
cen
tral r
esid
ue
α
L
s1s2s3
Jdecision line
sum
result: < decision line
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
outi = ∑i=1
Nin+1
Jij inj
inj value of input unit j ; outi value of output unit i ; Jij connection between input unit j and output unit i
E = ∑i=1
Nout
(outi - desi)2
outi value of output unit i ; desi secondary structure stateobserved for central amino acid for output unit i (e.g. fora helix: des1=1, des2=0, des3=0)
Principles of neural networks: error
• output:
• error:
• free variables: connections { J }• goal:
representation of set of examples (training set) for which the mapping input->output is known, i.e., the secondary structure state of the central residue has been observed by the network
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Principles of neural networks: training
∆Jij(t+1) = - ε ∂E(t)∂Jij(t) + α ∆Jij(t-1)
where ∂E/∂J is the derivative of the error with respect tothe network connection; t is the algorithmic time given bythe presentation of one example; ε determines the stepwidth of the change (learning strength, typically some0.01); α gives the contribution of the momentum term(∆J(t-1) , typically some 0.2), which permits uphill moves
Error
{ J }
training = change of connections {J} such that E decreasessimplest procedure:• gradient descent
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Effect of over-training: theory
100
50
0Training time
over-train
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
num
ber o
f cor
rect
clas
sific
atio
ns p
er ex
ampl
e
0 5 10 15 20 25
number of cycles
ratio for training set
ratio for testing set
Effect of over-training: practice
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
RETURN:secondary structure prediction
51Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Secondary structure predictions of 1. and 2. generation
52
single residues (1. generation)• Chou-Fasman, GOR 1957-70/80
50-55% accuracy
segments (2. generation)• GORIII 1986-92
55-60% accuracy
problems• < 100% they said: 65% max
• < 40% they said: strand non-local
• short segments
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
ACDEFGHIKLMNPQRSTVWY.
H
E
L
D (L)
R (E)
Q (E)
G (E)
F (E)
V (E)
P (E)
A (H)
A (H)
Y (H)
V (E)
K (E)
K (E)
Neural Network for secondary structure
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
helix strand otheroverallaccuracymethod
unbalanced 62%
NN predicts secondary structure
54
neural network
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
helix strand otheroverallaccuracymethod
unbalanced 62%
NN predicts secondary structure
55
neural network
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
helix strand otheroverallaccuracymethod
unbalanced 62%
NN predicts secondary structure
55
neural network
... and developer believes that application of machine learning is all the intelligence he will ever need...
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
NN sec str: training dynamicsOther Strand Helix
time: 1 step = 20,000 training samples
Perfo
rman
ce
Eµ = oiµ − di
µ( )i∑
2
ΔJµ ∝ - ∂Eµ{J}∂J
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
NN sec str: training dynamics
1 2 3 4 5 6 7 8 9 100
0.2
0.4
0.6
0.8
1Other Strand Helix
time: 1 step = 20,000 training samples
Perfo
rman
ce
Eµ = oiµ − di
µ( )i∑
2
ΔJµ ∝ - ∂Eµ{J}∂J
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
helix strand otheroverallaccuracymethod
unbalanced 62%neural network
NN predicts secondary structure
57
full pie: all correctly predicted residues
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
helix strand otheroverallaccuracymethod
unbalanced 62%comparison:data bankdistribution
NN predicts secondary structure
58
neural network
full pie: all correctly predicted residues
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
helix strand otheroverallaccuracymethod
unbalanced 62%comparison:data bankdistribution
comparison:33:33:33
NN predicts secondary structure
59
neural network
full pie: all correctly predicted residues
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Eµ = oiµ − di
µ( )i∑
2
ΔJµ ∝ - ∂Eµ{J}∂J
normal training
Balanced training
60Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
E = oiµ − di
µ( )i∑
µ=α ,β,L∑
2
Eµ = oiµ − di
µ( )i∑
2
ΔJµ ∝ - ∂Eµ{J}∂J
normal training
balanced training
Balanced training
61Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Balanced training: dynamics
62
Other Strand Helix
10.80.60.40.20
unbalanced balancedEµ = oi
µ − diµ( )
i∑
2
ΔJµ ∝ - ∂Eµ{J}∂J
train:
E = oiµ − di
µ( )i∑
µ=α ,β,L∑
2µ
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Balanced training: dynamics
62
1 2 3 4 5 6 7 8 9 100
0.20.40.60.8
1Other Strand Helix
1 2 3 4 5 6 7 8 9 10
10.80.60.40.20
unbalanced balancedEµ = oi
µ − diµ( )
i∑
2
ΔJµ ∝ - ∂Eµ{J}∂J
train:
E = oiµ − di
µ( )i∑
µ=α ,β,L∑
2µ
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
helix strand otheroverallaccuracymethod
unbalanced 62%comparison:data bankdistribution
comparison:33:33:33balanced 60%
63
full pie: all correctly predicted residues
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Neural networks DO improve if developer does something more
than dream the machine learning
dream...64
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Secondary structure predictions of 1. and 2. generation
65
single residues (1. generation)• Chou-Fasman, GOR 1957-70/80
50-55% accuracy
segments (2. generation)• GORIII 1986-92
55-60% accuracy
problems• < 100% they said: 65% max
• < 40% they said: strand non-local
• short segments
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
β-sheet formation is NOT local
Erabutoxin β (3ebx)Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Conclusion:not all sound
explanations are right!
67Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Secondary structure predictions of 1. and 2. generation
68
single residues (1. generation)• Chou-Fasman, GOR 1957-70/80
50-55% accuracy
segments (2. generation)• GORIII 1986-92
55-60% accuracy
problems• < 100% they said: 65% max
• < 40% they said: strand non-local
• short segments
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Bad segment prediction
HHHHHHHHHEEEEE
HHHHEEE
HHHHHHHEEEEE
1st level
2nd level
comparison:observed:
69
SEQ! KELVLALYDYQEKSPREVTMKKGDILTLLNSTNKDWWKVEVNDRQGFVPAAYVKKLDOBS! EEEE E E E EEEEEE EEEEEE EEEEEEHHHEEEE
TYP! EHHHH EE EEEE EE HHHEE EEEHH
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Select samples at random
70
∆Jij(t+1) = - ε ∂E(t)∂Jij(t) + α ∆Jij(t-1)
where ∂E/∂J is the derivative of the error with respect tothe network connection; t is the algorithmic time given bythe presentation of one example; ε determines the stepwidth of the change (learning strength, typically some0.01); α gives the contribution of the momentum term(∆J(t-1) , typically some 0.2), which permits uphill moves
Error
{ J }
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Local correlations in reality
residuesiandi+3
Erabutoxin β (3ebx)
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /9772
??
???
How to get those into the prediction?
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
H
E
L
V (E)
P (E)
A (H)
PHDsec:
structure-to-structure
PHDsec: structure-to-structure network
73B Rost (1996) Methods Enzymol 266:525-39Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Better segment prediction
HHHHHHHHHEEEEE
HHHHEEE
HHHHHHHEEEEE
1st level
2nd level
comparison:observed:
74Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
.
0
200
400
600
800
1000
1200
0 10 20 30 40 50
Num
ber o
f seg
men
ts
Segment length
0
5
10
15
20
25
25 30 35 40 45 50
DSSPPHD
-800
-600
-400
-200
0
200
400
600
800
0 2 4 6 8 10
helixstrandloop
Diff
eren
ce in
num
ber
of o
bser
ved
- pre
dict
ed se
gmen
tsSegment length
A B
Better prediction of segment lengths
75Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
.
0
200
400
600
800
1000
1200
0 10 20 30 40 50
Num
ber o
f seg
men
ts
Segment length
0
5
10
15
20
25
25 30 35 40 45 50
DSSPPHD
-800
-600
-400
-200
0
200
400
600
800
0 2 4 6 8 10
helixstrandloop
Diff
eren
ce in
num
ber
of o
bser
ved
- pre
dict
ed se
gmen
tsSegment length
A B
Better prediction of segment lengths
76Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Structure-to-structure network: Invented?
77
N Qian & TJ Sejnowski (1988) Predicting the secondary structure of globular proteins using neural network models. J. Mol. Biol. 202:865-884.
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Other ideas
More output units, e.g. instead of central residue: take central 31. 9 output units2. average output -> 3 unitsoutput back into neural networks:Gianluca Pollastri, Dariusz Przybylski, B Rost and Pierre Baldi (2002) Improving the prediction of protein secondary structure in three and eight classes using recurrent neural networks and profiles. Proteins: Structure, Function, and Bioinformatics 47:228-235.
78Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Other ideas
output back into neural networks:
79
Gianluca Pollastri, Dariusz Przybylski, B Rost and Pierre Baldi (2002) Proteins 47:228-235: Fig. 1
idea: P Frasconi & M Gori (1996) IEEE Trans Neural netw 7:1521-5
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
STILL ONLY 60+ε% accuracy.
How to improve beyond that?
80Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
How to get more data into it?
81Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
How to get more data into it?
81
?Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Evolution has it!
.
0
20
40
60
80
100
0 50 100 150 200 250
Perc
enta
ge se
quen
ce id
entit
y
Number of residues aligned
Sequence identityimplies structural
similarity !
Don't know region
82
C Sander & R Schneider 1991 Proteins 9:56-68B Rost 1999 Prot Engin 12:85-94
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
1 50fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYIyrk_chick VTLFIALYDY EARTEDDLSF QKGEKFHIIN NTEGDWWEAR SLSSGATGYIfgr_human VTLFIALYDY EARTEDDLTF TKGEKFHILN NTEGDWWEAR SLSSGKTGCIyes_chick VTVFVALYDY EARTTDDLSF KKGERFQIIN NTEGDWWEAR SIATGKTGYIsrc_avis2 VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_aviss VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_avisr VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_chick VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIstk_hydat VTIFVALYDY EARISEDLSF KKGERLQIIN TADGDWWYAR SLITNSEGYIsrc_rsvpa .......... ESRIETDLSF KKRERLQIVN NTEGTWWLAH SLTTGQTGYIhck_human ..IVVALYDY EAIHHEDLSF QKGDQMVVLE ES.GEWWKAR SLATRKEGYIblk_mouse ..FVVALFDY AAVNDRDLQV LKGEKLQVLR .STGDWWLAR SLVTGREGYVhck_mouse .TIVVALYDY EAIHREDLSF QKGDQMVVLE .EAGEWWKAR SLATKKEGYIlyn_human ..IVVALYPY DGIHPDDLSF KKGEKMKVLE .EHGEWWKAK SLLTKKEGFIlck_human ..LVIALHSY EPSHDGDLGF EKGEQLRILE QS.GEWWKAQ SLTTGQEGFIss81_yeast.....ALYPY DADDDdeISF EQNEILQVSD .IEGRWWKAR R.ANGETGIIabl_mouse ..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVabl1_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVsrc1_drome..VVVSLYDY KSRDESDLSF MKGDRMEVID DTESDWWRVV NLTTRQEGLImysd_dicdi.....ALYDF DAESSMELSF KEGDILTVLD QSSGDWWDAE L..KGRRGKVyfj4_yeast....VALYSF AGEESGDLPF RKGDVITILK ksQNDWWTGR V..NGREGIFabl2_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YNQNGEWSEV RSKNG.QGWVtec_human .EIVVAMYDF QAAEGHDLRL ERGQEYLILE KNDVHWWRAR D.KYGNEGYIabl1_caeel..LFVALYDF HGVGEEQLSL RKGDQVRILG YNKNNEWCEA RlrLGEIGWVtxk_human .....ALYDF LPREPCNLAL RRAEEYLILE KYNPHWWKAR D.RLGNEGLIyha2_yeastVRRVRALYDL TTNEPDELSF RKGDVITVLE QVYRDWWKGA L..RGNMGIFabp1_sacex.....AEYDY EAGEDNELTF AENDKIINIE FVDDDWWLGE LETTGQKGLF
83B Rost (1996) Methods Enzymol 266:525-39Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
1 50fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYIyrk_chick VTLFIALYDY EARTEDDLSF QKGEKFHIIN NTEGDWWEAR SLSSGATGYIfgr_human VTLFIALYDY EARTEDDLTF TKGEKFHILN NTEGDWWEAR SLSSGKTGCIyes_chick VTVFVALYDY EARTTDDLSF KKGERFQIIN NTEGDWWEAR SIATGKTGYIsrc_avis2 VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_aviss VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_avisr VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_chick VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIstk_hydat VTIFVALYDY EARISEDLSF KKGERLQIIN TADGDWWYAR SLITNSEGYIsrc_rsvpa .......... ESRIETDLSF KKRERLQIVN NTEGTWWLAH SLTTGQTGYIhck_human ..IVVALYDY EAIHHEDLSF QKGDQMVVLE ES.GEWWKAR SLATRKEGYIblk_mouse ..FVVALFDY AAVNDRDLQV LKGEKLQVLR .STGDWWLAR SLVTGREGYVhck_mouse .TIVVALYDY EAIHREDLSF QKGDQMVVLE .EAGEWWKAR SLATKKEGYIlyn_human ..IVVALYPY DGIHPDDLSF KKGEKMKVLE .EHGEWWKAK SLLTKKEGFIlck_human ..LVIALHSY EPSHDGDLGF EKGEQLRILE QS.GEWWKAQ SLTTGQEGFIss81_yeast.....ALYPY DADDDdeISF EQNEILQVSD .IEGRWWKAR R.ANGETGIIabl_mouse ..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVabl1_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVsrc1_drome..VVVSLYDY KSRDESDLSF MKGDRMEVID DTESDWWRVV NLTTRQEGLImysd_dicdi.....ALYDF DAESSMELSF KEGDILTVLD QSSGDWWDAE L..KGRRGKVyfj4_yeast....VALYSF AGEESGDLPF RKGDVITILK ksQNDWWTGR V..NGREGIFabl2_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YNQNGEWSEV RSKNG.QGWVtec_human .EIVVAMYDF QAAEGHDLRL ERGQEYLILE KNDVHWWRAR D.KYGNEGYIabl1_caeel..LFVALYDF HGVGEEQLSL RKGDQVRILG YNKNNEWCEA RlrLGEIGWVtxk_human .....ALYDF LPREPCNLAL RRAEEYLILE KYNPHWWKAR D.RLGNEGLIyha2_yeastVRRVRALYDL TTNEPDELSF RKGDVITVLE QVYRDWWKGA L..RGNMGIFabp1_sacex.....AEYDY EAGEDNELTF AENDKIINIE FVDDDWWLGE LETTGQKGLF
SH3Src-homology 3 domainone domain of proteins such asSrc tyrosine kinase (STK)
84
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
1 50fyn_human VTLFVALYDY EARTEDDLSF HKGEKFQILN SSEGDWWEAR SLTTGETGYIyrk_chick VTLFIALYDY EARTEDDLSF QKGEKFHIIN NTEGDWWEAR SLSSGATGYIfgr_human VTLFIALYDY EARTEDDLTF TKGEKFHILN NTEGDWWEAR SLSSGKTGCIyes_chick VTVFVALYDY EARTTDDLSF KKGERFQIIN NTEGDWWEAR SIATGKTGYIsrc_avis2 VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_aviss VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_avisr VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIsrc_chick VTTFVALYDY ESRTETDLSF KKGERLQIVN NTEGDWWLAH SLTTGQTGYIstk_hydat VTIFVALYDY EARISEDLSF KKGERLQIIN TADGDWWYAR SLITNSEGYIsrc_rsvpa .......... ESRIETDLSF KKRERLQIVN NTEGTWWLAH SLTTGQTGYIhck_human ..IVVALYDY EAIHHEDLSF QKGDQMVVLE ES.GEWWKAR SLATRKEGYIblk_mouse ..FVVALFDY AAVNDRDLQV LKGEKLQVLR .STGDWWLAR SLVTGREGYVhck_mouse .TIVVALYDY EAIHREDLSF QKGDQMVVLE .EAGEWWKAR SLATKKEGYIlyn_human ..IVVALYPY DGIHPDDLSF KKGEKMKVLE .EHGEWWKAK SLLTKKEGFIlck_human ..LVIALHSY EPSHDGDLGF EKGEQLRILE QS.GEWWKAQ SLTTGQEGFIss81_yeast.....ALYPY DADDDdeISF EQNEILQVSD .IEGRWWKAR R.ANGETGIIabl_mouse ..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVabl1_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YnnGEWCEAQ ..TKNGQGWVsrc1_drome..VVVSLYDY KSRDESDLSF MKGDRMEVID DTESDWWRVV NLTTRQEGLImysd_dicdi.....ALYDF DAESSMELSF KEGDILTVLD QSSGDWWDAE L..KGRRGKVyfj4_yeast....VALYSF AGEESGDLPF RKGDVITILK ksQNDWWTGR V..NGREGIFabl2_human..LFVALYDF VASGDNTLSI TKGEKLRVLG YNQNGEWSEV RSKNG.QGWVtec_human .EIVVAMYDF QAAEGHDLRL ERGQEYLILE KNDVHWWRAR D.KYGNEGYIabl1_caeel..LFVALYDF HGVGEEQLSL RKGDQVRILG YNKNNEWCEA RlrLGEIGWVtxk_human .....ALYDF LPREPCNLAL RRAEEYLILE KYNPHWWKAR D.RLGNEGLIyha2_yeastVRRVRALYDL TTNEPDELSF RKGDVITVLE QVYRDWWKGA L..RGNMGIFabp1_sacex.....AEYDY EAGEDNELTF AENDKIINIE FVDDDWWLGE LETTGQKGLF
SH3Src-homology 3 domainone domain of proteins such asSrc tyrosine kinase (STK)
84
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Evolution improves prediction
Evolutionary profile implicitly captures history of and individual protein!
85Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Evolution improves prediction
Evolutionary profile implicitly captures history of and individual protein!
85Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Evolution improves prediction
Evolutionary profile implicitly captures history of and individual protein!
fly
chicken
rat
mouse
human
85Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Η
Ε
L
>
>
>
pickmaximal
unit=>
currentprediction
J2
inputlayer
first orhidden layer
second oroutput layer
s0 s1 s2J1
:GYIY
DPAVGDPDNGVEP
GTEF:
:GYIY
DPEVGDPTQNIPP
GTKF:
:GYEY
DPAEGDPDNGVKP
GTSF:
:GYEY
DPAEGDPDNGVKP
GTAF:
Alignments
5 . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . 5 . .. . . . . . . 2 . . . . . 3 . . . . . .. . . . . . . . . . . . . . . . . 5 . .
. . . . 5 . . . . . . . . . . . . . . .
. . . 5 . . . . . . . . . . . . . . . .
. . 3 . . . . 2 . . . . . . . . . . . .
. . . . 1 . . 2 . . . 2 . . . . . . . .5 . . . . . . . . . . . . . . . . . . .. . . . 5 . . . . . . . . . . . . . . .. . . 5 . . . . . . . . . . . . . . . .. . . . 4 . 1 . . . . . . . . . . . . .. . . . 1 3 . . . 1 . . . . . . . . . .4 . . . . 1 . . . . . . . . . . . . . .. . . . . . . . . . . 4 . 1 . . . . . .. . . 1 . 1 . 1 2 . . . . . . . . . . .. . . 5 . . . . . . . . . . . . . . . .
5 . . . . . . . . . . . . . . . . . . .. . . . . . 5 . . . . . . . . . . . . .. 1 1 . 1 . . 1 1 . . . . . . . . . . .. . . . . . . . . . . . . . . . . . 5 .
GSAPD NTEKQ CVHIR LMYFW
profile table
:GYIY
DPEDGDPDDGVNP
GTDF:
Protein
corresponds to the the 21*3 bits coding for the profile of one residue
PHD: Neural network & evolutionary information
86B Rost & C Sander (1993) PNAS 90:7558-62B Rost (1996) Methods Enzymol 266:525-39
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
25%
80
100%
number of residues alignedSequ
ence
iden
tity
filterMaxHom
sequencedata bank
protein Aprotein B
:protein N
protein Aprotein C
:protein M
MaxHom
BLAST
11
22
33
ext ractal ignment
PHD
U
From sequence to profile
87B Rost (1996) Methods Enzymol 266:525-39Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
P H D s e c
H
L
E
4+1""""""
20444
outputlayer
inputlayer
hiddenlayer
20444
21+3""""""
H
L
E
0.5
0.1
0.4percentage of each amino acid in proteinlength of protein (≤60, ≤120, ≤240, >240)distance: centre, N-term (≤40,≤30,≤20,≤10)distance: centre, C-term (≤40,≤30,≤20,≤10)
input global in sequence
input local in sequence
localalign-ment13
adjacentresidues
:::AAAAA.LLLLIIAAGCCSGVV:::
globalstatist.wholeprotein
%AALength∆ N-term∆ C-term
A C L I G S V ins del cons100 0 0 0 0 0 0 0 0 1.17100 0 0 0 0 0 0 33 0 0.42 0 0 100 0 0 0 0 0 33 0.92 0 0 33 66 0 0 0 0 0 0.74 66 0 0 0 33 0 0 0 0 1.17 0 66 0 0 0 33 0 0 0 0.74 0 0 0 33 0 0 66 0 0 0.48
first levelsequence-to- structure
second levelstructure-to- structure
PHDsec: more details
88B Rost (1996) Methods Enzymol 266:525-39Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Jury
centre of mass = jury over 1-4
architecture 3architecture 4
singlenetworkvs.jurydecision
architecture 2architecture 1
89Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
PROFsec: Evolutionary information + more
B Rost (2001) J Struct Biol 134, 204-18 90Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
PROFsec: Evolutionary information + more
B Rost (2001) J Struct Biol 134, 204-18 90Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
HEADER CYTOSKELETONCOMPND ALPHA SPECTRIN (SH3 DOMAIN) �SOURCE CHICKEN (GALLUS GALLUS) BRAINAUTHOR M.NOBLE,R.PAUPTIT,A.MUSACCHIO,M.SARASTE
Spectrin homology domain (SH3)
59%65%
72%
91Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Prediction accuracy varies!
0
10
20
30
40
50
60
70
0 10 20 30 40 50 60 70 80 90 100
Num
ber o
f pro
tein
cha
ins
Per-residue accuracy (Q3)
<Q3>=72.3% ; sigma=10.5%
1spf
1bct
1stu
3ifm
1psm
92Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Stronger predictions more accurate!
.
0
20
40
60
80
100
0
20
40
60
80
100
3 4 5 6 7 8 9
Q per protein3 fit: Q3fit = 21 + 8.7 * Q
3
Q3 p
er p
rote
in
Reliability index averaged over protein
ACDEFGHIKLMNPQRSTVWY.
H
E
L
D (L)
R (E)
Q (E)
G (E)
F (E)
V (E)
P (E)
A (H)
A (H)
Y (H)
V (E)
K (E)
K (E)
H=0.5E=0.4L=0.1
H=0.8E=0.1L=0.1
0
10
20
30
40
50
60
70
0 10 20 30 40 50 60 70 80 90 100
Num
ber o
f pro
tein
cha
ins
Per-residue accuracy (Q3)
<Q3>=72.3% ; sigma=10.5%
1spf
1bct
1stu
3ifm
1psm
93Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Correct prediction of correctly predicted residues
.
70
75
80
85
90
95
100
0 20 40 60 80 100
PHDsec
PHDacc
PHDhtm
70
75
80
85
90
95
100RI=9
RI=0RI=9
RI=0
RI=9
RI=4
7
over
all p
er-re
sidue
acc
urac
y
percentage of resdidues predicted94
Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
BAD errors are frequent!
0
50
100
150
200
250
300
350
0 10 20 30 40
Num
ber o
f pro
tein
cha
ins
BAD error (H for E, or E for H)
<BAD>=4.0% ; sigma=5.9%
0
5
10
15
20
0 20 40 60 80 100Pe
rcen
tage
of e
rrors
Cumulative percentage of protein chains
95Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
False prediction for engineered proteins!
GB1: IgG-binding domain of protein G (CHAMELEON) Kim & Berg, Nature, 366, 267-270, 1993
....,....1....,....2....,....3....,....4....,....5....,..AA TTYKLILNGKTLKGETTTEAVDAATAEKVFKQYANDNGVDGEWTYDDATKTFTVTEKDSSP EEEEEEE EEEEEEEEE HHHHHHHHHHHHHHHHH EEEEEEE EEEEEEEE
PHD 30 EEEEEE E EEHHHHHHHHHHHHHHEEE EEEEEE EEEEEPHD no EEEEEE EEEEEHHHHHHHHHHHHHHHH EEEEE EEEEEE
AATAEKVFKQY AWTVEKAFKTFPHD 30 EEEEEE EEEEEEE HHHHHHHHHEEE EEEE EEEEEEPHD no EEEEEE EEEEEEHHHHHHHHHHHHHHH EEEEE EEEEEE
EWTYDDATKTF AWTVEKAFKTFPHD 30 EEEEEE EEE EHHHHHHHHHHHHHHHH EEEEE EEEEEEPHD no EEEEEE E E EHHHHHHHHHHHHHHHH HHHHHHH EEEEE
AWTVEKAFKTF HHHHH
96Wednesday June 4, 2014
© Burkhard Rost (TUM Munich) /97
Lecture plan PP1: Structure01: 2014/04/08 Tue: sorry02: 2014/04/10 Thu: welcome: who we are03: 2014/04/15 Tue: Intro I - acids/structure (Andrea Schafferhans)04: 2014/04/17 Thu: SKIP: Easter vacation05: 2014/04/22 Tue: SKIP: Easter vacation06: 2014/04/24 Thu: Intro II - 3D comparisons07: 2014/04/29 Tue: Alignment 1 08: 2014/05/01 Thu: SKIP: “May day” - (NOT to be confused with “m’aidez”)09: 2014/05/06 Tue: SKIP: student assembly (SVV)10: 2014/05/08 Thu: Alignment 211: 2014/05/13 Tue: Alignment 312: 2014/05/15 Thu: Comparative modeling 113: 2014/05/20 Tue: Comparative modeling 1-2, Experimental structure determination14: 2014/05/22 Thu: no lecture 15: 2014/05/27 Tue: Experimental structure determination / 3D -> 1D: Secondary structure assignment 16: 2014/05/29 Thu: SKIP: holiday (Ascension Day)17: 2014/06/03 Tue: SKIP: no lecture18: 2014/06/05 Thu: 1D: Secondary structure prediction 119: 2014/06/10 Tue: SKIP: Whitsun holidays20: 2014/06/12 Thu: 1D: Transmembrane helix prediction21: 2014/06/17 Tue: Nobel prize symposium22: 2014/06/19 Thu: SKIP: Corpus Christi (Fronleichnam)23: 2014/06/24 Tue: 1D: Transmembrane strand prediction, solvent accessibility24: 2014/06/26 Thu: 2D prediction25: 2014/07/01 Tue: 3D prediction/wrap up26: 2014/07/03 Thu: wrap up 227: 2014/07/08 Tue: examen, no lecture28: 2014/07/10 Thu: no lecture
97Wednesday June 4, 2014