Prediction of T cell epitopes usingartificial neural networks
Morten Nielsen,
CBS, BioCentrum,
DTU
Objectives
• How to train a neural network to predict peptideMHC class I binding
• Understand why NN’s perform the best– Higher order sequence information
• The wisdom of the crowd!– Why enlightened despotism does not work even for
Neural networks
Outline
• MHC class I epitopes
– Why MHC binding?
• How to predict MHC binding?– Information content
– Weight matrices
– Neural networks
• Neural network theory– Sequence encoding
• Examples
Prediction of HLA binding specificity
Simple Motifs– Allowed/non allowed amino acids
Extended motifs– Amino acid preferences (SYFPEITHI)– Anchor/Preferred/other amino acids
Hidden Markov models– Peptide statistics from sequence alignment (previous
talk)
Neural networks– Can take sequence correlations into account
SYFPEITHI predictions
Extended motifs based on peptides from the literatureand peptides eluted from cells expressing specific HLAs( i.e., binding peptides)
Scoring scheme is not readily accessible. Positions defined as anchor or auxiliary anchor positions
are weighted differently (higher) The final score is the sum of the scores at each position Predictions can be made for several HLA-A, -B and -DRB1
alleles, as well as some mice K, D and L alleles.
BIMAS
Matrix made from peptides with a measured T1/2 for theMHC-peptide complex
The matrices are available on the websiteThe final score is the product of the scores of each
position in the matrix multiplied with a constant,different for each MHC, to give a prediction of the T1/2
Predictions can be obtained for several HLA-A, -B and -Calleles, mice K,D and L alleles, and a single cattle MHC.
How to predict
The effect on the binding affinity ofhaving a given amino acid at oneposition can be influenced by theamino acids at other positions in thepeptide (sequence correlations).– Two adjacent amino acids may for
example compete for the space in apocket in the MHC molecule.
Artificial neural networks (ANN) areideally suited to take suchcorrelations into account
Higher order sequence correlations
Neural networks can learn higher order correlations!– What does this mean?
S S => 0
L S => 1
S L => 1
L L => 0
No linearfunction canlearn this (XOR)pattern
Say that the peptide needs one and onlyone large amino acid in the positions P3and P4 to fill the binding cleft
How would you formulate this to test ifa peptide can bind?
Neural network learning higher ordercorrelations
• How is mutual information calculated?• Information content was calculated as
• Gives information in a single position
• Similar relation for mutual information• Gives mutual information between two positions
Mutual information
!
I = paa
" log(pa
qa)
!
I = paba,b
" log(pab
pa # pb)
Mutual information. Example
ALWGFFPVA
ILKEPVHGV
ILGFVFTLT
LLFGYPVYV
GLSPTVWLS
YMNGTMSQV
GILGFVFTL
WLSLLVPFV
FLPSDFFPS
P1 P6
P(G1) = 2/9 = 0.22, ..P(V6) = 4/9 = 0.44,..P(G1,V6) = 2/9 = 0.22, P(G1)*P(V6) = 8/81 = 0.10
log(0.22/0.10) > 0!
I = paba,b
" log(pab
pa # pb)
Knowing that you have G at P1 allows you tomake an educated guess on what you will findat P6.P(V6) = 4/9. P(V6|G1) = 1.0!
313 binding peptides 313 random peptides
Mutual information
SLLPAIVEL YLLPAIVHI TLWVDPYEV GLVPFLVSV KLLEPVLLL LLDVPTAAV LLDVPTAAV LLDVPTAAV
LLDVPTAAV VLFRGGPRG MVDGTLLLL YMNGTMSQV MLLSVPLLL SLLGLLVEV ALLPPINIL TLIKIQHTL
HLIDYLVTS ILAPPVVKL ALFPQLVIL GILGFVFTL STNRQSGRQ GLDVLTAKV RILGAVAKV QVCERIPTI
ILFGHENRV ILMEHIHKL ILDQKINEV SLAGGIIGV LLIENVASL FLLWATAEA SLPDFGISY KKREEAPSL
LERPGGNEI ALSNLEVKL ALNELLQHV DLERKVESL FLGENISNF ALSDHHIYL GLSEFTEYL STAPPAHGV
PLDGEYFTL GVLVGVALI RTLDKVLEV HLSTAFARV RLDSYVRSL YMNGTMSQV GILGFVFTL ILKEPVHGV
ILGFVFTLT LLFGYPVYV GLSPTVWLS WLSLLVPFV FLPSDFFPS CLGGLLTMV FIAGNSAYE KLGEFYNQM
KLVALGINA DLMGYIPLV RLVTLKDIV MLLAVLYCL AAGIGILTV YLEPGPVTA LLDGTATLR ITDQVPFSV
KTWGQYWQV TITDQVPFS AFHHVAREL YLNKIQNSL MMRKLAILS AIMDKNIIL IMDKNIILK SMVGNWAKV
SLLAPGAKQ KIFGSLAFL ELVSEFSRM KLTPLCVTL VLYRYGSFS YIGEVLVSV CINGVCWTV VMNILLQYV
ILTVILGVL KVLEYVIKV FLWGPRALV GLSRYVARL FLLTRILTI HLGNVKYLV GIAGGLALL GLQDCTMLV
TGAPVTYST VIYQYMDDL VLPDVFIRC VLPDVFIRC AVGIGIAVV LVVLGLLAV ALGLGLLPV GIGIGVLAA
GAGIGVAVL IAGIGILAI LIVIGILIL LAGIGLIAA VDGIGILTI GAGIGVLTA AAGIGIIQI QAGIGILLA
KARDPHSGH KACDPHSGH ACDPHSGHF SLYNTVATL RGPGRAFVT NLVPMVATV GLHCYEQLV PLKQHFQIV
AVFDRKSDA LLDFVRFMG VLVKSPNHV GLAPPQHLI LLGRNSFEV PLTFGWCYK VLEWRFDSR TLNAWVKVV
GLCTLVAML FIDSYICQV IISAVVGIL VMAGVGSPY LLWTLVVLL SVRDRLARL LLMDCSGSI CLTSTVQLV
VLHDDLLEA LMWITQCFL SLLMWITQC QLSLLMWIT LLGATCMFV RLTRFLSRV YMDGTMSQV FLTPKKLQC
ISNDVCAQV VKTDGNPPE SVYDFFVWL FLYGALLLA VLFSSDFRI LMWAKIGPV SLLLELEEV SLSRFSWGA
YTAFTIPSI RLMKQDFSV RLPRIFCSC FLWGPRAYA RLLQETELV SLFEGIDFY SLDQSVVEL RLNMFTPYI
NMFTPYIGV LMIIPLINV TLFIGSHVV SLVIVTTFV VLQWASLAV ILAKFLHWL STAPPHVNV LLLLTVLTV
VVLGVVFGI ILHNGAYSL MIMVKCWMI MLGTHTMEV MLGTHTMEV SLADTNSLA LLWAARPRL GVALQTMKQ
GLYDGMEHL KMVELVHFL YLQLVFGIE MLMAQEALA LMAQEALAF VYDGREHTV YLSGANLNL RMFPNAPYL
EAAGIGILT TLDSQVMSL STPPPGTRV KVAELVHFL IMIGVLVGV ALCRWGLLL LLFAGVQCQ VLLCESTAV
YLSTAFARV YLLEMLWRL SLDDYNHLV RTLDKVLEV GLPVEYLQV KLIANNTRV FIYAGSLSA KLVANNTRL
FLDEFMEGV ALQPGTALL VLDGLDVLL SLYSFPEPE ALYVDSLFF SLLQHLIGL ELTLGEFLK MINAYLDKL
AAGIGILTV FLPSDFFPS SVRDRLARL SLREWLLRI LLSAWILTA AAGIGILTV AVPDEIPPL FAYDGKDYI
AAGIGILTV FLPSDFFPS AAGIGILTV FLPSDFFPS AAGIGILTV FLWGPRALV ETVSEQSNV ITLWQRPLV
Neural network training
• Sequence encoding– Sparse
– Blosum
– Hidden Markov model
• Network ensembles– Cross validated training
– Benefit from ensembles
Sequence encoding
• How to represent a peptide amino acidsequence to the neural network?• Sparse encoding (all!amino acids are equally
disalike)
• Blosum encoding (encodes similaritiesbetween the different amino acids)
• Weight matrix (encodes the position specificamino acid preference of the HLA bindingmotif)
Evaluation of prediction accuracy
PSSM
Neural network training. Cross validation
Cross validation
Train on 4/5 of dataTest on 1/5=>Produce 5 differentneural networks eachwith a differentprediction focus
Neural network training curve
Maximum test set performanceMost cable of generalizing
Network ensembles
The Wisdom of the Crowds
The Wisdom of Crowds. Why the Many areSmarter than the Few. James Surowiecki
One day in the fall of 1906, the British scientist FracisGalton left his home and headed for a country fair… He
believed that only a very few people had thecharacteristics necessary to keep societies healthy. He
had devoted much of his career to measuring thosecharacteristics, in fact, in order to prove that the vastmajority of people did not have them. … Galton came
across a weight-judging competition…Eight hundred peopletried their luck. They were a diverse lot, butchers,
farmers, clerks and many other no-experts…The crowdhad guessed … 1.197 pounds, the ox weighted 1.198
Network ensembles
• No one single network with a particulararchitecture and sequence encoding scheme,will constantly perform the best
• Also for Neural network predictions willenlightened despotism fail– For some peptides, BLOSUM encoding with a four
neuron hidden layer can best predict thepeptide/MHC binding, for other peptides a sparseencoded network with zero hidden neurons performsthe best
– Wisdom of the Crowd• Never use just one neural network• Use Network ensembles
Evaluation of prediction accuracy
ENS: Ensemble of neural networks trained using sparse, Blosum, and weight matrix sequence encoding
T cell epitope identification
Lauemøller et al., reviews in immunogenetics 2001
NetMHC-3.0 update
• IEDB + more proprietary data• Higher accuracy for existing ANNs
• More Human alleles
• Non human alleles (Mice + Primates)
• Prediction of 8mer binding peptides for somealleles
• Prediction of 10- and 11mer peptides for allalleles
• Outputs to spread sheet
NetMHC Output
53
49
94
289
529
M
Prediction of 10- and 11mers using9mer prediction tools
Approach:
For each peptide of length L create 6pseudo peptides deleting a sliding windowof L- 9 always keeping pos. 1,2,3, and 9
Example:
MLPQWESNTL = MLPWESNTL
MLPQESNTL
MLPQWSNTL
MLPQWENTL
MLPQWESTL
MLPQWESNL
L P
Q
W E S N T L
Prediction of 10- and 11mers using9mer prediction tools
Prediction of 10- and 11mers using9mer prediction tools
Final prediction = average of the 6 logscores:
(0.477+0.405+0.564+0.505+0.559+0.521)/6 = 0.505
Affinity:Exp(log(50000)*(1 - 0.505)) = 211.5 nM
Prediction using ANN trained on10mer peptides
Prediction of 10- and 11mers using9mer prediction tools
Examples. Hepatitis C virus. Epitope predictions
Hotspots
SARS T cell epitope identification
Peptide binding affinity
A01 predicted peptides offered to rA*0101
0.000
0.500
1.000
1.500
2.000
2.500
A1
6929
A1
6930
A1
6931
A1
6932
A1
6933
A1
6934
A1
6935
A1
6936
A1
6937
A1
6938
A1
6939
A1
6940
A1
6941
A1
6942
A1
6943
Pept ides tested
Pep
tid
e
aff
init
y
(K
D)
µM
Peptides tested: 15/15 (100 %)
Binders (KD < 500 nM): 14/15 (93%)
More SARS CTL epitopes
Peptide binding affinity
A03 predicted peptides offered to
rA*1101
0.000
0.500
1.000
1.500
2.000
2.500
A3
-69
59
A3
-69
60
A3
-69
61
A3
-69
62
A3
-69
63
A3
-69
64
A3
-69
65
A3
-69
66
A3
-69
67
A3
-69
68
A3
-69
69
A3
-69
70
A3
-69
71
A3
-69
72
A3
-69
73
Pepti
de aff
init
y (K
D)
µM
Peptide binding affinity
B7 predicted peptides offered to rB*0702
0.000
0.500
1.000
1.500
2.000
2.500
B7
-69
89
B7
-69
90
B7
-69
91
B7
-69
92
B7
-69
93
B7
-69
94
B7
-69
95
B7
-69
96
B7
-69
97
B7
-69
98
B7
-69
99
B7
-70
00
B7
-70
01
B7
-70
02
B7
-70
03
Peptides tested
Pepti
de aff
init
y (K
D)
µM
Peptide binding affinity
B58 predicted peptides offered to rB*5801
0.000
0.500
1.000
1.500
2.000
2.500
B58-7035
B58-7036
B58-7037
B58-7038
B58-7039
B58-7040
B58-7041
B58-7042
B58-7043
B58-7044
B58-7045
B58-7046
B58-7047
B58-7048
B58-7049
Peptides tested
Pepti
de af
finit
y (K
D)
µM
Peptide binding affinity
B62 predicted peptides offered to rB*1501
0.000
0.500
1.000
1.500
2.000
2.500
79.H
LA-B
62 7050
79.H
LA-B
62 7051
79.H
LA-B
62 7052
79.H
LA-B
62 7053
79.H
LA-B
62 7054
79.H
LA-B
62 7055
79.H
LA-B
62 7056
79.H
LA-B
62 7057
79.H
LA-B
62 7058
79.H
LA-B
62 7059
79.H
LA-B
62 7060
79.H
LA-B
62 7061
79.H
LA-B
62 7062
79.H
LA-B
62 7063
79.H
LA-B
62 7064
Peptides tested
Pepti
de aff
init
y (K
D)
µM
11/15 14/15 10/15
13/15 12/14
A0301 A1101 B0702
A0201 B5801 B1501
A2 supertype:A2 supertype:
Molecule usedMolecule used::
rA0201/ human rA0201/ human !!22mm
Peptide binding affinity
A02 predicted peptides offered to rA*0201
0.000
0.500
1.000
1.500
2.000
2.500
A2
6944
A2
6945
A2
6946
A2
6947
A2
6948
A2
6949
A2
6950
A2
6951
A2
6952
A2
6953
A2
6954
A2
6955
A2
6956
A2
6957
A2
6958
Peptides tested
Pepti
de aff
init
y (K
D) µM
?
12/15
Vaccine design. Polytope optimization
• Successful immunization can be obtained only if theepitopes encoded by the polytope are correctlyprocessed and presented.
• Cleavage by the proteasome in the cytosol,translocation into the ER by the TAP complex, as well asbinding to MHC class I should be taken into account in anintegrative manner.
• The design of a polytope can be done in an effectiveway by modifying the sequential order of the differentepitopes, and by inserting specific amino acids that willfavor optimal cleavage and transport by the TAPcomplex, as linkers between the epitopes.
Vaccine design. Polytope construction
NH2 COOH
Epitope
Linker
M
C-terminal cleavage
Cleavage within epitopes
New epitopescleavage
Polytope starting configuration
Immunological Bioinformatics, The MIT press.
Polytope optimization Algorithm
• Optimization of four measures:
1. The number of poor C-terminal cleavage sites of epitopes(predicted cleavage < 0.9)
2. The number of internal cleavage sites (within epitopecleavages with a prediction larger than the predicted C-terminal cleavage)
3. The number of new epitopes (number of processed andpresented epitopes in the fusing regions spanning theepitopes)
4. The length of the linker region inserted between epitopes.
• The optimization seeks to minimize the above four terms by useof Monte Carlo Metropolis simulations [Metropolis et al., 1953]
Polytope optimal configuration
Immunological Bioinformatics, The MIT press.
Summary
• MHC class I binding can be veryaccurately predicted using ANN
• Higher order sequence correlations areimportant for peptide:MHC-I binding
• ANN can can be trained withoutoverfitting• Using multiple sequence encoding schemes• Wisdom of the crowd
• Optimization can generate polytopes withhigh likelihood for antigen presentation