+ All Categories
Home > Documents > A predictive method for the evaluation of peptide binding in pocket 1 of HLA‐DRB1 via global...

A predictive method for the evaluation of peptide binding in pocket 1 of HLA‐DRB1 via global...

Date post: 11-Apr-2023
Category:
Upload: rutgers
View: 0 times
Download: 0 times
Share this document with a friend
16
A Predictive Method for the Evaluation of Peptide Binding in Pocket 1 of HLA-DRB1 via Global Minimization of Energy Interactions I.P. Androulakis, 1 N.N. Nayak, 1 M.G. Ierapetritou, 1 D.S. Monos, 2 and C.A. Floudas 1 * 1 Department of Chemical Engineering, Princeton University, Princeton, New Jersey 2 Department of Pediatrics, University of Pennsylvania, The Children’s Hospital of Philadelphia, Philadelphia, Pennsylvania ABSTRACT Human leukocyte antigens (HLA) or histocompatibility molecules are gly- coproteins that play a pivotal role in the devel- opment of an effective immune response. An important function of the HLA molecules is the ability to bind and present antigen peptides to T lymphocytes. Presently there is no compre- hensive way of predicting and energetically evaluating peptide binding on HLA molecules. To quantitatively determine the binding speci- ficity of a class II HLA molecule interacting with peptides, a novel decomposition approach based on deterministic global optimization is proposed that takes advantage of the topogra- phy of HLA binding grove, and examined the interactions of the bound peptide with the five different pockets. In particular, the main focus of this paper is the study of pocket 1 of HLA DR1 (DRB1*0101 allele). The determination of the minimum energy conformation is based on the ECEPP/3 potential energy model that describes the energetics of the atomic interactions. The minimization of the total potential energy is formulated on the set of peptide dihedral angles, Euler angles, and translation variables to describe the relative position. The determin- istic global optimization algorithm, aBB, which has been shown to be e-convergent to the glo- bal minimum potential energy through the solution of a series of nonlinear convex optimi- zation problems, is utilized. The PACK confor- mational energy model that utilizes the ECEPP/3 model but also allows the consider- ation of protein chain interactions is inter- faced with aBB. MSEED, a program used to calculate the solvation contribution via the area accessible to the solvent, is also inter- faced with aBB. Results are presented for the entire array of naturally occurring amino ac- ids binding to pocket 1 of the HLA DR1 mol- ecule and very good agreement with experi- mental binding assays is obtained. Proteins 29:87–102, 1997. r 1997 Wiley-Liss, Inc. Key words: peptide docking; global optimiza- tion; HLA-DRB1 class II protein INTRODUCTION The docking problem has received a lot of atten- tion in the open literature. The presented methods can be classified as shape-based methods that are based on molecular surface representation and en- ergy-based methods that optimize interaction en- ergy in order to determine good dockings. Shape- based methods have the advantage of being less computationally intensive since the number of pos- sible different binding modes can be greatly reduced by using a simplified model for the shapes of the receptor and binder. Based on this idea, are the works of Lee and Richard, 22 Connolly, 8 Bacon and Moult, 5 Jiang and Kim, 19 Kuntz and coworkers. 17 Energy-based methods on the other hand, repre- sent a more precise way of determining good dock- ings but they are more computationally demanding. Due to this fact most of the proposed approaches are based on Monte Carlo simulation and Simulated Annealing such as the works of Goodsell and Olson, 12 Hart and Read, 16 and Calfisch and coworkers. 7 Rosen- feld and coworkers, 35 present a peptide binding study based on random selection and minimization among potential peptide structures. More recently, dynamic programming optimization, 21 is used for optimizing the overall free energy based on a frag- ment assembly algorithm and molecular dynamics simulation is also utilized for studying the binding afinity of the HLA-B*2705 protein. 34 All the pro- posed approaches identify the importance of accu- rate prediction, which leads to the need of establish- ing efficient and systematic ways of predicting the global energetically most favorable docking mode. Contract grant sponsor: National Science Foundation; Air Force Office of Scientific Research; American Diabetes Associa- tion. Dr. Androulakis’s current address is Corporate Research Science Laboratories, Exxon Research & Engineering Co., Annandale, NJ 08801. *Correspondence to: C.A. Floudas, Department of Engineer- ing, Princeton University, Princeton, NJ 08544-5263. Received 10 December 1996; Accepted 17 April 1997 PROTEINS: Structure, Function, and Genetics 29:87–102 (1997) r 1997 WILEY-LISS, INC.
Transcript

A Predictive Method for the Evaluation of PeptideBinding in Pocket 1 of HLA-DRB1 via GlobalMinimization of Energy InteractionsI.P. Androulakis,1 N.N. Nayak,1 M.G. Ierapetritou,1 D.S. Monos,2 and C.A. Floudas1*1Department of Chemical Engineering, Princeton University, Princeton, New Jersey2Department of Pediatrics, University of Pennsylvania, The Children’s Hospital of Philadelphia,Philadelphia, Pennsylvania

ABSTRACT Human leukocyte antigens(HLA) or histocompatibility molecules are gly-coproteins that play a pivotal role in the devel-opment of an effective immune response. Animportant function of the HLA molecules is theability to bind and present antigen peptides toT lymphocytes. Presently there is no compre-hensive way of predicting and energeticallyevaluating peptide binding on HLA molecules.To quantitatively determine the binding speci-ficity of a class II HLA molecule interactingwith peptides, a novel decomposition approachbased on deterministic global optimization isproposed that takes advantage of the topogra-phy of HLA binding grove, and examined theinteractions of the bound peptide with the fivedifferent pockets. In particular, the main focusof this paper is the study of pocket 1 of HLADR1(DRB1*0101 allele). The determination of theminimum energy conformation is based on theECEPP/3 potential energy model that describesthe energetics of the atomic interactions. Theminimization of the total potential energy isformulated on the set of peptide dihedralangles, Euler angles, and translation variablesto describe the relative position. The determin-istic global optimization algorithm, aBB, whichhas been shown to be e-convergent to the glo-bal minimum potential energy through thesolution of a series of nonlinear convex optimi-zation problems, is utilized. The PACK confor-mational energy model that utilizes theECEPP/3 model but also allows the consider-ation of protein chain interactions is inter-faced with aBB. MSEED, a program used tocalculate the solvation contribution via thearea accessible to the solvent, is also inter-faced with aBB. Results are presented for theentire array of naturally occurring amino ac-ids binding to pocket 1 of the HLA DR1 mol-ecule and very good agreement with experi-mental binding assays is obtained. Proteins29:87–102, 1997. r 1997 Wiley-Liss, Inc.

Key words: peptide docking; global optimiza-tion; HLA-DRB1 class II protein

INTRODUCTION

The docking problem has received a lot of atten-tion in the open literature. The presented methodscan be classified as shape-based methods that arebased on molecular surface representation and en-ergy-based methods that optimize interaction en-ergy in order to determine good dockings. Shape-based methods have the advantage of being lesscomputationally intensive since the number of pos-sible different binding modes can be greatly reducedby using a simplified model for the shapes of thereceptor and binder. Based on this idea, are theworks of Lee and Richard,22 Connolly,8 Bacon andMoult,5 Jiang and Kim,19 Kuntz and coworkers.17

Energy-based methods on the other hand, repre-sent a more precise way of determining good dock-ings but they are more computationally demanding.Due to this fact most of the proposed approaches arebased on Monte Carlo simulation and SimulatedAnnealing such as the works of Goodsell and Olson,12

Hart and Read,16 and Calfisch and coworkers.7 Rosen-feld and coworkers,35 present a peptide bindingstudy based on random selection and minimizationamong potential peptide structures. More recently,dynamic programming optimization,21 is used foroptimizing the overall free energy based on a frag-ment assembly algorithm and molecular dynamicssimulation is also utilized for studying the bindingafinity of the HLA-B*2705 protein.34 All the pro-posed approaches identify the importance of accu-rate prediction, which leads to the need of establish-ing efficient and systematic ways of predicting theglobal energetically most favorable docking mode.

Contract grant sponsor: National Science Foundation; AirForce Office of Scientific Research; American Diabetes Associa-tion.

Dr. Androulakis’s current address is Corporate ResearchScience Laboratories, Exxon Research & Engineering Co.,Annandale, NJ 08801.

*Correspondence to: C.A. Floudas, Department of Engineer-ing, Princeton University, Princeton, NJ 08544-5263.

Received 10 December 1996; Accepted 17 April 1997

PROTEINS: Structure, Function, and Genetics 29:87–102 (1997)

r 1997 WILEY-LISS, INC.

Histocompatibility molecules or human leukocyteantigens (HLA) are cell surface molecules that formcomplexes with self- and nonself-peptides. The HLA–peptide complex is recognized by the T-lymphocytereceptor and initiates antigen specific immune re-sponses. HLA molecules are very polymorphic andeach of them may interact with large number ofpeptides. Both characteristics, their polymorphismand their binding promiscuity serve the basic func-tion of presenting a wide range of antigens to theimmune system. Appropriately presented antigensinduce either tolerance or an active immune re-sponse. HLA molecules are classified as class I andclass II. This distinction relates to the mode ofinteraction with peptides as well as to their functionand distribution in the tissues. The study of thedifferences between the two classes as well as a trialto predict the structure of class II MHC based on thestructure of class I is the subject of a recent publica-tion.20

In this study the presented results involve theclass II molecule HLA-DR1 (DRB1*0101). A deter-ministic global optimization approach is proposed fordetermining the conformation of the binding com-plex with the global minimum of interaction energy.This approach is applied on evaluating the totalpotential energy of the entire array of amino acidsinteracting with pocket 1 of HLA-DR1. A detaileddescription of the HLA-DR1 molecule is presented inthe next section. In this paper we present theproblem definition and formulation, the proposedglobal optimization approach, and the results for allnaturally occurring amino acids binding in pocket 1of HLA-DR1 discussed in comparison with providedexperimental data.

PROBLEM DEFINITION AND PROPOSEDAPPROACH

HLA-DR1 Molecule

The binding of an influenza virus peptide to theMHC protein HLA-DR138 is illustrated in Figure 1.The HLA-DR1 molecule is in white, while the boundpeptide is in gray with the different locations of themajor protein pockets defined by the residues shownin different shades. Notice that the peptide obtainsan extended conformation in the binding groove onthis complex. (All pictures were created in the molecu-lar graphics program GRASP.31)

Histocompatibility proteins are organized into twomajor classes. Polymorphic residues in both class Iand class II proteins are clustered in the peptide-binding region and are responsible for the differentpeptide specificities. The major distinctive featuresof the class I and class II loci are (i) allograft rejectionproperties, (ii) relative tissue distributions, and (iii)differing chemical compositions.32 Class I proteinsgenerally bind fragments that range from 8 to 10

residues in length. Additionally, the protein pocketsin this class show allele defined tendencies to bindparticular amino acid side chains, or to bind unspe-cifically. In contrast, class II molecules bind muchlonger fragments and it has proven difficult to definethe binding tendencies of the various pockets.38 Thedifferent binding properties of these two classes areconjectured to be a result of the more open structureof class II peptides. This allows longer peptides to besituated in the MHC binding groove.32

The HLA-DR1 molecule consists of an a chain(33–35 kDa), and a b chain (26–28 kDa) consisting of366 amino acid residues. The b1 chain of the HLA-DR1 locus is highly variable, while all other regionstend to be relatively invariant.

Crystallographic studies38 have shown that pep-tide binding is accommodated by five polymorphicpockets on the surface of the HLA-DR1 molecule.Each of these pockets can accommodate a singleamino acid residue when a particular peptide isbound.38 Accordingly, these pockets play a major rolein determining the peptide specificity of class IImolecules. Both pocket 1 and pocket 4 have beenimplicated as playing vital roles in peptide bindingand subsequent recognition by T cells.10,14

Pocket 1 is the largest and deepest pocket of theHLA-DR1 molecule. The area of contact for potentialbinders has been estimated at 200 Å2.38 The pockethas been implicated as being an ‘‘anchor’’ peptide. Ithas been postulated that the residues that bind tothe other four pockets are mainly determined bywhich residue in the binding peptide attaches to thispocket.14 Pocket 1 consists of hydrophobic residuesincluding several phenylalanine groups. This ac-counts for the preference of this pocket to accommo-date hydrophobic residues, such as tyrosine andphenylalanine.38 The large size of this pocket makesit the most solvent-accessible of the five pockets.

Pocket 4 is a relatively large pocket that is muchshallower than pocket 1. The pocket consists ofpredominately hydrophobic amino acids, except for apositively charged arginine group. This accounts forthis pocket’s tendency to bind residues that havelarge, aliphatic side chains, or negatively chargedside chains. Thus, residues such as glutamate andaspartate bind favorably to this pocket while posi-tively charged groups such as lysine or arginine arerepelled.15,38 Pocket 4 has been shown to play animportant role in the recognition of the bound pep-tide by T lymphocytes.10 Pocket 4, in addition topockets 6, 7, and 9, has been shown to be 90%inaccessible to solvent.38

The other three pockets (6, 7, and 9) are consideredto affect to a lesser degree the determination ofpeptide binding. Pocket 6 is a shallow pocket thatprefers smaller residues. The similarly shallow pocket7 is nondiscerning in its binding activities, and onlypartially accommodates side chains. Pocket 9 binds

88 I.P. ANDROULAKIS ET AL.

aliphatic side chains due to its small, hydrophobicnature.38

Proposed Approach

The modeling and optimization studies of theinteractions between the HLA-DR1 protein and avirus peptide are based on a novel decompositionscheme. As it has been described in the previoussection, the binding specificity of the HLA-DR1molecule is mainly determined by the binding charac-teristics of its five pockets, which enables the investi-gation of each one separately. This paper concen-trates on the study of pocket 1. The key ideas in theproposed decomposition approach are (i) to considerthe binding at each pocket separately, (ii) to studythe binding of each amino acid to each pocket byconsidering one at a time, and (iii) to create a rankordered list of the binding amino acids for eachpocket, based on an energetic criterion that reflectsthe binding specificity.

The proposed decomposition approach consists ofthe following stages.

Stage 1

In stage 1 the pocket of the HLA-DR1 protein isrepresented by a number of residues as described indetail in the subsection Pocket Definition. The workof Stern and colleagues38 provides information on theconstituent amino acids of each pocket of HLA-DR1protein. Furthermore, this work provides the carte-

sian coordinates of the atoms that participate in eachamino acid of the HLA-DR1 protein.

Stage 2

For the specific pocket interacting with each natu-rally occurring amino acid a mathematical model isformulated that represents all the energetic atom-to-atom interactions. These interactions are classifiedas (i) inter-interactions between the atoms of theresidues that define the pocket of HLA-DR1 proteinand the atoms of the considered naturally occurringamino acid, and (ii) intra-interactions between theatoms of the considered naturally occurring aminoacid. These interactions consist of electrostatic, non-bonded, hydrogen-bonding, torsional, and loop-closing components. In addition, solvation energy isalso considered based on solvent accessible areas.The detailed mathematical model and potential func-tions used are described in the subsections ProteinRepresentation and Potential Energy Model.

Stage 3

Having a mathematical model which accounts forall the inter- and intra-interactions of the specificpocket and the considered naturally occurred aminoacid, in stage 3 we formulate the global optimizationproblem, which minimizes the total potential energyas it is explained in detail in the subsection ProblemFormulation.

Fig. 1. HLA-DR1 bound to an influenza virus peptide.

89PEPTIDE BINDING IN POCKET 1 OF HLA-DRB1

Stage 4

A deterministic global optimization method,aBB,3,4,24,25 is adopted at stage 4 for the solution ofthe resulting nonconvex mathematical model of stage3. This stage requires the connection of the aBBglobal optimization method with the conformationenergy program PACK,37 which utilizes ECEPP/3,36

and the program MSEED,33 that supplies the solva-tion contribution as described in detail in the sectionDeterministic Global Optimization.

Stage 5

In stage 5, we introduce an energetic-based crite-rion that allows for the comparison of the bindingbetween a given pocket and each naturally occurringamino acid. This measure, which is denoted as DE,corresponds to the difference of (i) the global mini-mum total potential energy that is obtained in stage4 and which is indicated as ETotal, and (ii) the globalminimum potential energy of the considered natu-rally occurring amino acid when it is far away fromthe pocket and which is denoted by E°Res:

DE 5 ETotal2 E°Res (1)

Note here that the energies ETotal and E°Res includethe consideration of the solvation energy as it will bediscussed in more detail in the subsection PotentialEnergy Model. This criterion represents a measureof the binding affinity of each amino acid to the givenpocket, in the sense that it quantifies the tendency ofan amino acid to bind with the pocket of the HLA-DR1 molecule. The amino acid that exhibits the leastDE corresponds to the one with the best possiblebinding to that pocket of the HLA-DR1 protein.

Stage 6

In stage 6, we repeat the previous stages for eachnaturally occurring amino acid and hence create arank ordered list for the binding of each of them tothe specific pocket. The detailed results are shown inthe section Computational Studies and Discussion.

MATHEMATICAL MODELINGProtein Representation

The geometry of a protein can be fully described bydefining the relative cartesian coordinates of eachatom. Instead of specifying the coordinate vector forall atoms in a protein, one can specify all bondlengths, covalent bond angles and dihedral angles.Under biological conditions, the bond lengths andbond angles are fairly rigid and thus can be assumedto be fixed at their equilibrium values. Under thisassumption, the dihedral angles determine the geo-

metric shape of the folded protein. The names of thedihedral angles of a folded protein chain follow astandard nomenclature as shown in Figure 2.

If more than one polypeptide is involved then therelative orientations, and locations of these differentchains must be defined. This can most easily beaccomplished by defining a translation vector and arotation matrix. The translation is achieved throughthe cartesian coordinates of the initial nitrogen atomof each independent chain. The Euler angles specifythe rotations necessary to orient a particular polypep-tide and are defined as the angles between thecoordinate axes defined by the initial hydrogen,nitrogen, and alpha carbon of each residue. Thedetailed determination of the euler angles is given inAppendix A.

Pocket Definition

The relative energies of minimization of each ofthe five protein pockets is mainly determined by theresidues that constitute these pockets. A Program forPocket Definition, denoted as PPD, constructs thesepockets through the selection of all residues that arewithin a radius R of the atoms of the crystallo-graphic binder. A range of values for R has beenevaluated in an attempt to discover a radius thatrealistically represents the pocket, while limitingthe number of residues necessary to define thepocket. The information required by the user isprovided in a file containing the coordinates of theHLA-DR1 molecule and a file with the coordinates ofthe influenza virus binding peptide, as well as thevalue of radius R. Each of the five pockets of theHLA-DR1 molecule were run through PPD for threedifferent radius lengths ( R 5 4.0, 4.5, 5.0 Å). Theprogram filters through the a and b strands of themolecule and returns two output files one having alist of all atoms within a radius R, as well as theexact distances and a PDB file with the coordinatesof all residues that define a given protein pocket forthe specific radius. Table I presents the residues

Fig. 2. Dihedral angles of a standard amino acid.

90 I.P. ANDROULAKIS ET AL.

defining each of the protein pockets for the variousradii considered. There are several important obser-vations that should be made. First, there is theintuitive trend of the pockets becoming more com-plex with increased radius. Second, note that pocket1 is composed of a significantly larger number ofresidues than any of the other pockets. Finally, notethat in some cases an increase in radius does not

cause the inclusion of additional amino acids. Anexample of this is pocket 7, where the pocket isidentically defined across the entire range of radii.

The resulting PDB file is then translated to theinternal coordinate system by a program denoted asARAS (Amino acid Residue Angle Solver). The out-put file obtained by this program is the one requiredby the conformational energy program PACK toevaluate the potential energy as is described indetail in the next section.

Potential Energy Model

Molecular dynamics calculations employ an empiri-cal derived set of potential energy contributions forapproximating the force field of the protein system.These energy functions are based on specific types ofinteractions instead of being associated with a par-ticular molecule. The parameters for these correla-tions have been determined to provide the bestpossible agreement with experimental data. Manydifferent parameterizations have been proposed forapproximating the force field in protein folding calcu-lations. Some of the most popular ones are ECEPP,26

MM2,1 ECEPP/2,30 CHARMM,6 DISCOVER,9 AM-BER,41,42 GROMOS,39 ENCAD,23 MM3,2 ANDECEPP/3.29 In this work the ECEPP/329 detailedpotential model is utilized. In this potential model, itis assumed that the covalent bond lengths andangles are fixed at their equilibrium values and theconformational energy is treated as the sum ofelectrostatic, nonbonded, hydrogen bonding, tor-sional, and cystine contributions.

The potential function of ECEPP/3 includes thefollowing terms:

E 5 o(i, j )[ E S

332.0qiqj

Drij(Electrostatic)

1 o(i, j )[ N B

FA

rij12

2C

rij6

(Nonbonded)

1 o(hx)[ H X

FA 8

rhx12

2B

rhx10

(Hydrogen bonding)

1 ok[ T O R

1E0

2 2(1 6 cos nk uk ) (Torsional)

1 ol[ C O O P

BL oil51

il53

(ril2 ri0

)2 (Cystine loop-closing)

1 ol[ C O O P

AL (r4l2 r40

)2 (Cystine torsional)

In addition, the solvation energy is also consideredthrough the utilization of the program MSEED,33

TABLE I. PPD Pocket Compositionsfor R 5 4.0–5.0 Å

Pocket R 5 4.0 Å R 5 4.5 Å R 5 5.0 Å

1 ilea31 ilea31 ilea31trpa43 trpa43 trpa43sera53 sera53 sera53valb85 valb85 valb85pheb89 pheb89 pheb89phea32 asnb82 asnb82alaa52 phea32 thrb90phea54 alaa52 phea32glyb86 phea54 alaa52

glyb86 phea54phea24 glyb86

phea24glua55

4 glna09 glna09 glna09pheb13 pheb13 pheb13argb71 argb71 argb71tyrb78 tyrb78 tyrb78asna62 leub26 leub26glnb70 asna62 asna62alab74 glnb70 glnb70

alab74 alab74glua11 glua11

6 glua11 glua11 glua11vala65 vala65 vala65leub11 leub11 leub11asna62 asna62 argb71aspa66 aspa66 asna62

aspa66pheb13

7 vala65 vala65 vala65glub28 glub28 glub28trpb61 trpb61 trpb61argb71 argb71 argb71asna69 asna69 asna69tyrb47 tyrb47 tyrb47leub67 leub67 leub67

9 ilea72 ilea72 ilea72meta73 meta73 meta73trpb09 trpb09 trpb09tyrb60 tyrb60 tyrb60asna69 asna69 leua70arga76 arga76 asna69aspb57 aspb57 arga76

trpb61 aspb57trpb61

91PEPTIDE BINDING IN POCKET 1 OF HLA-DRB1

which supplies solvent accessible areas. Once theseareas have been calculated, the following formulacan be utilized to define the solvation potential:

ESOL 5 oi51

n

sk (i ) Ai (2)

where n equals the total number of atoms in themolecule, sk(i ) is a coefficient dependent upon theatom type, and Ai is the solvent accessible area ofthe ith atom. The s coefficients were determined bythe research performed in Ref. 40.

The solvent accessible area is determined by roll-ing a spherical test probe over the surface of themolecule (see Fig. 3). The areas of direct contactbetween the molecule and the probe define theaccessible surface. Additionally, the area of the bot-tom most part of the probe traces the surface ininaccessible cavities of the protein. The probe radiusis equivalent to the van der Waals radius of a watermolecule, which is equivalent to 1.4 Å. These empiri-cal solvent-accessible surface areas are calculated bythe program MSEED. This program utilizes Connol-ly’s analytical algorithm, which is described in Ref.33. Note that ESOL is only added to this overallpotential at local minima, and hence is not explicitlystated in the above equation. This is done becausethe parameters of the JRF set used in Ref. 40 werederived based on a set of tetrapeptide conformationsthat correspond to local minima of the ECEPPpotential energy.38 The total energy ETotal is thendefined as:

ETotal 5 E 1 ESOL

Problem Formulation

As it was described in the subsection ProteinRepresentation a particular amino acid chain couldbe defined by a translation vector, a rotation matrix,and the corresponding set of dihedral angles. Thetranslation vector will be defined as the coordinatesof the nitrogen atom on the first residue of a chain,while the rotation matrix will be defined by the Eulerangles. Since the pocket is considered to be rigid,the only variables will be the amino coordinates,Euler angles, and dihedral angles of the amino acidbinder.

Let k 5 1, . . . , K, where K is the total number ofside chain angles of the amino acid residue thatattempts to bind the pocket. Then, the set of variabledihedral angles would include the backbone angles(f, c, and v), and the side-chain angles (xk). Thecartesian coordinates of the amino translation vectorwill be defined by the variables Nx, Ny, and Nz.Similarly, the cartesian coordinates of the backbonecarboxyl carbon are represented by Cx8, Cy8, and Cz8.Finally, the Euler angles will be represented by e1, e2,and e3. Utilizing the above definitions the potential

energy minimization problem can be formulated asfollows:

min E (f, c, v, xk, Nx, Ny, Nz, e1, e2, e3 ) (3)

s.t. 2p # f # p (4)

2p # c # p (5)

2p # v # p (6)

2p # xk# p, k 5 1, . . . , K (7)

2p # e1 # p (8)

2p # e2 # p (9)

2p # e3 # p (10)

Nxl # Nx # Nx

u (11)

Nyl # Ny # Ny

u (12)

Nzl # Nz # Nz

u (13)

Cx8l # Cx8(f, c, v, xk, Nx, Ny, Nz, e1, e2, e3) # Cx8

u (14)

Cy8l # Cy8(f, c, v, xk, Nx, Ny, Nz, e1, e2, e3) # Cy8

u (15)

Cz8l # Cz8(f, c, v, xk, Nx, Ny, Nz, e1, e2, e3) # Cz8

u (16)

Note that the superscripts u and l denote upperand lower bounds, respectively, for the cartesiancoordinates of both the amino nitrogen and thecarboxyl carbon. In addition to the constraints on theamino nitrogen, note that in the above formulationthere are additional constraints on the carboxylcarbon. It has been assumed due to the decomposi-tion employed that enables the consideration of eachpocket separately, that the conformational move-

Fig. 3. Determination of solvent-accessible area.

92 I.P. ANDROULAKIS ET AL.

ments of the binding peptide are only constrained bythe locations of these two atoms. In the originalproblem though, the binding residue is part of alonger antigen peptide. The rest of the bindingpeptide is assumed to bind normally so even if thebinding residue is changed these backbone atomswill be relatively confined to their initial positionsdue to their peptidic linkages. Although the con-straints of amino nitrogen can be directly consideredin the above formulation since they correspond toproblem variables, the C8 coordinates are not explicitvariables and consequently they must be defined as afunction of the other variables (see Appendix B).Note that E is a nonconvex function involving numer-ous local minima that correspond to metastablestates of the specific amino acid binding to the pocket1. A single global minimum defines the energeticallymost favorable peptide conformation. In establishinga ranked-order list of binding peptides, one needs toidentify rigorously the best conformation of (i) thebinding residue far from the pocket and (ii) thecomplex of pocket 1 with the binding residue. Conse-quently, there is a need for a method that canguarantee convergence to the global minimum poten-tial energy conformation and which is capable ofsolving large scale constrained optimization prob-lems. In this paper, the global optimization approachaBB,3,4,25 described in more detail in the next sec-tion, has been extended to peptide systems interact-ing with realistic atomistic potential energy models(e.g., ECEPP/336), which include solvation contribu-tions via surface accessible area as the solventmethod (e.g., MSEED33).

DETERMINISTIC GLOBAL OPTIMIZATIONGlobal Optimization Approach

The global optimization scheme aBB3,4,25 is adeterministic branch and bound algorithm for locat-ing the global optimum based on the construction ofconverging lower and upper bounds. Upper boundscan be simply obtained by minimizing E using localmethods. Lower bounds can be evaluated by con-structing the convex underestimator, L, of the origi-nal function E and evaluating the single globalminimum of the resulting convex problem.

A convex lower bounding function L of potentialenergy function E can be defined by augmenting Eusing the ideas of the approach introduced in Ref. 25:

L 5 E 1 a5(fl 2 f)(fu 2 f) 1 (cl 2 c)(cu 2 c)

1 (vl 2 v)(vu 2 v) 1 ok51

K

(xk,l 2 xk )(xk,u 2 xk )

1 (Nxl 2 Nx )(Nx

u 2 Nx ) 1 (Nyl 2 Ny )(Ny

u 2 Ny )

1 (Nzl 2 Nz )(Nz

u 2 Nz ) 1 (e1l 2 e1 )(e1

u 2 e1 )

1 (e2l 2 e2 )(e2

u 2 e2 ) 1 (e3l 2 e3 )(e3

u 2 e3 )6

where a is a nonnegative parameter which must begreater or equal to the negative one half of theminimum eigenvalue of the hessian of E over therectangular under consideration described by thelower and upper bounds of the involved variablesdefined by the superscripts l and u, respectively. Thefollowing properties of function L will enable theconstruction of a global optimization algorithm. Theseproperties whose proof is given in Ref. 25 demon-strate that

1. L is always a valid underestimator of E.2. L matches E at all corner points of the box

constraints.3. L is convex.4. The maximum separation between L and E is

bounded and proportional to a and to square ofthe diagonal of the current box constraints. Thisproperty ensures that an ef feasibility and ec

convergence tolerances can be reached for a finitesize partition element.

5. The underestimators L constructed over super-sets of the current set are always less tight thanthe underestimator constructed over the currentbox constraints for every point within the currentbox constraints.

These bounds are successively refined by itera-tively partitioning the initial feasible region intosmaller ones. The feasible region partition is achievedby subdivision of a rectangle into two subrectanglesby halving along the longest side of the initialrectangle (bisection). At each iteration the lowerbound would be the minimum over all the minima inevery subrectangle composing the original domain.Therefore, a simple way to produce a nondecreasingsequence of lower bounds is to halve only the subrect-angle responsible for the infimum of the minima. Anonincreasing sequence of upper bounds can also beproduced by solving locally the nonconvex problemand selecting the minimum over the previouslyrecorded upper bounds. Based on this procedure afathoming step of the algorithm leads to no furtherconsideration of a subrectangle where the minimumis greater than the current upper bound. Conver-gence proof to an e-global solution in finite steps isgiven in Ref. 25.

Algorithmic Description

The proposed approach for the determination ofthe global minimum of E that corresponds to thepeptide conformation binding to the pocket 1 ofHLA-DR1 as posed in the section Problem Definitionand Proposed Approach, necessitates the develop-ment of an optimization interface that combines theglobal optimization program aBB, the conforma-tional energy program PACK which utilizesECEPP/3, the solvation program MSEED, and thelocal optimization solver NPSOL. Additional pro-

93PEPTIDE BINDING IN POCKET 1 OF HLA-DRB1

gram files serve to link these programs. A schematicdiagram of the interface between the used programsis shown in Figure 4.

The following steps are required in the calculationof the global minimum:

1. The local solver (NPSOL11) obtains a local mini-mum of the potential function supplied by PACKin a domain (rectangle) defined by the originallower and upper bounds of the variables (boundsare supplied by aBB). PACK determines theenergies of individual chains through repeatedcalls to ECEPP/3.

2. The solvation energy at this local minimum iscalculated by MSEED. This hydration energy isadded to the potential function to yield ETotal,which will serve as an upper bound on the globalminimum solution in the current rectangle.

3. The current best upper bound is updated to be theminimum of those thus far stored.

4. The current rectangle is partitioned by bisectionalong the longest side.

5. The convex function L is minimized in eachrectangle and the solvation energy is added at theminimum. If a solution is greater than the bestupper bound it will be eliminated, otherwise itwill be kept on the stack.

6. The rectangle with the current minimum solutionfor L is selected for further partitioning.

7. If the best upper and lower bounds are within ethe program will terminate, otherwise it willproceed to step 1.

It should be noted that ECEPP/3 calculates thepotential energy function for a polypeptide chain,36

while PACK is a program that inputs multiple chaindata and makes the appropriate calls to ECEPP/3 forcalculation of the interaction energies.37 Special typepenalty functions had to be added to the upper boundfunction, E, in order to implement the previouslydiscussed constraints on C 8 (Equations 14–16). The

modified objective function takes then the followingform:

E 8 5 E 1 b57Cx8l 2 Cx8 8 1 7Cx8 2 Cx8

u 8

1 7Cy8l 2 Cy8 8 1 7Cy8 2 Cy8

u 8

1 7Cz8l 2 Cz8 8 1 7Cz8 2 Cz8

u 86

The 7 8 function is defined as follows: 7 A 8 equalsA if A is greater than zero, otherwise 7 A 8 equalszero. Thus, as long as the coordinates are within thedefined bounds the objective function will not bemodified. Yet, if a particular coordinate falls outsideof the bounds, the function will be increased by thevalue of the transgression multiplied by the arbi-trarily large constant b.

Since the pocket is assumed rigid, the optimiza-tion variables are the dihedral angles, the transla-tion vector and the euler angles of the amino acidunder consideration. These variables are partitionedinto three sets. The first one (i.e., global variables)consists of the variables where branching occurs; thesecond set (i.e., local variables) consists of the vari-ables where branching is not performed, and thethird set (i.e., fixed variables) includes the variablesfor which there exists sufficient experimental evi-dence for keeping them fixed.

The information required by the user is providedin four different files. The first one is required byPACK and contains information about the differentprotein chains to be considered. The second one isneeded by ECEPP/3 and contains information aboutthe sequence and number of the amino acid residuesand the type of end groups. It also initializes thedihedral angles, translation vector and Euler angles.The third file provides the bounds on the aminonitrogen and carboxyl carbon of the binding residue.These bounds are defined around the cartesian coor-dinates of these two atoms of the influenza viruspeptide.38 Thus, for each pocket bounds were set inthe x, y, z directions around the coordinates of thecorresponding atoms of the influenza virus peptidepresented by Stern and colleagues.38 These boundsare given in Table II.

Fig. 4. The interface for global optimization.

TABLE II. Bounds on N and C 8 for Pockets 1of HLA-DRB1

Bounds Lower Upper

Nx 29.2 28.4y 23.3 24.1z 17.3 18.1

C8x 210.0 29.0y 25.0 27.0z 16.0 18.0

94 I.P. ANDROULAKIS ET AL.

COMPUTATIONAL STUDIESAND DISCUSSION

In this section, each one of the naturally occurringamino acids will be examined and accessed regard-ing its binding affinity with pocket 1 of HLA-DR1molecule. Before presenting the results obtained bythe proposed approach, the following point regardingthe amino acid polarity should be made. The ali-phatic side chains the amino acids Ala, Val, Ile, andLeu can be clearly considered as nonpolar ones,whereas, at the opposite end of the polarity scale, arethe charged residues Glu, Asp, Arg, Lys. Asn and Gly,which have amide side chains, as well as the hydrox-ylic amino acids Ser and Thr are polar and expectedto interact strongly with water and have high solubil-ity. The polarity of the rest of the amino acids is moreambiguous. Cys and His have pKa values close to 7and may actually be charged in many proteins underphysiological conditions. In our computational stud-ies we consider as positively charged residues theArg1, His1, Lys1, and as negatively charged resi-dues the Asp2 and Glu2, using the parameters ofECEPP/3 for evaluating their energy contributions.In Tyr the aromatic ring compensates for the hy-droxyl and makes Tyr a nonpolar residue. Gly andPro are special but from the way they behave inproteins can be considered as nonpolar and polar,respectively.

Individual Solvated Residues AwayFrom Pocket 1

As it has been mentioned in the section ProblemDefinition and Proposed Approach, in order to evalu-ate the energy of interaction of the 20 naturallyoccurring amino acids within a pocket, the intramo-lecular energy due to atomic interactions betweenthe atoms of the single residue far away from thepocket has to be calculated. Thus, the global mini-mum energy for each residue in isolation is foundwith the consideration of the solvation contributionas described in the subsection Potential EnergyModel. The results obtained by applying the globaloptimization algorithm, aBB3,4,25 are shown in TableIII, where, based on the previous remark regardingamino acid polarity, some of them are consideredcharged.

Complex of Pocket 1 and Binding Amino Acids

As mentioned earlier, DE, defined as the differenceETotal 2 ERes

0 , has been considered to represent thebinding potential of a specific naturally occurringamino acid to the pocket considered. As shown inTable I, the number of amino acids included inpocket 1 increase as R increases. Specifically from4.5 to 5.0 Å the amino acids that were added involvethreonine and glutamic acid. Note that Glu is nega-tively charged and is an important factor for evaluat-

ing the interactions with positive charged residuesas illustrated in Table V.

The results for pocket 1 with R 5 5.0 Å arepresented in Tables IV and V and in Figure 5a. Trp,Tyr and Phe are found to have the strongest bindingaffinities with interaction energies in a range of220.0 to 216.950. At lower positions in the middle ofthe list there are the Leu, Ile, and Val havinginteraction energies between 212.481 and 211.209.At the bottom of the list are the negative chargedresidues Glu2 and Asp2 with 40% smaller interac-tion energy than that of Val. An interesting result ofthe theoretical studies is the one obtained for thepositive charged residues that appear to be the mostunfavorable binders for this pocket.

A series of competitive binding assays was per-formed that involved analogs of the HA peptide(306–318) and the DR1 molecule.28 Since the HA(306–318) peptide residue that interacts with Pocket1 is Y(308) a number of analog peptides were synthe-sized that substituted the Y(308) with 11 differentamino acids. The relative binding affinity was de-rived from the reciprocal of 50% inhibitory concentra-tion (IC50) of each analog peptide in a logarithmicscale. Figures 5 and 6 show the results of both thetheoretical and the experimental results, respec-tively.

Based on the competitive binding assays shown inFigure 6, three groups of binding affinities have beenidentified. The first group includes the amino acidsTrp, Tyr, and Phe, that are the residues with thehighest affinity to DR1. The second group includesthe amino acids Ile, Leu, and Val and are character-ized by an intermediate level of affinity to DR1. Thethird group finally consists of low level affinity amino

TABLE III. Standard Energies for IndividualSolvated Residues

Residue CodeERes

0,S

(kcal/mol)

Ala A 242.143Asn N 294.376Cys C 274.667Gln Q 286.964Gly G 256.260Ile I 217.074Leu L 223.166Met M 246.269Phe F 2160.850Ser S 282.476Thr T 269.423Trp W 2184.230Tyr Y 2178.950Val V 225.055Glu2 E2 256.607Asp2 D2 267.416His1 H1 2125.460Arg1 R1 2105.800Lys1 K1 227.706

95PEPTIDE BINDING IN POCKET 1 OF HLA-DRB1

acids. This group involves the charged residuesAsp2, Glu2, Arg1, His1 as well as the amino acidsserine and threonine.

Based on the theoretical predictions, shown inFigure 5, Trp, Tyr, and Phe are at the top positions ofthe rank-ordered list of the examined naturallyoccurring amino acids a result that is further sup-ported from the strong preference of this pocket forlarge hydrophobic side chains. Furthermore, theamino acids Leu, Ile, and Val were found by theoptimization studies to be characterized by potentialenergies that correspond to 7th, 8th, and 11th posi-tion on the ordered list, respectively. The bindingassays resulted in intermediate affinities for theseamino acids. The DE value of 211.809 kcal/mol forthe binding of Val, reflects an approximate increaseof 43% as compared to Tyr. Provided that an increaseof 15% in potential energy defines the group of strongbinders (trp, tyr, phe), an increase of up to 43% couldvery well reflect a group of intermediate level bind-ers. At the bottom of Table IV, the global optimiza-tion studies put the charged residues which is also inagreement with experimental data. An increase ofapproximately 31% between DE values of Val andGlu2 reflects the low-affinity group of amino acids.

The laboratory studies present serine and threo-nine as relatively weak binders. The hydroxyl groupson both of these residues would favor interactionwith polar molecules. Thus, weak interactions withthe hydrophobic pocket 1 would be a predictable

consequence. Although, valine appears to have com-parable binding energy with these two residuesagain the optimization results support the observa-tion of these residues being weaker binders than thesmall aliphatic residues.

Finally, there is a number of amino acids includingGln, Lys1, Met, Asn, Cys, Gly and Ala for which noanalogs were synthesized. However, it has beenreported in Refs. 18 and 13 that the peptides with

TABLE IV. Relative Energies for Solvated Residuesin Pocket 1 ( R 5 5.0 Å)

ResidueETotal

S

(kcal/mol)ERes

0,S

(kcal/mol)DE

(kcal/mol)

Tyr 2198.950 2178.950 220.00Phe 2180.475 2160.850 219.625Trp 2201.180 2184.230 216.950Gln 2102.360 286.964 215.396Met 260.212 246.269 213.943Asn 2108.160 294.376 213.784Thr 282.718 269.423 213.297Leu 235.647 223.166 212.481Ile 229.539 217.074 212.465Ser 294.033 282.476 211.557Cys 285.947 274.667 211.280Val 236.264 225.055 211.209Ala 252.498 242.143 210.355Gly 266.351 256.260 210.091Glu2 264.531 256.607 27.744Asp2 269.847 267.416 22.431

TABLE V. Relative Energies for Solvated PositiveCharged Residues in Pocket 1 ( R 5 5.0 Å)

ResidueETotal

S

(kcal/mol)ERes

0,S

(kcal/mol)DE

(kcal/mol)

His1 258.374 2124.870 166.496Lys1 1196.15 227.706 1223.856Arg1 1182.78 2105.640 1288.420

Fig. 5. DE (kcal/mol) of the naturally occurring amino acids.

Fig. 6. Experimental data for the naturally occurring aminoacids.

96 I.P. ANDROULAKIS ET AL.

Met are intermediate binders while peptides withAla result in loss of peptide binding. The values ofDE equal to 213.943, 210.355 for Met and Ala,respectively, found from our global optimization stud-ies are consistent with the reported binding studies.

Therefore, the theoretical results are in excellentagreement with those obtained by the experimentalapproach of competitive binding assays.28

Moreover, since the optimization interface pro-duces a PDB file of the coordinates of the minimumenergy conformation of the binder a direct compari-son with the crystallographic data can be made forthe tyrosine residue that binds to pocket 1. Figure 7shows the HA peptide binder (tyrosine 308) in whiteand the minimum conformation of tyrosine for pocket1 in gray. An almost identical orientation with 1.28 Åis observed. Figures 9 and 10 illustrate the orienta-tion of phenylalanine and tryptophan in comparisonto the virus peptide binding shown in Figure 11,suggest that these residues are in fact very strongbinders.

The need for determining the global minimumconformation is illustrated in Figure 8 where a local

minimum conformation of tyrosine corresponding to2196.637 kcal/mol, that is having only 1.16% differ-ence from the global minimum of 2198.95 kcal/mol isillustrated with the grey whereas the global mini-mum conformation is shown with white. Note that

Fig. 7. Comparison of tyrosine binding to pocket 1.

Fig. 8. Local vs global minimum configuration of tyrosine.

Fig. 9. Phenylalanine binding to pocket 1.

Fig. 10. Tryptophan binding to pocket 1.

Fig. 11. Influenza virus peptide binding to pocket 1.

97PEPTIDE BINDING IN POCKET 1 OF HLA-DRB1

the local minimum energy conformation of tyrosineis very different than the global minimum despite itsproximity to the global minimum energy value.

For the charged residues, experimental data sug-gest their weak binding affinity. The fact that pocket1 is extremely hydrophobic region intuitively verifiesthis result. Charged residues are not stabilized bythe weak van der Waals interactions that stabilizethe conglomeration of hydrophobic residues. Thehypothesis that results from this knowledge is thatthe influsion of charge on the five aforementionedresidues should greatly destabilize their interactionswith pocket 1, which corresponds to an increase intheir overall conformational energies (DE ). The theo-retical results obtained for the negative chargedresidues support this idea. Positive charged residueshave large binding energy due to large electrostaticcontribution from the interaction with the negativecharged glutamic within the pocket, which, however,does not suggest favorable binders as their orienta-tion shown in Figures 12–14 for arginine1, histi-dine1, and lysine1, respectively. The enforcement ofthese residues inside the pocket gives rise to large(positive) energies (Table V, Fig. 5b) that indicatehighly unfavorable residues (see Figs. 15–17).

SUMMARY AND CONCLUSIONS

In this paper, a novel predictive method is pro-posed for modeling and studying the binding affinityof different naturally occurring amino acids withpocket 1 of the HLA-DR1 protein. First, the composi-tion of the pocket is identified together with thecartesian coordinates of the atoms that participatein each amino acid of the pocket 1 of the HLA-DR1protein. Second, explicit relations for all the ener-getic inter- and intra-interactions between the at-oms of the residues that define the pocket of theHLA-DR1 protein and the atoms of the considerednaturally occurring amino acid were derived. More-over, solvation energy was also taken into accountbased on solvent-accessible area method. Then, thedocking problem is formulated as a nonconvex opti-

mization problem on a set of independent dihedralangles, Euler angles and translation variables. Thedeterministic global optimum method, aBB, is thenadopted for the solution of the resulting problemwhich is based on the generation of a sequence ofconverging upper and lower bounds found from thelocal solution of the nonconvex problem and theconvex lower bounding problem which is constructedbased on eigenvalue analysis of the nonconvex poten-

Fig. 12. Arginine1 binding to pocket 1.

Fig. 13. Histidine1 binding to pocket 1.

Fig. 14. Lysine1 binding to pocket 1.

Fig. 15. Arginine1 forced within pocket 1.

98 I.P. ANDROULAKIS ET AL.

tial energy function. The final step of the proposedapproach consists of evaluating the interaction en-ergy for all naturally occurring amino acids andgenerate a ranked-order list.

The results of the proposed approach were foundto agree very well with the experimental competitivebinding assays. It should be emphasized that, al-though in this paper only one of the binding sites ofthe HLA-DR1 protein is examined, the approach isapplicable to predict the binding affinity of the aminoacid residues in the different pockets. Our currentwork focuses on studying the binding information forthe different pockets in order to be able to predict thebinding of the whole peptide to the HLA-DR1 pro-tein.

ACKNOWLEDGEMENTS

Financial support from the National Science Foun-dation and the Air Force Office of Scientific Researchis gratefully acknowledged.

REFERENCES

1. Allinger, N.L. Conformational analysis. MM2: a hydrocar-bon force field utilizing V1 and V2 torsional terms. J. Am.Chem. Soc. 99:8127, 1977.

2. Allinger, N.L., Yuh, Y.H., Liu, J.-H. Molecular mechanics:the MM3 force field for hydrocarbons. J. Am. Chem. Soc.111:8551, 1989.

3. Androulakis, I.P., Maranas, C.D., Floudas, C.A. aBB: Aglobal optimization method for general constrained noncon-vex problems. J. Global Optimiz. 7:337–363, 1995.

4. Androulakis, I.P., Maranas, C.D., Floudas, C.A. Predictionof oligopeptide conformations via deterministic global opti-mization. J. Global Optimiz. 11:1–34, 1997.

5. Bacon, D.J., Moult, J. Docking by least-square fitting ofmolecular surface patterns. J. Mol. Biol. 225:849–858,1992.

6. Brooks, B., Bruccoleri, R., Olafson, B., States, D., Swami-nathan, S., Karplus, M. CHARM: a program for macromo-lecular energy minimization and dynamics calculation. J.Comp. Chem. 8:132, 1983.

7. Calfisch, A., Niederer, P., Anliker, M. Monte Carlo dockingof oligopeptides to proteins. Proteins 13:223–230, 1992.

8. Connolly, M.L. Solvent-accessible surfaces of proteins andnucleic acids. Science 221:709, 1983.

9. Dauber-Osguthorpe, P., Roberts, V.A., Osguthorpe, D.J.,Wolff, J., Genest, M., Hagler, A.T. Structure and energeticsof ligand binding to peptides: Escherichia coli dihydrofo-late reductase-trimethoprim, a drug receptor system. Pro-teins 4:31, 1988.

10. Fu, X., Bono, C., Woulfe, S., Swearingen, C., Summers, N.,Sinigaglia, F., Sette, A., Schwartz, B., Carr, R.W. Pocket 4of the HLA-DR molecule is a major determinant of T cellrecognition of peptide. J. Exp. Med. 181:915–926, 1995.

11. Gill, P., Murray, W., Saunders, M., Wright, M. User’s Guidefor NPSOL (Version 4.0): A Fortran Package for NonlinearProgramming. Stanford University Department of Opera-tions Research, January 1986.

12. Goodsell, D.S., Olson, A.J. Automated docking of sub-strates to proteins by simulated annealing. Proteins 8:195–202, 1990.

13. Hammer, J., Bolin, C., Papadopoulos, D., Walsky, J., Hige-lin, J., Danho, W., Sinigaglia, F., Nagy, Z.A. High-affinitybinding of short peptides to major histocompatibility com-plex class II molecules by anchor combinations. Proc. Natl.Acad. Sci. U.S.A. 91:4456, 1994.

14. Hammer, J., Bono, E., Gallazi, F., Belunis, C., Nagy, Z.,Sinigaglia, F. Precise prediction of major histocompatibil-ity complex class II-peptide interaction based on peptideside chain scanning. J. Exp. Med. 180:2353–2358, 1994.

15. Hammer, J., Gallazzi, F., Bono, E., Karr, R., Guenot, J.,Valsasnini, Nagy, Z. Peptide binding specificity of HLA-DR4 molecules: correlation with Rheumatoid ArthritisAssociation. J. Exp. Med. 181:1847–1855, 1995.

16. Hart, T.N., Read, R.J. A multiple-start Monte-Carlo dock-ing method. Proteins 13:206–222, 1992.

17. Kuntz, I.D., Blaney, J.M., Oatley, S.J., Langridge, R.,Ferrin, T.E. A geometric approach to macromolecule-ligandinteractions. J. Mol. Biol. 161:269–288, 1982.

18. Jardesky, T.S., Gorga, J.C., Bush, R., Rothbard, J., Stro-minger, J.L. Wiley, D.C. Peptide binding to HLA-DR1: apeptide with most residues substituted to alanine retainsMHC binding. EMBO J. 9:1797, 1990.

19. Jiang, F., Kim, S.H. Soft docking: Matching of molecularsurface cubes. J. Mol. Biol. 219:79–102, 1991.

20. Nauss, J.L., Reid, R.H., Sadegh-Nasseri, S. Accuracy of astructural homology model for a class II histocompatibilityprotein, HLA-DR1: comparison to the crystal structure. J.Biomol. Struct. Dyn. 12:1213–1233, 1995.

21. Gulukota, K., Vajda, S., Delisi, C. Peptide docking usingdynamic programming. J. Comp. Chem. 17:418–428, 1996.

22. Lee, B., Richards, F.M. The interpretation of protein struc-tures: estimation of static accessibility. J. Mol. Biol. 55:379–400, 1971.

23. Levitt, M. Protein folding by restrained energy minimiza-tion and molecular dynamics. J. Mol. Biol. 170:723, 1983.

24. Maranas, C.D., Androulakis, I.P., Floudas, C.A. A determin-

Fig. 16. Histidine1 forced within pocket 1.

Fig. 17. Lysine1 forced within pocket 1.

99PEPTIDE BINDING IN POCKET 1 OF HLA-DRB1

istic global optimization approach for the protein foldingproblem. DIMACS Ser. Discrete Math. Theor. Comput. Sci.23:133–150, 1995.

25. Maranas, C.D., Floudas, C.A. Global minimum potentialenergy conformations of small molecules. J. Global Opti-miz. 4:135–170, 1994.

26. Momany, F.A., Burgess, A.W., McGuire, R.F., Scheraga,H.A. Energy Parameters in polypeptides. VII. Geometricparameters, partial atomic charges, nonbonded interac-tions, hydrogen bond interactions, and intrinsic torsionalpotentials for the naturally occurring amino acids. J. Phys.Chem. 79:2361, 1975.

27. Momany, F.A., Carruthers, L.M., McGuire, R.F., Scheraga,H.A. Energy parameters in polypeptides. J. Phys. Chem.78:1595, 1974.

28. Monos, D., Soulika, A., Argyris, E., Gorga, J., Stern, L.,Magafa, V., Cordopatis, P., Androulakis, I.P., Floudas, C.A.HLA diversity functional and medical implications. Proc.Int. Histocomp. World Conf. 12:xx–xx, 1997.

29. Nemethy, G., Gibson, K.D., Palmer, K.A., Yoon, C.N.,Paterlini, G., Zagari, A., Rumsey, S., Scheraga, H.A. En-ergy parameters in polypeptides. 10. Improved geometricalparameters and nonbonded interactions to use in theECEPP/3 algorithms, with applications to proline-contain-ing peptides. J. Phys. Chem. 96:6472, 1992.

30. Nemethy, G., Pottle, M.S., Scheraga, H.A. Energy param-eters in polypeptides. 9. Updating of geometrical param-eters, nonbinded interactions and hydrogen bond interac-tions for the naturally occurring amino acids. J. Phys.Chem. 89:1883, 1983.

31. Nicholls, A. GRASP: Graphical Representation and Analy-sis of Surface Properties, October 1992.

32. Paul, W. E. ‘‘Fundamental Immunology.’’ Philadelphia:Lippincott-Raven, 1993.

33. Perrot, G., Cheng, B., Gibson, K.D., Palmer, K.A., Vila, J.,Nayeem, A., Maigret, B., Scheraga, H.A. MSEED: a pro-gram for the rapid analytical determination of accessiblesurface areas and their derivatives. J. Comput. Chem.13:1–11, 1992.

34. Rogman, D., Scapozza, L., Folkers, G., Daser, A. MolecularDynamics simulation of MHC-peptide complexes as a toolfor predicting potential T cell epitopes. Biochemistry 33:11476–11485, 1994.

35. Rosenfeld, R., Q. Zheng, Vajda, S., DeLisi, C. Computingthe structure of bound peptides: applications to antigenrecognition by class I major histocompatibility complexreceptors. J. Mol. Biol. 234:515–521, 1993.

36. Scheraga, H.A. ECEPP/3 USER GUIDE. Cornell Univer-sity Department of Chemistry, January 1993.

37. Scheraga, H.A. PACK: Programs for Packing PolypeptideChains, (online documentation), 1996.

38. Stern, L., Brown, J., Jardetzky, T., Gorga, J., Urban, R.,Strominger, L., Wiley, D. Crystal structure of the humanclass II MHC protein HLA-DR1 complexed with an influ-enza virus peptide. Nature 368:215–221, 1994.

39. van Gunsteren, W.F., Berendsen, H.J.C. ‘‘GROMOS: Gro-ningen Molecular Simulation.’’ Groningen: The Nether-lands, 1987.

40. Vila, J., Williams, R.L., Vasquez, M., Scheraga, H.A. Empiri-cal solvation models can be used to differentiate nativefrom non-native conformations of bovine pancreatic tryp-sin inhibitor. Proteins 10:199–218, 1991.

41. Weiner, S., Kollmann, P., Case, D.A., Singh, U.C., Ghio, C.,Alagona, G., Profeta, S., Weiner, P. A new force field formolecular mechanical simulation of nucleic acids andproteins. J. Am. Chem. Soc. 106:765, 1984.

42. Weiner, S., Kollmann, P., Nguyen, D., Case, D. An all-atomforce field for simulations of proteins and nucleic acids. J.Comp. Chem. 7:230, 1986.

APPENDIX ADetermination of Euler Angles

The calculation of the Euler angles is complicatedby the fact that the location of the hydrogen atom inthe (x, y, z) space is not known. For this reason, a

discussion of the method of hydrogen position deter-mination will precede the Euler angle theory.

Determination of Hydrogen Location

The basic steps behind finding the location of ahydrogen is the following: (1) define a basis system,(2) find the relative position of the hydrogen in thissystem, and (3) translate and rotate this position to anew basis defined by a particular molecule. This willbecome clear as the explanation proceeds.

The diagram in Figure 18 shows how the axes aredefined in relation to the prime carbon, nitrogen andalpha carbon. The phi angle is always approximatelyequal to 180°. Hence the N—Ca bond defines the x-axis,while the Ca—C8 and N—H bonds lie in the XY plane.

The position of the hydrogen is initially unknown.The position of the nitrogen is taken to be at theorigin. The position of the alpha carbon is knownbecause the N—C bond length is approximately1.435 Å. By this definition, the positions of the atomsare N(0, 0, 0), and Ca(1.435, 0, 0). The position of thehydrogen can be found by the knowledge that theH—N—Ca bond angle is 121°. This angle is illus-trated by the arrow in Figure 18. Since the hydrogenlies in the x, y plane then the hydrogen will lie in thequadrant defined by negative x and positive y. Hencethe position is easily found remembering that theN—H bond length is ,1.0 Å. The x and y coordinatesare just the cosine and sine of the angle of 121°,respectively. Solving this system yields:H(20.5150, 0.8570, 0), or since nitrogen is defined asthe origin in this basis, the following vector isdefined:

NH 5 20.4226i 1 0.3791j 1 20.8232k (17)

where i, j, and k are the defined unit vectors for thebasis.

Fig. 18. Axes defined by the relative positions of C8, N, andCa.

100 I.P. ANDROULAKIS ET AL.

Now if N, C8, and Ca are at a random orientation inspace, the magnitudes (or distances from the definedorigin, nitrogen) do not change, but the directions ofthe unit normals do change. Hence if i, j, and k aredefined in the new orientation the vector will still bedefined as above, but will have a different value dueto the change in the unit vectors.

The final step is to define these unit vectors as thebasis of the orientation of a particular molecule,where the coordinates of N, C8, and Ca are known.The vectors NCa, and CaC8 are easily defined bysubtraction of coordinates. Then the unit vectors inthe axis directions can be defined as follows.

In the x direction,

i 5NCa

0NCa 0(18)

In the z direction,

z 5 (CaC8 ) 3 (NCa ) (19)

k 5z

0z 0(20)

In the y direction,

j 5 k 3 i (21)

Then for the particular orientation:

NH 5 20.5150i 1 0.8570j 1 0.00k (22)

Finally, the exact coordinates of hydrogen can bedetermined by adding NH to the given coordinates ofnitrogen for the system.

The Euler Angle Theory

The Euler angles are found by comparing theangles between the unit normals defining the coordi-nate axes. The basis coordinate system is defined asfollows and is subscripted with a 1 when mentionedlater: N(0, 0, 0), Ca(1.435, 0, 0), H(20.515, 0.857, 0).

So initially for given coordinates of the H, N, andCa atoms, the unit vectors defining the coordinateaxes must be found. First, the vectors NCa and NHare found by subtracting the respective coordinates.These vectors define the axes as shown in Figure 18.Once again the N—Ca bond lies on the x-axis, andthe N—H bond is defined as lying in the xy plane.This orientation is shown on the coordinate axes inFigure 18. The unit vectors on the axes are easilydescribed by the following equations.

In the x direction,

i 5NCa

0NCa 0(23)

In the z direction,

k 5n

0n 0(24)

where

n 5 i 3 (NH) (25)

In the y direction,

j 5 k 3 i (26)

After performing the above operations on theinitial basis and a residue there are two coordinatesystems to compare. By definition the followingrelations hold for the Euler angles in the basissystem defined above:

sin u1 5 2i1 · k2 (27)

cos u1 5 2j1 · k2 (28)

cos u2 5 k1 · k2 (29)

sin u3 5 2k1 · i2 (30)

cos u3 5 k1 · j2 (31)

sin2 u2 5 1 2 cos2 u2 (32)

The appropriate euler angles (u1, u2, u3) can easilybe found by taking the arctangent of the ratio of sineto cosine. The signs on the Euler angles are deter-mined by using the signs of the sine and cosine of theangle to determine the quadrant where the angle isdefined.

Fig. 19. Axes defined by the relative positions of H, N, and Ca.

101PEPTIDE BINDING IN POCKET 1 OF HLA-DRB1

APPENDIX BDetermination of C 8 Location

As mentioned in the subsection Problem Formula-tion the coordinates of the backbone carboxyl carbonhave to be expressed as a function of other variablessince they do not correspond to explicit optimizationvariables. This can be made using analytical geom-etry and linear algebra. In particular, for each aminoacid the coordinates of C8 atom can be expressed as afunction of the coordinates of the N atom (i.e., thetranslation vector) denoted as nx, ny, nz, the Eulerangles u1, u2, u3 and the f dihedral angle, andrequires the knowledge of the following parametersthe bond lengths NCa and CaC8 as well as the valueof the angle NCaC8 taken from the literature.27

Given the above information and based on thegraphic representation of protein shown in Figure 7,the following expressions are derived for the coordi-nates of C8 atom:

x 5 nx 1 sc 3 temp 3 sin (u1 ) 3 sin (u2 )

1 sa 3 sc 3 cos (f) 3 temp

3

[2cos (u2 ) 3 cos (u3 ) 3 sin (u1 )2 cos (u1 ) 3 sin (u3 )]

Î[1 2 cos2 (f)]

1 [(NCa) 1 (CaC8) 3 cos (u)) 3 (cos (u1) 3 cos (u3)

2 cos (u2 ) 3 sin (u1 ) 3 sin (u3 )]

y 5 ny 2 sc 3 temp 3 cos (u1 ) 3 sin (u2 )

1 sa 3 sc 3 cos (f) 3 temp

3

[cos (u1 ) 3 cos (u2 ) 3 cos (u3 )2 sin (u1 ) 3 sin (u3 )]

Î[1 2 cos2 (f)]

1 [(NCa ) 1 (CaC8 ) 3 cos (u))

3 (sin (u1) 3 cos (u3) 1 cos (u1) 3 cos (u2) 3 sin (u3)]

z 5 nz 1 sc 3 temp 3 cos (u2 ) 1 sa 3 sc 3 cos (f)

3 temp 3 cos (u3 ) 3sin (u3 )

Î[1 2 cos2 (f)]

1 [(NCa) 1 (CaC8) 3 cos (u)) 3 (sin (u2) 3 sin (u3)]

where

temp 5Î(CaC8 )2 2 (CaC8 )2 3 cos2 (u)

1 1cos2 (f)

1 2 cos2 (f)

sa 5 1.0, sc 5 21.0 if f . 0

sa 5 21.0, sc 5 1.0 if f # 0

and u is the angle NCaC8.

102 I.P. ANDROULAKIS ET AL.


Recommended