+ All Categories
Home > Documents > Site-Specific Thermodynamics:  Understanding Cooperativity in Molecular Recognition

Site-Specific Thermodynamics:  Understanding Cooperativity in Molecular Recognition

Date post: 04-Dec-2016
Category:
Upload: enrico
View: 218 times
Download: 0 times
Share this document with a friend
30
Site-Specific Thermodynamics: Understanding Cooperativity in Molecular Recognition Enrico Di Cera Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, Box 8231, St. Louis, Missouri 63110 Received December 16, 1997 (Revised Manuscript Received April 8, 1998) Contents I. Introduction 1563 II. Site-Specific Thermodynamics 1564 A. Difference between Global and Site-Specific Cooperativity 1564 B. The Wegscheider Principle and the Analysis of Ionization Reactions 1566 C. Using Mutants To Resolve Site-Specific Parameters: Ca 2+ Binding to Calbindin 1568 D. Practical Limitations 1571 III. Structural Mapping of Energetics 1571 A. A Basic Analogy 1571 B. Ala Scans 1572 C. Double-Mutant Cycles 1575 IV. Site-Specific Dissection of Thrombin Specificity 1577 A. Substrate Recognition by Serine Proteases 1577 B. Thrombin Structure and Function 1577 C. Library of Site-Specific Probes 1578 D. Cooperativity in Substrate Recognition 1580 E. Origin of the Higher Specificity of the Fast Form 1582 F. Molecular Origin of the Cooperativity among the P1-P3 Sites 1583 G. How Thrombomodulin Really Works 1584 V. New Formalism for the Analysis of Mutational Effects 1585 VI. Conclusions 1589 VII. Acknowledgments 1589 VIII. References 1589 I. Introduction A prominent feature of biological macromolecules is the ability to accomplish diverse functions using cooperative interactions among structural domains. The best known example of this behavior is offered by hemoglobin, in which binding of oxygen to one heme affects the binding properties of other hemes in the molecule and oxygen release to the tissues is allosterically controlled by the uptake of protons and organic phosphates at other sites. 1 Cooperativity is not limited to ligand binding processes and allosteric proteins. It is an inherent component of protein stability, providing the necessary communication among residues of the protein to maintain the folded structure. 2 It is also a key player in determining secondary structure, as illustrated by the helix-coil transitions of biopolymers. 3 More recently, the de- velopments of recombinant DNA technology have enabled a dissection of ligand recognition at the level of individual residues. 4 Systematic mutagenesis studies of binding epitopes have fostered the notion that cooperativity may be a fundamental ingredient of any recognition event. 5 The existence of cooperative interactions in mac- romolecular systems raises the question of how to decipher the mechanism underlying the communica- tion among structural domains. One can imagine that a cooperative property, F, subject to experimen- tal investigation is the result of the contribution of a number of individual structural domains of the macromolecule, so that where the f ’s encapsulate the individual contribu- tions. Relevant examples of such properties are the following: protein stability, where the f ’s represent the contributions of particular folding units to the macroscopic free energy of folding; helix-coil transi- tions, where the f ’s represent the helix propensities of individual residues and their contribution to the helix state of the peptide as a whole; binding and linkage phenomena, where the f ’s denote the prob- abilities of binding to individual sites and F is the Enrico Di Cera was born in Palermo, Italy, in 1960. He received his M.D. degree from the Catholic University School of Medicine in Rome, Italy, in 1985. He was introduced to biological thermodynamics by Stan Gill and Jeffries Wyman during his postdoctoral experience in Boulder, CO. He joined Washington University School of Medicine in 1990, where he is currently Associate Professor. His research interests involve the thermodynamics of molecular recognition, as well as enzyme structure, function, and evolution. F ) f 1 + f 2 + ... + f N (1) 1563 Chem. Rev. 1998, 98, 1563-1591 S0009-2665(96)00135-5 CCC: $30.00 © 1998 American Chemical Society Published on Web 05/09/1998
Transcript
Page 1: Site-Specific Thermodynamics:  Understanding Cooperativity in Molecular Recognition

Site-Specific Thermodynamics: Understanding Cooperativity in MolecularRecognition

Enrico Di Cera

Department of Biochemistry and Molecular Biophysics, Washington University School of Medicine, Box 8231, St. Louis, Missouri 63110

Received December 16, 1997 (Revised Manuscript Received April 8, 1998)

ContentsI. Introduction 1563II. Site-Specific Thermodynamics 1564

A. Difference between Global and Site-SpecificCooperativity

1564

B. The Wegscheider Principle and the Analysisof Ionization Reactions

1566

C. Using Mutants To Resolve Site-SpecificParameters: Ca2+ Binding to Calbindin

1568

D. Practical Limitations 1571III. Structural Mapping of Energetics 1571

A. A Basic Analogy 1571B. Ala Scans 1572C. Double-Mutant Cycles 1575

IV. Site-Specific Dissection of Thrombin Specificity 1577A. Substrate Recognition by Serine Proteases 1577B. Thrombin Structure and Function 1577C. Library of Site-Specific Probes 1578D. Cooperativity in Substrate Recognition 1580E. Origin of the Higher Specificity of the Fast

Form1582

F. Molecular Origin of the Cooperativity amongthe P1−P3 Sites

1583

G. How Thrombomodulin Really Works 1584V. New Formalism for the Analysis of Mutational

Effects1585

VI. Conclusions 1589VII. Acknowledgments 1589VIII. References 1589

I. IntroductionA prominent feature of biological macromolecules

is the ability to accomplish diverse functions usingcooperative interactions among structural domains.The best known example of this behavior is offeredby hemoglobin, in which binding of oxygen to oneheme affects the binding properties of other hemesin the molecule and oxygen release to the tissues isallosterically controlled by the uptake of protons andorganic phosphates at other sites.1 Cooperativity isnot limited to ligand binding processes and allostericproteins. It is an inherent component of proteinstability, providing the necessary communicationamong residues of the protein to maintain the foldedstructure.2 It is also a key player in determiningsecondary structure, as illustrated by the helix-coiltransitions of biopolymers.3 More recently, the de-

velopments of recombinant DNA technology haveenabled a dissection of ligand recognition at the levelof individual residues.4 Systematic mutagenesisstudies of binding epitopes have fostered the notionthat cooperativity may be a fundamental ingredientof any recognition event.5The existence of cooperative interactions in mac-

romolecular systems raises the question of how todecipher the mechanism underlying the communica-tion among structural domains. One can imaginethat a cooperative property, F, subject to experimen-tal investigation is the result of the contribution of anumber of individual structural domains of themacromolecule, so that

where the f ’s encapsulate the individual contribu-tions. Relevant examples of such properties are thefollowing: protein stability, where the f ’s representthe contributions of particular folding units to themacroscopic free energy of folding; helix-coil transi-tions, where the f ’s represent the helix propensitiesof individual residues and their contribution to thehelix state of the peptide as a whole; binding andlinkage phenomena, where the f ’s denote the prob-abilities of binding to individual sites and F is the

Enrico Di Cera was born in Palermo, Italy, in 1960. He received hisM.D. degree from the Catholic University School of Medicine in Rome,Italy, in 1985. He was introduced to biological thermodynamics by StanGill and Jeffries Wyman during his postdoctoral experience in Boulder,CO. He joined Washington University School of Medicine in 1990, wherehe is currently Associate Professor. His research interests involve thethermodynamics of molecular recognition, as well as enzyme structure,function, and evolution.

F ) f1 + f2 + ... + fN (1)

1563Chem. Rev. 1998, 98, 1563−1591

S0009-2665(96)00135-5 CCC: $30.00 © 1998 American Chemical SocietyPublished on Web 05/09/1998

Page 2: Site-Specific Thermodynamics:  Understanding Cooperativity in Molecular Recognition

average number of ligated sites; molecular recogni-tion, where the f ’s denote contributions arising ateach residue of the binding epitope to the binding freeenergy F.An important consequence of cooperativity is that

the properties of individual components embodied bythe functions f ’s cannot be inferred from the behaviorof each structural component separate from thesystem. Cooperativity changes the behavior of eachdomain when assembled into the whole macromol-ecule. Most often, the only quantity amenable toexperimental investigation is the global quantity Fin eq 1. This quantity has obvious limitationsbecause it cannot define uniquely the individualcomponents f ’s. The close connection between struc-ture and function is embodied by the site-specificproperties f ’s that, once specified, uniquely define F.The global quantity F may not reflect the truecooperativity pattern operating at the site-specificlevel. When individual components are summed, theexact nature of each particular contribution may beobscured by other terms that define the quantity F.Cooperativity can only be understood fully when thecontribution of the site-specific components is sortedout.The need for a description of cooperativity in terms

of site-specific properties has been recognized for along time and first emerged in the pioneering studiesof Wegscheider6 on the ionization reactions of poly-basic acids. For many years, however, our under-standing of cooperativity has been confined to theglobal description due to the limitations imposed byexperimental techniques. Likewise, previous theo-retical treatments of cooperativity have focused onthe analysis of global effects.7,8 Recent advances invarious areas, and especially in structural biologyand recombinant DNA technology, have made itpossible to access information at the site-specificlevel. Global phenomena can now be dissected interms of the contribution of individual binding sites,folding units, amino acid residues, or even atoms,thereby revealing the true and extraordinary com-plexity of cooperative effects in biology. These newadvances have fostered the development of a ther-modynamic description of site-specific effects.9 Site-specific thermodynamics expands previous analysesof global effects and provides the conceptual andmethodological tools to study cooperativity in avariety of systems. Much of the theory was developedto dissect ligand binding cooperativity.9 Subsequentdevelopments have encompassed the analysis ofmutational effects in proteins5 and have proved thegeneral applicability of concepts and analytical meth-ods originally introduced for the study of ligandbinding processes. Relevant applications of thetheory are summarized in this review.

II. Site-Specific Thermodynamics

A. Difference between Global and Site-SpecificCooperativityThe simplest and most convincing argument to

demonstrate the need for a site-specific descriptionof cooperativity comes from consideration of a mac-

romolecule M composed of two binding sites for aligand L. We shall assume that temperature andpressure are constant and that the macromoleculeand the ligand do not change their aggregation statewhen free or bound. There are two ways to describethe binding equilibria in the system. The globaldescription focuses on the overall behavior of the twosites,7,8 whereas the site-specific description takes intoaccount how binding occurs at each site.9 In theglobal description there are two reactions to beconsidered. These reactions define uniquely thebinding isotherm accessible to experimental mea-surements, from which the average number of ligatedsites is obtained as a function of ligand concentration.In the first reaction, M + L ) ML, the free macro-molecule interacts with one molecule of ligand to formthe singly ligated intermediate ML. The equilibriumbinding constant for this reaction is 2k1, where k1 isthe binding affinity for the ligation of a site and thefactor accounts for the possible ways of generatingthe singly ligated intermediate from the unligatedform of the macromolecule. In the second reaction,ML + L ) ML2, the singly ligated intermediate bindsa second molecule of ligand to form the doubly ligatedspecies ML2. The equilibrium constant for thisreaction is k2/2, where k2 is the binding affinity forthe ligation of the second site and the factor at thedenominator accounts for the number of ways one cangenerate the singly ligated species from the doublyligated intermediate. In general, for the reactionMLj-1 + L ) MLj, the binding constant is [(N - j +1)/j]kj, where kj is the binding affinity for ligation ofthe jth site, the factor at the numerator accounts forthe number of ways of generating MLj from MLj-1and that at the denominator accounts for the numberof ways of generating MLj-1 from MLj. The equilib-rium constants measuring the affinity of each ligationstep are called stepwise binding constants. They donot distinguish between the sites, 1 and 2, althoughthey depend on the properties of both sites. Coop-erativity is observed when k1 * k2, in which casebinding of the second ligand molecule takes placewith an affinity different than that of the first ligandmolecule. Positive cooperativity implies k1 < k2,whereas negative cooperativity demands k1 > k2. Thecase k1 ) k2 reflects the absence of cooperativity.We shall not discuss the graphical manifestations

of cooperativity that are dealt with in detail else-where.7-9 These signatures are useful in the analysisof experimental data but bear little on the conceptualframework that we are interested in discussing here.If the system shows cooperative binding of ligand L(k1 * k2), what is the underlying mechanism thatproduces this effect? On the other hand, if thesystem shows no presence of cooperativity (k1 ) k2),does it imply that the two sites are independent?These simple and important questions may be dif-ficult to answer in the global description due to thelack of information on the behavior of sites 1 and 2.The stepwise binding constants reflect the averageproperties of sites 1 and 2, whereas one needs toknow the detailed behavior of each site as a functionof the ligand concentration. Unraveling this infor-mation pertains to the site-specific description. The

1564 Chemical Reviews, 1998, Vol. 98, No. 4 Di Cera

Page 3: Site-Specific Thermodynamics:  Understanding Cooperativity in Molecular Recognition

binding reactions of ligand L with the macromoleculeM in the site-specific description directly identify thetwo sites in their ligation state. This makes itnecessary to introduce two site-specific binding con-stants, K1 and K2, referring respectively to thereaction of ligand L with site 1 when site 2 isunligated and binding to site 2 when site 1 isunligated. The interaction between the sites isexpressed by a third independent parameter, c12, thatreflects the presence of positive (c12 > 1), negative(c12 < 1) or no (c12 ) 1) cooperativity. By virtue ofthis interaction, binding to site 1 when site 2 isligated occurs with a binding constant c12K1, and forsite 2, when site 1 is ligated, the binding constant isc12K2.The nature of these parameters and the reactions

in the site-specific description are best understoodfrom the thermodynamic cycle

where M00 is the free macromolecule, M10 the singlyligated form with site 1 bound and site 2 free, M01the analogous intermediate with site 2 bound and site1 free, and M11 the doubly ligated form. Energyconservation in the cycle gives rise to only threeindependent parameters to describe the four possiblereactions. The reciprocity of site-site interactionsis a consequence of energy conservation in the cycle.So, if site 1 affects site 2, site 2 must affect site 1and to the same extent.Analysis of binding in terms of the site-specific

description raises a seemingly paradoxical issue. Inthe global description, only two independent param-eters (k1 and k2) are sufficient to define the propertiesof the system. In the site-specific description, on theother hand, there are three independent parameters(K1, K2, and c12) to be taken into account. What isthe origin of this apparent discrepancy? Simpleconsiderations on the equilibria involving the twosites lead to the following relationships betweenglobal and site-specific parameters:

where x is the ligand concentration. The apparentdiscrepancy arises because it is not possible touniquely derive the site-specific parameters K1, K2,and c12 from knowledge of the global parameters k1and k2. On the other hand, if the site-specificparameters are known, then the global parameterscan be determined uniquely. Hence, the globaldescription is incapable of deciphering what goes onat the level of individual sites. This fact has beenrecognized for a long time, and the need for a localdescription of binding processes finds its origin in the

early work on the dissociation of polyvalent sub-stances.6,10-13

The limitations of the global description becomeeven more apparent when we examine the nature ofcooperativity. The condition for cooperativity in theglobal description is k1 * k2 and can be formallyrepresented using eqs 3 and 4 as the difference:

The sign of the expression

defines the nature of cooperativity. ∆ ) 0 denotesabsence of cooperativity and provides the cutoffbetween positive (∆ > 0) and negative (∆ < 0)cooperativity. The condition for the absence of co-operativity in the global description does not neces-sarily coincide with the condition c12 ) 1 that reflectsthe true absence of interactions between sites 1 and2. Only when K1 ) K2 are the two conditionsidentical. If the binding sites have different affini-ties, there is always a value of c12 > 1 such that ∆ )0. This means that positive interactions between twosites that bind with different affinities may notmanifest themselves in the global description aspositive cooperativity.A direct illustration of this fact is given in Figure

1, where the logarithm of c12 is plotted versus thelogarithm of the ratio K1/K2. Plotting versus thelogarithm of the ratio K2/K1 is completely equivalentbecause of the symmetry of eq 6. The continuous linerepresents the relation between c12 and the ratio K1/K2 such that ∆ ) 0 in eq 6. On this line, anycombination of site-specific binding constants K1 andK2 and interaction constant c12 yields k1 ) k2 in theglobal description. The region above this line ischaracterized by positive cooperativity in the globaldescription (k1 < k2), whereas the region below theline characterizes negative cooperativity (k1 > k2).The discontinuous line gives the condition for theabsence of true interactions between the sites (c12 )1). Above this line the sites are positively linked andbelow it they are negatively linked. The two linesin the plot define three regions. In region I, definedby c12 < 1, there is no ambiguity between global andsite-specific cooperativity. When binding to one siteopposes binding to the other site at the site-specificlevel, the result is negative cooperativity in the globaldescription. At the boundary c12 ) 1, the system isalways negatively cooperative in the global descrip-tion, unless K1 ) K2 and the two sites bind with thesame affinity. Independent sites binding with dif-ferent affinities therefore mimic negative cooperat-ivity in the global description. In region II negativecooperativity in the global description is observedeven though the two sites interact in a positivemanner. This is a consequence of the heterogeneity

k2 - k1 )4c12K1K2 - (K1 + K2)

2

2(K1 + K2)(5)

∆ ) 4c12K1K2 - (K1 + K2)2 )

K22[4c12K1

K2- (1 +

K1

K2)2] ) K1

2[4c12K2

K1- (1 +

K2

K1)2](6)

K1

c12K2

c12K1

K2

M00 M10

M01 M11

(2)

2k1 )[ML][M]x

)[M10] + [M01]

[M00]x) K1 + K2 (3)

k22

)[ML2]

[ML]x)

[M11]

([M10] + [M01])x)c12K1K2

K1 + K2(4)

Site-Specific Thermodynamics Chemical Reviews, 1998, Vol. 98, No. 4 1565

Page 4: Site-Specific Thermodynamics:  Understanding Cooperativity in Molecular Recognition

of the binding affinities that opposes the favorablecoupling between the sites. Region III again leavesno ambiguity between global and site-specific coop-erativity, in this case both positive.The conclusion to be drawn from Figure 1 is that

cooperativity in the global description is at most acrude approximation of the true pattern of interactionbetween the sites. In the case of positive cooperat-ivity, the true extent of interactions is always un-derestimated if K1 * K2. In fact, for any given valueof c12 > 1, there is always a value of K1/K2 such that∆ ) 0, or even ∆ < 0. No matter how strongly twosites are positively coupled at the site-specific level,positive cooperativity in the global description is onlyseen in region III. Positive cooperativity can be madearbitrarily small and even turned into negativecooperativity in the global description if the bindingaffinities of the two sites differ significantly. Forexample, in a system where the affinities of the twosites differ by a factor of 100 (K1/K2 ) 0.01), butbinding to one site increases the affinity of the othersite by a factor of 10 (c12 ) 10), one has ∆ < 0 and

negative cooperativity appears in the global descrip-tion even though the sites are strongly coupled in apositive manner. For this system to show positivecooperativity in the global description, the value ofc12 must exceed 25. In a system where K1/K2 ) 0.001the value of c12 must exceed 250, and so on. Thelimitations of the global description are not confinedto the case of positive cooperativity. When negativecooperativity is observed in the global description, thesystem can actually be positively or negatively coop-erative at the site-specific level. Particularly inter-esting is the case c12 ) 1, which always leads tonegative cooperativity in the global description if K1* K2. The heterogeneity of the binding affinities ofthe sites generates per se a misleading pattern ofnegative interactions.In summary, cooperativity as assessed by the

global parameters k’s does not reflect the true patternof interaction between the sites, unless the sites havethe same affinity. If the sites bind with differentaffinities, then positive cooperativity in the globaldescription always underestimates the coupling be-tween the sites. Negative cooperativity can be totallymisleading, since it may be associated at the site-specific level with positive coupling between the sitesor absence of interactions. In the case of negativecoupling between the sites, on the other hand, theinteraction may be overestimated in the global de-scription. Absence of cooperativity in the globaldescription can be misleading as well. Positivecoupling between the sites can be exactly counteredby heterogeneity of their binding affinity. The truecooperative nature of the interactions between thesites can only be resolved from knowledge of thevalue of c12, but this requires information on howbinding occurs at each site of the system.

B. The Wegscheider Principle and the Analysisof Ionization ReactionsThe importance of dissecting cooperativity at the

site-specific level was obvious even to early investiga-tors of ionization equilibria of polybasic acids. In1895, the Austrian chemist Wegscheider introducedan ingenious strategy to structurally perturb asystem to mimic the properties of reduced systemscontaining a fewer number of sites.6 The comparisonof the original system and its reduced versions wouldthen be used to extract information on the behaviorof individual ionizable groups. It is quite instructiveto comment on Wegscheider’s strategy because ithelps understand more elaborate, but conceptuallysimilar, strategies currently employed in the studyof proteins and nucleic acids.The question that Wegscheider posed was as fol-

lows. If only global properties of a system can beaccessed experimentally, is it possible to deriverelevant information on site-specific parameters?Wegscheider thought that replacement of the ioniz-able carboxylate in a polybasic acid with methyl orethyl esters could mimic the protonated state of thegroup and reduce the number of sites to be studiedby direct titration. By selectively replacing groupsat each of the carboxylates of interest he could in turnstudy how protonation of these groups in the original

Figure 1. Difference between global and site-specificcooperativity for a two-site system. The logarithm of theinteraction constant c12 is plotted versus the logarithm ofthe ratio between the site-specific binding parameters K1and K2. The continuous line depicts the relation betweenc12 and the ratio K1/K2 such that ∆ ) 0 in eq 6. This linedefines the boundary between negative (below the line) andpositive (above the line) cooperativity in the global descrip-tion. The discontinuous line, on the other hand, defines theboundary between negative (below the line) and positive(above the line) cooperativity in the site-specific description.The two lines define three regions. In region I, there is noambiguity between global and site-specific cooperativity.A value of c12 < 1 always results in negative cooperativity.Likewise, in region III positive interactions between thesites always result in positive global cooperativity. RegionII, defined symmetrically between the lines, is ambiguousbecause positive interactions between the sites (c12 > 1)result in negative global cooperativity because of theheterogeneity of the sites (K1 * K2). The filled circle depictsthe values of site-specific parameters for Ca2+ binding tocalbindin (see section II.C).

1566 Chemical Reviews, 1998, Vol. 98, No. 4 Di Cera

Page 5: Site-Specific Thermodynamics:  Understanding Cooperativity in Molecular Recognition

molecule would affect protonation of other ionizablegroups and therefore extract important site-specificinformation from the system.Neuberger14 exploited Wegscheider’s idea to com-

pletely resolve the three ionization reactions ofglutamic acid into their site-specific components.15For such a system there are three binding sites forthe proton: the R-carboxyl (site 1), amino (site 2), andγ-carboxyl (site 3) groups. Titration of glutamic acidyields three pKa’s (Table 1) that reflect the compositebehavior of the three sites. The stepwise bindingconstants and the site-specific parameters are di-rectly defined by these pKa’s as follows:9,15

The M’s refers to the various protonated intermedi-ates of glutamic acid, and h is the proton concentra-tion. Knowledge of the three stepwise binding con-stants derived from direct titration of glutamic aciddoes not suffice to define uniquely the set of sevenindependent site-specific parameters necessary tocompletely solve the ionization reactions at the levelof individual sites. In the site-specific descriptionthere are three site-specific binding constants, K1, K2,and K3, three second-order coupling constants, c12,c13, and c23, and one third-order coupling constant,c123. Resolution of seven parameters demand at leastseven independent constraints from experimentaldata. To this end, Neuberger synthesized esters ofeither the R- or γ-carboxyl groups and titrated theother groups (Table 1). The assumption embodiedby the Wegscheider principle is that the ethyl estermimics the protonated state of the carboxyl group.Under this assumption, the following two relationsfor R-ethyl glutamate apply:

They provide two additional constraints for thesolution of the problem. Analogous expressions forγ-ethyl glutamate

complete the set of seven constraints needed toresolve the seven independent site-specific param-eters. The results are shown in Table 1. Theuniqueness of the results can be tested by calculatingthe pKa of the amino group in ethyl glutamate as

The predicted value of 7.035 is identical to that foundexperimentally.It is from knowledge of the site-specific parameters

for the ionization reactions of glutamic acid that amore close connection with the structure of the aminoacid can be drawn. When all groups are deproto-nated, binding of the proton occurs with high affinityto the amino group and with low affinity to thecarboxyl groups. The R-carboxyl group binds theproton with slightly lower affinity compared to theγ-carboxyl group, due to the proximity of the aminogroup and the unfavorable electrostatic couplingexperienced when this group is protonated. Theinteraction constants are all less than 1, indicatingthe presence of site-specific negative cooperativity inthe protonation reactions. This effect acts in concertwith the extreme heterogeneity of the sites to producea very pronounced negatively cooperative protonbinding curve for glutamic acid, as shown in Figure2. Negative coupling among the sites is expectedfrom electrostatic considerations and decreases withthe distance between neighbor groups. The twocarboxyl groups are located far enough away that c13≈ 1. On the other hand, protonation of the aminogroup has an effect almost 3 orders of magnitudelarger on the R- than the γ-carboxyl group, due tothe proximity of the former group.An alternative solution to eqs 7-9 can be found

by assuming c12 ) c13 ) c23 ) c123 ) 1 and solving forthe three independent site-specific binding constants.

Table 1. pKa’s of Glutamic Acid and Its Ethyl Estersa

1pKa2pKa

3pKa

glutamic acid 2.155 4.324 9.960R-ethyl glutamate 3.846 7.838γ-ethyl glutamate 2.148 9.19ethyl glutamate 7.035a The site-specific parameters for proton binding to glutamic

acid, derived from analysis of these pKa’s, are (log10 values aregiven in parentheses) K1 ) 5.69 × 104 M-1 (4.755), K2 ) 9.19× 109 M-1 (9.960), K3 ) 1.26 × 105 M-1 (5.101), c12 ) 0.00755(-2.122), c13 ) 0.353 (-0.452), c23 ) 0.17 (-0.770), and c123 )0.00042 (-3.377).

109.960 )[M100] + [M010] + [M001]

[M000]1h

)

3k1 ) K1 + K2 + K3 (7)

104.324 )[M110] + [M101] + [M011]

[M100] + [M010] + [M001]1h

)

k2 )c12K1K2 + c13K1K3 + c23K2K3

K1 + K2 + K3(8)

102.155 )[M111]

[M110] + [M101] + [M011]1h

)

k33

)c123K1K2K3

c12K1K2 + c13K1K3 + c23K2K3(9)

107.838 )[M110] + [M101]

[M100]1h

)

2k1′ ) c12K2 + c13K3 (10)

103.846 )[M111]

[M110] + [M101]1h

)

k2′2

)c123K2K3

c12K2 + c13K3(11)

109.190 )[M101] + [M011]

[M001]1h

)

2k1′′ ) c13K1 + c23K2 (12)

102.148 )[M111]

[M101] + [M011]1h

)

2k1′′2

)c123K1K2

c13K1 + c23K2(13)

pKa ) log10[M111]

[M101]h) log10(c123c13

K2) (14)

Site-Specific Thermodynamics Chemical Reviews, 1998, Vol. 98, No. 4 1567

Page 6: Site-Specific Thermodynamics:  Understanding Cooperativity in Molecular Recognition

The result is K1 ) 102.158, K2 ) 109.960, and K3 ) 104.321,which suggests that the pKa’s measured from titra-tion of glutamic acid correspond quite closely to thethree site-specific pKa’s. This erroneous conclusionwould make the affinity of the R-carboxyl group 2orders of magnitude lower than that of the γ-carboxylgroup and nearly 3 orders of magnitude lower thanthe value correctly derived from the analysis ofNeuberger’s results. The large discrepancy foundbetween the simplifying assumption and the correctresult in a simple molecule like glutamic acid repre-sents a serious warning for more complex casesinvolving ionization reactions in proteins. It hasbecome common practice to assume that ionizationof protein residues occurs almost independently ofother proton binding events in the protein in anattempt to simplify the calculations. This assump-tion is likely to be wrong, and the pKa’s estimatedfrom independent ionization reactions may have littlebearing on the actual binding affinities of the ioniz-able groups.Can the strategy embodied by the Wegscheider

principle be extended to macromolecular systems?Consider the problem of calculating the pKa of ioniz-able groups in a protein.16,17 In this case, experi-mental measurements of the site-specific protonbinding curve of each ionizable group may be unfea-sible. For a protein containing 20 such groups, thereare a total of 220 ≈ 106 total configurations and asmany site-specific parameters to be resolved fromexperimental data. No currently available techniquecan provide such information. However, many resi-dues such as Asp, Glu, Arg, Lys, and Tyr do not ionizein the pH range of physiological interest and can be

treated as fully protonated or deprotonated. Theremaining groups of the protein form a reducedsystem in the Wegscheider sense whose descriptionrequires a significantly reduced number of param-eters. The idea of fixing the ionization state ofselected residues in a protein to calculate moreefficiently the ionization properties of other groupshas been implemented by Bashford and Karplus andis known as the “reduced-site” model.18 Reducedsystems can also be generated empirically by replac-ing the residue of interest, say a His, with groupsthat mimic the protonated or unprotonated state ofthat residue. However, if these substitutions affectother properties of the macromolecule, such as theionization of other groups, or the coupling of His withother residues, the assumption central to the entireapproach is invalidated. In general, the assumptionthat a given substitution actually mimics a particularligation state for the His may be questioned. Moreimportantly, even though this strategy may be suc-cessful in the case of ionization reactions, it willcertainly fail in the case of other ligands such asmetal ions, peptides, or nucleic acids, whose recogni-tion by proteins entails extended structural domains.For example, in the specific case of Ca2+ binding toan EF-hand,19 it is difficult to envision simple sub-stitutions that can exactly mimic the Ca2+-bound orthe Ca2+-free form of the site. When binding of aligand involves several protein residues, any pertur-bation is likely to be extensive and the expectationthat it mimics a specific ligation state becomesunrealistic. Nonetheless, a refined version of theWegscheider principle is still applicable under certaincircumstances and provides a powerful approach tothe study of site-specific energetics.

C. Using Mutants To Resolve Site-SpecificParameters: Ca2+ Binding to Calbindin

Under suitable conditions, site-specific parameterscan be obtained from analysis of global properties.Although these conditions may be difficult to repro-duce in general for any system of interest, it isinstructive to consider the potential advantages ofthe approach when feasible. Consider a systemcontaining N sites, each existing in two possiblestates, free or bound. There are a total of 2N possibleconfigurations in this system, 2N - 1 of which areindependent if one is chosen as reference. The sumof the concentrations of all possible configurationsrelative to the concentration of the reference species,the unligated state, defines the partition function ofthe system Ψ.9 The partition function is a polynomialexpansion in the ligand concentration x of degreeN.7-9 From the partition function, all of the relevantglobal properties of the system can be derived. Forexample, the average number of ligated sites, X,accessible to experimental measurements throughtitration is X ) d lnΨ/d ln x. The configurationsdefining the partition function can be split in twosets: one containing all configurations with a givensite, say site j, unligated and the other containingall configurations with site j ligated. These setsdefine partition functions of reduced or contractedsystems. Let 0Ψj and 1Ψj be the partition functions

Figure 2. Proton binding curves of the three ionizablegroups of glutamic acid: R-carboxyl group (discontinuouscurve at right), amino group (discontinuous curve at left),and γ-carboxyl group (discontinuous curve in the middle).The sum of these three curves, divided by the number ofsites, gives the global proton binding curve measuredexperimentally (continuous line). Curves were drawn usingthe parameter values listed in Table 1.

1568 Chemical Reviews, 1998, Vol. 98, No. 4 Di Cera

Page 7: Site-Specific Thermodynamics:  Understanding Cooperativity in Molecular Recognition

of the two sets constrained by the particular ligation,free (0) or bound (1), of site j. The partition functionof the system can then be written as9

The factor Kjx arises because all intermediates withsite j bound contain this term in the partitionfunction. The binding probability to site j is evidently

with the conservation relation analogous to eq 1

The quantity Xj cannot be accessed experimentallyif only global properties such as X are measured, asalready pointed out in the Introduction. However,the function Xj can be reconstructed indirectly usingad hoc substitutions at site j. This assists in theresolution of some of the site-specific parameters inthe system.Consider a perturbation of site j, say a chemical

modification induced by a site-directed mutation. Theperturbation either can affect the properties of thesite where it applies or can carry over to other sites.Assume that the perturbation remains localized atsite j, with a negligible secondary effect on other sites.We speak in this case of a first-order perturbationthat changes the value of Kj to Kj′, while it leavesthe value of all other parameters unchanged. Thesmaller the perturbation, the more likely it will causea first-order effect. Then, for the perturbed system,we have

The contracted partition functions are the same asthose for the wild-type, or unperturbed system, sincethey do not depend on Kj. Both Ψ and Ψ′ areaccessible experimentally from integration of mea-surements of the average number of ligated sites asa function of ligand concentration. Hence, the func-tion

can be constructed from measurements on the wild-type and mutant systems. Except for a constantfactor ηj, which is easily obtained from fj in the limitx f ∞, this function is the same as the quantity ofinterest Xj in the unperturbed, wild-type system. Inthe limiting case where the mutation abolishesbinding to site j, eq 19 yields eq 16. In general, eq19 only requires ηj to be finite and therefore general-izes the Wegscheider approach that strictly demandsthe perturbation to mimic specifically a ligated stateof the site.An approach analogous to that embodied by eqs

15-18 has been used by Qian20 in the analysis of theeffects of single-residue substitutions on the stabilityof R-helices in homopolypeptides21 and by Wrabl and

Shortle22 in the analysis of the effects of site-directedmutations on the unfolded state of a protein. Par-ticularly important for these approaches is to verifythe uniqueness of the solution obtained. In the caseof ligand binding, this can be done by examining anumber of possible perturbations at site j to guar-antee a robust reconstruction of the site-specificbinding probability Xj. Expressions equivalent to eq19 can be derived when the perturbation at site jcarries over to other sites,9 but the parametersdefining these expressions are difficult to resolveexperimentally. Therefore, a successful use of thisstrategy should be expected only in the case of first-order perturbations.An application of the approach embodied by eq 19

has been reported for the analysis of cooperative Ca2+

binding to calbindin,9 one of the smallest membersof the calmodulin superfamily.23 Calbindin is com-posed of two helix-loop-helix motifs responsible forCa2+ binding (Figure 3). The C-terminal site (site 2)has the amino acid sequence and fold of an archetypalEF-hand, while the N-terminal site (site 1) differsfrom the usual EF-hand in that it contains twoadditional residues.24 The protein binds Ca2+ withpositive cooperativity and a difference between k1 andk2 of a factor of 825-27 (Table 2). Due to theirstructural differences, the extent of coupling betweenthe sites as revealed by k1 and k2 may be underes-timated. Assessment of the exact extent of couplingbetween the sites is important for the mechanism of

Ψ ) 0Ψj + 1ΨjKjx (15)

Xj ) Kjx1Ψj

Ψ) 1 -

0Ψj

Ψ(16)

X ) X1 + X2 + ... + XN (17)

Ψ′ ) 0Ψj + 1ΨjKj′x (18)

fj ) 1 - Ψ′Ψ

) (1 -Kj′Kj

)Xj ) ηjXj (19)

Figure 3. Ribbon representation of calbindin. The twobound Ca2+ are depicted by circles.

Table 2. Stepwise Binding Constants (M-1) for Ca2+

Binding to Calbindin and Its Mutants

k1 k2 k2/k1wild-type 1.0 × 108 7.9 × 108 7.9P20G 3.3 × 107 1.6 × 107 0.48P20G, ∆N21 8.5 × 107 1.0 × 106 0.012∆P20 4.0 × 107 1.0 × 106 0.25Y13F 1.9 × 108 4.9 × 108 2.6E17Q 1.3 × 107 2.5 × 108 19D19N 2.0 × 107 2.0 × 108 10E26Q 3.2 × 107 5.0 × 108 16E60Q 5.0 × 107 6.4 × 108 13E17Q, D19N 1.3 × 107 2.5 × 107 1.9E17Q, E26N 3.2 × 106 1.3 × 108 41D19N, E26Q 4.0 × 106 8.0 × 107 20E17Q, D19N, E26Q 3.2 × 106 1.0 × 107 3.1

Site-Specific Thermodynamics Chemical Reviews, 1998, Vol. 98, No. 4 1569

Page 8: Site-Specific Thermodynamics:  Understanding Cooperativity in Molecular Recognition

transduction of signals from one site to the other andfor correctly reproducing the energetics of Ca2+

binding to calbindin using computational approaches.The arguments discussed in section II.A assumeparticular relevance in the case of this protein.Experimental evidence from NMR data suggests

that Cd2+ binds with higher affinity to site 2 and thatthe binding pathways involving site 1 or site 2 aspossible singly ligated intermediates elicit distinctstructural transitions in the molecule.28-30 Spectro-scopic and kinetic studies on the Ca2+ bindingproperties of calbindin suggest that site heterogeneitymay be a factor of 4-6.25-27,31,32 No direct evidencehas been provided for Ca2+ binding with higheraffinity to site 1 or site 2 and direct determination ofCa2+ binding to either site in the wild-type proteinhas been lacking. Electrostatic calculations suggestthat site 2 may have only a slightly higher affinity.33On the other hand, valence maps indicate that site1, rather than site 2, may bind Ca2+ with higheraffinity.34 A number of mutants of residues in andaround site 1, and partially site 2, have been madeto assess the role of electrostatic contributions to thebinding of Ca2+, and the global binding parametershave been resolved for all of them25-27,32 (Table 2).Some of the residues mutated in the wild-type areshown in Figure 4. Particularly interesting is theobservation that mutations around site 1 remainlocalized at this site and do not propagate to site2.25,32 This set of mutants was used to resolve thesite-specific parameters for calbindin according to eq19.Mutations around site 1 can be assumed to produce

a first-order perturbation. The assumption is sup-ported by NMR data showing that the mutantsE17Q, E26Q, and E60Q have practically the samestructure as wild-type protein.28,29 The mutant P20Gand the deletion mutants P20G, ∆N21, and ∆P20show drastic perturbation of the global cooperativitypattern (Table 2). Nonetheless, there is evidence that

the structural perturbation is confined to site 1.25 Theperturbation of the global binding constants in thesemutants may indeed reflect a perturbation of the site-specific binding constant K1 only. Construction of thepredicted binding curve for site 1 according to eq 19is shown in Figure 5. The mutant Y13F predicts abinding curve that is physically implausible (X1 > 1).The Y13F replacement violates the assumption offirst-order perturbation and eq 19 based on it. Themutation induces changes that must propagate to site2 and may also involve the communication betweenthe sites. Other mutations, like ∆P20, producephysically plausible results, but five mutants inparticular produce a consensus binding curve for site1. These mutations are all isosteric substitutions ofnegatively charged residues around site 1 (Figure 4).The predicted site-specific binding curves are practi-cally identical for all of these mutants, although theperturbation of the overall binding constants is quitedifferent in each case (Table 2). Site 1 binds Ca2+

with an affinity of about 1.7 × 108 M-1, which isnearly 6-fold the affinity of site 2. As a result ofcooperative coupling between the sites, the affinityof either site increases by a factor of 18 when theother site is ligated. These parameters map on thepoint in region III in Figure 1, corresponding topositive cooperativity in the global description. Thesite heterogeneity is not large enough to overcomethe positive interaction between the sites.

Figure 4. Molecular environment of Ca2+ binding sites 1and 2 of calbindin, showing the side chains of residueswhose mutation produces a first-order perturbation of site1. Figure 5. Site-specific binding curve of site 1 of calbindin

derived from the approach based on the first-order pertur-bation hypothesis and eq 22 in the text, using the param-eters listed in Table 2 and the partition functions Ψ ) 1 +k1x + k1k2x2 and Ψ ) 1 + k1′x + k1′k2′x2. The five mutantsE17Q (O), D19N (b), E26Q (*), E60Q (0), and E17Q/D19N(9) predict a consensus binding curve (continuous line) thatyields site-specific parameter values: K1 ) 1.7 × 108 M-1,K2 ) 2.7 × 107 M-1, and c12 ) 18. The mutant ∆P20 yieldsthe discontinuous dotted line significantly different fromthe consensus curve. The mutant Y13F predicts a physi-cally implausible curve (discontinuous line) for which X1> 1.

1570 Chemical Reviews, 1998, Vol. 98, No. 4 Di Cera

Page 9: Site-Specific Thermodynamics:  Understanding Cooperativity in Molecular Recognition

These results are consistent with the crystal struc-ture of calbindin.24 The structure of site 1 (Figure4) suggests that it is unlikely that mutation of E26or E17 would be carried over to site 2. These residuesare about 10 Å closer to the Ca2+ in site 1 than thatin site 2. Residues D19 and E60 are close to bothsites and, in principle, mutations of these residuesshould perturb sites 1 and 2. In practice, however,mutation of these residues yields effects similar tomutation of E26 and E17. A molecular dynamicssimulation of calbindin shows that E60 may be partof the coordination sphere of Ca2+ in site 1.33 Thisresidue, and possibly D19, may be closer to site 1 insolution, contrary to the conclusions drawn from thecrystal structure.24 It is likely that these residuesare more strongly coupled to site 1 than site 2, sothat perturbation of site 2 can be neglected for allpractical purposes. Finally, comparison of the struc-ture of apo-calbindin with the full Ca2+ form showsminor changes at the level of site 1 and moresignificant rearrangements of the side chains aroundsite 2.23 Given the similar coordination geometry atthe two sites, it is expected that the enthalpy ofbinding will be similar for Ca2+ binding to site 1 orsite 2. However, the preformed structure of site 1should reduce the entropy loss of binding to this sitecompared to site 2, thereby making site 1 the high-affinity site.

D. Practical LimitationsGiven the importance of site-specific parameters

in deciphering cooperativity, it is desirable to have ageneral strategy of approach that works for anysystem of interest. Measuring site-specific bindingcurves is one way to obtain information on theseparameters. In the case of cytochromes, each redoxcenter has distinct spectral properties and enablesdirect site-specific measurements.35,36 In the case ofλI repressor binding to its operator, footprint titra-tions yield binding isotherms for the three individualsites of the operator.37 Although site-specific probescan be exploited in many systems to obtain informa-tion on the behavior of individual sites, it should benoted that knowledge of site-specific binding curvesmay be insufficient to resolve all independent pa-rameters in the system. Hence, the possibility ofexperimentally measuring binding events at indi-vidual sites by no means guarantees that the site-specific energetics of the system will be fully dis-sected. For a system of N sites, the N independentsite-specific binding constants Kj’s can be resolvedfrom the N site-specific binding curves.9 However,the partition function of the system also containsN(N- 1)/2 second-order coupling constants, N(N - 1)/(N- 2)/6 third-order coupling constants, and in general(mN) mth-order coupling constants that need to beresolved from experimental data. As soon as (mN) >N for a particular value of m, measurements of theN site-specific binding curves become insufficient toresolve all site-specific parameters. This limitationarises already for N ) 4. Even if one could measureall the site-specific binding curves in a macromoleculecontaining four binding sites, unique resolution of thesix second-order coupling constants would be impos-

sible. The only way to overcome this problem is todetermine directly the ligated intermediates in thepartition function, but this cannot be done in the vastmajority of cases. The cryogenic quenching techniquedeveloped by Perrella38 is unique in its ability toresolve all ligated intermediates of hemoglobin39 (N) 4), but this experimental strategy exploits peculiarproperties of this protein and has no applicability toother cooperative systems.The foregoing considerations may lead one to

conclude that site-specific thermodynamics is a theoryof limited applicability to small systems (N < 4) orto particular cases such as hemoglobin where allintermediates of the partition function can be ac-cessed directly. This is indeed the case when thetheory is applied to ligand binding cooperativity. Thelimitations vanish when dealing with another classof processes where the theory finds its most ideal andgeneral applicability. These processes include site-directed mutagenesis of residues aimed at under-standing the molecular signatures of stability andligand recognition. The various intermediates of thesystem, whose characterization is so problematic inligand binding studies, are generated directly fromthe perturbations introduced in the system in theform of site-directed mutations. Analytical andconceptual tools developed for ligand binding pro-cesses can be exploited in the analysis of mutationaleffects using a basic analogy to be described in thenext section. This brings site-specific thermodynam-ics into the main stream of current studies ofstructure-function relations, protein stability, andligand recognition.

III. Structural Mapping of Energetics

A. A Basic AnalogySite-directed mutagenesis4 has made it possible to

perturb the structure of a protein at the level ofindividual residues and study the origin of proteinstability and ligand recognition with unprecedenteddetail. Residues in a protein can be replaced by anyof the 20 natural amino acids. Extension of thisstrategy to include unnatural amino acids has furtherexpanded the ability to manipulate protein structureand function.40,41 Site-directed mutagenesis is amodern incarnation of the Wegscheider principle anduses the structural perturbation created by the site-directed substitution as a source of information onthe properties of the residue being substituted. Asignificant advantage of this technique is that it candirectly generate all intermediates of interest in asite-specific analysis, overcoming the intrinsic limita-tions imposed by the number of binding sites seenin ligand binding cooperativity (see section II.D).Although mutational effects are intrinsically dif-

ferent from ligand binding processes, their thermo-dynamic treatment is extraordinarily similar to thatof cooperative ligand binding once a basic analogy isconsidered. The free f bound transition at a givenbinding site is analogous in energetic terms to thewild-type f mutant transition of a given residue.This enables use of the same formalism developedfor the analysis of ligand binding cooperativity in the

Site-Specific Thermodynamics Chemical Reviews, 1998, Vol. 98, No. 4 1571

Page 10: Site-Specific Thermodynamics:  Understanding Cooperativity in Molecular Recognition

analysis of mutational effects. Each residue subjectto mutational perturbation represents a site in thesystem. The energetic balance of the wild-type fmutant transition at site j, when all other sites arewild-type, specifies the site-specific free energy ∆Gjanalogous to the site-specific binding free energy ∆Gj) -RT ln Kj (R is the gas constant and T the absolutetemperature) for binding to site j when all other sitesare free. Cooperativity in mutational effects can beexpected when substitutions are made at multiplesites and is treated in a manner analogous to that ofligand binding processes. The coupling free energy∆Gij between mutations at site i and j is defined fromthe thermodynamic cycle analogous to eq 2 thatinvolves the intermediates with both sites mutatedor wild-type and the two singly mutated species (seesection III.C). This quantity is analogous to thecoupling free energy ∆Gij ) -RT ln cij for the bindingof two ligand molecules to sites i and j.5,9 Thenumber of intermediates to be characterized is setby the number of residues subject to site-directedmutagenesis. Unlike ligand binding, this number isnot imposed by intrinsic properties of the system butis determined entirely by the experimentalist. WhenN sites are targeted with a single substitution, 2Npossible intermediates are to be considered. Of these,only 2N - 1 are independent if one is chosen asreference. This gives rise to N independent site-specific free energies ∆Gj’s and a total of 2N - N - 1coupling free energies from second up to Nth order.Resolution of all these independent parameters pro-vides information relevant to the behavior of eachresidue in the process under investigation and thenature of cooperative interactions.

B. Ala ScansTargets for site-directed mutagenesis are often

identified from available structural information. Inthe analysis of protein stability, particular attentionis devoted to residues buried in the interior of theprotein and defining hydrophobic cores.42-44 Othertargets are found in residues involved in ionic inter-actions,45 especially if screened from the solvent.46,47In the analysis of ligand recognition, targets areidentified from residues involved in polar and hydro-phobic interactions in the bound complex.48,49 In theabsence of structural information on the boundprotein, solvent accessibility can successfully guidea mutagenesis screen.50-52

There are a number of questions to be addressedwhen identifying epitopes for protein stability orligand recognition. First, one would like to knowwhat are the residues important for the energeticsof binding or stability. Identification of these resi-dues then raises the question of whether they actindependently or in a cooperative manner. Finally,if cooperativity is involved, one would like to knowwhat are the factors responsible for it. Answers toall these questions can be found by application of theprinciples of site-specific thermodynamics to muta-tional effects.5,9

The first question is addressed by replacing resi-dues that are thought to be involved in stability orrecognition. Definition of a structural epitope speci-fies the degrees of freedom of the system and the

number of residues that are energetically relevantto the phenomenon under study. There are 20possible choices for any given residue in a protein,and therefore, if one specific residue is to be replaced,there are 19 possibilities. In practice, the residue ofchoice for the replacement is Ala. The rationalebehind Ala-scanning mutagenesis is that all interac-tions of a side chain, except for the Câ atom, areeliminated.53,54 The contribution of the deleted groupsrelative to the methyl moiety of Ala is assessed fromthe difference between the properties of the wild-typerelative to the Ala mutant. For this strategy to beeffective, it is necessary that the Ala substitutioneliminates interactions without introducing newproperties. In principle, this should be the case foralmost all amino acids except Gly, for which the Alasubstitution can introduce new nonpolar interactions,and Cys, for which the Ala substitutions can disruptan important disulfide bond, generating global de-stabilizing effects on the protein. In addition, for Glyand Pro, the Ala substitution can introduce perturba-tions of the protein backbone that becomes lessflexible (Gly f Ala substitution) or less rigid (Pro fAla substitution). Ala scanning mutagenesis hasfound myriad applications in the identification andenergetic characterization of structural epitopes rec-ognizing specific ligands50-52 or the structural deter-minants of protein stability,42-44,55-57 enzyme mech-anism,58 and specificity.59Free energies of binding in the ground or transition

state, or free energies of unfolding are used toquantify the effect of the Ala substitution at anygiven site. In the case of ligand recognition, the effectof Ala replacements is quantified from the propertiesof the following thermodynamic cycle:5

∆Gwt measures the free energy of binding L to thewild-type macromolecule. The same process in themutant gives ∆Gmut, and the difference ∆∆G ) ∆Gmut- ∆Gwt ) ∆Gc is a measure of the effect of the site-directed mutation on the binding process.5 Thisdifference is the coupling free energy of the cycle andmeasures the linkage between the binding of L andthe mutation. The same cycle applies to binding inthe transition state, where the free energy is directlyrelated to the specificity constant s ) kcat/Km and ∆∆G) RT ln(swt/smut) ) ∆Gc. When ∆Gc > 0, the mutationreduces specificity, whereas enhanced specificity isreflected by ∆Gc < 0 and no effect is seen for ∆Gc )0.In the case of protein stability, a thermodynamic

cycle analogous to eq 20 can be constructed as follows:

M ML

Mmut MmutL

(20)

∆Gwt

∆Gmut

0∆Gj1∆Gj

U F

Umut Fmut

(21)

∆Gwt

∆Gmut

0∆Gj1∆Gj

1572 Chemical Reviews, 1998, Vol. 98, No. 4 Di Cera

Page 11: Site-Specific Thermodynamics:  Understanding Cooperativity in Molecular Recognition

The unfolded state of the protein, U, replaces M andthe folded state, F, replaces ML.5 The value of ∆Gcmeasures the linkage between the mutation and thefolding of the protein. When ∆Gc is positive, themutation reduces stability. Enhanced stability isreflected by a negative value of ∆Gc and no effect isseen for ∆Gc ) 0.Definition of ∆Gc in the cycles in eqs 20 and 21

implies that the effect of the mutation cannot beattributed entirely to the bound or folded state of theprotein, as far too often assumed. In fact, ∆Gcmeasures the difference in free energy between thefolded mutant and wild-type relative to the samedifference in the unfolded state. In the case of ligandrecognition, ∆Gc measures the difference 1∆Gj - 0∆Gjand reflects the perturbation introduced by themutation on the ML complex relative to the freemacromolecule M. To assign ∆Gc entirely as aperturbation of the folded state, one must assumethat the free energy of the unfolded state is notaffected by the mutation. However, this is in contrastwith a large body of experimental data obtained in avariety of systems.5 Likewise, identification of astructural epitope strictly demands that the mutationperturbs specifically the bound state of the macro-molecule. However, if a mutation has ∆Gc > 0 anddestabilizes the binding of L, the effect is not neces-sarily due to destabilization of the complex ML. Amutation that stabilizes the free form of the macro-molecule (0∆Gj < 0) and has no effect on the boundform (1∆Gj ) 0) also gives ∆Gc > 0 and can beconfused with a mutation that directly affects rec-ognition of the ligand. In this case, the residuemutated is mistakenly associated with the epitoperecognizing the ligand L, although it plays no role inthe binding event. A value of ∆Gc > 0 only meansthat the effect of the mutation has reduced thestability of the complex more than that of the freeform. Assignment of the perturbation to the boundcomplex requires experimental demonstration thatthe free form of the macromolecule is not affected bythe mutation (0∆Gj ) 0). In the absence of thisinformation, interpretation of the results may beproblematic and must rely on other criteria like thespatial proximity of residues affecting ligand bindingor the involvement of these residues in ligand rec-ognition based on structural information. Only whenthe Ala substitution does not alter the properties ofthe unfolded state, or removes contacts important forinteraction with the ligand, can maps of the regionsinvolved in stability and ligand recognition be con-structed from the effect of the mutation on ∆Gc.In addition to the potential problems outlined

above, single-site Ala replacements neglect a priorithe contribution of possible site-site interactions toprotein stability and ligand recognition. The analogywith ligand binding reveals the limits of such anapproach. Single-site Ala scans only provide infor-mation on the equivalent of the site-specific bindingconstants Kj’s, and there is no way one can assessthe binding properties of a cooperative system fromknowledge of these parameters alone. In the absenceof interactions, these constants are indeed sufficientto characterize the properties of the system. In a

cooperative system where interactions are predomi-nant, these parameters represent only a small frac-tion of the total number of independent parametersneeded to characterize the energetics. Results fromthe limited number of studies where the importanceof site-site interactions in mutational effects hasbeen addressed experimentally have fostered thesomewhat misleading notion that residues tend toparticipate independently in stability and recog-nition60-62 and that interactions only occur amongresidues close in space.56,58,62-64 It has now beenrecognized that interactions may involve residues asfar as 30 Å away from each other.65-73 Hence, thereis good reason to believe that interactions are presentin nearly every system and provide the most impor-tant ingredient to protein stability and ligand rec-ognition.A compelling argument in favor of the existence of

cooperativity in ligand recognition and protein stabil-ity is as follows. The key assumption of Ala-scanningmutagenesis is that the Ala replacement has the onlyeffect of eliminating the interactions of the side chainbeyond the Câ.53,54 If this assumption is at all validand the Ala replacement is an unbiased probe of theenergetic contribution of a given residue to binding,then the Ala mutation at any position of the epitopeshould convert the free energy contribution to zero.If this is not the case, then the Ala replacement hasintroduced new properties at the site, thereby invali-dating the assumption. If a functional epitope forbinding or stability were composed exclusively ofindependent residues, then these residues wouldcontribute to the energetics in an additive mannerand their contribution would be unraveled by single-site Ala scans. Furthermore, the sum of the freeenergy changes due to Ala replacement over all sitesin the epitope, with changed sign, would be close tothe actual free energy of binding or stability mea-sured experimentally for the wild-type. Inspectionof the results in Table 3 for a number of systemsshows that this is not the case.A large discrepancy exists between the calculated

and experimentally determined values. In the caseof human growth hormone48 or granulocyte colonystimulating factor76 binding to their receptors, thebinding affinity calculated from the results of the Alascan is greatly overestimated, and so is the stabilityof Arc repressor46 and staphylococcal nuclease.74 Inthe cases of BPTI binding to trypsin,49 tissue factorbinding to coagulation factor VIIa,51 or linolenatebinding to intestinal fatty acid binding protein,75 thebinding affinity is grossly underestimated. When theaffinity is underestimated, it may be argued that thefunctional epitope might have been incompletelycharacterized thereby missing important interac-tions. This can hardly be the case for the interactionof tissue factor with VIIa, where 112 residues weretargeted by mutagenesis, or intestinal fatty acidbinding protein, where 23 important residues in thebinding cavity were replaced. On the other hand,when the affinity is overestimated, it may be arguedthat the functional epitope might have included sitesof marginal importance. Again, this can hardly bethe case in the interaction of human growth hormone

Site-Specific Thermodynamics Chemical Reviews, 1998, Vol. 98, No. 4 1573

Page 12: Site-Specific Thermodynamics:  Understanding Cooperativity in Molecular Recognition

with its receptor where the functional epitope is asmall hot spot, or for Arc repressor, where thecalculated value of stability was taken from the sumof only 11 out of 52 mutated residues, or else forstaphylococcal nuclease where only the effect of Alareplacements of 14 large hydrophobic side chains wasconsidered. The results of intestinal fatty acid bind-ing protein are particularly instructive insofar asthey show that the discrepancy between calculatedand experimentally determined values depends onthe particular ligand examined. The differencechanges from -11.5 kcal/mol for linolenate to 1.4kcal/mol for stearate. Given the comparable size ofthe fatty acids listed in Table 3 and their comparablebinding affinity, this large difference cannot be dueto intrinsic properties of the ligand. Rather, itsuggests the presence of communication among theprotein residues that is sensitive to the particularligand bound. In the case of insulin binding to itsreceptor, the affinity is underestimated when 26residues are mutated to Ala.77 However, when theresults are combined with other Ala scans underidentical conditions78-83 to cover a total of 38 residues,the affinity is grossly overestimated. A similarsituation is encountered in the binding of a mono-clonal antibody to coagulation factor VIII.84 Again,when the Ala scan involves most of the residuesresponsible for binding, a large discrepancy is seenbetween calculated and experimentally determinedvalues for the binding of the ligand to the wild-type,underscoring the important role that interactionsamong residues play in the recognition process.It may seem paradoxical that an epitope containing

all residues replaced by Ala should bind a ligand witha ∆G ) 0, regardless of the system studied, if theresidues are truly independent. A binding freeenergy of zero means that the ligand experiences nonet energetic change in going from the free to thebound state and that the all-Ala binding epitope isenergetically neutral. Similar arguments apply to

protein stability. Although this scenario is hypo-thetical, its validity within reasonable energeticterms is key to the approach based on Ala scans. Ifthe large discrepancy in Table 3 is the result ofspecific favorable or unfavorable contributions tostability and recognition introduced by the presenceof Ala at any given site, the assignment of epitopeswith Ala-scanning mutagenesis becomes context de-pendent and highly questionable. It is possible thatAla replacements may introduce additional proper-ties at the site of mutation and that these propertiesmay bias the energetic balance of the substitution.However, this bias is likely to be small. We proposethat the large discrepancy documented in Table 3 isindicative of a more general problem, i.e., the neglectof energetic contributions arising from possible site-site interactions that cannot be quantified by single-site Ala scans.In the case of ligand binding, the presence of

cooperativity in the recognition event may be theresult of some general rules through which biologicalspecificity is encoded into the structure of a protein.A similar scenario may apply to protein stability,where recognition involves domains of the sameprotein. The stability of a protein is thought to resultfrom the balance of two large and opposite forces, onefavorable due to the hydrophobic and electrostaticeffects and the other unfavorable due to conforma-tional entropy loss.2 The balance is usually compa-rable in magnitude to the free energy involved in onlya few polar or charged interactions. In view of thiswell-established fact, it may be argued that thenonadditivity documented in Table 3 for the pertur-bation of protein stability may be due to the disrup-tion of favorable interactions, without compromisingthe unfavorable contributions. Hence, the disruptionof a few contacts independent of one another mayresult in a loss of stability comparable to that of theentire protein, and the balance of similar perturba-tions over a large number of residues will necessarily

Table 3. Comparison of Free Energy Values (kcal/mol) for Stability and Ligand Recognition MeasuredExperimentally and Calculated from Single-Site Ala Scans

system process Ala replacements ∆Gcalci ∆Gexp δ∆Gcoop ref

Arc repressor unfolding 51a 58.2 13.8 -44.4 46Staphylococcal nuclease unfolding 14b 39.1 5.5 -33.6 74hGH-hGHbpc binding 30 -25.9 -12.3 13.6 48BPTI-chymotrypsin binding 15 -6.4 -10.7 -4.3 49VIIa-TFd binding 112 -9.7 -15.4 -5.7 51I-FABPe (palmitate) binding 23 -6.8 -10.9 -4.1 75I-FABPe (stearate) binding 23 -13.1 -11.7 1.4 75I-FABPe (oleate) binding 23 -8.5 -10.7 -2.2 75I-FABPe (linoleate) binding 23 -5.4 -10.0 -4.6 75I-FABPe (linolenate) binding 23 2.4 -9.1 -11.5 75I-FABPe (arachidonate) binding 23 -3.6 -9.5 -5.9 75GCSF-GCSF receptorf binding 27 -14.5 -11.3 3.2 76RANTES-CCR1g binding 16 -5.2 -12.3 -7.1 52RANTES-CCR3g binding 16 -10.5 -13.0 -2.5 52insulin receptor binding 26 -8.9 -11.2 -2.3 77insulin receptor binding 38 -23.4 -11.2 12.1 77-83VIII-mAb413h binding 10 -16.6 -13.6 3.0 84a Only the Ala replacements of residues W14, N29, R31, S32, E36, R40, S44, K47, E48, and R50 forming hydrogen bonds and

ion pairs protected from the solvent were included in the calculations. b Only the Ala replacements of large hydrophobic residueswere included in the calculations. c Human growth hormone (hGH) binding to the extracellular domain of its first bound receptor(hGHbp). d Tissue factor (TF) binding to coagulation factor VIIa. e Intestinal fatty acid binding protein. f Granulocyte colonystimulating factor (GCSF). g CC-chemokine regulated upon activation normal T-cell expressed and secreted interacting with itsreceptors. h Monoclonal antibody 413 binding to coagulation factor VIII. i ∆Gcalc, ∆Gexp, and δ∆Gcoop are defined in eq 22.

1574 Chemical Reviews, 1998, Vol. 98, No. 4 Di Cera

Page 13: Site-Specific Thermodynamics:  Understanding Cooperativity in Molecular Recognition

exceed the stability of the protein by a large factor.If this argument is correct, it should be possible tofind a significant number of mutations in a proteinthat only affect the unfavorable contributions andproduce large increases in stability. However, sta-bilizing mutations are rather exceptional, whereasdestabilizing mutations are very common. We pro-pose that some of the unfavorable contributions toprotein stability result from the negative interactionsamong residues that contribute to protein stability.In the absence of these important interactions,underlying the complex cooperative nature of thefolding process, proteins would be orders of magni-tude more stable. This hypothesis explains theresults in Table 3 and accounts for the large preva-lence of destabilizing effects observed in single-siteAla scans of proteins.An approximate measure of the extent of interac-

tions among residues is given by the difference,δ∆Gcoop, between the experimentally determined,∆Gexp, and calculated, ∆Gcalc, values of the free energyof binding or stability, i.e.

The value of ∆Gexp is the same as the free energy ofthe wild-type, ∆Gwt. The calculated ∆Gcalc is the sumof the differences between the free energy of themutant and wild-type for all N mutants in theepitope, with changed sign. In the absence of inter-actions among the residues being mutated to Ala, andunder the assumption that the Ala substitution isenergetically neutral, δ∆Gcoop should be as close aspossible to zero. Hence,

is the expected result for an epitope composed ofindependent residues.The presence of interactions invalidates the ener-

getic assignments derived from single-site Ala scansbecause the contribution of a given residue to stabil-ity or ligand binding will depend on the state (wild-type or mutated) of other residues. The extent towhich interactions affect the assignments based onsingle-site Ala scans must be evaluated in each caseand complicates the identification of epitopes. Incooperative processes such as protein stability orligand recognition, the contribution of a given residueinvolves effects of multiple order. A first-ordercontribution comes from contacts made directly withthe ligand or with another residue in the protein.Higher-order contributions may come from the cou-pling between the residue and other structuralcomponents. The residue recognizing the ligand maybe involved in a number of interactions with otherresidues via short-range van der Waals coupling,long-range electrostatic coupling, or large-scale con-formational transitions. For interactions of second-

order, the construction of double mutations becomesnecessary to assess the energetic contribution tostability and ligand recognition, and so forth forhigher-order interactions. If an epitope contains Nresidues, a complete single-site Ala scan requires Nmutations and a double-site Ala scan requires N(N- 1)/2 mutations. The problem of correctly assessingthe energetic contribution of residues in a functionalepitope using site-directed mutagenesis is combina-torially challenging and demands elucidation of site-site coupling patterns. This calls for a new methodof analysis of mutational effects in proteins wherethe role of interactions is explicitly taken into ac-count.

C. Double-Mutant CyclesCooperativity between single-site mutations is

typically assessed from the properties of double-mutant cycles. Consider the general case of a systemcomposed of N sites that can exist in two states, 0(wild-type) and 1 (mutant). ∆Gj is the free energychange associated with the 0 f 1 transition at site jwhen all other sites are in state 0. This term is thedifference in free energy between the configurationwith site j perturbed and the wild-type resulting inthe loss (∆Gj > 0) or gain (∆Gj < 0) of specificity orstability due to perturbation of that site. There areN such terms to be taken into account, one for eachsite. Consider, then, the double perturbation at sitesi and j. The free energy change for such perturbationcan be written as the sum ∆Gi + ∆Gj + ∆Gij, where∆Gij is the interaction free energy between sites i andj when the perturbation is applied at both sites. ∆Gijis the same as the coupling free energy in thethermodynamic cycle analogous to eq 2:

where the suffix denotes the state, wild-type ormutant, of sites i and j. A negative value of ∆Gijindicates positive coupling between the perturbationsat sites i and j in enhancing specificity or stability,or negative coupling in reducing it, and vice versa fora positive value. A value of ∆Gij ) 0 indicates theabsence of coupling between the perturbations.Some properties of double-mutant cycles have been

discussed previously.62-65,73,85,86 Horovitz and Fer-sht86 pointed out that these cycles can also be usedto dissect more complex interactions involving mul-tiple sites. This becomes necessary if one wants tounderstand the origin of the coupling between twosites. The cycle in eq 24 can help establish thepresence of coupling between mutations introducedat two sites, but it cannot reveal the origin of thecoupling. Once the existence of coupling is estab-lished, is this the result of direct interactions betweenthe sites or is it mediated indirectly via other sites?Horovitz and Fersht86 suggested that comparison ofthe values of the coupling free energy obtained in thetwo states, wild-type and mutated, of a third residue

(24)

∆Gi

∆Gj ∆Gj + ∆Gij

M00 M10

M01 M11∆Gi + ∆Gij

δ∆Gcoop ) ∆Gexp - ∆Gcalc )

∆Gwt + ∑j)1

N

(j∆Gmut - ∆Gwt) ) ∆Gwt + ∑j)1

Nj∆∆Gmut

(22)

∆Gwt ) -∑j)1

Nj∆∆Gmut (23)

Site-Specific Thermodynamics Chemical Reviews, 1998, Vol. 98, No. 4 1575

Page 14: Site-Specific Thermodynamics:  Understanding Cooperativity in Molecular Recognition

can determine whether the third residue affects theinteraction between the two sites. This approach canbe extended to an arbitrary number of sites byconstructing a hierarchy of perturbed cycles. First,the effect of a third site is examined on the couplingbetween two sites. Then the effect of a fourth site isstudied on the coupling between the third site andthe first two sites, and so forth.A more straightforward and informative method

to assess the origin of coupling between mutationsat two different sites exploits a key property of thecoupling free energy.5,9 The mechanism of couplingis unraveled by studying how the coupling betweenany two sites is affected by the configuration of othersites. The coupling free energy between two muta-tions at sites i and j is defined in the double-mutantcycle in eq 24 by implicitly assuming that all othersites are in state 0. A cycle analogous to that in eq24 can be constructed for any configuration of theotherN - 2 sites. There are 2N-2 such configurationsand N(N - 1)/2 distinct pairs of sites, i and j, leadingto a total of N(N - 1)2N-3 possible thermodynamiccycles and coupling free energies. Not all cycles areindependent because the system only contains 2N -1 independent terms,N of which are site-specific freeenergies of perturbation ∆Gj’s and the remainder arecoupling free energy terms from second up to Nthorder. Construction of double-mutant cycles cannotgenerate more information than that contained in theindependent coupling terms. Hence, of the N(N -1)2N-3 possible cycles, only 2N - 1 - N are necessarilyindependent. However, once any pair of sites i andj is chosen, the 2N-3 coupling free energy valuesgenerated by all configurations of the other N - 2sites are all independent. Therefore, there are twoalternative and equivalent ways to characterize theinteractions of a system. One is based on the second-and higher-order interaction free energies that definethe intermediates of the system; the other casts thesefree energies in terms of the coupling between twosites in any possible configurations of the other sites.Once coupling free energies are calculated for all

possible configurations of the system, it is possibleto decipher the code for site-site interactions usingthe following property of a thermodynamic cyclewhose mathematical proof is given elsewhere:9

THEOREM: If the coupling between two sites isdirect and involves only second-order interactions,then the coupling free energy is independent of theconfiguration of other sites. Otherwise, the couplingis indirect and involves interactions higher thansecond order.To understand the significance of this property, it

is useful to consider two key examples of direct andindirect coupling. Direct coupling is peculiar ofmodels of nearest-neighbor interactions, like theKoshland-Nemethy-Filmer model of ligand bindingcooperativity.87 In this model, interactions are allpairwise and second order. Coupling of higher orderis simply the result of additive contributions fromsecond-order coupling terms. No matter how twosites are linked to each other and to the rest of thesystem, the coupling between them remains ener-getically the same regardless of the configuration of

other sites. This has the nontrivial consequence that,when the coupling between a pair of sites is notaffected by a third site, one cannot conclude that thethird site is not coupled to the pair as the Horovitz-Fersht approach would mistakenly imply.86 In fact,in any nearest-neighbor model where the third siteis coupled to each site in the pair, the state of thethird site is inconsequential on the coupling freeenergy of the pair. Though somewhat counterintui-tive, this conclusion can be proved mathematically9and provides an important reference point for thecorrect interpretation of coupling free energy profiles.

The case of the ionization reactions of glutamic aciddealt with in section II.B is particularly relevant inthis regard. The third-order coupling constant c123is the same as the product c12c13c23 (Table 1). Hence,the third-order coupling free energy ∆G123 ) -RT lnc123 is the sum of the three second-order coupling freeenergies ∆G12 ) -RT ln c12, ∆G13 ) -RT ln c13 and∆G23 ) -RT ln c23. As a result, the coupling betweenany two ionizable groups in glutamic acid is notinfluenced by the ionization state of the third group,although all groups are coupled. The amino groupinfluences protonation of the R- and γ-carboxylgroups but has no influence on the negative couplingbetween these groups, which remains the samewhether the amino group is protonated or not.

Indirect coupling manifests itself in a more obviousmanner. An example is provided by the Monod-Wyman-Changeux model of concerted allosterictransitions88 where interactions involve all sitesthrough a linked global conformational change. Inthis model, sites are always positively coupled andthe order of coupling changes according to the stateof other sites as the protein switches from one stateto another. Combination of the Koshland-Nem-ethy-Filmer andMonod-Wyman-Changeux modelsinto a more general hybrid model accounts forarbitrarily complex mechanisms of coupling.5,9

The mechanism of coupling can be identified fromanalysis of double-mutant cycles but requires theavailability of a high-dimensional manifold of per-turbations where the coupling between two sites canbe studied as a function of a relatively large numberof configurations of other sites. This poses challeng-ing tasks from an experimental standpoint becauseconstruction and expression of triple or higher ordermutants in a protein may be problematic. Theanalysis based on the properties of the coupling freeenergy appears to be ideally suited for the site-specific dissection of ligand recognition when mostof the perturbations are introduced in small peptidesthat bind to the protein. Large libraries of peptidescontaining all the relevant mutant forms can beconstructed with ease and when combined withperturbations in the protein generate the complexitynecessary to dissect all interactions in the system.An example of how this new and powerful approachbased on the principles of site-specific thermodynam-ics can be implemented in practice is offered in thenext section.

1576 Chemical Reviews, 1998, Vol. 98, No. 4 Di Cera

Page 15: Site-Specific Thermodynamics:  Understanding Cooperativity in Molecular Recognition

IV. Site-Specific Dissection of ThrombinSpecificity

A. Substrate Recognition by Serine ProteasesThe principles outlined in the previous sections find

an ideal application to the study of enzyme specificity.Understanding the molecular origin of enzyme speci-ficity is important for structure-function and evo-lutionary studies and also bears on rational drugdesign. One of the best characterized class of en-zymes is that of serine proteases of the chymotrypsinfamily.89,90 These enzymes participate in key physi-ological functions such as digestion, blood coagula-tion, fibrinolysis, complement, and development.Proteases involved in digestive processes, like trypsin,have wide specificity and are also found in bacteria.In contrast, proteases involved in blood coagulation,fibrinolysis, and complement have narrow specificityand are found almost exclusively in vertebrates.91-93

Among these more specialized proteases, activity andspecificity is controlled allosterically by the bindingof Na+, whereas more primitive proteases and thoseinvolved in fibrinolysis are apparently devoid of suchimportant property.94,95Serine proteases of the chymotrypsin family share

a common fold composed of two six-stranded â-barrelsof similar structure that pack together asymmetri-cally to host at their interface the residues of thecatalytic triad H57, D102, and S195.96 Although theyhave a common catalytic mechanism,97 these en-zymes differ widely in specificity. The exact molec-ular origin of this difference remains in the most partelusive. The preference of trypsin-like enzymes forcleavage at Arg residues is due to the presence ofD189 at the bottom of the catalytic pocket. Inchymotrypsin, residue 189 is a Ser and the preferenceis for bulky aromatic side chains. However, theD189S replacement in trypsin does not result in achymotrypsin-like specificity. This is instead ob-tained by more substantial replacements involvingthe surface loops 185-188 and 221-225 with thehomologous regions in trypsin,98 though none of theresidues in these loops contacts the bound substrate.These observations suggest a molecular origin ofprotease specificity that depends on multiple criticalsites.The classical approach to the study of protease

specificity takes into account interactions made bythe enzyme with the substrate at the level of indi-vidual sites.99 This approach lends itself to applica-tion of the principles of site-specific thermodynamicsdeveloped for the study of binding cooperativity.9Residues of the substrate interacting with the en-zyme are labeled with a P and a number from 1 toN, starting from the scissile bond and moving to theN-terminus. Residues of the enzyme making con-tacts with the substrate are called specificity sites andare labeled with an S. The amino acid at P1 of thesubstrate makes contacts with the specificity site S1of the enzyme, P2 contacts S2, and so forth. Residueson the C-terminal portion of the scissile bond of thesubstrate are numbered P1′, P2′, and so forth andthe corresponding specificity sites on the enzyme areS1′, S2′, and so on. The scissile bond is positioned

between P1 and P1′. The existence of multiplerecognition sites effectively narrows down specificityby reducing the probability that the required se-quence is found in a random sample of potentialsubstrates. The longer the consensus sequence in-teracting with the enzyme, the smaller the prob-ability that it will occur in another potential sub-strate.The recognition model based on binding to multiple

specificity sites brings about a number of importantquestions, including the assessment of the freeenergy cost of a replacement made at a P or S siteand whether the P or S sites contribute to recognitionadditively or cooperatively. These questions arecentral to the analysis of mutational effects discussedin section III and are addressed below in the specificcase of thrombin-substrate interactions.

B. Thrombin Structure and FunctionThe serine protease thrombin is capable of two

important and opposite roles that are at the basis ofthe efficiency of blood coagulation. The procoagulantrole entails the conversion of fibrinogen into theinsoluble fibrin clot, the promotion of platelet ag-gregation, the stabilization of the ensuing clot byactivation of factor XIII and inhibition of fibrinolysis,and the feedback enhancement of its own generationfrom prothrombin by activation of factors V, VIII, andXI. The anticoagulant role involves the thrombo-modulin-assisted conversion of protein C into anactive component that cleaves and inactivates factorsVIIIa and Va together with protein S, thereby limit-ing the conversion of prothrombin into thrombincatalyzed by the prothrombinase complex.100,101 Inaddition to its primary roles in coagulation, thrombinelicits a variety of important effects on a number ofcell lines upon binding to its receptors.102,103Na+ is required for the optimal conversion of

fibrinogen into fibrin monomers, which is catalyzedby the procoagulant fast (Na+-bound) form with highspecificity.104 The slow (Na+-free) form of thrombinperforms the same task with lower specificity. Thisform, on the other hand, has higher specificity thanthe fast form toward protein C105,106 and playspredominantly an anticoagulant role. As a result ofthe different affinity of the two allosteric forms, Na+

is actively exchanged in the transition state uponbinding of fibrinogen or protein C. Fibrinogen bindsto the fast forms with higher affinity and promotesthe slow f fast conversion and Na+ binding. On theother hand, binding of protein C promotes the fastf slow conversion and Na+ release. Hence, Na+

binding and dissociation are important molecularcomponents of substrate recognition by thrombin.Thrombin is composed of two polypeptide chains

of 36 (A chain) and 259 (B chain) residues that arecovalently linked through a disulfide bond.107 TheB chain carries the functional epitopes of the enzymeand has an overall architecture similar to that ofpancreatic serine proteases (Figure 6). The extraor-dinary specificity of thrombin toward fibrinogenarises not only from contacts made in the interior ofthe active site (see below) but also from interactionswith exosite I located about 20 Å away from the

Site-Specific Thermodynamics Chemical Reviews, 1998, Vol. 98, No. 4 1577

Page 16: Site-Specific Thermodynamics:  Understanding Cooperativity in Molecular Recognition

active site.108,109 Exosite I serves as an extendedprimed recognition site. Binding of hirudin deriva-tives or thrombomodulin to this site also enhancesallosterically Na+ binding and switches the enzymeto the fast form, thereby changing activity andspecificity.110-112 Another factor that influences throm-bin specificity is the W60d insertion loop that isunique to thrombin and shapes the apolar specificitysite S2. This loop narrows significantly the accessto the active site by protruding into the solvent.Replacement of W60d with the less bulky Ala or Serprofoundly affects the interaction of thrombin withthe natural inhibitor antithrombin III113 or fibrino-gen.111,114 A similar function has been hypothesizedfor the autolysis loop shaping the lower rim of theaccess to the active site. Deletion of the entire loopresults in a selective loss of fibrinogen binding.115The Na+ binding site (Figure 7) displays octahedral

coordination involving the carbonyl O atoms of R221aand K224 and four buried water molecules tetrahe-drally coordinated by protein atoms and other watermolecules116,117 that altogether define a complexhydrogen-bonding network within the catalyticpocket.118 Some of the hydrogen bonds in the net-work are conserved with trypsin.119 Others arespecific to thrombin and are associated with Na+ andits coordination shell. The bound Na+ is located 15-20 Å away from the catalytic triad and lies within 5Å from D189 in the specificity site S1 with a watermolecule mediating a hydrogen-bonding interactionwith Oδ2 of D189. The Na+ site also appears to bestabilized by three ion pairs. R221a is ion-paired toE146 of the autolysis loop, K224 is ion-paired toE217, while D221 and D222 form a bidentate ion pairwith R187. Altering the bidentate ion pair with thedouble substitution D221A/D222K results in reducedactivity toward fibrinogen but enhanced activitytoward protein C.116 Perturbation of the ion pair inthe R187Q thrombin Greenville produces a reduced

clotting activity, consistent with reduced Na+ bind-ing.120 Similar effects of reduced clotting activity dueto reduced Na+ binding are seen upon disruption ofthe R221aA-E146106,121 or the K224-E217106,122 ionpairs.

C. Library of Site-Specific ProbesThe molecular strategy used by thrombin to achieve

specificity toward fibrinogen and protein C is deeplyrooted in the mechanism through which Na+ bindingaffects the environment of the active site of theenzyme. The main question is how the Na+-inducedslow f fast conversion enhances specificity towardfibrinogen and small chromogenic substrates. Arelated question is which allosteric form should betargeted with active-site inhibitors to guaranteeoptimal specificity. In both cases, the answer residesprimarily in the properties of the specificity sites ofthe enzyme and warrants a quantitative assessmentof their energetic contribution in the transition state.Substrate libraries generated from combinatorial

chemistry or phage display to identify consensussequences for binding123,124 can be used as powerfulprobes of the molecular environment of the specificitysites of the enzyme to elucidate how they contributeto recognition in the transition state. If perturba-tions are made in the sequence of a substrate togenerate a library containing all species required fora site-specific analysis, much information can bederived on the energetic contributions of the specific-ity sites that is difficult to obtain from mutagenesisof the enzyme. To understand the molecular originof the higher specificity of the fast form towardfibrinogen, the chromogenic tripeptide substrate FPR(Table 4) was synthesized59 to mimic the interactionof the natural substrate with the active site of theenzyme.108,109 Like fibrinogen, FPR is cleaved by thefast form with a specificity 30-fold higher than thatof the slow form95 (Table 5). The crystal structure

Figure 6. Ribbon representation of thrombin showing theresidues of the catalytic triad. Important regions of theenzyme are noted.

Figure 7. Molecular environment of the Na+ binding siteof thrombin. The bound Na+ (black circle) is coordinatedoctahedrally by the carbonyl O atoms of K224 and R221aand four water molecules (gray circles). The site seems tobe stabilized by three ion pairs: R221a-E146, D221,D222-R187, and K224-E217.

1578 Chemical Reviews, 1998, Vol. 98, No. 4 Di Cera

Page 17: Site-Specific Thermodynamics:  Understanding Cooperativity in Molecular Recognition

of thrombin inhibited with H-D-Phe-Pro-Arg-CH2Cl107provides information on the interactions of the P1-P3 groups of FPR with the enzyme. Arg at P1 makesan ion pair with D189 at S1 at the bottom of thecatalytic pocket, Pro at P2 interacts with the apolarmoiety of S2 defined by P60b, P60c, and W60d,whereas Phe at P3 forms a favorable edge-to-faceinteraction with the aromatic ring of W215 at S3(Figure 8). The D enantiomer at P3 mimics theinteraction of F8 at P9 of fibrinogen with W215 ofthrombin.125 The chromogenic group p-nitroanilideattached to the C-terminus enables quantitativespectroscopic measurements of the released p-nitro-aniline upon cleavage by thrombin at the P1-p-nitroanilide scissile bond.Starting from FPR, seven substitutions were made

to generate the library in Table 4.59 The rationalebehind these substitutions was to introduce enoughperturbation at P1, P2, and P3 while retainingsufficient specificity for accurate experimental mea-surements. The perturbation would then act as thesource of information on the environment of thespecificity sites of the enzyme S1, S2, and S3. H-D-Phe was replaced with H-D-Val in VPR, VPK, VGR,and VGK, to replace the aromatic moiety with ahydrophobe. Pro was replaced with Gly in FGR,FGK, VGR, and VGK, to avoid steric hindrance withS2 and relieve the rigidity of the P2-P3 bond. Argwas replaced with Lys in FPK, FGK, VPK, and VGK,to preserve the positive charge at P1 needed tocontact D189 at S1. The substitutions were com-bined to generate all possible intermediates from theparent substrate FPR: the three singly substitutedsubstrates FPK, FGR, and VPR, the three doublysubstituted substrates FGK, VPK, and VGR, and thetriply substituted substrate VGK.

To obtain the relevant free energy changes associ-ated with the perturbations, the specificity constants ) kcat/Km for substrate hydrolysis was measured inall cases (Table 5) to estimate the free energy ofstability of the transition state. The value for FPRwas used to scale energetically all others to obtain

Table 4. Substrate Library

abbrev substrate site(s) perturbed

FPR H-D-Phe-Pro-Arg-p-nitroanilide noneFPK H-D-Phe-Pro-Lys-p-nitroanilide P1FGR H-D-Phe-Gly-Arg-p-nitroanilide P2VPR H-D-Val-Pro-Arg-p-nitroanilide P3FGK H-D-Phe-Gly-Lys-p-nitroanilide P1 and P2VPK H-D-Val-Pro-Lys-p-nitroanilide P1 and P3VGR H-D-Val-Gly-Arg-p-nitroanilide P2 and P3VGK H-D-Val-Gly-Lys-p-nitroanilide P1, P2, and P3

Table 5. Specificity Constants kcat/Km (µM-1 s-1) for the Hydrolysis of Synthetic Substrates by Thrombin, Trypsin,and Plasmin

FPR FPK FGR VPR FGK VPK VGR VGK

Thrombin Fast Formwild type 90 7.9 2.0 100 0.021 2.1 0.34 0.0047R221aA 80 4.6 0.75 36 0.011 0.96 0.14 0.0024K224A 44 7.7 0.93 24 0.027 1.4 0.17 0.0044R221aA/K224A 26 3.2 0.33 13 0.011 0.70 0.049 0.0017

Thrombin Slow Formwild type 3.0 0.35 0.86 6.7 0.0026 0.11 0.17 0.00079R221aA 1.6 0.040 0.042 1.0 0.00038 0.0097 0.0086 0.00013K224A 0.47 0.034 0.012 0.28 0.00039 0.0063 0.0020 0.00013R221aA/K224A 0.34 0.010 0.0025 0.077 0.00021 0.0018 0.00063 0.000063

trypsin 8.9 0.95 2.2 6.9 0.22 0.75 0.67 0.069plasmin 0.031 0.047 0.0018 0.028 0.0048 0.058 0.0016 0.0037a Experimental conditions: 5 mM Tris, I ) 200 mM, 0.1% PEG, pH 8.0, at 25 °C. The slow form was studied in the presence

of 200 mM choline chloride. The properties of the fast form refer to the limit [Na+] f ∞, at constant I ) 200 mM. Errors aretypically (2%.

Figure 8. Contacts between the irreversible inhibitor H-D-Phe-Pro-Arg-CH2Cl and the active site of thrombin. Shownare thrombin residues D189, P60c, W60d, L99, and W215that interact with the inhibitor. The guanidyl group of theArg at P1 makes an ion pair with the carboxyl group ofD189 at S1 at the bottom of the active site. Pro at P2 packsin the S2 apolar cavity provided by the W60d loop. H-D-Phe at P3 makes favorable hydrophobic contacts in the cleftwith L99 and especially a perpendicular aryl-aryl edge-on interaction with W215 at S3.

Site-Specific Thermodynamics Chemical Reviews, 1998, Vol. 98, No. 4 1579

Page 18: Site-Specific Thermodynamics:  Understanding Cooperativity in Molecular Recognition

the relevant free energy changes in the transitionstate (Table 6) as follows:59

∆G1, ∆G2, and ∆G3 are the changes in specificity dueto the single-site substitutions at P1, P2, and P3.∆G12, ∆G13, and ∆G23 are the second-order couplingfree energies for substitutions made at the threepossible pairs of sites, and ∆G123 is the third-ordercoupling free energy for the triple substitution.These terms reflect interactions between substitu-tions made at different sites that may reduce (∆G >0) or enhance (∆G < 0) specificity beyond simpleadditivity. The terms in eqs 25-31 define the freeenergy level of any substrate in the library relativeto FPR. For example, the relative free energy levelof FGK is ∆G1 + ∆G2 + ∆G12 and that of VGK is ∆G1+ ∆G2 + ∆G3 + ∆G123. Similar measurements werecarried out with the three thrombin mutants R221aA,K224A, and R221a/K224A to assess the role of theion pairs that seem to stabilize the Na+ bindingenvironment (Figure 7). This resulted in the com-plete dissection of a five-dimensional manifold ofspecies in both the slow and fast forms of the enzymefrom which detailed information can be derived onhow perturbations of the substrate are coupled toeach other and to perturbations in the enzyme. Thefive sites perturbed are P1, P2, and P3 in thesubstrate and R221a and K224 in the enzyme. Thesite-specific parameters relative to the 32 possibleintermediates in the manifold are listed in Table 6for each thrombin form.

D. Cooperativity in Substrate RecognitionInspection of Table 6 reveals the presence of large

and significant cooperativity in the effects induced

by perturbations of the P1-P3 sites in the case ofwild-type thrombin. The extent of cooperativitychanges for each pair of substitutions and is alsoaffected by the allosteric state of the enzyme andmutations made around the Na+ binding environ-ment. In contrast, no interactions are seen fortrypsin and plasmin, two cognate proteases. Thedifferent response elicited by the substrate libraryin different enzymes lends validity to the strategy ofprobing the environment of the specificity sites.The free energy change due to replacing Arg with

Lys at P1 in all possible combinations of the state ofP2 and P3 is summarized in Table 7. The values areall positive in both the slow and fast forms, for wild-type and mutant thrombins, indicating that the Argf Lys replacement at P1 always causes a loss ofspecificity. The cost of this replacement is about 1kcal/mol in both the slow and fast forms when noreplacement is made at P2 and P3, which suggeststhat the same mechanism may cause the loss ofspecificity in both allosteric forms. The changes incatalytic parameters observed in the fast f slowconversion of thrombin for both synthetic substratesand fibrinogen involve a decrease in kcat and anincrease in Km.104,126 This would suggest that bindingof Na+ orients the side chain of D189 for optimalcoordination of the guanidinium group of Arg at P1,perhaps using water 447 that bridges the bound Na+

and the Oδ2 atom of D189.118 In this case, however,the loss of specificity with the Arg f Lys substitutionat P1 would be more pronounced in the fast form.The similarity of effects seen for the two forms arguesagainst a direct influence of the allosteric switch onthe position of the side chain of D189. This conclu-sion is consistent with the observation that water 447is also present in trypsin,118,119 which does not bindNa+,94 where it bridges the Oδ2 atom of D189 to thecarbonyl O atom of K224. The origin of the increasedspecificity of the fast form must therefore reside atother specificity sites.Due to the strong interactions among the P1-P3

sites, the cost of replacing Arg with Lys at P1depends on the residue at P2 and P3 (Table 7) andreveals the importance of cooperativity in substraterecognition. With Gly at P2, the cost of the Arg fLys replacement at P1 increases by 1.3 kcal/mol inthe fast form and 2.1 kcal/mol in the slow form,introducing a significant difference of -0.7 kcal/molbetween the two forms. This difference measures thecoupling between the replacement at P1 and the slowf fast transition. A negative value indicates thatthe replacement promotes the slow f fast conversionin the transition state or that the replaced residuesbinds preferentially to the slow form. A positivevalue signals a stabilization of the slow form or thatthe replaced residues binds preferentially to the fastform. The presence of a small, but significantcoupling when Gly is present at P2 suggests that theenvironment around D189 in the transition state maybe different in the slow and fast form. When P3 issubstituted, the energetic penalty for the P1 substi-tution increases by nearly 1 kcal/mol in both throm-bin forms. The extent of interaction of P2 and P3with P1 is significant. When Gly is present at P2,

Table 6. Free Energy Values (kcal/mol) Due toPerturbation of the P1-P3 Sites of FPRa

∆G1 ∆G2 ∆G3 ∆G12 ∆G13 ∆G23 ∆G123

Thrombin Fast Formwt 1.4 2.3 -0.1 1.3 0.8 1.1 2.2R221aA 1.7 2.8 0.5 0.8 0.5 0.5 1.2K224A 1.0 2.3 0.4 1.1 0.7 0.6 1.8R221aA/K224A 1.2 2.6 0.4 0.8 0.5 0.7 1.5

Thrombin Slow Formwt 1.3 0.7 -0.5 2.2 1.2 1.4 3.3R221aA 2.2 2.2 0.3 0.6 0.6 0.7 1.0K224A 1.6 2.2 0.3 0.5 0.7 0.8 0.8R221aA/K224A 2.1 2.9 0.9 -0.6 0.1 -0.1 -0.8

trypsin 1.3 0.8 0.2 0.0 -0.0 0.6 0.6plasmin -0.3 1.7 0.1 -0.3 -0.2 0.0 -0.2

a Values were obtained from the specificity constants inTable 5 using eqs 25-31 in the text. Errors are (0.1 kcal/molor less.

∆G1 ) -RT ln(sFPK/sFPR) (25)

∆G2 ) -RT ln(sFGR/sFPR) (26)

∆G3 ) -RT ln(sVPR/sFPR) (27)

∆G12 ) -RT ln(sFGKsFPR/sFPKsFGR) (28)

∆G13 ) -RT ln(sVPKsFPR/sFPKsVPR) (29)

∆G23 ) -RT ln(sVGRsFPR/sFGRsVPR) (30)

∆G123 ) -RT ln(sVGKsFPR2/sFPKsFGRsVPR) (31)

1580 Chemical Reviews, 1998, Vol. 98, No. 4 Di Cera

Page 19: Site-Specific Thermodynamics:  Understanding Cooperativity in Molecular Recognition

the interaction with P1 actually exceeds the cost ofthe replacement at P1 itself in the slow form.The free energy change due to replacing Pro with

Gly at P2 in all possible combinations of the state ofP1 and P3 is summarized in Table 7. As for thesubstitution at P1, the values are significantly posi-tive. In this case, the effects tend to be morepronounced in the fast form, underscoring an obviouschange in the environment of the S2 site in the slowf fast transition. The significant difference is con-ducive to stabilization of the slow form in the transi-tion state when Pro is replaced by Gly. The apolarsite S2 of thrombin is formed by residues in the W60dloop, which has no counterpart in other serine pro-teases. Residues in the apolar site must be orienteddifferently in the slow and fast forms, causing abetter discrimination of the residue at P2 in the fastform. W60d may play a key role in this respectbecause replacement of the bulky side chain with Serin W60dS abolishes the differences between the slowand fast forms in recognizing substrates with Pro orGly at P2.111 The indole ring of W60d likely producessteric hindrance in the slow form, but not in the fastform. The perturbation at P2 depends strongly onthe residue present at P1 and P3. The cost of thePro f Gly replacement increases by 1.2 kcal/mol inthe fast form and 2.2 kcal/mol in the slow form as aresult of the substitution at P1. This effect is exactly(taking into account roundoff error) the same as thatseen for the perturbation at P1 when P2 is perturbed,as a consequence of the reciprocity of the linkagebetween the perturbations at P1 and P2.The free energy change due to replacing Phe with

Val at P3 in all possible combinations of the state ofP1 and P2 is summarized in Table 7. The unexpectedfinding is that Val at P3 does not cause a loss ofspecificity. Rather it increases specificity slightly inthe slow form. The hydrophobic group at P3 mayinteract favorably with the hydrophobic moiety of L99(Figure 8), which is close to the apolar site S2.Interestingly, residue Y3 of hirudin contacts W215of thrombin in a manner similar to Phe at P9 of the

fibrinogen AR chain,127 but replacement of Y3 withmore hydrophobic residues significantly enhances thebinding affinity,128,129 consistent with the enhancedspecificity of VPR compared to FPR. The energeticeffect linked to replacement of the residue at P3 isof the same magnitude in both forms and excludes adirect involvement of the S3 site in the slow T fastequilibrium. The perturbation at P3 depends stronglyon the state of P1 and P2. The cost of the Phe f Valreplacement increases by 0.9 kcal/mol in the fast formand 1.2 kcal/mol in the slow form as a result of thesubstitution at P1 and is the reciprocal of the effectseen for the perturbation at P1 when P3 is perturbed.The data in Tables 6 and 7 reveal the presence of

coupling among perturbations at P1, P2, and P3. Thecoupling is the result of constraints imposed by theenzyme on the bound substrate in the transition stateand is therefore revealing of the molecular environ-ment underlying the recognition process. The cou-pling free energies for the three possible pairs of Psites in the two possible states of the third site arelisted in Table 8. The values are constructed fromthe specificity constants pertaining to the four speciesin the double-mutant cycle in eq 24, where themutations are replaced by substitutions at the Psites. For example, the coupling between P1 and P2is 0∆G12 ) -RT ln(sFGKsFPR/sFPKsFGR) in the absenceof perturbation at P3 and 1∆G12 ) -RT ln(sVGKsVPR/sVPKsVGR) when P3 is perturbed. The value of 0∆G12is the same as ∆G12 in Table 6. The coupling freeenergies in the case of wild-type thrombin are mostlypositive and quite significant, demonstrating thatperturbations at the P1, P2, and P3 sites are nega-tively coupled in enhancing specificity and that theresidues at P1-P3 are negatively coupled in thebinding to the S1-S3 sites. When a site is perturbed,perturbation at a second site reduces specificitybeyond simple additivity. Furthermore, the couplingbetween any two sites is enhanced by more than 1kcal/mol when the third site is perturbed underlyingan even stronger cooperative effect in reducingspecificity that progresses with the extent of pertur-

Table 7. Free Energy Change (kcal/mol) in Specificity Due to Perturbation of the P1-P3 Sites of FPRa

fast form slow form couplingwt R221aA K224A

R221aA/K224A wt R221aA K224A

R221aA/K224A wt R221aA K224A

R221aA/K224A

Replacement at P1 (ArgfLys)FPX 1.4 1.7 1.0 1.2 1.3 2.2 1.6 2.1 0.2 -0.5 -0.5 -0.8FGX 2.7 2.5 2.1 2.0 3.4 2.8 2.0 1.5 -0.7 -0.3 0.1 0.5VPX 2.3 2.1 1.7 1.7 2.4 2.7 2.2 2.2 -0.1 -0.6 -0.6 -0.5VGX 2.5 2.4 2.2 2.0 3.2 2.5 1.6 1.4 -0.6 -0.1 0.5 0.6

Replacement at P2 (Pro f Gly)FXR 2.3 2.8 2.3 2.6 0.7 2.2 2.2 2.9 1.5 0.6 0.1 -0.3FXK 3.5 3.6 3.3 3.4 2.9 2.8 2.6 2.3 0.6 0.8 0.7 1.1VXR 3.4 3.3 2.9 3.3 2.2 2.8 2.9 2.8 1.2 0.5 0.0 0.5VXK 3.6 3.5 3.4 3.6 2.9 2.6 2.3 2.0 0.7 1.0 1.1 1.6

Replacement at P3 (Phe f Val)XPR -0.1 0.5 0.4 0.4 -0.5 0.3 0.3 0.9 0.4 0.2 0.1 -0.5XPK 0.8 0.9 1.0 0.9 0.7 0.8 1.0 1.0 0.1 0.1 0.0 -0.1XGR 1.0 1.0 1.0 1.1 1.0 0.9 1.1 0.8 0.1 0.1 -0.1 0.3XGK 0.9 0.9 1.1 1.1 0.7 0.6 0.7 0.7 0.2 0.3 0.4 0.4

a wt ) wild type. Errors are (0.1 kcal/mol or less. Values were obtained from the data in Table 6. The difference between thevalues for the fast and slow forms gives the coupling between the substitution and the slow f fast transition. Positive values areindicative of stabilization of the slow form in the transition state, whereas negative values signal stabilization of the fast form.Values of the coupling in excess of (RT (0.6 kcal/mol) are in bold type.

Site-Specific Thermodynamics Chemical Reviews, 1998, Vol. 98, No. 4 1581

Page 20: Site-Specific Thermodynamics:  Understanding Cooperativity in Molecular Recognition

bation in the substrate. There are six possiblecoupling free energy values for the three pairs, butonly four are independent. Hence, the differencebetween any two values for each pair is exactly thesame for all pairs. From the property of the couplingfree energy (section III.C), we conclude that the sitesare coupled indirectly through interactions higherthan second order.

E. Origin of the Higher Specificity of the FastFormThe Arg f Lys replacement at P1 slightly promotes

the slow f fast transition when Gly is present at P2.On the other hand, the Pro f Gly replacement at P2strongly stabilizes the slow form. The replacementat P3 is inconsequential on the allosteric equilibrium.Hence, the slow f fast transition affects mostly theenvironment of the S2 site, with modest effects onthe S1 site and no effect on the S3 site. Constraintsat the S2 site accounts for the lower specificity of theslow form compared to the fast form and becomeinconsequential if the substrate acquires flexibilitywith a Gly at P2 and can readjust in the active siteto compensate for the increased steric hindrance ofthe S2 site in the slow form. These findings explainwhy the thrombin mutant W60dS cleaves FPR withthe same specificity in the slow and fast forms111 andsuggest the bulky side chain of W60d as the likelyorigin of the constraints at S2.The dominant factors that control specificity are

the rigidity of the P2-P3 bond and the strength ofthe P1-S1 interaction. When the P2-P3 bond isrigid, the substrate finds a more favorable S2 envi-ronment in the fast form. Flexibility of the P2-P3bond relaxes the optimal interaction of Arg at P1 withD189 at S1, this effect being favored by a moreaccessible active site in the fast form.104,110 Thecoupling between substitutions at P1 and P2 comespartially from an intrinsic effect on the substrate, theloss of rigidity of the P2-P3 bond, and partially fromthe different environment of the enzyme in the slowand fast forms. The less constrained environmentof the specificity sites in the fast form also act toreduce the extent of negative coupling among thevarious perturbations in the substrate, causing theinteractions to essentially disappear as more substi-tutions are introduced at the P sites.The two ion pairs R221a-E146 and K224-E217

stabilizing the Na+ binding environment (Figure 7)provide other constraints in the slow form. TheR221aA mutant has a reduced Na+ affinity,106 sug-gesting that disruption of the R221aA-E146 ion pair

may destabilize the fast form. However, disruptionof the R221aA-E146 ion pair affects specificity morein the slow than the fast form. The parameterspertaining to the fast form are practically unchangedrelative to wild-type, while those in the slow formshow enhanced sensitivity to perturbation at P1 andP2. This perturbation is also less dependent on thestate of other groups, indicating a reduction in thecoupling among substitutions at the P1-P3 sites(Tables 6 and 7).Disruption of the R221a-E146 ion pair has a direct

influence on the specificity sites S1 and S2 of theenzyme in the slow form and affects the way thesesites discriminate between Arg and Lys at P1 or Proand Gly at P2. This ion pair maintains the correctarchitecture of the S1 site, especially in the slowform, but also influences the S2 site located some 17Å away. The molecular basis of this effect may bedue to enhanced mobility of the autolysis loop on theGlu side of the ion pair upon disruption of the contact.The enhanced mobility may interfere with substraterecognition in the slow form. The R221a-E146 ionpair contributes to the integrity of the S1 environ-ment in the slow form, but not in the fast formbecause the perturbation is practically abolished byNa+ binding.As for the R221aA mutant, mutation of K224 to

Ala reduces the Na+ affinity,106 suggesting thatdisruption of the K224-E217 ion pair may destabilizethe fast form, but again, this proposal is contradictedby the experimental data that document a largerperturbation of the slow form (Tables 6 and 7).Disruption of the K224-E217 ion pair produces effectsvery similar to those seen for the R221aA mutant,with a reduction of the coupling among the P1-P3sites especially in the slow form. The ion pairbetween K224 and E217 bridges two residues on thelast two â-strands of the B chain contributes to theintegrity of the S1 and S2 environments in the slowform. The region in immediate proximity to K224and E217 plays a key role in substrate selectivity andis absolutely conserved in thrombin from differentspecies.130 The state of this ion pair can thereforecontrol the access of substrates into the bottom of thecatalytic pocket where the specificity site S1 islocated.The two ion pairs interact slightly in the slow form,

but not in the fast form, as demonstrated by theresults on the double mutant R221aA/K224A (Tables6 and 7). The perturbation induced by the doublemutation is more drastic and almost abolishes Na+

binding.106 The mutation affects the response to

Table 8. Coupling Free Energies (kcal/mol) for Perturbation of the P1-P3 Sites of FPRa

fast form slow formwt R221aA K224A

R221aA/K224A wt R221aA K224A

R221aA/K224A

0∆G12 1.3 0.8 1.1 0.8 2.2 0.6 0.5 -0.61∆G12 0.2 0.3 0.6 0.3 0.7 -0.3 -0.6 -0.90∆G13 0.8 0.5 0.6 0.5 1.2 0.6 0.7 0.11∆G13 -0.2 -0.1 0.1 -0.0 -0.2 -0.3 -0.4 -0.10∆G23 1.1 0.5 0.6 0.7 1.4 0.7 0.7 -0.11∆G23 0.1 -0.0 0.1 0.2 0.0 -0.2 -0.3 -0.3a Listed are the two possible configurations of the third P site (0 ) wild-type (wt), 1 ) mutant). Errors are (0.1 kcal/mol or

less.

1582 Chemical Reviews, 1998, Vol. 98, No. 4 Di Cera

Page 21: Site-Specific Thermodynamics:  Understanding Cooperativity in Molecular Recognition

perturbations at the P1-P3 site, with an effect morepronounced in the slow form. The site-specific pa-rameters are profoundly altered in the slow form and,interestingly, the pairwise coupling pattern showsthe disappearance of indirect coupling in both theslow and fast forms, with the onset of positive second-order direct coupling between P1 and P2 (Table 8).This effect is peculiar to the double substitution,though it is somewhat anticipated by the singlesubstitutions. The molecular basis for the synergismbetween the R221a-E146 and K224-E217 ion pairsin the slow form is in the participation of residuesR221a and K224 in Na+ and water coordination. Inthe fast form, the carbonyl O atoms of R221a andK224 directly ligate the Na+. Mutation of theseresidues reduces the Na+ affinity, but high concen-trations of Na+ oppose the structural perturbationinduced by the mutation restoring a molecular en-vironment for the specificity sites that is essentiallythat of the fast form of wild-type. When Na+ isreleased, the carbonyl O atom of K224 may reorientas seen in the structure of trypsin and may hydrogenbond to water 447 in concert with the carbonyl Oatom of R221a. Water 447 hydrogen bonds to theside chain of D189 in the specificity pocket S1 andthrough the switching mechanism any perturbationof R221a and K224 changing the orientation of thecarbonyl O atoms will not be compensated as in thecase of the fast form and therefore may lead to moredrastic structural changes.118We conclude that the more constrained environ-

ment in the slow form of thrombin is partially dueto stronger ion pairs formed by R221a and K224 inthe Na+ binding loop with E146 in the autolysis loopand E217 in the penultimate â-strand of the B chain.The integrity of these ion pairs is essential formaintaining the correct architecture of the specificitysites through the effect on the water molecules in the

channel that embeds the specificity site S1. The roleof the ion pairs in the fast form appears to be lesscritical and their disruption can be compensated bythe binding of Na+. The origin of the reduced Na+

affinity in these mutants should be seen in a pertur-bation of the slow form leading to an impaired abilityto switch to the fast form.131 The foregoing analysisis invaluable to structure-function studies and topractical issues revolving around the design of betteractive-site inhibitors. Improvement in the potencyof these molecules can be obtained by reducing thenegative coupling among the P1-P3 sites. This effectis obtained by keeping a rigid backbone around theP2-P3 position that facilitates the coordination withD189 at S1 and by breaking the ion pairs R221a-E146 and K224-E217.

F. Molecular Origin of the Cooperativity amongthe P1−P3 Sites

The coupling pattern emerged from the analysisof the substrate library (Table 8) is conducive tonegatively cooperative interactions higher than sec-ond order. To elucidate the origin of this coupling,derived from the property of the coupling free energy,the entire five-dimensional manifold of species shouldbe considered. This manifold is composed of the sitesP1, P2, P3, R221a, and K224, and the relevant freeenergies are calculated by operating on the valueslisted in Table 6. Analysis of the coupling patterninvolving all possible pairs (Table 9) shows howinteractions change with the state of other sites.Considering only differences of at least (RT (0.6 kcal/mol) in the coupling free energy, the patterns can beanalyzed to identify the nature of the interaction.In the fast form, only the P1-P3 sites are signifi-

cantly coupled and in an indirect way. Perturbationof any P site influences the coupling at other sites.

Table 9. Coupling Free Energies (kcal/mol) for Perturbation of the P1-P3 Sites of FPR and Residues R221A andK224 of Thrombina

000 100 010 001 110 101 011 111 coupling mediated by

Fast FormP1-P2 1.3 0.2 0.8 1.1 0.3 0.5 0.8 0.3 indirect P3P1-P3 0.8 -0.2 0.5 0.6 -0.1 0.1 0.5 -0.0 indirect P2P1-R221a 0.2 -0.2 -0.1 0.2 -0.1 -0.1 0.0 -0.2 noneP1-K224 -0.4 -0.6 -0.6 -0.4 -0.4 -0.5 -0.4 -0.4 noneP2-P3 1.1 0.1 0.5 0.6 -0.0 0.1 0.7 0.2 indirect P1P2-R221a 0.5 0.1 -0.1 0.3 -0.1 0.0 0.4 0.1 noneP2-K224 0.0 -0.2 -0.4 -0.2 -0.2 -0.2 0.0 0.0 noneP3-R221a 0.2 0.1 -0.1 0.0 0.0 -0.1 0.1 0.0 noneP3-K224 0.4 0.2 -0.0 -0.1 0.2 -0.0 0.1 0.2 noneR221a-K224 0.2 0.2 0.0 -0.2 0.1 -0.0 0.2 0.2 none

Slow FormP1-P2 2.2 0.7 0.6 0.5 -0.3 -0.6 -0.6 -0.9 indirect P3, R221a, K224P1-P3 1.2 -0.2 0.6 0.7 -0.3 -0.4 0.1 -0.1 indirect P2P1-R221a 0.9 -0.6 0.3 0.5 -0.7 -0.6 -0.0 -0.3 indirect P2P1-K224 0.3 -1.4 -0.2 -0.1 -1.6 -1.3 -0.5 -1.1 indirect P2P2-P3 1.4 0.0 0.7 0.7 -0.2 -0.3 -0.1 -0.3 indirect P1, R221a, K224P2-R221a 1.4 -0.1 0.6 0.7 -0.4 -0.4 -0.1 -0.3 indirect P1, P3, K224P2-K224 1.4 -0.3 0.7 0.7 -0.6 -0.5 0.0 -0.6 indirect P1, P3, R221aP3-R221a 0.7 0.1 -0.0 0.6 -0.1 0.0 -0.2 0.1 indirect P2P3-K224 0.8 0.3 0.1 0.6 -0.0 0.2 -0.1 0.1 indirect P2R221a-K224 -0.2 -0.6 -0.9 -0.4 -0.8 -0.7 -1.1 -0.6 indirect P2a Listed are all possible configurations of the other sites (0 ) wild-type, 1 ) mutant) in the order P1, P2, P3, R221a, and K224.

Errors are (0.1 kcal/mol or less. Indirect coupling requires values that differ by at least (RT (0.6 kcal/mol). Direct coupling ofless than (RT on the average is considered zero.

Site-Specific Thermodynamics Chemical Reviews, 1998, Vol. 98, No. 4 1583

Page 22: Site-Specific Thermodynamics:  Understanding Cooperativity in Molecular Recognition

In the slow form all sites are strongly coupled. Eachcoupling can be dissected to identify the elementperturbing the interaction. A direct way to illustratethe effect of a third site on the coupling between twosites is to calculate the difference in coupling freeenergy of a pair due to the 0 f 1 transition of a thirdsite, in all possible configurations of the remainingsites. The P2 site emerges as a major node ofinteraction. In the slow form, the state of P2 influ-ences all interactions (Table 9). The state of R221aand K224 influences the P1-P2 and P2-P3 interac-tions but has no effect on the P1-P3 coupling thatis influenced by P2. Finally, the coupling betweenR221a and K224 is influenced by P2 only. As aresult, the Ala replacements at these thrombinresidues produce additive effects on specificity whenPro is at P2 but are positively linked when Pro isreplaced by Gly.It is of interest to note that the molecular deter-

minants of cooperativity among the specificity sitesin thrombin, like the region around W60d in the S2site and the R221a-E146 and K224-E217 ion pairs,are not present in trypsin and plasmin. Theseproteases, unlike thrombin, show simple additivityof the effects of perturbing individual sites in thesubstrate (Table 6). Disruption of the R221a-E146and K224-E217 ion pairs in thrombin produces atrypsin-like energetic profile. The coupling freeenergies reflect the strain imposed by the enzyme onthe substrate in the transition state. More con-strained environments, like thrombin in the slowform, tend to couple more the substitutions made atdifferent P sites. In more relaxed environments, likethrombin in the fast form or trypsin, the coupling isgreatly reduced or absent. The energetic signaturesof substrate recognition in these proteases correlatewell with the known structural features of theenzymes. Trypsin has a more accessible environmentin the specificity sites than thrombin.119 The infor-mation is also valuable when the structure is notknown, as in the case of plasmin. The results inTable 6 suggest that the environment of the specific-ity sites of plasmin is more similar to that of trypsinthan thrombin.The approach based on site-specific thermodynam-

ics is capable of effectively probing the environmentof the specificity sites of the enzyme in the transitionstate. Extension to other proteases, or to othermutant forms of thrombin, may further elucidate thestructural determinants of enzyme specificity and therole of cooperativity in substrate recognition. Theapproach can also be extended to the analysis ofligand binding coupled to mutational effects. Thesubstrate library provides an exceptionally sensitiveprobe of the molecular environment of the specificitysites of the enzyme and can be used to assess theeffect on these sites caused by the binding of allostericligands. In the case of thrombin, the library canunravel the effect of thrombomodulin on the specific-ity sites of the enzyme and help understand themechanism of action of this important cofactor.

F. How Thrombomodulin Really WorksThrombomodulin is a cofactor present on the

surface of endothelial cells that markedly (∼1000-

fold) increases the ability of thrombin to activateprotein C while it inhibits in a competitive mannerfibrinogen binding.132 It has been proposed that suchan effect is borne out by a thrombomodulin-inducedchange in thrombin conformation,132,133 but convinc-ing experimental support to this hypothesis has beenlacking. The substrate library (Table 4) was there-fore used to dissect the effect of thrombomodulin onthe specificity sites of thrombin. The site-specificapproach reveals important new information on themolecular mechanism of thrombomodulin function.When thrombomodulin binds to the fast form, there

is at most a 2-fold enhancement of specificity for allsubstrates in the library that differ up to 5 orders ofmagnitude in specificity (Table 10). Binding ofthrombomodulin to the slow form produces a consis-tently higher increase in specificity by as much as15-fold (Table 10). As a result, thrombomodulinbinding tends to abolish the differences between theslow and fast forms, consistent with the observationthat the cofactor binds to the fast form with higheraffinity.105,112 This effect is also seen with the naturalsubstrate protein C, which is cleaved by the slow formwith significantly higher specificity in the absencebut not in the presence of thrombomodulin.105,106Interestingly, the slow form becomes more specificin the case of substrates such as FGR and VGR whenthrombomodulin binds, suggesting that the cofactormay elicit other effects in addition to the slow f fasttransition of thrombin. All of these effects, however,are not peculiar to thrombomodulin because thehirudin C-terminal fragments 55-65 (hir),55-65 re-produce them almost identically, whereas it has noeffect on protein C activation. Thrombomodulin andhir55-65 share common epitopes on exosite I of throm-bin.134,135 These epitopes may provide the structuralbasis for the allosteric effects observed on the hy-drolysis of chromogenic substrates.These results have a bearing on the mechanism

that leads to the drastic (∼1000-fold) enhancementof thrombin specificity toward protein C upon throm-bomodulin binding, which is seen in both the slow

Table 10. Specificity Constants kcat/Km (µM-1 s-1) forthe Hydrolysis of Synthetic Substrates by Thrombinin the Presence of Thrombomodulin or Hir55-65 a

FPR FPK FGR VPR FGK VPK VGR VGK

Fast Formthrombo-modulin

94 10 2.1 96 0.035 3.7 0.44 0.010

hir55-65 117 6.3 1.7 98 0.023 2.0 0.32 0.0064rTMa 1.0 1.3 1.0 1.0 1.7 1.8 1.3 2.1rhirb 1.3 0.8 0.8 1.0 1.1 1.0 0.9 1.4

Slow Formthrombo-modulin

20 3.4 4.4 27 0.020 1.6 0.73 0.0061

hir55-65 21 1.6 3.1 24 0.0084 0.57 0.66 0.0027rTMa 6.7 9.7 5.1 4.0 7.7 15 4.3 7.7rhirb 7.0 4.6 3.6 3.6 3.2 5.2 3.9 3.4

a Experimental conditions: 5 mM Tris, I ) 200 mM, 0.1%PEG, pH 8.0, at 25 °C, 100 nM thrombomodulin or 100 µMhir.55-65 The slow form was studied in the presence of 200 mMcholine chloride. The properties of the fast form refer to thelimit [Na+] f ∞, at constant I ) 200 mM. Errors are typically(2%. b Ratio of specificity relative to the absence of thrombo-modulin (see Table 5). c Ratio of specificity relative to theabsence of hir55-65 (see Table 5).

1584 Chemical Reviews, 1998, Vol. 98, No. 4 Di Cera

Page 23: Site-Specific Thermodynamics:  Understanding Cooperativity in Molecular Recognition

and fast forms.105 The effect of thrombomodulin onthe specificity sites S1, S2, and S3 of the enzymeproduces a change in specificity that is either small(fast form) or at most 15-fold (slow form). Hence,thrombin must enhance its specificity toward proteinC using sites other than those probed by the libraryof chromogenic substrates, but this has little experi-mental support.95 A more reasonable hypothesis isthat thrombomodulin exerts its physiologically im-portant function by influencing the conformation ofthe bound protein C in the thrombin-thrombomodu-lin-protein C ternary complex, thereby enhancingthe specificity of the enzyme by turning protein C intoa better substrate. It is unlikely that the structuraldomains responsible for the enhancement in specific-ity are entirely located in regions of the enzyme otherthan the critical sites within the catalytic pocket thatcan be probed with the substrate library. It is alsounlikely that thrombomodulin would induce a largeconformational transition in thrombin not linked toa large change in heat capacity.112 Thrombomodulinmakes extensive contacts with thrombin through itsEGF domains 5 and 6, whereas its EGF domain 4may contact protein C.136 The thrombin-thrombo-modulin complex would also have the W60d loop andespecially the Na+ binding loop available for contact-ing protein C to form the ternary complex. It isconceivable that the bound protein C would makecontacts with the bound thrombomodulin, perhapsat the level of the external portion of W60d loop ofthrombin. If this were the case, a chromogenicsubstrate contacting only the interior of the catalyticpocket would not experience the large change inspecificity observed for protein C because it wouldlack the critical direct interaction with the cofactor.This model explains the similarity of effects seen onthe chromogenic substrates with thrombomodulinand hir55-65 but the lack of effect of hir55-65 on proteinC hydrolysis. It also predicts that it should bepossible to find mutations of thrombomodulin thatdo not affect binding to thrombin but reduce theability of thrombin to cleave protein C or mutationsof protein C that affect cleavage by thrombin todifferent extent in the presence and absence ofthrombomodulin. A number of such mutations havebeen reported recently for protein C137-139 and providemuch support to our proposed mechanism for throm-bomodulin function. Thrombomodulin is therefore acompetitive inhibitor of fibrinogen binding to throm-bin and a cofactor of protein C.

V. New Formalism for the Analysis of MutationalEffectsThe analogy between binding and mutational ef-

fects introduced in section III can be extended furtherto develop a new formalism for the analysis ofmutational effects. In the case of ligand binding, thequantities accessible to experimental measurements,like the binding isotherm, are continuous. Much canbe learned from the shape and properties of thebinding isotherm,7-9 and its analysis yields discrete,site-specific parameters as shown in section II. Inthe case of mutational effects, on the other hand, thediscrete site-specific parameters are determined di-

rectly without the need for analyzing continuousquantities that are functions of these parameters.Although this is certainly advantageous, a continuousrepresentation of the energetics may come in quitehandy when some general properties of the systemare to be illustrated. For example, cooperativity isknown to profoundly affect the shape of the bindingisotherm.7-9 Therefore, the analysis of mutationaleffects in terms of quantities equivalent to thebinding isotherm may help elucidate features of thecooperative nature of the process that are difficultto grasp from inspection of the site-specific param-eters alone.A pivotal quantity in the analysis of ligand binding

cooperativity is the partition function of the system,Ψ, that lists all intermediates involved in the bindingequilibria as defined by the law of mass action.7-9

The partition function is the sum of the concentra-tions of all intermediates relative to the concentrationof the unligated species used as reference. Once thepartition function is correctly defined, the importantquantities of the system can be derived by dif-ferentiation.A partition function can also be defined for muta-

tional effects. To this end, we define a generalequilibrium reaction M + jP ) Mj, where M is thewild-type macromolecule, P is a generic site-directedmutation applied to it, and Mj is the macromoleculebearing j such site-specific mutations. The reactionso defined is analogous to the binding equilibriuminvolving M and j ligand molecules. The free energyfor the equilibrium reaction is defined from thedifference in chemical potential between the productMj and the parent species M and P. The differencebetween the chemical potentials of Mj and M is givenby the difference in stability or specificity calculatedexperimentally. For example, in the case of substratebinding, the ratio of the concentrations of macromo-lecular species [Mj]/[M], is given by sj/s, where s refersto the specificity. Consideration of the site-specificintermediates in the system gives the partitionfunction

M00...0 is the reference wild-type macromolecule towhich mutations at sites 1, 2, ...N can be introduced.The variable 0 e x e ∞ is a dummy quantityanalogous to the ligand concentration and can bethought of as the driving force responsible for 0 f 1or wild-type f mutant transition taking place at eachsite of the macromolecule. The discrete nature of theprocess describing the perturbations at each site isgiven a continuous description through the variablex, just like ligand binding to discrete sites of amacromolecule is given a continuous description interms of the polynomial expansion analogous to eq32. In general, any coefficient of the partition func-tion can be written in terms of a coupling free energyof order R + â + ... + ω plus the sum of R + â + ... +ω site-specific perturbation free energies. For ex-ample, the partition function for the case of muta-tions at two sites characterized in terms of enzyme

Ψ ) ∑R)0

1

∑â)0

1

...∑ω)0

1 [MRâ...ω]

[M00...0]xR+â+...+ω (32)

Site-Specific Thermodynamics Chemical Reviews, 1998, Vol. 98, No. 4 1585

Page 24: Site-Specific Thermodynamics:  Understanding Cooperativity in Molecular Recognition

specificity is

where the s’s refer to specificity constants of thevarious intermediates. Alternatively, eq 33 candescribe the effect of mutating two sites on thestability of a protein (see below). Identification of xwith the Ca2+ concentration turns eq 33 into thepartition function for ligand binding to calbindin (seesection II.C).Differentiation of the logarithm of the partition

function relative to the logarithm of x gives aquantity, X, analogous to the average number ofligated sites in ligand binding processes. Moreinformative for mutational effects is however thederivative of X analogous to the binding capacity.9The quantity

defines the global susceptibility of the system, or theprobability density that a given perturbation in thesystem will cause a certain free energy change inspecificity or stability. Specifically, the product øN-1

d ln x is the probability density that a mutationproduces a free energy perturbation of specificity orstability comprised between RT ln x and RT(ln x + dln x). This information is obtained directly from aplot of ø versus ∆G ) RT ln x and is of immediatepractical relevance. In the susceptibility plot, thedummy variable x assumes physical meaning throughthe free energy change ∆G caused by a given per-turbation introduced in the system. The response ofthe system to the perturbation is proportional to thevalue of ø for a given value of ∆G.The information generated at the site-specific level

with mutational perturbations, as in the case ofthrombin dealt with in section IV, is sufficient todefine susceptibilities for each perturbed site. As forligand binding cooperativity, the quantities definedin the global description can be decomposed into theirsite-specific contributions as implied by the basic eq1. To define Xj, we make use of contracted forms ofthe partition function containing all configurationswith site j perturbed, 1Ψj, or wild-type, 0Ψj, as shownin section II for ligand binding. From the definitionof Xj it also follows that

with the conservation relationship

The site-specific susceptibility defines the probabilitydensity øj d ln x that a given perturbation or mutationin the system will cause a free energy perturbationat site j comprised between RT ln x and RT(ln x + dln x).Integration of the susceptibility profile yields im-

portant information on the energetics that is difficultto obtain from the site-specific parameters. Theinformation is particularly relevant in the site-specific case when cooperativity is present. The firstmoment of the global susceptibility defines themeanfree energy of perturbation, i.e.

∆Gm measures the average perturbation per sitedefined as the free energy change in going from thewild-type to the fully perturbed configuration, dividedby the number of sites. In fact, it can be shown that9

The quantity xm is the analogue of Wyman’s medianligand activity in ligand binding processes8,140 thatdefines the value of x where the unligated and fullyligated configurations are equally populated.The global susceptibility for a system composed of

two identical and independent sites is illustrated inFigure 9. A steeper distribution is indicative ofpositive interactions among the sites, whereas a more

Figure 9. Global susceptibility profiles for the case of twoidentically perturbed and independent sites (continuousline; ∆G1 ) 2 kcal/mol, ∆G2 ) 2 kcal/mol, and ∆G12 ) 0kcal/mol). The cases of two positively (discontinuous line;∆G1 ) 3.5 kcal/mol, ∆G2 ) 3.5 kcal/mol, and ∆G12 ) -3kcal/mol) or negatively (discontinuous-dotted line; ∆G1 )0.5 kcal/mol, ∆G2 ) 0.5 kcal/mol, and ∆G12 ) 3 kcal/mol)linked sites are also shown for comparison. The three casesare constructed so to have the same value of ∆Gm ) 2 kcal/mol, using the partition function in eq 33 in the text.

∆Gm ) RT ln xm )1N[∫-∞

+∞ø∆G d ∆G/∫-∞

+∞ø d ∆G] (37)

∆Gm ) (∆G1 + ∆G2 + ... + ∆GN + ∆G12...N)/N(38)

Ψ )[M00]

[M00]+ ([M10]

[M00]+[M01]

[M00])x +[M11]

[M00]x2 )

1 +(s10s00 +s01s00)x +

s11s00x2 ) 1 + [exp(-∆G1

RT ) +

exp(-∆G2

RT )]x + exp(-∆G1 + ∆G2 + ∆G12

RT )x2 (33)

ø ) dXd ln x

) d2 ln Ψd ln2 x

(34)

øj )dXj

d ln x(35)

ø ) ∑j)1

N

øj (36)

1586 Chemical Reviews, 1998, Vol. 98, No. 4 Di Cera

Page 25: Site-Specific Thermodynamics:  Understanding Cooperativity in Molecular Recognition

broader distribution is conducive to negative interac-tions or site heterogeneity in response to the muta-tional perturbations. The area under the curve isconstant and gives the number of sites N.The site-specific susceptibilities define quantities

analogous to ∆Gm, i.e.

The value of ∆Gm,j is the mean free energy ofperturbation at site j when a given mutation isintroduced in the system. This value is particularlyinformative in the presence of interactions among thesites because of the potential ambiguity in definingthe cost of a mutation at site j (Table 7). Unlike ∆Gm,∆Gm,j cannot be expressed as a simple function of thesite-specific parameters, except in the trivial case ofindependent sites. The value of ∆Gm,j must beobtained by integration of the susceptibility profile.We also note the conservation relationship

as a direct consequence of eq 36. The sum of themean free energies of perturbation for the N indi-vidual sites gives the mean free energy of perturba-tion in the global description times the number ofsites. Alternatively, the mean free energy of pertur-bation in the global description is the average of thesite-specific mean free energies of perturbation.As an illustrative example, we calculate the global

and site-specific susceptibility profiles for the slowand fast forms of thrombin interacting with thesubstrate library discussed in section IV. The resultsare shown in Figure 10. The partition function forthe system carrying perturbations at the P1-P3 sitesof the substrate FPR is

where the suffix denotes perturbation at site 1 (P1),2 (P2), and 3 (P3) in order. For example, sFPR is s000and sVGR is s110 (see eqs 25-31). The various ∆G’s

are the same as those listed in Table 6. The expres-sions for the site-specific quantities X1, X2 and X3 are

and the site-specific susceptibilities are derived fromdifferentiation of eqs 42-44 according to eq 35.

∆Gm,j ) ∫-∞+∞

øj∆G d ∆G/∫-∞+∞

øj d ∆G (39)

∆Gm )1

N∑j)1

N

∆Gm,j (40)

Ψ )[M000]

[M000]+ ([M100]

[M000]+[M010]

[M000]+[M001]

[M000])x +

([M110]

[M000]+[M101]

[M000]+[M011]

[M000])x2 +[M111]

[M000]x3 )

1 + (s100s000+s010s000

+s001s000)x + (s110s000

+s101s000

+s001s000)x2 +

s111s000

x3 ) 1 + [exp(- ∆G1

RT ) + exp(- ∆G2

RT ) +

exp(- ∆G3

RT )]x + [exp(- ∆G1 + ∆G2 + ∆G12

RT ) +

exp(- ∆G1 + ∆G3 + ∆G13

RT ) +

exp(- ∆G2 + ∆G3 + ∆G23

RT )]x2 +

exp(- ∆G1 + ∆G2 + ∆G3 + ∆G123

RT )x3 (41)

Figure 10. Global susceptibility profiles of the slow(discontinuous line) and fast (continuous line) forms ofthrombin, constructed from eqs 41-44 in the text usingthe parameters listed in Table 6. The profiles are decom-posed into the site-specific components showing the sus-ceptibility of S1 to perturbation in P1 (O,b), S2 to pertur-bation in P2 (0,9), and S3 to perturbation in P3 (4,2) inthe slow (O,0,4) and fast (b,9,2) forms. The values of meanfree energy of perturbation are (slow form) ∆Gm ) 1.6 kcal/mol, ∆Gm,1 ) 2.9 kcal/mol, ∆Gm,2 ) 2.2 kcal/mol, and ∆Gm,3) -0.2 kcal/mol and (fast form) ∆Gm ) 1.9 kcal/mol, ∆Gm,1) 2.2 kcal/mol, ∆Gm,2 ) 3.6 kcal/mol, and ∆Gm,3 ) 0.0 kcal/mol.

X1 ) 1 - [1 + (s010s000+s001s000)x +

s011s000

x2]/[1 + (s100s000

+s010s000

+s001s000)x + (s110s000

+s101s000

+s011s000)x2 +

s111s000

x3] (42)

X2 ) 1 - [1 + (s100s000+s001s000)x +

s101s000

x2]/[1 + (s100s000

+s010s000

+s001s000)x + (s110s000

+s101s000

+s011s000)x2 +

s111s000

x3] (43)

X3 ) 1 - [1 + (s100s000+s010s000)x +

s110s000

x2]/[1 + (s100s000

+s010s000

+s001s000)x + (s110s000

+s101s000

+s011s000)x2 +

s111s000

x3] (44)

Site-Specific Thermodynamics Chemical Reviews, 1998, Vol. 98, No. 4 1587

Page 26: Site-Specific Thermodynamics:  Understanding Cooperativity in Molecular Recognition

The global susceptibility (Figure 10) spans 9 kcal/mol, consistent with the presence of negative couplingamong the sites and heterogeneous response of theP1-P3 sites to structural perturbation. There is verylittle difference in the profiles of the two forms, asalso indicated by the similar values of ∆Gm (1.9 kcal/mol for the fast form and 1.6 kcal/mol for the slowform). However, the similarity in the global suscep-tibility is contrasted by significant differences in thesite-specific susceptibilities. There is a profounddifference in the response to perturbation at P1, withthe fast form being less susceptible by 0.7 kcal/mol.The slow f fast transition affects the environmentof D189 at S1 by making it less susceptible to theArg f Lys replacement at P1. This difference is notborne out by a difference in the site-specific pertur-bation free energy ∆G1 (Table 6), but rather by adifferent coupling between S1 and S2 in the twoforms. The susceptibility profile is in this case veryinformative because it reveals an important propertyof the system that is not easily anticipated uponinspection of the site-specific parameters. Likewise,there is a profound difference in the response toperturbation at P2, with the fast form being moresusceptible by 1.4 kcal/mol. The shape of the sus-ceptibility is in this case very different in the twoforms. The slow f fast transition affects the envi-

ronment of the apolar site S2 by making it moresusceptible to the Pro f Gly replacement at P2. Thecooperativity at this site is also higher in the fastform, indicating a reduction of the negative linkagewith S1 and S3. These effects are primarily ac-counted for by the differences in the perturbation freeenergy ∆G2, which is smaller in the slow form, andby the stronger negative coupling between P1 andP2 in this form. The response to perturbation at P3shows no significant differences in the two forms.

Figure 11. Effect of thrombomodulin on the globalsusceptibility profiles of the slow (discontinuous line) andfast (continuous line) forms of thrombin, constructed fromeqs 41-44 in the text using the specificity values listed inTable 10. The profiles are decomposed into the site-specificcomponents showing the susceptibility of S1 to perturbationin P1 (O,b), S2 to perturbation in P2 (0,9), and S3 toperturbation in P3 (4,2) in the slow (O,0,4) and fast(b,9,2) forms. Comparison with the data in Figure 10 showhow the profiles tend to become similar in the two formsat both the global and site-specific level. The values of meanfree energy of perturbation are (slow form) ∆Gm ) 1.6 kcal/mol, ∆Gm,1 ) 1.9 kcal/mol, ∆Gm,2 ) 2.7 kcal/mol, and ∆Gm,3) 0.0 kcal/mol and (fast form) ∆Gm ) 1.8 kcal/mol, ∆Gm,1) 1.9 kcal/mol, ∆Gm,2 ) 3.5 kcal/mol, and ∆Gm,3 ) 0.1 kcal/mol.

Figure 12. Susceptibility profiles for the Ala substitutionof the Glu-Arg pair at position 44 and 53 of a variant ofthe immunoglobulin G binding domain of streptococcalprotein G. The global susceptibility (continuous line) isdecomposed into the site-specific components showing thesusceptibility of E44 (b) and R53 (O) to the Ala substitu-tion. Comparison with the data in Figure 13 shows thedifference in cooperativity between the pair. Curves weredrawn using eqs 45-47 in the text with parameter val-ues: ∆G1 ) 0.84 kcal/mol, ∆G2 ) 1.47 kcal/mol, and ∆G12) -1.09 kcal/mol.

Figure 13. Susceptibility profiles for the Ala substitutionof the Thr-Ile pair at position 44 and 53 of a variant of theimmunoglobulin G binding domain of streptococcal proteinG. The global susceptibility (continuous line) is decomposedinto the site-specific components showing the susceptibilityof T44 (b) and I53 (O) to the Ala substitution. Comparisonwith the data in Figure 12 shows the difference in coop-erativity between the pair. Curves were drawn using eqs45-47 in the text with parameter values ∆G1 ) 0.24 kcal/mol, ∆G2 ) 0.49 kcal/mol, and ∆G12 ) 0.74 kcal/mol.

1588 Chemical Reviews, 1998, Vol. 98, No. 4 Di Cera

Page 27: Site-Specific Thermodynamics:  Understanding Cooperativity in Molecular Recognition

The susceptibility profiles also help understand theeffect of thrombomodulin on the two forms of throm-bin (Figure 11). The most notable effect is that theprofiles for perturbation at P1 and P2 become moresimilar in the slow and fast forms, in contrast to whatis seen in the absence of cofactor (Figure 10). Theoverall effect of thrombomodulin binding is to changethe environment of the slow form around the S1 andS2 sites into a fast-like conformation, so that thedifferences between the two allosteric states arereduced. The susceptibility profiles illustrate directlythe mechanism of action of this important cofactor.As another example we analyze the elegant work

of Smith and Regan on the energetics of â-sheet sidechain interactions.141 They determined the effect onprotein stability resulting from the substitution ofcross-strand pairs of side chains on an antiparallelâ-sheet of a variant of the immunoglobulin G bindingdomain. Several residues were substituted at posi-tion 44 and 53 facing each other on the oppositestrands and the results were expressed relative tothe stability of the Ala residues at either position.These data can be analyzed using the same formal-ism as developed for the analysis of enzyme catalysisand ligand binding. The partition function for thesystem carrying Ala perturbations at positions 44(site 1) and 53 (site 2) is

∆G1 is the free energy change in stability due toreplacement of the residue at position 44 with Ala,∆G2 is analogous free energy change in stability dueto replacement of the residue at position 53 with Ala,and ∆G12 is the coupling free energy between thesubstitutions. The expressions for the site-specificquantities X1 and X2 are

and the site-specific susceptibilities are derived fromdifferentiation according to eq 35.The susceptibility profiles are shown in Figures 12

and 13 for the Glu-Arg and Thr-Ile pairs of residues44-53. In the case of the Glu-Arg pair, the coupling

is positive and the susceptibility profile is peakedaround the mean free energy of perturbation. Theexistence of positive coupling makes the site-specificsusceptibilities look very similar, although the site-specific free energies of perturbation differ by nearly0.7 kcal/mol. In the case of the Thr-Ile pair, on theother hand, the coupling is negative and the suscep-tibilities spread out over a range of 4 kcal/mol. As aresult, the site-specific susceptibilities look differentand peak at free energy values about 1 kcal/molapart, although the intrinsic free energies of pertur-bation are within 0.2 kcal/mol.

VI. ConclusionsSite-specific thermodynamics5,9 provides a general

theory of cooperativity and extends previous treat-ments based exclusively on global effects.7,8 Thecontribution of individual sites or residues to proteinstability and ligand recognition can be dissected byintroducing site-specific perturbations by means ofrecombinant DNA technologies. The combination ofthis experimental tool with the principles of site-specific thermodynamics results in a powerful newstrategy that can detect the extent, nature, and originof cooperativity in the system. Structural perturba-tions should be introduced in the system in a rationalmanner to generate all intermediates for a site-specific dissection of the energetics. We have il-lustrated how to implement this strategy in practiceusing substrate recognition by thrombin. The amountof information and the degree of detail on therecognition process to be gained from such novelapproach is unprecedented. Extension of the samestrategy to other systems is possible and highlydesirable. The conceptual framework discussed inthis review article will certainly appeal to biochemistsand biophysicists involved in studies of structure-function relationships and molecular recognition inproteins and nucleic acids.

VII. AcknowledgmentsI am grateful to Prof. Peter Lollar for providing

unpublished data on the interaction of factor VIIIwith the monoclonal antibody 413 (Table 3). Thiswork was supported by NIH Research GrantsHL49413 and HL58141 and was carried out underthe tenure of an Established Investigator Award inThrombosis from the American Heart Associationand Genentech.

VIII. References(1) Perutz, M. F. Q. Rev. Biophys. 1989, 22, 139-236.(2) Creighton, T. E. Protein Folding; Freeman: New York, 1992.(3) Zimm, B. H.; Bragg, J. K. J. Chem. Phys. 1959, 31, 526-535.(4) Smith, M. Annu. Rev. Genet. 1985, 19, 423-462.(5) Di Cera, E. Adv. Protein Chem. 1998, 51, 59-119.(6) Wegscheider, R. Monatsch. Chem. 1895, 16, 153-158.(7) Hill, T. L. Cooperativity Theory in Biochemistry; Springer-

Verlag: Berlin, 1984.(8) Wyman, J.; Gill, S. J. Binding and Linkage; University Science

Books: Mill Valley, CA, 1990.(9) Di Cera, E. Thermodynamic Theory of Site-Specific Binding

Processes in Biological Macromolecules; Cambridge University:Cambridge, U.K., 1995.

(10) Adams, E. Q. J. Am. Chem. Soc. 1916, 38, 1503-1510.(11) Simms, H. S. J. Am. Chem. Soc. 1926, 48, 1239-1261.

Ψ )[M00]

[M00]+ ([M10]

[M00]+[M01]

[M00])x +[M11]

[M00]x2 )

1 +[exp(- ∆G1

RT ) + exp(- ∆G2

RT )]xexp(- ∆G1 + ∆G2 + ∆G12

RT )x2 (45)

X1 ) 1 - [1 + exp(- ∆G2

RT )x]/{1 + [exp(- ∆G1

RT ) + exp(- ∆G2

RT )]x +

exp(- ∆G1 + ∆G2 + ∆G12

RT )x2} (46)

X2 ) 1 - [1 + exp(- ∆G1

RT )x]/{1 + [exp(- ∆G1

RT ) + exp(- ∆G2

RT )]x +

exp(- ∆G1 + ∆G2 + ∆G12

RT )x2} (47)

Site-Specific Thermodynamics Chemical Reviews, 1998, Vol. 98, No. 4 1589

Page 28: Site-Specific Thermodynamics:  Understanding Cooperativity in Molecular Recognition

(12) Edsall, J. T.; Blanchard, M. H. J. Am. Chem. Soc. 1933, 55,2337-2353.

(13) Edsall, J. T.; Wyman, J. Biophysical Chemistry; Academic: NewYork, 1958.

(14) Neuberger, A. Biochem. J. 1936, 30, 2085-2094.(15) Hill, T. L. J. Chem. Phys. 1944, 12, 56-61.(16) Linderstrøm-Lang, K. Compt. Rend. Lab. Carlsberg 1924, 15,

1-29.(17) Tanford, C.; Kirkwood, J. G. J. Am. Chem. Soc. 1957, 79, 5333-

5339.(18) Bashford, D.; Karplus, M. J. Phys. Chem. 1991, 95, 9556-9561.(19) Kretsinger, R. H. CRC Crit. Rev. Biochem. 1980, 8, 119-174.(20) Qian, H. Biopolymers 1993, 33, 1605-1616.(21) Chakrabarty, A.; Schellman, J. A.; Baldwin, R. L. Nature 1991,

351, 586-588.(22) Wrabl, J. O.; Shortle, D. Protein Sci. 1996, 5, 2343-2352.(23) Skelton, N. J.; Kordel, J.; Akke, M.; Forsen, S.; Chazin, W. J.

Nat. Struct. Biol. 1994, 1, 239-245.(24) Szebenyi, D. M. E.; Moffat, J. J. Biol. Chem. 1986, 261, 8761-

8777.(25) Linse, S.; Brodin, P.; Drakenberg, T.; Thulin, E.; Sellers, P.;

Elmden, K.; Grundstrom, T.; Forsen, S. Biochemistry 1987, 26,6723-6735.

(26) Linse, S.; Brodin, P.; Johansson, C.; Thulin, E.; Grundstrom, T.;Forsen, S. Nature 1988, 335, 651-652.

(27) Linse, S.; Johansson, C.; Brodin, P.; Grundstrom, T.; Drakenberg,T.; Forsen, S. Biochemistry 1991, 30, 154-162.

(28) Akke, M.; Forsen, S.; Chazin, W. J. J. Mol. Biol. 1991, 220, 173-189.

(29) Carlstrom, G.; Chazin, W. J. J. Mol. Biol. 1993, 231, 415-430.(30) Linse, S.; Chazin, W. J. Protein Sci. 1995, 4, 1038-1044.(31) Brodin, P.; Johansson, C.; Forsen, S.; Drakenberg, T.; Grund-

strom, T. J. Biol. Chem. 1990, 265, 11125-11130.(32) Martin, S. R.; Linse, S.; Johansson, C.; Bayley, P. M.; Forsen,

S. Biochemistry 1990, 29, 4188-4193.(33) Ahlstrom, P.; Teleman, O.; Kordel, J.; Forsen, S.; Jonsson, B.

Biochemistry 1989, 28, 3205-3211.(34) Nayal, M.; Di Cera, E. Proc. Natl. Acad. Sci. U.S.A. 1994, 91,

817-821.(35) Kojima, N.; Palmer, G. J. Biol. Chem. 1983, 258, 14908-14913.(36) Hendler, R. W.; Subba Reddy, K. V.; Shrager, R. I.; Caughey,

W. S. Biophys. J. 1986, 49, 717-729.(37) Senear, D. F.; Ackers, G. K. Biochemistry 1990, 29, 6568-6577.(38) Perrella, M.; Rossi-Bernardi, L. Methods Enzymol. 1981, 76,

133-143.(39) Ackers, G. K.; Doyle, M. L.; Myers, D.; Daugherty, M. A. Science

1992, 255, 54-63.(40) Judice, J. K.; Gamble, T. R.; Murphy, E. C.; de Vos, A. M.;

Schultz, P. G. Science 1993, 261, 1578-1581.(41) Cornish, V. W.; Schultz, P. G. Curr. Opin. Struct. Biol. 1994, 4,

601-607.(42) Matthews, B. W. Annu. Rev. Biochem. 1993, 62, 139-160.(43) Yu, M.-H.; Weissman, J. S.; Kim, P. S. J. Mol. Biol. 1995, 249,

388-397.(44) Shortle, D. FASEB J. 1996, 10, 27-34.(45) Meeker, A. L.; Garcia-Moreno, B.; Shortle, D.Biochemistry 1996,

35, 6443-6449.(46) Milla, M. E.; Brown, B. M.; Sauer, R. T. Nat. Struct. Biol. 1994,

1, 518-523.(47) Garcia-Moreno, E., B.; Dwyer, J. J.; Gittis, A. G.; Lattman, E.

E.; Spencer, D. S.; Stites, W. E. Biophys. Chem. 1997, 64, 211-224.

(48) Clackson, T.; Wells, J. A. Science 1995, 267, 383-386.(49) Castro, M. J. M.; Anderson, S. Biochemistry 1996, 35, 11435-

11446.(50) Tsiang, M.; Jain, A. K.; Dunn, K. E.; Rojas, M. E.; Leung, L. L.

K.; Gibbs, C. S. J. Biol. Chem. 1995, 270, 16854-16863.(51) Dickinson, C. D.; Kelly, C. R.; Ruf, W. Proc. Natl. Acad. Sci.

U.S.A. 1996, 93, 14379-14384.(52) Pakianathan, D. R.; Kuta, E. G.; Artis, D. R.; Skelton, N. J.;

Hebert, C. A. Biochemistry 1997, 36, 9642-9648.(53) Lau, F. T.-K.; Fersht, A. R. Nature 1987, 326, 811-812.(54) Cunningham, B. C.; Wells, J. A. Science 1989, 244, 1081-1085.(55) Green, S. M.; Meeker, A. K.; Shortle, D. Biochemistry 1992, 31,

5717-5728.(56) Horovitz, A.; Fersht, A. R. J. Mol. Biol. 1992, 224, 733-740.(57) Fersht, A. R.; Serrano, L. Curr. Opin. Struct. Biol. 1993, 3, 75-

83.(58) Carter, P.; Wells, J. A. Nature 1988, 332, 564-568.(59) Vindigni, A.; Dang, Q. D.; Di Cera, E. Nat. Biotechnol. 1997,

15, 891-895.(60) Sandberg, W. S.; Terwilliger, T. C. Science 1989, 245, 54-57.(61) Shirley, B. A.; Stanssen, P.; Steyaert, J.; Pace, C. N. J. Biol.

Chem. 1989, 264, 11621-11625.(62) Wells, J. A. Biochemistry 1990, 29, 8509-8517.(63) Carter, P. J.; Winter, G.; Wilkinson, A. J.; Fersht, A. R.Cell 1984,

38, 835-840.(64) Mildvan, A. S.; Weber, D. J.; Kuliopulos, A. Arch. Biochem.

Biophys. 1992, 294, 327-340.

(65) Shortle, D.; Meeker, A. L. Proteins: Struct., Funct., Genet. 1986,1, 81-89.

(66) Perry, K. M.; Onuffer, J. J.; Gittelman, M. S.; Barmat, L.;Matthews, C. R. Biochemistry 1989, 28, 7961-7970.

(67) Howell, E. E.; Booth, C.; Farnum, M.; Kraut, J.; Warren, M. S.Biochemistry 1990, 29, 8561-8568.

(68) LiCata, V. J.; Speros, P. C.; Rovida, E.; Ackers, G. K. Biochem-istry 1990, 29, 9771-9783.

(69) Scrutton, N. S.; Berry, A.; Perham, R. N. Nature 1990, 343, 38-43.

(70) Green, S. M.; Shortle, D. Biochemistry 1993, 32, 10131-10139.(71) Jackson, S. E.; Fersht, A. R. Biochemistry 1993, 32, 13909-

13918.(72) Robinson, C. R.; Sligar, S. G. Protein Sci. 1993, 2, 826-832.(73) LiCata, V. J.; Ackers, G. K. Biochemistry 1995, 34, 3133-3159.(74) Shortle, D.; Stites, W. E.; Meeker, A. L. Biochemistry 1990, 29,

8033-8041.(75) Richieri, G. V.; Low, P. J.; Ogata, R. T.; Kleinfeld, A. M. J. Biol.

Chem. 1997, 272, 16737-16740.(76) Young, D. C.; Zhan, H.; Cheng, Q.-L.; Hou, J.; Matthews, D. J.

Protein Sci. 1997, 6, 1228-1236.(77) Kristensen, C.; Kjeldsen, T.; Wiberg, F. C.; Schaffer, L.; Hach,

M.; Havelund, S.; Bass, J.; Steiner, D. F.; Andersen, A. S. J. Biol.Chem. 1997, 272, 12978-12983.

(78) Cosmatos, A.; Cheng, K.; Okada, Y.; Katsoyannis, P. G. J. Biol.Chem. 1978, 253, 6586-6590.

(79) Nakagawa, S. H.; Tager, H. S. Biochemistry 1992, 31, 3204-3214.

(80) Marki, F.; Gasparo, M. D.; Eisler, K.; Kambler, B.; Riniker, B.;Rittel, W.; Sieber, P.Hoppe-Seyler’s Z. Physiol. Chem. 1979, 360,1619-1632.

(81) Nakagawa, S. H.; Tager, H. S. J. Biol. Chem. 1991, 266, 11502-11509.

(82) Kobayashi, M.; Ohgaku, S.; Iwasaki, M.; Maegawa, H.; Wa-tanabe, N.; Takada, Y.; Shigeta, Y.; Inouye, K. Biomed. Res.1984, 5, 267-272.

(83) Mirmira, R. G.; Tager, H. S. Biochemistry 1991, 30, 8222-8229.(84) Lubin, I. M.; Healey, J. F.; Barrow, R. T.; Scandella, D.; Lollar,

P. J. Biol. Chem. 1997, 272, 30191-30195.(85) Ackers, G. K.; Smith, F. R. Annu. Rev. Biochem. 1985, 54, 597-

629.(86) Horovitz, A.; Fersht, A. R. J. Mol. Biol. 1990, 214, 613-617.(87) Koshland, D. E.; Nemethy, G.; Filmer, D. Biochemistry 1966, 5,

365-385.(88) Monod, J.; Wyman, J.; Changeux, J. P. J. Mol. Biol. 1965, 12,

88-118.(89) Rawlings, R. D.; Barrett, A. J. Biochem. J. 1993, 290, 205-218.(90) Rawlings, R. D.; Barrett, A. J.Methods Enzymol. 1994, 244, 19-

61.(91) Neurath, H. Science 1984, 224, 350-357.(92) Doolittle, R. F.; Feng, D. F. Cold Spring Harbor Symp. Quant.

Biol. 1987, 52, 869-874.(93) Patthy, L. Blood Coagulation Fibrinolysis 1990, 1, 153-166.(94) Dang, Q. D.; Di Cera, E. Proc. Natl. Acad. Sci. U.S.A. 1996, 93,

10253-10256.(95) Di Cera, E.; Dang, Q. D.; Ayala, Y. M. Cell. Mol. Life Sci. 1997,

53, 701-730.(96) Lesk, A. M.; Fordham, W. D. J. Mol. Biol. 1996, 258, 501-537.(97) Warshel, A.; Naray-Szabo, G.; Sussman, F.; Hwang, J. K.

Biochemistry 1989, 28, 3629-3637.(98) Hedstrom, L.; Szilagyi, L.; Rutter, W. J. Science 1992, 255, 1249-

1253.(99) Schechter, I.; Berger, A. Biochem. Biophys. Res. Commun. 1967,

27, 157-162.(100) Mann, K. G.; Nesheim, M. E.; Church, W. R.; Haley, P.;

Krishnaswamy, S. Blood 1990, 76, 1-16.(101) Davie, E. W.; Fujikawa, K.; Kisiel, W. Biochemistry 1991, 30,

10363-10370.(102) Grand, R. J. A.; Turnell, A. S.; Grabham, P. W. Biochem. J. 1996,

313, 353-368.(103) Ishihara, H.; Connolly, A. J.; Zeng, D.; Kahn, M. L.; Zheng, Y.

W.; Timmons, C.; Tram, T.; Coughlin, S. R. Nature 1997, 386,502-506.

(104) Wells, C. M.; Di Cera, E. Biochemistry 1992, 31, 11721-11730.(105) Dang, Q. D.; Vindigni, A.; Di Cera, E. Proc. Natl. Acad. Sci.

U.S.A. 1995, 92, 5977-5981.(106) Dang, Q. D.; Guinto, E. R.; Di Cera, E. Nat. Biotechnol. 1997,

15, 146-149.(107) Bode, W.; Turk, D.; Karshikov, A. Protein Sci. 1992, 1, 426-

471.(108) Martin, P. D.; Robertson, W.; Turk, D.; Huber, R.; Bode, W.;

Edwards, B. F. P. J. Biol. Chem. 1992, 267, 7911-7920.(109) Stubbs, M.; Oschkinat, H.; Mayr, I.; Huber, R.; Angliker, H.;

Stone, S. R.; Bode, W. Eur. J. Biochem. 1992, 206, 187-195.(110) Ayala, Y. M.; Di Cera, E. J. Mol. Biol. 1994, 235, 733-746.(111) Guinto, E. R.; Di Cera, E. Biophys. Chem. 1997, 64, 103-109.(112) Vindigni, A.; White, C. E.; Komives, E. A.; Di Cera, E. Biochem-

istry 1997, 36, 6674-6681.(113) Rezaie, A. R. Biochemistry 1996, 35, 1918-1924.

1590 Chemical Reviews, 1998, Vol. 98, No. 4 Di Cera

Page 29: Site-Specific Thermodynamics:  Understanding Cooperativity in Molecular Recognition

(114) Guinto, E. R.; Vindigni, A.; Ayala, Y.; Dang, Q. D.; Di Cera, E.Proc. Natl. Acad. Sci. U.S.A. 1995, 92, 11185-11189.

(115) Dang, Q. D.; Sabetta, M.; Di Cera, E. J. Biol. Chem. 1997, 272,19649-19651.

(116) Di Cera, E.; Guinto, E. R.; Vindigni, A.; Dang, Q. D.; Ayala, Y.M.; Wuyi, M.; Tulinsky, A. J. Biol. Chem. 1995, 270, 22089-22092.

(117) Zhang, E.; Tulinsky, A. Biophys. Chem. 1997, 63, 185-200.(118) Krem, M. M.; Di Cera, E. Proteins: Struct., Funct., Genet. 1998,

30, 34-42.(119) Bartunik, H. D.; Summers, L. J.; Bartsch, H. H. J. Mol. Biol.

1989, 210, 813-828.(120) Henriksen, R. A.; Dunham, C. K.; Miller, L. D.; Casey, J. T.;

Menke, J. B.; Knupp, C. L.; Usala, S. J. Blood 1998, 91, 2026-2031.

(121) Miyata, T.; Aruga, R.; Umeyama, H.; Bezeaud, A.; Guillin, M.C.; Iwanaga, S. Biochemistry 1992, 31, 7457-7462.

(122) Gibbs, C. S.; Coutre, S. E.; Tsiang, M.; Li, W.-X.; Jain, A. K.;Dunn, K. E.; Law, V. S.; Mao, C. T.; Matsumura, S. Y.; Mejza,S. J.; Paborsky, L. R.; Leung, L. L. K. Nature 1995, 378, 413-416.

(123) Smith, G. P.; Petrenko, V. A. Chem. Rev. 1997, 97, 391-410.(124) Babine, R. E.; Bender, S. L. Chem. Rev. 1997, 97, 1359-1472.(125) Ni, F.; Ripoll, D. R.; Martin, P. D.; Edwards, B. F. P.Biochemistry

1992, 31, 11551-11557.(126) Vindigni, A.; Di Cera, E. Biochemistry 1996, 35, 4417-4426.(127) Rydel, T. J.; Tulinsky, A.; Bode, W.; Huber, R. J. Mol. Biol. 1991,

221, 583-601.

(128) De Filippis, V.; Vindigni, A.; Altichieri, L.; Fontana, A. Biochem-istry 1995, 34, 9552-9564.

(129) De Filippis, V.; Quarzago, D.; Vindigni, A.; Di Cera, E.; Fontana,A. Submitted for publication.

(130) Banfield, D. K.; MacGillivray, R. T. A. Proc. Natl. Acad. Sci.U.S.A. 1992, 89, 2779-2783.

(131) Lai, M.-T.; Di Cera, E.; Shafer, J. A. J. Biol. Chem. 1997, 272,30275-30282.

(132) Esmon, C. T. J. Biol. Chem. 1989, 264, 4743-4746.(133) Ye, J.; Esmon, N. L.; Esmon, C. T.; Johnson, A. E. J. Biol. Chem.

1991, 266, 23016-23021.(134) Mathews, I. I.; Padmanabhan, K. P.; Tulinsky, A.; Sadler, J. E.

Biochemistry 1994, 33, 13547-13552.(135) Vijayalakshmi, J.; Padmanabhan, K. P.; Mann, K. G.; Tulinsky,

A. Protein Sci. 1994, 3, 2254-2271.(136) Sadler, J. E. Thromb. Haemostasis 1997, 78, 392-395.(137) Richardson, M. A.; Gerlitz, B.; Grinnell B. W. Nature 1992, 360,

261-264.(138) Grinnell, B. W.; Gerlitz, B.; Berg, D. T. Biochem. J. 1994, 303,

929-933.(139) Gerlitz, B.; Grinnell, B. W. J. Biol. Chem. 1996, 271, 22285-

22288.(140) Wyman, J. Adv. Protein Chem. 1964, 19, 223-286.(141) Smith, C. K.; Regan, L. Science 1995, 270, 980-982.

CR960135G

Site-Specific Thermodynamics Chemical Reviews, 1998, Vol. 98, No. 4 1591

Page 30: Site-Specific Thermodynamics:  Understanding Cooperativity in Molecular Recognition

Recommended