+ All Categories
Home > Documents > Probing Possible Downhill Folding: Native Contact Topology Likely Places a Significant Constraint on...

Probing Possible Downhill Folding: Native Contact Topology Likely Places a Significant Constraint on...

Date post: 10-Nov-2023
Category:
Upload: independent
View: 0 times
Download: 0 times
Share this document with a friend
19
Probing Possible Downhill Folding: Native Contact Topology Likely Places a Significant Constraint on the Folding Cooperativity of Proteins with 40 Residues Artem Badasyan, Zhirong Liu and Hue Sun ChanDepartment of Biochemistry and Department of Molecular Genetics, Faculty of Medicine, University of Toronto, Toronto, Ontario, Canada M5S 1A8 Received 16 July 2008; received in revised form 6 September 2008; accepted 10 September 2008 Available online 17 September 2008 Experiments point to appreciable variations in folding cooperativity among natural proteins with approximately 40 residues, indicating that the behaviors of these proteins are valuable for delineating the contributing factors to cooperative folding. To explore the role of native topology in a protein's propensity to fold cooperatively and how native topology might constrain the degree of cooperativity achievable by a given set of physical interactions, we compared folding/unfolding kinetics simulated using three classes of nativecentric C α chain models with different interaction schemes. The approach was applied to two homologous 45-residue fragments from the peripheral subunit-binding domain family and a 39- residue fragment of the N-terminal domain of ribosomal protein L9. Free- energy profiles as functions of native contact number were computed to assess the heights of thermodynamic barriers to folding. In addition, chevron plots of folding/unfolding rates were constructed as functions of native stability to facilitate comparison with available experimental data. Although common Gō-like models with pairwise Lennard-Jones-type interactions generally fold less cooperatively than real proteins, the rank ordering of cooperativity predicted by these models is consistent with experiment for the proteins investigated, showing increasing folding cooperativity with increasing nonlocality of a protein's native contacts. Models that account for water-expulsion (desolvation) barriers and models with many-body (nonadditive) interactions generally entail higher degrees of folding cooperativity indicated by more linear model chevron plots, but the rank ordering of cooperativity remains unchanged. A robust, experimentally valid rank ordering of model folding cooperativity independent of the multiple nativecentric interaction schemes tested here argues that native topology places significant constraints on how coop- eratively a protein can fold. © 2008 Elsevier Ltd. All rights reserved. Edited by M. Levitt Keywords: contact order; chevron plots; Gō model; desolvation barriers; many-body interactions Introduction Behaviors of natural proteins are consequences of an interplay between physics and biology. The struc- tural and dynamic properties of natural proteins are dictated by physical forces acting in accordance with the proteins' amino acid sequences, which, in turn, are products of evolutionary selection for biological functions. Naturally occurring proteins encompass only a tiny fraction of all possible amino acid se- quences. Many natural protein propertieseven generic trendsare atypical among random amino acid sequences (reviewed in Refs. 1,2). For many *Corresponding author. E-mail address: [email protected]. Present address: Z. Liu, College of Chemistry and Molecular Engineering, Peking University, Beijing 100871, China. Abbreviations used: CI2, chymotrypsin inhibitor 2; NTL9, N-terminal domain of ribosomal protein L9; PDB, Protein Data Bank; PSBD, peripheral subunit-binding domain; db, desolvation barriers; CO, contact order; RCO, relative contact order. doi:10.1016/j.jmb.2008.09.023 J. Mol. Biol. (2008) 384, 512530 Available online at www.sciencedirect.com 0022-2836/$ - see front matter © 2008 Elsevier Ltd. All rights reserved.
Transcript

doi:10.1016/j.jmb.2008.09.023 J. Mol. Biol. (2008) 384, 512–530

Available online at www.sciencedirect.com

Probing Possible Downhill Folding: Native ContactTopology Likely Places a Significant Constraint on theFolding Cooperativity of Proteins with ∼40 Residues

Artem Badasyan, Zhirong Liu and Hue Sun Chan⁎

Department of Biochemistryand Department of MolecularGenetics, Faculty of Medicine,University of Toronto, Toronto,Ontario, Canada M5S 1A8

Received 16 July 2008;received in revised form6 September 2008;accepted 10 September 2008Available online17 September 2008

*Corresponding author. E-mail [email protected] address: Z. Liu, College o

Molecular Engineering, Peking UnivChina.Abbreviations used: CI2, chymotr

NTL9, N-terminal domain of ribosoProtein Data Bank; PSBD, peripheradomain; db, desolvation barriers; COrelative contact order.

0022-2836/$ - see front matter © 2008 E

Experiments point to appreciable variations in folding cooperativity amongnatural proteins with approximately 40 residues, indicating that thebehaviors of these proteins are valuable for delineating the contributingfactors to cooperative folding. To explore the role of native topology in aprotein's propensity to fold cooperatively and how native topology mightconstrain the degree of cooperativity achievable by a given set of physicalinteractions, we compared folding/unfolding kinetics simulated usingthree classes of native–centric Cα chain models with different interactionschemes. The approach was applied to two homologous 45-residuefragments from the peripheral subunit-binding domain family and a 39-residue fragment of the N-terminal domain of ribosomal protein L9. Free-energy profiles as functions of native contact number were computed toassess the heights of thermodynamic barriers to folding. In addition,chevron plots of folding/unfolding rates were constructed as functions ofnative stability to facilitate comparison with available experimental data.Although common Gō-like models with pairwise Lennard-Jones-typeinteractions generally fold less cooperatively than real proteins, the rankordering of cooperativity predicted by these models is consistent withexperiment for the proteins investigated, showing increasing foldingcooperativity with increasing nonlocality of a protein's native contacts.Models that account for water-expulsion (desolvation) barriers andmodels with many-body (nonadditive) interactions generally entail higherdegrees of folding cooperativity indicated by more linear model chevronplots, but the rank ordering of cooperativity remains unchanged. Arobust, experimentally valid rank ordering of model folding cooperativityindependent of the multiple native–centric interaction schemes tested hereargues that native topology places significant constraints on how coop-eratively a protein can fold.

© 2008 Elsevier Ltd. All rights reserved.

Keywords: contact order; chevron plots; Gō model; desolvation barriers;many-body interactions

Edited by M. Levitt

ess:

f Chemistry andersity, Beijing 100871,

ypsin inhibitor 2;mal protein L9; PDB,l subunit-binding, contact order; RCO,

lsevier Ltd. All rights reserve

Introduction

Behaviors of natural proteins are consequences ofan interplay between physics and biology. The struc-tural and dynamic properties of natural proteins aredictated by physical forces acting in accordance withthe proteins' amino acid sequences, which, in turn,are products of evolutionary selection for biologicalfunctions. Naturally occurring proteins encompassonly a tiny fraction of all possible amino acid se-quences. Many natural protein properties—evengeneric trends—are atypical among random aminoacid sequences (reviewed in Refs. 1,2). For many

d.

513Native Topology and Protein Folding Cooperativity

natural globular proteins that fold and unfoldreversibly, the kinetic accessibility of the folded statesuggests strongly that their energy landscapes3–6 arefunnel-like,3,6,7 constituting a special class of land-scape shapes that does not necessarily apply to arandom amino acid sequence. Traditionally, foldingkinetics data of natural proteins were rationalized interms of activation processes over free-energybarriers.8 Although the progress variables for freeenergy were mostly unspecified in such a phenome-nological approach, the method has provided manyuseful insights. In this view, the folding/unfoldingkinetics of many single-domain proteins with ∼60–100 residues is apparently limited by only a singlebarrier.9,10 This simple parametrization11 contrastsmarkedly with that for the more complex foldingkinetics of larger proteins, the interpretation of whichoften invoked transiently accumulating interme-diates12 and multiple barriers.13,14

In the energy landscape perspective, traditionalbarrier-limited folding is readily accounted for byfunnel-like landscapes.15,16 Our horizon of possibili-ties was broadened by the landscape perspective,however, as it proposed that the folding of someproteinsmay bedownhill, or barrierless, along certainwell-defined physical progress variables.3 In appar-ent confirmation of this prediction, recent experi-mental studies suggested that the 37-residueperipheral subunit binding domain from Escherichiacoli's 2-oxoglutarate dehydrogenase multienzymecomplex (BBL) undergoes downhill folding.17,18Kinetic behaviors observed in earlier temperaturejump experiments also indicated that the formation ofa compact globular intermediate from the cold-denatured state of the 415-residue two-domainenzymeyeast phosphoglycerate kinase (PGK)19 likelyproceeds in a downhill manner.20 Subsequent to theBBL thermodynamic study,17 the cooperatively fold-ing 80-residue λ6–85 part of lambda repressor (wildtype)21 was found to be “tunable” by mutationstowards downhill folding under a variety of stabilityconditions.22–24 These discoveries sparked intenseinterest. Notwithstanding the ongoing controversyregardingwhether BBL is indeed a downhill folder,25–29investigations of BBL, λ6–85 mutants, and otherputative downhill-folding proteins have led to sig-nificant experimental30–33 as well as theoretical34–42advances (see Refs. 43,44 for reviews). More recently,experimental folding data on a 62-residue version ofthe gene product protein W from bacteriophage λ(gpW)45 provided evidence that gpW is yet anotherexample of a protein “poised toward downhillfolding”.46The phenomenon of downhill folding is a valuable

window into general principles of protein energetics.It serves to bridge our understanding of conven-tional barrier-limited folding with other paradigmsof functional protein behavior such as that of intrin-sically disordered proteins (Refs. 47,48 and re-ferences therein). Fundamentally, barrier-limitedtwo-state-like folding10,11 implies that a protein'sfolded and unfolded states are separated by a regionin conformational space with drastically reduced

population. As such, whether a protein folds in abarrier-limited or downhill manner is intimatelyrelated to its folding cooperativity. The high degreeof cooperativity of many barrier-limited proteins is aphysically remarkable property,49 suggesting thatcooperativity itself is a consequence of stringentbiological selection.50–52 This recent finding fromtheory49,53,54 and experiment50 poses a serious con-ceptual challenge since natural two-state proteins aresignificantly more cooperative than coarse-grainedprotein chain models55 constructed to embodycommon intuitive notions of pairwise additive inter-actions56 (e.g., the hydrophobic–polar interactionscheme57,58). Apparently, only models that expli-citly59,60 or tacitly61 (see comment on pp. 554–557 ofRef. 62 regarding the COREX algorithm61) involvenonadditive many-body effects63 are capable ofproducing cooperativity comparable to that observedexperimentally.51,64,65 In this light, downhill foldersmay be amino acid sequences that did not undergo arigorous selection for cooperativity or, for somereasons, their folded structures are not conducive tocooperative folding. A comparison of the structuraland energetic differences between downhill andbarrier-limited folding should, therefore, elucidateboth noncooperative and cooperative folding.Studies of downhill folding so far has uncovered a

tangible relationship between native topology andfolding cooperativity. In line with results calculatedfrom Ising-like constructs66–68 that did not considerexplicit representations of the protein chain, recentinvestigations of explicit-chain models indicatedthat the contact pattern of native structure alonecan yield substantial information about a protein'sfolding cooperativity. Using a continuum53,69–71

native–centric Gō72 model for 16 proteins and a 16-residue peptide, Zuo et al. found that foldingcooperativity was well correlated with the numberof nonlocal contacts per residue in the protein'snative folded structure.36 Among the set of proteinsthey considered, which included some clearly two-state proteins such as chymotrypsin inhibitor 2(CI2), BBL was the least cooperative and barrierlessin the model.36 Using a similar modeling approach,Knott and Chan compared four proteins, all with∼40 residues so as to minimize chain-length effectsthat might otherwise obscure the impact of nativetopology on folding cooperativity.37 While recog-nizing53,56 that the additive interactions used in theirGōmodel tend to underestimate the cooperativity ofthe proteins it seeks to model,37 they found clearevidence that even such models could provide acrude prediction for the rank ordering of coopera-tivity of real proteins. Specifically, their model for a37-residue version of BBL [Protein Data Bank (PDB)ID: 1BBL] was determined to be barrierless and, bycomparison, much less cooperative than the corres-ponding model for a 39-residue fragment of theN-terminal domain of ribosomal protein L9 (NTL9;PDB ID: 1CQU), consistent with a trend that hadbeen indicated by experiments.17,73

The general trend observed in these two studieswas confirmed subsequently by Prieto and Rey using

514 Native Topology and Protein Folding Cooperativity

a harmonic-well instead of a Lennard-Jones form forthe favorable native contact interaction potential.They examined the 20-residue Trp-cage miniprotein1L2Y and seven other proteins with 36–67 residues,including some of the proteins such as BBL and CI2examined previously. These authors also found that1BBL was a downhill folder in their simulations.38

More recently, Cho et al. used Gō-like models aug-mented by charge and overall chain collapse terms toexamine the folding cooperativity of a 39-residue(PDB ID: 2CYU) and a 45-residue (PDB ID: 1W4H)form of BBL as well as a 39-residue truncated versionof 1W4H. They discovered that 2CYU models weresignificantly less cooperative than 1W4H models,40

thus offering a reconciliation of the seemingly con-tradictory experimental data obtained for 2CYUand 1W4H, respectively, by the Muñoz17,18 andFersht30,32 groups.These modeling successes raise a fundamental

question: Does native topology of a folded structureplay a significant role in the degree of cooperativity itcan potentially achieve? In other words, are somefolded structures intrinsically more likely to becooperative folders than others? Strictly speaking,to answer this question, one needs to first identify theconvergence set of all amino acid sequences thatencode for a given target structure, test these se-quences' folding cooperativity, then repeat theprocedure for many target structures, and finallycompare the average or maximum achievablecooperativity in the convergence sets for differenttarget structures. However, this exhaustive approachis currently not feasible. To address the above-posedquestion computationally, even a modest samplingof sequence space using atomic modeling wouldentail prohibitedly more effort than the insightfulatomic simulations of a few downhill-folding candi-dates performed thus far.40,42 To make progress, wetake a heuristic approach instead. Our method isbased upon the rank ordering of cooperativity37 ofnatural proteins modeled by multiple native–centricinteraction schemes.74 Model cooperativity of a givenstructure depends on the interaction scheme.40,59

Some common model interaction schemes producecooperativity lower than that of natural proteins53,56

whereas interaction schemes with many-body effectsafford higher cooperativity.59,75 We adopt the intui-tive working assumption that the diversity inpredicted behaviors of a given structure frommodelswithdifferent interaction schemes should be reflectiveof—though it is not identical with—the range of thecorresponding physical properties allowed by theconvergence set of amino acid sequences encoding forthe structure. Accordingly, we posit that if thecomputed rank ordering of cooperativity for a set ofproteins is independent of model interaction scheme,such a robustness is a strong indication that theexperimentally achievable cooperativity of the pro-teins would be ranked in the same order.Comparison of multiple interaction schemes has

been used recently to investigate possible downhillfolding by Cho et al.40 In this independent study, theyfound that the rank ordering of the model folding

cooperativity of three versions of BBL (see above)was invariant under various interaction schemeswith different degrees of nonadditivity based upon aspecific mathematical form.75 Building on theseadvances and the conceptual framework above, thepresent study compares folding cooperativity com-puted using native–centric interactions with (i)pairwise additive Lennard-Jones-type potentials,53,71(ii) pairwise potentials incorporating desolvationbarriers (dbs),53,76–78 and (iii) a general nonadditivescheme similar to but not identical with that of Ref.40. Because of the importance of solvation/desolva-tion effects in biomolecular assembly,16,79–82 we deemit instructive to consider models with elementary dbsbecause they capture key energetic aspects of waterexclusion in protein folding. Indeed, these modelshave been demonstrated to have protein-like coop-erative behaviors.53,76,78 Notably, they also provideamplitudes of native structural fluctuations that aremore realistic experimentally, in contrast to thesignificantly larger amplitudes predicted by commonGō models78 and even models with certain formalnonadditive effects (see below).Here, we elect to focus on three proteins of ∼40

residues because there are tantalizing clues that theirbehaviors may be more revealing about the relation-ship between native topology and folding coopera-tivity than the corresponding behaviors of longerprotein chains. As far as effects of chain length onnative stability is concerned, proteins of size range∼40 are considered marginal in that shorter peptidesare not likely to undergo two-state-like folding to astable globular state.83 As investigations of BBLexemplified, natural proteins with ∼40 residuesoften have appreciable differences in apparent fold-ing cooperativity. In contrast, folding is quite ideallytwo-state for many of the larger natural single-domain proteins,9,11 making it difficult experimen-tally to discriminate among their already highdegrees of cooperativity. For related reasons, someproteins of the∼40 size range84 are useful for probingthe general “speed limit” of folding as well.85,86 In thepresent study, we compare the folding cooperativityof NTL9 and two homologous peripheral subunit-binding domain (PSBD) proteins (PDB IDs: 1W4Eand 2BTH), for which experimental structural andfolding data are available.30,32,73,87 Among thesethree proteins, each of the two PSBD proteins has aclearly different native topology from that of NTL9,whereas the two PSBD homologs have essentiallyidentical folds with only subtle differences in pack-ing. Application of our approach outlined abovesuggests strongly that native topology is predictive ofand probably constraining on folding cooperativity.Our computed free-energy profiles and directlysimulated folding/unfolding kinetics showed notonly a clear difference in cooperativity betweenNTL9and each of the two PSBD proteins but also a robustlyhigher cooperativity for 1W4E than for 2BTH. Quiteremarkably, the latter theoretical rank ordering ofPSBD folding cooperativity is consistent with thatrevealed by experimental chevron plots,32 as will bediscussed below.

†We have previously used rijb rdbn to define contacts in

models with db, where rdbn is the Cα–Cα separation

between residues i and j at the db peak of their interactionpotential function.53 Differences in native contact countobtained by using this criterion and the rijb1.2rij

n criterionare largely negligible.

515Native Topology and Protein Folding Cooperativity

Theory

Salient features of the three native–centric modelinteraction schemes used in our investigation areoutlined in this section. Based on these interactionschemes, coarse-grained Cα protein chain modelswere constructed as before. As in Ref. 88, a pair ofresidues i,j is deemed to be in the native contact setif they are separated by at least three residues, that is,|i− j|N3, and a pair of non-hydrogen atom, onefrom each residue, are less than 4.5 Å apart89 in thePDB structure chosen for modeling. For the 1–39fragment of NTL9 (1CQU1–39), we used the samenative conformation we have modeled previously,37

which is the fifth among the 18 PDB NMRstructures reported for 1CQU, noting that no singleone of these NMR structures is deemed mostrepresentative by the PDB. For the 126–170 frag-ments of the two PSBD proteins (1W4E126–170 and2BTH126–170), we used the first conformation amongthe 20 PDB NMR structures, respectively, for 1W4Eand 2BTH. These conformations were selectedbecause each of them is listed in the PDB as “thebest representative conformation” in their respec-tive NMR ensemble. The native contact sets chosenhere for simulations will be further assessed belowin the context of detailed contact statistics of thePDB NMR ensembles in Results and Discussion.Thermodynamic sampling and dynamic simulationof folding and unfolding kinetics were conductedby using Langevin dynamics.90 Bias potentials wereemployed for selected modeling conditions forwhich direct simulations were inefficient (outlinedbelow in the last part of this section). Model time infolding/unfolding kinetics was measured in unitsof number of Langevin dynamics time steps. Unlessspecified otherwise, the Langevin parameters usedhere are identical with that in previous works ofour group.37,53,77,88

Common Gō-like interaction scheme

The present formulation of common continuumGōmodel follows previous works of our group,53,88

which, in turn, were based on formalisms proposedin several earlier studies.71,91 As in Refs. 53,71,88,91, a12–10 form is used here for the interaction betweenevery pair of residues that belong to the nativecontact set. The sum total of these pairwise additiveinteractions is given by

U rij� �� � ¼ Xnative

ibj�3ε 5

rnijrij

� �12

�6rnijrij

� �10" #ð1Þ

where the summation over i,j is restricted to nativecontact pairs. The variable distance rij is the Cα–Cα

separation between residues i and j in a givenconformation, and rij

n is the corresponding separationin the PDB structure. All aspects of the presentcommon Gō model are identical with that in Ref. 88.In particular, during dynamic simulation, if a residuepair i,j belongs to the native contact set, residues i andj are considered to be in contact when rijb1.2rij

n.

Interaction scheme with pairwise dbs

Implicit-water interaction potentials with dbswere motivated by potentials of mean force com-puted using explicit-water simulations (see, e.g.,Ref. 92 and references therein). The present form-ulation for native–centric interactions with pairwisedb follows that of Cheung et al.76 and previousworksfrom our group.53,77,78 Themodel potential used herefor the interaction scheme with pairwise db isidentical with that in Eq. (4) of Ref. 77, with thenative interaction term U(rij;rij

n,ɛ,ɛdb,ɛssm) providedby Eqs. (2) and (3) of Liu and Chan.78 The latterreference also contains a correction of prior typo-graphical errors (page S77 of Ref. 78). Duringdynamic simulations of these models, a residuepair i,j belonging to the native contact set areconsidered to be in contact when rijb1.2rij

n, as definedalso for the common Gō model above†. Figure 1shows examples of interaction potentialswith db andtheir effects on folding cooperativity. Using the well-studied CI2 as an example, Fig. 1b compares chevronplots of the common Gōmodel and dbmodels on thesame footing. As has been reported,54,78 these resultsindicate clearly that folding/unfolding cooperativity—asmanifested kinetically by the extent of the quasi-linear chevron regime93—increases with increasingmodel db height.

Interaction scheme with nonadditive many-bodyeffects

To explore many-body effects, here we consider aformally nonadditive interaction scheme by aug-menting the pairwise sum of native–centric contact-like interactions in Eq. (1) for the common Gō-likeformulation. Aside from the short-distance excluded-volume repulsions, Eq. (1) entails a favorable energyapproximately linear in the fractional number ofnative contact Q:

U rij� �� �

c� εQ̃nQ rij� �� � ð2Þ

where Q̃n is the total number of contacts in the native(PDB) structure. We now construct a nonadditive(many-body, mb) native–centric interaction schemethat is essentially quadratic in Q by making thefollowing change:

U rij� �� �

YU rij� �� �

mbuU rij� �� �þ εQ̃nQ rij

� �� �� εQ̃n Q rij

� �� �� 2 ð3ÞIt follows from Eq. (2) that the attractive part ofU({rij}) and þεQ̃nQ rij

� �� �essentially cancel on the

right-hand side of Eq. (3), while the excluded-volumerepulsion in U({rij}) remains operative. Hence, the

Fig. 1. dbs enhance folding/unfolding cooperativity in coarse-grained protein chain models. (a)Pairwise implicit-solvent modelinteraction potentials U(r)'s withdifferent db heights ϵdb=0.1ϵ, 0.2ϵ,and 5ϵ/9 (red, green, and bluecurves), where ϵ(N0) is the depthof the minimum value of U(r). Ageneral expression for these poten-tials was provided by Eqs. (2) and(3) of Ref. 78. The cartoon above thedbs illustrates that they originatephysically as an energetic cost ofwater (dashed circle) exclusion(arrows) when a protein's constitu-

ent parts (yellow circles) come together during the folding process. These three model potentials are identical with thoseused in Fig. 3 of Ref. 78, with depths ϵssm at the solvent-separated minima being equal to 0.2ϵ, 0.2ϵ, and ϵ/3, respectively.Included for comparison in (a) is a 10–12 potential (black curve) with the same minimum energy at contact but no db [Eq.(1) for the “without-solvation”model in Kaya and Chan53]. (b) Four model chevron plots for a 64-residue truncated formof CI2 (2CI220–83) simulated using the four pairwise potentials in (a), with the same color code. To highlight the differentdegrees of folding/unfolding cooperativity exhibited by the models, folding (open symbols, left) and unfolding (filledsymbols, right) rates of a given model are expressed in units of the folding (and unfolding) rate km at the transitionmidpoint of the samemodel. By the same token, the interaction strength ϵ is provided by a scale centered at the interactionstrength ϵm at the transition midpoint of the model; RT is gas constant times absolute temperature. Rates are given in unitsof reciprocal number of Langevin time steps. The chevron plots for the three with-db models were constructed using thesame data as that in Fig. 4 of Liu and Chan,78 whereas each of the data points for the folding and unfolding rates of thewithout-db barrier model (black symbols at different ϵ values) is determined here from 200 trajectories simulated atT=0.82. The dotted–dashed V shape is a hypothetical two-state chevron plot for all four models.

516 Native Topology and Protein Folding Cooperativity

attractive part of our new native–centric interac-tion function U({rij})mb equals approximately to�εQ̃nQ rij

� �� �� 2.The mathematical definition of fractional number

of native contact Q was also adjusted for our many-body interaction scheme. In the common Gō inter-action scheme described above, Q does not appearin the potential function and is determined by adiscrete criterion:

Q rij� �� � ¼

Pnativeibj�3 u 1:2rnij � rij

�Q̃ n

ð4Þ

where the function θ(x)=0 for x≤0 and θ(x)=1 forxN0. However, since Q now appears in the many-body potential function U({rij})mb in Eq. (3), whichhas to be differentiated to yield the conformationalforces for molecular/Langevin dynamics simula-tions, a smooth rather than discrete criterion forcontact is needed for Q({rij}) to be differentiable withrespect to the conformational coordinates (Cα posi-tions) in the chain model. To this end, we modify theθ function in Eq. (4) as follows:

u 1:2rnij � rij �

Q̃ n

Yus 1:2rnij; rij �

Pnativeibj�3 us 1:2rnij; rij ¼ rnij

� ð5Þ

where

us 1:2rnij; rij �

¼ 1

1þ exp 10 rij � 1:2rnij �

=rnijh i ð6Þ

is a smoothed version of the θ function. Accordingly,the fractional number of native contacts that entersinto our many-body interaction potential in Eq. (3) isgiven by

Q rij� �� � ¼

Pnativeibj�3 us 1:2rnij; rij

�Pnative

ibj�3 us 1:2rnij; rij ¼ rnij � ð7Þ

wherein the denominatorPnative

ibj�3 usð1:2rnij; rij ¼ rnijÞc0.88Q̃n is a normalization factor ensuring that Q=1for the PDB native structure. The definition in Eqs.(5) and (7) does not preclude QN1 because rijb rij

n ispossible. However, rijb rij

n is unfavorable because ofexcluded-volume repulsion. Thus, the population ofQN1 conformations is insignificant and the possibi-lity of QN1 is inconsequential to the physical inter-pretation of our results.Before proceeding with more detailed analysis,

it should be noted that models with Umb and simi-larly postulated many-body energy functions areuseful but are fundamentally limited. One obviouslimitation is that physical features such as dbs arenot explored in conjunction with the explicitmany-body terms in these models. Moreover, thesimple form of Umb does not capture more intricateenergetic effects such as local–nonlocal coupling,effects that are more likely to underlie cooperativefolding in real proteins59,60,94 than ad hoc schemessuch as Umb. Nonetheless, in the context of our verylimited current understanding of folding coopera-tivity, Umb is valuable for providing rudimentaryinsights about how properties of chain models with

517Native Topology and Protein Folding Cooperativity

a formally nonadditive, many-body energy functiondiffer from that predicted using a pairwise additiveGō-like construct. In this regard, it serves an impor-tant investigative purpose similar to that affordedby other many-body schemes introduced by East-wood and Wolynes75 and Jewett et al.95, respec-tively, for continuum and lattice protein models. Forexample, the Eastwood–Wolynes energy function isa sum of terms each of which raises the number ofnative contacts made by an individual residue to apower pN1 (whereas our Umb essentially squaresthe total number of native contacts as a whole).Similar to our Umb, both of these earlier potentialfunctions75,95 led to more cooperative folding be-cause their increase in energetic favorability inc-reases faster than Q.

Thermodynamic and kinetic sampling using biaspotentials

To determine thermodynamic cooperativity, weobtained some of the free-energy profiles as func-tions of Q in the present work using the umbrellasampling method that involves histogram techni-ques and bias potentials.96,97 In this approach, aseries of independent simulations were performed,each under a harmonic bias potential

Vbiass Qð Þ ¼ a Q�Qsð Þ2 ð8Þ

where a is a force constant and Qs varies fromsimulation to simulation (labeled by s) so as tocover the entire Q range of interest. From thesimulations, we obtained a series of conformationalpopulation distributions, Ps⁎(Q), over Q. The rela-tion between the actual free energy, G(Q), of theoriginal system (without the bias potential) and thepopulation simulated using bias potential is givenby

G Qð Þ ¼ �RTln Ps4 Qð Þ½ � � Vbiass Qð Þ þ Cs ð9Þ

hereCs is a constant yet to be determined. To enhancesampling, we parametrized Vs

bias such that only a

Fig. 2. Kinetic behaviors of the three classes of native–censhows typical simulated folding/unfolding trajectories for the(b) the model with db, and (c) the model with many-bodymidpoints. Q is fractional number of native contacts; t is modewith db, we used db height ɛdb=0.1ɛ and solvent-separated m

small patch of Q values is sampled with high accu-racy in each individual simulation. Then, to arrive atan accurate free-energy profile over the entire rangeof Q, we combined the free-energy patches, whereinthe values for Cs were determined by requiring thatthe common Q values between neighboring patchesshould have the same free energy.Direct simulation of folding can be very time

consuming when the folding/unfolding free-energybarrier is high. To overcome this bottleneck, weapplied a bias potential to accelerate some of ourfolding kinetics simulations. Our approach is basedon the observation that kinetic behaviors of con-formations around the denatured free-energy mini-mum have little effect on folding time because theconformations in the denatured basin are in quasi-equilibrium. Therefore, we may add a bias potentialVs

bias(Q) centered at the bottom of the denaturedbasin so as to upshift the free energy of the dena-tured state and thus lower the folding barrier. Thefolding time, tf, of the original system can then beextracted approximately from the (shortened) tra-jectory simulated under the bias potential:

tfcZ

eVbiass Q t4ð Þð Þ=RTdt4; ð10Þ

where RT has the same meaning as in Fig. 1, t⁎ is thetime in the bias simulation, and the integral is over agiven folding trajectory. A similar method calledhyperdynamics was proposed by Voter.98,99 Themain difference between our method and hyperdy-namics is that we constrained the bias potential to befar away from the transition state. This restrictionwas designed to avoid errors in Eq. (10) that areexpected to increase if the changes caused by thebias potential have significant effects near the peakof the folding barrier.

Results and Discussion

Figure 2 applies the three interaction schemesdescribed above to the 45-residue PSBD protein

tric interaction schemes described in the text. This figure45-residue 1W4E126–170 using (a) the common Gō model,effects, computed at the models' respective transition

l time. For this and all subsequent simulations in this workinimum depth ɛssm=0.5ɛ (see Refs. 77,78).

518 Native Topology and Protein Folding Cooperativity

1W4E126–170 and compares the cooperativity of theresulting folding/unfolding processes. The resultsindicate that model cooperativity increases in thefollowing order: common Gō→db→many-bodyeffects. The trajectory in Fig. 2a implies that folding/unfolding of the common Gōmodel for 1W4E126–170is noncooperative as the kinetics shows no clear

Fig. 3. 1W4E126–170 has more native contacts than 2BTH11W4E (b and d) shown in this figure are the “best represerespective ensemble of 20 NMR-determined structures proviCα traces for 2BTH126–170 (a, green) and 1W4E126–170 (b, red)contacts (as defined in the text). (a) shows the 14 contacts thain 2BTH126–170 but are absent between the corresponding natithe 23 native contacts in 1W4E126–170 that are not in 2BTH12the residues at the same pair of positions 143 and 166 in 2Bposition 166). The residue pairs are in contact in the native s2BTH in (c).

separation between a folded (high-Q) and an un-folded (low-Q) population. In contrast, the tra-jectories in Fig. 2b and c show that the model withdb and the model with many-body effects for thesame protein are both more cooperative, exhibit-ing two-state-like folding/unfolding transitions.Here, the model with many-body effects is seen to

26–170. Both the native structures for 2BTH (a and c) andntative conformer” (conformer 1) among these proteins'ded in the PDB. Thick lines in (a) and (b) are the native. Thin lines joining a pair of Cα positions indicate nativet are present between native chain positions (residues i,j)ve chain positions in 1W4E126–170. Conversely, (b) depicts6–170. (c) and (d) contrast the spatial separations betweenTH and 1W4E (both with a tryptophan at the mutatedtructure of 1W4E in (d) but not in the native structure of

Fig. 4. Comparing the nativecontact maps of 1W4E126–170 and2BTH126–170. The two PDB struc-tures analyzed here are the same asthat in Fig. 3. (a) Contacts (i,j)marked in black, red, and greenare present, respectively, in both1W4E and 2BTH, in 1W4E only, andin 2BTH only. The dashed diagonalis given by i− j=5 and may beviewed as a demarcation betweenlocal and nonlocal contacts. (b) Thestructures of 1W4E126–170 (thick redtrace) and 2BTH126–170 (thick greentrace) are superposed to minimizetheir root-mean-square deviation,with all their respective nativecontacts indicated by thin red andgreen lines.

519Native Topology and Protein Folding Cooperativity

be more cooperative because its transitions aresignificantly sharper than that for the model withdb.

Differences in the native contact patterns of2BTH and 1W4E

Figures 3 and 4 compare the native contact pat-terns of the two PSBD proteins 2BTH126–170 and1W4E126–170 with the goal of providing a rationaliza-tion for the significantly different degrees of coopera-tivity indicated by experimental folding data fromthe Fersht group32 (Fig. 5). Figure 3a and b shows that

Fig. 5. Logarithmic folding/unfolding relaxation rates (koΔGf. Data plotted for (a) 39-residue fragment of NTL9 (1C(1W4E126–170, filled circles) and BBL H166W (2BTH126–1experimental measurements of Horng et al.73 and Fergusonrelaxation rates shown by open and filled squares in (a)dependent chevron plots for wild-type NTL91–39 at T0=298.1and open circles in (b) were taken, respectively, from the uH166W at T0=298 K in Fig. 9c of Ref. 32. In (a), the ΔGf

0 andour formulation [Eq. (13)] were set by the experimental equTable 3 of Ref. 73 [ΔGf

0 =−ΔGUo (H2O), meq/RT0=m], but

available (see discussion in the text). In (b), the ΔGf scale wvalues for urea determined using fluorescence and provided[ΔGf

0 =−ΔGD–N, meq/RT0=mD–N]. In both (a) and (b), eachconsistent with the given protein's native thermodynamic sDashed V shapes are least-square two-state fits to the relaxathypothetical two-state chevron plots constructed by assuminfor details).

there are notably more contacts that can serve to tiethe two helices together in 1W4E126–170 (Fig. 3b)than in 2BTH126–170 (Fig. 3a). This pattern is under-scored by the lack of contact between the mutatedresidue Trp166 and position 143 in 2BTH126–170 (Fig.3c), in contrast to the existence of such a contact in1W4E126–170 (Fig. 3d). The number of native con-tacts for 2BTH126–170 is 66, which is smaller than thecorresponding number 75 for 1W4E126–170. None-theless, the differences in their topologies are quitesubtle. Figure 4 shows that their native contactpatterns have much in common (Fig. 4a), and theyhave very similar native folds (Fig. 4b).

bs measured in s−1) as functions of free energy of foldingQU1–39, open and filled squares) and (b) E3BD F166W70, open circles) are deduced, respectively, from theet al.32 using the formulation described in the text. Thewere taken, respectively, from the urea- and GuHCl-5 (25 °C) in Figs. 5 and 7 of Ref. 73. Those shown by filledrea-dependent chevron plots for E3BD F166W and BBLmeq values needed to relate urea concentration to ΔGf inilibrium ΔGU

o (H2O) and m values listed for NTL91–39 inexperimental equilibrium m value for GuHCl was notas set by the experimental equilibrium ΔGD–N and mD–Nin Table 2 of Ref. 32 for E3BD F166W and BBL H166Wsolid V shape is a hypothetical two-state chevron plottability [satisfying ln kf([d])/ln ku([d]) =−ΔGf ([d])/RT0].ion rates; thus, they do not necessarily coincide with theg two-state folding/unfolding cooperativity (see the text

520 Native Topology and Protein Folding Cooperativity

Criteria for folding cooperativity

Folding cooperativity is fundamentally a thermo-dynamic property, characterized traditionally by avan't Hoff-to-calorimetric enthalpy ratio ΔHvH/ΔHcal≈1. This condition implies a two-state-likeenthalpy distribution at the transition midpoint,with a high free-energy barrier separating the foldedand unfolded states of a given protein.49 Recentanalyses56,62 showed that the ΔHvH/ΔHcal≈1 con-dition is intimately related to a high Z score(commonly used in protein structure prediction) aswell as a large Tf/Tg ratio between folding andglass-transition temperatures in energy landscapetheories (see Eqs. (5) and (6) on pp. 361–362 ofRef. 49)‡. In lieu of or in addition to equilibrium dataon whether the conformational distribution of aprotein is two-state-like, one means to assessfolding/unfolding cooperativity is by ascertainingthe linearity, or lack thereof, of the arms of theprotein's chevron plot.49,53,54,93 Model studies indi-cated that a cooperatively folding protein with asingle high free-energy barrier should produceapproximately—though not perfectly—linear chev-ron arms within the range of native stability that isexperimentally accessible.60,93 To facilitate compar-ison between experimental and model chevronplots, we found it useful to express experimentalfolding and unfolding rates as functions of experi-mental native stability instead of denaturant con-centration. Such a change in variable is desirablebecause the corresponding model rates are alsoreadily expressed in terms of model native stability,thus allowing experimental and model folding/unfolding rates to be compared on the samefooting.53,77,93 In accordance with this investigativelogic, Fig. 5 replotted the experimental chevronplots from Horng et al.73 for NTL9 (1CQU) and fromFerguson et al.32 for the two PSBD mutants E3BDF166W (1W4E) and BBL H166W (2BTH) as func-tions of native stability, in anticipation of compar-ison with the corresponding simulated modelchevron plots in Figs. 6 and 7. The formulation wehave used to analyze the experimental data toarrive at Fig. 5 is as follows.

Chevron plots as functions of native stability:Analysis of existing experimental data

Chevron analyses conducted under constanttemperature T=T0 provide the dependence of

‡Note that the following sentence on p. 361 of Ref. 49needs to be corrected: “Z is defined as the width of energyvariation among a given set of denatured structures(nonnative decoy conformations) divided by the averageenergy separation between the native and the denaturedstructures” should read “Z is defined as the averageenergy separation between the native and a given set ofdenatured structures (nonnative decoy conformations)divided by the width of energy variation among thedenatured structures.”

logarithmic folding rate (kf) or unfolding rate (ku)as functions of denaturant (urea or GuHCl) con-centration [d]. When folding/unfolding is coop-erative, the relationship between logarithmicfolding or unfolding rate and [d] is approximatelylinear:

ln kf d½ �ð Þ ¼ ln k0f þmf

RT0d½ � ð11Þ

ln ku d½ �ð Þ ¼ ln k0u þmu

RT0d½ � ð12Þ

where ln kf0 and ln ku

0 are folding and unfoldingrates, respectively, at zero denaturant concentra-tion. For chevron plots with rollovers, however,one or both of these linear relationships would nothold. Nonetheless, irrespective of whether Eqs. (11)and (12) hold, the dependence of kf,ku on [d] may bereplaced by dependence on native stability (freeenergy of folding ΔGf), as discussed in our pre-vious studies.54,77 In many instances, even includ-ing certain situations with chevron rollovers as forthe 2BTH case considered below, an approximatelinear relation holds between ΔGf and [d]:

DGf d½ �ð Þ ¼ DG0f þmeq d½ � ð13Þ

where ΔGf0 is the free energy of folding in zero

denaturant. For all cases in Fig. 5 except NTL9 inGuHCl, this procedure is readily applied becauseEq. (13) holds approximately and meq is availablefrom experiment. In those situations, hypotheticaltwo-state chevron plots, shown as solid V shapes inFig. 5, are constructed by first making a linear fit tothe folding arm of the ln kobs chevron data toprovide a fitted function kf

(fit)(ΔGf/RT0). Then, astraight line for the fitted unfolding rate ku

(fit)(ΔGf/RT0) is drawn to (i) pass through the intersectionpoint of kf

(fit)(ΔGf/RT0) with ΔGf=0 (vertical dottedlines in Fig. 5; i.e., kf

(fit) =ku(fit) at ΔGf=0) and (ii)

possess a slope that satisfiesΔGf=−RT0 ln(kf(fit)/ku(fit)).In contrast, the dashed V shapes in Fig. 5 are fits tothe experimental kinetic chevron data alone byassuming that kobs=kf+ku and that kf,ku are linearin [d] but without imposing the ΔGf=−RT0 ln(kf/ku)constraint. Therefore, the ΔGf values at the verticesof dashed V shapes do not necessary coincide withΔGf=0, although they are often close (NTL9 in ureain Fig. 5a and BBL H166W in Fig. 5b) or approxi-mately equal (E3BD F166W in Fig. 5b) to it. For thecase of NTL9 in GuHCl for which experimental meqis not available, no solid V shape is provided andΔGf([d]) used for plotting this data set in Fig. 5a istaken to equal −RT0{ln kf([d])/ln ku([d])} with kf([d])and ku([d]) from the kinetic fit.The resulting plots in Fig. 5a show that 1CQU1–39

exhibits an extensive linear chevron regime indica-tive of cooperative folding. It is noteworthy as wellthat 1CQU1–39 chevron behaviors from urea andGuHCl data (open and filled squares) become verysimilar when both were plotted as functions ofnative stability, suggesting a basic relationship thatisothermal folding/unfolding rates in the two

Fig. 6. The rank ordering of folding/unfolding cooperativity of 1CQU1–39, 1W4E126–170, and 2BTH126–170 is invariantacross three different native–centric model interaction schemes. Results for the three proteins are provided, respectively,in the left, center, and right columns of this figure. Top: Free-energy profiles as functions of Q. As in our previousstudies,53,78,88 ΔG/RT=− ln P(Q), where P(Q) is the probability of the chain adopting a conformation with fractionalnumber of native contacts equal to Q. In each of the top panels, the free-energy profiles with the lowest, middle, andhighest overall barrier heights were computed, respectively, by using the common Gōmodel, the model with db, and themodel with many-body effects. Bottom: Model chevron plots for the three proteins. Here, ln(rate)=− ln(MFPT), whereMFPT is the kinetically simulated mean first passage time of folding or unfolding, and ΔGf was determined fromthermodynamic sampling of the models using the method described in the text. Kinetics at different ΔGf/RT values wassimulated by varying T while keeping ɛ constant. For the present models with temperature-independent potentialfunctions, the resulting rates should be essentially identical with those obtained by varying ɛ while keeping T constant.54

Consistent with the results in the top panels, in each of the bottom panels, the chevron plots with the fastest (circles),middle (squares), and slowest (triangles) folding and unfolding rates were simulated, respectively, using the common Gōmodel, the model with db, and the model with many-body effects. Each V shape is a hypothetical two-state chevron plotconsistent with the thermodynamic stability of the given protein model. For clarity, note that theΔG/RT (top panels) andln(rate) (bottom panels) values for different proteins were plotted in different vertical scales except that the same ΔG/RTscale was used for 1W4E and 2BTH. The free-energy profile for the 1CQU model with many-body effects was computedusing bias sampling. Each of the other eight free-energy profiles in the top panels was obtained from direct simulation fora duration of 109 time steps encompassing ∼20–100 unfolding/folding transition events. Each data point in the chevronplots in the bottom panels was averaged from at least 100 folding or unfolding trajectories of direct Langevin dynamicssimulation, except for the several slowest rates [ln(rate)b−19] on the folding arm for which sampling with bias potentialswas used.

521Native Topology and Protein Folding Cooperativity

denaturants are essentially equal under isostabilityconditions.54,77 In comparison with the cooperativebehavior of 1CQU1–39, folding cooperativity of1W4E126–170 may be less definitive. Its chevron(lower plot in Fig. 5b) shows some sign ofcooperative behavior as it has an extended linearfolding arm, but data are limited on the unfoldingarm. Additionally, Fig. 5 shows that 1W4E126–170generally folds considerably faster than 1CQU1–39,implying that the overall free-energy barrier at thetransition midpoint to folding is lower (and thusfolding is less cooperative) for 1W4E126–170 thanfor 1CQU1–39. Even so, when compared against

2BTH126–170, 1W4E126–170 is clearly more coopera-tive than 2BTH126–170, as is clear from the latter'schevron (upper plot in Fig. 5b), which exhibits thesignature noncooperative behavior of a sizablemismatch between experimental folding data anda hypothetical two-state chevron. Figure 5b showsfurther that 2BTH126–170 folds much faster than1W4E126–170, implying that 2BTH126–170 has evenlower overall free-energy barriers to folding than1W4E126–170. Based on these experimental observa-tions, it is reasonable to conclude that the degree offolding cooperativity of 2BTH126–170 is lowestamong the three proteins studied here.

Fig. 7. Further illustration of the invariance of rank ordering of folding/unfolding cooperativity. The data in Fig. 6are replotted here such that results along each column are for the same native–centric interaction scheme (asindicated). In each of the top panels, the free-energy profiles with the lowest, middle, and highest overall barrierheights are, respectively, for 2BTH126–170, 1W4E126–170, and 1CQU1–39. Their chevron plots correspond, respectively, tothat with the fastest (circles), middle (squares), and slowest (triangles) folding and unfolding rates in the bottompanels.

522 Native Topology and Protein Folding Cooperativity

Multiple-model simulations show a clear rankordering of cooperativity consistent withexperiments

Figure 6 compares the folding cooperativity of thethree proteins modeled by our three interactionschemes. The thermodynamic free-energy profiles(Fig. 6, upper panels) of a given protein modeled bydifferent interaction schemes vary, but the barrierheight in every interaction scheme considered inFig. 6 consistently follows the following order:1CQU1–39N1W4E126–170N2BTH126–170, in agreementwith the above experimental assessment. Kineti-cally, to express the model folding and unfoldingrates as functions of ΔGf/RT (Fig. 6, lower panels),we first determined the Q values for the minimumfor the native state (Qmin,N) and the minimum forthe denatured state (Qmin,D) along a given modelprotein's free-energy profile. We then computedΔGf/RT= ln[P(QbQmin,D)/P(QNQmin,N)]. In allcases studied in this work, the relationship betweenthe interaction strength over temperature, ɛ/T, andΔGf/RT determined in this manner was essentiallylinear. The resulting kinetic chevron plots in thelower panels of Fig. 6 show that the common Gōmodels led to severe rollovers with very little or nooverlap between hypothetical two-state chevronsand directly simulated folding and unfolding rates(top chevron in each of the lower panels). Theextent of corresponding overlaps between simu-

lated rates and hypothetical two-state chevrons islarger for db models and largest for models withmany-body terms (see middle and bottom chev-rons, respectively, in each of the lower panels). Atleast for the many-body interaction scheme, theextent of the quasi-linear model chevron regime issignificantly larger for 1CQU1–39 than for the twoPSBD proteins, a trend that is again consistent withthe higher degree of experimental folding coopera-tivity stipulated above for 1CQU1–39.Figure 7 replots the data in Fig. 6 to provide

additional perspectives. It shows that folding coop-erativity of the model interaction schemes followthe following order: common Gōbdbbmany-body.Thus, the trend observed in Fig. 2 for 1W4E126–170 isnow seen to be applicable to all three proteinsstudied. Moreover, consistent with the heights of thebarriers along the model free-energy profiles (Fig. 7,upper panels), the rank ordering of directly simu-lated folding rates of the three proteins (Fig. 7, lowerpanels) is 1CQU1–39b1W4E126–170b2BTH126–170 forall three model interaction schemes. This trendagrees with the experimental rank ordering of fold-ing rates in Fig. 5.

dbs limit native-state conformationalfluctuations

Figure 7 also highlights the merits and limitationsof the three models, especially with regard to using

Fig. 8. Dependence of experimental and simulatedlogarithmic folding rates on RCO. The experimental foldingrates of (from left to right) 2BTH126–170, 1W4E126–170,and 1CQU1–39 at their respective denaturant-inducedtransition midpoints (filled diamonds) and their extra-polated folding rates at zero denaturant (filled circles)were taken from Horng et al.73 and Ferguson et al.32 Thecorresponding simulated folding rates of these proteinsare given in the lower part of this figure for the interactionschemes of the common Gō model (asterisks), model withdb (open circles), and model with many-body effects(filled squares). These model rates were simulated at eachmodel's transition midpoint. Lines are least-square fits tothe experimental and simulated folding rates of the threeproteins we studied. Included for comparison are foldingrate data (open squares) for the 12 proteins of lengths 63–104 analyzed originally in Fig. 1a and Table 1 of Plaxco etal.101. All RCO values here were calculated by the methodin this reference. The unit for the experimental foldingrates is s−1, whereas each simulated folding rate is thereciprocal of mean first passage time measured by thenumber of Langevin time steps.

523Native Topology and Protein Folding Cooperativity

the db versus many-body modeling schemes toachieve a higher degree of folding cooperativity.Beginning with the common Gōmodel, we note thatalthough this particular form of native–centricpotential was designed to fold to a given PDBstructure, each of the native minima along the givenprotein's free-energy profile has only at mostQ≈85% of the native contacts. Similar trends havebeen observed in previous applications of thecommon Gō formulation to larger proteins.53,71

This somewhat paradoxical feature suggests thatthe native fluctuations in the common Gō modelsare likely too large to be quantitatively realistic. Incontrast, db models tend to entail significantlysmaller native fluctuations, that is, less floppy nativestates, an apparent advantage we have notedelsewhere.78 For the three proteins studied here,the native free-energy minima in the upper middlepanel of Fig. 7 for the db models are either at or verynear Q=100%. Interestingly, even though the pre-sent many-body interaction scheme is more coop-erative than our db interaction scheme, nativefluctuations are wider in the many-body scheme(free-energy minima at Q≈90%) than in the dbscheme. Considerable native fluctuations are alsopresent in the recent many-body interaction schemeof Cho et al.40 These observations underline that theform of cooperativity of a model (e.g., whether itarises from db,54,76 local–nonlocal couplingmechanisms60 such as hydrophobic burial of hydro-gen bonds,59,100 energy as certain simple nonlinearfunctions of Q or contact number,40,75,95 or acombination of these effects) can be critical in thesuccess or failure in capturing certain experimentalproperties of cooperative protein folding. As isevident above and also from the rationalization ofnative topology-dependent folding rates,49,94 parti-cular forms of many-body interaction—not merelythe overall cooperativity of a model—are expectedto be required to capture many of the real proteinproperties associated with cooperative folding.

Cooperativity increases folding speed diversity

Figure 8 compares the diversity of folding ratespredicted by the three interaction schemes withthose observed experimentally. The experimentalfolding rates at zero denaturant of the three smallproteins studied here (filled circles) are signifi-cantly faster than that of larger single-domainproteins with 63–104 amino acid residues (opensquares) and comparable values of relative contactorder (RCO).101 The experimental folding rates ofthe three small proteins at their respective transi-tion midpoints are expectedly slower (filled dia-monds), but the range spanned by the midpointfolding rates is essentially the same as that spannedby the zero-denaturant folding rates. The simulatedlogarithmic folding rates in the lower part of Fig. 8show that, in agreement with experiment, theycorrelate well with RCO. Moreover, at least for thethree interaction schemes explored in this work,diversity in folding rates increases with increasing

folding cooperativity. The range of transition-midpoint folding rates predicted by the commonGō model is significantly narrower than thatobserved experimentally (compare the asteriskswith the filled diamonds in Fig. 8), consistentwith results from several previous studies usingcommon Gō modeling.91,94,95 In comparison, fold-ing rates from models with db afford a wider range(open circles), as we have also observed among dbmodels for a collection of larger proteins.102 Amongthe three model interaction schemes here, however,the one with many-body effects is the only one thatled to a diversity in midpoint folding rates (filledsquares) quantitatively similar to that measuredexperimentally. The trend observed here is consis-tent with several previous studies finding thatvarious forms of many-body interactions canenhance folding speed diversity as well as improvethe correlation between RCO and logarithmicfolding rate.94,95,103,104

Robustness of cooperativity rank-orderingpredictions

As specified above, each of the native–centricchain models we employed to rank order coopera-tivity was based upon one chosen (target) con-formation among the multiple NMR structures

524 Native Topology and Protein Folding Cooperativity

deposited in the PDB for the given protein. Conse-quently, questions might be raised as to whether thetarget conformations can adequately reflect thebehaviors of their respective NMR ensembles, evenrecognizing that the native conformations chosen tomodel the two PSBD proteins have already beenidentified as “best representative” by the PDB. Table1 places the contact information of the targetconformations in a broader context of native contactstatistics for the three proteins' NMR ensembles toaddress this issue. Table 1 provides data for fourcontact definitions—namely, residue and atomiccontacts each with two different cutoff criteria. Thedata show that even taking into account the variationwithin each ensemble, the contact properties of thethree proteins are well separated, and thus, theirrank ordering should be robust. For the contactdefinitions with a 4.5-Å cutoff, there is essentially nooverlap in the numbers of either residue or atomiccontacts of the three proteins except for a smalloverlap (residue contact counts 81–83) between1CQU1–39 and 1W4E126–170. The native contactnumber ranges have more overlaps for the contactdefinitions with a 6.5-Å cutoff when local contactsare not excluded (no ib j−3 restriction). Nonetheless,the rank ordering of average contact density (aver-age contact number per residue) is consistently2BTH126–170b1W4E126–170b1CQU1–39 across all fournative contact definitions considered, echoing therank ordering of cooperativity determined above.

Table 1. Native contact statistics among NMR structures of t

No. of NMR structures in PDBib j−3, 4.5 Å cutoff Residue contacts Contact no. in ou

Range of contacAverage conta

Average contact no.COa

Average CAtomic contacts Contact no. in ou

Range of contacAverage conta

Average contact no.ib j, 6.0 Å cutoff Residue contacts Contact no. in ou

Range of contacAverage conta

Average contact no.Atomic contacts Contact no. in ou

Range of contacAverage conta

Average contact no.RCOb

Average RC

In the present consideration, for two amino acid residues (i,j) to be ineach residue, need to be separated by bdc in the PDB structure. Insimulations, dc=4.5 Å, and only contacts with ib j−3 are counted. In tcounted. In this table, the number of residue contacts for a given strucnative contact definition. For each such residue pair, we also enumeratthat are bdc apart. The number of atomic contacts for a given structurover all residue pairs that belong to the native contact set. Numbers osimulations. Range and average refer to the multiple NMR structures

a CO is calculated using only residue contact counts as in Wallin ab RCO was computed using atomic contact counts according to the

Table 1 also provides the target conformations'contact order (CO) values from residue contactcounts88 and their RCOs from atomic contactcounts,101,105 as well as ensemble averages of COand RCO. All four of these parameters for nativetopological complexity are higher for 1CQU1–39 thanfor the two PSBD proteins, whereas three of theseparameters are higher for 1W4E126–170 than for2BTH126–170. The former trend rationalizes theslower simulated and experimental folding rates of1CQU1–39 than that of the other two proteins. Thelatter trend rationalizes the faster simulated and ex-perimental folding rates for 2BTH126–170 than for1W4E126–170 (Fig. 8),101 notwithstanding that theresidue-contact CO value—one of the topologicalcomplexity parameters—is identical for the two tar-get PSBD conformations used for model simulations.

Variation in solvent viscosity causes nosignificant change in the shape of modelchevron plot

As stated above, the kinetic simulations for themodel chevron plots in Figs. 6 and 7 were conductedusing the same parameters for Langevin dynamics,with viscosity coefficient γ=0.0125, as in our pre-vious studies.53,88 This γ choice, which correspondsto a low friction regime and fast dynamics,90 wasprimarily motivated by the computational tractabil-ity it entails. However, the solvent viscosity mod-

he three proteins studied

1CQU1–39 1W4E126–170 2BTH126–170

18 20 20r model 93 75 66t nos. 81–95 74–83 59–69ct no. 88.1 78.0 64.2per residue 4.52 3.47 2.85

0.411 0.278 0.278O 0.415 0.297 0.253r model 810 485 393t nos. 712–849 485–576 363–427ct no. 745.9 537.2 398.4per residue 38.3 23.9 17.7r model 244 246 233t nos. 233–244 244–255 220–239ct no. 238.2 250.3 227.1per residue 12.2 11.1 10.1r model 4956 4660 4422t nos. 4698–5062 4660–4916 4372–4690ct no. 4788.6 4776.0 4490.6per residue 245.6 212.3 199.6

0.218 0.140 0.112O 0.215 0.143 0.111

native contact, at least one pair of non-hydrogen atoms, one fromthe first definition (top)—which is the one used in our modelhe second definition (bottom), dc=6.0 Å, and all contacts (ib j) areture counts the number of (i,j) residue pairs that satisfy the givene the number of non-hydrogen atom pairs (one from each residue)e is then the sum total of these atomic counts for individual (i,j)'sf contacts in boldface are for the native conformations used in ourin the PDB for the given protein.nd Chan [Eq. (4) in Ref. 88].original definition of Plaxco et al. (Ref. 101; see also Ref. 105).

Fig. 9. Model chevron plots for1CQU1–39 simulated using the com-monGō interaction schemeat variousγ values (as marked) for the viscosityterm in the Langevin dynamics. As inFigs. 6 and 7, ln(rate)=−ln(MFPT)where each folding (filled symbols)or unfolding (open symbols) MFPTdata point was averaged from 500trajectories. Fitted curves are a guidefor the eye.

525Native Topology and Protein Folding Cooperativity

eled by this γ is orders of magnitude lower than thatof water.90 Since certain properties of foldingdynamics may depend on solvent viscosity106 in acomplex107 manner, question about the robustnessof our theoretical predictions arises as to whethercooperativity-related features in model chevronplots such as their linear and rollover behaviorsmay be sensitive to solvent viscosity.Figure 9 addresses this issue by comparing model

chevron plots simulated under different coefficientsof viscosity, using 1CQU1–39 models with thecommon Gō interaction scheme as an example.Folding rates naturally depend on solvent viscosity.The dependence of two sets of representativefolding rates on γ is provided in Fig. 10. The plotsshow the expected Kramers' rate turnover108,109(rate is maximum at certain intermediate viscosity)as has been observed in earlier viscosity-dependentsimulations of protein folding.110,111 Nonetheless,despite this overall dependence of rates on γ, Fig. 9demonstrates that, aside from some minor varia-tions (which may warrant further analysis in futureworks), the shape of the chevron plot with itsrollover features remains essentially unchangedover more than 5 orders of magnitude variation inviscosity. Physically, according to the analysis ofVeitshans et al., the range of γ values covered by Fig.9 includes a high-friction regime comparable to thatof liquid water (γ≈5.0–22.5).90 Taken together, theseresults support our contention that the rollovers incommon Gō model chevron plots observed aboveand previously by us53,54,112 as well as by othergroups113,114 are a robust consequence of the lowlevels of folding cooperativity of these models, notan artifact of the low-viscosity conditions used forthe simulations.

Fig. 10. Nonmonotonic dependence of model foldingrate on viscosity. Logarithmic folding rate of the commonGōmodel for 1CQU1–39 at the transition midpoint (ΔGf=0,lower curve) and under strongly folding conditions (ΔGf=−16RTas an example, upper curve) versus logarithm of theviscosity coefficient γ for the Langevin dynamics in themodel. Folding rates (symbols) were from Fig. 9. Curvesthrough the data points are a guide for the eye.

Concluding Remarks

Using multiple-interaction-scheme native–centricmodeling, we have shown that variation in foldingcooperativity may be rationalized by native topo-

logy. Our coarse-grained modeling approach suc-ceeded in accounting for the difference in foldingcooperativity not only between proteins withdrastically different native topologies (NTL9 andthe two PSBDs) but also between homologousproteins with very similar native structures (1W4Eand 2BTH). The match achieved above betweenexperimental and model cooperativity rank order isquite remarkable in view of the fact that foldingcooperativity is a thermodynamic property deter-mined not only by the native structure but also bythe free energies of all conformations. Native–centricmodels do not consider nonnative interactions,which can be important in protein folding.115–117

Our native–centric models also ignored possiblesignificantly nonuniform distributions of contactenergies in natural proteins, a phenomenon revealedby studies of circular permutants118,119 and remainsto be better elucidated.In spite of these caveats, our modeling success

suggests that the requirement for amino acid se-

526 Native Topology and Protein Folding Cooperativity

quences to encode for a given structure placed ratherstringent physical restrictions on evolutionarychoice. Consequently, the folded structure can, onits own, provide ample information about the intra-chain interactions that are operative in the folded aswell as unfolded conformations. This very consid-eration is essentially the same as that used to justifyGō-type native–centric modeling itself.53,71,120 Ourapproach provides additional rigor by requiringconsistent results across several different native–centric interaction schemes. It would be interestingto extend this methodology to other natural proteinsas well as artificially designed proteins, to explore,for example, whether the differences in folding coop-erativity among the recently studied λ6–85 mutants24are reflected at all by subtle differences in thesemutants' folded structures in aqueous solution.A theoretical connection may be drawn between

our computational results and the effect of nativetopology on native stability of model proteins. It haslong been known that protein chain models withfolded native structures containing more nonlocalcontacts tend to be more stable thermodynamicallythan those with native structures containing pre-dominantly local contacts. Based on this observa-tion, some of the earlier modeling studies stipulatedthat folding should be faster for proteins with morenonlocal contacts than those with more localcontacts.121 However, this prediction turned out tobe opposite to the experimental trend discoveredsubsequently for the folding rates of natural single-domain proteins101 As had been commented at thetime, the fundamental reason for the contradictionbetween experiment and some of the earlier theorieswas that, by construction, folding of those earliermodels was strongly affected by kinetic traps, but inreality, significant kinetic trapping does not exist inmany single-domain proteins.122 Here, computa-tional results based upon native–centric models(which, in contrast with some of the earlier models,have minimal though not nonexistent trappingeffects112) indicate that native structures containingmore nonlocal contacts tend to fold more coopera-tively and, thus, fold more slowly123 (Fig. 8 andTable 1) as a result of their higher overall free-energybarriers to folding (Figs. 6 and 7). Now, this trend isconsistent with existing experiments (Fig. 5). Theabove considerations suggest that native stabilityand folding cooperativity are correlated in many—though not necessarily all—native–centric chainmodels (even though the corresponding relationshipin real proteins remains to be better elucidated101)and that both model properties tend to be enhancedby nonlocal native contacts.The conceptual underpinning of our analysis of

possible constraints imposed by native topology onfolding cooperativity may be viewed as a naturalextension of earlier ideas of encodability124 ordesignability125 of folded protein structures. In thisrespect, the present findings point to a plausibledesignability principle for folding cooperativity,namely, that proteins with a folded structure ofhigher topological complexity are more conducive

to be encoded by sequences that would undergocooperative folding. As an example, we note thatthis hypothesis appears to be consistent with thefinding of downhill-folding behaviors for theλD14A mutant22 because the D14A mutation onλ6–85 reduced native topological complexity byabolishing a stabilizing nonlocal side chain–sidechain hydrogen bond between an aspartic acid(position 14) and a serine (position 77)126,127 whilestrengthening local interactions (alanine has a higherhelical propensity than aspartic acid128). As anotherillustration, inasmuch as the ultrafast folding of the73-residue α3D is an indication of a lower degree offolding cooperativity (faster folding should be asso-ciated with a lower free-energy barrier), this designedprotein's significant propensity toward formation oflocal helical structure in the unfolded state85 may alsobe seen as lending support to our hypothesis.As a demonstration of principle, we have focused

primarily on small proteins with ∼40 amino acidresidues because experimentally observed differencesin their folding cooperativity—or lack thereof—canbe striking. In contrast, discerning variations infolding cooperativity among larger natural proteinsby common experimental techniques can be morechallenging because many of these proteins that havebeen studied so far fold and unfold with a highdegree of cooperativity. Nonetheless, as suggested bythe above discussion, we expect the existence of aconstraint imposed by native topology on foldingcooperativity to be generally applicable to largerproteins as well. Accordingly, a native–centric, multi-ple-interaction-scheme modeling approach along theline developed here should be useful as an efficienttool for rank ordering folding cooperativity ofproteins with known structures and for evaluatingtargets for artificial protein design.

Acknowledgements

We thank Yawen Bai, Allison Ferguson, NeilFerguson, Michael Knott, Victor Muñoz, Jose San-chez-Ruiz, Stefan Wallin, and Zhuqing Zhang forhelpful discussions and Zhuqing Zhang for carefullyrechecking some of the results.We are grateful to twoanonymous referees for their insightful comments,which have contributed to an improved exposition,and we acknowledge the Canadian Institutes ofHealth Research (CIHR grant no. MOP-84281) andthe Canada Research Chairs Program for financialsupport.

References1. Wolynes, P. G. (1997). As simple as can be? Nat.

Struct. Biol. 4, 871–874.2. Chan, H. S. (1999). Folding alphabets. Nat. Struct.

Biol. 6, 994–996.3. Bryngelson, J. D., Onuchic, J. N., Socci, N. D. &

Wolynes, P. G. (1995). Funnels, pathways, and the

527Native Topology and Protein Folding Cooperativity

energy landscape of protein folding: a synthesis.Proteins: Struct., Funct., Genet. 21, 167–195.

4. Dill, K. A., Bromberg, S., Yue, K., Fiebig, K. M., Yee,D. P., Thomas, P. D. & Chan, H. S. (1995). Principles ofprotein folding—a perspective from simple exactmodels. Protein Sci. 4, 561–602.

5. Thirumalai, D. & Woodson, S. A. (1996). Kinetics offolding of proteins and RNA. Acc. Chem. Res. 29,433–439.

6. Dill, K. A. & Chan, H. S. (1997). From Levinthal topathways to funnels. Nat. Struct. Biol. 4, 10–19.

7. Wallin, S. & Chan, H. S. (2005). A critical assess-ment of the topomer search model of protein fold-ing using a continuum explicit-chain model withextensive conformational sampling. Protein Sci. 14,1643–1660.

8. Matthews, C. R. & Hurle, M. R. (1987). Mutantsequences as probes of protein folding mechanisms.BioEssays, 6, 254–257.

9. Jackson, S. E. & Fersht, A. R. (1991). Folding ofchymotrypsin inhibitor 2. 1. Evidence for a two-statetransition. Biochemistry, 30, 10428–10435.

10. Jackson, S. E. (1998). How do small single-domainproteins fold? Folding Des. 3, R81–R91.

11. Baker, D. (2000). A surprising simplicity to proteinfolding. Nature, 405, 39–42.

12. Kim, P. S. & Baldwin, R. L. (1990). Intermediates inthe folding reactions of small proteins. Annu. Rev.Biochem. 59, 631–660.

13. Matthews, C. R. (1993). Pathways of protein folding.Annu. Rev. Biochem. 62, 653–683.

14. Bilsel, O. &Matthews, C. R. (2000). Barriers in proteinfolding reactions. Adv. Protein Chem. 53, 153–207.

15. Onuchic, J. N. & Wolynes, P. G. (2004). Theory ofprotein folding. Curr. Opin. Struct. Biol. 14, 70–75.

16. MacCallum, J. L., Sabaye Moghaddam, M., Chan, H.S. & Tieleman, D. P. (2007). Hydrophobic associationof α-helices, steric dewetting and enthalpic barriersto protein folding. Proc. Natl Acad. Sci. USA, 104,6206–6210.

17. Garcia-Mira, M. M., Sadqi, M., Fischer, N., Sanchez-Ruiz, J. M. & Muñoz, V. (2002). Experimentalidentification of downhill protein folding. Science,298, 2191–2195.

18. Sadqi, M., Fushman, D. & Muñoz, V. (2006). Atom-by-atom analysis of global downhill protein folding.Nature, 442, 317–321.

19. Damaschun, G., Dmaschun, H., Gast, K., Misselwitz,R., Muller, J. J., Pfeil, W. & Zirwer, D. (1993). Colddenaturation-induced conformational changes inphosphoglycerate kinase from yeast. Biochemistry,32, 7739–7746.

20. Sabelko, J., Ervin, J. & Gruebele, M. (1999). Observa-tion of strange kinetics in protein folding. Proc. NatlAcad. Sci. USA, 96, 6031–6036.

21. Myers, J. K. & Oas, T. G. (2001). Preorganizedsecondary structure as an important determinant offast protein folding. Nat. Struct. Biol. 8, 552–558.

22. Yang, W. Y. & Grueble, M. (2003). Folding at thespeed limit. Nature, 423, 193–197.

23. Ma, H. & Gruebele, M. (2005). Kinetics are probe-dependent during downhill folding of an engineeredλ6–85 protein.Proc. Natl Acad. Sci. USA, 102, 2283–2287.

24. Liu, F. & Gruebele, M. (2007). Tuning λ6–85 towardsdownhill folding at its melting temperature. J. Mol.Biol. 370, 574–584.

25. Naganathan, A. N., Doshi, U., Fung, A., Sadqi, M. &Muñoz, V. (2006). Dynamics, energetics, and struc-ture in protein folding. Biochemistry, 45, 8466–8475.

26. Huang, F., Sato, S., Sharpe, T. D., Ying, L. & Fersht,A. R. (2007). Distinguishing between cooperativeand unimodal downhill protein folding. Proc. NatlAcad. Sci. USA, 104, 123–127.

27. Ferguson, N., Sharpe, T. D., Johnson, C. M., Schartau,P. J. & Fersht, A. R. (2007). Structural biology:analysis of ‘downhill’ protein folding. Nature, 445,E14–E15.

28. Zhou, Z. & Bai, Y. (2007). Structural biology: analysisof protein-folding cooperativity. Nature, 445,E16–E17.

29. Sadqi, M., Fushman, D. & Muñoz, V. (2007).Structural biology: analysis of protein-folding coop-erativity—reply. Nature, 445, E17–E18.

30. Ferguson, N., Schartau, P. J., Sharpe, T. D., Sato, S. &Fersht, A. R. (2004). One-state downhill versus con-ventional protein folding. J. Mol. Biol. 344, 295–301.

31. Naganathan, A. N., Perez-Jimenez, R., Sanchez-Ruiz,J. M. & Muñoz, V. (2005). Robustness of downhillfolding: guidelines for the analysis of equilibriumfolding experiments on small proteins. Biochemistry,44, 7435–7449.

32. Ferguson, N., Sharpe, T. D., Schartau, P. J., Sato,S., Allen, M. D., Johnson, C. M. et al. (2005). Ultra-fast barrier-limited folding in the peripheralsubunit-binding domain family. J. Mol. Biol. 353,427–446.

33. Liu, F., Du, D., Fuller, A. A., Davoren, J. E., Wipf, P.,Kelly, J. W. & Gruebele, M. (2008). An experimentalsurvey of the transition between two-state anddownhill protein folding scenarios. Proc. Natl Acad.Sci. USA, 105, 2369–2374.

34. Hagen, S. J. (2003). Exponential decay kinetics in“downhill” protein folding. Proteins: Struct., Funct.,Genet. 50, 1–4.

35. Hagen, S. J., Qiu, L. & Pabit, S. A. (2005). Diffusionallimits to the speed of protein folding: fact or friction?J. Phys.: Condens. Matter, 17, S1503–S1514.

36. Zuo, G., Wang, J. & Wang, W. (2006). Folding withdownhill behavior and low cooperativity of proteins.Proteins: Struct., Funct., Bioinform. 63, 165–173.

37. Knott, M. & Chan, H. S. (2006). Criteria for downhillprotein folding: calorimetry, chevron plot, kineticrelaxation, and single-molecule radius of gyration inchain models with subdued degrees of cooperativity.Proteins: Struct., Funct., Bioinform. 65, 373–391.

38. Prieto, L. & Rey, A. (2007). Influence of the nativetopology on the folding barrier for small proteins. J.Chem. Phys. 127, 175101.

39. Bruscolini, P., Pelizzola, A. & Zamparo, M. (2007).Downhill versus two-state protein folding in a statis-tical mechanical model. J. Chem. Phys. 126, 215103.

40. Cho, S. S., Weinkam, P. & Wolynes, P. G. (2008).Origins of barriers and barrierless folding in BBL.Proc. Natl Acad. Sci. USA, 105, 118–123.

41. Yu, W., Chung, K., Cheon, M., Heo, M., Han, K.-H.,Ham, S. & Chang, I. (2008). Cooperative foldingkinetics of BBL protein and peripheral subunit-binding domain homologues. Proc. Natl Acad. Sci.USA, 105, 2397–2402.

42. Zhang, J., Li, W.,Wang, J., Qin, M. &Wang,W. (2008).All-atom replica exchange molecular simulation ofprotein BBL. Proteins: Struct., Funct., Bioinform. 72,1038–1047.

43. Gruebele, M. (2005). Downhill protein folding:evolution meets physics. C. R. Biol. 328, 701–712.

44. Liu, F. & Gruebele, M. (2008). Downhill dynamicsand the molecular rate of protein folding. Chem. Phys.Lett. 461, 1–8.

528 Native Topology and Protein Folding Cooperativity

45. Maxwell, K. L., Yee, A. A., Booth, V., Arrowsmith,C. H., Gold, M. & Davidson, A. R. (2001). Thesolution structure of bacteriophage λ protein W, asmall morphogenetic protein possessing a novel fold.J. Mol. Biol. 308, 9–14.

46. Fung, A., Li, P., Godoy-Ruiz, R., Sanchez-Ruiz, J. M.& Muñoz, V. (2008). Expanding the realm of ultrafastprotein folding: gpW, a midsize natural single-domain with α+β topology that folds downhill.J. Am. Chem. Soc. 130, 7489–7495.

47. Mittag, T. & Forman-Kay, J. D. (2007). Atomic-levelcharacterization of disordered protein ensembles.Curr. Opin. Struct. Biol. 17, 3–14.

48. Tompa, P. & Fuxreiter, M. (2007). Fuzzy com-plexes: polymorphism and structural disorder inprotein–protein interactions. Trends Biochem. Sci. 33,2–8.

49. Chan, H. S., Shimizu, S. & Kaya, H. (2004).Cooperativity principles in protein folding. MethodsEnzymol. 380, 350–379.

50. Scalley-Kim, M. & Baker, D. (2004). Characterizationof the folding energy landscapes of computergenerated proteins suggests high folding free energybarriers and cooperativity may be consequences ofnatural selection. J. Mol. Biol. 338, 573–583.

51. Bai, Y. (2006). Energy barriers, cooperativity, andhidden intermediates in the folding of small proteins.Biochem. Biophys. Res. Commun. 340, 976–983.

52. Watters, A. L., Deka, P., Corrent, C., Callender, D.,Varani, G., Sosnick, T. & Baker, D. (2007). The highlycooperative folding of small naturally occurringproteins is likely the result of natural selection. Cell,128, 613–624.

53. Kaya, H. & Chan, H. S. (2003). Solvation effects anddriving forces for protein thermodynamic and kineticcooperativity: how adequate is native–centric topo-logical modeling? J. Mol. Biol. 326, 911–931. Corri-gendum: 337, 1069–1070 (2004).

54. Kaya, H., Liu, Z. & Chan, H. S. (2005). Chevronbehavior and isostable enthalpic barriers in proteinfolding: successes and limitations of simple Gō-likemodeling. Biophys. J. 89, 520–535.

55. Levitt, M. & Warshel, A. (1975). Computer simula-tion of protein folding. Nature, 253, 694–698.

56. Kaya, H. & Chan, H. S. (2000). Polymer principles ofprotein calorimetric two-state cooperativity. Proteins:Struct., Funct., Genet. 40, 637–661. Erratum: 43, 523(2001).

57. Dill, K. A. (1985). Theory for the folding and stabilityof globular proteins. Biochemistry, 24, 1501–1509.

58. Lau, K. F. & Dill, K. A. (1989). A lattice statisticalmechanics model of the conformational andsequence spaces of proteins. Macromolecules, 22,3986–3997.

59. Kaya, H. & Chan, H. S. (2000). Energetic componentsof cooperative protein folding. Phys. Rev. Lett. 85,4823–4826.

60. Kaya, H. & Chan, H. S. (2005). Explicit-chain modelof native-state hydrogen exchange: implications forevent ordering and cooperativity in protein folding.Proteins: Struct., Funct., Bioinform. 58, 31–44.

61. Hilser, V. J. & Freire, E. (1996). Structure-basedcalculation of the equilibrium folding pathway ofproteins. Correlation with hydrogen exchange pro-tection factors. J. Mol. Biol. 262, 756–772.

62. Chan, H. S. (2000). Modeling protein density ofstates: additive hydrophobic effects are insufficientfor calorimetric two-state cooperativity. Proteins:Struct., Funct., Genet. 40, 543–571.

63. Plotkin, S. S., Wang, J. & Wolynes, P. G. (1997).Statistical mechanics of a correlated energy land-scape model for protein folding funnels. J. Chem.Phys. 106, 2932–2948.

64. Bai, Y., Sosnick, T. R., Mayne, L. & Englander, S. W.(1995). Protein folding intermediates: native-statehydrogen exchange. Science, 269, 192–197.

65. Bai, Y. (2006). Protein folding pathways studied bypulsed- and native-state hydrogen exchange. Chem.Rev. 106, 1757–1768.

66. Alm, E. & Baker, D. (1999). Prediction of protein-folding mechanisms from free-energy landscapesderived from native structures. Proc. Natl Acad. Sci.USA, 96, 11305–11310.

67. Muñoz, V. & Eaton, W. A. (1999). A simple model forcalculating the kinetics of protein folding from three-dimensional structures. Proc. Natl Acad. Sci. USA, 96,11311–11316.

68. Godoy-Ruiz, R., Henry, E. R., Kubelka, J., Hofrichter,J., Muñoz, V., Sanchez-Ruiz, J. M. & Eaton, W. A.(2008). Estimating free-energy barrier heights for anultrafast folding protein from calorimetric andkinetic data. J. Phys. Chem. B, 112, 5938–5949.

69. Micheletti, C., Banavar, J. R., Maritan, A. & Seno, F.(1999). Protein structures and optimal folding from ageometrical variational principle. Phys. Rev. Lett. 82,3372–3375.

70. Shea, J.-E., Onuchic, J. N. & Brooks, C. L. (1999).Exploring the origins of topological frustration:design of a minimally frustrated model of frag-ment B of protein A. Proc. Natl Acad. Sci. USA, 96,12512–12517.

71. Clementi, C., Nymeyer, H. & Onuchic, J. N. (2000).Topological and energetic factors: what determinesthe structural details of the transition state ensembleand “en-route” intermediates for protein folding? Aninvestigation for small globular proteins. J. Mol. Biol.298, 937–953.

72. Taketomi, H., Ueda, Y. & Gō, N. (1975). Studies onprotein folding, unfolding and fluctuations bycomputer simulation. 1. The effect of specific aminoacid sequence represented by specific inter-unitinteractions. Int. J. Pept. Protein Res. 7, 445–459.

73. Horng, J.-C., Moroz, V. & Raleigh, D. P. (2003). Rapidcooperative two-state folding of a miniature α–βprotein and design of a thermostable variant. J. Mol.Biol. 326, 1261–1270.

74. Badasyan, A., Liu, Z. & Chan, H. S. (2007). Downhilland two-state: which factors may affect these scenar-ios of folding? In Abstracts for the Conference onStructure and Dynamics in Soft Matter and Biomolecules:From Single Molecules to Ensemble; held at the AbdusSalam International Centre for Theoretical Physics,Trieste, Italy, June 4–8, 2007. http://cdsagenda5.ictp.trieste.it/full_display.php?smr=0&ida=a06200

75. Eastwood, M. P. & Wolynes, P. G. (2001). Role ofexplicitly cooperative interactions in protein foldingfunnels: a simulation study. J. Chem. Phys. 114,4702–4716.

76. Cheung, M. S., García, A. E. & Onuchic, J. N. (2002).Protein folding mediated by solvation: water expul-sion and formation of the hydrophobic core occurafter the structural collapse. Proc. Natl Acad. Sci. USA,99, 685–690.

77. Liu, Z. & Chan, H. S. (2005). Desolvation is a likelyorigin of robust enthalpic barriers to protein folding.J. Mol. Biol. 349, 872–889.

78. Liu, Z. & Chan, H. S. (2005). Solvation and desolva-tion effects in protein folding: native flexibility,

529Native Topology and Protein Folding Cooperativity

kinetic cooperativity, and enthalpic barriers underisostability conditions. Phys. Biol. 2, S75–S85.

79. Sheinerman, F. B. & Brooks, C. L. (1998). Molecularpicture of folding of a small α/β protein. Proc. NatlAcad. Sci. USA, 95, 1562–1567.

80. Moghaddam, M. S., Shimizu, S. & Chan, H. S.(2005). Temperature dependence of three-bodyhydrophobic interactions: potential of mean force,enthalpy, entropy, heat capacity, and nonadditivity.J. Am. Chem. Soc. 127, 303–316. Correction: 127,2363 (2005).

81. Levy, Y. & Onuchic, J. N. (2006). Water mediation inprotein folding and molecular recognition. Annu.Rev. Biophys. Biomol. Struct. 35, 389–415.

82. Krone, M. G., Hua, L., Soto, P., Zhou, R., Berne, B. J. &Shea, J.-E. (2008). Role of water in mediating theassembly of Alzheimer amyloid-βAβ 16–22 proto-filaments. J. Am. Chem. Soc. 130, 11066–11081.

83. Yang, W. Y., Petera, J. W., Swope, W. C. & Gruebele,M. (2004). Heterogeneous folding of the trpzip hair-pin: full atom simulation and experiment. J. Mol. Biol.336, 241–251.

84. Kubelka, J., Hofrichter, J. & Eaton, W. A. (2004). Theprotein folding ‘speed limit.’. Curr. Opin. Struct. Biol.14, 76–88.

85. Zhu, Y., Alonso, D. O. V., Maki, K., Huang, C.-Y.,Lahr, S. J., Daggett, V. et al. (2003). Ultrafast folding ofα3 D: a de novo designed three-helix bundle protein.Proc. Natl Acad. Sci. USA, 100, 15486–15491.

86. Zhu, Y., Fu, X., Wang, T., Tamura, A., Takada, S.,Savan, J. G. & Gai, F. (2004). Guiding the search for aprotein's maximum rate of folding. Chem. Phys. 307,99–109.

87. Luisi, D. L., Kuhlman, B., Sideras, K., Evans, P. A. &Raleigh, D. P. (1999). Effects of varying the localpropensity to form secondary structure on thestability and folding kinetics of a rapidly foldingmixed α/β protein: characterization of a truncationmutant of the N-terminal domain of the ribosomalprotein L9. J. Mol. Biol. 289, 167–174.

88. Wallin, S. & Chan, H. S. (2006). Conformationalentropic barriers in topology-dependent proteinfolding: perspectives from a simple native–centricpolymer model. J. Phys.: Condens. Matter, 18,S307–S328.

89. Skolnick, J., Jaroszewski, L., Kolinski, A. & Godzik,A. (1997). Derivation and test of pair potentials forprotein folding. When is the quasichemical approx-imation correct? Protein Sci. 6, 676–688.

90. Veitshans, T., Klimov, D. & Thirumalai, D. (1997).Protein folding kinetics: timescales, pathways andenergy landscapes in terms of sequence-dependentproperties. Folding Des. 2, 1–22.

91. Koga, N. & Takada, S. (2001). Roles of native topo-logy and chain-length scaling in protein folding: asimulation study with a Gō-like model. J. Mol. Biol.313, 171–180.

92. Shimizu, S. & Chan, H. S. (2000). Temperaturedependence of hydrophobic interactions: a meanforce perspective, effects of water density, andnonadditivity of thermodynamic signatures. J. Chem.Phys. 113, 4683–4700. Erratum: 116, 8636 (2002).

93. Kaya, H. & Chan, H. S. (2003). Simple two-stateprotein folding kinetics requires near-Levinthalthermodynamic cooperativity. Proteins: Struct.,Funct., Genet. 52, 510–523.

94. Kaya, H. & Chan, H. S. (2003). Contact orderdependent protein folding rates: kinetic conse-quences of a cooperative interplay between favorable

nonlocal interactions and local conformational pre-ferences. Proteins: Struct., Funct., Genet. 52, 524–533.

95. Jewett, A. I., Pande, V. S. & Plaxco, K. W. (2003).Cooperativity, smooth energy landscapes and theorigins of topology-dependent protein folding rates.J. Mol. Biol. 326, 247–253.

96. Valleau, J. P. & Torrie, G. M. (1977). A guide to MonteCarlo for statistical mechanics: 2. Byways. In Statis-tical Mechanics, Part A: Equilibrium Techniques (Berne,B. J., ed), pp. 169–194, Plenum Press, New York, NY;chapt. 5.

97. Beveridge, D. L. & DiCapua, F. M. (1989). Free-energy via molecular simulation—applications tochemical and biomolecular system. Annu. Rev.Biophys. Biophys. Chem. 18, 431–492.

98. Voter, A. F. (1997). A method for accelerating themolecular dynamics simulation of infrequent events.J. Chem. Phys. 106, 4665–4677.

99. Voter, A. F. (1997). Hyperdynamics: acceleratedmolecular dynamics of infrequent events. Phys. Rev.Lett. 78, 3908–3911.

100. Takada, S., Luthey-Schulten, Z. & Wolynes, P. G.(1999). Folding dynamics with nonadditive forces:a simulation study of a designed helical proteinand a random heteropolymer. J. Chem. Phys. 110,11616–11629.

101. Plaxco, K. W., Simons, K. T. & Baker, D. (1998).Contact order, transition state placement and therefolding rates of single domain proteins. J. Mol. Biol.227, 985–994.

102. Ferguson, A., Liu, Z. & Chan, H. S. (2007). Desolva-tion effects and topology-dependent protein folding.2007 American Physical Society March MeetingAbstract BAPS.2007.MAR.D26.3. http://meetings.aps.org/link/BAPS.2007.MAR.D26.3

103. Ejtehadi, M. R., Avall, S. P. & Plotkin, S. S. (2004).Three-body interactions improve the prediction ofrate and mechanism in protein folding models. Proc.Natl Acad. Sci. USA, 101, 15088–15093.

104. Qi, X. & Portman, J. J. (2007). Excluded volume, localstructural cooperativity, and the polymer physics ofprotein folding rates. Proc. Natl Acad. Sci. USA, 104,10841–10846.

105. Kamagata, K., Arai, M. & Kuwajima, K. (2004).Unification of the folding mechanisms of non-two-state and two-state proteins. J. Mol. Biol. 339, 951–965.

106. Qiu, L. & Hagen, S. J. (2004). A limiting speed forprotein folding at low solvent viscosity. J. Am. Chem.Soc. 126, 3398–3399.

107. Rhee, Y. M. & Pande, V. S. (2008). Solvent viscositydependence of the protein folding dynamics. J. Phys.Chem. B, 112, 6221–6227.

108. Kramers, H. A. (1940). Brownian motion in a field offorce and the diffusion model of chemical reactions.Physica, 7, 284–304.

109. Habggi, P., Talkner, P. & Borkovec, M. (1990).Reaction-rate theory: 50 years after Kramers. Rev.Mod. Phys. 62, 251–341.

110. Klimov, D. K. & Thirumalai, D. (1997). Viscositydependence of the folding rates of proteins. Phys. Rev.Lett. 79, 317–320.

111. Best, R. B. & Hummer, G. (2006). Diffusive model ofprotein folding dynamics with Kramers turnover inrate. Phys. Rev. Lett. 96, 228104.

112. Kaya, H. & Chan, H. S. (2003). Origins of chevronrollovers in non-two-state protein folding kinetics.Phys. Rev. Lett. 90, 258104.

113. Zhou, Y., Zhang, C., Stell, G. & Wang, J. (2003).Temperature dependence of the distribution of the

530 Native Topology and Protein Folding Cooperativity

first passage time: results from discontinuous mole-cular dynamics simulations of an all-atom model ofthe second β-hairpin fragment of protein G. J. Am.Chem. Soc. 125, 6300–6305.

114. Wang, J. (2004). The complex kinetics of proteinfolding in wide temperature ranges. Biophys. J. 87,2164–2171.

115. Feng, H., Takei, J., Lipsitz, R., Tjandra, N. & Bai, Y.(2003). Specific non-native hydrophobic interactionsin a hidden folding intermediate: implications forprotein folding. Biochemistry, 42, 12461–12465.

116. Capaldi, A. P., Kleanthous, C. & Radford, S. E. (2002).Im7 folding mechanism: misfolding on a path to thenative state. Nat. Struct. Biol. 9, 209–216.

117. Zarrine-Afsar, A., Wallin, S., Neculai, A. M., Neu-decker, P., Howell, P. L., Davidson, A. R. & Chan,H. S. (2008). Theoretical and experimental demon-stration of the importance of specific nonnativeinteractions in protein folding. Proc. Natl Acad. Sci.USA, 105, 9999–10004.

118. Lindberg, M., Tangrot, J. & Oliveberg, M. (2002).Complete change of the protein folding transitionstate upon circular permutation. Nat. Struct. Biol. 9,818–822.

119. Miller, E. J., Fischer, K. F. & Marqusee, S. (2002).Experimental evaluation of topological parametersdetermining protein folding rates. Proc. Natl Acad.Sci. USA, 99, 10359–10363.

120. Takada, S. (1999). Gō-ing for the prediction of proteinfolding mechanisms. Proc. Natl Acad. Sci. USA, 96,11698–11700.

121. Abkevich, V. I., Gutin, A. M. & Shakhnovich, E. I.(1995). Impact of local and nonlocal interactions onthermodynamics and kinetics of protein folding. J.Mol. Biol. 252, 460–471.

122. Chan, H. S. (1998). Matching speed and locality.Nature, 392, 761–763.

123. Gō, N. & Taketomi, H. (1978). Respective roles ofshort- and long-range interactions in protein folding.Proc. Natl Acad. Sci. USA, 75, 559–563.

124. Chan, H. S. & Dill, K. A. (1996). Comparing foldingcodes for proteins and polymers. Proteins: Struct.,Funct., Genet. 24, 335–344.

125. Li, H., Helling, R., Tang, C. & Wingreen, N. (1996).Emergence of preferred structures in a simple modelof protein folding. Science, 273, 666–669.

126. Marqusee, S. & Sauer, R. T. (1994). Contributions of ahydrogen bond/salt bridge network to the stabilityof secondary and tertiary structure in λ repressor.Protein Sci. 3, 2217–2225.

127. Myers, J. K. & Oas, T. G. (1999). Contribution of aburied hydrogen bond to λ repressor foldingkinetics. Biochemistry, 38, 6761–6768.

128. Pace, C. N. & Scholtz, J. M. (1998). A helix propensityscale based on experimental studies of peptides andproteins. Biophys. J. 75, 422–427.


Recommended