+ All Categories
Home > Documents > 1208.4885(1)

1208.4885(1)

Date post: 02-Apr-2018
Category:
Upload: anderson-vargas
View: 213 times
Download: 0 times
Share this document with a friend

of 22

Transcript
  • 7/27/2019 1208.4885(1)

    1/22

    arXiv:1208.4885v1

    [cond-mat.stat-m

    ech]24Aug2012

    Implicit Ligand Theory: Rigorous Binding Free Energies and Thermodynamic

    Expectations from Molecular Docking

    David D. L. Minh

    Department of Chemistry, Duke University, Durham NC 27708 USA(Dated: August 27, 2012)

    A rigorous formalism for estimating noncovalent binding free energies and thermodynamic ex-pectations from calculations in which receptor configurations are sampled independently from theligand is derived. Due to this separation, receptor configurations only need to be sampled once,facilitating the use of binding free energy calculations in virtual screening. Demonstrative calcu-lations on a host-guest system yield good agreement with previous free energy calculations andisothermal titration calorimetry measurements. Implicit ligand theory provides guidance on howto improve existing molecular docking algorithms and insight into the concepts of induced fit andconformational selection in noncovalent macromolecular recognition.

    INTRODUCTION

    The goal of molecular docking is to predict the moststable configuration of a noncovalent complex betweena ligand and receptor. Based on this configuration, the

    complex is assigned a score which may be used to ap-proximately rank the binding affinity of one ligand to thereceptor versus another. Molecular docking has many po-tential applications, and has been most prominently ap-plied to the virtual screening [1, 2] of chemical librariesto aid the development of pharmaceuticals.

    Given the three-dimensional structure of a protein re-ceptor, docking algorithms have proven reasonably adeptat sampling stable conformations of small organic ligandsin the complex. Unfortunately, current scoring functionsperform poorly at predicting binding free energies [35].Hence, docking is typically used to filter a large library

    of potential ligands to a smaller binder-enriched librarythat may be pursued experimentally or by more accu-rate and expensive computational methods [610]. Evenin this capacity, however, scoring functions are inconsis-tent, frequently presenting false positives (ligands pre-dicted to bind but actually have weak or no affinity) andfalse negatives (ligands predicted not to bind but actuallyhave significant affinity). For example, docking programsoften have difficulty distinguishing binding compoundsfrom decoys in which the chemical connectivity has beenrandomized [3, 11]. Improved scoring functions would in-crease the capability to discern binders from non-binders.

    The improvement of scoring functions, however, has

    been hindered by the lack of a rigorous formalism forobtaining binding free energies from molecular docking.While molecular docking calculations are usually per-formed with a rigid receptor, existing formalisms forbinding free energies require a flexible receptor. Here, Iderive a formalism, implicit ligand theory, for estimatingbinding free energies and thermodynamic expectationsbased on docking ligands to rigid receptor structures. Ialso describe practical aspects of statistical estimation,present example calculations, and discuss how physics-

    based (opposed to empirical or knowledge-based) dockingalgorithms (see [12]) may be modified to exploit it. Be-yond molecular docking, implicit ligand theory providesinsight into the concepts of induced fit and conforma-tional selection in noncovalent macromolecular recogni-

    tion.

    THEORY

    The standard binding free energy, the free energy of anoncovalent association between a receptor R and ligandL to form a complex RL, R + L RL, is,

    G = 1 ln

    CCRLCRCL

    , (1)

    where = (kBT)1 is the inverse of Boltzmanns con-stant, kB , times the temperature in Kelvin, T, C is

    the standard concentration (typically 1 M), and CX isthe equilibrium concentration of species X {R,L,RL}[13].

    Statistical thermodynamics relates the standard bind-ing free energy to a ratio of configurational partition func-tions [14],

    G = 1 ln

    ZRL,NZNZR,NZL,N

    C

    82

    (2)

    ZRL,N =

    Ie

    U(rRL,rS)drRLdrS (3)

    ZY,N = eU(rY,rS)drYdrS (4)ZN =

    eU(rS)drS, (5)

    in which symmetry numbers and a small pressure-volumeterm have been omitted from Eq. (2). ZRL,N and ZY,Nare configurational partition functions of the complex andof the species Y {R,L}, respectively, in N molecules ofsolvent. The potential energy U(rX , rS) depends on rX ,the internal coordinates of the receptor, ligand, or bothin complex (the external degrees of freedom have been

    http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1http://arxiv.org/abs/1208.4885v1
  • 7/27/2019 1208.4885(1)

    2/22

    2

    analytically integrated), and rS, the coordinates of Nmolecules of solvent. The complex coordinates rRL maybe decomposed into the internal coordinates of the recep-tor, rR, and of the ligand, rL, and six degrees of freedomdescribing their relative translation and rotation, L. Forsimplicity, Jacobians for the transformation from Carte-sian coordinates to a system with separated internal andexternal degrees of freedom are not shown in Eqs. (3)

    and (4). In ZRL,N, the indicator function I I(L)takes values between 0 and 1 and determines whetherthe receptor and ligand are complexed or not. For tight-binding complexes, the binding free energy is insensitiveto the precise definition ofI [14].

    Implicit Solvent Theory

    The configurational integrals in Eq. (2) may be ex-pressed in a formally equivalent but simpler form us-ing implicit solvent theory [14]. In implicit solvent the-ory, the interaction energy is defined as (rX , rS) =

    U(rX , rS) U(rX) U(rS), where U(rX) is the poten-tial energy of species X by itself and U(rS) the potentialenergy of the solvent by itself. By integrating the config-urational partition functions over rS, we may define theratios,

    ZRL ZRL,NZN

    =

    Ie

    [U(rRL)+W(rRL)]drRL (6)

    ZY ZY,NZN

    =

    e[U(rY)+W(rY)]drY , (7)

    where,

    W(rX) = 1 lne

    (rX,rS)eU(rS) drSeU(rS)

    drS , (8)

    is a potential of mean force that can be interpretedas the constant-pressure reversible work of transferringthe species X from the gas phase into the solvent. Inbiomolecular modeling, W(rX) is frequently estimatedas the sum of an electrostatic term from the Poisson-Boltzmann equation [15] (or the Generalized Born ap-proximation [16]), and a non-electrostatic term, whichto a first approximation is proportional to the molecularsurface area.

    In terms of implicit solvent configurational integrals,the standard binding free energy is,

    G = 1 ln

    ZRLZRZLC

    82. (9)

    As most implicit solvent models fail to account for spe-cific interactions, such as hydrogen bonding, that canhave important structural and energetic consequences,binding free energy calculations in implicit solvent aregenerally expected to be less accurate than those in ex-plicit solvent [17]. Nevertheless, binding free energycalculations in implicit solvent have yielded promisingagreement with experimental results (e.g. [1821]).

    Implicit Ligand Theory

    The development of implicit ligand theory is very sim-ilar to that of implicit solvent theory. It involves definingthe effective potential asU(rX) = U(rX)+W(rX), the ef-fective interaction energy as (rRL) = U(rRL)U(rR)U(rL), and,

    B(rR) = 1 ln

    Ie

    (

    rRL)e

    U(rL) drLdL

    IeU(rL) drLdL

    1 ln

    e

    rL,LL,I

    , (10)

    which is a potential of mean force that will subsequentlybe referred to as the binding PMF. Throughout this pa-per, angled brackets ...

    rX,... will be used to denote an

    ensemble average over the coordinates r listed in thesuperscript with respect to the density proportional toqX,..., where X describes the coordinates in the effectivepotential U(rX), and ... are labels. Here, qL,I(rL, L) =IeU(rL). Within angled brackets, I will use a short-

    hand notation in which functions implicitly depend oncoordinates, e.g. (rRL).In terms of the binding PMF, Eq. (9) may be written

    as,

    G = 1 ln

    IeU(rRL)drRL

    eU(rR)drReU(rL)drL

    C

    82

    = 1 ln

    Ie

    [U(rR)+(rRL)+U(rL)]drRLeU(rR)drR

    eU(rL)drL

    C

    82

    = 1 ln

    e[B(rR)+U(rR)]drR

    eU(rR)drR

    C

    82

    1

    lneBrR

    R + G, (11)where =

    IdL (which may be analytically tractable)

    is the binding site volume, G = 1 lnC

    82

    is the

    free energy of confining the ligand external degrees offreedom to the binding site, and qR(rR) = eU

    (rR). Eqs.(10) and (11) are the central theoretical results of thispaper.

    Implicit ligand theory provides a rigorous frameworkfor binding free energies that separates the sampling ofreceptor and ligand configurations. In Eq. (11), the re-ceptor probability density is independent of any ligandconfiguration. Likewise, the probability density of ligand

    internal coordinates in Eq. (10) is independent from thereceptor configuration. In practice, however, samplingfrom this ligand distribution may lead to slow conver-gence (this point will later be discussed in greater de-tail). The primary benefit of implicit ligand theory isthat the computationally expensive step of sampling re-ceptor configurations only needs to be performed once.Predicting binding free energies for a chemical library isthen limited by the much faster process of sampling lig-and conformations.

  • 7/27/2019 1208.4885(1)

    3/22

    3

    Thermodynamic Expectations

    In addition to estimating the binding free energy, im-plicit ligand theory may also be used to estimate expectedvalues of observables in the bound ensemble. Observablesmay include, for example, the mean potential energy, in-teraction energy, or distance between a ligand and recep-tor atom. Towards this end, it is useful to define a rigid-receptor expectation of an observable O(rRL), weightedby the interaction energy,

    (rR) =

    IO(rRL)e

    (rRL)eU(rL) drLdLIeU(rL) drLdL

    Oe

    rL,LL,I

    . (12)

    If the observable is solely a function of the receptor con-figuration, then (rR) reduces to O(rR)eB(rR).

    In terms of Eqs. (10) and (12), the expectationof O(rRL) with respect to the density proportional toqRL,I(rRL) = IeU(rRL) is,

    OrRLRL,I IO(rRL)e

    U(rRL)

    drRLIeU(rRL)drRL

    =

    IO(rRL)e[U(rR)+(rRL)+U(rL)]drRL

    Ie[U(rR)+(rRL)+U(rL)]drRL

    =

    (rR)eU(rR)drRe[B(rR)+U(rR)]drR

    rRR

    eBrRR

    = rRR e[GG] (13)

    Eqs. (12) and (13) significantly generalize implicit lig-and sampling [22], a method to estimate the potential ofmean force for the ligand center of mass. The results of

    Cohen et al. [22] may be obtained by choosing the ob-servable as a Dirac delta function for the ligand centerof mass, taking a natural logarithm, and multiplying by1. Cohen et al. [22] applied implicit ligand samplingto study gas migration pathways in myoglobin, but thepossibility of estimating other observables and bindingfree energies has not been previously recognized.

    ESTIMATION

    Applying implicit ligand theory to predicting bindingfree energies involves three steps:

    1. Sampling receptor configurations.

    2. Estimating the binding PMF, B(rR), for each re-ceptor configuration.

    3. Estimating G from B(rR) estimates.

    In this section, I present several ways, roughly in orderof increasing complexity, that these steps may be accom-plished. A variant of one approach will be demonstratedlater in the paper.

    Receptor Configurations

    Receptor configurations can be drawn from qR(rR),any (possibly unnormalized) distribution qR,w(rR) onthe same support as qR(rR) and for which w(rR) =qR(rR)/qR,w(rR) may be calculated, or from multiple

    distributions satisfying these conditions. Regardless ofthe sampling method, however, convergence of free en-ergy estimates requires representative sampling of boththe bound and unbound receptor configuration space. Aparticularly straightforward protocol is to sample fromthe distribution proportional to qR(rR) = e

    U(rR); oneconducts a molecular dynamics (MD) simulation in theimplicit solvent used for W(rR), collecting snapshots atevenly spaced intervals that are longer than the statisti-cal correlation time. This protocol may be satisfactoryif receptor fluctuations are minimal and the ligand doesnot significantly perturb the receptor configurational en-semble.

    For a receptor that undergoes larger structural fluctu-ations, sampling from multiple energetic minima may befacilitated by applying an external biasing potential (e.g.a harmonic bias) on one or more order parameters. If it isknown that a ligand significantly perturbs the receptorconfigurational ensemble, it can be useful to introducemultiple alchemical intermediates into a simulation. Al-chemical calculations may involve a coupling parameter, defined such that the two groups (e.g. the receptor andligand) are non-interacting at = 0 and fully interactingwith = 1. Simulations are conducted with at these

    end points and at multiple values in between. Samplingin each stage may be enhanced by Hamiltonian replica ex-change (e.g. Gallicchio et al. [21], Jiang et al. [23], Gallic-chio and Levy [24]), which entails stochastically swappingthe coordinates of different simulations with a probabil-ity that preserves the Boltzmann distribution. Recep-tor configurations obtained through a flexible-receptorHamiltonian replica exchange with a single ligand maysubsequently be used for implicit ligand free energy cal-culations with other ligands in the chemical library.

    As a caveat, implicit ligand theory does not provide

    a formal justification for docking to multiple experimen-tally determined structures (e.g. [25]) or any other setof structures in which w(rR) is unknown (e.g. homol-ogy modeling or flexible docking). One potential way touse information about multiple structures is to conductmultiple MD simulations with external potentials biasedtowards one or more of the structures. To facilitate lateranalysis, the external potentials should be set up to pro-mote overlap in the configuration space of different sim-ulations.

  • 7/27/2019 1208.4885(1)

    4/22

    4

    Estimating a Binding PMF

    The binding PMF B(rR) may be expressed in terms ofa ratio of partition functions,

    B(rR) = 1 ln

    Ie

    U(rRL) drLdL

    Ie[U(rL)+U(rR)] drLdL

    ,(14)

    which clarifies that B(rR) is a special type of free en-ergy difference in which the receptor configuration rRis rigid. Thus, B(rR) may be calculated using any oneof many available methods to estimate free energy differ-ences [26], including free energy perturbation (FEP) [27],thermodynamic integration (TI) [28], and the BennettAcceptance Ratio (BAR) [29]. While formally equiva-lent, free energy methods can have dramatically differentconvergence properties.

    Based on the form of Eq. (10), the most straightfor-ward estimation protocol is FEP. One can, for example,draw ligand configurations from the distribution propor-tional to qL(rL, L) = eU(rL) by conducting a MD sim-

    ulation of the ligand in the appropriate implicit solventand collecting snapshots at sufficiently long intervals. Be-cause qL is independent of L, the external degrees offreedom sampled from the simulation may be replacedby a new L sampled from the distribution proportionalto q,I = I. The expectation in Eq. (11) may then beestimated by the sample mean,

    B(rR) = 1 ln

    1

    N

    Nn=1

    e(rRL,n), (15)

    where rRL,n is the nth of N samples of the complex.

    Throughout this paper, A will denote a statistical es-

    timator - an equation used to calculate a quantity basedon sampled data.

    In exponential averages such as Eq. (15), a small sub-set of samples may contribute a large portion of the sum.The limiting case of an individual important sample in-spires the severe dominant state approximation, in whicha single value of (rRL) is used to estimate B(rR). Ex-ponential averages may also be estimated via a cumulantexpansion [30], here shown for Eq. (10) to the fourthorder,

    B(rR) rL,LL,I

    2! 2

    rL,L

    L,I+

    2

    3! 3

    rL,L

    L,I

    3

    4!

    4

    rL,LL,I

    3

    2rL,LL,I

    2, (16)

    where = (rRL) rL,LL,I . Each expectation in

    the cumulant expansion may be estimated by the samplemean.

    While formally correct, this approach to ligand sam-pling can converge slowly if most ligand configurationsplaced in the binding site have overlapping atoms andhigh values of (rRL). One potential solution to this

    problem is to sample the external degrees of freedom froma distribution biased towards energetically favorable ori-entations by a confining potential Uc(L). Multiplyingand dividing Eq. (10) by c =

    Ie

    Uc(L)dL and theintegrand in the numerator by eUc(L) leads to,

    B(rR) = 1 ln

    e[Uc]

    rL,L

    L,Ic 1 ln

    c

    (17)

    where qL,Ic = Ie[U(rL)+Uc(L)]. Good choices for

    Uc(L), which may be ascertained from existing molecu-lar docking algorithms (as will be discussed later in thepaper), will favor the sampling of poses with low (rRL).

    Alternatively, the binding PMF may be calculated us-ing the inverse form of Eq. 10,

    B(rR) = 1 ln

    Ie

    (rRL)eU(rRL) drLdLIeU(rRL) drLdL

    = 1 ln

    e

    rL,L

    RL,I. (18)

    Ligand configurations from the distribution proportionalto qRL,I(rL, L) = IeU(rRL) may be sampled, for ex-ample, from an implicit-solvent MD simulation in whichthe receptor is held rigid and the ligand is allowed tomove, and the expectation estimated using the samplemean estimator.

    This straightforward procedure is also problematic be-cause of the rarity of sampling configurations in which theligand is separated from the receptor or in which theyoverlap. While these configurations are insignificant inthe conformational ensemble in which receptor and ligandare fully interacting, they are relevant to the ensemble ofnoninteracting ligand and receptor, and the convergence

    of free energy differences requires phase space overlapbetween adjacent thermodynamic states [26]. The phasespace overlap problem may also be alleviated by calcu-lating the free energy difference with a reference state inwhich the external degrees of freedom are confined,

    B(rR) = 1 ln

    e[Uc]

    rL,LRL,I

    1 lnc

    . (19)

    The binding PMF may be estimated from the same sam-ples as with Eq. (18), and will be more accurate the moreclosely eUc(L) resembles the distribution of L in thecomplex.

    As discussed, phase space overlap problems are oftenresolved by introducing multiple alchemical stages intoa calculation, and sampling may be enhanced by Hamil-tonian replica exchange. With multiple stages, the totalfree energy difference between states with = 0 and = 1 is the sum of free energy differences between adja-cent stages, each of which may be estimated by FEP [27],TI [28], or BAR [29]. Alternatively, the total free energydifference may be estimated by the multistate BennettAcceptance Ratio (MBAR) [31].

  • 7/27/2019 1208.4885(1)

    5/22

    5

    Estimating the Binding Free Energy

    Once B(rR) is evaluated for each receptor configura-tion, the binding free energy may be calculated by esti-mating an ensemble average. The appropriate methodfor estimating G depends on how the receptor con-figurations rR are sampled. If they are drawn from thedistribution qR(rR), then the expectation in Eq. (11)may be estimated by the sample mean,

    G = 1 ln1

    N

    Nn=1

    eB(rR,n) + G, (20)

    in which B(rR,n) is the estimated binding PMF for thenth of N receptor configurations. Because the implicit-ligand expression for the binding free energy, Eq. (11),has the same form as Eq. (10), the dominant state ap-proximation and cumulant expansion may also be ap-plied.

    If receptor configurations are drawn from a biased dis-

    tribution, the importance sampling identity,

    OT =

    O(r)qT(r)dr

    qT(r)dr

    =

    O(r)w(r)qS(r)dr

    w(r)qS(r)dr=

    wOSwS

    , (21)

    may be applied. In this generic expression, w(r) =qT(r)/qS(r) is a ratio of unnormalized densities qT(r) forthe target distribution and qS(r) for the sampling dis-tribution. Using the sample mean estimator and impor-tance sampling identity for the expectation in Eq. (11)leads to,

    G = 1 ln

    Nn=1 w(rR,n)e

    B(rR,n)Nn=1 w(rR,n)

    + G(22)

    If receptor configurations are drawn from multiple biaseddistributions, then the expectation may be estimated us-ing MBAR [31].

    Thermodynamic Expectations

    Thermodynamic expectations may be estimated fromthe same data as the binding free energy. The appropri-

    ate estimator for (rR) will depend on how the ligandconfigurations were sampled. Once (rR) is estimatedfor every sampled receptor configuration, the appropri-ate estimator for the expectation in Eq. (13) similarly de-pends on how the receptor configurations were sampled.In the simplest case for (rR), if ligand configurationsare sampled from q,I, then (rR) may be estimated bya sample mean. In other cases, (rR) and O

    rRLRL,I may

    be estimated using importance sampling, MBAR [31], ora combination thereof.

    DEMONSTRATION

    As a demonstration, implicit ligand theory calcula-tions were performed to estimate the standard bindingfree energy of various ligands to Cucurbit[7]uril (CB[7])in water. The binding of CB[7] to a number of fer-rocenes, adamantanes, and bicyclooctanes has been well-characterized by both isothermal calorimetry and second-generation mining minima (M2) [18, 19] free energy cal-culations [32, 33]. Receptor configurations were sampledby molecular dynamics, binding PMFs estimated with amulti-stage alchemical calculation and MBAR [31], andthe binding free energy calculated using Eq. (20) or thedominant state approximation.

    Methods

    Molecular dynamics simulations at 300 K were per-formed with a slightly modified [34] compilation ofNAMD [35] version 2.9. When appropriate, CB[7] wasfixed using the fixedAtoms parameter. The commer-cial force field parameters and topologies from Moghad-dam et al. [33] were used for both CB[7] and its ligands.To match the force field from Moghaddam et al. [33] asclosely as possible, 1-4 electrostatics were scaled by 0.5and the nonbonded cutoff was set to 999 A, which ef-fectively turns off cutoffs. Water was represented withthe Generalized Born Surface Area (GBSA) implicit sol-vent model without ions and a surface tension of 0.006kcal/mol/A2. The receptor dielectric was 1.0 and solventdielectric was 78.5. A time step of 1 fs (using a 2 fs timestep with fixed atoms led to unstable trajectories) was

    used with Langevin dynamics.CB[7] was minimized for 2500 steps and thermalizedby increasing the temperature by 10 K and reinitializ-ing velocities every 100 steps from 0 to 300 K. Receptorsnapshots were saved every 0.1 ns from a trajectory of 10ns.

    Binding PMFs for every ligand in Moghaddam et al.[33] with the minimized CB[7] structure (15 repetitionseach) and 100 receptor simulation snapshots (1 repeti-tion each) were estimated using Hamiltonian replica ex-change, which can simultaneously dock a ligand and com-pute its binding free energy [21, 24]. The implementationis similar to that from Gallicchio and Levy [24], except

    that the receptor configuration is fixed. A reservoir of lig-and configurations [24] was generated by simulating theligand for up to 10 ns and saving snapshots every 10 ps.Simulations of the complex in which controls the extentof interaction between the ligand and receptor were runwith {0, 105, 104, 103, 102, 0.1, 0.2, 0.3, 0.4, 0.5,0.6, 0.7, 0.8, 0.9, 0.95, 1.0}. As implemented in NAMD,intermediate values of used a soft-core potential witha van der Waals shift coefficient of 5. Electrostatic inter-actions were turned on when = 0.5. Using the colvars

  • 7/27/2019 1208.4885(1)

    6/22

    6

    module, a flat-bottom harmonic potential with a springconstant of 10 kcal mol1 A1 and starting at 0.75 A wasused to restrain the center-of-mass distance between theligand core (heavy atoms except for the R groups inMoghaddam et al. [33]) and the receptor heavy atoms.This potential keeps the ligand within the binding sitewhen interactions are turned off. The binding site vol-ume, = IdL, is approximated as 4/3(0.753)(82).Because NAMD does not allow the simultaneous use ofalchemical decoupling and implicit solvent, simulationswere conducted in vacuum.

    The replica exchange simulation was initiated by tak-ing a random ligand configuration, applying a randomrotation, and randomly placing it within the binding site.This initial configuration was minimized and thermalizedwith the same protocol as with CB[7], except that it wasdone in vacuum. The thermalized structure was used tostart each replica. Occasionally, the random placementof the ligand led to high forces that caused the simula-tions to crash; in this case, the simulation was restartedwith a different random initial configuration.

    After every 5 ps of simulation for every value of, 1000replica exchanges were attempted between each pair ofadjacent windows. After each set of replica exchangeattempts, the ligand configuration for = 0 was replacedwith a random ligand configuration from the reservoir,randomly rotated, and placed in the binding site. (Thistype of reservoir swap satisfies detailed balance.) Thesimulation was conducted for 25 cycles, saving snapshotsevery 0.5 ps, for a total of 2 ns of simulation for eachbinding PMF. The docking and equilibration period, de-fined as the time before the potential energy of the fullycoupled state is within 20 kBT of its energy for the final

    snapshot, was ignored in subsequent analysis.Because alchemical coupling calculations were per-formed in vacuum, binding PMFs were estimated basedon a decomposition ofB(rR),

    B(rR) = Bcpl + BRL BL U(rR) (23)

    Bcpl = 1 ln

    IeU(rRL)drLdL

    Ie[U(rL)+U(rR)]drLdL

    BRL = 1 ln

    IeU(rRL)eU(rRL)drLdL

    IeU(rRL)drLdL

    BL = 1 lnIeU(rL)eU(rL)drLdL

    IeU(rL)drLdL .

    Bcpl is the free energy of turning on the interactions be-tween the ligand and the rigid receptor in vacuum. BRL,BL, and U(rR) are free energies of transferring the com-plex, ligand, and receptor, respectively, from vacuum tothe target state (in implicit solvent). They are based onU(rX) = UT(rX) U(rX), the potential energy differ-ence between rX in the target state versus the state fromwhich configurations were sampled (in vacuum). Bcpl

    was estimated by applying MBAR [31] to snapshots fromevery 0.5 ps of simulation, and BRL and BL by single-step FEP (evaluating transfer free energies by MBARwould require calculating target-state potential energiesfor every snapshot using computationally expensive forcefields).

    This decomposition makes it straightforward to eval-uate B(rR) for a variety of force fields using the same

    configurational samples. In this work, four are compared:

    1. NAMD: the total potential energy from usingGBSA in NAMD [35];

    2. M2: the total potential energy from using theGBSA model in the M2 program [18, 19];

    3. PB: Poisson-Boltzmann electrostatic solvation freeenergies from UHBD [36] and bond, angle, dihedral,coulomb, and van der Waals energies from the M2program [18, 19];

    4. PBSA: Poisson-Boltzmann electrostatic solvation

    free energies from UHBD [36] and bond, angle, di-hedral, coulomb, van der Waals, and nonpolar sur-face area energies from M2 [18, 19], the combinationused in Moghaddam et al. [33].

    During this step, the NAMD, M2, and UHBD programsare used strictly for single-point energy evalulations, notfor minimization or dynamics. Poisson-Boltzmann ener-gies were calculated with a grid spacing of 0.18 A withdimensions such that the maximum dimensions of themolecule are 0.7 (or less) of the final grid [33]. Forcomparison, binding PMFs were also calculated from thedominant state approximation with PBSA energies, us-

    ing the lowest value of (rRL) observed in the simulationswith = 0 or = 1.Because receptor configurations were sampled from a

    simulation in GBSA implicit solvent, binding free ener-gies were estimated by using Eq. (22). Binding freeenergies were also estimated with the dominant state ap-proximation: using the lowest observed value of B(rR)to estimate 1 ln

    eB

    rRR

    .To demonstrate the calculation of thermodynamic ex-

    pectations and for comparison with results from Moghad-dam et al. [33], the mean values of six PBSA energies -van der Waals, coulomb, electrostatic solvation, valence(bond + angle + dihedral), nonpolar solvation, and total

    - were estimated for the complex, the receptor, and theligand. Mean PBSA energies for the ligand and recep-tor were estimated by applying the importance samplingidentity to the ligand from the non-interacting system invacuum and to the receptor from the GBSA simulation,respectively. Occasionally, energies in the ligand trajec-tory briefly spiked to very high values. In estimating themean PBSA energies, these spikes were filtered out byremoving data points in which the total PBSA energy isat least 100 kBT larger than the PBSA energy of the final

  • 7/27/2019 1208.4885(1)

    7/22

    7

    snapshot. As the spikes were likely caused by the finitemolecular dynamics time step, they would probably beavoided by using a propagator the exactly preserves theBoltzmann distribution, e.g. Hybrid Monte Carlo [37].

    Towards estimating the mean PBSA energies of thecomplex, rigid-receptor expectations were estimatedby applying MBAR [31] to snapshots from the non-interacting and fully interacting states,

    (rR) =

    Nn=1w(rRL,n)O(rRL,n)N

    n=1w(rRL,n)(24)

    w(rRL) =e(UPBSA(RL)U0(RL))

    1 + N1N0

    e(U1(rRL)BcplU0(rRL),

    where U0(rRL) and U1(rRL) are the potential energies ofthe non-interacting and fully interacting complexes, re-spectively, UPBSA(rL) is the PBSA energy of only theligand, and rRL,n is the nth ofN snapshots of either thenon-interacting (N0 snapshots) or fully interacting com-plex (N1 snapshots). Bcpl was estimated by using MBAR

    [31] with all replicas. While it would be possible to esti-mate the mean PBSA energies using all snapshots fromall replicas, this was avoided because of the computa-tional expense of Poisson-Boltzmann calculations, whichcan take over a minute per snapshot. After obtaining(rR), the importance sampling identity, Eq. (21), wasused to estimate the expectations in Eq. (13). To en-sure consistency of the estimator - an estimate of a con-stant yields the same constant - (rR) was calculated for

    O = 1, in which case

    (rR)rRR

    =eB

    rRR

    . This es-

    timate ofeB

    rRR

    was used in the denominator of Eq.(13).

    Results

    Highlighting the importance of an accurate molecularmechanics model, binding PMF estimates are stronglydependent on the force field, as shown in Table I. For thelarge and highly charged bicyclooctane B11, switchingthe force field causes the binding PMF to change nearly40 kcal/mol! With increasing magnitude of charge, largercoulomb energies lead to larger values ofBcpl and largerelectrostatic solvation free energies increase the magni-tude of BRL, BL, and U(rR) (for estimates of Bcpl,

    BRL, and BL, see Table SI of the Supplemental Mate-rial. Thus, estimating the binding PMF with Eq. (23)entails the difficult task of computing a relatively smalldifference between large values. The importance of theforce field has also been noted for M2 calculations [18, 19].An alternate implementation, e.g. conducting replica ex-change within implicit solvent rather than vacuum, maynot require the implicit solvent model to be as accurate.

    With 2 ns of total simulation for all replicas, the stan-dard deviation of binding PMF estimates ranges from

    27

    20

    13

    B(rR

    )B02

    37

    30

    23

    Bcpl

    123.5

    120

    116.5

    BRL

    7

    6.8

    6.6

    BL

    0 0.5 1 1.5 2

    33.5

    30

    26.5

    min{(rRL

    )}

    Total Simulation Time (ns)

    FIG. 1. The mean and standard deviation of 15 indepen-dent estimates of B(rR), Bcpl, BRL, BL, and min{(rRL)}(kcal/mol) based on PBSA energies as a function of total MDsimulation time for the ligand B02. Analogous plots for theother ligands in this study are available as Fig. S1 in theSupplemental Material.

    0.12 to 1.63 kcal/mol (Table I), with most estimates onthe lower range of imprecision. For all of the compo-nents of Eq. (23), the mean estimate does not appearto shift after about 0.75 ns, and additional sampling re-duces the standard deviation of the estimate (see Fig. 1and Fig. S1 in the Supplemental Material. There is nounique component that limits the convergence of B(rR);the slowest converging component varies from ligand toligand. The binding PMF estimate B(rR) and the min-imal interaction energy min {(rRL)} converge at about

    the same rate, suggesting that the limiting factor for con-vergence is finding a configuration with the lowest inter-action energy. This interpretation is corroborated by thefact that largest ligands with the most rotatable bonds(see Moghaddam et al. [33] for structures) also have themost variance in B(rR), as the flexibility increases thechallenge of finding configurations with low (rRL).

    The accuracy of binding free energy estimates wasassessed with the correlation coefficient and root mean

  • 7/27/2019 1208.4885(1)

    8/22

    8

    square error,

    RMSE(m1,m2) =

    1L

    Ll=1

    (Gl,m1 G

    l,m2)2 (25)

    between methods m1 and m2, where Gl,m is the bind-ing free energy estimate for ligand l of L ligands usingmethod m (Tables I and II).

    Binding free energy estimates based on the bindingPMF for a minimized receptor structure suffices to pro-vide high correlation with experiment (R2 = 0.884 forNAMD) and M2 free energy calculations (R2 = 0.827for NAMD) (see Table I). Surprisingly, binding free en-ergies from NAMD GBSA calculations are more highlycorrelated to these benchmarks than G from PBSAcalculations. Ironically, the high correlation may be ex-plained by inaccurately large binding PMF values result-ing from highly charged ligands, as the molecules in thisset with the strongest charges also tend to have strongerbinding affinities. Although the correlation coefficient ishigh, the RMSE is also considerable, over 10 kcal/mol.Similar performance (R2 and RMSE) is observed by us-ing the dominant state approximation with PBSA cal-culations. In contrast, using Eq. (23) with PBSA leadsto less correlated (lower R2) but more accurate (lowerRMSE) estimates of the binding free energy.

    Even for this simple system, binding free energy esti-mates are substantially improved by using multiple recep-tor structures (Table II). With binding PMFs from PBSAenergies for 100 receptor structures, there is both highercorrelation and lower RMSE with respect to experiment(R2Exp = 0.704, RMSEExp = 4.5) and especially with re-

    spect to M2 free energy calculations (R2Gilson = 0.925,

    RMSEGilson = 2.4).While there are some variations on the order of a fewkcal/mol, mean potential energy changes upon complex-ation are also consistent with results from Moghaddamet al. [33] (Table III). Minor discrepencies between M2and implicit ligand free energy and mean potential energycalculations may be explained by a combination of im-perfect sampling in the current calculations and the ap-proximations in M2. As the described calculations wereperformed in vacuum, the samples may not be from thesame configurational space as those in implicit solvent.On the other hand, M2 assumes that the energy land-scape of the ligand, receptor, and complex are a trun-

    cated harmonic wells with anharmonicity corrections.Compared to the full procedure for estimating the

    binding PMF, applying the dominant state configurationleads to a reduction in the correlation with M2 resultsand an increase in the RMSE (Table IV). In contrast,applying the dominant state approximation to calculateG from B(rR) leads to a near-constant reduction ofabout 3 kcal/mol in the estimated binding free energy.While the RMSE increases, the correlation with M2 re-sults remains nearly identical. Given the same B(rR)

    25 20 15 10

    0

    5

    10

    15

    (a)

    Binding PMF (kcal/mol)

    Count

    0 5 10 15 2019

    16.9

    14.8

    (b)

    Number of Receptor Snapshots

    20 40 60 80 10019

    18.3

    17.6

    (c)

    Number of Receptor Snapshots

    FIG. 2. (a) Histogram of binding PMF estimates B(rR)(kcal/mol) of B02 to 100 snapshots of CB[7], using PBSA en-ergies. The vertical line shows the mean binding PMF for theminimized receptor structure. (b) and (c) Estimates of thebinding free energy G of B02 to CB[7] (kcal/mol), usingPBSA energies, as a function of the number of receptor snap-shots. The line and error bars denote the mean and standarddeviation from bootstrapping: the binding free energy is es-timated 100 times using random selections of N out of 100binding PMFs. Analogous plots for the other ligands in thisstudy are available as Figs. S2 and S3 in the SupplementalMaterial.

    results, however, there is essentially no reason to applythe dominant state approximation rather than Eq. (22).

    There is considerable variation in the binding PMFsfor the 100 receptor structures (Fig. 2 and Fig. S2 inthe Supplemental Material. For most of the ligands, therange of binding PMFs span 10 to 20 kcal/mol. Whilethe binding PMF of the minimized structure is often nearthe lower end of the binding PMF distribution, this is not

    always the case. In larger ligands, the binding PMF ap-pears to be lower for other receptor structures. The factthat a single structure does not always lead to the lowestbinding PMF shows a major limitation of using a singlereceptor structure to estimate binding free energies.

    In spite of the variability of binding PMFs, for the lig-ands in the test set, the average value of G appears tostabilize after a relatively small number (about 15) of re-ceptor snapshots (Fig. 2 and Fig. S3 in the SupplementalMaterial. Using a greater number of snapshots slightly

  • 7/27/2019 1208.4885(1)

    9/22

    9

    reduces the variance of binding free energy estimates. Af-ter the certain point, however, further reduction in thevariance of G is limited by the variance in bindingPMF estimates.

    DISCUSSION

    While the good agreement between implicit ligand andM2 calculations provides a proof of principle, the conver-gence and accuracy of implicit ligand calculations willdiffer with other classes of receptor-ligand pairs. Withprotein-ligand pairs, for example, representative sam-pling of receptors and finding low-energy poses of theligand will likely require much more MD simulation time.On the other hand, many protein-ligand systems are notas strongly charged and may be less sensitive to the elec-trostatic solvation free energy. Due to these variabilities,assessments for the feasibility of implicit ligand calcula-tions in different classes of systems will prove valuable.Tests for convergence and accuracy may be similar tothose performed for the CB[7] system.

    Numerous opportunities remain for further method-ological improvement and optimization of implicit ligandfree energy calculations. The accuracy of implicit ligandcalculations (and M2 calculations) may be limited by thequality of the force field. The decomposition of the bind-ing PMF in Eq. (23) provides a facile means to inte-grate alternate and potentially more expensive potentialenergies, e.g. quantum mechanical calculations or moresophisticated nonpolar solvation free energies. Modelingmay also be improved by the inclusion of a few explicitwater molecules (see Supplemental Material.) Another

    potential avenue for improvement is the fine-tuning ofthe replica exchange protocol (e.g. using implicit solventor optimizing the number of stages and values of for aparticular system) or implementing alternative methodsto estimate the binding PMF and binding free energy.

    Even without modifying the replica exchange protocol,computations may be accelerated by optimizing existingMD simulation packages for implicit ligand theory. Fewmodern MD simulation programs take full advantage ofrigid degrees of freedom by skipping the calculation ofpairwise interactions between rigid atoms. Even fewerimplicit solvent models are designed with rigid receptorsin mind [38]; implicit ligand theory may inspire the de-

    velopment of such models.Implicit ligand theory also provides guidance on how to

    understand and improve existing molecular docking algo-rithms. The definition of (rRL) provides a straightfor-ward functional form that can be used to account for sol-vation free energies and ligand internal energies (strain),which have been noted to be important factors in bind-ing free energies [39], but are frequently ignored in theinteraction energy functions used by docking packages.Implicit ligand theory also delineates how to improve the

    ranking of different ligands. Molecular docking packagescurrently rank receptor-ligand binding free energies basedon a single low-energy configuration. As such, they ap-ply the crudest form of implicit ligand theory, the dom-inant state approximation, to estimate both the bindingPMF and the binding free energy. With important mod-ifications to existing algorithms and the application ofmore complex estimators, the accuracy of scoring func-

    tions should be enhanced.One important potential change to molecular docking

    is the inclusion of multiple receptor configurations. Whilemost modern docking packages account for the orienta-tion and flexibility of the ligand, the large number ofcoordinates makes the treatment of receptor flexibilitychallenging. A number of groups have improved dockingperformance by treating receptor flexibility by using mul-tiple structures from crystallography [25, 40, 41] or MDsimulations [42] (the relaxed complex method [7, 43, 44]).Molecular dynamics simulations have also revealed bind-ing sites not discovered by crystallography [45, 46]. Inthe case of HIV integrase, insight into a new binding siteeven inspired the development of a new drug [47].

    Despite of this success, it has hitherto remained un-clear how to combine information from docking to differ-ent receptor snapshots. While averaging strategies havebeen empirically compared [48], the default strategy hasbeen to rank the ligand using the minimal energy fromdocking to all the snapshots. With implicit ligand theory,it is clear that the binding free energy may also be esti-mated by using an exponential average or cumulant ex-pansion of the binding PMF (which may still come fromthe dominant state approximation) for different snap-shots.

    The computational expense of the relaxed complex ap-proach may be reduced by clustering snapshots and se-lecting a representative snapshot from each cluster [44].Assuming that the binding PMF is constant within thecluster, estimated averages may be weighed by the clus-ter size. Using a clustering algorithm based on QR fac-torization to select 33 representative structures, Amaroet al. [44] were able to accurately reproduce a histogramof docking scores to over 400 structures. Further researchwill be necessary to develop and validate algorithms thatreliably cluster receptor configurations in which the bind-ing PMF is nearly constant.

    Compared to the inclusion of multiple receptor struc-

    tures, a more difficult task is the estimation ofB(rR) us-ing molecular docking, as this involves a paradigm shiftfrom searching for a minimum to sampling from a dis-tribution. Docking algorithms may be broadly classifiedinto two categories: matching and docking simulation[49]. Matching algorithms such as DOCK [50] attemptto match a ligand into a model of the binding site. DOCKmodels both the ligand and the binding site as a set ofspheres and uses algorithms from graph theory to alignthe ligand spheres into the binding site spheres. In dock-

  • 7/27/2019 1208.4885(1)

    10/22

    10

    ing simulation methods such as AutoDOCK [49, 5153]and MCDOCK [54], the ligand starts outside the bind-ing site and its configuration and orientation are progres-sively modified to search for the lowest energy configura-tion of the complex.

    Matching algorithms can estimate B(rR) by a postpro-cessing algorithm. That is, after low-energy complexesare found, they may be used to bias receptor-independent

    random sampling of the ligand orientation by a confin-ing potential Uc(L), for use in Eq. (17). With a har-monic potential for Uc(L), the ligand orientation willcome from a Gaussian distribution. An alternative post-processing algorithm is to use the lowest-energy struc-ture from a matching algorithm as a starting point for arigid-receptor MD simulation. This is not prohibitivelyexpensive; Graves et al. [8] even used MD simulationswith flexibility near the binding site as a postprocessingstep for molecular docking. Samples from this simulationwould be used to estimate B(rR) based on Eq. (18) orEq. (19).

    Docking simulation methods, on the other hand, willneed to be modified to sample from a known distribu-tion rather than to search for the minimum energy. Thischange may not require a complete revamp. Docking sim-ulation algorithms are often based on Monte Carlo ap-proaches, which preserve a desired distribution or may bereadily modified to do so. For example, MCDOCK [54]and early generations of AutoDOCK [51, 52] use simu-lated annealing, a procedure for which it is possible tocalculate the importance sampling weight [55].

    In addition to providing a path to rigorous bindingfree energies from molecular docking, implicit ligand the-ory also quantifies existing notions [56] about whether

    molecular recognition proceeds by induced fit or con-formational selection [57, 58]. As all receptor configu-rations have finite Boltzmann probability, the issue isa matter of degree. Suppose that a receptor binds totwo different ligands with the same binding free energy,one by conformational selection and the other by in-duced fit. If, to a good approximation, the complex isdominated by a single structure with receptor config-uration rR such that B(rR) = for all other recep-tor configurations, then Eq. (11) simplifies to G =

    U(rR) + B(rR) +

    1 ln ZR + G . For the ligand thatbinds by conformational selection, p(rR) = e

    U(rR)/ZRhas a reasonably high probability. In the induced fit com-

    plex, U(r

    R) is much less favorable and B(r

    R) must com-pensate accordingly to achieve the same G.

    Source code and data used in this paper are availableat https://simtk.org/home/implicit_ligand.

    ACKNOWLEDGEMENT

    The author thanks David Beratan for being a sup-portive postdoctoral advisor, Aaron Virshup and Shahar

    Keinan for helpful discussions, John Chodera and DavidMobley for comments on the manuscript, Yi Wang forsuggesting CB[7] as a test case, Michael Gilson for pro-viding parameters for CB[7] and its ligands, Clayton Jar-ratt for pinpointing the cause of negative surface areasin NAMD, and Emilio Gallichio for sharing source codefor BEDAM. Calculations were performed using DukeShared Computing Resources (DSCR). This research was

    funded by NSF CHE10-57953, NIH 2P50 GM-067082-06-10, and N00014-11-1-0729.

    Electronic Address: [email protected][1] B. K. Shoichet, Nature 432, 862 (2004).[2] G. Klebe, Drug Discov Today 11, 580 (2006).[3] R. Kim and J. Skolnick,

    J Comput Chem 29, 1316 (2008).[4] N. Moitessier, P. Englebienne, D. Lee,

    J. Lawandi, and C. R. Corbeil,British Journal of Pharmacology 153, S7 (2009).

    [5] D. Plewczynski, M. Lazniewski, R. Augustyniak, andK. Ginalski, J Comput Chem 32, 742 (2010).

    [6] J. Wang, P. Morin, W. Wang, and P. Kollman,J Am Chem Soc 123, 5221 (2001).

    [7] J. Lin, A. Perryman, J. Schames, and J. McCammon,Biopolymers 68, 47 (2003).

    [8] A. P. Graves, D. M. Shivakumar, S. E. Boyce,M. P. Jacobson, D. A. Case, and B. K. Shoichet,J Mol Biol 377, 914 (2008).

    [9] D. C. Thompson, C. Humblet, and D. Joseph-McCarthy,J Chem Inf Model 48, 1081 (2008).

    [10] T. Hou, J. Wang, Y. Li, and W. Wang,J Comput Chem 32, 866 (2010).

    [11] M. W. Chang, C. Ayeni, S. Breuer, and B. E. Torbett,PLoS ONE 5, e11955 (2010).

    [12] N. Huang, C. Kalyanaraman, K. Bernacki, and M. P.Jacobson, Phys Chem Chem Phys 8, 5166 (2006).

    [13] Activities have been assumed to be unity, a reasonableapproximation in the limit of low concentrations.

    [14] M. K. Gilson, J. Given, B. Bush, and J. McCammon,Biophys J 72, 1047 (1997).

    [15] J. Wang, C. Tan, Y.-H. Tan, Q. Lu, and R. Luo, Com-mun Comput Phys 3, 1010 (2008).

    [16] M. Feig and C. Brooks,Curr Opin Struc Biol 14, 217 (2004).

    [17] J. Michel and J. W. Essex, J Med Chem 51, 6654 (2008).[18] C.-E. Chang and M. K. Gilson,

    J Comput Chem 24, 1987 (2003).[19] C.-E. Chang, M. J. Potter, and M. K. Gilson,

    The Journal of Physical Chemistry B 107, 1048 (2003).[20] M. Lee, Biophys J 90, 864 (2005).[21] E. Gallicchio, M. Lapelosa, and R. M. Levy,

    J Chem Theory Comput 6, 2961 (2010).[22] J. Cohen, A. Arkhipov, R. Braun, and K. Schulten,

    Biophys J 91, 1844 (2006).[23] W. Jiang, M. Hodoscek, and B. Roux,

    J Chem Theory Comput 5, 2583 (2009).[24] E. Gallicchio and R. M. Levy,

    J Comput-Aided Mol Des , 1 (2012).

    https://simtk.org/home/implicit_ligandmailto:[email protected]://dx.doi.org/10.1038/nature03197http://dx.doi.org/10.1038/nature03197http://dx.doi.org/10.1038/nature03197http://dx.doi.org/10.1016/j.drudis.2006.05.012http://dx.doi.org/10.1016/j.drudis.2006.05.012http://dx.doi.org/10.1016/j.drudis.2006.05.012http://dx.doi.org/10.1016/j.drudis.2006.05.012http://dx.doi.org/10.1002/jcc.20893http://dx.doi.org/10.1002/jcc.20893http://dx.doi.org/10.1002/jcc.20893http://dx.doi.org/10.1038/sj.bjp.0707515http://dx.doi.org/10.1038/sj.bjp.0707515http://dx.doi.org/10.1038/sj.bjp.0707515http://dx.doi.org/10.1038/sj.bjp.0707515http://dx.doi.org/10.1002/jcc.21643http://dx.doi.org/10.1002/jcc.21643http://dx.doi.org/10.1002/jcc.21643http://dx.doi.org/%2010.1021/ja003834qhttp://dx.doi.org/%2010.1021/ja003834qhttp://dx.doi.org/%2010.1021/ja003834qhttp://dx.doi.org/%2010.1021/ja003834qhttp://dx.doi.org/10.1002/bip.10218http://dx.doi.org/10.1002/bip.10218http://dx.doi.org/10.1002/bip.10218http://dx.doi.org/10.1002/bip.10218http://dx.doi.org/10.1016/j.jmb.2008.01.049http://dx.doi.org/10.1016/j.jmb.2008.01.049http://dx.doi.org/10.1016/j.jmb.2008.01.049http://dx.doi.org/10.1016/j.jmb.2008.01.049http://dx.doi.org/10.1021/ci700470chttp://dx.doi.org/10.1021/ci700470chttp://dx.doi.org/10.1021/ci700470chttp://dx.doi.org/10.1021/ci700470chttp://dx.doi.org/%2010.1002/jcc.21666http://dx.doi.org/%2010.1002/jcc.21666http://dx.doi.org/%2010.1002/jcc.21666http://dx.doi.org/%2010.1002/jcc.21666http://dx.doi.org/%2010.1371/journal.pone.0011955.t002http://dx.doi.org/%2010.1371/journal.pone.0011955.t002http://dx.doi.org/%2010.1371/journal.pone.0011955.t002http://dx.doi.org/%2010.1371/journal.pone.0011955.t002http://dx.doi.org/10.1039/b608269fhttp://dx.doi.org/10.1039/b608269fhttp://dx.doi.org/10.1039/b608269fhttp://dx.doi.org/10.1039/b608269fhttp://dx.doi.org/10.1016/j.sbi.2004.03.009http://dx.doi.org/10.1016/j.sbi.2004.03.009http://dx.doi.org/10.1016/j.sbi.2004.03.009http://dx.doi.org/10.1016/j.sbi.2004.03.009http://dx.doi.org/10.1021/jm800524shttp://dx.doi.org/10.1021/jm800524shttp://dx.doi.org/10.1021/jm800524shttp://dx.doi.org/10.1021/jm800524shttp://dx.doi.org/10.1002/jcc.10325http://dx.doi.org/10.1002/jcc.10325http://dx.doi.org/10.1002/jcc.10325http://dx.doi.org/10.1021/jp027149chttp://dx.doi.org/10.1021/jp027149chttp://dx.doi.org/10.1021/jp027149chttp://papers//f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p3130http://papers//f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p3130http://papers//f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p3130http://pubs.acs.org/doi/abs/10.1021/ct1002913%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p7963http://pubs.acs.org/doi/abs/10.1021/ct1002913%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p7963http://pubs.acs.org/doi/abs/10.1021/ct1002913%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p7963http://pubs.acs.org/doi/abs/10.1021/ct1002913%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p7963http://dx.doi.org/10.1529/biophysj.106.085746%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p8331http://dx.doi.org/10.1529/biophysj.106.085746%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p8331http://dx.doi.org/10.1529/biophysj.106.085746%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p8331http://dx.doi.org/10.1529/biophysj.106.085746%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p8331http://pubs.acs.org/doi/abs/10.1021/ct900223z%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p6598http://pubs.acs.org/doi/abs/10.1021/ct900223z%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p6598http://pubs.acs.org/doi/abs/10.1021/ct900223z%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p6598http://pubs.acs.org/doi/abs/10.1021/ct900223z%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p6598http://papers//f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p8054http://papers//f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p8054http://papers//f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p8054http://pubs.acs.org/doi/abs/10.1021/ct900223z%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p6598http://dx.doi.org/10.1529/biophysj.106.085746%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p8331http://pubs.acs.org/doi/abs/10.1021/ct1002913%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p7963http://papers//f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p3130http://dx.doi.org/10.1021/jp027149chttp://dx.doi.org/10.1002/jcc.10325http://dx.doi.org/10.1021/jm800524shttp://dx.doi.org/10.1016/j.sbi.2004.03.009http://dx.doi.org/10.1039/b608269fhttp://dx.doi.org/%2010.1371/journal.pone.0011955.t002http://dx.doi.org/%2010.1002/jcc.21666http://dx.doi.org/10.1021/ci700470chttp://dx.doi.org/10.1016/j.jmb.2008.01.049http://dx.doi.org/10.1002/bip.10218http://dx.doi.org/%2010.1021/ja003834qhttp://dx.doi.org/10.1002/jcc.21643http://dx.doi.org/10.1038/sj.bjp.0707515http://dx.doi.org/10.1002/jcc.20893http://dx.doi.org/10.1016/j.drudis.2006.05.012http://dx.doi.org/10.1038/nature03197mailto:[email protected]://simtk.org/home/implicit_ligand
  • 7/27/2019 1208.4885(1)

    11/22

    11

    [25] S. Rao, P. C. Sanschagrin, J. R. Greenwood,M. P. Repasky, W. Sherman, and R. Farid,J Comput Aided Mol Des 22, 621 (2008).

    [26] C. Chipot and A. Pohorille, eds., Free Energy Calcula-tions, Vol. 86 (Springer, Berlin, 2007).

    [27] R. Zwanzig, J Chem Phys 22, 1420 (1954).[28] J. G. Kirkwood, J Chem Phys 3, 300 (1935).[29] C. H. Bennett, J Comput Phys 22, 245 (1976).[30] R. Zwanzig, J Chem Phys 22, 1420 (1954).

    [31] M. R. Shirts and J. D. Chodera,J Chem Phys 129, 124105 (2008).

    [32] S. Moghaddam, Y. Inoue, and M. K. Gilson,J Am Chem Soc 131, 4012 (2009).

    [33] S. Moghaddam, C. Yang, M. Rekharsky, Y. H.Ko, K. Kim, Y. Inoue, and M. K. Gilson,J Am Chem Soc 133, 3570 (2011).

    [34] Using the linear combination of pairwise overlap [59] al-gorithm, NAMD 2.9 calculates a negative surface areafor CB[7]. NAMD directly uses Appendix B of Weiseret al. [59], in which the P1 parameter for the N sp3 atomtype with 1 bonded neighbor is 7.8602 102, which issmaller than other P1 values and the corresponding P2parameter. By definition, P1 should be larger than P2. Tobring this parameter in line with other P1 and to make itlarger than P2, this parameter was multiplied by 10. Themodified code yields a positive surface area for CB[7].

    [35] J. C. Phillips, R. Braun, W. Wang, J. Gumbart,E. Tajkhorshid, E. Villa, C. Chipot, R. D. Skeel, L. Kale,and K. Schulten, J Comput Chem 26, 1781 (2005).

    [36] M. E. Davis, J. D. Madura,B. a. Luty, and J. McCammon,Computer Physics Communications 62, 187 (1991).

    [37] S. Duane, A. D. Kennedy, B. J. Pendleton, andD. Roweth, Phys Lett B 195, 216 (1987).

    [38] O. Guvench, J. Weiser, P. Shenkin, I. KolossvRy, andW. Still, J Comput Chem 23, 214 (2001).

    [39] D. L. Mobley and K. A. Dill,Structure/Folding and Design 17, 489 (2009).

    [40] I. R. Craig, J. W. Essex, and K. Spiegel,J Chem Inf Model 50, 511 (2010).

    [41] G. Bottegoni, W. Rocchia, M. Rueda, R. Abagyan, andA. Cavalli, PLoS ONE 6, e18845 (2011).

    [42] S. E. Nichols, R. Baron, A. Ivetac, and J. A. McCam-mon, J Chem Inf Model 51, 1439 (2011).

    [43] J. Lin, A. Perryman, J. Schames, and J. McCammon,J Am Chem Soc 124, 5632 (2002).

    [44] R. E. Amaro, R. Baron, and J. A. McCammon,J Comput Aided Mol Des 22, 693 (2008).

    [45] J. Schames, R. Henchman, J. Siegel, C. Sotriffer, H. Ni,and J. McCammon, J Med Chem 47, 1879 (2004).

    [46] R. E. Amaro, D. D. L. Minh, L. S. Cheng, W. M. Lind-strom, A. J. Olson, J.-H. Lin, W. W. Li, and J. A.McCammon, J Am Chem Soc 129, 7764 (2007).

    [47] J. D. Durrant and J. A. McCammon,BMC Biol 9, 71 (2011).

    [48] J. L. Paulsen and A. C. Anderson,Journal Chem Inf Model 49, 2813 (2009).

    [49] G. Morris, D. Goodsell, R. Halliday, R. Huey, W. Hart,R. Belew, and A. Olson, J Comput Chem 19, 1639(1998).

    [50] I. D. Kuntz, J. M. Blaney, S. J. Oatley, R. Langridge,and T. E. Ferrin, J Mol Biol 161, 269 (1982).

    [51] D. Goodsell and A. Olson, Proteins 8, 195 (1990).

    [52] G. Morris, D. Goodsell, R. Huey, and A. Olson, J Com-put Aided Mol Des 10, 293 (1996).

    [53] R. Huey, G. M. Morris, A. J. Olson, and D. S. Goodsell,J Comput Chem 28, 1145 (2007).

    [54] M. Liu and S. Wang,J Comput Aided Mol Des 13, 435 (1999).

    [55] R. Neal, Stat Comput 11, 125 (2001).[56] H. Carlson, Curr Opin Chem Biol 6, 447 (2002).[57] Y. Xu, J. P. Colletier, H. Jiang, I. Silman, J. L. Sussman,

    and M. Weik, Protein Sci 17, 601 (2008).[58] D. Bucher, B. J. Grant, and J. A. McCammon,

    Biochemistry-Us 50, 10530 (2011).[59] J. Weiser, P. S. Shenkin, and W. C. Still,

    J Comput Chem 20, 217 (1999).[60] J. A. Wagoner and V. S. Pande,

    J Chem Phys 134, 214103 (2011).[61] S. Wong, R. E. Amaro, and J. A. McCammon,

    J Chem Theory Comput 5, 422 (2009).

    http://dx.doi.org/10.1007/s10822-008-9182-yhttp://dx.doi.org/10.1007/s10822-008-9182-yhttp://dx.doi.org/10.1007/s10822-008-9182-yhttp://dx.doi.org/10.1007/s10822-008-9182-yhttp://papers//f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p6113http://papers//f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p6113http://papers//f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p6113http://apps.webofknowledge.com/InboundService.do?SID=2FoOLNp6PbDoOGkko7M&product=WOS&UT=000201218900011&SrcApp=CR&DestFail=http%253A%252F%252Fwww.webofknowledge.com&Init=Yes&action=retrieve&Func=Frame&customersID=mekentosj&SrcAuth=mekentosj&IsProductCode=Yes&mode=FullRecord%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p7507http://apps.webofknowledge.com/InboundService.do?SID=2FoOLNp6PbDoOGkko7M&product=WOS&UT=000201218900011&SrcApp=CR&DestFail=http%253A%252F%252Fwww.webofknowledge.com&Init=Yes&action=retrieve&Func=Frame&customersID=mekentosj&SrcAuth=mekentosj&IsProductCode=Yes&mode=FullRecord%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p7507http://apps.webofknowledge.com/InboundService.do?SID=2FoOLNp6PbDoOGkko7M&product=WOS&UT=000201218900011&SrcApp=CR&DestFail=http%253A%252F%252Fwww.webofknowledge.com&Init=Yes&action=retrieve&Func=Frame&customersID=mekentosj&SrcAuth=mekentosj&IsProductCode=Yes&mode=FullRecord%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p7507http://apps.webofknowledge.com/InboundService.do?SID=2FoOLNp6PbDoOGkko7M&product=WOS&UT=000201218900011&SrcApp=CR&DestFail=http%253A%252F%252Fwww.webofknowledge.com&Init=Yes&action=retrieve&Func=Frame&customersID=mekentosj&SrcAuth=mekentosj&IsProductCode=Yes&mode=FullRecord%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p7507http://papers//f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p5518http://papers//f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p5518http://papers//f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p5518http://papers//f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p5518http://papers//f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p1443http://papers//f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p1443http://papers//f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p1443http://papers//f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p1443http://pubs.acs.org/doi/abs/10.1021/ja808175m%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p4484http://pubs.acs.org/doi/abs/10.1021/ja808175m%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p4484http://pubs.acs.org/doi/abs/10.1021/ja808175m%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p4484http://pubs.acs.org/doi/abs/10.1021/ja808175m%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p4484http://pubs.acs.org/doi/abs/10.1021/ja109904u%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p8255http://pubs.acs.org/doi/abs/10.1021/ja109904u%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p8255http://pubs.acs.org/doi/abs/10.1021/ja109904u%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p8255http://pubs.acs.org/doi/abs/10.1021/ja109904u%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p8255http://dx.doi.org/10.1002/jcc.20289http://dx.doi.org/10.1002/jcc.20289http://dx.doi.org/10.1002/jcc.20289http://dx.doi.org/10.1002/jcc.20289http://dx.doi.org/10.1016/0010-4655(91)90094-2http://dx.doi.org/10.1016/0010-4655(91)90094-2http://dx.doi.org/10.1016/0010-4655(91)90094-2http://dx.doi.org/10.1016/0010-4655(91)90094-2http://papers//f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p8040http://papers//f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p8040http://papers//f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p8040http://papers//f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p8040http://dx.doi.org/10.1016/j.str.2009.02.010%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p4877http://dx.doi.org/10.1016/j.str.2009.02.010%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p4877http://dx.doi.org/10.1016/j.str.2009.02.010%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p4877http://dx.doi.org/10.1016/j.str.2009.02.010%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p4877http://dx.doi.org/10.1021/ci900407chttp://dx.doi.org/10.1021/ci900407chttp://dx.doi.org/10.1021/ci900407chttp://dx.doi.org/%2010.1371/journal.pone.0018845.t005http://dx.doi.org/%2010.1371/journal.pone.0018845.t005http://dx.doi.org/%2010.1371/journal.pone.0018845.t005http://dx.doi.org/%2010.1371/journal.pone.0018845.t005http://dx.doi.org/%2010.1021/ci200117nhttp://dx.doi.org/%2010.1021/ci200117nhttp://dx.doi.org/%2010.1021/ci200117nhttp://dx.doi.org/%2010.1021/ci200117nhttp://dx.doi.org/10.1021/ja0260162http://dx.doi.org/10.1021/ja0260162http://dx.doi.org/10.1021/ja0260162http://dx.doi.org/10.1021/ja0260162http://dx.doi.org/10.1007/s10822-007-9159-2http://dx.doi.org/10.1007/s10822-007-9159-2http://dx.doi.org/10.1007/s10822-007-9159-2http://dx.doi.org/10.1007/s10822-007-9159-2http://dx.doi.org/10.1021/ja0723535http://dx.doi.org/10.1021/ja0723535http://dx.doi.org/10.1021/ja0723535http://dx.doi.org/10.1021/ja0723535http://dx.doi.org/10.1186/1741-7007-9-71http://dx.doi.org/10.1186/1741-7007-9-71http://dx.doi.org/10.1186/1741-7007-9-71http://dx.doi.org/10.1186/1741-7007-9-71http://dx.doi.org/10.1021/ci9003078http://dx.doi.org/10.1021/ci9003078http://dx.doi.org/10.1021/ci9003078http://dx.doi.org/10.1021/ci9003078http://dx.doi.org/10.1002/jcc.20634http://dx.doi.org/10.1002/jcc.20634http://dx.doi.org/10.1002/jcc.20634http://apps.webofknowledge.com/InboundService.do?SID=3Dc4hd7abg3dJLHbd5E&product=WOS&UT=000081695300001&SrcApp=CR&DestFail=http%253A%252F%252Fwww.webofknowledge.com&Init=Yes&action=retrieve&Func=Frame&customersID=mekentosj&SrcAuth=mekentosj&IsProductCode=Yes&mode=FullRecordhttp://apps.webofknowledge.com/InboundService.do?SID=3Dc4hd7abg3dJLHbd5E&product=WOS&UT=000081695300001&SrcApp=CR&DestFail=http%253A%252F%252Fwww.webofknowledge.com&Init=Yes&action=retrieve&Func=Frame&customersID=mekentosj&SrcAuth=mekentosj&IsProductCode=Yes&mode=FullRecordhttp://apps.webofknowledge.com/InboundService.do?SID=3Dc4hd7abg3dJLHbd5E&product=WOS&UT=000081695300001&SrcApp=CR&DestFail=http%253A%252F%252Fwww.webofknowledge.com&Init=Yes&action=retrieve&Func=Frame&customersID=mekentosj&SrcAuth=mekentosj&IsProductCode=Yes&mode=FullRecordhttp://apps.webofknowledge.com/InboundService.do?SID=3Dc4hd7abg3dJLHbd5E&product=WOS&UT=000081695300001&SrcApp=CR&DestFail=http%253A%252F%252Fwww.webofknowledge.com&Init=Yes&action=retrieve&Func=Frame&customersID=mekentosj&SrcAuth=mekentosj&IsProductCode=Yes&mode=FullRecordhttp://dx.doi.org/%2010.1110/ps.083453808http://dx.doi.org/%2010.1110/ps.083453808http://dx.doi.org/%2010.1110/ps.083453808http://dx.doi.org/%2010.1110/ps.083453808http://dx.doi.org/10.1021/bi201481ahttp://dx.doi.org/10.1021/bi201481ahttp://dx.doi.org/10.1021/bi201481ahttp://dx.doi.org/%2010.1002/(SICI)1096-987X(19990130)20:2%3C217::AID-JCC4%3E3.0.CO;2-Ahttp://dx.doi.org/%2010.1002/(SICI)1096-987X(19990130)20:2%3C217::AID-JCC4%3E3.0.CO;2-Ahttp://dx.doi.org/%2010.1002/(SICI)1096-987X(19990130)20:2%3C217::AID-JCC4%3E3.0.CO;2-Ahttp://dx.doi.org/%2010.1002/(SICI)1096-987X(19990130)20:2%3C217::AID-JCC4%3E3.0.CO;2-Ahttp://papers//f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p6634http://papers//f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p6634http://papers//f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p6634http://papers//f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p6634http://dx.doi.org/10.1021/ct8003707http://dx.doi.org/10.1021/ct8003707http://dx.doi.org/10.1021/ct8003707http://dx.doi.org/10.1021/ct8003707http://dx.doi.org/10.1021/ct8003707http://papers//f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p6634http://dx.doi.org/%2010.1002/(SICI)1096-987X(19990130)20:2%3C217::AID-JCC4%3E3.0.CO;2-Ahttp://dx.doi.org/10.1021/bi201481ahttp://dx.doi.org/%2010.1110/ps.083453808http://apps.webofknowledge.com/InboundService.do?SID=3Dc4hd7abg3dJLHbd5E&product=WOS&UT=000081695300001&SrcApp=CR&DestFail=http%253A%252F%252Fwww.webofknowledge.com&Init=Yes&action=retrieve&Func=Frame&customersID=mekentosj&SrcAuth=mekentosj&IsProductCode=Yes&mode=FullRecordhttp://dx.doi.org/10.1002/jcc.20634http://dx.doi.org/10.1021/ci9003078http://dx.doi.org/10.1186/1741-7007-9-71http://dx.doi.org/10.1021/ja0723535http://dx.doi.org/10.1007/s10822-007-9159-2http://dx.doi.org/10.1021/ja0260162http://dx.doi.org/%2010.1021/ci200117nhttp://dx.doi.org/%2010.1371/journal.pone.0018845.t005http://dx.doi.org/10.1021/ci900407chttp://dx.doi.org/10.1016/j.str.2009.02.010%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p4877http://papers//f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p8040http://dx.doi.org/10.1016/0010-4655(91)90094-2http://dx.doi.org/10.1002/jcc.20289http://pubs.acs.org/doi/abs/10.1021/ja109904u%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p8255http://pubs.acs.org/doi/abs/10.1021/ja808175m%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p4484http://papers//f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p1443http://papers//f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p5518http://apps.webofknowledge.com/InboundService.do?SID=2FoOLNp6PbDoOGkko7M&product=WOS&UT=000201218900011&SrcApp=CR&DestFail=http%253A%252F%252Fwww.webofknowledge.com&Init=Yes&action=retrieve&Func=Frame&customersID=mekentosj&SrcAuth=mekentosj&IsProductCode=Yes&mode=FullRecord%20papers://f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p7507http://papers//f7841d7d-9771-4011-86e7-7e86ad060f2b/Paper/p6113http://dx.doi.org/10.1007/s10822-008-9182-y
  • 7/27/2019 1208.4885(1)

    12/22

    12

    SUPPLEMENTAL MATERIAL

    This supplemental material contains one theoreticalsection, two tables, and three figures. The theoreticalsection describes a hybrid implicit-explicit solvent model.One table is for components of the binding PMF and theother for average potential energies. The figures showthe convergence of binding PMF and binding free energyestimates, as well histogram of binding PMF estimatesfor different receptor snapshots.

    Hybrid Implicit-Explicit Solvent

    The desire to combine the speed of implicit solventwith the molecular detail and accuracy of explicit sol-vent has inspired interest in hybrid implicit-explicit sol-vent models (see Wagoner and Pande [60] and referencestherein). In the context of implicit ligand theory, a smallnumber of explicit solvent molecules can be consideredas a part of the receptor [61] during binding PMF calcu-lations.

    A simple formalism for a hybrid implicit-explicit sol-vent model may be derived by separating the coordinatesofN solvent molecules rS into explicitly represented co-ordinates rE and implicitly represented coordinates rI.

    Partition functions analogous to and formally equivalentto Eqs. (6) and (7) are then defined as,

    ZRL =

    Ie

    [U(rRL,rE)+W(rRL,rE)]drRLdrE (26)

    ZR =

    e[U(rR,rE)+W(rR,rE)]drRdrE. (27)

    Defining the effective interaction energy as (rRL

    , rE

    ) =U(rRL, rE) U(rR, rE) U(rL) and the binding PMF as,

    B(rR, rE) = 1 ln

    Ie[(rRL,rE)+U(rL)]drLdL

    IeU(rL)drLdL

    ,

    the binding free energy may be written as,

    G = 1 ln

    ZRL

    ZRZL

    C

    82

    = 1 ln

    e[B

    (rR,rE)+U(rR,rE)]drRdrEeU(rR,rE)drRdrE

    C

    82

    =

    1

    lne

    B rR,rER,E + G , (28)

    where qR,E = eU(rR,rE). The main text focuses ondescribing calculations in implicit solvent, with the un-derstanding that explicit solvent may be readily included.

  • 7/27/2019 1208.4885(1)

    13/22

  • 7/27/2019 1208.4885(1)

    14/22

    14

    Ligand VDW Coul PB Val NP Total

    AD1 -32.5 (0.471) 0.1 (1.509) 4.8 (1.547) -5.0 (2.785) -2.5 (0.011) -35.2 (2.547)

    AD2 -33.6 (0.931) -65.8 (1.032) 64.9 (0.783) -5.9 (1.910) -2.5 (0.017) -42.9 (1.693)

    AD3 -32.8 (0.718) -64.4 (0.693) 62.2 (0.855) -5.7 (2.128) -2.6 (0.009) -43.4 (2.388)

    AD4 -38.1 (1.400) -125.2 (3.283) 124.4 (1.003) 1.9 (4.475) -2.7 (0.070) -39.9 (2.817)

    AD5 -33.3 (1.374) -65.1 (1.549) 64.8 (1.415) -4.9 (1.782) -2.5 (0.025) -40.9 (1.834)

    B02 -33.3 (0.622) -5.8 (1.067) 9.7 (0.770) 1.4 (3.379) -2.7 (0.022) -30.6 (2.187)

    B05 -32.9 (0.896) -138.2 (1.231) 138.0 (1.151) -2.3 (1.236) -2.8 (0.013) -38.1 (1.673)B11 -39.9 (1.192) -199.3 (2.280) 212.0 (1.066) -5.6 (5.431) -3.4 (0.075) -36.2 (4.475)

    F01 -26.2 (0.497) -8.2 (1.824) 14.2 (1.119) 8.7 (4.842) -2.7 (0.017) -14.3 (4.079)

    F02 -26.9 (1.518) -65.7 (2.078) 65.9 (0.923) -0.9 (2.237) -3.0 (0.012) -30.6 (2.253)

    F03 -28.7 (0.832) -58.0 (0.987) 64.2 (0.649) -0.4 (3.552) -3.0 (0.015) -26.1 (3.428)

    F06 -35.1 (1.154) -116.1 (0.810) 120.9 (0.651) -8.8 (4.862) -3.5 (0.013) -42.6 (4.132)

    TABLE III. Estimates of the mean potential energy changes (kcal/mol) upon the binding of various ligands to CB[7]. Thecolumns refer to van der Waals (VDW), coulomb (Coul), electrostatic solvation (PB), valence (Val, bond + angle + dihedral),nonpolar solvation (NP), and total energies. The value in the parentheses is the standard deviation from b ootstrapping: the

    observable is estimated based on 1000 random selections of 100 values of . In Table SII of the Supplemental Material, meanpotential energies for the ligand, receptor, and complex are also shown.

    Ligand

    B(rR) min{(rR)} min{(rR)} HREX HREX

    G minB(rR)

    EXP min

    B(rR)

    EXP

    AD1 -28.6 -27.2 -22.0 -20.1

    AD2 -36.4 -34.6 -27.6 -25.4

    AD3 -38.1 -36.8 -27.6 -26.2

    AD4 -43.1 -40.4 -29.8 -27.1

    AD5 -35.8 -33.6 -26.8 -24.4

    B02 -29.8 -27.9 -21.0 -18.1B05 -37.9 -35.6 -23.7 -21.4

    B11 -48.5 -45.7 -23.1 -20.5

    F01 -22.7 -21.3 -10.2 -7.6

    F02 -30.9 -28.8 -17.0 -14.6

    F03 -28.7 -27.0 -14.5 -13.2

    F06 -35.6 -33.8 -21.3 -19.7

    R2ITC 0.849 0.855 0.684 0.704

    RMSEITC 17.3 15.3 5.8 4.5

    R2Gilson 0.787 0.795 0.926 0.925

    RMSEGilson 15.8 13.9 3.5 2.4

    R2Exp 0.723 0.736 0.996

    RMSEExp 15.5 13.6 2.3

    TABLE IV. Estimates of the binding free energy G (kcal/mol) using the PBSA model. First, the binding PMF B(rR)

    is estimated with the dominant state approximation (min {(rR)}) or based on Eq. (23) (HREX). Then, G is from the

    dominant state approximation (minB(rR)

    ) or based on Eq. (22) (EXP). The bottom rows show the correlation coefficient

    (R2) and root mean square error (RMSE, Eq. (25)) with respect to isothermal titration calorimetry experiments (ITC) andmining minima calculations (Gilson) from Moghaddam et al. [33], and the fourth column.

  • 7/27/2019 1208.4885(1)

    15/22

  • 7/27/2019 1208.4885(1)

    16/22

    16

    Average Potential Energy of Complexes

    Ligand VDW Coul PB Val NP Total

    AD1 -90.2 (0.417) 50.7 (1.415) -131.6 (1.500) 328.2 (2.651) 5.8 (0.011) 162.8 (2.400)

    AD2 -91.9 (0.905) 29.7 (0.888) -122.9 (0.687) 328.5 (1.710) 5.8 (0.017) 149.2 (1.464)

    AD3 -91.7 (0.684) 23.7 (0.452) -124.3 (0.768) 333.8 (1.950) 5.8 (0.009) 147.4 (2.231)

    AD4 -98.1 (1.382) 79.7 (3.240) -191.4 (0.930) 348.1 (4.394) 6.1 (0.070) 144.4 (2.685)

    AD5 -91.5 (1.357) 19.7 (1.457) -123.1 (1.364) 331.6 (1.565) 5.8 (0.025) 142.5 (1.624)

    B02 -87.2 (0.582) 39.8 (0.929) -131.2 (0.672) 340.6 (3.270) 5.8 (0.021) 167.8 (2.014)

    B05 -90.5 (0.869) 29.4 (1.113) -173.0 (1.087) 340.4 (0.896) 5.8 (0.012) 112.1 (1.440)

    B11 -103.0 (1.171) 336.9 (2.218) -404.0 (0.997) 379.8 (5.364) 7.3 (0.075) 217.0 (4.393)

    F01 -88.1 (0.446) 84.0 (1.747) -127.1 (1.054) 515.7 (4.767) 5.8 (0.016) 390.4 (3.989)

    F02 -89.4 (1.502) 56.6 (2.010) -119.3 (0.843) 514.2 (2.068) 5.9 (0.012) 368.1 (2.086)

    F03 -90.1 (0.802) 55.2 (0.835) -117.4 (0.529) 518.1 (3.448) 5.9 (0.015) 371.7 (3.320)

    F06 -97.3 (1.133) 63.0 (0.616) -153.2 (0.532) 525.4 (4.787) 6.1 (0.012) 343.9 (4.044)

    Average Potential Energy of Ligands

    Ligand VDW Coul PB Val NP Total

    AD1 -1.8 (0.0063) -8.5 (0.0012) -4.6 (0.0014) 52.6 (0.0232) 2.0 (0.0001) 39.7 (0.0236)

    AD2 -2.5 (0.0049) 36.3 (0.0036) -55.9 (0.0028) 53.8 (0.0246) 2.0 (0.0001) 33.7 (0.0248)

    AD3 -3.0 (0.0058) 29.0 (0.0028) -54.6 (0.0028) 58.8 (0.0259) 2.1 (0.0001) 32.4 (0.0264)

    AD4 -4.1 (0.0056) 145.8 (0.0124) -183.9 (0.0089) 65.6 (0.0291) 2.5 (0.0001) 25.9 (0.0270)AD5 -2.4 (0.0052) 25.6 (0.0032) -56.1 (0.0030) 55.9 (0.0245) 2.0 (0.0001) 25.0 (0.0248)

    B02 1.9 (0.0103) -13.6 (0.0032) -9.1 (0.0025) 58.7 (0.0257) 2.2 (0.0001) 40.1 (0.0255)

    B05 -1.7 (0.0069) 108.3 (0.0081) -179.2 (0.0055) 62.1 (0.0270) 2.3 (0.0001) -8.2 (0.0273)

    B11 -7.2 (0.0116) 477.0 (0.0132) -484.1 (0.0106) 104.8 (0.0268) 4.4 (0.0002) 94.8 (0.0232)

    F01 -6.0 (0.0036) 33.1 (0.0059) -9.5 (0.0036) 226.5 (0.0251) 2.2 (0.0001) 246.4 (0.0258)

    F02 -6.5 (0.0055) 63.1 (0.0069) -53.4 (0.0037) 234.6 (0.0268) 2.6 (0.0002) 240.4 (0.0270)

    F03 -5.5 (0.0078) 54.1 (0.0069) -49.8 (0.0035) 238.0 (0.0286) 2.6 (0.0001) 239.4 (0.0284)

    F06 -6.4 (0.0101) 120.0 (0.0109) -142.3 (0.0064) 253.6 (0.0341) 3.3 (0.0002) 228.1 (0.0339)

    Average Potential Energy of the Receptor

    VDW Coul PB Val NP Total

    -55.9 (0.2198) 59.2 (0.5253) -131.8 (0.3756) 280.6 (0.8513) 6.3 (0.0039) 158.4 (0.8516)

    Average Potential Energy Changes

    Ligand VDW Coul PB Val NP Total

    AD1 -32.5 (0.471) 0.1 (1.509) 4.8 (1.547) -5.0 (2.785) -2.5 (0.011) -35.2 (2.547)

    AD2 -33.6 (0.931) -65.8 (1.032) 64.9 (0.783) -5.9 (1.910) -2.5 (0.017) -42.9 (1.693)

    AD3 -32.8 (0.718) -64.4 (0.693) 62.2 (0.855) -5.7 (2.128) -2.6 (0.009) -43.4 (2.388)

    AD4 -38.1 (1.400) -125.2 (3.283) 124.4 (1.003) 1.9 (4.475) -2.7 (0.070) -39.9 (2.817)

    AD5 -33.3 (1.374) -65.1 (1.549) 64.8 (1.415) -4.9 (1.782) -2.5 (0.025) -40.9 (1.834)

    B02 -33.3 (0.622) -5.8 (1.067) 9.7 (0.770) 1.4 (3.379) -2.7 (0.022) -30.6 (2.187)

    B05 -32.9 (0.896) -138.2 (1.231) 138.0 (1.151) -2.3 (1.236) -2.8 (0.013) -38.1 (1.673)

    B11 -39.9 (1.192) -199.3 (2.280) 212.0 (1.066) -5.6 (5.431) -3.4 (0.075) -36.2 (4.475)

    F01 -26.2 (0.497) -8.2 (1.824) 14.2 (1.119) 8.7 (4.842) -2.7 (0.017) -14.3 (4.079)F02 -26.9 (1.518) -65.7 (2.078) 65.9 (0.923) -0.9 (2.237) -3.0 (0.012) -30.6 (2.253)

    F03 -28.7 (0.832) -58.0 (0.987) 64.2 (0.649) -0.4 (3.552) -3.0 (0.015) -26.1 (3.428)

    F06 -35.1 (1.154) -116.1 (0.810) 120.9 (0.651) -8.8 (4.862) -3.5 (0.013) -42.6 (4.132)

    TABLE SII. Estimates of the mean potential energy(kcal/mol) of the ligand, receptor, and complex for differentCu[7] ligands. The columns refer to van der Waals (VDW),coulomb (Coul), electrostatic solvation (PB), valence (Val,bond + angle + dihedral), nonpolar solvation (NP), and totalenergies. The value in the parentheses is the standard devi-ation from bootstrapping: the observable is estimated basedon 1000 random selections of 100 values of .

  • 7/27/2019 1208.4885(1)

    17/22

    17

    28.5

    25

    21.5

    B(rR

    )AD1

    31.731

    30.3

    B

    cpl

    123.5

    120

    116.5

    BRL

    2.8

    2.7

    2.5

    BL

    0 0.5 1 1.5 2

    33.5

    30

    26.5

    min{(

    rRL

    )}

    Total Simulation Time (ns)

    30.7

    30

    29.3

    B(rR

    )AD2

    91.791

    90.3

    B

    cpl

    117.9

    117.5

    117.2

    BRL

    54

    53.8

    53.6

    BL

    0 0.5 1 1.5 2

    37.4

    36

    34.6

    min{(

    rRL

    )}

    Total Simulation Time (ns)

    33.4

    32

    30.6

    B(rR

    )AD3

    93.7

    93

    92.3

    Bcpl

    116.9

    116.5

    116.2

    BRL

    52.9

    52.5

    52.2

    BL

    0 0.5 1 1.5 2

    41.4

    40

    38.6

    min{(rR

    L)}

    Total Simulation Time (ns)

    28.5

    25

    21.5

    B(rR

    )AD4

    149.3

    147.5

    145.8

    Bcpl

    183.5

    180

    176.5

    BRL

    179.3

    177.5

    175.8

    BL

    0 0.5 1 1.5 2

    38.5

    35

    31.5

    min{(rR

    L)}

    Total Simulation Time (ns)

    FIG. S1. (a) The mean and standard deviation of 15 indepen-dent estimates of B(rR), Bcpl, BRL, BL, and min{(rRL)}(kcal/mol) based on PBSA energies as a function of total MDsimulation time.

  • 7/27/2019 1208.4885(1)

    18/22

    18

    29.4

    28

    26.6

    B(rR

    )AD5

    91.490

    88.6

    B

    cpl

    118.7

    118

    117.3

    BRL

    54.4

    54

    53.7

    BL

    0 0.5 1 1.5 2

    37.4

    36

    34.6

    min{(

    rRL

    )}

    Total Simulation Time (ns)

    27

    20

    13

    B(rR

    )B02

    3730

    23

    B

    cpl

    123.5

    120

    116.5

    BRL

    7

    6.8

    6.6

    BL

    0 0.5 1 1.5 2

    33.5

    30

    26.5

    min{(

    rRL

    )}

    Total Simulation Time (ns)

    23.5

    20

    16.5

    B(rR

    )B05

    158.5

    155

    151.5

    Bcpl

    169.4

    168

    166.6

    BRL

    176.7

    176

    175.3

    BL

    0 0.5 1 1.5 2

    34.3

    32.5

    30.8

    min{(rR

    L)}

    Total Simulation Time (ns)

    23.5

    20

    16.5

    B(rR

    )B11

    223.5

    220

    216.5

    Bcpl

    407

    400

    393

    BRL

    478.5

    475

    471.5

    BL

    0 0.5 1 1.5 2

    54

    40

    26

    min{(rR

    L)}

    Total Simulation Time (ns)

    FIG. S1. (b) The mean and standard deviation of 15 indepen-dent estimates of B(rR), Bcpl, BRL, BL, and min{(rRL)}(kcal/mol) based on PBSA energies as a function of total MDsimulation time.

  • 7/27/2019 1208.4885(1)

    19/22

    19

    17

    10

    3

    B(rR

    )F01

    28.525

    21.5

    B

    cpl

    87

    80

    73

    BRL

    27.3

    28

    28.7

    BL

    0 0.5 1 1.5 2

    27

    20

    13

    min{(

    rRL

    )}

    Total Simulation Time (ns)

    23.5

    20

    16.5

    B(rR

    )F02

    85.484

    82.6

    B

    cpl

    91.4

    90

    88.6

    BRL

    30.7

    30

    29.3

    BL

    0 0.5 1 1.5 2

    33.5

    30

    26.5

    min{(

    rRL

    )}

    Total Simulation Time (ns)

    21.4

    20

    18.6

    B(rR

    )F03

    83.4

    82

    80.6

    Bcpl

    93.5

    90

    86.5

    BRL

    29.4

    28

    26.6

    BL

    0 0.5 1 1.5 2

    31.4

    30

    28.6

    min{(rR

    L)}

    Total Simulation Time (ns)

    27.4

    26

    24.6

    B(rR

    )F06

    147.4

    146

    144.6

    Bcpl

    133.7

    133

    132.3

    BRL

    126.7

    126

    125.3

    BL

    0 0.5 1 1.5 2

    37.4

    36

    34.6

    min{(rR

    L)}

    Total Simulation Time (ns)

    FIG. S1. (c) The mean and standard deviation of 15 indepen-dent estimates of B(rR), Bcpl, BRL, BL, and min{(rRL)}(kcal/mol) based on PBSA energies as a function of total MDsimulation time.

  • 7/27/2019 1208.4885(1)

    20/22

    20

    30 25 20 15 10 5

    0

    10

    20

    30AD1

    35 30 25 20 15 10

    0

    20

    40AD2

    35 30 25 20 15 100

    20

    40AD3

    40 30 20 10 00

    10

    20

    30AD4

    35 30 25 20 15 10 50

    20

    40AD5

    25 20 15 100

    10

    20B02

    28 26 24 22 20 18 160

    10

    20B05

    30 25 20 15 10 50

    10

    20B11

    15 10 5 0 5 100

    10

    20

    30 F01

    25 20 15 10 5 0 50

    10

    20

    30 F02

    20 15 10 5 0 50

    10

    20

    30F03

    30 20 10 0 100

    10

    20

    30F06

    FIG. S2. Histogram of binding PMF estimates B(rR)(kcal/mol) of various ligands to 100 snapshots of Cu[7], us-ing PBSA energies. The vertical line shows the mean bindingPMF for the minimized receptor structure.

  • 7/27/2019 1208.4885(1)

    21/22

    21

    21.2

    18

    14.8

    AD1

    20.8

    20

    19.2

    26.1

    22.5

    18.9

    AD2

    25.8

    25

    24.2

    27.2

    24

    20.8

    AD3

    26.8

    26

    25.2

    25.9

    21.5

    17.1

    AD4

    27.5

    25.5

    23.5

    24.3

    21.5

    18.7

    AD5

    24.8

    24

    23.2

    18.4

    16

    13.6

    B02

    18.8

    18

    17.2

    21.4

    19

    16.6

    B05

    21.8

    21

    20.2

    20.9

    16.5

    12.1

    B11

    21.6

    20

    18.4

    8.1

    4.5

    1

    F01

    8.7

    7.5

    6.3

    14.1

    10.5

    6.9

    F02

    15.6

    14

    12.4

    14

    10

    6

    F03

    13.8

    13

    12.2

    0 5 10 15 20

    18.9

    14.5

    10.1

    F06

    Number of Receptor Snapshots20 40 60 80 100

    20.6

    19

    17.4

    Number of Receptor Snapshots

    FIG. S3. Estimates of the binding free energy G of variousligands to Cu[7] (kcal/mol), using PBSA energies, as a func-tion of the number of receptor snapshots. The line and errorbars denote the mean and standard deviation from bootstrap-ping: the binding free energy is estimated 100 times usingrandom selections ofN out of 100 binding PMFs.

  • 7/27/2019 1208.4885(1)

    22/22

    30 25 20 15 10 50

    0

    20

    30AD1

    35 30 25 20 15 100

    20

    40

    60AD2

    35 30 25 20 15 100

    20

    40AD3

    40 30 20 10 00

    10

    20

    30AD4

    35 30 25 20 15 10 5

    0

    20

    40AD5

    25 20 15 10

    0

    10

    20B02

    28 26 24 22 20 18 160

    0

    20B05

    30 25 20 15 10 50

    10

    20B11

    15 10 5 0 5 100

    0

    20

    30F01

    25 20 15 10 5 0 50

    10

    20

    30F02

    0

    20

    30F03

    10

    20

    30F06


Recommended