+ All Categories
Home > Documents > Jp 9705075

Jp 9705075

Date post: 03-Jun-2018
Category:
Upload: moh-syaifudin
View: 228 times
Download: 0 times
Share this document with a friend
13
Langevin Dipoles Model for ab Initio Calculations of Chemical Processes in Solution: Parametrization and Application to Hydration Free Energies of Neutral and Ionic Solutes and Conformational Analysis in Aqueous Solution Jan Floria ´ n and Arieh Warshel*  Department of Chemistry, UniV ersity of Southern California, Los Angeles, California 90089-1062  ReceiV ed: February 10, 1997; In Final Form: May 2, 1997 X A new parametrization of the Langevin dipole (LD) model is developed for ab initio calculations of chemical processes in aqueous solution. This parametrization is implemented in both the iterative (ILD) and noniterative (NLD) versions of the LD model. The training set for the new par ametrization enco mpassed solvation fre e energies of 44 neutra l and 39 ionic solutes that contained the C, O, N, P, S, F, and Cl atoms. The performan ce of the model is assessed by examining its ability to represent the overall training set, p K a  differences of structurally related compounds, and conformation-related changes in the solvation energy of 1,2-ethanediol (glycol). The effects of solute polarization and electron correl ation are also discussed. The overall perform ance of the model is found to be comparable or slightly better than the PCM continuum model of Tomasi and co-workers. However, the simplified expli cit representat ion of solvent molecules of the LD model may allow one to gain a somewhat clearer insight into the molecular origin of different solvent effects than that obtained by continu um models . The presen t version of the model can be used as a stan d alone program with the standard output from the Gaussian or related programs. 1. Introdu ction Quantum mechanical studies of chemical processes in solu- tions and proteins should take into account the effect of the environment around the reacting fragments. 1 A number of computational approaches were developed to accomplish this task, ranging from continuum to all-atom models (for review see, for example, ref 2). One of these met hods is the Langevi n dipoles (LD) model, 3-6 which was introduced soon after the eme rge nce of the ear ly combined qua ntum mec han ica l/  continuum dielectric approaches. 7-10 In fact, the LD model repres ents probably the first attempt to obtai n quant itat ive estimates of solvation energies of molecules in solutions and enzymes. The term “quant itati ve” needs to be stress ed here, because continuum methods available at that time involved the use of a spherical solute -solven t bounda ry. As the unce rtain radiu s of this sphere was expli citl y inclu ded in solvat ion free energy calculations, the predictive capabilities of early con- tinuum methods were quite limited. In contrast, the LD m odel use d tra nsf erab le ato mic par ame ter s (va n der Waa ls rad ii) calibra ted using observed solv ation energi es. This approach has provided an effective way of treating solutes of arbitrary shape without assuming any idealized cavity boundary. The same type of atom-param etr ize d bou nda ry was later app lie d in new generations of continuum dielectric models. 11-17 In addition, the seemingly simple idea of using van der Waals radii as adjustable parameters has been extensively used in subsequent all-atom solvent models (for review see ref 18) and represents one of the main reasons for their reliability. The development of the LD model was motivated by the realization that dipolar models can capture the main physics of polar solvents and that reproducing the average polarization of the sol ven t sho uld suf fice for a reas ona ble eva lua tio n of solvation effects. This assump tion has been later confirmed by the success of mode rn continuum model s. The LD model and its protein dipoles Langevin dipoles (PDLD) version has been applied extensively to calculations of chemical processes in solution (see for example refs 19-24). The cl ose rela tionsh ip between the LD model and more rigorous microscopic models has been demonstrated. 6,25 In addition, the formal relations between the continuum models, explicit dipoles models, and LD models have been established. 26,27 All these studies have shown that dipolar models provide very effective ways for evalu ating solva tion energi es. In additio n, they showed that many different models can perform reasonably well and that what really counts is the computational efficiency and the con sis ten cy of the ir par ame tri zat ion . Nat ural ly, a gen eral rel iabili ty reg ardless of the exact details of the model is characteristic also of continuum models provided that they use atom-based parametrizat ion. Apparently this was not underst ood by some scientists who criticized the minor details of the LD implementation while having no difficulties in accepting the oversi mplif icati ons of conti nuum models. 28,29 The expli cit dipoles present in the LD model allow one to form a direct bridge to molecular solvent models and to address problems that are difficult to formulate in a unique way in the framework of continuum treatments (e.g. the treatment of induced dipoles in excited state calculations 30 ). As far as quantum chemical calculations are concerned, the LD model and its protein dipoles Langevin dipoles (PDLD) ver sion can be considered as the earl ies t hyb rid quantu m mechanical/molecular mechanical (QM/MM) solvation models. 3 Thi s ear ly QM/LD mod el has con sid ered consisten tl y the coupl ing between the solut e semie mpiri cal QCFF /ALL or MINDO/2 Ham iltonian and the solvent. Subsequent study has parametrized the LD model with the semiempirical MNDO approach. 30 Unfortunately, current semiempirical approaches do not always provide correct dipole m oments. Consequently, it is dif ficult to obtain con sis ten t sol vat ion energi es fro m semiempirical qua ntum mec han ica l mod els wit hou t using unrealistic van der Waals radii. Therefore we continued t o use empirically adjusted charges in LD and PDLD calculations, 31 and this is in fact the approach used in refining the LD parameters in the POLARIS program 20 and in related param- etrizations of conti nuum models (e.g. ref 32). However, recent X Abstract published in  Ad V ance ACS Abstracts,  June 15, 1997. 5583  J. Phys. Chem. B  1997,  101,  5583-5595 S1089- 5647(97)0050 7-5 CCC : $14.0 0 © 1997 A merica n Chemi cal Society
Transcript
Page 1: Jp 9705075

8/12/2019 Jp 9705075

http://slidepdf.com/reader/full/jp-9705075 1/13

Langevin Dipoles Model for ab Initio Calculations of Chemical Processes in Solution:

Parametrization and Application to Hydration Free Energies of Neutral and Ionic Solutes

and Conformational Analysis in Aqueous Solution

Jan Florian and Arieh Warshel*

 Department of Chemistry, UniV ersity of Southern California, Los Angeles, California 90089-1062

 ReceiV ed: February 10, 1997; In Final Form: May 2, 1997 X

A new parametrization of the Langevin dipole (LD) model is developed for ab initio calculations of chemicalprocesses in aqueous solution. This parametrization is implemented in both the iterative (ILD) and noniterative(NLD) versions of the LD model. The training set for the new parametrization encompassed solvation freeenergies of 44 neutral and 39 ionic solutes that contained the C, O, N, P, S, F, and Cl atoms. The performanceof the model is assessed by examining its ability to represent the overall training set, p K a  differences of structurally related compounds, and conformation-related changes in the solvation energy of 1,2-ethanediol(glycol). The effects of solute polarization and electron correlation are also discussed. The overall performanceof the model is found to be comparable or slightly better than the PCM continuum model of Tomasi andco-workers. However, the simplified explicit representation of solvent molecules of the LD model may allowone to gain a somewhat clearer insight into the molecular origin of different solvent effects than that obtainedby continuum models. The present version of the model can be used as a stand alone program with the

standard output from the Gaussian or related programs.

1. Introduction

Quantum mechanical studies of chemical processes in solu-tions and proteins should take into account the effect of theenvironment around the reacting fragments.1 A number of computational approaches were developed to accomplish thistask, ranging from continuum to all-atom models (for reviewsee, for example, ref 2). One of these methods is the Langevindipoles (LD) model,3-6 which was introduced soon after theemergence of the early combined quantum mechanical/ continuum dielectric approaches.7-10 In fact, the LD model

represents probably the first attempt to obtain quantitativeestimates of solvation energies of molecules in solutions andenzymes. The term “quantitative” needs to be stressed here,because continuum methods available at that time involved theuse of a spherical solute-solvent boundary. As the uncertainradius of this sphere was explicitly included in solvation freeenergy calculations, the predictive capabilities of early con-tinuum methods were quite limited. In contrast, the LD modelused transferable atomic parameters (van der Waals radii)calibrated using observed solvation energies. This approach hasprovided an effective way of treating solutes of arbitrary shapewithout assuming any idealized cavity boundary. The same typeof atom-parametrized boundary was later applied in newgenerations of continuum dielectric models.11-17 In addition,

the seemingly simple idea of using van der Waals radii asadjustable parameters has been extensively used in subsequentall-atom solvent models (for review see ref 18) and representsone of the main reasons for their reliability.

The development of the LD model was motivated by therealization that dipolar models can capture the main physics of polar solvents and that reproducing the average polarization of the solvent should suffice for a reasonable evaluation of solvation effects. This assumption has been later confirmedby the success of modern continuum models. The LD modeland its protein dipoles Langevin dipoles (PDLD) version hasbeen applied extensively to calculations of chemical processes

in solution (see for example refs 19-24). The close relationshipbetween the LD model and more rigorous microscopic modelshas been demonstrated.6,25 In addition, the formal relationsbetween the continuum models, explicit dipoles models, andLD models have been established.26,27 All these studies haveshown that dipolar models provide very effective ways forevaluating solvation energies. In addition, they showed thatmany different models can perform reasonably well and thatwhat really counts is the computational efficiency and theconsistency of their parametrization. Naturally, a general

reliability regardless of the exact details of the model ischaracteristic also of continuum models provided that they useatom-based parametrization. Apparently this was not understoodby some scientists who criticized the minor details of the LDimplementation while having no difficulties in accepting theoversimplifications of continuum models.28,29 The explicitdipoles present in the LD model allow one to form a directbridge to molecular solvent models and to address problemsthat are difficult to formulate in a unique way in the frameworkof continuum treatments (e.g. the treatment of induced dipolesin excited state calculations30).

As far as quantum chemical calculations are concerned, theLD model and its protein dipoles Langevin dipoles (PDLD)version can be considered as the earliest hybrid quantum

mechanical/molecular mechanical (QM/MM) solvation models.3This early QM/LD model has considered consistently thecoupling between the solute semiempirical QCFF/ALL orMINDO/2 Hamiltonian and the solvent. Subsequent study hasparametrized the LD model with the semiempirical MNDOapproach.30 Unfortunately, current semiempirical approachesdo not always provide correct dipole moments. Consequently,it is difficult to obtain consistent solvation energies fromsemiempirical quantum mechanical models without usingunrealistic van der Waals radii. Therefore we continued to useempirically adjusted charges in LD and PDLD calculations,31

and this is in fact the approach used in refining the LDparameters in the POLARIS program20 and in related param-etrizations of continuum models (e.g. ref 32). However, recentX Abstract published in  Ad V ance ACS Abstracts,  June 15, 1997.

5583 J. Phys. Chem. B  1997,  101, 5583-5595

S1089-5647(97)00507-5 CCC: $14.00 © 1997 American Chemical Society

Page 2: Jp 9705075

8/12/2019 Jp 9705075

http://slidepdf.com/reader/full/jp-9705075 2/13

progress in methodology and computer power made it feasibleto obtain reliable charges from ab initio calculations. Suchatomic charges may enable one to use more physically justifiedvan der Waals radii in QM/MM models and to assess accuracylimits of simplified solvent models. An attempt to implementthe LD model in ab initio calculations was reported recentlyby Malcolm and McDouall.33 This work used Mulliken atomiccharges and solute cavities represented by Bondi’s atomic radii34

plus a value of 0.75 and 0.54 Å for neutral and ionic solutes,respectively. Increased computational demands related with the

use of ab initio technique was compensated by neglecting grid-averaging procedures and interactions among solvent dipoles.Despite the resulting grid-related uncertainty, their calculationscorrectly reproduced solvent-induced change of the reactionprofile for the CH3Cl  +  Cl- SN2 reaction. Nevertheless, formore quantitative studies of biochemical processes, a newparametrization of the LD solvent model for solutes describedby ab initio quantum mechanical methods is needed.

In this paper we develop and parametrize an LD model forab initio calculations of solvation free energies and thoroughlyanalyze the performance of this model. Both the noniterativeand iterative versions of the LD model are considered. Forcomparison purposes, solvation free energies are evaluated byusing the polarized continuum model (PCM) of Miertus,

Scrocco, and Tomasi11,12 implemented in the Gaussian 94program.35 For the LD calculations we use atomic charges fittedto the ab initio molecular electrostatic potential (MEP) of thegas phase solute. Because the charges are not considered asadjustable parameters, the correlation between them and the vander Waals parameters is largely removed, which enables aconsistent evaluation of solvation properties for different solutestructures. Since we are primarily concerned with establishinga simple way of using the LD model with standard ab initioprograms, we evaluate solute polarization by using ab initioMEP charges of hydrated solutes that were obtained during theforegoing PCM calculations. Such a use of the PCM model isreasonable because the increase in solvation free energy due tothe changes in solute charge distribution upon solvation

represents in many cases only a second-order effect.36 Thisapproach merely reflects a pragmatic decision based on theinterest in having a convenient and easily portable model.

The paper is arranged into three parts. First, we describemain features of the new version of the LD model and itsparametrization for aqueous solutions. Next, we present sol-vation free energies of a representative set of 44 neutral and 39ionic molecules calculated by the LD and PCM methods. Inaddition, the corresponding experimental solvation energies of ionic solutes are carefully reevaluated. Finally, the conforma-tional dependence of the solvation free energy of 1,2-ethanediol(glycol) is examined.

2. Ab Initio Methods

Solvation free energies were calculated by using the Langevindipoles (LD) method (see below), and the solute chargedistribution was approximated by the atom-centered pointcharges. Although the integration of the HF wave function isa more accurate and straightforward approach for generatingthe electric field, which determines the orientation of Langevindipoles, we abandoned this approach after an initial testingperiod. During this period it was found that the changes in thesolvation energy due to the use of wave-function-generatedfields are negligible compared to the differences between thecalculated and experimental solvation energies. In addition, themore accurate solute charge distribution did not always resultin improved agreement with experiment. Obviously, the errors

in solvation energy associated with replacing solvent moleculesby point dipoles are comparable or larger than errors originatingfrom the multipole expansion of the solute charge distribution.Thus, no major improvements are expected from having an exactelectric field at the positions of these dipoles. In addition, theuse of the exact solute charge distribution results in a significantincrease of the CPU time. Finally, the use of atomic chargessimplifies the interpretation of the calculated solvation energiesin terms of group contributions and also the separation of theab initio and solvation calculations without neglecting the solute

polarization term (see section 3.6). The later feature isespecially useful for the developement of a new parametrizationof our solvation model.

Atomic charges were evaluated by fitting to the molecularelectrostatic potential (MEP) obtained from the HF/6-31G* andMP2/6-31+G** electron densities and HF/6-31G* moleculargeometries using the Gaussian 94 program.35 For the MEPcalculations, the default Merz-Kollman atomic radii and thegrid construction procedure were used.

Gas phase proton affinities and charge distributions neededfor the evaluation of electron correlation effects were calculatedat the MP2/6-31+G**//HF/6-31G* level. Zero-point energies(ZPE) were determined from HF/6-31G* harmonic vibrationalfrequencies scaled by 0.9.

For comparison and as a source of atomic charges of hydratedsolutes we used the polarized continuum method (PCM)11

implemented in the Gaussian 94 program.35 HF/6-31G* wavefunctions and default Pauling’s (Merz-Kollman’s) van derWaals radii scaled by 1.2 were used for the PCM calculations.

3. Description and Parametrization of the Langevin

Dipole Model

The Langevin dipole (LD) solvation model is based on theevaluation of interactions between the electrostatic field of thesolute and point dipoles placed on a grid surrounding the solute.This grid of dipoles is surrounded by a dielectric continuum.The solute electrostatic field is generated from the point charges

placed at the atomic nuclei. The total electrostatic field at agiven dipole and the magnitude of this dipole are related viathe Langevin function (see below) that exhibits saturation forlarge fields. The resulting electrostatic part of the solvationenergy (∆GES) is augmented by the solvation contribution fromthe outer continuum dielectric (∆Gbulk), van der Waals (∆GvdW),hydrophobic (∆Gphob) and solute-polarization (∆Grelax) terms togive the solvation free energy,  ∆Gsolv:

In the current implementation, atomic charges are determinedby fitting to the ab initio electrostatic potential. Because the

atomic charges are not empirically adjustable (in contrast tothe previous versions of the model), also other parameters of the LD solvation model had to be reoptimized. These changesinclude introduction of the field-dependent grid points andsurface-constrained dipoles on outer solvent surface, the newrelation between magnitudes of dipoles placed at the inner andouter grid points, reparametrized vdW and hydrophobic terms,increased extent of dipole-dipole interactions, and a newlyadded solute-polarization term. The details of this new param-etrization for aqueous solution are presented below.

3.1. Grid Construction.  The LD model represents solventmolecules by a grid of point dipoles. Our previous experiencehas shown that the exact nature of this lattice does not affectthe calculated solvation energy in a major way. Thus, the

∆Gsolv ) ∆GES + ∆Gbulk + ∆GvdW + ∆Gphob + ∆Grelax

(1)

5584   J. Phys. Chem. B, Vol. 101, No. 28, 1997    Florian and Warshel

Page 3: Jp 9705075

8/12/2019 Jp 9705075

http://slidepdf.com/reader/full/jp-9705075 3/13

computational efficiency rather than the reproduction of theactual solvent structure is the decisive factor in choosing thebest grid. In the present model, point dipoles are centered atthe 3 Å simple cubic grid that is transformed into the denser 1Å cubic grid near the van der Waals (vdW) surface of the solute(defined by the vdW radii of Table 1). The boundary betweenthe inner (1 Å) and outer (3 Å) grids is formed by points thatlie at the distance of 2 Å from the vdW surface of the solute.The outer grid points are constructed up to the distance of 20Å from the vdW surface of the solute, but those outer grid points

that are located in the regions of very small field are subse-quently discarded. This selection is done by using the screenedelectric field,

where

and the symbols   Qi and   r ij  )   |r bij|  )   |(r b j  -   r bi)| denote atomiccharges and the distance between the  ith atom and the  jth grid

point, respectively. The screened electrostatic field defined byeqs 2 and 3 was also used in the noniterative LD calculations(see below). The criterion  |ξB j

D|  < 0.0015 e/Å2 was used for

the selection procedure described above. In addition, Langevindipoles at grid points with magnitude of electrostatic field inthe 0.0015 < |ξB j

D| < 0.0021 e/Å2 range were constrained at the

noniterative values. These dipoles were kept constant duringthe iterative Langevin dipole calculation. Using this field-dependent selection of grid points, there are typically about 1400grid points for ions treated in this paper, whereas for neutralsolutes only several hundred grid points are used. Thisprocedure improves computational efficiency of the LD model,since Langevin dipoles are placed only in locations where theycan really contribute to the electrostatic part of the solvationfree energy.

Since the values of  ∆Gsolv depend upon the position of thecenter of the grid, it is essential to carry out LD calculationsfor several different grids to obtain a stable mean value of ∆Gsolv. In this study, ∆Gsolv was averaged over 22 grids. Thisprocedure decreased the grid-related uncertainty of our resultsbelow 1.5%.

3.2. Electrostatic Contribution to Solvation Free Energies.

For a given grid, the magnitudes and orientations of Langevindipoles are calculated in two ways. In the iteratiV e calculation,further denoted as ILD, Langevin dipoles are allowed to interactwith each other. The  jth dipole,  µb j, becomes polarized (i.e.changes its size and orientation) along the vector of the total

electrostatic field,   ξB

 j, evaluated as a sum of the   unscreened contributions from solute charges Qi and from Langevin dipolesdetermined in the preceding, (n-1)th, iteration:

where ξB0 is the field of the solute in vacuum,

In the noniteratiV e Langevin dipole calculation (NLD), the field

that determines the magnitude of the Langevin dipoles is givenby eqs 2 and 3.

The extent of dipole polarization is given by the Langevinfunction, L( x), which exhibits linear behavior for smaller fieldsand reaches saturation for larger fields (Figure 1):

where

and the field ξB jL is taken as  ξB j (eq 4) in the ILD approach and

as   ξB jD (eq 2) in the NLD approach. Vector   µb0   with the

magnitude of 0.26 and 0.05 e Å for outer and inner grid dipoles,respectively, points in the direction of   ξB j

L. Furthermore,symbols k  and  T  in eq 8 denote the Boltzman constant and thethermodynamic temperature, respectively. The magnitudes of  µ0   for inner and outer grid centered dipoles are not chosenarbitrarily, but they are related by the formula

where ϑin ) 1 Å3 and ϑout ) 27 Å3 denote volumes of the innerand outer grid cells, respectively. This relationship can bederived from the requirement that the macroscopic dielectricconstant () of the inner and outer grid dipoles is the same,assuming the Claussius-Mosotti relationship between    and µ.

The electrostatic part of the solvation energy,   ∆GES, isevaluated as energy of Langevin dipoles µb j in the electrostatic

field generated by the solute atoms:

The parameter k ES implicitly includes energy needed to polarizethe solvent molecules. It should be 1/2 within the linearresponse approximation,6,37,38 but recent simulations utilizingall-atom solvent models indicated that its magnitude can varyfor different solutes.39 In our LD implementation,   k ES   is

Figure 1.  Plot of the Langevin function (eq 8).ξB jD )∑

1

Qi r bij

d(r ij) r ij3,   i ∈ solute,   j ∈ grid points (2)

d(r ij) ) 2 + r ij

1.7  (3)

ξB j ) ξB j0 +∑

k * j

[r jk 2 µbk 

(n-1) - ( r b jk ‚ µbk (n-1)) r b jk ]

r  jk 5

(4)

ξB j0 )∑

i

Qi r bij

r ij3

(5)

 µb j )  µb0 L( x) (6)

 L( x) ) coth( x) -1 x

  (7)

 x ) µ0|ξB j

L|

kT   (8)

( µ0,in)2

ϑin)

( µ0,out)2

ϑout(9)

∆GESILD ) 332k ES

ILD∑

 j

 N 

( µb jILD‚ξB j

0) (10)

∆GESNLD ) 332k ES

NLD∑

 j

 N 

( µb jNLD‚ξB j

D) (11)

Langevin Dipoles Model   J. Phys. Chem. B, Vol. 101, No. 28, 1997    5585

Page 4: Jp 9705075

8/12/2019 Jp 9705075

http://slidepdf.com/reader/full/jp-9705075 4/13

assumed to be solute-independent and amounts to 0.52 and 0.47for ILD and NLD models, respectively.

3.3. Calculation of Dipole-Dipole Interactions.   Asmentioned above the vector of the local electrostatic field,  ξB j,at the   jth Langevin dipole is calculated iteratively in the ILDmodel. The iterative cycle is terminated when oscillations in∆GES fall below 0.1% for 10 successive steps. Usually, about60 iterations are needed before this criterion is met. Theconverged ∆GES is independent of starting configuration of theLangevin dipoles. However, for faster convergence, this

configuration is generated so that part of the outer grid dipolesare oriented randomly, whereas all inner grid dipoles andremaining outer grid dipoles are constructed using a screeningfunction as in the NLD method. Subsequently, the jth Langevindipole at the  nth iteration is calculated as

where the dumping factor ω is gradually decreased from 0.9 to0.5 as n increases from 1 to 80. The value of  x(n-1) is calculatedaccording to eq 8, using the local electrostatic field,  ξB j, evaluatedfrom the values of  µb(n-1) by using eq 4. In the latter formula,a cutoff distance of 20 Å is used between the first and ninth,

and in the 45th iteration. In other iterative steps, the cutoff distance is decreased to 6 Å and augmented by the local reactionfield approximation.40 In addition, interactions between dipolesseparated by less than 2.5 Å are excluded.

3.4. Bulk Correction.  The contributions to the solvationfree energy originating from the solvent region that is outsidethe solvent region filled with Langevin dipoles are evaluatedby a continuum approximation. This is done by using Born’sand Onsager’s formulas for charges and dipoles, respectively,and expressing the relevant bulk free energies of eq 1 as

where R  is the radius of the Langevin dipoles region, while  Q

and µ are the charge and the dipole of ionic and neutral solutes,respectively, and   is the dielectric constant of the solvent ( )80 for water). The cavity radius was calculated from theestimated solute radius plus 2 Å for the inner grid layer plus aterm determined from the number and spacing of outer griddipoles. The resulting bulk correction, ∆Gbulk, is typically about-10 kcal/mol for ionic solutes and about  -0.1 kcal/mol forpolar neutral solutes.

3.5. Hydrophobic and van der Waals Interactions.   Toreproduce observed solvation free energies of nonpolar solutes,the electrostatic part of the solvation energy must be augmentedby additional terms approximating hydrophobic and van derWaals (vdW) contributions. The hydrophobic free energy,∆Gphob, is assumed to be related to the magnitude of thenonpolar molecular surface, which is proportional to the numberof Langevin dipoles (grid points) that lie within 1.5 Å from thevdW surface of the solute:

Here, the magnitude of electrostatic potential, V i, at the ith gridpoint is used as the criterion for the local surface polarity,

according to the formula

The empirical parameters k phob ) 0.012 kcal/mol, V min ) 0.002e/Å, V max ) 0.0015 e/Å, and χ ) 0.08 were adjusted to obtaincorrect solute-size dependence of solvation free energies of hydrocarbons and alkyl alcohols.

The van der Waals energy, ∆GvdW, was calculated as the sumof  r -9 repulsion and London dispersion terms,

In this formula, r i* and C i denote atomic vdW radii and Londoncoefficients, respectively, and  r ij is the distance between the  ithatom and the jth grid point. The normalization factor N  j, whichequals 1 and 1/27 for outer and inner grid points, respectively,was introduced in eq 16 because the densities of Langevindipoles placed on inner and outer grid points, respectively, differby a factor of 27 (see above). The parameter k vdW is equal to0.84 kcal/mol. The values of atomic radii (r *) are given inTable 1. Note that these parameters represent the distance fromthe given solute to the center of the nearest solvent molecule,and thus they are not directly comparable to vdW atomic radiiobtained from molecular and ionic crystals (e.g. Bondi’s orPauling’s sets), molecular mechanic force fields, or thoseobtained by fitting solvation energies by using Born’s formula.

However, the overall trend in the magnitudes of LD vdW radiiis similar to those obtained by using other methods.The London coefficients (the C ’s) were chosen to be the same

for the atoms from the same row of the periodic table. Namely,C  )  0.7, 1.5, and 2.0 for hydrogen and atoms located in thefirst and second row of the periodic table of elements,respectively. The increase in the London coefficients reflectsthe fact that atomic polarizability increases with the atom size.

3.6. Solute Polarization.   Because ∆GES is evaluated fromthe gas phase charges, some correction is required to accountfor the increase in the solvation free energy due to thepolarization of the solute electron density as it interacts withLangevin dipoles. This correction, which is denoted here as∆Grelax, is evaluated in the framework of the linear response

TABLE 1: vdW Radii of the LD Model

atom typea r * [Å] atom typea r * [Å]

C(sp3) 2.65 S 3.2C(sp2) 3.0 P 3.1C(sp) 3.25 F 2.45O(sp3) 2.2 Cl 3.15O(sp2) 2.65 H   d O(inorg)b 2.8 H(inorg)c 2.0N 2.65

a C(sp)  )  C(sp2), O(sp)  )  O(sp2).   b Inorganic vdW radius is used

for oxygen atoms that are not directly bonded to carbon. Exceptionsinclude oxygen atoms in nitro groups attached to carbon, H2O, andtheir ions.   c H covalently bound to O(inorg).   d  vdW radius of hydrogenis assumed to be linearly dependent on the   r * of the closest heavyatom X:   r *(H) ) k Hr *(X). The constant  k H equals 0.88 and 0.78 foratom X from the first and second row of the periodic table, respectively.

 µb j(n) ) ωµb j

(n-1) + (1 - ω) µb0 L( x(n-1)) (12)

∆Gbulk(ionic solute) ) -166(1 - 1/ )Q2 /  R   (13a)

∆Gbulk(neutral solute) )-166[(2 - 2))/(2 + 1)] µ2 /  R3

(13b)

∆Gphob ) k phob∑ j

 f (V  j),   j ∈ surface grid points (14)

 f (V  j) ) {1,   for |V  j| e V min

1 -  |V  j|-V min

V max - V min(1 - χ),   for V max > |V  j| > V min

 χ,   for |V  j| g V max}

(15)

∆GvdW ) k vdW∑i, j

C i N  j[2(r *i

r ij)

9

- 3(r *i

r ij)

6

],

i ∈ solute atoms,   j ∈ grid points (16)

5586   J. Phys. Chem. B, Vol. 101, No. 28, 1997    Florian and Warshel

Page 5: Jp 9705075

8/12/2019 Jp 9705075

http://slidepdf.com/reader/full/jp-9705075 5/13

approximation (LRA).6 Within this approximation, the energyinvested in polarizing the solute is negative half of the energygained by the interaction between the polarizing potential andthe induced changes in solute atomic charges (∆qi). This leadsin the case of combined QM/MM treatments to the expression41

where the values of solvent-induced electrostatic potentials atsolute atoms,  V i, are evaluated as

and the screening factor d  is given by eq 3 for the NLD method,and d ) 1 for the ILD method. Here we decided to replace thefactor of 1/2 in eq 17 by an adjustable constant  k relax,

To determine the constant  k relax, we calculated ∆Grelax for H2O,OH-, and CH3O- molecules, which represent systems withsmall, medium, and large polarization contributions, and

compared the resulting  ∆Grelax with the values obtained previ-ously by a self-consistent quantum mechanical approach.36

Accordingly, we adjusted  k relax to the values 0.8 and 0.6 for theNLD and ILD methods, respectively. Note that the value 0.6is close to 0.5 predicted by the LRA. The values of  ∆qi wereobtained by evaluating the change in potential-derived atomiccharges when going from the gas phase solute to the solutesolvated using the polarized continuum method (PCM) of Tomasi. Because the values of  ∆qi obtained in this way arenot dependent on the LD vdW radii, we found this approach tobe most convenient for the purpose of adjusting parameters of the LD model.

In the future we intend to evaluate  ∆Grelax of larger systemsempirically, whereas for smaller solutes we will incorporate the

potential from Langevin dipoles into the solute quantummechanical Hamiltonian. The strategy for the implementationof the QM/LD coupling into the semiempirical MNDO Hamil-tonian has been described previously.30 This approach assumesthat differential overlap of atomic orbitals can be partlyneglected and that the SCF Hamiltonian is Lowdin ortogonal-ized. Such a procedure is entirely consistent within theframework of the semiempirical model.42 The same procedurecan be considered only approximate for ab initio methods, sothat its performance needs to be carefuly assessed. A succesfulimplementation of this QM/LD coupling was reported byMalcolm et al.33 A more rigorous approach involves theconversion of the Langevin dipoles into point charges, whichare subsequently introduced as external charges in the ab initio

Hamiltonian (see also the related treatment on the all-atom watermodel43). These and other effective approaches are beingdeveloped in our laboratory, but they are beyond the scope of the present work. It is important to point out in this respectthat if the solvent is treated classically (including, of course,by continuum models), there is no rigorous way of obtainingthe solute polarization energy (see section 2.2 and the appendixof ref 30), and the LRA is one of the most reasonableapproximations. This applies also to the effect of the electroncorrelation part of  ∆Grelax (see section 3.7).

The  ∆Grelax contributions evaluated by eq 19 are comparedwith the results of Chen et al.36 in Table 2. Although there areapparent method-dependent variations in calculated values,overall trends are predicted in a consistent way. Such a

performance seems to be sufficient in standard solvationcalculations since  ∆Grelax usually does not exceed 15% of thetotal solvation energy. The polarization of anionic solutes wasfound to be significantly larger than for their neutral or positively

charged counterparts. The largest  ∆Grelax value of  -8.6 kcal/ mol was obtained by us for the phenolate anion. These findingscan be rationalized by a more diffuse character of the wavefunction of anions and consequently their larger static electronicpolarizability. Therefore it is quite surprising that∆Grelax valuesreported by Chen   et al.36 for formic acid/formate system donot conform to the mentioned trend.

3.7. Higher Order Corrections.  The essential feature of the presented parametrization of the LD solvation model is theuse of the solute point charges that approximate the molecularelectrostatic potential determined from the Hartree-Fock (HF)wave function expressed in terms of the 6-31G* set of atomicorbitals. This quantum chemical method is frequently consid-ered as a method of choice for various chemical and biochemical

applications involving medium to large molecules. However,the HF approximation neglects electron correlation effects thatare known to play an important role in systems such as transitionstates for breaking or forming covalent bonds. Also, the useof more extensive basis sets may be necessary for the properdescription of solute charge distribution in some anionic systems.Thus, in these cases additional corrections may be required toamend these deficiencies. This is accomplished here as in thecase of the HF solute polarization by using the LRA and addingto  ∆Gsolv a term

where  ∆qicor represents the additional solute polarization due

to the higher order effects, and   k ES  )  0.52 and 0.47 for theILD and NLD methods, respectively. (Note that the LRAconstant k ES is identical to that used in eqs 10 and 11.) Equation20 is analogous to eq 19 in that it involves the electric potentialsV i  at the positions of solute atoms and differences in atomiccharges. Here, the values of ∆qi

cor were obtained by subtractingcharges on the ith solute atom calculated from the HF and MP2electrostatic potential, i.e.  ∆qi  )  Q i

HF -   QiMP2. By MP2 we

denote the many-body perturbation theory of second order thattreats the electron correlation as a perturbation to the HFHamiltonian. Consequently, MP2 electron density and elec-trostatic potential include part of the electron correlation effects.The MP2/6-31+G** atomic charges are used in this paper toevaluate the higher order corrections according to eq 20, but

TABLE 2: Solute Polarization Term (∆Grelax  (kcal/mol))Evaluated by Various Computational Methods

molecule ILD SCRF-Pa SCRF-P+Da

H2O   -0.8   -1.6   -0.9OH- -2.8   -3.7   -1.9CH3OH   -0.6   -1.4   -1.1CH3O- -6.8   -5.9   -6.1CH3NH2   -0.3   -1.2   -0.6CH3NH3

+ -0.3   -1.1   -0.6Im   -0.8   -3.4   -2.4ImH+ -0.3   -0.6   -0.2

HCOOH   -0.6   -1.5   -1.6HCOO- -1.8   -1.0   -1.2CH3COOH   -0.6   -1.8   -1.5CH3COO- -3.5   -4.2   -3.4

a Results obtained by self-consistent coupling of the reaction potentialof the solvent, obtained by solving the Poisson-Boltzmann equation,with the DFT quantum mechanical description of the solvent.36 Themolecular charge distribution was represented by a multicenter multipoleexpansion formed either by atomic charges (SCRF-P) or by atomiccharges and dipoles (SCRF-P+D).

∆Gcor ) k ES∑V i∆qicor (20)

∆GrelaxLRA )∑V i∆qi - 0.5∑V i∆qi ) 0.5∑V i∆qi   (17)

V i ) -332∑ j

( µb j‚ r bij)

dr ij3

(18)

∆Grelax ) k relax∑V i∆qi   (19)

Langevin Dipoles Model   J. Phys. Chem. B, Vol. 101, No. 28, 1997    5587

Page 6: Jp 9705075

8/12/2019 Jp 9705075

http://slidepdf.com/reader/full/jp-9705075 6/13

other extended basis sets or other correlated methods, includingdensity functional theory (DFT), could be used as a reasonablealternative. We use the notation MP2-LD and LD to distin-guish between ∆Gsolv results calculated by using the LD modelwith and without the  ∆Gcor term, respectively.

Effects that cannot be treated by the LRA, for example thevariation of the electron correlation energy upon solvation, wereneglected. This approach is partly justified by the small valuesof  ∆Gcor obtained for the molecules studied by us and a smalleffect of the electric field on the electron correlation contributionto the dipole moment of the OH- ion (see section 4.4). Also,we would like to emphasize that there is no rigorous way of obtaining the correlation contribution to the solute polarizationenergy for classical solvent models (see section 2.2 and theappendix of ref 30).

4. Hydration Free Energies of Neutral and Ionic Solutes

In Tables 3 and 4, we present an overview of the calculatedand experimental hydration free energies (∆Gsolv) for a numberof neutral and ionic solutes. The first attempt to obtain such a

quantitative comparison was reported in early studies of solventmodels.4,6 More extensive compilations were reported byCramer and Truhlar15,16 as a part of the development of the SMfamily of solvation models based on the semiempirical AM1and PM3 calculations, by Sitkoff et al., who developed atomicparameters for the macroscopic solvent models,32 and byStefanovich and Truong44 for ab initio PCM calculations. Othercomputational studies of aqueous solvation thermodynamicswere focused on implementation of various solvation modelsinto quantum mechanical programs30,36,45-49 or on detailedanalysis of solvation phenomena for a smaller range of molecules, e.g. neutral molecules.50-53

Table 3 lists individual components of the calculated solvationfree energy for selected solutes. In Table 4, results obtained

by using iterative (ILD) and noniterative (NLD) Langevin dipolemodels described in the preceding section are compared withexperimental data. In addition, LD solvation free energies canbe compared in Table 4 with results obtained by the PCM model.In the PCM calculations we used Pauling’s hybridization-independent atomic radii that were parametrized only througha single multiplicative scale factor.

The experimental data presented in Tables 3 and 4 vary intheir reliability, depending on the way in which they werederived. For neutral solutes, experimental   ∆Gsolv   can be

obtained directly by measuring partition coefficients of solutesbetween gas phase and dilute aqueous solutions in equilibrium.Following the suggestion of Ben-Naim,54 the experimental∆Gsolv presented in this paper corresponds to a transfer of thesingle solute molecule from gas to dilute aqueous solution of the same molarity. We used the data compiled by Cabani etal.55 and by Ben-Naim and Marcus56 for the temperature of 298K and atmospheric pressure. The uncertainty of measuredpartition coefficients increases with increasing  |∆Gsolv|.57 Even-tually, the concentration of the solute molecules in the gas phasefalls below the experimental detection limit. This usually occursfor solutes with  |∆Gsolv| above 12 kcal/mol. Thus, ∆Gsolv forall charged solutes cannot be obtained by direct experiments.Instead, for a positively charged (protonated) solute, experi-

mental ∆Gsolv can be obtained indirectly from the values of gasphase basicity,   B g(A)  )   G°(A)  +   G°(H+)  -   G°(AH+), thesolvation free energy of the proton,  ∆Gsolv(H+),  ∆Gsolv of itsconjugate base, and its pK a constant:6

By using an analogous thermodynamic cycle,  ∆Gsolv of nega-tively charged base B- can be evaluated as

In cases where the gas phase basicity is unknown, it can beobtained from the value of gas phase proton affinity, ∆ H g [kcal/ mol], by using the approximate formula58,59

The value of   ∆Gsolv(H+) can be derived from the absolutepotential of the hydrogen electrode.60 Because the value of thispotential has been determined with ∼1% accuracy,61 the valuesfor solvation energy of the proton that are used by differentauthors vary considerably. Here we assume that ∆Gsolv(H+) is-259.5 kcal/mol and that the uncertainty of this value is (2.5kcal/mol.   ∆Gsolv(H+) cancels out when relative solvation

energies of equally charged solutes are evaluated, but itsmagnitude does affect relative solvation energies of solutescarrying different charges. Therefore, the uncertainty in∆Gsolv(H+) must be considered when LD and experimental∆Gsolv are compared.

The accuracy of other terms entering eq 21 was determinedas follows. Experimental pK a values are highly accurate if theyfall in the 0-14 range, but their uncertainty increases beyondthis region. Moreover, solvation free energies determined byusing eq 21 for acids and bases with pK a constants below  -3and above 16 refer to solvation by solvents other than neutralwater. The relative gas phase basicities are determined directlyfrom the equilibrium constants for proton transfer reactions, andas such they are usually accurate within 0.5 kcal/mol. However,

TABLE 3: Solvation Free Energies [kcal/mol] of Hydrocarbons and Neutral Alkyl Alcohols, Ethers, andAmines

solutea ∆GESb ∆GvdW   ∆Gphob   ∆Grelax   ∆Gsolv

calc∆Gsolv

exp

methane   -0.1   -1.5 3.4 0.0 1.8 1.9ethane 0.0   -2.2 4.2 0.0 2.0 1.8propane 0.0   -2.9 4.9 0.0 2.0 2.0butane 0.0   -3.5 5.5 0.0 2.0 2.2pentane 0.0   -4.1 6.1 0.0 2.0 2.3hexane 0.0   -4.7 6.8 0.0 2.1 2.6cyclohexane 0.0   -4.1 5.9 0.0 1.8 1.2

ethene   -0.4   -2.3 4.0 0.0 1.3 1.3cyclopentene   -0.2   -3.8 5.4 0.0 1.4 0.6cyclopentadiene   -0.8   -3.7 4.7 0.0 0.2cyclopentadiene-c -50   -4.5 0.5   -0.6   -65   -66 ( 5benzene   -1.1   -4.4 4.9   -0.1   -0.7   -0.9naphthalene   -1.7   -6.2 5.2 0.0   -2.8   -2.4ethyne   -0.4   -2.3 4.0 0.0 0.3 0.0methanol   -4.8   -1.9 1.5   -0.6   -6.0   -5.1ethanol   -4.5   -2.6 2.2   -0.6   -5.3   -5.0propanol   -4.3   -3.2 3.1   -0.6   -5.0   -4.8butanol   -4.4   -3.8 3.6   -0.6   -5.1   -4.7ethanediol   -7.3   -2.9 1.5   -0.9   -9.7   -9.6tetrahydrofuran   -2.7   -3.6 3.0   -0.6   -3.9   -3.51,4-dioxane   -3.9   -3.8 3.3   -0.6   -5.1   -5.1dimethyl ether   -2.1   -2.6 2.6   -0.4   -2.6   -1.9diethyl ether   -1.3   -3.9 4.3   -0.2   -1.1   -1.6ammonia   -3.1   -1.4 1.1   -0.4   -4.0   -4.3methylamine   -2.5   -2.2 2.1   -0.3   -2.9   -4.6ethylamine   -2.4   -2.8 2.9   -0.3   -2.6   -4.5dimethylamine   -1.5   -2.8 3.3   -0.2   -1.2   -4.3trimethylamine   -1.0   -3.3 4.4   -0.1   -0.1   -3.2

a All-trans conformers of alkanes,   C s  (trans) conformers of alkylalcohols, all-gauche conformer of ethanediol.   b Obtained by the ILDmethod.   c Deprotonated at the sp3 carbon atom.

∆Gsolv(AH+) ) B g(A) + ∆Gsolv(H+) + ∆Gsolv(A) -

2.303 RT  pK a(AH+) (21a)

∆Gsolv(B-) ) -B g(B-) - ∆Gsolv(H+) + ∆Gsolv(BH) +

2.303 RT  pK a(BH) (21b)

B g(A) ) ∆ H g(A) - 7.5 kcal/mol (22)

5588   J. Phys. Chem. B, Vol. 101, No. 28, 1997    Florian and Warshel

Page 7: Jp 9705075

8/12/2019 Jp 9705075

http://slidepdf.com/reader/full/jp-9705075 7/13

the error range of absolute free energies of the molecularstandards for these measurements is significantly larger.62

Therefore, gas phase basicities were assumed to be accurate towithin   (2.5 and   (1.5 kcal/mol for molecules with protonaffinities above and below 202 kcal/mol, respectively.59,62 Forbasicities of neutral molecules we used the scale of Lias et al.,62

and for basicities of anions (acidities of neutral molecules) weused a more recent compilation by the same group.59 In a fewcases when experimental gas phase basicities were not available,we substituted them by ab initio MP2/6-31+G**//HF/6-31G*

proton affinities using eq 22. We assumed that these ab initioresults were accurate within (4 kcal/mol. Similarly, if experi-mental ∆Gsolv of neutral compounds were not available, we usedin the right-hand side of eqs 21a and 21b values of  ∆Gsolv thatwere calculated by the ILD method. This replacement isreasonable since the accuracy of calculated solvation energiesof neutral solutes is significantly better than for ionic solutes.

Experimental solvation energies for a number of ionic soluteswere reported previously by Pearson,58 who used the sameformula (eq 21a,b) for their derivation. These data differ fromours in that Pearson’s values of  |∆Gsolv| refer to the transfer of the solute from the ideal gas (1/22.4 M) to 1 M aqueous solution.For example, for methanol  ∆Gsolv of  -3.2 and -5.1 kcal/molwas used in Pearson’s and our compilations, respectively. There

are also other nonsystematic differences between our andPearson’s sets of experimental data, especially in the choice of gas phase free energies. Therefore, we converted Pearson’svalues to our standard state and listed them in Table 4 alongwith the experimental solvation energies derived by us and theiruncertainties.

Generally large error ranges of experimental solvation ener-gies of ionic solutes suggest a question whether one shouldinclude these data in the training set for the parametrization of solvation models. We believe that despite their uncertainty,

these data are valuable for determination of atomic vdW radii.This is because calculated solvation energies of ionic solutesare highly sensitive to atomic radii. In addition, variations in∆Gsolv   due to the substituent effects are usually determinedexperimentally with substantially better accuracy since gas phasedata refer to the same standard and  ∆Gsolv(H+) cancels out.

4.1. Carbohydrates.  The electrostatic contributions to the∆Gsolv   of neutral carbohydrates are negligible. Because theexperimental ∆Gsolv of linear alkanes increases only slowly withthe number of methylene groups (Table 3), van der Waals(∆GvdW) and hydrophobic (∆Gphobic) terms that have oppositesigns must nearly compensate each other. The magnitude of ∆GvdW per -CH3 group was adjusted to be close to -0.9 kcal/ mol determined from the measured substituent effects on

TABLE 4: Comparison of Calculated and Experimental Solvation Free Energies of Neutral and Ionic Solutes in AqueousSolution a

calculated experimental calculated experimental

soluteb NLD ILD PCMc this workd  Pearsone soluteb NLD ILD PCMc this workd  Pearsone

H2O   -7.8   -8.8   -6.0   -6.4 CN- -77   -75   -89   -75 ( 5   -77H3O+ -108   -101   -94   -105 ( 5   -104 acetonitrile   -6.7   -6.9   -4.1   -3.9OH- -118   -115   -121   -110 ( 5   -106 CH3CNH+ -72   -68   -74   -69 ( 5   -69MeOH2

+ -88   -83   -80   -87 ( 5 j -83 nitromethane   -7.4i -7.9i -5.2   -3.7h

MeO- -99   -97   -105   -98 ( 5   -95 CH2NO2- -83i -82i -84   -80 ( 6

EtOH2+ -80   -77   -73   -81 ( 6 j HNO2   -4.1   -4.5   -2.5

EtO- -93   -93   -102   -94 ( 5 NO2- -77   -74   -85   -73 ( 7   -72

PhOH   -4.7   -5.7   -2.9   -6.6 HNO3   -5.2   -6.0   -2.7PhO-d  -77   -80   -83   -75 ( 5   -72 NO3

- -72   -68   -77   -66 ( 5   -65acetaldehyde   -4.7   -5.1   -4.8   -3.5 H2S   -0.3   -0.4   -0.2   -0.7formic acid   -5.3   -6.2   -6.5 HS- -77   -75   -87   -76 ( 5   -76formate   -85   -81   -88   -80 ( 5 MeSH   -1.3   -1.5   -0.5   -1.2acetic acid   -6.1   -7.0   -7.0   -6.7 MeS- -74   -74   -84   -76 ( 5acetate   -83   -82   -89   -82 ( 5   -77 EtSH   -1.4   -1.6   -0.6   -1.2CHF2COOH   -5.9   -6.8   -5.7 EtS- -71   -72   -81   -74 ( 5CHF2COO- -78   -75   -77   -70 ( 6 PhSH   -1.6   -1.9   -0.3   -2.6CHCl2COOH   -4.2   -5.0   -4.8 PhS- -66   -69   -68.0   -65 ( 7l -67CHCl2COO- -71   -71   -77   -66 ( 6 PH3   0.8 0.8   -0.1 0.6NH4

+ -85   -81   -87   -81 ( 5   -79 PH4+ -75   -72   -80   -73

MeNH3+ -78   -74   -76   -73 ( 5   -70 MePH2   0.2 0.1   -0.2

Me2NH2+ -70   -67   -68   -66 ( 5   -63 MePH3+ -69   -68   -72   -63 ( 5   -66

Me3NH+ -65   -62   -64   -59 ( 5   -59 Me2PH   -0.1   -0.2   -0.4aniline   -4.3   -5.4   -2.2   -4.9 Me2PH2+ -64   -63   -64   -57 ( 5   -57

anilineH+m

-64   -64   -62   -68 ( 6   -68 Me3P   -0.4   -0.6   -0.6pyridine   -3.7   -4.3   -3.4   -4.7 Me3PH+ -61   -59   -60   -53 ( 5   -53pyridineH+m -60   -59   -61   -58 ( 5   -59 H3PO4   -10   -12   -13imidazole   -7.3   -8.3   -5.6   -10.3g H2PO4

- -71   -70   -81   -68 ( 8l

imidazoleH+m -65   -62   -67   -64 ( 5   -62 HPO4(2-) -225   -247   -275   -245 ( 15l

formamide   -8.2   -8.7   -7.9 PO4(3-) -439   -536   -594   -536 ( 20l

formamideH+k  -80   -76   -77   -78 ( 5 CH3F   -2.6   -2.8   -2.6   -0.2acetamide   -8.2   -8.9   -9.4   -9.7 HF   -4.5   -4.9   -4.7acetamideH+k  -72   -69   -70   -70 ( 5 F- -109   -104   -107 ( 6   -107cytosine   -17   -18   -15 CH3Cl   -2.0   -2.1 0.0   -0.6cytosineH+ f  -67   -66   -67   -67 ( 6 ClH   -0.6   -0.8 0.0HCN   -4.3   -4.7   -3.5 Cl- -81   -78   -78 ( 7   -77

a Solvation free energies refer to dilute aqueous solutions, standard state of 298 K, and equal molar concentrations of solute in both the gas phaseand solution.   b Me, Et, and Ph denote methyl, ethyl, and phenyl residues, respectively.   c PCM HF/6-31G* method using Merz-Kollman/PaulingvdW radii (Å) of 1.2 (H), 1.50 (C), 1.50 (N), 1.40 (O), 1.75 (S), 1.80 (P), 1.8 (Cl) scaled by a factor of 1.2.   d  Experimental solvation free energiesof neutral molecules were taken from Cabani et al.55 For the determination of “experimental” data for ionic solutes and their uncertainties see the

text.   e Reference 58.   f  Cytosine protonated at N3.   g 4-Methylimidazole.84   h Nitroethane.   i Calculated by assuming H3C(sp3)NO(sp2)O(sp2) andH2C(sp2)NO(sp2)O(-)(sp3) structures for the neutral and deprotonated nitromethane, respectively.   j pK a ) -2.1 and -1.9 for MeOH2+ and EtOH2

+,respectively.85   k  Protonated at the oxygen atom.   l The ab initio MP2/6-31+G**//HF/6-31G* method was used to determine the gas phase protonaffinity.   m Protonated at the nitrogen atom.

Langevin Dipoles Model   J. Phys. Chem. B, Vol. 101, No. 28, 1997    5589

Page 8: Jp 9705075

8/12/2019 Jp 9705075

http://slidepdf.com/reader/full/jp-9705075 8/13

enthalpies of transfer of benzene derivatives from vapor tomethanol.63 The resulting calculated  ∆Gsolv slightly underes-timated experimental ∆Gsolv of linear alkanes and overestimatedexperimental ∆Gsolv of cyclic alkanes, which are about 1.4 kcal/ mol less positive compared to linear alkanes.55 This findingcould be explained neither by continuum dielectric theory50 norby the current LD model. Because the   ∆GvdW  term is quiteinsensitive to the magnitudes of atomic van der Waals (vdW)radii, these radii have to be determined from polar or ionicsolutes, the solvation energy of which is dominated by

electrostatic interactions. With increasing aromatic character,∆GES contributes more significantly to the total solvation energy,but this is still insufficient for adjusting vdW radii of carbonatoms. Therefore, these parameters were mainly determinedfrom solvation properties of alkyl alcohols and amines (seebelow). Except for the cyclopentadiene anion (Table 3),experimental data from carbocations and carboanions were notused because the relevant pK a constants fall in the highly acidicor basic regions. The cyclopentadiene anion could be includedin our training set since it is strongly stabilized by resonance.Indeed, the pK a   value for cyclopentadiene (16   (   0.2) isconsidered to be the most reliable pK a for a hydrocarbon in theaqueous standard state.64

4.2. Compounds Containing Oxygen or Sulfur Atoms.

Oxygen, as the highly electronegative element, largely deter-mines solvation properties of organic molecules. Because thecalculated solvation energy is strongly dependent on themagnitudes of vdW radii, we used three different atom typesfor oxygen in our current LD model. For organic sp3 oxygen,a small 2.2 Å vdW radius was used to account for large |∆Gsolv|

of   aliphatic alcohols   and their ions (Tables 3 and 4). Byanalogy, the same oxygen vdW radius was used for H3O+, H2O,and OH- molecules, which resulted in underestimated  |∆Gsolv|

of H3O+ and overestimated   |∆Gsolv|   for OH- and H2O.Therefore, for obtaining more accurate results, we suggestoxygen vdW radii in H2O and OH- to be increased to 2.35 Å,as in our study of phosphate ester hydrolysis.65 Unfortunately,the use of such hybridization- or group-dependent vdW radiimay result in discontinuities on the free energy surface of astudied chemical reaction. To our knowledge, little attentionwas paid to this complication before, as many solvation modelstend to introduce different vdW parameters for the same atomsin neutral and ionic solutes, or use many group-dependentparameters. On the other hand, the use of hybridization- andgroup-independent atomic parameters as in the PCM modelresults in large deviations from experimental data (Table 4).Because the current LD model uses vdW radii that areindependent of the total solute charge, the number of suchdiscontinuities is largely limited in the current LD model. Thus,this LD model and a linear interpolation of vdW radii alongthe reaction path could be recently used to obtain smooth and

quantitatively correct free energy surfaces for phosphate esterhydrolysis in aqueous solution.65

Low accuracy of ∆Gsolv of ions containing sp3 oxygen atomsnot only is characteristic of the LD model but is even morepronounced in continuum-based models. For example, the PCMmodel with generic atomic radii largely overestimates  |∆Gsolv|

of the OH- ion and underestimates |∆Gsolv| for protonated water,methanol, and ethanol molecules, respectively (Table 4).Alternatively, by using the PCM model with density functionaland MP2 electron densities and refined atomic radii, Stefanovichand Truong obtained reasonable value of  ∆Gsolv for the OH-

ion, but  ∆Gsolv reported by them for CH3O- fell in the -78 to-83 kcal/mol range.44 Similarly, a solvation energy of  -85kcal/mol was obtained for CH3O- by solving the Poisson-

Boltzman equation for density functional charge distribution onthe solute.36 Lim et al.66 obtained a correct solvation energyof CH3O- (-97 kcal/mol), but only at the expense of differentvdW radii of the oxygen atom in methanol and methoxide anionand freely adjustable atomic charges.

The agreement with experimental data improves for solutescontaining sp2 oxygen, such as  organic acids. However, thehybridization-dependent vdW radii for O and C atoms requirethe knowledge of the prevalent resonance form for a givensolute. This is straightforward, for example, for R-COO-

molecules, in which two resonance structures have equal“weight” and consequently either oxygen can be defined as sp2

(and the other as sp3). However, the situation becomes moredifficult in   phenoxide anion,

for which contribution of the minor resonance forms  2 and 3 isunknown. The solvation energy of  -80 kcal/mol reported in

Table 4 was calculated using the resonance form   1. It isoverestimated by 5 kcal/mol, whereas   ∆Gsolv  calculated withthe sp2 radius on the O atom and the smaller sp3 radius on theopposite C atom, as in the resonance form  2, amounts to -71kcal/mol at the ILD level. Thus, averaging over energiesobtained for the resonance structures   1   and   2   results in asignificant improvement of the calculated result. Alternatively,averaged vdW radii of 2.43 and 2.83 Å, respectively, could beused for the O and C atoms affected by the resonance. Thesame overestimation of the calculated   |∆Gsolv|  occurs for thethiophenolate anion. Thus, the vdW parameter of sp2-hybrid-ized sulfur atom should be somewhat larger than for the sp3

one, in analogy with the trend occurring for C and O atoms.Otherwise, LD results for  thiols are in an excellent agreement

with available experimental data.4.3. Compounds Containing Nitrogen or Phosphorus

Atoms.   Unlike solvation of cations of aliphatic alcohols, ∆Gsolv

of  protonated methyl amines tend to be overestimated by PCM(Table 3) and other continuum dielectric methods.36,46 There-fore, we have chosen a rather large (2.65 Å) vdW radius fornitrogen. This however did not lead to sufficient agreementwith the experiment, as large contributions to  ∆Gsolv in thesemolecules have their origin in charged hydrogen atoms. Tokeep the number of adjustable parameters small, we have chosenthe vdW radii of H atoms to be a linear function of the radii of adjacent heavy atoms (with the exception of the H atom bondedto inorganic oxygen). The resulting ILD values of ∆Gsolv agreewell with their experimental counterparts and also reproduce

decreasing |∆Gsolv| upon going from NH4+

to the trimethylam-monium ion.The solvation of  protonated methyl phosphines turned out to

be an even more difficult case. Here, the experimental solvationenergies are systematically overestimated by the ILD methodby about 6 kcal/mol, but relative energies of protonated mono-,di-, and trimethyl phosphine are predicted correctly.

A similar decrease of   |∆Gsolv|   upon methylation on anelectronegative center was observed and calculated for neutraland ionized water and alkyl alcohols. However, for   neutral

methyl amines, the experimental   ∆Gsolv   is approximatelyconstant,56,57 whereas LD (Table 3), continuum dielectric,68,69

and MD FEP calculations70,71 provide a decrease in |∆Gsolv| of about 1 kcal/mol for each methylation upon going from

5590   J. Phys. Chem. B, Vol. 101, No. 28, 1997    Florian and Warshel

Page 9: Jp 9705075

8/12/2019 Jp 9705075

http://slidepdf.com/reader/full/jp-9705075 9/13

ammonia to trimethylamine. Several computational studies68-71

were devoted to this topic without solving this enigma.Recently, Marten et al.50 attributed this discrepancy to theincorrect description of short-range hydrogen-bonding effectsby methods that use electrostatic models.

In contrast to amines,  |∆Gsolv| of neutral  acetonitrile calcu-lated by using the same vdW radius on nitrogen is overestimatedby 3 kcal/mol. This deviation is not critical, given that the verylimited parameter set in our model is capable of describingconsistently many experimental results for both neutral and ionic

solutes. It cannot be corrected by a better choice of vdW radiuson the nitrogen, as this would increase the deviation betweenLD and experimental solvation energy for protonated acetoni-trile. In contrast, the SCRF model, which was designed forneutral molecules by Marten et al.,50 provided the solvationenergy of acetonitrile within 0.4 kcal/mol from its experimentalvalue. However, this model involved a separate vdW parameterfor the sp-hybridized nitrogen and for two types of sp carbon,one of them for the CC and the other for the CN type of triplebonds. The vdW radius for N(sp) was chosen by these authorsto be 0.5 Å larger than for N(sp2). Obviously, if this methodand these parameters were applied to the prediction of thesolvation energy of protonated acetonitrile, they would resultin a largely underestimated   |∆Gsolv|. Thus, in addition to

amines, solvation properties of nitriles seem to be the case, inwhich proper treatment of both the quantum and statistical natureof the solute-solvent interactions becomes important.

Solvation free energies of   aromatic N-containing solutes

could be predicted accurately enough without any additionalparameter adjustments. This is especially important for for-mamide and acetamide, which represent model systems forpeptidic linkages, and also for cytosine. Various aspects of aqueous solvation of these important biomolecules have beenrecently scrutinized by several groups.23,69,70,72-75

Compounds containing oxygen atoms bonded to nitrogen or 

 phosphorus have rather small  |∆Gsolv|. Thus, the value of theoxygen vdW radius in these compounds was adjusted to 2.8 Å,which is 0.6 Å larger than for O(sp3). The ILD model withthis parametrization reproduces correctly solvation energies of NO2

-, NO3-, and phosphate mono-, di-, and trianion. The

noniterative LD (NLD) and the PCM models seem to fail forhighly charged phosphates, providing too small and too largevalues of  |∆Gsolv|, respectively. Considerable improvement inthe PCM results for phosphate anions could be expected if alarger vdW radius for inorganic oxygen was used. In addition,overestimated PCM hydration energies of 2- and 3- ions canbe rationalized by the lack of saturation in continuum methods.In contrast, the nonlinear dependence of induced dipoles uponthe magnitude of the solute electrostatic field is expressed interms of the Langevin function (Figure 1) in LD-based ap-proaches.

4.4. Electron Correlation and Basis Set Effects.   Asdescribed in section 3.7, the current LD implementation enablesone to easily evaluate electron correlation and basis set effectsupon the calculated solvation energies from the differencesbetween the HF and MP2 atomic charges, and the values of Langevin dipoles calculated from the HF/6-31G* charges.(Note that the MP2-LD results were not used for the param-etrization of the model nor were they included in Tables 3 and4.) In this section we will discuss the trends in ∆Gcor valuescalculated by using eqs 19 and 20 and MP2/6-31+G** atomiccharges.

The HF/6-31G* method is generally believed to overestimatethe gas phase molecular dipole moment by about 10-20%, sothat it effectively mimics the solute polarization in polar

solvents. This advantage of HF/6-31G* charges was utilizedin a recent refinement of the AMBER molecular mechanicsforce field.76 Thus, the more accurate solute charge distributionobtained by using correlated quantum chemical methods canbe expected to provide less negative solvation energies. Wefound this expectation to apply for the majority of conjugatedor partly conjugated systems, regardless of the total solutecharge. However, the magnitude of the MP2/6-31+G** cor-rections in these solutes was found to be generally small. Forexample, the largest  ∆Gcor of 0.6 kcal/mol was calculated for

neutral nitromethane.The combined electron correlation and basis set corrections

are small also for methyl amines, where they amount to onlyfew tens of kcal/mol. Interestingly, these contributions help toslightly improve the agreement with the experimental solvationenergies. On the other hand, for methyl alcohols, the calculated∆Gcor corrections tend to increase their already too high |∆Gsolv|

values. This is noticeable especially for OH- and CH3- ions,

for which ∆Gcor amounts to -1.2 and -1.0 kcal/mol. However,due to the small size of these ions, the electronic electroncorrelation effects might be partly offset by the correlation-related increase in the CO bond length. To estimate theimportance of such geometrical effects we calculated ∆Gsolv of the OH- ion by using both the HF and MP2 geometries with

the same HF set of atomic charges. The resulting change in∆Gsolv corresponding to the 0.008 Å increase in the OH bondlength turned out to be only -0.2 kcal/mol, which is below thegrid-related uncertainty. In addition, we evaluated (at the MP2/ 6-311++G(2d,p) level) how the correlation contribution to thedipole moment of the OH- ion changes when OH- is placedinto the homogeneous electrostatic field. We applied a fieldof 0.01 au, which induced the dipole moment increase fromthe 1.77 to 2.04 D at the HF level and from 1.76 to 2.06 D atthe MP2 level. Thus, in this case the electron correlationcontribution to the induced dipole moment is negligible.

4.5. Substituent-Dependent Changes in Solvation Free

Energies.   As illustrated in Table 4, the absolute accuracy of both the experimental and LD solvation free energies of ionicsolutes is low. This is rather unfortunate since the importanceof solvation contributions from ionized groups for the kineticsand equilibrium energetics of heterolytic chemical reactions insolutions is enormous. However, experimental informationoften becomes more reliable (up to (0.5 kcal/mol) for solvationproperties within a group of chemically related compounds. Thesubstituent-dependent changes in   ∆Gsolv   of charged solutesusually amount to several kcal/mol, so the proper prediction of these trends by the theory is in fact a more meaningful test of the quality of the given computational method than themagnitude of the mean deviation of absolute experimental andcalculated solvation energies. Even more importantly, if thesolvation-dependent changes in equilibrium constants and

kinetics of chemical reactions are correctly reproduced by asolvation model, it is usually easy to change a few parametersof the model in such a way that both the absolute and relativeenergetics will agree with the experimental data.

In Table 5, we present such a comparison for several groupsof related ionic compounds. Members of each group havecommon total charge and type of central atom. For the samemolecules, variations in gas phase basicity calculated at the abinitio MP2/6-31+G**//HF/6-31G* level are compared with thecorresponding experimental results. The ab initio methodchosen includes a significant amount of correlation energycontributions while applicable to relatively large solutes (∼40atoms), so it represents a reliable counterpart of the LD modelfor a wide range of chemical processes occurring in aqueous

Langevin Dipoles Model   J. Phys. Chem. B, Vol. 101, No. 28, 1997    5591

Page 10: Jp 9705075

8/12/2019 Jp 9705075

http://slidepdf.com/reader/full/jp-9705075 10/13

solution. The data presented in Table 5 enable one to comparethe accuracy of such standard ab initio methods with theperformance of LD and PCM models. Finally, a most criticalcheck of the predictive capabilities of theoretical approaches

for the treatment of solution energetics is presented in the tworightmost columns of Table 5. Here, the equilibrium constantsfor proton transfer reactions (characterized by pK a differences)calculated by using the ILD method are compared withcorresponding experimental quantities.

4.6. Overall Performance.   Overall, the ILD method slightlyoutperforms both the NLD and PCM methods. Such a goodperformance indicates that the ILD method can be fullycompetitive with the continuum dielectric methods in studiesof reactivity and equilibrium properties of medium-sized systemsin aqueous solution. Here, the existence of several alternativetheoretical approaches should enable one to critically evaluatethe inherent accuracy of calculated solvation free energies. Anexample of such a comparison will be given in the following

section of our paper.The results provided by the NLD method have comparableaccuracy as those obtained from the PCM calculations. TheNLD method performs especially well for neutral and monoan-ionic solutes, whereas for solutes with larger localized chargesNLD hydration energies are significantly underestimated.However, one should realize that the NLD method is about 2orders of magnitude faster than the ILD and PCM methods.Therefore, a good performance of the noniterative approach ispromising for the calculations of large systems. For suchsystems, the HF/6-31G* atomic charges needed for the LDcalculations may not be directly obtainable, as the whole systemmay be too big for ab initio treatment. However, the necessarycharge distribution could be built from separate ab initio

calculations of individual constituents, for example amino acids.In fact, the potential-derived HF/6-31G* charges and thisbuildup strategy have been used for a long time in the AMBERforce field.76 Thus, if the free energy contribution corresponding

to solute polarization (∆Grelax) could be determined empirically,e.g. from the parametrization of atomic polarizabilities, onecould use, for example, the AMBER library of atomic chargesto calculate hydration energies of macromolecules within theframework of the LD parametrization presented in this paper.Such calculations are currently feasible only by using theENZYMIX library of group charges and the previous LD modelimplemented in the POLARIS program.20

5. Solvation Effects on Conformational Equilibria in

1,2-Ethanediol

Rotations around single bonds constitute important degreesof freedom of biological macromolecules. The corresponding

energy differences are usually small, but because of a largenumber of such degrees of freedom, they add up to significantdriving forces influencing macromolecular secondary structures.Thus, it would be very useful if simple theoretical solvationmodels such as the LD model were able to predict correctlyhow hydration affects these conformational equilibria. To assessthe performance of our LD model in evaluating conformation-related solvation energy differences, we studied the g+G+g-,tG+g- and tTt rotamers of 1,2-ethanediol (Figure 2). We usethe spectroscopic notation of the rotational isomers that denotestorsional angles near 60°,  -60°, and 180°  as gauche+ (g+),gauche- (g-), and trans (t), respectively. The g+G+g- andtG+g- conformers enable intramolecular hydrogen bonding.Consequently, they represent preferred conformations in the gas

TABLE 5: Comparison of Calculated and Experimental Substituent Effects on Gas Phase Basicities, Solvation Free Energiesof Charged Solutes, and Corresponding p K a  Differences

∆∆B ga (kcal/mol)   ∆∆Gsolv

b (kcal/mol)   ∆pK asubstituents(R)(R′)(R′′) MP2c exptd  NLD ILD PCM expt calce expt f 

Amines, N(R)(R′)(R′′)(H)+, Relative to NH4+ /NH3

(-H)2(-CH3) 10.4 10.1 7 7 11 8 3 14(-H)2(-Ph) 4.8 6.9 21 17 25 13   -10   -4.6(-H)(-CH3)2   17.3 17.2 15 14 19 15 4 1.5(-CH3)3   21.5 21.7 20 19 23 22 4 0.6

Phosphines, P(R)(R′)(R′′)(H)+, Relative to MePH3+ /MePH2

(-H)(-CH3)2   14.5 12.3 5 5 8 6 7 3.9(-CH3)3   24.2 23.0 8 9 12 10 11 8.7

Alcohols, ROH2+, Relative to H3O+ /H2O

(-H)2(-CH3) 15.9 15.1 20 18 14 18 0 0(-H)2(-C2H5) 20.2 21.1 28 24 21 26 0 0

Oxoanions, RO-, Relative to HO- /H2O(-CH3)   -5.2   -10.1 19 18 16 12   -7   -0.1(-C2H5)   -9.0   -13.4 25 22 19 16   -7 0.2(-Ph)   -39.5   -41.8 41 35 38 35   -6   -5.7(-CHO)   -45.9   -45.9 33 34 33 30   -11   -12.0(-CCH3O)   -42.0   -42.6 35 33 32 28   -8   -10.9(-CCHF2O)   -61.4   -60.5 40 40 44 40   -17   -14.4(-CCHCl2O)   -62.1   -62.2 47 44 44 44   -16   -14.4

Thiols, RS-, Relative to HS- /H2S(-CH3) 4.6 5.7 3 1 3 0 5 5(-C2H5) 3.0 4.0 6 3 6 2 5 5

mean deviationg 1.4 3.5 2.1 3.6 2.5

a Free energy change for the gas phase proton transfer reaction AH+ + B f A + BH+ or AH + B-f A- + BH, where B or B- is the reference

base.   b For absolute solvation energies and method descriptions see Table 4.   c Determined from MP2/6-31+G**//HF/6-31G* proton affinities at298 K, using eq 22. Calculated  ∆B g values for the reference bases: NH3, 198.0; MePH2, 195.0; H2O, 157.1; OH-, 379.7; SH-, 344.5 kcal/mol.d  Experimental ∆B g values for the reference bases: NH3, 195.6; MePH2, 196.3; H2O, 159.0; OH-, 384.1; SH-, 344.9 kcal/mol (refs 59, 62).   e MP2+ ILD method. Absolute pK a constants for acids calculated from MP2 gas phase energies and ILD solvation free energies using eq 21: NH4

+, 11;MePH3

+, 3; H3O+, -7; H2O, 10; H2S, 8.   f  Reference 86 for phosphines, ref 85 for water and alcohols, ref 87 for others. Experimental pK a constantsfor reference systems: NH4

+, 9.2; MePH3+, 0; H3O+,  -1.5; H2O, 15.7; H2S, 7.0.   g Arithmetic average of the absolute values of the differences

between calculated and experimental data.

5592   J. Phys. Chem. B, Vol. 101, No. 28, 1997    Florian and Warshel

Page 11: Jp 9705075

8/12/2019 Jp 9705075

http://slidepdf.com/reader/full/jp-9705075 11/13

phase. When only the electronic, vibrational, and rotationalcontributions to the free energies are considered, the tG+g-

conformer is preferred by 0.5 kcal/mol to the tG+

g-

conformerand by 2.2 kcal/mol to the tTt one.77 Except for the tTtconformer, seven other minima that lack internal hydrogenbonds were located on the ab initio MP2/cc-pVDZ potentialenergy surface.77 For the investigation of all stereochemicallyunique conformers, which exceeds the scope of the presentpaper, we refer the reader to the work of Cramer and Truhlar 77

and references cited therein.The relative LD solvation free energies of the different

conformers are presented in Table 6 along with the resultsobtained by other computational methods. These include PCMand Monte Carlo calculations of Alagona and co-workers,78,79

results obtained with AM1-SM1a, AM1-SM2, and PM3-SM3solvation models,77 and classical molecular dynamics simula-

tions.80 All the mentioned theoretical methods agree that boththe g+G+g- and tTt conformers are stabilized upon solvation.However, the predicted magnitudes of solvation effects con-siderably vary among different methods. The SM2 and SM3solvation models were claimed by Cramer and Truhlar77 to bemore appropriate for the calculation of conformational equilibriain aqueous solution than the PCM model because the SM2 andSM3 models are parametrized to include first solvation shelleffects. Indeed, the major part of the relative stabilization of the tTt conformer in the SM2 and SM3 calculations was foundto have a nonelectrostatic origin.77 In contrast, the morenegative   ∆Gsolv   for g+G+g- and tTt than for the tG+g-

conformer originates mainly from the  ∆GES term in both theiterative and noniterative LD solvation models. This electro-

static stabilization of the all-trans conformer is increased uponinclusion of the electron correlation. On the other hand, it ispartly compensated by the larger (positive) hydrophobic term.The effects of repulsion-dispersion (∆GvdW), solute polarization(∆Grelax), and Born (∆GBorn) terms were found to be negligible.The above mentioned competition of different components of solvation free energy results in slightly larger stabilization of the g+G+g- conformer than the tTt conformer. At the highestMP2-ILD level, this stabilization with respect to tG+g- amountsto 0.8 and 0.6 kcal/mol, respectively. This result is in a

reasonable agreement with calculations using all-atom solventmodels and also with results of our PCM calculation. Interest-ingly, the PCM calculations of Alagona and Ghio that werecarried out at the similar ab initio HF/6-31G*//HF/4-31G levelresulted in notably different relative energies. Most of thisdifference probably originates from the differences in atomicvdW radii. Unfortunately, information about vdW radii wasnot included in ref 78.

6. Concluding Remarks

Approximate treatments of solute-solvent interactions pro-vide at present the only practical way for quantitative evaluationof solvation energies. Several strategies are currently used

including continuum treatments, LD models, and all-atom QM/ MM approaches. In this paper, a new parametrization of theLD model is presented, and the resulting solvation energies arecompared to the results obtained by using current continuummodels. The performance of both methods is comparable anddepends to a large extent on the effort invested in theparametrization.81 However, since the LD model is based onan explicit representation of the solvent molecules, it providesa clearer connection to the underlying molecular picture. Forexample, it is not entirely clear how to represent the electronicpolarizability of solvent molecules in continuum approaches.On the other hand, the dipolar models enable one to deducethe direct analogy between the behavior of the inductivecomponents of these dipoles and the behavior of the electrons

in real solvent molecules. This can be done by considering theactual solvent molecules in a quantum mechanical supermo-lecular treatment and then asking what is the correspondingclassical dipolar analog.41 To realize this point, one can startwith a solute molecule and a single helium atom and treat themas a supermolecular quantum mechanical system within a doubleCI treatment41 and then reproduce the effect of the helium atomby an induced dipole. The same can be done with many heliumatoms or any other system of polarizable atoms.3 In this case,it is possible to form a clear bridge between a quantummechanical description of the solvent molecules and ourpractical classical model. An identical philosophy can be usedin modeling polar solvent molecules. The solvent moleculescan of course be approximated by continuum approaches, and

there are even powerful formulations for the treatment of theelectronic polarization within the continuum theory.82 However,it is not clear yet how to relate such treatments to the quantumtreatment of the solvent, since the basic model does not reflecta quantum mechanical representation of the solvent. This isperhaps the reason for controversies about the correct imple-mentations of such models. No such problems exist in dipolarmodels due to their clear connection to the microscopic world.Therefore, we believe that explicit dipolar models may be usefulin future attempts to improve solvent models, such as modelingcharge transfer to and from the solvent or explicit hydrogenbonds. In cases like this, it is quite straightforward to studythe properties of real molecular models and to construct thecorresponding dipolar models. Finally, the advantage of dipolar

Figure 2.   Benchmark conformers of 1,2-ethanediol. The calculated

gas phase HF/6-31G* torsional angles (deg) are given below eachconformer in the order (H1O1C1C2, O1C1C2O2, C1C2O2H2). Theintramolecular hydrogen bond is indicated by the dashed line.

TABLE 6: Relative a Solvation Free Energies of the g+G+g-

and tTt Conformers of 1,2-Ethanediol in Aqueous Solution

conformer

method g+G+g- tTt

NLD   -0.7   -0.2ILD   -0.9   -0.5MP2-NLDb -0.6   -0.3MP2-ILDb -0.8   -0.6PCMc -0.4   -0.5PCM(HF/4-31G)d  -1.1   -0.2PCM(HF/6-31G*)e -0.7   -0.2

SM1a f -0.4   -0.1

SM2 f  -0.1   -1.0SM3 f  -0.1   -1.2MC (OPLS/TIP4P)g -1.0   -1.2MD-FEP (GROMOS/SPC)h -0.7   -1.1

a With respect to the tG+g- conformer.   b LD solvation free energiescorrected for electron correlation effects (see section 3.7).   c This work.d  Reference 78.   e HF/6-31G*//HF/4-31G PCM calculation.78   f  AM1-SM1a, AM1-SM2, and PM3-SM3 relative solvation free energies,evaluated at the MP2/cc-pVDZ geometry.77   g Monte Carlo free energyperturbation method.79   h Molecular dynamics restrained dihedral sam-pling.80

Langevin Dipoles Model   J. Phys. Chem. B, Vol. 101, No. 28, 1997    5593

Page 12: Jp 9705075

8/12/2019 Jp 9705075

http://slidepdf.com/reader/full/jp-9705075 12/13

models becomes more apparent when one moves to heteroge-neous environments such as proteins, where the correct use of continuum representation involves major conceptual traps.24

Those who do not share our belief in the advantage of usingdipolar models, which is in part a matter of taste, can simplyconsider our LD model as a tool in the arsenal of moderncomputational chemistry that augments the more widespreadcontinuum dielectric and all-atom solvation models. The currentversion of our LD program83 can be used along with anystandard ab initio program and can be obtained upon request

from the authors.Acknowledgment.  We thank Dr. Arno Papazyan for sug-

gesting the Clausius-Mosotti scaling and for valuable discus-sions. This work was supported by the NIH Grant GM24492and by the Tobacco Grant 4RT-0002.

References and Notes

(1) Warshel, A.  Computer Modeling of Chemical Reactions in Enzymesand Solutions; John Wiley & Sons: New York, 1991.

(2) (a) Cramer, C. J.; Truhlar, D. G.   Structure and ReactiV ity in Aqueous Solution; American Chemical Society: Washington, 1994; Vol.568. (b) Tomasi, J.; Mennucci, B.; Cammi, R.; Cossi, M. In  Computational

 Approaches to Biochemical ReactiV ity; Naray-Szabo, G., Warshel, A., Eds.;Kluwer Academic Publishers: Dordrecht, 1997; pp 1-102.

(3) Warshel, A.; Levitt, M.  J .  Mol.  Biol.   1976,  103, 227.(4) Warshel, A.  J .  Phys.  Chem.  1979,  83, 1640.(5) Russell, S. T.; Warshel, A.  J .  Mol.  Biol.   1985,  185, 389.(6) Warshel, A.; Russell, S. T.  Q.  ReV .  Biol.   1984,  17 , 283.(7) Rinaldi, D.; Rivail, J.-L.  Theor .  Chim.  Acta   1973,  32, 57.(8) Tapia, O.; Goscinski, O.  Mol.  Phys.  1975,  29, 1653.(9) Rivail, J.-L.; Rinaldi, D.  Chem.  Phys.  1976,  18 , 223.

(10) McCreery, J. H. C., R. E.; Hall, G. G.  J . Am. Chem. Soc. 1976, 98 ,7191.

(11) Miertus, S.; Scrocco, E.; Tomasi, J.   Chem.  Phys.   1981,  55, 117.(12) Miertus, S.; Tomasi, J.  Chem.  Phys.   1982,  65, 239.(13) Rashin, A. A.  J .  Phys.  Chem.   1990,  94, 1725.(14) Sharp, K. A.; Honig, B.  J .  Phys.  Chem.   1990,  94, 7684.(15) Cramer, C. J.; Truhlar, D. G. J .  Am.  Chem.  Soc.  1991,  113, 8305.(16) Cramer, C. J.; Truhlar, D. G. J . Comput .-Aided Mol. Des.  1992,  6 ,

629.(17) Dillet, V.; Rinaldi, D.; Rivail, J. L.  J . Phys. Chem. 1994, 98 , 5034.(18) Kollman, P.  Chem.  ReV .   1993,  93, 2395.

(19) Warshel, A.; A°

qvist, J. Annu. ReV 

. Biophys. Biophys. Chem. 1991,20, 267.(20) Lee, F. S.; Chu, Z. T.; Warshel, A.  J .  Comput .  Chem.   1993,  14,

161.(21) Alden, R. G.; Parson, W. W.; Chu, Z. T.; Warshel, A. J . Am. Chem.

Soc.  1995,  117 , 12284.(22) Stephens, P. J.; Jollie, D. R.; Warshel, A.  Chem.  ReV .  1996,  96 ,

2491.(23) Florian, J.; Baumruk, V.; Leszczynski, J.  J . Phys. Chem. 1996, 100,

5578.(24) Sham, Y. Y.; Chu, Z. T.; Warshel, A.  J . Phys. Chem. B  1997, 101,

4458.(25) King, G.; Warshel, A.  J .  Chem.  Phys.   1989,  91, 3647.(26) Papazyan, A.; Warshel, A. In preparation.(27) Coalson, R. D.; Duncan, A.  J .  Phys.  Chem.  1996,  100, 2612.(28) Rogers, N. K.  Prog.  Biophys.  Mol.  Biol.   1986,  48 , 37.(29) It is important to point out that past criticisms of the LD model

were either irrelevant or incorrect. This point was partially clarified

recently.24

However, in the interest of those who are unfamiliar with themodel and find it computationally appealing we will further clarify thisissue by considering the specific criticism of Rogers 28 point by point. (i)It was argued28 that the LD model has only permanent dipoles, while it iswell-known that electronic polarizability is very important for a properdielectric constant. However, many LD versions include explicitly induceddipoles (e.g. ref 4). Also, the effect of induced dipoles is rather unimportantwhen one deals with ground state solvation properties and adjusts thepermanent dipoles to account for the missing electronic polarizability.Moreover, as far as the dielectric constant () is concerned it is well-knownthat the proper   can be obtained with a properly adjusted permanent dipolewithout any induced dipole. (ii) Rogers pointed out the obvious fact thatthe cubic grid of the LD model does not have the correct structure of realwater, which he assumed to be very important for obtaining the dielectricconstant of water. However, the LD and other simple dipolar models (ortheir simplification, the continuum models) have no pretense of reproducingthe details of the solvent structure, and their use is justified by the virtue of capturing solvation effects. (iii) It was pointed out that the reaction field

from the bulk is very important and it is neglected in the LD model. Thiscriticism is rather puzzling since the LD model does include the reactionfield effect on solvation energy (see e.g. ref 6), and a simple version of this correction is the bulk correction presented in section 3.4. Finally, itwas implied that the model is problematic because it was criticized byKrishtalik and Topolev88 in particular with respect to ion pairs. The problemwith such a second-hand assessment is that the original criticism wasbaseless in any single point (see a discussion of some of the points infootnote 56 of ref 89). The specific criticism of the energetics of ion pairswas based on the incorrect assumption that  |∆Gsolv| of ion pairs in solutionshould stay constant with distance rather than increase with distance. (Thisobvious point is demonstrated for example in Figure 2 of ref 4.)

(30) Luzhkov, V.; Warshel, A.  J .  Comput .  Chem.   1992,  13, 199.

(31) Warshel, A.; Chu, Z. T. In  ACS Symposium Series: Structure and  ReactiV ity in Aqueous Solution.   Characterization of Chemical and Biologi-cal Systems; Cramer, D. G., Ed.; American Chemical Society: Washington,DC, 1994.

(32) Sitkoff, D.; Sharp, A. A.; Honig, B. J . Phys. Chem. 1994, 98 , 1978.(33) Malcolm, N. O. J.; McDouall, J. J. W.  J . Mol. Struct . (THEOCHEM)

1996,  366 , 1.(34) Bondi, A.  J .  Phys.  Chem.   1964,  68 , 441.(35) Frisch, M. J.; Trucks, G. W.; Schlegel, H. B.; Gill, P. M. W.;

Johnson, B. G.; Robb, M. A.; Cheeseman, J. R.; Keith, T.; Petersson, G.A.; Montgomery, J. A.; Raghavachari, K.; Al-Laham, M. A.; Zakrzewski,V. G.; Ortiz, J. V.; Foresman, J. B.; Cioslowski, J.; Stefanov, B. B.;Nanayakkara, A.; Challacombe, M.; Peng, C. Y.; Ayala, P. Y.; Chen, W.;Wong, M. W.; Andres, J. L.; Replogle, E. S.; Gomperts, R.; Martin, R. L.;Fox, D. J.; Binkley, J. S.; Defrees, D. J.; Baker, J.; Stewart, J. P.; Head-Gordon, M.; Gonzalez, C.; Pople, J. A.  Gaussian 94, ReV ision D.2; Gaussian,Inc.: Pittsburgh, PA, 1995.

(36) Chen, J. L.; Noodleman, L.; Case, D. A.; Bashford, D.  J .   Phys.Chem.   1994,  98 , 11059.

(37) Levy, R. M.; Belhadj, M.; Kitchen, D. B.   J .   Chem.   Phys.   1991,95, 3627.

(38) Lee, F. S.; Chu, Z. T.; Bolger, M. B.; Warshel, A.   Protein Eng.1992,  5, 215.

(39) A° qvist, J.; Hansson, T.  J .  Phys.  Chem.   1996,  100, 9512.(40) Lee, F. S.; Warshel, A.  J .  Chem.  Phys.   1992,  97 , 3100.(41) Luzhkov, V.; Warshel, A.  J .  Am.  Chem.  Soc.   1991,  113, 4491.(42) In a recent study90 it was argued that the LD approach of ref 30

and, in fact, ref 4 is inconsistent in its implementation with QM calculationssince it uses point charges for the solute and that this leads to problems inMD simulations (since the force is presumably not consistent with theenergy). These assertions involve several misunderstandings which areclarified below: (i) The LD method has been developed and implementedwithin the Lowdin orthogonalized orbitals formulation (where the overlapis treated implicitly) for the solute-solvent coupling, and this approximationis fully consistently and rigorously obtained using the solute point charges(see e.g. ref 30). (ii) Dealing with quantum/classical models, one has todefine the interaction potential semiempirically, and thus there is no corrector “incorrect” approximation. This is, already taking coupling of 1/  R ratherthan an electron repulsion integral is an approximation or a model. TheLD model defines fully and consistently a solute-solvent electrostaticcoupling. Obviously this coupling is not identical to the coupling obtainedusing ab initio wave functions or electrostatic potential, but it does notinvolve any inconsistency and its first derivatives are the exact derivativeof the assumed potential. In fact, referring to rigorous QM coupling whenthe solvent is treated classically as dipoles or as a continuum is an oxymoron.(iii) The LD is not an MD model, and it does not involve any analyticalgradients. Consistent energy minimization with semiempirical QM forcesand induced dipoles on the solvent (or protein) have been implemented byus as early as 1976.3 Again, using solute point charges within semiempiricalmodels provides a completely consistent and a valid model for MDsimulations with semiempirical models. This is particularly true when oneuses approaches such as the QCFF/PI model in MD studies (see e.g. ref 41). (iv) Finally as far as moving to more rigorous solute-solvent couplingon the ab initio level is concerned, obviously the LD model does notrepresent the solvent quantum mechanically and the same is true forcontinuum models. However, we have already introduced approaches thattreat the solvent quantum mechanically.91,92

(43) Muller, R. P.; Warshel, A.  J .  Phys.  Chem.   1995,  99, 17516.(44) Stefanovich, E. V.; Truong, T. N.   Chem.   Phys.   Lett .  1995,   244,

65.(45) Ford, G. P.; Wang, B.  J .  Am.  Chem.  Soc.   1992,  114, 10563.(46) Andzelm, J.; Klamt, A.  J .  Chem.  Phys.  1995,  103, 9312.(47) Truong, T. N.; Stefanovich, E. V.   Chem.   Phys.   Lett .   1995,   240,

253.(48) Tawa, G. J.; Martin, R. L.; Pratt, L. R.; Russo, T. V. J . Phys. Chem.

1996,  100, 1515.(49) (a) Cossi, M.; Barone, V.; Cammi, R.; Tomasi, J. Chem. Phys. Lett .

1996,  255, 327. (b) York, D. M.; Lee, T.-S.; Yang, W.   Chem. Phys. Lett.1996,  263, 297.

(50) Marten, B.; Kim, K.; Cortis, C.; Friesner, R. A.; Murphy, R. B.;Ringnalda, M. N.; Sitkoff, D.; Honig, B.  J . Phys. Chem. 1996, 100, 11775.

5594   J. Phys. Chem. B, Vol. 101, No. 28, 1997    Florian and Warshel

Page 13: Jp 9705075

8/12/2019 Jp 9705075

http://slidepdf.com/reader/full/jp-9705075 13/13

(51) Tunon, I.; Ruiz-Lopez, M. F.; Rinaldi, D.; Bertran, J.  J .  Comput .Chem.   1996,  17 , 148.

(52) Rashin, A. A.; Young, L.; Topol, I. A.  Biophys.  Chem.   1994,  51,359.

(53) Bachs, M.; Luque, F. J.; Orozco, M.  J .  Comput .  Chem.   1994,  15,446.

(54) Ben-Naim, A.  J .  Phys.  Chem.  1978,  82, 792.(55) Cabani, S.; Gianni, P.; Mollica, V.; Lepori, L.  J .   Solution Chem.

1981,  10, 563.(56) Ben-Naim, A.; Marcus, Y.  J .  Chem.  Phys.  1984,  81, 2016.(57) Wolfenden, R.  Science  1983,  222, 1087.(58) Pearson, R. G.  J .  Am.  Chem.  Soc.   1986,  108 , 6109.(59) Lias, S. G.; Bartmess, J. E.; Liebman, J. F.; Holmes, J. L.; Levin,

R. D.; Mallard, W. G.  J .  Phys.  Chem.  Ref .  Data   1988,  17 , Suppl. 1.(60) Farrell, J. F.; McTigue, P.  J .  Electroanal.   Chem.  1982,  139, 37.(61) Reiss, H.; Heller, A.  J .  Phys.  Chem.   1985,  89, 4207.(62) Lias, S. G.; Liebman, J. F.; Levin, R. D.  J . Phys.  Chem.  Ref . Data

1984,  13, 695.(63) Fuchs, R.; Young, T. M.; Rodewald, R. F. J . Am. Chem. Soc. 1974,

96 , 4705.(64) Streitwieser, A., Jr.; Nebenzahl, L. L.  J . Am. Chem. Soc. 1976,  98 ,

2188.(65) Florian, J.; Warshel, A.  J .  Am.  Chem.  Soc.   1997,  119, 5473.(66) Lim, C.; Bashford, D.; Karplus, M.  J . Phys. Chem. 1991, 95, 5610.(67) Wolfenden, R.  Biochemistry   1978,  17 , 201.(68) Orozco, M.; Jorgensen, W. L.; Luque, F. J. J . Comput . Chem. 1993,

14, 1498.(69) Tannor, D. J.; Marten, B.; Murphy, R.; Friesner, R. A.; Nicholls,

A.; Honig, B.  J .  Am.  Chem.  Soc.   1994,  116 , 11875.(70) Morgantini, P. Y.; Kollman, P. A.  J .  Am.  Chem.  Soc.   1995,  117 ,

6057.(71) Meng, E. C.; Caldwell, J. W.; Kollman, P. A.  J . Phys. Chem. 1996,

100, 2367.(72) Rick, S. W.; Berne, B. J.  J .  Am.  Chem.  Soc.   1996,  118 , 672.(73) Cramer, C. J.; Truhlar, D. G.  Chem.  Phys.  Lett .   1992,  198 , 74.(74) Miller, J. L.; Kollman, P. A.  J .  Phys.  Chem.  1996,  100, 8587.(75) Colominas, C.; Luque, F. J.; Orozco, M.  J .  Am.  Chem.  Soc.  1996,

118 , 6811.

(76) Cornell, W. D.; Cieplak, P.; Bayly, C. I.; Gould, I. R.; Merz, K.M.; Ferguson, D. M.; Spellmeyer, D. C.; Fox, T.; Caldwell, J. W.; Kollman,P. A.  J .  Am.  Chem.  Soc.  1995,  117 , 5179.

(77) Cramer, C. J.; Truhlar, D. G.  J .  Am.  Chem.  Soc.  1994,  116 , 3892.(78) Alagona, G.; Ghio, C.  J .   Mol.   Struct .  (THEOCHEM)   1992,  254,

287.(79) Nagy, P. I.; Dunn, W. J., III; Alagona, G.; Ghio, C.  J .  Am.  Chem.

Soc.  1991,  113, 6719.(80) Hooft, R. W. W.; van Eijck, B. P.; Kroon, J.  J .  Chem.  Phys. 1992,

97 , 3639.(81) It had been sometimes assumed that continuum models are more

general than the LD model since they involve fewer parameters. Of course,early continuum models used only two parameters: the dielectric constant

and the cavity size. Therefore, they were nonquantitative. Eventually, itwas realized that it is essential to have different parameters (Born or vander Waals radii) for different atom types. This is given by the physics of solvation and does not need any justification now. The fact that continuummodels contain the experimental dielectric constant () does not make themmore (or less) general since the interest here is in solvation energy thatmainly depends on the representation of the solute/solvent boundaries whosemicroscopic origin is not related to   .

(82) McRae, E. G.  J .  Phys.  Chem.   1957,  61, 562.(83) (a) Florian, J.; Warshel, A.   ChemSol, Version 1.0; University of 

Southern California: Los Angeles, 1997. (b) Program ChemSol can bedownloaded from the anonymous ftp server usc.edu, directory /pub/warshel/ cs.

(84) Wolfenden, R.  Biochemistry   1981,  20, 849.(85) Lowry, T. H.; Richardson, K. S. Mechanism and Theory in Organic

Chemistry; HarperCollins Publishers: New York, 1987.(86) Kirby, J. A.; Warren, S. G.  The Organic Chemistry of Phosphorus;

Elsevier: Amsterdam, 1967.

(87) Dean, J. A.   Lange’s Handbook of Chemistry; McGraw-Hill, Inc.:New York, 1992.(88) Krishtalik, V. I.; Topolev, V. V.  Mol.  Biol.   1984,  18 , 892.(89) Yadav, A.; Jacksom, R. M.; Holbrook, J. J.; Warshel, A.  J .  Am.

Chem.  Soc.   1991,  113, 4800.(90) Thompson, M. A.  J .  Phys.  Chem.  1996, 14492.(91) Wesolowski, T.; Warshel, A.  J .  Phys.  Chem.   1994,  98 , 5183.(92) Wesolowski, T.; Muller, R. P.; Warshel, A.  J .  Phys.  Chem.  1996,

100, 15444.

Langevin Dipoles Model   J. Phys. Chem. B, Vol. 101, No. 28, 1997    5595


Recommended