A simple new way to help speed up Monte Carlo convergence rates:Energyscaled displacement Monte CarloSaul Goldman Citation: J. Chem. Phys. 79, 3938 (1983); doi: 10.1063/1.446262 View online: http://dx.doi.org/10.1063/1.446262 View Table of Contents: http://jcp.aip.org/resource/1/JCPSA6/v79/i8 Published by the American Institute of Physics. Additional information on J. Chem. Phys.Journal Homepage: http://jcp.aip.org/ Journal Information: http://jcp.aip.org/about/about_the_journal Top downloads: http://jcp.aip.org/features/most_downloaded Information for Authors: http://jcp.aip.org/authors
Downloaded 12 Sep 2012 to 128.143.23.241. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions
A simple new way to help speed up Monte Carlo convergence rates: Energy-scaled displacement Monte Carlo
Saul Goldman
The Guelph-Waterloo Centre for Graduate Work in Chemistry. Guel"h Camnus. Guelnh Ontario Can--'a NIG 2Wl r r r • • au, •
(Received 4 March 1983; accepted 15 July 1983)
A m.ethod we call energy-scaled displacement Monte Carlo (ESDMC) whose purpose is to improve sampling ~fficlency and thereby speed up convergence rates in Monte Carlo calculations is presented. The method Involves scaling the maximum displacement a particle may make on a trial move to the particle's configurational energy. The scaling is such that on the average, the most stable particles make the smallest moves and the most energetic particles the largest moves. The method is compared to Metropolis Monte Carlo (MMC) and Force Bias Monte Carlo of (FBMC) by applying all three methods to a dense LennardJones fluid at two temperatures, and to hot ST2 water. The functions monitored as the Markov chains devel~~ were, for the Lennard-Jones case: melting, radial distribution functions, internal energies, and heat capacIties. For hot ST2 water, we monitored energies and heat capacities. The results suggest that ESDMC samples configuration space more efficiently than either MMC or FBMC in these systems for the biasing parameters used here. The benefit from using ESDMC seemed greatest for the Lennard-Jones systems.
I- INTRODUCTION
Within the last few years, a variety of methods have appeared, whose purpose is to increase the efficiency for carrying out Monte Carlo simulations relative to what is obtainable with the original method of MetropOlis et al. 1 These include umbrella sampling, 2-4
preferential particle selection,s Smart Monte Carlo, 6
and force-biased7-
9 Monte Carlo_ Also, combinations of some of these methods have been used. 10 Because the effectiveness of these methods is problem and system dependent, and because Monte Carlo calculations are never cheap (for many problems they are still impossibly long), there remains a strong continuing need to develop new ways to improve the method. The purpose of this article is to outline a new method designed to do that. We call the method "energy-scaled displacement Monte Carlo, " or ESDMC for short, for reasons discussed below.
II- THEORY
It has been known since Metropolis et al. original paper l that the maximum displacement allowed in the trial move had a bearing on the convergence rate of the simulation. If this was too large, too small a fraction of the trial moves was accepted, which led to computational inefficiency; if it was too small, most moves were accepted, but now the configurations did not change enough over the course of the Monte Carlo walk for an efficient sampling of configuration space. So some intermediate maximum allowed displacement has to be used.
Metropolis et al. studied a system of two-dimensional hard spheres, in which all the particles have the same configurational energy, zero. For this system, it is entirely reasonable to treat all the particles democratically, by allowing them all the same maximum box size. However, any system for which the pair potential varies more gradually with the configurational variables,
will have its particles distributed over a range of energies, so that here it may prove wasteful to impose on all the particles the same maximum move size. More specifically, one of the systems considered in this study was ST2 waterll at 600 K and 1. 00 gm/cm. The total potential energy per particle was of course available and was periodically printed out for all the particles. For this system, in which the ensemble average energy per particle (reduced with respect to kT) was - 5. 9 the range of reduced energies spanned by the 216 particles was typically - -14 to - O. (See also Eqs. (4) to (6) and Table I). In view of this it is unreasonable to impose on all the particles the same maximum move size. DOing this would have the effect of rejecting low energy particle moves too often, and constraining the high energy particles to maximum displacements that are unnecessarily small. These objections are similar in spirit to those against the use of box sizes too large or too small on systems whose particles all have the same energy. Thus, the underlying idea behind this work is,
TABLE I. Relative reduced dispersions for Lennard-Jones and ST2 fluids.
System p" T(K) _i3(c)b (C~/R){' RIlD"
0.85 86.136 8.25 n.4 0.31 Lennard-Jones 0.85 179.70 3.42 1.5 0.36
0.99678 283 18.9",f 12.1' 0.18 ST2 water 0.99678 600 5.89 7.1 0.45
ap *= prJ!, where p is particle number density and (J is LennardJones distance parameter.
be is here the mean configurational energy per particle with no tail correction [see Eq. (4)); ( ) means ensemble average. cC~ is the configurational constant volume molar heat capacity. dRRD means relative reduced dispersion. It was obtained by Eq. (6) and the entries in this table.
eReference 8, 'Reference 17. "Reference 11.
3938 J. Chern. Phys. 79(8),15 Oct. 1983 0021-9606/83/203938·10$02.10 © 1983 American Institute of Physics
Downloaded 12 Sep 2012 to 128.143.23.241. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions
Saul Goldman: Monte Carlo convergence rates 3939
by exploiting the dispersion of energy among the particles, to improve Monte Carlo convergence rates by allowing a range of maximum move sizes. The lowest energy particles are to be constrained to the smallest maximum move sizes, and the highest energy particles allowed the largest move sizes. This will have the effect of scaling the average displacement a particle executes to its energy. Small trial moves by low energy particles should improve the sampling efficiency within the important low potential energy regions. Large moves by high energy particles should help ensure substantial configurational changes when these moves are accepted. These large moves, therefore, also serve the purpose of disrupting cages, thereby permitting low energy particles to periodically escape. Of course it is undesirable for the biasing in ESDMC to be made too strong; then low energy particles will remain trapped in their cages for long periods of time, and high energy particle moves will only rarely be accepted.
This idea is very easy to implement. We want to scale the average displacement that any particle executes to the partiCle's potential energy, so that the most stable particles make the smallest moves, and the least stable particles, the largest moves on average. We have done this by scaling the maximum move a particle is allowed to make to its potential energy, through the equation:
(1)
with
ej = -21 t Ujj; i3 = (kT)"l • j~j
In Eq. (1), Ulj is the pair potential, ('€) is the running ensemble average of (if / N), where UC and N are, respectively, the total configurational energy and the number of particles. .6.0 and A are adjustable constants, and .6.1 is the maximum displacement particle "i" is allowed, on any particular trial move. For isotropic particles, .6.1 is the maximum translational move allowed in each of the x, y, and z coordinates (2.6.1 is the length of the cube's edge). For anisotropic particles we used Eq. (1) for the translational move, and the analogous equations:
(2)
and
(3)
for the maximum change allowed in the trial move, for the space-fixed frame Euler angles ai, i310 and Yi of particle "t"" (2.6.()(lo 2.6.n, and 2.6.cos,s1 are the limits in any' one move on the changes in at 10 Yio cos 13;). .6.wo and .6. cos i30 are constants. In this preliminary study the value of A used in Eqs. (2) and (3) was taken to be the same as that used in Eq. (1).
Several features of ESDMC are now more or less apparent. First, with A = 0 we recover Metropolis Monte Carlo (MMC). Second, is the obviously desirable feature that there is essentially no extra computation required for ESDMC relative to MMC, because the particle potentials are always known in a simulation and
because the time required to work out the exponential in Eqs. (1) to (3) is completely negligible relative to the other calculations required in the trial step. Third is that even for reasonable choices of .6.0, .6.wo, and to cos ,so, some care is needed in selecting a value of A. If A is too small, we will not have changed the walk significantly relative to MMC; if it is too large, the moves attempted by the higher energy particles will almost never be accepted, and those attempted by the lower energy particles, while largely accepted, will mostly be too small to alter the configurations generated significantly. Our results will demonstrate, however, that there exist intermediate values of A, for physically interesting systems, that significantly improve convergence rates.
Last, we point out that ESDMC requires no modification of the original Metropolis acceptance-rejection procedure to ensure sampling on a Boltzmann distribution. This follows from the fact that we do not bias the direction of our trial moves in ESDMC. That is, our trial moves are made at random, relative to the force and torque vectors on the molecule. Consequently, trial moves are as likely to be to higher as to lower energy states, so that the move size for the reverse step will, on average, be the same as that for the forward step. Or in terms of MetropOlis et al. terminology, the a priori transition probably (which is proportional to the reciprocal of the number of states) will for the forward trial move, the same, on average, as for the reverse move. Consequently, no correction to the Metropolis gate arises.
III. SYSTEM SELECTION
Since ESDMC scales the range of a trial move to a particle's relative configurational energy, it will be useless for hard spheres, and of little or no value for systems with very narrow energies dispersions. It should be useful for systems with significant energy dispersions and its usefulness should increase as this dispersion increases. The shape and skewedness of the distribution will also have some bearing, although we have not considered this here. An index of the extent of energy dispersion among the particles in a system at equilibrium can be constructed from the configurational constant volume molar heat capacity, C;. From the equation6:
where, as before, (3= (uc/N), N=Avogadro's number, lie = (e - (e», and the statistical result12 :
lie2 = li~ , (5)
where
liel = ej - (e) ,
we get:
= (lief)1I2 _ 1 (C;)1/2 RRD-IWi- i3iTeJi R (6)
J. Chern. Phys., Vol. 79, No.8, 15 October 1983
Downloaded 12 Sep 2012 to 128.143.23.241. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions
3940 Saul Goldman: Monte Carlo convergence rates
In Eq. (6) RRD means relative reduced dispersion, and (6~) is the ensemble average of the square of the configurational energy difference of any particle from the mean. Equation (6) suggests that for a particular system, at a given density, the usefulness of ESDMC will generally increase with temperature since the product TCJ-/2 generally increases with T. In Table I we list the values of RRD for Lennard-Jones and ST2 systems. The entries are in line with the expected temperature dependence of the method.
The systems selected for this study were the LennardJones liquid at the state conditions shown in Table I, and ST2 water at 600 K, p* =0.996 7B. The LennardJones systems were selected for the usual kinds of reasons: the isotropic potential, the fact that it has been studied by others, 13.14 and because the RRD values for this system are sizeable. ST2 water was studied only at this elevated temperature, because at ordinary temperatures convergence rates are too slow to make the kind of comparative study we planned practicable, and because at ordinary temperatures the RRD values of ST2 water are inauspiciously small (Table I).
IV. COMPUTATIONAL DETAILS
A. Acceptance rates and parameter assignments
We are chiefly interested in finding out how various values of A in Eq. (1) to Eq. (3) affect the convergence rate of ESDMC relative to MMC and FBMC. But before this can be looked at, we have to assign values to ~, ~WO, and ~ cos (30 in these equations. These parameters represent the maximum move size for particles whose energy equals the ensemble average energy. In previous studies, in which all the particles were allowed the same maximum move size, Berne et al. 8.9 advocated that the maximum move size be selected so as to maximize the mean-squared displacement of the particles over some finite portion of the walk. This criterion was not adopted here because of our skepticism over the proposition that maximizing the mean-squared displacement in an absolute coordinate frame should increase the efficiency with which configuration space was sampled-the latter depending entirely on the relative configurational coordinates of particles. This skepticism led us to test the above propOsition by running standard Metropolis Monte Carlo on 108 LennardJones particles at p* = 0.85, T' = O. 719, with fcc solid as the starting configuration for a range of values of the maximum move size~. The results are displayed in Fig. 1, from which it is seen that the value of ~ that brings the energy up the fastest over the first 104 or 105
trial moves is - 0.10, but the value that maximizes (r2) is - 0.15. In terms of mean acceptance rates, Fig. 1 shows that while the fairly low acceptance rate of - O. 30 does indeed maximize the mean-squared displacement, an acceptance rate of - O. 42 moves the ensemble energy toward the equilibrium value the quickest. [For lOB particles with no tail correction, the asymptotic value of «e)/kT) is here - - B. O. 1 While it may well be true that there are systems, or perhaps other convergence criteria, for which maximizing the mean-squared displacement also optimizes the efficiency
9.0 ARj=.70AR =.30 l 8.8 o j
0
-<k~) 8.6 0
! 8.4 'AR=.44
8.2 , ,
8.0 AR=.42
AR=.30 j
0.5 t
0.4 o AR=.29
<r2) 0.3
X 102 0.2
0.1
0 .05 .10 .15 .20 .25
A FIG. 1. The results of a series of simulations for five different maximum move lengths, ~, on a 108-particle Lennard-Jones system at T* = 0.719. p* = 0.85. The starting configuration in every case was the fcc lattice. If is defined with Eq. (4). and r2 is the square of the move size executed by each particle. Distances are in units of u. The acceptance rates (AR) for several values of ~ are indicated with arrows. (0) and (.) designate runs of 104 and 10· trial moves, respectively. The ensemble averages for both upper and lower figures are running averages over all values, including the initial ones in the run. Therefore, the maxima in the upper figure at ~ = 0.1 0 which represent cumulative averages closest to the asymptotic value of - - 8. 0, are, on the basis of this criterion, best.
of the Monte Carlo walk, the results in Fig. 1 demonstrate that this optimizing prescription is not general. The decay of an energy-energy correlation function, which would involve relative rather than absolute displacements, would be more reasonable, but perhaps too expensive to be worthwhile. 8
Largely because of the results of this calculation, we selected the values of ~, ~wo, and ~ cos (30 so that the average acceptance rate, after the initial transients died off, was around 0.42. However, changes in the acceptance rates as a run proceeded made it hard to guess values that would result in an acceptance rate of exactly 0.42. In practice, the equilibrium acceptance rates were in the range O. 42± O. 08.
lt turns out that the average acceptance rate is only slightly affected by the value of the parameter A. This is because as A increases, the acceptance rate for trial moves of particles with energies less than the mean increases (because of the smaller displacements here) while that for particles with energies greater than the mean decreases (because of the larger displacements here) so that these two effects largely nullify one another with respect to the overall acceptance rate. This effect, together with the values selected for ~o, ~WO, ~ cos 13o, and A, are shown in Table II.
We had no theoretical basis for the values of A that were tried in this work. The only guidelines used were
J. Chem. Phys., Vol. 79, No.8, 15 October 1983
Downloaded 12 Sep 2012 to 128.143.23.241. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions
Saul Goldman: Monte Carlo convergence rates 3941
TABLE II. Values of parameters used to set maxima move sizes and the corresponding mean acceptance rates.
Methooa
MMC and ESDMC
MMC and ESDMC
MMC and ESDMC
FBMC
FBMC
System, b T(K)
LJ, 86.136
LJ,179.7
ST2, T=600 K
LJ, 86.136 LJ,179.7
ST2, T=600 K
.0.0
0.10 0.10 0.10 0.10
0.12 0.12 0.12
.0.0
0.075 0.075 0.075 0.075
0.15 0.17
.0.0
0.10
Parameters"
.o.wo .0. cos /30
0.25 0.075 0.25 0.075 0.25 0.075 0.25 0.075
.o.a o
0.35
Overall Low energy High energy acceptance acceptance acceptance rated rate" ratet
A
0 0.46 0.46 0.46 0.35 0.44 0.53 0.32 0.50 0.45 0.58 0.28 1.0 0.49 0.72 0.16
0 0.43 0.46 0.39 0.25 0.45 0.52 0.34 0.35 0.45 0.54 0.32
0 0.49 0.45 0.53 0.40 0.49 0.66 0.31 0.60 0.49 0.74 0.22 0.75 0.49 0.78 0.17
A
1.0 0.34 0.37 0.31 1.0 0.38 0.45 0.29
1.0 0.45 0.46 0.43
aMMC, ESDMC, and FBMC mean Metropolis Monte Carlo, energyscaleddisplacementMonte Carlo and force-biased Monte Carlo, respectively. ~he Lennard-Jones calculations were for 256 particles at p* = 0.85. The ST2 calculations were for 216 particles at p* =0.99678.
"The parameters .0.0, .o.wo, .0. cos/3o, and A are defined in Eqs. (1) to (3). The.o.o values are reduced with respect to ~ the .o.wo values are in radians. A is a parameter that regulates the strength of the force and torque biasing (Ref. 8). .o.ao is in radians. It is defined with Eq. (AI). dThese are acceptance rates for attempted moves made with particles whose energy had any value. "These are acceptance rates for attempted moves made with particles whose energy was below the mean. fThese are acceptance rates for attempted moves made with particles whose energy was above the mean.
that A had to be bigger than 0, to ensure a significant departure from MMC, but not so large as to essentially stop accepting high energy particle moves and/or limit low energy particle moves to very small regions. As will be seen, a value of A = 1 was too large for any of our systems, but values in the range 0.25 to 0.50 for the Lennard-Jones systems, and 0.40 to 0.60 for hot ST2 water worked out very well.
B. Lennard·Jones systems
Apart from the runs shown in Fig. 1, all the LennardJones simulations were for a 256-particle, N, V, T ensemble, at p* '" 0.85 with spherical cutoff at half the initial fcc box length (3.3515 0"), and the usual periodic boundary conditions. All moves were single-particle moves with the trial particle picked at random. An fcc solid was the initial configuration. The maximum move size in the ESDMC calculations was not allowed to exceed Aoe2 since trial moves beyond this limit are almost invariably rejected. The bin size (see Ref. 15 and below) was 1000 in all the runs.
C. ST2 water
These runs were at T=600 K, p* =0.99678 for a 216-particle N, V, T ensemble, with the usual periodic
boundary conditions, and spherical cutoff at 8.46 A. As with the Lennard-Jones calculations, the trial particle was picked at random, and in the ESDMC calculations the maximum trial move size was restricted to t.oe2. The bin size was kept to the relatively low value of 1000 in order to avoid the effect of roundoff on the calculated heat capacities. Also, keeping the bin size the same in all these runs, helps to ensure that comparisons of heat capacities obtained with the different runs will be meaningful.
Thus the cumulative heat capacities were obtained by:
(7)
with
C; =_1_" Ce' +Nr~m(i7e')~ _{~m(V"')}2] (8) R mR ~ Vm L m m
In Eqs. (7) and (8) the superscripts "T" and "c" denote "total" and "configurational," respectively, m is the number of blocks the run was comprised of, and the primes mean that the indicated function refers to these blocks. Also Ve means eUC/N. For example, suppose a run of 106 trial configurations consisted of 20 blocks, each with 5x 104 trial configurations. Here Eq. (8) is used twice; first with m = 50 to get the 5 x 104 trial block
J. Chem. Phys., Vol. 79, No.8, 15 October 1983
Downloaded 12 Sep 2012 to 128.143.23.241. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions
3942 Saul Goldman: Monte Carlo convergence rates
ensemble averages. These are the primed quantities in Eq. (8). Then Eq. (8) is applied again, with m = 20, to get the cumulative heat capacity for the 106 trials. As indicated above, this double decomposition was made necessary by our use of a baSic bin size of 103 •
Equations (7) and (8) were applied, only after the configurational energy had settled down to within 1% to 2% of the asymptotic value.
The ESDMC and MMC calculations were done with the same program; the MMC results were obtained by setting A to zero in Eq. (1) to Eq. (3). Once a particle is picked, the translational trial moves were made by selecting a box size with Eq. (1), and then moving to a location, picked randomly, within this box. The rotational trial moves were made by using Eqs. (2) and (3) to compute a domain for ai' cos{3i' and YI' and then rotating to a randomly selected orientation within this domain.
The FBMC calculations were all for fixed maximum move sizes. The force biasing for the translational moves was done as in Refs. 8 or 9, but the torque biasing for the rotations was done differently. Specifically, we torque biased our rotations by picking a spacefixed axis with a probability proportional to the square of the torque along it, and then rotating the molecule around this axis, with the rotation biased in the direction of the torque. We found that this method yields a slightly weaker bias, but inVOlves less overhead than that described in Ref. 8, and that these effects roughly nullify one another.
Our ESDMC and FBMC programs were tested on previously equilibrated ST2 water at p* =0.99678, T=283.15 K. Our results for the ensemble-averaged energy per particle agreed well with published values. For 30000 trial moves, the ESDMC program (with A = 0) gave Me) = -18. 82± O. 08; the FBMC program over 50000 trial moves gave fl(e)=-18.88±0.09(vs -18.88 8
,
and -18.87 ± 0.1117). These values are all for 216 particles and the same spherical cutoff of 8.46 A, with no tail correction.
V. RESULTS
Since the functions that were monitored converged toward their equilibrium values at very different rates, this section is divided according to system and then subdivided according to the function monitored.
LENNARD-JONES SYSTEMS
A. Melting
Certainly, the speed with which a particular simulation procedure melts a solid that is initially a little above its melting point is an interesting test of the efficiency of that procedure. Using molecular dynamics on an 864 particle Lennard-Jones system, Verlet13
estimated that the melting point of this system at p* =0.85 was i"" =0.704. Verlet's criterion for melting was based on the time evolution of the function:
N
S " 41TXI =L.J cos -- , 1=1 a
(9)
S
180
160
140
120
100
80
60
40
20
0 /\ r-\ ...
-20
o 4 8 12 16 20 24
N/7.S X 104
FIG. 2. S, defined by Eq. (9), is an index of the extent to which a solid has melted. A value of zero typifies a liquid. (-) MMC; (---) ESDMCA=O.35; (-e-) ESDMCA=l.O; (--) FB MC A'" 1. O. N is number of trial moves. See text for other symbols.
where XI is the absolute X coordinate of particle "i," "a" is the side of the unit cell in the initial fcc lattice, with similar expressions of course for the Y and Z directions. S will equal N exactly in an fcc lattice, it will be of the order of N in any lattice, and it will oscillate around zero with an amplitude of the order of Nl/2
for a liquid. 13
So our first test involved seeing how MMC, ESDMC, and FBMC fared in bringing down the value of S from 256 to 0, starting from a 256 particle fcc lattice at p* = 0.85 and T* = 0.719. This temperature is 1. 8 K above Verlet's estimate of the melting point of this system, so that it should be high enough to produce a liquid in a reasonable time, but low enough to allow any spread between the different methods to make itself manifest.
The results are shown in Fig. 2. The values of S were calculated from configurations which we periodically stored; the values are averages over the X, Y, and Z directions. As seen from the figure, the MMC procedure has not completely melted the solid by 1. 05 X 106 trials, ESDMC with A = 0.35 gives the liquid after about 7.5 X 105 trials, and ESDMC with A", 1, and FBMC with 1\ = 18 , both do the job within about 5 x 105 trials. In order to keep this figure uncluttered, we do not show the ESDMC results for A = O. 50; they were very similar to the ESDMC A = 0.035 results, but were for a shorter run.
B. Radial distribution functions
The results obtained for these functions at T* values of 0.719 and 1. 5 are displayed in Figs. 3 and 4, respectively.
J. Chern. Phys., Vol. 79, No.8, 15 October 1983
Downloaded 12 Sep 2012 to 128.143.23.241. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions
Saul Goldman: Monte Carlo convergence rates 3943
3.4
3.0
2.6
2.2
rdf 1.8
1.4
1.0
0.6
0.2 .80 1.12 1.44 1.76 2.08 2.40 2.72
r*
Fro. 3. Radial distribution functions for a Lennard-Jones fluid at T* =0. 719, p* =0. 85. (-) MD data from Refs. 14 and 18. (0) MD data from Ref. 13. (- - -) ESDMC A = 0.35, (--) FBMC A =1. O. The ESDMC and FBMC curves are averages for the ranges 6. 75x105 to 1. 8x106 and 2. 25x105 to 9. Ox105 trial moves, respectively. Averages over these ranges were taken since shifts within these ranges as the walk proceeded were not significant. Error bars represent one standard deviation. r* is distance reduced by (1.
First consider Fig. 3, in which our results are compared with the Molecular Dynamics data of Verlet13 and of Nicolas et al. 14 Again, in the interest of clarity, we show an appended form of our results. The MMC results were for an incompletely melted solid (Fig. 2) and so are omitted. We omit the ESDMC A = 1 results, since we will find that these results (because of overbiasing) did not yet converge to the proper internal energies by the time the run was terminated. Also, our ESDMC distribution functions with A = 0.35 and A = 0.50 were almost indistinguishable on the scale of Fig. 3, so only the A = 0.35 curve is shown.
Our ESDMC curve is close to but not totally cOincident with the MD results. The FBMC curve's first peak is shifted to the right, and is significantly lower than the first peak in either the ESDMC or the MD results. After the first minimum the FBMC and ESDMC curves coalesce.
We believe, for a number of reasons, that the ESDMC curve is the most accurate of the ones shown. First, this curve was essentially the same as the one obtained with A = 0.50. Second, Nicolas et al. results14 in this temperature region were found, in subsequent testing, to be somewhat inaccurate (see footnote in Ref. 14). This was probably due to insufficient equilibration (500 time steps) at these temperatures. Verlet's MD results13 were for a very short experiment (1200 time steps), so they also may have inadequately equilibrated. Beyond the first minimum the ESDMC and FBMC curves agree very closely with Verlet's results, but are a little out of phase with Nicolas et al. result. This last discrepancy is probably due to the fact that the equation used to represent Nicolas et al. data18 contained only one adjustable parameter for handling the oscillations in the tail, and so the fit to the tail provided by this equation was a bit rough.
The most significant discrepancy in Fig. 3 is that the FBMC first peak is significantly lower and shifted a bit to the right of either the MD or the ESDMC first peak. That this is not attributable to noise is clear from the error bars. We believe this is caused by the tendency of FBMC (with A = 1) to focus the sampling in the early part of the walk on the low energy and low force regions of configuration space, thereby missing, in an insufficiently long run, some of the higher energy and higher force configurations that contribute to the configurational properties. This point will come up again in this article in our heat capacity results and will be dealt with at length in a later article. For now, though, in support of this View, we pOint out that the FBMC first maximum is at r* = 1. 12 which corresponds to the minimum in the Lennard-Jones pair potential. Also the FBMC curve indicates a slightly higher density in the region r* "" 1.24, (which corresponds to the region of minimum force for a pair of Lennard-Jones particles) at the expense of density in the higher force region of r* -1. 08.
Figure 4 shows that the pattern of discrepancies found at T" = 0.719 is repeated at T" = 1. 5, but the magnitude of the discrepancies is now somewhat reduced. Here again, two ESDMC curves, one with A =0. 25, the other with A = 0.35, were virtually indistinguishable on the scale of Fig. 4, so only one curve is shown. Also the MMC curve was virtually indistinguishable from the ESDMC curve except at the first peak (r* = 1. 08) where the MMC curve was lower by 0.11 units. The FBMC are ESDMC curves coalesce after the first minimum. The smaller discrepancies at the higher temperature are attributable to the fact that Nicolas et al. data improve in quality with increasing temperature, 14 and to the fact that the degree of biasing in FBMC will, for a fixed value of A decrease with temperature, making FBMC increasingly resemble MMC as the temperature rises. The MMC curve was virtually indistinguishable from the ESDMC curves except as indicated at the first peak. Again, we believe, that with respect to the r. d. f. s the MMC walk was not quite for the fully equilibrated sample at termination.
3.0
2.6
2.2
rdf 1.8
1.4
1.0
0.6
0.2 0 .80 1.12 1.44 1.76 2.08 2.40 2.72
r* Fro. 4. Radial distribution functions for a Lennard-Jones fluid at T*=1.5, p*=0.85. (-) MDdata from Refs. 14 and 18. (---) ESDMCA=0.35; (--) FBMC A=1.0. The ESDMC and FBMC curves are averages for the range 1.5x105 to 1.8x10s trial moves. Shifts within this range were negligible. Standard deviations are 2 or 3 times the thickness of the curves.
J. Chern. Phys., Vol. 79, No.8, 15 October 1983
Downloaded 12 Sep 2012 to 128.143.23.241. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions
3944 Saul Goldman: Monte Carlo convergence rates
8.5
-<k~~:: 8.2
8.1
8.0
7.9 ~~~~----'-~~~----'-~---'-~----I 3 5 7 9 11
N/(1.5 x 105)
FIG. 5. Ensemble averages of the reduced configurational energy (definition with Eq. 4) for a Lennard-Jones fluid at T* = 0, 719, p* = 0.85. Each discontinuity is an ensemble average over the preceding 7.5 x 104 trial moves. (-) MMC; ( ___ ) ESDMC A =0. 35; (-.-) ESDMC, A =1. 0; (--) FBMC >-=1, O.
C. Energies and heat capacities
These results are given in Figs. 5-7 and in Table III. The energy profile for ESDMC with A = 0.50 at T* = 0.719 is similar to but shorter than the one shown for A=O. 35 and so was left out from Fig. 5. Also, the energy tracings at T* = 1. 5 are basically featureless and so are not shown. All the walks have their equilibrium energies included in Table III.
There are a few interesting things demonstrated by the energy tracings in Fig. 5. First, it is seen that the internal energy from the MMC walk has not yet reached the equilibrium value. This is not surprising, since the MMC walk at termination had not yet completely melted the starting fcc lattice (Fig. 2 and text). We see also from Fig. 5, and Table III, that the value of the internal energy for the FBMC walk is statistically indistinguishable from that obtained with ESDMC using A =0. 35 (or A=O. 50). So it is sobering to look back to Fig. 3 where we clearly see that the correlation func-
17
16 Cv ~ mol K 15
14 0~~-2~---'-~4-L-~~~~~~~~ 6 8 10 12
N/(1.5 X 10~ FIG. 6. Cumulative, total, constant volume molar heat capacities for a Lennard-Jones fluid at T* = 0.719, p* = 0.85. See Eqs. (7), (8), Table m, and text. (- - -) ESDMC A = O. 35; (-0-) ESDMC, A=0.50; (--) FBMC >-=1.0.
11.0
10.0
9.0
Cv 8.0
~7.0 mol K
6.0
5.0 13
4.0 0~~~2~~~4~~~6~~~8~~~10~~~12
N/(1.5 x 105)
FIG. 7. See legend for Figure 6. Here T*=1.5, p*=0.85 (-) MMC. (---) ESDMCA=0.35. (-0-) ESDMC, A=0.25, (--) FBMC >-=1.0,
tions obtained in two walks are noticeably different. Obviously, integrating over the product of the correlation function obtained with FBMC, and the LennardJones potential, results in so much error cancellation, that internal energies are here virtually useless as indicators of convergence to equilibrium.
The ESDMC curve for A = 1 is included in Fig. 5 to illustrate what happens when we overbias. Here the energies are about 2% too high, and this happens because with A = 1. 0 in Eq. (1) the bOxes drawn around the most energetic particles are sO large that very few of the moves attempted by these particles are accepted. In fact, the average acceptance rate for particles with energy above the mean is 0.16 for A = 1 (Table 11). To dispel the disturbing idea that the ESDMC A == 1 walk might be the only equilibrated one of all the walks tried (say because only here did we escape fully from low-energy traps), we used the final configuration of this walk as the input in our FBMC program, and proceeded from this configuration, but with FBMC. As seen in Fig. 5, the energy quickly jumped back to the region into which the other walks had settled. So we conclude that it was the ESDMC A = 1 walk that was out of line, for the reasons given.
The cumulative, total, constant volume, molar heat capacities are shown in Figs. 6 and 7, and our best estimates-the terminal values of the graphs-are given in Table III. These were obtained by Eqs. (7) and (8) with a factor of 3/2 rather than 3 in Eq. (7), for the Lennard-Jones systems.
As seen from Figs. 6 and 7, and Table III, at both r =0. 719 and T* =1. 5 our ESDMC walks gave the same cumulative heat capacities at termination, but FBMC at both temperatures, and MMC at the higher temperature fell short of this value. We do not report heat capacities for MMC at r == 0.719, because here the energies had not yet converged.
For a couple of reasons, we think that these discrepancies are important. First, the result that the FBMC cumulative heat capacities fall below those obtained with ESDMC is consistent with discrepancies
J. Chern. Phys., Vol. 79, No.8, 15 October 1983
Downloaded 12 Sep 2012 to 128.143.23.241. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions
Saul Goldman: Monte Carlo convergence rates
TABLE m. Average configurational energy and total constant volume heat capacity of the equilibrated Lennard-J ones liquid at p* = O. 85.
-{e/kT)b Cv (cal/moIK)C Type of simulatioIfl Source
0.719 8.58±0.14 MD, 5 K time steps Ref. 14 8.51 MD, 1. 2 K time steps Ref. 13 8.51 ±O. 04 14.5 FBMC, A=I, 675 K trials This work 8.47 ±O. 05 15.6 ESDMC, A=0.35, 1125 K trials This work 8.44 ±O. 06 15.7 ESDMC, A =0.50, 525 K trials This work
1.5 3.55 ±O. 07 MD, 5 K time steps Ref. 14 3.56 ±O. 03 5.5 MMC, 1725 K trials This work 3.56 ±O. 03 5.6 FBMC, A=I, 1725 K trials This work 3.55±0.03 6.0 ES MC, A=O.25, 1725 K trials This work 3.54 ±O. 03 6.0 ESDMC, A = 0.35, 1725 K trials This work
~*= (kT/t:I, where EO: is the Lennard-Jones energy parameter and k is the Boltzmann constant. ~he entries for Ref. 14 were obtained by using an equation fitted to the r.d.f. 's in Ref. 14, in the integrand of the energy equation with a Lennard-Jones pair potential; i. e., by using Eq. (5) of Ref. 18. The uncertainties given with these entries were calculated from Table IV of Ref. 18. The entries for ''this work" are ensemble averages, calculated over that portion of the walk for which the energy had settled down to within about one standard deviation ofthe reported average. The uncertainties in this table represent two standard deviations. They, like the means, were obtained by combining block averages (of 7.5 xl0' trials) for (e/kT). Tail corrections of - 0.263 and - 0.126 at T* = 0.719 and T* = 1.5, respectively, are included in energies from "this work." Entries from Refs. 13 and 14 include appropriate tail corrections of their own.
cThese are total (i. e., configurational plus kinetic) constant volume molar heat capacities. The configurational part was obtained by appropriately combining subaverages [see Eqs. (7), (8), and text); these are the terminal points in Figs. 6 and 7; no tail correction was applied in getting the heat capacities.
dSee text for the meaning of all the symbols.
3945
found previously between the correlation functions obtained with the two methods (Figs. 3 and 4). Thus if FBMC, in the early part of a walk, causes the low energy wing of the energy distribution function to be filled at the expense of the high energy wing, we would expect this to result in a relatively low heat capacity because the latter is related to the breadth of the energy distribution function. Also, we feel that the heat capacity results obtained at T* = 1. 5 are especially significant. The 2 ESDMC walks converge to 6.0 cal/
portions of the walks are entered in Table IV, from which we see that the ensemble-averaged energies all fall within each others uncertainties. Clearly, internal energy is here too insensitive to show up differences between the methods of simulation.
mol K while MMC gave 5. 5 and FBMC 5.6 cal/mol K, respectively. While these differences are small, it should be remembered that these walks, each 1. 8 x 108
trials at termination, are very long for a Lennard-Jones fluid at this temperature. We believe these discrepancies indicate that ESDMC (with the A values used here) samples configuration space more efficiently than FBMC with i\ = 1 or than MMC.
ST2 WATER AT 600 K
We used the identical configuration as the starting point for each of these simulations. This configuration was one that resulted from running ST2 water, (initially at 283 K) for 106 trials at 391 K using ESDMC with A = 0.30. For all the walks at 600 K (MMC, FBMC i\ = 1, ESDMC with A =0. 40, 0.60, 0.75) the internal energies came up, to within 1% to 2% of their equilibrium values, fairly quickly-i. e., always within 1. 5 x 105 trials, so that we could not meaningfully discriminate between methods on this basis.
The internal energies, obtained from the equilibrium
We see from Fig. 8, however, that as with the
TABLE IV. Average configurational energy and total constant volume heat capal'ity for ST2 water at 600 K, and 1.00 gm/cm3•
Parameter Method strengtha -(e/kT)b Cv (cal/moIK)C
MMC 5. 877 ± O. 065 20.1
A ESDMC 0.40 5. 987 ± O. 072 20.1
0.60 5. 880 ± O. 061 19.4 0.75 5. 839 ± O. 070 18.5
A
FBMC 1.0 5.852 ± O. 050 17.2
asee Eqs. (1) to (3) and text for meaning of these symbols. Additional details are given in Table n.
be is defined with Eq. (4). These are all ensemble averages for a walk of 1. 05 x 1 OS trials, after rejecti~ the first 1.5 xl0s trials for the MMC and ESDMC walks, and the first 7.5 XI0' trials for FBMC walk. The uncertainties mean one standard deviation. These standard deviations, llke the means, resulted from combining 21 values of the 5 x 10' trial block subaverages of (e/kT) (see the text).
cThese are the terminal points in Fig. 8. They were obtained by Eqs. (7) and (8) and the results of our simulations.
J. Chern. Phys., Vol. 79, No.8, 15 October 1983
Downloaded 12 Sep 2012 to 128.143.23.241. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions
3946 Saul Goldman: Monte Carlo convergence rates
21
20
19
Cy
18
~17 mol K 16
15
14
4 6 8 10 12
N/105
FIG. 8. Cumulative, total, constant volume molar heat capacity for ST2 water at 600 K, p* = O. 996 78. (-) MMC; (- - -) ESDMC, A=0.40; (-0-) ESDMC, A=0.60; (-e-) ESDMC, A=0.75; (- -) FBMC A=1. O.
Lennard-Jones liquid, heat capacities are sensitive to the method of simulation. Our ESDMC method with A =0. 40 or A = O. 60 brings in the cumulative heat capacity, - 20 cal/mol K, a little more quickly than MMC but all three walks soon converge to this same limiting value. As was found with the Lennard-Jones systems, so also here FBMC with A = 1 is slow in coming up to the limiting value. By termination (1.15 x 106 trials) the FBMC walk's cumulative heat capacity reached only 17.2 cal/moIK, although as seen from Fig. 8, this curve is slowly rising. The relatively low value of the cumulative heat capacity for FBMC is consistent with the smaller standard deviation obtained with FBMC for the internal energy (Table IV). We believe that the reason the heat capacities come in slowly with FBMC is because of a distortion in the underlying distribution functions, analogous to what was found in the LennardJones systems.
As is apparent from Fig. 8, ESDMC with A=O. 75 does not converge to the limiting heat capacity as quickly as MMC or as ESDMC with A = 0.40 or 0.60. Again, we believe this is an example of overbiasing, although not as extreme as our previous example, because here the internal energies are not significantly different. We see from Table II, that with A = 0.75, the high energy particle acceptance rate is down to 0.17, which is apparently too low for an efficient walk. Again, though, this curve is gradually rising toward the MMC and the other ESDMC curves, as of course it must.
DISCUSSION AND SUMMARY
The results of this study suggest that ESDMC, when properly parametrized and applied to a dense LennardJones liquid or hot ST2 water, samples the configuration space in these systems more efficiently than either MMC or FBMC used with A = 1. While FBMC with A = 1 always brings in the energies quickly, this seems to happen at the expense of the distribution functions and heat capacities. Also ESDMC has an advantage over FBMC in not requiring the calculation of forces or torques.
This is not to say that either MMC or FBMC should not be used. When considered in perspective, each approach has specific advantages so that these three methods are somewhat complementary. Thus, if internal energy is the principal quantity sought, FBMC with A = 1 is useful. Also, A = ~ (for FBMC) is now generally believed to provide faster convergence than A = 1. Furthermore, only MMC can be used for hard-spherelike systems, ESDMC loses its usefulness as the energy dispersion of the sample narrows, and FBMC loses its usefulness as mean kinetic energy of the particles rises relative to their mean potential energy.
Perhaps the most important aspect of this work is that it produced more questions than answers. Specifically, we would like to understand and explain each of the following issues on a more fundamental basis than was here possible. (1) What exactly are we dOing in ESDMC, in terms of the underlying Markov Chain, that renders it more efficient than MMC? What are we dOing when we "overbias?" (2) Why did FBMC with A=l result in an apparent distortion of the r. d. f. s in the Lennard-Jones systems, and a slow development of the heat capacities in both the Lennard-Jones and the ST2 systems? Is this because A = 1 is, for these systems, a form of overbiasing and that a smaller value of A would have been better? (3) Is it possible to combine the ideas underlying both FBMC and ESDMC and so create a more nearly optimized sampling procedure that would result (modifying a previously coined phrase) in the Smartest Monte Carlo of all?
In a subsequent article, we will deal with these questions. It turns out to be possible to study these matters fundamentally by applying elementary Markov Chain theory to hypothetical systems that contain the salient features of the real systems that were looked at here. These calculations are now being done and should be published in the near future.
ACKNOWLEDGMENTS
It is a pleasure to thank Vice President, Academic, Howard Clark for his initiatives in setting up the "Special Computing Project" at the University of Guelph under which these calculations were carried out. We also thank the Institute of Computer Science at the University of Guelph for their cooperation. Thanks are also extended to Professor David Beveridge and Dr. M. Mezei for providing the 283 K equilibrated ST2 configuration, and the Natural Sciences and Engineering Research Council of Canada for financial support.
APPENDIX
Here we provide the details of how we torque biased the moves in the force-biased simulations on ST2 water.
The projections of the torque vector along each of the space-fixed x, y, and z axes in each molecule were obtained from the vector cross product of the total force at each site due to the other molecules in the system and the space-fixed vector between that site and the molecular center, followed by summing over the four sites.
J. Chern. Phys., Vol. 79, No.8, 15 October 7983
Downloaded 12 Sep 2012 to 128.143.23.241. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions
Saul Goldman: Monte Carlo convergence rates 3947
To pick an axis of rotation, with a probability proportional to the square of the torque along it, we evaluated the ratios
Rt = (-ri/l;. Ti), i=x, y, z,
where T, is the projection of the torque vector. Then using a random number scheme, we simply picked one of the x, y, or Z axes with a probability of RI. To do this, let ~ be a random number between 0 and 1. Then, if
~ !( Rz, pick the x axis,
Rz < ~ !( (Rz + Ry), pick the y axis,
~ > (Rz + Ry), pick the z axis.
Of course the R,'s are updated and the process is repeated for each trial move.
Once an axis "i" is picked, the rotation around it is biased in the direction of the torque. We set the probability of a rotation around a selected axis (i) to be proportional to exp (~TI ~ai) where ~ai is the amount we rotate by and ~ is a constant. Following Berne et al., the extent of any rotation is provided by8:
In [Hexp(~f3TI ~ao) - exp( - ~f3TI ~ao)}+ exp (- ~(3Ti~aO)] (AI) ~f3T, '
where ~ is a random number between 0 and 1. The values used for ~ and ~ao are given in Table II. Note that in present notation, the allowed range for the rotation is 2~ao radians.
The normalization constant for the rotational part of the move is now readily obtained. From the fact that the Ri' s are normalized, and the requirement
where Cjl is the normalization constant for the rotation, we obtain
Q _ Rt [exp(~j3TI~ai)]~j3TI .
i - exp(~f3Ti~aO) -exp(- ~f3Ti~aO)' z=x, y, Z •
(A2)
In Eq. (A2), Qi is the total rotational contribution to the forward step transition probability prior to discriminating on the basis of exp(- f3~u). Similar expressions are obtained for the reverse step Qf"Y. Thus, when the Boltzmann test is done, our rotational torquebiasing scheme contributed the factor (<tt'Y / Qi) to the product of the translational force-biasing factor8 and exp (- f3~ u).
IN. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller, J. Chem. Phys. 21, 1087 (1953).
20. M. Torrie and J. P. Valleau, J. Compo Phys. 23, 187 (1977).
3J. P. Valleau and S. G. Whittington, in Statistical Mechanics, Part A: Equilibrium Techniques, edited by B. J. Berne (Plenum, New York, 1977), p. 137.
4J. P. Valleau and G. M. Torrie, in Statistical MechaniCS, Part A: Equilibrium Techniques, edited by B. J. Berne (Plenum, New York, 1977), p. 169.
5J. C. Owicki and H. A. Scheraga, Chem. Phys. Lett. 47, 600 (1977).
6p. J. Rossky, J. D. Doll, am H. L. Friedman, J. Chem. Phys. 69, 4628 (1978).
7C. Pangali, M. Rao, and B. J. Berne, Chem. Phys. Lett. 55, 413 (1978).
8M. Rao, C. Pangali, andB. J. Berne, Mol. Phys. 37,1773 (1979).
9M• Rao and B. J. Berne, J. Chem. Phys. 71, 129 (1979). lOP. K. Mehrotra, M. Mezei, and D. L. Beveridge, J. Chem.
Phys. 78, 3156 (1983). 11 F. H. Stillinger and A. Rahman, J. Chem. Phys. 60, 1545
(1974). 12J. E. Freund, Mathematical Statistics (Prentice-Hall, New
Jersey, 1962), p. 177. 13L. Verlet, Phys. Rev. 159, 98 (1967). 14J. J. Nicolas, K. E. Gubbins, W. B. Streett, and D. J.
Tildesley, Mol. Phys. 37, 1429 (1979). 15The bin size is both the number of entries used to compute
ensemble subaverages for the energy, square of the energy, and heat capacity, and the number of energy increments added together before recalling the last configuration to recalculate the energy on the bltsis of this configuration. Too large a bin size results in erroneously high heat capacities; for this reason the Cv reported in Ref. 16 is now known to be too high.
16S• Goldman, ,J. Chem. Phys.74, 5851 (1981). 17 M• Mezei, S. Swaminathan, and D. L. Beveridge, J. Chem.
Phys. 71, 3366 (1979). 18S. Goldman, J. Phys. Chem. 83, 3033 (1979).
J. Chern. Phys., Vol. 79, No.8, 15 October 1983
Downloaded 12 Sep 2012 to 128.143.23.241. Redistribution subject to AIP license or copyright; see http://jcp.aip.org/about/rights_and_permissions