+ All Categories
Home > Documents > [Chemical Modelling] Chemical Modelling Volume 5 ||

[Chemical Modelling] Chemical Modelling Volume 5 ||

Date post: 02-Oct-2016
Category:
Upload: alan
View: 218 times
Download: 1 times
Share this document with a friend
38
Multiscale modelling of biological systems Christopher J. Woods and Adrian J. Mulholland DOI: 10.1039/b608778g 1. Introduction At what point does a collection of molecules become a biomolecular system? At what length scale does biology begin, and chemistry end? Biological phenomena involve the flow of information across a range of length and timescales. For example, a cell may be placed under physical stress at the macroscopic level, which causes an increase in pressure within its protective membrane. This pressure has the effect of opening 1 or closing 2 mechanosensitive ion channels, thereby changing the flow of individual ions into the cell. This changes the ionic concentration within the cell, which then acts as the trigger for a signal sent via a protein signalling pathway. A chemist would look at this as a molecular system that was capable of converting mechanical forces into electrical signals. A biologist would however look at this as the mechanism a cell uses to adapt to stress, and thereby stay alive. Biology is full of such examples. Every thought we have involves the passage of signals between neurons, which itself requires the conversion of electrical signals into flows of ions. These ions trigger the release of neurotransmitter molecules, which cross the synaptic gap between neurons, and bind to individual receptor proteins at the synapse. This causes a change in protein conformation, which open nearby ion channels, causing ions to rush in or out of the neuron, thereby continuing the signal. Information is constantly flowing between the macroscopic world and the atomic, chemical world. Indeed it is this interplay between the chemical and macroscopic worlds that is a real beauty of biology, and it is the recent advances made by the science of biochemistry that has revealed the elegance of the chemicals of life to all. However, while it is possible to use a microscope to watch how an individual cell responds to external stimuli, it is not possible to ‘zoom in’ further and observe what is occurring at the chemical level. Experiments can infer what is happening, and can provide supporting evidence for a particular hypothesis, but there is no experimental technique or microscope that allows us to watch a chemical reaction within an enzyme active site. Until such techniques are developed, the most appealing route that currently exists is to use computers to create models of the biochemical world. Computational scientists can create virtual enzymes, and models of cell membranes, and then use these to provide a window through which the interactions of biomolecules can be observed. If the models are constructed on the firm foundations of physics and chemistry, and if their predictions are carefully compared and validated against experiment, then simulations using these models can provide the valuable insight necessary to link the chemical and biological worlds. Computational scientists have developed many tools for modelling molecules. Computer models are not perfect recreations of reality. Instead, approximations and assumptions have to made, and the model compromised for the sake of computa- tional efficiency. As the size of the system gets larger, and so the size and number of molecules increases, so to does the computational expense of the calculation. This means that the larger the system, the more compromises and approximations must be made. This act of compromise has led computational scientists to develop four main levels of biomolecular modelling: 1. Quantum mechanics (QM). Quantum chemical calculations model the fine detail of the electrons in the molecule. They achieve this by modelling the electrons as a quantum mechanics wavefunction that interacts with the electrostatic potential Centre for Computational Chemistry, School of Chemistry, University of Bristol, Bristol, UK BS8 1TS Chem. Modell. , 2008, 5, 13–50 | 13 This journal is c The Royal Society of Chemistry 2008 Downloaded by Stanford University on 23 August 2012 Published on 19 November 2008 on http://pubs.rsc.org | doi:10.1039/B608778G
Transcript
Page 1: [Chemical Modelling] Chemical Modelling Volume 5 ||

Multiscale modelling of biological systems

Christopher J. Woods and Adrian J. MulhollandDOI: 10.1039/b608778g

1. Introduction

At what point does a collection of molecules become a biomolecular system? At whatlength scale does biology begin, and chemistry end? Biological phenomena involvethe flow of information across a range of length and timescales. For example, a cellmay be placed under physical stress at the macroscopic level, which causes anincrease in pressure within its protective membrane. This pressure has the effect ofopening1 or closing2 mechanosensitive ion channels, thereby changing the flow ofindividual ions into the cell. This changes the ionic concentration within the cell,which then acts as the trigger for a signal sent via a protein signalling pathway. Achemist would look at this as a molecular system that was capable of convertingmechanical forces into electrical signals. A biologist would however look at this asthe mechanism a cell uses to adapt to stress, and thereby stay alive. Biology is full ofsuch examples. Every thought we have involves the passage of signals betweenneurons, which itself requires the conversion of electrical signals into flows of ions.These ions trigger the release of neurotransmitter molecules, which cross the synapticgap between neurons, and bind to individual receptor proteins at the synapse. Thiscauses a change in protein conformation, which open nearby ion channels, causingions to rush in or out of the neuron, thereby continuing the signal. Information isconstantly flowing between the macroscopic world and the atomic, chemical world.Indeed it is this interplay between the chemical and macroscopic worlds that is a realbeauty of biology, and it is the recent advances made by the science of biochemistrythat has revealed the elegance of the chemicals of life to all. However, while it ispossible to use a microscope to watch how an individual cell responds to externalstimuli, it is not possible to ‘zoom in’ further and observe what is occurring at thechemical level. Experiments can infer what is happening, and can provide supportingevidence for a particular hypothesis, but there is no experimental technique ormicroscope that allows us to watch a chemical reaction within an enzyme active site.Until such techniques are developed, the most appealing route that currently exists isto use computers to create models of the biochemical world. Computationalscientists can create virtual enzymes, and models of cell membranes, and then usethese to provide a window through which the interactions of biomolecules can beobserved. If the models are constructed on the firm foundations of physics andchemistry, and if their predictions are carefully compared and validated againstexperiment, then simulations using these models can provide the valuable insightnecessary to link the chemical and biological worlds.Computational scientists have developed many tools for modelling molecules.

Computer models are not perfect recreations of reality. Instead, approximations andassumptions have to made, and the model compromised for the sake of computa-tional efficiency. As the size of the system gets larger, and so the size and number ofmolecules increases, so to does the computational expense of the calculation. Thismeans that the larger the system, the more compromises and approximations mustbe made. This act of compromise has led computational scientists to develop fourmain levels of biomolecular modelling:1. Quantum mechanics (QM). Quantum chemical calculations model the fine

detail of the electrons in the molecule. They achieve this by modelling the electronsas a quantum mechanics wavefunction that interacts with the electrostatic potential

Centre for Computational Chemistry, School of Chemistry, University of Bristol, Bristol, UKBS8 1TS

Chem. Modell., 2008, 5, 13–50 | 13

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

Page 2: [Chemical Modelling] Chemical Modelling Volume 5 ||

field generated by the atomic nuclei. Quantum chemical calculations provide themost physically realistic and accurate models of molecules, but this accuracy comesat a cost. While methods have been developed that allow QM calculations oncomplete proteins,3 in general the high computational expense of QMmethods limitstheir application to small molecular systems.2. Molecular mechanics (MM). Atomistic molecular mechanics calculations apply

the assumption that the fine detail information about the behaviour of the electronscan be ignored, and instead they are approximated by representing their effects usingsimple descriptors such as atomic partial charges or polarisabilities. By modelling theelectrons implicitly, MMmethods are much less expensive than QMmethods, and sothey are able to model significantly larger systems. By including atomic detail, MMmodels are still limited to the molecular level, and even today’s largest applicationscan only achieve the modelling of hundreds of thousands of atoms over hundreds ofnanoseconds.3. Coarse grain (CG). Coarse grain (or coarse grained) calculations apply the

assumption that the fine detail information about the position of each atom in themolecule can be ignored, and instead groups of atoms are approximated by smearingthem out into single ‘beads’. So, for example, rather than modelling each atom in aprotein, a CG representation would portray each residue as a single bead. Thisapproximation allows CG simulations to achieve length and timescales that are farbeyond those possible using atomistic MM models.4. Continuum. Continuum models apply the assumption that the fine detail

information about the location of any particles or groups can be ignored, andinstead systems are modelled as continuum regions. For example, implicit solventmodels ignore the location of each individual solvent molecule, but instead representthe complete solvent as a fuzzy dielectric continuum. Equally, continuum models ofa cell membrane ignore the individual locations of each lipid molecule, and insteadmodel the membrane as a homogenous elastic sheet. By ignoring particles, andinstead modelling biological systems as continuous fields or homogenous assemblies,continuum models are able to simulate the largest length scales and longest time-scales of any of the four levels.These four levels of biomolecular modelling are each well-suited to modelling

phenomena at the length and timescales for which they were designed. However,what makes biology work, and what makes it scientifically interesting, is theinterplay and flow of information across the different length and timescales. It isnot possible for simulations at any one of these biomolecular modelling levels torepresent these complex, multiscale biological phenomena on their own, and somethods that allow the combination of different levels of biomolecular modeltogether must therefore be sought. Multiscale modelling, in which calculations atmultiple length and/or timescales are combined together into a single simulation, isnow becoming popular, and its development is now the focus of significant researcheffort. Multiscale modelling is not new, for example combined QM/MM methods,and MM/continuum implicit solvent methods have been used for over 30 years, andmultiscale methods have a rich heritage of applications in the fields of materialsmodelling and nanomaterials,4 and modelling fluid and gas flow.5,6 Recently, therehas been a huge increase in the development and application of multiscale methodsfor biomolecular modelling. This review focuses on these developments, in particularthe application of multiscale methods to biomolecules covering the period from 2005to 2007. Coveney7 has produced a review of biological multiscale modelling thatcovers the period up to 2005.Before starting this review, there first needs to be a definition of what is meant by a

multiscale method. There are several different definitions that vary depending on thetype of coupling between the different modelling levels. This review will adoptperhaps the most broad definition of a multiscale method, namely that it is anymethod that involves a flow of information from one modelling level to another. Bydefinition, if there is a flow of information from one level to another, then there must

14 | Chem. Modell., 2008, 5, 13–50

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 3: [Chemical Modelling] Chemical Modelling Volume 5 ||

be an interface between the levels through which this information will flow.Throughout this review it will become clear that there are four main classes ofinterface;1. One-way, bottom-up interfaces. These involve a one-way, often one-time

transfer of information from a lower level of modelling to a higher level. Examplesinclude using a QM calculation to parameterize an MM forcefield, or using an MMsimulation to parameterize a CG potential.2. One-way, top–down interfaces. These involve a one-way transfer of information

from a higher level of modelling to a lower level. Examples include using a CGmodelto reconstruct an atomistic model of a protein, or using a continuum model toprovide the boundary conditions for an atomistic simulation.3. Two-way parallel interfaces. These involve a two-way dynamic transfer of

information between two simulations running in parallel at two different modellinglevels. An example includes running both an MM and CG simulation of a systemand using replica exchange8–10 moves to exchange coordinates between the twolevels.4. Two-way embedded interfaces. These involve embedding a low modelling level

region within a simulation at a higher level, e.g. embedding a QM model of asubstrate and active site within an MM model of the enzyme, or embedding an MMmodel of an ion channel within a CG model of a membrane.This review is therefore organised according to the different interfaces between

levels (QM/MM, atomistic/CG, particle/continuum), and then by the differentclasses of interface that are used between these levels.

2. Interfacing QM with MM models

The most accurate physical description of atoms and molecules is provided byquantum chemical calculations. Quantum chemical calculations are capable ofcorrectly predicting the energetics and conformations of small molecules from firstprinciples, using broadly applicable approximations (e.g. the Born-Oppenheimerapproximation) and nothing more than fundamental physical constants as input.11

Quantum chemical calculations model electrons as a quantum mechanics (QM)wavefunction that interacts with the electrostatic potential field created by theatomic nuclei of the molecule. QM provides the most exact physical model ofmatter at the atomic scale, and QM calculations are capable of predicting chemicalbonding and chemical reactivity. There are several recent reviews of quantumchemical methods,11–13 and QM methods may now be used across a length andtimescale that ranges from modelling the femotosecond interactions of infra-redlaser light with carbon monoxide,14 to modelling the sub-nanosecond dynamics of acomplete protein.15 There are a range of QMmethods available with varying degreesof approximation, with a range that includes fast semi-empirical Hamiltonians suchas AM116 or PM3,17,18 and highly exact coupled cluster methods such asLCCSD(T).19 Because QM methods include an explicit representation of electrons,they are able to model chemical processes such as charge transfer, bond breakingand formation, and changes of molecular polarisation. However, the high computa-tional expense of QM methods prevents their application to the large length andtimescales that are required to understand complex biomolecular processes.MM methods provide a simpler representation of molecules, in which the fine

detail of the electrons represented implicitly via partial charges and, is some cases,molecular polarisabilities.20,21 MM models represent molecules as a collection ofatoms interacting through classical potentials. There are several MM models (orforcefields), and they differ in the functional forms of the interaction potential usedbetween atoms, and in the means by which these interaction potentials areparameterized. Several good recent reviews of MM forcefields have been pro-duced.22–26 Several MM forcefields have been developed for application to biomo-lecular systems. The most popular of these are the CHARMM,27 AMBER,28,29

Chem. Modell., 2008, 5, 13–50 | 15

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 4: [Chemical Modelling] Chemical Modelling Volume 5 ||

GROMOS30–32 and OPLS33,34 forcefields. Each of these forcefields has evolved overtime, with different versions produced periodically. However, despite the prolifera-tion and evolution of MM forcefields for biomolecular modelling, their functionalforms all remain broadly similar. Each atom is modelled as a single point in space.Pairs of atoms in separate molecules interact through a pairwise non-bondedpotential, Enb, which depends only on the distance between the atoms, r. Theelectrostatic part of this non-bonded potential, Eelec, is modelled using Coulomb’slaw, assigning a fixed partial charge to each atom in the molecule. The non-bondedpotential must also model the van der Waals (vdW) forces between the molecules,which result from the combination of the Pauli repulsion that results from theinability of two electrons to occupy the same space with the same set of quantumnumbers, and the attractive dispersion (London) forces whose physical basis lies inthe ‘instantaneous dipoles’ that result from the wavefunctions of close atoms movingin phase. These vdW forces have their origin in the behaviour of electrons, which arenot explicitly modelled in MM forcefields. These forces must therefore be approxi-mated. The most common approximation used for biomolecular applications is theLennard-Jones (LJ) potential. This approximates the vdW interactions using a 12–6repulsive–attractive potential, ELJ,

ELJðrijÞ ¼ 4eijsijrij

� �12

� sijrij

� �6" #

; ð1Þ

where ELJ(rij) is the Lennard-Jones energy between atom i and atom j, rij is thedistance between the pair of atoms, and sij and eij are parameters that are tuned toreproduce the strength of the vdW forces between this pair of atoms, often by fittingto macroscopic properties. Note that this is a pairwise potential that acts onlybetween pairs of atoms. This is despite the fact that unlike the permanent electro-static forces, vdW forces are not pairwise in nature. Indeed, while permanentelectrostatic forces are pairwise, MM forcefields use Coulomb’s law and fixed atomicpartial charges to model both the permanent electrostatics of the molecule and,implicitly, its polarisation. Charge polarisation is also not a pairwise phenomenon.The non-bonded potential energy between two molecules is given by the sum of theCoulomb and LJ energies between all pairs of atoms in the molecules. It is thereforean effective pair potential, as the derivation of the partial charges and LJ parametersmust account for the errors implicit in only using a pairwise sum over atoms, andmust therefore include 3-, 4- to n-body effects implicitly.Modelling the electronic detail of a molecule, as well as providing an explicit

representation of polarisation and vdW forces, is also responsible for giving a correctrepresentation of chemical bonding. As MM forcefields do not explicitly modelelectrons, they must include classical interaction potentials that mimic the effects ofchemical bonding. MM forcefields include classical intramolecular interactionpotentials, e.g. a harmonic bond potential, Ebond that acts between bonded atoms(called 1–2 atoms), a harmonic angle potential, Eangle that acts on the angle betweena series of three bonded atoms (1–3 atoms), and a torsional potential, Etorsion, thatacts about the dihedral formed by four bonded atoms (1–4 atoms). Atoms that areseparated by more than three bonds (1–5+ atoms) are treated as being non-bonded,and so their interaction energy is calculated using the sum of their Coulomb and LJinteraction energies. The total intramolecular energy of a molecule is then given bythe sum of the bond energy between all 1–2 atoms, the sum of the angle energybetween all 1–3 atoms, the sum of torsion energy between all 1–4 atoms, and the sumof the non-bonded Coulomb and LJ energies between all pairs of 1–5+ atoms. Thetotal energy of a system of molecules can then be calculated as the sum of theintramolecular energies of all of the molecules, together with the sum of the non-bonded potential energies between all pairs of molecules.By using classical potentials, MM models allow for a very rapid evaluation of the

energy and forces acting on each atom within a large biomolecular system. This

16 | Chem. Modell., 2008, 5, 13–50

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 5: [Chemical Modelling] Chemical Modelling Volume 5 ||

rapid evaluation allows these forces and energies to be used by statistical conforma-tional sampling methods, such as molecular dynamics (MD)35,36 or Monte Carlo(MC),36,37 to generate large ensembles of configurations of the system, from whichmacroscopic (thermodynamic) properties may be evaluated. However, by notexplicitly modelling electrons, MM models struggle to model many chemicallyimportant phenomena, such as chemical bond breaking and formation, electronicpolarisation and charge transfer. There is therefore a strong motivation to combineMM models with quantum mechanics (QM) calculations within a multiscalemodelling framework, and combined QM/MM biomolecular simulation methodshave a rich history of application and evolution since their original inception in theearly 1970s.38,39

Using a broad definition, QM/MM multiscale methods are those that involve atransfer of information across an interface between the QM and MM levels ofmodelling. There are several different types of interface in use, which fall into fourcategories:1. One-way bottom-up methods. These involve a single transfer of information

from a QM calculation to a classical simulation, e.g. by using QM calculations toparameterize the classical potentials used in MM force fields.2. Two-way dynamic parameterisation methods. These involve a dynamic transfer

of information between separate classical and quantum calculations, e.g. usingsuccessive QM calculations to dynamically re-parameterize the classical potentialsthe QM atoms of an MM forcefield during a live simulation.3. Two-way embedded methods. These involve embedding of molecules or parts of

molecules modelled using QM into a system of molecules modelled using MM, e.g.using QM to model a substrate, and MM to model the enzyme and solvent.4. Two-way parallel methods. These involve the running in parallel of classical and

quantum simulations, and dynamically sharing information between them at runtime.Examples of each of these different types of interface, and recent developments in

their methodology, will now be discussed in turn.

2.1 One-way bottom-up QM/MM interfacing methods

As described in the last section, molecular mechanics (MM) forcefields use classicalpotentials to calculate the interaction energy between pairs of atoms. Several types ofinteraction potential are required to fully describe the MM energy of a set ofmolecules:1. Non-bonded potentials. These typically take the form of a Coulomb potential

between non-bonded pairs of atoms to describe polarisation and permanentelectrostatics, and a Lennard-Jones (LJ) potential between non-bonded atom pairsto describe the vander Waals (vdW) interactions.2. Bonded potentials. These typically involve harmonic terms that are applied

between 1–2 bonded and 1–3 bonded pairs of atoms, and cosine terms between 1–4bonded pairs. These potentials try to model the effect of chemical bonding.These classical interaction potentials must be parameterized, e.g. the magnitude of

the partial charges on each atom in the molecule must be assigned, and theequilibrium bond length and size of the harmonic force constant must be attachedto each bond. In the early biomolecular MM forcefields, these parameters weredeveloped to produce molecular models that could reproduce known experimentalproperties of the bulk system. For example, several MM water models have beendeveloped.26,40,41 One of the earliest successful models, TIP3P,42 was parameterizedsuch that simulations of boxes of TIP3P molecules reproduced known thermody-namic properties of water, such as liquid density and heats of vaporisation. Such aparameterisation scheme is to be applauded, as it ties the molecular model closely toexperiment. Indeed many of the common MM models of amino acids weredeveloped by comparison to experiment, e.g. OPLS.33 Indeed it is such a good

Chem. Modell., 2008, 5, 13–50 | 17

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 6: [Chemical Modelling] Chemical Modelling Volume 5 ||

scheme that even some modern water models, like TIP5P,43 are still parameterized inthis way. However, it was quickly realised that parameterisation against experimentrequired large amounts of physical data that was just not available for the novelmolecules being conceived during rational drug design. The developers of biomole-cular forcefields therefore created recipes that allowed for the parameters of newmolecules to be derived from quantum chemical calculations. One popular exampleof such a recipe is GAFF (generalised AMBER forcefield),44 which is an MMforcefield for small drug-like ligands that is compatible with AMBER. GAFF usesgeneric atom types that are assigned to each atom in the drug-like molecule, e.g.aliphatic carbon or aromatic hydrogen. These atom types are used to assign LJ,bond, angle and dihedral parameters to the molecule from a large parameter library.The partial charges for the atoms are derived by performing an AM116 semiempiricalQM calculation, and calculating BCC45,46 charges. In a very broad sense, thisparameterisation is a multiscale method, as information (the charge distribution)from a QM calculation is transferred to an MM simulation via the parameterisationof the atomic partial charges. This multiscale parameterisation therefore represents aone-time, one-way flow of information up from the QM level to the MM level. Asimilar scheme is available for the OPLS forcefield,47 which uses CM1A charges48

that are also derived from semiempirical AM1 QM calculations. Wang and Sand-berg49 used a more complex multiscale parameterisation to derive intramolecularCHARMM parameters for the interaction of DNA bonded to a gold surface. Theparameters were calculated by fitting to density functional theory (DFT) QMcalculations.

2.2 Two-way dynamic parameterisation methods

QM calculations are now used routinely as the source of parameters for MMforcefields. Indeed this application is now so routine that most workers would notconsider forcefield parameterisation to be an example of a multiscale method.However, there is a drawback with using QM calculations to provide MM forcefieldparameters. The problem is that the information flow is only in one direction, fromthe QM to the MM level. This means that the QM-derived parameters for a moleculehave to be very general, and be able to represent the molecule in a variety ofconformations and environments. This is an unreasonable requirement, as it is clearthat the charge distribution and polarisation of the molecule depends on both itsconformation and its environment, e.g. whether it is in bulk solvent or whether it isbound to the active site of a protein. Because information only flows from the QM toMM level, there is no mechanism that allows information about the environmentand conformation of the molecule experienced during the MM simulation to be fedback to the QM calculation. A solution is to modify multiscale parameterisation sothat there is a two-way interface between the QM and MM levels of modelling.There have been two recent applications that have made such a modification: onetargeted at creating a QM/MM multiscale docking method,50 and another targetedat QM/MM multiscale free energy calculations.47,51,52

Docking is one of the primary tools used during the process of rational drugdesign.53 The aim of docking is to predict the binding mode of a ligand with aprotein. Because docking calculations are typically used to study how libraries ofthousands of ligands bind to a protein, the calculations involved must be simple andefficient. This means that the interaction potentials used in docking tend to be basedon molecular mechanics forcefields.53 MM forcefields struggle to model the changesin polarisation upon protein-ligand binding, an effect that is thought to account foras much as 10–40% of the binding affinity.54 Multiscale docking methods thatattempt to use QM calculations to overcome this problem have therefore beendeveloped.50,55 Cho et al.50 have developed a QM/MM docking method that usesmultiscale parameterisation dynamically throughout a docking calculation. Theelectrostatic interaction energy between the ligand being docked and the protein is

18 | Chem. Modell., 2008, 5, 13–50

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 7: [Chemical Modelling] Chemical Modelling Volume 5 ||

calculated using Coulomb’s law, with atomic partial charges placed on the proteinand ligand atoms. Cho et al. first showed that the docking prediction is improved ifthe atomic partial charges on the ligand are derived from a QM calculation of theligand in the bound geometry. They demonstrated this by selecting several testprotein-ligand systems whose bound geometries were known via crystal structuresavailable in the protein databank (PDB). They calculated the partial charges byperforming a density functional theory (DFT) calculation of the ligand in the boundgeometry. The wavefunction was polarised by embedding the partial charges fromthe protein within the Hamiltonian of the QM calculation. By using the boundgeometry, and by embedding the partial charges of the protein, information aboutthe environment of the ligand in the protein active site was made available to the QMcalculation. The polarised wavefunction was used to obtain the molecular electro-static potential (MEP) surface around the ligand. Partial charges were generatedusing an electrostatic potential fitting procedure to reproduce the QM MEP. Bycalculating these partial charges from a QM calculation that had information aboutthe bound geometry, it could be argued that these atomistic partial charges wereoptimised for the bound geometry. Cho et al. demonstrated this50 by running re-docking calculations where the ligands were docked using both optimised andgeneric partial charges. The results demonstrated that docking calculations usingthe optimised charges were significantly more likely to rediscover the knownexperimental binding modes. To turn this observation into an effective dockingalgorithm, Cho et al. created an iterative dynamic reparameterisation algorithm;1. Dock the ligand using the default partial charges taken from the docking MM

forcefield.2. Perform a QM calculation on the predicted binding mode to obtain optimised

partial charges for the ligand.3. Dock the ligand again, this time using the optimised partial charges.4. Keep iterating until the charges and predicted binding mode converge to within

a set limit.By dynamically reparameterising the ligand throughout the docking calculation,

Cho et al. allow information to flow both ways between the QM and MM levels. Asimilar idea has been developed by Jorgensen and co-workers to create a QM/MMmultiscale method for free energy calculations.47,51,52 Jorgensen and co-workersdeveloped a method to obtain atomic partial charges efficiently from a QMcalculation that were compatible with the partial charges from the standard OPLSall-atom forcefield.47 The charges were calculated using the charge model 1(CM1A)48 analysis of an AM1 semiempirical QM calculation. However, as CM1Acharges were parameterized to reproduce gas-phase dipole moments,48 they had tobe scaled by a factor of 1.2 so that an implicit account could be made for the extrapolarising effects of polar solvents. Jorgensen and co-workers first used this methodto perform QM/MM hydration free energy calculations. The solute molecule wasmodelled using QM (AM1), while the solvent molecules were modelled using MM(OPLS). The QM calculation was used to obtain the intramolecular energy of thesolute. The QM calculation was also used to obtain the atomic partial charges on thesolute using AM1/CM1A. These partial charges were used to calculate the electro-static interaction energy between the solute and solvent via Coulomb’s law. The LJequation was used to obtain the vdW interaction between the solute and solventusing pre-assigned OPLS e and s LJ parameters. The QM calculation in this methodwas used only to dynamically re-parameterize the MM atomic partial charges duringthe simulation. The only information flow from the MM to QM level was the changein conformation of the solute. The solvent environment around the solute was notpassed explicitly, as it was not included within the QM calculation. The effect of thesolvent was only felt implicitly in the QM calculation via the application of the scalefactor. Despite the lack of explicit modelling of the solvent environment in the QMcalculation, Jorgensen and co-workers have successfully used this method to studysolution-phase Diels-Alder reactions,56 and to study the enzyme-catalysed Claisen

Chem. Modell., 2008, 5, 13–50 | 19

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 8: [Chemical Modelling] Chemical Modelling Volume 5 ||

rearrangement reaction of chorismate to prephenate.51 Jorgensen and co-workershave since adapted this method57–59 to use the PDDG/PM3 semiempirical QMHamiltonian,60 using the CM361 method to extract charges, which are then scaled bya factor of 1.14,62 again to provide an implicit account for the extra polarising effectsof the solvent.

2.3. Two-way embedded methods

Periodic multiscale reparameterisation, i.e. periodically during a simulation, pro-vides a two-way conduit for information flow between the QM and MM levels ofmodelling. However, the coupling between levels is not particularly strong. Theinterchange of information between levels occurs only periodically, which can lead tothe MM level falling out of step with the QM. This problem was experienced duringthe development of the QM/MM docking method of Cho et al.50 If the ligand wasinitially docked in a poor configuration, then the partial charges derived for thatconformation could bias the subsequent docking runs to rediscover the poorconfiguration in preference to the correct binding mode. Cho et al. developed asurvival of the fittest algorithm50 that ran multiple docking runs in parallel, therebypreventing one poor result from biasing the rest of the calculation.A closer coupling between the QM and MM levels can be achieved using an

embedded interace method. A QM region of the biomolecular system is embeddedwithin a larger MM simulation. One of the primary application areas for embeddedQM/MM methods is computational enzymology (the computational modelling ofenzyme-catalysed reaction mechanisms), where typically a QM model of the sub-strate and part of the enzyme active site is embedded within anMMmodel of the restof the enzyme and solvent.63 Embedding a QM model within an MM simulationcreates a dynamic and permanent interface between the two levels, with informationflow across that interface having to be managed for each configuration of thesimulation. The ONIOM method, developed by Morokuma and co-workers,64

provides such an interface via the use of multilevel corrections. The ONIOMmethodpartitions the system into multiple layers. For example, consider a two-layer system,where a low-level QM region, A, is embedded within a high-level MM region, B. Theenergy of both regions, A + B, is first calculated using only the MM Hamiltonian,giving EMM(A + B). This total energy is corrected by calculating the difference inenergy between the QM and MM energies of the QM region, EQM(A) � EMM(A).The total ONIOM energy of the system is therefore EMM(A + B) + EQM(A) �EMM(A). The generalisation of this algorithm to multiple levels is straight-forward,e.g. a system can be divided into an ab initio QM region, A, which is embeddedwithin a semiemprical QM region, B, which is itself embedded within an MMsystem, C. The ONIOM energy in this case would be EMM(A + B + C) +[Esemiemprical(A + B) � EMM(A + B)] + [Eab initio(A) � Esemiemprical(A)].The ONIOM method allows the facile combination of QM and MM levels of

modelling. However, the electrostatic interaction between the QM and MM regionsin the original ONIOM implementation is handled at the MM level only, viaEMM(A + B), based on partial atomic charges derived from the QM calculation.The use of this method, called classical or mechanical embedding,65 means thatinformation about the electrostatics of the system flows only from the QM up to theMM level. There is no conduit by which the electrostatic environment of the MMatoms is able to flow down to the QM region, to polarise the QM wavefunction.65

An alternative method of interfacing QM and MM calculations, called electronicembedding, solves this problem. In electronic embedding, the partial charges of theMM atoms are embedded within the QM Hamiltonian. This allows the QMwavefunction to be polarised by the MM atoms, thereby providing a two-wayconduit for electrostatic information between the QM and MM regions. TheONIOM method has since been extended to use electronic embedding,65 therebyovercoming one of the fundamental weaknesses of the algorithm.

20 | Chem. Modell., 2008, 5, 13–50

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 9: [Chemical Modelling] Chemical Modelling Volume 5 ||

Electronic embedding has had a rich and long history of application in other QM/MM schemes. The first application of electronic embedded QM/MM to a biomo-lecular system was in the ground-breaking work of Warshel and Levitt in 1976.39

They developed the method to study the reaction mechanism of hen egg-whitelysozyme. Warshel has recently produced a detailed, clear and very interestingreview63 of the details of the algorithm and the developments in QM/MM metho-dology since his pioneering work in the 1970s. Acceptance of this approach took along time,63 but there is now a large body of QM/MM applications that use suchmethodology (see one of the very many reviews of QM/MMmethods available in theliterature66–69). The underlying theory of this method has been covered by manydifferent authors68,69 so only a brief description will be given here. The biomolecularsystem is divided into a QM region, for example a substrate, and an MM region, e.g.an enzyme and surrounding solvent molecules. The total energy of the system is thesum of the energy of the MM region, evaluated using a standard MM forcefield, theenergy of the QM region, and the QM/MM interaction energy between the tworegions. The QM/MM interaction energy is split into two parts: an electrostatic partand a vdW part. The vdW part is calculated by assigning LJ parameters to all of theQM atoms and calculating the interaction between the QM andMM atoms using theLJ equation. The electrostatic part of the QM/MM interaction is calculated bybundling it together with the calculation of the QM energy of the QM region. This isachieved by embedding within the QM Hamiltonian the locations and partialcharges of all of the MM atoms that are within a pre-determined cutoff distanceof the QM atoms. Normally the MM atoms are represented as point charges in theQM calculation, but Gaussian charge distributions, with the width of the Gaussianthat is similar to the covalent radii of the MM atoms may instead be used.70 TheMM partial charges act to polarise the QM wavefunction of the QM atoms, andtherefore the evaluation of the energy of this wavefunction returns both theintramolecular energy of the QM atoms and the electrostatic interaction betweenthe QM and MM atoms. The simple split of the QM/MM interaction intoelectrostatic and vdW parts is, however, only possible if the interface betweenregions lies between molecules, i.e. all molecules are either QM orMM, and there areno molecules that sit across the interface. It is desirable (e.g. within computationalenzymology) for a single molecule to be able to bridge this interface, e.g. while themajority of an enzyme is modelled at the MM level, it is usually necessary torepresent some (e.g. catalytic) active site residues at the QM level. The problem withhaving a single molecule straddle the interface is that the QM/MM interactionenergy, as well as modelling electrostatic polarisation, must now also include termsthat account for the chemical bonding between atoms modelled at the QM level andatoms modelled at the MM level. This is a particular problem with the QM side ofthe calculation, as when the boundary bisects a covalent bond, the electron density isterminated abruptly at the end of the QM region and electrons of the bonded atomare missing, potentially leading to unpaired electrons.66 The three most popularmethods for solving this problem are the link atom method,66,71,72 the local self-consistent field (LSCF) method73,74 and the generalized hybrid orbital (GHO)method.75,76

The ‘‘dummy junction atom’’ or link atom approach introduces so-called linkatoms to satisfy the valence of the atoms on the QM side of the QM/MMinterface.66,71,72 Usually this atom is a hydrogen, but other atom types have alsobeen used, e.g. halogens such as fluorine or chlorine.77 The link atom method can beused with both the Warshel-type QM/MM methods and ONIOM methods.65,78 Thelink atom method has been criticised because it introduces extra unphysical atoms tothe system, which come with associated extra degrees of freedom. Another problemis that a C–H bond is clearly not chemically exactly equivalent to a C–C bond.Despite these problems, the simplicity of the link atom method means that it is usedwidely in the QM/MM modelling of proteins and other biological molecules.79

Chem. Modell., 2008, 5, 13–50 | 21

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 10: [Chemical Modelling] Chemical Modelling Volume 5 ||

Zhang et al.80,81 solved these problems with their pseudo bond method. If X and Yare the bonded QM and MM boundary atoms, respectively, then the link atommethod effectively replaces the Y atom with a hydrogen, thereby changing X–Y intoX–H. This has the problem that H is not necessarily chemically similar to Y, and alsothe X–H bond length may not be the same as the X–Y bond length. In thepseudobond method, Y is instead replaced by Yeff. This is a one-free valenceboundary atom that has a parameterized effective core potential that mimics thestrength of the X–Y bond. This method has been tested successfully80 with bothHartree Fock (HF), density functional theory (DFT) and MP282 QM/MM calcula-tions. Antes and Thiel83 have also introduced a conceptually similar approach,which they call ‘‘adjusted connection atom’’, that works at the semiemprical QMlevel. Parameterisation of these effective link atoms is not straight-forward, as theymust minimally perturb the electron density compared to a QM calculation of theentire molecule. Rothlisberger and co-workers have developed a sophisticatedscheme that derives parameters for these atoms via density functional perturbationtheory.84

A second approach to define the bonding interface between the QM and MMregions is the local self-consistent field (LSCF) algorithm developed by Rivail andco-workers.73,74 In the LSCF method the bonds between the QM andMM atoms arerepresented by strictly localised bond orbitals, which are parameterized by separateQM calculations on small molecules.75 These localised orbitals are assumed to betransferable to the protein system, and are used, and kept constant, throughout theself-consistent field (SCF) QM calculation. An elegant feature of the LSCF methodis that it does not require use of link atoms, and a comparison of the LSCF and linkatom methods85 showed that both give equivalent energy results. However, theparameters for the localised bond orbitals have to be determined for each newsystem studied.75 Fonili et al.86 have recently shown that it is possible to use frozencore orbitals on the MM frontier atom within the LSCF scheme. This provides anexplicit description of the core electrons of the atoms on the MM frontier atom,thereby improving the physical description of the interface, thus reducing the erroron the calculation.While the LSCF method is elegant, and has been shown to work well, the

parameters for the localised orbitals are not very portable. Gao et al. addressedthis problem by developing generalised hybrid orbitals (GHOs).75,76 The QMboundary atom at the QM/MM interface has the standard valence s and p orbitalsas all of the other non-hydrogen atoms in the QM region. These four sp orbitals aretransformed into a set of orthogonal hybrid orbitals, which can be defined by thebound geometry of the atom.75 These hybrid orbitals are used, along with the atomicorbitals of the QM region, to determine the QM energy. However, only one hybridorbital points along the bond between the QM and MM boundary atoms, and it isthis active hybrid orbital that needs to be optimised. The complete set of this oneactive hybrid orbital, plus all of the atomic orbitals from the other QM atoms formthe active set that is optimised during the SCF calculation. The remaining threehybrid orbitals, called auxiliary orbitals, act, together with the nucleus charge, togenerate an effective core potential for the QM boundary atom. Gao et al. realisedthat these auxiliary orbitals may be parameterized to mimic the effective corepotential for the active electrons from the MM region. Therefore rather thanparameterising the charge density of the hybrid orbitals for each specific system,as is the case for the LSCF method, they instead optimise the semiempricalparameters for the boundary atom to reproduce the bonding properties of fullQM systems. As a result, the parameters for this GHO method are expected to begeneral and transferable in the same way as all the semiemprical parameters.75 Themethod has since been extended by Gao and co-workers87 to work with the self-consistent charge density-functional tight binding (SCC-DFTB) method.88 Theaccuracy and efficiency of SCC-DFTB is making it a popular choice to model theQM region, and its use in computational enzymology has increased greatly in recent

22 | Chem. Modell., 2008, 5, 13–50

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 11: [Chemical Modelling] Chemical Modelling Volume 5 ||

years.88 Gao and co-workers76,89 have also developed a GHOmethod that is suitablefor ab initio Hartree Fock (HF)89 and density functional theory (DFT) calcula-tions.76

Konig et al.90 recently compared several different algorithms to model chemicalbonds between QM and MM atoms, and concluded that while none of theapproaches were perfect, error cancellation meant that enzyme catalysed reactionfree energies using an SCC-DFTB QM region were only marginally affected by thechoice of algorithm, if total charge was conserved during the reaction. The conclu-sion contrasts with a comparison study in which we were involved,91 in which wecompared the link atom and GHO boundary methods. While we found that bothmethods, when properly applied, can lead to similar behaviour, the inclusion ofconformational sampling amplified the effects of the differences between themethods. This led to the two methods returning different free energy reactionprofiles despite being applied to the same enzyme system. It is important to makeclear that while the same enzyme system was used, the QM region was not the samefor the two boundary methods. This was because of the different constraints on thepartitioning of the two methods (the GHO method requires that the QM atombound to an MM atom must be an sp3 carbon91). The origin of the observeddifference may well be in the different QM and MM systems, rather than in thepartitioning schemes themselves. Despite these questions over the fine detail of howQM/MM methods are applied, their development and application has now maturedto a point where they can provide near-quantitive results for activation enthalpiesand free energies of reaction.92 It is now possible to perform electronic structurecalculations on large systems approaching chemical accuracy, thus allowing quanti-tative studies of reaction mechanisms in enzymes.92

2.4. Two-way parallel QM/MM methods

Quantum mechanics calculations are computationally demanding. This has causeddifficulties with their application to the calculation of thermodynamic properties,such as free energies. This is because thermodynamic properties are calculated as anaverage over a large ensemble of conformations of a system. Sampling methods,such as molecular dynamics (MD)35,36 or Monte Carlo (MC)36,37 must be used forthe rigorous generation of such ensembles. The computational expense of QMcalculations means that it is only practical to generate small ensembles via standardMD or MC, e.g. current methods are limited to picoseconds of molecular dynamics.Two-way parallel QM/MM interfaces have therefore been developed in an attemptto overcome this problem. We use the term parallel methods to mean those that useboth a QM or QM/MM simulation running in parallel with a standard MMsimulation, using only periodic exchange of information between the two simulationlevels.Warshel and co-workers developed a successful QM/MM parallel method in a set

of pioneering papers in the late 1990s.93,94 The aim of this method was to calculatethe free energy difference between two systems, A and B. For example, system Acould be a substrate bound to an enzyme, while system B could be the transitionstate. The free energy difference between these two corresponds to the activation freeenergy of the enzyme catalysed reaction. Warshel and co-workers calculated therelative free energy of A and B by first using a molecular mechanics type potential.Because an MM potential was used, molecular dynamics sampling was efficient, andtherefore a large ensemble, and well-converged relative free energy were calculated.This relative free energy, DGMM(A- B), can only be as good as the MM potentialused during the calculation. Ideally, this free energy should be calculated using a QMor QM/MM representation of A and B, giving DGQM(A - B). However, thecomputational expense of the QM calculation prevents the efficient generation ofthe large ensembles necessary to evaluate converged free energies. Warshel and co-workers solved this by rather than calculating DGQM(A- B) directly, they used the

Chem. Modell., 2008, 5, 13–50 | 23

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 12: [Chemical Modelling] Chemical Modelling Volume 5 ||

MM ensembles to calculate the difference in free energy between the QM and MMrepresentations of A and B. In essence, Warshel and co-workers calculated the freeenergy error associated with using the MM forcefield. By calculating these errors,Warshel and co-workers were able to correct DGMM(A- B) so that it was formallyequal to DGQM(A- B)93,94 (see Fig. 1). The correction free energies were calculatedby generating ensembles for systems A and B using the MMmodel. The difference inenergy between the QM and MM models was calculated for a subset of eachensemble, and the difference between these energies used as input to a single-step freeenergy perturbation (FEP)95,96 between the MMmodel (the FEP reference state) andthe QM model (the FEP perturbed state). As long as the MM model is a goodapproximation of the QM model, i.e. the phase space overlap of the two models isgood, then the average calculated via the FEP equation will converge to an accurateestimate of the correction free energy. The key advantage of this method is that all ofthe thermodynamic sampling is performed using only the MM model of the system.QM or QM/MM calculations are run in parallel with the MM sampling to estimatethe correction free energies. The disadvantage of this method is the requirement ofgood overlap between the QM and MM models. Warshel and co-workers mitigatethis disadvantage through the development of the empirical valence bond(EVB)93,94,97 forcefield, which has been designed to give energies that are in goodagreement with experiment and QM calculations. In addition, the EVB potential hasbeen developed so that it can be used to study chemical reactions, something that isnot possible using most of the biological MM forcefields. The EVB potential and theWarshel parallel QM/MMmethod have been very successful, and have been used tostudy a variety of systems.98–102

Warshel and co-workers developed their method to avoid the problem of poorsampling of a QM or QM/MM Hamiltonian. MD methods are currently limited topicoseconds of QM/MM dynamics for typical biomolecular applications, even usingrelatively low levels of QM theory. Monte Carlo (MC) methods suffer from evengreater problems. MC works by performing typically millions of small randommoves of the biomolecular system, each of which are tested according to the changein energy associated with that move. MC sampling of a QM/MM Hamiltonianwould potentially require millions of QM energy evaluations, which is impracticalusing current methods and computers. A second class of parallel QM/MM methodsattempt to solve this problem. These methods use a novel Monte Carlo algorithmdeveloped by Hastings in 1970.103 This is a multiscale sampling method that usesMC sampling at one modelling level to generate an ensemble which is formallycorrect for a different modelling level. The algorithm works by creating a new type ofMonte Carlo move. The move starts at configuration i. The energy of this config-uration is evaluated using both the fast, high-level model, giving Efast(i), and theslower, low-level forcefield, giving Eslow(i). A block of MC moves is then performedusing the fast forcefield. This results in a new configuration, j. The energy of thisconfiguration is evaluated using both forcefields, giving Eslow(j) and Efast(j). These

Fig. 1 The free energy cycle93,94 used to calculate the QM/MM free energy difference betweensystems A and B, DGQM/MM(A - B). The free energy difference between A and B is firstestimated using an approximate potential (e.g. an MM potential), giving DGMM(A- B). Thisis then corrected to the QM/MM value by calculating the free energy necessary to perturbsystem A from MM to QM/MM (DGMM-QM/MM(A)) and the free energy to perturb system Bfrom MM to QM/MM (DGMM-QM/MM(B)).

24 | Chem. Modell., 2008, 5, 13–50

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 13: [Chemical Modelling] Chemical Modelling Volume 5 ||

energies are used to test configuration j according to a new MC acceptance test.Configuration j is accepted into the ensemble if this test is passed (see Fig. 2).Otherwise the whole block of sampling is rejected and the simulation is reset toconfiguration i. The form of the MC test is such that even though thetrial configurations are generated using the fast forcefield, they are acceptedinto the ensemble of the slow forcefield with the correct Boltzmann probability.This algorithm was popularised for applications to QM and QM/MMsystems by Schofield and co-workers,104,106 who coined the phrase ‘‘molecularmechanics-based importance function’’ (MMBIF). One of the problems with thisalgorithm is that the acceptance ratio of the MC test will be low if there is pooroverlap between the fast and slow forcefields. This is similar to the problemencountered in Warshel’s correction free energy method. Effort may therefore needto be spent optimising the fast forcefield such that it is a better match to the slowforcefield.We have developed a parallel QM/MM method105 that combines the advantages

of both the Warshel and MMBIF algorithms. The method works by using theMMBIF method to generate QM/MM ensembles from which the Warshel correc-tion free energies can be calculated in full. The correction free energy is calculatedusing thermodynamic integration (TI)107,108 over a fictional l scaling parameter thatmaps from the QMmodel to the MMmodel. This l parameter allows the QMmodelto be changed over a series of windows into the MM model. The MMBIF algorithmcan then be used to generate ensembles of conformations of the system at differentvalues of l, so that the gradient of the free energy with respect to l can be calculatedat several points between the QM and MM models. These gradients can then beintegrated across l to return the correction free energy. We overcome the problemsassociated with potentially poor overlap between the QM and MM models by usingreplica exchange8,9,109 moves across the l coordinate during the simulation. Replicaexchange moves are additional Monte Carlo moves that lightly couple multipletrajectories together. All of the MC simulations at different l values are run inparallel. Neighbouring pairs of simulations are tested periodically according to areplica exchange MC acceptance test. If this test is passed, then the l values of theneighbouring pairs are swapped. This has the effect of allowing each MC simulationto sample multiple l values during the simulation. This enhances convergence of thefree energy averages. We have used this method105 to calculate converged relativehydration free energies of water and methane, using an MP2 ab initio QM model ofwater and methane, solvated by a periodic box of MM waters. While this was a non-biological application, we are currently using this method to calculate QM/MMrelative binding free energies of protein-ligand systems (using a DFT QM model ofthe ligand and anMMmodel of the protein and explicit solvent), and are planning touse it to perform some computational enzymology calculations.

Fig. 2 Application of the Metropolis-Hastings103 algorithm to accelerate sampling of a systemrepresented using a QM/MM Hamiltonian.104,105 The Monte Carlo move starts at configura-tion i. The energy of this configuration is evaluated using the target QM/MM Hamiltonian(giving EQM/MM(i)) and on an approximate (MM) Hamiltonian (giving EMM(i)). StandardMetropolis Monte Carlo moves are then attempted from configuration i using only theapproximate MM Hamiltonian, until after a set number of moves, the system is in configura-tion j. The energy of configuration j is evaluated using both the QM/MM and MMHamiltonians (giving EQM/MM(j) and EMM(j)). Configuration j is then accepted into the QM/MM ensemble according to the probability min{1,exp(�DDE/kBT)} where DDE= (EQM/MM(j)� EMM(j)) � (EQM/MM(i) � EMM(i)).

Chem. Modell., 2008, 5, 13–50 | 25

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 14: [Chemical Modelling] Chemical Modelling Volume 5 ||

3. Interfacing atomistic with coarse grain models

In mid 2007, Leontiadou, Mark and Marrink110 produced a paper in which theyused atomistic molecular dynamics simulations to model the effects of ionicconcentration on the transport of ionic species across a pore in a lipid membrane.This work involved several large simulations, involving 128 dipalmitoylphosphati-dylcholine (DPPC) lipids (each containing 130 atoms) and about 6000 watermolecules. This work pushed the limits of what is achievable with current atomisticmolecular dynamics, and, by using approximations such as modelling long-rangeelectrostatics using a reaction field, and using bond constraints such that a 5 fsintegration timestep could be used, they were able to run several simulations ofbetween 50 ns to 100 ns in length. Despite the impressive size of these simulations,they are still limited to biologically small length and time scales. 128 lipids is merelyan 8 � 8 membrane bilayer, which is too small to model effects such as membranecurvature or membrane waves.111,112 100 ns is also too short a time to capture eventssuch as membrane protein aggregation or lipid raft formation within a membrane.111

Coarse grain models provide a route to longer time and length scales in biomolecularsimulations. Coarse grain models are a class of mesoscale model that work bygrouping several atoms together and modelling them as a single interaction site. Ineffect, groups of atoms are smeared together into beads. For example, a coarse grainmodel could be constructed that represents an amino acid residue as a single bead,and a protein as a string of beads. The use of coarse grain models reduces thecomputational expense of a simulation, as coarse graining reduces the number ofinteraction sites. In addition, CG models contain fewer degrees of freedom, and useforcefields that lead to smoother potential energy surfaces. The smoother potentialenergy surface reduces the problems associated with frustration or non-ergodictrapping, thereby leading to improved sampling and a lower correlation time. Alsocoarse graining tends to remove the stiffest degrees of freedom from the model (e.g.the C–H bond vibrational modes), thereby allowing a CG model to use a largertimestep. All of these effects mean that CG simulations provide a route to modellinglength and time scales that are far beyond that which is practically achievable viaatomistic molecular dynamics. Coarse grain modelling is currently undergoing arenaissance, and there is now significant international effort being spent developingand applying coarse grain methods to model biological systems. It is not the purposeof this review to cover all of these recent developments in depth, so the interestedreader is directed to several excellent modern reviews of the development andapplication of CG methods.113–115

Coarse grain models allow simulators to routinely access length and time scalesthat are not practically possible using atomistic modelling methods. However, insmearing out the atomistic detail, CG models run the possibility of missing outimportant atomistic effects, much in the same way that molecular mechanics models,in smearing out all of the electronic detail, can fail to model important electroniceffects such as polarisation. There is now significant interest in interfacing atomisticand coarse grain models within a multiscale framework, so that this problem may beovercome. Just as there is significant variation in the type and strength of interactionin the different methods developed to interface QM and MM models, so too is theresignificant variation in the type and strength of interaction of the different methodsof interfacing atomistic and CG models. The type of interfaces broadly fall into fourcategories, which are similar in nature and definition to the interfaces that have beendeveloped for QM/MM interfaces:1. One-way, bottom-up methods. These involve a single transfer of information

from atomistic simulations or calculations to the CG simulation, e.g. by using anatomistic simulation to parameterize a CG model.2. One-way, top–down methods. These involve a transfer of information from the

CG simulation to the atomistic simulation, e.g. using a CG model to enhance

26 | Chem. Modell., 2008, 5, 13–50

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 15: [Chemical Modelling] Chemical Modelling Volume 5 ||

sampling, and then using an atomistic model to explore interesting conformationsthat are discovered.3. Two-way parallel methods. These involve the running, in parallel, of both a CG

and atomistic model, and dynamically exchanging information between them at runtime.4. Two-way embedded methods. These involve the running of simulations in

which an atomistic region (e.g. a membrane protein) is embedded within a CGmodel(e.g. a lipid membrane).Examples of each of these different types of interface will now be presented and

discussed in turn.

3.1 One-way, bottom-up interfacing methods

Coarse grain models can be interfaced using a one-way, bottom-up scheme. In thisscheme, an atomistic model of the system (be it MM or QM) is used to parameterizea coarse grain model of the system. This idea is not new, and was the parameterisa-tion method used by Levitt and Warshel116,117 when they first introduced coarsegrain models in their pioneering 1970s papers. Levitt and Warshel’s 1975 paperintroduced the first coarse grain model of a globular protein. It used two CGparticles per residues: one that represented the a-carbon of an amino acid (Ca), andone that represented the side-chain atoms. A torsion potential acted about Ca

particles, while a Lennard-Jones type potential acted between pairs of side chainparticles. This CG model was built to represent bovine pancreatic trypsin inhibitor(BPTI), and was very successful, being able to correctly refold the protein startingfrom a completely denatured configuration. This work is interesting not onlybecause it represents the first coarse grain biomolecular model, but also because itrepresents the first multiscale MM/CG simulation. Levitt and Warshel generated theparameters for their coarse grain model by averaging over interaction energiescalculated using an atomistic potential. The parameters for the interaction potentialbetween side chain particles were calculated by assuming that the side chains hadspherical symmetry. The effective potential between identical side chain particles wascalculated at various distances apart by summing the interaction energies of all of theatoms in one sphere with all of the atoms in the other sphere using a molecularmechanics style potential. This interaction potential between like-particles was thenused to parameterize a Lennard-Jones type function, and the parameters betweendifferent side chain particles were then obtained using a geometric combining rule.The torsion potential between Ca atoms was also calculated by fitting to atomisticcalculations, with the torsion potential between a pair of residues based on a timeaverage of the energy calculated from atomistic simulations of a set of dipeptides.

3.1.1. Coarse grain models of lipid membranes. Levitt and Warshel’s method ofobtaining CG parameters from underlying atomistic calculations can now berecognised as an example of a multiscale parameterisation scheme, where there isa one-way, bottom-up, flow of information from the atomistic to the CG calculation.Levitt and Warshel were ahead of their time in using a multiscale parameterisationscheme for their CG model. Coarse grain models only became popular forbiomolecular modelling in the 1990s, which saw the beginning of the developmentof CG representations of lipids. Because the description of the development ofcoarse grain lipid molecules is outside the scope of this review (excellent reviews ofthis subject113,115 exist already) only a brief history of CG lipid models will bepresented, in particular to highlight how CG lipid models have recently moved tousing multiscale parameterisation methods.The first applications of CG lipid models looked for qualitative, rather than

quantitative predictions, and therefore did not use any information from atomistic-level calculations in their parameterisation. One of the first models produced duringthis time was by Smit et al.118 This model was used to investigate the phases of an oil/

Chem. Modell., 2008, 5, 13–50 | 27

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 16: [Chemical Modelling] Chemical Modelling Volume 5 ||

water/surfactant system. This was a qualitative model that used just two types ofparticle; an oil (o) particle and a water (w) particle. An oil molecule was representedusing a single o particle, water using a single w, and a surfactant was modelled as achain of two w particles followed by 5 o particles, all held together with harmonicsprings. All particles interacted using a Lennard-Jones potential, with attractive o–oand w–w interactions, and a purely repulsive o–w interaction. This simple repre-sentation was qualitatively able to model micelle formation. Goetz et al.119 andNoguchi and Takasu120 also developed qualitative CG models of amphiphiles, andused them to investigate phenomena such as bilayer and vesicle formation. One ofthe problems with a CG model is that the form of the interaction potential betweenCG particles is not clear, and there are questions over whether simple potentials aresufficient. Noguchi and Takasu developed their qualitative model of amphiphilesusing custom potentials; a pairwise exponential term to give a soft-core repulsiveshape to the CG particles, and a many-body attractive potential that aimed to modelthe hydrophobic effect. Using these potential forms, and a small number ofadjustable parameters, they were able to simulate the self-assembly of bilayervesicles. Brannigan and Brown121 also developed a pure CG model with a complexform. They modelled lipids as soft spherocylinders that interacted via a short-ranger�8 repulsive term and an isotropic attractive alignment potential, which was basedon an r�2 attractive term that was mediated by an angular dependence. These termsgave the lipids shape and encouraged them to form an aligned lamellar phase. Toencourage bilayer formation, they designated one end of the spherocylinder as thelipid tail, and added a r�6 Lennard-Jones type attractive term between tails. Thismodel resulted in a very small number of parameters, which allowed its phasebehaviour with respect to a parameter search to be investigated. With the rightparameters, this model was observed to self-assemble into a bilayer. Michel andCleaver122 were still investigating the correct functional form to use for coarse grainamphiphile molecules in 2007, in a paper that investigated the use of the Gay-Berne123 potential. The Gay-Berne potential is effectively an anisotropic version ofthe Lennard-Jones potential, where the r�12 repulsive and r�6 attractive terms areattenuated by the angle of interaction between two particles (using the dot productof the particles alignment vectors). This has the effect of elongating the Lennard-Jones sphere along an alignment vector and turning it into an ellipsoidal rod. Micheland Cleaver performed a parameter space search for Gay-Berne particles, and wereable to qualitatively observe several different liquid crystal phases.Despite Levitt and Warshel’s demonstration of the effectiveness of using quanti-

tatively parameterized CG models, it took until 2001 before other workers begandeveloping quantitative CG models for biomolecules. The Shelley model, developedby Shelley et al.124,125 was the first quantitative CG biomolecular representation thatwas used to model lipid membranes. Shelley et al. developed a CG model of the lipiddimyristoylphosphatidylcholine (DMPC). They represented the 46 atoms of DMPCwith just 13 spherical interaction sites. Four sites were used for each hydrocarbonchain (three chain SM sites, and one terminating ST site), one site each for the estherlink (E1), one site for the glycerol backbone (GL) one site for the choline (CH) andone site for the phosphate (PH) in the headgroup (see Fig. 3). In addition, they alsoused a spherical W particle, that represented a grouping of three water molecules.The interaction potentials between these spherical groups were based on Lennard-Jones potentials, e.g. interactions between W particles used a LJ 6–4 potential (anr�6 repulsive term coupled with an r�4 attractive term). The parameters for these LJpotentials were obtained by reproducing thermodynamic properties, e.g. the sWW

parameter was set to reproduce the experimental density of water at 303.15 K, whilecalculations of the vapour pressure of a box of W particles, where used to set eWW

such that the experimental boiling point of water was reproduced. A Lennard-Jones9–6 potential was used between non-bonded hydrocarbon particle sites (SM andST), which the LJ parameters obtained by matching the densities and vapourpressures of boxes of nonane and dodecane. The Shelley CG model was mostly

28 | Chem. Modell., 2008, 5, 13–50

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 17: [Chemical Modelling] Chemical Modelling Volume 5 ||

parameterized from experimental and thermodynamic data, similarly to how theoriginal atomistic molecular mechanics forcefields were parameterized from experi-mental and thermodynamic data. However, due to the importance of correctlymodelling the interactions between head group particles, Shelley et al. did use amultiscale parameterisation method to generate the parameters for the head groupparticles from an underlying atomistic simulation. The interactions between headgroup particles were based on radial distribution functions (RDFs) between headgroup particles calculated from atomistic simulations. These RDFs were tabulated,and then used to create a potential of mean force (PMF) which was used as theinteraction potential between the head group particles. Use of this PMF led to RDFsthat disagreed with the atomistic simulation, so corrections to the interactionpotential were calculated by comparing the atomistic and CG RDFs, and runningan iterative scheme that updated the interaction potential until the differences

Fig. 3 A comparison of different coarse grain lipid models. The Shelley model124,125 ofDMPC, and Marrink126 and Essex127 models of DPPC are compared to their atomisticequivalents (for ease of comparison, hydrogen atoms of the atomistic models are not shown).Solid lines represent harmonic bonds connecting CG particles, and the CG particle types for theShelley and Marrink models are labelled (the labels are the same as those used in the main text).The point charges (represented by + and �) and point dipoles (represented by arrows) areshown for the Essex model (the charges and dipoles are located at the centre of their associatedCG particle). The Shelley and Marrink models use LJ particles (represented by spheres), whilethe Essex model uses a combination of LJ particles (spheres) and Gay-Berne particles(ellipsoids). Finally, the ‘blob’ model proposed by Chao et al.128 is also shown for comparison.This model represents groups of atoms as rigid non-spherical ‘blobs’ that use interactionpotentials based on multipole expansions.

Chem. Modell., 2008, 5, 13–50 | 29

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 18: [Chemical Modelling] Chemical Modelling Volume 5 ||

between the atomistic and CG RDFs were minimised. The model semi-quatitativelyreproduced the density profile of an aqueous DMPC bilayer, and could be used tosimulate the self-assembly of the bilayer. Shelley et al. used this model to investigatediffusion of halothane through the membrane.125 Halothane was modelled using aLJ 6–4 potential, which was parameterized by matching the experimental densityand boiling point of the molecule, but a lack of experimental data prevented thedetailed parameterisation of the interactions between halothane and the particles inthe lipids, and this led to a model which was shown to not be truly predictive. Lopezet al.129 used existing beads from the Shelley DMPC model to model a syntheticantimicrobial polymer, which was inspired by a natural antibiotic. They thenqualitatively investigated the interaction between the antimicrobial model and abilayer of Shelley DMPC molecules. Despite the quantitative foundations of theShelley model, it has several limiting features which arise from the use of multiscaleparameterization:126 the model is optimized for the bilayer phase of DMPC only (asit was an atomistic simulation of this phase that led directly to the head groupparameters), the interaction potentials are more complicated, so short time stepsmust be used in the dynamics integrator, and it evaluates long-range interactions, soit is not possible to reduce the computational expense of the calculation by using ashort-range cutoff.These disadvantages of multiscale parameterisation, the lack of portability and the

complexity of the interaction potential, led Marrink and co-workers to develop asimple, experimentally parameterized CG lipid model.126 The Marrink model issemi-quantitative, and was developed to target four goals; speed, accuracy, applic-ability and versatility. Marrink et al. achieved speed by using only short rangepotentials, which were also very smooth, so that large integration timesteps may beused. Accuracy was maintained by deriving the parameters to match experimentalresults and by comparing results calculated using their model to results fromequivalent atomistic simulations. The simplicity of the forcefield (it is composed asa set of generic particles), together with its flexibility to represent chemical groupsachieved the aims of applicability and versatility. The coarse grain model isconstructed as follows; on average four atoms are represented by a single interactionsite. To keep the model simple, only four main types of interaction site are used;polar (P), nonpolar (N), apolar (C) and charged (Q). The nonpolar and chargedgroups are then futher divided into subtypes, which depend on whether or nothydrogen bonding is possible to these groups. Fig. 3 shows how a DPPC lipid isrepresented using this model. The coarse grain particles interact using a Lennard-Jones function. However, only five levels of interaction are defined; attractive, wheree = 5 kJ mol�1, semi-attactive, where e = 4.2 kJ mol�1, intermediate (e = 3.4 kJmol�1), semirepulsive (e= 2.6 kJ mol�1) and repulsive (e= 1.8 kJ mol�1). The levelof interaction between a particular pair of coarse grain particles is chosen from atable look-up based on the types of the interacting sites, e.g. a pair of P particles havean attractive level of interaction, while a P and a C have a repulsive level ofinteraction. The complete lookup table is provided in the original paper describingthis model.126 These parameters were obtained by trial and error, running multipleoil and water CG simulations and modifying the small number of parameters toreproduce experimental properties (e.g. density of water and alkanes at roomtemperature, mutual solubility and diffusion rates). For all interaction levels, thesame value of s was used (0.47 nm). To keep the interactions short-ranged, they wereforce shifted to zero between 0.9 nm and the user-defined parameter rcut using thestandard Gromacs130,131 shift function. All coarse grain particles except for nearestneighbours interact through this Lennard-Jones potential. Nearest neighbours areconnected via a weak harmonic spring, while next-nearest neighbours are connectedvia a harmonic angle potential. In addition to these interactions, and unusually forCG lipid models, charged (Q) particles were included, which interact via a Coulombpotential which used a relative dielectric constant of er = 20, to implicitly accountfor charge screening. This electrostatic interaction is also shifted using the standard

30 | Chem. Modell., 2008, 5, 13–50

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 19: [Chemical Modelling] Chemical Modelling Volume 5 ||

GROMACS130,131 shift function between 0 nm and rcut. The Q particles weredesigned to represent groups that have a full charge, and the groups full chargeshould be used. The exception are small hydrated ions, that use a reduced charge, totake into account the effect of an implicit hydration shell.Energy, density, length, temperature and pressure all have the same meaning for in

the Marrink CG model as they do for any atomistic MM model. Time is differentthough, as the CG interaction potential is very smooth, and therefore conforma-tional sampling is faster as the system is not trapped in local minima. This meansthat CG time is probably 3–6 times faster than atomistic MD time. To achieveparity, Marrink et al. scale time by four. Part of the reason for the success of themodel is that the smooth potential surface allows large timesteps to be used with thedynamics integrator—timesteps up to 50 fs can be used, though 40 fs gives morestable results. The Marrink model is implemented in Gromacs,130,131 and the inputparameters for this model can be downloaded from http://md.chem.rug.ml/marrink/coarsegrain.html. The Marrink model has been used in several applications110,132–134

and has been compared against a united atom atomistic force field (GROMOS)135–137

(a united atom MM forcefield is one in which non-polar hydrogen atoms are notmodelled explicitly, but are instead combined into the representation of their boundheavy atom, e.g. ethane is represented as a pair of CH3 particles). This comparisonincluded an estimate of the loss of configurational entropy incurred in moving fromthe atomistic to coarse grain model.135,136 These comparisons showed that while theCG model compared well, it failed to fully represent some thermodynamic proper-ties, e.g. oil/oil interactions were too weak in the CG model, and water/oil repulsionappeared to be overestimated.137 In addition, the CG model showed less enthalpy/entropy compensation than an atomistic model for the mixing of oil and water,leading the authors of the comparison to suggest that accurate solvation thermo-dynamics should be employed to help in the reparameterisation of CG models.137

Marrink et al. have now performed this reparameterisation and have created theMARTINI forcefield.138 The MARTINI forcefield was parameterized in a consis-tent manner to reproduce the partitioning free energies between polar and apolarmedia of a large number of chemical compounds.The advantage of the Marrink model is that it provides a set of basic CG bead

building blocks that may be used to create different types of molecules. Bondet al.139,140 used these building blocks to construct CG models of membraneproteins, so as to investigate membrane-protein insertion. They compared theirresults to fully atomistic simulations, and found qualitative agreement. Marrink andco-workers have also now parameterized CG beads to represent amino acid residuesin proteins,134 and have used these parameters to perform a CG study of the self-assembly of G protein-coupled receptors in membrane bilayers.

3.1.2. Multiscale CG parameterisation. Despite the problems of multiscale para-meterisation, e.g. the lack of portability of the parameters and the potentiallycomplex form of the interaction potentials, and the success of the experimentallyparameterized Marrink CG model, a trend over the last few years is that CG modelsof lipids have begun to move towards using parameters that are derived system-atically from atomistic simulations. This is because multiscale parameterisationprovides a route to systematically obtain CG potentials for mixed or complexsystems, or for cases where experimental data is difficult to obtain. Parameterisationmethods are appearing that use the type of multiscale one-way atomistic to CGinformation flows that were first used by Levitt and Warshel in the 1970s. This trendparallels the historical development of atomistic MM force fields, which also movedaway from complex experiment-based parameterisation schemes towards using well-validated recipes (e.g. GAFF44) to get MM parameters from QM calculations. CGparameterisation schemes now seem to be following this trend, and systematic

Chem. Modell., 2008, 5, 13–50 | 31

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 20: [Chemical Modelling] Chemical Modelling Volume 5 ||

methods to obtain CG parameters from atomistic MM simulations are nowbecoming popular.Izvekov and Voth141,142 have developed a method that automates the generation

of CG potentials and parameters from an atomistic MM simulation. Their method,which they term ‘‘multiscale coarse graining’’ (MS-CG), automatically generatespairwise interaction potentials for a defined CG model based on results from anatomistic simulation. They have developed a force matching method, which is anextension of the least-squares force matching approach developed by Ercolessi andAdams.143 The method is used to coarse-grain an underlying atomistic simulationtrajectory. As in other coarse grain models, the atoms are combined together intobeads, and the system as a whole is therefore represented as a set of interactingbeads. The potential of mean force is estimated between each bead based on theaverage forces experienced during the atomistic trajectory. This is evaluated usingthe atomistic potential energy surface, and CG potentials are derived that minimisethe difference between the CG force and the average atomistic force. An analyticalderivation of the method has been published,144,145 and statistial mechanicaljustification of using pair potentials for the CG particles has been developed.146 Ifthe CG potential depends linearly on the fitting parameters, then the fitting of theparameters can be written as a series of over-determined linear equations, and theleast-squares solution to which can be found by orthogonal matrix triangulation(QR decomposition).141 The efficiency of the fitting can be enhanced by breaking theentire atomistic trajectory into small chunks and generating the CG parameters foreach chunk independently. The final CG parameters are then found by averaging theparameters over each chunk. This method is very efficient, but it requires that theCG potential depends linearly on the CG parameters. This is achieved by using aseries of splines, which are linear in the spline parameters, and combining this with aCoulomb potential for the electrostatics. The effect of this coarse-graining is toreturn a model of the forces between pairs of CG beads that accurately reflects theaverage forces experienced between the atoms contained within those beads duringthe course of the atomistic molecular dynamics trajectory. Izvekov et al. applied thismethod to generate a CG potential for a DMPC bilayer. They collapsed the forcesover a 400 ps atomistic trajectory onto a bead representation of DMPC that wassimilar to the Shelley125 or Marrink126 models. These forces were input to the forcematching equations to yield the CG potentials. Despite the fact that some of the CGbeads were charged, they were able to omit the Coulomb term from the fitting, as itwas found that due to charge screening, the effect of electrostatics could be handledsufficiently by the short range spline functions of the MS-CG forcefield. This methodhas since been used to derive a CG model of carbonaceous nanoparticles,147

monosaccharides in water148 and of ionic liquids.149 Because this method system-atically derives CG parameters using only an atomistic simulation, it is much morestraightforward to develop parameters for mixed systems, for which experimentaldata is either unavailable or difficult to obtain. Izvekov et al. demonstrated thisadvantage by deriving a CG model for a mixed DMPC/cholesterol bilayer.150

MS-CG effectively provides a CG model of a system that approximates the freeenergy surface. However, dynamics simulations using the MS-CG model cannot berelated to the true dynamics of the atomistic model. Izvekov et al. have tried toovercome this problem by using the atomistic simulation to parameterize friction co-efficients which can be used with a Brownian dynamics integrator.151 While this hasonly currently been applied to liquid methanol, they plan to apply this method tomembrane systems.Lyubartsev152 has also developed a multiscale parameterisation method that has

been used to systematically build a CG model of a DMPC bilayer. Lyubartsev usesan inverse Monte Carlo method153 to generate the CG parameters from an under-lying atomistic simulation. The atomistic simulation trajectory is analysed togenerate the radial distribution functions (RDFs) for the CG bead model. TheseRDFs can be converted into pairwise interaction potentials between the beads. The

32 | Chem. Modell., 2008, 5, 13–50

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 21: [Chemical Modelling] Chemical Modelling Volume 5 ||

RDFs calculated from a CG simulation using these initial interaction potentialsdiffer from those calculated from the atomistic simulation. An inverse Monte Carloalgorithm153 is therefore used to iteratively refine these interaction potentials bycorrecting them by the difference between the CG and atomistic RDFs. This isessentially the same method that was used by Shelley et al.125 to derive theparameters between the CG lipid head group particles, and is also similar to theBoltzmann inversion method,154 which also uses an iterative procedure that usesRDFs measured from atomistic simulations to derive CG interaction potentials.Lyubartsev has used this method to fully parameterize his own CGmodel of DMPC.Arkhipov et al. have taken multiscale coarse graining to an extreme in their recent

work,155 in which they use atomistic models to parameterize CG models of acomplete virus capsids. Each CG particle in the simulation represented about 200atoms. Each CG particle interacted via a Lennard-Jones potential, which wasparameterized to match the size of the domain the CG particle represented, ascalculated from the radius of gyration of that domain measured from an atomisticmodel. The resulting CG model was able to simulate complete virus capsids(of dimensions 10 nm to 100 nm) over timescales of 1 ms to 10 ms.Chao et al.128 have developed a CG model that is based on the concept of soft

rigid blobs. A blob is a grouping of atoms within a molecule that interacts with otherblobs using a non-spherical, soft potential. A molecule is divided into blobs, e.g. seeFig. 3 for a proposed blob model of a phospholipid.128 Each blob represents a rigidgrouping of atoms, and the interaction potential between a pair of blobs is calculatedbased on the underlying atomistic potential energy of interaction between the atomswithin those blobs. However, unlike other CG models, which try to approximate thisinteraction potential using high symmetry beads, the rigid blob model expands thisinteraction potential as a Taylor series to produce interaction moment tensorsbetween blobs. These are akin to the multipole tensors in a standard multipoleexpansion. The result is that the blob potential more accurately represents the threedimensional shape of the group of atoms that it represents. A further advantage isthat the quality of the model can be increased by adding higher-order moments tothe interaction potentials.

3.2 One-way top–down interfacing methods

The multiscale atomistic/CG methods discussed so far all involve a single, one-waytransfer of information from the atomistic model to the CG model. Essentially theatomistic models are used to parameterize a CG model, so that the CG model canthen be used to explore the conformational space of the system more rapidly.However, in building the CG model, the atomistic fine detail of the system has beenlost, as it has been smeared out into a set of beads. Recovering the atomistic finedetail would however be very useful, as this would allow the simulator to zoom backin and observe interesting configurations revealed during the CG trajectory at anatomistic level of detail. Knecht and Marrink133 realised the advantages of thisapproach, and have used it together with the Marrink CG lipid model to modelvesicle formation. The CG model is used to self-assemble a vesicle, and this CGstructure is then used as a template for the starting structure for an atomisticsimulation. The conversion from the CG model to the atomistic model is straight-forward. The coordinates from the CG simulation were scaled by a factor of 1.6.Atomistic models of the lipid, taken from an atomistic simulation of the planarbilayer, were fitted on top of their CG counterparts. The CG model had to be scaledto prevent steric clashes caused by the curvature of the CG vesicle bilayer. Moleculardynamics simulations were run on the atomistic model, allowing vesicle membranefusion to be studied at the atomistic level.It is relatively straight-forward to move from a coarse grain model of a lipid to a

fully atomistic model, as the main features of the lipid, namely head group and twotails, are still present in the CG model. It is less straight-forward to reconstruct the

Chem. Modell., 2008, 5, 13–50 | 33

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 22: [Chemical Modelling] Chemical Modelling Volume 5 ||

atomistic coordinates of a protein from a CG model. This is because many CGprotein models represent only the backbone of the protein, and if the side chain isrepresented, it is done so using only one or two beads. There is a lot of atomisticstructural information missing from a CG protein model, and reconstructing thisfine detail is not trivial. McCammon and co-workers156 attempted in their multiscaleexploration of ligand binding to HIV protease to map from a single residue per beadprotein CG model back to an atomistic representation. The binding site of HIVprotease is covered by two large flaps, which open and close on a timescale that isbeyond that is accessible on the atomistic molecular dynamics timescale. McCam-mon and co-workers have, over a series of papers,157–162 developed a very successfulCG model of HIV protease. This model uses a single bead per residue, and wasparameterized based on a statistical analysis of the crystal structure of the protein.This CG model is capable of simulating opening and closing motion of the largeflaps that gate the entrance to the active site.158,162 McCammon and co-workers usedtheir CGmodel within a multiscale simulation to investigate the binding pathways ofan HIV protease inhibitor. The CG simulation was used to investigate opening andclosing of flaps which gate the binding site of the enzyme, to see how the presence ofthe inhibitor affected flap dynamics, and to test whether the inhibitor could bindwith the flaps closed. Because the CG model lacks atomistic detail, McCammon andco-workers selected two configurations from the CG trajectory to act as startingpoints for fully atomistic, implicit solvent Brownian dynamics simulations of theprotease plus inhibitor. While it was straight-forward to rebuild a fully atomisticmodel of the inhibitor from the CG model, it was not simple to rebuild an atomisticmodel of the enzyme. This is because the CG model is missing lots of importantstructural information, e.g. the orientation and location of the amino acid sidechains. To avoid the difficulty of rebuilding the atomistic detail of the enzyme, theyinstead used crystal structure configurations of the protease that had backbonegeometries that were similar to those observed in the two snapshots from the CGtrajectory.McCammon and co-workers156 used the CG simulation to provide input to a pair

of atomistic simulations, thereby performing a pair of one-way, top–down informa-tion transfers from the CG to atomistic level. They sidestepped the great challenge oftop–down transfers, namely the difficulty of recovering fine atomistic detail, by onlyusing the CG model to locate the binding geometry of the inhibitor, and as a meansto select which crystal structures corresponded to that particular binding geometry.Other workers have directly tackled the problem of reconstructing atomistic detailfrom CG models. The RACOGS algorithm (Reconstruction Algorithm for CoarseGrain Structures)163 is an example of a recently developed algorithm that attemptsto solve the reconstruction problem. The algorithm is designed to rebuild anatomistic protein model given only the position of the Ca atoms. The algorithmhas several stages;1. The backbone atoms (C, O, N) are located based on the input positions of the

Ca atoms using the algorithm developed by Feig et al.,164 which is itself based on thework by Milik et al.165 The algorithm places the backbone atoms into the averagepositions based on a statistical analysis of the positions of backbone atoms takenfrom 4013 non-redundant protein structures from the PDB.2. The next step is the positioning of side chain atoms. These are positioned using

the algorithm developed by Xiang and Honig,166 which uses a rotamer library toinitially place the side chain atoms. The algorithm starts by adding the side chainatoms to a residue based on selecting the rotamer for that residue that has the lowestenergy of interaction with the backbone atoms of the other residues. The interactionenergy is evaluated using the vdW and dihedral energy terms from the AMBER99forcefield.167 In performing this step, the rotamer library is pruned of any rotamersthat involve a steric clash with the backbone. This helps improve the efficiency of thealgorithm by removing any structurally unrealistic conformations as quickly aspossible.

34 | Chem. Modell., 2008, 5, 13–50

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 23: [Chemical Modelling] Chemical Modelling Volume 5 ||

3. Once all of the side chains have been constructed, an iterative procedure isemployed whereby each residue is considered in turn, and its energy of interaction ofall possible rotamers of its side chain with all of the other residues’ side chains andbackbone atoms is calculated. If another rotamer in the library has a lower energy,then it is selected in place of the existing rotamer. This iteration is repeated until anentire cycle over all of the residues in the protein is completed without a singleresidue being replaced.4. The next step involves a minimisation of any high-energy side chains (defined by

whether the energy of interaction of that side chain with the rest of the protein isgreater than a specified cutoff). Each high energy side chain is, in turn, energyminimised while holding the rest of the protein fixed. The minimisation is performedusing the bond, angle, dihedral and vdW terms from the AMBER99 forcefield.5. Finally, hydrogen atoms are added using the LEAP module of AMBER 8,168

and the entire protein is energy minimised using the conjugate gradient method, andthe full generalized born/surface area (GB/SA)169 implicit solvent AMBER99forcefield.Heath et al.163 were able to use this algorithm to hop between CG and atomistic

models of the proteins src-SH3, S6wt and S6Alz. One of the problems of hopping froma CG to an atomistic model is that, because different forcefields are used, the freeenergy surfaces of the two models could be very different. Thus configurations thatare sampled by the CG model may not be important for the atomistic model. Heathet al.163 investigated this for the CG model (the Das et al. model170) and atomisticmodel (AMBER99167) that they used. They found good agreement between the freeenergy surfaces, and were able to successfully use the CG model to investigate thefolding and misfolding of these proteins, and to use their RACOGS algorithm tozoom in to obtain the atomistic detail of the misfolded structures.Ding et al.171 have also developed a method to reconstruct the atomistic fine detail

of a protein structure from a CGmodel. They use a CGmodel that uses the positionsof the Ca and Cb atoms to investigate domain swapping in seven proteins. The CGmodel enables the running of simulations that cover the timescale on which thedomain swapping is seen to occur. A fully atomistic model of the protein is thenreconstructed from the configurations generated from the CG simulation. Thereconstruction algorithm is composed of the following stages;1. The positions of the Ca and Cb atoms from the CG model are used to position

the N and C backbone atoms so that the correct chirality is maintained (D-aminoacids). To restrict the rotational freedom, a strong constraint was used, namely thatneighbouring residues must have a planar peptide bond. Harmonic potentials wereadded between neighbouring backbone atoms, the equilibrium distances of whichwere taken from average distances calculated from the PDB structures. The Ca

atoms were then immobilised by setting their masses to infinity, and a shortmolecular dynamics simulation was then run to relax the system.2. Backbone hydrogens were added to create non-specific backbone hydrogen

bonds. A CG potential was added between the Cb atoms, and a short moleculardynamics simulation was run to optimise the backbone hydrogen bond network.3. The side chain and backbone O atoms were added using rotamers and a Monte

Carlo simulated annealing algorithm. The scoring function for the simulatedannealing used the vdW interactions from the Cedar forcefield,172 the EEF1 implicitsolvation model173 and a statistical potential for hydrogen bonds, as proposed byKortemme et al.174

This reconstruction algorithm was also used by Sharma, Ding and Dokholyan175

to reconstruct the atomistic detail from CG models of nucleosomes. This algorithmsis very useful, as, like the RACOGS algorithm, it allows the CG model to be used tosample conformational change on timescale beyond that which is accessible tostandard atomistic methods, yet it then allows the simulator to zoom in andreconstruct the atomistic data for interesting configurations.

Chem. Modell., 2008, 5, 13–50 | 35

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 24: [Chemical Modelling] Chemical Modelling Volume 5 ||

3.3 Two-way parallel interfacing methods

Multiscale modelling across the atomistic/CG boundary is not limited to single, one-way transfers of information. Two-way interfacing methods, whereby the CG andatomistic levels exchange information dynamically throughout a simulation, are inactive development and use. Two main schemes for linking CG and atomisticsimulations have become popular; parallel methods, which use loosely-coupledatomistic and CG simulations running in parallel, and embedded methods, wherebyatomistic regions are embedded within a CG model.

3.3.1. Parallel atomistic/CG methods. The replica exchange method, discussed insection 2.4 in terms of atomistic and QM calculations, has also been applied with CGmodels. For example, Nanias et al.176 have used replica exchange moves overtemperature to enhance sampling within a CG protein simulation. Just as therehas been interest in developing replica exchange moves that map between QM andMM models,105 so to has there been recent interest in developing replica exchangemethodology that can map between atomistic and CG representations. Such replicaexchange moves represent a parallel, two-way interface between the atomistic andCG levels, as atomistic and CG simulations are run in parallel, and information isonly periodically exchanged between them.Lyman and Zuckerman177 have developed a atomistic/CG replica exchange

method that they call ‘‘resolution exchange’’ (ResEx). The aim of this method isto run both an atomistic and CG simulation of a system in parallel, and toperiodically attempt replica exchange moves between the simulations. The idea isthat the enhanced sampling of the CG model will be shared with the atomistic modelvia the replica exchange moves. For this method to work, the coarse grain modelmust use a subset of the coordinates of the atomistic model. In this case, Lyman andZuckerman achieved this by using a united atom model at the CG level (OPLSunited atom forcefield33), and an all-atom model for the atomistic level (OPLS allatom forcefield34). While technically being a CG model, a united atom forcefield ismore normally considered to be an atomistic representation. This is because a unitedatom forcefield does not use beads to represent residues or parts of residues, butinstead merely removes aliphatic and aromatic hydrogens from the model bycollapsing them into their bonded carbon. Despite this, Lyman and Zuckerman’smethod can still be considered as a multiscale method, and this application points tosome of the difficulties that more ambitious applications would have to overcome.One major difficulty is that the replica exchange moves between the united atom andall atom representations had a very low acceptance ratio. This is because the CGmodel uses only a subset of the atomistic coordinates, so replica swap moves canlead to unphysical conformations as this subset is moved out of step with the rest ofthe atomistic coordinates. To overcome this problem, Lwin and Luo,178 in a similarmethod, minimised the energy of the atomistic model after each swap, but before theMonte Carlo test that was used to accept the configuration. While this doessuccessfully increase the acceptance ratio of the move, it violates detailed balance,so does not sample the required ensemble correctly. Lyman and Zuckerman’ssolution to the problem is to use incremental coarsening. With this method, insteadof switching the whole protein between the CG and atomistic models in one go,instead individual residues in the protein are switched. This leads to the problem thatCG residues then need to interact with atomistic residues, so CG/atomistic cross-terms need to be included in the Hamiltonian. Fortunately the OPLS united atomand all atom forcefields are compatible, and so can be combined in this way, at leastfor this application. Now multiple replicas of the system may be used, with eachreplica representing a different percentage of the protein using the CG forcefield.Lyman and Zuckerman tested their method by application to the pentapeptide met-enkephalin. They used six replicas, the first replica using a fully atomistic model,then second replica using one CG residue and four atomistic, the second using two

36 | Chem. Modell., 2008, 5, 13–50

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 25: [Chemical Modelling] Chemical Modelling Volume 5 ||

CG residues and three atomistic, and so on. In addition, they also increased thetemperature of each replica, such that while the fully atomistic model was simulatedat 298 K, the fully CGmodel was simulated at 700 K. However, despite the small sizeof the system, the small difference between the CG and atomistic models (OPLSUAvs. OPLSAA) and the use of incremental coarsening, the acceptance ratio of theswap moves was still very low, running between 2.5% and 5.8%. Applications of thismethod to larger systems, or using a greater difference between the atomistic and CGmodels therefore looks problematic.Christen and van Gunsteren10 have developed a novel multiscale method that they

call ‘‘multigraining’’, which aims to use the CG model to enable both relaxation oflarge molecular systems and sampling of slow processes with concurrent atomicdetail representation of the results. In this method, both an atomistic and a CGmodel of a molecule are used simultaneously. Each molecule in the simulation hasboth a set of atomistic (fine grain) coordinates, rFG, and a set of coarse graincoordinates, rCG. The coarse grain coordinates are constrained to map to theatomistic coordinates. The Hamiltonian of the system is composed of atomisticinteraction terms, UFG(rFG), and CG interaction terms, UCG(rCG). Moleculardynamics is then performed using both the atomistic and CG potentials simulta-neously. This is referred to as multigraining MD, and the algorithm is as follows;1. First the coarse grain coordinates are determined by mapping them from the

atomistic coordinates (e.g. by placing CG beads at the geometric centre of a residue).2. Calculate the forces on the CG particles using the CG potential energy terms,

and the forces on the atoms using the atomistic potential energy terms.3. Distribute the forces from the CG particles back onto the atoms.4. Propagate the coordinates of the atoms using the leap-frog integrator.By running both the CG and atomistic models in parallel, and by mapping the CG

coordinates from the atomistic coordinates, this method avoids all of the problemsassociated with information loss when moving back and forth between the CG andatomistic representations. Also, the atomistic and CG potential energy terms areseparate, therefore avoiding the problem of parameterising atomistic/CG cross-terms. Also, by keeping the atomistic and CG potentials separate, Christen and vanGunsteren are able to use a l scaling parameter to switch between the atomistic andCG potentials, e.g. at l= 0, only the pure atomistic potential is used, at l= 1 onlythe pure CG potential is used, while at l= 0.5, a 50–50 mix of the CG and atomisticpotentials are used. Christen and van Gunsteren then employ replica exchangemoves over this l coordinate, thereby allowing several parallel trajectories to swapdynamically between CG and atomistic representations. They tested the method byusing it to simulate liquid octance, but like Lyman and Zuckerman, they experiencedproblems with a low acceptance ratio for the replica exchange moves, and needed touse 24 replicas across the l coordinate. Using 24 replicas, they achieved an averageacceptance ratio of just 24%. This figure masks the very poor acceptance ratio forreplicas at l= 0, where the purely atomistic potential was used. Just 8% of the 300replica exchange moves were accepted at l= 0, which is disappointing, given that itis the sampling at l = 0 which is important for the calculation of correctthermodynamic properties. However this method is useful as a means of performinga rapid equilibration or conformation search at the CG level, and then smoothlyswitching back to the atomistic model to explore interesting configurations.

3.3.2. Embedded atomistics/CG methods. Parallel atomistic/CG methods still usea simple interface between the CG and atomistic models, with information exchangeoccurring only during the replica exchange moves. This represents a loose couplingbetween the atomistic and CG models. An alternative to this loosely-coupledmethod is to develop a tightly coupled method whereby an atomistic model is linkeddirectly to a CG model. This allows for an atomistic region, e.g. an atomistic modelof an ion channel, to be embedded within a CG simulation, e.g. a CG model of a cell

Chem. Modell., 2008, 5, 13–50 | 37

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 26: [Chemical Modelling] Chemical Modelling Volume 5 ||

membrane. This requires that the atomistic and CG models are closely coupledtogether, with information exchange occurring every timestep via atomistic/CGcross-terms in the Hamiltonian. This is challenging, as it is not immediately obvioushow to obtain the parameters that describe the atomistic/CG interaction, nor whatthe form of the interaction potential should take. In section 3.1.2 the multiscalecoarse graining (MS-CG) method, developed by Izvekov and Voth, was presented asa method that could be used to systematically derive the functional form andparameters for MM-CG interactions. Shi,179 working with Izvekov and Voth, hasused the MS-CG method to systematically derive the functional form and para-meters for the atomistic/CG interaction of an atomistic gramicidin A polypeptideion channel immersed in a CG membrane. The DMPC membrane and solvent waterwere modelled using the CG model developed by Izvekov and Voth,150 which wasdeveloped using the MS-CG method. This method derives the parameters for theCG interactions from the average forces experienced by the atoms during anatomistic dynamics simulation. These forces are then coarse-grained and a CGpotential is fitted. Shi ran a short all-atom simulation of the whole membrane/ionchannel system and used this to develop the CG/atomistic forcefield using the MS-CG method. The forcefield was obtained by treating the atomistic and CG regionsequally within the force matching algorithm. However, to reduce computationalcost, some simplifications and approximations were made. First, the forcefield wassplit into three parts; atom–atom, atom-CG and CG-CG. The atom–atom forceswere assumed to be the same as those from the wholly atomistic simulation, while theCG-CG forces were taken from an MS-CG application of a pure lipid bilayerembedded in water. This meant that only the atom-CG forces had to be derived fromthe fully atomistic simulation of the entire system. The atom-CG forces wereobtained by subtracting the atom–atom and CG-CG forces from the referenceforces from the fully atomistic simulation, and then fitting the residual forces usingthe MS-CG method. This method is not as rigorous as applying the full MS-CGmethod in a single step, but it is much more efficient, and Shi et al. found that thisstrategy worked quite well for this system.Essex and co-workers have developed a CGmodel that is designed from the outset

to interact with an atomistic forcefield.127,180,181 They have developed a CG lipidmodel of DMPC127,180 that uses six Gay-Berne123 ellipsoids to represent thehydrocarbon tails, and two Lennard-Jones (LJ) particles to represent the headgroup. The glycerol region was modelled by 2 further Gay-Berne units, therebyallowing the entire DMPC model to be represented by just ten interaction sites (seeFig. 3). The key advance of the Essex model is that it was designed from the outset tobe compatible with atomistic forcefields. The Gay-Berne potential is essentially ananisotropic form of the Lennard-Jones potential, which is the most commonpotential used to model van der waals (vdW) non-bonded forces between atoms.This means that it is possible to interface the Gay-Berne and LJ potentials so that thevdW interaction energies between Gay-Berne beads and Lennard-Jones (LJ) atomscan be calculated. However, to correctly handle the electrostatic interaction betweenthe CG and atomistic levels, the CG model has to explicitly include a representationof the charge distribution of the lipid. The Essex model achieves this by includingpoint charges in the two head group particles, and point dipoles in the two Gay-Berne particles that represent the glycerol region. Water was modelled explicitlyusing the soft sticky dipole model182 (SSD), which despite only using a singleinteraction site, is capable of accurately reproducing the structural, thermodynamic,dielectric and temperature-dependent properties of water.182–185 The lipid model wasparameterized by trial-and-error molecular dynamics, whereby the parameter spacewas searched and molecular dynamics simulation run to reproduce the experimentalstructure of DMPC bilayers (to test whether the desired experimental propertieswere obtained). The point charges and dipoles of the model were chosen toreproduce the net electrostatics (charge and dipole) of each bead from the underlyingatomistic level model. To parameterize the strength of interaction between atomistic

38 | Chem. Modell., 2008, 5, 13–50

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 27: [Chemical Modelling] Chemical Modelling Volume 5 ||

and CG molecules, Essex and co-workers181 calculated solvation free energies ofatomistic amino acid side chain analogues in boxes of CG octane, and compared thisto experiment. The atomistic and CG interactions were found to be unbalanced, butEssex and co-workers found that simple scaling of the cross terms led to goodagreement of comparable quality to atomistic simulations. To calibrate the scalingfactor for the interaction of an atomistic inclusion with the CG water model, asimilar set of calculations were made of the hydration free energy of atomistic aminoacid side chain analogues in the SSD water model. The calculated hydration freeenergies were also compared with experiment. The calculation of these scalingfactors removes the need for the atomistic forcefield to be reparameterized for usewith the CG model. This is an interesting approach to the solution of the cross-termparameterisation problem of atomistic/CG multiscale modelling, and has thepromise to be transferable across a range of solutes and proteins, thereby removingthe need to reparameterize the model for each new system under study.The challenges in creating atomistic/CG multiscale methods parallel those of

creating QM/MM methods. The definition of the QM/MM interface is relativelystraightforward if the divide is between molecules. However the definition of theinterface becomes more complicated if the divide is within a molecule, as then theHamiltonian has to somehow include bonded terms between QM atoms and MMatoms. In much the same way, the definition of the atomistic/CG interface is morestraightforward if it falls between molecules, rather then within a molecule, and thatis why the majority of atomistic/CG multiscale simulations avoid this approach. Theexceptions are the parallel method developed by Lyman and Zuckerman177 (theResEx method discussed in the last section) and an embedded CG/atomistic methoddeveloped by Neri et al.186 Neri et al. have developed a method where a protein ismodelled using a CG representation, but the active site is modelled using anatomistic representation. The aim is to allow efficient sampling of protein conforma-tional change, due to the use of the CG model, but to still capture the fine-detailatomistic motions in the active site, which is the area of interest in protein-ligandbinding or computational enzymology calculations. They tested the method byapplication to two proteins of active pharmaceutical interest: HIV protease, which isa target for anti-HIV medication, and human b-secretase (BASE) which plays a rolein the onset of Alzheimer’s disease. The active site atoms were modelled using theGROMOS9630 atomistic forcefield, while the rest of the protein was modelled at theCG level by only considering Ca centroids. Solvent-protein interactions weremodelled in terms of viscosity and the addition of random forces in the frameworkof a stochastic dynamics simulation. The interaction between the atomistic and CGregions was handled by the addition of an interface region. Several residues at theboundary of the atomistic and CG regions were modelled using both a CG andatomistic representation. The atomistic region residues interacted with the interfaceresidues using the standard atomistic potential. To maintain backbone integritybetween the interface and CG regions, harmonic bonds are added between the Ca

atoms of the interface residues with the Ca atoms of the neighbouring CG residues.In addition, an exponential-form non-bonded potential is added between CG andinterface residues, and the parameters for the CG model and for the interface werechosen so as to reproduce the root mean square fluctuations (RMSF) of HIVprotease as observed during fully atomistic and fully CG simulations. A keyparameter is the thickness of the interface region, with the key concerns being thatthe interface region has to guarantee the correct geometry of the atomistic residues,it has to get the local electrostatics right, and it has to be able to transmit the modesof vibration experienced by the rest of the protein, and which are mimicked by theCG model. Neri et al. found that for both HIV protease and BASE they needed touse an interface region that included all residues that had at least one atom within6 A of any of the residues in the atomistic region.There is a clear separation in the methods presented so far between the atomistic

and CG regions. One molecule, or part of one molecule (the protein), is treated at the

Chem. Modell., 2008, 5, 13–50 | 39

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 28: [Chemical Modelling] Chemical Modelling Volume 5 ||

atomistic level, while the remaining molecules (lipid membrane and solvent) aremodelled using CG techniques. An alternative to defining the atomistic and CGlevels by molecule is instead to define the levels by region, namely to divide thesimulation space into an atomistic region and a coarse grain region. This creates amore dynamic split between the two models, as individual molecules are free todiffuse from one region to another, and therefore algorithms must be developed thataccount for the interconversion of CG and atomistic models. Praprotnik et al. have,over a series of insightful papers,187–191 developed just such algorithms. They havecreated a multiscale simulation method whereby the simulation space is divided intoan atomistic region and a CG region, and the molecules in the system are free todiffuse between these two regions. The methods developed allow on-the-fly inter-change between the molecules’ coarse grain and atomistic representations, enablinglarge length and time scales to be achieved, while still retaining the atomistic finedetail of parts of the system. The key development in this work is the methodologyused to allow molecules to switch dynamically between CG and atomistic represen-tations. They call these methods AdResS (adaptive resolution scheme). The originalapplication of the method was to the multiscale dynamics of liquid methane.187 ACG model of methane was constructed based on the radial distribution function ofmethane obtained from a simulation of the reference atomistic model. The simula-tion box was split in half, creating an atomistic region and a CG region, and anidentical number of methane molecules were placed in each half of the box. Toensure that the transition between the two regions was smooth, a ‘handshaking zone’was created between the atomistic and CG regions. To achieve a smooth transitionfrom the atomistic to CG forcefields, the forces are scaled in the handshaking zone.As the switching of the resolution can be considered as a first order phase transition,it is necessary for this scheme to be used in combination with a thermostat. Becausethe latent heat is generated at the interface region,187 it is important that a localthermostat is used that couples directly to local particle motion, e.g. Langevin ordissipative particle dynamics (DPD) thermostats. The other main challenge of anadaptive resolution method is how to handle the loss and reintroduction of atomicdegrees of freedom at the atomistic/CG interface. It is important that the vibrationaland rotational degrees of freedom at the atomistic model, which are missing in theCG model, are slowly reintroduced as a molecule diffuses across the interfacebetween the CG and atomistic regions. To achieve this, when a molecule movesfrom the CG to atomistic region it is mapped onto an atomistic representation thathas the same centre of mass and linear momentum as the CG representation. Inaddition, each atom is given rotational and vibrational velocities that come from arandom molecule already in the atomistic region.Praprotnik used AdResS successfully to simulate a multiscale box of liquid

methane.187 While the application worked well, there was a small problem with apressure imbalance at the interface zone between the CG and atomistic regions.Praprotnik et al. solved this problem in a new version of their method,188 in whichthey also extended the scope of application to include spherical boundaries betweenthe atomistic and CG regions. Since then, they have used the method to run multi-resolution simulations of liquid water191 (using the CG parameterisation method ofLyubartsev152 to get the CG model of water) and have investigated the thermo-dynamic implications of running simulations that have a dynamic, and indeedfractional number of degrees of freedom.189

Ensing et al.192 have also developed a new multiscale method that allows for anatomistic region to be embedded within a CG simulation. Like AdResS, the methoduses a spherical atomistic region that is interfaced to the CG simulation via ahandshaking buffer zone. To ensure a smooth transition from the CG to atomisticrepresentations, the potential energies, rather than the forces, are smoothly scaledfrom the CG to atomistic Hamiltonians as the molecule crosses the handshakingzone. Additional terms are also added to the Hamiltonian that account for thekinetic energy that is lost when the number of degrees of freedom is decreased when

40 | Chem. Modell., 2008, 5, 13–50

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 29: [Chemical Modelling] Chemical Modelling Volume 5 ||

a molecule moves from the atomistic to the CG region. These additional termsensure that the total energy of the system is conserved even as the number of degreesof freedom change. In addition to using a different method to switch between the CGand atomistic regions, Ensing et al. also use a different method to handle the changein dynamics between the CG and atomistic regions. Instead of destroying andrecreating the atomistic degrees of freedom as the molecules move across theboundary, they instead use the RESPA193 multiple timestep molecular dynamicsalgorithm to freeze out the intramolecular degrees of freedom of molecules as theymove into the CG region. RESPA is a molecular dynamics method whereby differenttimesteps can be used to sample different parts of the Hamiltonian. It hastraditionally been used to separate the sampling of ‘‘fast’’ forces, such as bondpotentials, from ‘‘slow’’ forces, such as those arising from the non-bonded poten-tials. Ensing et al. cleverly use RESPA to separate the sampling of the CG andatomistic Hamiltonians by region, and thereby effectively freeze the atomisticrepresentation as it crosses into the CG region. This allows them to tune themolecular dynamics integrator optimally across the entire multiscale system. Whilethis method has been applied only to a model system and liquid methane, it showsgreat promise as a method that could find widespread use in biomolecular modelling.

4. Interfacing particle with continuum models

Continuum methods provide the largest time and space ranges considered in thisreview. While continuum models have been popular for the modelling of solids194 orof liquid or gas flow,195 they have only recently been applied to the study ofbiomolecular systems. This is because continuum methods tend to be best suited toperiodic or smoothly varying systems, and biomolecular systems are distinctlyinhomogenous, and exhibit physical behaviours that require the the modelling offine-level atomistic detail. By far the most popular continuum models used inbiomolecular modelling are those that replace atomistic waters with an implicitsolvent model. Many implicit solvent models, which include the Poisson Boltzmann(PB)196,197 and Generalized Born (GB)198,199 methods, model the solvent as adielectric continuum, and calculate the polarisation of that continuum caused bythe charge distribution of the atomistic molecule model. This polarisation leads to anelectrostatic reaction field with which the atomistic molecular model then interacts.Implicit solvent models have been used for many years in combination withQM,200,201 MM202,203 and CG204,205 calculations. These represent a large class oftruly multiscale methods, and have been reviewed in detail many times.204,206–209

Implicit solvent models have been the dominant class of multiscale continuummethods over recent years. However exciting new classes of multiscale continuummodels have recently been developed. These new methods fall into the followingcategories, which are of similar definition and type to the interfaces used between theQM/MM and atomistic/CG levels:1. One-way bottom-up interfacing methods. These involve a single transfer of

information from the atomistic or CG level to the continuum level, e.g. by using theatomistic level to provide parameters for the continuum model.2. Two-way embedded interfacing methods. These involve embedding an atomis-

tic or CG model within a continuum representation. Implicit solvent models fall intothis category. New multiscale methods, which capture hydrodynamic and mechan-ical effects have now also been developed.3. Two-way parallel interfacing methods. These involve the running of an

atomistic or coarse grain level simulation in parallel with a continuum calculation,and interfacing the two using a lightly coupled interface, e.g. by using propertiescalculated from the continuum model to update the boundary conditions of theatomistic or CG simulation.Examples of each of these categories of multiscale continuum interfaces will now

be discussed in turn.

Chem. Modell., 2008, 5, 13–50 | 41

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 30: [Chemical Modelling] Chemical Modelling Volume 5 ||

4.1 One-way bottom-up interfacing methods

Continuummodels can be interfaced with atomistic models using a one-way bottom-up multiscale parameterisation scheme. In this scheme, an atomistic level calculationis used to provide the input parameters for the continuummodel. This scheme is verysimple, as it involves a single transfer of information from the atomistic level to thecontinuum level. Tang et al.210 have recently employed this scheme to get theparameters for a continuum model of a mechanosensitive ion channel embedded in alipid membrane. The membrane was modelled as a homogenous elastic sheet ofthickness 35 A, and area of 400 A � 400 A, using a finite element (FEM) model. Thetransmembrane helicies were represented by homogenous cylindrical elastic rods ofdiameter 5 Awith spherical caps at each end. The helicies were embedded within themembrane, and were also modelled using a FEM model. What makes this amultiscale simulation is that the parameters for this model were calculated usinginformation calculated using atomistic-level models. This represents a one-way,bottom-up information flow from the atomistic level to the continuum level. Elasticparameters for the model were calculated from the mechanical properties of themembrane and helicies measured by experiment and calculated from atomisticmolecular dynamics simulations. The interactions between the helicies and themembrane were modelled using a Lennard-Jones style attractive/repulsive potential.The parameters for this potential were determined based on molecular mechanicscalculations run using the CHARMM forcefield.27 The channel is comprised of tenhelicies. The interaction potential between helicies was parameterized by fitting toCHARMM potential energy calculations. For each pair of helicies, the interactionenergy in vacuum was calculated using the CHARMM19 forcefield, with thecoordinates set to either X-ray or homology structures. This calculation wasperformed for different combinations of helix pairs, which effectively samplesdifferent orientations, and a range of centre of mass inter-helix separations thatvaried between �20 A and 20 A. The interactions between each helix and themembrane were parameterized based on calculating the insertion energy profile ofmoving an atomistic model of the helix from an implicit solvent water model into animplicit lipid membrane model. In the implicit membrane models the membranethickness was taken to be 23.5 A, which corresponds to the thickness of thehydrophobic part of the membrane. While this model was constructed as a proofof concept of FEM methods, it was able to reproduce the experimental gatingtension of the channel, and the structural variations along the gating pathway asobserved from biased atomistic molecular dynamics simulations. This applicationsuccessfully demonstrated the potential of continuum models of biomolecularsystems to reach length (sub-mm) and time (multi ms) scales that are not accessibleto atomistic level calculations.

4.2 Two-way embedded interfacing methods

Continuum models can be directly interfaced with atomistic or coarse grain modelsusing a two-way embedded interface. In this scheme, the atomistic or CG model isembedded within a continuum model. Implicit solvent methods, in which anatomistic or CG model of a solute is embedded within a continuum model of thesolvent, are popular and well-established examples of this type of interface. Implicitsolvent models represent the solvent as a dielectric continuum, and allow theelectrostatics of the atomistic or CG solute to polarise the continuum, which thenresults in an electrostatic reaction field that returns to interact with the solute.Implicit solvent models have been reviewed in detail many times before,204,206–209

and enable the dynamic transfer of electrostatic information across the atomistic/continuum or CG/continuum interfaces. Recently, new multiscale continuum meth-ods have been developed that allow for the dynamic transfer of mechanical andhydrodynamic information across these interfaces. One example is the work by Villa

42 | Chem. Modell., 2008, 5, 13–50

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 31: [Chemical Modelling] Chemical Modelling Volume 5 ||

et al.,211,212 who have developed a method that allows the transfer of mechanicalinformation across the atomistic/continuum boundary. They have developed amultiscale model to investigate protein/DNA complexes. The method uses anatomistic model of the lac repressor protein and a continuum model of a large loopof DNA that is bound to the protein. The DNA loop is represented using acontinuum elastic rod model, using elasticity theory. Protein MD simulations areused to position the DNA loop end points, and these end points then serve as crucialboundary conditions to elasticity theory to determine the shape of the loop.Elasticity theory then in turn provides the forces acting at the end points, whichcan be fed back into the protein molecular dynamics, completing the loop.Most atomistic/continuum embedded multiscale methods split the system into

continuum and atomistic regions by molecule, e.g. the solvent is defined as being inthe continuum, while the solute is atomistic. While being straight-forward toimplement, such a split does not allow a complete representation of the physicaldetail of the system. One problem with continuum solvent models is that byrepresenting all of the solvent as a uniform dielectric, they fail to capture atomicdetails, such as hydrogen bonding between the solute and solvent, which can beimportant factors to consider in protein-ligand binding or computational enzymol-ogy applications. It is therefore preferable to be able to spit the system spatially intoatomistic and continuum regions, e.g. to model the solvent in the active site of theprotein using an atomistic model, with the remainder modelled as a continuum.However, such a model must account for the diffusion of solvent between theatomistic and continuum regions. Schemes must be developed that allow for theconservation of momentum, mass and energy across the interface. While we areunaware of any such methods that have been developed for application to proteinsystems, Coveney et al. have, over a series of papers,6,213–219 developed an ambitiousand highly promising multiscale method that allows the embedding of condensedphase atomistic simulations within continuum models for the purpose of studyinghydrodynamics.The Coveney method works by dividing space into two main regions; an atomistic

(particle) region, called P, and a continuum region, called C, modelled usingcomputational fluid dynamics (CFD). An interface lies between these two regions.This interface is split into two parts; a C to P part, which sits on the continuum sideof the interface, and a P to C part, which sits on the particle side. In the P region, themotion of the atomistic particles is solved using molecular dynamics. In the C regionthe motion of the continuum solvent is solved using standard continuum fluiddynamics (CFD). The hybrid scheme applied at the interface is designed to exchangefluxes of conserved quantities, namely energy, momentum and mass. In the C to Pregion, the fluxes from the continuum region are imposed on the particles. In the P toC region, the atomistic fluxes are coarse grained in time and space to form boundaryconditions for the continuum domain.213 The details of how this is achieved isdescribed in detail elsewhere.213,215,220 In essence, the stress induced in the P regionby flow in the C region can be calculated and converted into a local momentum fluxat the C to P interface. This flux can be converted into a force which is added to all ofthe particles in the C to P region. This pressure force from the continuum liquid isenough to stop the escape of particles from the C to P region, and, at a basic level, isresponsible for the transfer of information from the continuum to the particleregion. The P to C interface, on the other hand, is used to transfer information backto the continuum model. As the continuum model is a mesoscale representation ofthe solvent, the particle model must be averaged in both space and time. Anaveraging time of Dtavg is used, and the average momentum flux is calculated. Thiscan then be used to establish the boundary condition of the continuum domain atthe C to P interface. In addition to controlling the flow of momentum acrossthe interface, the flow of mass must also be managed to ensure that the total mass inthe whole system is conserved. This is achieved by calculating the mass flux atthe interface and using that to determine whether particles need to be deleted

Chem. Modell., 2008, 5, 13–50 | 43

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 32: [Chemical Modelling] Chemical Modelling Volume 5 ||

(to represent flow of mass form the particle to continuum region), or whetherparticles need to be created (to represent flow back from the continuum region).Removing particles is simply a case of choosing those that are closest to the top ofthe molecular dynamics simulation and thus closest to the continuum region.Inserting particles is more difficult, as sufficient space has to be found for the newparticle amongst the existing particles, and atomic fine detail (coordinates andmomenta) have to be reconstructed. Coveney et al. have developed the USHERalgorithm221 which efficiently finds a location and configuration of the new solventmolecule that releases an energy equal to the mean energy per molecule. The velocityof the new particle is obtained randomly from a Maxwell distribution with a zeromean velocity and temperature equal to the simulation temperature. One of theoriginal applications of this method213 used a simple model system, with a modelpolymer attached to a wall, solvated by a single-particle model solvent. They wereable to use this setup to study the dynamics of this polymer within a solvent whichwas undergoing shear flow. Coveney et al. refined their method214,219 to correctlycouple hydrodynamics, using a fluctuating hydrodynamics (FH) continuum mod-el,222 and were able to show that fluctuations, at both the atomistic and continuumlevel were thermodynamically consistent. They demonstrated their method bysimulating sound waves passing through a continuum, then atomistic model ofliquid water, and then being reflected by an atomistic model of a lipid membranebilayer.214 The light coupling between the different types of simulation (MD andCFD or FH) makes it well-suited to application over a distributed computingcluster, and Coveney et al. have demonstrated the method’s successful deploymentover computational GRIDs.6

4.3 Two-way parallel interfacing methods

Creating a multiscale method that directly embeds an atomistic region within acontinuum model introduces a physical interface between the atomistic and con-tinuum models. This complicates the implementation, as techniques must bedeveloped that allow for the transfer of information across this physical interface,e.g. the mass, momentum and energy flux across the interface must be controlled.Chang, Ayton and Voth223 have developed a multiscale atomistic/continuummethod that avoids the creation of a physical interface between the two regions.They achieve this by running both a continuum and an atomistic model of the systemin parallel, and only loosely coupling this pair of simulations by periodicallytransferring structural information between them. Chang et al. developed themethod, called multiscale coupling (MSC),112,223,224 originally to study the effectof long range dynamics of a lipid bilayer membrane at the atomistic level. Theyloosely couple together an atomistic model of a bilayer with a continuum-basedmesoscopic model. The continuum model is based on an elastic membrane (EM)implicit solvent model originally proposed by Ayton and Voth.225 This model wasparameterized from mechanical properties calculated from atomistic MD simula-tions of a DMPC bilayer. To better capture the dynamics of a membrane, this modelhas been updated to include an explicit mesoscale model of water.223 The elasticmembrane model was used to simulate a bilayer of dimension, 890 A � 890 A. Tomake this a multiscale simulation, a point on the elastic sheet representing themembrane is chosen, and an atomistic model of the membrane at that point isconstructued. The atomistic model comprised 64 DMPC lipid molecules solvated by1312 TIP3P waters. A real biological membrane is not subject to external con-straints, and therefore it adopts a tensionless configuration.226 However simulationsusing the canonical (NVT) ensemble use a fixed number of particles and fixedmembrane area. This results in a simulation of the membrane with a non-zerointerfacial tension.227 This observation has resulted in the development of a newensemble, whereby the membrane tension, g, is kept constant.226 This new ensemble,NPgT, allows the membrane tension to become part of the definition of the

44 | Chem. Modell., 2008, 5, 13–50

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 33: [Chemical Modelling] Chemical Modelling Volume 5 ||

thermodynamic state of the system, as g has become a thermodynamicparameter. Chang et al. use this fact to couple their continuum-based mesoscopicmembrane model to the atomistic model. They run the atomistic model usingthe NPgT ensemble, and update the value of g every 1 ps in response to thechanges in surface tension measured at the corresponding location in thecontinuum-based mesoscopic model. This allows a dynamic, one-way transfer ofinformation from the continuum model to the atomistic model throughout thesimulation. The advantage of this approach is that the atomistic level simulationdoes not directly interface with the continuum level simulation, so standardapproaches can be used, e.g. periodic boundaries and the use of the particle meshEwald sum to model long range electrostatics. Also, while this applicationonly allows the dynamic flow of information from the continuum to atomistic level,in theory, mechanical properties calculated from the atomistic simulation could thenbe calculated and used to reparameterize the continuum model, thereby completingthe cycle.Ayton and Voth228 have employed this multiscale coupling (MSC) method to

study the effect of long range membrane motion on the dynamics of an ion-channel.An atomistic ion channel was embedded within an atomistic simulation of thesolvated bilayer, and this was coupled to the continuum-based EM model using theMSC method. Because formally an ensemble of atomistic simulations should beconnected to the continuum model,228 they ran eight atomistic simulations inparallel, and connected each of these to the same point in the continuum model.They compared the dynamics of the ion channel using both the MSC boundaryconditions and standard NPT boundary conditions and found that the orientationof histidine residues in the channel appeared more disordered when the atomisticlevel simulation was coupled to the continuum. This suggested that the fluctuationsin stress caused by the long range dynamics of the membrane modelled at thecontinuum level, could indeed affect the dynamics of the ion channel at the atomisticlevel.

5. Beyond continuum models

Continuum models allow biological simulators to access extremely large length andtimescales, e.g. there are continuum models of a complete human heart.229

There is, however, a level of modelling that addresses even larger length and timescales. At this level, biological network models are used to represent cells,230,231

organs232–234 or creatures.235,236 Detailed reviews230,237,238 of how to build models atthis level are available. In brief, the methods work by constructing differentialequations that describe the flow of material through a biological system, e.g. using adifferential equation to represent an ionic current.237 Several equations can becoupled together to form a model, e.g. of the electric current flow across the humanheart.232 The whole network of equations can be run together to produce an outputsignal that describes the biological property that the simulator is interested inmodelling, e.g. modelling the oscillations in concentration of cellular metabolites.238

Multiple levels of biological network models can be combined together. ThePhysiome project239 is an attempt to define standard interfaces between these levelsso that they can be combined together easily. To make these models fully multiscale,they could also be interfaced with atomistic, CG, or continuum representations. Theconduits for information flow are the kinetic parameters that calibrate the differ-ential equations that make up the network.238 For example, QM/MM computa-tional enzymology calculations can be used to obtain the rate constants of enzymecatalysed reactions, or MM/CG calculations can be used to obtain the diffusionconstants of ions through membrane pores. In this way, biological multiscalemodelling will allow the coupling of simulations of electrons all the way up tomodels of a human heart.

Chem. Modell., 2008, 5, 13–50 | 45

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 34: [Chemical Modelling] Chemical Modelling Volume 5 ||

6. Conclusion

This review has outlined the techniques and methods involved in building computermodels of biological systems that bridge between different length and time scales.Biomolecular multiscale modelling is not new: Levitt and Warshel pioneered muchof the development of the foundations of the QM/MM and MM/CG interfaces inthe 1970s. However, there has been an explosion of interest and activity in this areaover the last few years that has been prompted both by the increases in availablecomputer power, and the maturing of methodology at each of the differentbiomolecular modelling levels. As each level has matured, and become predictive,simulators are now rightly asking how can these methods be combined to answer thebigger questions of biology such as how does a cell maintain membrane cohesion,how do small changes in protein configuration initiate cell signalling pathways, andwhat is it about this collection of molecules that makes them a living system? Whilemultiscale modelling methods are not yet capable of providing answers to thesequestions, the huge progress made over the last few years is now providing anunprecedented insight into the interplay between the biological and chemical worlds.Multiscale modelling methods will certainly be an exciting growth area of increasingimportance in biology.

References

1 L. E. Bilston and K. Mylvaganam, FEBS Lett., 2002, 512(1–3), 185.2 P. J. Booth, Curr. Opin. Struct. Biol., 2005, 15(4), 435.3 A. Ababou, A. van der Wart, V. Gogonea and K. M.Merz, Biophys. Chem., 2007, 125(1),

221.4 D. D. Vvedensky, J. Phys.-Condes. Matter, 2004, 16(50), R1537.5 P. Koumoutsakos, Annu. Rev. Fluid Mech., 2005, 37, 457.6 R. Delgado-Buscalioni, P. V. Coveney, G. D. Riley and R. W. Ford, Philos. Trans. R.

Soc. A-Math. Phys. Eng. Sci., 2005, 363(1833), 1975.7 P. V. Coveney and P. W. Fowler, J. R. Soc. Interface, 2005, 2(4), 267.8 Y. Sugita, A. Kitao and Y. Okamoto, J. Chem. Phys., 2000, 113(15), 6042.9 H. Fukunishi, O. Watanabe and S. Takada, J. Chem. Phys., 2002, 116(20), 9058.10 M. Christen and W. F. van Gunsteren, J. Chem. Phys., 2006, 124(15), 154106.11 R. J. Bartlett and M. Musia, Rev. Mod. Phys., 2007, 79(1), 291.12 W. A. Lester and R. Salomon-Ferrer, Theochem.-J. Mol. Struct., 2006, 771(1–3), 51.13 J. Z. Wu and Z. D. Li, Annu. Rev. Phys. Chem., 2007, 58, 85.14 S. Y. Zou, G. G. Balint-Kurti and F. R. Manby, J. Chem. Phys., 2007, 127(4), 044107.15 H. Y. Liu et al., Proteins, 2001, 44(4), 484.16 M. J. S. Dewar, E. G. Zoebisch, E. F. Healy and J. J. P. Stewart, J. Am. Chem. Soc., 1985,

107(13), 3902.17 J. J. P. Stewart, J. Comput. Chem., 1989, 10(2), 209.18 J. J. P. Stewart, J. Comput. Chem., 1989, 10(2), 221.19 M. Schutz and H. J. Werner, Chem. Phys. Lett., 2000, 318(4–5), 370.20 S. Patel and C. L. Brooks, Mol. Simul., 2006, 32(3-4), 231.21 A. Warshel, M. Kato and A. V. Pisliakov, J. Chem. Theory Comput., 2007, 3(6), 2034.22 D. J. Price and C. L. Brooks, J. Comput. Chem., 2002, 23(11), 1045.23 Y. G. Mu, D. S. Kosov and G. Stock, J. Phys. Chem. B, 2003, 107(21), 5064.24 A. D. Mackerell, J. Comput. Chem., 2004, 25(13), 1584.25 M. Patra and M. Karttunen, J. Comput. Chem., 2004, 25(5), 678.26 B. Hess and N. F. A. van der Vegt, J. Phys. Chem. B, 2006, 110(35), 17616.27 A. D. MacKerell et al., J. Phys. Chem. B, 1998, 102(18), 3586.28 D. A. Case et al., J. Comput. Chem., 2005, 26(16), 1668.29 L. J. Yang et al., J. Phys. Chem. B, 2006, 110(26), 13166.30 X. Daura, A. E. Mark and W. F. van Gunsteren, J. Comput. Chem., 1998, 19(5), 535.31 W. R. P. Scott et al., J. Phys. Chem. A, 1999, 103(19), 3596.32 L. D. Schuler, X. Daura andW. F. Van Gunsteren, J. Comput. Chem., 2001, 22(11), 1205.33 W. L. Jorgensen, J. D. Madura and C. J. Swenson, J. Am. Chem. Soc., 1984, 106(22),

6638.34 W. L. Jorgensen, D. S. Maxwell and J. TiradoRives, J. Am. Chem. Soc., 1996, 118(45),

11225.35 J. A. McCammon, B. R. Gelin and M. Karplus, Nature, 1977, 267(5612), 585.

46 | Chem. Modell., 2008, 5, 13–50

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 35: [Chemical Modelling] Chemical Modelling Volume 5 ||

36 M. P. Allen and D. J. Tildesley, Computer Simulation of Liquids, Oxford University Press,1987.

37 N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller and E. Teller, J.Chem. Phys., 1953, 21(6), 1087.

38 A. Warshel and M. Karplus, J. Am. Chem. Soc., 1972, 94(16), 5612.39 A. Warshel and M. Levitt, J. Mol. Biol., 1976, 103(2), 227.40 J. Zielkiewicz, J. Chem. Phys., 2005, 123(10), 104501.41 Q. Zhang and Z. Z. Yang, Acta Phys.-Chim. Sin., 2007, 23(10), 1565.42 W. L. Jorgensen, J. Chandrasekhar, J. D. Madura, R. W. Impey and M. L. Klein, J.

Chem. Phys., 1983, 79(2), 926.43 M. W. Mahoney and W. L. Jorgensen, J. Chem. Phys., 2000, 112(20), 8910.44 J. M. Wang, R. M. Wolf, J. W. Caldwell, P. A. Kollman and D. A. Case, J. Comput.

Chem., 2004, 25(9), 1157.45 A. Jakalian, B. L. Bush, D. B. Jack and C. I. Bayly, J. Comput. Chem., 2000, 21(2), 132.46 A. Jakalian, D. B. Jack and C. I. Bayly, J. Comput. Chem., 2002, 23(16), 1623.47 G. A. Kaminski and W. L. Jorgensen, J. Phys. Chem. B, 1998, 102(10), 1787.48 J. W. Storer, D. J. Giesen, C. J. Cramer and D. G. Truhlar, J. Comput.-Aided Mol. Des.,

1995, 9(1), 87.49 G. M. Wang and A. C. Sandberg, Nanotechnology, 2007, 18(13), 135702.50 A. E. Cho, V. Guallar, B. J. Berne and R. Friesner, J. Comput. Chem., 2005, 26(9), 915.51 C. R. W. Guimaraes, M. Udier-Blagovic, I. Tubert-Brohman and W. L. Jorgensen, J.

Chem. Theory Comput., 2005, 1(4), 617.52 C. R. W. Guimaraes, M. Udier-Blagovic and W. L. Jorgensen, J. Am. Chem. Soc., 2005,

127(10), 3577.53 R. D. Taylor, P. J. Jewsbury and J. W. Essex, J. Comput.-Aided Mol. Des., 2002, 16(3),

151.54 C. Hensen et al., J. Med. Chem., 2004, 47(27), 6673.55 F. Beierlein, H. Lanig, G. Schurer, A. H. C. Horn and T. Clark, Mol. Phys., 2003,

101(15), 2469.56 J. Chandrasekhar, S. Shariffskul and W. L. Jorgensen, J. Phys. Chem. B, 2002, 106(33),

8078.57 O. Acevedo and W. L. Jorgensen, J. Am. Chem. Soc., 2006, 128(18), 6141.58 O. Acevedo and W. L. Jorgensen, J. Org. Chem., 2006, 71(13), 4896.59 O. Acevedo and W. L. Jorgensen, J. Chem. Theory Comput., 2007, 3(4), 1412.60 M. P. Repasky, J. Chandrasekhar and W. L. Jorgensen, J. Comput. Chem., 2002, 23(16),

1601.61 J. D. Thompson, C. J. Cramer and D. G. Truhlar, J. Comput. Chem., 2003, 24(11), 1291.62 M. Udier-Blagovic, P. M. De Tirado, S. A. Pearlman and W. L. Jorgensen, J. Comput.

Chem., 2004, 25(11), 1322.63 A. Warshel, Annu. Rev. Biophys. Biomolec. Struct., 2003, 32, 425.64 M. Svensson et al., J. Phys. Chem., 1996, 100(50), 19357.65 T. Vreven et al., J. Chem. Theory Comput., 2006, 2(3), 815.66 K. P. Eurenius, D. C. Chatfield, B. R. Brooks and M. Hodoscek, Int. J. Quantum Chem.,

1996, 60(6), 1189.67 J. L. Gao and D. G. Truhlar, Annu. Rev. Phys. Chem., 2002, 53, 467.68 D. Riccardi et al., J. Phys. Chem. B, 2006, 110(13), 6458.69 H. Lin and D. G. Truhlar, Theor. Chem. Acc., 2007, 117(2), 185.70 T. Laino, F. Mohamed, A. Laio and M. Parrinello, J. Chem. Theory Comput., 2005, 1(6),

1176.71 U. C. Singh and P. A. Kollman, J. Comput. Chem., 1986, 7(6), 718.72 P. A. Bash, M. J. Field and M. Karplus, J. Am. Chem. Soc., 1987, 109(26), 8092.73 V. Thery, D. Rinaldi, J. L. Rivail, B. Maigret and G. G. Ferenczy, J. Comput. Chem.,

1994, 15(3), 269.74 L. G. Gorb, J. L. Rivail, V. Thery and D. Rinaldi, Int. J. Quantum Chem., 1996, 60(7),

1525.75 J. L. Gao, P. Amara, C. Alhambra and M. J. Field, J. Phys. Chem. A, 1998, 102(24),

4714.76 J. Z. Pu, J. L. Gao and D. G. Truhlar, ChemPhysChem, 2005, 6(9), 1853.77 Hypercube, Inc., Hyperchem Users Manual, Computational Chemistry, Hypercube, Inc.,

Waterloo, Ontario, Canada, 1994.78 E. Derat, J. Bouquant and S. Humbel, Theochem.-J. Mol. Struct., 2003, 632, 61.79 P. D. Lyne, M. Hodoscek and M. Karplus, J. Phys. Chem. A, 1999, 103(18), 3462.80 Y. K. Zhang, T. S. Lee and W. T. Yang, J. Chem. Phys., 1999, 110(1), 46.81 Y. K. Zhang, Theor. Chem. Acc., 2006, 116(1–3), 43.82 Y. K. Zhang, J. Chem. Phys., 2005, 122(2), 024114.

Chem. Modell., 2008, 5, 13–50 | 47

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 36: [Chemical Modelling] Chemical Modelling Volume 5 ||

83 I. Antes and W. Thiel, J. Phys. Chem. A, 1999, 103(46), 9290.84 O. A. von Lilienfeld, I. Tavernelli, U. Rothlisberger and D. Sebastiani, J. Chem. Phys.,

2005, 122(1), 014113.85 N. Reuter, A. Dejaegere, B. Maigret and M. Karplus, J. Phys. Chem. A, 2000, 104(8),

1720.86 A. Fornili, P. F. Loos, M. Sironi and X. Assfeld, Chem. Phys. Lett., 2006, 427(1–3), 236.87 J. Z. Pu, J. L. Gao and D. G. Truhlar, J. Phys. Chem. A, 2004, 108(25), 5454.88 M. Elstner, Theor. Chem. Acc., 2006, 116(1–3), 316.89 J. Z. Pu, J. L. Gao and D. G. Truhlar, J. Phys. Chem. A, 2004, 108(4), 632.90 P. H. Konig, M. Hoffmann, T. Frauenheim and Q. Cui, J. Phys. Chem. B, 2005, 109(18),

9082.91 A. Rodriguez, C. Oliva, M. Gonzalez, M. van der Kamp and A. J. Mulholland, J. Phys.

Chem. B, 2007, 111(44), 12909.92 F. Claeyssens et al., Angew. Chem.-Int. Edit., 2006, 45(41), 6856.93 M. Strajbl, G. Y. Hong and A. Warshel, J. Phys. Chem. B, 2002, 106(51), 13333.94 R. P. Muller and A. Warshel, J. Phys. Chem., 1995, 99(49), 17516.95 R. W. Zwanzig, J. Chem. Phys., 1954, 22(8), 1420.96 W. L. Jorgensen, J. F. Blake and J. K. Buckner, Chem. Phys., 1989, 129(2), 193.97 A. Yadav, R. M. Jackson, J. J. Holbrook and A. Warshel, J. Am. Chem. Soc., 1991,

113(13), 4800.98 R. H. Wood, E. M. Yezdimer, S. Sakane, J. A. Barriocanal and D. J. Doren, J. Chem.

Phys., 1999, 110(3), 1329.99 M. Strajbl, A. Shurki, M. Kato and A. Warshel, J. Am. Chem. Soc., 2003, 125(34), 10228.100 Y. Ming, G. L. Lai, C. H. Tong, R. H. Wood and D. J. Doren, J. Chem. Phys., 2004,

121(2), 773.101 T. H. Rod and U. Ryde, J. Chem. Theory Comput., 2005, 1(6), 1240.102 E. Rosta, M. Klahn and A. Warshel, J. Phys. Chem. B, 2006, 110(6), 2934.103 W. K. Hastings, Biometrika, 1970, 57(1), 97.104 R. Iftimie, D. Salahub and J. Schofield, J. Chem. Phys., 2003, 119(21), 11285.105 C. J. Woods, F. R. Manby and A. J. Mulholland, J. Chem. Phys., 2008, 128(1), 014109.106 R. Iftimie, D. Salahub, D. Q. Wei and J. Schofield, J. Chem. Phys., 2000, 113(12), 4852.107 D. A. Pearlman, J. Phys. Chem., 1994, 98(5), 1487.108 M. Mezei, J. Chem. Phys., 1987, 86(12), 7084.109 C. J. Woods, J. W. Essex and M. A. King, J. Phys. Chem. B, 2003, 107(49), 13703.110 H. Leontiadou, A. E. Mark and S. J. Marrink, Biophys. J., 2007, 92(12), 4209.111 G. S. Ayton and G. A. Voth, Biophys. J., 2004, 87(5), 3299.112 J. W. Chu, G. S. Ayton, S. Izvekov and G. A. Voth, Mol. Phys., 2007, 105(2–3), 167.113 S. O. Nielsen, C. F. Lopez, G. Srinivas and M. L. Klein, J. Phys.-Condes. Matter, 2004,

16(15), R481.114 V. Tozzini, Curr. Opin. Struct. Biol., 2005, 15(2), 144.115 M. Venturoli, M. M. Sperotto, M. Kranenburg and B. Smit, Phys. Rep.-Rev. Sec. Phys.

Lett., 2006, 437(1–2), 1.116 M. Levitt and A. Warshel, Nature, 1975, 253(5494), 694.117 M. Levitt, J. Mol. Biol., 1976, 104(1), 59.118 B. Smit et al., Nature, 1990, 348(6302), 624.119 R. Goetz, G. Gompper and R. Lipowsky, Phys. Rev. Lett., 1999, 82(1), 221.120 H. Noguchi and M. Takasu, Phys. Rev. E, 2001, 6404(4), 041913.121 G. Brannigan and F. L. H. Brown, J. Chem. Phys., 2004, 120(2), 1059.122 D. J. Michel and D. J. Cleaver, J. Chem. Phys., 2007, 126(3), 034506.123 J. G. Gay and B. J. Berne, J. Chem. Phys., 1981, 74(6), 3316.124 J. C. Shelley, M. Y. Shelley, R. C. Reeder, S. Bandyopadhyay and M. L. Klein, J. Phys.

Chem. B, 2001, 105(19), 4464.125 J. C. Shelley et al., J. Phys. Chem. B, 2001, 105(40), 9785.126 S. J. Marrink, A. H. de Vries and A. E. Mark, J. Phys. Chem. B, 2004, 108(2), 750.127 M. Orsi, D. Y. Haubertin, W. Sandersony and J. W. Essex, J. Phys. Chem. B, in press.128 S. D. Chao, J. D. Kress and A. Redondo, J. Chem. Phys., 2005, 122(23), 234912.129 C. F. Lopez, S. O. Nielsen, G. Srinivas, W. F. DeGrado and M. L. Klein, J. Chem.

Theory Comput., 2006, 2(3), 649.130 D. Van der Spoel et al., J. Comput. Chem., 2005, 26(16), 1701.131 E. Lindahl, B. Hess and D. van der Spoel, J. Mol. Model., 2001, 7(8), 306.132 S. J. Marrink, J. Risselada and A. E. Mark, Chem. Phys. Lipids, 2005, 135(2), 223.133 V. Knecht and S. J. Marrink, Biophys. J., 2007, 92(12), 4254.134 X. Periole, T. Huber, S. J. Marrink and T. P. Sakmar, J. Am. Chem. Soc., 2007, 129(33),

10126.

48 | Chem. Modell., 2008, 5, 13–50

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 37: [Chemical Modelling] Chemical Modelling Volume 5 ||

135 R. Baron, A. H. de Vries, P. H. Hunenberger andW. F. van Gunsteren, J. Phys. Chem. B,2006, 110(31), 15602.

136 R. Baron, A. H. de Vries, P. H. Hunenberger andW. F. van Gunsteren, J. Phys. Chem. B,2006, 110(16), 8464.

137 R. Baron et al., ChemPhysChem, 2007, 8(3), 452.138 S. J. Marrink, H. J. Risselada, S. Yefimov, D. P. Tieleman and A. H. de Vries, J. Phys.

Chem. B, 2007, 111(27), 7812.139 P. J. Bond and M. S. P. Sansom, J. Am. Chem. Soc., 2006, 128(8), 2697.140 P. J. Bond, J. Holyoake, A. Ivetac, S. Khalid and M. S. P. Sansom, J. Struct. Biol., 2007,

157(3), 593.141 S. Izvekov and G. A. Voth, J. Phys. Chem. B, 2005, 109(7), 2469.142 S. Izvekov and G. A. Voth, J. Chem. Phys., 2005, 123(13), 134105.143 F. Ercolessi and J. B. Adams, Europhys. Lett., 1994, 26(8), 583.144 S. Izvekov, M. Parrinello, C. J. Burnham and G. A. Voth, J. Chem. Phys., 2004, 120(23),

10896.145 J. W. Chu, S. Izveko and G. A. Voth, Mol. Simul., 2006, 32(3–4), 211.146 W. G. Noid, J. W. Chu, G. S. Ayton and G. A. Voth, J. Phys. Chem. B, 2007, 111(16),

4116.147 S. Izvekov, A. Violi and G. A. Voth, J. Phys. Chem. B, 2005, 109(36), 17019.148 P. Liu, S. Izvekov and G. A. Voth, J. Phys. Chem. B, 2007, 111(39), 11566.149 Y. T. Wang, S. Izvekov, T. Y. Yan and G. A. Voth, J. Phys. Chem. B, 2006, 110(8), 3564.150 S. Izvekov and G. A. Voth, J. Chem. Theory Comput., 2006, 2(3), 637.151 S. Izvekov and G. A. Voth, J. Chem. Phys., 2006, 125(15), 151101.152 A. P. Lyubartsev, Eur. Biophys. J. Biophys. Lett., 2005, 35(1), 53.153 A. P. Lyubartsev and A. Laaksonen, Phys. Rev. E, 1995, 52(4), 3730.154 D. Reith, M. Putz and F. Muller-Plathe, J. Comput. Chem., 2003, 24(13), 1624.155 A. Arkhipov, P. L. Freddolino and K. Schulten, Structure, 2006, 14(12), 1767.156 C. E. A. Chang, J. Trylska, V. Tozzini and J. A. McCammon, Chem. Biol. Drug Des.,

2007, 69(1), 5.157 J. Trylska, V. Tozzini and J. A. McCammon, Biophys. J., 2005, 89(3), 1455.158 V. Tozzini and J. A. McCammon, Chem. Phys. Lett., 2005, 413(1–3), 123.159 C. E. Chang, T. Shen, J. Trylska, V. Tozzini and J. A. McCammon, Biophys. J., 2006,

90(11), 3880.160 V. Tozzini, W. Rocchia and J. A. McCammon, J. Chem. Theory Comput., 2006, 2(3), 667.161 V. Tozzini, J. Trylska, C. E. Chang and J. A. McCammon, J. Struct. Biol., 2007, 157(3),

606.162 J. Trylska, V. Tozzini, C. A. Chang and J. A. McCammon, Biophys. J., 2007, 92(12),

4179.163 A. P. Heath, L. E. Kavraki and C. Clementi, Proteins, 2007, 68(3), 646.164 M. Feig, P. Rotkiewicz, A. Kolinski, J. Skolnick and C. L. Brooks, Proteins, 2000, 41(1),

86.165 M. Milik, A. Kolinski and J. Skolnick, J. Comput. Chem., 1997, 18(1), 80.166 Z. X. Xiang and B. Honig, J. Mol. Biol., 2001, 311(2), 421.167 J. M. Wang, P. Cieplak and P. A. Kollman, J. Comput. Chem., 2000, 21(12), 1049.168 D. A. Case et al., AMBER 8, University of California, 2004.169 A. Onufriev, D. Bashford and D. A. Case, Proteins, 2004, 55(2), 383.170 P. Das, S. Matysiak and C. Clementi, Proc. Natl. Acad. Sci. USA, 2005, 102(29), 10141.171 F. Ding, K. C. Prutzman, S. L. Campbell and N. V. Dokholyan, Structure, 2006, 14(1), 5.172 J. Hermans, H. J. C. Bernendsen, W. F. van Gunsteren and J. P. M. Postma, Biopolymers,

1984, 23(8), 1513.173 T. Lazaridis and M. Karplus, Proteins, 1999, 35(2), 133.174 T. Kortemme, A. V. Morozov and D. Baker, J. Mol. Biol., 2003, 326(4), 1239.175 S. Sharma, F. Ding and N. V. Dokholyan, Biophys. J., 2007, 92(5), 1457.176 M. Nanias, C. Czaplewski and H. A. Scheraga, J. Chem. Theory Comput., 2006, 2(3), 513.177 E. Lyman and D. M. Zuckerman, J. Chem. Theory Comput., 2006, 2(3), 656.178 T. Z. Lwin and R. Luo, J. Chem. Phys., 2005, 123(19), 194904.179 Q. Shi, S. Izvekov and G. A. Voth, J. Phys. Chem. B, 2006, 110(31), 15045.180 L. Whitehead, C. M. Edge and J. W. Essex, J. Comput. Chem., 2001, 22(14), 1622.181 J. Michel, M. Orsi and J. W. Essex, J. Phys. Chem. B, in press.182 Y. Liu and T. Ichiye, J. Phys. Chem., 1996, 100(7), 2723.183 A. Chandra and T. Ichiye, J. Chem. Phys., 1999, 111(6), 2701.184 M. L. Tan, B. R. Brooks and T. Ichiye, Chem. Phys. Lett., 2006, 421(1–3), 166.185 C. J. Fennell and J. D. Gezelter, J. Chem. Phys., 2004, 120(19), 9175.186 M. Neri, C. Anselmi, M. Cascella, A. Maritan and P. Carloni, Phys. Rev. Lett., 2005,

95(21), 218102.

Chem. Modell., 2008, 5, 13–50 | 49

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online

Page 38: [Chemical Modelling] Chemical Modelling Volume 5 ||

187 M. Praprotnik, L. Delle Site and K. Kremer, J. Chem. Phys., 2005, 123(22), 224106.188 M. Praprotnik, L. Delle Site and K. Kremer, Phys. Rev. E, 2006, 73(6), 066701.189 M. Praprotnik, K. Kremer and L. Delle Site, J. Phys. A-Math. Theor., 2007, 40(15), F281.190 M. Praprotnik, K. Kremer and L. Delle Site, Phys. Rev. E, 2007, 75(1), 017701.191 M. Praprotnik, S. Matysiak, L. Delle Site, K. Kremer and C. Clementi, J. Phys.-Condes.

Matter, 2007, 19(29), 292201.192 B. Ensing, S. O. Nielsen, P. B. Moore, M. L. Klein and M. Parrinello, J. Chem. Theory

Comput., 2007, 3(3), 1100.193 M. Tuckerman, B. J. Berne and G. J. Marthyna, J. Chem. Phys., 1992, 97(3), 1990.194 R. E. Rudd and J. Q. Broughton, Phys. Rev. B, 2005, 72(14), 144104.195 S. A. Orszag and I. Staroselsky, Comput. Phys. Commun., 2000, 127(1), 165.196 N. A. Baker, J. Comput. Chem., 2005, 21, 349.197 N. A. Baker, Curr. Opin. Struct. Biol., 2005, 15(2), 137.198 J. H. Chen, W. P. Im and C. L. Brooks, J. Am. Chem. Soc., 2006, 128(11), 3728.199 Z. Y. Yu, M. P. Jacobson and R. A. Friesner, J. Comput. Chem., 2006, 27(1), 72.200 C. Park, M. J. Carlson and W. A. Goddard, J. Phys. Chem. A, 2000, 104(11), 2498.201 Y. H. Jang et al., J. Phys. Chem. B, 2003, 107(1), 344.202 N. J. English, J. Mol. Model., 2007, 13(10), 1081.203 S. P. Brown and S. W. Muchmore, J. Chem. Inf. Model., 2007, 47(4), 1493.204 G. Brannigan, L. C. L. Lin and F. L. H. Brown, Eur. Biophys. J. Biophys. Lett., 2006,

35(2), 104.205 I. Lotan and T. Head-Gordon, J. Chem. Theory Comput., 2006, 2(3), 541.206 W. Im, J. H. Chen and C. L. Brooks, Adv. Protein Chem., 2006, 72, 173.207 A. Warshel, P. K. Sharma, M. Kato and W. W. Parson, BBA-Proteins Proteomics, 2006,

1764(11), 1647.208 J. Carlsson, M. Ander, M. Nervall and J. Aqvist, J. Phys. Chem. B, 2006, 110(24), 12034.209 P. Koehl, Curr. Opin. Struct. Biol., 2006, 16(2), 142.210 Y. Y. Tang et al., Biophys. J., 2006, 91(4), 1248.211 E. Villa, A. Balaeff, L. Mahadevan and K. Schulten,Multiscale Model. Simul., 2004, 2(4),

527.212 E. Villa, A. Balaeff and K. Schulten, Proc. Natl. Acad. Sci. USA, 2005, 102(19), 6783.213 S. Barsky, R. Delgado-Buscalioni and P. V. Coveney, J. Chem. Phys., 2004, 121(5), 2403.214 G. De Fabritiis, R. Delgado-Buscalioni and P. V. Coveney, Phys. Rev. Lett., 2006, 97(13),

134501.215 R. Delgado-Buscalioni and P. V. Coveney, Philos. Trans. R. Soc. Lond. Ser. A-Math.

Phys. Eng. Sci., 2004, 362(1821), 1639.216 R. Delgado-Buscalioni, E. G. Flekkoy and P. V. Coveney, Europhys. Lett., 2005, 69(6),

959.217 R. Delgado-Buscalioni and P. V. Coveney, Physica A, 2006, 362(1), 30.218 E. G. Flekkoy, R. Delgado-Buscalioni and P. V. Coveney, Phys. Rev. E, 2005, 72(2),

026703.219 G. Giupponi, G. De Fabritiis and P. V. Coveney, J. Chem. Phys., 2007, 126(15), 154903.220 R. Delgado-Buscalioni and P. V. Coveney, Phys. Rev. E, 2003, 67(4), 046704.221 R. Delgado-Buscalioni and P. V. Coveney, J. Chem. Phys., 2003, 119(2), 978.222 G. De Fabritiis, M. Serrano, R. Delgado-Buscalioni and P. V. Coveney, Phys. Rev. E,

2007, 75(2), 026307.223 R. Chang, G. S. Ayton and G. A. Voth, J. Chem. Phys., 2005, 122(24), 244716.224 G. S. Ayton, W. G. Noid and G. A. Voth, Curr. Opin. Struct. Biol., 2007, 17(2), 192.225 G. Ayton and G. A. Voth, Biophys. J., 2002, 83(6), 3357.226 M. Kranenburg, J. P. Nicolas and B. Smit, Phys. Chem. Chem. Phys., 2004, 6(16), 4142.227 M. Kranenburg, C. Laforge and B. Smit, Phys. Chem. Chem. Phys., 2004, 6(19), 4531.228 G. S. Ayton and G. A. Voth, J. Struct. Biol., 2007, 157(3), 570.229 G. Zini, A. Sarti and C. Lamberti, IEEE Trans. Ultrason. Ferroelectr. Freq. Control, 1997,

44(2), 297.230 D. Noble, Biochem. Soc. Trans., 2003, 31, 156.231 S. H. G. Khoo and M. Al-Rubeai, Biotechnol. Appl. Biochem., 2007, 47, 71.232 A. Michailova, J. Saucerman, M. E. Belik and A. D. McCulloch, Biophys. J., 2005, 88(3),

2234.233 D. Noble, Biochem. Soc. Trans., 2005, 33, 539.234 D. Noble, Biosystems, 2006, 83(2–3), 75.235 C. V. Forst, Drug Discov. Today, 2006, 11(5–6), 220.236 D. B. Kell, IUBMB Life, 2007, 59(11), 689.237 J. B. Bassingthwaighte, H. J. Chizeck and L. E. Atlas, Proc. IEEE, 2006, 94(4), 819.238 M. Stein, R. R. Gabdoulline and R. C. Wade, Curr. Opin. Struct. Biol., 2007, 17(2), 166.239 P. J. Hunter, W. W. Li, A. D. McCulloch and D. Noble, Computer, 2006, 39(11), 48.

50 | Chem. Modell., 2008, 5, 13–50

This journal is c The Royal Society of Chemistry 2008

Dow

nloa

ded

by S

tanf

ord

Uni

vers

ity o

n 23

Aug

ust 2

012

Publ

ishe

d on

19

Nov

embe

r 20

08 o

n ht

tp://

pubs

.rsc

.org

| do

i:10.

1039

/B60

8778

G

View Online


Recommended