E. Darve, ICME, 2/5/2007
Computational models of bio-molecules
Eric DarveMechanical Engineering
Stanford
2/34E. Darve, ICME, 2/5/2007
Protein models
3/34E. Darve, ICME, 2/5/2007
Life begins with cells
Single 200 micrometer cell (the human egg), with sperm. From the union of an egg and sperm will arise the 10 trillion cells of a human body.
Cells are the building block of the body.
What is the cell filled with?
4/34E. Darve, ICME, 2/5/2007
A cell is filled with molecules: from ions, small molecules to macromolecules
Proteins give cells structure and perform most cellular tasks.
Water-accessible surface of proteins; notice the complex three-dimensional shape.
Enzyme, hormone, antibody, blood’s oxygen carrier.
Enzyme
Hormone
Oxygen carrier Antibody Enzyme Cell membrane
5/34E. Darve, ICME, 2/5/2007
The code for this machinery is in the DNA
The DNA stores a code in the form of a succession of four letters A, G, T, C.
A section is copied into a ribonucleic acid (RNA).
The ribosome performs the translation: amino acids get linked together to form a protein.
The order is specified by the RNA; a universal genetic code is followed.
6/34E. Darve, ICME, 2/5/2007
Biology is a multiscale problem
DNA double helix: 2 nmEight cells in an embryo: 200 micro mWolf spider: 15 mmEmperor penguin: 1 m.
Atomistic computer models
7/34E. Darve, ICME, 2/5/2007
Structure and function of proteins are tightly coupled
Proteins are defined by a unique sequence of amino acids.
There are 20 amino acids.
A hierarchy of folding processes gives rise to large complexes or assemblies.
Modeling becomes increasingly harder as the size increases.
8/34E. Darve, ICME, 2/5/2007
Proteins are polypeptides formed by chaining amino acids
Tripeptide: peptide bonds (yellow) link the amide nitrogen atom (blue) of one amino acid with the carbonyl carbon atom (gray) of an adjacent one in the linear polymers known as polypeptides.
Proteins are polypeptides (100s to 1000s of amino acids) that have folded into a defined 3D shape.
The side chain (R group, green) determine its properties.
9/34E. Darve, ICME, 2/5/2007
The simplest structure is the alpha helix
The alpha helix: the most basic secondary structure.
The backbone (red) is folded into a spiral that is held in place by hydrogen bonds between backbone oxygen and hydrogen atoms.
Side chain R groups are covering the outside of the helix.
Helix has a directionality because all the hydrogen-bond donors have the same orientation.
10/34E. Darve, ICME, 2/5/2007
Some diseases are caused by proteins which misfold
Alzheimer’s: caused by the formation of insoluble plaques composed of amyloid protein.
Conformation changes from alpha-helix to beta-sheet.
This leads to an aggregation into filaments (amyloid) found in plaques.
11/34E. Darve, ICME, 2/5/2007
Ion channels allow molecules to come in and out of the cell
Protein sits in the membrane of the cell.
Two conformations: open and closed.
Hydrophilic groups are facing inside the channels while hydrophobic groups face the lipid bilayer.
The selectivity filter determines the ion selectivity of the channel.
12/34E. Darve, ICME, 2/5/2007
Ion channel: Bacterial K+ channel
Top view
Side View
Potassium ionSelectivity loop
Pore helix
VestibuleInner helix
Outer helix
Color code: acidic=red, basic=blue, polar=green, non-polar=white
13/34E. Darve, ICME, 2/5/2007
The Art of Water Transport in Aquaporins: UIUC, theoretical and computational biophysics group
Aquaporins are membrane water channels that play critical roles in controlling the water contents of cells. These channels are widely distributed in all kingdoms of life, including bacteria, plants, and mammals.
They form tetramers in the cell membrane, and facilitate the transport of water and, in some cases, other small solutes across the membrane.
14/34E. Darve, ICME, 2/5/2007
Ion channels are gated by different mechanisms
Channels are often gated, i.e., they don’t stay open or closed but open briefly and close again.
Gating mechanisms: voltage-gated, binding of a ligand, mechanically gated.
Example: mediate most forms of electrical signaling in the nervous system.
Project with School of
Medicine: sense of touch
15/34E. Darve, ICME, 2/5/2007
Free energy is used to understand these changes of conformation
Free energy: used to describe systems at constant temperature and pressure.
All systems evolve such that the free energy is minimized.
On the right: a typical free energy profile for a reaction.
Reaction occurs if the free energy of products is less than reactants.
High-energy transition state must be crossed: activation energy.
In mechanically gated channels, a force applied to the channel lowers the barrier, enabling the channel to open.
The adaptive biasing force is a numerical technique to efficiently calculate such
profiles.
16/34E. Darve, ICME, 2/5/2007
Symplectic time integrators
17/34E. Darve, ICME, 2/5/2007
Symplectic integrators are a special class of geometric integrators
They conserve area.
Importantly, they conserve energy (no drift) over long-time scale integrations.
18/34E. Darve, ICME, 2/5/2007
The discrete Hamilton’s principle allows constructing symplectic integrators
Hamilton’s principle: a trajectory is an extremum of the action integral:
Discrete principle: extremum of the discrete action integral
19/34E. Darve, ICME, 2/5/2007
This class of integrator can be extended to asynchronous integrators
Independent choice of time steps for each potential:
This variational principle leads necessarily to a symplectic integrator
20/34E. Darve, ICME, 2/5/2007
The second order method can be implemented very easily
Time
21/34E. Darve, ICME, 2/5/2007
For molecular dynamics, a time step is chosen for each type of potential
Chemical bonds
Bond angle, torsion angle, dihedral angle
Lennard-Jones
Short-range and long-range electrostatics
22/34E. Darve, ICME, 2/5/2007
A model problem allows studying the stability of synchronous integrators
Model problem:
r-RESPA corresponds to the choice:
Stability condition:
The integrator is unstable when:
23/34E. Darve, ICME, 2/5/2007
This analysis can be extended to the asynchronous case
Rational ratio:
We define the following matrix:
The integrator is unstable if one of the eigenvalues is larger than 1.
This allows a numerical investigation of unstable time steps.
24/34E. Darve, ICME, 2/5/2007
The stability diagram shows many structures
25/34E. Darve, ICME, 2/5/2007
Instability if the synchronization time is a multiple of the half-period
Proved:
These equations lead to a finite set of points.
Conjecture:
26/34E. Darve, ICME, 2/5/2007
There exists a family of curves composed of unstable points only
27/34E. Darve, ICME, 2/5/2007
Red curves:
Four curves are clearly visible on this plot
Green curves:
Magenta curves:
Cyan curves:
28/34E. Darve, ICME, 2/5/2007
This integrator can be stabilized using a Langevin dynamics equationLangevin dynamics is used to model a system at constant temperature.
It’s a stochastic equation given by:
The previous study can be used to determine the smallest value of γ which guarantees a stable integrator.
29/34E. Darve, ICME, 2/5/2007
AVI is faster than r-RESPA
30/34E. Darve, ICME, 2/5/2007
AVI is even faster when the time scales are close to one another
31/34E. Darve, ICME, 2/5/2007
The gap in performance between conventional processors and graphics cards increases
32/34E. Darve, ICME, 2/5/2007
The computing performance are incredible
1.7x1.4xAnnual growth
$599$874Price
330 Gflops (measured)
48 Gflops (maximum)Performance
Nvidia 8800 GTX
3 GHz Intel Core 2 Duo
33/34E. Darve, ICME, 2/5/2007
A speed-up of 70x is obtained on atomistic simulations
Results on ATI X1900XTX
This will enable simulations of larger systems over realistic time scales, i.e., relevant to the biologists.
High-performance computing is not just for gamers anymore!
34/34E. Darve, ICME, 2/5/2007
Students and collaborators
Free Energy:– Andrew Pohorille, NASA Ames– David Rodriguez-Gomez, NASA Ames
Symplectic integrators:– Adrian Lew, ME Department, Stanford– William Fong, ICME program, Stanford
GPU:– Vijay Pande, Chemistry Department, Stanford– Pat Hanrahan, Computer Science Department, Stanford– Erich Elsen, ME Department, Stanford
35/34E. Darve, ICME, 2/5/2007
Classes
Spring 2007: ME 436, Computational Molecular Modeling and Parallel Computing
Summer 2007: ME 438, Computational Molecular Modeling Project
36/34E. Darve, ICME, 2/5/2007