+ All Categories
Home > Documents > Physics and structure of biomacromolecules Konstantin Zeldovich LRB 1004, x62354.

Physics and structure of biomacromolecules Konstantin Zeldovich LRB 1004, x62354.

Date post: 15-Jan-2016
Category:
Upload: hana-morrisson
View: 216 times
Download: 0 times
Share this document with a friend
Popular Tags:
26
Physics and structure of biomacromolecules Konstantin Zeldovich LRB 1004, x62354
Transcript
Page 1: Physics and structure of biomacromolecules Konstantin Zeldovich LRB 1004, x62354.

Physics and structure of biomacromolecules

Konstantin ZeldovichLRB 1004, x62354

Page 2: Physics and structure of biomacromolecules Konstantin Zeldovich LRB 1004, x62354.

Protein structure• PDB, the Protein Data Bank: ~63,000 structures• Primary, secondary, tertiary, … structure• Domains• Methods: X-ray and NMR• Computational approaches

• Diverse structures: from globular to knotted and intrinsically disordered, but a limited repertoire of ~1000 folds

Branden & Tooze, Introduction to protein structure

Page 3: Physics and structure of biomacromolecules Konstantin Zeldovich LRB 1004, x62354.

Interactions within a protein• Van der Waals • Hydrophobic forces• Electrostatic• Hydrogen bonds• Role of solvent• Hierarchy of energies (bond strength)

Many interactions of a similar energy scale (except chemical bonds).Overall, a 300-residude protein has G ~ 5 kcal/mol

-per residue, a very small difference between folded and unfolded states

- SUBTLE BALANCE Hydrophobic interactions drive folding to the compact structure

Page 4: Physics and structure of biomacromolecules Konstantin Zeldovich LRB 1004, x62354.

Thermodynamics of folding

Privalov, J Chem Thermodyn 29: 447 (1997)

Methods: calorimetry , thermal or chemical denaturationSmall proteins fold in a two-state fashion, folding is reversible

lysozymeheat capacity

N

U

G

reaction coordinate

unfolded nativetransition state

Page 5: Physics and structure of biomacromolecules Konstantin Zeldovich LRB 1004, x62354.

Kinetics of folding

Plaxco et al, JMB 277:985 (1998); Biochemistry 39:11177 (2000)

For many proteins, folding rate is determined by their topology (contact order)

However: newer research suggests strong outliers; C.R. Matthews lab.

Contact order (CO) = average sequence separation between contacting residue pairs

Relative CO: normalized by chain length

Page 6: Physics and structure of biomacromolecules Konstantin Zeldovich LRB 1004, x62354.

Most proteins are densely packedRadius of gyration vs. chain length

3/1

3

3

~

~

aNR

VR

NaV

g

All bacterial proteins from the PDB, June 2009

Page 7: Physics and structure of biomacromolecules Konstantin Zeldovich LRB 1004, x62354.

Anfinsen’s thermodynamic hypothesis• Native state is entirely defined by sequence• Native state is a minimum of free energy– Unique– Stable– Kinetically accessible

All computational efforts depend on these ideas

Anfinsen, Science 181: 223 (1973)

Page 8: Physics and structure of biomacromolecules Konstantin Zeldovich LRB 1004, x62354.

How sequence defines structure?

• Protein is a heteropolymer• How can a specific structure arise at all?• Protein-like sequences and energy gap• Folding landscape and “funnels”

Review papers:

Dill et all, Annu. Rev. Biophys. 2008 37:289-316Shakhnovich, Chem. Rev. 2006 106:1559-1588Onuchic, Luthey-Schulten, Wolynes, Annu. Rev. Phys. Chem. 1997 48:545-600

Page 9: Physics and structure of biomacromolecules Konstantin Zeldovich LRB 1004, x62354.

Toy models address basic questions

27-residue compact chain on 3x3x3 latticeConformational space is discrete, 103346 structuresPairwise contact potentials: only nearest neighbors interactSimulations are very quick

Lau & Dill, Macromolecules 22, 3986 (1989)Shakhnovich & Gutin, J Chem Phys 93, 5967 (1990)

Discrete conformational space -> we can calculate the energies of the toy proteinin each and every of the possible configuration.The configuration with the lowest energy is the native state

Page 10: Physics and structure of biomacromolecules Konstantin Zeldovich LRB 1004, x62354.

Proteins have a large energy gapE

103345

0

/

/0

)(

i

TE

TE

ie

eTP

WHPCECQLLRYGNNDFRNLDMLFISFR

WEDNMIQAGWYCPLTRRHIFQFYCHFY

compact lattice 27-mers with 10,000 possible conformations

Gap!Also, a sparse spectrum for low E

Page 11: Physics and structure of biomacromolecules Konstantin Zeldovich LRB 1004, x62354.

Energy gap leads to stability

...1

11)( /)(

0

/)(

0

/

/

1

TEEM

i

TEEM

i

TE

TE

i

NN N

Nii

N

eee

e

p

pTP

What is the probability to find a protein in its native state?

Gap!

The larger the gap, the more populated the native state is compared to other states

Np

T

protein

random polypeptide

PN vs T is roughly equivalent to CD spectra of thermal denaturation

Page 12: Physics and structure of biomacromolecules Konstantin Zeldovich LRB 1004, x62354.

Kinetics of folding and “funnels”How does the protein find its native state?Levinthal paradox: a brute-force search of all possible configurations would be outrageously long. In reality, proteins fold in milliseconds. Answer: the native state must be kinetically accessible

Dill et all, Annu. Rev. Biophys. 2008 37:289

The lower the energy, the more similarconformations are. Folding thus converges to the single native state

Empirically (from simulations), a large gap is necessary for fast folding

Page 13: Physics and structure of biomacromolecules Konstantin Zeldovich LRB 1004, x62354.

To crystallize or to simulate?

• Protein structure prediction• Homology modeling vs molecular simulations• Structural genomics• CASP competition

To crystallize is hard, to sequence is cheap. Structure from sequence?

In a perfect world: knowing the all of the interactions, find the conformation corresponding to the minimum energy. Voila, this is the native state.

Practical challenges: -Interactions are not known exactly-Interactions with solvent-Very large parameter space (# bond angles ~# of atoms ~ 105)-Rugged energy landscape with deep local minima – search algorithms are inefficient

Page 14: Physics and structure of biomacromolecules Konstantin Zeldovich LRB 1004, x62354.

Threading using energies

Jones, Taylor, Thornton, Nature 1992

Given a set of structures, determine which one is the best match for the given sequenceRationale: the number of folds is limited

Thread the sequence into each structure (possibly with gaps), thenevaluate the energy of amino acid contacts.

Select the threading which yields the lowest energy (cf. the gap)

Works well even at low sequence homology

Page 15: Physics and structure of biomacromolecules Konstantin Zeldovich LRB 1004, x62354.

Threading using profiles

Bowie, Luthy, Eisenberg, Science 1991

For each position, assess:-secondary structure-fraction polar-buried area, …

Residue typeA C D E …32 84 -92 23-6 87 34 -5…

posi

tion

profileAverage over homologous sequences with known structures

Create profiles for different folds (using known structures with homologous sequences)

For a given sequence with unknown structure, match it to all profiles (with gaps)

Select the profile with best score.

Page 16: Physics and structure of biomacromolecules Konstantin Zeldovich LRB 1004, x62354.

Homology modeling

Marti-Renom,… Sali, Annu. Rev. Biophys. Biomol. Struct. 2000. 29:291–325

Pairwise sequence alignment with PDB (BLAST)Match to multiple seq.alignment (PSI-BLAST)Threading, or 3D template matching to PDB

Fold correctness? (by seq.similarity?)StereochemistrySolvent accessibilityPositions of charged and hydrophobic groups…

Rigid-body assemblySegment matching (aligning conserved atoms)Satisfaction of spatial restraints

Page 17: Physics and structure of biomacromolecules Konstantin Zeldovich LRB 1004, x62354.

ab initio structure prediction

Anfinsen’s hypothesis: -native structure is entirely determined by the sequence-native structure is a unique energy minimum

Assuming we know interactions between the amino acids, can we just look for this minimum???

Polymer modeling is extensively used in materials science. Is it applicable to proteins?

Two main methods: molecular dynamics and Monte Carlo deterministic stochastic reflects dynamics no dynamics

Karplus, Scheraga, …

Page 18: Physics and structure of biomacromolecules Konstantin Zeldovich LRB 1004, x62354.

Force fields and potentialsHow do we know the strength of each interaction between atoms in a protein?

Ab initio approach: quantum chemistry can calculate the electron density profiles , and thus the energy (isn’t a protein just one big Schroedinger equation?)

Statistical approach: learn from the PDB by counting the contacts

Potentials optimized to correctly predict known structures of small moleculesCHARMM, AMBER

Miyazawa & Jernigan 1985, 1996

Boltzmann law: Inverting:

ji

ijijij

RTUij NN

NRTpRTUep ij loglog/ number of contacts

molar fractions

Training set must be carefully chosen: various folds, no homology, …

Page 19: Physics and structure of biomacromolecules Konstantin Zeldovich LRB 1004, x62354.

Molecular dynamics: amF

For i-th atom:

tvxx

tavv

Fm

a

iii

iii

jiji

all

1

for a

whi

le

i

j

x

time

Trajectories of all atoms

Pros:- Most detailed, most realistic- True dynamics

Cons:-Time-consuming

...

)(

HBij

electrij

VdWij

bondijij

i

jiijij

UUUUU

dx

xxdUFforce

Main issue: needs (picosecond) to reproduce bond vibrations, butfolding occurs on microsecond to seconds timescale so at least 107 iterations needed

s10~ 12t

Tools: AMBER, CHARMM, GROMACS, NAMD, …

Page 20: Physics and structure of biomacromolecules Konstantin Zeldovich LRB 1004, x62354.

Applications of molecular dynamics

• Protein-ligand interactions• Dynamics of protein folding• Membrane proteins and ion channels• Sidechain packing

D.E.Shaw Research has developed a dedicated hardware supercomputer, Anton,to run MD simulations much faster than any commodity clusters

hardware designed to run MD, using custom-built chips (ASIC and FPGA)

milliseconds are becoming accessible!

D.E.Shaw et al 2009, Proceedings of the ACM/IEEE Conference on Supercomputing (SC09)

Page 21: Physics and structure of biomacromolecules Konstantin Zeldovich LRB 1004, x62354.

Monte-Carlo simulationSacrifices information about dynamics to better explore the full energy landscape

Trial move

oldE newEenergy

Elementary step:Make a trial move, and accept or reject the new configuration

oldnew

oldnew

EE

EE

TkEE Boldnewep /)( - always accept

- accept with probability

(Metropolis sampling)

Different conformations are visited with the same frequency as in mol.dyn.

Page 22: Physics and structure of biomacromolecules Konstantin Zeldovich LRB 1004, x62354.

Monte-Carlo simulation (cont’d)Typical moves are rotations around bonds

-local move, rotation of one atom rel. to its two neighbors -global move, pivoting of the entire chain around a bond

Advantage over MD: no small/large timescale problemHowever, - no direct information about dynamics - calculating rotations is expensive (trigonometry!)

Often used in coarse-grained simulations to explore large conformational space and find basins of attraction (energy valleys).

If needed, these valleys can then be further explored by molecular dynamics

Tools: ProFASi

Page 23: Physics and structure of biomacromolecules Konstantin Zeldovich LRB 1004, x62354.

Hybrid techniques: I-TASSER

Wu, Skolnick, Zhang, BMC Biology 5:17 (2007)

Page 24: Physics and structure of biomacromolecules Konstantin Zeldovich LRB 1004, x62354.

Hybrid techniques: ROBETTA

Kim, Chivian, Baker, NAR 2004, vol. 32 W526–W531

Sequences parsed into putative domains

If homology is found, comparative modeling

If low homology, ab initio folding

3 or 9 residues fragment libraries are assembled

Selected decoys are clustered, cluster centroids used as models

Sidechains repacked by MC simulationsusing a rotamer library

Page 25: Physics and structure of biomacromolecules Konstantin Zeldovich LRB 1004, x62354.

Structural databases: SCOP, CATHhttp://scop.mrc-lmb.cam.ac.uk/scop/

• Hierarchical structural classification

• Class all-alpha, all-beta, alpha/beta, alpha+beta, mulitdomain, membrane, small

• Fold • Superfamily• Family

http://www.cathdb.info/

• Hierarchical domain classification

• Class: mainly-alpha, mainly-beta and alpha-beta

• Architecture• Topology (fold family)• Homologous superfamily

Murzin et al, JMB 247:536(1995) Orengo et al, Structure 5:1093 (1997)

Page 26: Physics and structure of biomacromolecules Konstantin Zeldovich LRB 1004, x62354.

Tools & serversPDB www.rcsb.orgStructure prediction servers and tools (just a few)

I-TASSER http://zhanglab.ccmb.med.umich.edu/I-TASSER/ROBETTA http://robetta.bakerlab.org/ MODELLER http://salilab.org/modeller/

Molecular dynamics packages (general)AMBER http://ambermd.org/CHARMM http://www.charmm.org/GROMACS http://www.gromacs.org/NAMD http://www.ks.uiuc.edu/Research/namd/

Monte Carlo protein modelingProFASi http://cbbp.thep.lu.se/activities/profasi/

Structural biology software databasehttp://www.ks.uiuc.edu/Development/biosoftdb/


Recommended