+ All Categories
Home > Documents > Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento...

Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento...

Date post: 12-Jan-2016
Category:
Upload: junior-hopkins
View: 215 times
Download: 0 times
Share this document with a friend
Popular Tags:
31
Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento di Fisica e Astronomia Universita’ di Padova Means, Methods and Results in the Statistical Mechanics of Polymeric Systems. Toronto 21-22 June 2012
Transcript
Page 1: Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento di Fisica e Astronomia Universita’ di Padova Means,

Exploring the Universe of Protein Structures beyond the

Protein Data Bank

Flavio Seno

Dipartimento di Fisica e Astronomia

Universita’ di Padova

Means, Methods and Results in the Statistical Mechanics of Polymeric Systems.

Toronto 21-22 June 2012

Page 2: Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento di Fisica e Astronomia Universita’ di Padova Means,

What are their distinctive properties?secondary structures

stabilized by hydrogen bonds

Folds:arrangements of secondary structures in the space

There is a limited set of folds:same folds used to perform different

functions

There is not macroscopic evolution:multiple separate discoveries during the

course of evolution

Protein structures

Page 3: Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento di Fisica e Astronomia Universita’ di Padova Means,

Platonic folds : intrinsic features of the order of nature (Denton and Marshall ,Nature 2002)

~ 7000 structures (new sequences) determined every year

Page 4: Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento di Fisica e Astronomia Universita’ di Padova Means,

Are protein folds determined only by physical and geometrical laws ( crystal structures) and not by the chemistry of the amino-acid sequence?

Is it possible to reproduce them in terms of general principles? Maybe through an homopolymer that captures the main common features of all the aminoacids?

Are the observed folds in a one to one correspondence with the whole possible fold universe?

If not, why? Is there a selection principle?

“SIMILARITY OF PROTEIN STRUCTURES IMPOSED BY

SOME PHYSICAL REGULARITIES” (Finkelstein-Ptitsyn 2002)

Page 5: Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento di Fisica e Astronomia Universita’ di Padova Means,

Minimal Coarse-Grained ModelT.X. Hoang, L. Marsella, A. Trovato, J.R. Banavar, A. Maritan, F.S. PNAS, vol 103, 6883 (2006)

tionRepresenta C

• Excluded volume (self-avoiding tube)

• Hydrogen bonding geometric constraint

• Hydrophobic interaction

• Local bending penalty

Page 6: Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento di Fisica e Astronomia Universita’ di Padova Means,

Ground State Phase DiagramHomopolypeptide structures in the‘marginally compact’ phase

(compact + h-bonds) are protein-like

Page 7: Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento di Fisica e Astronomia Universita’ di Padova Means,

THE METHOD IS BASED ON AN ARTIFICIAL DYNAMICS (METADYNAMICS)

1) IDENTIFY COLLECTIVE VARIABLES S WHICH ARE ASSUMED TO PROVIDE A RELEVANT COARSE GRAINED DESCRIPTION OF THE SYSTEM

2) TO BIAS THE DYNAMICS ALONG THESE VARIABLES.

METADYNAMICS A Laio, M Parrinello, Escaping free-energy minimaPROCEEDINGS OF THE NATIONAL ACADEMY OF SCIENCES OF THE UNITED STATES OF, 99, 12562 (2002)

S

HOW TO FIND STABLE MINIMA WHICH ARE SEPARATED BY BARRIERS THAT CANNOT CLEARED IN THE AVAILABLE SIMULATION TIME

3) RUN IN PARALLEL SEVERAL MOLECULAR DYNAMICS EACH BIASED WITH A METADYNAMIC POTENTIAL

4) SWAPS OF THE CONFIGURATIONS

Page 8: Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento di Fisica e Astronomia Universita’ di Padova Means,

ATOMISTIC MODEL

• Why VAL? (is small but not too much)

• MD simulations with AMBER force field and package GROMACS

• Bias-exchange METADYNAMICS with 6 replicas

• Six collective variables linked to secondary structure elements

60 AMINO ACIDS POLYVALINE (VAL60)

Page 9: Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento di Fisica e Astronomia Universita’ di Padova Means,

50 microseconds molecular dynamics simulation

We generate an ensemble of 30000 all-atom conformations SIGNIFICANT SECONDARY STRUCTURE CONTENT AND SMALL RADIUS OF GYRATION

Page 10: Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento di Fisica e Astronomia Universita’ di Padova Means,

We verify they are local minima also for ALA-60

Structural quality resembles that of real protein

H-BOND ENERGY COMPUTED WITH PROCHECK

RAMACHANDRAN PLOT

QUALITY MEASURE G-FACTOR

FRAGMENT DISTANCE < 0.6 A

0.7 A

Page 12: Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento di Fisica e Astronomia Universita’ di Padova Means,

The Class Architecture Topology and Homologous superfamily protein structure classification (CATH) is one of the main databases providing hierarchical classification of protein domain structures.

RELATION BETWEEN VAL60 AND REAL PROTEINS

300 FOLDS

40 < L<75

SIMILARITY: TM-SCORE (Zhang Scolnick 2005)

ALLIGNMENTS OF SECONDARY STRUCTURES ALLOWING INSERTIONS AND DELETIONS (COVERAGE)

MINIMIZATION OF THE RELATIVE DISTANCE BETWEEN ALIGNED RESIDUES (RMSD)

TM=0.45

Page 13: Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento di Fisica e Astronomia Universita’ di Padova Means,

1ib8

1g29

1x9b

COMPARISON

VAL60 VS CATH

40 < L < 75

300 FOLDS1uxy

Page 14: Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento di Fisica e Astronomia Universita’ di Padova Means,

SECOND RESULT

THE COMPUTATIONAL SETUP USED IN THIS WORK ALLOW US TO EXPLORE THE MAJORITY OF

THE FOLDS IN NATURE (AT LEAST FOR THESE LENGTHS)

Page 15: Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento di Fisica e Astronomia Universita’ di Padova Means,

COMPARISON

POLYVAL VS CATH

NOT ALL VAL60 ARE PRESENT IN CATH!!!!!!!

TM =0.45

VAL60 7000

CATH 300

Page 16: Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento di Fisica e Astronomia Universita’ di Padova Means,

THIS MIGHT JUST DEPEND ON THE CHOSEN SIMILARITY THRESHOLD

-Real protein strucures were selected under a bias towards low CO

- protein structures are selected to be topologically less entangled

DO STRUCUTRAL DESCRIPTORS DISCRIMINATE BETWEEN CATH AND VAL60?

CONTACT ORDER:Average sequence separation between contacting residues(related to folding rates Plaxco Simons Baker 1998)

Page 17: Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento di Fisica e Astronomia Universita’ di Padova Means,

THIRD RESULT

THERE IS NO ONE-TO-ONE CORRESPONDENCE BETWEEN PDB LIBRARY AND THE ENSEMBLE OF

COMPACT STRUCUTRES WITH SIGNIFICANT SECONDARY

STRUCUTURE CONTENT (VAL60)

Page 18: Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento di Fisica e Astronomia Universita’ di Padova Means,

SUMMARY

• VAL60 SET IS REPRESENTATIVE OF REAL PROTEINS

(PROTEINS FOLDS SELECTED BY GEOMETRY AND SIMMETRY AND NOT BY CHEMISTRY OF THE SEQUENCE)

• KNOWN FOLDS FORM ONLY A SMALL FRACTION OF THE FULL DATABASE

• NATURAL FOLDS ARE CHARACTERIZED BY SMALL CONTACT ORDER

WHY

KINETIC ACCESSIBILITY

HIGHER CO HIGHER TENDENCY TO AGGREGATE?

Page 19: Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento di Fisica e Astronomia Universita’ di Padova Means,

APPLICATIONS

• REALISTIC DECOYS

• DESIGN NEW PROTEINS

• CHECK PREDICTIONS IN SYNTHETIC BIOLOGY

• MODELS FOR MISFOLDED STRUCTURES RELATED TO NEURODEGENERATIVE DISEASES

Page 20: Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento di Fisica e Astronomia Universita’ di Padova Means,

COLLABORATORS

• PILAR COSSIO (NIH WASHINGTON)

• ALESSANDRO LAIO (SISSA TRIESTE)

• DANIELE GRANATA (SISSA TRIESTE)

• FABIO PIETRUCCI (CECAM – LAUSANNE)

• AMOS MARITAN (PADOVA)

• ANTONIO TROVATO (PADOVA)

Plos Computational Biology vol.6 e 1000957 (2010)

Scientific Reports 2, Art. No. 351 (2012)

Page 21: Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento di Fisica e Astronomia Universita’ di Padova Means,

CORRELATION BETWEEN POTENTIAL ENERGY AND CONTACT ORDER FOR VAL60 AND ALA60 STRUCUTRES

Page 22: Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento di Fisica e Astronomia Universita’ di Padova Means,

Similarity between the VAL60 and CATH databases

Page 23: Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento di Fisica e Astronomia Universita’ di Padova Means,

CATH and VAL60 are explored with equal probability

Page 24: Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento di Fisica e Astronomia Universita’ di Padova Means,
Page 25: Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento di Fisica e Astronomia Universita’ di Padova Means,
Page 26: Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento di Fisica e Astronomia Universita’ di Padova Means,

Distribution of the radius of gyration for the VAL60, VAL60+WATER,ALA60 and CATH 55–65 sets of structures.

Page 27: Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento di Fisica e Astronomia Universita’ di Padova Means,

Cα RMSD distributions for the 30,000 VAL60 and the 1500 ALA60 minimized through SD.

Page 28: Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento di Fisica e Astronomia Universita’ di Padova Means,

Probability of finding a structure in the VAL60 trajectory for different CO classes.

Page 29: Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento di Fisica e Astronomia Universita’ di Padova Means,

Number of independent structures

Page 30: Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento di Fisica e Astronomia Universita’ di Padova Means,

Bias Exchange Metadynamics

S Piana, A Laio, A bias-exchange approach to protein folding JOURNAL OF PHYSICAL CHEMISTRY B, 111, 4553 (2007)

1) List all the collective variables2) Run in parallel several molecular dynamics each biased with a metadynamic potential3) Swaps of the configuration

IT IS AN APPROACH DESIGNED FOR ACCELATING RARE

EVENTS IN VERY COMPLEXES CASES IN WHICH THE

VARIABLES THAT ARE RELVANT FOR THE PROCESS

ARE MORE THAN 2 OR 3

Page 31: Exploring the Universe of Protein Structures beyond the Protein Data Bank Flavio Seno Dipartimento di Fisica e Astronomia Universita’ di Padova Means,

Are compact hydrogen-bonded polypetide structures in one-to-

one correspondence with protein structures

from the Protein Data Bank (PDB)?

YES!?

PNAS 103, 2605-2010 (2006)

Homopolypeptide ( side chain:C-beta atoms) with a very minimal potential consisting of H-bonding, excluded volume, and a uniform, pairwise attractive potential between side chains.


Recommended