+ All Categories
Home > Documents > Dissertation Adam

Dissertation Adam

Date post: 05-Apr-2018
Category:
Upload: k4dr0n
View: 219 times
Download: 0 times
Share this document with a friend

of 29

Transcript
  • 7/31/2019 Dissertation Adam

    1/29

    UNIVERSITEIT VAN AMSTERDAM

    A Coarse-Grained Model for Self-Assembling collagen-silk-like

    block-copolymer

    Bruno Barbosa Rodrigues

    Advisor: Peter G. Bolhuis

    Co-Advisor: Marieke Schor

    July 2009

  • 7/31/2019 Dissertation Adam

    2/29

    Abstract

    We present the results of the sequence design of the adapted off-lattice minimalist model based on

    the original Head-Gordon (HG) for a monodisperse, biodegradable, biocompatible and hydrophilic

    collagen-like sequence. This sequence is part of a block copolymer made of two hydrophilic collagen-

    like blocks flanking a hydrophobic silk-like block. We start with atomistic simulations of short pep-

    tides to extract the bond distances, bend and dihedral angles distributions to fit the coarse-grained

    (CG) force field (FF) and adapt the HG model for this collagen-like sequence. Due to the high con-

    centration of proline and charged groups at low pH in the sequence, different distributions were found

    for the dihedral angles according to the relative position of proline in the chain. Thus, we extended

    the original three-letters minimalist model combined with a previous adapted shifted and rescaled

    non-bonded potential developed for the silk-block sequence. We compared the results of the radius

    of gyration (Rg) for different number of residues with the experimental value measured for a 400

    residues long sequence, coming up with a scaling law between the R g and the number of residues in

    the collagen-like sequence. We concluded that the exponent is close to the Flory exponent.

    Keywords: Collagen-silk-like block copolymers, protein fibers, Molecular Dynamics, Coarse-

    Grained model.

  • 7/31/2019 Dissertation Adam

    3/29

    Contents

    1 Introduction p. 1

    2 Methods p. 4

    3 Results p.13

    4 Conclusions and Future Perspectives p.19

    Acknowledgements p.20

    Appendix A -- Dihedral Angles p.21

    Bibliography p.23

  • 7/31/2019 Dissertation Adam

    4/29

    1

    1 Introduction

    Protein based block copolymers that form fibers upon a certain stimulus (e.g. change in pH)[16]

    are potentially very useful as biocompatible nanomaterials. One example is a block copolymer made

    of two hydrophilic collagen-like blocks flanking a hydrophobic silk-like block. The collagen block

    is responsible for limiting the growth direction of the silk block and drives it to form fibers with

    improved transversal strength. The sequence of the collagen block is chosen in a way that, unlike

    the most common gelatin in nature that forms a gel in the presence of water, it remains soluble and

    unstructured under various conditions of pH and temperature. Previously, a coarse grained model

    of the very regular, highly structured silk-like block has been developed [14]. However, a different

    strategy is needed to develop such a model for the unstructured collagen-like block.

    In the first section of this chapter we give an overview on the biocompatible materials and their

    applications; the second section is dedicated to the protein polymers; the two last sections are ded-

    icated to introduce the computer simulation techniques as tools to understand and predict proteindynamics, the limitations and possible solutions, ending up with an overview of this work in the last

    section.

    1.1 Biocompatible Materials

    Nature produces a wide range of different materials, but all of them for its own purposes, like

    actin, which is the main constituent of cytoskeleton, or collagen, the major component of extracel-

    lular matrix. Genes can encode identical amino acid sequences (also called primary structures) with

    absolute control over molecular weight, composition, sequence, and stereochemistry. Control of this

    protein production can lead us to build materials that could fit human desires[4] such as chemo-

    mechanical fibers[5], tissue engineering [6, 7] and drug delivery [8, 9].

    Biocompatible materials are synthetic or natural materials intended to interface with biological

    systems in intimate contact with living tissue. One way to build such high value materials is via ge-

    netic and protein engineering, both components of a new polymer chemistry that provide the tools for

    producing macromolecular polyamide copolymers of diversity and precision far beyond the current

    capabilities of synthetic polymer chemistry.

  • 7/31/2019 Dissertation Adam

    5/29

    2

    1.2 Protein Polymers

    Control and understanding of the behavior of biocompatible materials requires one to look deep

    into the atomic compostion of those materials. The function of a protein is intrinsically connected to

    its spatial conformation (or tertiary structure) and the process that drives the protein towards this stateis called protein folding.

    Proteins can be designed as blocks of repetitive amino acid sequences, also called block copoly-

    mers. This term was introduced by Capello in 1990s [2] who built diblock copolymers containing

    silk-like and elastin-like amino acid sequences. From that time on, many other block copolymers

    have been produced [3], coming from diblock to multiblock sequences. The self-assembly charac-

    teristic of the proteins can be used to tune organization in the molecular level [10], wich can build

    organized nanoscopic objects that end up in macroscopic materials with interesting properties.

    Figure 1.1: Triblock copolymers consisting of a hydrophobic pH-responsive, silk-like block flanked

    by hydrophilic, non-responsive collagen blocks self-assembling into m long fibrils. The collagen

    block limits the assembly direction of the silk block [11].

    The system under study on this dissertation is a collagen-like polypeptide [12] that flanks a silk-

    like sequence [13] which self-assembles into a roll forming stacks under low pH [14] (fig. 1.1).

    The triblock was produced experimentally by gene encoding of the yeast Pichia pastoris [15, 16]. The

    collagen-like part [12] has a hydrophilic sequence that, although the normal collagens that form gels,

    it remains soluble under many conditions of pH and temperature. With this random characteristic, the

    collagen-like part can drives the silk-block towards the growth direction forming biocompatible fibers

    that can have improved their transversal strength upon a stimulus (e.g. a change in pH).

    1.3 Computer Simulations

    During the last two decades, computer simulations have gradually been recognized as giving

    complementary information about the properties of such new materials or to understand conforma-tional transitions in proteins [17, 18]. While most of the relevant dynamics in proteins, like folding

    processes, occur on time scale of miliseconds and involve large molecular aggregates, atomistic simu-

  • 7/31/2019 Dissertation Adam

    6/29

    3

    lations can only address hundreds of nanoseconds in simulations for small proteins in explicit solvent,

    which means at least four and one orders of magnitude lower in time and in degrees of freedom, re-

    spectively.

    In this framework, many techinques were developed to overcome the system size and simulation

    time. Simplification of the system under study by integrating out details that are not actually important

    to the overall result can lead us to address longer length and time scales. In this work we focused on

    adapting the Head-Gordon (HG) [25, 26] model to apply for the collagen-like part of the collagen-

    silk-collagen block-copolymer. With this improvement, it is possible to simulate big systems long

    enough to predict the properties of the collagen-like block, and understand its role on driving the

    growth direction of the silk-fiber.

    1.4 Outline of this Dissertation

    The HG, developed to study protein floding and aggregation, was adapted for the collagen-like

    sequence in such a way that the dihedral angles are treated either with a cosine and a harmonic

    expansion. The resulting fit of the parameters can be combined with previous results for the silk-

    like block copolymer in order to enable a complete description of the collagen-silk-collagen block-

    copolymer.

    In Chapter 2 we describe the methods in molecular dynamics applied for the system under study,

    as well as details concerned to the simulation procedure. We also introduce the adapted Head-Gordon

    model to coarse-grain the polypetide lumping together an entire amino acid in one bead. In Chapter

    3 we investigate the results of the atomistic as well as the coarse-grained simulations, present the

    output distributions that were used to fit the adapted HG model and show the scaling law between

    the radius of gyration and the number of residues, appearing to be in very good agreement with the

    experimental value. Thus, in Chapter 4 we conclude the work and give the future perspectives.

  • 7/31/2019 Dissertation Adam

    7/29

    4

    2 Methods

    In this chapter we discuss the atomistic and CG Molecular Dynamics (MD) simulation te-

    chiniques which were used to study the collagen-silk block-copolymers. We performed the atom-

    istic simulations using GROMACS version 4.0 [30], which provided the distributions for the bond

    distances, bend and dihedral angles between the three and four consecutive C atoms respectively.

    With these parameters, we fitted the adapted HG model that permitted the achievement of longer time

    and length scales. The CG simulations were performed with the CM3D code[31].

    2.1 Molecular Dynamics

    The key of MD simulations is to integrate Newtons Law of motion for the N interacting particles

    md2ri

    dt2=

    j=iFri j (2.1)

    with accuracy and in such a way that the pairwise additive interactions do not scale as N2. Speed

    up of the evaluation of both short-range and long-range forces is possible and for that we have some

    techniques. Following the standard procedure to calculate the force by the derivative of a potential,

    we can integrate Newtons equation of motion and many algorithms are available. A very simple

    is the leap-frog integrator [32, 33] which is a Verlet-like second-order algorithm that evaluates the

    velocities at half-integer time steps and uses these velocities to compute the new positions:

    r(t+t) = r(t) +tv(t+t/2) (2.2)

    v(t+t/2) = v(tt/2) +tf(t)

    m(2.3)

    The more sofisticated Gear predictor-corrector algorithm falls into the general finite difference

    pattern, where the estimate of the positions, velocities etc. at time t+t may be obtained by Taylor

    expansion about time t. These values are estimated and do not represent the true trajectory. After

    calculating the forces at the new position rp(t+ t), the trajectories are corrected and the predicted

    step is fed with the new information to iterate the corrected trajectory and rc(t+t) is now a better

    approximation to the true position.

  • 7/31/2019 Dissertation Adam

    8/29

    5

    2.2 Atomistic Models

    As described in the previous section, the key point of MD simulations is to solve Newtons equa-

    tions of motion. But usually, the systems are defined by the potential energy rather than the forces,

    which can then be easily calculated by the negative gradient of the potential: F(ri j) =

    V(ri j).For that, many potential energy functions were developed to simulate protein systems, the so-called

    force-fields (FF) [35]. The basic idea of a FF relies on mapping all the possible physical interactions

    in the system and put them into a potential, like presented in eqs. 2.4 to 2.9:

    V = Vnoncov +Vcov = (VLJ+VC) + (Vbond +Vbend +Vdih) (2.4)

    VLJ(ri j) = 4i jC(12)i j

    i j

    ri j 12

    C(6)i j

    i j

    ri j 6

    (2.5)VC(ri j) =

    1

    40

    qiqj

    rri j(2.6)

    Vbond(ri j) =1

    2k

    (bond)i j

    ri jbi j

    2(2.7)

    Vbend(i jk) =1

    2kbendi jk

    i jk

    0i jk

    2(2.8)

    Vdih(ijkl ) =1

    2[C1(1 + cos()) +C2(1 cos(2))

    + C3(1 + cos(3)) +C4(1 cos(4))] (2.9)

    where the potential is divided in bonded (or covalent) and non-bonded (non-covalent). The non-

    bonded interactions contain a repulsion term, a dispersion term, and a Coulomb term. The repulsion

    and dispersion terms are combined in the Lennard-Jones (or 6-12 interaction). In addition, (partially)

    charged atoms act through the Coulomb term. Bonded interactions are based on a fixed list of atoms.

    They are not exclusively pair interactions, but include 3- and 4-body interactions as well. There are

    bond stretching (2-body), bond angle (3-body), and dihedral angle (4-body) interactions given by eqs.

    2.7, 2.8 and 2.9, respectively.

    There are many FF codes available nowadays, the most common are: AMBER [36], CHARMM

    [37], GROMOS [38] and OPLS-AA [39]. Their potential energy is parametrized against experiments

    and ab initio quantum mechanical calculations.

    2.2.1 Setup atomistic simulations

    The Atomistic simulations can reveal several details of the system, as it treats both the proteinunder consideration and the water. However, it is very difficult to reach long time and length scales

    within this framework. In this case, the atomistic simulations were carried out only to extract the pa-

  • 7/31/2019 Dissertation Adam

    9/29

  • 7/31/2019 Dissertation Adam

    10/29

  • 7/31/2019 Dissertation Adam

    11/29

    8

    that time on many approaches emerged. The CG models evoluted towards different directions, that

    differ basically in the relation between the complexity of the representation versus the complexity of

    the parametrization. Harmonic models represent the system by beads (usually one per amino acid)

    connected by elastic springs [21], and are used basically for the analysis of the principal modes [22],

    requiring a previous knowledge of an equilibrium reference configuration. Go-like models [23] alsorequire an a priori knowledge of the native state and lack on representing the most intriguing fact of

    protein folding: the dependence on the primary sequence. A lower level of reference dependence can

    be found in the Head-Gordon model [25], which represents each amino acid as one bead. Two-bead

    models [27] were developed adding a second bead on the centroid of the sidechain, increasing the

    independence with a reference configuration but increasing the complexity of the energy terms and

    inserting correlations on dihedral angles. Four-six bead models [28, 29] represent the sidechain as

    one bead but explicitly consider the coordinates of the three heavy atoms of the backbone.

    We chose the Head-Gordon (HG) [25] model to coarse-grain the collagen-like block, as it was

    successfully applied to the silk-like sequence before [14]. In this model, in constrast with Go [23]

    model, we do not need to know anything about the tertiary structure of the native state. However, we

    need to face the more difficult aspect of the protein folding problem, namely its dependence on amino

    acid sequence. The C atoms trace are taken to represent the protein backbone and the structural

    details of amino acids and aqueous solvent are integrated out and replaced by effective bead-bead

    interactions.

    2.3.1 Head-Gordon Model

    The original Head-Gordon (HG) model is an improvement of previous efforts of Thirumalai and

    coworkers [53] that is more general to helical, sheet and / protein topologies. The 20-

    letter amino acid sequence is converted to a three-letter code defined by the flavors: hydrophilic (L),

    hydrophobic (B) and neutral (N). The idea of describing an amino acid as one bead can be visualized

    in the figure below:

    Figure 2.2: Schematic description of an amino acid as a bead in the CG model.

  • 7/31/2019 Dissertation Adam

    12/29

    9

    The force field in the original HG model is defined as:

    H =

    1

    2k(0)

    2

    + A(1 + cos) +B(1 cos) +C(1 + cos3) +D1 + cos+

    4 (2.10)+

    i,ji+3

    4HS1

    ri j

    12S2

    ri j

    6

    where we have the bond angle between three consecutive beads with 0 =105 being the equilib-

    rium angle and k =20H/rad2. The dihedral angle between four consecutive beads can assume

    different conformations depending on the region to be described, with the constants A, B, C and D

    defining the shape of the distribution. The LJ potential determines the attraction-repulsion between

    the beads of size with the three flavors i and i separated by ri j: B-B interactions are attractive and

    represented by S1 = S2 =1; S1 =1/3 and S2 =0 apply for L-L and L-B interactions; and N-L, N-B

    and N-N interactions have the constants S1 =1 and S2 =0. In the original HG model the bond lengths

    are constrained by the RATTLE algorithm [54]. The non-bonded potentials are plotted in the Figure

    below:

    Figure 2.3: Non-bonded potential between neutral, hydrophilic, hydrophobic and proline (treated as

    neutral) amino acids.

    2.4 Adapted HG model

    Based on the original Head-Gordon model and the adapted version for the silk part [14] of

    the block-copolymer, we developed a four-letters minimalist model for the collagen-like block: hy-

    drophilic (L), hydrophobic (B), neutral (N) and proline (P), where we defined proline as a separated

  • 7/31/2019 Dissertation Adam

    13/29

    10

    Figure 2.4: Full collagen sequence transcription from the 20-letters amino acid code to the adapted

    four-letters minimalist HG code based on table 2.2.

    flavour, due to its key role on the stiffness of the dihedral angles. In the table 2.2 below we show the

    sequence mapping between 20-letter amino acid and adapted CG four letter code, which generates a

    minimalist full sequence according to the fig. 2.4.

    Name 20 4 Name 20 4

    Glycine GLY / G N Aspargine ASN / N L

    Alanine ALA / A B Proline PRO / P P

    Glutamic Acid GLU / E L Glutamine GLN / Q L

    Lysine LYS / K L Serine SER / S N

    Table 2.2: Sequence mapping between 20-letter amino acid and adapted CG four letter code.

    The FF also needs to be modified to cover the new changes in the dihedral angles, bond distances

    (which are no longer constrained) and non-local interactions between beads. The new FF is given by

    the equations 2.11, 2.12 and 2.13 below:

    Hadap = b

    1

    2kb (bb0)

    2 +

    1

    2k(0)

    2

    +

    Vweak() +Vsti f f()

    (2.11)

    + i,ji+3

    4HS1

    ri j 0

    12S2

    ri j0

    6

    Vweak() = h

    6

    k=0Akcosk()

    (2.12)

    Vsti f f() = B01

    2hB1 (0)

    2 (2.13)

    where now the bond distances are explicitly described by the spring potential, as CM3D package used

    for the CG simulations employs a reversible multiple time-step integrator, the stiffness is given by

    kb = 33h and b0 = 3.84A consistent with the measured all-atom CC distance distributions. The

    bond angles have the same treatment as in the original HG model, with the parameters given by k =

    20h

    and 0 = 105 extracted from the atomistic simulations. The parameters A

    k, B0, B1 and 0 set

    the dihedral angles between four subsequent C-atoms that show either periodic behaviour with two

    minima (weak) or harmonic with one minima (stiff) depending on the four beads sequence, as the

  • 7/31/2019 Dissertation Adam

    14/29

    11

    Figure 2.5: Shifting the Lennard-Jones potential has the effect of shortening the range of the potential.The position of the nearest neighbors and the second nearest neighbors between strands in the silk part

    are given by the dashed vertical lines. In the traditional L-J potential the second nearest neighbours

    still feel the interaction.

    new flavor (proline) plays an important role on the stiffness of the dihedral angle. The L-J potential

    is shifted and scaled in order to coincide with the previous model optimized for the slik-part [14],

    where the scaled parameter sets the range of the interaction and 0 shifts the potential to match

    the size of the bead, as can be seen in the fig. 2.5. The strength of the non-bonded interaction kLJ =

    4h was previously optimized for the silk part by M. Schor [14] by comparing the potentials of meanforce (PMF) from steered MD (SMD) simulations of the atomistic and the coarse-grained model. The

    PMFs from SMD were calculated following the method of Park and Schulten [63, 64].

    Simulations with CM3D were carried out for different sequence sizes. We started with the 30

    residues structure built in MolMol and kept the positions of the Catoms. From this basic structure

    we built sequences of 45, 60, 75, 90, 120, 150 and 200 beads by sticking together the pieces of short

    peptides. We used VMD to manipulate the PDB files of the different sizes of collagen-like sequence.

    Then the sequences were put in a cubic box (as the CG simulation employs implicit solvent, the shapeof the box is not relevant)with periodic boundary conditions in all x, y and z directions. The time step

    used is 2fs.

    We started relaxing the protein at low temperature. The first simulation was carried out with

    velocities given by a Boltzmann distribution, at 30K for 1ns in a NVT ensemble with Nose-Hoover

    chain with 4 units and time step of 1.6ps. Then we performed further 1ns simulations based on

    previous velocities and positions and raising gradually the temperature up to 60K, 100K, 200K and,

    finally, 300K. Then we equilibrated the system for 1ns in 300K and started a 50ns simulation to

    sample the averages. The 1ns simulations were carried out in a local machine and took no more than

    30 minutes at most for the 200 residues sequence. For the 50ns long we simulated on LISA cluster

  • 7/31/2019 Dissertation Adam

    15/29

    12

    with 1 processor 2 Intel Xeon 3.4GHz and took at most 20 hours for the longest sequence (200 beads).

    2.5 Order Parameters

    The configuration space is very high-dimensional and a visualization of its direct quantities is

    meaningless. To overcome this problem, we need to project the phase space in one-dimensional

    representations. They can monitor the (un)folding transitions, characterize native and unfolded states

    and some of them can directly be compared to experimental measurements.

    Since a protein chain is not a regular object and because it is subject to dynamic structural equi-

    librium that involves motion, it is necessary to consider a statistical measure of a chain size. Then the

    end-to-end distance is a key description for the statistical behavior of the chain.

    A commonly used order parameter is the Root Mean Square Deviation (RMSD) from a refer-

    ence structure, usually obtained experimentally by X-ray or NMR. RMSD was calculated minimizing

    under rotations and translations and is defined as:

    RMSD =

    1

    M

    N

    i=1

    mi|ri rre fi |

    2

    12

    (2.14)

    A second order moment about the mean chain position is the radius of gyration. It describes the

    overall spread of the molecule and it is defined as the root mean square distance of the collection ofatoms from their common centre of gravity:

    Rg =

    1

    M

    N

    i=1

    mi|ri rcm|2

    12

    (2.15)

    where rcm denotes the position of center of mass of the protein. This measure gives a valuable way to

    compare our CG method with the experimental data available for the collagen-like system.

    Root Mean Square Fluctuation (RMSF) is a measure of the deviation between the position ofparticle i and some reference position.

    RMSF =1

    T

    T

    tj=1

    ri(tj) ri

    2(2.16)

    where T is the time over which one wants to average, and ri is the reference position of particle i.

    Typically this reference position will be the time-averaged position of the same particle i, ie. ri. Note

    that, instead of averaging over the particles (as in RMSD), RMSF averages over the simulation time,

    giving a value for each particle i, usually the C atoms.

  • 7/31/2019 Dissertation Adam

    16/29

    13

    3 Results

    The results of the MD simulations described in the chapter 2 are presented here. We analyse the

    bond distances, bend and dihedral angles and order parameters obtained from the Atomistic simu-

    lations and use them to fit the CG adapted model for the collagen-like block copolymer. Thus, we

    present the improved CG model and compare the results with the previous atomistic simulations.

    Lastly, we summarize the results obtained from the adapted model analysing the order parameters.

    3.1 Atomistic Simulations

    Atomistic simulations of the short peptides provided enough information about bonds, bends and

    dihedral distributions, while the 30 residues collagen-like simulation revealed several details about

    the dynamics of the protein. Bond distance distributions (see fig. 3.1 (left)), strongly peaked around

    3.84 0.12A representing the distance between C- atoms of subsequent amino acids, justify theuse of a stiff harmonic potential for the bonds. Also based on the distributions calculated from the

    atomistic simulations, the rather narrow flexibility of the bend angles (see fig. 3.1 (right)) justifies the

    same treatment as in the original HG model. The dihedral angles between four subsequent C-atoms

    show periodic behaviour with two minima (flexible) or harmonic with one minima (stiff) leading to

    an expansion of the model in such a way that these details can be taken into account.

    Figure 3.1: Bond distances (left) and bend angles (right) distributions obtained from an all-atomsimulation of the short peptide A1.

  • 7/31/2019 Dissertation Adam

    17/29

    14

    Analyzing the dihedral distributions shown in the fig. 3.2, we can see three examples of rather

    different potentials that were fitted either with cosine expansion or harmonic potentials depending on

    the position of proline in the dihedral angle.

    Figure 3.2: Negative Logarithm of the dihedrals distibutions of the sequences LNPL (left), PNLP

    (center) and LLNL (right) obtained from an all-atom simulation of the short peptides in the minimalistfour-letters description. We can easily see how the distribution changes according to the relative

    position of the Proline amino acid in the sequence.

    After obtaining all the required parameters for the CG model, we present the results for

    the 30 residues peptide which was simulated for 60ns to have a reference system and com-

    pare with the new minimalist model. This sequence was simulated under the same procedure

    adopted for the small pieces of the collagen-like sequence. We chose an intermediate sequence

    |GNEGQPGQPGQNGQPGEPGSNGPQGSQGNP|to sample as many different amino acids as possible and cal-

    culated the order parameters (see fig. 3.3) for the RMSD, Rg and RMSF.

    Figure 3.3: RMSD (left), Rg (center) and RMSF (right) calculated for a 60ns simulation of the 30

    residues sequence. It can be observed that RMSD reaches a plateau after 45ns and also R g does not

    change its value, but instead remains flat during the whole simulation. The xaxis in the RMSF

    graphic represents the C atoms and it can be seen that none of the atoms is more likely to find a

    more stable position related to the others.

  • 7/31/2019 Dissertation Adam

    18/29

  • 7/31/2019 Dissertation Adam

    19/29

    16

    Figure 3.4: Comparison between the Atomistic short peptides and 30 residues CG simulations for

    the negative logarithm of the dihedrals distibutions of the sequences LNPL (left), PNLP (center) and

    LLNL (right). It is observed that the fitting for PNLP and LNPL sequences are good enough but the

    agreement for the LLNL potential seems to show some histeric hindrance, or maybe the phase space

    was not sampled enough.

    3.3 Analysis of the collagen-like block

    The analysis of all the dihedral angles in the atomistic simulations showed a strong sequence-

    dependence of the distributions, leading us to adapt the original Head-Gordon model to achieve a

    more accurate description of our collagen-like proteins. From our simulations, we concluded that

    the high concentration of proline randomly spread in the sequence plays an important role on the

    stiffness of the dihedral angles between four consecutive C and therefore proline must be taken

    into account as a separated flavour. In this way, we characterized the dihedrals distributions using

    four flavours: hydrophilic (L), hydrophobic (B), neutral (N) and proline (P). The relation between the

    amino acids present in the collagen-like and their four letters minimalist codes are given in Table 2.2.

    After taking the negative logarithm of the dihedral distributions, we fitted them either with cosine

    expansion (6k=0Akcos

    k()

    ) or parabolic function (B012B1 (0)

    2) according to the position of

    the proline. The results of the observations of the dihedrals stiffness can be summarized in the table

    below, whegre the distributions were divided in groups according to their main characteristcs. The

    first group, where the proline is not present, shows a flexible and smooth logarithmic distribution

    of the dihedral angles. The second group has a proline at the third position, and it is observed that it

    makes the dihedral angles stiff and the logarithmic distribution is therefore very stiff with one minima.

    The third group has a proline at the second position and, despite of its stiffness, it still can be fitted

    by a cosine expansion. The fourth and last group has proline in one or both flanks, and it makes the

    dihedral angles very flexible and the distribution is, therefore, smooth.

    Analyzing the table 3.1 and the figs. 3.4 above, it is possible to infer some conclusions about the

    role that Proline plays in the dihedrals distributions:

    1. Proline makes the dihedral angles stiffer when it is on the second or third position from the first

  • 7/31/2019 Dissertation Adam

    20/29

  • 7/31/2019 Dissertation Adam

    21/29

    18

    3.4 Dependence of the Rg with the number of residues

    We then calculated the radius of gyration from the output of a CG simulation runned in CM 3D

    for many collagen sizes and plotted in fig. 3.6 a logarithm scale curve for Rg as a function of the

    number of residues. There is a very good agreement with the experimental value for 400 residueswithin the statistical error, which serves to validate the adapted HG model for the collagen-part block-

    copolymer. Calculating the slope of the curve, we can see the dependence of the radius of gyration

    with the number of molecules to be Rg = 1.391(N)0.528 in good agreement with the Florys exponent

    (0.583) [62] in a good solvent (which means that the particles affectively repel each other), where N

    denotes the number of residues.

    Figure 3.6: Logarithm dependence between the radius of gyration (in nm) and the number of residues

    on the collagen-like sequence. The simulation values are plotted with the experimental result for 400

    residues sequence and fitted with a linear function.

  • 7/31/2019 Dissertation Adam

    22/29

    19

    4 Conclusions and Future Perspectives

    Finally, we can conclude that the adapted HG model developed for the collagen-like protein

    can predict the experimentally observed order parameter value. Therefore, as stated in the begin,

    it is confirmed that the high concentrations of proline and the charged/hydrophilic residues in the

    sequence play an important role on avoiding the structure to folds into any specific state, but instead

    retains its randomness. The radius of gyration has a very good agreement with experiments, behaving

    in a logarithmic dependence with the number of residues and providing a good value for the Flory

    expoenent.

    In the future, this adapted collagen-like CG model will be combined with the adaped CG model

    previously developed for the silk-part to be applied for the whole collagen-silk-collagen block copoly-

    mer. Thus, it will enable us to study the effect of the collagen-like on silk-like block folding and

    self-assembling.

  • 7/31/2019 Dissertation Adam

    23/29

    20

    Acknowledgements

    Many people contributed to the accomplishment of this work. I pay here special attention to some

    of them, not neccerily the most important, but the ones who were essential, in precise moments, to

    consolidate this achievement.

    First of all I thank God, for having given me the ability to learn and understand, always supplying

    me with vitality to face the challenges and keeping up achieving my goals, never allowing me to

    surrender, but instead keeping me humble.

    I thank my supervisor Peter Bolhuis, for giving me the opportunity to start this Msc. project at the

    University of Amsterdam. I also thank Marieke Schor, who co-supervised me during the project, and

    made my life much easier with your expertise on protein folding and coarse-grained simulation. I also

    thank the Molsim group, Bernd, Francesco, Anna, Grisell, Murat, Wolfgang, Rosanne and Zerihum,

    my friend, with whom I had a great time in the course of this project. I also acknowledge Sara cluster

    for the computer power provided for the simulations.

    I also thank my friends in Amsterdam Pedro, Dimas, Max, Igor, Raquel, Vinicius, Girry, Anthony,

    Adrien, and many other students, thanks you all for the great time we had here in Amsterdam. To

    my friends in Lyon Roberto, Diego, Rodrigo, Franck, Dorian, Jakub, Alex, Aion, Jana, that made that

    short stay in France one of the best periods in my life.

    I would like to thank the coordinators of the AtoSim Programme for accepting my application to

    this course and Erasmus Mundus for the scholarship.

    Finally, to all of them who contributed directly or indirectly to the accomplishment of this project.

    Thank you very much!

  • 7/31/2019 Dissertation Adam

    24/29

    21

    APPENDIX A -- Dihedral Angles

    Here we present all the dihedral angles distributions analized in the collagen-like sequence as

    well as the fitting parameters. It can be observed that some of them have almost the same behavior

    and can be tabulated in four categories defined by the position of the proline. These groups are shown

    in the table 3.1. All the sequences are listed in the figures subsequent, with the fitting parameters atthe captions.

    Figure A.1: The fitting coefficients are NNLN: A0=-3.34, A1=1.51, A2=2.19, A3=-1.22, A4=-3.34,

    A5=-0.69; LLNL: A0=-3.31, A1=-1.99, A2=-0.64, A3=2.17, A4=-0.14, A5=-0.69; NLLN: A0=-4.25,

    A1=0.69, A2=2.27, A3=-0.29, A4=-1.32, A5=0.36.

    Figure A.2: The fitting coefficients are BNNL: A0=-3.98, A1=0.63, A2=-0.42, A3=-1.98, A4=0.70,

    A5=0.89; NLNB: A0=-1.61, A1=-0.33, A2=-4.59, A3=2.08, A4=1.73, A5=-0.81; PBNL: A0=-2.68,

    A1=0.08, A2=1.72, A3=0.55, A4=0.33, A5=-0.11.

  • 7/31/2019 Dissertation Adam

    25/29

    22

    Figure A.3: The fitting coefficients are LNPL - NNPL: B0=-5.81, B1=0.051, 0=-50.24; NNPN:

    B0=-6.73, B1=0.23, 0=-109.86; NBPN: B0=-6.54, B1=0.28, 0=-115.27.

    Figure A.4: The fitting coefficients are BPNL: A0=-2.34, A1=-2.73, A2=-1.08, A3=10.75, A4=-12.81,

    A5=5.07; LPNL - LPNN: A0=-2.33, A1=-0.23, A2=-2.84, A3=-0.02; LNLP: A0=-3.73, A1=-2.24,

    A2=1.02, A3=1.02, A4=-1.41, A5=2.03.

    Figure A.5: The fitting coefficients are NLNP: A0=-4.59, A1=-1.02, A2=2.63, A3=-0.18, A4=-0.43,

    A5=0.91; LLNP: A0=-1.61, A1=-0.33, A2=-4.59, A3=2.08, A4=1.73, A5=-0.81; PNLP: A0=-3.72,

    A1=-2.25, A2=1.02, A3=1.01, A4=-1.41, A5=2.03.

  • 7/31/2019 Dissertation Adam

    26/29

    23

    Bibliography

    [1] I. W. Lyo, P. Avouris, Field-Induced Nanometer-Scale to Atomic-Scale Manipulation of Silicon

    Surfaces with the Stm. Science 253 173 (1991).

    [2] J. Cappello, J. Crissman, M. Dorman, M. Mikolajczak, G. Textor, M. Marquet and F. Ferrari,

    Genetic Engineering of Structural Protein Polymers. Biotechnol. Prog. 6, 198 (1990).

    [3] M. Haider, Z. Megeed and H. Ghandehari, Genetically engineered polymers: status and

    prospects for controlled release. J. Control. Rel. 95, 1 (2004).

    [4] R. Langer and D. A. Tirrell Designing materials for biology and medicine. Nature, 428, 487(2004).

    [5] G. A. Silva, C. Czeisler, K. L. Niece, E. Beniash, D. A. Harrington, J. A. Kessler and S. I.

    Stupp, Selective Differentiation of Neural Progenitor Cells by HighEpitope Density Nanofibers.

    Science, 303, 1352 (2004).

    [6] D.W. Urry, Elastic molecular machines in metabolism and soft-tissue restoration. Trends

    Biotechnol. 17, 249 (1999).

    [7] D. A. Harrington, E. Y. Cheng, M. O. Guler, L. K. Lee, J. L. Donovan, R. C. Claussen, S. I. Stupp

    Branched peptide-amphiphiles as self-assembling coatings for tissue engineering scaffolds. J.Biom. Mat. Res. A, 78A, 157 (2006).

    [8] J. Cappello, H. Ghandehari, Engineered Protein Polymers for Drug Delivery and Biomedical

    Applications. Adv. Drug Deliv. Rev. 54, 1053 (2002).

    [9] D. Chitkara, A. Shikanov, N. Kumar, A. J. Domb, Biodegradable Injectable In Situ Depot-

    Forming Drug Delivery Systems. Macromol. Biosc. 6, 977 (2006).

    [10] C. Parka, J. Yoonb and E. L. Thomas. Enabling nanotechnology with self assembled block

    copolymer patterns. Polymer 44, 6725 (2003).

    [11] A. A. Martens, Silk-Collagen-like Block Copolymers with Charged Blocks, self-assembly intonanosized ribbons and macroscopic gels. PhD Thesis, Wageningen Universiteit, The Nether-

    lands (2008).

    [12] M. W. T. Werten, W. H. Wisselink, T. J. J. van den Bosch, E. C. de Bruin and F. A. de Wolf,

    Secreted production of a custom-designed, highly hydrophilic gelatin in Pichia pastoris. Protein

    Engineering 14, 447 (2001).

    [13] M. T. Krejchi, E. D. T. Atkins, A. J. Waddon, M. J. Fournier, T. L. Mason and D. A. Tirrell,

    Chemical Sequence Control Of Sheet Assembly In Macromolecular Crystals Of PeriodicPolypeptides. Science 265, 1427 (1994).

    [14] M. Schor, B. Ensing and P. G. Bolhuis, A simple coarse-grained model for self-assembling silk-

    like protein fibers.Soft Matter 5, 2658 (2009). DOI: 10.1039/b902952d

  • 7/31/2019 Dissertation Adam

    27/29

    24

    [15] M. W. T. Werten, T. J. van den Bosch, R. D. Wind, H. Mooibroek and F. A. de WolfHigh-yield

    secretion of recombinant gelatins by Pichia pastoris. Yeast 15, 1087 (1999).

    [16] A. A. Martens, G. Portale, M. W. T. Werten, R. J. de Vries, G. Eggink, M. A. C. Stuart and F. A.

    de Wolf, Triblock Protein Copolymers Forming Supramolecular Nanotapes and pH-Responsive

    Gels. Macromol. 42 1002 (2009).

    [17] M. P. Allen and D. J. Tildesley, Computer Simulation of Liquids, Oxford University Press.

    [18] D. Frenkel and B. Smith, Understanding Molecular Simulation - From Algorithms to Applica-

    tions, Academic Press.

    [19] D. Bhella, A. Ralph and R. P. Yeo, Conformational Flexibility in Recombinant Measles Virus

    Nucleocapsids Visualised by Cryo-negative Stain Electron Microscopy and Real-space Helical

    Reconstruction. J. Mol. Biol. 340, 319 (2004).

    [20] M. Levitt, A simplified representation of protein conformations for rapid simulation of protein

    folding. J. Mol. Biol, 104, 59 (1976).

    [21] M. M. Tirion, Large amplitude elastic motions in proteins from a single-parameter, atomic anal-

    ysis. Phys Rev Lett 77, 1905 (1996).

    [22] S. Kundy, R. L. Jernigan, Molecular mechanism of domain swapping in proteins: an analysis of

    slower motions. Biophys. J. 86, 3846 (2004).

    [23] Y. Ueda, H. Taketomi and N. Go, Studies on protein folding, unfolding, and fluctuations by

    computer simulation. II. A. Three-dimensional lattice model of lysozyme. Biopolymers, 17, 1531

    (1978).

    [24] J. A. McCammon, S. H. Northrup, M. Karplus, R. M. Levy. Helix-coil transitions in a simplepolypeptide model. Biopol. 19, 2033 (1980).

    [25] S. Brown, N. J. Fawzi, and T. Head-Gordon, Coarse-grained sequences or protein folding and

    design. Proc. Natl. Acad. Sci. USA, 2003, 100, 10712-10717.

    [26] N. L. Fawzi, E. H. Yap, Y. Okabe, K. L. Kohlstedt, S. P. Brown and T. Head-Gordon, Contrasting

    Disease and Nondisease Protein Aggregation by Molecular Simulation. Acc. Chem. Res., 2008,

    41 (8), 10371047.

    [27] I. Bahar, R. L. Jernigan, Inter-residue potentials in globular proteins and the dominance of

    highly specific hydrophilic interactions at close separation. J. Mol. Biol. 266, 195 (1997).

    [28] A. V. Smith, C. K. Hall, helix formation: discontinuous molecular dynamics on anintermediate-resolution protein model. Proteins 44, 344 (2001).

    [29] A. V. Smith, C. K. Hall, Assembly of a tetrameric a-helical bundle: computer simulations on an

    intermediate-resolution protein model. Proteins 44, 376 (2001).

    [30] Hess, B., Kutzner, C., van der Spoel, D. and Lindahl, E. (2008) GROMACS 4: Algorithms for

    Highly Efficient, Load-Balanced, and Scalable Molecular Simulation, J. Chem. Theory Com-

    put., 4, 435-447.

    [31] http://www.cmm.upenn.edu/resources/indexsoft.html

    [32] R. W. Hockney,S. P. Goel, J. Eastwood, Quiet highresolution computer models of a plasma. J.

    Comp. Phys. 14, 148 (1974).

  • 7/31/2019 Dissertation Adam

    28/29

    25

    [33] R. W. Hockney and J. W. Eastwood, Computer Simulations Using Particles. McGraw Hill, New

    York (1981).

    [34] S. Auerbach and A. Friedman. Long-term behaviour of numerically computed orbits: Small and

    intermediate timestep analysis of one-dimensional systems. J. Comput. Phys. 93(1), 189 (1991).

    [35] W. Wang, O. Donini, C. M. Reyes, P. A. Kollman1, BIOMOLECULAR SIMULATIONS: Re-cent Developments in Force Fields, Simulations of Enzyme Catalysis, Protein-Ligand, Protein-

    Protein, and Protein-Nucleic Acid Noncovalent Interactions. Annu. Rev. Biophiys. Biom. 30,

    211 (2001).

    [36] W. D. Cornell, P. Cieplak, C. I. Bayly, I. R. Gould, K. M. Merz, D. M. Ferguson, D. C.

    Spellmeyer, T. Fox, J. W. Caldwell, P. A. Kollman, A Second Generation Force Field for the

    Simulation of Proteins, Nucleic Acids, and Organic Molecules. J. Am. Chem. Soc. 117, 5179

    (1995).

    [37] A. D. MacKerell Jr., D. Bashford, M. Bellott, R. L. Dunbrack Jr., J. D. Evanseck, M. J. Field, S.

    Fischer, J. Gao, H. Guo, S. Ha, D. Joseph-McCarthy, L. Kuchnir, K. Kuczera, F. T. K. Lau, C.Mattos, S. Michnick, T. Ngo, D. T. Nguyen, B. Prodhom, W. E. Reiher, B. Roux, M. Schlenkrich,

    J. C. Smith, R. Stote, J. Straub, M. Watanabe, J. Wiorkiewicz-Kuczera, D. Yin and M. Karplus,

    All-Atom Empirical Potential for Molecular Modeling and Dynamics Studies of Proteins. J.

    Phys. Chem. B 102 3586 (1998).

    [38] M. Christen, P. H. Hnenberger, D. Bakowies, R. Baron, R. Brgi, D. P. Geerke, T. N. Heinz,

    M. A. Kastenholz, V. Krutler, C. Oostenbrink, C. Peter, D. Trzesniak, W. F. van Gunsteren,

    The GROMOS software for biomolecular simulation: GROMOS05. J. Comput. Chem. 26 1719

    (2005).

    [39] G. A. Kaminski, R. A. Friesner J. Tirado-Rives and W. L. Jorgensen, Evaluation andReparametrization of the OPLS-AA Force Field for Proteins via Comparison with Accurate

    Quantum Chemical Calculations on Peptides, J. Phys. Chem. B 105 6474 (2001).

    [40] R. Koradi, M. Billeter and K. Wuthrich, MOLMOL: A program for display and analysis of

    macromolecular structures. J. Mol. Phys., 14, 51 (1996).

    [41] V. Humblot, C. Methivier and C. M. Pradier. Adsorption of L-Lysine on Cu(110): A RAIRS Study

    from UHV to the Liquid Phase. Lagmuir 22, 3089 (2006).

    [42] D. van der Spoel, P. J. van Maaren and H. J. C. Berendsen, A systematic study of water models

    for molecular simulation: Derivation of water models optimized for use with a reaction field. J.

    Chem. Phys. 108, 10220 (1998).

    [43] B. Hess, H. Bekker, H. J. C. Berendsen, J. G. E. M. Fraaije, LINCS: A Linear Constraint Solver

    for Molecular Simulations. J. Comp. Chem. 18, 1463 (1997).

    [44] T. Darden, D. York, L. Pedersen, Particle mesh Ewald: An N-log(N) method for Ewald sums in

    large systems. J. Chem. Phys. 98, 10089 (1993).

    [45] U. Essmann, L. Perera, M. L. Berkowitz, T. Darden, H. Lee, L. G. Pedersen, A smooth particle

    mesh ewald potential. J. Chem. Phys. 103, 8577 (1995).

    [46] K. Zimmerman, All purpose molecular mechanics simulator and energy minimizer. J. Comp.Chem. 12, 310 (1991).

  • 7/31/2019 Dissertation Adam

    29/29

    26

    [47] M. arrinello, A. Rahman, Polymorphic transitions in single crystals: A new molecular dynamics

    method. J. Appl. Phys. 52, 7182 (1981).

    [48] Nose, S., Klein, M. L. Constant pressure molecular dynamics for molecular systems. Mol. Phys.

    50:10551076, 1983.

    [49] S. Nose, A unified formulation of the constant temperature molecular dynamics methods. J.Chem. Phys. 81, 511 (1984).

    [50] W. G. Hoover, Canonical dynamics: Equilibrium phase-space distributions. Phys. Rev. A 31,

    1695 (1985).

    [51] W. G. Hoover, Constant-pressure equations of motion. Phys. Rev. A, 34, 2499 (1986).

    [52] J. Juraszek and P. G. Bolhuis, Sampling the multiple folding mechanisms of Trp-cage in explicit

    solvent. Proc. Natl. Acad. Sci. 103, 15859 (2006).

    [53] Z. Guo and D. Thirumalai, Kinetics and Thermodynamics of Folding of a de novo Designed four

    Helix Bundle J. Mol. Biol. 263, 323 (1996).

    [54] H.C.J. Andersen, Rattle: A velocity version of the shake algorithm for molecular dynamics

    calculations. J. Comput. Phys. 52, 24 (1983).

    [55] A. V. Smith and C. K. Hal, Protein refolding Versus aggregation: computer simulations on an

    intermediate-resolution protein model. J. Mol. Biol., 2001, 312, 187-202.

    [56] V. Tozzini, Coarse-grained models for proteins. Curr. Opin. Struct. Biol., 2005, 15, 144-50.

    [57] H. M. Knig and A. F. M. Kilbinger, Learning from Nature: -Sheet-Mimicking Copolymers. GetOrganized. Angew. Chem. Int. Ed., 2007, 46, 8334-8340.

    [58] Nomenclature and Symbolism for Amino Acids and Peptides.

    IUPAC-IUB Joint Commission on Biochemical Nomenclature. 1983.

    http://www.chem.qmul.ac.uk/iupac/AminoAcid/AA1n2.html. Retrieved on 2008-11-17.

    [59] I. W. Lyo and P. Avouris. Field-Induced Nanometer- to Atomic-Scale Manipulation of Silicon

    Surfaces with the STM. Science 253, 173 (1991).

    [60] Galo J. de A. A. Soler-Illia, Clment Sanchez, Bndicte Lebeau, and Jol Patarin, Chemical Strate-

    gies To Design Textured Materials: from Microporous and Mesoporous Oxides to Nanonetworks

    and Hierarchical Structures. Chem. Rev. 102, 4093 (2002).

    [61] M. W. T. Werten, W. H. Wisselink, T. J. Jansen-van den Bosch, E. C. de Bruin and F. A. de Wolf,

    Secreted production of a custom-designed, highly hydrophilic gelatin in Pichia pastoris. Protein

    Engineering 14(6), 447 (2001).

    [62] P. J. Flory, Principles of Polymer Chemistry. Cornell University Press, Ithaca, New York (1953).

    [63] S. Park, F. Khalili-Araghi, E. Tajkhorsid and K.Schulten, Free energy calculation from steered

    molecular dynamics simulations using Jarzynskis equality J. Chem. Phys. 119, 3559 (2003).

    [64] S. Park and K.Schulten, Calculating potentials of mean force from steered molecular dynamics

    simulations. J. Chem. Phys. 120, 5946 (2003).


Recommended