+ All Categories
Home > Documents > A Priori Crystal Structure

A Priori Crystal Structure

Date post: 01-Nov-2015
Category:
Upload: sayyadmannan
View: 222 times
Download: 0 times
Share this document with a friend
Description:
hi
13
  A Priori Crystal Structure Prediction of Native Celluloses Remco J. Vie ¨ tor 1 Karim Mazeau 2 Miles Lakin 2 Serge Pe ´ rez 1,2 1 Inge ´ nie ´ rie Mole ´ cul aire, Institut National de la Recherche Agronomique, Rue de la Ge ´ raudie ` re, BP 71627,  44316 Nantes Ce ´ dex, France 2 Centre de Recherches  sur les Macromole ´ cules Ve ´ ge ´ tales,*  Centre National de la Recherche Scientique, BP53, 38041 Grenobl e Ce ´ dex, France  Received 14 May 1999; accepted 29 March 2000  Abstract:  The pack ing of  -1,4-glucopyranose chains has been modeled to further elaborate the molecular structures of native cellulose microbrils. A chain pairing procedure was implemented that evaluates the optimal interchain distance and energy for all possible settings of the two chains. Starting with a rigid model of an isolated chain, its interaction with a second chain was studied at various helix-axis translations and mutual rotational orientations while keeping the chains at van der Waals separation. For each setting, the sum of the van der Waals and hydrogen-bonding energy was calculated. No energy minimization was performed during the initial screening, but the energy and interchain dist ance s were mapped to a thre e-di mensi onal grid , with evalua tion of para llel settings of the cellulose chains. The emergence of several energy minima suggests that parallel chains of c ellulose ca n be paired in a variet y of stable orientatio ns. A further an alysis cons idered all possible parallel arrangements occurring between a cellulose chain pair and a further cellulose chai n. Among all the low-ener gy thre e-ch ain models, only a few of them yield closel y pack ed three-dimensional arrangements. From these, unit-cell dimensions as well as lattice symmetry were deri ved; interestingl y two of them corr espo nd clos ely to the obse rved allo morp hs of crys tall ine nati ve cell ulos e. The most favorab le stru ctur al mode ls were then opti mized using a mini cryst al  procedure in conjunct ion with the MM3 force eld. The two best crystal lattice predictions were  for a triclinic (P 1 ) and a monoclinic (P2 1 ) arra nge ment with unit cell dimensio ns  a    0.63,  b 0.69,  c 1.036 nm,   113.0,   121.1,    76.0°, and  a 0.87,  b 0.75,  c 1.036 nm,     94.1°, respectively. They correspond closely to the respective lattice symmetry and unit-cell dimensions that have been reported for cellulose I  and cellulose I  allomorphs. The suitability of Correspondence to:  Serge Pe ´rez; email: [email protected] Contract grant sponsor: INRA, CARENET-2, and CNRS * Associated with University Joseph Fourier, Grenoble. Biopolymers, Vol. 54, 342–354 (2000) © 2000 John Wiley & Sons, Inc.  342
Transcript
  • A Priori Crystal StructurePrediction of NativeCelluloses

    Remco J. Vietor1

    Karim Mazeau2

    Miles Lakin2

    Serge Perez1,21 Ingenierie Moleculaire,

    Institut National de laRecherche Agronomique,

    Rue de la Geraudie`re,BP 71627,

    44316 Nantes Cedex, France

    2 Centre de Recherchessur les Macromolecules

    Vegetales,*Centre National de la

    Recherche Scientifique,BP53,

    38041 Grenoble Cedex,France

    Received 14 May 1999;accepted 29 March 2000

    Abstract: The packing of b-1,4-glucopyranose chains has been modeled to further elaborate themolecular structures of native cellulose microfibrils. A chain pairing procedure was implementedthat evaluates the optimal interchain distance and energy for all possible settings of the two chains.Starting with a rigid model of an isolated chain, its interaction with a second chain was studied atvarious helix-axis translations and mutual rotational orientations while keeping the chains at vander Waals separation. For each setting, the sum of the van der Waals and hydrogen-bonding energywas calculated. No energy minimization was performed during the initial screening, but the energyand interchain distances were mapped to a three-dimensional grid, with evaluation of parallelsettings of the cellulose chains. The emergence of several energy minima suggests that parallelchains of cellulose can be paired in a variety of stable orientations. A further analysis consideredall possible parallel arrangements occurring between a cellulose chain pair and a further cellulosechain. Among all the low-energy three-chain models, only a few of them yield closely packedthree-dimensional arrangements. From these, unit-cell dimensions as well as lattice symmetry werederived; interestingly two of them correspond closely to the observed allomorphs of crystallinenative cellulose. The most favorable structural models were then optimized using a minicrystalprocedure in conjunction with the MM3 force field. The two best crystal lattice predictions werefor a triclinic (P1) and a monoclinic (P21) arrangement with unit cell dimensions a 5 0.63, b5 0.69, c 5 1.036 nm, a 5 113.0, b 5 121.1, g 5 76.0, and a 5 0.87, b 5 0.75, c 5 1.036 nm,g 5 94.1, respectively. They correspond closely to the respective lattice symmetry and unit-celldimensions that have been reported for cellulose Ia and cellulose Ib allomorphs. The suitability of

    Correspondence to: Serge Perez; email: [email protected] grant sponsor: INRA, CARENET-2, and CNRS* Associated with University Joseph Fourier, Grenoble.

    Biopolymers, Vol. 54, 342354 (2000) 2000 John Wiley & Sons, Inc.

    342

  • the modeling protocol is endorsed by the agreement between the predicted and experimentalunit-cell dimensions. The results provide pertinent information toward the construction of macro-molecular models of microfibrils. 2000 John Wiley & Sons, Inc. Biopoly 54: 342354, 2000

    Keywords: b-1,4-glucopyranose chains; packing; molecular structure; native cellulose microfi-brils; crystal structure prediction

    INTRODUCTION

    Many recent advances in the theory and application ofmolecular modeling to the structural elucidation ofcarbohydrate and carbohydrate polymers have pro-duced a wide range of useful results.14 In combina-tion with experimental methods, computer modelinghas become an integral part of the strategy for reveal-ing three-dimensional structures, both in solution andin the condensed phase. Nevertheless, the realm ofcarbohydrate modeling has tended to emphasize in-tramolecular rather than intermolecular aspects. Whendealing with materials in condensed phases, the mod-eling technique can be combined with informationderived from electron and fiber diffraction to enablequantitative solution of the three-dimensional crystal-line structure.5 In addition to rationalizing why theobserved crystalline arrangement is the preferredform, a further goal is the prediction of all stablethree-dimensional organizations accessible to thepolysaccharide in a given conformation. It would alsobe desirable to extend this predictive methodology toless ordered systems such as gels, where chainchaininteractions may occur to promote the formation ofthe so-called junction zones.

    These topics require the development of generalrules for analyzing the stability of certain interhelixarrangements. Several authors have proposed meth-ods for investigating the interhelix structure and en-ergy through nonbonded forces.615 These proceduresinvolve a minimization of the interhelix energy. Incontrast, we have developed a method where thehelices are positioned so as to allow contact but notinterpenetration of the van der Waals surfaces of thetwo helices. After the helices are placed at the posi-tion of van der Waals contact for a given helixhelixrotation and translation, the energy is calculated. Thisprocedure takes considerably less computer time thanmethods involving energy minimization, and has beensuccessfully applied to synthetic polymers,16 and tothe polysaccharide chitin and starch.17 In this latterexample, the structure predicted to be most stablecorresponds to a duplex of parallel double helices, asfound in both the crystalline A and B allomorphs.18,19From these results, an explanation of the transitionfrom the A to B allomorph has been proposed.17

    Native cellulosic materials are organized into mi-crofibrils in which crystalline domains coexist with

    amorphous zones. Little is known about the ultra-structures of the amorphous zones. Incidentally, thedetailed crystal structure of native cellulose (celluloseI) is still a matter of debate despite more than 70 yearsof research effort. X-ray fiber diffraction experimentsinitially lead to models based on two-chain or eight-chain unit cells depending upon the source of thesample,20,21 with the eight-chain unit cell invoked toaccount for weak signals in the diffraction pattern.Later experiments using cross-polarizationmagic an-gle spinning proton nmr indicated the presence of twoallomorphs in samples of native cellulose, designatedIa and Ib.2224 Subsequent results from electron dif-fraction25 verified the presence of these two allo-morphs and provided data on crystal symmetry andunit cell size for each allomorphs. The Ia allomorphcrystallizes in the triclinic P1 space group and con-tains one cellobiose unit per unit cell with a parallelarrangement of the chains as would be expected. TheIb form crystallizes in the monoclinic space groupP21 with two cellobiose moieties per unit cell. Sincethe two allomorphs are found within the same micro-fibril, parallel packing for Ib is the inescapable con-clusion. The chain repeat length is found to be invari-ant at 1.036 nm.

    Building realistic macromolecular models ofcellulosic microfibrils starting solely with informa-tion derived from the fiber repeat distance is still adifficult task. One needs to predict both crystallineallomorphic phases of cellulose together with lessordered regions, which could occur in the amor-phous phase. This requires an exhaustive explora-tion of the low-energy three-dimensional arrange-ments of cellulose chains. The present workassesses the feasibility of the methodology of gen-eration. A very important aspect of the work is theadequation of the proposed models. Therefore,prior to the generation of dense packed chains oneneeds a proper validation of the method used; thiscan be achieved by a careful prediction of the twoallomorphs of cellulose I. In addition to refining themethodology for predicting solid state polymor-phism, the study should yield the crystal structureof the two cellulose I allomorphs, and provide someinsight into the nature of each form and into tran-sitions between them.

    Crystal Structure Prediction 343

  • COMPUTATIONAL METHOD

    Nomenclature

    The atom coordinates for the glucose residue used in thisstudy were taken from the MONOBANK database ofmonosaccharide structures.26 The atom numbering and theangle definitions used are shown in Figure 1. The relativeorientation of two glucose residues was determined by threeangles: the glycosidic bond angle t, defined by the atomsC1OO4OC49, and the two torsion angles f(O5OC1OO1OC49) and c (C1OO1OC49OC59). In ad-dition, the torsion angle v (O5OC5OC6OO6) was used todescribe the position of the primary hydroxyl group.

    Computer Generation of Cellulose Chains

    The PFOS program27 was used to generate cellobiose unitsand to calculate the conformational energy and the helicalparameters n (number of residues per turn) and h (advancealong the helicoidal axis per residue) as a function of f andc for the primary hydroxyl conformations gg (v 5 2608),gt (v 5 1808), and tg (v 5 608; see Table I). Thecalculations used the 1.036 nm cellulose fiber repeat dis-tance established by x-ray diffraction together with a gly-

    cosidic bond angle t of 117.5. A typical potential energysurface is represented in Figure 2.

    Strict helical symmetry of a macromolecular chain re-quires that equivalent chemical units occupy equivalentpositions about the molecular axis. Such model chains wereprepared with glycosidic torsion angles falling at minima onthe potential energy surface and generating helices having n5 2 and h 5 0.518 nm. This type of chain will be referredto as helical with t 5 117.5, f 5 290, and c 5 2153.

    Cellulose chains exhibiting a repeat distance of 1.036 nmcan also be constructed by regular alternation between two setsof glycosidic torsion angles. Such chains, without internalsymmetry, will be referred to as translational. They wereobtained by setting one glycosidic linkage to one of the min-imum energy conformations determined with PFOS; with thenext linkage manipulated until translational symmetry with therequired period of 1.036 nm was obtained between residues iand (i 1 2). Relative chain energies for the conformationsobtained in this way were estimated by averaging the energiesdetermined with PFOS for the individual sets (f, c). The chainhaving the lowest energy was selected for further study. Con-formational parameters for the generated chains are collectedin Table I. Coordinates of both cellulose helices are availableupon request form the authors.

    Prediction of the Stable ChainChainArrangements

    The relative orientation of two commensurate chains (chainA and chain B) oriented either parallel or antiparallel, re-

    FIGURE 1 Schematic representation of the cellobioseunit. The atoms belonging to the reducing residue areprimed. The relative arrangement of the two glucose resi-dues is determined by three angles: the glycosidic bondangle t defined by the atoms C1OO4OC49, and the twotorsion angles f (O5OC1OO1OC49) and c(C1OO1OC49OC59). In addition, the torsion angle v(O5OC5OC6OO6) was used to describe the orientation ofthe primary hydroxyl group, which normally adopts one ofthe three staggered positions: gauchegauche (gg), gauchetrans (gt), and transgauche (tg).

    Table I Conformation Parameters of Studied Chains(for v 5 260, 60; 180)

    Chain

    t F cc

    (nm)()

    Helix 117.5 290 2153 1.036Translational 117.5 277 2141 1.036

    117.5 2102 2164 1.036

    FIGURE 2 Rigid residue potential energy surface of cel-lobiose. Iso-energy contours are shown at 1 kcal/mol inter-vals relative to the global minimum. Superimposed uponthese contours are those calculated for the helical parame-ters n 5 2 and h 5 0.518 nm . Where the cellulose chainadopts true 21 helical symmetry (f, c) must assume thevalues (290, 2153). This requirement may be relaxed ifthe glycosidic torsion angles alternate between (f, c)15 (277, 2141) and (f, c)2 5 (2102, 2164).

    344 Vietor et al.

  • quires a set of four interhelical parameters: mA, a rotation ofthe chain A about the helical axis from 0 to 360, mB, arotation of the chain B about its axis from 0 to 360, Dx,and Dz, which are taken as positive, and represent posi-tional shifts normal and parallel to the identity axis, respec-tively. Dz is bounded between 0 and t (fiber repeat). Thespatial description of these parameters is shown in Figure 3.

    The minimum energy arrangement of the two polymerchains with respect to a displacement, will tend to bring themolecules as close as possible without interpenetration ofVan der Waals radii. In reality, a small amount of repulsiveenergy resulting from interpenetration of some atom pairscan be compensated by additional attractions from the re-maining atoms pairs. However, nonbonded contact dis-tances deviate by only 10% (0.020.03 nm) in molecularsolids.28 The contacting procedure28 involves describing thesurface of each chain by circumscribing a hard sphere of

    appropriate van der Waals radius R around each of theconstituent atoms. Then for a given orientation of the chains(as dictated by rotation angles mA and mB and an incrementalong the chain axis Dz) the relative translation Dx is foundthat will bring the surfaces into a position where they are incontact, but without interpenetration. In general, the finalposition of the two initially penetrating polymer chains ischaracterized by the following conditions:

    1. For at least one atom pair (i, j) the ith atom of thechain A is separated from the jth atom of the chain Bby the sum of the associated van der Waals radii (Rij5 Ri 1 Rj). The atom pair i, j that satisfies theseconditions is referred to as the determining contact.

    2. For all atom pairs in the two separate chains, there isno pair at a distance closer than the sum of their vander Waals radii (Dij . Ri 1 Rj).

    3. Condition (2) is violated any time that an atom pair i,j is involved in hydrogen bonding with no a priorioptimum value for the interatomic distance. This lim-iting case is treated by identifying all atom pairspotentially eligible for participation in interchain hy-drogen bonding and omitting them from the proce-dure. Thus hydrogen bonding will not violate princi-ple (1).

    Using this contacting procedure for chainchain con-struction, the search space is reduced to three geometricvariables. The resulting interchain interaction energy (EAB)can then be calculated to the required degree of approxima-tion.

    For the simulation of cellulose, the interaction energy ofthe two chains was considered to be the sum of all pairwiseatomatom interactions and was calculated using a 612potential function,29,30 with an additional term to cover thestabilization arising from the interchain hydrogen bonding.This was based on the distance between the oxygen atomsthat can interact through hydrogen bonding (0.250.30 nm)without recourse to the hydroxyl hydrogens.

    The CHACHA program17 was used to map Dx and EABas a function of the structural variables mA, mB, and Dz.The analysis was performed by incrementing mA and mBover the whole angular range by increments of a few de-grees and the relative translation (Dz) between the twochains was studied over the length of the whole fiber repeat,typically by increments of 0.010.05 nm (i.e., h/ 20). Therotations mA and mB were set to be both independent andcoupled (mA 5 mB 1 a). For each setting of the chain asa function of mA and mB and Dz, the magnitude of theperpendicular offset Dx was derived according to the con-tact procedure described above. The value of the energyEAB was then computed. The mapping procedure was usedto search for low-energy regions. In order to pinpoint theenergy minima, regions containing a local minimum weresearched a second time using intervals of 1 (m) and 0.005nm (for Dz). This procedure provided a complete overviewof the symmetry (or lack of symmetry) of the chainchaininteractions. The set of interhelical parameters relate to thelattice symmetry that characterises the three-dimensionalorganisation as follows:

    FIGURE 3 Interhelical parameters used to define the geo-metric orientation of the two parallel cellulose chains (Aand B): Chain rotations mA and mB, interchain contactdistance Dx, and longitudinal offset Dz.

    Crystal Structure Prediction 345

  • 1. mA mB: Chain A and chain B are not related byany symmetry operation. They are independent andboth form the asymmetric unit of the cell.

    2. mA 5 mB: Chain B is derived from chain A by asimple translational operation.

    3. mA 5 mB 1 1808, and Dz 5 0: The two chains areparallel and related by a twofold operation. For atwofold screw axis operation, the conditions will bemA 5 mB 1 1808 and Dz 5 t/ 2.

    4. mA 5 2mB 1 1808: The two chains are antiparalleland related by a twofold or by a twofold screw axissymmetry operation.

    Several protocols were used for the grid search. In thesimplest case, the rotations of the 2 chains were coupled (mA5 mB). For the helical chains, the internal 21 screw sym-metry of the chains reduced the range to be searched tobetween 0 and 180, and Dz between 0.0 and 0.518 nm.This reduction of the search space was not applicable to thetranslational chains.

    In the study of the monoclinc arrangement, two casesneed to be considered. In the first and most general case, thechain axes do not coincide with the 21 screw axes of thelattice. This implies that the chains have to be relatedthrough 21 symmetry, i.e., mA 5 mB 1 1808 and Dz5 0.518 nm (5c/ 2). In the alternative case, where thechains do coincide with the screw axis, two groups of chainscan be distinguished, with no a priori relation between thechains of different groups; the relative orientations of twosuch chains are independent. The chains within a group arerelated by translation in the xy plane. For the helical chains,both of these situations are possible, the first being a subsetof the second. For the translational chains, only the firstsituation is possible.

    Expansion to a Three-Dimensional Lattice

    Given two independent orientations of B chains with respectto the A chain, a three-dimensional lattice is described. Therequired parameters, m1A, m1B, Dx1, Dz1, and m2 A, m2B,Dx2, Dz2, are illustrated in Figure 4. Such arrangementswere determined by combining a further single chain with achain pair from one of the stable conformations obtained inthe previous step. The rotations of the single chain and theduplex were always coupled since the symmetry of thesystems under investigation does not allow more than twoorientations of the cellulose chains. The vertical displace-ment Dz was free for triclinic lattices, but fixed at 0.00 nmfor monoclinic lattices.

    The three-chain arrangements obtained in this way werescreened to exclude those with a space-filling array of unitcell volume per cellobiose outside the range of the experi-mental values, i.e., outside 0.3 and 0.4 nm3. The corre-sponding arrangements were checked for stability using aSimplex optimization. Finally, unit-cell parameters weredetermined for the refined arrangements that gave the small-est unit cell volumes (i.e., the highest densities). The de-tained procedure to calculate the unit-cell parameters hasbeen described previously16 and the set of equations to beused are given in the caption for Figure 4.

    Up to five unit cells per series were retained. The generalprocedure to predict the occurrence of all possible three-dimensional arrangements, is summarized in Figure 5.

    Simplex Calculations

    The CHACHA chainchain algorithm allows three-chainarrays to be considered stable even where two of the chainsdo not interact. This situation would lead to channel-likevoids in the corresponding crystal lattice and were thereforeexcluded.

    In order to confirm their viability, Simplex optimizationswere performed on the CHACHA-derived three-chain ar-rangements. The interchain distances (Dx), the chain orien-tations (m), the relative shifts (Dz), and the angle in the xy

    FIGURE 4 Building up crystalline structures. The calcu-lations for three chains gave the relative positions of thechains in stable triplet interactions. For triclinic lattices, thesides a and b, the unit cell angles a, b, and g, the projectionof g on the xy plane (g*), and the lattice volume percellobiose unit can be calculated from the parameters for thestable configurations using Eqs. (1)(7). For the monocliniclattices, Eqs. (8), (9), and (10) were used instead of (1), (2),and (6) respectively to calculate a, b, and g.

    a 5 Dx12 1 Dz1

    2 (1)

    b 5 Dx22 1 Dz22 (2)

    tan a 5Dz2Dx2

    (3)

    tan b 5Dz1Dx1

    (4)

    g* 5 1808 2 m1, A 1 m2, A (5)cos g 5 cos a 3 cos b 2 sin a 3 sin b 3 cos g* (6)

    V 5 Dx1 3 Dx2 3 c sin g* (7)

    a 5 Dx22 1 ~Dx1p2!2 2 4 3 Dx1 3 Dx2 3 cos g* (8)

    b 5 Dx2 (9)

    sin~1808 2 g! 52Dx1

    asin g* (10)

    346 Vietor et al.

  • plane determined by the three-chain arrangement were var-ied (eight parameters). In order to retain the required trans-lational symmetry for the triclinic cases the rotation aroundthe helical axis was held identical for the three chains. Thisgave 6 degrees of freedom in total. The Simplex was gen-erated by multiplying the initial value of each variable inturn by 1.05 and using the obtained tupel as an additionalvertex for the simplex, giving 7 vertices in total. Simplexoptimisation was continued until the minimum for the totalinterchain energy was reached. Both nonbonding and hy-drogen-bonding interactions were taken into account forthese energy calculations. Structures that showed large dis-placements or rotations were rejected.

    Minicrystal Generation and EnergyMinimization

    The most favorable structural models (i.e., those having thelowest unit cell volume) were optimized using a minicrystalprocedure as described by French et al.31 Minicrystals con-sisting of 7 cellotetraose units were generated using thecalculated chain orientations and lattice parameters, in sucha way that a central cellobiose unit was completely sur-rounded. The resulting model was allowed to relax underthe MM3 force field.32,33 The total steric energy calculatedby MM3 includes intramolecular terms (bond stretching andbending, forming and deforming pyranose rings), as well asnonbonded forces that can apply to both intra- and intermo-lecular interactions. The more stable arrangement will havea lower total energy, regardless of whether the stabilitycomes from a more stable isolated molecule or from a betterintermolecular arrangement. Instead of energy contributionsestimated from explicit atomic charges, the dipoledipoleenergy is evaluated. This is very dependent on the dielectricconstant as is the hydrogen-bonding term. A value of 4 wasused to mimic the effect of a crystalline environment onisolated molecules. After relaxation, deformation of the

    central cellobiose unit was measured by calculating the rmsatom displacement of all nonhydrogen atoms and of thefinal values for t, f, and c. The orientations of the primaryhydroxyl groups after relaxation were also determined. Thelattice energy of the central disaccharide was estimated asthe sum of all intermolecular interactions involving thisdisaccharide.

    RESULTS

    Chain Construction

    Among the (f, c) combinations compatible with the1.036 nm fiber repeat, only the one giving n 5 2 andh 5 0.518 nm falls within the fully allowed zone ofthe potential energy surface shown in Figure 2. If anideal twofold helical structure is assumed for thecellulose chain, there exists only one set of glycosidictorsion angles that can generate the helical parame-ters. For such a helical chain, minimal glycosidicbond energy is obtained with f 5 290 and c5 2153. Further stabilization is possible throughinterresidue hydrogen bonds between O5 and O39(oxygenoxygen distance, dOOO, approximately 0.25nm) and, depending on the orientation of the primaryhydroxyl group, between O6 (orientation gt) and O39(dOOO 5 0.31 nm) or O2 and O69 (orientation tg;dOOO 5 0.28 nm). Consideration of the potentialenergy surface suggests that slight deviations from theideal helical structure could not only be accommo-dated but also be energetically more stable. A chainnot conforming to helical symmetry is characterisedby a set of alternating torsional glycosidic angles (f,c)1 and (f, c)2, and is termed a translational chain.Minimum energy for such a chain was found for thecombination of (f, c)1 5 (277, 2141) and (f, c)25 (2102, 2164).

    Packing of Helical Cellulose Chains

    Two-chain Arrangements. Both coupled chain rota-tions and independent rotations were considered and anumber of stable arrangements were found. As a firststep, all possible arrangements occurring between twoparallel cellulose chains, were examined. This wasperformed by rotating mA and mB over the wholeangular range from 0 to 360 and the relative dis-placement of the two chains was investigated over thewhole length of the fibre repeat and with the differentorientations of the primary hydroxyl groups. For eachsetting of the chains as a function of mA, mB, and Dz,the magnitude of the offset perpendicular to the chainaxis Dx was computed according to the contact pro-cedure. The value of the energy corresponding to eachset of chain orientations was evaluated as a function

    FIGURE 5 Flow chart showing the general procedureused to investigate all possible three-dimensional paralleland antiparallel arrangements of cellulose chains.

    Crystal Structure Prediction 347

  • of the set of four interhelical orientations. The searchof energy minima was performed within the three-dimensional (mA, mB, Dz) space. Seventeen energyminima were found within a difference energy win-dow of 2 kcal/mol/cellobiose. The results indicate thatthe significant energy minima occurred for values ofmA lying in the vicinity of mB. This suggests thatappropriate packing of neighboring helical chains canbe achieved with operations of simple translation.

    The following steps were conducted assuming mA

    5 mB. This allowed for a straightforward two-dimen-sional study. The contour maps calculated as a func-tion of the translation Dz, along the fibre axis and thecoupled rotation angles mA 5 mB are shown in Figure6. Figure 6a is a representation of interchain energy inrelation to coupled variations of mA and mB, with theperpendicular offset Dx. Figure 6b shows the inter-chain potential energy map at the optimum perpen-dicular offset Dx, as a function of the translation Dz,along the chain direction and coupled rotations of mA

    FIGURE 6 (a) Interchain potential energy surface as a function of coupled variation of mA and mBwith the perpendicular offset Dx. Contours are drawn at intervals of 5 kcal/mol/cellobiose; (b)Interchain potential energy map at the optimum perpendicular offset Dx, as a function of thetranslation Dz along the chain direction and coupled rotations of mA and mB. Contours are drawnat 5 kcal/mol/cellobiose intervals.

    348 Vietor et al.

  • and mB. In this example, all primary hydroxyl groupsof the cellulose chain are tg. This map exhibits anobvious symmetry; four equivalent orientations (m,Dz), (m 1 1808, Dz), (m 1 1808, c 2 Dz) and (m,c 2 Dz) are observed due to the helical nature of thecellulose chain used as the model. Consequently, onlyone section needs to be described. The most stablearea of the map, as delineated by the first contourcovers a 20 angular range in m and a 0.15 nm rangein Dx. This indicates that there are multiple interchainarrangements and that libration motion within thoselimits could occur. Domains are found, which corre-spond to a somewhat restricted range of mA. Verticaldisplacements are 0.00 or 0.279 nm (slightly morethat h/ 2 5 2.59 nm); Dx appears to be mainlydependent on mA, and much less on Dz. Within thesedomains, 17 energy minima were found within adifference energy window of 2 kcal/mol/cellobiose.Characteristics of the most energetically stable ar-rangements are given in Table II. These arrangementshave the chains stacked with ring surfaces touching.

    Three-chain Arrangements. Three-chain arrange-ments were calculated starting from each of the stabletwo-chain arrangements calculated, by combining thechain duplex with a third identical chain. Only stablearrangements with low energy were retained. In mostcases, the global minimum for the three-chain

    searches gave arrangements with the three chains inone plane. Since such arrangements do not correspondto space-filling lattices, they were excluded. As anexample, Figure 7 shows the energy and lateral dis-placement as functions of m and Dz using a singlechain combined with a chain duplex. The presence ofa pair of chains instead of a single chain removes anumber of equivalent positions that were visible inFigure 6. Here again, the most stable areas extend inboth directions, suggesting that there are many chainarrangements with comparable energy.

    With the freely oriented chains, differences inchain orientation (mA 2 mB) of more than ca. 10;caused the appearance of channels in the corre-sponding lattice, so decreasing the chainchain inter-action and the lattice density. Despite their impor-tance in the description of the amorphous structures,such arrangements were not suitable for generatingthree-dimensional crystalline arrangements. Theywere therefore discarded.

    Packing of TranslationalCellulose Chains

    Two-chain Arrangements. Stable two-chain andthree-chain arrangements were determined and se-lected as described for the helical chains. Since thetranslational chains do not contain an internal 21 axis;

    Table II Optimum Values for the ChainChain Interactions for Coupled Rotations of Helical Cellulose Chains

    vmA()

    Dz(nm)

    Dx(nm)

    E(vdW)(kcal/mol/dis)

    E(HB) (kcal/mol/dis)

    E(Tot) (kcal/mol/dis) Contacts

    gg 277 0.00 0.481 27.24 0 27.24 154gg 277 0.520 0.488 26.42 0 26.42 145gg 285 0.226 0.501 25.50 0 25.50 132gg 286 0.219 0.503 25.47 0 25.47 131tg 277 0.00 0.482 27.03 0 27.03 159tg 260 0.274 0.468 26.17 0 26.17 150tg 296 0.324 0.544 26.89 0 25.89 135tg 258 0.000 0.477 25.60 0 25.60 120tg 288 0.270 0.522 25.18 0 25.18 135tg 274 0.520 0.494 25.07 0 25.07 127tg 250 0.179 0.487 24.97 0 24.97 130tg 2111 0.000 0.635 24.91 0 24.91 94tg 29 0.269 0.642 24.57 0 24.57 100gt 2126 0.279 0.669 25.61 21.95 27.56 101gt 2169 0.000 0.809 23.35 23.95 27.30 62gt 277 0.000 0.480 26.11 0 26.11 140gt 2101 0.219 0.550 25.68 0 25.68 139gt 261 0.279 0.468 25.65 0 25.65 138gt 277 0.520 0.488 25.39 0 25.39 138gt 258 0.00 0.477 25.13 0 25.13 114gt 288 0.209 0.508 25.08 0 25.08 119gt 29 0.264 0.646 24.76 0 24.76 100

    Crystal Structure Prediction 349

  • only coupled rotations needed to be considered. Hy-drogen bonding was not taken into account at thisstage. For the stable configurations the energies foundwere higher than for the helical chains. Also, thedistance between the chains was larger, and the num-ber of chainchain contacts lower.

    Three-chain Arrangements. No stable 3-chain ar-rangements that resulted in a viable spacefilling lattice

    compatible with a monoclinic lattice could be deter-mined. Though several arrangements compatible witha triclinic lattice were found, though these resulted inrather large unit-cell volumes compared to those ob-tained for the helical chains.

    Final SelectionUnit-cell parameters were calculated for the selectedthree-chain arrangement obtained, allowing a space-

    FIGURE 7 (a) Chain : chain-pair potential energy surface as a function of chain rotation m andthe perpendicular offset Dx. Contours are drawn at 5 kcal/mol/cellobiose intervals above the globalminimum; (b) Chain : chain-pair potential energy map at the optimum perpendicular offset Dx, asa function of the translation Dz along the chain direction, and coupled chain rotations m. Contoursare drawn at 5 kcal/mol/cellobiose intervals above the global minimum.

    350 Vietor et al.

  • filling lattice to be generated. The structures resultingin the lowest cell volumes were retained. This gavetwo models for the triclinic phase: one with helicalchains and one with translational chains. The triclinicmodel with helical chains was considered to be thepreferred candidate in view of its smaller unit cellvolume. Only a model with helical chains could beretained for the monoclinic phase; the translationalchains did not give acceptable three-chain configura-tions.

    The cellulose chains of both models are arrangedin sheets that can be stabilized by hydrogen bonding.For the triclinic model, hydrogen bonding betweensheets is not possible, whereas in the monoclinicmodel one hydrogen bond per cellobiose unit canparticipate in an intersheet link.

    Description of the Selected Models

    Two models were selected as representative of thecrystalline structure of native cellulose. For both mod-els, the cellulose chains were taken as parallel to thec axis of the unit cell.

    Triclinic Model. The triclinic unit cell, space groupP1, contains one cellobiose unit, and is shown inFigure 8. Unit-cell parameters are a 5 0.63, b5 0.69, c 5 1.036 nm, a 5 113.0, b 5 121.1, g5 76.0. As the chains are related by translationalsymmetry only, all have the same orientation aboutthe chain axis. The chains of this structure are organ-ised into sheets oriented along the (1, 21, 0) plane.Within the sheets, the relative positions of the chainsare constant, with a distance between helix axes of0.82 nm and a displacement of 0.055 nm along thechain axis. These sheets are stabilized by an inter-chain hydrogen bond between the OH6 of one chainand the OH3 of the closest neighbor within the sheet(dOOO 5 0.30 nm). The distance between the sheetsis 0.433 nm, with a displacement of 0.279 nm alongthe chain axis. No hydrogen bonding was evidentbetween chains located in different sheets. An intra-chain hydrogen bond can be deduced between the tgO6 and O2 in the next ring along the chain (dOOO5 0.27).

    Monoclinic Model. The predicted monoclinic ar-rangement corresponds to space group P21 and con-tains 2 cellobiose units, as shown in Figure 9. The unitcell parameters are a 5 0.87, b 5 0.75, c 5 1.036nm, g 5 94.1. The chain axes coincide with symme-try axes of the unit cell with one chain placed at thecorner of the unit cell and the other at the center. Asin the triclinic lattice, the chains are organized intosheets. In the present model, these sheets are arranged

    parallel to the bc (1, 0, 0) plane with a distance of0.435 nm between the sheets. Two types of sheets canbe distinguished based on a 5 difference in the ori-entation of the constituent cellulose chains. Interchainhydrogen bonds are possible between O6 of a selectedchain and the O2 of the closest neighboring chain(dOOO 5 0.250.28 nm, depending on the sheet) andnot O3 as in the triclinic case. A further hydrogenbond may occur between the O6 of a chain in onesheet and the O4 of the closest chain in a neighboringsheet (dOOO 5 0.250.32 nm). O6 of each unitexhibits a gt orientation and lies at 0.31 nm from O3of the next pyranose ring along the chain. The differ-ence between our model and some suggested one forthe monoclinic phase lies in the orientation of the O6primary hydroxyl group. According to nmr measure-ment of the C6 chemical shift a tg conformation hasbeen proposed as derived from the use of an empirical

    FIGURE 8 The triclinic model of cellulose I, viewed (a)along the chain axis and (b) perpendicular to the chain axis.

    Crystal Structure Prediction 351

  • correlation between chemical shifts and conformationof the primary hydroxyl groups. As a consequence,intersheets hydrogen bonds cannot be formed. Ofcourse, our calculations do reveal some possible mon-oclinic arrangements for cellulose chains having theirprimary hydroxyl groups in the tg orientation. How-ever, the corresponding cell dimensions differ sub-stantially than those experimentally derived. Beside,the energy of such arrangements is slightly higherthan the one corresponding to our selected model.

    A monoclinic model has been recently proposed.34It is based on the refinement of two independent setsof x-ray fiber diffraction data. The comparison be-tween their proposed models and ours shows a verysatisfactory agreement. The unit-cell parameters andthe space group symmetry, along with the chain con-formation, are the same except for the conformationof the hydroxymethyl group. A small difference lies

    in the relative orientation of the two chains. Thevariation of the cylindrical polar angle between thetwo chains has been reported to be either null orslightly negative (23.5) whereas the magnitude ofthis angle is 5 in our predicted three-dimensionalstructure. However, it was shown that the variationsof this angle in the range 210 to 10 have only aminor influence on the calculated agreement factor.

    Not surprisingly, the calculated cell dimensionsexhibit some discrepancies with respect to those thathave been determined experimentally. In the case ofthe monoclinic allomorph, the largest deviationsamount to 0.03 nm and to 2.2. When compared to theaverage experimental data, the maximum deviationsin the unit-cell dimensions amount to 6% in lengthsand to 7% in the angles for the triclinic model, and 8%in the lengths and only 3% in the angles for themonoclinic model. Those variations reflect the occur-rence of some flexibility in the interchain arrange-ments as revealed in the present study. They alsoindicate that while our modeling protocol providessatisfactory models, there is still room for improve-ment in the methodology. Also, taking into accountthe possible conformational changes that the hydroxy-methyl pendant group may undergo, along with theoccurrence of some possible variations in the pucker-ing parameters of the pyranose rings, might improvethe end results in terms of the cell dimensions. Itshould nevertheless be pointed out that in our simu-lation protocol yield prediction of both the spacegroup symmetry and unit-cell dimensions, withoutany other constraint than the polysaccharide fiberrepeat. It is therefore remarkable to reach such anagreement for the dimensions of the unit cells.

    Lattice Energy CalculationsAs a final refinement, seven-chain minicrystals weregenerated based on the unit cells obtained that had thelowest cell volume. The conformational energy wasdetermined after minimisation with MM3. Lattice en-ergy was determined as the sum of all intermolecularenergy contributions involving the central cellobioseunit of the minicrystal.

    The results of these calculations are shown in Ta-ble III. The structures based on helical chains showedonly minimal deformation after minicrystal optimiza-tion. The structure based on a translational chain, onthe other hand, was severely deformed toward a sym-metrical helical chain. As a result of this deformation,the lattice energy could not be usefully estimated.

    Allomorphic TransitionsComparison of the triclinic and monoclinic modelsindicates that the interchain distances are quite simi-

    FIGURE 9 The monoclinic model of cellulose I, viewed(a) along the chain axis and (b) perpendicular to the chainaxis.

    352 Vietor et al.

  • lar. Superposition of the two structures suggests that atransition between them would not give rise to grosscrystal deformations.

    A transition from the triclinic to the monoclinicform would require the following longitudinal shiftsand small rotations of some cellulosic chains:

    1. Reorient the conformation of OH-6 from tg togt.

    2. Rotate the chains in every second sheet by 5about c.

    3. Vertically shift every fourth sheet by 0.518 nm,i.e., c/ 2, or a 180 rotation of the chains in thissheet (these changes are equivalent due to thetwofold helical axis).

    4. As a consequence of these reorientation, a ver-tical shift of the third layer will result.

    Given the absence of hydrogen bonding betweenthe sheets in the triclinic model, the vertical shiftsalong the c axis should be relatively accessible atelevated temperature. The formation of intersheet hy-drogen bonds would then stabilize the structure byfixing the chains in place. No lateral shifts would berequired to facilitate the transition so leaving thelayers intact.

    CONCLUSION

    The present work has established a computationalprocedure to predict the different ways that a poly-saccharide chain of known conformation is able tointeract with other chain-like molecules. The proce-dure has been applied to cellulose, for which stableparallel chain pairings have been generated.

    Few of these arrangements are capable of generat-ing an efficiently packed three-dimensional array, but

    may be pertinent to situations such as the amorphousstate or at the surface of cellulose crystalline domains.

    Structures with parallel chains were used for com-parison with experimentally derived data. Theyshould provide sound starting models for refinementagainst observed structure factors derived from x-rayor electron diffraction, but should not be consideredas complete descriptions of the two allomorphs ofnative cellulose. Agreement between the predictedunit cell dimensions and the published dimensions hasprovided some degree of validation of the methodol-ogy.

    The two most favorable predicted crystalline ar-rangements correspond to a triclinic lattice, spacegroup P1, and to a monoclinic form, space group P21.These structures correspond closely to those whichhave been reported for cellulose Ia and Ib, respec-tively. The cellulose chains in the selected modelsform layers, stabilized by interchain hydrogen bonds.Stacking of the layers to gives rise to the completecrystal lattice. Layer stacking in the triclinic model isstabilized only by van der Waals interactions. For themonoclinic model, the layers are linked through twointerplane hydrogen bonds per cellobiose unit, one toeach neighbouring layer.

    The present algorithm is limited by its two-stagedetermination of stable three-chain configurations,and acceptance of minimal interchain distances whereall three chains are not in contact. A new algorithmthat directly determines all stable three-chain interac-tions is being developped in our laboratory.

    The authors gratefully acknowledge financial support fromINRA to RJV. The work was also conducted within theframework of CARENET-2 a European funded networkwithin the Training and Mobility of Researchers 19941998(MTL). The provision for financial support by INRA andCNRS is acknowledged.

    Table III Chain Conformation Parameters after MM3 Seven-Chain Minicrystal Minimization (see Table I forOriginal Values)

    Lattice t ()F() C ()

    v()

    v9()

    c

    (nm)

    LatticeEnergy(kJ/mol)

    Changea(nm)

    TriclinicHelix 116 292 2149 170 169 1.036 220.0 0.0114Translational 115.3 285 2136 1.036 NDb 0.0562

    MonoclinicHelix chain A 116.1 291 2157 63 61 1.036 215.7 0.0136Helix chain B 115.2 292 2149 66 66 1.036 218.0 0.0095

    a Root mean square of displacement of the non-hydrogen atoms of the central cellobiose unit (except OO6 and OO69) after fitting to theoriginal conformation.

    b Lattice energy could not be determined due to the large lattice deformation.

    Crystal Structure Prediction 353

  • REFERENCES

    1. Perez, S.; Kouwijzer, M. L. C. E.; Mazeau, K.; En-gelsen, S. B. E. J Mol Graphics 1997, 14, 307321.

    2. OSullivan, A. C. Cellulose 1997, 4, 173207.3. Kroon-Batenburg, L. M. J.; Kroon, J. Glycoconjugate J

    1997, 14, 677690.4. Kroon-Batenburg, L. M. J.; Bouma, B.; Kroon, J. Mac-

    romolecules 1996, 29, 56955699.5. Perez, S. Methods Enzymol 1991, 203, 510556.6. Aabloo, A.; French, A. D. Macromol Theory Simul

    1994, 3, 185191.7. Aabloo, A.; French, A. D.; Mikelssar, R. H.; Perstin, A.

    Cellulose 1994, 1, 161168.8. Cousins, S. K.; Malcom Brown, R., Jr. Polymer 1995,

    36, 38853888.9. Heiner, A. P.; Sugiyama, J.; Telleman, O. Carbohydr

    Res 1995, 273, 207223.10. Hopfinger, A. K. Biopolymers 1971, 10, 12991315.11. Hopfinger, A. J.; Walron, A. G. J Macromol Sci Phys

    1969, B3, 195208.12. Hopfinger, A. J.; Walron, A. G. J Macromol Sci Phys

    1970, B4, 185199.13. Marhofer, R. J.; Relling, S.; Brickman, J. Ber Bunsen-

    ges Phys Chem 1996, 100, 13501354.14. Tai, K.; Kobayashi, M.; Tadokoro, H. J Polym Sci

    Polym Phys Eds 1976, 14, 783797.15. Woodcock, C.; Sarko, A. Macromolecules 1980, 13,

    1183.16. Perez, S. In Electron Crystallography of Organic Mol-

    ecules; Fryer, J.; Dorset, D. L., Eds.; NATO ASI Series;Kluwer Academic: New York, 1990; pp 3353.

    17. Perez, S.; Imberty, A.; Scaringe, R. P. In ComputerModeling of Carbohydrate Molecules; French, A. D.;Brady, J. W., Eds.; ACS Symposium Series, AmericanChemical Society: Washington, DC, 1990; pp 281299.

    18. Imberty, A.; Chanzy, H.; Perez, S.; Buleon, A.; Tran, V.J Mol Biol 1988, 201, 365378.

    19. Imberty, A.; Perez, S. Biopolymers 1988, 27, 12051221.

    20. Gardner, K. H.; Blackwell, J. Biopolymers 1974, 13,19752001.

    21. Sarko, A.; Muggli, R. Macromolecules 1974, 7, 486494.

    22. Attala, R. H.; VanderHart, D. L. Science 1984, 223,283.

    23. VanderHart, D. L.; Atalla, R. H. Macromolecules 1984,17, 14651472.

    24. Vanderhart, D. L.; Atalla, R. H. In The Structure ofCellulose; ACS Symposium Series 1987, AmericanChemical Society: Washington, DC, 1987; pp 88118.

    25. Sugiyama, J.; Vuong, R.; Chanzy, H. Macromolecules1991, 24, 41684175.

    26. Perez, S.; Delage, M. M. Carbohydr Res 1992, 212,253259.

    27. Tvaroska, I.; Perez, S. Carbohydr Res 1986, 149, 389410.

    28. Scaringe, R. P.; Perez, S. J Phys Chem 1987, 91,23942403.

    29. Chou, K. C.; Nemethy, G.; Scheraga, H. A. J PhysChem 1983, 87, 28692881.

    30. Chou, K. C.; Nemethy, G.; Scheraga, H. A. J Am ChemSoc 1984, 106, 31613170.

    31. French, A. D.; Miller, D. P.; Aabloo, A. Int J BiolMacromol 1993, 15, 3036.

    32. Allinger, N. L.; Yuh, Y. H.; Lii, J.-H. J Am Chem Soc1989, 111, 85518134.

    33. Allinger, N. L.; Rahman, M.; Lii, J.-H. J Am Chem Soc1990, 112, 82938307.

    34. Finkenstadt, V. L.; Millane, R. P. Macromolecules1998, 31, 77767783.

    354 Vietor et al.


Recommended