2012
MolBio PhD Programme / GGNB Course A57 2012
Macromolecular Structure Determination
Part II: Space Groups, Data Integration, and PhasingTim Grüne
University of GöttingenDept. of Structural Chemistry
http://[email protected]
Tim Grüne Macromolecular Structure Determination 1/90
2012
So Far, So Good . . .
Crystals produce a “regular” pattern of spots, the diffraction pattern, when held into X-rays.
With some efforta these spots can be turned into a beautiful modelof the molecule inside the crystal.
The first step is data integration, i.e. the determination of spot lo-cations (which corresponds to the unit cell parameters by means ofthe Laue conditions) and their intensities.aHow — that is what this lecture is all about. . .
Tim Grüne Macromolecular Structure Determination 2/90
2012
Symmetry and Space Groups
Tim Grüne Macromolecular Structure Determination 3/90
2012
Crystallography and Symmetry
Steve Lower, http://www.chem1.com/acad/webtext/virtualtextbook.html
Historically crystallographers described the appearance of minerals and their regu-larities. E.g. Nicolaus Steno formulated the law of constant angles in 1669, longbefore the advent of X-rays.
1801 René-Just Haüy describe the symmetry of crystals (after group theory had been developed).1850 Auguste Bravais describes the 14 different Bravais lattices.1890/1891 Arthur Moritz Schönflies and Jewgraf Stepanowitch derive the 230 possible space groups.1912 Max von Laue, Walter Friedrich, and Paul Knipping carry out the first diffraction experiment and show thewave nature of X-rays and the lattice structure of crystals.
Tim Grüne Macromolecular Structure Determination 4/90
2012
The Use of Symmetry
Historically it was certainly a matter of curiosity to realise that crystals obey certain rules of repetition andregularity (that’s what symmetry is about).In principal one could solve a structure without taking symmetry into account.There are two important advantages of taking symmetry into account:
1. Improvement of data quality by increasing the accuracy of the measurment2. Reduction of work. E.g. ignoring a 4-fold symmetry one would have to refine four molecules which are
basically identical.The aim of this section: Understanding both aspects.
Tim Grüne Macromolecular Structure Determination 5/90
2012
Symmetry in Nature
Symmetry is a mathematical concept with its origin in nature:
Butterflies: mirror plane Flower with 5-fold rotational symmetry
Symmetry is often associated with beauty.
Tim Grüne Macromolecular Structure Determination 6/90
2012
Symmetry in Molecules
Benzene: 6-fold rotational symmetry, mirrorplanes
single macromolecules (Protein, DNA, RNA)are never symmetric.
Tim Grüne Macromolecular Structure Determination 7/90
2012
Symmetric Arrangements
Any object, symmetric or not, can be arranged in a symmetric way.
Five ribosomes arranged with a 5-fold rotationaxis.Note that the ribosome cannot be arranged tohave a mirror plane, because it consists of chiralcompounds.
Tim Grüne Macromolecular Structure Determination 8/90
2012
Symmetry Operations
Loosely speaking, a symmetry operation is a movement that leaves (at least the appearance of) the objectunchanged.
There are three basic types of symmetry operations:
Rotation Mirror Plane Inversion
We speak of an n-fold symmetry (axis) when the move-
ment is a rotation about 360◦
n . E.g., the angle betweenone ribosome and the next on the previous slide is360◦
5 = 72◦.
Tim Grüne Macromolecular Structure Determination 9/90
2012
Combination of Symmetry
One can combine symmetry operations. This often generates additional symmetries:
• Mirror the butterfly to create a secondone.• Rotate both butterflies by 180◦ - now
there are four butterflies.• The whole composition contains a new
mirror plane, generated by the combina-tion of the first mirror plane and the rota-tion.
Tim Grüne Macromolecular Structure Determination 10/90
2012
Combination of Symmetry
One can combine symmetry operations. This often generates additional symmetries:
Mirror Plane
• Mirror the butterfly to create a secondone.• Rotate both butterflies by 180◦ - now
there are four butterflies.• The whole composition contains a new
mirror plane, generated by the combina-tion of the first mirror plane and the rota-tion.
Tim Grüne Macromolecular Structure Determination 11/90
2012
Combination of Symmetry
One can combine symmetry operations. This often generates additional symmetries:
Mirror Plane
Rotation 180°
• Mirror the butterfly to create a secondone.• Rotate both buterflies by 180◦ - now
there are four butterflies.• The whole composition contains a new
mirror plane, generated by the combina-tion of the first mirror plane and the rota-tion.
Tim Grüne Macromolecular Structure Determination 12/90
2012
Combination of Symmetry
One can combine symmetry operations. This often generates additional symmetries:
Mirror Plane
Rotation 180°
Mirro
r P
lan
e
• Mirror the butterfly to create a secondone.• Rotate both by 180◦ - now there are four• The whole composition contains a new
mirror plane, generated by the combina-tion of the first mirror plane and the rota-tion.
Tim Grüne Macromolecular Structure Determination 13/90
2012
Screw Axes
One special type of symmetry elements are Screw Axes. They are combinations of a rotation by 360◦
n with atranslation along the unit cell axis by (kn) of the axis length. We speak of an nk-fold screw axis.
The figure shows an example of a 41 screw axis:
A rotation by 1/4 · 360◦, i.e. 90◦, is combined with a trans-
lation of 1/4 of the length of the unit cell axis along which
the screw axis runs. After four such screws, one comes to a
point in the next unit cell which is the starting point translated
by the cell axis.
side view top view
We are going to meet screw axes again when we deal with space group determination.
Tim Grüne Macromolecular Structure Determination 14/90
2012
Symmetry in Crystals
There seems to be an infinite number of possible combinations of symmetry operations.
In crystallography, however, the possible number is restricted: the symmetry must cooperate with the crystallattice, and this imposes some restrictions, e.g.:
• Start with an arbitrary unit cell• apply it 90◦ rotation (4-fold rotation axis)• the gap between the two unit cell cannot be
filled by this unit cell. But crystals are not al-lowed to have gaps.
Tim Grüne Macromolecular Structure Determination 15/90
2012
Symmetry in Crystals
There seems to be an infinite number of possible combinations of symmetry operations.
In crystallography, however, the possible number is restricted: the symmetry must cooperate with the crystallattice, and this imposes some restrictions:
• Start with an arbitrary unit cell• apply it 90◦ rotation (4-fold rotation axis)• the gap between the two unit cell cannot be
filled by this unit cell. But crystals are not al-lowed to have gaps.
Tim Grüne Macromolecular Structure Determination 16/90
2012
Symmetry in Crystals
There seems to be an infinite number of possible combinations of symmetry operations.
In crystallography, however, the possible number is restricted: the symmetry must cooperate with the crystallattice, and this imposes some restrictions:
• Start with an arbitrary unit cell• apply it 90◦ rotation (4-fold rotation axis)• the gap between the two unit cell cannot be
filled by this unit cell. But crystals are not al-lowed to have gaps.
Tim Grüne Macromolecular Structure Determination 17/90
2012
Possible Symmetries
Because of the restriction of the symmetry operations to match with the lattice, the only possible symmetryoperations available for crystals are:
rotations ( only 2-, 3-, 4- and 6-fold axes)
3-fold 4-fold 6-fold2-fold
mirrors and inversion centres(only small molecules!)
centre of inversionmirror plane
and their combinations.
Tim Grüne Macromolecular Structure Determination 18/90
2012
Space Groups and Naming Conventions
There are 230 different possibilities for symmetric arrangements within a lattice. They are called the spacegroups.
There are two different notations for space groups:
1. Herrmann-Mauguin notation, e.g. P1, I4132, F 43c. The first letter describes the lattice type (primitive,face centred,. . . ), the rest the symmetries per axis.
2. Schönflies notation, e.g. C11 , O
8, T5d , which is derived from the mathematical group names.
This course uses the Herrmann-Mauguin notation (if at all . . . ).
Tim Grüne Macromolecular Structure Determination 19/90
2012
Symmetry of Macromolecules
Because macromolecules are chiral, a macromolecule cannot crystallise with a space group which contains aninversion centre or a mirror plane.
This leaves “only” 65 chiral space groups in macromolecular crystallography.
Interestingly, macromolecules tend to crystallise in a high symmetry space group (with many possible symmetryoperations), whereas small molecules tend to crystallise in a low symmetry space group.
Tim Grüne Macromolecular Structure Determination 20/90
2012
The International Tables
All spacegroups with their properties (e.g. symmetry operators) are listed in the International Tables for X-Ray
Crystallography.
P222 D1
2 222 Orthorhombic
No. 16 P222 Patterson symmetry Pmmm
Origin at 222
Asymmetric unit 0 ≤ x ≤ 1
2 ; 0 ≤ y ≤ 1
2 ; 0 ≤ z ≤ 1
Symmetry operations
(1) 1 (2) 2 0,0,z (3) 2 0,y,0 (4) 2 x,0,0
Maximal non-isomorphic subgroups
I [2] P112 (P2, 3) 1; 2[2] P121 (P2, 3) 1; 3[2] P211 (P2, 3) 1; 4
IIa none
IIb [2] P2122 (a′ = 2a) (P222
1, 17); [2] P22
12 (b′ = 2b) (P222
1, 17); [2] P222
1(c′ = 2c) (17);
[2] A222 (b′ = 2b,c′ = 2c) (C 222, 21); [2] B222 (a′ = 2a,c′ = 2c) (C 222, 21); [2] C 222 (a′ = 2a,b′ = 2b) (21);[2] F 222 (a′ = 2a,b′ = 2b,c′ = 2c) (22)
Maximal isomorphic subgroups of lowest index
IIc [2] P222 (a′ = 2a or b′ = 2b or c′ = 2c) (16)
Minimal non-isomorphic supergroups
I [2] Pmmm (47); [2] Pnnn (48); [2] Pccm (49); [2] Pban (50); [2] P422 (89); [2] P4222 (93); [2] P 42c (112); [2] P 42m (111);
[3] P23 (195)
II [2] A222 (C 222, 21); [2] B222 (C 222, 21); [2] C 222 (21); [2] I 222 (23)
200
International Tables for Crystallography (2006). Vol. A, Space group 16, pp. 200–201.
Copyright 2006 International Union of Crystallography
CONTINUED No. 16 P222
Generators selected (1); t(1,0,0); t(0,1,0); t(0,0,1); (2); (3)
Positions
Multiplicity,
Wyckoff letter,
Site symmetry
Coordinates Reflection conditions
General:
4 u 1 (1) x,y,z (2) x, y,z (3) x,y, z (4) x, y, z no conditions
Special: no extra conditions
2 t . . 2 1
2 ,1
2 ,z1
2 ,1
2 , z
2 s . . 2 0, 1
2 ,z 0, 1
2 , z
2 r . . 2 1
2 ,0,z1
2 ,0, z
2 q . . 2 0,0,z 0,0, z
2 p . 2 .
1
2 ,y,1
2
1
2 , y,1
2
2 o . 2 .
1
2 ,y,01
2 , y,0
2 n . 2 . 0,y, 1
2 0, y, 1
2
2 m . 2 . 0,y,0 0, y,0
2 l 2 . . x, 1
2 ,1
2 x, 1
2 ,1
2
2 k 2 . . x, 1
2 ,0 x, 1
2 ,0
2 j 2 . . x,0, 1
2 x,0, 1
2
2 i 2 . . x,0,0 x,0,0
1 h 2 2 2 1
2 ,1
2 ,1
2
1 g 2 2 2 0, 1
2 ,1
2
1 f 2 2 2 1
2 ,0,1
2
1 e 2 2 2 1
2 ,1
2 ,0
1 d 2 2 2 0,0, 1
2
1 c 2 2 2 0, 1
2 ,0
1 b 2 2 2 1
2 ,0,0
1 a 2 2 2 0,0,0
Symmetry of special projections
Along [001] p2mma′ = a b′ = bOrigin at 0,0,z
Along [100] p2mma′ = b b′ = cOrigin at x,0,0
Along [010] p2mma′ = c b′ = aOrigin at 0,y,0
(Continued on preceding page)
201
Tim Grüne Macromolecular Structure Determination 21/90
2012
Choosing the Unit Cell
• An artificial crystal from the ribo-some.• It has 2-fold symmetry about the
marked axes (not 4-fold!)• One possible unit cell• Another possible unit cell that
shows the symmetry.
By convention the unit cell is chosen as small as possible but should also reflect the symmetry of the lattice. Inthis example, the 90◦ angles make the 2-fold axis (and the two mirror planes) more apparent.
Tim Grüne Macromolecular Structure Determination 22/90
2012
Choosing the Unit Cell
• An artificial crystal from the ribo-some.• It has 2-fold symmetry about the
marked axes (not 4-fold!)• One possible unit cell• Another possible unit cell that
shows the symmetry.
By convention the unit cell is chosen as small as possible but should also reflect the symmetry of the lattice. Inthis example, the 90◦ angles make the 2-fold axis (and the two mirror planes) more apparent.
Tim Grüne Macromolecular Structure Determination 23/90
2012
Choosing the Unit Cell
• An artificial crystal from the ribo-some.• It has 2-fold symmetry about the
marked axes (not 4-fold!)• One possible unit cell• Another possible unit cell that
shows the symmetry.
By convention the unit cell is chosen as small as possible but should also reflect the symmetry of the lattice. Inthis example, the 90◦ angles make the 2-fold axis (and the two mirror planes) more apparent.
Tim Grüne Macromolecular Structure Determination 24/90
2012
Choosing the Unit Cell
• An artificial crystal from the ribo-some.• It has 2-fold symmetry about the
marked axes (not 4-fold!)• One possible unit cell• Another possible unit cell that
shows the symmetry.
By convention the unit cell is chosen as small as possible but should also reflect the symmetry of the lattice. Inthis example, the 90◦ angles make the 2-fold axis (and the two mirror planes) more apparent.
Tim Grüne Macromolecular Structure Determination 25/90
2012
Asymmetric Unit
The unit cell is the smallest volume required to build up the whole crystal using only translation.
The asymmetric unit is the smallest volume we need toknow in order to reconstruct the whole crystal using bothtranslation and the symmetry operators of the crystal.
We only need to find the atoms inside the asymmetric unit in order to describe the molecule, all other atoms canbe found by symmetry operations.
Tim Grüne Macromolecular Structure Determination 26/90
2012
Crystal Systems
The unit cell parameters a, b, c, α, β, γ can be classified according to their degree and type of regularity. Onespeaks of the seven crystal systems:
ort
horh
om
bic
cubic
tetr
agonal
hexagonal
trig
onal
monoclin
ictric
linic
a b c≠ ≠
a b c≠ ≠
≠a=b c
≠a=b c
a b c≠ ≠
c
a
α=β=γ=90° α=β=90°, γ=120° α=γ=90° ≠ β
α≠β≠γ
a=b=ca=b=c
b
c
a
b
c
a
b
c
b
a
ab
a
c
b
bc
c
β
γ
a
α
Lattice Type Restrictions RestrictionsSides Angles
triclinic none nonemonoclinic none α = γ = 90◦
trigonal a = b = c α = β = 90◦
γ = 120◦
hexagonal a = b α = β = 90◦
γ = 120◦
orthorhombic none α = β = γ = 90◦
tetragonal a = b α = β = γ = 90◦
cubic a = b = c α = β = γ = 90◦
Tim Grüne Macromolecular Structure Determination 27/90
2012
Bravais Lattices
The restriction based on “Choosing the Unit Cell” and the seven “Crystal Systems”, i.e. the combination ofcrystal symmetry with lattice types, leads to the 14 Bravais Lattices.
aP mP mC oP oC oI oF tP
W. MassatI hP hR cP cI cF
Tim Grüne Macromolecular Structure Determination 28/90
2012
Bravais Lattices - Key
The “dots” in the previous presentation represent special positions, i.e. they mark locations of symmetry opera-tors. There do not need to be atoms at these positions.
These are the meanings of the abbreviations of the Bravais lattices:
Choice of unit cell
P primitive F face-centred R rhombohedralC C-centred I body-centred
Crystal System
a triclinic o orthorhombic h hexagonal /trigonal
m monoclinic t tetragonal c cubic
P, C, F, I, and R appear in the Herrmann-Mauguin-Symbols of the space groups.
Tim Grüne Macromolecular Structure Determination 29/90
2012
Symmetry and X-ray Diffraction
The symmetry of the crystal can be observed on the diffraction pattern:
Diffraction image of Lysozyme (nearly) orientedalong its 4-fold axis: Especially at the centre thesymmetry becomes visible (Z. Dauter).The symmetry of the reflections is imposed on thedata and used to correct for systematic errors dur-ing data collection and hence to improve the dataquality.
Tim Grüne Macromolecular Structure Determination 30/90
2012
Further Reading: Symmetry and Space Groups
• International Tables of Crystallography, Volume A (www.iucr.org)
• W. Massa, Crystal Structure Determination (Springer, 2004)
Tim Grüne Macromolecular Structure Determination 31/90
2012
Predicting X-Ray Spots:
The Ewald Sphere Construction
Tim Grüne Macromolecular Structure Determination 32/90
2012
The Ewald Sphere Construction
Before we continue with “Data Collection”, we have to introduce the reciprocal lattice and the Ewald Sphere.
While the Laue conditions are merely helpful for computational purposes, the Ewald Sphere is very educationaland a powerful tool to understand an X-ray diffraction experiment.
To understand the Ewald sphere construction, we first must introduce reciprocal space.
Tim Grüne Macromolecular Structure Determination 33/90
2012
The Reciprocal Lattice — Orthorhombic Case
For an orthorhombic lattice, i.e., all three angles α = β = γ = 90◦, the term reciprocal lattice is fairlyunderstandable:
• |~a∗| = 1|~a|
• |~b∗| = 1
|~b|
• |~c∗| = 1|~c|
• ~a∗||~a
• ~b∗||~b
• ~c∗||~c
c
c*
a a*
b b*
“direct or real space” “reciprocal space”
(|~b| = 1)
Tim Grüne Macromolecular Structure Determination 34/90
2012
The Reciprocal Lattice: Formal Definition
In general, the vectors ~a∗,~b∗, ~c∗, which span the reciprocal space, are mathematically defined as:
• ~a∗ =~b×~cV , i.e. ~a∗ ⊥ plane(~b,~c)
• ~b∗ = ~c×~aV , i.e. ~b∗ ⊥ plane(~c,~a)
• ~c∗ = ~a×~bV , i.e. ~c∗ ⊥ plane(~a,~b)
The volume V of the unit cell and the volume V ∗
of the reciprocal unit cell (the box spanned by~a∗,~b∗, ~c∗) always fulfil V = 1/V ∗.
A long “real space vector” corresponds to a short “reciprocal vector”. Does this ring a bell?
Tim Grüne Macromolecular Structure Determination 35/90
2012
The Reciprocal Lattice
The reciprocal lattice are all the points that can be described as
h~a∗+ k~b∗+ l~c∗
with integers h, k, l.
These integers h, k, l — again — turn out to be the Miller indices of a reflection (h, k, l).
Tim Grüne Macromolecular Structure Determination 36/90
2012
Ewald Sphere Construction
(1,0)
*
(−3,0)
b* (1,1)
a
(0,0)
⑦
✲
✲
✲
✲
✲
✲
✲
✲
✲
✲
✲
✲
X-r
ayso
urce
Reciprocal Lattice:
~a∗ =~b×~c
(~a×~b)·~c
~b∗ = ~c×~a
(~a×~b)·~c
~c∗ = ~a×~b
(~a×~b)·~c
Lattice points at:
h~a∗+ k~b∗(+l~c∗)
(hollow circles)
The crystal rotates about the origin of the reciprocal lattice.
Tim Grüne Macromolecular Structure Determination 37/90
2012
Ewald Sphere Construction
1/λ|S | =in⑦
✲
✲
✲
✲
✲
✲
✲
✲
✲
✲
✲
✲
X-r
ayso
urce
Draw a sphere with
radius 1/λ that touches
the lattice origin. The sphere
centre lies aligned with the
X-ray source.
This sphere is the Ewald Sphere.
Tim Grüne Macromolecular Structure Determination 38/90
2012
Ewald Sphere Construction
S
(0, −2)
(−1, 2)
(−5, −3)
(−7, −1)
⑦
✲
✲
✲
✲
✲
✲
✲
✲
✲
✲
✲
✲
X-r
ayso
urce The scattering vector ~S points
from the origin to the latticepoint.
Exactly those lattice pointson the surface of the Ewaldsphere fulfil the Laue condi-tions.
They are the recordable reflec-tions.
Tim Grüne Macromolecular Structure Determination 39/90
2012
Ewald Sphere Construction
(0, −2)
(−1, 2)
(−5, −3)
2θ′
(−7, −1)
(−1,2)
(0,0)
(0,−2)
Dete
cto
r
2θ
⑦
✲
✲
✲
✲
✲
✲
✲
✲
✲
✲
✲
✲
X-r
ayso
urce
Some of these
spots hit the
detector.
Tim Grüne Macromolecular Structure Determination 40/90
2012
Ewald Sphere Construction
Dete
cto
r
(0, 2)
⑦
✲
✲
✲
✲
✲
✲
✲
✲
✲
✲
✲
✲
X-r
ayso
urce
Crystal rotation =
Lattice rotation =
New spots
(Rot. axis perpendicular to slide)
Tim Grüne Macromolecular Structure Determination 41/90
2012
Use of the Ewald Sphere
The Ewald Sphere construction allows to understand the diffraction patterns we observe during data collection.The so-called lunes - the reflection spots arranged in a circular pattern - are the intersection of the lattice pointswith the surface of the sphere.
The reciprocal lattice is constructed from the unit cell such thatthe reciprocal lattice has the same point symmetry as the di-rect lattice. This is way the diffraction pattern show the (point)symmetry of the crystal.
The point symmetry is the crystal’s symmetry without any translational parts,
because the Ewald sphere always stays attached to the (0,0,0) lattice point
(by construction).
Tim Grüne Macromolecular Structure Determination 42/90
2012
Data Integration
Tim Grüne Macromolecular Structure Determination 43/90
2012
Goal of Data Collection
From a X-ray diffraction experiment we learn the intensities of a large number of reflections∗.
Every reflection is identified with its Miller index, and the measurement results in a long list of intensities I(hkl).Typically for a macromolecule a dataset contains 10,000-1,000,000 reflections.
Target of data collection and data integration is to determine the intensities of as many reflections as correctlyas possible.
Why?
∗and an error estimate of the intensities and the unit cell parameters
Tim Grüne Macromolecular Structure Determination 44/90
2012
Goal of Data Collection
Before we can create a model of the molecule(s) inside the crystal we have to determine the electron densitymap ρ(x, y, z).
The intensity I(hkl) of a reflection can be calculated from the electron density map as
I(hkl) = const · |∫
unitcellρ(x, y, z)e2πi(hx+ky+lz)|2
We are, though, in the opposite situation: we can measure many of the I(hkl) and want to calculate ρ(x, y, z).Therefore we would have to invert the above equation.
Tim Grüne Macromolecular Structure Determination 45/90
2012
Intensity to Density
The actual inversion of the equation on the previous slide is mostly the topic of phasing, which will be dealt withlater.
For now, bear in mind:
• the more measured reflections I(hkl)
→ the more accurate the electron density map ρ(x, y, z)
→ the more accurate the model
Tim Grüne Macromolecular Structure Determination 46/90
2012
What we want to collect
• As many reflections as possible.• In reciprocal space this means: make as
many lattice points as possible traverse theEwald Sphere.• This is achieved by rotating the crystal.
Standard set-ups, e.g. at a synchrotron allow torotate the crystal around one axis. More sophisti-cated machines allow to rotate the crystal aroundmore than one axis: one can reach a better com-pleteness of the data.
Tim Grüne Macromolecular Structure Determination 47/90
2012
Caveat to the Ewald Sphere
The Ewald sphere construction shows the reciprocal lattice. One can rotate the crystal which also rotates thereciprocal lattice and hence allows to imagine how and which reflections can be collected.
Bear in mind: Translating (shifiting) the crystal does not move reflections through the Ewald sphere: The Ewaldsphere always stays attached to the reflection (000).
Therefore the diffraction pattern only shows the symmetry of the point group of the crystal and not its full
symmetry.
Tim Grüne Macromolecular Structure Determination 48/90
2012
How Data are Collected: Frames
Our detector is planar, only two-dimensional. The reflections we want to collect are distributed in three-dimensional space.
If one would rotate the crystal for 360◦ and record everything on the detector, one would not know when eachreflection was recorded.
Data are collected as slices, or frames.
Tim Grüne Macromolecular Structure Determination 49/90
2012
Frames
Diffraction images are like computer tomography at a hospital: Many slices are taken from the tissue (brain, leg,etc.) from which the three-dimensional object can be reconstructed.
Tim Grüne Macromolecular Structure Determination 50/90
2012
Frame Width
In X-ray crystallography the same is achieved by rotating the crystal by a small angle while the detector detectsthe signal. Typically the angle for each image (its frame width) ranges between 0.1◦ to 2◦. One data set consistsof a hundred to several thousand images.
Tim Grüne Macromolecular Structure Determination 51/90
2012
Optimal Framewidth
In general the data become better the finer each slice. However, it takes 1800 images to collect a crystal rotationof 180◦ with a frame width of 0.1◦, ten times more than with 1◦ slices. This also increases the radiation dosethe crystal is exposed to and therefore the risk of radiation damage.
Even though data are routinely collected at 100 K, every crystal suffers from radiation damage: the X-raysproduce free radicals that in turn break bonds and thus destroy the crystal.
On average, synchrotron data are collected with 0.5◦ − 1◦ frame width; on inhouse sources one often collectswith ≈ 0.2◦, because the less intense beam causes less radiation damage.
Tim Grüne Macromolecular Structure Determination 52/90
2012
Data Integration
Tim Grüne Macromolecular Structure Determination 53/90
2012
Determination of the Spot Intensities
1. Cell/
Orientation
2. (prelim.)
Spacegroup
6. Scaling 5. Spacegroup (4.Corrections)
Background
Spot area
Summation
3. Integration
Tim Grüne Macromolecular Structure Determination 54/90
2012
Integration Programs
Popular and less popular programs for data processing (=data integration) include
XDS HKL2000
Mosflm Saint
Eval d*trek
automar
None of these programs is superior to the others, and it is often worth trying at least two in order to get the bestintegrated data set.
Tim Grüne Macromolecular Structure Determination 55/90
2012
1. Unit Cell Dimension & Orientation = Indexing
1. The scattering vector ~S and the scattering angle θ for each reflection (hkl) are “macroscopic” quantities:They can be calculated from(a) the spot position on the detector(b) the distance between crystal and detector
2. The Laue Conditions and Bragg’s Law relate them to the unit cell parameters ~a,~b,~c
3. There are enough reflections on 1-2 images to determine the unit cell and its orientation.This step of determining the unit cell dimensions and orientation is called indexing, because it is equivalent toassigning to each reflection its Miller index.
Tim Grüne Macromolecular Structure Determination 56/90
2012
2. Spacegroup
The spacegroup that best matches the unit cell dimensions and has high symmetry (many symmetry elements)is chosen:
❇r❛✈❛✐s ❙❝♦r❡ ❛ ❜ ❝ ❛❧♣❤❛ ❜❡t❛ ❣❛♠♠❛✯ ✸✶ ❛P ✵✳✵ ✾✷✳✸ ✾✷✳✹ ✶✷✼✳✾ ✾✵✳✵ ✾✵✳✵ ✻✵✳✵✯ ✹✹ ❛P ✵✳✵ ✾✷✳✸ ✾✷✳✹ ✶✷✼✳✾ ✾✵✳✵ ✾✵✳✵ ✶✷✵✳✵✯ ✸✾ ♠❈ ✵✳✵ ✶✻✵✳✵ ✾✷✳✸ ✶✷✼✳✾ ✾✵✳✵ ✾✵✳✵ ✾✵✳✵✯ ✶✵ ♠❈ ✵✳✸ ✶✻✵✳✵ ✾✷✳✸ ✶✷✼✳✾ ✾✵✳✵ ✾✵✳✵ ✾✵✳✵✯ ✸✹ ♠P ✵✳✺ ✾✷✳✸ ✶✷✼✳✾ ✾✷✳✹ ✾✵✳✵ ✶✷✵✳✵ ✾✵✳✵✯ ✷✾ ♠❈ ✵✳✺ ✾✷✳✸ ✶✻✵✳✵ ✶✷✼✳✾ ✾✵✳✵ ✾✵✳✵ ✾✵✳✵✯ ✸✽ ♦❈ ✵✳✺ ✾✷✳✸ ✶✻✵✳✵ ✶✷✼✳✾ ✾✵✳✵ ✾✵✳✵ ✾✵✳✵✯ ✶✸ ♦❈ ✵✳✽ ✾✷✳✹ ✶✻✵✳✵ ✶✷✼✳✾ ✾✵✳✵ ✾✵✳✵ ✾✵✳✵✯ ✶✹ ♠❈ ✵✳✽ ✾✷✳✹ ✶✻✵✳✵ ✶✷✼✳✾ ✾✵✳✵ ✾✵✳✵ ✾✵✳✵✯ ✶✷ ❤P ✵✳✽ ✾✷✳✸ ✾✷✳✹ ✶✷✼✳✾ ✾✵✳✵ ✾✵✳✵ ✶✷✵✳✵
✸✺ ♠P ✷✺✵✳✵ ✾✷✳✹ ✾✷✳✸ ✶✷✼✳✾ ✾✵✳✵ ✾✵✳✵ ✶✷✵✳✵
XDS example output for P6122
Actually, only the Laue Group is of interest during integration. The Laue group is similar to, but not identical to the point group which
was mentioned above.
Tim Grüne Macromolecular Structure Determination 57/90
2012
3. Integration
Magnified spot on detector. To measure its intensity:• estimate the average background (grey)• estimate the spot area• count the pixel values of the spot• subtract the background
Correctly estimating the background and spot areaare the difficult parts, especially for weak reflections.
Tim Grüne Macromolecular Structure Determination 58/90
2012
3.1 2D- and 3D-spots
Spots have a certain volume and appear on more than one frame.
Some integration programs, e.g. Mosflm and HKL2000, treat each frame separately and write the fraction ofeach spot per frame to the output file. They leave it to a separate scaling program to put the fraction together.These are 2D-integration programs.
Other programs like XDS and Saint integrate over all frames that contribute to a reflection and only write out thefinal total intensity per spot. These are 3D-integration programs.
Tim Grüne Macromolecular Structure Determination 59/90
2012
4. Corrections
The integration step bascially consists of counting the pixel values and subtracting the background. Once allmeasurable reflections are processed, certain corrections must be applied:
• technical corrections like Lorentz- and polarisation-correction• improvemed estimate of unit cell dimensions using all data• improvement of experimental parameters like crystal-to-detector distance, distortions
of detector, . . .
It is often worth repeating the whole integration process with the improved parameters.
Tim Grüne Macromolecular Structure Determination 60/90
2012
5. Spacegroup Determination
With all reflections processed and the settings of the experiment (unit cell dimensions, detector distance, . . . )improved and refined, the spacegroup can now be determined more reliably than before.
Especially, spacegroups with screw axes show so called extinctions.
E.g. in spacegroup P21, the reflections (001), (003), (005), . . . are mathematically zero, because the screwaxis leads to systematic destructive interference for these reflections. This is the only way to distinguish betweenP21 and P2.
Tim Grüne Macromolecular Structure Determination 61/90
2012
6. Scaling
Scaling is a second type of correction. It takes into account that
• the crystal is not spherical: the volume of irradiated crystal changes with crystal orientation (larger volume= higher intensities)• radiation damage leads to reduction in the scattering power of the crystal• CCD detectors are made of several “chips”. Each chip may react slightly differently to the impact to X-rays.
Scaling adjusts the data as much as possible as though it came from a ideal crystal measured with a idealinstrument, because this is what the subsequent steps (refinement, building) assume.
Tim Grüne Macromolecular Structure Determination 62/90
2012
6.1 Symmetry Related Reflections
Every symmetry operation can be expressed by a matrix multiplication and a vector addition (translation).
E.g. one of the symmetry operators of the space group P41 can be written as
0 −1 01 0 00 0 1
xyz
+
0014
This means, that the reflections∗
123
and
0 −1 01 0 00 0 1
123
=
−213
should (mathematically) have identical intensities.
∗because the Ewald sphere is attached to the (000) reflection, there is no translational part in reciprocal space.
Tim Grüne Macromolecular Structure Determination 63/90
2012
Result of Integration: the hkl-file
At the end of the integration step, all hundreds or thousands of images are reduced to the reflections theycontain. We end up with a reflection file containing a list of Miller indices each with its intensity and the errorestimate:
✷ ✷ ✵ ✶✵✳✾✷✺✽ ✵✳✽✶✶✵✵✸ ✵ ✵ ✵✳✽✻✻✷✹ ✵✳✺✸✸✾✽✵ ✸ ✵ ✵✳✵✾✾✷✶ ✵✳✼✾✽✻✶✶ ✸ ✵ ✺✾✳✸✷✹✻ ✸✳✺✹✸✵✹✸ ✶ ✵ ✻✽✳✸✺✶✹ ✸✳✽✷✺✷✼✲✶ ✸ ✵ ✺✸✳✷✾✼✽ ✸✳✸✻✷✷✻✷ ✸ ✵ ✸✾✳✺✺✽✽ ✷✳✹✼✵✸✾
(Example of a Thaumatin data set in space group P41212, maximum resolution 1.6 Å, 283,862 reflections in total.)
Tim Grüne Macromolecular Structure Determination 64/90
2012
Resolution of the Data Set
The intensity of the reflections fades as we move towards the edge of the detector (i.e., as we increase thescattering angle θ). There is a maximal angle to which a crystal diffracts. This is the resolution limit of thecrystal.
2θ
2θ
• For each reflection we know the angle 2θ it forms with the line betweendetector and crystal.• From Bragg’s Law λ = 2d sin θ we can calculate d, the resolution of
the reflection.• The smallest distance to which reasonable data can be measured is
called the resolution of the dataset.
Tim Grüne Macromolecular Structure Determination 65/90
2012
Reasonable Data: Determination of the Resolution
There is a problem with the resolution of a dataset:
The integration program does not really distinguish between background and reflections: it calculates the loca-tion of the reflections (from the Laue conditions), sums up the pixels in that area and substracts the background.
The crystallographer has to decide about the resolution cut-off.
A good guide for the resolution cut-off is where the average signal divided by its error, IσI
drops below 2.0.
Tim Grüne Macromolecular Structure Determination 66/90
2012
Example Statistics from the program xprep
❘❡s♦❧✉t✐♦♥ ★❉❛t❛ ★❚❤❡♦r② ✪❈♦♠♣❧❡t❡ ❘❡❞✉♥❞❛♥❝② ▼❡❛♥ ■ ▼❡❛♥ ■✴s ❘✭✐♥t✮ ❘s✐❣♠❛
■♥❢ ✲ ✷✳✶✺ ✻✸✹ ✶✺✵✵ ✹✷✳✸ ✵✳✹✷ ✷✻✶✳✺ ✼✳✻✶ ✵✳✷✸✼✶ ✵✳✶✷✷✺✷✳✶✺ ✲ ✶✳✽✹ ✻✸✹ ✽✺✻ ✼✹✳✶ ✵✳✼✺ ✷✻✶✳✸ ✼✳✷✵ ✵✳✶✵✵✾ ✵✳✶✷✻✼✶✳✽✹ ✲ ✶✳✻✻ ✻✺✷ ✾✶✹ ✼✶✳✸ ✵✳✼✸ ✶✺✹✳✼ ✼✳✶✼ ✵✳✵✺✹✽ ✵✳✶✷✻✽✶✳✻✻ ✲ ✶✳✺✷ ✻✼✽ ✾✸✻ ✼✷✳✹ ✵✳✼✸ ✾✹✳✺ ✻✳✼✵ ✵✳✶✷✽✹ ✵✳✶✸✶✶✶✳✺✷ ✲ ✶✳✹✷ ✻✾✽ ✶✵✵✽ ✻✾✳✷ ✵✳✼✶ ✼✾✳✸ ✻✳✻✷ ✵✳✵✻✾✸ ✵✳✶✸✺✵✶✳✹✷ ✲ ✶✳✸✹ ✻✼✵ ✾✼✻ ✻✽✳✻ ✵✳✼✶ ✻✵✳✷ ✺✳✽✽ ✵✳✶✵✼✻ ✵✳✶✹✹✾✶✳✸✹ ✲ ✶✳✷✽ ✻✸✽ ✾✵✹ ✼✵✳✻ ✵✳✼✸ ✹✾✳✻ ✺✳✸✸ ✵✳✶✷✷✾ ✵✳✶✺✼✵✶✳✷✽ ✲ ✶✳✷✷ ✼✷✻ ✶✶✵✷ ✻✺✳✾ ✵✳✻✼ ✹✼✳✾ ✺✳✸✵ ✵✳✶✹✻✾ ✵✳✶✻✷✷✶✳✷✷ ✲ ✶✳✶✼ ✻✾✶ ✶✶✸✷ ✻✶✳✵ ✵✳✻✹ ✹✻✳✻ ✺✳✵✼ ✵✳✶✸✸✽ ✵✳✶✻✺✻✶✳✶✼ ✲ ✶✳✶✸ ✻✺✻ ✶✵✸✽ ✻✸✳✷ ✵✳✻✻ ✹✹✳✽ ✺✳✵✷ ✵✳✶✼✼✾ ✵✳✶✼✵✷✶✳✶✸ ✲ ✶✳✵✾ ✼✵✸ ✶✶✻✹ ✻✵✳✹ ✵✳✻✹ ✸✹✳✼ ✹✳✸✵ ✵✳✶✼✹✼ ✵✳✶✾✹✷✶✳✵✾ ✲ ✶✳✵✺ ✽✶✽ ✶✹✵✷ ✺✽✳✸ ✵✳✻✷ ✷✸✳✼ ✸✳✻✺ ✵✳✷✵✻✷ ✵✳✷✹✺✶✶✳✵✺ ✲ ✶✳✵✷ ✻✹✶ ✶✶✾✽ ✺✸✳✺ ✵✳✺✽ ✶✾✳✽ ✸✳✶✺ ✵✳✶✾✸✸ ✵✳✷✽✽✽✶✳✵✷ ✲ ✵✳✾✾ ✼✷✻ ✶✸✽✵ ✺✷✳✻ ✵✳✺✼ ✶✸✳✾ ✷✳✸✽ ✵✳✷✸✺✻ ✵✳✸✽✹✸✵✳✾✾ ✲ ✵✳✾✻ ✼✾✶ ✶✹✻✵ ✺✹✳✷ ✵✳✺✾ ✶✶✳✼ ✲✲❃✷✳✶✺ ✵✳✷✸✻✼ ✵✳✹✸✽✺✵✳✾✻ ✲ ✵✳✾✸ ✽✷✻ ✶✻✾✻ ✹✽✳✼ ✵✳✺✹ ✾✳✽ ✲✲❃✶✳✽✷ ✵✳✸✹✻✻ ✵✳✺✹✶✺✵✳✾✸ ✲ ✵✳✾✵ ✼✸✽ ✶✾✾✽ ✸✻✳✾ ✵✳✹✶ ✶✸✳✻ ✶✳✽✸ ✵✳✸✶✷✺ ✵✳✹✹✽✼✵✳✾✵ ✲ ✵✳✽✺ ✺✻✾ ✸✸✾✷ ✶✻✳✽ ✵✳✶✾ ✻✳✸ ✶✳✶✶ ✵✳✸✸✼✺ ✵✳✽✹✹✺✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✵✳✾✹ ✲ ✵✳✽✺ ✶✽✼✷ ✻✺✼✵ ✷✽✳✺ ✵✳✸✷ ✶✵✳✷ ✶✳✻✵ ✵✳✸✹✹✷ ✵✳✺✺✷✾■♥❢ ✲ ✵✳✽✺ ✶✷✹✽✾ ✷✹✵✺✻ ✺✶✳✾ ✵✳✺✺ ✻✺✳✺ ✹✳✺✵ ✵✳✶✻✻✷ ✵✳✶✺✽✼
The third last column suggests to cut the resolution at 0.95 Å.
Tim Grüne Macromolecular Structure Determination 67/90
2012
Further Reading: Data Integration
W. Kabsch, Integration, scaling, space-group assignement, and post-refinement (Acta Cryst D66, 2010)
Tim Grüne Macromolecular Structure Determination 68/90
2012
Summary and Outlook
So far we ended up with a long list of reflections, i.e., with one Miller index for each reflection together with itsintensity and error estimate.
This does not suffice to determine the electron density, which we need in order to start building a model of themolecule.
We still require the phases for each reflection.
This is the topic of the next part.
Tim Grüne Macromolecular Structure Determination 69/90
2012
Phasing
Tim Grüne Macromolecular Structure Determination 70/90
2012
Phasing
The equation
I(hkl) = const · |∫
unitcellρ(x, y, z)e2πi(hx+ky+lz)|2
connects X-ray crystallography with chemistry because it shows how the (measured) reflection spots I(hkl)
are connected to the electron density ρ(x, y, z) in the crystal.
Unfortunately, this equation reads the wrong way: We want to calculate the electron density from the intensities,because the electron density is needed in order to construct an atomic model for the molecules.
The inversion of the equation is the content of the section phasing.
Tim Grüne Macromolecular Structure Determination 71/90
2012
The Structure Factor
The reflections are the result of small waves from the electrons in the crystal. This notion leads (after somecalculations . . . ) to the concept of the structure factor F(hkl). It is a complex number and builds a two-waybridge between intensities and density:
I(hkl) = const ∗ |F(hkl)|2
F(hkl) = const ∗∫
ρ(x, y, z)e2πi(hx+ky+lz)
The latter equation can be inverted:
ρ(x, y, z) = const ∗∑
h,k,l
F(hkl)e−2πi(hx+ky+lz)
Tim Grüne Macromolecular Structure Determination 72/90
2012
The Phase Problem
Unfortunately, the structure factor F(hkl) is a complex number. As such it consists of an amplitude |F(hkl)|and a phase φ(hkl) and can be written as F(hkl) = |F(hkl)|eiφ(hkl).
The square root of the intensity delivers the structure factor amplitude |F(hkl)|.
The phase angle φ(hkl) cannot be measured directly. This fact is called the phase problem of crystallography.
Without knowing the phases that belong to each reflection, we cannot procede to calculate the electron densitymap ρ(x, y, z).
Tim Grüne Macromolecular Structure Determination 73/90
2012
Illustrating the Phase Problem
The fact that the phases do not show up in the diffraction pattern is comparable to drawing a three-dimensionalobject:
Which side of the cube is the front side? We cannot decide without further information.
Tim Grüne Macromolecular Structure Determination 74/90
2012
Illustrating the Phase Problem
The fact that the phases do not show up in the diffraction pattern is comparable to drawing a three-dimensionalobject:
Which side of the cube is the front side? We cannot decide without further information.
Tim Grüne Macromolecular Structure Determination 75/90
2012
Important Notice
It is (computationally) straightforward to cal-culate/ predict the reflections from a model(which is the final representative of the elec-tron density ρ(x, y, z)).
A good match between calculated and measured amplitudes indicates we have e.g. a good model or goodphases. (This is not fool-proof, though, which is why Validation is an important step in structure determination.
Tim Grüne Macromolecular Structure Determination 76/90
2012
Limits of Phasing
All phasing methods provide only an estimate of the phases, and once found the phases must be furtherimproved to get closer to the real phases. Finding this initial phase estimate is phasing.
It is not be obvious, even to more experienced crystallographers, that the improvement of these phases is therole of model building and refinement.
Tim Grüne Macromolecular Structure Determination 77/90
2012
Overview of Phasing Methods
The most common methods to solve the macromolecular phase problem are:
Molecular Replacement
Isomorphous Replacement
Anomalous Dispersion
Tim Grüne Macromolecular Structure Determination 78/90
2012
Molecular Replacement
Tim Grüne Macromolecular Structure Determination 79/90
2012
Structural Similarity
Proteins are alike! Proteins consist mainly of helices and beta sheets. Although the possible sequences ofamino acids are nearly endless, the variations in tertiary structure is rather limited.
Proteins with homologous sequences are considered to share a similar tertiary structure, too. An identitiy ofonly 30% can be sufficient for structural similarity so that Molecular Replacement works.
Sometimes, even 100% sequence similarity is not enough to find a solution by molecular replacement (domainmovements, conformational changes upon ligand binding, etc).
Tim Grüne Macromolecular Structure Determination 80/90
2012
Molecular Replacement - Flow Chart
The steps of Molecular Replacement are:
1. Find a similar structure - e.g. by sequence comparison against all know structures in the Protein Databasea.30% sequence similarity is considered the minimum.
2. Correctly place this search prototype in the unit cell3. Combine the phases φ(hkl) calculated from the placed prototype with the structure factor amplitudes|F(hkl)| derived from the measured intensities I(hkl) .
Step (2) is the tricky step.
aThe PDB can be used free of charge and can be accessed e.g. at www.pdb.org or www.pdbe.org
Tim Grüne Macromolecular Structure Determination 81/90
2012
Why does this work?
When the search prototype is sufficiently similar and if it is correctly placed within the unit cell, the calculatedphases are close enough to the real phases to get an interpretable map.
Tim Grüne Macromolecular Structure Determination 82/90
2012
Molecular Replacement Programs
Phaser probably the program of choice for molecular replacement.1. Easy to use2. tolerant of clashes
3. offers choice a spacegroups in ambiguous cases4. fast (approx. 30 minutes for an average structure)
MrBump model search and preparation based on sequence
There are also the programs Amore, Molrep, EPMR, but I have no or little experience with these programs.
Tim Grüne Macromolecular Structure Determination 83/90
2012
MR: Model Preparation
Every difference between the search prototype and the molecule inside the crystal reduces the chance for agood molecular replacement solution.
There is some advice as to how to prepare the search model before carrying out the search:
• Remove solvent and ligands• Remove flexible parts, mostly loop regions.• Split the molecule into domains and search one after the other• Try several copies per asymmetric unit (oligomeric proteins)
Tim Grüne Macromolecular Structure Determination 84/90
2012
MR: Example Model Preparation
PDB-ID: 1OFC
Molecules like this one with several domains tend to be flexibleand might crystallise in slightly different ways.• Separate into three domains: blue, green + half of red linker
helix, yellow + half of red linker helix• remove loop region in yellow domain• remove diconnected (disordered) helix in blue domain
Tim Grüne Macromolecular Structure Determination 85/90
2012
MR: Number of Molecules
Macromolecules often crystallise as oligomers. For molecularreplacement this can become an obstacle since the search pro-gram must now how many copies of the molecule it should belooking for.← Realistic packing for a small protein in a large unit cell
(spacegroup I4122)← Molecule crystallised as heptamer (7-mer) ⇒ 112
molecules in unit cell.← It could easily be one molecule more or less without sterical
clashes.
Tim Grüne Macromolecular Structure Determination 86/90
2012
Estimating Number of Molecules: the Solvent Content
The previous unit cell shows large “white” areas. These are filled with solvent molecules (water, salt), which aredisordered and therefore cannot be seen in the crystal structure.
❁✲✲✲ r❡❧❛t✐✈❡ ❢r❡q✉❡♥❝② ✲✲✲❃✹✳✶✼ ✲✸✳✽✺ ✲✲✸✳✸✸ ✲✲✲✲✲✲✷✳✾✹ ✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✷✳✼✽ ✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✷✳✺✵ ✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✷✳✸✽ ✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✷✳✷✼ ✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✷✳✵✽ ✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✶✳✾✷ ✲✲✲✲✲✲✲✲✲✲✲✲✲✲✲✶✳✼✾ ✲✲✲✶✳✻✼ ✲✶✳✻✶ ✯ ✭❈❖▼P❖❙■❚■❖◆✯✶✮✶✳✺✻ ✲
Macromolecules crystallise typically with 30-70 % solvent con-tent, centred around 50%. With these statistics the example onthe left says that it is very unlikely that there are 13 moleculesin the asymmetric unit (but not impossible).
Phaser log-file with 13 copies
According to statistics there are probably 9 copies of the molecule in the asymmetric unit (in this case therewere 7).
Tim Grüne Macromolecular Structure Determination 87/90
2012
Model Bias - the Main Risk
Molecular replacement uses “foreign” phases in order to calculate the electron density map. One hopes thethese phases are close enough to the real phases that the electron density is correct or at least close enoughto allow for improvements.
Thought Experiment (Kevin Cowtan, http://www.ysbl.york.ac.uk/~cowtan/fourier/fourier.html) :
inverse FT
inverse FT
φ (hkl)|F(hkl)|,
φ (hkl)
φ (hkl)
FT
searc
h m
odel
cry
sta
l conte
nt
|F(hkl)|,
|F(hkl)|,
φThe phase of the duck
determines the shape
Tim Grüne Macromolecular Structure Determination 88/90
2012
A Simple Test for Model Bias
The risk of model bias is particularly high at medium or low resolution.
To check whether an molecular replacement solution is correct or just a random solution, do the following
1. Before running the MR program, remove part of the search model, e.g. half an α-helix.2. Carry out the MR3. Look at the resulting map: If there is electron density for the removed part, the solution is certainly correct.
If not: You are most likely (but not certainly) looking at a false solution
Tim Grüne Macromolecular Structure Determination 89/90
2012
Example Test
Some residues of the helix in this searchmodel were removed before MR.The resulting map does not show anysigns of density for these residues.Therefore, this is most likely a false solu-tion.
Tim Grüne Macromolecular Structure Determination 90/90