Anomalous Differences
• Bijvoet differences• (hkl) vs (-h-k-l)
Dispersive Differences1(hkl) vs 2(hkl)
• From merged (hkl)’s
X-ray Anomalous Scattering
Dependency on f’ and f”
• Anomalous differences are proportional to 2f”
• Dispersive differences are largely dependent on the f’ between the wavelengths
Se Absorption Edge
X-ray energy in eV
f’ a
n d f”
(el
ectro
ns)
f’f”
Peak
Rem
ote
(hig
h)
Infle
ctio
n po
int
Anomalous differences
X-ray energy in eV
f’ a
n d f ”
(el
ectro
ns)
f”
Hg Anomalous signal2.5 x stronger signal over Se
X-ray energy in eV
10
f’ a
n d f ”
(el
ectro
ns)
Dispersive differences
X-ray energy in eV
f’ a
n d f ”
(el
ectro
ns)
f”
f’
Crick and Magdoff Equation (1956)Perturbation due to f”= (NA/2NT)1/2(2f”A/Zeff)
Perturbation due to f’= (NA/2NT) 1/2(f’Ai- f’Aj /Zeff)
Where NA = number of anomalous scatters, with anomalous scattering factors f’and f”NT = total number of atoms in the structureZeff = effective normal scattering power for all atoms (6.7e for protein atoms at 2 = 0)
Since perturbations to f’ and f” are orthogonal, the net expected signal is the root mean square of these two quantities.
Crick and Magdoff Equation
“Pessimistic” means 60% occupancy and 60% optimal f’ and f”
Met8p•Space Group C2•6 SeMet x 3 molecules (273 aa)
2.8Å
6.7%
COPOX • C2 Crystal form• 32 Se in ASU (8 SeMet x 4 molecules, 333aa)
2.8Å9.8%
COPOX• P3 Crystal form• 16Se’s (8 SeMet x 2 molecules, 333aa)
3.5Å
10%
To determine the quality of data required to see the achievable signal you need to evaluate the required
intensity over background.
• I = variance(I) = 2(I), or = sqrt(I)• If we want signal to be larger than 2 then if the anomalous signal is
0.03*I (3%) then we want:• 0.03*I > 2(I)
– 0.03*I > 2*sqrt(I)– Sqrt(I) > 2/0.03– I > 4444
• So, each measured intensity must be at least 4500 counts above background.
….or look at Rmerge
• If the signal is 3% you need Rmerge < 0.03
• Ethan Merritt’s site suggests that you in fact require the Rmerge < signal in the resolution shell where you are comparing the signal. – This is not what I have just shown in real life
cases which have both worked and not worked.
Analysis: Anomalous and Dispersive differences
• Differences should not be higher than the theoretical maximum
• Trend in differences should follow wavelength expectations– Peak with highest anomalous signal– Inflection vs Remote to have biggest dispersive diff.
• Good Se data set will have anomalous differences around 6-8%
column 1: bin numbercolumn 5: average resolution in bincolumn 6: <|f_p | - |f_r |>/<(|f_r |+|f_p |)/2> (signed difference)column 7: <||f_p | - |f_r ||>/<(|f_r |+|f_p |)/2> (absolute diff)column 8: sqrt(<(|f_p | - |f_r |)^2>)/sqrt(1/2(<|f_r |>^2+<|f_p |>^2))column 9: fraction of theoretically complete data 1 7.8832 0.0031 0.0693 0.0760 0.8844 2 4.9338 -0.0032 0.0747 0.0843 0.9116 3 4.1407 0.0063 0.0822 0.1206 0.9023 4 3.6961 0.0087 0.1014 0.1201 0.8801 5 3.3963 0.0153 0.1338 0.1646 0.8259 6 3.1757 0.0129 0.1728 0.2134 0.7640 7 3.0063 0.0038 0.2203 0.2643 0.6660 8 2.8659 -0.0187 0.2616 0.3096 0.4418 ----------------------------------------------------------------- #bin | resolution range | #refl | 1 2.800 0.1100 0.1288 0.7845
CNS-Analyse.inp (copox)
CNS analyse.inp“analyse.matrix”
f_p f_r f_if_p 0.### 0.xxx 0.xxx
f_r 0.### 0.xxx
f_i 0.###
Dispersivedifferences
Anomalous differences
Overall values for resolution range 500-2.8Å.sqrt(<(|f_p | - |f_r |)^2>)/sqrt(1/2(<|f_r |>^2+<|f_p |>^2))
= rms (Fi,k) rms (|Fi|+|Fk|)
= rms (Fi,k) rms ((|Fi|+|Fk|)/2)
FliG/C f_p f_r f_i
f_p 0.1169 0.093 0.1184
f_r 0.0948 0.1118
f_i 0.1062
Met8p f_p f_r f_i
f_p 0.0695 0.0536 0.0597
f_r 0.0558 0.0643
f_i 0.0600
Copox f_p f_r f_if_p 0.1117 0.1288 0.1804
f_r 0.1101 0.1511
f_i 0.1408
A Local ComparisonCapsid f_p f_r f_i
f_p 0.0960 0.0656 0.0531
f_r 0.0627 0.0865
f_i 0.0832
Experimental vs. calculated
To obtain a usable signal, the data must be measured with a significantly better (lower) noise level
Copox f_p f_r f_if_p 0.1117 0.1288 0.1804
f_r 0.1101 0.1511
f_i 0.1408
Calc f_p f_r f_if_p 0.060 0.037 0.004
f_r 0.040 0.041
f_i 0.045
Friedel differences vs Centric differences
One way to analyze the noise in the data is to compare the merging statistics of the centric reflections to the Bijvoet differences.
Centric reflections are reflections which are related through the space groups point symmetry (Laue symmetry).
For a two fold axis:
[hkl]-1 0 00 1 00 0 -1
=-h+k-l
N 1/resol^2 Dmin Nmeas %poss Cm%poss Mlplcty AnomCmpl AnomFrc Rmeas Rmeas0 (Rsym) PCV PCV0
1 0.008 11.04 8865 99.3 99.3 16.8 100.0 100.0 0.085 0.097 0.081 0.104 0.116
2 0.016 7.80 16919 100.0 99.7 19.3 100.0 100.0 0.086 0.094 0.082 0.104 0.114
3 0.025 6.37 22195 100.0 99.9 20.0 100.0 100.0 0.114 0.125 0.109 0.141 0.152
4 0.033 5.52 26126 100.0 99.9 20.4 100.0 100.0 0.148 0.156 0.141 0.183 0.191
5 0.041 4.94 29854 100.0 99.9 20.6 100.0 100.0 0.181 0.185 0.173 0.224 0.229
6 0.049 4.51 33098 100.0 100.0 20.8 100.0 100.0 0.192 0.193 0.183 0.236 0.234
7 0.057 4.17 35794 100.0 100.0 20.8 100.0 100.0 0.227 0.227 0.217 0.282 0.285
8 0.066 3.90 38552 100.0 100.0 20.8 100.0 100.0 0.298 0.297 0.285 0.374 0.369
9 0.074 3.68 31885 99.4 99.9 16.6 99.5 100.0 0.409 0.406 0.384 0.467 0.462
10 0.082 3.49 22214 95.6 99.2 11.4 96.4 99.8 0.605 0.596 0.551 0.657 0.652
Overall 265502 99.2 99.2 18.6 99.4 100.0 0.190 0.194 0.180 0.247 0.247
Nmeas %poss Cm%poss Mlplcty AnomCmpl AnomFrc Rmeas Rmeas0 (Rsym) PCV PCV0
"Improved R-factors for diffraction data analysis in macromolecular crystallography" Kay Diederichs & P. Andrew Karplus, Nature Structural Biology, 4, 269-275
(1997)
Wavelength Correlation
• How well to the anomalous pairs correlate between the different wavelengths. – Good data should have an overall correlation
between 0.6-0.8– The resolution of the data is only really good to
a correlation to 0.3.– Diagonal- self correlation (by definition = 1)– Off-diagonal – overall correlation
Mannose Binding Protein – 4Yb in these examples the wavelength correlation is in the off-diagonal
Inf1 Peak Inf2 High
Inf1 0.1346 0.9441 0.9321 0.8893
peak 0.2654 0.9745 0.9309
Inf2 0.1953 0.9277
High 0.1142
Data has high correlation beyond 1.8Å. Anomalous differences are stronger than Se differences as Yb is a stronger scatter. Trend is correct here with peak wavelength the strongest anomalous scatter. Very high correlations in anomalous signal across wavelengths.
Protein “X”- 9Sein these examples the correlation is in the off-diagonal
Inf1 Peak High Low
Inf1 0.0867 0.0552 0.0665 0.0637
Peak 0.0713 0.4565 0.3237
High 0.0885 0.4966
Low 0.0765
Correlation shows that data is only good to 4.2Å, but the data (and these numbers) are to 2.5Å. Also note that the peak wavelength doesn’t have the highest anomalous signal. The numbers are in the right range though. (off diagonal numbers are wavelength correlations).
Met8p f_p f_r f_if_p 1.000 0.5014 0.4237
f_r 1.000 0.3295
f_i 1.000
Capsid f_h f_i f_pf_h 1.000 0.6604 0.6883
f_i 1.000 0.7799
f_p 1.000
Capsid res Corr
1 3.60 0.9717
2 2.86 0.9424
3 2.50 0.8404
4 2.27 0.6977
5 2.11 0.5096
6 1.98 0.3116
7 1.88 0.2280
8 1.80 0.1440
All 1.80 0.7799
Met8p res corr
1 6.00 0.7935
2 4.76 0.4352
3 4.16 0.2919
4 3.78 0.2493
5 3.51 0.2945
6 3.30 0.2380
7 3.14 0.2833
8 3.00 0.2795
all 3.00 0.4237
Copox f_p f_r f_if_p 1.000 0.1912 0.1008
f_r 1.000 0.0882
f_e 1.000
Copox res corr
1 6.00 0.5331
2 4.76 0.2673
3 4.16 0.1352
4 3.78 0.1225
5 3.51 0.0414
6 3.30 -0.0049
7 3.14 -0.0257
8 3.00 0.0460
1 3.00 0.1008
FliG/C f_p f_r f_ef_p 1.000 0.3775 0.2614
f_r 1.000 0.2128
f_e 1.000
FliG/C res Corr
1 6.60 0.6290
2 5.24 0.3715
3 4.58 0.3665
4 4.16 0.2640
5 3.86 0.2344
6 3.63 0.2181
7 3.45 0.0909
8 3.30 -0.0394
All 3.30 0.2128
• Correlation of anomalous differences at different wavelengths. (solve nicely puts all three wavelengths in a little table vs resolution. Solve suggests that little contribution to phasing will happen below a correlation of 0.5even though you will probably use data to a correlation of 0.3.
Met8p . CORRELATION FOR
WAVELENGTH PAIRS DMIN 1 VS 2 1 VS 3 2 VS 3
5.40 0.89 0.89 0.84 4.05 0.76 0.73 0.64 3.78 0.63 0.56 0.44 3.58 0.54 0.46 0.34 3.38 0.47 0.43 0.29 3.24 0.39 0.31 0.21 3.11 0.26 0.24 0.16 2.97 0.22 0.21 0.10 2.84 0.18 0.14 0.09 2.70 0.09 0.08 0.02
ALL 0.52 0.46 0.35
COPOX . CORRELATION FOR
WAVELENGTH PAIRS DMIN 1 VS 2
5.60 0.65 4.20 0.36 3.92 0.26 3.71 0.16 3.50 0.13 3.36 0.13 3.22 0.09 3.08 0.072.94 0.04 2.80 0.06
ALL 0.23
SOLVE
FliG/C- 2/3 . CORRELATION FOR
WAVELENGTH PAIRS DMIN 1 VS 2 1 VS 3 2 VS 3
8.00 0.85 0.71 0.69 6.00 0.67 0.50 0.42 5.60 0.58 0.36 0.23 5.30 0.51 0.22 0.24 5.00 0.46 0.31 0.10 4.80 0.54 0.31 0.15 4.60 0.54 0.38 0.25 4.40 0.54 0.33 0.28 4.20 0.46 0.26 0.23 4.00 0.38 0.32 0.13
ALL 0.58 0.38 0.28
Summary
• Know how much signal you need– Rule of thumb is 1 Se per 17 Kda– Two Met’s were incorporated into U3S to solve structure.
• Know the quality of the data you expect to collect. – COPOX data might not be able to be solved by SeMet.
Might have to return to heavier atoms.• Analyze the data to decide on what wavelengths to
use, and to what resolution.
Rmerge update
Improved R-factors for diffraction data analysis in marcomolecular crystallography (1997) Diederichs and Karplus NSB: 4(4) p269
A discussion of the efffect of redundancy on R-factors. Can also use these R-factors to evaluate Centric merging statistics to overall merging statistics. If there is a good anomalous signal then the overall merging statistics will be higher than the centric stats. Incorporated in SCALA.