+ All Categories
Home > Documents > Structure solution by Direct Methods

Structure solution by Direct Methods

Date post: 13-Jan-2016
Category:
Upload: jake
View: 31 times
Download: 3 times
Share this document with a friend
Description:
Structure solution by Direct Methods. Carmelo Giacovazzo Istituto di Cristallografia, CNR, Bari University, Italy [email protected]. - PowerPoint PPT Presentation
30
Structure solution by Direct Methods Carmelo Giacovazzo Istituto di Cristallografia, CNR, Bari University, Italy [email protected]
Transcript
Page 1: Structure solution by  Direct Methods

Structure solution by Direct Methods

Carmelo GiacovazzoIstituto di Cristallografia, CNR,

Bari University, Italy

[email protected]

Page 2: Structure solution by  Direct Methods
Page 3: Structure solution by  Direct Methods

Let us answer the following questions:crystal structure ?

crystal structure ?

As a consequence:

2 F

2 F

)]( 2exp[

)](2exp[

)2exp()2exp(

11

2

1,

11

2

N

jijiji

N

jj

N

jijiji

N

jjj

N

jjj

ifff

iff

ififF

rrh

rrh

hrhrh

r2F

Page 4: Structure solution by  Direct Methods

A third question: structure ?

00

'

'0

'

10

'0

11

2exp2exp

2exp

2exp2exp

)(2exp2exp

hXhX

hX

hrhX

rXhhr

hhh

h

h

h

j

N

jj

j

N

jjj

N

jj

iFiFF

Fi

ifi

ififF

O

O’

Xo

rj’

rj

Page 5: Structure solution by  Direct Methods

A fourth basic question

How can we derive phases from diffraction moduli? This seems contradictory: indeed

Phase values depend on the origin chosen by the user, moduli are independent of the user .

The moduli are structure invariants , the phases are not structure invariants.

Evidently, from the moduli we can derive information only on those combinations of phases

( if they exist) which are structure invariants.

Page 6: Structure solution by  Direct Methods

The simplest invariant : the triplet invariant

Use the relation F’h = Fh exp ( -2ihX0)

to check that the invariant Fh FkF-h-k does not depend on the origin.

The sum (h + k+ -h-k ) is called triplet phase invariant

)(exp||||||

])(2[exp||

2exp||2exp||

0

00'

khkhkhkh

khkh

hkhh'

kh'kh

Xkh

kXhX

iFFF

iF

iFiFFFF

.

Page 7: Structure solution by  Direct Methods

Structure invariants

Any invariant satisfies the condition that the sum of the indices is zero:

doublet invariant : Fh F-h = | Fh|2

triplet invariant : Fh Fk F-h-k

quartet invariant :Fh Fk Fl F-h-k-l

quintet invariant : Fh Fk Fl Fm F-h-k-l-m

Page 8: Structure solution by  Direct Methods

The prior information we can use for deriving the phase estimates may be so summarised:

1) atomicity: the electron density is concentrated in atoms:

2) positivity of the electron density: ( r ) > 0 f > 0

3) uniform distribution of the atoms in the unit cell.

N

jjaj

1

rrr

r1

r2

a1

a2

Page 9: Structure solution by  Direct Methods

The Wilson statistics• Under the above conditions Wilson ( 1942,1949)

derived the structure factor statistics. The main results where:

• (1)• Eq.(1) is :• a) resolution dependent (fj varies with θ ),

• b) temperature dependent: • From eq.(1) the concept of normalized structure factor

arises:

Nj jfF 1

22|| h

)/sinexp( 220 jjj Bff

2/11

2)/( Nj jfFE hh

Page 10: Structure solution by  Direct Methods

The Wilson Statistics• |E|-distributions:

and

in both the cases.The statistics may be used to evaluate the average themel factor and the absolute scale factor.

)||exp(||2|)(| 21 EEEP

)2/||exp(2

|)(| 21 EEP

1|| 2 E

Page 11: Structure solution by  Direct Methods

jjj

N

jjj iBfifF hrhr

plot Wilson The

h 2exp

sinexp 2exp

2

20

1

A

jj ifB hr

2expsin

exp 02

2

s2 Fh0

22022 2exp BsFKFKFhhobsh

A

2022022exp 2exp BsKBsFKF shobsh

20

2

2lnln BsKF

s

obsh

y

x

Page 12: Structure solution by  Direct Methods
Page 13: Structure solution by  Direct Methods

The Cochran formulah,k =h + k + -h-k = h + k - h+k P(hk) [2 I0]-1exp(G cos hk)

where G = 2 | Eh Ek Eh+k |/N1/2

Accordingly:

h + k - h+k 0 G = 2 | Eh Ek Eh+k |/N1/2 h - k - h-k 0 G = 2 | Eh Ek Eh-k |/N1/2 h k - h-k G = 2 | Eh Ek Eh-k |/N1/2

Page 14: Structure solution by  Direct Methods

The tangent formulaA reflection can enter into several triplets.Accordingly h k1 + h-k1 = 1 with P1(h) G1 = 2| Eh Ek1 Eh-k1 |/N1/2

h k2 + h-k2 = 2 with P2(h) G2 = 2| Eh Ek2 Eh-k2 |/N1/2

……………………………………………………………………………………………………….

h kn + h-kn = n with Pn(h) Gn = 2| Eh Ekn Eh-kn |/N1/2 Then P(h) j Pj(h) L-1 j exp [Gj cos (h - j )]

= L-1 exp [ cos (h - h )]where

2/122 ,cos

sintan BT

B

T

G

G

jj

jj

hh

Page 15: Structure solution by  Direct Methods

A geometric interpretation of

Page 16: Structure solution by  Direct Methods

The random starting approachTo apply the tangent formula we need to know one or more pairs ( k + h-k ). Where to find such an information?

The most simple approach is the random starting approach. Random phases are associated to a chosen set of reflections. The tangent formula should drive these phases to the correct values. The procedure is cyclic ( up to convergence).

How to recognize the correct solution?Figures of merit can or cannot be applied

Page 17: Structure solution by  Direct Methods

Tangent cycles

• φ1 φ’1 φ’’1 ……………. φc1

• φ2 φ’2 φ’’2 ……………. φc2

• φ3 φ’3 φ’’3 …………….. φc3

• ……………………………………………………………………..

• φn φ’n φ’’n………………. φcn

Page 18: Structure solution by  Direct Methods

• Ab initio phasing• SIR2011 is able to solve -small size structures (up to 80 atoms in the a.u.); -medium-size structures ( up to 200); -large size (no upper limit)• It uses • Patterson deconvolution techniques• ( multiple implication transformations)• as well as • Direct methods • to obtain a starting set of phases. They are extended and

refined via• electron density modification techniques•

Page 19: Structure solution by  Direct Methods

The VLD ( vive la difference) method • Historically, the difference Fourier synthesis is

calculated via Fourier coefficients

• In a modern way it is calculated via the coefficients

• where m and D parameters take into account the correlation between model and target structure.

• m=D=0 for uncorrelated model, m=D=1 for identical models.

)exp(|)||(| pp iFF

)exp(|)|||( pp iFDFm

Page 20: Structure solution by  Direct Methods

• According to our recent publication ( 2010) the best coefficient for a difference Fourier synthesis is :

• That is , it is the sum of the classical coefficient • • and of the flipping term

• The flipping term is dominant when the model is poor, goes to zero when the model coincides with the target.

)( pARmR

)exp(]1

)1(|||)|||[( 2

2

pA

App i

eDFFDFm

2

2

1)1(||

A

Ap

eDF

Page 21: Structure solution by  Direct Methods

• A difference Fourier synthesis calculated by such coefficient will have :

• Big negative minima where the atoms of the model structure are in wrong position;

• Medium positive maxima in correspondence of the atoms of the target structure which are not part of the model.

Page 22: Structure solution by  Direct Methods

PDB CORRqST DPHIST CORRqNEW DPHIqNEW

1kf3 0.41 63 0.63 511zs0 0.38 64 0.61 541a6m 0.29 64 0.61 561s31 0.33 63 0.60 552p0g 0.30 66 0.60 592sar 0.36 61 0.62 531lys 0.26 64 0.58 551cgn 0.19 65 0.62 536ebx 0.25 63 0.62 522bpy 0.12 66 0.64 521yxa 0.18 62 0.64 502iff 0.18 69 0.62 576ebx’ 0.00 70 0.68 469pti 0.00 72 0.69 489pti’ -0.02 73 0.68 48

Page 23: Structure solution by  Direct Methods

• Steps of VLD algorithm • ( Burla, Giacovazzo, Polidori, (2010), • J.Appl. Cryst. 43, 825-836)• A random E- electron density map is calculated.• The model structure is obtained by selecting

2.5% of the largest intensity pixels.• A difference electron density map is

calculated via the best coefficients . • is modified ( by selecting 4% of the pixels

with largest positive values and 4% of the pixels with largest negative values ) and added to to obtain a new estimate of :

p

q

q

p

qpnew

Page 24: Structure solution by  Direct Methods

• The corresponding E-map is calculated which is submitted to cycles of EDM. In each cycle the parameter is updated.

• At the end a new model is obtained and a next iteration starts.

A

Page 25: Structure solution by  Direct Methods

• Work in progress• We have combined VLD with RELAX.• RELAX is a procedure for shifting a

model from a wrong into the correct position.• All the calculations are transparent

for the user. • The results are below described.

Page 26: Structure solution by  Direct Methods

SMALL STRUCTURES ( up to 80 atoms in the asymmetric unit

• 33 test structures• Solution found in 7sec ( on average);• 1.1 seeds necessary for finding the

solution.

Page 27: Structure solution by  Direct Methods

code NAS seedR fFOM2 code NAS NAS fFOM2

WINTER 90 2 R 3.9 C8NEw 133 1 R 8.0

DEXTK 92 1 R 5.1 BULGE 138 1 R 7.0

AMPHIS 102 10 7.0 TA 142 7 R 12.0

BCDIMP 102 2 8.5 GNA 146 1 R 6.6

TENSIN 107 1 R 7.5 LASSO 146 1 R 6.0

ERGO21 111 1 R 11.4 TP 162 2 7.2

CEPHBc 112 2 15.5 HELIX 164 2 R 5.7

CEPHBa 113 11 R 8.0 CAHK 168 1 R 3.4

PNIB 114 1 2.3 TBn 190 3 7.8

DASCO6 124 1 R 7.7 DODE 205 2 2.3

MOR59 126 6 1.1 CYCLA 219 1 R 2.7MACRO 129 2 R 10.0 TRIP04 223 6 R 7.4

Page 28: Structure solution by  Direct Methods

• On the medium size structures above mentioned:

• <RES>= 0.16, • <T>= 4.6 mins, • <Δφ>=16°,• <seed>= 2.8

Page 29: Structure solution by  Direct Methods

pdb NAS seedS fFOM 2 pdb NAS seedS fFOM21AA5 200 2 4.0 1IGD 468 --- ---1SH0 207 48 1C75 528 7 7.71HHY 208 4 6.4 1B0Y 593 16 3.11P9I 213 19 1.2 1CTJ 672 2 2.41ICK 250 7 7.1 2PVB 814 --- ---

1AOM 255 1 5.4 1A6N 892 --- ---2BF9 305 1 1.4 1D4T 911 --- ---2ERL 305 2 3.3 3PYP 983 --- ---1A7Z 307 1 5.1 1MFM 1106 13 5.11A1Z 313 1 1.1 1A6G 1224 --- ---1CBN 330 1 1.9 1CKU 1229 2 2.31HHZ 354 31 1SWZ 1254 --- ---1BX7 365 3 1.2 1I76 1315 --- ---2FDN 373 8 6.5 1JM1 1496 --- ---8RXN 392 1 2.3 1FY2 1685 --- ---1IRO 411 1 5.0 1NLS 1810 --- ---1IRN 411 1 2.9 1GY0 2006 3 1.61NKD 439 9 5.5

Page 30: Structure solution by  Direct Methods

• For the solved proteins ( 12 over 35 unsolved in default, 10 or less if we use a larger number of seeds):

• <RES>=0.28, • <fFOM2>=3.75• <T>=0.5 hours,• <Δφ>=21°,• <seed>=2.4


Recommended