+ All Categories
Home > Documents > Phenix Tools for Cryo-EM: Refinement and Validation€¦ · Pavel V. Afonine,a,b*BrunoP.Klaholz,c...

Phenix Tools for Cryo-EM: Refinement and Validation€¦ · Pavel V. Afonine,a,b*BrunoP.Klaholz,c...

Date post: 28-Jan-2021
Category:
Upload: others
View: 0 times
Download: 0 times
Share this document with a friend
50
Phenix Tools for Cryo -EM: Refinement and Validation Pavel Afonine Phenix software developer LBNL, Berkeley, California, USA July 11, 2019 Cryo -EM Workshop, Stanford
Transcript
  • PhenixToolsforCryo-EM:RefinementandValidation

    PavelAfonine

    Phenixsoftwaredeveloper

    LBNL,Berkeley,California,USA

    July11,2019Cryo-EMWorkshop,Stanford

  • Cryo-EMtoolsinPhenix

    Startingmap MapimprovementMapsymmetry

    Mapmanipulation

    Extractuniquepart

    Docking,modelbuildingRefinementValidation

    Completesetoftoolsforcryo-EMstructuresolution:frominitialreconstructionto

    finalvalidatedmodel

  • Structurerefinement

    Initial(poor)model

    Improved(refined)model

    Refinement

  • RefinementtoolsinPhenix

    Ini=al$model$ Experimental$$data$

    Score$

    Modify$model$$parameters$

    Improved$$model$

    A(priori((knowledge$

    Refinement$–$op=miza=on$process$of$fiSng$model$to$experimental$data$$

    Ini=al$model$ Experimental$$data$

    Score$

    Modify$model$$parameters$

    Improved$$model$

    A(priori((knowledge$

    Refinement$–$op=miza=on$process$of$fiSng$model$to$experimental$data$$

    Crystallography Cryo-EM

    phenix.refineAvailablesince2005

    phenix.real_space_refineAvailablesince2013

  • RefinementtoolsinPhenix

  • Automatedmodelrefinement:phenix.real_space_refine

    •  Directrefinementagainstthemap•  NoFourierspaceinvolved

  • Automatedmodelrefinement:phenix.real_space_refine

    •  Bestmodel-mapfit.Anymap:X-ray,neutron,EM.Anyresolution•  Refinedmodels:nopoorvalidationmetrics

    •  Fast(minutes–afewhours,notdaysormanyhours)• MakeuseofmultipleCPUs:asmanyasavailable

    •  Largeconvergenceradius

    •  Easytouse:mapandmodelin,refinedmodelout

    •  Accessible:nospecialhardwarerequirements

  • •  CalculateonesetofFcalc–neverfinishedonmylaptop(runoutofmemory)

    •  Calculatereal-spacerefinementtarget–severalseconds

    Real-spacerefinement

    •  PDB:5VKU 3720chains|1,872,060residues|14,917,620atoms

    T = − ρ(atoms∑ xatom, yatom, zatom )

  • Automatedmodelrefinement:phenix.real_space_refine

    Rigidbody

    Modelidealization

    Morphing

    Weightcalculation

    Minimization

    RefineNCSoperators

    SimulatedAnnealing

    Rotamerfitting

    Inputs

    Refinedmodel Trajectory Logfile

    Refinementmacro-cycle

  • Automatedmodelrefinement:phenix.real_space_refine

    Rigidbody

    Modelidealization

    Morphing

    Weightcalculation

    Minimization

    RefineNCSoperators

    SimulatedAnnealing

    Rotamerfitting

    Inputs

    Refinedmodel Trajectory Logfile

    Refinementmacro-cycle

  • Startmodelbeforerefinement Afterphenix.real_space_refine

    Morphing

  • versus

    Modelregularization

  • • Goal•  Eliminateallgeometryoutliers• Moveatomsaslittleaspossiblefromstartposition•  Idealizedmodelwithinconvergenceofrefinement

    • Why?• Refinementmaynotbeabletorefineamodelwithlotsofbadgeometries•  Low-resdatacannotvalidategeometryoutliers

    Modelregularization

  • BeforeandafteridealizationRMSDbetweentwomodels

    lessthan1.5Å

    Modelregularization

  • Before…

    …aftermodelidealization

    Modelregularization

  • 1Å 2Å 3Å

    TRESTRAINTS = TBOND + TANGLE + TDIHEDRAL + TPLANARITY + TNONBONDED+ TCHIRALITY

    TBOND = Σall bonded pairsw(dideal - dmodel)2

    •  Lowertheresolution,lessdetailedthemap•  Needextrainformationtokeepcorrectgeometryduringrefinement

    T =TDATA +wTRESTRAINTS

    Restraints

  • •  Lowresolutionmapisnotsufficienttomaintainsecondary

    2Å 4-5Å 6Å-lower

    Restraints

  • •  Example:refinementofaperfectα-helixintolow-resmap•  Usingstandardrestraintsoncovalentgeometryisinsufficient

    • Modelgeometrydeterioratesasresultofrefinement

    Restraints

  • Images from PumMa web site (http://www.pumma.nl)

    Mainchain distributions

    Sidechain distributions

    Covalent geometry

    Related structures

    Secondary structure

    Internal symmetry

    TRESTRAINTS = TBOND + TANGLE +… + TNCS + TRAMACHANDRAN + TREFERENCE +…

    Restraints

  • Validation

    Model Data

    Cryo-EM Diffraction

    Modeltodatafit

    or

  • Validationtools:CrystallographyvsCryo-EM

    Model Data

    Cryo-EM Diffraction

    Modeltodatafit

    or

    Exactlysame Different

    Similar

  • Validation

    •  Helpstosavetimelater

    •  Helpstoproducebettermodels

    •  Helpstosetcorrectexpectations

    •  Minimizefraudortruemistakes

  • Validation

    Page 2 Full wwPDB X-ray Structure Validation Report 1JH7

    1 Overall quality at a glance i○

    The following experimental techniques were used to determine the structure:X-RAY DIFFRACTION

    The reported resolution of this entry is 2.40 Å.

    Percentile scores (ranging between 0-100) for global validation metrics of the entry are shown inthe following graphic. The table shows the number of entries on which the scores are based.

    Metric Whole archive(#Entries)Similar resolution

    (#Entries, resolution range(Å))Rfree 111664 3481 (2.40-2.40)

    Clashscore 122126 3956 (2.40-2.40)Ramachandran outliers 120053 3897 (2.40-2.40)

    Sidechain outliers 120020 3898 (2.40-2.40)RSRZ outliers 108989 3386 (2.40-2.40)

    The table below summarises the geometric issues observed across the polymeric chains and their fitto the electron density. The red, orange, yellow and green segments on the lower bar indicate thefraction of residues that contain outliers for >=3, 2, 1 and 0 types of geometric quality criteria. Agrey segment represents the fraction of residues that are not modelled. The numeric value for eachfraction is indicated below the corresponding segment, with a dot representing fractions

  • ValidationValidationforcrystallography(X-ray,neutron)andcryo-EM

  • Validation

  • Phenix tools for validation

    research papers

    814 https://doi.org/10.1107/S2059798318009324 Acta Cryst. (2018). D74, 814–840

    Received 16 August 2017

    Accepted 27 June 2018

    Edited by G. J. Kleywegt, EMBL-EBI, Hinxton,

    England

    Keywords: cryo-EM; atomic models; model

    quality; data quality; validation; resolution.

    New tools for the analysis and validation of cryo-EMmaps and atomic models

    Pavel V. Afonine,a,b* Bruno P. Klaholz,c Nigel W. Moriarty,a Billy K. Poon,a Oleg V.Sobolev,a Thomas C. Terwilliger,d,e Paul D. Adamsa,f and Alexandre Urzhumtsevc,g

    aMolecular Biophysics and Integrated Bioimaging Division, Lawrence Berkeley National Laboratory, Berkeley, CA 94720,

    USA, bDepartment of Physics and International Centre for Quantum and Molecular Structures, Shanghai University,

    Shanghai, 200444, People’s Republic of China, cCentre for Integrative Biology, Institut de Génétique et de Biologie

    Moléculaire et Cellulaire, CNRS–INSERM–UdS, 1 Rue Laurent Fries, BP 10142, 67404 Illkirch, France, dBioscience

    Division, Los Alamos National Laboratory, Los Alamos, NM 87545, USA, eNew Mexico Consortium, Los Alamos,

    NM 87544, USA, fDepartment of Bioengineering, University of California Berkeley, Berkeley, CA 94720, USA, andgFaculté des Sciences et Technologies, Université de Lorraine, BP 239, 54506 Vandoeuvre-lès-Nancy, France.

    *Correspondence e-mail: [email protected]

    Recent advances in the field of electron cryomicroscopy (cryo-EM) haveresulted in a rapidly increasing number of atomic models of biomacromoleculesthat have been solved using this technique and deposited in the Protein DataBank and the Electron Microscopy Data Bank. Similar to macromolecularcrystallography, validation tools for these models and maps are required. Whilesome of these validation tools may be borrowed from crystallography, newmethods specifically designed for cryo-EM validation are required. Here, newcomputational methods and tools implemented in PHENIX are discussed,including d99 to estimate resolution, phenix.auto_sharpen to improve maps andphenix.mtriage to analyze cryo-EM maps. It is suggested that cryo-EM half-mapsand masks should be deposited to facilitate the evaluation and validation ofcryo-EM-derived atomic models and maps. The application of these tools todeposited cryo-EM atomic models and maps is also presented.

    1. Introduction

    While crystallography is still the predominant method forobtaining the three-dimensional atomic structures of macro-molecules, the number of near-atomic resolution structuresfrom electron cryomicroscopy (cryo-EM) is growing expo-nentially (Fig. 1; Orlov et al., 2017). Since the introduction ofdirect electron detectors (see, for example, Faruqi et al., 2003;Milazzo et al., 2005; Deptuch et al., 2007), cryo-EM isincreasingly becoming the method of choice for manymacromolecules, particularly since these detectors have beenstandardized for routine usage. Crystallographic structuredetermination is a multi-step process that includes samplepreparation, obtaining a crystal of the sample, measuringexperimental data from that crystal, solving the phase problemand building an atomic model, followed by model refinementand validation (Rupp, 2010). As an imaging technique, thecollection and processing of experimental data is significantlydifferent in structure determination using cryo-EM becausethere is no phase problem to solve (Frank, 2006). However, itis very similar to crystallography in the subsequent stages ofthe process, such as model building, refinement and validation.

    It has been widely accepted that model validation (Chen etal., 2010) is critical in assessing the correctness of a model fromchemical, physical and crystallographic viewpoints, which inturn helps to ensure that the result, the atomic model of a

    ISSN 2059-7983

  • Ramachandran plot facts

    •  A protein structure should conform topriorexpectations

    •  Most (98%+) residues should have amainchainconformationconsistentwiththeRamachandrandistribution

    •  A small percentage (0.2%) of residuemayshowRamachandranoutliers(theyarenotnecessarilyerrors!)

    •  Outlierscanbeseeninstrainedregionsofthestructure(e.g.intheactivesite)

    •  Any outliers need to be confirmed bydetailedanalysis

  • Ramachandran plot facts

    local backbone conformation. For this, a

    Conformation-Dependent Library (CDL) has been

    developed46,47 and implemented in Phenix48 for pro-

    tein refinement. The CDL relates the expected cova-

    lent bond geometry to local backbone Ramachandran

    conformation. Because the expected bond geometry

    values in the CDL differ from those in the single-

    value library (especially for the N-Ca-C s angle), Mol-Probity validation now uses the CDL values for struc-

    tures refined with the CDL, as detected from the

    REMARK 3 information of a submitted file. Similarly,

    for RNA, geometry targets are dependent on ribose

    pucker.

    Cis or twisted non-trans peptidesThe peptide bond that joins adjacent amino-acid res-

    idues in a protein has partial double-bond character

    and therefore assumes a trans, or more rarely a cis,

    configuration. The cis configuration is significantly

    more common preceding a proline and results in a

    unique Ramachandran distribution for cis-proline.

    To maintain this special relationship, we associate

    peptide bonds with their following residue. About

    5% of prolines are cis, while only about 0.03% of all

    non-proline residues are genuinely cis.Recently, we were alerted to a surprising and

    improbable increase in the number of cis non-proline

    peptide bonds being modeled,49 as shown in the plot

    (updated) of Figure 9(A). These are due to model-

    building without consideration of prior probabilities,

    but also in part due to the lack of validation that

    flagged cis-nonPro peptides, in MolProbity or other

    systems. We have therefore implemented a new vali-

    dation and visual markup for non-trans peptides.

    Matching the PDB definition, we define a cis peptide

    as one with an x angle between 2308 and 1308, anda trans peptide as one with an x angle>11508or

  • Ramachandranplot

  • Ramachandran plot examplesGood Good

    Poor Suspicious

  • Ramachandran plot

    PDBcode3NOQ,1Å

    Outliers:

    (A,ILE,152),(B,ILE,154)

    (A,ILE,152)

    ValidRamachandranplotoutliers:justifiedbythedata(densitymap)

  • Example:Ramachandranplotoutliers

    3zx9 5a9z

    Clashscore:245

    Ramaoutliers:23%

    Rotameroutliers:17%

    Year:2011

    Resolution:17Å

    Clashscore:197

    Ramaoutliers:25%

    Rotameroutliers:28%

    Year:2015

    Resolution:4.7Å

  • Ramachandran plot

    3JA8 6EYC:re-refined(TristanCroll)

  • Ramachandran plot

    PDBcode:5a9z

    OriginalRefinedwithRamachandran

    plotrestraints

  • RamachandranplotZ-score

    •  Ramachandran Z-score is good at identifying odd-lookingRamachandranplots!•  UsedinPDBREDOandWhatCheck.ImplementedinPhenix(OlegSobolev)•  Criteria:

    •  Z-2: Good

  • RamachandranplotZ-score:examples

    6DZV 1US0(0.66Å)

    Z-score=-4.55 Z-score=0.1

    •  Z-2: Good

  • RamachandranplotZ-score:examples

    Z-score=-3.5 Z-score=-2.27

    •  Z-2: Good

    3JA8 6EYCre-refinedbyTristanCroll

  • Example:side-chainrotameroutliers4btg

    Clashscore:329

    Ramaoutliers:9%

    Rotameroutliers:46%

    Year:2013

    Resolution:4.4Å

  • Validation:model-to-mapfit3j9e(emd_6240)|3.3Å|CC=0.85|Year:2015

  • Validation:model-to-mapfit3a5x(emd_1641)|4.0Å|CC

  • Model-map correlation coefficient (CC)•  Definition• Withorw/osubtractingmean

    •  Howmodelmapiscalculated•  Approximation(e.g.N-gaussian)•  Form-factors(electron,X-ray,neutron)•  Fouriermap•  BoxorsphereofFouriermapcoefficients

    •  RegioninthemapusedtocalculateCC• Wholebox• Maskaroundatoms•  Atomradius

    CC ρ1,ρ2( ) = ρ1 n( )− ρ1( )2

    n∑#

    $%

    &

    '(

    −1/2

    ρ2 n( )− ρ2( )2

    n∑#

    $%

    &

    '(

    −1/2

    ρ1 n( )− ρ1( ) ρ2 n( )− ρ2( )n∑#

    $%

    &

    '(

    CC ρ1,ρ2( ) = ρ1 n( )( )2

    n∑"

    #$

    %

    &'

    −1/2

    ρ2 n( )( )2

    n∑"

    #$

    %

    &'

    −1/2

    ρ1 n( )ρ2 n( )n∑"

    #$

    %

    &'

  • Modelmap•  GaussianIAM(IndependentAtomModel)

    •  Universallyusedincrystallography(X-ray,Neutron,Electron)

    •  Isotropic:

    •  Anisotropic:

    •  Wholemodel:

    •  Toaccountforfiniteresolution:•  FTmodelmap•  Removetermsuptospecifiedresolution•  FTbacktorealspacetogetFourierimage=“Modelmap”

    ρatom (r,r0,B,q) = q ak4π

    bk + B

    $

    % &

    '

    ( )

    3 / 2

    k=1

    5

    ∑ exp − 4π2 r − r0

    2

    bk + B

    $

    % & &

    '

    ( ) )

    ρatom (r,U,q) = qq a j 4π( )

    3 / 2

    8π 2Ucart + b jI1/ 2 exp −4π

    2 r − r0( )T A T 8π 2Ucart + b jI[ ]

    −1A r − r0( )( )

    j=1

    5

    ρMODEL (r) = ρatoms (r)i=1

    Natoms

  • Examples:3J5Q,resolution:3.8Å

    METRIC Original PhenixMapCC 0.650 0.714RMSD(bonds/angles) 0.01/1.34 0.01/1.31Clashscore 100.9 32.84Rama.outl.,% 0.52 0Rotameroutl.,% 27.99 0C-betadeviations 0 0

  • Examples:3J5Q,resolution:3.8Å

    Residues/atoms:2,324/17,424

    Refinement:20min

    METRIC Original PhenixMapCC 0.650 0.714RMSD(bonds/angles) 0.01/1.34 0.01/1.31Clashscore 100.9 32.84Rama.outl.,% 0.52 0Rotameroutl.,% 27.99 0C-betadeviations 0 0

  • Examples:3J6P,resolution:8.2Å

    METRIC Original PhenixMapCC 0.596 0.743RMSD(bonds/angles) 0.03/2.34 0.00/1.11Clashscore 92.37 34.73Rama.outl.,% 2.03 0.54Rotameroutl.,% 26.21 0C-betadeviations 2 0

  • Examples:3J6P,resolution:8.2Å

    Residues/atoms:949/7,501

    Refinement:15min

    METRIC Original PhenixMapCC 0.596 0.743RMSD(bonds/angles) 0.03/2.34 0.00/1.11Clashscore 92.37 34.73Rama.outl.,% 2.03 0.54Rotameroutl.,% 26.21 0C-betadeviations 2 0

  • Examples: 3ZEE,resolution:6.1Å

    METRIC Original PhenixMapCC 0.709 0.647RMSD(bonds/angles) 0.04/4.05 0.01/1.23Clashscore 18.34 18.59Rama.outl.,% 3.66 0Rotameroutl.,% 24.64 0C-betadeviations 637 0

  • Examples: 3ZEE,resolution:6.1Å

    Residues/atoms:4,116/32,830

    Refinement:45min

    METRIC Original PhenixMapCC 0.709 0.647RMSD(bonds/angles) 0.04/4.05 0.01/1.23Clashscore 18.34 18.59Rama.outl.,% 3.66 0Rotameroutl.,% 24.64 0C-betadeviations 637 0

  • Resources

  • Usersupport

    •  Feedback,questions,help

    [email protected]@[email protected]

    •  Reportingabugoraskingforhelp:•  Wecan’thelpyouifyoudon’thelpustounderstandyourproblem

    •  Do: 1)MakesureyoucanreproducetheproblemusinglatestPhenixversion 2)Commandandparametersused(seriesofGUIclicksthatleadtoproblem) 3)Inputandoutputfiles 4)Clearlyexplaintheproblem/question

    PHENIXmailinglist:www.phenix-online.org


Recommended