Cryo-EM validation
tools in CCP-EM
Martyn Winn
20 November 2020
CCP-EM & CCP4 | RCaH
eBIC | DLS
Collaborative Computational Project for Electron cryo-Microscopy
Core team hosted by Research Complex at Harwell, alongside CCP4 team.
Wide network of collaborators.
What is CCP-EM?
Tom
Burnley
Colin
Palmer
Agnel
Joseph
Jola
Mirecka
Matt
Iadanza
Build UK cryo-EM community• Annual conference: CCP-EM Spring Symposium• Talks recorded, available on YouTube
Software training workshopsMailing listWorkshops for developersBenchmarking
CCP-EM software suite• Data processing tools for cryo-EM• Free for academic use, charge for industrial
Support for UK national facility at eBIC
Highlights
Replace job and metadata handling with new modular, flexible
Python layer. Unified approach for RELION and CCP-EM suite.
Sjors
Scheres
Standards / validation
Training
schools
Machine learning for segmentation of molecular maps
Support
for eBIC
Integrates tools for EM data processing
Collection of programs with common look-and-feel
Download from ccpem.ac.uk (Linux & Mac)
Stable v1.5. Use nightly for latest updates
Bugs & requests: [email protected]
Jobs run via:
Task GUI windows
Python API
Individual program executables
CCP-EM software suite
> ccpem main GUI
> ccpem-mrc-to-mtz task GUI
> ccpem-python scripting
> refmac5 executables
Validation overview
Of course, validation should be part of the structure determination process!
CCP-EM GUI allows sanity checking of input and interpretation of output.
Some tasks are more focussed on validation.
Designed for off-line validation during structure determination, prior to deposition.Complementary to EMDB / PDB.
Confidence maps
Carsten’s talk yesterday
Is there map support for atomic models?Model validation.
CryoEFNaydenova, K and Russo, CJ "Measuring the effects of particle
orientation to improve the efficiency of electron cryomicroscopy" Nature
Communications, 8, Article number: 629 (2017).
e.g. run_data.starfrom Relion Refine3D
V1.1 in CCP-EM. Quantify spread of particle angle distribution, recommends tilt angles to minimize bias
TEMPy: DiffMap
Map-Map or Map-ModelMaps are scaled based on average resolution dependent amplitude fall-offs of the maps
ADP-AlFxbound to kinesin-
6 motor domain
W1
4N68
K11
L83
L80H72
EMD-3622 (4.4Å) vs EMD-3621 (6.1 Å)
EMD-3488 (3.2Å) vs 5NI1_mut
Global or local scaling.
Latter can minimize effect of variable local resolution
Joseph et al. JCIM (2020)
Hemoglobinrotamer errors
EMDADeveloped by Rangana, see following talk.
Importable Python3 library for EM map and model manipulations:
Resolution related functionalities Map statistics calculation in Image and Fourier space Local correlation map calculation for map and model validation FSC based map-model validation
Included in CCPEM v1.5
No task, command line only (for now).In terminal window:emda -h
emda fsc --map1 foo_half_map_1.map
--map2 foo_half_map_2.map
ProSHADESymmetry/pseudo-symmetry detection
TEMPy-LocScore
Segment based Manders’ Overlap Coefficient (SMOC)
An overlap coefficient is calculated over
voxels covered by each residue (and the
local neighborhood)
Neighbourhood in sequence or space.
Joseph et al. 2016 , Farabella et al. 2015,
TEMPy: Scores
Difference
from best
Joseph et al. JSB, 2017
Several global scores (volume / surface / overlap).Compare alternative atomic models.
Refmac: Half-map cross-validation
Brown et al. 2015
Reciprocal space model refinement.CCP-EM task includes half-map validation.Test for overfitting of model to map.
weight matrix = 0.001
weight matrix = 0.1
Increase weight on fit to map
Validation: model taskDedicated validation task with several metricsAgnel’s talk
Geometry (bond/angle/dihedral): MolprobityCA geometry / peptides: (CaBLAM)
Bfactor distribution
Local fit in density: (TEMPy SMOC)Model map FSC / FSCavg: (Refmac)
Secondary structure prediction (Jpred)
Uses local programs (except last)Willams et a. 2018, Chen et al. 2015, Brown et al. 2015,
Joseph et al. 2016
Atomic B-factor distributionFrom input model – no refinement performed.
CCPEM - Model validation
Local fit to map: SMOC
Fixing in Coot
Launches Coot with map, model and to-do list
Future directions
Machine learning
Automated annotation, unusual features
Haruspex (in alpha mode). Deep learning approach to identify secondary structures in maps. Thorn A et al. Angewandte (2020)
Deposition
Gemmi project.Data harvesting.Collation of deposition files / metadata.Validation reports.
Integration with Relion
Improve data / metadata flow.Make intermediate steps available for validation.
CCP-EM core team
● Tom Burnley
● Colin Palmer
● Agnel Praveen Joseph
● Jola Mirecka
● Matt Iadanza
CCP4 core team
STFC SCD
● Alan Kyffin
DLS / eBIC staff
Birkbeck
● Maya Topf & group
University of Manchester
● Alan Roseman & group
AcknowledgementsImperial College London
● Chris Aylett
Francis Crick Institute
● Peter Rosenthal
University of Leeds
● Neil Ranson
● Becky Thompson
EBI
● Gerard Kleywegt
● Ardan Patwardhan
● Zhe Wang
MRC-LMB
● Garib Murshudov
● Sjors Scheres
● Paul Emsley
● Rob Nicholls
● Rangana Warshamanage
● Katerina Naydenova
University of York
● Kevin Cowtan
● Soon Wen ‘Scott’ Hoh
● Jon Agirre
TU Delft
● Arjen Jakobi
EMBL / FZ Jülich
● Max Beckers
● Carsten Sachse
University of Hamburg
● Andrea Thorn
… and others!