1
High-resolution cryo-EM reconstructions in the presence of 1
substantial aberrations 2
Raquel Bromberga, Yirui Guoa,b, Dominika Boreka,*, Zbyszek Otwinowskia,* 3
a Department of Biophysics, The University of Texas Southwestern Medical Center, Dallas, 4 Texas 75390, USA 5
b Current Address: Ligo Analytics, Dallas, TX 75206, USA 6
7
8
1corresponding authors 9
Dominika Borek 10 Department of Biophysics 11 The University of Texas Southwestern Medical Center 12 Dallas, Texas 75390, USA 13 Phone: (214)645-9577 14 Fax: (214)645-6353 15 Email: [email protected] 16 17 Zbyszek Otwinowski 18 Department of Biophysics 19 The University of Texas Southwestern Medical Center 20 Dallas, Texas 75390, USA 21 Phone: (214)645-6385 22 Fax: (214)645-6353 23 Email: [email protected] 24
25
Keywords: cryo-EM, axial aberrations, coma, trefoil, resolution, validation 26 27
.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 8, 2019. . https://doi.org/10.1101/798280doi: bioRxiv preprint
2
The beam-image shift method accelerates data acquisition in cryo-EM single particle 28
reconstruction (cryo-EM SPR) by fast repositioning of the imaging area, but at the cost of more 29
severe and complex optical aberrations. 30
We analyze here how uncorrected anti-symmetric aberrations, such as coma and trefoil, affect 31
cryo-EM SPR results, and then infer an analytical formula quantifying information loss due to their 32
presence that explains why Fourier-shell coefficient (FSC)-based statistics may report 33
significantly overestimated resolution if these aberrations are not fully corrected. We validate our 34
analysis with reference-based aberration refinement for two cryo-EM SPR datasets acquired with 35
a 200 kV microscope in the presence of coma exceeding 40 µm, and obtained 2.3 and 2.7 Å 36
reconstructions for 144 and 173 kDa particles, respectively. 37
Our results provide a description of an efficient approach for assessing information loss in cryo-38
EM SPR data acquired in the presence of higher-order aberrations and address inconsistent 39
guidelines regarding the level of aberrations acceptable in cryo-EM SPR experiments. 40
.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 8, 2019. . https://doi.org/10.1101/798280doi: bioRxiv preprint
3
Introduction 41
In a typical cryo-EM SPR experiment, some aberrations such as defocus are introduced 42
intentionally while others such as spherical aberration are unavoidable for a given setup (1-3). 43
The significance of the remaining aberrations is evaluated on a case-by-case basis (4, 5). 44
In the analysis of aberrations in cryo-EM SPR, where in one image we record only a small part of 45
the focal plane, we can use an isoplanatic approximation in which aberrations are represented by 46
convolutions, and in Fourier space they depend only on the angle of the scattered electrons. In 47
phase contrast illumination mode, used in cryo-EM SPR, axial aberrations can be divided into two 48
distinct categories depending on the symmetry properties of the image phase shift as a function 49
of a scattering vector. If the phase shift is centrosymmetric, then aberrations will result in 50
modulations of the image power spectrum. If the phase shift is antisymmetric, the power spectrum 51
will not be modulated because aberrations will not affect the amplitude of the image but only its 52
phase. The lowest order of antisymmetric phase shift is a translation, which due to the lack of an 53
absolute coordinate system, can be set to zero. Antisymmetric aberrations of the next, third order 54
are called axial coma and trefoil and these are important for cryo-EM SPR data quality in practice 55
(4-6). Alignment procedures used in cryo-EM SPR minimize coma and trefoil indirectly, for 56
instance by analyzing changes in the image power spectrum due to interactions between beam 57
tilt and spherical aberration (5). The success of this approach and similar ones requires co-58
alignment of the optical axes for multiple lenses, and if the alignment procedures are not properly 59
executed, coma and trefoil may be present and affect the quality of the SPR results, and yet 60
manifest only in specialized analyses (5-8). Furthermore, these and additional, higher-order, 61
antisymmetric aberrations are induced when data are acquired with the beam-image shift method 62
(6-8), in which coordinated electronic shifts of an illuminating beam and an image are used to 63
navigate away from the optical axis. We restrict our discussion here to axial aberrations, but with 64
.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 8, 2019. . https://doi.org/10.1101/798280doi: bioRxiv preprint
4
the understanding that these aberrations will have different values in different positions of an 65
optical system for data acquired with the beam-image shift method. 66
The acceleration in cryo-EM data acquisition enabled by the beam-image shift method (6-9) has 67
resulted in discussions of the best experimental strategies for data collection, approaches to 68
compensate for or to correct for optical aberrations, and methods to assess modulation and the 69
loss of signal due to the presence of uncorrected optical aberrations (4, 6, 10, 11). Axial coma 70
can be corrected by applying compensating beam tilt during data collection; however, beam tilting 71
does not correct other aberrations (4), and thus the extent to which one can apply beam-image 72
shift without compromising data quality remains open. 73
We found that large values of axial aberrations can be precisely estimated and accurately 74
corrected for, leading to large, case-specific improvements of SPR results. We provide formula 75
for assessing how the levels of uncorrected coma and trefoil affect the resolution of SPR and 76
discuss their impact on the validation statistics. 77
78
Results 79
CTF determination by power spectrum analysis is an inherent part of high-resolution cryo-EM 80
SPR and provides estimates of the magnitude of symmetric aberrations (2). However, phase shift 81
does not modulate the power spectrum, so determination of antisymmetric aberrations has to be 82
performed by other methods (4). 83
In our analysis of antisymmetric aberrations, we utilize an associative property of pure phase shift 84
that also holds when it is combined with image translation (12). Thus, corrections for the presence 85
of antisymmetric aberrations can be split into separate steps and applied in any order, without 86
loss of accuracy in retrieving information. Consequently, for both large and small magnitudes of 87
antisymmetric aberrations, their impact is defined only by the difference between their true and 88
.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 8, 2019. . https://doi.org/10.1101/798280doi: bioRxiv preprint
5
their assumed or refined values. If their estimates are inaccurate, the difference generates a 89
component of the image phase shift that, after averaging over multiple particles, produces the 90
signal modulation that we analyze here. 91
Antisymmetric aberration is a convolution with signal so in the frequency domain the convolution 92
is represented as the multiplication of a signal and Fourier representation of an aberration. The 93
impact of aberrations on 3D cryo-EM SPR has been theoretically analyzed by considering how 94
images are affected (5, 13). However, cryo-EM SPR relies on averaging multiple particles to 95
increase the SNR. Thus, we investigated how an aberration’s impact will propagate to averaged 96
representations of particles. 97
To this end, we assume a large number of particles which are randomly oriented on a grid with 98
respect to rotation around the beam axis but with potential preferred orientation dependence on 99
the other two Eulerian angles. Averaging all such particles removes the dependence of the 100
average signal on the angle of the aberration in the microscope frame (6). Therefore, the final 101
consequence of an aberration is the resolution-dependent modulation of signal amplitude (6). For 102
each unique projection that is conceptually equivalent to a 2D class average, we can define an 103
angle of particle orientation in the microscope and the difference ∆ between and the 104
characteristic direction of a particular antisymmetric aberration (Fig. S1). Single particles have no 105
force aligning them with this angle, so we can assume that their distribution is uniform with respect 106
to this angle. We then express the maximum phase shift for a particular aberration and 107
resolution, which for coma and trefoil, is: 108
, 2 , (1) 109
where represents resolution and represents electron wavelength. The first index of the 110
aberration coefficients identifies the power dependence on resolution and the other defines 111
angular periodicity in the microscope frame of the phase shift, and so for coma 1 and for 112
.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 8, 2019. . https://doi.org/10.1101/798280doi: bioRxiv preprint
6
trefoil 3 (14). Coma can also be derived from interactions between beam tilt and spherical 113
aberrations : , 3 so ,3 . This representation may be convenient when one is 114
interested in expressing coma with respect to beam tilt rather than coma values. 115
In weak phase approximation, third order aberrations can be expressed using terms resulting 116
from the Taylor expansion of a wave front: 2 3,1 ∆ 3,1 3,3 3∆ 3,3 . To obtain a modulation 117
term for the signal at a given resolution, we average the wave front over the angular distribution 118
of all possible particles, with the result being the Bessel function of order zero (Fig. 1): 119
, ∆ , ∆ , , (2) 120
To our knowledge, this analytical result has not been noticed before in assessing aberrations 121
despite having highly important consequences for their analysis. As discussed in detail later, the 122
foremost consequence is that the structure factors of the reconstruction may become 123
anticorrelated from the reality in some resolution shells, an effect which can be missed in 124
standard, FSC-based half-maps assessment of resolution (15-17), implying much higher 125
resolution than that achieved. 126
We expect that data acquired with the beam-image shift method will be affected by more than 127
one type of axial aberration. In such a case, aberrations that have the same angular dependence 128
order (second index) but different radial order (first index) will be strongly correlated (5) and these 129
correlations have to be considered when values of aberrations are refined. If in refinement, the 130
data were to have uniform information content across resolutions, the refinement will be 131
orthogonalized by Zernike polynomials with corresponding to the limiting resolution (18). For 132
coma, the corresponding Zernike polynomial is with the first term representing coma and 133
the second representing translation. Consequently, at the resolution limit, two-thirds of the coma-134
produced phase shifts will be compensated for by image translation in refinement. This translation 135
can be executed on the whole image or at the level of particles by shifting their positions in the 136
.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 8, 2019. . https://doi.org/10.1101/798280doi: bioRxiv preprint
7
image by the same value, with such an effect not being noticeable in a typical refinement. 137
However, the compensation factor is reduced from the value of two-thirds because the signal is 138
much stronger at low resolution than at the resolution limit. We determined for analyzed high-139
resolution cryo-EM SPR datasets (Table 1) that translation produced only about a two-fifths 140
compensating contribution, reducing the maximum value of the coma-induced phase shift by a 141
factor of ~0.6. The compensation due to resolution dependence explains some of the 142
discrepancies in the literature discussing the acceptable limits of coma. For instance, one 143
proposed limit was a phase shift in the direction of coma distortion, and without translational 144
compensation (4, 5). Using our formula (Eq. 2), we found that this will preserve 0.6 ∙ ~0.94 145
of the original signal, a reduction in the SNR that is barely significant, and so the limit is too 146
conservative as it was postulated by Cheng et al. (6). In Cheng et al.’s analysis, the impact of 147
coma was derived from FSC curves calculated from numerical experiments, where a particular 148
value of beam tilt (and corresponding coma) were assumed. FSC-based resolution analysis in 149
the presence of uniform large coma can be misleading because the J0 function oscillates (Fig. 1). 150
When the signal modulation term defined by J0 (Eq. 2) is negative, both halves of the split data 151
are affected, and so the correlation coefficient between halves is positive even if the resulting 152
reconstruction has a negative correlation with the truth. If the coma causes very strong modulation 153
in the FSC (11), then it is easy to recognize the correct resolution limit corresponding to the first 154
zero of the Bessel function J0 (Eq. 1, 2). However, oscillations in the FSC curve may be 155
pronounced or smoothed to different degrees (Fig. 2), even when data sets with very similar 156
values of coma (Table 2) were analyzed. The method that we employed from cisTEM (19) uses 157
only a smooth spherical mask to calculate FSC curves, and so the oscillations could not have 158
come from a molecular mask effect (20). Thus, in the absence of pronounced oscillations, the 159
resolution limit may be significantly overestimated compared to the consideration based on the 160
.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 8, 2019. . https://doi.org/10.1101/798280doi: bioRxiv preprint
8
first zero of the Bessel function. Consequently, simulation-based procedures relying solely on 161
FSC (6) may grossly underestimate the significance of coma. 162
We tested our approach for two small proteins with particle size 144 kDa and 173 kDa (Fig. 2, 163
Table 1), and we obtained reconstructions of 2.32 Å and 2.70 Å, respectively, in the presence of 164
very high coma, with data acquired with Talos Arctica 200 kV, K2 Gatan camera. An objective 165
aperture, an energy filter, and a phase plate were not used, and numbers of micrographs and 166
particles in data analysis were moderate (Table 1). To our knowledge, these are the highest 167
resolution reconstructions obtained with a 200 kV instrument for particles having molecular mass 168
below 200 kDa and the first high resolution reconstruction for a molecule with mass below 150 169
kDa. We have not found any detrimental effects for correcting coma, even in cases where the 170
generated phase shift is very large, on the order of 10×2(7.2 mrad). We reprocessed data from 171
EMPIAR for 200 kV instruments, applying the same aberration estimation approach to data for 172
the larger molecules of the proteasome (700 kDa, EMPIAR 10185 and 10186) (10) and -173
galactosidase (430 kDa, EMPIAR 10204) (Table 1) (21). We refined coma and trefoil 174
independently on each micrograph. We noticed that coma can fluctuate far above refinement 175
uncertainty and we attribute this observation to differences in the stability of the beam tilt direction 176
between micrographs. In addition, coma refinement has a strong correlation with overall image 177
shift, affecting the accuracy of coma determination. We found that trefoil on the other hand was 178
remarkably stable and has no significant correlation with other parameters of refinement, so 179
variations in its refined value can be used as a good indicator of statistical uncertainty for third 180
order aberrations (Fig. 3). 181
The problem generated by the presence of large coma was analyzed in a recent publication (11) 182
which described the refinement of EMPIAR 10263, dataset III, with coma refinement performed 183
by Relion 3.0 (13) and JSPR (11, 22), with results presented in Fig. 6B in (11). We reprocessed 184
this dataset and obtained a much tighter clustering of coma values, similar to the clustering 185
.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 8, 2019. . https://doi.org/10.1101/798280doi: bioRxiv preprint
9
observed for our datasets. Relion 3.0 (13) seriously underestimated coma by a factor of ~4 186
compared to our estimate and the average of the JSPR individual refinements per micrograph 187
had a value that was underestimated by twenty-five percent compared to our values. However, 188
we used a similar target function in our refinement procedure to JSPR (11, 22) and Relion 3.0 189
(13), and so the different outcomes are most probably due to differing refinement schemes (11). 190
The results presented in Fig. 6B from reference (11) are consistent with the behavior of our 191
procedure when compared to the results from our first cycle. 192
We observed that only the presence of a large fraction of incorrect particles or particles with 193
grossly incorrect orientation in a micrograph would bias the coma refinement toward the starting 194
point. When starting with coma refined by the previous cycle, re-refining the particle orientation 195
attenuated the bias in the next cycle, so even the presence of a moderate number of bad particles 196
did not affect the convergence of the procedure. We achieved additional improvement, in terms 197
of resolution and spread of coma values, when we changed the null hypothesis regarding coma 198
from a value of zero to the average of the refined coma values. This more appropriate null 199
hypothesis was applied by taking advantage of the associative properties of coma, which allowed 200
us to apply coma phase shift correction to the images; therefore, all the steps in the final round of 201
the data analysis, starting from particle picking, were performed on images corrected by the 202
average value of coma. Subsequent coma refinement, representing a difference from the previous 203
average used in image correction, was still performed independently for all micrographs, but the 204
residual bias resulting from the initial coma value being zero was eradicated. Coma or beam tilt 205
refinement has to use the same type of reference-based target function irrespective of 206
implementation. We would expect that small corrections would be equally well-characterized by 207
all programs. However, local refinements and restraints may have different convergence ranges 208
depending on the implementation, and this is mostly likely the explanation for the differences in 209
results between programs. 210
.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 8, 2019. . https://doi.org/10.1101/798280doi: bioRxiv preprint
10
Uncorrected aberrations create systematic patterns of phase shift and if such aberration error is 211
constant across a dataset, then the reconstruction will also be altered in a systematic way, not 212
only by losing amplitude but also by flipping its sign at some resolutions. In this case, the FSC 213
curve may undergo oscillations, with the first minimum being the effective resolution limit of the 214
result. Oscillations in the FSC curve are recognized as a qualitative sign of problems with the 215
quality of cryo-EM SPR results (20). We provide here (Eq. 2) an explanation for another possible 216
source of these oscillations. In our glucose isomerase (GI) data acquired with 58 m coma, 217
corresponding to 7.2 mrad beam tilt, before correcting coma we observed four oscillations in the 218
FSC curve resulting from phase shift (Fig. 2) with the map interpretability being inconsistent with 219
the FSC-based resolution indicator. Therefore, if FSC oscillations are encountered, refining coma 220
is highly recommended to diagnose the problem, with the potential outcome being substantial 221
reconstruction improvement by correcting the aberration. 222
223
Discussion 224
Although instruments may be aligned quite accurately before data collection, our analyses and 225
those of others (4, 6, 10, 11) indicate that coma can vary significantly between images and the 226
extent of the variation is dataset dependent. For this reason, we recommend refinement of not 227
only overall coma but also coma for individual images. Trefoil typically is not important, but for 228
one pair of EMPIAR datasets (EMPIAR 10185 and EMPIAR 10186) and also our datasets (not 229
shown), it was highly significant. However, even in these cases, only the overall value of trefoil 230
for the dataset was important, with an insignificant level of variations between individual images. 231
Correction for antisymmetric aberrations can be performed during reconstruction (11, 13), but the 232
consequence of the associative property is that it can just as well be performed at the whole 233
micrograph level before reconstruction commences. In the presence of large coma, correcting 234
coma at the image level reduces the point spread function of the imaging system so it may 235
.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 8, 2019. . https://doi.org/10.1101/798280doi: bioRxiv preprint
11
improve particle masking operations. If the value of these aberrations is known from external 236
calibration, then the image-based correction may simplify applying these corrections to the data 237
without modifying downstream analysis programs. 238
What affects cryo-EM SPRs is the error in the aberration model used and not the magnitude of 239
the aberrations themselves, at least up to the theoretical limit where the image of a point source 240
(e.g. coma) extends outside of the detector. Therefore, it is important to calibrate all aberrations 241
and not only the ones that affect the power spectrum. This can be accomplished prior to data 242
collection or a posteriori by reference-based refinement using the structure being solved. Once 243
appropriate coma calibrations and corrections are used, this means that the procedures where 244
the beam is intentionally tilted can be used on a larger scale. This can provide limited but 245
additional three-dimensional particle information on top of the projection image without any time 246
and precision penalties associated with mechanical rotations. 247
248
Materials and methods 249
Protein expression, purification, and grid preparation: Glucose isomerase (GI), also called 250
xylose isomerase, from Streptomyces rubiginosus was purchased from Hampton Research (23). 251
Protein slurry was dialyzed three times against excess of dH2O and concentrated to ~40 mg/ml 252
with Amicon filter. 253
HemQ protein was one of the structural genomics targets (MCSG APC35880). We have solved 254
its X-ray crystallographic structure (PDB code: 1T0T.pdb) and recently others determined its 255
function (24). Expression and purification of GYMC52_3505 plasmid encoding HemQ in pMCSG7 256
vector with Tobacco Etch Virus (TEV) cleavable N-terminal His6-tag (25) followed previously 257
established protocol (26). After purification and tag cleaving, the protein was extensively dialyzed 258
against 20 mM HEPES pH 7.5 and used at ~28 mg/ml concentration for grid preparation. The 259
plasmid GYMC52-3505 is available from the DNAS Plasmid Repository (https://dnasu.org). 260
.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 8, 2019. . https://doi.org/10.1101/798280doi: bioRxiv preprint
12
Cryo-EM grids for both proteins were prepared with FEI Vitrobot Mark IV. In each case, 3 µl of 261
protein solution were applied to the grid at 4 °C, 100% humidity followed by 6 s blotting with blot 262
force 20 before grids were plunged into liquid ethane cooled with liquid nitrogen (Fig S2). 263
264
Data acquisition and analysis: The cryo-EM dataset for GI was collected with a 200 kV Talos 265
Arctica microscope equipped with a K2 Gatan camera, with a physical pixel size of 0.91 Å. The 266
phase plate was not used and the objective aperture was not inserted. A total of 202 movies with 267
an exposure time of 100 s/movie were collected. Each movie contains 200 frames with an 268
exposure time of 0.5 s/frame and an electron dose of 140 e/Å2 per movie (Table 1). 269
Both HemQ-57K and HemQ-45K were also collected in the same alignment conditions on a 200 270
kV Talos Arctica microscope with a K2 Gatan camera run in super-resolution mode, with a 271
physical pixel of 0.72 Å for HemQ-57K and 0.91 Å for HemQ-45K. For HemQ-57K, 268 movies 272
were collected with an exposure time of 40 s/movie. Each movie contains 100 frames with an 273
exposure time of 0.4 s/frame and an electron dose of 90 e/Å2 per movie. For HemQ-45K, 257 274
movies were collected with an exposure time of 40 s/movie. Each movie contains 100 frames with 275
an exposure time of 0.4 s/frame and an electron dose of 90 e/Å2 per movie. 276
Complete datasets for EMPIAR deposits 10204, 10185 and 10186 were processed as examples 277
of datasets collected at 200 kV. EMPIAR 10185 and EMPIAR 10186 were collected consecutively 278
on the same instrument with EMPIAR 10185 collected with a traditional setup by moving only the 279
stage, and EMPIAR 10186 collected with the beam-image shift method. We performed image-280
specific correction for coma in both datasets, producing a material improvement in resolution 281
(Table 2). EMPIAR 10263, dataset III, served as an example of a dataset collected at 300 kV with 282
a large coma value that was partially corrected by alternative methods. 283
We processed all datasets with cisTEM (19). We modified the cisTEM pipeline by adding 284
reference-based refinement of aberrations, including coma and trefoil, following the same design 285
.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 8, 2019. . https://doi.org/10.1101/798280doi: bioRxiv preprint
13
as implemented in Relion 3.0 (11, 13). As discussed in the Results and Discussion sections, 286
multiple cycles that included aberration refinement, orientation refinement, and creation of a new 287
reference (19). In these cycles, the resolution limit of data used for the orientation refinement was 288
selected based on manual assessment of cisTEM’s SNR estimate exceeding a threshold value, 289
typically around ~4. The data collection and analysis statistics are summarized in Table 1. The 290
cryo-EM movies and maps used in data analysis of GI and HemQ proteins are deposited in 291
EMPIAR under [code 1] and [code 2] codes. For figure preparation, we placed models generated 292
by crystallography (5VR0.pdb for GI and 1T0T.pdb for HemQ) into cryo-EM maps, using the rigid 293
body refinement option available in Coot (27, 28). 294
295
Acknowledgements: We thank Tabitha Emde for protein purification and grid preparation for the 296
HemQ protein. We thank the Cryo-Electron Microscopy Facility at UT Southwestern Medical 297
Center which is supported by grant RP170644 from the Cancer Prevention & Research Institute 298
of Texas (CPRIT) for maintaining Talos Arctica microscope. This project has been funded in part 299
with federal funds from the National Institute of Allergy and Infectious Diseases, National Institutes 300
of Health, Department of Health and Human Services, under Contract No. 301
HHSN272201700060C. This project was also supported by the National Institutes of Health 302
(R21GM126406 to DB, and R01GM117080 and R01GM118619 to ZO) and the Department of 303
Energy (DE-SC0019600 to YG). 304
305
Author Contributions: RB, DB, and ZO developed an approach to aberration analysis; RB and 306
ZO implemented the approach; DB acquired data; RB, YG and ZO analyzed data; RB, YG, DB 307
and ZO wrote manuscript. 308
309
.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 8, 2019. . https://doi.org/10.1101/798280doi: bioRxiv preprint
14
Conflict of interest statement: RB, YG, DB, and ZO are co-founders of Ligo Analytics. YG 310
serves as the CEO of Ligo Analytics. ZO is a co-founder of HKL Research. RB, ZO, and DB are 311
co-inventors listed on a provisional patent application that has been filed on this work. 312
.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 8, 2019. . https://doi.org/10.1101/798280doi: bioRxiv preprint
15
References 313
1. Cheng Y, Grigorieff N, Penczek PA, & Walz T (2015) A primer to single-particle cryo-314
electron microscopy. Cell 161(3):438-449. 315
2. Wade RH (1992) A Brief Look at Imaging and Contrast Transfer. Ultramicroscopy 46(1-316
4):145-156. 317
3. Scherzer O (1936) Über einige Fehler von Elektronenlinsen. Zeitschrift für Physik 318
101(9):593-603. 319
4. Glaeser RM, Typke D, Tiemeijer PC, Pulokas J, & Cheng A (2011) Precise beam-tilt 320
alignment and collimation are required to minimize the phase error associated with coma 321
in high-resolution cryo-EM. J Struct Biol 174(1):1-10. 322
5. Uhlemann S & Haider M (1998) Residual wave aberrations in the first spherical aberration 323
corrected transmission electron microscope. Ultramicroscopy 72(3-4):109-119. 324
6. Cheng A, et al. (2018) High resolution single particle cryo-electron microscopy using 325
beam-image shift. J Struct Biol 204(2):270-275. 326
7. Mastronarde DN (2005) Automated electron microscope tomography using robust 327
prediction of specimen movements. J Struct Biol 152(1):36-51. 328
8. Suloway C, et al. (2005) Automated molecular microscopy: the new Leginon system. J 329
Struct Biol 151(1):41-60. 330
9. Cheng A, Tan YZ, Dandey VP, Potter CS, & Carragher B (2016) Strategies for Automated 331
CryoEM Data Collection Using Direct Detectors. Methods Enzymol 579:87-102. 332
10. Herzik MA, Jr., Wu M, & Lander GC (2017) Achieving better-than-3-A resolution by single-333
particle cryo-EM at 200 keV. Nat Methods 14(11):1075-1078. 334
11. Li K, et al. (2019) Sub-3A apoferritin structure determined with full range of phase shifts 335
using a single position of volta phase plate. J Struct Biol 206(2):225-232. 336
12. Hopkins HH (1984) Image Shift, Phase-Distortion and the Optical Transfer-Function. Opt 337
Acta 31(3):345-368. 338
.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 8, 2019. . https://doi.org/10.1101/798280doi: bioRxiv preprint
16
13. Zivanov J, et al. (2018) New tools for automated high-resolution cryo-EM structure 339
determination in RELION-3. Elife 7. 340
14. Barthel J (2007) Ultra-precise measurement of optical aberrations for Sub-Ångström 341
transmission electron microscopy. (Universitätsbibliothek). 342
15. Saxton WO & Baumeister W (1982) The correlation averaging of a regularly arranged 343
bacterial cell envelope protein. J Microsc 127(Pt 2):127-138. 344
16. van Heel M (1987) Similarity Measures between Images. Ultramicroscopy 21(1):95-99. 345
17. Scheres SH & Chen S (2012) Prevention of overfitting in cryo-EM structure determination. 346
Nat Methods 9(9):853-854. 347
18. Bhatia AB & Wolf E (1954) On the circle polynomials of Zernike and related orthogonal 348
sets. Mathematical Proceedings of the Cambridge Philosophical Society 50(1):40-48. 349
19. Grant T, Rohou A, & Grigorieff N (2018) cisTEM, user-friendly software for single-particle 350
image processing. Elife 7. 351
20. Penczek PA (2010) Resolution measures in molecular electron microscopy. Methods 352
Enzymol 482:73-100. 353
21. Iudin A, Korir PK, Salavert-Torres J, Kleywegt GJ, & Patwardhan A (2016) EMPIAR: a 354
public archive for raw electron microscopy image data. Nat Methods 13(5):387-388. 355
22. Guo F & Jiang W (2014) Single particle cryo-electron microscopy and 3-D reconstruction 356
of viruses. Methods Mol Biol 1117:401-443. 357
23. Borek D, Bromberg R, Hattne J, & Otwinowski Z (2018) Real-space analysis of radiation-358
induced specific changes with independent component analysis. J Synchrotron Radiat 359
25(Pt 2):451-467. 360
24. Celis AI, et al. (2017) Structure-Based Mechanism for Oxidative Decarboxylation 361
Reactions Mediated by Amino Acids and Heme Propionates in Coproheme Decarboxylase 362
(HemQ). J Am Chem Soc 139(5):1900-1911. 363
.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 8, 2019. . https://doi.org/10.1101/798280doi: bioRxiv preprint
17
25. Stols L, et al. (2002) A new vector for high-throughput, ligation-independent cloning 364
encoding a tobacco etch virus protease cleavage site. Protein Expr Purif 25(1):8-15. 365
26. Kim Y, et al. (2011) High-throughput protein purification and quality assessment for 366
crystallization. Methods 55(1):12-28. 367
27. Emsley P & Cowtan K (2004) Coot: model-building tools for molecular graphics. Acta 368
Crystallogr D Biol Crystallogr 60(Pt 12 Pt 1):2126-2132. 369
28. Emsley P, Lohkamp B, Scott WG, & Cowtan K (2010) Features and development of Coot. 370
Acta Crystallogr D Biol Crystallogr 66(Pt 4):486-501. 371
372
.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 8, 2019. . https://doi.org/10.1101/798280doi: bioRxiv preprint
18
Figure legends 373
Figure 1. Resolution dependence of third order aberrations on single particle reconstruction. X-374
axis represents reciprocal space resolution, scaled to a = 1 for x = 1 (Eq. 1). Y-axis represents 375
reciprocal space signal modulation of the reconstruction resulting from averaging the aberration. 376
377
Figure 2. FSC plots before (black) and after (red) coma correction and the corresponding final 378
map fragment for three experiments with high beam tilt (coma) values. The statistics from each 379
experiment are presented in Table 1. The vertical dotted grey line represents the first zero of the 380
modulation function (Eq. 1) and the solid line represents the first zero of the modulation function 381
with the assumption that 40% of the coma impact was compensated by image translation. The 382
resolutions corresponding to the first zero values are listed in Table 2. 383
384
Figure 3. Heat maps for coma and trefoil values refined separately per image from the HemQ-385
57K dataset. The leftmost panel shows tight clustering of the coma values, with the center panel 386
magnifying the region of clustering. The right panel shows the values of trefoil, which for this 387
dataset were insignificant. 388
389
Supplemental Figure S1. Definition of the orientation angle for particles coming from the same 390
projection. This angle is not affected by forces generating preferred orientation in the typical setup 391
of the sample grid being perpendicular to the beam. 392
393
.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 8, 2019. . https://doi.org/10.1101/798280doi: bioRxiv preprint
19
Figure 1 394 395
396
.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 8, 2019. . https://doi.org/10.1101/798280doi: bioRxiv preprint
20
Figure 2. 397
398
.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 8, 2019. . https://doi.org/10.1101/798280doi: bioRxiv preprint
21
Figure 3. 399 400
401
.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 8, 2019. . https://doi.org/10.1101/798280doi: bioRxiv preprint
22
Figure S1. 402 403
404
.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 8, 2019. . https://doi.org/10.1101/798280doi: bioRxiv preprint
23
Table 1. Data collection and processing
GI HemQ-57K HemQ-45K EMPIAR 10185 EMPIAR 10186 EMPIAR10204
Instrument Talos Arctica 200 kV Cryo-Arm 200
kV Phase plate No No No No No No
Energy filter No No No No No No
Objective aperture No No No Yes Yes Not known
Frames per movie 200 100 100 68 68 49
Electron dose (e/A2/frame) 0.7 0.9 0.9 0.99a 1.0a 1.38
Exposure time (second/frame) 0.5 0.4 0.4 0.25 0.25 Not known
K2 super-resolution mode No Yes Yes Yes Yes No
Detector pixel size (Å) 0.91 0.72 0.91 0.91 0.91 0.885
Data pixel size (Å) N/A 0.36 0.455 0.455 0.455 N/A
Movies collected/deposited 202 268 257 315 260 2161
Movies used for processing 149 258 173 315 260 415
Molecular weight (kDa) 173 144 144 659 659 465
Particle symmetry D2 C5 C5 D7 D7 D2
Total picked particles 114522 156210 236091 109695 91186 157513
Particles after 2D averaging 85527 145966 174776 No 2D classification 82721
Particles used in refinement 61909 81302 129446 85847 78689 52340 a count-based estimation
.C
C-B
Y-N
C 4.0 International license
certified by peer review) is the author/funder. It is m
ade available under aT
he copyright holder for this preprint (which w
as notthis version posted O
ctober 8, 2019. .
https://doi.org/10.1101/798280doi:
bioRxiv preprint
24
Table 2. The resolution without and with correction for coma for analyzed datasets.
GI HemQ-45 HemQ-57 EMPIAR 10204
EMPIAR 10185
EMPIAR 10186
Reported resolution [Å] NA NA NA NA 3.1 3.3
FSC0.143 based resolution before correction [Å]
4.1 3.8 4.3 2.5 3.1 3.2
FSC0.143 based resolution after correction [Å]
2.7 2.6 2.3 2.5 2.5 2.4
Coma [µm]/Beam tilt [mrad] 42.7/5.3 56.9/7.0 56.2/6.9 Varieda
Substantial Trefoil and Variable comab
Substantial Trefoil and Variable comab
Trefoil [µm] 0.62 0.79 0.49 0.09 2.74 3.06
Resolution at the first oscillation of with 40% compensation
5.2 5.7 5.7 ND ND ND
aComa varied between movies by more than 10 µm indicating ~1.2 mrad beam tilt variation bEMPIAR 10185 and 10186 were collected consecutively and share the same stable value of trefoil. EMPIAR 10185 was collected with stage shift and has similar coma variation as EMPIAR 10204. EMPIAR 10186 was collected with beam-image shift that induced additional coma variation.
.CC-BY-NC 4.0 International licensecertified by peer review) is the author/funder. It is made available under aThe copyright holder for this preprint (which was notthis version posted October 8, 2019. . https://doi.org/10.1101/798280doi: bioRxiv preprint