Log In Sign Up

Orthogonal Matrix Retrieval in Cryo-Electron Microscopy

by   Tejal Bhamre, et al.

In single particle reconstruction (SPR) from cryo-electron microscopy (cryo-EM), the 3D structure of a molecule needs to be determined from its 2D projection images taken at unknown viewing directions. Zvi Kam showed already in 1980 that the autocorrelation function of the 3D molecule over the rotation group SO(3) can be estimated from 2D projection images whose viewing directions are uniformly distributed over the sphere. The autocorrelation function determines the expansion coefficients of the 3D molecule in spherical harmonics up to an orthogonal matrix of size (2l+1)× (2l+1) for each l=0,1,2,.... In this paper we show how techniques for solving the phase retrieval problem in X-ray crystallography can be modified for the cryo-EM setup for retrieving the missing orthogonal matrices. Specifically, we present two new approaches that we term Orthogonal Extension and Orthogonal Replacement, in which the main algorithmic components are the singular value decomposition and semidefinite programming. We demonstrate the utility of these approaches through numerical experiments on simulated data.


Mahalanobis Distance for Class Averaging of Cryo-EM Images

Single particle reconstruction (SPR) from cryo-electron microscopy (EM) ...

Orthogonal Matrix Retrieval with Spatial Consensus for 3D Unknown-View Tomography

Unknown-view tomography (UVT) reconstructs a 3D density map from its 2D ...

3D Unwrapped Phase Retrieval with Coded Aperture is Reducible to Projection Tomography

A discrete framework of 3D tomographic phase retrieval with a coded aper...

Representation Theoretic Patterns in Multi-Frequency Class Averaging for Three-Dimensional Cryo-Electron Microscopy

We develop in this paper a novel intrinsic classification algorithm -- m...

Determining a rotation of a tetrahedron from a projection

The following problem, arising from medical imaging, is addressed: Suppo...

1 Introduction

SPR from cryo-EM is an increasingly popular technique in structural biology for determining 3D structures of macromolecular complexes that resist crystallization [1, 2, 3]. In the basic setup of SPR, the data collected are 2D projection images of ideally assumed identical, but randomly oriented, copies of a macromolecule. In cryo-EM, the sample of molecules is rapidly frozen in a thin layer of vitreous ice, and maintained at liquid nitrogen temperature throughout the imaging process [4]. The electron microscope provides a top view of the molecules in the form of a large image called a micrograph. The projections of the individual particles can be picked out from the micrograph, resulting in a set of projection images. Datasets typically range from to projection images whose size is roughly pixels.

Mathematically, ignoring the effects of the microscope’s contrast transfer function and noise, a 2D projection image corresponding to rotation is given by the integral of the Coulomb potential that the molecule induces


where . The 3D reconstruction problem in cryo-EM is a non-linear inverse problem in which needs to be estimated from multiple noisy discretized projection images of the form (1) for which the rotations are unknown.

Radiation damage limits the maximum allowed electron dose. As a result, the acquired 2D projection images are extremely noisy with poor signal-to-noise ratio (SNR). Estimating and the unknown rotations at very low SNR is a major challenge.

The 3D reconstruction problem is typically solved by guessing an initial structure and then performing an iterative refinement procedure, where iterations alternate between estimating the rotations given a structure and estimating the structure given rotations [1, 5, 6]. When the particles are too small and images too noisy, the final result of the refinement process depends heavily on the choice of the initial model, which makes it crucial to have a good initial model. If the molecule is known to have a preferred orientation, then it is possible to find an ab-initio 3D structure using the random conical tilt method  [7, 8]

. There are two known approaches to ab initio estimation that do not involve tilting: the method of moments 

[9, 10], and common-lines based methods [11, 12, 13].

Using common-lines based approaches,  [14] was able to obtain three-dimensional ab-initio reconstructions from real microscope images of large complexes that had undergone only rudimentary averaging. However, researchers have so far been unsuccessful in obtaining meaningful 3D ab-initio models directly from raw images that have not been averaged, especially for small complexes.

We present here two new approaches for ab-initio modelling that are based on Kam’s theory [15] and that can be regarded as a generalization of the molecular replacement method from X-ray crystallography to cryo-EM. The only requirement for our methods to succeed is that the number of collected images is large enough for accurate estimation of the covariance matrix of the 2D projection images.

2 Kam’s theory and the Orthogonal matrix retrieval problem

Kam showed [15] using the Fourier projection slice theorem (see, e.g., [16, p. 11]) that if the viewing directions of the projection images are uniformly distributed over the sphere, then the autocorrelation function of the 3D volume with itself over the rotation group SO(3) can be directly computed from the covariance matrix of the 2D images. Let

be the 3D Fourier transform of

and consider its expansion in spherical coordinates


where is the radial frequency and are the real spherical harmonics. Kam showed that


can be estimated from the covariance matrix of the 2D projection images. For images sampled on a Cartesian grid, each matrix is of size , where is the maximum frequency (dictated by the experimental setting). In matrix notation, eq.(3) can be rewritten as


where is a matrix size whose ’th column is . The factorization (4) of , also known as the Cholesky decomposition, is not unique: If satisfies (4), then also satisfies (4) for any unitary matrix (i.e., ).

Since , the electric potential induced by the molecule, is real-valued, its Fourier transform satisfies , or equivalently, . Together with properties of the real spherical harmonics, it follows that (and therefore ) is real for even

and purely imaginary for odd

. Then is unique up to a orthogonal matrix , where


Originally, functions of the radial frequency are required for each in order to completely characterize . With the additional knowledge of the parameter space is reduced to . We refer to the problem of recovering the missing orthogonal matrices as the orthogonal matrix retrieval problem in cryo-EM.

2.1 Analogy with X-ray crystallography

The orthogonal matrix retrieval problem is akin to the phase retrieval problem in X-ray crystallography. In crystallography, the measured diffraction patterns contain information about the modulus of the 3D Fourier transform of the structure but the phase information is missing and needs to be obtained by other means. Notice that in crystallography, the particle’s orientations are known but the phases of the Fourier coefficient are missing, while in electron microscopy, the projection images contain phase information but the orientations of the particles are missing. Kam’s theory converts the cryo-EM problem to one akin to the phase retrieval problem in crystallography. From a mathematical standpoint, the phase retrieval problem in crystallography is perhaps more challenging than the orthogonal matrix retrieval problem in cryo-EM, because in crystallography each Fourier coefficient is missing its phase, while in cryo-EM only a single orthogonal matrix is missing per several radial components.

3 Orthogonal Extension (OE)

A classical solution to the missing phase problem in crystallography is molecular replacement, which relies upon the existence of a previously solved structure which is similar to the unknown structure from which the diffraction data is obtained. The structure is then estimated using the Fourier magnitudes from the diffraction data with the phases from the homologous structure. We mimic this approach in cryo-EM, by grafting the orthogonal matrices of the already resolved similar structure onto the unknown structure.

Let be the unknown structure, and suppose is a known homologous structure, whose 3D Fourier transform has the following expansion in spherical harmonics


We can obtain the auto-correlation matrices from the cryo-EM images of the unknown structure using Kam’s method. Let be any matrix satisfying , determined from the Cholesky decomposition of . Then


where . Requiring , in orthogonal extension we determine as the solution to the least squares problem


where denotes the Frobenius norm.

Although the orthogonal group is non-convex, there is a closed form solution to (8) (see, e.g., [17]) given by




is the singular value decomposition (SVD) of . Thus, we estimate by


In analogy with crystallography, the phase information () from the resolved homologous structure appends the experimentally measured intensity information (). We note that other magnitude correction schemes have been used in crystallography. For example, setting the magnitude to be twice the magnitude from the desired structure minus the magnitude from the known structure, has the desired effect of properly weighting the difference between the two structures, but also the undesired effect of doubling the noise level. The cryo-EM analog in this case would be estimating by


4 Orthogonal Replacement (OR)

We move on to describe Orthogonal Replacement, our approach for resolving structures for which there does not exist a homologous structure. Suppose and are two unknown structures for which we have cryo-EM images. We assume that their difference is known. This can happen, for example, when an antibody fragment of a known structure binds to a protein. We have two sets of cryo-EM images, one from the protein alone, and another from the protein plus the antibody, . Let be the matrices computed from the sample covariance matrices of the 2D projection images of . Let be any matrix satisfying . We have , where . The matrices need to be determined for and . The difference is known from the 3D Fourier transform of the binding structure . We have


4.1 Relaxation to a Semidefinite Program

The least squares problem


is a non-convex optimization problem with no closed form solution. We find and using convex relaxation in the form of semidefinite programming (SDP). We first homogenize (13) by introducing a slack unitary variable and consider the augmented linear system


If the triplet is a solution to (15), then the pair is a solution to the original linear system (. The corresponding least squares problem


is still non-convex. But it can be relaxed to an SDP. Let be a symmetric matrix, which can be expressed as a block matrix with block size , and the ’th block is given by


It follows that is positive semidefinite (denoted ). Moreover, the three diagonal blocks of are () and . The cost function in (16) is quadratic in , so it is linear in . The problem can be equivalently rewritten as


over , subject to , and , where the matrix can be written in terms of , and . Here, we have only one non-convex constraint – the rank constraint. Upon dropping the rank constraint we arrive at an SDP that can be solved efficiently in polynomial time in . We extract the orthogonal matrices from the decomposition (17) of . If the solution matrix has rank greater than (which is possible since we dropped the rank constraint), then we employ the rounding procedure of [18].

4.2 Exact Recovery and Resolution Limit

We have the following theoretical guarantee on recovery of and using the SDP relaxation in the noiseless case:

Theorem 1.

Assume that and are elementwise sampled from i.i.d. Gaussian , and , then the SDP method recovers and almost surely.

The proof of Theorem 1 is beyond the scope of this paper and is deferred to a separate publication. Theorem 1

shows that the SDP method almost achieves the theoretical information limit, since by counting the degrees of freedom in (

13) it is impossible to recover and if . Indeed, the number of free parameters associated with an orthogonal matrix in is , while the number of equations in (13) is . This introduces a natural resolution limit on structures that can be resolved. Only angular frequencies for which can be determined using OR.

5 Numerical experiments

We present the results of numerical experiments on simulated images ( pixels) of the Kv1.2 potassium channel complex (Fig. 1 A and B) with clean and noisy projection images. The experiments were performed in MATLAB in UNIX environment on an Intel (R) Xeon(R) X7542 with 2 CPUs, having 6 cores each, running at 2.67 GHz, and with 256 GB RAM in total. To solve the SDP we used the MATLAB package CVX [19], and to compute the covariance matrix of the 2D images we used the steerable PCA procedure [20].

Kv1.2 is a dumbbell-shaped particle consisting of two subunits - a small subunit and a larger subunit, connected by a central connector. We performed experiments using OE and OR, assuming one of the subunits (e.g., ) is known, while the other is unknown. In the case of OR, we additionally used projection images of the unknown subunit.

Figure 1: Kv1.2 potassium channel: A) Volume visualization in UCSF Chimera [21]. B) Image from Protein Data Bank Japan (PDBj). C through F show reconstructions from clean images - C) OE with known, D) OE with known, E) OR with known, and F) OE with known. G through J show reconstructions from noisy images using OR - G) SNR=0.7 with known, H) SNR=0.7 with known, I) SNR=0.35 with known, and J) SNR=0.35 with known.
Figure 2: Projection images at different values of SNR: A) Clean image, B) SNR=0.7, and C) SNR=0.35.

5.1 Clean and Noisy Projections

We reconstruct the structure from both clean and noisy projection images. The reconstruction of Kv1.2 obtained from clean images using OE and OR is shown in Fig. 1 C through F. We used the true matrices for the known subunit, and a maximum of 30. We tested OR to reconstruct Kv1.2 from noisy projections at various values of SNR. A sample projection image at different values of SNR is shown in Fig. 2. The matrices were estimated from the noisy projection images. In Fig. 1 G through J we show the reconstructions obtained from 10000 projections using OR at SNR=0.7, and from 40000 projections using OR at SNR=0.35. In our simulations with 10000 images, it takes seconds to perform steerable PCA, seconds to calculate the matrices using the maximum as 30, and the time to solve the SDP as a function of ranges from 5 seconds for to seconds for .

5.2 Comparison between OE and OR

We quantify the ‘goodness’ of the reconstruction using the Fourier Cross Resolution (FCR) [22]. In Fig. 3 we show the FCR curves for the reconstruction from the complex using OE and OR. The additional information in OR, from the projection images of , results in a better reconstruction, as seen from the FCR curve. The Kv1.2 complex has C4 symmetry, which reduces the rank of the matrices. Our experiment thus benefits from the reduced size of the orthogonal matrices to be recovered.

Figure 3: FCR curve for reconstruction from (clean images)

6 Summary

We presented two new approaches based on Kam’s theory for ab-initio modelling of macromolecules for SPR from cryo-EM. Ab-initio modelling of small complexes is a challenging problem in cryo-EM because it is difficult to detect common lines between noisy projection images at low SNR. Our methods only require reliable estimation of the covariance matrix of the projection images which can be met even at low SNR if the number of images is sufficiently large. In future work we plan to estimate the covariance matrix when images are not centered, and to apply our methods to experimental datasets.


  • [1] J. Frank, Three-Dimensional Electron Microscopy of Macromolecular Assemblies : Visualization of Biological Molecules in Their Native State: Visualization of Biological Molecules in Their Native State, Oxford University Press, USA, 2006.
  • [2] W. Kühlbrandt, “The resolution revolution,” Science, vol. 343, pp. 1443–1444, 2014.
  • [3] X. Bai, G. McMullan, and S.H.W Scheres, “How cryo-em is revolutionizing structural biology,” Trends in Biochemical Sciences, in press.
  • [4] L. Wang and F. J. Sigworth, “Cryo-EM and single particles,” Physiology (Bethesda), vol. 21, pp. 13–18, 2006.
  • [5] M. van Heel, B. Gowen, R. Matadeen, E. V. Orlova, R. Finn, T. Pape, D. Cohen, H. Stark, R. Schmidt, and A. Patwardhan, “Single particle electron cryo-microscopy: Towards atomic resolution,” Q. Rev. Biophys., vol. 33, pp. 307–369, 2000.
  • [6] P. Penczek, R. Renka, and H. Schomberg, “Gridding-based direct Fourier inversion of the three-dimensional ray transform,” J. Opt. Soc. Am. A, vol. 21, pp. 499–509, 2004.
  • [7] A. Verschoor M. Radermacher, T. Wagenknecht and J. Frank, “Three-dimensional reconstruction from a single-exposure, random conical tilt series applied to the 50s ribosomal subunit of escherichia coli,” Journal of Microscopy, vol. 146, no. 2, pp. 113–136, 1987.
  • [8] A. Verschoor M. Radermacher, T. Wagenknecht and J. Frank, “Three-dimensional structure of the large ribosomal subunit from Escherichia coli,” EMBO J, vol. 6, no. 4, pp. 1107–14, 1987.
  • [9] A.B. Goncharov, “Integral geometry and three-dimensional reconstruction of randomly oriented identical particles from their electron microphotos,” Acta Applicandae Mathematicae, vol. 11, pp. 199–211, 1988.
  • [10] D.B. Salzman, “A method of general moments for orienting 2D projections of unknown 3D objects,” Comput. Vision Graph. Image Process., vol. 50, no. 2, pp. 129–156, May 1990.
  • [11] B.K. Vainstein and A.B. Goncharov, “Determining the spatial orientation of arbitrarily arranged particles given their projections,” Dokl. Acad. Sci. USSR, vol. 287, no. 5, pp. 1131–1134, 1986, English translation: Soviet Physics Doklady, Vol. 31, p.278.
  • [12] M. Van Heel, “Angular reconstitution: a posteriori assignment of projection directions for 3d reconstruction,” Ultramicroscopy, vol. 21, no. 2, pp. 111–123, 1987.
  • [13] A. Singer and Y. Shkolnisky,

    “Three-dimensional structure determination from common lines in cryo-EM by eigenvectors and semidefinite programming,”

    SIAM Journal on Imaging Sciences, vol. 4, pp. 543–572, 2011.
  • [14] Z. Zhao and A. Singer, “Rotationally invariant image representation for viewing direction classification in cryo-EM,” Journal of Structural Biology, vol. 186, no. 1, pp. 153 – 166, 2014.
  • [15] Z. Kam, “The reconstruction of structure from electron micrographs of randomly oriented particles,” Journal of Theoretical Biology, vol. 82, no. 1, pp. 15 – 39, 1980.
  • [16] F. Natterer, The Mathematics of Computerized Tomography, Classics in Applied Mathematics. SIAM: Society for Industrial and Applied Mathematics, 2001.
  • [17] J. B. Keller, “Closest unitary, orthogonal and Hermitian operators to a given operator,” Mathematics Magazine, vol. 48, no. 4, pp. 192–197, 1975.
  • [18] A. S. Bandeira, A. Singer, and D. A. Spielman, “A Cheeger inequality for the graph connection Laplacian,” SIAM Journal on Matrix Analysis and Applications, vol. 34, no. 4, pp. 1611–1630, 2013.
  • [19] M. Grant and S. Boyd, “CVX: Matlab software for disciplined convex programming, version 2.1,”, Mar. 2014.
  • [20] Z. Zhao and A. Singer, “Fourier Bessel rotational invariant eigenimages,” J. Opt. Soc. Am. A, vol. 30, no. 5, pp. 871–877, May 2013.
  • [21] E. F. Pettersen, T. D. Goddard, C. C. Huang, G. S. Couch, D. M. Greenblatt, E. C. Meng, and T. E. Ferrin, “UCSF Chimera–a visualization system for exploratory research and analysis,” Journal of Computational Chemistry, vol. 25, no. 13, pp. 1605–1612, 2004.
  • [22] P. A. Penczek, “Chapter three - resolution measures in molecular electron microscopy,” in Cryo-EM, Part B: 3-D Reconstruction, Grant J. Jensen, Ed., vol. 482 of Methods in Enzymology, pp. 73 – 100. Academic Press, 2010.