I Introduction
I-a Problem statement
We consider the problem of estimating a target image from a large
noisy measurement that contains randomly rotated copies of the target. More precisely, suppose that is a target image supported on the open unit disc, and is the rotation of by angle about the origin. Let be a discrete version of defined by(1) |
where is a fixed positive integer. We assume the measurement is of the form
(2) |
where are uniformly random angles, are translations of the target images, and is i.i.d. Gaussian noise with mean
and variance
. For simplicity, we assume that images in the measurement are separated by at least one image diameter, see (5). Recent work on a related problem suggests that this separation restriction can be alleviated [13].Our goal is to recover the image from the noisy measurement . We emphasize that with respect to this goal, the rotations and translations are nuisance parameters that do not necessarily need to be estimated. If the signal-to-noise ratio (SNR) is high as in Figure 1(a), then estimating these rotations and translations is straightforward, and the image can be recovered by aligning and averaging its different copies in the measurement. However, if the SNR is low as in Figure 1(b), which is the case of interest in this study, then this approach may be problematic as even detecting the image occurrences becomes challenging [1, 4].

We are interested in the following question: given a large enough measurement , can we recover the target image regardless of the level of noise? More precisely, if the number of target image occurrences grows at a constant rate with the size of the measurement , can we recover the image for any fixed noise variance as ?
I-B Motivation
We are motivated by challenges in single-particle reconstruction using cryo-electron microscopy (cryo-EM). The measurements in cryo-EM consist of 2-D tomographic projections of identical biomolecules of unknown 3-D orientations embedded in a large noisy image, called a micrograph. In the current analysis workflow of cryo-EM data [2, 7, 8, 17, 19, 20], the 2-D projections are first detected and extracted from the micrograph, and later rotationally and translationally aligned to reconstruct the 3-D molecular structure. This approach is problematic for small molecules, which are difficult to detect due to their lower SNR. This difficulty of detection in turn sets a lower bound on the usable molecule size in the current analysis workflow of cryo-EM data [9]. To circumvent this fundamental barrier, recent papers [3, 4] suggest to directly estimate the 3-D structure from the micrograph, without an intermediate detection stage; this approach was inspired by Kam [11] who introduced autocorrelation analysis to cryo-EM.
The estimation problem described in Section I-A is a simplified version of the cryo-EM reconstruction problem: the tomographic projection operator is omitted and we observe the same 2-D image multiple times with random in-plane rotations. This image recovery problem is an instance of the multi-target detection statistical model, in which a set of signals appear multiple times at unknown locations in a noisy measurement [3, 4, 13]. Here, we extend previous works by taking in-plane rotations into account, which forms an important step towards the analysis of the full cryo-EM problem.
I-C Main contribution
The principal contribution of this paper is to demonstrate that a target image can be recovered from a measurement of the form (2) without estimating the rotations and translations of the images in the measurement. As a consequence, limits on estimating these nuisance parameters do not impose limits on estimating the target image . In particular, we empirically demonstrate that the estimation of is feasible at any noise level given a large enough measurement. From a computational perspective, this approach is highly efficient as it requires only one pass over the measurement to calculate the autocorrelations; this is in contrast to the likelihood-based techniques that may become prohibitive as the data size increases. For a more detailed discussion, see [13].
Ii Rotationally and translationally invariant features
We begin by studying invariance in the continuous setting. The simplest invariant to both rotations and translations is
which is the mean of the image; however, clearly more information is needed to recover the image. Motivated by autocorrelation analysis we next consider the rotationally-averaged second-order autocorrelation by
By a change of variables of integration, it is clear that is only a function of , that is, it is a 1-D function, and thus does not contain enough information to encode a 2-D image. Thus, we proceed to consider the rotationally-averaged third-order autocorrelation defined by
It is straightforward to verify that is a function of three parameters: , and , where
denotes the angle between the vectors
. In this work, we show empirically that can be recovered from , which is a discrete version of defined by(3) |
We claim that the function can be approximated from the measurement by computing the third-order autocorrelation defined by
(4) |
Indeed, recall that the images are assumed to be separated in the measurement by one image diameter:
(5) |
and that the rotations in the measurement are chosen uniformly at random. Under these assumptions it is straightforward to show, see for example [3], that if , then
(6) |
as for any fixed noise level , where
and if and is zero otherwise. Crucially, equation (6) relates , a quantity that can be estimated accurately from the data, with functions of the target: and , while averaging out the effects of the nuisance variables. In practice, and can be estimated from . More specifically, can be estimated by the variance of the pixel values of in the low SNR regime, while can be estimated by the empirical mean of . As a result, can be estimated from up to a constant factor .
The analysis above motivates the following question: can the image be robustly reconstructed from ? We demonstrate empirically that the answer to this question is positive. For simplicity of exposition, we describe a method of recovering that just involves , although we note that and , which represent the discrete analogs of and , may be used to aid in the reconstruction process.
The problem of estimating from , as described in Section I-A, should not be confused with the related problem of recovering an image from a set of measurements
, each of which has exactly one shifted and rotated observation. Several moment-based techniques have been proposed to address this problem at high noise levels, see for instance
[12, 15, 16, 18], but none can be applied directly to our problem where only a single, large observation is available.Iii Image recovery from invariants
Iii-a Steerable basis
Recall that the target image is supported on the unit disc. We assume that
is bandlimited in the basis of Dirichlet Laplacian eigenfunctions on the unit disk. In polar coordinates
, these eigenfunctions are of the form(7) |
where is the -th order Bessel function of the first kind, and is the -th positive root of
. The eigenvalue associated with
is , and thus, the assumption that is bandlimited can be written as(8) |
where is the bandlimit frequency, and are the associated expansion coefficients. If we further define
then we can write
(9) |
where . An advantage of expressing a function in Dirichlet Laplacian eigenfunctions is that the basis is steerable in the sense that it diagonalizes the rotation operator: rotating by corresponds to multiplying each term of by :
(10) |
Iii-B Computing invariants
Recall that is the discretely sampled version of , which is defined by for . In the following, we consider as a function on
Since is supported on the open unit disc, it follows that
Let by
denote the discrete Fourier transform (DFT) of
. Similarly, we can consider as a function on and define its DFT by(11) |
for . By substituting (3) into (11), it is straightforward to show that
(12) |
The triple product in (12) corresponds to the Fourier transform of the third-order autocorrelation. This triple product is called the bispectrum [21] and many of its analytical and computational properties have been studied; see for instance [5, 18]. We note that the bispectrum can be generalized to more involved operations, such as 2-D rotations [14], 3-D rotations [12], and general compact groups [10].
Define as the discrete samples of the Dirichlet Laplacian eigenfunctions:
where is considered as a function supported on the unit disc. With this notation, we can express as
As a consequence, the products that appear in (3) are bandlimited by with respect to . Therefore, we can replace the integral over in (12) by a summation over angles sampled at the Nyquist rate, that is,
where . By linearity of the DFT, we have
where denotes the DFT of .
Let denote the set of all the pairs in the expansion above. We define the column vector by , and the column vector by ; the latter encodes the parameters that describe the target image. With this notation, can be expressed compactly as and it follows that
(13) |
where we write to emphasize the dependence on . This expression for is particularly convenient for computational purposes. In particular, the gradient of with respect to is easy to compute, which is important for solving the optimization problem defined in Section III-D.
Iii-C Leveraging symmetries
Recall that is a discrete version of , which only depends on the three parameters: , and . Let be the estimate of from the noisy measurement via (6), that is, the de-biased and normalized autocorrelation of the measurement (4), and denote the DFT of . Since the noise is assumed to be i.i.d. Gaussian with mean , its Fourier transform is also i.i.d. Gaussian with mean in Fourier space. Therefore, we can reduce the effect of noise by binning the entries of with similar values of , and . Numerically, the binning is done through the mapping by
where represents the set of the bins, and determine the bin sizes.
Iii-D Optimization problem
Suppose that we are given an estimate of —the de-biased and normalized third-order autocorrelation of the measurement. For a fixed model and any coefficient vector , we can compute via (13). In order to estimate the coefficient vector that is consistent with , we define the cost function
(14) |
Minimizing this cost function is a non-convex (polynomial of degree ) least squares optimization problem, and thus, a priori, there is no reason to suspect that the global minimum of this problem can be attained; however, our numerical results indicate that standard gradient-based methods result in accurate and stable recovery of the parameters . We note that computing the cost and gradient involves operations in each iteration of the optimization.
Iv Numerical results
We first consider the problem of recovering a model image from a noiseless set of —namely, from the exact rotationally-averaged third-order autocorrelation. This reflects the case where the noise level is fixed and . The model image is generated by expanding a image of a tiger (Figure 2) into a linear combination of the first Dirichlet Laplacian eigenfunctions, sorted by the eigenvalues in ascending order. We denote this model image by . Since is noiseless in this case, we accelerate the optimization by only choosing one entry from each bin in instead of summing all the entries to calculate the cost and gradient, which reduces the computational cost by .
Let be the image formed by the coefficients that were recovered by minimizing the least squares (14). Since is invariant under in-plane rotations of the model image, we only expect to reconstruct the image up to some arbitrary rotation . As a result, we define the reconstruction error by
where is the rotation of by an angle . Using the BFGS optimization algorithm, we recovered the image in Figure 2 with . The optimization took seconds parallelized over 100 CPUs in total.

In the second experiment, we study the robustness of the reconstruction to noise. Limited by the computational resources, we downsample the image of tiger to pixels, and expand the image into the first Dirichlet Laplacian eigenfunctions. From the expansion we compute the rotationally-averaged third-order autocorrelation , which is further contaminated by i.i.d. Gaussian noise with mean and variance . A total of 10 different values of are used to model the noisy counterparts estimated from measurements of different SNR. The relative error of is quantified by
The relative errors of the reconstructed images from the 10 sets of are shown in Figure 3. We observe that the images can be reliably recovered over a wide range of noise levels. The high correlation between and might indicate that the optimization landscape is benign.
V Discussion
This work serves as a proof of concept for the feasibility of estimating a target image from its rotational and translational invariants. We have demonstrated the reconstruction of a target image from its noiseless invariants, and showed that our algorithmic approach is robust to noise. In future work, we intend to extend the framework to include the recovery of the target image from the observation , to mitigate the separation condition [13], and to allow the recovery of multiple images simultaneously, in a similar fashion to [4, 6]. From a theoretical perspective, we wish to complement the empirical results of this work by proving that indeed a generic image is determined uniquely by its rotationally-averaged third-order autocorrelation.
Acknowledgment
The authors thank Iris Rukshin for stimulating discussions. NFM was supported by NSF DMS-1903015. TYL and AS were supported in part by Award Number FA9550-17-1-0291 from AFOSR, Simons Foundation Math+X Investigator Award, the Moore Foundation Data-Driven Discovery Investigator Award, and NSF BIGDATA Award IIS-1837992.
References
- [1] C. Aguerrebere, M. Delbracio, A. Bartesaghi, and G. Sapiro, “Fundamental limits in multi-image alignment,” IEEE Transactions on Signal Processing, vol. 64, no. 21, pp. 5707–5722, 2016.
- [2] T. Bendory, A. Bartesaghi, and A. Singer, “Single-particle cryo-electron microscopy: Mathematical theory, computational challenges, and opportunities”, arXiv preprint arXiv:1908.00574, 2019.
- [3] T. Bendory, N. Boumal, W. Leeb, E. Levin, and A. Singer, “Toward single particle reconstruction without particle picking: Breaking the detection limit,” arXiv preprint arXiv:1810.00226, 2018.
- [4] T. Bendory, N. Boumal, W. Leeb, E. Levin, and A. Singer, “Multi-target detection with application to cryo-electron microscopy,” Inverse Problems, 2019.
- [5] T. Bendory, N. Boumal, C. Ma, Z. Zhao, and A. Singer, “Bispectrum inversion with application to multireference alignment,” IEEE Transactions on Signal Processing, vol. 66, no. 4, pp. 1037–1050, 2017.
- [6] N. Boumal, T. Bendory, R. R. Lederman, and A. Singer, “Heterogeneous multireference alignment: A single pass approach,” in 2018 52nd Annual Conference on Information Sciences and Systems (CISS). IEEE, 2018, pp. 1–6.
- [7] J. Frank, Three-dimensional electron microscopy of macromolecular assemblies: visualization of biological molecules in their native state. Oxford University Press, 2006.
- [8] T. Grant, A. Rohou, and N. Grigorieff, “cisTEM, user-friendly software for single-particle image processing,” Elife, vol. 7, p. e35383, 2018.
- [9] R. Henderson, “The potential and limitations of neutrons, electrons and X-rays for atomic resolution microscopy of unstained biological molecules,” Quarterly reviews of biophysics, vol. 28, no. 2, pp. 171–193, 1995.
- [10] R. Kakarala, “The bispectrum as a source of phase-sensitive invariants for Fourier descriptors: a group-theoretic approach,” Journal of Mathematical Imaging and Vision, vol. 44, no. 3, pp. 341–353, 2012.
- [11] Z. Kam, “The reconstruction of structure from electron micrographs of randomly oriented particles,” Journal of Theoretical Biology, vol. 82, no. 1, pp. 15–39, 1980.
- [12] R. Kondor, “A novel set of rotationally and translationally invariant features for images based on the non-commutative bispectrum,” arXiv preprint arXiv:0701127, 2007.
- [13] T.-Y. Lan, T. Bendory, N. Boumal, and A. Singer, “Multi-target detection with an arbitrary spacing distribution,” arXiv preprint arXiv:1905.03176, 2019.
- [14] C. Ma, T. Bendory, N. Boumal, F. Sigworth, and A. Singer, “Heterogeneous multireference alignment for images with application to 2-D classification in single particle reconstruction,” To appear in IEEE Transactions on Image Processing, 2019.
- [15] S. Mallat, “Group invariant scattering”, Communications on Pure and Applied Mathematics vol. 65, no. 10, pp. 1331–1398, 2012.
- [16] O. Özyeşil, N. Sharon, and A. Singer, “Synchronization over cartan motion groups via contraction”, SIAM Journal on Applied Algebra and Geometry, vol. 2, no. 2, pp. 207–241, 2018.
- [17] A. Punjani, J. L. Rubinstein, D. J. Fleet, and M. A. Brubaker, “cryoSPARC: algorithms for rapid unsupervised cryo-EM structure determination,” Nature methods, vol. 14, no. 3, p. 290, 2017.
- [18] B. M. Sadler and G. B. Giannakis, “Shift-and rotation-invariant object reconstruction using the bispectrum,” JOSA A, vol. 9, no. 1, pp. 57–69, 1992.
- [19] S. H. Scheres, “RELION: implementation of a Bayesian approach to cryo-EM structure determination,” Journal of structural biology, vol. 180, no. 3, pp. 519–530, 2012.
- [20] G. Tang, L. Peng, P. R. Baldwin, D. S. Mann, W. Jiang, I. Rees, and S. J. Ludtke, “EMAN2: an extensible image processing suite for electron microscopy,” Journal of structural biology, vol. 157, no. 1, pp. 38–46, 2007.
- [21] J. Tukey, “The spectral representation and transformation properties of the higher moments of stationary time series,” Reprinted in The Collected Works of John W. Tukey, vol. 1, pp. 165–184, 1953.