. Single molecule XFEL imaging collects two-dimensional diffraction patterns of single particles at random orientations. The images are very noisy due to the low photon counts and the detector count-noise follows an approximately Poisson distribution. Since we only capture one diffraction pattern per particle and the particle orientations are unknown, it is challenging to reconstruct the 3-D structure at low signal-to-noise ratio (SNR). One approach is to use expectation-maximization (EM)[6, 7]
, which has no guarantee of the global optimality. Alternatively, assuming that the orientations are uniformly distributed over the special orthogonal group, Kam’s correlation analysis [8, 9, 10, 11, 12, 13, 14, 15] bypasses maximum likelihood estimation. Instead, it requires a robust estimation of the covariance matrix of the noiseless 2-D images. This serves as our main motivation for developing efficient and accurate covariance estimation and denoising methods for Poisson data.
Principal Component Analysis (PCA) is widely used for dimension reduction and denoising of large datasets [16, 17]. However, it is most naturally designed for Gaussian data, and there is no commonly agreed upon extension to non-Gaussian settings such as exponential families [16, Sec. 14.4]. For denoising with non-Gaussian noise, popular approaches reduce it to the Gaussian case by a wavelet transform such as a Haar transform ; by adaptive wavelet shrinkage [18, 19]
; or by approximate variance stabilization such as the Anscombe transform. The latter is known to work well for Poisson signals with large parameters, due to approximate normality. However, the normal approximation breaks down for Poisson distributions with a small parameter, such as photon-limited XFEL [21, Sec. 6.6]
. Other methods are based on singular value thresholding (SVT)[22, 23] and Bayesian techniques . In addition, many existing methods are computationally intractable for large datasets. Recently,  introduced exponential family PCA (
PCA), which extends PCA to a wider class of distributions. It involves the eigendecomposition of a new covariance matrix estimator, constructed in a deterministic and non-iterative way using moment calculations and shrinkage.
PCA was shown to be more accurate than PCA and its alternatives for exponential families. Its computational cost is similar to that of PCA, it has substantial theoretical justification building on random matrix theory, and its output is interpretable.
In XFEL imaging, the orientations are uniformly distributed over , and it is therefore equally likely to observe any planar rotation of the given diffraction pattern. Therefore, it makes sense to include all possible in-plane rotations of the images when performing PCA. To this end, we incorporate steerability in PCA, by adapting the steerable PCA algorithm . We take into account the action of the group on diffraction patterns by in-plane rotation and reflection. The resulting principal components are invariant to any transformation of the input images. In the new algorithm, we first estimate the rotational invariant sample mean of the image dataset and use it to “prewhiten" the images (we refer to this step as prewhitening, since the mechanism is similar to that of whitening colored noise). We then compute the truncated Fourier-Bessel expansion coefficients of the prewhitened images. Since the truncated Fourier-Bessel transform is almost unitary , we can compute the rotational invariant sample covariance matrix in terms of the expansion coefficients. The resulting covariance matrix has block diagonal structure. For each covariance sub-block, we apply shrinkage, recoloring, and scaling as in  to estimate the underlying clean covariance matrix.
We illustrate the improvement in estimating the covariance matrix through image denoising. Specifically, we introduce a Wiener-type filtering of the truncated Fourier-Bessel expansion coefficients. Rotation invariance enhances the effectiveness of PCA in covariance estimation and thus achieves better denoising. In addition, the denoised expansion coefficients are useful in building rotationally invariant image features (i.e. bispectrum-like features ). We verify the benefits of steerable PCA through numerical experiments with the simulated XFEL diffraction patterns. Similar to the case with normal PCA, the computational complexity of the steerable PCA is lower than PCA.
The paper is organized as follows: Section II contains the observation model (II-A) and the computation of the steerable exponential principal components (II-B). Various numerical examples comparing the steerable PCA and PCA are presented in Section III. An implementation of steerable PCA in MATLAB is publicly available at github.com/zhizhenz/sepca/.
Ii-a The observation model
We adopt the same observation model as in . The observed noisy images (i.e.,
is drawn from a probability distributionwith mean and covariance matrix . Conditional on , the coordinates of are drawn independently from a canonical one-parameter exponential family , with density
The mean and variance of the random variableis and . Therefore, the mean of conditional on is
so the noisy data vector can be expressed as , with and the marginal mean of is . Thus, one can think of as a noisy realization of the clean vector . However, the latent vector is also random and varies from sample to sample. In the XFEL application, are the unobserved noiseless images, and their randomness stems for the random (and unobserved) orientation of the molecule. We may write , where the coordinates of are conditionally independent and standardized given . The covariance of is given by the law of total covariance:
For the special case of Poisson observations , where is random, we can write . The natural parameter is the vector with and . Therefore, we have , and . In words, while the mean of the noisy images agrees with the mean of the clean images, their covariance matrices differ by a diagonal matrix that depends solely on the mean image.
Ii-B Fourier-Bessel Expansion
Under the observation model in Sec II-A, we develop a method that estimates the rotational invariant efficiently and accurately from the image dataset . We assume that a digital image is composed of discrete samples from a continuous function with bandlimit
. The Fourier transform of, denoted , can be expanded in any orthogonal basis for the class of squared-integrable functions in a disk of radius . For the purpose of steerable PCA, it is beneficial to choose a basis whose elements are products of radial functions with Fourier angular modes, such as the Fourier-Bessel functions. The scaled Fourier-Bessel functions are given by
where are polar coordinates in the Fourier domain (i.e., , , , and ; is the normalization factor; is the Bessel function of the first kind of integer order ; and is the th root of the Bessel function . We also assume that the functions of interest are concentrated in a disk of radius in real domain. In order to avoid aliasing, we only use Fourier-Bessel functions that satisfy the following criterion [25, 3]
For each angular frequency , we denote by the number of components satisfying Eq. (5). The total number of components is , where is the maximal possible value of satisfying Eq. (5). We also denote for and .
The inverse Fourier transform (IFT) of is
Therefore, we can approximate using the truncated expansion
We evaluate the Fourier-Bessel expansion coefficients numerically as in  using a quadrature rule that consists of equally spaced points in the angular direction , with and a Gaussian quadrature rule in the radial direction for . To evaluate , we need to sample the discrete Fourier transform of the image , denoted , at the quadrature nodes,
which can be evaluated efficiently using the the nonuniform discrete Fourier transform , and we get
where is the 1D FFT of on each concentric circle of radius . For real-valued images, it is sufficient to evaluate the coefficients with , since . In addition, the coefficients have the following properties: under counter-clockwise rotation by an angle , changes to ; and under reflection, changes to .
Suppose are discretized input images sampled from . Here, the observational vectors for PCA are simply . In PCA, the first step is to prewhiten the data using the sample mean. However, the sample mean is not necessarily rotationally invariant. The rotationally invariant sample mean can be evaluated from , the expansion coefficients of ,
The rotationally invariant sample mean is circularly symmetric. We denote by a vector that contains all the coefficients ordered by the radial index . Although the input images are non-negative, the finite truncation may result in small negative values in the mean estimation, so we threshold any negative entries to zero.
We prewhiten the images by the estimated mean image to create new images as , when , and is 0 otherwise. Combining Eqs. (5) and (9), we compute the truncated Fourier-Bessel expansion coefficients of . Let us denote by the matrix of expansion coefficients with angular frequency , obtained by putting into a matrix, where the columns are indexed by the image number and the rows are ordered by the radial index . The coefficient matrix is of size .
Since the truncated Fourier-Bessel transform is almost unitary , the rotationally invariant covariance kernel built from the original image data with all possible in-plane rotations and reflections can be computed in terms of the IFT of the Fourier-Bessel basis and the associated expansion coefficients. Subtracting the sample mean is equivalent to subtracting from the coefficients , while keeping other coefficients unchanged. Therefore, we first update the zero angular frequency coefficients by . In terms of the expansion coefficients, the rotational invariant homogenized sample covariance matrix is , with
be the eigenvalues and eigenvectors of, respectively, that is
Ii-D Eigenvalue Shrinkage
is above the the Baik-Ben Arous-Péché (BBP) phase transition, then the top sample spike pops out from the Marčenko-Pastur distribution of the “noise” eigenvalues. The top eigenvalue will converge to the value given bythe spike forward map:
The underlying clean population covariance eigenvalues can be estimated by solving the quadratic equation in Eq. (13). Shrinking the eigenvalues helps to denoise the block sample covariance matrices .
The shrinkers set all noise eigenvalues to zero for within the support of the Marčenko-Pastur distribution and reduce other eigenvalues according to Eq. (14). Then the denoised covariance matrices are
We denote by an estimate of using the estimated clean covariance eigenvalues in Eq. (14) and .
Homogenization changes the direction of the clean eigenvectors. Therefore, after eigenvalue shrinkage, we recolor (heterogenize) the covariance matrix by conjugating the recoloring matrix with : . The recoloring matrix is evaluated with numerical integration,
which has a block diagonal structure and is decoupled for each angular frequency, .
Therefore, the recoloring step is also decoupled for each angular frequency sub-block. The heterogenized covariance estimators are
The eigendecomposition of gives . The empirical eigenvalues are which is a biased estimate of the true eigenvalue . In [1, Sec. 4.2.3], a scaling rule was proposed to correct the bias. We extend it to the steerable case and scale each eigenvalue of by a parameter ,
where the parameter . The scaled covariance matrices are
The rotational invariant covariance kernel is well approximated by , where contains IFT of all with angular frequency . The computational complexity of steerable PCA is , same as steerable PCA, and it is lower than the complexity of PCA which is .
As an application of steerable PCA, we develop a method to denoise the photon-limited images. Since the Fourier-Bessel expansion coefficients are computed from the prewhitened images, we first recolor the coefficients by multiplying with and then we apply Wiener-type filtering to denoise the coefficients. For the steerable basis expansion coefficients with angular frequency
For , we need to take into account the rotational invariant mean expansion coefficients,
The denoised image sampled on the Cartesian grid in real domain are computed from the filtered expansion coefficients ,
where and .
Iii Numerical Results
We apply steerable PCA and PCA to a simulated XFEL dataset and compare the results for covariance estimation and denoising. The algorithms are implemented in MATLAB on a machine with 60 cores, running at 2.3 GHz, with total RAM of 1.5TB. For steerable PCA and steerable PCA, but only 8 cores were used in our experiments. We simulate noiseless XFEL diffraction intensity maps of a lysozyme (Protein Data Bank 1AKI) with Condor . The average pixel intensity is rescaled to be 0.01 for image size pixels such that shot noise dominates . To sample an arbitrary number of noisy diffraction patterns, we sample an intensity map at random, in-plane rotate the sample by a random angle following a uniform distribution over , and then sample the photon count of each detector pixel from a Poisson distribution whose mean is the pixel intensity. Figures (a)a and (b)b illustrate the clean intensity maps and the resulting noisy diffraction patterns.
Iii-a Covariance estimation and principal components
For experiments using PCA, we use a permutation bootstrap-based method to estimate the rank of the covariance matrix, following e.g. . By randomly permuting each column of the mean-subtracted data matrix, we completely destroy structural information including linear structure, while the noise statistics remain the same (see [35, 36] for an analysis). Singular values of the randomly permuted matrices reveal what should be the largest covariance matrix eigenvalues that correspond to noise, up to a user-selected confidence level
. This can replace the other rank estimation methods that assume Gaussian distribution when the noise model is non-Gaussian, such as in our case. Empirically, we observe thatgives the best performance in covariance estimation. For steerable PCA, we estimate the number of components using the right edge of the Marčenko-Pastur distribution for homogenized covariance matrices and include only the components whose scaling factor are above zero.
Steerable PCA is able to recover more signal principal components from noisy images than PCA, steerable PCA, and PCA (see Fig. (a)a). When the sample size , the mean number of estimated components is and for PCA and steerable PCA respectively. For , the estimated number of components is and for PCA and steerable PCA respectively. Fig. (b)b shows that steerable PCA is more efficient than PCA and PCA. Because steerable PCA contains extra steps such as prewhitening, recoloring, and scaling, its runtime is slightly larger than steerable PCA. When , steerable PCA is 8 times faster than PCA.
Fig. 3 shows the top 12 eigenimages for clean XFEL diffraction patterns (Fig. (e)e), and noisy diffraction patterns with mean photon count 0.01 (Figs. (c)c–(i)i) using PCA, steerable PCA, PCA, and steerable PCA. The true eigenimages in Fig. (e)e are computed from 70000 clean diffraction patterns whose orientations are uniformly distributed over . Figs. (a)a–(d)d are computed from noisy images. Since the number of samples is much smaller than the size of the image and the noise type is non-Gaussian, PCA can only recover the first two or three components. PCA improves the estimation and is able to extract the top 7 eigenimages. Moreover, steerable PCA and steerable PCA achieve much better estimation of the underlying true eigenimages for a given sample size. Steerable PCA achieves the best performance in estimating both the eigenvalues and eigenimages.
Furthermore, we compare the operator norms and Frobenius norms of the difference between the covariance estimates and the true covariance matrix. Fig. 4 shows that steerable PCA significantly improves the covariance estimation, especially when the sample size is small.
We compare the denoising effects of PCA and steerable PCA by the mean squared error, MSE. We perform “PCA denoising” using empirical best linear predictor (EBLP) , which had been shown to outperform “PCA denoising”, i.e., orthogonal projection onto sample or PCA/heterogenized eigenimages, as well as the exponential family PCA method proposed by . Note that in our implementation, to avoid inverting a singular matrix (when some coordinates have 0 sample mean and sample variance), we compute with ‘regularization’ , where , , and is the by identity matrix. The number of components are estimated by the permutation rank estimation as described in the previous section.
Steerable PCA is able to recover the images with lower MSEs compared to PCA, steerable PCA, and PCA (see Fig. 6), especially when the sample size is small. Fig. 5 shows some examples of denoised images for sample size and . For robustness, we repeated the numerical experiment for another dataset simulated from the small protein chignolin (Protein Data Bank entry 1UAO) and obtained qualitatively similar results.
We presented steerable PCA, a method for principal component analysis of a set of low-photon count images and their uniform in-plane rotations. This work has been mostly motivated by its application to XFEL. The computational complexity of the new algorithm is , whereas that of PCA is . Incorporating rotational invariance allows more robust estimation of the true eigenvalues and eigenvectors. Our numerical experiments showed that steerable PCA is able to recover more signal components than PCA and achieves better covariance estimation and denoising results. Finally, we remark that the Fourier-Bessel basis can be replaced with other suitable bases, for example, the 2-D prolate spheroidal wave functions (PSWF) on a disk [38, 39].
The authors would like to thank Edgar Dobriban and Boris Landa for discussions. ZZ is partially supported by National Center for Supercomputing Applications Faculty Fellowship and University of Illinois at Urbana-Champaign College of Engineering Strategic Research Initiative. AS is partially supported by Award Number R01GM090200 from the NIGMS, FA9550-17-1-0291 from AFOSR, Simons Investigator Award, the Moore Foundation Data-Driven Discovery Investigator Award, and NSF BIGDATA Award IIS-1837992.
-  L. T. Liu, E. Dobriban, and A. Singer, “PCA: High dimensional exponential family PCA,” Annals of Applied Statistics, vol. 12, no. 4, pp. 2121–2150, 2018.
-  Z. Zhao and A. Singer, “Rotationally invariant image representation for viewing direction classification in cryo-em,” Journal of Structural Biology, vol. 186, no. 1, pp. 153–166, 2014.
-  Z. Zhao, Y. Shkolnisky, and A. Singer, “Fast steerable principal component analysis,” IEEE Transactions on Computational Imaging, vol. 2, no. 1, pp. 1–12, 2016.
-  V. Favre-Nicolin, J. Baruchel, H. Renevier, J. Eymery, and A. Borbély, “XTOP: high-resolution X-ray diffraction and imaging,” Journal of Applied Crystallography, vol. 48, no. 3, pp. 620–620, 2015.
-  F. R. N. C. Maia and J. Hajdu, “The trickle before the torrent—diffraction data from X-ray lasers,” Scientific Data, vol. 3, 2016.
-  S. H. W. Scheres, H. Gao, M. Valle, G. T. Herman, P. P. B. Eggermont, J. Frank, and J.-M. Carazo, “Disentangling conformational states of macromolecules in 3D-EM through likelihood optimization,” Nature Methods, vol. 4, no. 1, pp. 27–29, 2007.
-  D. N.-T. Loh and V. Elser, “Reconstruction algorithm for single-particle diffraction imaging experiments,” Phys. Rev. E, vol. 80, p. 026705, Aug 2009.
-  Z. Kam, “Determination of macromolecular structure in solution by spatial correlation of scattering fluctuations,” Macromolecules, vol. 10, no. 5, pp. 927–934, 1977.
-  D. K. Saldin, V. L. Shneerson, R. Fung, and A. Ourmazd, “Structure of isolated biomolecules obtained from ultrashort X-ray pulses: exploiting the symmetry of random orientations,” Journal of Physics: Condensed Matter, vol. 21, no. 13, 2009.
-  D. Starodub, A. Aquila, S. Bajt, M. Barthelmess, A. Barty, C. Bostedt, J. D. Bozek, N. Coppola, R. B. Doak, S. W. Epp et al., “Single-particle structure determination by correlations of snapshot X-ray diffraction patterns,” Nature communications, vol. 3, p. 1276, 2012.
-  K. Pande, P. Schwander, M. Schmidt, and D. K. Saldin, “Deducing fast electron density changes in randomly orientated uncrystallized biomolecules in a pump–probe experiment,” Phil. Trans. R. Soc. B, vol. 369, no. 1647, p. 20130332, 2014.
-  R. P. Kurta, J. J. Donatelli, C. H. Yoon, P. Berntsen, J. Bielecki, B. J. Daurer, H. DeMirci, P. Fromme, M. F. Hantke et al., “Correlations in scattered X-ray laser pulses reveal nanoscale structural features of viruses,” Physical Review Letters, vol. 119, no. 15, p. 158102, 2017.
-  J. J. Donatelli, J. A. Sethian, and P. H. Zwart, “Reconstruction from limited single-particle diffraction data via simultaneous determination of state, orientation, intensity, and phase,” Proceedings of the National Academy of Sciences, vol. 114, no. 28, pp. 7222–7227, 2017.
-  B. von Ardenne, M. Mechelke, and H. Grubmüller, “Structure determination from single molecule X-ray scattering with three photons per image,” Nat Commun., vol. 9, no. 1, p. 2375, 2018.
-  K. Pande, J. J. Donatelli, E. Malmerberg, L. Foucar, C. Bostedt, I. Schlichting, and P. H. Zwart, “Ab initio structure determination from experimental fluctuation x-ray scattering data,” Proceedings of the National Academy of Sciences, vol. 115, no. 46, pp. 11 772–11 777, 2018. [Online]. Available: http://www.pnas.org/content/115/46/11772
-  I. Jolliffe, Principal Component Analysis. Wiley Online Library, 2002.
-  T. W. Anderson, An Introduction to Multivariate Statistical Analysis. Wiley New York, 2003.
-  R. D. Nowak and R. G. Baraniuk, “Wavelet-domain filtering for photon imaging systems,” IEEE Transactions on Image Processing, vol. 8, no. 5, pp. 666–678, 1999.
-  F. Luisier, T. Blu, and M. Unser, “A new sure approach to image denoising: Interscale orthonormal wavelet thresholding,” IEEE Transactions on image processing, vol. 16, no. 3, pp. 593–606, 2007.
-  F. J. Anscombe, “The transformation of poisson, binomial and negative-binomial data,” Biometrika, vol. 35, no. 3-4, p. 246, 1948.
-  J.-L. Starck, F. Murtagh, and J. M. Fadili, Sparse image and signal processing: wavelets, curvelets, morphological diversity. Cambridge university press, 2010.
-  T. Furnival, R. K. Leary, and P. A. Midgley, “Denoising time-resolved microscopy image sequences with singular value thresholding,” Ultramicroscopy, 2016.
-  Y. Cao and Y. Xie, “Low-rank matrix recovery in Poisson noise,” in Signal and Information Processing (GlobalSIP), 2014 IEEE Global Conference on. IEEE, 2014, pp. 384–388.
-  M. Sonnleitner, J. Jeffers, and S. M. Barnett, “Local retrodiction models for photon-noise-limited images,” in SPIE Photonics Europe. International Society for Optics and Photonics, 2016, pp. 98 960V–98 960V.
-  A. Klug and R. A. Crowther, “Three-dimensional image reconstruction from the viewpoint of information theory,” Nature, vol. 238, pp. 435–440, 1972.
-  L. Greengard and J. Lee, “Accelerating the Nonuniform Fast Fourier Transform,” SIAM Review, vol. 46, no. 3, pp. 443–454, 2004.
-  J. Baik, G. Ben Arous, and S. Péché, “Phase transition of the largest eigenvalue for nonnull complex sample covariance matrices,” Annals of Probability, vol. 33, no. 5, pp. 1643–1697, 2005.
J. Baik and J. W. Silverstein, “Eigenvalues of large sample covariance
matrices of spiked population models,”
Journal of Multivariate Analysis, vol. 97, no. 6, pp. 1382–1408, 2006.
-  D. Paul, “Asymptotics of sample eigenstructure for a large dimensional spiked covariance model,” Statistica Sinica, vol. 17, no. 4, pp. 1617–1642, 2007.
-  F. Benaych-Georges and R. R. Nadakuditi, “The eigenvalues and eigenvectors of finite, low rank perturbations of large random matrices,” Advances in Mathematics, vol. 227, no. 1, pp. 494–521, 2011.
-  D. L. Donoho, M. Gavish, and I. M. Johnstone, “Optimal shrinkage of eigenvalues in the Spiked Covariance Model,” Annals of Statistics, vol. 46, no. 4, p. 1742, 2018.
-  M. F. Hantke, T. Ekeberg, and F. R. N. C. Maia, “Condor: A simulation tool for flash X-ray imaging,” Journal of Applied Crystallography, vol. 49, no. 4, p. 1356–1362, 2016.
-  P. Schwander, D. Giannakis, C. H. Yoon, and A. Ourmazd, “The symmetries of image formation by scattering. II. Applications,” Opt. Express, vol. 20, no. 12, pp. 12 827–12 849, Jun 2012.
-  J. Landgrebe, W. Wurst, and G. Welzl, “Permutation-validated principal components analysis of microarray data,” Genome Biology, vol. 3, no. 4, 2002.
-  E. Dobriban, “Permutation methods for factor analysis and PCA,” arXiv preprint arXiv:1710.00479, Oct. 2017.
-  E. Dobriban and A. B. Owen, “Deterministic parallel analysis: an improved method for selecting factors and principal components,” Journal of the Royal Statistical Society: Series B (Statistical Methodology), 2018.
-  M. Collins, S. Dasgupta, and R. E. Schapire, “A generalization of principal component analysis to the exponential family,” Advances in Neural Information Processing Systems (NIPS), 2001.
-  D. Slepian, “Prolate spheroidal wave functions, Fourier analysis, and uncertainty–IV: extensions to many dimensions, generalized prolate spheroidal wave functions,” Bell System Technical Journal, vol. 43, pp. 3009–3057, 1964.
-  B. Landa and Y. Shkolnisky, “Steerable principal components for space-frequency localized images,” SIAM Journal on Imaging Sciences, vol. 10, no. 2, pp. 508–534, 2017.