DeepAI

# Multitaper estimation on arbitrary domains

Multitaper estimators have enjoyed significant success in providing spectral density estimates from finite samples. Unfortunately, highly accurate methods exist only for certain symmetric acquisition domains, such as rectangles or disks. Methods proposed for arbitrary domains suffer from the instability inherent in calculating the necessary Slepian tapers to high precision. Finally, no performance bounds are currently available to measure the mean squared error of the estimate. This is inadequate for applications such as cryo-electron microscopy, where accurate noise models must be estimated from irregular domains with small sample sizes. To remedy this, we prove a set of performance bounds for the multitaper estimator on arbitrary domains. We also introduce randomized tapers, a proxy for the Slepian tapers yielding the same spectral estimate but which can be calculated to arbitrary precision. The method is demonstrated on synthetic and experimental datasets from cryo-electron microscopy, where it reduces mean squared error by a factor of two or more compared to traditional methods.

• 14 publications
• 5 publications
04/16/2018

### Optimal mean squared error bandwidth for spectral variance estimators in MCMC simulations

This paper proposes optimal mean squared error bandwidths for a family o...
07/01/2021

### How many samples are needed to reliably approximate the best linear estimator for a linear inverse problem?

The linear minimum mean squared error (LMMSE) estimator is the best line...
12/07/2011

### A recursive procedure for density estimation on the binary hypercube

This paper describes a recursive estimation procedure for multivariate b...
02/11/2022

### POT-flavored estimator of Pickands dependence function

This work proposes an estimator with both Peak-Over-Threshold and Block-...
10/21/2020

### Minimum Mean-Squared-Error Autocorrelation Processing in Coprime Arrays

Coprime arrays enable Direction-of-Arrival (DoA) estimation of an increa...
03/03/2022

### Online Balanced Experimental Design

e consider the experimental design problem in an online environment, an ...
04/11/2021

### Hyperspectral Pigment Analysis of Cultural Heritage Artifacts Using the Opaque Form of Kubelka-Munk Theory

Kubelka-Munk (K-M) theory has been successfully used to estimate pigment...

## 1 Introduction

Estimating the frequency content of a stochastic process is crucial to many data processing tasks. For example, in several inverse problems, such as denoising, access to a good noise model is necessary for accurate reconstruction. With other tasks, such as system identification, this frequency structure is itself the object of interest.

The frequency content of a stationary process over is characterized by its spectral density , defined as the Fourier series of its autocovariance function [30]. Spectral estimation is the task of recovering given one or more realizations of over a finite subset of . The challenge in the estimation problem is two-fold: (i) stochastic fluctuations in

introduce variance into estimates of

, and (ii) spatial constraints restrict us to using only samples from the domain . When many realizations are available, the first point may be addressed through ensemble averaging. For several applications, however, such as geosciences, cosmology, and electron microscope imaging, only a single realization is available, requiring accurate single-shot estimators.

One approach to reducing error in single-shot spectral estimation is to divide the domain into disjoint subsets, compute spectral density estimates for each, and average the result. In one dimension, with for some , this is typically done by partitioning into blocks of samples each, computing a spectral estimate, such as a periodogram, for each subset, and averaging. The parameter controls the trade-off between the two challenges raised previously: while a higher value of mitigates the stochastic fluctuations (reducing variance), each periodogram is computed from only samples, reducing their resolution (increasing bias) [5]. One problem with this approach, however, is that it introduces artifacts due to the boundaries imposed by the partitioning. The multitaper estimator, introduced by Thomson [42], refines this method by instead computing periodograms over all of , but with the samples first multiplied by a set of sequences called tapers. Optimal trade-off between bias and variance is obtained for specific sequences known as discrete prolate spheroidal sequences, or Slepian tapers [40].

Originally defined for , multitaper estimators naturally generalize to higher dimensions [38]

. The Slepian tapers, however, are defined as solutions to an ill-posed eigenvalue problem. Special geometries, including rectangular grids

[20, Chapter 2] and domains involving other symmetries [15, 20, 37, 36], may be analyzed by means of commuting differential operators, yielding well-posed problems that allow for accurate calculation of the tapers. There are other stable algorithms for calculating Slepian tapers, but these are defined only for specific domains [24, 25, 21, 23] or for certain modified tapers [32].

For general domains , however, all available methods inherit the instability of the underlying eigenvalue problem [37]. As a result, high-precision multitaper estimators are not applicable to domains such as the complement of the disk (see Figure 0(c)). This geometry arises in cryo-electron microscopy (cryo-EM), where noisy projection images contain signal on a central disk and clean samples of the noise process are found only outside of that disk. In geosciences, a similar problem occurs when a physical quantity needs to be sampled on a subregion of the earth [36, 37, 31].

One solution to this problem is to partition this irregular domain into smaller rectangular regions and apply the tensor Slepian multitaper estimator to those. This strategy, however, lacks the simplicity Thomson’s multitaper estimator, and may typically only be implemented suboptimally.

In this work, we show that the multitaper estimator only depends on the linear span of the tapers used. Consequently, we may calculate it with any set of tapers that span the same subspace as the Slepian tapers. The ill-posed problem of computing the leading eigenvectors is therefore replaced with computing a basis for that eigenspace. This problem is well-posed, because of the large spectral gap between the first eigenvalues and the rest of the spectrum, which we validate.

Furthermore, we provide a mean squared error bound for the multitaper estimator on arbitrary domains. This generalizes previous results of Abreu and Romero [3] on performance bounds for the one-dimensional case.

The proposed method is evaluated on synthetic data, where it is shown to perform comparably to the Slepian multitaper method for rectangular domains. However, we also show that the method performs equally well on the complement of a disk. We also evaluate the method on cryo-EM images, both synthesized and from experimental datasets, where we also achieve good performance compared to previous approaches.

Section 2 introduces the spectral estimation problem for general domains. The multitaper spectral estimator is introduced in Section 3. Section 4 provides a bound for the mean squared error of the multitaper estimator for arbitrary domains. We show that the estimator only depends on the linear span of the tapers in Section 5 and use this to provide an implementation using randomized tapers. Finally, Section 6 illustrates the performance of the proposed estimator using numerical experiments.

Notation

For a vector

, we let be its Euclidean norm. The -norms are denoted . For two non-negative functions , we write is there exists a constant such that . We also write , if and . For a function , we denote its -norm by .

We say that a function is 1-periodic if for all . The convolution of two periodic functions is defined as

 f∗g(x)=∫[−1/2,1/2]2f(y)g(x−y)dy. (1)

The -norm of a measurable, 1-periodic function is , and its -norm is . For clarity, we will sometimes write .

The -norm of a twice continuously differentiable function is

 ∥f∥C2:=max{∥f∥∞,∥∂xjf∥∞,∥∂xj∂xj′f∥∞:j,j′=1,…,d}.

## 2 Spectral estimation on irregular domains

Let us consider a real-valued Gaussian, zero-mean, stationary process defined on an infinite grid . Our goal is to estimate the covariance matrix

 Cov[q,q′]=E{X[q]X[q′]}q,q′∈Zd.

Since is stationary, only depends on the difference . We therefore rewrite the matrix in terms of the autocovariance function , giving

 Cov[q,q′]=r[q−q′]q,q′∈Zd.

The covariance information is equivalently encoded in the spectral density

 S(ξ)=∑q∈Zdr[q]e2πiqξ,ξ∈[−1/2,1/2]d.

Given one or several realizations of on a subset of cardinality , we wish to estimate . This is known as the spectral estimation problem [30].

An instance of this problem arises in imaging. Suppose that an image is defined on a grid

 QN={0,…,N−1}2

of pixels, and represents additive noise at the pixel . An accurate estimate of the noise distribution is then needed for denoising and other tasks. For example, in single-particle cryo-EM, molecules are imaged by freezing them in a thin layer of ice and recording their tomographic projections using an electron microscope [13]. To reduce specimen damage, electron dose is kept low, resulting in exceptionally noisy images, as shown in Figures 0(a) and 0(b). The projection of the molecule is expected to lie in the center of the image, so pure noise samples are only found outside of a central disk. A reasonable domain for noise estimation is therefore the complement of that disk

 Ω:={q∈QN:|q−(N/2,N/2)|>R}, (2)

as illustrated in Figure 0(c).

Another instance is found in geosciences, where is a physical quantity on an approximately flat portion of the earth, and measurements are only available on a subregion , corresponding, for example, to a continent [37]. A more sophisticated model replaces the Cartesian grid for a grid on the sphere [36, 31].

For both of these applications, we only have access to a single realization of the process whose spectral density we wish to estimate. In cryo-EM, this is due to changing experimental conditions between projection images, such as non-stationary optical parameters or variation in ice thickness, while in geosciences, only one realization (i.e., one planet) exists. We therefore require single-shot estimators which provide accurate estimates from a single realization of . The aim of the present work is to study the performance one such method, the multitaper estimator, and provide a high-precision implementation.

## 3 Multitaper estimators

The inherent variability of induces error into any estimate of its spectral density

. With access to only a single realization, ensemble averaging cannot be used to reduce the error. Instead, we must impose some constraint on the estimate. This often involves post-processing of the spectral density estimate by smoothing or fitting parametric models

[9, 30].

The multitaper spectral estimator provides another solution to this challenge [42]. Initially introduced for one-dimensional signals, we present it here for signals of arbitrary dimension. Let be a function of unit -norm supported on , that is, satisfying

 m[q]=0,if q∉Ω. (3)

We now define the tapered periodogram of with taper as

 ˆSm(ξ):=∣∣ ∣∣∑q∈Ωm[q]X[q]e2πiqξ∣∣ ∣∣2,ξ∈[−1/2,1/2]d. (4)

Given an orthonormal family of tapers supported on , we define the corresponding multitaper estimator as the average

 ˆSmt(ξ):=1KK∑k=1ˆSmk(ξ)ξ∈[−1/2,1/2]d. (5)

The accuracy of in estimating depends on the choice of tapers .

In the traditional multitaper estimator, the tapers are defined using the following spectral concentration problem:

 maximize ∫[−W/2,W/2]d∣∣ ∣∣∑q∈Ωm[q]e2πiqξ∣∣ ∣∣2dξ, (6) subject to:∑q∈Ω|m[q]|2=1, and supp(m)⊆Ω,

where

 W=(K/nΩ)1/d, (7)

and is the number of elements in . The parameter is known as the bandwidth of the estimator. It is chosen so that

 K=⌈nΩ×Wd⌉, (8)

and controls the amount of smoothness imposed on the estimate .

The Slepian tapers are defined as the set of mutually orthogonal solutions to (6). Specifically, the first taper is the solution to (6), while is the solution to (6) with the constraint that , and so on. In general, is the solution to (6) subject to .

Alternatively, (6) may be formulated as the maximization of the quadratic form corresponding to the -Toeplitz matrix

 TΩ,W[q,q′]=Wd1Ω(q)sinc(W(q−q′))1Ω(q′),q,q′∈Zd, (9)

where is the -dimensional normalized sinc function

 sinc(u):=d∏k=1sin(πuk)πuk,u=(u1,…,ud)∈Zd. (10)

The Slepian tapers are thus the top eigenvectors of , satisfying

 TΩ,Wmk=λkmk,k∈{1,…,K}, (11)

where .

Unfortunately, serious numerical difficulties arise when attempting solve (11) numerically. Indeed, the first eigenvalues form a plateau profile, all clustering around (see Figure 2), resulting in small spectral gaps between successive eigenvalues. For increasing , small spectral gaps result in an ill-posed eigenvector problem [41, 14], making direct calculation of in finite precision very challenging.

Under certain circumstances, may be replaced with a commuting differential operator whose spectrum does not exhibit the sample plateau. This is the case in one dimension when , yielding a well-posed eigenvector problem for the Slepian sequences [20, Chapter 2]. For higher dimensions, we may similarly take as a subgrid of , that is, . In this case, Slepian tapers are tensor products of one-dimensional Slepian sequences and the resulting estimators achieve good performance [17].

The existence of a commuting differential operator was referred to as a “lucky accident” by Slepian [39]. Similar devices in two dimensions are only available for special cases involving radial symmetries [15, 20, 37] or for polar caps in spherical geometry [36]. For non-symmetric domains, there are no adequate commuting differential operators [8, 16, 29]. Other stable numerical strategies exist only for particular domains [24, 25, 21, 23], or for modified tapers that have analytic expressions [32]. For more general domains, such as the complement of the disk within a rectangular grid (2), no useful symmetries seem to be available. Consequently, the calculation of the Slepian tapers for multitaper estimators on irregular domains is affected by numerical instability. However, as a consequence of the analysis in this article, such instabilities do not preclude the a stable implementation of the multitaper estimator. Indeed, even if the calculation of the Slepian tapers is severely ill-posed, effective numerical proxies can be calculated efficiently (see Section 5).

Multitaper estimators on irregular domains have been studied previously in the literature. Bronez referred to the problem of “irregularly sampling of multidimensional processes” (in reference to the geometry of the set of available samples) and named the corresponding tapers “generalized prolate spheroidal sequences” [10]. More recently, multitaper estimators associated with irregular domains, including the more challenging setting of spherical geometries, have been instrumental in geosciences and climate analysis [20, 18]. In the absence of a commuting operator, practitioners often calculate the tapers by a direct eigenvalue decomposition of , an operation that is admittedly unstable [31, 36, 37, 18], although effective in practice (see Remark 2). Irregular spectra are also relevant in the one-dimensional setting, for example in the field of cognitive radio [19], where the opportunistic occupation of transmission frequencies leads to complex geometries that can be leveraged through carefully designed irregular sampling patterns [12].

## 4 Analysis of multitaper estimator

Let us now consider the performance of the multitaper estimator . Previous analysis for the one-dimensional case [3] relies on the following aggregated measure for the spectral resolution of the tapers, known as the accumulated spectral window:

 ρ(ξ):=1KK∑k=1∣∣ ∣∣∑q∈Ωmk[q]e2πiqξ∣∣ ∣∣2ξ∈[−1/2,1/2]d. (12)

This window determines the smoothness imposed on and therefore controls the bias, variance, and consequently mean squared error (MSE) of the estimator. Indeed, for a multitaper estimator based on Slepian tapers, we have the following estimate, which is a minor extension of a result from Lii and Rosenblatt [26] (the proof is provided in Appendix A).

###### Proposition 1

Suppose that the spectral density of is a -periodic bounded function. If is defined from the Slepian tapers by (12), the multitaper estimator satisfies:

 Var{ˆSmt(ξ)}:=E{∣∣ˆSmt(ξ)−E{ˆSmt(ξ)}∣∣2}≲∥S∥2∞K, Bias{ˆSmt(ξ)}:=E{∣∣ˆSmt(ξ)−S(ξ)∣∣}=|S(ξ)−S∗ρ(ξ)|.

Note that the above result can also be proved for an arbitrary set of tapers. Abreu & Romero [3] showed that the spectral window associated with Thomson’s classical (that is, one-dimensional) multitaper estimator resembles a bump function localized in the interval and provided concrete error estimates [3]. Such description of the spectral window, combined with Proposition 1, leads to concrete MSE bounds for the classical multitaper as a function of , thus elaborating on the more qualitative analysis by Lii and Rosenblatt [26]. These types of estimates are also instrumental in the analysis of multitapering for slowly evolving spectral densities [44].

As a first contribution, we extend the description of the spectral window to arbitrary dimension and general acquisition domains. We let denote the digital perimeter of :

 n∂Ω=∑q∈Zdd∑j=1∣∣1Ω(q+ej)−1Ω(q)∣∣,

where is the canonical basis of . In the following, we will assume that to avoid degenerate cases.

###### Theorem 1

Consider the spectral window defined by the Slepian tapers obtained from (6). Assume that , where and are related by (7). Then

 ∥∥ρ−1Wd1[−W/2,W/2]d∥∥L1([−1/2,1/2]d)≲n∂ΩWd−1K[1+log(nΩn∂Ω)]. (13)

Related results in the context of the short-time Fourier transform can be found in

[1, 2]. The proof of Theorem 1 is postponed to Appendix A.

Combining Proposition 1 and Theorem 1, we obtain MSE bounds for the multitaper estimator on general domains . We also present a simplified expression for the bound, valid in the so-called fine-scale regime. In this regime, one considers a class of acquisition domains for which there is a constant such that . An instance of this regime occurs, for example, if arises from increasingly fine-scale discretizations of a certain continuous subset of , where depends on the smoothness of the subset.

###### Theorem 2

Suppose that the spectral density of is a -periodic function. Assume that . Then the multitaper estimator with tapers satisfies the following mean squared error bound:

 MSE{ˆSmt(ξ)} =E{∣∣ˆSmt(ξ)−S(ξ)∣∣2} (14) ≲∥S∥2C2(K4/dnΩ4/d+n2∂ΩnΩ2−2/dK2/d[1+log(nΩn∂Ω)]2+1K). (15)

In particular, when , and, in the fine-scale regime (where ), the choice (or, equivalently, ) gives

 MSE{ˆSmt(ξ)}≲nΩ−2/3⋅log2(nΩ)⋅∥S∥2C2. (16)

The proof of Theorem 2 is postponed to Appendix A.

In dimension , the choice in (14) (or, equivalently, ) gives

 MSE{ˆSmt(ξ)}≲nΩ−4/5∥S∥2C2, (17)

recovering the result in [3]. In this setting, the minimax risk corresponding to the stronger error measure given by the expected operator norm of the covariance matrix satisfies

 infˆSsupSE{supξ∣∣ˆS(ξ)−S(ξ)∣∣2}≍n−4/5log(n)4/5. (18)

Here, the infimum is taken among all estimators based on consecutive samples, and the first is over all spectral densities with Hölder exponent satisfying a certain smoothness bound, which determines the implied constant [6, 11]; see also [22]. We are unaware of benchmarks for the spectral estimation problem related to two-dimensional acquisition domains.

The estimates leading to Theorem 1 are also relevant in numerical analysis. For example Proposition 4 below improves on [27].

The above results may be generalized to cover arbitrary frequency profiles for the tapers instead of squares . For example, in applications where the spectral density does not display any particular anisotropy related to the axes, a disk might be a better choice. For simplicity we do not pursue such generalizations in the present work.

[Shannon number] Theorem 2 and the technical lemmas in Appendix A

provide non-asymptotic bounds for certain heuristic calculations concerning the so-called

Shannon number [37]. These involve the sum of the most significant eigenvalues of the spectral concentration problem for irregular domains, and are typically formulated in the large-scale asymptotic regime (see also [36, Section 7]).

## 5 Randomized implementation of the multitaper estimator

As discussed in Section 3, calculating the Slepian tapers for arbitrary domains is a difficult task due to the inherent instability of the underlying eigenproblem. To avoid this difficulty, we replace the Slepian tapers with another set of tapers which are easier to calculate but have the same linear span. While these tapers can be very different, it turns out that they yield the same multitaper estimator.

###### Proposition 2

Let and be two orthonormal sets that span the same linear space . Then the corresponding multitaper estimators coincide, that is,

 1LL∑k=1ˆSgk(ξ)=1LL∑k=1ˆS˜gk(ξ)ξ∈[−1/2,1/2]d. (19)

The proof is found in Appendix A.

The above result significantly simplifies our task, since while computing the top eigenvectors of (i.e., the Slepian tapers) is unstable, this is not the case when computing their linear span. Indeed, there is a large spectral gap between the first eigenvalues the rest of the spectrum (which Theorem 1 and the estimates in Appendix A validate). As a result, computing the associated subspace is a well-posed problem [41, 14].

We propose to compute these randomized tapers through a block power method applied to . These are then used to compute the multitaper spectral estimator for a given realization of a stationary process. The resulting algorithm is presented as Procedure (P1).

Procedure P1: The randomized multitaper estimator. Data: Samples , for A bias-variance trade-off parameter , Number of power method iterations , A target frequency . Result: , an estimator for . Procedure: [itemindent=0pt] Let be defined by .

Draw a random matrix

, where , for and , are drawn independently. Repeat times:

Compute a QR decomposition:

.
Set .
Let be the columns of and

In exact arithmetic, the column space of converges to the span of the Slepian tapers as we increase . Consequently, by Proposition 2, converges to . Since the gap between the th and the th eigenvalues of is typically non-negligible [46, 47], convergence is relatively fast and only a moderate number of iterations is necessary. In fact, a single iteration is often sufficient for many applications. Furthermore, the only step in Procedure (P1) that depends on the data is the last one. As a result, we may precompute the randomized tapers ahead of time and then apply them to the data when needed.

The matrix is supported on ; in Procedure (P1) it is treated as an element of .

Applying to a vector involves an insertion followed by a convolution and a truncation. Insertion and truncation are diagonal operators, while the convolution is a Toeplitz operator that can be applied using a fast Fourier transform. Multiplication of a vector by is thus achieved in time.

[Relation to common practice] The numerical calculation of the Slepian tapers on general domains as eigenvectors of is a severely ill-posed problem. Indeed, the condition number for the calculation of a single eigenvector is inversely proportional to its distance to the rest of the spectrum–see, for example, [33, Equation 3.45] and [35, Section 2.5]–and this spectral gap is small because of the plateau spectral profile of when is large. However, practitioners resort to such direct methods with remarkable results [31, 36, 37, 18]. Our analysis partly explains this success: unless is small enough, a standard finite-precision eigenvalue routine will fail to compute the true Slepian tapers, but will still compute an orthonormal basis for their linear span. In other words, the resulting tapers will be linear combinations of the true Slepian tapers, and therefore yield an equally useful multitaper estimator.

## 6 Numerical results

To empirically evaluate the performance of the proposed randomized multitaper estimator, we perform a few numerical experiments on synthetic data. The randomized multitaper is shown to perform comparably to the standard Slepian multitaper, but with applicability to irregular domains. We also evaluate its accuracy on single-particle cryo-EM data, for both synthetic and experimental images.

### 6.1 Comparison with tensor Slepian tapers

As discussed in Section 3, tensor products of Slepian functions may be used when is a subgrid of . To illustrate the performance of the randomized multitaper estimator, we therefore first compare it with that of the tensor Slepian multitaper estimator defined on the subgrid shown in Figure 2(a).

The top row Figure 4 shows a few tensor Slepian functions defined on the subgrid domain of Figure 2(a) with target frequency profile given by 2(b), which corresponds to . Below are randomized tapers defined on the same subgrid with the same frequency profile and calculated using Procedure (P1) for . Both sets of tapers are properly supported on given domain, but their appearance is quite different. Nonetheless, their accumulated spectral windows, shown in Figures 4(a) and 4(b), both closely agree with the target profile in Figure 2(b).

To evaluate the performance of these tapers in a multitaper estimation setting, we generate a Gaussian process with spectral density given by Figure 5(a) on a -by--pixel square. The estimated density of the tensor Slepian multitaper estimator is given in Figure 5(b) while that of the randomized tapers for iterations is in Figure 5(c). Both agree quite well with the true density. Indeed, the normalized root mean squared error of the tensor Slepian estimator is approximately while for the randomized tapers, we have of about . The deviation between the two estimators is . If we increase the number of iterations to , we obtain equality up to machine precision with .

### 6.2 Illustration for irregular domain

We now replace the subgrid domain with the disk complement domain shown in Figure 2(c). Following the discussion of Section 3, it is computationally intractable to solve the eigenvalue (11) for the desired tapers. We can, however, compute randomized tapers over this domain using Procedure (P1) with . A few sample tapers are shown in the bottom row Figure 4. Again, their appearance is quite different from the Slepian tapers in the top row, but their accumulated spectral window shown in Figure 4(c) agrees well with the target of Figure 2(b).

Applying these tapers to estimate the spectral density of Figure 5(a) from one realization gives the density depicted in Figure 5(d). The normalized root mean squared error is approximately .

### 6.3 Cryo-EM: Synthetic data

To evaluate our approach in a real-world application, we consider the estimation of noise power spectra in cryo-EM. In cryo-EM imaging, a solution containing macromolecules of interest are frozen in a thin layer of vitreous ice which is then exposed to an electron beam. A sensor records the transmitted electrons, resulting in a set of tomographic projections depicting the molecules from various viewing angles [13]. To reduce specimen damage, the electron dose is kept low, resulting in exceptionally noisy images, with noise power often exceeding that of the signal by a factor of ten or more. Three-dimensional reconstruction of the molecules, the goal of cryo-EM, therefore requires accurate characterization of the noise model. Since the noise characteristics vary with microscope configuration, ice thickness, and other experimental factors, we cannot rely on ensemble averages for accurate estimation of the noise power spectrum.

To evaluate our approach for the cryo-EM application, we generate a number of projection images using a 70S ribosome density map on a grid with . These are shown in Figures 6(a) and 6(b). We then generate Gaussian noise with a spectral density given in Figure 6(c). Adding the noise to the simulated projections, we obtain the images shown in Figure 6(d) and 6(e).

Since the central disk of the images contains both the projected molecular density and the noise, we would like to estimate outside of this disk. Specifically, we would like to restrict our estimator to a set of samples like the one shown in Figure 7(a). A common approach is to define the mask taper and use (4) to calculate the tapered periodogram [45, 7]. As only taper is used, the result has high variance, so the estimates are averaged over a set of images to obtain an adequate estimate. However, this fails to account for any variability in the noise models of the individual images.

A more accurate estimator is obtained by replacing with a union of rectangular subgrids

 Ωgrid:=⋃a∈{0,1}2{q∈QN:|q−Na|1

where is the -norm. We then apply a standard tensor multitaper estimator to each subgrid and average the results. These tapers are tensor products of one-dimensional Slepian sequences, as described in Section 3. The corresponding to the of Figure 7(a) is shown in Figure 7(b). Depending on the geometry of , may discard many points in , increasing variance of the estimate. For small , this also increases bias, as fewer points in this regime leads to a wider accumulated spectral window .

Performance. We now compare the performance of these baseline estimators, the tapered periodogram and the tensor multitaper estimator , to that of our proposed estimator . For a given , we define by (2) and by (20). When is low, we expect a large bias as the samples are contaminated by the projected density maps, while increasing results in lower bias but higher variance due to the lower number of available samples. The bias, variance, and mean squared error of the three estimators as a function of are shown in Figures 7(c), 7(d), and 7(e), respectively.

The bias of the randomized multitaper estimator is higher than the tensor multitaper for low . This is due to the original mask overlapping with the support of the projection images to a greater extent than at these radii. As increases above , however, the bias for both estimators drops down to the same level. Since it has , no smoothness is imposed on the tapered periodogram which therefore has lower bias compared to the multitaper estimators.

At low , the influence of the clean projection images yields high variance for all the estimators, since the images vary according to viewing angle. The effect is exacerbated for the randomized multitaper estimators and the tapered periodogram since has greater overlap with the projections compared to . As increases, however, the randomized multitaper estimator enjoys a lower variance, since it draws upon a larger number of samples compared to the tensor multitaper estimator , reducing the variance by a factor of two. The tapered periodogram, meanwhile, has high variance for all since it only employs a single taper, providing no variance reduction.

The low bias and variance for high combine to yield a lower MSE for the randomized multitaper estimator compared to the other estimators. On average, the randomized multitaper estimator gives an error a factor of two lower than the tensor multitaper, and beats the tapered periodogram by a factor of three.

### 6.4 Cryo-EM: Experimental data

We now evaluate performance on images consisting of experimental projections of an 80S ribosome complex from the EMPIAR-10028 dataset [43]. The images are defined on a grid with . For our evaluation, we choose a subset of the first images from the dataset. Two sample images from this subset are shown in Figure 8(a) and 8(b).

To obtain a reasonable approximation of the true power spectrum for each projection image, we process the entire dataset through the RELION software package [34]. This yields estimates of the underlying molecular density and allows us to simulate clean projection images for each of the noisy images. By subtracting the estimated clean images from the noisy images, we obtain a set of images consisting mostly of noise. We may then estimate their power spectra by applying a tensor multitaper estimator over the entire grid . Since these estimates incorporate the noise in the center of the image, they should provide an accurate estimate of the spectral density of the noise in the whole image. We shall therefore use these as a standard against which we compare the estimates obtained from the complement of the disk on the original projection images.

As before, we calculate the tapered periodogram on , the tensor multitaper estimator on , and the randomized multitaper estimator on . Each estimated power spectrum is compared to the “ground truth” power spectrum obtained for the corresponding image as described above. The resulting MSE’s are plotted for each estimator in Figure 8(c).

As before, the tapered periodogram performs badly since it has , resulting in high variance and therefore high MSE. The tensor multitaper estimator performs well for low , but increasing reduces the area of , yielding higher variance and higher MSE. A similar behavior is observed for the randomized multitaper estimates, but the reduction in area for is not as drastic for increasing , so the variance remains small compared to the tensor multitaper estimator. At , the error is minimized and the randomized multitaper outperforms the tensor multitaper by a factor of .

## 7 Conclusion

We have analyzed the multitaper estimator on arbitrary acquisition domains, providing performance bounds on the mean squared error. Furthermore, we use the fact that the multitaper estimation only depends on the linear span of the tapers to introduce a new implementation based on randomized tapers. These span the same linear space as the Slepian tapers, but are easier to calculate since they do not involve calculating eigenvectors associated with closely spaced eigenvalues. The performance of the randomized multitaper estimator is shown to be comparable to that using Slepian tapers when these are available for rectangular domains. For more general domains, we demonstrate the improved performance of the randomized multitaper estimator compared to Slepian multitaper estimators defined on rectangular subgrids. This is shown for both synthetic examples and on experimental data obtained from cryo-EM imaging.

Future directions of research involve adapting the factor analysis framework proposed for periodogram estimators [4] to multitaper estimators. This would allow for greater variance reduction when a linear structure exists in the variability of spectral density between independent realizations of the point process. Such a situation arises, for example, in the cryo-EM noise estimation task, where a set of underlying noise sources combine at arbitrary strengths to yield the noise process in a given image.

## Acknowledgment

The authors are very grateful to Luís Daniel Abreu, Tomasz Hrycak, and Amit Singer, who motivated this article and provided valuable input.

## Appendix A Proofs

### a.1 Convolution estimates

First, we note that

 ∥f∗g∥∞≤∥f∥∞∥g∥L1([−1/2,1/2]d). (21)

Furthermore, for a parameter

we consider the periodic extension of the normalized characteristic function

, and, by a slight abuse of notation, we define its convolution with a 1-periodic function as

 f∗W−d1[−W/2,W/2]d(x)=1Wd∫[−W/2,W/2]df(x−y)dy.

An estimate on a second-order Taylor expansion shows that

 ∥f−1Wdf∗1[−W/2,W/2]d∥∞≲∥f∥C2W2. (22)

### a.2 Trace and norm of the Toeplitz operators

The estimates derived in this section are also relevant in numerical analysis of Fourier extensions. In particular, they improve on the results in [27] by avoiding assumptions on the (digital) topology of the set . See also [28] for related estimates.

###### Proposition 3

Let with . Then

 ∥a∗b−b∥ℓ1(Zd)≤∥∇b∥1∑q|q|∞|a(q)|,

where for , , and .

###### Proof

For and we let be the truncated vector , and also . For , let us write

 (a∗b)[q′]−b[q′]=∑q∈Zd(b[q+q′]−b[q′])a[−q] =∑q∈Zdd∑k=1(b[χkq+q′]−b[χk−1q+q′])a[−q] =∑q∈Zdd∑k=1(b[χk−1q+q′+qkek]−b[χk−1q+q′])a[−q] =∑q∈Zdd∑k=1|qk|−1∑l=0(b[χk−1q+q′+sgn(qk)(l+1)ek]−b[χk−1q+q′+sgn(qk)lek])a[−q] =∑q∈Zdd∑k=1|qk|−1∑l=0∇kb[χk−1q+q′+sgn(qk)lek]a[−q].

Therefore,

 ∥a∗b−b∥1 ≤∑q∈Zdd∑k=1|qk|−1∑l=0∑q′∈Zd∣∣∇kb[χk−1q+q′+sgn(qk)lek]∣∣|a[−q]| ≤∑q∈Zdd∑k=1∥∇kb∥1|qk||a[−q]|≤∥∇b∥1∑q∈Zd|q|∞|a[q]|,

as desired, since . (See [28, 1] for related estimates.)

###### Proposition 4

Let be the matrix in (9). Then

 trace[TΩ,W]−trace[(TΩ,W)2]≲n∂ΩWd−1[1+log(nΩn∂Ω)]. (23)

###### Proof

Step 1. (Computations). The function

 h[q]:=d∏j=1sin(πWqj)πqj=Wdsinc(Wq),q=(q1,…,qd)∈Zd, (24)

determines the Fourier series

Note that, in terms of , (9) reads: . We first compute

 trace[TΩ,W] =∑q∈Ωh[q−q]=nΩh0=nΩ∥1[−W/2,W/2]d∥1 (25) =WdnΩ=Wd∑q∈Zd1Ω[q]. (26)

Second,

 trace[(TΩ,W)2] =∑q,q′∈Zd1Ω[q]∣∣h[q−q′]∣∣21Ω[q′] (27) =Wd∑q∈Zd(1Ω∗a)[q]1Ω[q], (28)

where

 a[q]:=W−d|h[q]|2.

Step 2. (Truncation errors). Let and consider the function

 ~a[q]:=a[q]1|q|≤L.

We claim that, for ,

 ∥a−~a∥1=W−d∑q∈Zd,|q|>L|h[q]|2≲(WL)−1, (29) ∑q∈Zd|q||~a[q]|=W−d∑q∈Zd,|q|≤L|q||h[q]|2≲logLW. (30)

To show these estimates, we write with

 r[q1]=sin(πWq1)πq1=Wsinc(Wq1),q1∈Z.

We first note the following:

 ∑q1∈Z,|q|>L|r[q1]|2≲∑q1∈Z,|q1|>L1|q1|2≲1L, ∑q1∈Z,|q1|≤L|q1||r[q1]|2≲∑q1∈Z1|q1|≲log(L),L