1.1 Problem definition
Consider the matrix
is a vector of distinct nodeswith , and
is the normalized bandwidth. The scaling of the smallest eigenvalue111It is well-known that is positive-definite – for instance, because is a positive-definite function. is of interest in applied harmonic analysis and in particular the theory of super-resolution, where this quantity controls the worst-case stability of recovering an atomic measure from bandlimited data (see sub:discussion below). Since
we see that is the limit as of the matrix
where is the Dirichlet (periodic sinc) kernel
For each , let
be the rectangular Vandermonde matrix with complex nodes where . Clearly , and so .
For , we denote
where is the principal value of the argument of , taking values in .
Given as above, we define the minimal separation (in the wrap-around sense) as
It is well-known that there are two very different scaling regimes for , depending on the quantity which is frequently called the “super-resolution factor” (see sub:discussion below)
If and is fixed, the matrix is well-conditioned, and in fact it can be shown that in this case
The case is somewhat more relevant to super-resolution applications, however all known results provide sharp bounds only in the particular case when all the nodes are clustered together, or approximately equispaced. In this setting we have the fast decay
1.2 Main results
It turns out that the bound (5) is too pessimistic if only some of the nodes are known to be clustered. Consider for instance the configuration , then, as can be seen in fig:sigma.min.first.simulation, we have in fact , decaying much slower than – which would be the bound given by (5).
In this paper we bridge this theoretical gap. We consider the partially clustered regime where at most neighboring nodes can form a cluster (there can be several such clusters), with two additional parameters controlling the distance between the clusters and the uniformity of the distribution of nodes within the clusters.
The node vector is said to form a -clustered configuration for some , , and , if for each , there exist at most distinct nodes
such that the following conditions are satisfied:
For any , we have
For any , we have
Our main result is the following generalization of (5) for clustered configurations.
There exists a constant such that for any , any forming a -clustered configuration, and any satisfying
The proof of thm:main-theorem is presented in sub:theproof below. It is based on the “decimation” technique, previously used in the context of super-resolution in [1, 2, 4, 5, 6] and references therein.
The same node vector can be regarded as a clustered configuration with different choices of the parameters . For example, the vector from the beginning of this section (and also fig:sigma.min.first.simulation) is both -clustered and -clustered, with any . To obtain as tight a bound as possible, one should choose the minimal such that the condition (6) is satisfied for within the range of interest. For instance, might be too small if is small enough, however by choosing one is able to increase without bound. See fig:breakdown for a numerical example.
The constant is given explicitly in (30), and it decays in like . We do not know whether this rate can be substantially improved, however it is plausible that the best possible bound would scale like for some absolute constant .
For the case of finite , one might be interested to consider the rectangular Vandermonde matrix without any reference to , i.e.
for some node vector . Our next result is the analogue of (7) in this setting, albeit under an extra assumption that the nodes are restricted to the interval .
There exists a constant such that for any , any forming a -clustered configuration, and any satisfying
Let us choose so that for all we have
Further define , and . We immediately obtain that the vector forms a -clustered configuration according to def:partial-cluster, and the rectangular Vandermonde matrix in (9) is precisely . Clearly, , and also
Using (10), we obtain precisely the conditions (6) with in place of respectively. Therefore the conditions of thm:main-theorem are satisfied for , and so (11) follows immediately from (12) and (7), with . ∎
Returning back to thm:main-theorem, it turns out that the bound (8) is asymptotically optimal.
There exists an absolute constant and a constant such that for any and any satisfying , there exists a -clustered configuration with nodes and certain depending only on , for which
The proof of thm:optimality is presented in sub:optimality. Numerical experiments validating the above results are presented in sec:Numerical-evidence.
1.3 Related work and discussion
Our main result has direct implications for the problem of super-resolution under sparsity constraints. For simplicity suppose that the nodes must belong to the grid of step size . As demonstrated in [11, 18] and several other works, the minimax error rate for recovery of sparse point measures from the bandlimited and inexact measurements is directly proportional to where is any vector of length . Moreover, it is established in those works that without any further constraints on the support of , the bound (5) holds and it is the best possible.
It is fairly straightforward to extend the results of  and  to our setting: if the support of is known to be partially clustered (as in def:partial-cluster), then the minimax error rate will satisfy
for any estimatorand the norm , and it will be attained by the intractable sparse -minimization, with the additional restriction that the solutions should exhibit the appropriate clustered sparsity pattern instead of the unconstrained sparsity.
A different but closely related setting was considered in the seminal paper , where the measure was assumed to have infinite number of spikes on a grid of size , with one spike per unit of time on average, but whose local complexity was constrained to have not more than spikes per any interval of length . is called the “Rayleigh index”, being the maximal number of spikes which can be clustered together (a related notion of Rayleigh regularity was introduced in ). It was shown in  that the minimax recovery rate for such measures essentially scales like (13) where is replaced with (the work  had a small gap in the exponents between the lower and upper bounds, which was later closed in  for the finite sparse case). Our partial cluster model can therefore be regarded as the finite-dimensional version of these “sparsely clumped” measures with finite Rayleigh index, showing the same scaling of the error – polynomial in and exponential in the “local complexity” of the signal.
If the grid assumption is relaxed, then one might wish to measure the accuracy of recovery by comparing the locations of the recovered signal with the true ones . In this case, there are additional considerations which are required to derive the minimax rate, and it is possible to do so under the partial clustering assumptions. See [2, 6] for details, where we prove (13) in this scenario, for uniform bound on the noise . The extreme case has been treated recently in [4, 5].
In the case of well-separated spikes (i.e. clusters of size ), a recent line of work using minimization ([9, 8, 13, 10] and the great number of follow-up papers) has shown that the problem is stable and tractable.
Therefore, the partial clustering case is somewhat mid-way between the extremes and , and while our results in this paper (and also in ) show that it is much more stable than in the unconstrained sparse case, it is an intriguing open question whether provably tractable solution algorithms exist.
Several candidate algorithms for sparse super-resolution are well-known – MUSIC, ESPRIT/matrix pencil, and variants; these have roots in parametric spectral estimation . In recent years, the super-resolution properties of these algorithms are a subject of ongoing interest, see e.g. [14, 19, 25] and references therein. Smallest singular values of the partial Fourier matrices , for finite , play a major role in these works, and therefore we hope that our results and techniques may be extended to analyze these algorithms as well.
2 Known bounds
2.1 Well-separated regime
Consider the well-separated case , and let be as defined in (3), i.e. a rectangular Vandermonde matrix with nodes on the unit circle with , so that .
Several more or less equivalent bounds on are available in this case, using various results from analysis and number theory such as Ingham and Hilbert inequalities, large sieve inequalities and Selberg’s majorants [17, 20, 24, 3, 21, 22, 15, 7].
The tightest bound was obtained by Moitra in , where he showed that if then
In our setting, we have and so as we obtain
which is exactly (4).
2.2 Single clustered regime
Let us now assume , i.e. or, equivalently, .
The same scaling was shown using Szego’s theory of Toeplitz forms in  – see also sub:discussion. The authors showed that there exist and such that for
Essentially the same result was obtained in , where the authors considered partial discrete Fourier matrices
obtained from the un-normalized Discrete Fourier Transform matrix of sizeby taking the first rows and an arbitrary set of columns, with and . The authors showed that as with the ratio fixed, we have the bound
which is attained for the configuration of consecutive columns. In our equispaced setting, it is easy to see that the matrix for large is precisely with and . Therefore the above result reduces to
which is the same as (5).
Here we introduce the uniform blowup of a node vector by a positive parameter , and study the effect of such a blowup mapping on the minimal wrap-around distance between the mapped nodes.
Let form a cluster, and suppose that . Then, for any there exists a set of total measure such that for every the following holds for every :
Furthermore, the set is a union of at most intervals.
We begin with (15). Let , then and since we immediately conclude that
To show (16), let
be the uniform probability measure on. Let and be fixed and put . For , let
be the random variable on, defined by
We now show that for any
Since , we can write where is an integer and . We break up the probability (17) as follows:
Now, consider the number . As varies between and , the number traverses the unit circle exactly once, and therefore the variable traverses the interval exactly twice. Consequently,
Similarly, when varies between and , we have
It is clear from the above that is a union of intervals, each of length , repeating with the period of . Consequently the set is a union of at most intervals. Since we have , and so the set is a union of at most intervals.
Now we put and apply (17) for every pair where and . By the union bound, we obtain
Fixing as the complement of the above set, , we have that is of total measure greater or equal to , and for every the estimate (16) holds. Clearly is a union of at most intervals. ∎
Fix and consider the set given by lem:blowup-lemma. Let us also fix a finite and positive integer , and consider the set of equispaced points in :
If , then .
By lem:blowup-lemma, the set consists of intervals, and by (19) the total length of is at most . Denote the lengths of those intervals by . The distance between neighboring points in is , and therefore each interval contains at most points. Overall, the interval contains at most
points from , and since the total number of points in is at least , we have
3.2 Square Vandermonde matrices
Let be a vector of pairwise distinct complex numbers. Consider the square Vandermonde matrix
Theorem 3.1 (Gautschi, ).
For a matrix , let denote the induced matrix norm
Then we have
Suppose that is a vector of pairwise distinct complex numbers with , , and let be arbitrary. Let
For , denote by the angular distance between and :
3.3 Proof of thm:main-theorem
We shall bound defined as in (3) for sufficiently large . For any subset let , be the submatrix of containing only the rows in . By the Rayleigh characterization of singular values, it is immediately obvious that if is any partition of the rows of then
Let be the set from lem:blowup-lemma for . By prop:finite-n-blowup we have that for all , will contain a rational multiple of of the form for some .
Consider the ”new” nodes
Since , we conclude by lem:blowup-lemma that for every
Since it follows that . Now consider the particular interleaving partition of the rows by blocks of rows each, separated by rows between them (some rows might be left out):
For , each is a square Vandermonde-type matrix as in (22),
with node vector
Now we use (26) to aggregate the bounds on for each square matrix and obtain
Since and since by assumption , we have that and so