1 Introduction
1.1 Problem definition
Consider the matrix
(1) 
where
is a vector of distinct nodes
with , andis the normalized bandwidth. The scaling of the smallest eigenvalue
^{1}^{1}1It is wellknown that is positivedefinite – for instance, because is a positivedefinite function. is of interest in applied harmonic analysis and in particular the theory of superresolution, where this quantity controls the worstcase stability of recovering an atomic measure from bandlimited data (see sub:discussion below). Sincewe see that is the limit as of the matrix
where is the Dirichlet (periodic sinc) kernel
(2) 
For each , let
(3) 
be the rectangular Vandermonde matrix with complex nodes where . Clearly , and so .
The question of lower bounds for (or, equivalently, ) received much attention in the literature, see e.g. [3, 7, 19, 20, 15, 18, 5, 11].
For , we denote
where is the principal value of the argument of , taking values in .
Given as above, we define the minimal separation (in the wraparound sense) as
It is wellknown that there are two very different scaling regimes for , depending on the quantity which is frequently called the “superresolution factor” (see sub:discussion below)
If and is fixed, the matrix is wellconditioned, and in fact it can be shown that in this case
(4) 
The case is somewhat more relevant to superresolution applications, however all known results provide sharp bounds only in the particular case when all the nodes are clustered together, or approximately equispaced. In this setting we have the fast decay
(5) 
1.2 Main results
It turns out that the bound (5) is too pessimistic if only some of the nodes are known to be clustered. Consider for instance the configuration , then, as can be seen in fig:sigma.min.first.simulation, we have in fact , decaying much slower than – which would be the bound given by (5).
In this paper we bridge this theoretical gap. We consider the partially clustered regime where at most neighboring nodes can form a cluster (there can be several such clusters), with two additional parameters controlling the distance between the clusters and the uniformity of the distribution of nodes within the clusters.
Definition 1.1.
The node vector is said to form a clustered configuration for some , , and , if for each , there exist at most distinct nodes
such that the following conditions are satisfied:

For any , we have

For any , we have
Our main result is the following generalization of (5) for clustered configurations.
Theorem 1.1.
There exists a constant such that for any , any forming a clustered configuration, and any satisfying
(6) 
we have
(7)  
(8) 
The proof of thm:maintheorem is presented in sub:theproof below. It is based on the “decimation” technique, previously used in the context of superresolution in [1, 2, 4, 5, 6] and references therein.
Remark 1.1.
The same node vector can be regarded as a clustered configuration with different choices of the parameters . For example, the vector from the beginning of this section (and also fig:sigma.min.first.simulation) is both clustered and clustered, with any . To obtain as tight a bound as possible, one should choose the minimal such that the condition (6) is satisfied for within the range of interest. For instance, might be too small if is small enough, however by choosing one is able to increase without bound. See fig:breakdown for a numerical example.
Remark 1.2.
The constant is given explicitly in (30), and it decays in like . We do not know whether this rate can be substantially improved, however it is plausible that the best possible bound would scale like for some absolute constant .
For the case of finite , one might be interested to consider the rectangular Vandermonde matrix without any reference to , i.e.
(9) 
for some node vector . Our next result is the analogue of (7) in this setting, albeit under an extra assumption that the nodes are restricted to the interval .
Corollary 1.1.
There exists a constant such that for any , any forming a clustered configuration, and any satisfying
(10) 
we have
(11) 
Proof.
Let us choose so that for all we have
Further define , and . We immediately obtain that the vector forms a clustered configuration according to def:partialcluster, and the rectangular Vandermonde matrix in (9) is precisely . Clearly, , and also
(12) 
Using (10), we obtain precisely the conditions (6) with in place of respectively. Therefore the conditions of thm:maintheorem are satisfied for , and so (11) follows immediately from (12) and (7), with . ∎
Returning back to thm:maintheorem, it turns out that the bound (8) is asymptotically optimal.
Theorem 1.2.
There exists an absolute constant and a constant such that for any and any satisfying , there exists a clustered configuration with nodes and certain depending only on , for which
The proof of thm:optimality is presented in sub:optimality. Numerical experiments validating the above results are presented in sec:Numericalevidence.
1.3 Related work and discussion
Our main result has direct implications for the problem of superresolution under sparsity constraints. For simplicity suppose that the nodes must belong to the grid of step size . As demonstrated in [11, 18] and several other works, the minimax error rate for recovery of sparse point measures from the bandlimited and inexact measurements is directly proportional to where is any vector of length . Moreover, it is established in those works that without any further constraints on the support of , the bound (5) holds and it is the best possible.
It is fairly straightforward to extend the results of [18] and [11] to our setting: if the support of is known to be partially clustered (as in def:partialcluster), then the minimax error rate will satisfy
(13) 
for any estimator
and the norm , and it will be attained by the intractable sparse minimization, with the additional restriction that the solutions should exhibit the appropriate clustered sparsity pattern instead of the unconstrained sparsity.A different but closely related setting was considered in the seminal paper [12], where the measure was assumed to have infinite number of spikes on a grid of size , with one spike per unit of time on average, but whose local complexity was constrained to have not more than spikes per any interval of length . is called the “Rayleigh index”, being the maximal number of spikes which can be clustered together (a related notion of Rayleigh regularity was introduced in [23]). It was shown in [12] that the minimax recovery rate for such measures essentially scales like (13) where is replaced with (the work [12] had a small gap in the exponents between the lower and upper bounds, which was later closed in [11] for the finite sparse case). Our partial cluster model can therefore be regarded as the finitedimensional version of these “sparsely clumped” measures with finite Rayleigh index, showing the same scaling of the error – polynomial in and exponential in the “local complexity” of the signal.
If the grid assumption is relaxed, then one might wish to measure the accuracy of recovery by comparing the locations of the recovered signal with the true ones . In this case, there are additional considerations which are required to derive the minimax rate, and it is possible to do so under the partial clustering assumptions. See [2, 6] for details, where we prove (13) in this scenario, for uniform bound on the noise . The extreme case has been treated recently in [4, 5].
In the case of wellseparated spikes (i.e. clusters of size ), a recent line of work using minimization ([9, 8, 13, 10] and the great number of followup papers) has shown that the problem is stable and tractable.
Therefore, the partial clustering case is somewhat midway between the extremes and , and while our results in this paper (and also in [6]) show that it is much more stable than in the unconstrained sparse case, it is an intriguing open question whether provably tractable solution algorithms exist.
Several candidate algorithms for sparse superresolution are wellknown – MUSIC, ESPRIT/matrix pencil, and variants; these have roots in parametric spectral estimation [27]. In recent years, the superresolution properties of these algorithms are a subject of ongoing interest, see e.g. [14, 19, 25] and references therein. Smallest singular values of the partial Fourier matrices , for finite , play a major role in these works, and therefore we hope that our results and techniques may be extended to analyze these algorithms as well.
2 Known bounds
2.1 Wellseparated regime
Consider the wellseparated case , and let be as defined in (3), i.e. a rectangular Vandermonde matrix with nodes on the unit circle with , so that .
Several more or less equivalent bounds on are available in this case, using various results from analysis and number theory such as Ingham and Hilbert inequalities, large sieve inequalities and Selberg’s majorants [17, 20, 24, 3, 21, 22, 15, 7].
The tightest bound was obtained by Moitra in [20], where he showed that if then
2.2 Single clustered regime
Let us now assume , i.e. or, equivalently, .
If all the nodes are equispaced, say , then the matrix is the socalled prolate matrix, whose spectral properties are known exactly [28, 26]. Indeed, we have in this case
and therefore where is the matrix defined in [26, eq. (21)]. The smallest eigenvalue of , denoted by in the same paper, has the exact asymptotics for small, given in [26, eqs. (64,65)]:
(14) 
which gives
proving (5).
The same scaling was shown using Szego’s theory of Toeplitz forms in [11] – see also sub:discussion. The authors showed that there exist and such that for
Essentially the same result was obtained in [18], where the authors considered partial discrete Fourier matrices
obtained from the unnormalized Discrete Fourier Transform matrix of size
by taking the first rows and an arbitrary set of columns, with and . The authors showed that as with the ratio fixed, we have the boundwhich is attained for the configuration of consecutive columns. In our equispaced setting, it is easy to see that the matrix for large is precisely with and . Therefore the above result reduces to
which is the same as (5).
3 Proofs
3.1 Blowup
Here we introduce the uniform blowup of a node vector by a positive parameter , and study the effect of such a blowup mapping on the minimal wraparound distance between the mapped nodes.
Lemma 3.1.
Let form a cluster, and suppose that . Then, for any there exists a set of total measure such that for every the following holds for every :
(15)  
(16) 
Furthermore, the set is a union of at most intervals.
Proof.
We begin with (15). Let , then and since we immediately conclude that
To show (16), let
be the uniform probability measure on
. Let and be fixed and put . For , letbe the random variable on
, defined byWe now show that for any
(17) 
Since , we can write where is an integer and . We break up the probability (17) as follows:
(18) 
Now, consider the number . As varies between and , the number traverses the unit circle exactly once, and therefore the variable traverses the interval exactly twice. Consequently,
It is clear from the above that is a union of intervals, each of length , repeating with the period of . Consequently the set is a union of at most intervals. Since we have , and so the set is a union of at most intervals.
Now we put and apply (17) for every pair where and . By the union bound, we obtain
(19) 
Fixing as the complement of the above set, , we have that is of total measure greater or equal to , and for every the estimate (16) holds. Clearly is a union of at most intervals. ∎
Fix and consider the set given by lem:blowuplemma. Let us also fix a finite and positive integer , and consider the set of equispaced points in :
Proposition 3.1.
If , then .
Proof.
By lem:blowuplemma, the set consists of intervals, and by (19) the total length of is at most . Denote the lengths of those intervals by . The distance between neighboring points in is , and therefore each interval contains at most points. Overall, the interval contains at most
points from , and since the total number of points in is at least , we have
∎
3.2 Square Vandermonde matrices
Let be a vector of pairwise distinct complex numbers. Consider the square Vandermonde matrix
(20) 
Proposition 3.2.
Suppose that is a vector of pairwise distinct complex numbers with , , and let be arbitrary. Let
(22) 
For , denote by the angular distance between and :
Then
(23) 
3.3 Proof of thm:maintheorem
We shall bound defined as in (3) for sufficiently large . For any subset let , be the submatrix of containing only the rows in . By the Rayleigh characterization of singular values, it is immediately obvious that if is any partition of the rows of then
(26) 
Let be the set from lem:blowuplemma for . By prop:finitenblowup we have that for all , will contain a rational multiple of of the form for some .
Consider the ”new” nodes
(27) 
Since , we conclude by lem:blowuplemma that for every
(28)  
(29) 
Since it follows that . Now consider the particular interleaving partition of the rows by blocks of rows each, separated by rows between them (some rows might be left out):
For , each is a square Vandermondetype matrix as in (22),
with node vector
where are given by (27). We apply prop:vandsingestimate with the crude bound obtained from (28) and (29) above:
and obtain
Now we use (26) to aggregate the bounds on for each square matrix and obtain
Since and since by assumption , we have that and so
Comments
There are no comments yet.