1. Introduction
In their seminal work on sparse recovery [5], Candés and Tao were led to the notion of the restricted isometry property (RIP). A matrix has the restricted isometry property of order with constant if for all sparse vectors (i.e. vectors with at most nonzero entries) we have
The significance of this property is that it guarantees that one can recover an approximately sparse vector from via a convex program [5]. Specifically, they showed that if a matrix satisfies RIP, then the minimizer
satisfies
where is the best sparse approximation of — in particular when is exactly sparse, it can be efficiently recovered from without any error.
In applications, is the number of measurements needed to recover a sparse signal. Therefore, it is of interest to understand the minimal number of rows needed in a matrix with the RIP property.
It is known that for a properly normalized matrix with independent gaussian entries,
suffices to generate a RIP matrix with high probability (e.g.
[8]). Yet, it is often beneficial to have more structure in the matrix [13]. For example, if the matrixis a submatrix of the discrete Fourier transform matrix, then the fast Fourier transform algorithm allows fast matrix–vector multiplication, speeding up the run time of the recovery algorithm
[8, Chapter 12]. Additionally, generating a random submatrix requires fewer random bits and less storage space.The first bound on the number of subsampled rows from a Fourier matrix necessary for recovery appeared in the groundbreaking work [5]. They showed that if one randomly subsamples rows so that the expected number of rows is , then concatenating these rows forms a RIP matrix with high probability, after appropriate normalization. Rudelson and Verhsynin later improved this bound to via a gaussian process argument involving chaining techniques [14]. Their proof was then streamlined and their probability bounds strengthened [7, 13]. Cheraghchi, Guruswami, and Velingker then proved a bound of [6], and Bourgain established the bound [4]. The sharpest result in this direction is due to Haviv and Regev, who showed the upper bound through a delicate application of the probabilistic method [10]. It is widely conjectured that for the discrete Fourier transform suffices [14].
It turns out that all proofs in this line of work, including the strongest known upper bound [10]
, apply in a more general setting where random matrix
is constructed by subsampling rows of any bounded orthonormal matrix — that is an orthonormal matrix with all entries bounded in magnitude by for some constant . The matrix of the Discrete Fourier Transform satisfy this property with .This paper addresses the problem of determining a necessary number of samples for reconstruction. Our contribution is that — surprisingly — for general bounded orthonormal matrices, and for a certain range of , samples are needed. In particular, only a gap of remains between our bound and the best known upper bound. We improve the previous best lower bound due to Bandeira, Lewis, and Mixon [3] which applied to the DFT matrix. Those in turn improve upon more general lower bounds on the number of rows for any matrix that satisfies the RIP property [2, 9, 11, 12].
In the proof we consider an example of a bounded orthonormal matrix, the Hadamard matrix (i.e. the matrix of the Fourier transform on the additive group ), and we show that for this specific matrix at least
samples is required. More concretely, by a second moment argument, we demonstrate that for fewer than
subsampled rows, with high probability there exists a sparse vector in the kernel — ruling out both the RIP property, and in general any hope for sparse recovery algorithm with those matrices. The same proof can be applied more generally to show that for any prime one needs to subsample at least rows of a matrix corresponding to Fourier transform on the additive group — for the sake of simplicity of the argument we do not elaborate on this.Acknowledgments
J.B. has been supported by Siebel Foundation. P.L. has been partially supported by the NSF Graduate Research Fellowship Program under grant DGE1144152. K.L. has been partially supported by NSF postdoctoral fellowship DMS1702533. S.R. has been supported by the Simons Collaboration on Algorithms and Geometry.
2. Preliminaries
Throughout this note, we use to denote the base logarithm. For an integer , we set and fix a bijection between and ; this identification remains in force for the rest of the paper.
We say a function is a character if it is a group homomorphism. To an element , we associate the character
for all . The Fourier transform of a function is defined to be
for all . Let be the matrix representing the Fourier transform on the group . In other words,
When normalized to have entries, the matrix is also known as a Hadamard matrix. We refer the reader to [15] for a thorough discussion of Fourier analysis on finite groups.
The Grassmannian is defined as the collection of vector subspaces of of dimension . Our proof uses the following wellknown result about the Fourier transform.
Lemma 2.1.
For a subspace , we let be the vector corresponding to the indicator function for with the normalization . Then
where is the orthogonal complement of .
In this way, implements a bijection between and . We also make use of the following bounds on the size of .
Lemma 2.2.
The size of is bounded by
(2.1) 
Proof.
A standard counting argument gives the explicit formula
(2.2) 
Using the inequalities
(2.3) 
on each factor individually gives the result. ∎
We also make use of the following trivial counting lemma.
Lemma 2.3.
For ,
3. Main Result
For a subset , we let denote the matrix generated from the rows of indexed by . Let
be a set of independent Bernoulli random variables which take the value
with probability . Random variables will indicate which rows to include in our measurement matrix, , meaningNote that has average cardinality and standard concentration arguments can be used to obtain sharp bounds on its size. We say that a vector is sparse if it has at most nonzero entries. The following theorem is our main technical result.
Theorem 3.1.
For , where and , there exists a positive constant such that for , there exists a sparse vector in the kernel of with probability .^{1}^{1}1 indicates a quantity that tends to zero as . All asymptotic notation is applied under the assumption that .
Proof.
We will define for future convenience, and note that , for small enough .
We restrict our attention to the sparse vectors that correspond to for , the indicator functions of subspaces of dimension . For such , set to be the indicator function for the event that . Define
(3.1) 
Observe that by Lemma 2.1, if is nonzero then there exists a sparse vector in the kernel of . We proceed by the second moment method to show that is nonzero with high probability. By the second moment method (e.g. [1]),
(3.2) 
We can easily obtain an expression for the first moment:
The second moment requires a more delicate calculation. We partition the sum into pairs of orthogonal complements with the same dimension of intersection. By Lemma 2.3, and letting denote , we have
(3.3) 
We can explicitly compute each term in the sum above as follows.
Claim 3.2.
For such that , we have
Proof.
Observe that
∎
We plug this expression back to the sum (3.3), in order to arrive at
Let us use to denote number of pairs such that . With this notation, the entire sum simplifies to
We will split this sum into two parts and bound them separately
The first part of the summation is easy to control: for we have , which implies , and
(3.4) 
We can now turn our attention to bounding .
Claim 3.3.
For , we have
Proof.
First, we have the bound . Indeed, to choose two subspaces of dimension with , we can first choose as a subspace of (there are ways of doing this), and then we can consider the quotient space and count the number of disjoint subspaces — the number of such choices is upper bounded by — the number of all pairs of subspaces .
Applying Lemma 2.2 to and , we obtain
The quadratic in the exponent is maximized for , hence in the range , the maximum is attained exactly at . This yields
where the second inequality follows from the fact that .
On the other hand, using Lemma 2.2 again, we have and the statement of the claim follows by combining these two inequalities. ∎
We can now state our main result in terms of sparse recovery.
Theorem 3.4.
Let and be as in Theorem 3.1. For there to exist a method to recover every sparse vector from , for any such that , the expected cardinality of the number of rows of must be . Further, for any constant , the expected number of rows of must be for to have the RIP property.
Proof.
By Theorem 3.1, there exists a sparse vector in the kernel of with high probability if the expected number of rows of is . Let us write where and are both sparse vectors. Then , which proves that is not injective when restricted to the set of all sparse vectors. The statement about the RIP property follows directly from the definition. ∎
References
 [1] Noga Alon and Joel H. Spencer, The probabilistic method, fourth ed., Wiley Series in Discrete Mathematics and Optimization, John Wiley & Sons, Inc., Hoboken, NJ, 2016. MR 3524748
 [2] Khanh Do Ba, Piotr Indyk, Eric Price, and David P Woodruff, Lower bounds for sparse recovery, Proceedings of the twentyfirst annual ACMSIAM symposium on Discrete Algorithms, SIAM, 2010, pp. 1190–1197.
 [3] Afonso S. Bandeira, Megan E. Lewis, and Dustin G. Mixon, Discrete uncertainty principles and sparse signal processing, Journal of Fourier Analysis and Applications (2017), 1–22.

[4]
Jean Bourgain,
An improved estimate in the restricted isometry problem
, Geometric aspects of functional analysis, Lecture Notes in Math., vol. 2116, Springer, Cham, 2014, pp. 65–70. MR 3364679  [5] Emmanuel J. Candes and Terence Tao, Nearoptimal signal recovery from random projections: Universal encoding strategies?, IEEE transactions on information theory 52 (2006), no. 12, 5406–5425.
 [6] Mahdi Cheraghchi, Venkatesan Guruswami, and Ameya Velingker, Restricted isometry of Fourier matrices and list decodability of random linear codes, SIAM J. Comput. 42 (2013), no. 5, 1888–1914. MR 3108113
 [7] Sjoerd Dirksen, Tail bounds via generic chaining, Electronic Journal of Probability 20 (2015).
 [8] Simon Foucart and Holger Rauhut, A mathematical introduction to compressive sensing, Applied and Numerical Harmonic Analysis, Birkhäuser/Springer, New York, 2013. MR 3100033
 [9] Andrej Y. Garnaev and Efim D. Gluskin, On widths of the euclidean ball, Soviet Mathematics Doklady, vol. 277, 1984, pp. 1048–1052.
 [10] Ishay Haviv and Oded Regev, The restricted isometry property of subsampled fourier matrices, Geometric Aspects of Functional Analysis, Springer, 2017, pp. 163–179.
 [11] Boris Sergeevich Kashin, Diameters of some finitedimensional sets and classes of smooth functions, Izvestiya Rossiiskoi Akademii Nauk. Seriya Matematicheskaya 41 (1977), no. 2, 334–351.

[12]
Jelani Nelson and Huy L. Nguyen, Sparsity lower bounds for dimensionality
reducing maps
, Proceedings of the fortyfifth annual ACM symposium on Theory of computing, ACM, 2013, pp. 101–110.
 [13] Holger Rauhut, Compressive sensing and structured random matrices, Theoretical foundations and numerical methods for sparse recovery 9 (2010), 1–92.
 [14] Mark Rudelson and Roman Vershynin, On sparse reconstruction from Fourier and Gaussian measurements, Communications on Pure and Applied Mathematics 61 (2008), no. 8, 1025–1045.
 [15] Audrey Terras, Fourier analysis on finite groups and applications, London Mathematical Society Student Texts, vol. 43, Cambridge University Press, Cambridge, 1999. MR 1695775
Comments
There are no comments yet.