I Introduction
In estimation and recovery problems related to empirical second moments, e.g. covariance matching, one often observes a matrix, which is a noisy linear combination of outer products
of random but known vectors . The goal is then to estimate the unknown coefficients from this observedmatrix. In the prototypical example this is an empirical covariance matrix. This problem appears in matrix and tensor recovery problems and many recent applications, see e.g.
[Romera:2016, Dasarathy:2015, Ma:2010, Duan:2016, Haghighatshoar:MVV:arxiv:18] and massive MIMO [Fengler:MassiveAccess:arxiv:2019]. In the case of sparse linear combinations this yields a compressed sensing [Candes2005, Donoho2006a] problem with a random but structured measurement matrix.There are several known properties which ensure robust and stable recovery guarantees of the vector of unknown but sparse (or compressible) coefficients. Among them is the restricted isometry property (RIP) of order which ensures that a measurement matrix maps sparse vectors almostisometrically, i.e., there exists such that
(1) 
holds for all –sparse vectors . Several upper bounds on have been established to ensure stable and robust recovery for certain algorithms, see e.g. [candes:rip2008, Foucart2013]. For example, it is known that that based convex recovery algorithms succeed if [Cai2014]
. For a random matrix with iid. subGaussian components it is known that this property holds with overwhelming probability for
, see for example [Foucart2013] and also [dirksen:ripgap] for a further discussions about its relation to the e.g. the nullspace property. When imposing additional structure on a random measurement matrix often more measurements are required to ensure robust recovery guarantees. However, for many random ensembles it has been shown that it is still possible to achieve, up to logfactors, a linear relation between sparsity and number of measurements , meaning that .In this work we show that the structure imposed by the problem above indeed also allows robust and stable recovery in the regime since . This addresses a conjecture raised in [Khanna:correction] for noncentered KR product. Some of the essential proof steps have been sketched already in [Hag:isit18].
More precisely, let be a random matrix with independent columns , . The (columnwise) self KhatriRao product of the matrix is defined as
(2) 
where the matrix is the outer product^{1}^{1}1 with is the vectorization operation identifying a matrix with a vector. of the vector and .
We assume the columns to be normalized in expectation such that and are drawn from an isotropic distribution, i.e.
(3) 
First results^{2}^{2}2Note that the results have been corrected in v3 of the preprint on the RIP property for self KR products have been established in [Khanna:KRRIP:arxivv3, Khanna:KRRIP], [Khanna:correction]. In this work it has been shown that for (meaning that ) the dimensional KR product of a centered iid. subGaussian matrix has RIP with high probability, see [Khanna:KRRIP:arxivv3, Theorem 3]. Thus, the number of measurements scales quadratically in the sparsity . However, we will show below that the scaling is indeed linear when centering the KR product.
We will use the work of [Ada2011] to prove a bound on the RIP constant of the centered and normalized KR product :
(4) 
It is easy to see that . is a normalization factor to ensure that the columns of are still normalized after centering: . In general
(5) 
Note that for a vector one has
(6) 
Example 1 (iid with normalized 2. moment).
Let be a random vector with components being independent copies of with . Then
(7) 
Example 2 (Constant Amplitude).
Let be a random vector such that it’s components
are independent copies of a Rademacher random variable
, i.e., with fixed amplitudeand uniformly distributed sign. From the previous example it follows that
(8) 
Example 3 (Spherical Distribution).
Let be drawn uniformly from the sphere with radius . Then and it can easily be checked that (6) gives:
(9) 
Ii RIP for Centered KR Products
The norm for of a realvalued random variable can^{3}^{3}3this definitions are not unique in the literature such that these norms may differ in constants formally be defined as:
(10) 
Note that for . If is satisfied, the random variable is called subGaussian for and subexponential for . The definitions above extend in a canonical way to random vectors. The norm of a random vector is defined as the best uniform bound on the norm of its marginals:
(11) 
random variables and vectors for are often called heavy tailed. Note that this terminology is also important if the norm of a random vectors grows with its dimension.
Iia RIP for Independent Heavytailed Columns
As introduced above, the KR product of a random matrix with independent subGaussian isotropic columns is itself a matrix with heavy tailed (subexponential) independent columns having a special structure. The RIP properties for the columnindependent model with normalized subGaussian isotropic columns have been established in [vershynin_2018]. In a series of works [Ada2011, Guedon2014:heavy:columns, Guedon2017] the heavy tailed column independent model has been further investigated and concrete results can be found for various ensembles. However, the previously investigated ensembles not explicitly discuss the structure imposed by KR products. Thus, we make use of the following generic RIP result from [Ada2011, Theorem 3.3] for matrices with iid subexponential columns:
Theorem 1 (Theorem 3.3 in [Ada2011]).
Let and be integers such that . Let be independent random vectors normalized such that and let . Let , and set . Then for the matrix with columns ,
(12) 
holds with probability larger then
(13)  
(14) 
where are universal constants.
We shall use this theorem for and . The key to get a good bound from Theorem 1 is to

Show the marginals of the columns of have subexponential tails with a subexponential norm, which is independent of the dimension .

Show that the norm of the columns of concentrate well around their mean.
If the columns of are exactly normalized, then the second point is trivially fulfilled, the latter two terms of (14) vanish and we can choose and to be arbitrary small. We can use the following corollary for matrices with constant norm:
Corollary 1.
Let all parameters be as in Theorem 1 with the additional requirement that . Additionally we assume that . Then the RIP constant of order of satisfies
(15) 
with probability at least as long as
(16) 
Where and are some universal constants.
Proof.
Let us abbreviate . Since , the last two terms in (14) vanish for all and .
(17) 
with probability larger than
(18) 
Let for any . Note that the conditions and guarantee that . Plugging into (17) we see that the RIPconstant satisfies
(19)  
(20)  
(21)  
(22) 
where in the first line we made use of and in the last line we used . This bound fails with probability:
(23)  
(24)  
(25) 
where in the second line it was used that . The statement of the Corollary follows by choosing small enough such that . ∎
IiB The Case of SubGaussian iid Columns
We will show here that Corollary 1 holds almost unchanged if where are the columns of the centered self KR product of a matrix with subGaussian iid entries as defined in (4) and Example 1. First we need to show, that the columns are subexponential with a norm independent of . This is a consequence of the HansonWright inequality, which states that every centered quadratic form of independent subGaussian random variables is subexponential:
Theorem 2 (HansonWright inequality).
Let be a random vector with independent components which satisfy and . Let be a matrix. Then, for every ,
(26) 
Proof.
See [Rud2013] ∎
With and we denote here operator norm and Frobenius norm of the matrix . Note that a RV with such a mixed tail behavior is especially subexponential. This can be seen by bounding its moments. Let be a RV with
(27) 
Since , we have for . It follows
(28) 
where is the Gamma function. So
(29) 
which is equivalent to by elementary properties of subexponential random variables.
Theorem 3.
Proof.
To apply Theorem 1 to we need to show that the norm of it’s columns concentrate well around their mean. This is the subject of the following theorem.
Theorem 4.
Let be the columns of the centered self KR product of a centered, normalized subGaussian iid matrix as in Theorem 3. Let denote the distribution of the entries of . Then for it holds:
(34) 
if satisfies
(35) 
Proof.
By union bound we have that
(36) 
Furthermore, with the abbreviation , we have (see example 1)
(37) 
which can be rewritten as
(38) 
Example 1 shows that , thus
(39)  
(40) 
with , and . We can estimate the one sided tail by
(41) 
Therefore is a sum of independent zero mean subexponential random variables with , as a centering argument and the identity for subGaussian random variables shows, e.g. [vershynin_2018, Ch. 2.7]. Therefore the elemental Bernstein inequality gives that
(42) 
Then the same argument as in (29) shows that
(43) 
for some constant . So in particular
(44)  
(45) 
where in the last step we used that and . The probability of deviation of can be bound as follows:
(46)  
(47)  
(48) 
Finally
(49) 
For the other tail , notice that and are nonnegative. For this is obvious, for it follows from Jensen inequality and . Therefore . ∎
The third term in (14) is the probability of a one sided deviation of and therefore we can bound it by the same term as in theorem 4:
(50) 
Now we can state that the result of Corollary 1 holds almost unchanged, except for different constants, for the self KR product of an iid subGaussian matrix:
Theorem 5.
Let and be integers such that and . Let be a random matrix with subGaussian iid entries, distributed according to , with , and . Let be the centered and rescaled selfKR product of as defined in (4). Then the RIP constant of order of satisfies
(51) 
for any with probability larger then
(52) 
as long as
(53) 
and
(54) 
Where with . For some universal constants .
Proof.
Theorem 3 shows that the columns of are subexponential. So the prerequisites of Theorem 1 are fulfilled with , for some absolute constant , and . We set and . Furthermore we can set , such that . Theorem 4, with and (IIB) show that there exist constants
(55) 
if . (the latter is simply a constant, since is bounded by for subGaussian .) Choosing in the condition (54) large enough, such that , we can guarantee that
(56) 
with . So Theorem 1 gives that
(57) 
holds with probability larger then
(58) 
Then, the same calculation as in the proof of Corollary 1 shows, that there is a constant such that setting leads to the result of this theorem. ∎
IiC Spherical Columns
Let be a matrix such that its columns are drawn iid from a sphere with radius . See Example 3. Since the columns are now exactly normalized we can apply Corollary 1, if we can show that columns of the centered selfKR product have subexponential marginals, with a subexponential norm independent of the dimension. For this we can use the following result from [Ada2015] which states that a random vector which satisfies the convex concentration property also satisfies the HansonWright inequality:
Theorem 6 (Theorem 2.5 in [Ada2015]).
Let be a mean zero random vector in , which satisfies the convex concentration property with constant , then for any matrix and every ,
(59) 
The convex concentration property is defined as follows
Definition 1 (Convex Concentration Property).
Let be a random vector in . has the convex concentration property with constant if for every 1Lipschitz convex function , we have and for every ,
(60) 
A classical result states that a spherical random variable has the even stronger (nonconvex) concentration property (e.g. [vershynin_2018, Theorem 5.1.4]):
Theorem 7 (Concentration on the Sphere).
Let be uniformly distributed on the Euclidean sphere of radius . Then there is an absolute constant , such that for every 1Lipschitz function
(61) 
So in particular has the convex concentration property with constant and it follows by Theorem 6 that it also satisfies the tail bound of the HansonWright inequality. As shown in (29), this implies that the columns of are subexponential with for some absolute constant . With this we can apply Corollary 1.
Remark 1.
In this section we did not specifically use the property that the columns of are drawn iid from the sphere, but only their convex concentration property. So the results also hold for the larger class of normalized columns with dependent entries, i.e. those which satisfy the convex concentration property. E.g. it is known that satisfies the convex concentration property if its entries are drawn iid without replacement from some fixed set of numbers with . For more examples see [Ada2015]. Also note that the subGaussian iid case of section IIB is not covered by Theorem 6, since with subGaussian iid does not, in general, have the convex concentration property with a constant independent of dimension [Ada2015].
Acknowledgments
We thank Fabian Jänsch, Radoslaw Adamczak, Saeid Haghighatshoar and Giuseppe Caire for fruitful discussions. PJ has been supported by DFG grant JU 2795/3.
Comments
There are no comments yet.