DeepAI

# Quantized Compressive Sensing with RIP Matrices: The Benefit of Dithering

In Compressive Sensing theory and its applications, quantization of signal measurements, as integrated into any realistic sensing model, impacts the quality of signal reconstruction. In fact, there even exist incompatible combinations of quantization functions (e.g., the 1-bit sign function) and sensing matrices (e.g., Bernoulli) that cannot lead to an arbitrarily low reconstruction error when the number of observations increases. This work shows that, for a scalar and uniform quantization, provided that a uniform random vector, or "random dithering", is added to the compressive measurements of a low-complexity signal (e.g., a sparse or compressible signal, or a low-rank matrix) before quantization, a large class of random matrix constructions known to respect the restricted isometry property (RIP) are made "compatible" with this quantizer. This compatibility is demonstrated by the existence of (at least) one signal reconstruction method, the "projected back projection" (PBP), whose reconstruction error is proved to decay when the number of quantized measurements increases. Despite the simplicity of PBP, which amounts to projecting the back projection of the compressive observations (obtained from their multiplication by the adjoint sensing matrix) onto the low-complexity set containing the observed signal, we also prove that given a RIP matrix and for a single realization of the dithering, this reconstruction error decay is also achievable uniformly for the sensing of all signals in the considered low-complexity set. We finally confirm empirically these observations in several sensing contexts involving sparse signals, low-rank matrices, and compressible signals, with various RIP matrix constructions such as sub-Gaussian random matrices and random partial Discrete Cosine Transform (DCT) matrices.

• 4 publications
• 37 publications
05/11/2018

### Taking the edge off quantization: projected back projection in dithered compressive sensing

Quantized compressive sensing (QCS) deals with the problem of representi...
08/17/2020

### One Bit to Rule Them All : Binarizing the Reconstruction in 1-bit Compressive Sensing

This work focuses on the reconstruction of sparse signals from their 1-b...
04/09/2018

### Memoryless scalar quantization for random frames

Memoryless scalar quantization (MSQ) is a common technique to quantize f...
12/05/2019

### (l1,l2)-RIP and Projected Back-Projection Reconstruction for Phase-Only Measurements

This letter analyzes the performances of a simple reconstruction method,...
10/02/2018

### Quantization-Aware Phase Retrieval

We address the problem of phase retrieval (PR) from quantized measuremen...
01/19/2019

### One-Bit Sensing of Low-Rank and Bisparse Matrices

This note studies the worst-case recovery error of low-rank and bisparse...
08/08/2021

### Robust 1-bit Compressive Sensing with Partial Gaussian Circulant Matrices and Generative Priors

In 1-bit compressive sensing, each measurement is quantized to a single ...

## 1 Introduction

Compressive sensing (CS) theory [1, 2, 3] has shown us how to compressively and non-adaptively sample low-complexity signals, such as sparse vectors or low-rank matrices, in high-dimensional domains. In this framework, accurate estimation of such signals from their compressive measurements is still possible thanks to non-linear reconstruction algorithms (e.g., -norm minimization, greedy algorithms) exploiting the signal low-complexity nature. In other words, by generalizing the concepts of sampling and reconstruction, CS has somehow extended Shannon-Nyquist theory initially restricted to the class of band-limited signals.

Specifically, given a sensing (or measurement) matrix with , CS describes how one can recover a signal from the measurements associated with the underdetermined linear model222Some of the mathematical notations and conventions used below are defined at the end of this section.

 y=Φx+n, (1)

where is the measurement vector, stands for a possible additive measurement noise, and is assumed restricted to a low-complexity signal set , e.g., the set of -sparse vectors in an orthonormal basis . In particular, it has been shown that the recovery of is guaranteed if respects the restricted isometry property (RIP) over , which essentially states that behaves as an approximate isometry for all elements of (see Sec. 3.1

). Interestingly, many random constructions of sensing matrices have been proved to respect the RIP with high probability (w.h.p.

333Hereafter, we will write w.h.p. if the probability of failure of the considered event decays exponentially with respect to the number of measurements.[4, 5, 6, 3]. For instance, if is a Gaussian random matrix with entries identically and independently distributed

(i.i.d. ) as a standard normal distribution

, respects the RIP over with very high probability provided . For a more general set , the RIP is verified as soon as is sufficiently large compared to the intrinsic complexity of in , e.g., as measured by the squared Gaussian mean width or the Kolmogorov entropy of  [7, 5, 4] (see Sec. 3.1 and Sec. 7).

Under the satisfiability of the RIP, many signal reconstruction methods (e.g., basis pursuit denoise [1], greedy algorithms such as the orthogonal matching pursuit [8], or iterative hard thresholding [3]) achieve a stable and robust estimate of from the sensing model (1), e.g., for . They typically display the following reconstruction error bound, or instance optimality [9, 1, 2],

 ∥x−^x∥ ⩽ C∥x−xk∥1√k+Dϵ, (2)

where is the signal estimate, the best -term approximation of , estimates the noise energy, and are only depending on .

In this brief overview of CS theory, we thus see that, at least in the noiseless setting, the measurement vector is assumed represented with infinite precision. However, any realistic device model imposes digitalization and finite precision data representations, e.g., to store, transmit or process the acquired observations. In particular, (1) must be turned into a quantized CS formalism where the objective is to reliably estimate a low-complexity signal from the quantized measurements

 y=Qg(Φx). (3)

In (3), is a general quantization function, or quantizer, mapping -dimensional vectors to some vectors in a discrete set, or codebook, .

While initial approaches modeled the quantization distortion as an additive, bounded noise in (1) (see e.g.[10]), inducing thus a constant error bound in (2[11, 12], various kind of quantizers have since then been studied more deeply in the context of QCS [12]. This includes -quantization [11], non-regular scalar quantizers [13], non-regular binned quantization [14, 15], and even vector quantization by frame permutation [16]. These quantizers, when combined with an appropriate signal reconstruction procedure, achieve different decay rate of the reconstruction error when the number of measurements increases. For instance, for a -quantizer combined with Gaussian or sub-Gaussian sensing matrices [11], or with random partial circulant matrices generated by a sub-Gaussian random vector [17], this error can decay polynomially in for an appropriate reconstruction procedure, and, in the case of a 1-bit quantizer, adapting the sign quantizer by inserting in it adaptive thresholds can even lead to exponential decay [18].

In this paper, however, we do not to focus on optimizing the quantizer to achieve the best decay rate for the reconstruction error of some appropriate algorithm when increases. Our main objective is to show that a simple scalar quantization procedure, i.e., a uniform quantizer, applied componentwise onto vectors (or entry-wise on matrices), is compatible with the large class of sensing matrices known to satisfy the RIP, provided that we combine the quantization with a random, uniform pre-quantization dithering [13, 19, 20]. Accessing to a broader set of sensing matrices for QCS, i.e., not only restricted to unstructured sub-Gaussian random constructions, is desirable in many CS applications where specific, structured sensing matrices are constrained by technology or physics. For instance, (random) partial Fourier (or DCT) matrices are ubiquitous in magnetic resonance imaging [21], radio-astronomy [22], radar applications, and communications applications [23, 24].

In this context, we also study the estimation of signals belonging to a general low-complexity set in , e.g., the set of sparse or compressible vectors or the set of low-rank matrices. In fact, we will consider any low-dimensional sets, i.e., having a small Kolmogorov entropy (see Sec. 3.1 and Sec. 7), that support the RIP of . This last requirement ensures that the reconstruction guarantees of QCS reduces to those of CS if the quantization disappears (e.g., when its precision becomes infinite).

Mathematically, our work deals with the problem of estimating a signal from the QCS model

 y=A(x)=A(x;Φ,ξ):=Q(Φx+ξ), (4)

where is a quantized random mapping, i.e., , is a uniform scalar quantization of resolution444In this work, the term “resolution” does not refer to the number of bits used to encode the quantized values [25]. , and

is a uniform random dithering vector whose components are i.i.d. as a uniform distribution over

, i.e., for , or, more briefly, .

The compatibility mentioned above between the QCS model (4) and the class of RIP matrices is demonstrated by showing that a simple (often non-iterative) reconstruction method, the projected back projection (PBP) of the quantized measurements onto the set , i.e., finding the closest point in to the back projection for any (see Sec. 4), achieves a reconstruction error that decays like when increases, for some only depending on .

For instance, we prove in Sec. 7 that, given a RIP matrix and a fixed signal , if the dithering is random and uniform over , then one achieves, w.h.p., when is the set of sparse vectors, the set of low-rank matrices555Up to the identification of these matrices with their vector representation., or any finite union of low-dimensional subspaces, as with model-based CS schemes [26] or group-sparse signal models [27]. Interestingly, for these specific sets, the same error decay rate is proved to hold uniformly, up to extra log factors in the involved dimensions, i.e., when the randomly generated allows the estimation of all vectors of w.h.p.. More generally, if is a convex and bounded set of , e.g., the set of compressible signals , we observe error decay rates associated with and in the non-uniform and in the uniform setting, respectively.

Knowing if other reconstruction algorithms can reach faster error decay in our QCS model and for general RIP matrices is an open question. In this regard, PBP can be seen as a good initialization for more advanced reconstruction algorithms iteratively enforcing the consistency of the estimate with the observations  [28, 29, 30, 31, 32].

In all our developments, the importance of the random dithering in the QCS model (4) founds its origin in the simple observation that, for , for all (see Lemma A.1 in Appendix A

). By the law of large numbers, this thus means that for

different r.v.’s with and increasingly large, an arbitrary projection of the vector for some vector onto a fixed direction tends to . Moreover, this effect should persist for all and selected in a set whose dimension is small compared to , and, in our case of interest, if these vectors are selected in the image of a low-complexity set by a RIP matrix .

In order to accurately bound the deviation between these projections and zero, we prove that, given a resolution and for large before the intrinsic complexity of (as measured by its Kolmogorov entropy), the quantized random mapping associated with a RIP matrix and a random dithering respects, w.h.p., the limited projection distortion property over , or LPD, defined by

 1m|⟨A(u),Φv⟩−⟨Φu,Φv⟩| ⩽ ν,∀u,v∈K∩Bn,

where is a certain distortion depending on , , and . This is established using tools from measure concentration theory and some extra care to deal with the discontinuities of . In fact, we will see in Sec. 6 that if the dithering is random and uniform, where is an arbitrary small distortion impacting the requirement on . For instance, forgetting all other dependencies, for the set of sparse vectors as classically established for ensuring the RIP of a Gaussian random matrix [4] (see Sec. 6 and Sec. 7.3). Moreover, by localizing the LPD on a fixed , the impact of quantization is reduced and , as deduced in Sec. 5.

Postponing the detailed proof of this fact to Sec. 4, it is easy to understand why the LPD property is useful to characterize the reconstruction error of PBP in the estimation of bounded -sparse signals. We note first that a standard use of the polarization identity proves that if satisfies the RIP over , then for all such that (see, e.g.[3]). Therefore, under the LPD of over and for the same vectors and , the triangular identity provides

 |1m⟨A(u),Φv⟩−⟨u,v⟩| ⩽ ϵ+ν. (5)

Consequently, for a bounded sparse signal , its PBP estimate , i.e., the closest element of to , is the best -sparse approximation of both and the -sparse vector obtained by zeroing all the entries of but those indexed in . Leveraging this fact we thus get , and, by the definition of the -norm,

 ∥¯a−x∥=supv∈ΣnT∩Bn⟨a−x,v⟩=supv∈ΣnT∩Bn1m⟨A(x),Φv⟩−⟨x,v⟩,

with . Consequently, since is at most -sparse, (5) provides finally the bound on the reconstruction error of PBP.

The rest of the paper is structured as follows. We present in Sec. 2 a few related works, namely, former usages of the PBP method in 1-bit CS, variants of the LPD property in 1-bit CS and non-linear CS, and certain known reconstruction error bounds of PBP and related algorithms for certain QCS and non-linear sensing contexts. Most of these works are based on sub-Gaussian random projections of signals altered by quantization or other non-linear disturbances, with two noticeable exceptions using subsampled Gaussian circulant sensing matrix and bounded orthogonal ensemble [33, 34]. After having introduced a few preliminary concepts in Sec. 3, such as the characterization of low-complexity spaces, the PBP method and the formal definition of the (L)LPD for a general mapping , Sec. 4 establish the reconstruction error bound of PBP when the LPD of and the RIP of are both verified. We realize this analysis for three kinds of low-complexity sets, namely, finite union of low-dimensional spaces (e.g., the set of (group) sparse signals), the set of low-rank matrices, and the (unstructured) case of a general bounded convex set. In Sec. 5, we prove that the L-LPD holds w.h.p. over low-complexity sets when represents a linear sensing model corrupted by additive sub-Gaussian noise. This analysis will later simplify the characterization of PBP of QCS observations in the non-uniform case when the observed signal is fixed prior to the generation of the random dithering. In Sec. 6, we prove that the quantized random mapping defined in (4) from a uniform random dithering is sure to respect the (uniform) LPD w.h.p. provided is large before the complexity of . From the results of these two last sections, we instantiate in Sec. 7 the general bounds found in Sec. 4 and establish the decay rate of the PBP reconstruction error when increases for the same low-complexity sets considered in Sec. 4 and for several classes of RIP matrices including sub-Gaussian and structured random sensing matrices. Finally, in Section 8, we validate the PBP reconstruction error numerically for the particular sets discussed in Sec. 7 and several structured and unstructured random sensing matrices .

#### Conventions and notations:

We find it useful to finish this introduction with the conventions and notations used throughout this paper. We denote vectors and matrices with bold symbols, e.g. or

, while lowercase light letters are associated with scalar values. The identity matrix in

reads and the zero vector , its dimension being clear from the context. The  component of a vector (or of a vector function)  reads either  or , while the vector  may refer to the  element of a set of vectors. The set of indices in  is  and the support of is . The Kronecker symbol is denoted by and is equal to 1 if and to 0 otherwise, while the indicator of a set is equal to 1 if and to 0 otherwise. For any  of cardinality , denotes the restriction of to , while is the matrix obtained by restricting the columns of  to those indexed by . The complement of a set reads . For any , the -norm of  is with . For a bounded set , . The -sphere in  is , and the unit -ball reads . For , we write and . By extension, is the Frobenius unit ball of matrices with , where the Frobenius norm is associated with the scalar product through , for two matrices . The common flooring operator is denoted .

An important feature of our study is that we do not pay particular attention to constants in the many bounds developed in this paper. For instance, the symbols are positive and universal constants whose values can change from one line to the other. We also use the ordering notations (or ), if there exists a such that (resp. ) for two quantities and .

Concerning statistical quantities, and  denote an  random matrix or an -length random vector, respectively, whose entries are identically and independently distributed (or

) as the probability distribution

, e.g., (or ) is the distribution of a matrix (resp. vector) whose entries are as the standard normal distribution  (resp. the uniform distribution

). We also use extensively the sub-Gaussian and sub-exponential characterization of random variables (or r.v.) and of random vectors detailed in

[35]. The sub-Gaussian and the sub-exponential norms of a random variable are thus denoted by and , respectively, with the Orlicz norm for . The random variable is therefore sub-Gaussian (or sub-exponential) if (resp. ).

## 2 Related works

We now provide a comparison of our work with the most relevant literature. This one is connected to 1-bit CS, QCS or other non-linear sensing models, when these are applied componentwise (as for scalar quantization) on compressive sensing measurements, possibly with a random or adaptive pre-quantization dithering for certain studies, with algorithms similar or related to PBP. Most of the works presented below are summarized in Table 1, reporting there, amongst other aspects, the sensing model, the algorithm, the type of admissible sensing matrices and the low-complexity sets chosen in each of the referenced works.

#### PBP in 1-bit CS:

Recently, signal reconstruction via projected back projection has been studied in the context 1-bit compressive sensing (1-bit CS), an extreme QCS scenario where only the sign of the compressive measurements are retained [41, 18, 31, 40]. In this case (3) is turned into

 y=sign(Φx). (6)

It has been shown that if the sensing matrix satisfies the sign product embedding property (SPE) over  [40, 31], that is, up to some distortion and some universal normalization ,

 |μm⟨sign(Φu),Φv⟩−⟨u,v⟩|⩽ϵ, (SPE)

for all , then the reconstruction error of the PBP of is bounded by  [31, Prop. 2]. In other words, for a signal with unknown norm (as implied by the invariance of (6) to positive signal renormalization [41]), the PBP method allows us to estimate the direction of a sparse signal. This remains true for all methods assuming to be of unit norm, as those explained below.

So far, the SPE property has only been proved for Gaussian random sensing matrices, with i.i.d. standard normal entries, for which . Such matrices respect the SPE with high probability if , conferring to PBP a (uniform) reconstruction error decay of when increases for all . Besides, by localizing the SPE to a given , a non-uniform variant of the previous result, i.e., where is randomly drawn conditionally to the knowledge of , gives error decaying as fast as  [31, Prop. 2] when increases.

For more general low-complexity set (such that ) and with , provided

is a random sub-Gaussian matrix (with i.i.d. centered sub-Gaussian random entries of unit variance

[35]), a variant of the PBP method, which amounts to finding the vector maximizing its scalar product666On the principle, this is similar, but not equivalent, to finding the closest point of , in the Euclidean sense, to the back projection , as in the PBP. with , is proved to reach, with high probability, small reconstruction error [39], and this even if the binary sensing model (6) is noisy (e.g., with possible random sign flips on a small percentage of the measurements). In fact, this error decays like when increases, with depending only on the level of measurement noise, on the distribution of , and on (actually, on its Gaussian mean width, see Sec. 7), while is associated with the non-Gaussian nature of the sub-Gaussian random matrix (i.e., if it is Gaussian) [39, Thm 1.1]. Therefore, as a paid-off for having a 1-bit sub-Gaussian sensing (e.g., 1-bit Bernoulli sensing), the reconstruction error is not anymore guaranteed to decrease below a certain floor level , and this level is driven by the sparsity of (i.e., it is high if is very sparse). In fact, in the case of 1-bit Bernoulli sensing, there exist counterexamples of -sparse signals that cannot be reconstructed, i.e., with constant reconstruction error if increases, showing that the bound above is tight [40].

More recently, [33] has shown that, if is a subsampled Gaussian circulant matrix in the binary observation model (6), PBP can reconstruct the direction of any sparse vector up to an error decaying as (see [33, Thm 4.1]

). Moreover, by adding a random dithering to the linear random measurements before their binarization, the same authors proved that a second-order cone program (SOCP) can fully estimate the vector

, i.e., not only its direction. For the same subsampled circulant sensing matrix, they also proved that their results extend to the dithered, uniformly quantized CS expressed in (4). In fact, with high probability, and for all effectively sparse signals , i.e., such that is small, the same SOCP program achieves a reconstruction error decaying like when increases. Their only requirement is that, first, the dithering is made of the addition of a Gaussian random vector with a uniform random vector adjusted to the quantization resolution, and second, that is smaller than  [33, Thm 6.2].

In the same order of ideas, [42, 18] have also shown that, for Gaussian random sensing matrices, adding an adaptive or random dithering to the compressive measurements of a signal before their binarization allows accurate reconstruction of this signal, i.e., of both its norm and its direction, using either PBP or the same SOCP program as in [33]. Additionally, for random observations altered by an adaptive dithering before their 1-bit quantization, i.e., in a process close to noise shaping or -quantizer [11, 12], an appropriate reconstruction algorithm can achieve an exponential decay of its error in terms of the number of measurements. This is only demonstrated, however, in the case of Gaussian sensing matrices and for sparse signals only.

#### QCS and other non-linear sensing models:

The (scalar) QCS model (4) can be seen as a special case of the more general, non-linear sensing model , with the random non-linear function such that , for some random function  [36, 38]. In the QCS context defined in (4), this non-linear sensing model corresponds to setting with .

In [38], the authors proved that, for a Gaussian random matrix and for a bounded, star-shaped777The set is star-shaped if, for any , . set , provided that

, , and with , and provided that is sub-Gaussian with finite sub-Gaussian norm  [35], one can estimate with high probability from the solution of the PBP of in (59) (see [38, Thm 9.1]). In the specific case where matches the QCS model (4), this analysis proves that for Gaussian random matrix , the PBP of QCS observations estimates the direction with a reconstruction error decaying like when increases (the details of this analysis are given in App. B).

A similar result is obtained in [36] for the estimate provided by a K-Lasso program, which finds the element of minimizing the -cost function , when , and under the similar hypotheses on the non-linear corruption than above (i.e., with finite moments , , and  ). Of interest for this work, [36] introduced a form of the (local) LPD (given in Sec. 1 and Sec. 3) in the case where and is a Gaussian random matrix (with possibly unknown covariance between rows). The authors indeed analyzed when, for some ,

 1m(⟨f(Φx),Φv⟩−⟨Φμx,Φv⟩)≲ϵ,∀v∈D∗=D∩Bn, (7)

with being the tangent cone of at . An easy rewriting of [36, Proof of Thm 1.4] then essentially shows that the RIP of over combined with (7) provides . In particular, thanks to the Gaussianity of , they prove that, with large probability, (7) holds888As implied by Markov’s inequality combined with [36, Lem. 4.3]. with , with the Gaussian mean width of measuring its intrinsic complexity (see Sec. 3.1). For instance, if and if is such that , this proves that . Correspondingly, if the ’s are thus selected to match the QCS model with a Gaussian sensing , this shows that K-Lasso achieves a non-uniform reconstruction error decay of of if the Gaussian mean width of the tangent cone can be bounded (e.g., for sparse or compressible signals, or for low-rank matrices). In other words, when instantiated to our specific QCS model, but only in the context of a Gaussian random matrix and with some restrictions on the norm of , the non-uniform reconstruction error decays of PBP and K-Lasso in [38] and [36], respectively, are similar to the one achieved in our work (see Sec. 7).

More recently, in the context of and noise-shaping quantization, [34] has extended initial works restricted to the use of Gaussian random matrices [11] by proving that if is either a bounded orthogonal ensemble (e.g., a random partial Fourier matrices) or a partial circulant ensemble (also known as subsampled circulant matrix), a convex program proposed in the paper achieves a uniform reconstruction of any with an error decaying polynomially fast in for the quantization, or exponentially fast in for the noise-shaping quantization. However, the analysis of [34] is limited for the estimation of sparse signals, although they characterize binary embeddings of finite and general low-complexity sets from the same quantization schemes.

## 3 Preliminaries

### 3.1 Low-complexity spaces

In this work, our ability to estimate a signal from the QCS model (4) is developed on the hypothesis that this signal belongs to a “low-complexity” set . In other words, we first suppose that, for any radius , the restriction999Our developments can be rescaled to accept any bounded low-complexity set , instead of considering . of to the -ball can be covered by a relatively small number of translated -balls of radius . In other words, we assume that has a small Kolmogorov entropy before  [43], with, for any bounded set ,

 H(S,η):=logmin{|G|: G⊂S⊂G+ηBn},

where the addition is the Minkowski sum between sets. Most of the time, e.g., if is a low-dimensional subspace of (or a finite union of such spaces, as for the set of sparse vectors) or the set of low-rank matrices, is well controlled by standard covering arguments of  [44]. In fact, as explained in Sec. 7 and summarized in [20, Table 1], for most of these sets, we can consider that , where is the Gaussian mean width of a bounded set  [40, 45] defined by

 w(S) := Esupu∈S|⟨g,u⟩|,g∼N(0,In).

Interestingly, we have indeed for a large number of low-complexity sets , such as those mentioned above. For instance, , and the square Gaussian mean width of bounded, square rank- matrices with entries is bounded by (see e.g.[46, 45] and [20, Table 1]).

When does not belong to these easy cases, Sudakov minoration provides the (generally) looser bound  [47, Thm 3.18]. The analysis of both and for dithered QCS will be further investigated in Sec. 7.

Another implicit assumption we make on the set , or on its -multiple for some (see Sec. 7), is that it is compatible with the RIP of . In other words, given a distortion , we assume respects the RIP defined by

 ∣∣1m∥Φu∥2−∥u∥2∣∣ ⩽ ϵ,∀u∈K∩Bn. (8)

This assumption is backed up by a growing literature in the field of compressive sensing and we will refer to it in many places. In particular, it is known that sensing matrices with i.i.d. centered sub-Gaussian random entries satisfy the RIP if is large compared to the typical dimension of , as measured by the square Gaussian mean width of  [7, 5, 3]. Note that in the case where with a cone, i.e., for all , a simple rescaling argument provides the usual formulation of the RIP, i.e., (8) implies

 (1−ϵ)∥u∥2⩽1m∥Φu∥2⩽(1+ϵ)∥u∥2,∀u∈K0. (9)

### 3.2 Projected back projection

As announced in the Introduction, the standpoint of this work is to show the compatibility of a RIP matrix with the dithered QCS model (4), provided that the dithering is random and uniform, through the possibility to estimate via the projected back projection (PBP) onto of the quantized observations

More generally, given the distorted CS model

 y=D(x),x∈K∩Bn, (10)

associated with the general (random) mapping (e.g., with ), the PBP of onto is mathematically defined by

 ^x := PK(1mΦ⊤y), (11)

where is the (minimal distance) projector101010Note that there exist cases where could have several equivalent minimizers, e.g., if is non-convex. If this happens, we just assume picks one of them, arbitrarily. on , i.e.,

 PK(z) ∈ argminu∈K∥z−u∥.

Throughout this work, we assume that can be computed, i.e., in polynomial complexity with respect to and . For instance, if , is the standard best -term hard thresholding operator, and if is convex and bounded, is the orthogonal projection onto this set.

###### Remark 3.1.

Hereafter, the analysis of the PBP method is divided into two categories: uniform estimation, i.e., with high probability on the generation of (e.g., through the random dithering in the case where in (4)), all signals in the set can be estimated using the same mapping ; and non-uniform estimation (or fixed signal estimation) where is randomly generated for each observed signal.

### 3.3 Limited projection distortion

We already sketched at the end of the Introduction that a crucial element of our analysis is the combination of the RIP of with another property jointly verified by , or equivalently by the quantized random mapping defined in (4). As will be clear later, this property, the (local) limited projection distortion (or (L)LPD) and the RIP allow us to bound the reconstruction error of the PBP. We define it as follows for a general mapping .

###### Definition 3.2 (Limited projection distortion).

Given a matrix and a distortion , we say that a general mapping respects the limited projection distortion property over a set observed by , or LPD, if

 1m|⟨D(u),Φv⟩−⟨Φu,Φv⟩| ⩽ν,∀u,v∈K∩Bn. (12)

In particular, when is fixed in (12), we say that respects the local limited projection distortion on , or L-LPD.

As explained in Sec. 2, the LPD property was (implicitly) introduced in [36] in the special case where and is a non-linear function applied componentwise on the image of . The LPD is also connected to the SPE introduced in [40] for the specific case of a 1-bit sign quantizer if we combine the LPD property with the RIP of in order to approximate by in (12) (see Lemma 3.4). This literature was however restricted to the analysis of Gaussian random matrices.

###### Remark 3.3.

In the case where is the quantized random mapping introduced in (4), if is a random uniform dithering, an arbitrary low-distortion is expected in (12) for large values of since, in expectation, from Lemma A.1 (see Sec. 5 and Sec. 6). Note also that for such a random dithering, if tends to 0, then the quantizer tends to the identity operator and must vanish. In fact, by Cauchy-Schwarz and the triangular inequality, this is sustained by the deterministic bound

 |⟨A(u),Φv⟩−⟨Φu,Φv⟩|=|⟨(A(u)−Φu),Φv⟩| ⩽∥Φv∥(∥A(u)−(Φu+ξ)∥+∥ξ∥) ⩽2δ√m∥Φv∥.

As developed in Sec. 5, it is easy to prove the L-LPD of if this mapping is a linear mapping  corrupted by an additive noise composed of i.i.d. sub-Gaussian random components, i.e., . Standard tools of measure concentration theory then show that concentrates, w.h.p., around 0. As will be clear later, such a scenario includes the case since, given a fixed , the i.i.d. r.v.’s are bounded and thus sub-Gaussian. However, proving the uniform LPD property of requires to probe its geometrical nature. We need in particular to control the impact of the discontinuities introduced by in (see Sec. 6).

We observe in (12) that the (L)LPD characterizes the proximity of scalar products between distorted and undistorted random observations in the compressed domain . In order to assess how approximates in the case where respects the RIP, we can consider this standard lemma from the CS literature (see e.g., [3]).

###### Lemma 3.4.

Given two symmetric subsets and , if is RIP with , then

 |1m⟨Φu,Φv⟩−⟨u,v⟩|⩽2ϵ,∀u∈K1∩Bn, ∀v∈K2∩Bn. (13)

In particular, if and are two cones, we have

 |1m⟨Φu,Φv⟩−⟨u,v⟩|⩽ϵ∥u∥∥v∥,∀u∈K1, ∀v∈K2. (14)
###### Proof.

Note that since and are symmetric, . Given and , if is RIP with , then, from the polarization identity, the fact that and from (8),

 1m⟨Φu,Φv⟩=1m∥Φ(u+v2)∥2−1m∥Φ(u−v2)∥2⩽14∥u+v∥2−14∥u−v∥2+2ϵ =⟨u,v⟩+2ϵ.

The lower bound is obtained similarly. If and are also conic, then is conic, and (9) provides, for all unit norm and ,

 1m⟨Φu,Φv⟩=1m∥Φ(u+v2)∥2−1m∥Φ(u−v2)∥2 ⩽14∥u+v∥2−14∥u−v∥2+14ϵ(∥u+v∥2+∥u−v∥2)=⟨u,v⟩+ϵ,

with a similar development for the lower bound. A simple rescaling argument provides (14). ∎

Therefore, applying the triangular identity, it is easy to verify the following corollary.

###### Corollary 3.5.

Given , two symmetric subsets , and . If  respects the RIP and verifies the LPD for , then

 |1m⟨D(u),Φv⟩−⟨u,v⟩| ⩽ 2ϵ+ν,∀u∈K1∩Bn, ∀v∈K2∩Bn. (15)

The same observation holds is fixed when the L-LPD is invoked instead of the LPD. Note also that the bound reduces to if and are conic.

Note that we recover Lemma 3.4 if is identified with .

## 4 PBP reconstruction error in distorted CS

In this section, we provide a general analysis of the reconstruction error of the estimate provided by the PBP of the general distorted CS model (10). This is achieved in the context where is only assumed to respect the (L)LPD property, which, in a certain sense, characterizes the proximity of this (possibly non-linear) mapping with a RIP matrix .

Note that the results of this general study will be applied to the quantized, random mapping introduced in (4) (as explained in Sec. 5 and Sec. 6), but it could potentially concern other distorted sensing models, provided that the associated mapping meets the (L)LPD property.

Hereafter, we analyze the cases where the low-complexity signal set is a union of low-dimensional subspaces, the set of low-rank matrices, or a convex subset of . Sec. 7 will later analyze these general results when is the quantized random mapping introduced in (4).

### 4.1 Union of low-dimensional subspaces

We first study the reconstruction error of PBP in the estimation of vectors in a union of  low-dimensional subspaces , also known as a ULS model. This model encompasses, e.g., sparse signals in an orthonormal basis or in a dictionary [48, 45], co-sparse signal models [49], group-sparse signals [27] and model-based sparsity [26].

The next theorem states that the PBP reconstruction error is bounded by the addition of the distortion induced by the RIP of (as in CS) and the one provided by (L)LPD of .

###### Theorem 4.1 (PBP for ULS).

Let us consider the ULS model . Given two distortions , if respects the RIP and if the mapping satisfies the LPD, then, for all , the estimate obtained by the PBP of onto satisfies

 ∥x−^x∥⩽2(ϵ+ν).

Moreover, if is fixed, then the same result holds if respects the L-LPD.

###### Proof.

The proof generalizes the proof sketch given at the end of the Introduction for the reconstruction error of PBP in the case where . Since and , there must exist two subspaces and , for some