# Cramer Rao-Type Bounds for Sparse Bayesian Learning

In this paper, we derive Hybrid, Bayesian and Marginalized Cramér-Rao lower bounds (HCRB, BCRB and MCRB) for the single and multiple measurement vector Sparse Bayesian Learning (SBL) problem of estimating compressible vectors and their prior distribution parameters. We assume the unknown vector to be drawn from a compressible Student-t prior distribution. We derive CRBs that encompass the deterministic or random nature of the unknown parameters of the prior distribution and the regression noise variance. We extend the MCRB to the case where the compressible vector is distributed according to a general compressible prior distribution, of which the generalized Pareto distribution is a special case. We use the derived bounds to uncover the relationship between the compressibility and Mean Square Error (MSE) in the estimates. Further, we illustrate the tightness and utility of the bounds through simulations, by comparing them with the MSE performance of two popular SBL-based estimators. It is found that the MCRB is generally the tightest among the bounds derived and that the MSE performance of the Expectation-Maximization (EM) algorithm coincides with the MCRB for the compressible vector. Through simulations, we demonstrate the dependence of the MSE performance of SBL based estimators on the compressibility of the vector for several values of the number of observations and at different signal powers.

## Authors

• 10 publications
• 13 publications
10/16/2020

### On the MMSE Estimation of Norm of a Gaussian Vector under Additive White Gaussian Noise with Randomly Missing Input Entries

This paper considers the task of estimating the l_2 norm of a n-dimensio...
08/29/2021

### Lower Bounds for the Minimum Mean-Square Error via Neural Network-based Estimation

The minimum mean-square error (MMSE) achievable by optimal estimation of...
03/08/2017

### A GAMP Based Low Complexity Sparse Bayesian Learning Algorithm

In this paper, we present an algorithm for the sparse signal recovery pr...
07/08/2020

### A new generalized newsvendor model with random demand

Newsvendor problem is an extensively researched topic in inventory manag...
03/25/2016

### The Asymptotic Performance of Linear Echo State Neural Networks

In this article, a study of the mean-square error (MSE) performance of l...
07/05/2020

### On Molecular Flow Velocity Meters

The concentration of molecules in the medium can provide us very useful ...
04/30/2021

### On the Computation of PSNR for a Set of Images or Video

When comparing learned image/video restoration and compression methods, ...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## I Introduction

Recent results in the theory of compressed sensing have generated immense interest in sparse vector estimation problems, resulting in a multitude of successful practical signal recovery algorithms. In several applications, such as the processing of natural images, audio, and speech, signals are not exactly sparse, but compressible, i.e., the magnitudes of the sorted coefficients of the vector follow a power law decay [1]. In [2] and [3]

, the authors show that random vectors drawn from a special class of probability distribution functions (pdf) known as

compressible priors result in compressible vectors. Assuming that the vector to be estimated (henceforth referred to as the unknown vector) has a compressible prior distribution enables one to formulate the compressible vector recovery problem in the Bayesian framework, thus allowing the use of Sparse Bayesian Learning (SBL) techniques [4]. In his seminal work, Tipping proposed an SBL algorithm for estimating the unknown vector, based on the Expectation Maximization (EM) and McKay updates [4]. Since these update rules are known to be slow, fast update techniques are proposed in [5]. A duality based algorithm for solving the SBL cost function is proposed in [6], and based reweighting schemes are explored in [7]. Such algorithms have been successfully employed for image/visual tracking [8], neuro-imaging [9, 10], beamforming [11], and joint channel estimation and data detection for OFDM systems [12].
Many of the aforementioned papers study the complexity, convergence and support recovery properties of SBL based estimators (e.g., [5, 6]). In [3], the general conditions required for the so-called instance optimality of such estimators are derived. However, it is not known whether these recovery algorithms are optimal in terms of the Mean Square Error (MSE) in the estimate or by how much their performance can be improved. In the context of estimating sparse signals, Cramér-Rao lower bounds on the MSE performance are derived in [13, 14, 15]. However, to the best of our knowledge, none of the existing works provide a lower bound on the MSE performance of compressible vector estimation. Such bounds are necessary, as they provide absolute yardsticks for comparative analysis of estimators, and may also be used as a criterion for minimization of MSE in certain problems [16]. In this paper, we close this gap in theory by providing Cramér-Rao type lower bounds on the MSE performance of estimators in the SBL framework.

As our starting point, we consider a linear Single Measurement Vector (SMV) SBL model given by

 y=Φx+n, (1)

where the observations and the measurement matrix are known, and is the unknown sparse/compressible vector to be estimated [17]. Each component of the additive noise

is white Gaussian, distributed as

, where the variance may be known or unknown. The SMV-SBL system model in (1) can be generalized to a linear Multiple Measurement Vector (MMV) SBL model given by

 T=ΦW+V. (2)

Here, represents the observation vectors, the columns of are the sparse/compressible vectors with a common underlying distribution, and each column of is modeled similar to in (1) [18].
In typical compressible vector estimation problems, is underdetermined (), rendering the problem ill-posed. Bayesian techniques circumvent this problem by using a prior distribution on the compressible vector as a regularization, and computing the corresponding posterior estimate. To incorporate a compressible prior in (1) and (2), SBL uses a two-stage hierarchical model on the unknown vector, as shown in Fig. 1. Here, , where the diagonal matrix contains the hyperparameters as its diagonal elements. Further, an Inverse Gamma (IG) hyperprior is assumed for itself, because it leads to a Student- prior on the vector , which is known to be compressible [4].111The IG hyperprior is conjugate to the Gaussian pdf [4]. In scenarios where the noise variance is unknown and random, an IG prior is used for the distribution of the noise variance as well. For the system model in (2), every compressible vector , i.e., the compressible vectors are governed by a common .

It is well known that the Cramér-Rao Lower Bound (CRLB) provides a fundamental limit on the MSE performance of unbiased estimators

[19] for deterministic parameter estimation. For the estimation problem in SBL, an analogous bound known as the Bayesian Cramér-Rao Bound (BCRB) is used to obtain lower bounds [20], by incorporating the prior distribution on the unknown vector. If the unknown vector consists of both deterministic and random components, Hybrid Cramér-Rao Bounds (HCRB) are derived [21].

In SBL, the unknown vector estimation problem can also be viewed as a problem involving nuisance parameters. Since the assumed hyperpriors are conjugate to the Gaussian likelihood, the marginalized distributions have a closed form and the Marginalized Cramér-Rao Bounds (MCRB) [22] can be derived. For example, in the SBL hyperparameter estimation problem,

itself can be considered a nuisance variable and marginalized from the joint distribution,

, to obtain the log likelihood as

 log∫xpY,X|Γ(y,x|γ)dx=−(log|Σy|+yTΣ−1yy)2, (3)

where [23].

The goal of this paper is to derive Cramér-Rao type lower bounds on the MSE performance of estimators based on the SBL framework. Our contributions are as follows:

• Under the assumption of known noise variance, we derive the HCRB and the BCRB for the unknown vector , as indicated in the left half of Fig. 2.

• When the noise variance is known, we marginalize nuisance variables ( or ) and derive the corresponding MCRB, as indicated in the right half of Fig. 2. Since the MCRB is a function of the parameters of the hyperprior (and hence is an offline bound), it yields insights into the relationship between the MSE performance of the estimators and the compressibility of .

• In the unknown noise variance case, we derive the BCRB, HCRB and MCRB for the unknown vector , as indicated in Fig. 3.

• We derive the MCRB for a general parametric form of the compressible prior [3] and deduce lower bounds for two of the well-known compressible priors, namely, the Student- and generalized double Pareto distributions.

• Similar to the SMV-SBL case, we derive the BCRB, HCRB and MCRB for the MMV-SBL model in (2).

Through numerical simulations, we show that the MCRB on the compressible vector is the tightest lower bound, and that the MSE performance of the EM algorithm achieves this bound at high SNR and as . The techniques used to derive the bounds can be extended to handle different compressible prior pdfs used in literature [2]. These results provide a convenient and easy-to-compute benchmark for comparing the performance of the existing estimators, and in some cases, for establishing their optimality in terms of the MSE performance.

The rest of this paper is organized as follows. In Sec. II, we provide the basic definitions and describe the problem set up. In Secs. III and IV, we derive the lower bounds for the cases shown in Figs. 2 and 3, respectively. The bounds are extended to the MMV-SBL signal model in Sec. V. The efficacy of the lower bounds is graphically illustrated through simulation results in Sec. VI. We provide some concluding remarks in Sec. VII. In the Appendix, we provide proofs for the Propositions and Theorems stated in the paper.

Notation: In the sequel, boldface small letters denote vectors and boldface capital letters denote matrices. The symbols and denote the transpose and determinant of a matrix, respectively. The empty set is represented by , and denotes the Gamma function. The function

represents the pdf of the random variable

evaluated at its realization . Also, stands for a diagonal matrix with entries on the diagonal given by the vector . The symbol is the gradient with respect to (w.r.t.) the vector . The expectation w.r.t. a random variable is denoted as . Also, denotes that is positive semidefinite, and is the Kronecker product of the two matrices and .

## Ii Preliminaries

As a precursor to the sections that follow, we define the MSE matrix and the Fisher Information Matrix (FIM) [19], and state the assumptions under which we derive the lower bounds in this paper. Consider a general estimation problem where the unknown vector can be split into sub-vectors , where consists of random parameters distributed according to a known pdf, and consists of deterministic parameters. Let denote the estimator of as a function of the observations . The MSE matrix is defined as

 Eθ≜EY,Θr[(θ−^θ(y))(θ−^θ(y))T], (4)

where denotes the random parameters to be estimated, whose realization is given by . The first step in obtaining Cramér-Rao type lower bounds is to derive the FIM [19]. Typically, is expressed in terms of the individual blocks of submatrices, where the block is given by

 Iθij≜−EY,Θr[∇θi∇TθjlogpY,Θr;Θd(y,θr;θd)]. (5)

In this paper, we use the notation to represent the FIM under the different modeling assumptions. For example, when and , represents a Hybrid Information Matrix (HIM). When and , represents a Bayesian Information matrix (BIM). Assuming that the MSE matrix exists and the FIM is non-singular, a lower bound on the MSE matrix is given by the inverse of the FIM:

 Eθ⪰(Iθ)−1. (6)

It is easy to verify that the underlying pdfs considered in the SBL model satisfy the regularity conditions required for computing the FIM (see Sec. 5.2.3 in [22]).
We conclude this section by making one useful observation about the FIM in the SBL problem. An assumption in the SMV-SBL framework is that and are independent of each other (for the MMV-SBL model, and are independent). This assumption is reflected in the graphical model in Fig. 1, where the compressible vector (and its attribute ) and the noise component (and its attribute ) are on unconnected branches. Due to this, a submatrix of the FIM is of the form

 Iθγξ=−EX,Y,Γ,Ξ[∇γ∇ξ{logpY|X,Ξ(y|x,ξ) +logpX,Γ(x,γ)+logpΞ(ξ)}], (7)

where there are no terms in which both and are jointly present. Hence, the corresponding terms in the above mentioned submatrix are always zero. This is formally stated in the following Lemma.

###### Lemma 1

When and , the block matrix of the FIM given by (5) simplifies to , i.e., to an all zero vector.

## Iii SMV-SBL: Lower Bounds when σ2 is Known

In this section, we derive lower bounds for the system model in (1) for the scenarios in Fig. 2, where the unknown vector is . We examine different modeling assumptions on and derive the corresponding lower bounds.

### Iii-a Bounds from the Joint pdf

#### Iii-A1 HCRB for θ=[xT,γT]T

In this subsection, we consider the unknown variables as a hybrid of a deterministic vector and a random vector distributed according to a Gaussian distribution parameterized by . Using the assumptions and notation in the previous section, we obtain the following proposition.

###### Proposition 1

For the signal model in (1), the HCRB on the MSE matrix of the unknown vector with the parameterized distribution of the compressible signal given by , and with modeled as unknown and deterministic, is given by , where

 Hθ≜[Hθ(x)Hθ(x,γ)(Hθ(x,γ))THθ(γ)]= ⎡⎣(ΦTΦσ2+Υ−1)0L×L0L×Ldiag(2γ21, 2γ22, …, 2γ2L)−1⎤⎦. (8)

Proof: See Appendix -A.

Note that the lower bound on the estimate of depends on the prior information through the diagonal matrix . In the SBL problem, the realization of the random parameter has to be used to compute the bound above, and hence, it is referred to as an online bound. Also, the lower bound on the MSE matrix of is , which is the same as the lower bound on the error covariance of the Baye’s vector estimator for a linear model (see Theorems 10.2 and 10.3 in [19]), and is achievable by the MMSE estimator when is known.

#### Iii-A2 BCRB for θ=[xT,γT]T

For deriving the BCRB, a hyperprior distribution is considered on , and the resulting is viewed as being drawn from a compressible prior distribution. The most commonly used hyperprior distribution in the literature is the IG distribution [4], where are distributed as , given by

 (9)

where . Using the definitions and notation in the previous section, we state the following proposition.

###### Proposition 2

For the signal model in (1), the BCRB on the MSE matrix of the unknown random vector , where the conditional distribution of the compressible signal is , and the hyperprior distribution on is , is given by , where

 Bθ≜[Bθ(x)Bθ(x,γ)(Bθ(x,γ))TBθ(γ)]= ⎡⎢⎣(ΦTΦσ2+λIL×L)0L×L0L×Lλ2(ν+2)(ν+7)2νIL×L⎤⎥⎦. (10)

Proof: See Appendix -B.

It can be seen from that the lower bound on the MSE of is a function of the parameters of the IG prior on , i.e., a function of and , and it can be computed without the knowledge of realization of . Thus, it is an offline bound.

### Iii-B Bounds from Marginalized Distributions

#### Iii-B1 MCRB for θ=[γ]

Here, we derive the MCRB for , where is an unknown deterministic parameter. This requires the marginalized distribution , which is obtained by considering as a nuisance variable and marginalizing it out of the joint distribution , to obtain (3). Since is a deterministic parameter, the pdf must satisfy the regularity condition in [19]. We have the following theorem.

###### Theorem 1

For the signal model in (1), the log likelihood function satisfies the regularity conditions in [19]. Further, the MCRB on the MSE matrix of the unknown deterministic vector is given by , where the element of is given by

 Mγij=12(ΦTjΣ−1yΦi)2, (11)

for , where is the column of , and , as defined earlier.

Proof: See Appendix -C.

To intuitively understand (11), we consider a special case of , and use the Woodbury formula to simplify , to obtain the entry of the matrix as

 Mγii=2(σ2N+γi)−2. (12)

Hence, the error in is bounded as . As , the bound reduces to , which is the same as the lower bound on the estimate of obtained as the lower-right submatrix in (8). For finite , the MCRB is tighter than the HCRB.

#### Iii-B2 MCRB for θ=[x]

In this subsection, we assume a hyperprior on , which leads to a joint distribution of and , from which can be marginalized. Further, assuming specific forms for the hyperprior distribution can lead to a compressible prior on . For example, assuming an IG hyperprior on leads to an with a Student- distribution. Sampling from a Student- distribution with parameters and results in a -compressible  [2]. The Student- prior is given by

 pX(x)≜(Γ((ν+1)/2)Γ(ν/2))L(λπν)L2L∏i=1(1+λx2iν)−ν+12, (13)

where ,

represents the number of degrees of freedom and

represents the inverse variance of the distribution. Using the notation developed so far, we state the following theorem.

###### Theorem 2

For the signal model in (1), the MCRB on the MSE matrix of the unknown compressible random vector distributed as (13), is given by , where

 Mx=ΦTΦσ2+λ(ν+1)(ν+3)IL×L. (14)

Proof: See Appendix -D.

We see that the bound derived depends on the parameters of the Student- pdf. From [3], the prior is “somewhat” compressible for , and (14) is nonnegative and bounded for , i.e., the bound is meaningful in the range of used in practice. Note that, by choosing to be large (or the variance of to be small), the bound is dominated by the prior information, rather than the information from the observations, as expected in Bayesian bounds [19].

It is conjectured in [22] that, in general, the MCRB is tighter than the BCRB. Analytically comparing the MCRB (14) with the BCRB (8), we see that for the SBL problem of estimating a compressible vector, the MCRB is indeed tighter than the BCRB, since

 (ΦTΦσ2+λ(ν+1)(ν+3)IL×L)−1⪰(ΦTΦσ2+λIL×L)−1.

The techniques used to derive the bounds in this subsection can be applied to any family of compressible distributions. In [3], the authors propose a parametric form of the Generalized Compressible Prior (GCP) and prove that such a prior is compressible for certain values of . In the following subsection, we derive the MCRB for the GCP.

### Iii-C General Marginalized Bounds

In this subsection, we derive MCRBs for the parametric form of the GCP. The GCP encompasses the double Pareto shrinkage type prior [24] and the Student- prior (13) as its special cases. We consider the GCP on as follows

 pX(x)≜KLL∏i=1(1+λ|xi|τν)−(ν+1)/τ, (15)

where , and the normalizing constant . When , (15) reduces to the Student- prior in (13), and when , it reduces to a generalized double Pareto shrinkage prior [24, 25]. Also, the expression for the GCP in [3] can be obtained from (15) by setting , and defining . The following theorem provides the MCRB for the GCP.

###### Theorem 3

For the signal model in (1), the MCRB on the MSE matrix of the unknown random vector , where is distributed as the GCP in (15), is given by , where

 Mθτ=ΦTΦσ2+Tτ, (16)

where

Proof: See Appendix -E.
It is straightforward to verify that for , (16) reduces to the MCRB derived in (14) for the Student- distribution. For , the inverse of the MCRB can be reduced to

 Mθτ=ΦTΦσ2+λ2(ν+1)2ν(ν+2)IL×L. (17)

In Fig. 4, we plot the expression in (16). We observe that, in general, the bounds predict an increase in MSE for higher values of . Also, for given value of , the lower bounds at different signal to noise ratios (SNRs) converge as the value of increases, indicating that increasing renders the bound insensitive to the SNR. The lower bounds also predict a smaller value of MSE for a lower value of .

Thus far, we have presented the lower bounds on the MSE in estimating the unknown parameters of the SBL problem when the noise variance is known. In the next section, we extend the results to the case of unknown noise variance.

## Iv SMV-SBL: Lower Bounds when σ2 is Unknown

Let us denote the unknown noise variance as . In the Bayesian formulation, the noise variance is associated with a prior, and since the IG prior is conjugate to the Gaussian likelihood , it is assumed that [4], i.e., is distributed as

 pΞ(ξ)≜dcΓ(c)ξ(−c−1)exp{−dξ};ξ∈(0,∞), c,d>0. (18)

Under this assumption, one can marginalize the unknown noise variance and obtain the likelihood as

 p(y|x)≜∫∞ξ=0p(y,ξ|x)dξ =(2d)cΓ(N2+c)Γ(c)(π)N/2((y−Φx)T(y−Φx)+2d)−(N2+c), (19)

which is a multivariate Student- distribution. It turns out that the straightforward approach of using the above multivariate likelihood to directly compute lower bounds for the various cases given in the previous section is analytically intractable, and that the lower bounds cannot be computed in closed form. Hence, we compute lower bounds from the joint pdf, i.e., we derive the HCRB and BCRBs for the unknown vector with the MSE matrix defined by (4).222We use the subscript to indicate that the error matrices and bounds are obtained for the case of unknown noise variance. Using the assumptions and notation from the previous sections, we obtain the following proposition.

###### Proposition 3

For the signal model in (1), the HCRB on the MSE matrix of the unknown vector , where , with the distribution of the compressible vector given by , where is modeled as a deterministic or as a random parameter distributed as , and is modeled as a deterministic parameter, is given by , where

 Hθξ=⎡⎣Hθ′0L×101×LN2ξ2⎤⎦. (20)

In the above expression, with a slight abuse of notation, is the FIM given by (8) when is unknown deterministic and by (10) when is random.
Proof: See Appendix -F.
The lower bound on the estimation of matches with known lower bounds on noise variance estimation (see Sec. 3.5 in [19]). One disadvantage of such a bound on is that the knowledge of the noise variance is essential to compute the bound, and hence, it cannot be computed offline. Instead, assigning a hyperprior to would result in a lower bound that only depends on the parameters of the hyperprior, which are assumed to be known, allowing the bound to be computed offline. We state the following proposition in this context.

###### Proposition 4

For the signal model in (1), the HCRB on the MSE matrix of the unknown vector , where , with the distribution of the vector given by , where is modeled as a deterministic parameter or as a random parameter distributed as , and with the random parameter distributed as , is given by , where

 Hθξ=⎡⎣Hθ′0L×101×Lc(c+1)(N/2+c+3)d2⎤⎦. (21)

In (21), is the FIM given in (8) when is unknown deterministic and by (10) when is random.
Proof: See Appendix -G.

In SBL problems, a non-informative prior on is typically preferred, i.e., the distribution of the noise variance is modeled to be as flat as possible. In [4], it was observed that a non-informative prior is obtained when . However, as , the bound in (21) is indeterminate. In Sec. VI, we illustrate the performance of the lower bound in (21) for practical values of and .

### Iv-a Marginalized Bounds

In this subsection, we obtain lower bounds on the MSE of the estimator , in the presence of nuisance variables in the joint distribution. To start with, we consider the marginalized distributions of and , i.e., where both, and are deterministic variables. Since the unknowns are deterministic, the regularity condition has to be satisfied for . We state the following theorem.

###### Theorem 4

For the signal model in (1), the log likelihood function satisfies the regularity condition [19]. Further, the MCRB on the MSE matrix of the unknown deterministic vector is given by , where

 Mθξ≜[Mθξ(γ)Mθξ(γ,ξ)Mθξ(ξ,γ)Mθξ(ξ)], (22)

where the entry of the matrix is given by , and . Further, , .

Proof: See Appendix -H.

Remark: From the graphical model in Fig. 1, it can be seen that the branches consisting of and are independent conditioned on . However, when is marginalized, the nodes and are connected, and hence, Lemma 1 is no longer valid. Due to this, the lower bound on depends on and vice versa, i.e., and depend on both and through .

Thus far, we have presented several bounds for the MSE performance of the estimators , and in the SMV-SBL framework. In the next section, we derive Cramér-Rao type lower bounds for the MMV-SBL signal model.

## V Lower Bounds for the MMV-SBL

In this section, we provide Cramér-Rao type lower bounds for the estimation of unknown parameters in the MMV-SBL model given in (2). We consider the estimation of the compressible vector from the vector of observations , which contain the stacked columns of and , respectively. In the MMV-SBL model, each column of is distributed as , for , and the likelihood is given by , where and . The modeling assumptions on and are the same as in the SMV-SBL case, given by (9) and (18), respectively [18].

Using the notation developed in Sec. II, we derive the bounds for the MMV SBL case similar to the SMV-SBL cases considered in Secs. III and IV. Since the derivation of these bounds follow along the same lines as in the previous sections, we simply state results in Table I.

We see that the lower bounds on and are reduced by a factor of compared to the SMV case, which is intuitively satisfying. It turns out that it is not possible to obtain the MCRB on in the MMV-SBL setting, since closed form expressions for the FIM are not available.

In the next section, we consider two popular algorithms for SBL and graphically illustrate the utility of the lower bounds.

## Vi Simulations and Discussion

The vector estimation problem in the SBL framework typically involves the joint estimation of the hyperparameter and the unknown compressible vector . Since the hyperparameter estimation problem cannot be solved in closed form, iterative estimators are employed [4]. In this section, we consider the iterative updates based on the EM algorithm first proposed in [4]. We also consider the algorithm proposed in [6] based on the Automatic Relevance Determination (ARD) framework. We plot the MSE performance in estimating , and with the linear model in (1) and (2), for the EM algorithm, labeled EM, and the ARD based Reweighted algorithm, labeled ARD-SBL. We compare the performance of the estimators against the derived lower bounds.

We simulate the lower bounds for a random underdetermined () measurement matrix , whose entries are i.i.d. and standard Bernoulli distributed. A compressible signal of dimension is generated by sampling from a Student- distribution with the value of ranging from to , which is the range in which the signal is “somewhat” compressible, for high dimensional signals [3]. Figure 5 shows the decay profile of the sorted magnitudes of i.i.d. samples drawn from a Student- distribution for different and with the value of fixed at .

### Vi-a Lower Bounds on the MSE Performance of ^x(y)

In this subsection, we compare the MSE performance of the ARD-SBL estimator and the EM based estimator . Figure 6 depicts the MSE performance of for different SNRs and and , with . We compare it with the HCRB/BCRB derived in (8), which is obtained by assuming the knowledge of the realization of the hyperparameters . We see that the MCRB derived in (14) is a tight lower bound on the MSE performance at high SNR and .

Figure 7 shows the comparative MSE performance of the ARD-SBL estimator and EM based estimator as a function of varying degrees of freedom , at an SNR of  dB and and . As expected, the MSE performance of the algorithms is better at low values of since the signal is more compressible, and the MCRB and BCRB also reflect this behavior. The MCRB is a tight lower bound, especially for high values of . Figure 8 shows the MSE performance of the ARD-SBL estimator and EM based estimator as a function of , at an SNR of  dB and for two different values of . The MSE performance of the EM algorithm converges to that of the MCRB at higher .

### Vi-B Lower Bounds on the MSE Performance of ^γ(y)

In this subsection, we compare the different lower bounds for the MSE of the estimator for the SMV and MMV-SBL system model. Figure 9 shows the MSE performance of as a function of SNR and , when is a random parameter, and . In this case, it turns out that there is a large gap between the performance of the EM based estimate and the lower bound.

When is deterministic, we first note that the EM based ML estimator for is asymptotically optimal and the lower bounds are practical for large data samples [19]. The results are listed in Table II. We see that for and , the MCRB and BCRB are tight lower bounds, with MCRB being marginally tighter than the BCRB. However, as increases, the gap between the MSE and the lower bounds increases.

### Vi-C Lower Bounds on the MSE Performance of ^ξ(y)

In Fig. 10, we compare the lower bounds on the MSE of the estimator in the SMV and MMV-SBL settings, for different values of and . Here, is sampled from the IG pdf (18), with parameters and .

When is deterministic, the EM based ML estimator for is asymptotically optimal and the lower bounds are practical for large data samples [19]. Table III lists the MSE values of , the corresponding HCRB and MCRB for deterministic but unknown noise variance, while the true noise variance is fixed at . We see that for and , the MCRB is marginally tighter than the HCRB. However, when the noise variance is random, we see from Fig. 10 that there is a large gap between the MSE performance and the HCRB.

## Vii Conclusion

In this work, we derived Cramér-Rao type lower bounds on the MSE, namely, the HCRB, BCRB and MCRB, for the SMV-SBL and the MMV-SBL problem of estimating compressible signals. We used a hierarchical model for the compressible priors to obtain the bounds under various assumptions on the unknown parameters. The bounds derived by assuming a hyperprior distribution on the hyperparameters themselves provided key insights into the MSE performance of SBL and the values of the parameters that govern these hyperpriors. We derived the MCRB for the generalized compressible prior distribution, of which the Student- and Generalized Pareto prior distribution are special cases. We showed that the MCRB is tighter than the BCRB. We compared the lower bounds with the MSE performance of the ARD-SBL and the EM algorithm using Monte Carlo simulations. The numerical results illustrated the near-optimality of EM based updates for SBL, which makes it attractive for practical implementations.

### -a Proof of Proposition 1

Using the graphical model of Fig. 1 in (5),

 Hθ(x) ≜ −EY,X;γ[∇2xlogpY,X;γ(y,x;γ)] (23) = −EY,X;γ[∇x(ΦT(y−Φx)σ2−Υ−1x)] = ΦTΦσ2+Υ−1.

Similarly, it is straightforward to show that . Since are zero mean random variables,

 Hθ(γ,x)=−EY,X;γ[∇γ∇xlogpY,X;γ(y,x;γ)]=0L×L,

Now, since , we get,

 ∂2logpX;γ(x;γ)∂γi∂γj = ⎧⎪⎨⎪⎩12γ2i−x2iγ3iif i=j0if i≠j. (24)

Taking on both sides of the above equation and noting that , we obtain

 Hθ(γ) = diag(−EX;γ[∂2logpX;γ(x;γ)∂γ2i]) (25) = diag([12γ21,…,12γ2L]).

This completes the proof.

### -B Proof of Proposition 2

Using the graphical model of Fig. 1 in (5),

 Bθ(x) ≜ −EY,X,Γ[∇2xlogpY,X,Γ(y,x;γ)] (26) = −EY,X,Γ[∇x(ΦT(y−Φx)σ2−Υ−1x)] = EΓ[ΦTΦσ2+Υ−1] = ΦTΦσ2+EΓ[Υ−1]. (27)

The expression for w.r.t. is given by,

 EΓ[1γi] = Kγ∫∞γi=0γ(−ν2−2)iexp{−ν2λγi}dγi = KγΓ(ν2+1)(