I Introduction
Compressed sensing, a new paradigm for data acquisition and reconstruction, has drawn much attention over the past few years [1, 2, 3]
. The main purpose of compressed sensing is to recover a highdimensional sparse signal from a lowdimensional linear measurement vector. In practice, measurements are inevitably contaminated by noise due to hardware imperfections, quantization errors, or transmission errors. Most existing studies (e.g.
[4, 5, 6]) assume that measurements are corrupted with noise that is evenly distributed across the observations, such as independent and identically distributed (i.i.d.) Gaussian, thermal, or quantization noise. This assumption is valid for many cases. Nevertheless, for some scenarios, measurements may be corrupted by outliers that are significantly different from their nominal values. For example, during the data acquisition process, outliers can be caused by sensor failures or calibration errors [7, 8], and it is usually unknown which measurements have been corrupted. Outliers can also arise as a result of signal clipping/saturation or impulse noise [9, 10]. Conventional compressed sensing techniques may incur severe performance degradation in the presence of outliers. To address this issue, in previous works (e.g. [7, 8, 9, 10]), outliers are modeled as a sparse error vector, and the observed data are expressed as(1) 
where is the sampling matrix with , denotes an dimensional sparse vector with only nonzero coefficients, denotes the outlier vector consisting of nonzero entries with arbitrary amplitudes, and denotes the additive multivariate Gaussian noise with zero mean and covariance matrix . The above model can be formulated as a conventional compressed sensing problem as
(2) 
Efficient compressed sensing algorithms can then be employed to estimate the sparse signal as well as the outliers. Recovery guarantees of and were also analyzed in [7, 8, 9, 10].
The rationale behind the above approach is to detect and compensate for these outliers simultaneously. Besides the above method, another more direct approach is to identify and exclude the outliers from sparse signal recovery. Although it may seem preferable to compensate rather than simply reject outliers, inaccurate estimation of the compensation (i.e. outlier vector) could result in a destructive effect on sparse signal recovery, particularly when the number of measurements is limited. In this case, identifying and rejecting outliers could be a more sensible strategy. Motivated by this insight, we develop a Bayesian framework for robust compressed sensing, in which a set of binary indicator variables are employed to indicate which observations are outliers. These variables are assigned a betaBernoulli hierarchical prior such that their values are confined to be binary. Also, a Gaussian inverseGamma prior is placed on the sparse signal to promote sparsity. A variational Bayesian method is developed to find the approximate posterior distributions of the indicators, the sparse signal and other latent variables. Simulation results show that the proposed method achieves a substantial performance improvement over the compensationbased robust compressed sensing method.
Ii Hierarchical Prior Model
We develop a Bayesian framework which employs a set of indicator variables to indicate which observation is an outlier, i.e. indicates that is a normal observation, otherwise is an outlier. More precisely, we can write
(3) 
where denotes the th row of , and are the th entry of and
, respectively. The probability of the observed data conditional on these indicator variables can be expressed as
(4) 
in which those “presumed outliers” are automatically disabled when calculating the probability. To infer the indicator variables, a betaBernoulli hierarchical prior [11, 12] is placed on , i.e. each component of
is assumed to be drawn from a Bernoulli distribution parameterized by
(5) 
and
follows a beta distribution
(6) 
where and are parameters characterizing the beta distribution. Note that the betaBernoulli prior assumes the random variables are mutually independent, and so are the random variables .
To encourage a sparse solution, a Gaussianinverse Gamma hierarchical prior, which has been widely used in sparse Bayesian learning (e.g. [13, 14, 15, 16]), is employed. Specifically, in the first layer, is assigned a Gaussian prior distribution
(7) 
where , and are nonnegative hyperparameters controlling the sparsity of the signal
. The second layer specifies Gamma distributions as hyperpriors over the precision parameters
, i.e.(8) 
where the parameters and are set to small values (e.g. ) in order to provide noninformative (over a logarithmic scale) hyperpriors over
. Also, to estimate the noise variance, we place a Gamma hyperprior over
, i.e.(9) 
where the parameters and are set to be small, e.g. . The graphical model of the proposed hierarchical prior is shown in Fig. 1.
Iii Variational Bayesian Inference
We now proceed to perform Bayesian inference for the proposed hierarchical model. Let denote the hidden variables in our hierarchical model. Our objective is to find the posterior distribution , which is usually computationally intractable. To circumvent this difficulty, observe that the marginal probability of the observed data can be decomposed into two terms
(10) 
where
(11) 
and
(12) 
where
is any probability density function,
is the KullbackLeibler divergence between
and . Since , it follows that is a rigorous lower bound on . Moreover, notice that the left hand side of (10) is independent of . Therefore maximizing is equivalent to minimizing , and thus the posterior distribution can be approximated by through maximizing . Specifically, we could assume some specific parameterized functional form for and then maximize with respect to the parameters of the distribution. A particular form of that has been widely used with great success is the factorized form over the component variables in [17]. For our case, the factorized form of can be written as(13) 
We can compute the posterior distribution approximation by finding of the factorized form that maximizes the lower bound . The maximization can be conducted in an alternating fashion for each latent variable, which leads to [17]
where denotes an expectation with respect to the distributions specified in the subscript. More details of the Bayesian inference are provided below.
1) Update of : We first consider the calculation of . Keeping those terms that are dependent on , we have
(15) 
where
(16) 
and denote the expectation of and , respectively. It is easy to show that
follows a Gaussian distribution with its mean and covariance matrix given respectively by
(17)  
(18) 
2) Update of : Keeping only the terms that depend on , the variational optimization of yields
(19) 
The posterior therefore follows a Gamma distribution
(20) 
in which and are given respectively as
3). Update of : The variational approximation of can be obtained as:
(21) 
Clearly, the posterior obeys a Gamma distribution
(22) 
where and are given respectively as
(23)  
(24) 
in which
4) Update of : The posterior approximation of yields
(25) 
Clearly, still follows a Bernoulli distribution with its probability given by
(26)  
(27) 
where is a normalizing constant such that , and
(28) 
The last two equalities can also be found in [12], in which represents the digamma function.
5) Update of : The posterior approximation of can be calculated as
(29) 
It can be easily verified that follows a Beta distribution, i.e.
(30) 
In summary, the variational Bayesian inference involves updates of the approximate posterior distributions for hidden variables , , , , and
in an alternating fashion. Some of the expectations and moments used during the update are summarized as
where denotes the th diagonal element of .
Iv Simulation Results
We now carry out experiments to illustrate the performance of our proposed method which is referred to as the betaBernoulli prior modelbased robust Bayesian compressed sensing method (BPRBCS)^{1}^{1}1Codes are available at http://www.junfanguestc.net/codes/RBCS.rar. As discussed earlier, another robust compressed sensing approach is compensationbased and can be formulated as a conventional compressed sensing problem (2). For comparison, the sparse Bayesian learning method [18, 13] is employed to solve (2), and this method is referred to as the compensationbased robust Bayesian compressed sensing method (CRBCS). Also, we consider an “ideal” method which assumes the knowledge of the locations of the outliers. The outliers are then removed and the sparse Bayeisan learning method is employed to recover the sparse signal. This ideal method is referred to as RBCSideal, and serves as a benchmark for the performance of the BPRBCS and CRBCS. Note that both CRBCS and RBCSideal use the sparse Bayesian learning method for sparse signal recovery. The parameters of the sparse Bayesian learning method are set to . Our proposed method involves the parameters . The first four are also set to . The betaBernoulli parameters are set to and since we expect that the number of outliers is usually small relative to the total number of measurements. Our simulation results suggest that stable recovery is ensured as long as is set to a value in the range .
We consider the problem of directionofarrival (DOA) estimation where narrowband farfield sources impinge on a uniform linear array of sensors from different directions. The received signal can be expressed as
where denotes i.i.d. Gaussian observation noise with zero mean and variance , is an overcomplete dictionary constructed by evenlyspaced angular points , with the th entry of given by , in which denotes the distance between two adjacent sensors, represents the wavelength of the source signal, and are evenlyspaced grid points in the interval . The signal contains nonzero entries that are independently drawn from a unit circle. Suppose that out of measurements are corrupted by outliers. For those corrupted measurements , their values are chosen uniformly from .
(a)  (b) 
(a)  (b) 
We first consider a noiseless case, i.e. . Fig. 2 depicts the success rates of different methods vs. the number of measurements and the number of outliers, respectively, where we set , , (the number of outliers) in Fig. 2(a), and , , in Fig. 2(b). The success rate is computed as the ratio of the number of successful trials to the total number of independent runs. A trial is considered successful if the normalized reconstruction error of the sparse signal is no greater than . From Fig. 2, we see that our proposed BPRBCS achieves a substantial performance improvement over the CRBCS. This result corroborates our claim that rejecting outliers is a better strategy than compensating for outliers, particularly when the number of measurements is small, because inaccurate estimation of the compensation vector could lead to a destructive, instead of a constructive, effect on sparse signal recovery. Next, we consider a noisy case with . Fig. 3 plots the normalized mean square errors (NMSEs) of the recovered sparse signal by different methods vs. the number of measurements and the number of outliers, respectively, we set , , in Fig. 3(a), and , , in Fig. 3(b). This result, again, demonstrates the superiority of our proposed method over the CRBCS.
V Conclusions
We proposed a new Bayesian method for robust compressed sensing. The rationale behind the proposed method is to identify the outliers and exclude them from sparse signal recovery. To this objective, a set of indicator variables were employed to indicate which observations are outliers. A betaBernoulli prior is assigned to these indicator variables. A variational Bayesian inference method was developed to find the approximate posterior distributions of the latent variables. Simulation results show that our proposed method achieves a substantial performance improvement over the compensationbased robust compressed sensing method.
References
 [1] S. S. Chen, D. L. Donoho, and M. A. Saunders, “Atomic decomposition by basis pursuit,” SIAM J. Sci. Comput., vol. 20, no. 1, pp. 33–61, 1998.

[2]
E. Candés and T. Tao, “Decoding by linear programming,”
IEEE Trans. Information Theory, no. 12, pp. 4203–4215, Dec. 2005.  [3] D. L. Donoho, “Compressive sensing,” IEEE Trans. Inform. Theory, vol. 52, pp. 1289–1306, 2006.
 [4] E. Candes, “The restricted isometry property and its implications for compressive sensing,” Compte Rendus de l’Academie des Sciences, Paris, Serie I, vol. 346, pp. 589–592, 2008.
 [5] M. J. Wainwright, “Informationtheoretic limits on sparsity recovery in the highdimensional and noisy setting,” IEEE Trans. Information Theory, vol. 55, no. 12, pp. 5728–5741, Dec. 2009.
 [6] T. Wimalajeewa and P. K. Varshney, “Performance bounds for sparsity pattern recovery with quantized noisy random projections,” IEEE Journal on Selected Topics in Signal Processing, vol. 6, no. 1, pp. 43–57, Feb. 2012.
 [7] R. G. B. Jason N. Laska, Mark A. Davenport, “Exact signal recovery from sparsely corrupted measurements through the pursuit of justice,” in The 43rd Asilomar Conference on Signals, Systems and Computers, Pacific Grove, California, USA, November 14 2009.
 [8] R. C. Kaushik Mitra, Ashok Veeraraghavan, “Analysis of sparse regularization based robust regression approaches,” IEEE Trans. Signal Processing, no. 5, pp. 1249–1257, Mar. 2013.
 [9] R. E. Carrillo, K. E. Barner, and T. C. Aysal, “Robust sampling and reconstruction methods for sparse signals in the presence of impulsive noise,” IEEE Journal of Selected Topics in Signal Processing, no. 2, pp. 392–408, Apr. 2010.
 [10] C. Studer, P. Kuppinger, G. Pope, and H. Bolcskei, “Recovery of sparsely corrupted signals,” IEEE Trans. Information Theory, no. 5, pp. 3115–3130, May 2012.
 [11] L. He and L. Carin, “Exploiting structure in waveletbased Bayesian compressive sensing,” IEEE Trans. Signal Processing, vol. 57, no. 9, pp. 3488–3497, Sept. 2009.

[12]
J. Paisley and L. Carin, “Nonparametric factor analysis with Beta process
priors,” in
26th Annual International Conference on Machine Learning
, Montreal, Canada, June 1418 2009.  [13] S. Ji, Y. Xue, and L. Carin, “Bayesian compressive sensing,” IEEE Trans. Signal Processing, vol. 56, no. 6, pp. 2346–2356, June 2008.
 [14] Z. Zhang and B. D. Rao, “Extension of SBL algorithms for the recovery of block sparse signals with intrablock correlation,” IEEE Trans. Signal Processing, vol. 61, no. 8, pp. 2009–2015, Apr. 2013.
 [15] Z. Yang, L. Xie, and C. Zhang, “Offgrid direction of arrival estimation using sparse Bayesian inference,” IEEE Trans. Signal Processing, vol. 61, no. 1, pp. 38–42, Jan. 2013.
 [16] J. Fang, Y. Shen, H. Li, and P. Wang, “Patterncoupled sparse Bayesian learning for recovery of blocksparse signals,” IEEE Trans. Signal Processing, no. 2, pp. 360–372, Jan. 2015.
 [17] D. G. Tzikas, A. C. Likas, and N. P. Galatsanos, “The variational approximation for Bayesian inference,” IEEE Signal Processing Magazine, pp. 131–146, Nov. 2008.
 [18] M. Tipping, “Sparse Bayesian learning and the relevance vector machine,” Journal of Machine Learning Research, vol. 1, pp. 211–244, 2001.
Comments
There are no comments yet.