I Introduction
Sparse linear regression, refered to as compressed sensing (CS) [1, 2], has been widely studied in many fields ranging from denoising [3]
to superresolution restoration
[4, 5]. A signal vector is called sparse if it has at most nonzero elements. Then, the objective of sparse linear regression is to recover a sparse signal vector from the given measurement vector , such thatwhere is a known sensing matrix with and a noise vector .
To improve recovery performance of the sparse signal and its support , the set of nonzero indices in , a method called MAP support detection [6] to recover a partial support of was proposed by exploiting the maximum a posteriori (MAP) estimation. This method determines whether the index belongs to by using a likelihood ratio test under the true and null hypotheses given a correlation between the residual vector and its normalized column vector in . By applying this method to existing algorithms such as generalized OMP (gOMP) [7], compressive sampling matching pursuit (CoSaMP) [8], and subspace pursuit (SP) [9], some greedy methods were proposed and shown to outperform other existing methods[6].
We propose a new MAPbased method to estimate partial support by modifying the MAP support detection to use the correlation term given in rank aware order recursive matching pursuit (RAORMP) [10]. This proposed scheme improves the existing MAP support detection by additionally utilizing an orthogonal complement onto each column vector in to calculate the correlation, since the correlation can be better represented through this consideration.
Based on this method to estimate a partial support, we develop an algorithm called Bayesian multiple matching pursuit (BMMP) to recover and its support . BMMP also uses the following four techniques to enhance the performance further. Simulation results show that BMMP outperforms other stateoftheart methods in both noiseless and noisy cases.

(Generating an extended support estimate of size bounded by ) The extended support is defined by an index set whose size is larger than the actual sparsity of and includes . gOMP [7], SP [9], and CoSaMP [8] exploited an extended support estimate in the process to estimate . To improve the support recovery performance, the maximum size of the extended support estimate in BMMP is set to while the maximum size of the extended support estimate in gOMP, SP, and CoSaMP, is , , and , respectively, for a constant smaller than . The reason for setting this to in BMMP is explained in the section describing BMMP.

(Generating multiple support candidates) BMMP utilizes the diversity gain by iteratively selecting multiple indices of different size to generate each of multiple support candidates. In other words, BMMP generates the th support candidate by iteratively selecting indices through the proposed partial support estimation. There have been related studies generating multiple support candidates through a tree structure: multipath matching pursuit [11], multibranch matching pursuit [12]. However, the complexity of the treebased methods scales in the sparsity and is higher than those of other CS algorithms, since the number of support candidates increases exponentially with (the depth of the tree) due to the structural nature of the tree. In contrast with these methods, BMMP does not use the treebased approach to reduce complexity so that the number of support candidates is independent of and the complexity is linearly increased with in BMMP.

(Updating extended support estimate by replacing its subset) To improve the support recovery performance, SP [9] and CoSaMP [8] update an extended support estimate by selecting indices and replacing the rest of the selected indices in at each iteration. BMMP modifies this technique by iteratively selecting indices in a given extended support estimate and updating its complement in .

A ridge regression proposed by Wipf and Rao
[13] is used to improve robustness to noise.
Ii Notation
The set is denoted by . The submatrix of a matrix , where is its th column, with columns indexed by is denoted by . denotes the submatrix of with rows indexed by . denotes the range space spanned by the columns of . denotes the transpose of . and denote the orthogonal projection onto and its orthogonal complement, respectively. denotes the Frobenius norm. For a set and a space of , , and . Similarly, .
Iii Partial support estimation by MAP
The proposed MAPbased algorithm for estimating a partial true support is introduced in this section. We assume the following.
is uniformly distributed on
. Each nonzero element of is i.i.d and follows an arbitrary distributionwhose mean and variance are
and , respectively. Each element in andfollows the Gaussian distribution whose mean is
and variance is and , respectively.Suppose that such that is a given partial support estimate. The goal of the proposed partial support detection is to find an index set belonging to by using the inner product used in RAORMP, for , where is the residual vector, and . represents the correlation between the residual vector and .
Although MAP support detection [6] estimates a partial true support from the likelihood ratio of the inner product , where , the proposed method uses the likelihood ratio of for . Note that and use and , respectively. can be interpreted as the correlation term in a normalized version of OMP since the normalized vector is used in , whereas OMP uses in its correlation term . This indicates that the proposed method additionally considers the orthogonal complement of for the column vector to represent the correlation compared to MAP support detection. Since each vector space for is guaranteed to belong to a space , which is extended from the residual space , the index selection based on may more successfully find the index in compared to that based on .^{1}^{1}1It is not guaranteed that each vector space for belongs to the extended residual space .
The proposed partial support detection uses an estimate of . This is obtained by the following derivations.
where , (a) follows from the independence assumption, and (b) follows from Lemma A.1 (with and or ) in that and . Then, the estimate for is given as
(1) 
Note that the following equalities hold for
(2) 
where is the th element of and (a) follows from . Our derivation in the rest of this section is based on [6]. From the equalities in (2), becomes the rightside terms in (3) and (4) under the null and true hypotheses— and , respectively. For , does not belong to , and . For , belongs to , and .
(3)  
(4) 
From (2) and (3), the expectation and variance of
under the null hypothesis can be respectively approximated as follows:
(5) 
(6) 
where (a) and (b) follow from the circular symmetry of the Gaussian distribution and (1), respectively. Similarly, by (4) and Lemma A.1 (with and ), the expectation and variance of under the true hypothesis are obtained respectively as
(7) 
and
(8) 
where , and is the Gamma function.
Then, using (5)–(8), we obtain the loglikelihood ratio as follows.
(9) 
where (a) follows from the Gaussian approximation of the conditional distributions of given () by using (5) and (6) and given () by using (7) and (8).
Therefore, the proposed method estimates a partial true support as an index set by selecting the largest ratios for in . Note that is propotional to (10) when each nonzero element of follows the uniform distribution for
(10) 
where .
Iv Algorithm description
The proposed method, BMMP (Algorithm 1), for estimating and is introduced in this section. BMMP returns () as estimates of () given the tuple (), where is the number of support candidates from which the final support is estimated, and is a threshold used to terminate BMMP when the residual error is smaller than this value. The estimate is obtained in Steps 23–24 by selecting one of support candidates ( for ) with the minimum residual error. The th support candidate in Step 23 is obtained by the iteration in Steps – consisting of the following four stages.
In the first stage (generate an extended support estimate ) in Steps 4–10, a set of indices in Step 8 is iteratively obtained through the proposed partial support estimation^{2}^{2}2We note that in (10) is calculated by using the following values: . Step 6 in Algorithm 1 specifies these values as inputs of . and added to a set in Step 9 until its size is equal to or the residual error obtained from the estimate is smaller than , i.e. . In the second stage (generate a temporary estimate of ) in Step 11, an estimate of the signal vector supported on is obtained by the ridge regression described in Step 11. In the third stage in Steps 12–15, a temporary support estimate is obtained by selecting the largest elements in . The fourth stage in Steps 16–17 generates a subset of by selecting indices from . Then, the extended support estimate is reinitialized as the subset in the next iteration step. This iteration terminates and the th support candidate is obtained in Step 19 when the residual error obtained from the temporary support estimate in Step 15 does not decrease further in Step 3.
To enhance the robustness to noise, a ridge regression proposed by Wipf and Rao [13] is used in Step 11 with the following parameters obtained by minimizing the cost in (11).
(11) 
where is a submatrix of with columns indexed by , , and is the diagonal matrix whose th diagonal element is in . We obtain an appoximated solution of by using sparse bayesian learning (SBL) [13] to reduce the complexity.
In the noiseless case, the parameters in the ridge regression are set to zero so that the ridge regression becomes the leastsquare regression (). The leastsquares method has a lower complexity than SBL and it is wellknown that the leastsquares method provides a unique solution as in the noiseless case if , , and has the full column rank. If holds, the inversion problem of CS can be simplified as an easier problem where is replaced by its submatrix compared to the original problem. When
has a larger size, the probability of satisfying
is greater. For this reason, we set the maximal size of the extended support estimate as . Besides, it is guaranteed from Lemma A.2 that in the noiseless case, is a subset of () if and is equal to zero. Based on this fact, to minimize the size of satisfying , we increase the size of from to and find satisfying the criterion in Steps 4–10. Figure 1 illustrates the procedure of BMMP.V Numerical experiments
We compare the performance of BMMP and existing stateoftheart algorithms such as MAPgOMP^{3}^{3}3Two indices are selected at each step in MAPgOMP., MAPSP, MAPCoSaMP, MAPOMP [6], RegL1 ( norm minimization with a nonnegative constraint).^{4}^{4}4RegL1 outputs the signal estimate by minimizing such that for and where is the noise magnitude. We note that MAPSP, MAPCoSaMP, MAPOMP are algorithms based on MAP support detection [6]. Each entry of is i.i.d. and follows the Gaussian distribution with mean zero and variance . The elements of are independently and uniformly sampled from to . Then, is .
The rate of successful support recovery, i.e., , and the execution time of each algorithms in the noiseless case are shown in Fig. 2(a) and Fig. 2(b), respectively. () is set to () in Fig. 2. We evaluate BMMP whose input is set to , where SNR is the signaltonoise ratio (SNR ) in decibels. It has been shown [14] that in the noiseless case, the maximal sparsity required for an ideal approach to recover any signal vector such that is equal to given , where is the support of ; the bound is denoted by this ideal limit of the signal sparsity (). Fig. 2(a) and Fig. 2(b) show that BMMP has a lower complexity than RegL1, outperforms other stateoftheart methods, and approaches the bound, i.e., .
Fig. 2(c) illustrates the mean squared error of each algorithm when SNR is varied from to dB^{5}^{5}5In the noisy case, is randomly sampled such that each of its nonzero elements is ranged from to ., the sparsity is fixed to . It is observed that the signal reconstruction performance of BMMP is better than those of other methods in the noisy case.
Fig. 2(d) shows how each of the following three techniques used in BMMP contributes to the performance improvement of BMMP: exploiting the extended support estimate with its maximal size equal to (), using multiple support estimates (), and iteratively updating the support estimate by replacing its subset (). BMMP refers to BMMP without by setting the value to in step 16 in Algorithm 1. BMMP denotes BMMP without by setting to in BMMP. Similarly, BMMP indicates BMMP without such that (the maximum size of the extended support estimate ) shown in steps 410 in Algorithm 1 is set to and the remaining settings follow BMMP. It is shown that the performance improves in the order of BMMP, BMMP, BMMP, and BMMP. Fig. 2(d) shows that the consideration of contributes to a significant performance improvement of BMMP.
To compare performance of the proposed partial support estimation and MAP support detection [6], we compare performance of MAPgOMP and its variant, which is obtained by replacing the MAP support detection with the proposed partial support estimation in MAPgOMP. Fig. 3 shows the rate of successful support recovery of these two algorithms in the noiseless case where is varied from to with and . It is observed that the proposed partial support estimation outperforms the MAP support detection [6].
Fig. 4 shows the performance for reconstructing a grayscale sparse image with a size of pixels in the noisy case when SNR = dB. This image is compressed by using , whose elements are i.i.d. and sampled from the Gaussian distribution with mean and variance . It is observed in Fig. 4 that BMMP recovers the image better than other methods.
Vi Conclusion
We presented BMMP, which updates multiple extended support estimates with each size equal to
and performs a likelihood ratio test given the correlation term in RAORMP. The numerical results show that BMMP achieves an improvement in performance compared to existing stateoftheart methods, even with noisy measurements. Future related studies will consider the application of a technique for estimating the moments of
, , and to BMMP and the development of BMMP in the case where the distribution of is designable or the a priori information of is available, i.e., a communication system for nonorthogonal multiple access [15, 16].Appendix A Lemmas
Lemma A.1.
Let be a vector such that its elements are i.i.d. and follow the Gaussian distribution , and let be a matrix with rank . Then, it is guaranteed that and , where , and is the Gamma function.
Proof.
Let be an orthonormal basis of . Then, . Note that each for follows by the isotropic property of the Gaussian vector [17]. Since it follows that
follows the chi distribution with degrees of freedom. Then, the proof is completed by averaging the chi and chisquared distributions. ∎
Lemma A.2.
Suppose that every columns in exhibit full rank, is uniformly sampled from , and the sparsity is smaller than , i.e., . Then, in the noiseless case, the following statements (a) and (b) hold for any set such that .

holds if .

holds almost surely if .
Proof.
The following subspace is defined as
(12) 
For any set such that , holds if . If , includes so that . Thus, implies that so that the statement (a) is satisfied.
Given that from the assumption for , the rank of is strictly larger than that of for any index set such that and , the event region satisfying has Lebesgue measure zero on the range space . Thus, the condition holds almost surely. Given that the condition implies that , holds almost surely. And this condition implies that from the definition of . Thus, the statement (b) is satisfied. ∎
References
 [1] E. J. Candès et al., “Compressive sampling,” in Proceedings of the International Congress of Mathematicians, vol. 3. Madrid, Spain, 2006, pp. 1433–1452.
 [2] D. L. Donoho, “Compressed sensing,” IEEE Transactions on Information Theory, vol. 52, no. 4, pp. 1289–1306, 2006.
 [3] C. A. Metzler, A. Maleki, and R. G. Baraniuk, “From denoising to compressed sensing,” IEEE Transactions on Information Theory, vol. 62, no. 9, pp. 5117–5144, 2016.
 [4] J. Yang, J. Wright, T. S. Huang, and Y. Ma, “Image superresolution via sparse representation,” IEEE Transactions on Image Processing, vol. 19, no. 11, pp. 2861–2873, 2010.
 [5] R. Heckel, V. I. Morgenshtern, and M. Soltanolkotabi, “Superresolution radar,” Journal of the Institute of Mathematics and its Applications (IMA), vol. 5, no. 1, pp. 22–75, 2016.
 [6] N. Lee, “MAP support detection for greedy sparse signal recovery algorithms in compressive sensing,” IEEE Transactions on Signal Processing, vol. 64, no. 19, pp. 4987–4999, 2016.
 [7] J. Wang, S. Kwon, and B. Shim, “Generalized orthogonal matching pursuit,” IEEE Transactions on Signal Processing, vol. 60, no. 12, pp. 6202–6216, 2012.
 [8] D. Needell and J. A. Tropp, “CoSaMP: Iterative signal recovery from incomplete and inaccurate samples,” Applied and Computational Harmonic Analysis, vol. 26, no. 3, pp. 301–321, 2009.
 [9] W. Dai and O. Milenkovic, “Subspace pursuit for compressive sensing signal reconstruction,” IEEE transactions on Information Theory, vol. 55, no. 5, pp. 2230–2249, 2009.
 [10] M. E. Davies and Y. C. Eldar, “Rank awareness in joint sparse recovery,” IEEE Transactions on Information Theory, vol. 58, no. 2, pp. 1135–1146, 2012.
 [11] S. Kwon, J. Wang, and B. Shim, “Multipath matching pursuit,” IEEE Transactions on Information Theory, vol. 60, no. 5, pp. 2986–3001, 2014.
 [12] M. Rossi, A. M. Haimovich, and Y. C. Eldar, “Spatial compressive sensing in MIMO radar with random arrays,” in 46th Annual Conference on Information Sciences and Systems (CISS), 2012, pp. 1–6.
 [13] D. P. Wipf and B. D. Rao, “Sparse Bayesian learning for basis selection,” IEEE Transactions on Signal Processing, vol. 52, no. 8, pp. 2153–2164, 2004.
 [14] S. Foucart and H. Rauhut, A mathematical introduction to compressive sensing. Springer, 2013.
 [15] B. Wang, L. Dai, T. Mir, and Z. Wang, “Joint user activity and data detection based on structured compressive sensing for NOMA,” IEEE Communications Letters, vol. 20, no. 7, pp. 1473–1476, 2016.
 [16] A. C. Cirik, N. M. Balasubramanya, and L. Lampe, “Multiuser detection using ADMMbased compressive sensing for uplink grantfree NOMA,” IEEE Wireless Communications Letters, 2017.
 [17] D. Tse and P. Viswanath, Fundamentals of wireless communication. Cambridge university press, 2005.
Comments
There are no comments yet.