I Introduction
Sparse approximation techniques have found wide use due to their benefits and high flexibility in many applications in image and signal processing [1][2]. Sparse representation can efficiently extract most important features of a signal, so it provides very promising results in data compression [3], denoising [4], blind source separation [5], signal classification [6], and so on. The methods based on exploiting the signal sparsity have two main steps. First, an overcomplete dictionary [1] is selected/learned according to the structural characteristics of the set of signals, and then the target signal is decomposed over the dictionary to obtain a compact representation. Representation in terms of a few designed/learned bases can accurately capture the signal structure characteristics, which in turn, leads to an improvement in the distinction between noise/interference and structured signals.
In some signal processing applications, the task is to detect the presence of a signal from its noisy measurements. For example, in speech processing, Voice Activity Detection (VAD) is performed to distinguish speech segments from nonspeech segments in an audio stream. VAD plays a critical role on increasing the capacity of transmission and speech storage by reducing the average bitrate [7].
Signal detection is an old problem in signal processing and there are some traditional signal detectors including energy detector, matched filter and matched subspace detector [8]. Matched signal detector is the most basic framework for signal detection which needs a bank of matched signals to design a detector system. However, in many applications it is preferred to replace rank1 signals by a multirank matched subspace [8]. Matched subspace detector assumes the span of a subspace as the desired signals and rejects that part of signal which lies on the nullspace of the assumed subspace. Generalized likelihood ratio test (GLRT) for matched subspace detector is the uniformly most powerful invariant (UMPinvariant) statistic for detection [8]
. The subspace model for detection needs some bases as the span of desired signals which can be a set of fixed bases like discrete fourier transform (DFT) or datadependent bases like principal component analysis (PCA). Although subspace model is more adaptive for signal analysis, it needs several parameters that must be either known or estimated. For example the set of bases spanning the desired signals, the coefficients of the bases, noise covariance and signal to noise ratio (SNR). Depending on the knowledge about different parameters, the optimum statistic is suggested in
[9] for 4 situations. In the case of unknown coefficients, orthogonal projection of the observation is used to determine the coefficients of the contribution of each basis. The present paper assumes a more general model for signals in which considers a union of subspaces.Sparsity has been exploited widely in detection purposes, e.g., abnormal event detection [10], voice activity detection [11]
[12]. A multi criteria detection based on intelligent switching between traditional detection and sparse detection is proposed in [13]. In these works, sparsity has been used to extract features or define a heuristic criterion for detection. Compressive detector is another application of sparsity for signal detection. It is able to detect signals only by using some measurements from the original samples while the performance is not degraded dramatically
[14] [15]. The goal of compressive detector is to preserve the performance of detector the same as the original detector. In this paper we use sparsity from a different point of view. The traditional detectors are generalized to consider sparsity on the optimum decision rule and a new tradeoff is suggested between sparsity (rank of a subspace) and error of projection (distance to a lowrank subspace).In this paper, we propose a new signal detection method based on the union of lowrank subspaces (ULRS) model [16] [17]
. This model is able to reveal intrinsic structure of a set of signals. The proposed detector is a generalized version of traditional detectors. In other words, imposing a union of rank1 subspaces model for desired signals yields nothing other than the traditional matched filter banks. We investigate our detector from different points of views in order to show relation between our method and other classical detectors. We also derive a robust version of the proposed detector in order to provide robustness against outliers and gross errors. We provide theoretical investigations as well as experimental results on VAD.
The rest of the paper is organized as follows. Section 2 provides a brief background on sparse representation theory and basic concepts of detection theory. In Section 3 we describe our new signal detection method, study its performance and provide its robust version. Section 4 experimentally demonstrates the effectiveness of our proposed signal detection method. Finally, Section 5 concludes the paper with a summary of the proposed work.
Ii Theoretical Background and Review
Iia Basic Theory of Sparse Decomposition
Sparse decomposition of signals based on some basis functions has attracted a lot of attention during the last decade [1]. In this approach, one wants to approximate a given signal as a linear combination of as few basis functions as possible. Each basis function is called an atom and their collection is called a dictionary [18]. The dictionary is usually overcomplete, i.e., the number of atoms is much more than the dimension of atoms. Specifically, let be the signal which is sparsely represented over the dictionary with . This amounts to the following problem,
(1) 
where stands for the socalled pseudonorm which counts the number of nonzero elements. Many algorithms have been introduced to solve the problem of finding the sparsest approximation of a signal in a given overcomplete dictionary (for a good review see [19]). For a specified class of signals, e.g. class of natural images, the dictionary should have the capability of sparsely representing the signals. In some applications there is a predefined and fixed dictionary which is wellmatched to the contents of the specific class of signals. Overcomplete DCT dictionary for the class of natural images is an example. These nonadaptive dictionaries are favorable because of their simplicity. On the other hand, learning based dictionary results in better matching the contents of the signals [1]. Most dictionary learning algorithms are indeed a generalization of the clustering algorithms. While in clustering each training signal is forced to assign only one atom (cluster center), in the dictionary learning problem each signal is allowed to use more than one atom provided that it uses as fewest as few atoms as possible. The general dictionary learning problem can be stated as follows,
(2) 
Where the columns of contain the observed data and , the columns of , are sparse representations of the observed data. Most dictionary learning algorithms solve the above problem by alternatively minimizing it over and . Dictionary learning algorithms differ mainly in performing the minimization over the dictionary. Dictionary learning has an important role in the sparse decomposition based methods. A subsection in the proposed method section is allocated for discussion on dictionary learning.
IiB Basic Theory of Detection
In this section we review signal detection theory and study some related detectors to our proposed one. First consider the following model for detection
(3) 
where
is the observation vector,
is the signal of interest andis the observation noise of the model. First we assume that the probability density function of
and are known. In this case the likelihood ratio test (LRT) gives(4) 
where is a threshold that satisfies the desired amount of probability of false alarm. By Gaussian assumption on the noise with covariance matrix R, LRT simplifies to,
(5) 
where is the covariance matrix. If it is not known in advance, it must be determined by obtaining the sample covariance matrix in the above test. Probability of detection is then equal to [20],
(6) 
in which is the probability of false alarm and . Another wellknown detector is Generalized LRT [21] (GLRT) which is derived by maximizing conditional densities constituting the likelihood ratio test with respect to the unknown parameters. The following detection criterion is obtained by assuming the covariance matrix to be unknown
(7) 
where is the number of snapshots available for estimation. In [21] no optimality has been claimed for GLRT. However, Scharf and Friedlander have shown that GLRT is uniformly most powerful (UMP) invariant [8]. This is the strongest statement of optimality derived for a detector. GLRT detector may be interpreted as a projection on the nullspace of the interference followed by a matched subspace detector [8]. Consider the following model for hypothesis test.
(8) 
where spans the background or interference subspace, and determines contribution of each column of . spans signal subspace which is to be detected, and x determines the contribution of each column of . It is obvious that if or then or spans all the space of the signals. In other words, in this case or may be overfitted for background detection and signal detection, respectively. On the other hand, restriction of and may result in unreliable subspaces which are unable to fit suitable matched subspace. The role of matched subspaces detector is as follows
(9) 
where is the orthogonal projection matrix on the nullspace of and is the part of orthogonal projection of which does not account for subspace spanned by . Figure 1 shows the block diagram of this detector.
At the conclusion of paper [8] authors mentioned that basis can be extracted from Discrete Cosine Transform, Wavelet Transform or learned by data dependent analysis like Principal Component Analysis (PCA). Using such basis provides a matched subspace for the whole desired signals which are going to be detected. For more illustration refer to Figure 2. This figure shows composites of some 3D data by signal and nonsignal (interference and noise) parts. Two lowrank subspaces are shown corresponding to rank1 and rank2 subspace (the lowrank matched subspace) which are obtained by PCA.
The main contribution of [8] may be answering ’no’ to the question, ’Can the GLRT be improved upon?’ while they did not assume any prior information on the structure of the lowrank matched filter. The structural assumption can be applied by assuming a sparse prior on the coefficients of and . The proposed method of this paper suggests using the model of ULRS for signals due to its suitable fitness which has been proven in many signal processing applications. Instead of traditional analysis like PCA, modern analysis like the methods proposed in [22], [23] and [24] can be exploited in order to recover suitable bases spanning these lowrank subspaces. Figure 3 shows a union of matched lowrank subspaces corresponding to the data of Figure 2.
Compressive detection is another application of the sparse theory exploited in signal detection and studied in [25] and [15]. Instead of dealing with all the samples of the signal, the compressed detector works with few measurements. This detector distinguishes between two hypotheses,
(10) 
where is the measurement matrix and is the measurement. If no further prior is known about , no optimal can be designed, and random measurements yield a detector with the following performance [25].
(11) 
in which, the performance of the detector is degraded by factor compared to the traditional matched filter. Having knowledge of results in a compressed detector as shown in [25].
(12) 
in which, the performance of the detector is improved by a factor of compared to the random measurement detector. Reference [15] studied two cases about the knowledge of . The first case assumes that is known and the second case assumes that consists of a set of parametric basis, where the active basis of can be recovered by a sparse coding algorithm. Recently, [26] investigated the problem of detection of a union of lowrank subspaces via compressed measurements. The compressed detector still performs worse than the matched filter by factor .
In this paper we are going to exploit the lowrank structure characteristic of the signals to design a new detector. Our detector is not compressed and the goal is to design a generalized detector using sparsity (that is, assuming a structure) which implicitly exists in the signals. In Section 3 the proposed detector will be presented. Our detector first assumes a model according to sparse signals and then derives an optimum rule of detection.
Iii The Proposed Approach
In this section, we introduce our model for signal detection. We want to distinguish between two hypotheses and :
(13) 
where, is the dictionary which can be interpreted as a bank of matched filters, is the error vector of the model which denotes the mismatch between the exact matched filter and the union of subspaces spanned by the columns of . Assume that
is a zeromean white Gaussian noise with variance
, i.e. . In our method, the signal () matched to the observed signal () is unknown; so it must be determined. This section is divided into four subsections. In the first subsection, we analyze the role of the coefficients of the linear combination () and then describe our approach for coefficients estimation. In the second subsection, the performance of our proposed detection method will be analyzed. Since dictionary learning is a critical issue in the model, third subsection is allocated for discussing on the dictionary learning. In the last subsection we will explain how our method may become robust to detect signals that are contaminated by gross errors.Iiia A discussion on the coefficients ()
Linear combination of the dictionary atoms generates the matched signal for detection. Three cases are considered for
estimation. First, no constraint solution, second matched filter bank and third applying Gaussian distribution. First assume that there is no constraint on
i.e, orthogonal projection of the signal onto the span of the desired subspace. This method is used in matched subspace method to identify the part of signal that amounts for the desired signals [8]. The solution for it will be,(14) 
This answer suffers from overfitting as some signals that do not contain the target signal may be decomposed in terms of the atoms. More restricted constraints may alleviate this problem. Now let us assume that just one element of is allowed to be none zero. This constraint helps reducing overfitting. By this assumption the problem becomes,
(15) 
The solution will be zero except in the position corresponding to the atom with maximum correlation. This solution is nothing but the traditional matched filter bank. Each matched filter which has more correlation is considered as the matched signal. All correlations are sufficient statistics for the decision. If all the correlations are less than a threshold, no detection is performed.
The third scenario we study is assuming Gaussian prior on . The motivation of considering this assumption for is to avoid overlearning and moreover having less sensitive coefficients. Estimation of by the assumption of Gaussian distribution on and can be obtained as follows,
(16) 
This solution for the coefficients of linear combination is the Ridge regression
[27]. Solution (15) is the least overlearned and solution (14) is the most overlearned one. It is interesting to see how each of the solutions covers the signal space for learning. Solution (15) provides high learning for few one dimensional subspaces corresponding to each atom, while solutions (14) and (16) provide high learning for many subspaces corresponding to arbitrary selections of the atoms. Involvement of all the atoms to form the matched signal results in detection of undesired signals as the target signal due to the expansion of the matched subspaces. To keep the number of involved atoms limited, we suggest modifying problem (16) as follows,(17) 
There is a large enough value for such that the solution of the above problem is the same as (15). Now we show that this problem is the MAP estimation of under multivariate independent Gaussian prior,
(18) 
where is a diagonal matrix. By this assumption, two unknowns must be estimated. First we obtain the ML estimation of ,
(19) 
By setting the derivative with respect to equal to zero, the solution of Equation (19) is which has no solution, however we need only the diagonal elements of due to the independent assumption on the entries of . So, calculating the derivative with respect to only diagonal elements of () results in,
(20) 
where, is a small positive for avoiding division by zero. Then we insert the obtained in (18):
(21) 
Actually, is an auxiliary parameter which is used just for more adaptation of the coefficients distribution. The obtained W results in a distribution with more probability of having orthogonal low rank subspaces (in the space to which belongs; for more illustration see Fig. 4). Corresponding to these orthogonal low rank subspaces there are nonorthogonal low rank subspaces in the observation domain which or belongs to this space. The MAP estimation of by prior of (21) results in the suggested problem (17) which is a generalized version of (15) from the aspect of sparsity level of the coefficients, and a generalized version of (16) from the aspect of prior distribution on the coefficients for estimation. In [4], it is proved that in a certain condition, problem (17) leads to the same solution with the following regularized problem:
(22) 
IiiB Performance analysis
First, we define the false alarm rate and the detection alarm rate,
(23) 
Parameter satisfies the desired amount of false alarm probability, .
(24) 
By solving , the threshold for decision rule can be achieved,
(25) 
where is a constant value depending on and and the desired . The sufficient statistic for decision making is . It is easy to show that,
(26) 
where and . As can be seen, the performance of the detector is degraded by a factor of . But our detector has learned a suitable space for signals to be detected. In other words, we accept a small deterioration of the performance duo to the generalization of the detector. Flexibility of the sparse representation based detector is the most distinguished advantage. Dictionary learning [23] is the most important issue for the methods based on sparse representation. In the sparse detector, the dictionary should be learned such that ESR to avoid performance deterioration and at the same time ESR to avoid overlearning. In the next section we will explain how to learn an appropriate dictionary. In (25), sparsity has no effect on the performance. Now we introduce a decision rule for detection that exploits the sparsity of the coefficients. To this end, we solve equation by the obtained in the equation (21). The new decision rule can be achieved as follows,
(27) 
where is a positive constant value. As increases, may be more probable, because the signals representation in terms of the dictionary would be sparse only for the learned signals. Similar to (26), it is easy to show that,
(28) 
where is an increasing homogenous function. As can be seen, the probability of detection increases (decreases) when sparsity increases (decreases) for false alarm rates smaller than (because when then ). As the desired false alarm rates are often small, the probability of detection would increase in this region (it is favorable for a detector that the topleft region of its ROC be close to the ideal ROC). If the representation of a signal is sparse, this signal lies in the desired lowrank subspace (that is, it meets our assumed model for the target signals). Thus the probability of detection would increase for these signals that have sparse representation in terms of the dictionary atoms, which is actually what we expect from sparsity. Figure 5 shows the ROC of (22) with SNR=+20dB for different sparsity levels.
Traditional matched filter banks have the most sparsity level, but it is not practical. For instance, in voice activity detection, it is not feasible to collect all possible voices in a bank. A small number of filters results in high ESR and low performance. Our proposed detector makes a tradeoff between ESR and sparsity in order to have a good detector performance. Dictionary learning has a critical role in the tradeoff which is studied in the following.
IiiC Learning the Dictionary
In this section we explain the role of dictionary learning in the proposed detection method. In many detection problems, the number of training signals may not be as large as the number of possible matched filters that cover all the target signals space. By the proposed approach, we search for a dictionary learned by a set of finite number of signals that efficiently represents those signals. The dictionary should be general to be able to deal with a signal that has not been seen before. Assume that we have a set of signals (). Dictionary learning is a function that maps to where . An appropriate dictionary should have ESR to be a suitable representation for the training data and also ESR should not be too small to have a general dictionary that is not overlearned for only the training data. Two algorithms for dictionary learning are presented.
IiiC1 Kmeans algorithm
Kmeans method uses K centroids of clusters, to characterize the training data [28]. They are determined by minimizing the sum of squared errors,
(29) 
where the columns of are , . The provided dictionary assigns to each training data a centroid. should be large enough to satisfy the desired amount of ESR. Problem (14) has to be solved to determine the coefficients so that only one of them is none zero. This dictionary learns some points in the signal space. As the distance from these points increases, the level of learning would decrease. In other words, this dictionary is obtained by the union of spheres model. This model may not be a suitable choice for ordinary signals. The next algorithm agrees with a more appropriate model for the data. The KSVD learns the signal space with a union of lowrank subspaces.
IiiC2 KSVD algorithm
By extending the union of spheres to a union of lowdimensional subspaces, Kmeans algorithm is generalized to KSVD algorithm [23]. This flexible model agrees with many signals such as images and audio signals. For example, natural images have sparse representation in terms of DCT dictionary. In other words, by combination of only a few DCT bases, it is possible to approximate the blocks of an image. The following problem provides the dictionary learned by KSVD,
(30) 
This algorithm is based on atombyatom updating over the columns of . Recently, more efficient algorithms for atombyatom updating are suggested in [29]. Each arbitrary selection of few columns characterizes a cluster corresponding to a subspace. The dictionary learned by KSVD is in agreement with the proposed problem (22). After learning, test signals that lie on the learned low dimensional subspaces can be reconstructed and detected. In addition to dictionary learning using training signals, it is possible to design a dictionary using parametric functions [30]. Kernels of FFT and DCT are two examples from this class of dictionaries where bases sweep the parameter of frequency. Figure 6 shows the block diagram of the proposed detection method.
IiiD Robustness
Assume that a dictionary has learnt to detect face images without sun glasses. If a face image with sun glasses is given to it for detection, gross error in the region of eyes may result in a wrong detection. To solve this problem, a distribution has to be supposed that has longer tail than Gaussian. Laplace distribution is our suggestion for the error distribution. Thus implies that the observed signal is the combination of few atoms of , a Laplace distributed error and a Gaussian distributed noise.
(31) 
The problem of coefficients estimation for (22) by new prior assumption has been already presented in robust statistics [31].
(32) 
where,
(33) 
In other words, small errors and large errors are penalized by norm and norm, respectively. is the parameter of the mixture distribution of Gaussian and Laplace. Let rewrite (32) as follows,
(34) 
Let us define as .
(35) 
By substitution of and , we have,
(36) 
This problem is similar to (22
) except that its dictionary is extended by scaled identity matrix. Identity matrix projects inappropriate parts of the signals onto corresponding coefficients. Inappropriate parts of the signals may be large errors or out of the desired subspace interferences or outlier data. Authors of
[32] also intuitively have used the same dictionary to obtain a robust framework for face recognition. A same procedure can be pursued to learn robust dictionary by a set of unreliable data [33].Iv Experimental Results
We evaluated the performance of our proposed method in the case study of VAD. To construct the learned dictionary, clean speech signals of NOIZEUS database were used [34]. In NOIZEUS database, thirty sentences were selected which include all phonemes in the American English language. The sentences were produced by three male and three female speakers and originally sampled at 25 kHz and downsampled to 8 kHz. We divided the clean speech signals into 25ms frames with 10ms frame shift. After removing the silent frames, we extracted standard Melfrequency Cepstral Coefficients (MFCC) using 10 Mel triangular filters, energy values computed at each of the 10 Mel triangular filters, total energy (the first Cepstral coefficient) and entropy from each speech frame. MFCC features capture the most relevant information of speech signal, and they are widely used in speech and speaker recognition making the VAD method easy to integrate with existing applications. So our features vector was 24dimensional, and the total number of vectors was about 6300. By using the KSVD algorithm, we obtained a learned dictionary with 100 atoms, which was used in the following experiments for obtaining the sparse representation based on OMP method.
To evaluate the performance of the proposed method, the speech detection probability PD and false alarm probability PF were investigated based on a reference decision. A clean test speech (sp10.wav), taken from the NOIZEUS database, was downsampled at 8000 Hz and was used for the reference decisions. To simulate noisy environments, several noise signals as the subset of the NOIZEUS database were used. Noise signals included recordings from different places (Babble (crowd of people), Car,…) at SNRs of 0dB, 5dB, 10dB, and 15dB. The ROC Curves for VAD using our proposed method are illustrated in Fig. 7 which shows PD versus PF.
Sparsity in voice activity detection has been exploited already. E.g, a feature extraction is performed to suggest a decision rule for detection in
[11]. We compared the result of our method with the sparsitybased VAD method proposed in [11]. As can be seen in Fig. 8, our method shows better performance in low SNR conditions.V Conclusion
This paper presented a new sparsitybased detector. The performance of the method was evaluated in a realistic application: voice activity detection in speech signal processing. Our detector proposed a new tradeoff for designing detectors by assuming the union of lowrank subspaces model. The tradeoff is between the sparsity and the error of union of lowrank subspaces model denoted by ESR. In our detector the number of filter banks is proportional to the size of the dictionary. Appropriate dictionary is able to regularize the sparsity and the introduced parameter ESR. Simulation results showed that the proposed method is effective and has a high antinoise ability due to optimum projection of signals to reliable learned lowrank subspaces.
References
 [1] M. Elad, Sparse and Redundant Representations: From Theory to Applications in Signal and Image Processing. Springer Publishing Company, Incorporated, 1st ed., 2010.
 [2] A. M. Bruckstein, D. L. Donoho, and M. Elad, “From sparse solutions of systems of equations to sparse modeling of signals and images,” SIAM Rev., vol. 51, pp. 34–81, Feb. 2009.
 [3] A. Rahmoune, P. Vandergheynst, and P. Frossard, “Sparse approximation using mterm pursuit and application in image and video coding,” IEEE Transactions on Image Processing, vol. 21, no. 4, pp. 1950–1962, 2012.
 [4] M. Elad and M. Aharon, “Image denoising via sparse and redundant representations over learned dictionaries,” Trans. Img. Proc., vol. 15, pp. 3736–3745, Dec. 2006.

[5]
P. Comon and C. Jutten,
Handbook of Blind Source Separation: Independent Component Analysis and Applications
. Academic Press, 1st ed., 2010.  [6] J. Mairal, F. Bach, J. Ponce, G. Sapiro, A. Zisserman, T. Cog, J. Mairal, F. Bach, J. Ponce, G. Sapiro, A. Zisserman, Équipesprojets Willow, and E. N. Supérieure, “Supervised dictionary learning,” 2008.
 [7] A. Benyassine, E. Shlomot, H. yu Su, D. Massaloux, C. Lamblin, and J.P. Petit, “Itut recommendation g.729 annex b: a silence compression scheme for use with g.729 optimized for v.70 digital simultaneous voice and data applications,” Communications Magazine, IEEE, vol. 35, pp. 64–73, Sep 1997.
 [8] L. Scharf and B. Friedlander, “Matched subspace detectors,” Signal Processing, IEEE Transactions on, vol. 42, pp. 2146–2157, Aug 1994.
 [9] S. Kraut, L. Scharf, and L. McWhorter, “Adaptive subspace detectors,” Signal Processing, IEEE Transactions on, vol. 49, pp. 1–16, Jan 2001.
 [10] P. Ahmadi, S. Khoram, M. Joneidi, I. Gholampour, and M. Tabandeh, “Discovering motion patterns in traffic videos using improved group sparse topical coding,” in Telecommunications (IST), 2014 7th International Symposium on, pp. 343–348, Sept 2014.
 [11] P. Teng and Y. Jia, “Voice activity detection via noise reducing using nonnegative sparse coding,” Signal Processing Letters, IEEE, vol. 20, pp. 475–478, May 2013.

[12]
T. Le, K. Luu, and M. Savvides, “Sparcles: Dynamic
sparse classifiers with level sets for robust beard/moustache detection and segmentation,”
Image Processing, IEEE Transactions on, vol. 22, pp. 3097–3107, Aug 2013.  [13] B. Shim and B. Song, “Multiuser detection via compressive sensing,” Communications Letters, IEEE, vol. 16, pp. 972–974, July 2012.
 [14] M. Duarte, M. Davenport, M. Wakin, and R. Baraniuk, “Sparse signal detection from incoherent projections,” in Acoustics, Speech and Signal Processing, 2006. ICASSP 2006 Proceedings. 2006 IEEE International Conference on, vol. 3, pp. III–III, May 2006.
 [15] M. Davenport, P. Boufounos, M. Wakin, and R. Baraniuk, “Signal processing with compressive measurements,” Selected Topics in Signal Processing, IEEE Journal of, vol. 4, pp. 445–460, April 2010.
 [16] Y. Lu and M. Do, “A theory for sampling signals from a union of subspaces,” Signal Processing, IEEE Transactions on, vol. 56, pp. 2334–2345, June 2008.
 [17] G. Liu, Z. Lin, S. Yan, J. Sun, Y. Yu, and Y. Ma, “Robust recovery of subspace structures by lowrank representation,” Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 35, pp. 171–184, Jan 2013.
 [18] S. Mallat and Z. Zhang, “Matching pursuits with timefrequency dictionaries,” Signal Processing, IEEE Transactions on, vol. 41, pp. 3397–3415, Dec 1993.
 [19] J. Tropp and S. Wright, “Computational methods for sparse solution of linear inverse problems,” Proceedings of the IEEE, vol. 98, pp. 948–958, June 2010.
 [20] M. A. Davenport, M. B. Wakin, and R. G. Baraniuk, “Detection and estimation with compressive measurements,” tech. rep., 2006.
 [21] E. Kelly, “An adaptive detection algorithm,” Aerospace and Electronic Systems, IEEE Transactions on, vol. AES22, pp. 115–127, March 1986.
 [22] M. Sadeghi, M. Joneidi, M. BabaieZadeh, and C. Jutten, “Sequential subspace finding: A new algorithm for learning lowdimensional linear subspaces,” in Signal Processing Conference (EUSIPCO), 2013 Proceedings of the 21st European, pp. 1–5, Sept 2013.
 [23] M. Aharon, M. Elad, and A. Bruckstein, “Ksvd: An algorithm for designing overcomplete dictionaries for sparse representation,” Signal Processing, IEEE Transactions on, vol. 54, pp. 4311–4322, Nov 2006.
 [24] R. Rubinstein, T. Peleg, and M. Elad, “Analysis ksvd: A dictionarylearning algorithm for the analysis sparse model,” Signal Processing, IEEE Transactions on, vol. 61, pp. 661–677, Feb 2013.
 [25] Z. Wang, G. Arce, and B. Sadler, “Subspace compressive detection for sparse signals,” in Acoustics, Speech and Signal Processing, 2008. ICASSP 2008. IEEE International Conference on, pp. 3873–3876, March 2008.
 [26] Y. Eldar and M. Mishali, “Robust recovery of signals from a structured union of subspaces,” Information Theory, IEEE Transactions on, vol. 55, pp. 5302–5316, Nov 2009.
 [27] A. E. Hoerl and R. W. Kennard, “Ridge regression: Biased estimation for nonorthogonal problems,” Technometrics, vol. 12, pp. 55–67, 1970.
 [28] A. K. Jain, “Data clustering: 50 years beyond kmeans,” Pattern Recogn. Lett., vol. 31, pp. 651–666, June 2010.
 [29] M. Sadeghi, M. BabaieZadeh, and C. Jutten, “Learning overcomplete dictionaries based on atombyatom updating,” Signal Processing, IEEE Transactions on, vol. 62, pp. 883–891, Feb 2014.
 [30] M. Yaghoobi, L. Daudet, and M. Davies, “Parametric dictionary design for sparse coding,” Signal Processing, IEEE Transactions on, vol. 57, pp. 4800–4810, Dec 2009.
 [31] P. Huber, J. Wiley, and W. InterScience, Robust statistics. Wiley New York, 1981.
 [32] J. Wright, A. Yang, A. Ganesh, S. Sastry, and Y. Ma, “Robust face recognition via sparse representation,” Pattern Analysis and Machine Intelligence, IEEE Transactions on, vol. 31, pp. 210–227, Feb 2009.
 [33] S. Amini, M. Sadeghi, M. Joneidi, M. BabaieZadeh, and C. Jutten, “Outlieraware dictionary learning for sparse representation,” in Machine Learning for Signal Processing (MLSP), 2014 IEEE International Workshop on, pp. 1–6, Sept 2014.
 [34] P. C. Loizou, Speech Enhancement: Theory and Practice. Boca Raton, FL, USA: CRC Press, Inc., 2nd ed., 2013.
Comments
There are no comments yet.