Estimating the number of sources in an efficient and accurate way is important to many applications that involve array signal processing. Such applications assume this parameter to be known in prior and further processing would depend on such parameter. These algorithms include: Direction of Arrival (DoA) , blind source and channel order separations . In DoA algorithms, such as MUSIC or ESPRIT, knowing the number of sources impaired to the array is critical in eigenvalues decomposition to separate between noise and signal subspaces. DoA estimation can be involved in many further applications that include localization and tracking of objects, dedicating the signal to a desired user in wireless networks and sound and speech processing . Hence, many algorithms have been proposed to detect the number of sources that include: information theoretic criterion based   
, eigenvector-based,and threshold based estimations .
Information theoretic approaches such as Akaike’s information criterion (AIC)  and minimum description length (MDL)  are the most widely used methods for number of sources estimation. Those methods are criterion based estimation algorithms that are mostly computationally complex and have bad performance with low number of samples and low SNR. Complexity problem that is found in both methods is due to the minimization of criterion to search for minimum AIC or MDL values beside the eigenvalue decomposition (EVD) operation on the auto covariance matrix of the observed data. The poor performance problem is due to the incorrect estimation of auto covariance matrix at low SNR especially with low number of samples. This results in no clear difference between eigenvalues that are needed for number of sources estimation. Beside all that, such methods assume the noise to be sparse-like uncorrelated from the signal and hence fail in practical scenarios such as underwater  and indoor offices . As a result, a lot of research tried to solve these problems by modifying the traditional algorithms or proposing another way for estimation.
Some works tried to reduce the complexity by not going through EVD and solving the problem utilizing Multi-Stage Wiener Filter (MSWF), , however, applications such as DoA would mostly involve EVD so going through that wouldn’t add complexity. Another work in 
presented a threshold based estimation algorithm that is based on peak to average ratio (PAR) characteristics. The algorithm calculates the PAR values of the received data and get the differences between adjacent ones which is compared to a threshold. If the difference exceeds the threshold, then the number of sources is detected by the location at that point. The threshold here is set based on the gradient of the PAR values, the number of array elements and the minimum PAR value. Results showed a better performance than AIC and MDL under low SNR conditions, however the threshold need to be adjusted and the probability of detection is still effected drastically by the number of samples.
In this paper, it was noted that most information theoretic approaches are computationally complex while threshold based approaches need a reconfigurable threshold with different parameters. As well, almost all previous work did consider the auto covariance matrix eigenvalues without considering the auto correlation coefficient matrix eigenvalues which can result in a much simpler detection approaches. Hence, this paper proposes a simple estimation algorithm that uses the auto correlation coefficient matrix to estimate eigenvalues and estimate the number of sources by looking for the maximum difference or moving standard deviation between eigenvalues. The moving standard deviation in here is the difference between two consecutive biased standard deviation of two eigenvalues only. The algorithm was compared, in term of error rate, to information theoretic approaches with different scenarios and setups.
The rest of the paper is organized as follows, section II will present the system model while section III presents how the eigenvalues are calculated. Section IV presents some existing techniques and section V will present the proposed algorithm. Section VI will present simulation results and comparisons and finally, conclusion and future work will be presented in section VII.
Ii System Model
In our system model, we assume that the receiver is equipped with -sensor uniform antenna array. Considering signals are impinging on the receiver’s array, the received signal at an instant of time can be expressed as:
is the steering vector for the a signal arriving at azimuth angle, is the impinging signal from the source at time , and is the additive white Gaussian noise (AWGN). In the matrix notation, (1) can be represented as:
where , ,, , with being the total number of collected samples and is the set of complex numbers. The matrix of steering vectors is:
The steering vector for a uniform circular array (UCA) can be represented as:
with waveform , radius r and is .
The auto covariance matrix of the received data can be expressed as:
where, denotes the expectation operation, denotes the Hermitian operation, is the auto covariance matrix of the impinging signal, is the auto covariance matrix of the receivers AWGN with
is the noise variance andis unitary matrix. It is worth noting that the auto covariance matrix of the impinging signal is assumed to be a full rank matrix. This implies that its columns are linearly independent or in other words, the impinging signals are not correlated. Consequently, if the impinging signals are correlated, will be rank deficient.
Iii Problem Formulation
The auto covariance matrix of the received signal from the antenna array is typically estimated when estimating the DoA [13, 14]. For subspace based techniques such as MUltiple SIgnal Classification (MUSIC) , which is widely used and known for its superb performance particularly at low SNR levels, the EVD is applied on as a step to estimate the DoA. In other words, estimating the and its EVD is a conventional step in most of the DoA estimation algorithms. Applying EVD on leads to:
where and are signal and noise subspaces unitary matrices, and and are diagonal matrices of the eigenvalues of the signal and noise, respectively. (III) can be expressed as:
The eigenvalues with their corresponding eigenvectors define the signal and noise subspace as and respectively. The problem is then estimating the value of , i.e., the number of impinging signals, given the estimated .
Iv Existing Techniques
AIC and MDL are the most widely used algorithms for number of sources estimation. They are order determination information theoretic models that use the eigenvalues of the sample auto covariance to determine how many smallest eigenvalues are approximately equal. Those eigenvalues would lie in the noise subspace while others would lie in signal subspace. Both algorithms consist of minimizing a criterion of log likehood over the number of signals that are detectable. In here, the derivation of those criterion will not be stated, however the details of both of them can be found in . When ordering the eigenvalues in a descending order, i.e., , AIC criterion can be expressed as:
while MDL criterion can be expressed as:
where is the index of the eigenvalues. For the rest of the paper, we will use AIC and MDL as references to compare the performance of our proposed algorithms. Another approach for estimating the number of sources is based on setting a threshold for the eigenvalues increment . It was noted that the eigenvalues of the noise subspace are close to each other and the difference between them doesn’t exceed a certain threshold. Hence, the increment in the eigenvalue is compared with a threshold to estimate the number of sources. Their estimated threshold () is given by:
where is the estimated signal power, is the eigenvalue with index M, i.e. last eigenvalue. in here is a coefficient that is found through extensive computer simulation for each two particular and . In other words, each time either or or both of them change, a comprehensive simulation has been run beforehand to find the best value.
AIC and MDL are more computationally expensive than the eigenvalues increment threshold based approach given that it will be needed to solve the minimization problem given in (8) and (9) each time an estimation of the number of sources in needed. On the other hand, the eigenvalues increment threshold based approach requires an extensive iterations a prior to adjust accordingly. In addition to that depends on several parameters such as , and SNR making its adjustment a tedious process.
V Proposed Algorithm
In our proposed algorithm, we exploit the auto correlation coefficient matrix rather than the auto covariance matrix to estimate the number of impinging sources. To define the auto correlation coefficient matrix, we first redefine the auto covariance matrix in (II) as:
where . The elements in the diagonal of are the variances of . The auto correlation coefficient matrix is then given by:
We then apply the EVD on which leads to:
The eigenvalues with their corresponding eigenvectors define the signal and noise subspace as and respectively. As well, the problem is then estimating the value of given the estimated . We first arrange the eigenvalues in an ascending order, rather than a descending order as in the case of AIC and MDL. Hence, eigenvalues are arranged from the beginning as where and would lay in the noise subspace while are in signal subspace.
It can be inferred from (14) and (7) that since the eigenvalues of the signal subspace contain both signal and noise power, the values of sources’ signal eigenvalues are expected to be higher than noise eigenvalues at moderate and high SNR values. At the same time the noise eigenvalues are expected to be comparable to one another. The main contribution of using EVD of the auto correlation coefficient matrix in (14) rather than EVD of the auto covariance matrix in (7) is that the difference between signal eigenvalues and the noise eigenvalues is more accentuated, which leads to an easier and more efficient estimation of the number of sources, particularly at low SNR values. Moreover, the mathematical operation applied to estimate a decision statistic, which is then used to decide on the number of sources, can be as simple as our proposed moving increment or moving standard deviation rather than the complicated decision statistic for the AIC and MDL given in (8) and (9).
To illustrate the advantage of using EVD of the auto correlation coefficient matrix, as we propose, versus EVD of the auto covariance matrix, which is conventionally used in most of the existing techniques, we plot the moving increment and moving standard deviation of the estimated eigenvalues of the auto correlation coefficient matrix versus the auto covariance matrix for different SNR values in Fig.1 and different number of collected samples in Fig.2. The simulation parameters for the first figure are: 8 elements antenna array, 2 impairing signal, 1024 samples and different SNR values. The simulation parameters for the other figure are the same except that the SNR is kept fixed at -7 dB and the number of samples changed from 128 to 1024. From these figures, one can see that in the case of using the eigenvalues of the auto correlation coefficient matrix, the jump in the decision statistic when first moving from the noise subspace to the signal subspace is always the highest. The decision statistics then starts to decrease, in other words, the highest increment in the decision statistic always happens when moving from noise subspace to signal subspace. On the contrary, when using the same two decision statistics with the eigenvalues of the auto covariance matrix, the first jump between the noise and signal subspaces is not necessarily the largest. In addition to that the decision statistics in the signal subspace increase monotonically. This implies that when using the decision statistics of the auto correlation coefficient matrix, the problem is transformed into a simple maximization problem, where the index at which the highest jump occurs is searched for. While for the case of using the decision statistic of the auto covariance matrix, the decision statistics should be compared to a threshold to decide on the number of sources. As stated in , which uses covariance eigenvalues in their algorithms, estimating the threshold is a tedious process that requires an extensive simulation and iterations to estimate the appropriate threshold for each particular set of parameters.
V-a Moving Increment of the auto correlation coefficient matrix Eigenvalues
Our first proposed decision statistic () used as a metric to decide on the number of sources is the moving increment of the estimated eigenvalues of the auto correlation coefficient matrix. The moving increment is estimated as the difference between each two consecutive eigenvalues:
The highest increment would then imply the shift between noise eigenvalues to signal eigenvalues. The index at which this shift happens can be estimated as:
In this case, the number of sources can be given by .
V-B Moving STD of the auto correlation coefficient matrix Eigenvalues
Our second proposed decision statistic () used as a metric to decide on the number of sources is the moving standard deviation of the estimated eigenvalues of the auto correlation coefficient matrix. The biased sample standard deviation in general is a measure of variance or difference of the sample from the mean, it can be calculated by:
where is the mean and is the size of the sample or, in our case, the size of the eigenvalues involved in standard deviation calculation.
Now, finding the biased standard deviation of two eigenvalues, can be done by:
where is the mean of the two eigenvalues involved which is:
We define our second decision statistic, which is the moving STD () as the difference between two consecutive STDs:
Similarly, as in the case of using the moving increment, the highest index at which the shift between the noise eigenvalues and the signal eigenvalues can be estimated as:
Consequently, the number of sources can be given by .
Vi Simulation Results
Simulation results were carried in different scenarios to test algorithm’s performance with different cases that include different SNR values, different number of samples, different number of impairing signals, and different array configuration. Performance metric used for comparison was the percentage error rate, which can be expressed as:
Except for the last simulation, the array that was used was a uniform circular array with 8 elements. The original signal was a QPSK signal and the noise added was a white Gaussian noise with the different SNR values. The number of runs iterations is 10000.
Vi-a Algorithms Performance at Various SNR
The first simulation was done to test AIC, MDL and the proposed algorithms performance with different SNR values. SNR values ranged from -20 to 15 dB, the number of samples was fixed to 1024 and the actual number of sources was 2.
As shown in Fig. 3, the proposed algorithms behaved better than MDL in low SNR values and better than AIC in high SNR values. For less than -10dB, the performance of the proposed algorithms had a comparable performance to AIC and better than MDL. However, the estimation was inaccurate for all algorithms for less than -12 dB SNR. The reason for the proposed algorithms inaccurate estimation at this stage is due to the inconsistent change in eigenvalues that is resulted from high noise and hence the standard deviation or increment changes randomly and the detection can happen at different stages. The reason why MDL behaves badly is the underestimation of the number of sources which was detected to be 1 as well. After -10 dB SNR, the performance of MDL and the proposed algorithms came to be the same with minimum error rate that is almost 0 while AIC kept its error rate of about 10. The reason why AIC is not giving lower error rate is the overestimation of the number of sources which happen with relatively high SNR values. This overestimation is probably due to AIC added penalty term as was proven by .
Vi-B Algorithms Performance at Various Number of Samples
One of the important parameters to consider in any algorithm design is the number of samples needed by the algorithm to estimate correctly. This is important for algorithm practical implementation as the number of samples needed to be minimized in such scenarios. Hence, MDL, AIC and the proposed algorithms were tested against different number of samples at SNR value of -5 dB with 2 impairing signals.
As can be seen in Fig.4, the proposed algorithms did have a better performance than MDL and similar performance to AIC for low number of samples. MDL algorithm underestimated the number of sources with low number of samples as eigenvalues were not well distrusted in a way that can be detected by the algorithm criterion and the added penalty term. MDL and the proposed algorithms did have the same performance for more than 256 samples which was almost 0 error rate. AIC, on the other hand, overestimated the number of samples and hence had its 10% error rate, which was found in almost all test cases that were conducted in this paper.
Vi-C Algorithm Performance with Different Number of Impairing Sources
Different algorithms might have different sensitivities in terms of the number of sources they can estimate. Hence, algorithms were tested against a different number of impairing sources at SNR value of -5 dB and the number of samples equal to 1024 samples.
As can be seen in Fig. 5, AIC outperformed all other algorithms in the maximum number of sources it can estimate. MDL and moving STD algorithm could estimate up to 5 sources with less than 20% error and fail to estimate more while AIC could estimate 6 and 7 but with high error rate. The reason behind this drawback goes back to separation between the DoA angles was not enough to estimate the number of sources correctly at this point. In general, moving STD algorithm could estimate up to 6 signals with 8 elements array and couldn’t estimate more no matter what the separation or SNR value was. Moving Increment algorithm could estimate up to 5 sources with this configuration. However, such performance drawback can be safely negligible due to the fact that receiving 6 sources at the same time is almost impossible in practical wireless scenarios. Besides, even if the number of sources was estimated correctly, further applications that use such estimation, such as DoA, won’t be able to estimate more than 5 sources and hence the difference in the performance won’t effect.
Vi-D Algorithm performance with Different Array Elements
This simulation examines the effect of increasing the number of elements that construct the array. The test was done on the SNR value of -5, 100 samples and 2 signals were impaired to the array.
As shown in Fig. 6, when the number of elements increases the error rate will decrease until it reach almost 0 error rate when the number of elements is 8 for the moving STD based algorithm and 12 for MDL and moving increment based algorithm. Before 16 elements array moving STD algorithm showed the best performance compared to three simulated ones which can be resulted back to low number of samples that caused MDL and moving increment to perform worst than others at that stage. After 16 elements both MDL and the proposed algorithms showed an error rate of almost 0%.
This paper presented a new algorithm for number of sources estimation based on eigenvalues decomposition. The algorithm depended on the auto correlation coefficient matrix instead of auto covariance matrix to get the eigenvalues and find the number of samples at the maximum moving increment or moving standard deviation. In general, moving STD behaved better than moving increment and had more stable performance that is comparable to AIC and MDL. Results showed a better performance than MDL at low SNR values and better than AIC at high SNR. In addition to that, the proposed algorithm was much simpler than the information theoretic approaches as it depends on simple maximizing problem only.
This publication was made possible by the support of the NPRP grant 5-559-2-227 from the Qatar National Research Fund (QNRF). The statements made herein are solely the responsibility of the authors.
-  T. E. Tuncer and B. Friedlander, Classical and modern direction-of-arrival estimation. Academic Press, 2009.
-  A. J. Bell and T. J. Sejnowski, “An information-maximization approach to blind separation and blind deconvolution,” Neural computation, vol. 7, no. 6, pp. 1129–1159, 1995.
-  A. Liavas, P. Regalia, and J.-P. Delmas, “Blind channel approximation: effective channel order determination,” Signal Processing, IEEE Transactions on, vol. 47, no. 12, pp. 3336–3344, Dec 1999.
T. W. Anderson, “Asymptotic theory for principal component analysis,”Annals of Mathematical Statistics, pp. 122–148, 1963.
-  H. Akaike, “A new look at the statistical model identification,” Automatic Control, IEEE Transactions on, vol. 19, no. 6, pp. 716–723, Dec 1974.
-  M. Wax and T. Kailath, “Detection of signals by information theoretic criteria,” Acoustics, Speech and Signal Processing, IEEE Transactions on, vol. 33, no. 2, pp. 387–392, Apr 1985.
-  A. Di and L. Tian, “Matrix decomposition and multiple source location,” in Acoustics, Speech, and Signal Processing, IEEE International Conference on ICASSP ’84., vol. 9, Mar 1984, pp. 722–725.
-  J.-S. Jiang and M.-A. Ingram, “Robust detection of number of sources using the transformed rotational matrix,” in Wireless Communications and Networking Conference, 2004. WCNC. 2004 IEEE, vol. 1, March 2004, pp. 501–506 Vol.1.
-  M. Kotanchek and E. Dzielski, “Subspace stability in high resolution direction finding and signal enumeration,” in Autonomous Underwater Vehicle Technology, 1996. AUV ’96., Proceedings of the 1996 Symposium on, Jun 1996, pp. 192–199.
-  J.-S. Jiang and M. A. Ingram, “Path models and mimo capacity for measured indoor channels at 5.8 ghz,” ANTEM, pp. 603–609, 2002.
-  Z. An, H. Su, and Z. Bao, “A new method for fast estimation of number of signals,” in Image and Signal Processing, 2008. CISP ’08. Congress on, vol. 5, May 2008, pp. 390–393.
-  J.-F. Gu, P. Wei, and H.-M. Tai, “Detection of the number of sources at low signal-to-noise ratio,” Signal Processing, IET, vol. 1, no. 1, pp. 2–8, March 2007.
-  M. S. BARTLETT, “Periodogram analysis and continuous spectra,” Biometrika, vol. 37, no. 1-2, pp. 1–16, 1950. [Online]. Available: http://biomet.oxfordjournals.org/content/37/1-2/1.short
-  J. Capon, “High-resolution frequency-wavenumber spectrum analysis,” Proceedings of the IEEE, vol. 57, no. 8, pp. 1408–1418, Aug 1969.
-  R. Schmidt, A Signal Subspace Approach to Multiple Emitter Location and Spectral Estimation. Stanford University, 1981. [Online]. Available: http://books.google.com.qa/books?id=mLKUnQEACAAJ
-  O. Hu, F. Zheng, and M. Faulkner, “Detecting the number of signals using antenna array: a single threshold solution,” in Signal Processing and Its Applications, 1999. ISSPA ’99. Proceedings of the Fifth International Symposium on, vol. 2, 1999, pp. 905–908 vol.2.
-  E. Fishler, M. Grosmann, and H. Messer, “Detection of signals by information theoretic criteria: general asymptotic performance analysis,” Signal Processing, IEEE Transactions on, vol. 50, no. 5, pp. 1027–1036, May 2002.