I Introduction
WITH the rapid development of the upcoming technologies of 5G new radio, the extensive research on enhanced mobile broadband (eMBB), massive machinetype communications (mMTC), and ultrareliable low latency communications (URLLC) has drawn dramatically increasing attention from both academia and industry [1, 2, 3]. To satisfy the prospects of 5G, not only tremendous improvements of the aforementioned new radio techniques need to be achieved, but also the harmonic and fair coexistence of heterogeneous networks and the compatibility between 4G and 5G systems should be taken great care of [4]. Due to the scarcity of the spectrum suitable for wireless electromagnetic transmission, many various existing and emerging communication systems are deployed close to each other, or even overlapping in spectrum, which inevitably results in intensive interference [5]. As a typical example, the narrowband internetofthings (NBIoT) system is deployed reusing the spectrum of long term evolution (LTE), occupying the spectrum of LTE when operating in the “inband” mode [6, 7, 8]. NBIoT is a promising and emerging technology to support the prospect of mMTC in 5G new radio, capable of interconnecting a large amount of nodes with very low power consumption and narrow bandwidth [9, 10, 11]. Since LTE and LTEAdvanced (LTEA) with the cyclicprefixed orthogonal frequency division multiplexing (CPOFDM) modulation are dominating technologies in 4G era [12, 13, 14], the interference from NBIoT systems should be properly tackled so that the smooth transition from 4G to 5G can be done [15, 16]. In the process of the deployment of 5G eMBB facilities, it is also important to mitigate the interference from NBIoT if the utilized spectrum is overlapping.
However, how to mitigate or eliminate the interference between NBIoT and LTE systems still remains an open issue, which has not been sufficiently investigated in literature yet. Since the bandwidth of NBIoT is sufficiently small compared with that of LTE, the interference from NBIoT can be regarded as a certain kind of narrowband interference (NBI). Although there are plenty of conventional methods to combat against NBI in literature [17, 18, 19, 20, 21], useful data might be lost using the conventional methods, or the information of statistics or locations of the NBI should be priorly known, or a large amount of virtual subcarriers were consumed, which limited the efficiency and applicability of the conventional methods.
Recently, emerging sparse recovery methods are introduced to NBI estimation, exploiting the sparsity property of NBI, especially the compressed sensing (CS) theory based methods are drawing great attention
[22]. Nevertheless, the stateoftheart CSbased methods are mostly designed for nonCPOFDM systems, or the estimation is carried out at the preamble, which might turn out inaccurate for the payload data frames. Besides, it is difficult to design a practical observation matrix with satisfactory restricted isometry property (RIP) required by CSbased methods [22]. Thus, the performance is limited when the conditions of background noise or sparsity level are unideal. Sparse Bayesian learning (SBL), as another sparse recovery theory, was proposed [23] to solve block sparse recovery problems, but prior information of the block partition and the statistics of the unknown signal were required, and the stringent parametric assumptions of the NBI were impractical.Different from the aforementioned existing schemes, the emerging and powerful machine learning theory and techniques, drawing tremendous research attention recently, can be a great inspiration to achieve a both efficient and reliable method of NBI recovery. In the research on machine learning, crossentropy (CE) has been exploited as the loss function to train deep neural networks
[24]. Nevertheless, the conventional CE method was not designed for sparse approximation. Moreover, the stateoftheart research on sparse machine learning based NBI recovery using iterative crossentropy guided training is insufficient in literature. To fill this gap, a sparse machine learning inspired probabilistic framework is formulated, and a novel algorithm called sparse CE minimization (SCEM) is proposed to iteratively learn the support distribution. The proposed method is capable of learning and recovering the NBI more efficiently and more accurately than stateoftheart counterparts, supporting the harmonic coexistence of NBIoT and LTE systems.The main contributions are listed as follows:

The theory of sparse machine learning with the method of CEguided training is introduced to the area of NBI recovery for the first time. A novel probabilistic framework of sparse machine learning is formulated to recover and eliminate the NBIoT interference to the LTE system, with higher spectral efficiency and recovery accuracy than the existing methods.

A novel algorithm called SCEM based on sparse machine learning is proposed for NBI recovery, which iteratively learns the NBI support distribution guided by the CE as the loss function. An enhanced algorithm called regularized SCEM (RSCEM) is proposed by regularizing the loss function, which achieves better recovery accuracy and convergence rate.

The proposed framework is extended to MIMO systems to utilize the spatial correlation of the NBI at multiantennas. Thus the simultaneous SCEM (SSCEM) algorithm is formulated, which combines the contributions from multiple antennas and simultaneously recovers the common support of the NBI to further improve the spectral efficiency and accuracy.
The rest of this paper is organized as follows: The related works are presented in Section II. The system model is presented in Section III. The main contribution of this paper, the proposed probabilistic framework formulation and the proposed algorithms of sparse machine learning for NBI recovery, are described in detail in Section IV. The performance of the proposed algorithms is evaluated through computer simulations in Section V, which is followed by the conclusions in Section VI.
. Matrices and column vectors are denoted by boldface letters; frequencydomain and timedomain vectors are denoted by boldface vectors with tilde
and without tilde , respectively; and denote the pseudoinversion operation and conjugate transpose, respectively; represents the norm operation; denotes the cardinality of the set ; denotes the entries of the vector in the set of ; represents the submatrix comprised of the columns of the matrix indexed by ; denotes the complementary set of ; denotes getting the support of .. BSBL (Block Sparse Bayesian Learning). CE (CrossEntropy). CP (Cyclic Prefix). CRLB (CramerRao Lower Bound). CS (Compressed Sensing). FTE (Frequency Threshold Excision). IBI (InterBlock Interference). INR (InterferencetoNoise Ratio). LTE (Long Term Evolution). MIMO (MultipleInput MultipleOutput). MSE (Mean Square Error). NBI (NarrowBand Interference). NBIoT (NarrowBand InternetofThings). NLL (Negative Logarithm Likelihood). OFDM (Orthogonal Frequency Division Multiplexing). PASAMP (Priori Aided Sparsity Adaptive Matching Pursuit). RIP (Restricted Isometry Property). RSCEM (Regularized Sparse CrossEntropy Minimization). SCEM (Sparse CrossEntropy Minimization). SSCEM (Simultaneous Sparse CrossEntropy Minimization). SAMP (Sparsity Adaptive Matching Pursuit).
symbol  concept  symbol  concept 

block sparse NBI vector  differential NBI vector  
NBI measurement vector  AWGN error norm threshold  
selection matrix  IDFT matrix  
observation matrix  current support distribution  
candidate support  favorable support  
residue error norm  weighted average residue error norm  
candidate supports number  favorable supports number  
regularization weight parameter  maximum iteration number  
measurement vector length  OFDM subcarrier number  
MIMO receive antenna number  NBI sparsity level 
Ii Related Works
Some coexistence simulation results for inband and guard band scenarios between NBIoT and legacy systems are provided for initial analysis in the 3GPP technical document [25], which shows significant interference between NBIoT and LTE systems. Ratasuk et al provided an analysis of the impacts of the NBIoT signal on the link budget and block error rate performance of the LTE system [26]. Kim et al investigated the interference between NBIoT and LTE systems in the “inband” mode [27]. Wang and Wu gave an analysis of the coexistence between NBIoT and LTE for the standalone mode, and studied the effects of NBIoT to the performance of uplink LTE transmission [28].
Since the problem of the coexistence between NBIoT and LTE systems is vital, there have been some conventional methods to combat against NBI. A commonly adopted approach is to directly null out the subcarriers where NBI is present, called frequency threshold excision (FTE) [17]. Nilsson proposed a linear minimum mean square error based method to estimate NBI [18]. A successive interference cancelation approach mitigating the NBI in a recursive manner was introduced in [19]. A soft decision based successive NBI cancellation method was further proposed by Darsena et al in [20]
. Coulson designed a timedomain notch filter for NBI suppression based on linear prediction criterion before discrete Fourier transform at the transmitter
[21]. The limitation of conventional methods mainly lies in that useful data might be lost, and that the statistics information or plenty of virtual subcarriers are required.To overcome the limitations of conventional methods, the CS theory, as a newly emerged powerful approach for sparse recovery, can be utilized to deal with the NBI estimation problem. CSbased methods were first investigated by AlDhahir et al, utilizing the null space to obtain the measurements of NBI for OFDM systems [29, 30]. In this work, the NBI could be recovered by using CSbased greedy algorithms. There have been studies on different CSbased greedy algorithms, such as subspace pursuit (SP) [31] proposed by W. Dai et al and sparsity adaptive matching pursuit (SAMP) [32] proposed by T. Do et al. The SP algorithm is able to recover sparse signals with or without noise disturbance costing low complexity [31]. The SAMP algorithm is designed to be adaptive to variant sparsity levels of the NBI. By dividing the iteration process into multiple stages, the SAMP algorithm is able to recover the sparse signal by iterative matching pursuit of the support basis without knowing its sparsity level [32].
Other CSbased methods were proposed to estimate the NBI, exploiting the timedomain training guard interval of timedomain synchronous OFDM (TDSOFDM) systems [33] or the preamble in the frame header [34]. In the work of [33], the algorithm of priori aided SAMP (PASAMP) was proposed as an improvement of the classical algorithm SAMP [32], which makes use of the prior information of the partial NBI support acquired by the coarse power threshold method. Then the prior information was exploited in the initialization and iteration process to reduce the complexity and improve the accuracy. The twodimensional correlation of the NBI was exploited in the framework of multiple measurements and structured CS, in literature [34]. The twodimensional measurement data were obtained from the preambles in multiple receive antennas, and then utilized for the structured CS based recovery of the NBI. Another sparse recovery theory, sparse Bayesian learning (SBL), was proposed in [23] and has been utilized to effectively estimate the impulsive noise [35]. A block SBL (BSBL) based method of estimating the NBI generated by NBIoT was proposed in [36], which is an improvement of the SBLbased method in [23]
. The BSBLbased method employed parametric Bayesian inference iteratively to estimate the unknown deterministic parameters of the block sparse NBI
[36]. However, the major limitation of CSbased methods is that the CS theory requires that an observation matrix with satisfactory RIP should be designed [22], which is difficult in practice. Furthermore, the performance is limited when the intensity of the background noise or sparsity level is large.Machine learning has become a popular research trend in recent years, with many applications in the area of sparse composite regularization [37], antijamming [38, 39], as well as wireless communications [40]
. A reinforcement learning based scheme was proposed in literature
[38] for ultradense networks, which adaptively learns the policy of power control to improve the efficiency while mitigating the intercell interference. A twodimensional antijamming mobile communication scheme based on reinforcement learning was proposed in literature [39], where a mobile device can achieve an optimal communication policy without the need to know the jamming and interference model in a dynamic game framework. As an important method in machine learning, the CE method is usually utilized for training deep neural networks and machine learning models, which has well solved many learning tasks such as pattern recognition, object classification and so on
[41, 42]. Recently, a machine learning based method exploiting CE was proposed in [43] to improve hybrid precoding performance for mmWave massive MIMO systems, which introduced it to wireless communications research. Previously, the CE method was also adopted to solve combinatorial optimization problems in literature, which outperforms the bruteforce approach [24, 44]. Different from the stateoftheart methods, the proposed solution in this work introduces sparse machine learning to NBI estimation, and a novel algorithm based on CE minimization is proposed to efficiently learn the NBI support, which improves both the spectral efficiency and the estimation accuracy compared with existing approaches.Iii System Model
Iiia Signal Model of LTE
As adopted in 3GPP standards of LTE [12, 13], the CPOFDM frame structure is composed of the length OFDM block, where is the number of subcarriers with the subcarrier spacing of , and the length CP in front, which is formed by the last samples of the OFDM block, as illustrated in Fig. 1.
After transmitted in the wireless multipath fading channel with the channel impulse response (CIR) in the presence of the NBI generated by the NBIoT signal, the received th CP before the th OFDM block is represented as
(1) 
where denotes the timedomain NBI vector located at the CP,
denotes the additive white Gaussian noise (AWGN) vector with zero mean and variance of
, and denotes the received CP, with the matrix represented asThe entries in the matrix above represent the last samples of the preceding th OFDM block , which causes interblockinterference (IBI) on the following th CP. Since only causes IBI on the first samples of the th CP as illustrated in Fig. 1, the last samples of will form the IBIfree region given by
(2) 
where denotes the selection matrix composed of the last rows of the identity matrix . The IBIfree region exists in practical broadband transmission systems because a common rule for system design is to configure the guard interval length to be much larger than the maximum channel delay spread in the worst case to avoid IBI between OFDM symbols, which is specified in standards and supported in literature [45, 12, 36].
For simplicity of notations, the subscript of denoting the frame number is omitted in the following content of this paper when there is no ambiguity about the current frame number, unless otherwise clearly stated. Then the IBIfree region can be rewritten as
(3) 
where , , and consist of the last entries of , , and in (1), respectively, while is composed of the last rows of without the IBI component. Since the CP is the same with the last samples of the OFDM block, there is a duplicate of the IBIfree region at the last samples of its subsequent OFDM block, which can be denoted by given by
(4) 
where and denote the length timedomain NBI and AWGN vectors at the end of the OFDM block, respectively.
IiiB NBI Model Generated by NBIoT
In LTE systems, the NBIoT signal working in the “inband” mode at the spectrum of LTE generates NBI to the receivers of the LTE system [46]. The widely adopted model of the NBI in the frequency domain is the superposition of several tone interferers, and each tone interferer is modeled by a bandlimited Gaussian noise (BLGN) with the power spectral density (PSD) of [47]. The frequencydomain location of the tone interferers can be randomly distributed among all subcarriers [48, 47], and different tone interferers are mutually independent [48]. Let denote the frequencydomain NBI vector associated with the CP, and then each entry of the corresponding timedomain NBI signal can be represented as
(5) 
where is the set of the indices of nonzero entries, which is defined as the support. The sparsity level is defined by the number of nonzero entries, which is much smaller than the signal dimension, i.e., . The interferencetonoise ratio (INR) is used to represent the intensity of the NBI, defined by , where denotes the average power. Since the tone interferers are BLGN as described, the average power is , yielding the INR .
Since the bandwidth of NBIoT is sufficiently small compared with that of LTE [49], the NBI generated by NBIoT can be modeled as a sparse vector in the frequency domain, which has only few nonzero entries compared with the number of subcarriers. The nonzero entries of the NBI are not necessarily located exactly at the frequencies of the OFDM subcarriers in LTE, because in practice there might be a fractional frequency offset (FO) for the NBIoT working frequency with respect to the OFDM subcarriers. Thus, the generalized NBI model will become a block sparse vector due to the spectral leakage [50]. Then the frequencydomain block sparse NBI vector associated with the CP can be represented as
(6) 
where denotes the inverse discrete Fourier transform (IDFT) matrix with the entry , and
is the FO matrix, whose value of offset frequency can be modeled by a uniformly distributed variable
[50]. Transforming the frequencydomain NBI signal (6) to the time domain by partial IDFT, the NBI vector associated with the IBIfree region in (3) is obtained as(7) 
There is a useful feature of NBI called temporal correlation, which can be utilized for measuring the NBI from the compound received signal containing both the NBI and the data components. The temporal correlation claims that, the NBI signal usually has invariant support and amplitude over one received OFDM frame of interest. This is because according to experiments and observations, the coherence time of the NBI signal is normally much larger than that of one OFDM symbol, and the working band of the NBI source such as NBIoT is not changing so fast [51, 52, 36]. It is observed that usually the NBIoT signal working inband in LTE spectrum is located fixed in certain frequency locations [8, 11]. Temporal correlation is also verified by substantial field tests and experimental observations in real house and apartments [53].
Because of the temporal correlation, the frequencydomain NBI vectors associated with the CP part and the following OFDM block part share the same support and amplitude, with only a phase shift in between: Let denote the frequencydomain NBI vector associated with the CP’s duplicate in the OFDM block given by (4), where the timedomain representation of is given by
(8) 
Hence, can be derived by the phase shift of associated with the CP in (3), which can be represented as
(9) 
where the value of FO determines the phase to shift, and is the corresponding timedomain distance between the CP and its duplicate in the OFDM block.
Note that as illustrated in Fig. 1, so it can be further derived that , which yields a simpler relation only related with given by
(10) 
Iv Probabilistic Sparse Machine Learning Based Framework Formulation and Algorithms for NBI Recovery
In this section, the probabilistic framework of sparse machine learning as well as the sparse combinatorial optimization problem for NBI recovery is firstly formulated in Section IVA. Then the proposed sparse machine learning based iterative algorithm called SCEM is introduced in detail in Section IVB, followed by the enhanced algorithm of RSCEM imposing regularization on the loss function in Section IVC. Afterwards, the extension of the proposed method to MIMO systems is presented in Section IVD.
Iva Probabilistic Sparse Machine Learning Framework Formulation for NBI Recovery
The ultimate goal of this work is to accurately recover the NBI vector located at the OFDM data block and eliminate it from the data, which can be done by estimating and using the relation in (10). Hence, firstly the measurement of the NBI should be obtained, and a probabilistic sparse machine learning based framework can be formulated to efficiently recover the NBI using the proposed algorithms.
The measurement vector of the NBI can be obtained using the temporal differential measuring operation [36]. Specifically, as illustrated in Fig. 1, the measurement vector can be obtained by the differential operation between the received IBIfree region in (3) and its duplicate in (4) at the end of the OFDM block, which nulls out the cyclic data component , yielding the measurement vector of the NBI
(11) 
where and . Thus by substituting (7) and (8) into (11), the measurement vector can be rewritten as
(12) 
where is given by
(13) 
whose support is the same with that of and .
After obtaining the measurement of the NBI in (12), the probabilistic sparse machine learning framework of NBI recovery can be formulated, by which the support distribution of the NBI can be learnt using the proposed algorithms. Because of the sparsity of the frequencydomain NBI vector, it is crucial to recover its support, i.e., the set of the indices of the nonzero entries. Since the sparsity level of the NBI is , it is required that the unknown NBI vector to be reconstructed in (12) should satisfy
(14) 
where denotes the norm, i.e., the number of nonzero entries. To recover the optimal NBI vector based on the measurement in (12), we should solve the optimization problem given by
(15) 
where denotes the optimal NBI vector to be recovered from the measurement in (12) that minimizes the residue error norm , with given by
(16) 
In the conventional perspective of signal processing, the problem in (15) is intractable, because of the nonconvex constraint of norm. Since the constraint is a sparse one, it can be regarded as a sparse combinatorial optimization problem. Let denote the set of all possible supports of sparse vectors satisfying the constraint in (14), we have
(17) 
so the size of the set of possible solutions is given by
(18) 
It can be noted from (18) that the possible supports of the solution space is exponentially and combinatorially increasing with the problem size and .
Some sparse approximation methods, including the popular CSbased theory, have been exploited to relax the nonconvex optimization problem to a tractable one in literature. For instance, the nonconvex norm constraint in (15) can be relaxed to the convex norm minimization problem [22] as
(19) 
where denotes the error norm bound due to the background AWGN noise , and thus convex programming can be exploited to solve it [54]. However, the performance of the CSbased methods is dependent on the RIP of the observation matrix [22, 55]. Besides, performance degradation could be caused due to intensive background noise and large sparsity level [22]. The spectral efficiency could still be improved because many measurement samples have to be reserved in the guard interval for CSbased methods [33].
To overcome the difficulties of stateoftheart methods, a probabilistic sparse machine learning based approach called SCEM is proposed for NBI recovery, which is able to efficiently solve the nonconvex sparse combinatorial optimization problem in (15) without strict prior RIP requirements for the observation matrix , and much more spectrumefficient by reducing the cost of measurement data. The proposed algorithm significantly develops the conventional CE method [24] to accommodate the sparse recovery problem, and the unknown sparse NBI signal can be accurately recovered, as described in detail in the next subsection.
IvB Proposed Sparse Machine Learning Inspired Algorithm: Sparse CrossEntropy Minimization
Based on the probabilistic framework of sparse learning, the purpose of the SCEM algorithm proposed in this paper is to efficiently solve the sparse combinatorial optimization problem in (15) by iteratively minimizing the crossentropy between the current support distribution and the one minimizing the residue error norm. The pseudocode of the proposed SCEM algorithm is summarized in Algorithm 1, and the computing flowchart of the essential computing modules, parameters, nodes, and data flows of the algorithm is illustrated in Fig. 2.
It can be observed from Fig. 2
that the proposed sparse machine learning algorithm iteratively learns the probability distribution of the NBI support by minimizing the loss function (i.e., the crossentropy). In each iteration within the algorithm loop, the algorithm generates a set of candidate supports randomly based on the current support distribution
(initialized by ), and computes the corresponding residue error norms using the measurement vector from the input. After sorting the residue error norms, the set of favorable supports is selected out, which serves as the training data set. Then, the loss function is computed by calculating the crossentropy between the training data set and the estimated output. By minimizing the loss function using gradient descent, the support distribution is backward updated to for the next iteration. This process will drive the support distribution gradually to be trained towards the one with minimum estimation error. The iterations continue until the halting condition of the algorithm is met, and the output of the algorithm is thus achieved.The overall structure and explanations of Algorithm 1 are described as follows:
Phase 1 Input. The measurement vector , the observation matrix , the residue error norm threshold given in (19), and the number of candidate supports and favorable supports, i.e., and , are input to the algorithm.
Phase 2  Initialization. The initial probability distribution of the NBI support is set as , where , and denotes the probability that the th entry is in the NBI support , i.e.,
(20) 
Since the nonzero entries can be randomly distributed in the support, assuming each entry has an initial probability of 0.5 to be nonzero is rational without loss of generality.
Phase 3  Main iterations. The main process is composed of multiple iterations, and terminates until the halting condition of the algorithm is met. The main process includes the following steps:
1) Candidate supports generation (Line 4): candidate supports are generated based on the support distribution . Each candidate support is generated in an efficient and simple recursive manner to obtain a sparse support. Let denote the current temporary support in the recursive generation process, where the initial temporary support . Then, based on the current temporary support and its corresponding probability derived from the current support distribution , a more sparse temporary support can be generated by a Bernoulli trial on each entry as
(21) 
where the valued parameter is the outcome of the Bernoulli trial on entry with Bernoulli probability . Afterwards, and keep doing this until , and then the candidate support is set as .
2) Computing NBI and residue (Lines 56): the estimated NBI vectors corresponding to the candidate supports are calculated based on the least squares principle implemented on the candidate supports , and the corresponding residue error norms are calculated by (16) using the estimated NBI vectors.
3) Favorable supports selection (Lines 78): the candidate supports are sorted by the residue error norms in the ascending order in order to pick out the best candidate supports with smallest estimation error, which is closest to the real NBI support and regarded as the favorable supports . The implicit probability distribution implied by the favorable supports is the training target of the current support distribution , which is gradually driven towards the groundtruth distribution by iteratively minimizing the CE between them.
4) Learning support distribution by minimizing CE (Line 9): The CE is utilized as the loss function in the perspective of machine learning theory, which is given by
(22) 
where is the negative logarithm likelihood (NLL) of the favorable support conditioned on the current probability distribution . By minimizing the loss function in (22), the current support distribution is updated to , which is given by
(23) 
Let a valued length vector denote the favorable support , where its th entry satisfies
(24) 
Then the conditional probability in the CE in (23) is given by
(25) 
where
is a Bernoulli random variable given by
(26) 
Thus, one can derive that
(27) 
By substituting (27) into (23), the first derivative of the CE with respect to can be derived as
(28) 
To minimize the CE, the first derivative (28) is set to zero, so the updated support distribution can be learnt by
(29) 
5) Iteration switching (Line 1011): if the halting condition is satisfied when or , the algorithm ends. Otherwise, the algorithm goes into the next iteration.
Phase 4  Output. The output of the algorithm includes the learnt support probability distribution , the recovered NBI support , and the recovered sparse NBI vector , which obtains the solution of the sparse combinatorial optimization problem in (15) as .
Afterwards, can be calculated by (13) and the NBI associated with the OFDM block can be calculated through (10). Then, the NBI can be directly eliminated from the information data in the frequency domain just by subtracting from the received frequencydomain OFDM subcarriers , which is given by
(30) 
where is the DFT of the received OFDM block as illustrated in Fig. 1, while is the frequencydomain OFDM data block free from the NBIoT interference. Thus, the NBIfree OFDM data block can be then used for information demapping and decoding.
IvC Enhanced Sparse Machine Learning Based Algorithm: Regularized SCEM
In the proposed SCEM algorithm where the CE plays the important role of loss function, each NLL corresponding to each favorable support has an average contribution to the CE given in (23), so the favorable supports with different residue error norms contribute the same to the loss function. In fact, different supports should reflect different contributions on the loss function so as to encourage the algorithm to learn the support with less error. Out of this insight, an enhanced sparse learning algorithm of RSCEM is proposed, in which the loss function in (22) is regularized by multiplying with the weighting parameter to generate the regularized loss function given by
(31) 
where the regularization weighting parameter is given by
(32) 
where is the average residue error norm over the favorable supports given by
(33) 
Note that a smaller residue error norm leads to a larger weighting parameter in (32). Hence, the NLL corresponding to a more accurate support will have a larger contribution to the regularized loss function in (31), which will drive the support distribution to converge to the groundtruth support more accurately and more efficiently. The pseudocode of RSCEM is thus similar to that of SCEM given in Algorithm 1 except for the procedure of minimizing the loss function in Line 9, where the regularized loss function is now adopted to update the distribution as given by
(34) 
To calculate the minimum regularized loss function in (34), the same notation as in the previous subsection, i.e. the Bernoulli vector in (24) denoting the favorable support , is inherited. Through similar deduction from (24) to (27), and substituting (27) into (34), the first derivative of the regularized loss function with respect to can be obtained, represented as
(35) 
Setting the first derivative of the regularized loss function given in (35) to zero, the regularized loss function can be minimized, yielding the updated support probability distribution given by
(36) 
Comparing (36) with (29), it can be observed that, for the algorithm of SCEM, all the entries have the same contribution to the updating of in (29), so the different accuracy among favorable supports are not taken into consideration. On the other hand, for the enhanced RSCEM algorithm, a more accurate support will impose a larger weighting parameter on and have a larger contribution to the updating of as implied by (36). In fact, (29) can be regarded as a special case of (36) when . Consequently, it can be derived that the enhanced RSCEM algorithm will learn the groundtruth support distribution more accurately and more efficiently than SCEM, which is also validated in the simulation results in the next section.
IvD Extension to MIMO: Simultaneous MultiAntenna NBI Recovery Algorithm
The proposed method can be extended to MIMO systems to further improve the estimation accuracy by exploiting the spatial correlation of the NBI. Due to the spatial correlation, the received NBI signals at different receive antennas in the MIMO system share the same support, i.e., the locations of nonzero entries are the same, although their amplitudes might be different [34]. One can make use of the spatial correlation in the iterations of the proposed sparse machine learning algorithm to simultaneously recover the NBI signals contaminating multiple receive antennas.
Comments
There are no comments yet.