I Introduction
Spectrum sensing enables cognitive radios to discover unused spectrum of primary users (PUs) in time, frequency and spatial domains, such that secondary users (SUs) can access these unused spectral bands to increase spectral utilization of the network [1][3]. Spectrum sensing is considered of critical importance for the realization of cognitive radio.
In recent years, deep learning (DL) techniques have achieved great success on many complex tasks in computer vision, speech recognition and synthesis, and natural language processing. Experience in these areas has shown that best performance is usually obtained with endtoend models
[4, 6, 5], where a DL system learns appropriate features for the task in a datadriven fashion, instead of using engineered features, handcrafted by domain experts. Such models may also have potential in spectrum sensing.A DL model was proposed in [7] for cooperative spectrum sensing, where the cognitive radio network (CRN) combines the individual sensing results from each SU. Measured received signal strength (RSS) or binary sensing decisions were used as the input to a deep neural network (DNN). A recent work on modulation recognition [8]
using raw samples of the inphase and quadraturephase of the received temporal signals as input to a DNN shows significant gains compared to using conventional features, for example, higher order moments. However, deep learningbased approaches require significant amounts of labeled training data which follows the same distribution as the test data. In
[9] and [10], the authors propose adversarial generative networks to augment training examples, with a limited number of labeled training data, as well as domain adaptation to switch between signal types.In this letter, we propose a DLbased spectrum sensing system, called deep sensing hereafter. Unlike existing DLbased spectrum sensing using expert features, the proposed method uses raw signals as inputs to a DNN. We observe that a DNN trained using data obtained under one set of conditions may not perform well when wireless conditions change, e.g., variations in wireless propagation, different PU signals. To improve the robustness, we propose to incorporate transfer learning [11], which uses small amounts of additional data to adapt the learned models to new communications settings. Results show that transfer learning significantly improves the robustness of deep spectrum sensing.
To our knowledge, this is the first attempt at directly using signal samples rather than expert features for spectrum sensing with DL in cognitive radio, and this is the first exploration of transfer learning considering both cases of no labeled training examples and a small number of labeled training examples, toward more robust DLbased spectrum sensing. The rest of this letter is organized as follows. Section II presents the deep spectrum sensing algorithm and its performance. Robustness is analyzed, and two transfer learning frameworks are examined in Section III.
Ii Deep Spectrum Sensing
Received radio signals pass through a rectangular bandlimited filter to limit noise, and then are sampled, producing a discretetime sequence. A subsequence of complexvalued samples, collected during a single sensing interval, is decomposed as a
realvalued vector, with the first and second row being the inphase and quadrature components respectively, and forms a single input vector
x to a DNN. The DNN outputs a binary class label with value when the PU is detected and when it is not.We use a convolutional neural network (CNN) with two convolutional layers, followed by two dense layers (Table I). For the two convolutional layers, the stride is
and the zero padding equals
. Rectified linear (ReLU) activation units are used as the nonlinearity in each layer. Dropout with a rate of
is used to regularize fully connected and convolutional layers, to reduce overfitting. The Adam optimizer is utilized, and the last layer uses the logistic function. Given a training set of sensing interval examples and their class labels , denoted , the network parameters are learned by minimizing the empirical risk(1) 
where
and the empirical risk uses the binary crossentropy loss function
(2) 
This is the set of network parameters that maximizes the likelihood .
The reasons for choosing a CNN are (a) relatively low complexity, (b) operation of a CNN kernel can be thought of as related to filtering operations that occur in communications receivers, and (c) the modulation recognition work by O’Shea [8] used a CNN.
Layer  Output dimensions  # of kernels  Kernel size 

Input  
Conv1  
Conv2  
Dense1  
Dense2  
Output 
To compare the performance of spectrum sensing using deep learning, we adopt a setting where an analytical expression for the optimal sensing algorithm is available. We consider detecting a narrowband Gaussiandistributed signal in additive white Gaussian noise (AWGN), in which case the optimal sensing algorithm according to the loglikelihood ratio is
[12](3) 
where x is a vector of received samples within one sensing duration, is the covariance matrix of x, and is the covariance matrix of the additive noise after the filter.
We compare sensing performance using a narrowband Gaussian PU signal with zero mean, corrupted by AWGN. There are samples in a sensing interval, and the signaltonoise ratio (SNR) is 4dB, where
is the PU signal variance and
is the noise variance after the filter. The PU signal bandwidth is of the filter bandwidth. The network is trained with a training set of and tested on an independent (but with the same transmitter, channel and receiver characteristics) test set of the same size. Fig. 1 shows the ROC curves for optimal and deep sensing as well as the performance of an energy detector [2]. The optimal sensing result was obtained with (3). The deep sensing result was obtained by computing probabilities of detection and false alarm on the test set, using different thresholds on the network output. The deep sensing, which does not require feature extraction of the received samples, outperforms energy detection (ED) and is close to the optimal.
The optimal scheme for a particular sensing scenario is only optimal if it has perfect information on the required parameters. For example, the optimal scheme in Fig. 1 requires the covariance matrices of the received samples and of the additive noise after the receive filter. With estimation error in the required information, the performance degrades. Also, for different sensing scenarios, the optimal sensing scheme differs, so a dedicated sensing receiver is required for every scenario, which is costly.
Iii Robust Deep Sensing with Transfer Learning
Robustness was shown to be a problem when applying DL for automatic modulation recognition [13]. We examine deep sensing robustness by considering different PU signals: narrowband Gaussian signals with zero mean in AWGN with an SNR of , and QPSK signals that use a square root raised cosine filter with a rolloff factor of as pulse shaping. The QPSK signals experience path loss with average SNR between and and frequencyselective Rayleigh fading with 3 discrete paths. The data is obtained from simulations in MATLAB. Datasets collected under these different characteristics will belong to different, but related, distributions. We say that these datasets have been obtained in different domains. The source domain is used to train the network, and the target domain is used for testing. Both training and test sets have size . Results are in Fig. 2, where the probability of detection () versus the probability of false alarm () is plotted. In Fig. 2(left), we use QPSK as source domain and Gaussian as target domain. The resulting sensing performance, marked “QPSKGaussian”, is significantly worse than the case where we use examples of Gaussian signals to train and test the network (curve labeled “GaussianGaussian”).
Similar observations can be made from Fig. 2(right), where the curve “GaussianQPSK” is obtained using Gaussian signals in the source domain and QPSK signals in the target domain, and the curve “QPSKQPSK” is plotted for reference. Figs. 1 and 2 show that when source and target domains are the same, deep sensing performance can be close to optimal, whereas when they are mismatched, deep sensing performance can degrade significantly. As transmitted signals can vary in several ways (e.g., alphabet sizes, coding schemes) and signal propagation depends on many factors (e.g., frequency, terrain profile), getting enough groundtruth labeled training data across all possible scenarios is difficult. Experience in other problems such as object recognition shows that no system is ever robust enough to address all possible operating conditions. Thus transfer learning procedures are important.
Iiia Transfer learning with no labeled data
The transfer approaches in this category are referred to as unsupervised domain adaptation. Let and denote the data in the source and target domains. As shown above, directly applying the neural network (NN) trained with may not work well for . To leverage the knowledge learned by the NN from , we use the transfer learning method of [14]. This aims to discover a latent space described by a kernelinduced feature transformation function such that the marginal distributions of and are close. A nonparametric distance estimate, referred to as the Maximum Mean Discrepancy (MMD)[14], is defined by embedding distributions in a reproducing kernel Hilbert space (RKHS) and is calculated by , where is the RKHS norm. Making the distributions of the source and target data close is equivalent to minimizing the MMD distance [14]. Let , and if , else if , otherwise, . The MMD distance can then be written as , and the learning problem formulated as [14]
(4) 
where stands for the trace operation, is the centering matrix, is a column vector with all ’s, a regularization term controls the complexity of W, is a tradeoff factor between the MMD distance between distributions and complexity, and I
is the identity matrix. The data in the latent space is
, and the solution of corresponds to theleading eigenvectors of
.We use and as the sensing performance metrics. Fig. 2(left) shows that when QPSK data is used as source data and Gaussian data as target data, the transfer learning algorithm improves the sensing, compared to when we directly use the NN trained on QPSK data for sensing Gaussian PU signals. However, the improved deep sensing is still worse than ED. Further, interchanging source and target data, Fig. 2(right) shows that unsupervised domain adaptation does not improve performance, although in this case either deep sensing outperforms ED. These results indicate that this transfer with no labeled target domain data is not robust.
IiiB Transfer learning with a small amount of labeled data
When we have a small amount of labeled data, we can use finetuning, the dominant transfer learning procedure in computer vision [11]. The deep sensing system, trained on a large source dataset, is a starting point for further training using data from the target dataset. For training the baseline network, it is assumed that simulation data is used. For the transfer learning, we use simulation data also, but in practice the SU would need to acquire some real labeled data in its actual environment. One way to accomplish this is through cooperation between PUs and SUs. With a small loss of throughput, the PUs could use occasional sensing intervals for providing ON and OFF periods so that each SU can acquire labeled data. Alternatively, by listening and comparing across consecutive sensing and data transmission intervals, an SU could develop estimates of the labels.
We start with a NN pretrained using
examples of QPSK data, and fine tune it using a variable number of examples of Gaussian signals. The fine tuned network is then applied for sensing zeromean Gaussian signals. We also plot ED performance and the DLbased sensing performance by training from scratch which initializes the NN randomly and trains it using a variable number of Gaussian examples. To account for the stochastic nature of the stochastic gradient descent optimization with random weight initialization, the network is trained 10 times and the results are averaged. Fig.
3(top) shows vs. the number of examples of Gaussian signals, with . With no labeled Gaussian data, for the network trained by QPSK data, and for the randomly initialized network, showing that QPSKtrained initialization is beneficial. When the number of training examples is larger than roughly 300, the DLbased sensing outperforms ED. Fine tuning outperforms the Gaussian data training from scratch. Given enough training data, the performance of random initialization approaches that of the pretrained network.Next we interchange the training and test data. We use Gaussian signals for pretraining and fine tune with a limited number of examples of QPSK signals. We test by sensing QPSK signals. As in Fig. 3(top), simulations are rerun 10 times and the results are obtained by averaging. Fig. 3(bottom) shows versus the number of labeled QPSK data, where . We observe a similar pattern as before: when only a small amount of QPSK training data is available, better performance can be achieved by fine tuning than random initialization. Further, fine tuning outperforms ED for the whole curve, and the DLbased sensing by training from scratch outperforms ED as well when the number of training examples exceeds roughly 100.
In addition to the narrowband Gaussian and QPSK signals, we tested several other signals and channel models. For curves of the type shown in Fig. 3, the area under the curves over the xaxis range for both finetuning and training from scratch are in Table II. All results were consistent with Fig. 3, in that finetuning outperformed training from scratch.
Source domain target domain  Finetune  Train from scratch 

BPSK +PL QPSK +PL,R  845.64  673.98 
QPSK +PL,R BPSK +PL  938.72  849.61 
QPSK +PL 16QAM +PL,R  816.55  655.63 
16QAM +PL BPSK +PL,R  870.26  760.05 
Conclusion: We demonstrate the application of deep learning to spectrum sensing. The approach does not require feature extraction from the received signals at the SU. As deep spectrum sensing is not robust when applied in a different communications scenario from the training data, we incorporate transfer learning to ensure robustness. With no labeled target data, the transfer is unreliable and depends on whether QPSK or Gaussian signals are the source or target. When there is a small amount of labeled target data, fine tuning is shown to be robust for transferring into a variety of domains.
References
 [1] Y. Zeng, Y. Liang, A. T. Hoang, and R. Zhang, “A review on spectrum sensing for cognitive radio: challenges and solutions,” EURASIP Journal on Advances in Signal Processing, doi:10.1155/2010/381465.
 [2] D. Cabric, S. M. Mishra, R. W. Brodersen, “Implementation issues in spectrum sensing for cognitive radios,” the 38th Asilomar Conference on Signals, Systems and Computers, Nov. 710, 2004.

[3]
Y. Li and Q. Peng, “Achieving secure spectrum sensing in presence of malicious attacks utilizing unsupervised machine learning,”
IEEE MILCOM, Baltimore, USA, Nov. 13, 2016.  [4] M. Bojarski, D. Testa, et al., “End to End Learning for SelfDriving Cars,” CoRR abs/1604.07316, 2016.

[5]
A. Graves and N. Jaitly, “Towards EndToEnd Speech Recognition with Recurrent Neural Networks,” in
Proceedings of the 31st International Conference on Machine Learning, 2014.  [6] Y. Wu et al., “Google’s Neural Machine Translation System: Bridging the Gap between Human and Machine Translation,” CoRR abs/1609.08144, 2016.
 [7] W. Lee, M. Kim, D. Cho, and R. Schober, “Deep sensing: cooperative spectrum sensing based on convolutional neural networks,” arXiv:1705.08164.
 [8] T. J. O’Shea, T. Roy, and T. C. Clancy, “Overtheair deep learning based radio signal classification,” IEEE Journal on Selected Topics in Signal Processing, vol. 12, no. 1, Feb. 2018.
 [9] K. Davaslioglu and Y. E. Sagduyu, “Generative adversarial learning for spectrum sensing,” IEEE International Conference on Communications, Kansas City, MO, USA, May 2024, 2018.
 [10] T. Erpek, Y. E. Sagduyu, and Y. Shi, “Deep learning for launching and mitigating wireless jamming attacks,” IEEE Trans. on Cognitive Communications and Networking, vol. 5, no. 1, pp. 214, Mar. 2019.
 [11] J. Donahue, Y. Jia, O. Vinyals, J. Hoffman, N. Zhang, E. Tzeng, and T. Darrell, “Decaf: A deep convolutional activation feature for generic visual recognition,” in Proceedings of the 31st International Conference on Machine Learning, 2014.
 [12] T. A. Schonhoff and A. A. Giordano, Detection and Estimation Theory and Its Applications, Prentice Hall, 2006.
 [13] B Luo, Q. Peng, P. C. Cosman, and L. B. Milstein, “Robustness of deep modulation recognition under AWGN and Rician fading,” Asilomar Conference on Signals, Systems, and Computers, October, 2018.
 [14] S. J. Pan, I. W. Tsang, J. T. Kwok, and Q. Yang, “Domain adaptation via transfer component analysis,” IEEE Trans. on Neural Networks, vol. 22, no. 2, pp. 199210, Feb. 2011.
Comments
There are no comments yet.