1 Introduction
Nowadays, developments in deep convolutional neural networks (CNNs) have made remarkable achievement in the area of signal recognition, improving the state of the art significantly, such as [16, 7, 11] and so on. Generally, a vast majority of existing learning methods follow a closedset assumption[6, 13], that is, all of the test classes are assumed to be the same as the training classes. However, in the realworld applications new signal categories often appear while the model is only trained for the current dataset with some limited known classes. It is openset learning [22, 1] that was proposed to partially tackle this issue (i.e., test samples could be from unknown classes). The goal of an openset recognition system is to reject test samples from unknown classes while maintaining the performance on known classes. However, in some cases, the learned model should be able to not only differentiate the unknown classes from known classes, but also distinguish among different unknown classes. Zeroshot learning (ZSL) [18, 23] is one way to address the above challenges and has been applied in image tasks. For images, it is easy for us to extract some humanspecified highlevel descriptions as semantic attributes. For example, from a picture of zebra, we can extract the following semantic attributes 1) color: white and black, 2) stripes: yes, 3) size: medium, 4) shape: horse, 5) land: yes. However, for a realworld signal it is almost impossible to have a highlevel description due to obscure signal semantics. Therefore, although ZSL has been widely used in image tasks, to the best of our knowledge it has not yet been studied for signal recognition.
In this paper, unlike the conventional signal recognition task where a classifier is learned to distinguish only known classes (i.e., the labels of test data and training data are all within the same set of classes), we aims to propose a learning framework that can not only classify known classes but also unknown classes without annotations. To do so, a key issue that needs to be addressed is to automatically learn a representation of semantic attribute space of signals. In our scheme, CNN combined with autoencoder is exploited to extract the semantic attribute features. Afterwards, semantic attribute features are wellclassified using a suitably defined distance metric. The overview of proposed scheme is illustrated in Fig. 1.
In addition, to make a selfevolution learning model, incremental learning [3, 20] needs to be considered when the algorithm is executed continuously. The goal of incremental learning is to dynamically adapt the model to new knowledge from newly coming data without forgetting the already learned one. Based on incremental learning, the obtained model will gradually improve its performance over time.
In summary, the main contribution of this paper is threefold:

First, we propose a deep CNNbased zeroshot learning framework, called SR2CNN, for openset signal recognition. SR2CNN is trained to extract semantic feature while maintaining the performance on decoder and classifier. Afterwards, the semantic feature is exploited to discriminate signal classes.

Second, extensive experiments on various signal datasets show that the proposed SR2CNN can discriminate not only known classes but also unknown classes and it can gradually improve itself.

Last but not least, we provide a new signal dataset SIGNAL202002 including eight digital and three analog modulation classes.
The code and dataset of this paper will be published upon acceptance.
2 Related Work
In recent years, signal recognition via deep learning has achieved a series of successes. The work
[15] proposed the Convolutional Radio Modulation Recognition Networks, which can adapts itself to the complex temporal radio signal domain, and also works well at low SNRs. Another paper [12] proposed an ensemble model of deep convolutional networks, to recognize 7 classes of signals from reallife data in the fiber optic field. Moreover, [17] used Residual Neural Network [8] to perform the signal recognition tasks across a range of configurations and channel impairments, offering referable statistics. These experiments basically follow closedset assumption, namely, their deep models are expected to, whilst are only capable to distinguish among alreadyknown signal classes.When considering the recognition task of those unknown signal classes, some traditional machine learning methods like anomaly (also called outlier or novelty) detection can more or less provide some guidance. Isolation Forest
[10] constructs a binary search tree to preferentially isolate those anomalies. Elliptic Envelope [21], fits an ellipse for enveloping these central data points, while rejecting the outsiders. Oneclass SVM [5], an extension of SVM, finds a decision hyperplane to separate the positive samples and the outliers. Local Outlier Factor
[2], uses distance and density to determine whether a data point is abnormal or not. The above openset learning methods can indeed identify known samples (positive samples) and detect unknown ones (outliers). However, a common and inevitable defect of these methods are that they can never carry out any further classification tasks for the unknown signal classes.Zeroshot learning is wellknown to be able to classify unknown classes and it has already been widely used in image tasks. For example, the work [18] proposed a ZSL framework that can predict unknown classes omitted from a training set by leveraging a semantic knowledge base. Another paper [23] proposed a novel model for jointly doing standard and ZSL classification based on deeply learned word and image representations. The efficiency of ZSL in image processing field majorly profits from the perspicuous semantic attributes which can be manually defined by highlevel descriptions. However, it is almost impossible to give any highlevel descriptions regarding signals and thus the corresponding semantic attributes cannot be easily acquired beforehand. This may be the main reason why ZSL has not yet been studied in signal recognition.
, where the stride and padding are 2 and 0. (b) Average unpooling with grid of
, where the stride and padding are 2 and 0. (c) Deconvolution with kernel of , where the stride and padding are 1 and 0 respectively.3 Problem Definition
We begin by formalizing the problem. Let , be the signal input space and output space. The set is partitioned into and , denoting the collection of known class labels and unknown labels, respectively.
Given training data , the task is to extrapolate and recognize signal class that belongs to . Specifically, when we obtain the signal input data , the proposed learning framework, elaborated in the sequel, can rightly predict the label . Notice that our learning framework differs from openset learning in that we not only classify the into either or , but also predict the label . Note that includes both known classes and unknown classes .
We restrict our attention to ZSL that uses semantic knowledge to recognize and extrapolate to . To this end, we first map from into the semantic space , and then map this semantic encoding to a class label. Mathematically, we can use nonlinear mapping to describe our scheme as follows. is the composition of two other functions, and defined below, such that:
(1)  
Hence, our task is left to find proper and to build up a learning framework that can identify both known signal classes and unknown signal classes.
4 Proposed Approach
This section formally presents a nonannotation zeroshot learning framework for signal recognition. Overall, the proposed framework is mainly composed of four modules as follows:

Feature Extractor ()

Classifier ()

Decoder (), and

Discriminator ()
Fig. 2 shows the architecture of feature extractor (), classifier () and decoder (). The feature extractor () is modeled by a CNN architecture that projects the input signal onto a latent semantic space representation. The classifier (), modeled by a fullyconnected neural network, takes the latent semantic space representation as input and determines the label of data. The decoder (), modeled by another CNN architecture, aims to produce the reconstructed signal which is expected to be as similar as possible to the input signal. Finally, the discriminator () is devised to discriminate among all classes including both known and unknown.
4.1 Feature Extractor, Classifier and Decoder
The feature extractor networks can be represented by a mapping from the input space to the latent semantic space . It consists of four convolutional layers and two fully connected layers. In order to minimize the intraclass variations in space while keeping the interclasses’ semantic features well separated, center loss [24] is used. Let and be the label of , then . Assuming that batch size is , the center loss is expressed as follows:
(2) 
where denotes the semantic center vector of class in and the needs to be updated as the semantic features of class changed. Ideally, entire training dataset should be taken into account and the features of each class need to be averaged in every iterations. In practice, can be updated for each batch according to , where is the learning rate and is computed via
(3) 
where if the condition inside holds true, and otherwise.
The classifier will discriminate the label of samples based on semantic features. It consists of several fully connected layers. Furthermore, cross entropy loss is utilized to control the error of classifier , which is defined as
(4) 
where is the prediction of .
Further, autoencoder [4, 9, 14] is used in order to retain the effective semantic information in . As shown in the right part of Fig 2, decoder is used to reconstruct from
. It is made up of deconvolution, unpooling and fully connected layers. Among them, unpooling is the reverse of pooling and deconvolution is the reverse of convolution. Specifically, max unpooling keeps the maximum position information during max pooling, and then it restores the maximum values to the corresponding positions and set zeros to the rest positions as shown in Fig.
3(a). Analogously, average unpooling expands the feature map in the way of copying it as shown in Fig. 3(b).The deconvolution is also called transpose convolution to recover the shape of input from output, as shown in Fig. 3(c). See appendix A for the detailed convolution and deconvolution Operation, as well as toy examples.
In addition, autoencoder loss is utilized to evaluate the difference between original signal data and reconstructed signal data.
(5) 
where is the reconstruction of signal . Intuitively, the more complete signal is reconstructed, the more valid information is carried within . Thus, the autoencoder greatly help the model to generate appropriate semantic features.
As a result, the total loss function combines cross entropy loss, center loss and autoencoder loss as
(6) 
where the weights and are used to balance the three loss functions. The whole learning process with loss is summarized in Algorithm 1, where , , denote the model parameters of the feature extractor , the classifier and the decoder , respectively.
4.2 Discriminator
The discriminator is the tail but the core of the proposed framework. It discriminates among known and unknown classes based on the latent semantic space . For each known class , the feature extractor extracts and computes the corresponding semantic center vector as:
(7) 
where is the number of data points in class . When a test signal appears and is obtained, the difference between the vector and can be measured for each . Specifically, the generalized distance between and is used, which is defined as follows:
(8) 
where is the transformation matrix associated with class and denotes the inverse of matrix . When is the covariance matrix of semantic features of signals of class , is called Mahalanobis distance. When
is the identity matrix
^{1}^{1}1This is also the only possible choice in the case when the covariance matrix is not available, which happens for example when the signal set of some class is singleton. , is reduced to Euclidean distance. also can be and where is a diagonal matrix formed by taking diagonal elements of and with being the dimension of . The corresponding distance based on and are called the second distance and third distance. Note that when the Mahalanobis distance, second distance and third distance are applied, the covariance matrix of each known class needs to be computed in advance.With the above distance metric, we can establish our discriminant model which is divided into two steps. Firstly, distinguish between known and unknown classes. Secondly, discriminate which known classes or unknown classes the test signal belongs to. The first step is done by comparing the threshold with the minimal distance given by
(9) 
where is the set of known semantic center vectors. Let us denote by the prediction of . If , , otherwise . Owing to utilizing the center loss in training, the semantic features of signals of class
are assumed to obey ndimensional Gaussian distribution. Thus,
can be set according to the threesigma rule [19], i.e.,(10) 
where is a control parameter. We also refer to as the discrimination coefficient.
The second step is more complicated. If belongs to the known classes, its label can be easily obtained via
(11) 
Obviously the main difficulty lies in dealing with the case when is classified as unknown in the first step. To illustrate, let us denote by the recorded unknown classes and define to be the set of the semantic center vectors of . In this difficult case with , a new signal label is added to and is set to be the semantic center vector . The unknown signal is saved in set and let . While in the difficult case with , the threshold is compared to the minimal distance which is defined by
(12) 
Here, the threshold is set as
(13) 
where is the median distance between and each , and is used to balance the two distances. The above formula is obtained by following the intuition that is much related to and . To proceed, let denote the number of recorded signal labels in . Then, if , a new signal label is added to and set . Otherwise we set
(14) 
and save the signal in . Accordingly, is updated via
(15) 
where denotes the number of signals in set . As a result, with the increase of the number of predictions for unknown signals, the model will gradually improve itself by way of refining ’s.
To summarize, we present the whole procedure of the discriminator in Algorithm 2.
5 Experiments and Results
In this section, we demonstrate the effectiveness of the proposed SR2CNN approach by conducting extensive experiments with the dataset 2016.10A, as well as its two counterparts, 2016.10B and 2016.04C [15]. The data description is presented in Table I. All types of modulations are numbered with class labels from left to right.
Sieve samples. Samples with SNR less than 16 are firstly filtered out, only leaving a purer and higherquality portion (onetenth of origin) to serve as the overall datasets in our experiments.
Choose unknown classes.
Empirically, a class whose features are hard to learn is an arduous challenge for a standard supervised learning model, let alone when it plays an unknown role in our ZSL scenario. Hence, necessarily, an completely supervised learning stage is carried out beforehand, to help us nominate suitable unknown classes. As shown in Table
V, the ultimate candidates fall on AMSSB(3) and GFSK(6) for 2016.10A and 2016.04C, while CPFSK(5) and GFSK(6) for 2016.10B.Split training and test data. 80% of the samples from the known classes makes up the overall training set while the rest 20% makes up the known test set. For the unknown classes, there’s only a test set needed, which consists of 20% of the unknown samples.
Due to the three preprocessing steps, we get a small copy of, e.g., dataset 2016.10A, which contains a training set of samples, a known test set of samples and an unknown test set of samples.
All of the networks in SR2CNN are computed on a single GTX Titan X graphic processor and implemented in Python, and trained using the Adam optimizer with learning rate and batch size
. Generally, we allow our model to learn and update itself maximally for 250 epochs. However, interestingly, we find in our experiments that the best performance is always achieved in exactly 150 epochs.
5.1 Intraining Views
Basically, the average softmax accuracy of the known test set will converge roughly to on both 2016.10A and 2016.10B, while to on 2016.04C, as indicated in Fig. 4. Note that there’s almost no perceptible loss on the accuracy when using the clustering approach (i.e., the distance measurebased classification method described in Section 4) to predict instead of softmax, meaning that the established semantic features space by our SR2CNN functions very well. For ease of exposition, we will refer to the known cluster accuracy as upbound (UB).
It can be inferred that the cross entropy loss remains the decisive factor which affects accuracy the most, as the curves of these two indicators in Fig. 4 roughly imply negative correlation on the whole. During the training course, the cross entropy loss undergoes sharp and violent oscillations. This phenomenxon makes sense, since the extra center loss and autoencoder loss will intermittently shift the learning focus of the SR2CNN.
5.2 Critical Results
The most critical results are presented in Table V. To better illustrate it, we will firstly make a few definitions in analogy to the binary classification problem. By superseding the binary condition positive and negative with known and unknown respectively, we can similarly elicit true known (TK), true unknown (TU), false known (FK) and false unknown (FU). Subsequently, we get two important indicators as follows:
Furthermore, we define precision likewise as follows:
where denotes the total number of known samples that are classified to their exact known classes correctly, while denotes the total number of unknown samples that are classified to their exact newlyidentified unknown classes correctly. Note that sometimes unexpectedly, our SR2CNN may classify a small portion of signals into different unknown classes but their real labels are actually identical and correspond to certain unknown class (we name these unknown classes as isotopic classes) . In this rare case, we only count the identified unknown class with the highest accuracy in calculating .
For ZSL, we test our SR2CNN with several different combinations of aforementioned parameters and , hopefully to snatch a certain satisfying result out of multiple trials. Fixing to 1 simply leads to fair performance, though still, we adjust in a range between 0.05 and 1.0. Here, the predefined indicators above play an indispensable part to help us sift the results. Generally, a wellchosen result is supposed to meet the following requirements: 1. the weighted true rate (WTR): TKR+TUR is as great as possible; 2. KPUB, where UB is the upbound defined as the known cluster accuracy; 3. 2 for all possible , where denotes the number of isotopic classes corresponding to a certain unknown class .
In order to better make a transverse comparision, we compute two extra indicators, average total accuracy in ZSL scenario and also average known accuracy in completely supervised learning, shown as italics in Table V.
On the whole, the results are promising and excellent. However, we have to admit that ZSL learning somewhat incurs a little bit performance loss as compared with the fully supervised model, especially reflected in the class AMDSB among all modulations, while reflected in dataset 2016.10B compared with other two datasets. After all, when losing sight of the two unknown classes, SR2CNN can only acquire a segment of the intact knowledge that shall be totally learned in a supervised case. It is this imperfection that presumably leads to a fluctuation (better or worse) on each class’s accuracy when compared with supervised learning. Among these classes, the poorest victim is always AMDSB, with considerable portion of its samples rejected as unknown ones. Besides, the features, especially those of the unknown classes, among these three datasets are not exactly in the same difficulty levels of learning. Some unknown features may even be akin to those known ones, which can consequently cause confusions in the discrimination tasks. It is no doubt that these uncertainties and differences in the feature domain matter a lot. Take 2016.10B, compared with its two counterparts, it emanates the greatest loss (more than 10%) on average accuracy (both total and known), and also a pair of inferior true rates. Moreover, it is indeed the single case, where both two unknown classes are separately identified into two isotopic classes.
It is obvious that average accuracy strongly depends on the weighted true rate (WTR), i.e., the clearer for the discrimination between known and unknown, the more accurate for the further classification and identification. Therefore, to better study this discrimination ability, we depict Fig. 5 to elucidate its variation trends regarding discrimination coefficient (). At the same time, we introduce a new concept discrimination interval as an interval where the weighted true rate is always greater than 80%. The width of the above interval is used to help quantify this discrimination ability.
Apparently, the curves for the primary two kinds of true rate are monotonic, increasing for the known while decreasing for the unknown. The maximum points of these weighted true rate curves for each dataset, are about 0.4, 0.2 and 0.4 respectively, which exactly correspond to the results shown in Table V. Besides, the width of the discrimination interval of 2016.10B is only approximately one third of those of 2016.10A and 2016.04C. This implies that the features of 2016.10B are more difficult to learn, and just accounts for its relatively poor performance.
Fig. 6 indicates that the usage of center loss on 2016.10A indeed helps our model to discriminate more distinctly, resulting in a notably broader discrimination interval. Still, there’s a thoughtprovoking point regarding 2016.10B, which is really worth our discussions. Referring back to Fig. 4, we can clearly see that curve of center loss for 2016.10B is the smoothest and converges to the lowest value. However, as a matter of fact, the performance of discrimination task on 2016.10B is actually the poorest, which seems like that 2016.10B has never benefit from the usage of center loss. Thus, revolving on this irregular phenomenon, we assume that it is still the features of the data that dominantly and unshakably determine the performance of discrimination, while center loss only works secondarily to help cluster samples for known classes.
5.3 Other Extensions
We tentatively change several unknown classes on 2016.10A, seeking to excavate more in the feature domain of data. As shown in Table III, both known precision (KP) and unknown precision (UP) are insensitive to the change of unknown classes, proving that the classification ability of SR2CNN are consistent and wellpreserved for the considered dataset. Nevertheless, obviously, the unknown class CPFSK is always the hardest obstacle in the course of discrimination, since its accuracy is always the lowest as well as some isotopic classes are observed in this case. When class CPFSK and GFSK simultaneously play in the unknown roles, the performance loss (on both TKR and TUR) is quite striking. We accredit this phenomenon to the resemblances among the classes in the feature domain. Specifically, the unknown CPFSK and GFSK may share a considerable number of similarities with their known counterparts to a certain degree, which will unluckily mislead SR2CNN about the further discrimination task.
To justify SR2CNN’s superiority, we compare it with a couple of traditional methods prevailing in the field of outlier detection. The results are presented in Table
IV. Concretely, when exploiting these methods, a certain sample, which is said to be an outlier for each known class, will be regarded as an unknown sample. Note that there are no unknown classes identification tasks launched, only discrimination tasks are considered. Hence, here, for a certain unknown class , we compute its unknown rate, instead of accuracy, as , where denotes the number of samples from unknown class , while denotes the number of samples from unknown class , which are discriminated as unknown ones. Aforementioned requirement 1. the weighted true rate (WTR): 0.4TKR+0.6TUR is as great as possible, is employed to help choose several standard results. As expected, SR2CNN stands out unquestionably, while the other traditional methods all confront a destructive performance loss and fail to discriminate normally. Only Elliptic Envelope can limpingly catch up a little. At least, its true unknown rate can indeed overtake 90%, though at the cost of badly losing its true known rate.6 Dataset SIGNAL202002
We newly synthesize a dataset, denominated as SIGNAL202002, to hopefully be of great use for further researches in signal recognition field. Basically, the dataset consists of 11 modulation types, which are BPSK, QPSK, 8PSK, 16QAM, 64QAM, PAM4, GFSK, CPFSK, BFM, AMDSB and AMSSB. Each type is composed of 20000 frames. Data is modulated at a rate of 8 samples per symbol, while 128 samples per frame. The channel impairments are modeled by a combination of additive white Gaussian noise, Rayleigh fading, multipath channel and clock offset. We pass each frame of our synthetic signals independently through the above channel model, seeking to emulate the realworld case, which shall consider translation, dilation and impulsive noise etc. The configuration is set as follows:
20000 samples per modulation type
feature dimension
20 different SNRs, even values between [2dB, 40dB]
The complete dataset is stored as a python pickle file which is about 450 MBytes in complex 32 bit floating point type. Related code for the generation process is implemented in MatLab.
We conduct zeroshot learning experiments on our newlygenerated dataset and report the results here. As mentioned above, a supervised learning trial is similarly carried out to help us get an overview of the regular performance for each class of SIGNAL202002. Unfortunately, as Table V shows, the original two candidates of 2016.10A, AMSSB and GFSK, both fail to keep on top. Therefore, here, we relocate the unknown roles to another two modulations, CPFSK with the highest accuracy overall, as well as BFM, which stands out in the three analogy modulation types (BFM, AMSSB and AMDSB).
According to Table V, an apparent loss on the discrimination ability is observed, as both the TKR and the TUR just slightly pass 80%. However, our SR2CNN still maintain its classification ability, as the accuracy for each class remains encouraging compared with the completelysupervised model. The most interesting fact is that, the known precision (KP) is incredibly high, exceeding those KPs on 2016.10A by almost 10%, as shown in Table III. To account for this, we speculate that the absence of two unknown classes may unintentionally allow SR2CNN to better focus on the features of the known ones, which consequently, leads to a superior performance of known classification task.
7 Futrue Direction
It is worth mentioning that there’s still some room for our SR2CNN to improve and mature. For example, an obvious limitation is that, the randomness (each time we set a random value to the seed used to shuffle the test data) in the coming order of the unknown test samples may sometimes greatly derail our SR2CNN during the unknown classification task. To be more clear, consider that the first sample discriminated as an unknown one, is actually an anomaly of its corresponding unknown class (namely, it cannot represent the typical features of its class). In this case, however, SR2CNN is completely unaware of this abnormality, and will still routinely record this improper sample as a newlyidentified semantic center, which as a result, can inevitably mess up the classification tasks of those followup test samples.
Therefore, when it comes to some further research, our preoccupation basically falls on handling with the uncertainty of the unknown samples, as we demonstrated above. Hopefully, in the near future, we can figure out an approach to strengthen and perfect our SR2CNN so that it can be more robust and omnipotent, and ultimately be widely applied in the ZSL of signal recognition field.
8 Conclusion
In this paper, we have proposed a ZSL framework SR2CNN, which can successfully extract precise semantic features of signals and discriminate both known classes and unknown classes. SR2CNN can works very well in the situation where we have no sufficient training data for certain class. Moreover, SR2CNN can generally improve itself in the way of updating semantic center vectors. Extensive experiments demonstrate the effectiveness of SR2CNN. In addition, we provide a new signal dataset SIGNAL202002 including eight digital and three analog modulation classes for further research.
Appendix A Convolution and Deconvolution Operation
Let denote the vectorized input and output matrices. Then the convolution operation can be expressed as
(16) 
where denotes the convolutional matrix, which is sparse. With back propagation of convolution, is obtained, thus
(17) 
where denotes the th element of , denotes the th element of , denotes the element in the ith row and jth column of , and denotes the th column of . Hence,
(18) 
Similarly, the deconvolution operation can be notated as
(19) 
where denotes a convolutional matrix that has the same shape as , and it needs to be learned. Then the back propagation of convolution can be formulated as follows:
(20) 
For example, the size of the input and output matrices is and as shown in Fig. 3(c). Then is a 16dimensional vector and is a 4dimensional vector. Define convolutional kernel as
(21) 
It is not hard to imagine that is a matrix, and it can be represented as follows:
(22) 
Hence, deconvolution is expressed as leftmultiplying in forward propagation, and leftmultiplying in back propagation.
Acknowledgments
The authors would like to thank…
References

[1]
(2016)
Towards open set deep networks.
In
Proceedings of the IEEE conference on computer vision and pattern recognition
, pp. 1563–1572. Cited by: §1.  [2] (2000) LOF: identifying densitybased local outliers. In Proceedings of the 2000 ACM SIGMOD international conference on Management of data, pp. 93–104. Cited by: §2, TABLE IV.

[3]
(2001)
Incremental and decremental support vector machine learning
. In Advances in neural information processing systems, pp. 409–415. Cited by: §1.  [4] (2016) Variational lossy autoencoder. arXiv preprint arXiv:1611.02731. Cited by: §4.1.

[5]
(2001)
Oneclass svm for learning in image retrieval
. In Proceedings 2001 International Conference on Image Processing (Cat. No. 01CH37205), Vol. 1, pp. 34–37. Cited by: §2, TABLE IV.  [6] (2008) Closed sets for labeled data. Journal of Machine Learning Research 9 (Apr), pp. 559–580. Cited by: §1.
 [7] (2017) Signal detection effects on deep neural networks utilizing raw iq for modulation classification. In MILCOM 20172017 IEEE Military Communications Conference (MILCOM), pp. 121–127. Cited by: §1.
 [8] (2016) Deep residual learning for image recognition. In Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 770–778. Cited by: §2.
 [9] (2017) Betavae: learning basic visual concepts with a constrained variational framework.. ICLR 2 (5), pp. 6. Cited by: §4.1.
 [10] (2008) Isolation forest. In 2008 Eighth IEEE International Conference on Data Mining, pp. 413–422. Cited by: §2, TABLE IV.

[11]
(2018)
Deep learning based on batch normalization for p300 signal detection
. Neurocomputing 275, pp. 288–297. Cited by: §1.  [12] (2016) Deep learning algorithms for signal recognition in long perimeter monitoring distributed fiber optic sensors. In 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP), pp. 1–6. Cited by: §2.
 [13] (2010) Evaluation of svm kernels and conventional machine learning algorithms for speaker identification. International journal of Hybrid information technology 3 (3), pp. 23–34. Cited by: §1.
 [14] (2011) Sparse autoencoder. CS294A Lecture notes 72 (2011), pp. 1–19. Cited by: §4.1.
 [15] (2016) Convolutional radio modulation recognition networks. In International conference on engineering applications of neural networks, pp. 213–226. Cited by: §2, §5.
 [16] (2016) Unsupervised representation learning of structured radio communication signals. In 2016 First International Workshop on Sensing, Processing and Learning for Intelligent Machines (SPLINE), pp. 1–5. Cited by: §1.
 [17] (2018) Overtheair deep learning based radio signal classification. IEEE Journal of Selected Topics in Signal Processing 12 (1), pp. 168–179. Cited by: §2.
 [18] (2009) Zeroshot learning with semantic output codes. In Advances in neural information processing systems, pp. 1410–1418. Cited by: §1, §2.
 [19] (1994) The three sigma rule. The American Statistician 48 (2), pp. 88–91. Cited by: §4.2.
 [20] (2008) Incremental learning for robust visual tracking. International journal of computer vision 77 (13), pp. 125–141. Cited by: §1.

[21]
(1999)
A fast algorithm for the minimum covariance determinant estimator
. Technometrics 41 (3), pp. 212–223. Cited by: §2, TABLE IV.  [22] (2012) Toward open set recognition. IEEE transactions on pattern analysis and machine intelligence 35 (7), pp. 1757–1772. Cited by: §1.
 [23] (2013) Zeroshot learning through crossmodal transfer. In Advances in neural information processing systems, pp. 935–943. Cited by: §1, §2.

[24]
(2016)
A discriminative feature learning approach for deep face recognition
. In European conference on computer vision, pp. 499–515. Cited by: §4.1.
Comments
There are no comments yet.