Adversarial Deep Learning for Over-the-Air Spectrum Poisoning Attacks

11/01/2019 ∙ by Yalin E. Sagduyu, et al. ∙ 0

An adversarial deep learning approach is presented to launch over-the-air spectrum poisoning attacks. A transmitter applies deep learning on its spectrum sensing results to predict idle time slots for data transmission. In the meantime, an adversary learns the transmitter's behavior (exploratory attack) by building another deep neural network to predict when transmissions will succeed. The adversary falsifies (poisons) the transmitter's spectrum sensing data over the air by transmitting during the short spectrum sensing period of the transmitter. Depending on whether the transmitter uses the sensing results as test data to make transmit decisions or as training data to retrain its deep neural network, either it is fooled into making incorrect decisions (evasion attack), or the transmitter's algorithm is retrained incorrectly for future decisions (causative attack). Both attacks are energy efficient and hard to detect (stealth) compared to jamming the long data transmission period, and substantially reduce the throughput. A dynamic defense is designed for the transmitter that deliberately makes a small number of incorrect transmissions (selected by the confidence score on channel classification) to manipulate the adversary's training data. This defense effectively fools the adversary (if any) and helps the transmitter sustain its throughput with or without an adversary present.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

Machine learning provides wireless communications with automated means to learn from and adapt to dynamic spectrum environment that includes a variety of topology, channel, traffic, and interference effects [1, 2, 3]. Examples of machine learning applications in wireless communications include spectrum sensing [4]

, channel estimation

[5], spectrum access [6], power control [7], signal classification [8], and augmentation [9].

Wireless communication is vulnerable to different types of attacks such as jamming [10] and eavesdropping [11] due to its open broadcast nature. Dynamic spectrum access (DSA) is especially sensitive to attacks as it involves various tunable parameters that can be manipulated by adversaries [12]. One example is the primary user emulation (PUE) attack, where an adversary pretends to be a primary user and aims to decrease the spectrum access opportunities of cognitive radios [13]. In a collaborative sensing environment, another example is the spectrum sensing data falsification (SSDF) attack that targets the spectrum sensing operation by falsifying spectrum sensing reports [14].

As machine learning starts finding more applications in wireless communications, the safe use of machine learning algorithms is emerging as a major security concern. In particular, machine learning itself may become the target of the adversary. Such security issues have been studied in other data domains (e.g., computer vision) in the emerging field of

adversarial machine learning. Examples include exploratory (inference) attacks

to infer how a machine learning algorithm operates (e.g., learn a classifier’s decision boundaries)

[15], evasion attacks to fool a machine learning algorithm into making wrong decisions (e.g., fool a trained filter into misclassifying spam emails) [16, 17], and causative

attacks to provide incorrect information (e.g., training data in supervised learning) to a machine learning algorithm

[18].

When adversarial machine learning is applied to wireless communications, the objective is not anymore to directly attack wireless communications but to manipulate the underlying cognitive engine based on machine learning algorithms. One important difference from other data domains (e.g., computer vision classifier APIs) is that the adversary and the target in wireless communications observe different features (due to different channel and interference effects) and use different classification labels (as they perceive different events). In [6, 7], an adversary was designed to build a deep neural network to mimic how the DSA algorithm works and then jams the data transmissions by first running its own surrogate classifier to determine successful transmission opportunities. While this attack decreases the throughput, it incurs major energy consumption and leaves a large footprint for easy detection.

In this paper, we develop a new type of wireless attack based on adversarial machine learning, namely the over-the-air spectrum poisoning attack that targets the sensing period of a transmitter under attack. This is a stealth attack that is energy-efficient and does not leave a large footprint compared with previous attacks. Unlike traditional denial of service attacks [19], where the adversary transmits to jam data transmissions, the adversary aims to manipulate the spectrum sensing data by jamming the spectrum sensing period so that the target transmitter makes wrong decisions by using the unreliable spectrum sensing results. This attack also differs from the SSDF attack, since the adversary does not participate in cooperative spectrum sensing and does not try to change the estimated channel labels directly as in the SSDF attack. Instead, the adversary injects adversarial perturbations over the air (in terms of jamming the spectrum sensing period) to the channel in order to fool the transmitter into making wrong transmit decisions or make the transmitter’s re-training process fail. To counteract such attacks, we develop a defense mechanism that uses the classification outputs of the transmitter’s deep neural network to add controlled errors into channel access decisions of the transmitter and consequently mislead the adversary.

We consider a canonical wireless communication scenario with a transmitter, its corresponding receiver, an adversary, and some other background traffic. We apply different channel models including Gaussian, Rayleigh, Rician, and log-normal channels. Note that the proposed attacks are independent of the network topology and can be directly applied to other network topologies. We also show results for multiple background traffic sources. For this case, aggregate traffic is observed through spectrum sensing that inputs the aggregated signal to the transmitter and jammer algorithms. If there are multiple adversaries, each of them and the jammer can individually apply their proposed algorithms (while treating interference the same way as background traffic). The transmitter builds a machine learning model (based on a deep neural network) to predict the busy and idle states of the channel. The adversary applies adversarial deep learning to launch various attacks, including exploratory attack, evasion attack in test phase, causative attack in training phase, and their combinations. As a defense strategy, the transmitter launches an attack back on the cognitive engine of the adversary and aims to degrade the inference stage of the adversary.

The main contributions of this paper are on stealth and energy-efficient attacks on wireless communications built upon adversarial machine learning and a corresponding defense scheme. We present novel techniques for

  • exploratory (inference) attacks by observing the spectrum and feedback on transmission outcomes in communications (see Section 1.1),

  • evasion attacks on spectrum sensing of wireless communications in test phase (see Section 1.2),

  • causative attacks on spectrum sensing of wireless communication in training phase (see Section 1.3), and

  • defense scheme against all these attacks (see Section 1.4).

1.1 Exploratory (Inference) Attack

The adversary applies adversarial deep learning to launch an exploratory attack. For that purpose, the adversary trains a deep neural network. Subsequently, it intentionally changes the transmitter’s sensing results by transmitting when it predicts that there will be a successful transmission if there were no attacks.

The training data of the transmitter consists of time-series of spectrum sensing results as features and channel idle/busy status based on the ground truth (the background transmitter’s on/off state) as labels. Using this training data, the transmitter builds a deep neural network to make transmit decisions. If a transmission is successful (i.e., the signal-to-interference-plus-noise ratio (SINR) exceeds a threshold), the receiver sends an acknowledgement (ACK) to the transmitter and the adversary can detect the presence of this ACK (without decoding its content).

The adversary first determines the time slot structure used by the transmitter and then performs an exploratory attack to build a classifier that can predict the outcome of transmissions, i.e., whether there will be an ACK if there is no attack. The adversary uses time-series of its own spectrum sensing results as features and presence/absence of ACKs as labels in its training and test data. Note that this is not a standard exploratory attack and the classifier built by the adversary will not be the same as (or similar to) the classifier used by the transmitter, due to the following two differences.

  • The transmitter and the adversary are in different locations and thus their sensing results will vary based on the channel environment and differ from each other. As a result, the input data (features) to their classifiers will differ.

  • The adversary predicts the outcome of the transmissions (‘ACK’ or ‘no ACK’) while the transmitter predicts channel status (‘idle’ or ‘busy’). As a result, the output data of their classifiers will differ.

Once the adversary develops its deep neural network model as part of an exploratory attack, it uses this classifier to perform either an evasion attack in the test phase or causative attack in the training phase.

1.2 Evasion Attack in Test Phase

After building its classifier, the adversary predicts when the transmitter will have a successful transmission (if there was no attack) and performs the evasion attack in test phase, i.e., the adversary transmits to change the channel status in order to poison (i.e., falsify) the transmitter’s input (spectrum sensing data) to the machine learning algorithm. The attack considered in this paper is similar to that in [6, 7], where the adversary also first learns the transmitter’s behavior (ACK or not) by an exploratory attack and then performs subsequent attacks. The difference is that in [6, 7], the adversary performs a standard jamming attack during the data transmission period to make a transmission fail while in this paper the adversary performs an evasion attack in the sensing period such that the transmitter is provided with incorrect input data (manipulated over the air) to its classifier and makes the wrong decision of not transmitting. This attack is harder to detect since it does not directly jam the transmitter’s signal but it changes the input data to the decision mechanism. Moreover, this attack is more energy efficient since the adversary makes a very short transmission in the sensing period.

We show that this adversarial deep learning approach significantly reduces the transmitter’s performance. In particular, for the scenario studied in numerical results, only few transmission attempts are made when the evasion attack is launched and the achieved throughput (normalized by the best throughput by an ideal algorithm to detect every idle channel) drops from to . For comparison purposes, we consider the same energy budget (namely, the energy consumption of spectrum poisoning attack) and study an attack that jams data transmission period (much longer than spectrum sensing period). Due to this small energy budget, an adversary cannot jam the data transmission period of all time slots that will have a successful transmission. Thus, the normalized throughput can only be reduced to , which is much higher than under the evasion attack on the data sensing period.

1.3 Causative Attack in Training Phase

For the case that the transmitter collects additional training data and retrains its classifier, the adversary can also apply a causative attack on the training data after determining the start and end of training phase. The classifier of the transmitter is then updated by using some incorrect data and thus becomes worse than before retraining. As a result, even if there is no further attack, the transmitter will make incorrect decisions in the future and its performance will drop. This attack is even harder to detect and more energy efficient than evasion attack since the adversary’s transmissions are limited to sensing periods of only the retraining phase. For the scenario studied in numerical results, the normalized throughput drops from to while a defender that monitors data transmissions cannot find a jamming signal. A causative attack can be followed by other attacks after the transmitter’s classifier is updated, such as evasion attack on the data sensing period and jamming attack on the transmission period. Both attacks can further reduce the transmitter’s throughput. When combined with causative attack, evasion attack reduces throughput to a smaller value () than jamming ().

The overview of these attacks is shown in Figure 1.

Fig. 1: Overview of adversarial machine learning proposed in this paper.

1.4 Defense Scheme

Since the adversary can significantly reduce the transmitter’s throughput, it is necessary to develop a defense scheme. One approach to protect a neural network against adversarial machine learning is adding randomness to the neural network structure (namely, weights and biases) [20]. This defense is effective if an adversary can access the output layer of a neural network (labels and scores). However, it cannot be applied in wireless communications, since the adversary collects its training data indirectly by observing the outcome of transmissions (without obtaining a score). Also, small randomness may not change the outcome of transmissions while large randomness will impact ’s performance. Instead, we design a defense scheme where the transmitter intentionally makes some incorrect transmit decisions to manipulate the training data of the adversary so that the adversary cannot build a reliable deep learning model. This corresponds to a causative attack by the transmitter on the adversary’s inference attack stage. These incorrect transmit decisions should be made on a carefully selected set of time slots to balance the trade-off between the large impact on the adversary’s classifier and small loss in transmitter’s performance due to incorrect transmit decisions. We select these time slots from those with the classification score (provided by the deep learning classifier of the transmitter) that is far away from the decision boundary. We show that this defense mechanism can increase the normalized throughput from to against evasion attacks, and can be effectively applied against other attacks, as well, by adapting the level of defense without knowing whether an adversary is present, or not.

1.5 Paper Organization

The rest of the paper is organized as follows. Section 2 reviews related work on wireless attacks and adversarial machine learning. Section 3 describes the system model. Section 4 describes the transmitter’s algorithm and shows the performance without an attack. Section 5 describes the adversary’s algorithm and shows the performance under different attacks. Section 6 presents a defense mechanism and shows how it improves the performance. Section 7 concludes the paper.

2 Related Work

There are various security concerns regarding the safe use of machine learning algorithms. For example, if the input data to a machine learning algorithm is manipulated during the training or operation (test) time, the output will be very different compared to the expected results. These particular security threats are addressed in the emerging field of adversarial machine learning, which studies learning in the presence of adversaries and aims to enable safe adoption of machine learning to the emerging applications.

Attacks under adversarial machine learning are divided into three broad categories, namely exploratory (or inference) attacks, evasion attacks, and causative attacks.

  • In exploratory attacks [21, 22, 15], the adversary aims to understand how the underlying machine learning works for an application (e.g., inferring sensitive and/or proprietary information).

  • In evasion attacks [17, 16], the adversary attempts to fool the machine learning algorithm into making a wrong decision (e.g., fooling a security algorithm into accepting an adversary as legitimate).

  • In causative attacks [18], the adversary provides incorrect information such as training data to machine learning.

These attacks can be launched separately or combined, i.e., causative and evasion attacks can be launched by making use of the inference results from an exploratory attack [23]. For wireless applications, the evasion attack was considered in [24, 25, 26, 27] by adding adversarial perturbations to fool receivers to misclassify signal types (such as modulations). Adversarial distortions were considered in [28] to support anti-jamming by deceiving the jammer’s learning algorithms in a game-theoretic framework. Built upon exploratory attacks, deep learning was studied in [6, 7] to launch jamming attacks on data transmissions. This paper focuses on attacks during spectrum sensing of wireless communications. In [29], we performed the preliminary study on exploratory and evasion attacks on data sensing for wireless communications and corresponding defense strategies.

Apart from adversarial machine learning, there are different types of attacks on the spectrum sensing decisions studied in the literature [12, 30, 46]. In a collaborative sensing environment, some users may send falsified reports to each other or to a decision center. This corresponds to a spectrum sensing data falsification (SSDF) attack that aims to degrade the performance of spectrum sensing [31, 32]. The attacks proposed in this paper are different from SSDF attacks, since the adversary does not participate in collaborative spectrum sensing and does not falsify estimated spectrum sensing results but rather, it transmits in the spectrum sensing period to change the inputs to spectrum classifier over the air. Another type of attack, the primary user emulation (PUE) attack, aims decrease the spectrum access opportunities of cognitive radios. A defense technique for PUE attacks using belief propagation was studied in [13]. Cognitive radio networks are also susceptible to conventional security threats such as jamming [33], eavesdropping [11] and noncooperation [34]. These threats on wireless communications extend from physical layer to higher layers, e.g., attacks on routing in the network layer [35] and network flow inference attacks [36]. Wireless security finds rich applications of deep learning. Deep learning was applied to authenticate signals [37], detect and classify jammers of different types [38, 39, 40], and control communications to mitigate jamming effects [41, 42, 47]. Using wireless sensors, deep learning was also used to infer private information in analogy to exploratory attacks [43].

In this paper, we study adversarial machine learning attacks on spectrum sensing under a small energy budget. Following an exploratory attack, we consider an evasion attack in test phase and as a causative attack in training phase. We also consider combination of these attacks along with jamming data transmissions. Moreover, we propose a defense scheme in this paper to counteract these new types of attacks.

3 System Model

We consider a communication system that includes a transmitter , a receiver , an adversary and some background traffic source that may transmit its data. These nodes operate on a single channel. The network topology to generate numerical results is shown in Figure 2. As noted in Section 1, the proposed attacks can be applied to other network topologies. We assume that and are cognitive radios that can run algorithms developed in this paper and can perform spectrum sensing and transmit and receive data and feedback, as specified in algorithm solution. We mostly focus on fixed locations in this paper. We will discuss the impact of mobile nodes in Section 4.

Fig. 2: The network topology.

The transmission pattern of is not known by or a priori, and can be detected via spectrum sensing. Packets arrive at randomly according to the Bernoulli process with rate (packet/slot). If

is not transmitting, it becomes active with certain probability when its queue is not empty. Once activated, it will keep transmitting until its queue becomes empty. Thus, there may be a continuous period of busy slots and its length depends on the number of packets in

’s queue, which is related to the number of previous idle slots. Therefore, channel busy/idle states are correlated over time, and both and need to observe the past channel status over a time period to predict the current channel status.

Symbol Definition
The adversary
The background traffic source
Node ’s classifier
Updated classifier for
Classifier ’s output on features
Distance from node to node
Test data set
Test data set for under ’s defense
actions
Training data set
Training data set for under ’s defense
actions
Error probability for classifier
False alarm probability for classifier
Misdetection probability for classifier
Features for time slot
Channel gain from node to at time

Set of deep learning hyperparameter

values
Feasible region for deep learning
hyperparameters
The label (ACK or not) at time
Function for deep learning process of
node on given , , and
Success ratio among all transmissions
Normalized throughput
Transmission ratio
Number of busy time slots in
Number of idle time slots in
Number of false alarms for classifier
Number of misdetections for classifier j
Number of most recent sensing results in
each
Noise
Sensed power at time
Classification score for sample
Transmit power
Ratio of ’s defense actions
The receiver
A sample
The idle/busy status at time slot
An inter-arrival time between two ACKs
A transmitter
SNR or SINR
SNR or SINR threshold for a successful
transmission
Arrival rate of
Classification threshold
TABLE I: Notation.

Time is divided in slots. Within each slot, the initial short period of time is allocated by for spectrum sensing and the ending short period of time is allocated for feedback (i.e., ACK). The rest of a slot is for data transmission if channel is detected as idle. The decision of is based on a classifier (trained by deep learning) that analyzes sensing results and then determines the time slot status such that a time slot is busy if background traffic is detected and idle otherwise. is independent of ’s actions. Each sensing result is either

  • noise (idle time slot) or

  • noise plus the received power from background traffic (busy time slot), where is the channel gain from to at time and is transmit power at .

Data transmission is successful if the SNR or the SINR at the receiver is not less than a threshold , where is the channel gain from to . We assume Gaussian noise at and Gaussian channel gain from to . Results for other channel models, i.e., Rayleigh channel, Rician channel, and log-normal channel, are also presented in Section 4. Channel quality changes over time. The mean value of the channel gain is calculated based on the free-space propagation loss model. Note that algorithms in this paper are not tied to any channel model. sends an ACK for each successful transmission.

Before launching an attack, first determines the length of a time slot and the length of sensing, transmission, and feedback periods in a time slot based on its spectrum sensing results. For that purpose, senses the channel over a period of time to collect data. Then can detect ACKs reliably because of the unique properties of ACKs. First, an ACK always follows an active data transmission period and is followed by an inactive sensing period. Second, ACK itself is a short transmission period with a reliable modulation and coding scheme, which is different than the scheme used for data transmissions. The inter-arrival time between two ACKs is an integer times the length of a time slot, since some time slots do not have an ACK. The problem of determining the time slot length from multiple observations of such inter-arrival times is solved by Algorithm 1. Once the length of a time slot is determined, can further determine the sensing and transmission periods in a time slot. In a time slot with ACK, there is a successful transmission and thus the starting point of such a transmission (with higher sensed power than idle cases) determines the sensing period (before this point) and the data transmission period (after this point).

1:   observes the spectrum over a time period and identifies ACKs.
2:  Initialize a list that includes inter-arrival times of these ACKs.
3:  Find as the smallest number in .
4:  for  to ,  do
5:     
6:     if  then
7:        Remove from .
8:     else
9:        Replace by .
10:     end if
11:  end for
12:  if  has only one element  then
13:     Return .
14:  else
15:     Go to Step 3.
16:  end if
Algorithm 1 Determine the length of a time slot

Then for each time slot, aims to predict whether there will be a successful transmission (ACK) if there is no attack. Note that only detects the presence of the ACK message but does not need to decode it. The prediction by is based on another classifier that is trained by using deep learning. If predicts that there will be a successful transmission, it performs some attack to reduce the throughput of . In this paper, we consider the attack of transmitting in the initial short sensing period to change the sensing result of for the current time slot. Since this sensing result is an input to on time slot status, may make a wrong decision, even if was trained properly.

The advantage of this attack, comparing with the continuous jamming attack, is that the initial sensing period is much shorter than the data transmission period. As a result, the power consumption of this attack is much less compared to continuous jamming. In addition, it is harder to detect this attack compared to continuous jamming due to its small footprint.

may also apply a defense mechanism to mitigate such attacks. For that purpose, takes wrong actions in a controlled manner such that the ‘ACK’ or ‘no ACK’ results (namely labels for ) are changed. As a consequence, cannot be reliably trained and the attack performance drops. However, needs to minimize the number of these wrong actions such that the performance loss due to wrong channel access decisions remains small. In Section 6, we will show how to carefully select a small set of time slots (depending on the classification score of ) and take wrong actions only in these slots to better mislead . Table I lists the notation used in this paper.

4 Transmitter’s Algorithm

senses the spectrum, identifies an idle time slot (when is not transmitting), and then decides whether to transmit or not. applies a deep learning classifier to identify idle time slots. is pre-trained using a number of samples, where a sample for time has the most recent sensing results as features and the current busy/idle status as the label. is potentially a design parameter for and can be tuned by to optimize its performance. In this paper, we assume . Each sensing result is either a Gaussian noise with normalized power (idle time slot) or noise plus the transmit power from another user

(busy time slot), where noise and the channel gain are random variables with Gaussian distributions. After observing a certain period of time,

collects a number of samples to be used as training data to build a deep learning classifier . ’s training algorithm is summarized in Algorithm 2.

1:   collects sensing data over a time period to build its training data .
2:   builds a training sample for each time , where , is the sensed power at time , and is the busy/idle status at time .
3:   trains a deep learning classifier using training data .
Algorithm 2 ’s training algorithm
Fig. 3: ’s classifier in test time.

Once is built, uses it to predict the channel status of each time slot and transmit if it predicts a given time slot as idle. The block diagram in Figure 3 shows ’s operation in test time. Note that there is an optional block of defense, which will be discussed later in Section 6. This prediction algorithm is summarized in Algorithm 3. In this algorithm, two types of errors may be incurred:

  • Misdetection. A busy time slot is detected as idle, i.e., ‘busy’ and ‘idle‘.

  • False alarm. An idle time slot it is detected as busy, i.e., ‘idle’ and ‘busy’.

Transmitter aims to minimize error probability to balance misdetections and false alarms, where is the misdetection probability and is the false alarm probability for classifier . This objective is important, especially when data is imbalanced among labels. These error probabilities are calculated by and , where is the number of misdetections, is the number of busy time slots in , is the number of false alarms, and is the number of idle time slots in

. There are many hyperparameters in deep learning, e.g., the number of layers in the neural network and the number neurons per layer. Denote

as a set of hyperparameter values and as the feasible region for hyperparameters. In addition to training the deep neural network (namely, determining weights and biases), these hyperparameters should also be optimized to minimize . Hyperparameter selection leads to the following optimization problem.

OptHyper:
minimize
subject to

where is ’s function for deep learning process on given hyperparameters and training data . Note that the closed form expression of is unknown due to the complex neural network built in deep learning. Therefore, standard optimization techniques such as convex optimization cannot be applied to solve OptHyper. In this paper, we find local optimal solution to OptHyper by applying a greedy sequential-fixing algorithm that starts with an initial set of parameter values and optimizes one parameter value (while keeping others unchanged) in each round until all parameter values are optimized. In addition, we solve OptHyper by Hyperband [48]

, which starts with a number of settings of parameter values and check their performance with a limited number of training epochs. This approach can still achieve good performance (local optimal solution) with low complexity. Based on the current performance results for each setting, some bad settings are removed. In the next round, remaining settings will continue for more epochs and more accurate performance results will be obtained to further remove some bad settings. After several rounds, a final solution on parameters will be obtained likely with good performance by considering many settings. On the other hand, it has low complexity since most of settings can be removed without a complete training process. Alternatively, a random search approach in

[44] could be used for low complexity but performance is not good as only a small random portion of the large search space is covered. OptHyper

could also be solved by genetic algorithm that can find good solutions on parameters at the expense of high time complexity.

111 is used as chromosome and the algorithm starts with a number of initial solutions on , which is the first generation. Once a termination condition (e.g., no signification improvement on over some generations) is met, the best solution in the current generation on is the final solution by the genetic algorithm.

1:  At time , senses channel and obtains power .
2:   builds a test sample .
3:   uses its classifier to decide on a label (‘busy’ or ‘idle’) for the test sample at time slot , i.e., computes .
Algorithm 3 ’s prediction algorithm

We use TensorFlow

[45] to build with an FNN structure shown in Figure 4. The following hyperparameters are selected as a local optimal solution by solving OptHyper for the deep neural network of :

  • An FNN is trained with backpropagation algorithm by using cross-entropy as the loss function. Cross-entropy function is given by

    where is the set of the neural network parameters,

    is the training data vector,

    is the corresponding label vector, and is the output of the neural network at the last (output) layer .

  • Number of hidden layers is 3.

  • Number of neurons per hidden layer is 100.

  • Rectified linear unit (ReLU) is used as activation function at hidden layers. ReLU performs the operation on input .

  • Softmax is used as the activation function at output layer. Softmax performs on input .

  • Batch size is 100.

  • Number of training steps is 1000.

Fig. 4: The structure of a feedforward neural network.

In the simulation, background traffic arrives at background transmitter at rate of packet per time slot. When has queued data packet, it may decide to transmit at rate of packet per time slot and once it transmits, it will continue until the queue is empty. The channel gain is a random variable with a Gaussian distribution and the expected value , where is the distance between and . In the simulation setting, the location of is , the location of is , and the transmit power at is (normalized with respect to the unit noise power).

collects samples, each with the most recent spectrum sensing results and a label (‘idle’ or ‘busy’). Half of these samples are used as training data and the other half of them are used as test data. The optimized deep learning classifier minimizes its error . In this test phase, there are busy and idle time slots found in the test data. Among them, busy time slots are identified as idle and no idle time slot is identified as busy. Thus, and . This small error shows that can reliably predict the channel status of a given time slot when there is no attack. Note that we achieve this small error by optimizing deep learning hyperparameters. Other parameters may result in worse performance. For example, if the number of hidden layers is changed to and the number of neurons per hidden layer is changed to , we end up with , and that are worse what can be achieved with hyperparameter tuning. When we use Hyperband for hyperparameter optimization, the deep neural network is determined to have three hidden layers, with , , and neurons, and we end up with , and .

The classifier with the best set of hyperparameters is implemented on the embedded GPU platform, Nvidia Jetson Nano. The run time to get one classification result in test time (namely, to run one sample through the deep neural network) is measured as msec. The run time for adversary’s algorithm to be developed later is similar.

transmits in idle time slots detected by . If the SNR (or SINR) at receiver is no less than a threshold , confirms a successful transmission by sending an ACK to . We set the location of as and the transmit power at as (again normalized with respect to the unit noise power). applies its deep learning classifier on time slots and makes transmission decisions. In this training phase, there are busy and idle time slots. Among them, busy time slots are identified as idle and transmissions in these slots fail, while idle time slots are correctly identified as idle and transmissions in slots are successful.

We evaluate the achieved normalized throughput , which is defined as the ratio of the number of successful transmissions to the number of idle time slots. In simulations, we measure . We also evaluate the success ratio , which is defined as the ratio of the number of successful transmissions to the number of all transmissions. In simulations, we measure . Due to small errors in detecting busy/idle time slots, normalized throughput and success ratio achieved by ’s algorithm are high. Finally, we evaluate the overall transmission ratio , which is defined as the ratio of the number of all transmissions to the number of all slots. In simulations, we measure .

In this paper, we focus on deep learning based algorithms, which has better performance than other machine learning algorithms. For example, can also use an SVM based classifier to analyze sensing data. We found that the performance of this classifier is worse, namely and . Both error probabilities are much larger than the performance of the deep learning classifier. Also note that the search space of OptHyper includes the case of one hidden layer (i.e., neural network) during the search and finds that deep learning solution with three hidden layers has higher accuracy ( by deep neural network vs. by a neural network with a single hidden layer).

Next, we consider the impact of different channel models. Three additional models, i.e., Rayleigh channel model, Rician channel model, and log-normal channel model, are studied under the same setting assumed for all other factors. From Table II, we can see that deep learning can build an accurate classifier for each of these channel models (with errors less than ), although the error probabilities are different under different channel models.

Channel model Misdetection False alarm
Gaussian
Rayleigh
Rician
log-normal
TABLE II: Results under different channel models.

To consider the impact of locations, we change the location of background transmitter to , , and . Results in Table III show that error probabilities can be smaller if the background transmitter is closer to the transmitter, since the sensed signal will be stronger.

Location Misdetection False alarm
TABLE III: Results under different background transmitter locations.

We also consider the impact of mobility, i.e., the classifier is built when background transmitter is at but then it moves to , , and , respectively. Results in Table IV show that error probabilities can be smaller if the background transmitter is moved closer to the transmitter, since the sensed signal will be stronger, otherwise, error probabilities will be larger as the background transmitter moves away.

Location in test phase Misdetection False alarm
TABLE IV: Results under different background transmitter locations.

Finally, we consider the case of multiple background sources. There are two additional background transmitters at and with the same transmit power as the one at . To have similar number of idle time slots, the traffic rates at all these transmitters are set as packet per time slot. If any background transmitter is sending its data, the channel is busy. The spectrum sensing observes the aggregated signal from all these transmitters as input to ’s classifier. We find that the trained classifier has error probabilities , i.e., it has better performance than the classifier for the case of single background source. The reason is that multiple active sources will generate larger aggregated signal and thus it is easier to predict the channel status.

5 Adversary’s Algorithm

There is an adversary that aims to reduce transmitter ’s performance. As the first step, needs to determine ’s time slot structure (start and end point, and duration) and its decomposition to sensing, transmission, and feedback periods. This step is discussed in detail in Section 3. With the knowledge of ’s time slot structure, launches an exploratory attack to infer . Then it analyzes ’s behavior and launches different attacks (using the same energy budget). In this section, we consider two types of attacks.

  • Evasion attack. jams ’s sensing period such that collects wrong channel data samples and thus makes wrong decisions when it runs its classifier with these wrong samples.

  • Causative attack. Suppose that collects additional training data and retrains . jams ’s sensing period such that collects wrong training data and thus the updated classifier fails to improve and also becomes worse.

5.1 Exploratory Attack

For exploratory attack, senses the spectrum, predicts whether there will be a successful transmission (if there was no attack), and performs certain attacks (if it predicts that there will be a successful transmission). There are four cases:

  1. time slot is idle ( ‘idle’) and is transmitting,

  2. time slot is busy ( ‘busy’) and is not transmitting,

  3. time slot is idle ( ‘idle’) and is not transmitting, or

  4. time slot is busy ( ‘busy’) and is transmitting.

Since is transmitting if and only if ‘idle’, the last two cases correspond to false alarm and misdetection of , respectively. Our results in Section 4 show that these are rare cases and . uses the most recent sensing results as the features and the current feedback (‘ACK’ vs. ‘no ACK’) as the label to build one training sample. For numerical results, is assumed to be . Note that and do not know classifier parameters of each other including . After observing a certain period of time, collects a number of samples as training data to build a deep learning classifier that outputs one of two labels, ‘ACK’ (namely, a successful transmission) and ‘no ACK’ (namely, a failed transmission). Figure 5 shows the input data and the labels while building the adversary’s classifier . ’s training algorithm is summarized in Algorithm 4.

Fig. 5: The input and output (label) data while training ’s classifier.
1:   collects data over a time period to build its training data .
2:   builds a training sample for each time , where , is the sensed power at time , and is the label (ACK or not) at time .
3:   trains a deep learning classifier using its training data .
Algorithm 4 ’s training algorithm

The process of building can be regarded as an exploratory attack, since aims to build to infer the operation of . There are the following two differences between these classifiers.

  • Due to different locations of and , and random channels, the sensing results at and differ. Thus, features for the same sample are different at and .

  • The labels (classes) at and are different, i.e., labels are ‘busy’ or ‘idle’ in ’s classifier and ‘ACK’ or ‘no ACK’ in ’s classifier.

1:  At time , senses channel and collects received power .
2:   builds a test sample .
3:   uses its classifier to decide on a label (‘ACK’ or ‘no ACK’) for the test sample at time slot , i.e., computes .
Algorithm 5 ’s prediction algorithm

Once is built, uses it to predict whether there is a successful transmission (if there was no attack). This prediction algorithm is given in Algorithm 5. For this algorithm, there may be two types of errors:

  • Misdetection. There will be a successful transmission but predicts that there will not be a successful transmission, i.e., ‘ACK’ and ‘no ACK’.

  • False alarm. There will not be a successful transmission but predicts that there will be a successful transmission, i.e., ‘no ACK’ and ‘ACK’.

aims to minimize error probability , where and are the probabilities of misdetection and false alarm for , respectively. For that purpose, it trains and selects its hyperparameters. The underlying optimization problem is similar to the one for (discussed in Section 4) and thus its discussion is omitted here.

We use TensorFlow to build . In the simulation, we set the location of as . collects samples (each sample corresponds to most recent sensing results) with labels in time slots. Half of these samples are used as training data and the other half is used as test data. There are successful transmissions in test data. Out of transmissions, are predicted as failed transmissions, although these transmissions are indeed successful. Among failed transmissions, of them are predicted as successful transmissions, although these transmissions indeed fail. Thus, and This small error shows that can reliably predict the successful transmissions by . The inferred classifier is further used by for two additional attacks, evasion and causative attacks, discussed next.

5.2 Evasion Attack

With , can perform an evasion attack (that targets the test time of ) as follows. If predicts that a time slot will have an ACK when there is no attack, transmits in the initial sensing period to change ’s sensing result for the current time slot. This sensing result is one feature of ’s classifier (part of in time slot ) and thus may make a wrong decision (namely, may misclassify the status of time slot ), even if was built successfully to predict idle/busy channel states in the absence of attacks. Compared with a continuous jamming attack, this attack targets the initial sensing period that is much shorter than the data transmission period. Hence, the power consumption of this attack is much less than continuous jamming. There are two important properties of this attack compared to jamming data transmissions. First, it is more energy-efficient and can be used to attack over a longer period of time (assuming is battery-operated). Second, it is more difficult to be detected by since does not jam transmission of (so DoS detection mechanisms cannot be readily applied).

Fig. 6: Using ’s classifier for evasion attacks.

Figure 6 illustrates ’s operation for evasion attack. In the simulation, the transmit power at is set as . For built in Section 4 and time slots considered for transmissions under the attack, (of ) idle time slots are identified as idle and the transmissions in these slots are all successful, while busy time slot is identified as idle and the transmission in this slot fails. Thus, the achieved normalized throughput is , and the overall success ratio is , while only very few transmission attempts are made such that the all transmission ratio is . As a result, reduces the throughput of significantly from to , the success ratio from to , and the ratio of transmissions from to .

We compare the evasion attack with traditional jamming attack that targets data transmissions (as studied in [6]), where jams when it predicts that may have a successful transmission (if there was no attack). We consider the optimistic case that the prediction accuracy is the same as the deep learning classifier . To have a fair comparison, we consider the same energy budget for these two attacks. For that purpose, we measure the energy consumption of (namely, the ratio of time slots when transmits) under the spectrum poisoning attack. Then we use this energy budget for every other attack considered in this paper. We assume that the lengths of a sensing period and a transmission period are and of the entire time slot, respectively (we ignore the small end period for feedback). We also assume that the energy budget allows to transmit in the entire sensing period for all time slots. Then for jamming attack, can jam up to of all time slots under the same energy budget. In this case, will select time slots with high probabilities of having an ACK if no attack. ’s transmission decisions do not change, i.e., transmissions. Among them, will be successful under jamming attack. Given that there are idle time slots, we have for the jamming attack. We can see that under the same energy budget, jamming attack is not as effective as the evasion attack considered in this paper.

5.3 Causative Attack

can also launch a causative attack (that targets the training process of ) by using if is updating using additional training data. To attack the re-training process of , identifies ’s re-training phase, namely when it starts and ends, as follows. We assume such re-training process is launched periodically. Thus, can identify ’s re-training phase in two steps (see Algorithm 6). In the first step, observes the accuracy of using to predict ACK. Once updates , observes a change in this accuracy. The time instances of changes can be used to identify the time to update , which is the ending time of re-training phases. In the second step, can launch the causative attack with adjustable length, which corresponds to different estimation on the length of a re-training phase. If increasing this length cannot improve the impact of causative attack, the current length is no less than the length of a re-training phase. Otherwise, the current length is no more than the length of a re-training phase. Thus, can adjust the predicted length of re-training to determine the actual length. The result of these two steps determines the re-training phase of , as formulated in Algorithm 6.

1:   observes the accuracy of on whether there will be an ACK and identifies time instances when accuracy changes.
2:   finds the optimal parameter such that is minimized, which is the time between two re-training phases.
3:  The initial lower and upper bounds for the re-training length are and , respectively.
4:   starts causative attack with two initial estimated lengths on the re-training phase, where .
5:  if Causative attack with has the same performance as that by  then
6:     Update the upper bound as .
7:  else
8:     Update the lower bound as .
9:  end if
10:  if  then
11:     The re-training length is .
12:  else
13:     Go to Step 4.
14:  end if
Algorithm 6 Determine the start and end of re-training phase

Once the re-training phases are determined, performs a causative attack by transmitting in the initial sensing period if predicts an ACK. To retrain , collects additional training data but its sensing results are changed due to ’s transmissions. Hence, this causative attack can change the training data and then change ’s classifier to . Thus, ’s performance drops even if does not transmit later to change sensing results in test time. Comparing with an evasion attack, the power consumption of this attack is even smaller than continuous jamming of sensing period.

Fig. 7: Using ’s classifier for causative attacks.

Figure 7 illustrates the adversary’s operation for causative attacks. In the simulation, the transmit power at is set as . For the classifier built in Section 4 and time slots considered for transmissions after the attack, (of ) idle time slots are identified as idle and the transmissions in slots are successful, while busy time slot is identified as idle and transmissions in this slot are successful. Thus, the achieved normalized throughput is , the overall success ratio is , and the all transmission ratio is . As a result, increases the ratio of transmissions from to . However, more transmissions cannot improve the performance. reduces the throughput of from to and reduces the success ratio from to without the need of further transmissions.

Normalized Success All transmission
throughput ratio ratio
no attack 98.96% 96.94% 19.60%
evasion attack 3.13% 75.00% 0.80%
jamming 41.67% 40.82% 19.60%
causative attack 87.27% 60.76% 31.60%
causative + evasion attack 2.72% 75.00% 0.80%
causative + jamming attack 37.27% 25.95% 31.60%
TABLE V: Results under various attacks.

5.4 Causative Attack followed by Evasion or Jamming Attack

The causative attack can be followed by an evasion attack. That is, first launches the causative attack such that ’s classifier is updated as with wrong samples of additional training data. Then also launches the evasion attack such that the input features to are also wrong. As a result, reduces the throughput of from to , the success ratio from to , and the ratio of transmissions from to .

The causative attack can also be followed by a jamming attack (that targets data transmissions) with an energy budget. As discussed in Section 5.2, we assume that can jam up to of all time slots. Under this setting, increases the ratio of transmissions from to . Again, more transmissions cannot improve the performance. reduces the throughput of from to and reduces the success ratio from to .

Table V summarizes the performance of without an attack and with various attacks considered in this paper, and demonstrates the success of these attacks. Overall, the proposed attacks cause major loss in ’ performance and the impact is much more substantial than typical jamming attacks that target data transmissions under the same energy budget.

6 Defense Strategy

The first step of the proposed attacks is an exploratory attack to understand how works and build . One approach to protect a deep learning algorithm against attacks is adding some randomness to the deep neural network of the target and making it more challenging for the adversary to learn its structure [20]. However, this approach is not effective, since does not have access to the last layer of the neural network of . However, can access the outcome (ACK or not) of ’s actions (transmissions). Therefore, an alternative approach is to add randomness directly to ’s transmissions, which will in turn change the input to (namely, the labels collected by to build its classifier in the exploratory attack). Note that a small level of randomness may not change ACKs much and thus can still perform an exploratory attack. On the other hand, a large level of randomness will randomly change ’s actions, which makes ’s performance worse, even without attack. Note that we consider a single channel system and thus spectrum handoff to other channels is not possible as a strategy to confuse the adversarial attack.

In this paper we consider a defense strategy that selectively changes ’s actions, i.e., makes transmit in a time slot when it is identified as busy222If there is an attack, this defense will improve the performance as we show later in this section. However, the impact of this defense action on throughput is not obvious if there are multiple transmitters but there is no attack. The reason is that is not perfect and thus an idle channel may be identified as busy. This issue can be resolved by an alternative defense at that sends ACK although no packet is received. As discussed later in this section, this approach achieves the same defense performance on average without issue of additional interference to other nodes. or not transmit in a time slot when it is identified as idle. Such changes should ensure that the observation of becomes incorrect and thus it cannot build a good classifier . As a consequence, cannot perform subsequent attacks effectively, as well. Moreover, such changes should not reduce ’s performance significantly. Therefore, this defense mechanism involves a fundamental trade-off between the accuracy of and the performance of after taking some defense actions. The problem is how to select a number of time slots such that taking defense actions on these slots can achieve the maximum (negative) impact on the accuracy of ’s classifier. This can be formulated as an optimization problem as follows.

OptDefense:
maximize
subject to

where and are training and test data sets for under ’s defense actions, the subtraction of sets is the set of elements in but not in , denotes the size of set , is the ratio of defense actions over , and is the maximum allowed ratio on defense actions. ’s function for deep learning process, , depends not only on and but also on , since does not know when collects training data and when collects test data. Thus, takes some ratio of defense actions on and (assumed to be equal for numerical results).

# of defense operations Adversary error probabilities Transmitter performance
divided by # of all samples Misdetection False alarm Normalized throughput Success ratio
0% (no defense) 1.98% 4.21% 3.13% 75.00%
10% 6.99% 10.59% 15.63% 15.31%
20% 8.92% 35.29% 41.67% 28.78%
40% 10.12% 42.67% 51.04% 18.22%
60% 17.06% 69.44% 76.04% 18.07%
80% 10.88% 93.22% 56.25% 13.30%
TABLE VI: Results for defense strategy under evasion attack.

OptDefense is solved by analyzing the output of as follows. provides not only a label for each sample, but also a score that can be used to measure the confidence of this classification. That is, there is a score for each sample . Classifier uses a decision boundary for classification, i.e., if is less than some decision boundary , sample is classified as idle, otherwise, sample is classified as busy. Note that is a hyperparameter in deep learning and it is selected (along with other hyperparameters) to minimize . If the difference between and , namely , is large, the confidence of the classification is high; otherwise, the confidence of the classification is low. Therefore, to maximize the impact of defense actions, should select time slots (samples) with scores far away from the decision boundary. This decision algorithm with defense is summarized in Algorithm 7. Thresholds and in Step 4 can be determined by using .333Alternatively, these thresholds can also be determined by . We use to be consistent with other results in the paper, i.e., we always use a classifier on test data to obtain performance results. For example, we can select of time slots by selecting such that and selecting such that . The probability in Step 5 is to randomize ’s defense actions among selected time slots, which makes ’s learning more challenging.

1:  At time , senses channel and obtains power .
2:   builds a sample with features .
3:   uses its classifier to decide on a label (busy or idle) and a score .
4:  if  or  then
5:      changes the label with certain probability.
6:  end if
7:   transmits if time slot is still classified as idle, i.e., ‘idle’.
Algorithm 7 ’s defense algorithm

Table VI shows the results for different under an evasion attack. With more frequent defense actions (larger ), the achieved normalized throughput increases from to . However, further increases in (beyond ) reduce , as the ’s own channel access becomes excessively unreliable. Results for defense strategy against causative attack are similar and thus are omitted. We can design a search process for to maximize throughput, according to the adversary’s actions.

The searching process of also works when there is no adversary. For this extreme case, will find that any defense actions will decrease the throughput and thus the search process will end with the defense level without knowing whether an adversary is present, or not.

The above defense is performed by . Instead, can also perform defense actions to fool the adversary, i.e., can refrain from sending an ACK when a packet is received or send an ACK when no packet is received. Then will observe incorrect labels to build . ’s defense strategy can be realized to have the same outcome (ACK or no ACK) as ’s strategy by using 1 bit overhead. There are the following three cases for ’s defense strategy.

  • Case I: takes a defense action of not transmitting when channel is detected as idle. Then will not send ACK since there is no transmission.

  • Case II: takes a defense action of transmitting when channel is detected as busy. Then it is likely that will not send ACK since transmission may fail. But if such a transmission is successful, will send ACK.

  • Case III: does not take a defense action. Then will send ACK if there is a successful transmission.

To ensure the same outcome, ’s defense strategy is implemented for the above three cases as follows.

  • Case I: transmits data with a 1 bit flag of “defense action”. will not send ACK even if the transmission is successful.

  • Case II: transmits 1 bit flag of “defense action”. will send ACK if the flag is successfully received.

  • Case III: If transmits data, also transmits a 1 bit flag of “no defense action”. will send ACK if the transmission is successful.

From adversary’s point of view, the two defense strategies will provide the same outcome and thus the adversary will build the same classifier under the exploratory attack. The performance under subsequent attacks (discussed in Sections 5.2, 5.3 and 5.4) will be similar. The only difference is that under ’s defense strategy, there are more transmissions (for Case I) and thus throughput can be further improved if these transmissions are successful.

7 Conclusion

We applied adversarial machine learning (based on deep neural networks) to design over-the-air spectrum sensing poisoning attacks that target the spectrum sensing period and manipulate the input data of the transmitter in test and training phases (in form of evasion and causative attacks). An adversary launches these attacks either to fool the transmitter into making wrong transmit decisions (namely, an evasion attack) or manipulate its retraining process (namely, a causative attack). Since the adversary only needs to transmit for a short period of time to manipulate the transmit decisions, these attacks more energy-efficient and harder to detect compared to directly jamming data transmissions. We showed that these attacks substantially decrease the throughput of the transmitter and are more effective than conventional jamming attacks. We also combined evasion, causative, and jamming attacks, and measured their total impact. To mitigate these attacks, we developed an effective defense strategy for the transmitter that intentionally takes wrong actions in selected time slots to mislead the adversary. These time slots are selected from those with the classification score that is far away from the decision boundary. We showed that the proposed defense mechanism significantly increases the errors in adversary’s decisions and prevents major losses in the performance of the transmitter.

Acknowledgements

A preliminary version of the material in this paper was partially presented at IEEE Military Communications Conference (MILCOM), 2018. This effort is supported by the U.S. Army Research Office under contract W911NF-17-C-0090. The content of the information does not necessarily reflect the position or the policy of the U.S. Government, and no official endorsement should be inferred.

References

  • [1] C. Clancy, H. J. Stuntebeck, and T. O’Shea, “Applications of machine learning to cognitive radio networks,” IEEE Wireless Communications, 2007.
  • [2] M. Chen, U. Challita, W. Saad, C. Yin, and M. Debbah, “Machine Learning for Wireless Networks with Artificial Intelligence: A Tutorial on Neural Networks,” arXiv preprint arXiv:1710.02913, 2017.
  • [3] O. Simeone, “A very short introduction to machine learning with applications to communication systems,” IEEE Transactions Cognitive Communications and Networking, 2018
  • [4] W. Lee, M. Kim, D. Cho, and R. Schober, “Deep Sensing: Cooperative Spectrum Sensing Based on Convolutional Neural Networks,” arXiv preprint arXiv:1705.08164, 2017.
  • [5] H. Ye, G. Y. Li, and B.-H. Juang, “Power of Deep Learning for Channel Estimation and Signal Detection in OFDM Systems,” IEEE Wireless Communications Letters, 2018
  • [6] Y. Shi, Y. E Sagduyu, T. Erpek, K. Davaslioglu, Z. Lu, and J. Li, “Adversarial deep learning for cognitive radio security: Jamming attack and defense strategies,” IEEE International Conference on Communications (ICC) Workshop on Promises and Challenges of Machine Learning in Communication Networks, 2018.
  • [7] T. Erpek, Y. E. Sagduyu, and Y. Shi, “Deep learning for launching and mitigating wireless jamming attacks,” IEEE Transactions on Cognitive Communications and Networking, 2019.
  • [8] T. O’Shea, J. Corgan, and C. Clancy, “Convolutional radio modulation recognition networks,” International Conference on Engineering Applications of Neural Networks, 2016.
  • [9] K. Davaslioglu and Y. E. Sagduyu, “Generative adversarial learning for spectrum sensing,” IEEE International Conference on Communications (ICC), 2018.
  • [10] W. Xu, W. Trappe, Y. Zhang, and T. Wood, “The Feasibility of Launching and Detecting Jamming Attacks in Wireless Networks,” ACM International Symposium on Mobile Ad Hoc Networking and Computing (Mobihoc05), 2005.
  • [11] Y. Zou, J. Zhu, L. Yang, Y.-C. Liang, and Y.-D. Yao, “Securing physical-layer communications for cognitive radio networks,” IEEE Communications Magazine, 2015.
  • [12] T. C. Clancy, and N. Goergen, “Security in cognitive radio networks: Threats and mitigation,” IEEE Conference on Cognitive Radio Oriented Wireless Networks and Communications (CrownCom), 2008.
  • [13] Z. Yuan and D. Niyato and H. Li and J. B. Song and Z. Han, “Defeating primary user emulation attacks using belief propagation in cognitive radio networks,” IEEE Journal Selected Areas in Communications, 2012.
  • [14] R. Chen, J. Park, and K. Bian, “Robust distributed spectrum sensing in cognitive radio networks,” IEEE Conference on Computer Communications (INFOCOM), 2008.
  • [15] M. Fredrikson, S. Jha, and T. Ristenpart, “Model inversion attacks that exploit confidence information and basic countermeasures,” ACM SIGSAC Conference on Computer and Communications Security, 2015.
  • [16] A. Kurakin, I. Goodfellow, and S. Bengio, “Adversarial examples in the physical world,” arXiv preprint arXiv:1607.02533, 2016.
  • [17] B. Biggio, I. Corona, D. Maiorca, B. Nelson, N. Srndic, P. Laskov, G. Giacinto, and F. Roli, “Evasion attacks against machine learning at test time,” European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, 2013.
  • [18]

    B. Biggio, B. Nelson, and P. Laskov, “Poisoning attacks against support vector machines,”

    International Conference on International Conference on Machine Learning, 2012.
  • [19] Y. E. Sagduyu and A. Ephremides, “A game-theoretic analysis of denial of service attacks in wireless random access,” Journal of Wireless Networks, 2009.
  • [20] A. Kurakin, et al., “Adversarial attacks and defences competition,” arXiv preprint arXiv:1804.00097, 2018.
  • [21] G. Ateniese, L. Mancini, A. Spognardi, A. Villani, D. Vitali, and G. Felici, “Hacking smart machines with smarter ones: How to extract meaningful data from machine learning classifiers,” International Journal of Security and Networks, 2015.
  • [22] F. Tramer, F. Zhang, A. Juels, M. Reiter, and T. Ristenpart, “Stealing machine learning models via prediction APIs,” USENIX Security, 2016.
  • [23] Y. Shi, Y. E. Sagduyu, K. Davaslioglu, and J. Li, “Active Deep Learning Attacks under Strict Rate Limitations for Online API Calls,” IEEE Symposium on Technologies for Homeland Security, 2018.
  • [24] M. Sadeghi and E. G. Larsson, “Adversarial attacks on deep-learning based radio signal classification,” IEEE Wireless Communications Letters, 2018.
  • [25] B. Flowers, R. M. Buehrer, and W. C. Headley, “Evaluating adversarial evasion attacks in the context of wireless communications,” arXiv preprint, arXiv:1903.01563, 2019.
  • [26] M. Z. Hameed, A. Gyorgy, and D. Gunduz, “Communication without interception: defense against deep-learning-based modulation detection,” arXiv preprint, arXiv:1902.10674, 2019.
  • [27] S. Kokalj-Filipovic and R. Miller, “Adversarial examples in RF deep learning: detection of the attack and its physical robustness,” arXiv preprint, arXiv:1902.06044, 2019.
  • [28] S. Weerasinghe, T.Alpcan, S. M. Erfani, C.Leckie, P. Pourbeik, and J. Riddle, “Deep learning based game-theoretical approach to evade jamming attacks,”

    International Conference on Decision and Game Theory for Security (GameSec)

    , 2018.
  • [29] Y. Shi, T. Erpek, Y. E. Sagduyu, and J. H. Li, “Spectrum data poisoning with adversarial deep learning,” in Proc. IEEE Military Communications Conference (MILCOM), 2018.
  • [30] Y. Zou, J. Zhu, L. Yang, Y. Liang, and Y. Yao, “Securing physical-layer communications for cognitive radio networks,” IEEE Communications Magazine, 2015.
  • [31] F. Penna, Y. Sun, L. Dolecek, and D. Cabric, “Detecting and counteracting statistical attacks in cooperative spectrum sensing,” IEEE Transactions on Signal Processing, 2012.
  • [32] F. R. Yut, H. Tang, M. Huang, Z. Lit, and P. C. Mason, “Defense against spectrum sensing data falsification attacks in mobile ad hoc networks with cognitive radios,” IEEE Military Communications Conference (MILCOM), 2009.
  • [33] Y. E. Sagduyu, R. Berry, and A. Ephremides, “Jamming games in wireless networks with incomplete information,” IEEE Communications Magazine, 2011.
  • [34] Y. E. Sagduyu, R. Berry, and A. Ephremides, “MAC games for distributed wireless network security with incomplete information of selfish and malicious user types,” IEEE International Conference on Game Theory for Networks (GameNets), 2009.
  • [35] Z. Lu, Y. E. Sagduyu, and J. Li, “Securing the backpressure algorithm for wireless networks,” IEEE Transactions on Mobile Computing, 2017.
  • [36] Z. Lu and C. Wang, “Enabling network anti-inference via proactive strategies: a fundamental perspective,” IEEE/ACM Transactions on Networking, 2017.
  • [37] A. Ferdowsi and W. Saad, “Deep learning for signal authentication and security in massive internet of things systems,” arXiv preprint arXiv:1803.00916, 2018.
  • [38]

    G. Han, L. Xiao, and H. V. Poor, “Two-dimensional anti-jamming communication based on deep reinforcement learning,”

    IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2017.
  • [39]

    Z. Wu. Y. Zhao, Z. Yin, and H. Luo, “Jamming signals classification using convolutional neural network,”

    IEEE International Symposium on Signal Processing and Information Technology (ISSPIT), 2017.
  • [40] O. A. Topal, S. Gecgel, E. M. Eksioglu, and G. Karabulut Kurt, “Identification of smart jammers: Learning based approaches using wavelet representation,” arXiv preprint, arXiv:1901.09424, 2019.
  • [41] L. Xiao, D. Jiang, D. Xu, H. Zhu, Y. Zhang, and V. Poor, “Two-dimensional anti-jamming mobile communication based on reinforcement learning,” IEEE Transactions on Vehicular Technology, 2018.
  • [42] L. Xiao, C. Xie, M. Min, and W. Zhuang, “User-centric view of unmanned aerial vehicle transmission against smart attacks,” IEEE Transactions on Vehicular Technology, 2018.
  • [43] Y. Liang, Z. Cai, J. Yu, Q. Han, and Y. Li, “Deep learning based inference of private information using embedded sensors in smart devices,” IEEE Network, 2018.
  • [44] J. Bergstra and Y. Bengio,“Random search for hyper-parameter optimization,” Journal of Machine Learning Research, 2012.
  • [45] M. Abadi, et al., “TensorFlow: Large-scale machine learning on heterogeneous systems,” 2015. www.tensorflow.org
  • [46] L. Xiao, D. Jiang, D. Xu, W. Su, N. An, and D. Wang, “Secure mobile crowdsensing based on deep learning,” China Communications, vol. 15, no. 10, pp. 1–11, 2018.
  • [47] L. Xiao, X. Wan, W. Su, and Y. Tang, “Anti-jamming underwater transmission with mobility and learning,” IEEE Communications Letters, vol. 22, no. 3, pp. 542–545, 2018.
  • [48] L. Li, K. Jamieson, G. DeSalvo, A. Rostamizadeh, A. Talwalkar, “Hyperband: A novel bandit-based approach to hyperparameter optimization,” arXiv preprint, arXiv:1603.06560, 2016.