Detecting ADS-B Spoofing Attacks using Deep Neural Networks

The Automatic Dependent Surveillance-Broadcast (ADS-B) system is a key component of the Next Generation Air Transportation System (NextGen) that manages the increasingly congested airspace. It provides accurate aircraft localization and efficient air traffic management and also improves the safety of billions of current and future passengers. While the benefits of ADS-B are well known, the lack of basic security measures like encryption and authentication introduces various exploitable security vulnerabilities. One practical threat is the ADS-B spoofing attack that targets the ADS-B ground station, in which the ground-based or aircraft-based attacker manipulates the International Civil Aviation Organization (ICAO) address (a unique identifier for each aircraft) in the ADS-B messages to fake the appearance of non-existent aircraft or masquerade as a trusted aircraft. As a result, this attack can confuse the pilots or the air traffic control personnel and cause dangerous maneuvers. In this paper, we introduce SODA - a two-stage Deep Neural Network (DNN)-based spoofing detector for ADS-B that consists of a message classifier and an aircraft classifier. It allows a ground station to examine each incoming message based on the PHY-layer features (e.g., IQ samples and phases) and flag suspicious messages. Our experimental results show that SODA detects ground-based spoofing attacks with a probability of 99.34 small false alarm rate (i.e., 0.43 techniques such as XGBoost, Logistic Regression, and Support Vector Machine. It further identifies individual aircraft with an average F-score of 96.68 accuracy of 96.66 detector.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

page 6

page 7

page 8

page 9

08/07/2015

Using Deep Learning for Detecting Spoofing Attacks on Speech Signals

It is well known that speaker verification systems are subject to spoofi...
05/24/2017

Anti-spoofing Methods for Automatic SpeakerVerification System

Growing interest in automatic speaker verification (ASV)systems has lead...
10/23/2020

Detection of Replay Attacks to GNSS based on Partial Correlations and Authentication Data Unpredictability

Intentional interference, and in particular GNSS spoofing, is currently ...
10/22/2021

Spoofing Detection on Hand Images Using Quality Assessment

Recent research on biometrics focuses on achieving a high success rate o...
07/16/2020

Generative Adversarial Network in the Air: Deep Adversarial Learning for Wireless Signal Spoofing

The spoofing attack is critical to bypass physical-layer signal authenti...
11/28/2017

Using LSTM Encoder-Decoder Algorithm for Detecting Anomalous ADS-B Messages

Although the ADS-B system is going to play a major role in the safe navi...
08/25/2020

Divider: Delay-Time Based Sender Identification in Automotive Networks

Controller Area Network (CAN) is one of the in-vehicle network protocols...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Rapid growth in the air traffic expected for the coming years requires innovative applications and methodologies to guarantee efficiency of the transportation infrastructure and the safety for passengers and crew. In order to monitor and manage the increasingly congested airspace, the Automatic Dependent Surveillance-Broadcast (ADS-B) system is deployed as a key component of the Next Generation Air Transportation System [1]. The airborne ADS-B OUT devices (transmitters) allow aircraft to periodically broadcast their identifications and current positions, determined from global navigation satellite systems, to ground stations or other aircraft over the 1090 MHz band. All aircraft will need to be equipped with ADS-B capabilities to be operated in the European and U.S. airspace by 2020, as per the EASA and FAA111EASA and FAA stand for the European Aviation Safety Agency and the Federal Aviation Administration, respectively. regulations.

Despite of the advantages of ADS-B, its security vulnerabilities have been widely acknowledged [2, 3, 4, 5, 6]. In particular, the source of the problem is the lack of basic security mechanisms, such as encryption and authentication. As a result, a variety of wireless attacks can be launched against the ADS-B system, such as eavesdropping, jamming, message injection (spoofing), message deletion, and message modification, as summarized in [3]. The above security vulnerabilities of ADS-B are further exacerbated by the prevalence of low-cost Software Defined Radios (SDRs) (e.g., USRP [7]). It has been demonstrated in [6, 5] that the above attacks are easy and practically feasible to launch for a moderately sophisticated attacker with a SDR.

In this work, we focus on the ADS-B message injection or spoofing attacks that target a ground station. As illustrated in Fig. 1, the ADS-B receiver receives broadcast messages broadcast by aircraft within signal range and feeds them to the Air Traffic Control (ATC) system. We classify the ADS-B spoofing attacks into two categories: ground-based and aircraft-based. In the first attack, the ground-based attacker uses a SDR to retransmit previously recorded messages or transmit newly generated and correctly modulated fake messages, which are referred to as the replay attack and the ghost aircraft injection attack, respectively. The main objective is to fake the presence of non-existent (ghost) aircraft and confuse the ATC system. In the second attack, an aircraft-based attacker modifies the ICAO address in the ADS-B messages transmitted by the airborne ADS-B transponder, masquerading itself as a known or trusted aircraft to bypass the surveillance.

Fig. 1: Illustration of ADS-B message injection or spoofing attacks that target the ground station. The ground-based attacker uses a SDR device to replay previously recorded ADS-B messages or transmit newly generated and correctly modulated fake ADS-B messages, whereas an aircraft-based attacker modifies the ICAO address of transmitted messages and masquerades as a known or trusted aircraft. SODA is deployed between the ADS-B receiver and the ATC system and flags suspicious messages based on the PHY-layer features using DNN.

Use of cryptographic primitives (e.g., Message Authentication Code) is one way to defend against ADS-B spoofing [8, 9, 10, 11, 12], but it requires protocol modifications or overheads and shifts the problems to the design of secure and scalable key management and distribution schemes. Alternative non-cryptographic approaches aim to identify and exploit unique characteristics of ADS-B transponders for fingerprinting. In [13], Strohmeier et al. analyzed message transmission patterns (inter-arrival time distribution) and identified distinct patterns between commercial ADS-B transponder types and their implementations. Nevertheless, an attacker with a SDR-based spoofer has the full control of inter-transmission times and can mimic the transmission pattern to evade detection. A malicious aircraft can also masquerade as another aircraft equipped with an ADS-B transponder of a similar implementation (e.g., from the same manufacturer) to avoid detection. In [14], Leonardi et al. proposed to fingerprint aircraft through phase patterns and applied neural networks for multi-class classification. Due to the limited number of predefined classes for phase patterns (seven in total), there will be a significant number of aircraft in each class (e.g., 100+ aircraft in the class of non-coherent phase patterns), leaving a lot of freedom to the attacker.

In this work, we present SODA – a Deep Neural Network (DNN)-based spoofing detector for ADS-B. SODA takes PHY-layer features (e.g., IQ samples and phases) as input and performs two-stage detection to detect malicious messages and identify malicious aircraft.

Contributions. Throughout this paper, we make the following contributions:

  • We introduce SODA, a DNN-based detector that consists of a message classifier and an aircraft classifier to detect the ground-based and aircraft-based spoofing attacks.

  • We build a SDR-based spoofer and conduct Over-The-Air (OTA) experiments in an anechoic chamber to emulate realistic ground-based spoofing attacks. Our experimental results show that the message classifier detects malicious messages from a ground-based attacker with a detection probability of and a negligible false alarm rate (i.e., ). It outperforms baseline machine learning techniques including XGBoost, Logistic Regression, and Support Vector Machine.

  • We build an aircraft classifier that performs multi-class classification with each aircraft being its own class. Our experimental results show that a fine-tuned aircraft classifier classifies a total of 238 aircraft with an average F-score of and an accuracy of , with a significant performance improvement over the state-of-the-art detector in [14].

Organization. The remainder of this paper is organized as follows. Section II reviews related work and Section III provides relevant background on ADS-B and DNN. In Section IV, we describe our system and adversary models. Section V presents the proposed detection system, and Section VI presents the experimental evaluation. This study is concluded in Section VII.

Ii Related Work

The security vulnerabilities of ADS-B have been analyzed in many studies [2, 3, 4, 5, 6]. One way to secure wireless ADS-B communication is through cryptographic measures [8, 9, 10, 11, 12]. Finke et al. compared various encryption schemes and supported a symmetric cipher using the FFX algorithm [9]. Nevertheless, it is acknowledged as a challenging task to perform secure key management and distribution. More recently, Wesson et al. proposed the use Public Key Infrastructure (PKI) and ECDSA (Elliptic Curve Digital Signature Algorithm) signatures as the cryptographic solution. In [12], Feng et al. presented a PKI solution for ADS-B message authentication based on Elliptic Curve Cipher and X.509 certificates. But the proposed scheme would require additional messages to carry the signature and timestamps, which is hardly scalable in practice, and efficient certification distribution remains an open question.

Alternative non-cryptographic approaches can be deployed to mitigate the attacks on the ADS-B system. In [15], the authors evaluated ADS-B jamming attacks and proposed a solution based on PHY-layer signal separation. In [16], Schäfer et al. proposed secure track verification based on timing. In [17]

, Ghose and Lazos exploited the Doppler spread phenomenon to estimate actual velocities and crossed them with the claimed velocities. In

[18], Schäfer et al. exploited the Doppler shift measurements from multiple ground stations (verifiers) to verify the motion of the aircraft (prover). While it was implicitly assumed that an attacker could modify the transmission frequency to mimic the Doppler shift and deceive a single ground station, we show that it could be a challenging task for the attacker, if the detector leverages the PHY-layer features for detection as SODA does. In [19], the authors proposed an enhanced ADS-B protocol called ADS-BT, which embeds transmit timestamps in the ADS-B messages, but such approach would require protocol modifications.

In the more recent study in [14], the authors identified 7 classes of phase patterns and exploited them for aircraft classification. They applied a neural network (one hidden layer with nodes) and reported an accuracy of for the 7-class classification task. While the results are promising, the effectiveness of the proposed detector is limited by the number of classes. In other words, more aircraft in one class imply the greater freedom for an attacker to launch the aircraft spoofing attack. In the ideal case, there should be only a single aircraft in each class. Moreover, a ground-based attacker in possession of a SDR-based spoofer can also mimic the phase pattern to evade the detection.

In this work, we develop SODA – a DNN-based two-stage spoofing detector. In the first stage, the message classifier examines each incoming message and labels it as malicious or non-malicious. For those that are considered non-malicious in the first stage, they are further processed by the aircraft classifier, which predicts the ICAO address based on the PHY-layer features and compares the prediction against the claimed address to detect aircraft spoofing attacks. Our experimental results demonstrate the effectiveness of the proposed SODA with a detection probability of in the first stage and an accuracy of for a total of 238 aircraft (with each aircraft being its own class) in the second stage.

Iii Background

In this section, we provide relevant background on the ADS-B protocol (Section III-A) and a brief introduction to DNN (Section III-B).

Iii-a ADS-B Background

Aircraft use the ADS-B technology to periodically transmit their identifications (ICAO addresses) and status data (e.g. position, speed, heading, etc.) to ground stations and nearby aircraft. There are two types of ADS-B devices: 1) ADS-B OUT devices for transmitting broadcasts to ADS-B receivers, and 2) ADS-B IN devices for receiving broadcasts, weather data, and ATC reports. The data link standards for ADS-B are the 987 MHz Universal Access Transceiver (UAT) and the 1090 MHz Extended Squitter (1090ES). The former was created specifically for aviation services such as ADS-B and requires new hardware, whereas the later integrates the ADS-B function to traditional Mode S transponders. In this work, we focus on the commercially used 1090ES data link.

Fig. 2 illustrates the ADS-B message structure. Each message contains a 8 s preamble for synchronization and a 56-bit (short) or 112-bit (extended) data block. The first 5 bits of the data block contain the downlink format (the message type). The subsequent 3-bit capability field serves as an additional identifier. The 24-bit ICAO address is a unique identifier issued to each aircraft by the ICAO. The 56-bit extended ADS-B data field contains surveillance information such as identification, position, velocity, and emergency codes. The last field is a 24-bit parity check for receivers to validate the correctness of the preceding message. Within the 56-bit data field, the first 5 bits represent the Type Code (TC), indicating the type of information contained in the subsequent bits. While there are a total of 31 TCs, we focus on messages that include airborne position data with barometric altitude (TCs 9-18), and airborne velocities (TC 19).

Fig. 2: Illustration of ADS-B message structure. Each message consists of a 8 s preamble and a s or s data block.

ADS-B messages are transmitted every s on average using the Pulse Position Modulation (PPM) scheme. Since the data rate is 1 Mbps, the total duration is 120 s for an extended ADS-B message (including the preamble).

Iii-B Deep Neural Network

DNN is a widely used supervised learning technique, which requires a large amount of labeled training data. It consists of input, hidden, and output layers. Each layer consists of a certain amount of nodes (neurons). The number of input nodes is equal to the number of features of a sample, and the number of output nodes usually correspond to the number of labels or classes. The number of hidden layers and nodes are considered as tuneable hyper-parameters.

In a feed-forward DNN with multiple hidden layers, with hidden nodes, each node maps its input, , to the output state

, through an activation function. In Eq. (

1), is equal to , the bias of node , plus the sum of , over all nodes in the preceding layer, where is the output state of node in the previous layer, and is the connecting weight from node to node

. The sigmoid function is often used in binary classification models, whereas the softmax function is used for multi-class classification problems, and returns the probability of each class,

, over total classes.

(1)

A cost function (e.g., cross entropy) is defined to measure the discrepancy between the target outputs and the actual outputs produced for each training sample. The derivatives of the cost function are then back-propagated throughout the network to update the weights, thus training the DNN classifier. In order to increase the efficiency of back-propagation on large training sets, mini-batches, i.e., small random set of training cases, are used in place of the entire training data. The term epoch refers to a single forward and back-propagation of the training data. More details on DNN are available in literature such as

[20].

Iv System and Adversary Model

In this section, we present our system model (Section IV-A) and our adversary model (Section IV-B) for ADS-B spoofing attacks against a ground station.

Iv-a System Model

We consider a ground station equipped with an ADS-B receiver that has access to the raw IQ data. We assume that the ground station has sufficient resources (e.g., processing power or storage) for various machine learning tasks. We also assume that the ground station has a database of previously received authentic ADS-B messages (including both message contents and IQ data) for each aircraft, which can be used to train a supervised learning-based detector. In practice, such a database may be created by a ground station equipped with secondary surveillance systems (e.g., the radar surveillance system) such that the authenticity of each message can be cross-validated. This database can be later shared among other ground stations. Attack messages can be collected by emulating the spoofing attacks using various SDR-based spoofers. While a database created by a single ground station tends to be regional and does not cover all aircraft, multiple ground stations can collaborate and build a nation-wide dataset offline.

Iv-B Adversary Model

As illustrated in Fig. 1, we consider two types of attackers: 1) a ground-based attacker that uses a low-cost SDR-based spoofer, and 2) an aircraft-based attacker that uses its ADS-B transponder with modified ICAO address. While it is possible for an onboard attacker to transmit spoofed messages with the SDR-based spoofer from the aircraft, the signal may not reach the ground station due to the low transmit power of SDRs. For example, the output power of USRP B210/B200 is +20 dBm (100 mW) max [7], whereas airborne ADS-B transponders are required to have a minimum transmission power of 75 W for smaller aircraft and 125 W or 200 W for larger aircraft [21]. Spoofing attacks launched by an on-board attacker against the aircraft is beyond the scope of this paper.

We assume that attackers have full knowledge of the ADS-B protocol and the location of the targeted ground station. Since SDR devices are highly flexible and configurable through software, the attacker will have control over signal characteristics (e.g., power level, phase, transmitting frequency). In particular, we assume that the attacker is aware of the Doppler effect (caused by aircraft movements) and mimic the Doppler shift by shifting the transmitting frequency of the generated signal (or equivalently, by modifying the phases of the transmitted IQ samples). In contrast, commercial ADS-B OUT devices have custom circuits and do not provide the flexibility for users to access its physical layer or modify the signal characteristics.

Depending on the way the spoofed messages are generated, we further divide the ADS-B spoofing attacks into three types: 1) message or IQ data replay attack, 2) ghost aircraft injection attack, and 3) aircraft spoofing attack. The first two attacks can only be launched by a ground-based attacker, while the last one is launched by an aircraft-based attacker. Here we consider the replay attack as a special case of the spoofing attack, in which the attacker transmits previously recorded messages pretending to be the legitimate aircraft, without bothering to manipulate the message contents. We now describe each attack in detail.

Message or IQ data replay attack. In this attack, the ground-based attacker records the messages contents or the IQ data of the received authentic ADS-B messages using the SDR device, and then transmits the same messages at a later time without changing the message contents. Compared to the message replay attack, the IQ data replay attack is much stealthier, as the recorded IQ data incorporates a lot of information about the Doppler effect, the transmitter characteristics (e.g., carrier frequency offset), and the channel characteristics (e.g., multi-path and fading effects), which is difficult to mimic otherwise.

Ghost aircraft injection attack. Unlike the replay attacks, the ground-based attacker transmits fake ADS-B messages with arbitrary contents of its choice using the SDR device. In particular, the attacker can simulate the trajectories of non-existent aircraft and generate corresponding ADS-B messages with carefully chosen Doppler shifts, causing ghost aircrafts to appear on the console of the ground station.

Aircraft spoofing attack. In this attack, an aircraft-based attacker (malicious aircraft) attempts to masquerade as a known or trusted aircraft by spoofing the ICAO address and hide its true identity. Since the aircraft is physically present, the masquerading attack will not be detected even if the secondary radar surveillance system is deployed.

V Proposed System – SODA

In this section, we first describe the architecture of SODA (Section V-A). We then present the DNN-based message classifier (Section V-B) and aircraft classifier (Section V-C).

V-a System Architecture

Fig. 3 illustrates the architecture of SODA, which consists of a message classifier and an aircraft classifier. The message classifier decides whether the message is malicious or not. If a message is considered non-malicious, it means that it is not transmitted by the SDR-based spoofer, but it may come from a malicious aircraft instead of the legitimate transmitter as indicated by the ICAO address in the message. Hence, the aircraft classifier aims to further determine the transmitting aircraft of the message and compare the output ICAO address against the claimed ICAO address to detect the aircraft spoofing attack.

Since both message and aircraft classifiers are based on DNN, they need to be trained on a large labeled dataset. At runtime, the SODA takes each incoming ADS-B message (IQ data) as input and decides whether this message is suspicious or not. Since the SODA is a passive detection system that only flags possibly spoofed messages, it is up to the ATC system to take further actions (e.g., drop the flagged messages or display the ghost aircraft on the console). In the rest of this section, we describe the two stages in more detail.

Fig. 3: Illustration of SODA Architecture. The message classifier determines whether the ADS-B message is malicious or not so as to detect ground-based spoofing attacks. A non-malicious message will be further passed into the aircraft classifer. By comparing the output ICAO address with the claimed ICAO address, the detector can detect the aircraft-based spoofing attack.

V-B Message Classification

Message classification is considered as a binary classification problem. In SODA, the raw IQ samples are used as features. Since each message lasts s, a sampling rate of MHz will produce interleaving IQ samples.

Data collection. To successfully train the message classifier, a large amount of realistic ground-based spoofing attack data is needed. As illustrated in Fig. 4(a), we constructed an ADS-B receiver (the ground station) that consists of a laptop, a RTL-SDR dongle [22], a 1090 MHz bandpass filter, and an ADS-B antenna to collect real ADS-B messages in an open area. We then built a SDR-based ADS-B spoofer consisting of a USRP B210 [7] and a 5-dBi 1090 MHz antenna (Fig. 4(b)) to emulate ADS-B spoofing attacks.

For the message replay attack, the spoofer transmits correctly modulated ADS-B signals using the message contents decoded from authentic ADS-B messages. For the IQ data replay attack, the spoofer simply replays the IQ data of the recorded ADS-B messages. This represents the worst-case IQ data replay attack, because the replayed IQ data directly comes from the ground station, while in practice the attacker would record the IQ data using different hardware at a different location. In the ghost aircraft injection attack, a number of aircraft are simulated with random ICAO addresses, velocities, altitudes, and headings. For simplicity, we assume that each aircraft broadcasts an ADS-B message every second.

Fig. 4: Illustration of data collection using the constructed ADS-B receiver and SDR-based spoofer. (a) The ADS-B receiver collects authentic messages in an open area at the coordinates of . (b) The same ADS-B receiver receives malicious messages transmitted by the SDR-based spoofer in an anechoic chamber.

In order to simulate different received signal strengths of messages from the same aircraft when it is approaching or flying away from the receiver, attack messages are transmitted multiple times with different USRP gains. Since the attacker may be aware of the Doppler effect and the existence of carrier frequency offsets, the calculated or random Doppler shifts and frequency offsets are incorporated into the transmitted attack messages. More details are provided in Section VI.

Message classification example. Fig. 5 illustrates authentic and malicious messages under different attacks. Recall that both message and IQ data replay attacks reuse the original message contents, whereas the ghost aircraft injection attack generates fake message contents. Comparing Fig. 5(b) with Fig. 5(a), we notice that despite of the same message contents, the message replay attack leads to very different IQ samples. In contrast, the IQ data replay attack (Fig. 5(c)) can produce a very similar but shifted phase pattern, as compared to the original message (Fig. 5(a)).

(a)
(b)
(c)
(d)
Fig. 5: Examples of IQ samples of received ADS-B messages under different ground-based ADS-B spoofing attacks. (a) No attack (authentic message). (b) Message replay attack. (c) IQ data replay attack. (d) Ghost aircraft injection attack.

V-C Aircraft Classification

The aircraft classifier predicts the source ICAO address of the received ADS-B message and compares it against the claimed ICAO address. Since there are many aircraft that transmit ADS-B messages, it then becomes a multi-class classification problem. It is worth noting that through aircraft classification, the SODA can not only detect the aircraft spoofing attack, but also identify the masquerading aircraft.

Features. Unlike message classification, the aircraft classifier does not use IQ samples or their magnitudes that encode the (possibly spoofed) ICAO address as features in order to avoid being deceived. Instead, it uses the phases that are independent of the claimed ICAO address as features. The phase of the -th pair of IQ samples is computed as

From the communication theory [23], we know that the phases encode the information about the TX and RX carrier frequency offsets and the Doppler shift. To see why, denote the passband ADS-B signal as

where is the complex baseband signal and is the carrier frequency ( MHz for ADS-B signals). With a carrier frequency offset of and a phase offset , the resulting signal becomes

where . Hence, the rate of the change in the phase indicates the carrier frequency offset, which is a sum of TX/RX frequency offsets and the Doppler shift222Since the ADS-B signal is a narrowband signal, the impact of the Doppler effect is equivalent to shifting the carrier frequency by a certain amount. and also affected by the propagation channel. As a result, the phases encode rich information and thus can be used as features for classification.

Aircraft classification example. A comparison of phase patterns of received messages from different aircraft is provided in Fig. 6. On one hand, we observe that aircraft could have distinguishable phase patterns from each other, which could be learned by the DNN-based aircraft classifier. On the other hand, it is possible for two aircraft to have a similar phase pattern, such as Aircraft 7 and 8, which might cause potential misclassifications.

Fig. 6: Comparison of message phase patterns of different aircraft.

Vi Experimental Evaluation

In this section, we describe the procedure for data collection and then evaluate the performance of both message and aircraft classifiers of SODA implemented usingKeras [24] using real-world datasets.

Vi-a Datasets

The ADS-B receiver and SDR-based spoofer in Fig. 4 were used for data collection. The ADS-B antenna was mounted on a tripod and the overall height is about 1.5 meters. A modified version of dump1090 [25] was used to collect IQ samples with a gain of 49.6 and a sampling rate of 2 MHz. Both wired and OTA experiments were performed to collect realistic spoofing attack data. The former were conducted by directly wiring the transmitter and receiver. The latter were performed in an anechoic chamber located in the basement level of a building to avoid possible signal emission.

Datasets for message classification. Table I summarizes the collected datasets. A total of 18675 authentic ADS-B messages (extended, downlink format of 17) were collected in an open area within half an hour. The first 10000 messages were used for message and IQ data replay attacks. A total of

ghost aircraft were simulated with a velocity randomly drawn from a normal distribution

. The altitude follows

. The headings (in degree) are drawn from a uniform distribution

.

The Doppler shift and the carrier frequency offset are simulated in four different cases: i) , ii) calculated and , iii) calculated and random , iv) random and , and v) random and random . The distributions of random and are and , respectively.

Dataset Label Wired OTA Total
Authentic messages A0 N/A 18675 18675
Message replay attack A1 8101 8318 16419
IQ data replay attack A2 5961 5806 11767
Ghost aircraft injection attack A3 8770 8832 17602
TABLE I: Collected datasets for message classification.

Datasets for aircraft classification. In order to provide sufficient amount of data for aircraft classification, we collected additional data using the constructed ADS-B receiver on the rooftop of a building for two consecutive days. There are a total of 300,152 and 326,122 extended ADS-B messages from 157 and 210 aircraft in two days, respectively.

Description Value
Input Layer Number of features
Output Layer Number of classes
Weight Initialization Xavier normal
Weight Regularizer L2
Activation Function ReLU
Output Activation Function Softmax
Cost Function Cross entropy
Optimizer Adam
Epochs, Batch Size 50, 32
TABLE II: Basic DNN setup for SODA

Vi-B Performance of Message Classification

In this experiment, we evaluate the performance of the DNN-based message classifier and compare it against three baseline classifiers: XGBoost [26], Logistic Regression (LR) [27], and Support Vector Machine (SVM) [28].

Setup. We first consider all attacks and split the dataset into 60%, 20%, and 20% for training, validation, and testing, respectively. The DNN parameters are summarized in Table II. We consider the following three DNN models:

  1. D1: one hidden layer with 128 nodes,

  2. D2: one hidden layer with 256 nodes, and

  3. D3: two hidden layers with 128 nodes per layer.

When training XGBoost/LR/SVM, the attack dataset is subsampled so that the attack and authentic datasets have the same number of messages, avoiding data imbalance.

Metrics. Since each classifier acts as a detector, we adopt the following two metrics:

  1. Detection probability (denoted as ): the percentage of malicious messages classified as malicious, and

  2. False alarm probability (denoted as ): the percentage of authentic messages classified as malicious.

In practice, a classifier that achieves a higher and a lower (that is within a given limit) is preferred.

Results. The results are provided in Table III. We first observe that a simple DNN model such as D1 can already achieve very good performance with and , as compared to other machine learning techniques. In particular, LR and SVM do not perform well in this task, and XGBoost only achieves of with . If we increase the complexity of the DNN model by doubling the number of nodes, the message classifier (D2) achieves and . Interestingly, if the hidden nodes of D2 are split equally into two hidden layers, the resulting classifier (D3) can further reduce to , while keeping .

Model
D1 99.39% 2.60%
D2 99.50% 2.09%
D3 99.34% 0.43%
XGBoost 78.37% 5.17%
LR 54.83% 46.93%
SVM 51.52% 43.64%
TABLE III: Comparison of different models for message classification

Impact of attack dataset diversity. As shown in the previous experiment, the DNN model performs well when trained with all attack datasets. Then a natural question is that if the DNN model is trained on a subset of attack datasets, will it achieve good performance when tested with unknown attacks? In this experiment, we study the impact of attack dataset diversity on D3, and the results are provided in Table IV.

As we can see, the performance of DNN depends on the attack datasets used for training. An interesting observation that when trained with A1, the classifier detects A3 with . In contrast, when trained with A3, the classifier only detects A1 with . In general, the model does not generalize to unknown attacks, suggesting the necessity of creating a diverse attack dataset for message classification.

Attack datasets
used for training
of A1 of A2 of A3
{A1} 99.67% 11.68% 96.31% 0.35%
{A2} 38.57% 99.07% 55.21% 0.40%
{A3} 21.80% 0.30% 100.00% 0.03%
{A1, A2} 99.03% 98.85% 96.28% 0.59%
{A1, A3} 99.88% 13.59% 100.0% 0.70%
{A2, A3} 72.91% 98.94% 99.97% 0.27%
{A1, A2, A3} 99.36% 98.34% 100.00% 0.43%
TABLE IV: Detection performance of D3 message classifier when trained with different combinations of attack datasets

Vi-C Performance of Aircraft Classification

In this section, we evaluate the DNN-based aircraft classifier. We then study the impact of the size of the training set and the number of aircraft on the classification performance.

Setup. We split the data into three portions: 60% for training, 20% for validation, and 20% for testing. We post-processed the collected data and obtained a dataset that consists of 238 aircraft with 1000 messages for each aircraft, totalling 238,000 messages. Each message has a total of 240 phase values (features) computed from the IQ samples. Since DNN dominates other machine learning techniques (Section VI-B), we focus on DNN in this experiment. Specifically, we consider five DNN models with various complexity:

  1. M1: one hidden layer with 512 nodes,

  2. M2: one advanced hidden layer with 1024 nodes,

  3. M3: two hidden layers with 512 nodes per layer, and

  4. M4: three hidden layers with 512 nodes per layer.

  5. M5: a fine-tuned model with two hidden layers (512 and 256 nodes), batch normalization

    [29] and 200 epochs.

The parameters in Table II are used for constructing DNNs.

Metrics. We adopt the following standard metrics for multi-class classification: precision, recall, and F-score [30]. Let be the true positive rate at which Aircraft ’s messages are correctly identified from Aircraft . Similarly, the false positive rate and the false negative rate are the rates of incorrectly accepted and incorrectly rejected identifications for Aircraft , respectively. Hence, we have

Note that high precision and high recall imply low FPR and low FNR, respectively.

To further quantify the overall performance of the classifier, we compute the macro-averaging precision (AvgPr) and recall (AvgRe) by averaging the ’s and ’s over all aircraft, and the average F-score as

In addition, the accuracy metric is often used to measure the overall effectiveness of a classifier, defined as the percentage of messages that are correctly classified.

Results. The results are provided in Table V. Compared to M1 that achieves an average F-score of and an accuracy of , increasing the model complexity by adding more nodes or one more layer (e.g., M2 and M3) improves the overall performance. Keeping adding more layers (e.g., M4) does not necessarily result in performance gain. But with careful tuning of hyper-parameters and longer training time, a fine-tuned model like M5 can achieve an average F-score of and an accuracy of .

Model AvgPr AvgRe AvgF-Score Accuracy
M1 93.69% 93.60% 93.64% 93.60%
M2 94.62% 94.51% 94.57% 94.51%
M3 94.88% 94.68% 94.78% 94.68%
M4 94.52% 94.28% 94.40% 94.28%
M5 96.69% 96.66% 96.68% 96.66%
TABLE V: Performance of DNN-based aircraft classifiers

Fig. 7

plots the empirical cumulative distribution function (CDF) of the F-scores. We can see that over

of aircraft have a F-score of for all models. In addition, increasing model complexity tends to improve the F-score of individual aircraft. For example, there are of aircraft with a F-score lower than with M1, and this ratio drops to and with M3 and M4, respectively. For the fine-tuned model, this ratio is as low as .

Fig. 7: Empirical CDF of F-scores of DNN-based aircraft classifiers.

Fig. 8

plots the confusion matrix of M5, in which the

-th entry represents the probability of Aircraft ’s messages incorrectly classified as Aircraft ’s. With messages correctly identified, there exist only non-diagonal entries (out of ) with a rate larger than or equal to ( misclassified messages). The largest misclassification rate is between Aircraft 91 and 92, suggesting similarity in their phase patterns.

Fig. 8: Normalized confusion matrix of the fine-tuned aircraft classifier (M5). There are non-diagonal entries larger than or equal to (blue squares).

Impact of the size of the training set. In order to understand how the size of the training set affects the DNN performance, we vary the training ratio from to with a step of and plot the average F-score and accuracy of M3. As shown in Fig. 9, with only of the original training set (i.e., messages per aircraft), the classifier achieves an accuracy of over . While the overall performance increases as the training ratio increases, the performance gain of more training samples tends to gradually decrease. Similar trends are observed with other models and thus not reported due to the space limit.

Fig. 9: Performance of the aircraft classifier (M3) in terms of average F-score and accuracy as a function of the training ratio. The performance improves with more training samples and tends to saturate after a certain point.

Impact of the number of aircraft. In this experiment, we vary the number of aircraft from to with a step of and plot the performance of M3, as shown in Fig. 10. With only aircraft, the classification accuracy is as high as . With more aircraft, the performance drops and comes to a level of around .

Fig. 10: Performance of the aircraft classifier (M3) in terms of average F-score and accuracy as a function of the number of aircraft (classes). When the number of aircraft increases, the performance slightly drops to around .

Vii Conclusion and Future Work

In this paper, we proposed SODA to detect ADS-B spoofing attacks using DNN, which consists of a message classifier and an aircraft classifier. The message classifier takes IQ samples as input and detects malicious messages from the SDR-based ground attacker, including the message/IQ data replay attack and the ghost aircraft injection attack. The aircraft classifier takes the phases of received messages as input and detects spoofed messages by comparing the claimed ICAO address against the predicted ICAO address. We collected real-world ADS-B messages and generated realistic attack data using our ADS-B testbed through OTA experiments in an anechoic chamber. Our experimental results show that the SODA detects ground-based spoofing attacks with a detection probability of and a false alarm probability of . It detects aircraft spoofing attacks with an average F-score of and an accuracy of .

As future work, we will study the scalability of SODA and explore the use of clustering to identify aircraft with similar features and detect ADS-B spoofing attacks launched by aircraft in different clusters. We will also perform post-processing of the IQ samples to extract more stable features (e.g., frequency offsets) to improve the robustness of SODA against various factors, including radio propagation, receiver characteristics, and measurement noise.

Acknowledgments

This work was supported in part by NSF grant CNS-1446866, ONR grants N00014-16-1-2710 and N00014-17-1-2946, and ARO grant W911NF-16-1-0485. Views and conclusions expressed are that of the authors and not be interpreted as that of the NSF, ONR or ARO. This work was also supported by a grant of the Italian Presidency of the Council of Ministers.

References

  • [1] FAA, “Modernization of U.S. airspace,” accessed: Dec 14, 2018. [Online]. Available: https://www.faa.gov/nextgen/
  • [2] K. Sampigethaya and R. Poovendran, “Security and privacy of future aircraft wireless communications with offboard systems,” in Communication Systems and Networks (COMSNETS), 2011 Third International Conference on.   IEEE, 2011, pp. 1–6.
  • [3] D. McCallie et al., “Security analysis of the ads-b implementation in the next generation air transportation system,” International Journal of Critical Infrastructure Protection, vol. 4, no. 2, pp. 78–87, 2011.
  • [4] M. Strohmeier, V. Lenders, and I. Martinovic, “On the security of the automatic dependent surveillance-broadcast protocol,” IEEE Communications Surveys & Tutorials, vol. 17, no. 2, pp. 1066–1087, 2015.
  • [5] A. Costin and A. Francillon, “Ghost in the air (traffic): On insecurity of ADS-B protocol and practical attacks on ADS-B devices,” Black Hat USA, pp. 1–12, 2012.
  • [6] M. Schäfer et al., “Experimental analysis of attacks on next generation air traffic communication,” in International Conference on Applied Cryptography and Network Security.   Springer, 2013, pp. 253–271.
  • [7] Ettus Research, “USRP B210,” accessed: Dec 14, 2018. [Online]. Available: https://www.ettus.com/product/details/UB210-KIT
  • [8] K. D. Wesson et al., “Can cryptography secure next generation air traffic surveillance?” IEEE Security and Privacy Magazine, 2014.
  • [9] C. Finke et al., “Enhancing the security of aircraft surveillance in the next generation air traffic control system,” International Journal of Critical Infrastructure Protection, vol. 6, no. 1, pp. 3–11, 2013.
  • [10] M. J. Viggiano et al., “Secure ADS-B authentication system and method,” Jun. 1 2010, uS Patent 7,730,307.
  • [11] L. Schuchman, “Automatic dependent surveillance system secure ads-s,” Jan. 25 2011, uS Patent 7,876,259.
  • [12] Z. Feng, W. Pan, and Y. Wang, “A data authentication solution of ADS-B system based on X.509 certificate,” in 27th International Congress of the Aeronautical Sciences, ICAS, 2010, pp. 1–6.
  • [13] M. Strohmeier and I. Martinovic, “On passive data link layer fingerprinting of aircraft transponders,” in Proceedings of the First ACM Workshop on Cyber-Physical Systems-Security.   ACM, 2015, pp. 1–9.
  • [14] M. Leonardi et al., “Air traffic security: Aircraft classification using ADS-B message’s phase-pattern,” Aerospace, vol. 4, no. 4, p. 51, 2017.
  • [15] M. Leonardi, E. Piracci, and G. Galati, “ADS-B jamming mitigation: a solution based on a multichannel receiver,” IEEE Aerospace and Electronic Systems Magazine, vol. 32, no. 11, pp. 44–51, 2017.
  • [16] M. Schäfer et al., “Secure track verification,” in Security and Privacy (SP), 2015 IEEE Symposium on.   IEEE, 2015, pp. 199–213.
  • [17] N. Ghose and L. Lazos, “Verifying ADS-B navigation information through Doppler shift measurements,” in Digital Avionics Systems Conference (DASC), 2015 IEEE/AIAA 34th.   IEEE, 2015, pp. 1–27.
  • [18] M. Schäfer et al., “Secure motion verification using the doppler effect,” in Proceedings of the 9th ACM Conference on Security & Privacy in Wireless and Mobile Networks.   ACM, 2016, pp. 135–145.
  • [19] Y. Kim, J.-Y. Jo, and S. Lee, “Ads-b vulnerabilities and a security solution with a timestamp,” IEEE Aerospace and Electronic Systems Magazine, vol. 32, no. 11, pp. 52–61, 2017.
  • [20] G. Hinton et al., “Deep neural networks for acoustic modeling in speech recognition: The shared views of four research groups,” IEEE Signal processing magazine, vol. 29, no. 6, pp. 82–97, 2012.
  • [21] RTCA, “Minimum operational performance standards for 1090 MHz extended squitter automatic dependent surveillance-broadcast (ADS-B) and traffic information services-broadcast (TIS-B),,” pp. RTCA DO–260B.
  • [22] RTL-SDR.COM, “RTL-SDR (RTL2832U) and software defined radio news and projects,” accessed: Dec 14, 2018. [Online]. Available: https://www.rtl-sdr.com/
  • [23] U. Madhow, Fundamentals of digital communication.   Cambridge University Press, 2008.
  • [24] F. Chollet et al., “Keras,” https://keras.io, 2015.
  • [25] “Dump1090 is a simple Mode S decoder for RTLSDR devices,” accessed: Dec 14, 2018. [Online]. Available: https://github.com/antirez/dump1090
  • [26] T. Chen and C. Guestrin, “Xgboost: A scalable tree boosting system,” in Proceedings of the 22nd acm sigkdd international conference on knowledge discovery and data mining.   ACM, 2016, pp. 785–794.
  • [27] D. W. Hosmer Jr, S. Lemeshow, and R. X. Sturdivant, Applied logistic regression.   John Wiley & Sons, 2013, vol. 398.
  • [28] J. A. Suykens and J. Vandewalle, “Least squares support vector machine classifiers,” Neural processing letters, vol. 9, no. 3, pp. 293–300, 1999.
  • [29] S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” arXiv preprint arXiv:1502.03167, 2015.
  • [30] M. Sokolova and G. Lapalme, “A systematic analysis of performance measures for classification tasks,” Information Processing & Management, vol. 45, no. 4, pp. 427–437, 2009.