ORACLE: Optimized Radio clAssification through Convolutional neuraL nEtworks

12/03/2018 ∙ by Kunal Sankhe, et al. ∙ 0

This paper describes the architecture and performance of ORACLE, an approach for detecting a unique radio from a large pool of bit-similar devices (same hardware, protocol, physical address, MAC ID) using only IQ samples at the physical layer. ORACLE trains a convolutional neural network (CNN) that balances computational time and accuracy, showing 99% classification accuracy for a 16-node USRP X310 SDR testbed and an external database of >100 COTS WiFi devices. Our work makes the following contributions: (i) it studies the hardware-centric features within the transmitter chain that causes IQ sample variations; (ii) for an idealized static channel environment, it proposes a CNN architecture requiring only raw IQ samples accessible at the front-end, without channel estimation or prior knowledge of the communication protocol; (iii) for dynamic channels, it demonstrates a principled method of feedback-driven transmitter-side modifications that uses channel estimation at the receiver to increase differentiability for the CNN classifier. The key innovation here is to intentionally introduce controlled imperfections on the transmitter side through software directives, while minimizing the change in bit error rate. Unlike previous work that imposes constant environmental conditions, ORACLE adopts the `train once deploy anywhere' paradigm with near-perfect device classification accuracy.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 7

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Sensing the wireless spectrum and identifying active radios within the bands of interest directly impacts spectrum usage. This paper takes the first step in distinguishing radios in a shared spectrum environment by using machine learning to detect characteristic reference signatures embedded in their transmitted electromagentic waves, a process known as

RF fingerprinting. Our goal is to achieve this with information that can be leveraged at the radio hardware front-end. We separately consider situations where the channel is unchanging between training and validation (idealized) and when the channel is dynamic (practical). The key innovation in our approach, termed ORACLE, is that it learns the unique modifications present within the in-phase (I) and quadrature-phase (Q) samples that are introduced in the signal as it passes through the transmitter chain. ORACLE uses Convolutional Neural Networks (CNNs) to learn and then identify individual radios through device-specific variations contributed by the inherent randomness in the manufacturing process. These so called imperfections are present within the analog components (digital-to-analog converters, band-pass filters, frequency mixers and power amplifiers) that compose a typical transmission chain, differentiating radio devices even if their manufacturer and make/model are identical.

Figure 1: Typical transceiver chain with various sources of RF impairments.

I-a Signatures contained within IQ samples

Radio fingerprinting involves extracting unique patterns (or features) across the protocol stack that can be used as device signatures. Indeed, physical (PHY) layer, medium access control (MAC) layer, and upper layers have been utilized for radio fingerprinting [1]. However, simple unique identifiers such as IP addresses, MAC addresses, international mobile station equipment identity (IMEI) numbers can easily be spoofed. Location-based features such as radio signal strength (RSS), angle of arrival (AoA) and channel state information (CSI) are susceptible to mobility and environmental changes. ORACLE, instead, focuses on those transmitter features that are inherent to a device’s hardware makeup, which are unchanging and cannot be easily replicated by malicious agents.

Fig. 1 indicates an example scenario of these so called transmitter signatures (rigorously studied in Sec. III) for 16-QAM constellation. The red circles indicate the ideal constellation points formed by the I (x-axis) and Q (y-axis) samples, and the black crosses indicate actual constellation points that are shifted due to a specific type of hardware imperfection. Practical transmitters have a combination of these shifts that form their unique signatures, though we show only three plots caused by IQ imbalance, nonlinear distortion and phase noise in the figure. ORACLE aims to learn and intentionally modify some of these features on the transmitter through USRP Hardware Driver (UHD) software API commands, thereby enhancing identifiability/classifier efficiency. We note that ORACLE can be easily used in conjunction with other existing and higher layer classification approaches.

I-B Machine learning for RF fingerprinting in ORACLE

Machine learning (ML) techniques have shown great promise in image and speech identification problems, and are steadily gaining traction in applications within the wireless domain. ORACLE is solely built on a convolutional neural network architecture that has not only seen success in the above areas, but has also been previously used for modulation [2] and protocol identification [3]. ORACLE adopts a stagewise approach towards achieving practical classification. We attain this in the first step by demonstrating 99% accuracy on an externally obtained data set of 100+ COTS WiFi radios (not all of which are bit-similar), as well as on our testbed of 16 bit-similar USRP X310 radios that we configure to be exactly similar in terms of waveforms generated (same 802.11a PHY frame, modulation/protocol/mac ID).

I-C The ORACLE approach

For radios operating in a channel-invariant environment (henceforth referred to as a static channel), ORACLE identifies radios by using only raw IQ samples. It neither estimates the channel, nor does it use any prior knowledge of the protocol being used. However, its performance degrades if the operating environment of the radio is changed. This is because the wireless channel often has a dominant impact on the transformation of the IQ samples in the complex plane. When the channel is varying (henceforth referred to as a dynamic channel), ORACLE is trained with complex demodulated symbols instead of raw IQ samples. This approach negates the effect of the channel while retaining the effect of hardware impairments only. Here we make an interesting observation: training with demodulated symbols makes low-end SDRs (such as the Ettus N210 USRP) robust to channel variations. However, high-performance SDRs (such as the X310 USRP) that are manufactured with components with lower variability need an additional step. For such high-end bit-similar devices, ORACLE has a principled method for intentionally introducing impairments to increase differentiability while minimizing the bit error rate (BER) for each transmitter. The key insight here is that controlled addition of impairments in a bit-similar radio generates a unique pattern in the demodulated signal at the receiver, which is independent of channel variations.

In summary, the main contributions of this paper are:

We study the different causes of transmitter-side reference signatures, and visualize their impact on the IQ constellation space. We identify specific features that are amenable to fine tuning by the receiver feedback using software APIs.

Using an SDR testbed and external database of 100+ devices, we propose the design of ORACLE, which includes a robust CNN architecture returning 99% device classification accuracy on static channels using only raw 1/Q samples.

We propose and implement an enhanced design of ORACLE on USRP X310 radios, that systematically introduces controlled impairments to increase differentiability in high-end bit-similar SDRs, while ensuring the added BER at a common receiver is minimized. This is a critical step towards ‘train once deploy anywhere’ paradigm that allows robust CNN learning under realistic channel variations.

Ii Related work

Publication Approach
Franklin et al [4] Master DB of signatures for wireless device driver fingerprinting
Gao et al [5] Master DB of signatures for AP fingerprinting
Kennedy et al [6] k-NN based transmitter fingerprinting
Brik et al [7] SVM based NIC identification
Radhakrishnan et al [8] ANN based wireless device identification
O’Shea, et al [2] CNN based Modulation recognition
Chen, et al [9] Infinite Hidden Markov Random Field based classification
Nyugen, et al [10]

Infinite Gaussian Mixture Model based device classification

Table I: Machine learning approaches for device fingerprinting.

While there exists a vast literature on the theory and applications of ML, we only review works that are directly relevant to the problem of RF fingerprinting, and within it, mainly supervised learning. Unsupervised learning, on the other hand, is effective when there is no prior label information about devices. For e.g., in

[9], an infinite Hidden Markov Random field (iHMRF)-based online classification algorithm is proposed for wireless fingerprinting using unsupervised clustering techniques and batch updates. Transmitter characteristics are used in [10] where a non-parametric Bayesian approach (namely, an infinite Gaussian Mixture Model) classifies multiple devices in an unsupervised, passive manner. However, in our approach we generate real data for each device independently; hence, labeling the device specific dataset is an inexpensive task. Given the ground truth to facilitate model creations, we follow the supervised learning paradigm, where a large collection of labeled samples are applied for training, prior to network deployment. There are two main approaches in this form of learning:

Ii-a Similarity-based

Similarity measurements involve comparing the observed signature of the given device with the references present in a master database. In [4], a passive fingerprinting technique is proposed that identifies the wireless device driver running on an IEEE 802.11 compliant node by collecting traces of probe request frames from the devices. A supervised Bayesian approach is used to analyze the collected traces and generate the device driver fingerprint. Gao et al. [5] describe a passive blackbox technique, that uses TCP or UDP packet inter-arrival time to determine the type of access points using wavelet analysis. However these techniques rely on prior knowledge of vendor specific features.

Ii-B Classification-based

Ii-B1 Conventional

This form of classification examines a match with pre-selected features using domain knowledge of the system, i.e., the dominant feature(s) must be known a priori. Kennedy et al. [6] propose classification by extracting the known preamble within a packet and computing spectral components. A set of log-spectral-energy features are given as input to the k-nearest neighbors (k-NN) discriminatory classifier. PARADIS [7] fingerprints 802.11 devices based on modulation-specific errors in the frame using SVM and k-NN algorithms with an accuracy of 99%. In [8]

, a technique for physical device and device-type classification called GTID using artificial neural networks is proposed that exploits variations in clock skews as well as hardware compositions of the devices. However, as multiple different features are used, selecting the right set of features is a challenge. This also causes scalability problems when large number of devices are present, leading to increased computational complexity in training.

Ii-B2 Deep Learning

Deep learning offers a powerful framework for learning complex functions, leverages large datasets, and greatly increases the the number of layers, in addition to neurons within a layer. O’Shea and Corgan [2] and O’Shea and Hoydis [11] apply deep learning at the physical layer, specifically focusing on modulation recognition using IQ samples and convolutional neural networks. They classify 11 different modulation schemes. However, this approach does not identify a device like ORACLE, but only the modulation type used by the transmitter.

To the best of our knowledge, ORACLE is the first work that allows training a CNN for bit-similar device identification such that the same classifier may operate in unknown/dynamic channel conditions without the need for new trials.

Iii A closer look at device signatures

In this section, we first study RF hardware impairments that cause variations in IQ samples, resulting in a unique signature for each device. We focus on IQ imbalance and DC offset, the two impairments that (i) are independent of the environment, and (ii) do not apply only in context of a specific transmitter-receiver pair (as opposed to, say, relative phase offset). Then, we present a method of introducing controlled impairments using GNU Radio UHD API at the receiver. Subsequently, we explain the experimental testbed setup for trace data collection.

Iii-a RF impairments

Using the MATLAB Communications System Toolbox, we simulate a typical wireless communications processing chain (see Fig. 1, with the shifts in the received complex valued IQ samples), and then modify the ideal operational blocks to introduce RF impairments, typically seen in actual hardware implementations. This allows us to individually study the IQ imbalance, DC offset, phase noise, carrier frequency offset and nonlinear distortions of power amplifier. In this paper, we focus on the two impairments (IQ imbalance and DC offset) owing to space constraints, though our approach can be trivially extended for others as well.

IQ imbalance: Quadrature mixers are often impaired by gain and phase mismatches between the parallel sections of the RF chain dealing with the I and Q signal paths. The mismatch in their gains causes amplitude imbalance, whereas phase deviation from in the quadrature signal results in phase imbalance. IQ imbalance varies only with frequency due to frequency-dependent low pass filters, and thus, it carries a unique signature of a transmitter for that frequency.
DC offset: This is caused within the quadrature mixers due to the finite isolation between Local Oscillator (LO) and RF ports of a mixer, and a direct feedthrough from the LO signal often gets coupled to the output.

Iii-B Software-based control of impairments

We first explain the use of self-calibrations utilities provided by Ettus to set IQ imbalance and DC offset in the transmitter chain using GNU Radio functions.

 IQ imbalance compensation: Let be the transmitted baseband complex signal at time before being distorted by IQ imbalance. Then, the distorted baseband signal in the time domain is:

(1)

where the distortion parameters and are related to amplitude and phase imbalances in the I and Q paths of the quadrature mixer in the transmitter chain.

The simplified model of these distortions parameters can be written as and , where and are the amplitude and phase imbalance between the I and Q signal paths at the transmitter, respectively. The phase imbalance is any phase deviation from the ideal . The amplitude imbalance is defined as , where and are the respective gain amplitudes on the I and Q paths.

IQ imbalance causes interference in the signal by generating its image at a mirror frequency. It is quantified by measuring the power of the image with respect to the desired signal, also called as Image Rejection Ratio (IMRR), as shown in Fig. 2. The IMRR is calculated by sending a complex sinusoidal , and by taking ratio of the power of the signal at the image frequency and desired frequency . Thus, IMMR for amplitude imbalance and phase difference of , is given by:

(2)

where .

Figure 2: Effect of IQ imbalance quantified through IMRR.

While many theoretical time and frequency domain methods allow compensation for the IQ imbalance, we use the Ettus provided UHD calibration utility uhd_cal_tx_iq_balance. It performs a calibration sweep over a range of frequencies checking the transmission path signal leakage into the receive path.

At runtime, the UHD software automatically applies the correction, typically a single complex factor, to the transmit chain of the RF daughterboard. SampIn is the complex sample input to the block, SampOut is the complex sample output and Corr is the correction factor. For a given value of correction factor, a single frequency tone is transmitted, and the power of the desired tone and the image tone are measured to compute IMMR.

We modified this utility to record the correction factors and the corresponding IMMR. Table II shows a snapshot of the recorded IMMR levels for USRP X310 radio at a center frequency of 2.45 GHz.

Figure 3: Experimental setup for data collection using SDR.
Correction real Correction imag. Power of main tone Power of image tone IMMR (dB)
Table II: A snapshot of IMMR levels of IQ imbalance recorded using uhd_cal_tx_iq_balance utility

 DC offset compensation: DC offset results in a large spike in the center of the spectrum. By measuring the power of the main tone at the DC frequency, we can measure the amount of DC offset. A UHD calibration utility uhd_cal_tx_dc_offset uses a single complex factor to correct DC offset level. It finds the best correction factor that minimizes the power of the DC tone. Again, by modifying the utility, we record the levels of DC offset level for the correction factor.

We use the open-source GNU Radio companion (GRC) to transmit standard-compliant IEEE 802.11a WiFi packets through the SDR. Using set_iq_balance and set_dc_offset functions in GRC, these two separate complex correction factors can be set to intentionally introduce required level of impairments in the radio.

Iii-C Experimental setup for Trace Data collection

We study the performance of the CNN using IQ samples collected from an experimental setup of USRP SDRs, as shown in Fig. 3, with a fixed USRP B210 as the receiver. All transmitters are bit-similar USRP X310 radios that emit IEEE 802.11a standards compliant frames generated via a MATLAB WLAN System toolbox. The data frames generated contain random payload but have the same address fields, and are then streamed to the selected SDR for over-the-air wireless transmission. The receiver SDR samples the incoming signals at sampling rate at center frequency of for WiFi. The collected complex IQ samples are partitioned into subsequences. For our experimental study, we set a fixed subsequence length of , i.e., the length of contiguous samples that will be used at a time for training and classification. Overall, we collect over million samples for each radio, subsequently divided into training, validation and test set.

Iv CNN architecture for static channels

Iv-a Classifier architecture

Figure 4: Our proposed CNN architecture with two convolution and two fully connected layers.

For static channels, we design a CNN architecture that uses raw time-series IQ samples generated from 16-node USRP X310 SDR testbed and the external database of COTS WiFi devices. Our proposed CNN architecture, as shown in Fig. 4, is partly inspired from AlexNet [12]

. It is a deep CNN architecture specifically designed to classify 1.2 million high-resolution images available in the ImageNet dataset into 1000 different classes. Unlike AlexNet, which is made up of 8 layers (5 convolution and 3 fully connected), our CNN architecture consists of four layers, with two convolution layers and two fully connected (or dense) layers. The input to our CNN is a windowed sequence of raw IQ samples with length 128. We choose a

sliding window approach to partition the training samples that enhances the shift invariance of the features learned by the CNN. Each complex value is represented as two-dimensional real values (i.e., I and Q are two real value streams), which results in the dimension of our input data growing to . This is then fed to the first convolution layer. The convolution layer consists of a set of spatial filters, also called kernels, that perform a convolution operation over input data to extract the features. The first convolution layer consists of 50 filters, each of size , in which each filter learns a 7-sample variation in time over the I or Q dimension separately, to generate 50 distinct feature maps over the complete input sample. Similarly, the second convolution layer has 50 filters each of size

and each filter learns variations, again of 7 activation values, over both I and Q dimensions of the 50-dimensional activation volume obtained after the first convolution layer. Each convolution layer is followed by a Rectified Linear Unit (ReLU) activation, that performs a pre-determined non-linear transformation on each element of the convolved output.

The output of the second convolution layer is then provided as input to the first fully connected layer, which has 256 neurons. A second fully connected layer of 80 neurons is added to extract higher level non-linear combinations of the features extracted from previous layers, which are finally passed to a classifier layer. A

softmax

classifier is used in the last layer to output the probabilities of each sample being fed to the CNN. The choice of hyperparameters such as filter size, number of filters in the convolution layers and the depth of the CNN is of high importance to ensure that our CNN model generalizes well. These are chosen carefully through cross validation. In order to overcome overfitting, we set the dropout rate to 50% at the dense layers. We also use an

regularization parameter . The weights of the network are trained using Adam optimizer with a learning rate of

. We minimize the prediction error through back-propagation, using categorical cross-entropy as a loss function computed on the classifier output. We implement our CNN architecture in Keras running on top of TensorFlow on a system with 8 NVIDIA Cuda enabled Tesla K80m GPU.

Iv-B Preliminary results

Figure 5: Box plot for the classification of WiFi devices using CNN.

Our preliminary evaluation aims to demonstrate the accuracy of ORACLE’s CNN architecture for classifying radios under static conditions, and it also motivates the need for receiver-feedback driven modifications for dynamic channels using techniques described in Sec. III-B.

Iv-B1 Accuracy in static channel conditions

First, we verify the performance of our proposed CNN to classify COTS WiFi devices using an external database, which contains labeled IQ samples collected from 140 devices (phones/tablets/laptops/drones) of 122 manufacturers. For each device, we use 4.5K windowed examples as training set and 1K examples as test set, based on available samples in the database. A validation set of 300 examples for each device is used at each training epoch to monitor the performance on unseen data and the training process is stopped if the validation accuracy does not increase for 10 consecutive training epochs. The training time for this experiment using all 140 devices is

. ORACLE’s performance is shown in Fig. 5

with the minimum accuracy, first quartile, median, third quartile, and maximum accuracy for each dataset. Here, the X-axis represents a number of randomly chosen devices whereas the classification accuracy is shown on the Y-axis. Up to 100 different devices, we obtain a median accuracy of 99%, whereas it is 96% for 140 devices. We note that while the number of radios is large, these devices are not bit-similar. Hence, we ‘stress-test’ our classifier using collected IQ samples from 16, high-end X310 USRP SDRs that present a narrower range of impairments, with the same B210 radio as a receiver. Our training set for this experiment consists, per radio, of

windowed training examples and examples for validation. We use another examples for each device to test the performance of our trained model. It takes with our current setup to train the model for 16 radios. Also for this setup, we obtained 98.6% accuracy on the test set, shown in Fig. 5(a).

Iv-B2 Limitations of raw IQ samples in dynamic channels

(a)
(b)
Figure 6: Confusion matrix relative to two experiments with same devices and different locations: (a) overall accuracy is ; (b) overall accuracy is .
(a)
(b)
Figure 7: (a) Estimated channel gain for subcarrier for each radio (b) Magnitude of estimated channel for all radios (ordered from lower to higher).

Multipath reflection and fading have considerable impact on received IQ samples, at times distorting the samples wherein the classifier no longer correctly identifies the radios. Typically, the effect of the channel is compensated by channel estimation and equalization techniques to correctly retrieve over-the-air transmitted data. Thus, as we show next, classification performance degrades severely when either (i) classifiers are trained on raw IQ samples under a given channel and then tested on IQ samples obtained under different channels, or (ii) transmitters experience very similar channel conditions.

Fig. 5(a) shows the classification accuracy of 16 X310 radios, with near-perfect results for all the devices. However, Fig. 5(b)

shows the same setup in a different location where several outliers exist, as the confusion matrix shows, e.g., see radio pairs (5,15), (10, 14). The reason is that

the similarity in the wireless channel experienced by certain transmitter pairs dominates subtle hardware variations. Given a set of radios, represents the average channel gain in subcarrier of each radio , estimated over WiFi packets belonging to the training dataset.

Fig. 6(a) and 6(b) reveal how received samples from transmitters with smaller differences in channel estimation are more likely to be misclassified by ORACLE during testing. This shows that wireless channel state affects the distribution of complex symbols captured by the receiver in a non-negligible manner, and therefore becomes a discriminating factor when the classifier is trained with raw IQ samples. If we try to use a pre-trained model and use it to classify samples collected from same devices but at different times or locations, the classification result is unpredictable. See Fig. 7(a), 7(b) and 7(c) for the classification results showing the time and location dependence of the trained classifier.

(a)
(b)
(c)
Figure 8: (a) Classification accuracy for 4 devices tested at time and location ; (b) time and same location ; (c) time and different location .

V ORACLE with Feedback for Dynamic Channels

This section describes the enhancements in ORACLE that allow it to robustly classify transmitters in unseen environments. The two main assumptions here are: (i) instead of raw IQ samples, ORACLE works with demodulated symbols, and (ii) in a pre-deployment phase, the receiver provides feedback to the transmitter to incorporate controlled impairments.

V-a Impact of impairments on demodulated symbols

ORACLE modifies the transmitter chain of the SDRs such that their respective demodulated symbols acquire unique characteristics that make the CNN robust to channel changes, i.e., it makes the transmitter hardware dominate channel induced variations. We first validate the hypothesis that a given combination of impairments results in repeatability in the outcome of the classification. To demonstrate this, consider demodulated symbols received from two X310 radios, over cable and air channels, as shown in Fig. 9, for three different levels of IQ imbalance. The first row shows slight differences in the demodulated samples when the channel is completely changed (i.e., air to cable) for the same transmitter. In the second row, when the same channel is maintained, but the transmitters themselves are different, adding the same level of IQ imbalance results in virtually the same pattern in each case, ensuring repeatability and robustness.

Figure 9: Patterns generated by 3 impairments on 2 devices under 2 channel conditions. First and second row show the channel- and device- invariance of the patterns respectively.

We also quantitatively analyze the property of the channel- and device- invariance of the patterns with Earth Mover’s Distance (EMD), a widely used metric to measure similarities between two multi-dimensional distributions. More precisely, suppose we have two sets of points in . Let and be two subsets of equal size, i.e., . Let be the set of all possible bijections ( and onto mappings) from to . The EMD between and is given by:

(3)

In other words, EMD is given by the smallest possible sum of Euclidean distances between points in and , over all possible valid bijections . Smaller EMD indicates more similarities between two patterns and vice versa. Fig. 10 (a) and (b) show the EMD matrix of patterns generated on different channel conditions and devices respectively with the same set of impairments in Fig. 9. We see that computed EMD on the matrix diagonal, which represents the patterns generated by the same impairments, are much lower than the EMD of patterns generated by different impairments. We further evaluate the EMD for the demodulated signal collected under 3 different channel conditions, 4 devices across 32 different levels of impairments. We see that the average EMD remains around 0.1 and 0.2 for patterns generated by the same and different level of impairments, respectively, despite of the variations caused by channel conditions. This result matches closely with Fig. 10 and verifies our intuition.

(a)
(b)
Figure 10: The EMD matrix of patterns generated (a) under different channel conditions; (b) on different devices.

V-B Identifying feasible impairments

The naive approach of introducing random combinations of impairments before training the CNN has three problems:

  1. Scalability: If a new transmitter is introduced in the network, then we have to re-train the entire CNN, which is a time- and computation-heavy process.

  2. Accuracy: It is possible that demodulated samples originating from two different transmitters (previously, easily differentiable) now appear clustered together owing to the modification in their placement on the IQ plane. This may reduce the performance of the classifier.

  3. Communication impact: Adding impairments naturally increases the BER. Hence judicious and controlled addition is needed to limit any adverse impact on BER.

To solve these issues, ORACLE automatically selects feasible impairments that produce IQ sample constellation points that are significantly different from each other, while minimizing the influence on the BER for the transmitter. This step allows ORACLE to pre-train on virtual radios transmitter chains (constructed in GNU Radio) as the impairments dominate other variations introduced by its own hardware and the wireless channel. Thus, ORACLE learns the impairment patterns, which we have shown in Fig. 9 to be both device and channel agnostic, i.e., two different radios will result in a similar demodulated IQ pattern at the receiver under the same impairment. This approach greatly increases the flexibility of ORACLE: if a new transmitter is added, we simply assign it one of the feasible and uncommitted impairments, without any need to re-train the CNN.

(a)
(b)
Figure 11: (a) BER vs. IMMR value of IQ imbalance; (b) BER vs. DC offset level for different SNRs.

We use a generic X310 USRP radio that operates in a loop while automatically adding impairments to its hardware through the utilities uhd_cal_tx_iq_balance and uhd_cal_tx_dc_offset for IQ imbalance and DC offset, respectively. Then the transmitter sends a stream of known data over cable to the B210 USRP receiver that checks the BER. For our experiment, we consider different levels of IQ imbalance with IMMR value ranging from  dB to  dB and levels of DC offset ranging from  dB to  dB. The BER plots are shown in Fig. 10(a) and Fig. 10(b) for different SNR levels, which we concisely refer to as an impairment map , and use it later in Sec. V-D. The bounds on the impairments depend on the SNR that the radios operate in. For e.g., our lab has a noise floor of  dBm, for which we assume an average  dB SNR level with the constraint on BER of . Accordingly, we choose upper bound  dB on IMMR for IQ imbalance and  dB for DC offset level.

We next explain how to identify the feasible set out of all impairment combinations that satisfy the BER constraint. Specifically, let

be the vector of different levels of IQ imbalance resulting in an ordered set of corresponding BER, i.e.,

. Therefore, is the maximum IQ imbalance we can add without exceeding the BER constraint. Note that the BER constraint of is evaluated under ideal SNR level ( dB). We start from , since it has the smallest impact on the communication, increasingly adding to to the set . However, any new is eligible to be added only if the difference in EMD between the pattern generated by and that of any existing in is larger than a threshold . As we have seen in Sec. V-A, allows for an acceptable buffer in evaluating how close a given IQ pattern is to another. After we have reached , we configure the radio with a different type of impairment until , where is the number of bit-similar radios.

V-C CNN classifier using transmitter-side impairments

In this section, we discuss to train the classifier for the patterns (see Sec. V-B). We reuse the same CNN architecture and the input data format as described in Sec. IV. Note all IQ samples for training are collected over the cable, i.e, we remove the influence of wireless channel so that CNN can capture the pattern generated solely by hardware impairments.

ORACLE deliberately introduces random noise by modifying the original data to increase the number and variability of the initial dataset before input to the classifier, a technique commonly used in deep learning. Since low SNR of the received samples results in scattering around the ideal constellation point location within the IQ plane, the noise is modeled as a Gaussian variable. We note that noise may result in an altered demodulated IQ sample pattern that is different from the original one, as shown in Fig. 12. To finely control the possible variations, we maintain the EMD under 0.1 after adding noise, since two sample patterns up to this level are still similar to each other (see Sec. V-A). Thus, adding noise power less than ensures that the EMD between original and altered patterns is below this threshold.

(a)
(b)
(c)
Figure 12: Pattern generated with (a) original (demodulated) data; (b) data after adding -17 dB noise, EMD with (a): 0.07; (c) data after adding -9 dB noise , EMD with (a): 0.18.

V-D Allocation of specific transmitters to impairments

The main challenge in adding impairments is that it increases the BER and degrades the quality of service. In addition, the degradation of impairments are different for radios under various SNR levels (as shown in Fig. 11). Lower the SNR, the less impairments we may add to radios to ensure the required BER. We discuss how to solve this problem in this section, assuming the SNR measurements at the receiver side are quasi-static for duration , allowing an average of SNR levels within each such time slot.

Problem formulation: Given radios , the average SNR levels for these radios are . We need to select impairments that minimize the BER of each transmitter, also depending on the average SNR level at the receiver.

We solve this problem using a greedy heuristic similar to the one we used in Sec. 

V-A to generate unique patterns. Without loss of generality, consider IQ imbalance with as the set of selected IMMR levels and giving the the mapping of different SNR levels to the max IQ imbalance to maintain the BER (see Sec. V-B). Then, for each radio we select , where , is the set of SNR in and , .

Following this step, we sort the radios by their , such that , i.e, we sort radios according to the max IQ imbalance that can be added. Then we create two empty sets and , which denotes classifiable and unclassifiable radios, respectively. We then start to allocate iteratively to the radios from to as long as and place a given radio in the classifiable set . Otherwise if , it means no feasible IQ imbalance can be added to radio without exceeding the BER limit. Therefore, we put the radio in the unclassifiable set . After we have explored all radios and if the is not empty, we repeat the above process with a second type of impairment (e.g., DC offset) until all radios have been put in the classifiable set.

In summary, allocating the impairment from low to high makes sure that we are minimizing the degradation in the BER.

Vi Performance evaluation

(a)
(b)
Figure 13: Two different experimental environments: (a) closed lab area (location 1); (b) open recreation area with much less reflections (location 2).

In this section, we present the performance of ORACLE showing: (1) it increases the classification accuracy for bit-similar radios, and that accuracy is not influenced by variation in wireless channel conditions (Sec. VI-A); (2) it minimizes the BER changes due to the hardware impairments without sacrificing classification performance (Sec. VI-B).

Experiment setup: We first identify a set of 32 impairments which generates unique patterns as discussed in Sec. V-B. Next, we collect demodulated data from WiFi packets that are transmitted over a cable from a single radio, after introducing these impairments through GNU Radio API. We replicate and augment demodulated data by adding a random Gaussian noise. We limit the power of noise to be under -13 dB to ensure that EMD lies below the threshold of between patterns generated from original and altered data. Finally, we train the classifier with the augmented dataset using the same CNN architecture as described in Sec. IV.

Vi-a Classification accuracy with different channel conditions

We test the performance of the trained CNN classifier with X310 radios. To do so, we first collect samples from these radios through cable. All radios are configured with one of impairments selected from set , according to the approach described in Sec. V-D. As shown in Fig. 13(a), ORACLE easily distinguishes bit-similar radios that are intentionally introduced with the selected impairments by achieving a classification accuracy of %. This indicates that our pre-trained classifier is able to identify bit-similar radios accurately.

Next, we evaluate the performance of ORACLE with data collected over the wireless channel. To show robustness to variation in channel conditions, we conduct the experiments in two different locations: (1) our lab, which represents a typical in-indoor environment (Fig. 12(a)) and (2) a more open recreation area which has fewer reflections (Fig. 12(b)). The confusion matrix of classification accuracy is shown in Fig. 13(b) and Fig. 13(c) respectively. In general, in both environments ORACLE can achieve higher than % accuracy, which proves that the unique patterns created by the impairments can still be detected, even with random noise.

In comparison, training the same classifier with these X310 devices without any kind of artificially introduced hardware impairments results in a poor classification performance. As shown in Fig. 13(d), the classification accuracy is only 35.96% for these bit-similar radios, which shows the benefits of the careful impairment allocation process.

(a)
(b)
(c)
(d)
Figure 14: Classification accuracy (a) via cable; (b) over air in location 1 (Fig.12(a)); (c) over air in location 2 (Fig.12(b)). (d) shows the accuracy without ORACLE (data collected in location 2).

Vi-B Reduced BER with heuristic impairments selection

We use the metric of average total sum of BER of all the transmitters and compare the results with allocating impairments i) randomly, and ii) greedily using the algorithm described in Sec. V-D. We consider radios to have average SNR values selected randomly among {20,25,30} dB. Let IQ imbalance be the only impairment added, which is bounded by IMMR value of -13.5 dB. However, we consider 16 available impairment levels that range from IMMR of -13.5 to -21 dB with 0.5 dB separation. At each selection we ensure that the CNN classifies with these impairment levels at accuracy.

Under a random allocation approach, radios are randomly allocated one of the selected 16 impairment levels. On the other hand, our greedy heuristic algorithm iteratively assigns a lowest available impairment level to the radio which have least average SNR level. A BER value for each radio is computed with different SNR levels shown in Fig. 10(a). We run 1000 iterations, in which each radio is randomly assigned one SNR level. In each iteration, a unique impairment level is randomly allocated to each radio using random allocation strategy. We repeat this 500 times to compute the total sum of BER of all the radios averaged over 500 iterations for the given SNR assignment. This is then averaged again over 1000 SNR assignments. Similarly, we compute the total sum of BER of all the radios obtained using the greedy heuristic algorithm, averaged over 1000 SNR assignments. Table III shows the BER of all radios confirming that ORACLE’s approach of allocating impairments always outperforms random allocation.

Number of radios Average total sum of BER
Random Greedy Heuristic
Table III: BER comparison between random and greedy heuristic impairments allocation.

Vii Conclusion

We presented ORACLE, a fingerprinting technique for identification of specific radios based on the hardware-centric features within the transmitter chain. We showed that our CNN classier achieves an accuracy of using raw IQ samples for COTS WiFi devices and 16 X310 USRP radios in static environment. To further improve the classification accuracy in dynamic environment, we showed how feedback-driven transmitter-side modifications can increase differentiability for bit-similar devices. The key innovation lies in its ‘train once and deploy anywhere’ feature. We demonstrate experimental accuracy with bit-similar X310 radios, regardless of different channel conditions and wireless transmission environments.

Acknowledgment

This work is supported by DARPA under RFMLS program contract N00164-18-R-WQ80. We are grateful to Paul Tilghman, program manager at DARPA, and Esko Jaska for their insightful comments and suggestions.

References

  • [1] Q. Xu and R. Zheng and W. Saad and Z. Han, ”Device Fingerprinting in Wireless Networks: Challenges and Opportunities”, IEEE Communications Surveys Tutorials, vol. 18, no. 1, pp. 94–104, Firstquarter 2016
  • [2] T. J. O’Shea and J. Corgan, “Convolutional radio modulation recognition networks,” 2016. [Online]. Available: http://arxiv.org/abs/1602.04105
  • [3] A. Selim, F. Paisana, J. A. Arokkiam, Y. Zhang, L. Doyle, and L. A. DaSilva, “Spectrum monitoring for radar bands using deep convolutional neural networks,” in IEEE GLOBECOM 2017
  • [4] J. Franklin, D. McCoy, P. Tabriz, V. Neagoe, J. Van Randwyk, and D. Sicker, “Passive data link layer 802.11 wireless device driver fingerprinting,” in ACM USENIX Security Symposium - Volume 15, 2006
  • [5] K. Gao, C. Corbett, and R. Beyah, “A passive approach to wireless device fingerprinting,” in IEEE DSN 2010, June 2010, pp. 383–392.
  • [6] I. O. Kennedy, P. Scanlon, F. J. Mullany, M. M. Buddhikot, K. E. Nolan, and T. W. Rondeau, “Radio transmitter fingerprinting: A steady state frequency domain approach,” in IEEE VTC, Sept 2008, pp. 1–5.
  • [7] V. Brik, S. Banerjee, M. Gruteser, and S. Oh, “Wireless device identification with radiometric signatures,” in ACM MOBICOM 2008
  • [8] S. V. Radhakrishnan, A. S. Uluagac, and R. Beyah, “Gtid: A technique for physical device and device type fingerprinting,” IEEE Transactions on Dependable and Secure Computing, Sept 2015.
  • [9] F. Chen, Q. Yan, C. Shahriar, C. Lu, W. Lou, and T. C. Clancy, “On passive wireless device fingerprinting using infinite hidden markov random field,” submitted for publication.
  • [10] N. T. Nguyen, G. Zheng, Z. Han, and R. Zheng, “Device fingerprinting to enhance wireless security using nonparametric bayesian method,” in IEEE INFOCOM, April 2011, pp. 1404–1412.
  • [11] T. J. O’Shea and J. Hoydis, “An introduction to machine learning communications systems,” 2017. [Online]. Available: http://arxiv.org/abs/1702.00832
  • [12] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification with deep convolutional neural networks,” in NIPS 2012