With continuing advances in sensors and low-power communication technologies, human-centric Internet of Things (IoT) has gained increasing momentum in both industrial and academic communities by unobtrusively providing smart user-centered services [1, 2]. The hardware miniaturization of IoT devices resembles the two sides of a coin: it empowers IoT devices to communicate via ultra low-power radios while making communication links vulnerable to malicious invasions. Since on-body IoT devices are generally attached to users’ bodies to continuously record fine-grained vital signs, security breaches of these devices pose a serious threat to users’ everyday privacy and safety .
Growing attempts and extensive endeavors have been devoted to thwarting malicious masqueraders for hardware-constrained wearable devices. It has been shown that radio channel characteristics in body area networks (BANs) can be exploited to perform device authentication . Recent efforts have also leveraged dedicated sensors [5, 6, 7, 8], such as accelerometers and gyroscopes, to verify wearable devices. However, hardly any of them have achieved widespread acceptance. They limit themselves to either special motion scenarios , or fitness related wearables [5, 6, 7, 8]. To embrace the coming wave of human-centric IoT, it is critical for a device authentication solution to support various on-body IoT devices under diverse user motions.
The salient physical layer (PHY) signatures underlying different BANs present us with an exciting opportunity. In BAN channels, off-body signals are mainly comprised of line-of-sight (LOS) and multi-path components, while on-body signals are governed by creeping waves [9, 10]. The distinct radio propagation patterns potentially enable a general security solution relying on prevalent wireless chips. However, radio signals in BAN channels are severely affected by IoT users’ body motions. As a consequence, on- and off-body signals can exhibit significantly different patterns under a specific user motion, and their patterns tend to vary dramatically across multiple motion states. Furthermore, users’ frequent motion changes in daily life make it a highly challenging task to manually select features to represent propagation patterns from real-world radio traces.
To address this challenge, we propose a motion invariant authentication framework for on-body IoT devices. The proposed system performs device authentication by exploiting BAN radio signatures in two steps. In the first step, our system abstracts representative time and frequency features from noisy received signal strength (RSS) segments to characterize fine-grained radio propagation characteristics. In the second step, to learn robust feature representations from abstracted radio features, an adversarial multi-player network is customized to effectively remove motion specific features and thereafter accurately recognize the identities of IoT devices. To achieve this goal, during training, an adversarial training criterion is implemented, which leads to the emergence of transferable features that generalize well in unseen motion states. We implement a working prototype of our system on universal software radio peripheral (USRP) devices and conduct experiments with various body motions in different real-world environments. Experimental results show that our system achieves an authentication accuracy of 90.4% on average.
The main contributions of this work are summarized as follows.
We propose a general authentication system that supports various on-body IoT devices under diverse body motions. The crux of the proposed system is to construct reliable radio propagation profiles from RSS segments and to develop an adversarial network to essentially identify IoT devices based on underlying propagation patterns.
We theoretically analyze our adversarial multi-player network and demonstrate that at equilibrium, the learned feature representation contains all information about BAN radio propagation patterns, and becomes invariant to user body motions.
We build a prototype of our system on USRP platform and conduct extensive experiments with various frequently appearing body motions in a variety of indoor and outdoor environments. The experimental results demonstrate the effectiveness and generalizability of our system.
Ii Exploiting Distinct Radio Propagation Patterns between On- and Off-Body Channels
Since the human body is basically a low-loss dielectric at microwaves frequencies, including Wi-Fi and Bluetooth frequency bands, radio propagation between a transmitter (Tx) and a receiver (Rx) carried by a user is significantly influenced by the user’s body. As shown in Fig. 1, off-body links are dominated by LOS and multi-path propagations. On the other hand, on-body links are governed by creeping waves, which are diffracted by human tissues and spread out along the human body . Previous measurements  indicate that creeping waves are rarely disturbed by multi-path fading (small-scale fading) or large-scale fading caused by Tx-Rx distance changes or shadowing, but are largely influenced by body motions. Thus, we see that distinct propagation patterns exist between on- and off-body signals.
depicts the RSS and the cumulative distribution function (CDF) of different BAN signals that were collected in standing and walking states, respectively. In the standing state, we observe that on-body signals are more stable in the time domain and off-body signals contain more components in the high frequency band. In the walking state, on-body signals have a larger RSS variance and fall into a low frequency range with a very high probability. The experimental observations verify that differentiable propagation patterns exist in on- and off-body channels in each motion state. This supports our premise that we can rely upon PHY signatures to authenticate various on-body IoT devices.
Iii Adversarial Network Based Device Authentication
Iii-a Design Rationale
It is, however, non-trivial to reliably capture propagation patterns from real-world radio traces. As shown in Fig. 2, although on- and off-body signals show distinguishable propagation patterns in each case, their patterns are remarkably different between the two cases. Consequently, an authentication model that is trained under a specific user motion will typically not generalize well in different motion scenarios.
To deal with such dilemma, we resort to adversarial networks, which have recently surfaced as a popular tool to discover transferable features in the deep learning field and have proven their advantages in many real-world applications[11, 12, 13]. Being a branch of deep learning approaches, adversarial networks facilitate automatic extraction of complex and latent feature representations by adopting a hierarchical structure . Different from traditional approaches, they have the ability to find and exclude irrelevant features in the learned representations with an adversarial training criterion.
Therefore, we can reap the benefits of adversarial networks to recognize underlying on- and off-body propagation patterns. In our application, a customized adversarial network can be leveraged to autonomously extract feature representations about BAN radio propagation patterns and selectively eliminate motion specific features from the representations. To this end, we propose an adversarial network based security system to seamlessly authenticate various on-body IoT devices.
Iii-B Design Overview
Our system takes advantage of an adversarial network to extract distinct radio propagation patterns for on-body device authentication. Fig. 3 illustrates the framework of our authentication system. It takes as input RSS time series and outputs the corresponding device authentication results. It is worth noting that to verify RSSs from various low-end embedded IoT devices, our authentication system locates at users’ smartphones, which have sufficient capability to perform low-latency and accurate learning based inferences .
The core of our authentication system includes two components – Propagation Profile Characterization and Propagation Pattern Recognition
Propagation Pattern Recognition.
Propagation Profile Characterization. This component first divides RSS time series into multiple basic segments. Then, radio features are extracted from both the time and frequency domains of RSS segments for fine-grained characterization of potential propagation patterns. Finally, the extracted features are integrated into radio propagation profiles for future pattern recognition by the adversarial network.
Propagation Pattern Recognition. Upon receiving a propagation profile, the adversarial network first utilizes a functional block to abstract a feature representation in terms of on- and off-body propagations. Subsequently, the model infers the identity of a connected IoT device through an on-off prediction block. Moreover, an adversary block is added to eliminate motion specific features in the feature representation in the training phase. All blocks are learned through an adversarial training process to promote the emergence of features that are resilient to motion changes.
Iii-C Propagation Profile Characterization
Signal Segmentation. Our system first partitions RSS measurements into multiple segments. As an RSS segment is the basic unit for device authentication, the segment interval needs to be carefully determined. If the interval is too long, on- and off-body signals will be probably both included in a same segment. If it is too short, the system will be unable to recognize any segment. We empirically find that a time interval of 5s is capable of correctly differentiating over of on- and off-body IoT devices.
Time Domain Feature Extraction.
Time Domain Feature Extraction.Since on- and off-body signals have different levels of impact from body motions, large- and small-scale fading, we first decompose each RSS segment into multi-scale variations by using filters. As creeping waves are sensitive to body motions and their frequencies fall into relatively low frequency bands with a high probability , a band-pass filter is leveraged to extract motion-induced variations. Based on our experimental observations, most fluctuations caused by body motions fall between 0.5 Hz and 15 Hz. Variations in the residual low and high frequency bands are also extracted by a low-pass filter and a high-pass filter, respectively, as large- and small-scale variations.
With multi-scale variations, we select six time domain features, including maximum, minimum, median, variance, kurtosis and skewness
, to characterize propagation signatures from each kind of variations. The maximum, minimum, median and variance are chosen to describe the impact from the human body, because dramatic body vibration typically contributes to rapid changes in the maximum, minimum and median and also results in a large variance. Kurtosis and skewness show the symmetry and asymmetry of radio signals, respectively, and can potentially capture propagation patterns due to the fact that both symmetric and asymmetric components are richly shared in radio waves. For finer-grained feature extraction, we divide each kind of variations into ten chunks and extract six features from each chunk. Therefore, a total of 180 feature points are extracted to describe radio propagation signatures from the time domain of an RSS segment.
Frequency Domain Feature Extraction.
To abstract frequency domain features, we start by performing Short-Time Fourier Transform (STFT) on each RSS segment to obtain its two-dimensional spectrogram. Specifically, with a signal sampling rate of 500 Hz, we conduct a 1000-point Fast Fourier Transform (FFT) within a 2s sliding window, shifting 1s each time to make full use of sampling data. To summarize information in the frequency domain, the frequency band of each spectrogram, i.e., [0,250] Hz, is partitioned into 40 intervals, each of which is associated with a frequency component of the segment. To effectively indicate propagation signatures, we equally segment the low frequency band, i.e., [0,15] Hz, into 30 intervals and the residual high frequency band into 10 intervals, and we sum up the magnitudes in each interval in every FFT result. In this way, we transform a two-dimensional spectrogram into a 440 matrix . Then we take two frequency domain features from : the component magnitude (or each element in ) and the proportion of each component (PC), such that , where . Finally, a total of 200 feature points are extracted from the frequency domain of an RSS segment.
Iii-D Propagation Pattern Recognition
We formulate the propagation pattern recognition as a binary classification task, where is the sample space and is the target label set. In our context, each is a radio propagation profile sample, and indicates the corresponding on- or off-body IoT device. Moreover, for each , denotes an auxiliary label that refers to the body motion that is sampled from.
We develop an adversarial multi-player network for propagation pattern recognition. As depicted in Fig. 4, our model encompasses three blocks – a Feature Extractor , an On-Off Predictor and a Motion Discriminator . Since simply wiping out all dependencies between feature representations and domains (e.g., motions in our application) could degrade the accuracy of target label prediction, our model adopts a conditional adversarial architecture for better generalization performance.
Feature Extractor . An extractor is the front block of the adversarial model. It takes as input a propagation profile and returns a latent feature representation as .
On-Off Predictor . A predictor acts as the end block of the model. It takes as input a learned feature representation
and outputs a two-dimensional probability vectorin terms of on- and off-body devices.
Motion Discriminator . A discriminator serves as an adversary in our model in the training phase. It takes a feature representation and the associated on-off probability vector together as input, and discriminates which motion state is sampled from as .
Adversarial Training. Before performing authentication, our adversarial model needs to be trained on training data, which follows the distribution . We define the loss of as the cross-entropy between and the true posterior target label distribution over , which is given as
Similarly, the loss of is defined as the cross-entropy between and the true conditional distribution over , which is expressed as
Note that to effectively learn parameters of our multi-player model, the flow from to is a one-way link (i.e., the black arrow line in Fig. 4), along which gradients don’t propagate back. Thus, the parameters of are not updated through the optimization of the loss .
To robustly authenticate IoT devices under diverse body motions, it is critical for our model to implement an adversarial training criterion. The basic idea is that to generalize well in unseen scenarios, a predictive model must discriminate well between on- and off-body devices, but it cannot distinguish body motions associated with input samples. To achieve this goal, we use minimax games between , and in the training phase. Particularly, plays a cooperative game with to minimize the loss . At the same time, and together play a minimax game, where aims to minimize the loss and tries to maximize it.
Iii-E Theoretical Analysis of Adversarial Model
We prove that the output of our adversarial model becomes invariant to motion changes through the adversarial training. Specifically, we first present the optimal predictor and optimal discriminator in Proposition 1 and Proposition 2, respectively, without proving them, and refer the reader to  (Proposition 2) for details. Then, we illustrate the virtual training criterion, optimal extractor and optimal output, respectively, in Corollary 1, Proposition 3 and Corollary 2. Differing from the theoretical efforts in the prior work , our analysis focuses on a practical adversarial model.
(Optimal predictor) For a fixed extractor , the output of the optimal predictor over achieves
and the loss of is
where denotes the conditional entropy function.
Note that given , the equality (5) indicates the maximal predictive capability that a predictor can learn from .
(Optimal discriminator) Given any extractor and any predictor , the optimal discriminator over obtains
and its loss is
With the optimal predictor and optimal discriminator, we proceed to simplify the minimax training criterion (4).
(Virtual training criterion) If and have enough capacity and are trained to be optimal over , the minimax optimization (4) is equivalent to the minimization of a virtual value function , which is expressed as
According to the losses of the optimal predictor (6) and optimal discriminator (10), the initial value function (3) can be simplified as the virtual version (9). Thus, optimizing the minimax optimization (4) equals to minimizing .
Then, we obtain the optimal extractor by minimizing .
(Optimal extractor) If , and have enough capability and are trained to be optimal over , any optimal extractor satisfies
When is fixed, and . Therefore, we obtain a lower bound of , that is
We note that the lower bound is achievable by considering a special case, where , an extractor with the best representative ability. In this case, we can check that
Proposition 3 indicates that when all blocks are trained to be optimal and our adversarial model reaches equilibrium, the extractor is able to extract all information about from the training samples and eliminate any information about except what is also related to .
(Optimal output) If , and have enough capacity and are trained to be optimal over , the output of our adversarial model achieves
Iv Evaluation in Real Environments
Iv-a Experimental Methodology
Implementation. We build a proof-of-concept prototype of the proposed system with three GNURadio/USRP B210 devices, which work at 2.4 GHz with a sampling rate of 500 Hz. Furthermore, two USRP devices are placed on a volunteer, referred to as a legitimate user, and are considered to be two on-body devices. The left device is situated on another volunteer, referred to as a malicious attacker, and is regarded to be an off-body device.
Data Collection. We collect radio traces in both controlled and uncontrolled user motion scenarios. In the controlled scenario, the user is confined to five frequently appearing motions, which are comprised of two static motions, sitting and standing, and three dynamic ones, arm moving, rotating and walking. In the uncontrolled scenario, the user is permitted to behave casually. In both scenarios, the attacker is allowed to move freely in the vicinity of the user to try to fool the legitimate devices. Moreover, to verify the robustness of our system under various environments, we collect wireless signals in five indoor and outdoor settings, i.e., a lab, a meeting room, a corridor, a rooftop and a park. We conduct the experiments over seven days and collect a total of ten hours of radio traces.
Dataset. Our dataset includes a total of 7200 samples that are extracted from collected radio traces. Therein, 6000 samples are from the controlled user motion scenario, and 1200 are from the uncontrolled scenario. When evaluating our model, we randomly take out 4800 samples from the controlled scenario for training and combine the leftover 1200 ones and all 1200 samples from the uncontrolled scenario for testing. Additionally, in both the training and testing sets, the numbers of on- and off-body samples are equal.
Parameterization. As shown in Fig. 4
, we parameterize our multi-player model as a deep neural network. Specifically, the feature extractor
is a convolutional neural network with eight convolutional layers to abstract latent feature representations from input samples. Furthermore, the on-off predictorand the motion discriminator are configured with three fully-connected layers to facilitate their own predictions.
Evaluation Metrics. We use the following metrics to illustrate the performance of our system.
Accuracy. It is computed as the ratio of the number of RSS segments that are correctly recognized to the total number of on- and off-body RSS segments.
True positive (TP) rate. It is denoted as the ratio of the number of on-body RSS segments that are correctly predicted to the total number of on-body segments.
False positive (FP) rate. It is defined as the ratio of the number of off-body RSS segments that are mistakenly accepted to the total number of off-body segments.
Iv-B Performance Results
We first illustrate the overall performance of our authentication system on all testing data. As shown in Table I, our system is able to identify 90.4% of on- and off-body devices on average. Specifically, it can correctly recognize on-body devices with a ratio of 89.0% and successfully mitigate 91.8% of attacks from off-body devices. In addition, we report the receiver operating characteristic (ROC) curve of our system, which depicts the tradeoff between FP and TP rates by varying their discrimination threshold in the interval . As depicted in Fig. 7, the system’s ROC curve first goes straight up and then becomes steady promptly as FP rate increases. Moreover, the area under the ROC curve (AUROC) reaches 0.958, which is close to 1, i.e., the AUROC of the ideal case. The above results indicate that our system achieves good authentication ability.
|Accuracy||TP Rate||FP Rate|
|90.4% 1.9%||89.0% 2.4%||8.2% 1.7%|
Then, we elaborate on the authentication performance for each frequently appearing motion. In general, each motion has a unique movement pattern of the human body, and thus exhibits different effects on BAN radio waves. As plotted in Fig. 7, the proposed system achieves better performance for the static motions than for the dynamic ones. The same observations are also shown in Fig. 7. Therein, higher TP rates and lower FP rates are clearly present in the static states, because there are fewer disturbances caused by body movements in radio signals when the user sits or stands still with IoT devices, which makes it much easier for the system to recognize on- and off-body propagation patterns. Despite the above differences, the system still achieves average TP and FP rates of 90.8% and 6.9%, respectively, in the controlled user motion scenario.
Next, we compare the system performance in the uncontrolled scenario with that in the controlled one. As illustrated in Fig. 10
, the system shows performance degradation in each metric in the uncontrolled scenario. The reason for the degradation is that more irregular and complicated body movements are present when the user behaves casually, which causes the extractor to extract more noisy features and thus hampers the prediction ability of the predictor. More specifically, the system has a TP rate reduction of 4.0% and a FP rate increase of 2.6% for uncontrolled motions. This is due to the fact that, compared with off-body signals, on-body signals, dominated by creeping waves, are more sensitive to user motion dynamics, which results in more on-body RSS segments to be mistakenly classified as off-body ones.
We further illustrate the benefits of adopting an adversarial discriminator in our multi-player model. Our discriminator aims at helping the extractor to discover transferable features and thus boosts the generalization ability of the predictor. To illustrate these merits, we set up a version of our model with a non-adversarial discriminator as a baseline. Note that in the baseline, the update of the extractor’ parameters relies solely on the minimization of the predictor’s loss.
Fig. 10 plots the training losses of discriminators in our and baseline models. The loss of the non-adversarial discriminator declines quickly and then stabilizes at a very low level. However, ours first fluctuates dramatically and finally converges to a high value. This is due to the fact that at the beginning, the fluctuations of the adversarial loss are incurred by its minimax optimization, and they mitigate gradually as motion specific features irrelevant to the predictor fade out in the feature representation. The above observations reveal that the extractor in our model abstracts more transferable features than that in the baseline. Furthermore, comparing the performance of two predictors in Fig. 10, we see that both loss curves decrease at first and then increase after certain numbers of iterations. However, the adversarial curve rises up at a lower speed than the non-adversarial one, which suggests that our adversarial discriminator works as a regularizer for alleviating over-fitting and enables the promotion of the predictor’s generalization ability.
V Related Work
Dedicated sensors, including accelerometers , bioimpedance sensors , motion sensors  and capacitive touch sensors , have been used to differentiate on- and off-body devices. Additionally, various sensors in smartphones [18, 19] have been also exploited to identify devices or users. However, sensor-based approaches limit themselves to specified user motions or fitness related wearables.
Existing measurements [9, 10] have shown that essential differences exist between on- and off-body radio propagations. Based on the above studies, radio propagation characteristics were examined to identify legitimate wearable devices . In comparison to the prior work, our work develops a customized adversarial network to essentially extract underlying propagation patterns and obtains a better generalized authentication performance in various motion scenarios.
This paper presents a motion invariant authentication system to secure on-body IoT device pairing and data transmission by harnessing an adversarial multi-player network to effectively recognize underlying radio propagation patterns. Our system takes one step forward to embrace the advent of human-centric IoT by supporting various wearable devices under diverse user motions. Our theoretical analysis indicates that at equilibrium, our adversarial model is resilient to motion variances. We extensively evaluate the proposed system with various static and dynamic user motions in indoor and outdoor settings. The results shows that our system can recognize 89.0% of legitimate devices while at the same time mitigating 91.8% of impersonation attack attempts.
The work was supported in part by the NSFC under Grant 61871441, 91738202, 61729101, the RGC under Contract CERG 16203215, Young Elite Scientists Sponsorship Program by CAST under Grant 2018QNRC001, National Key R&D Program of China under Grant 2017YFE0121500, the Key Laboratory of Dynamic Cognitive System of Electromagnetic Spectrum Space (Nanjing Univ. Aeronaut. Astronaut.), MIIT, China under Grant KF20181911.
-  D. Uckelmann, M. Harrison, and F. Michahelles, “An architectural approach towards the future Internet of Things,” in Springer Architecting the Internet of Things, 2011, pp. 1–24.
-  L. Mainetti, V. Mighali, and L. Patrono, “An IoT-based user-centric ecosystem for heterogeneous smart home environments,” in Proc. IEEE ICC, 2015, pp. 704–709.
-  S. Gollakota et al., “They can hear your heartbeats: non-invasive security for implantable medical devices,” in Proc. ACM SIGCOMM, vol. 41, no. 4, 2011, pp. 2–13.
-  L. Shi et al., “BANA: body area network authentication exploiting channel characteristics,” IEEE J. Sel. Areas Commun., vol. 31, no. 9, pp. 1803–1816, 2013.
-  G. Revadigar et al., “Accelerometer and fuzzy vault-based secure group key generation and sharing protocol for smart wearables,” IEEE Trans. Inf. Forensics Security, vol. 12, no. 10, pp. 2467–2482, 2017.
-  W. Xu et al., “Gait-key: A gait-based shared secret key generation protocol for wearable devices,” ACM Trans. Sensor Networks, vol. 13, no. 1, p. 6, 2017.
-  C. Cornelius et al., “A wearable system that knows who wears it,” in Proc. ACM MobiSys, 2014, pp. 55–67.
-  T. Vu et al., “Distinguishing users with capacitive touch communication,” in Proc. ACM MobiCom, 2012, pp. 197–208.
-  F. Di Franco et al., “On-body to on-body channel characterization,” in IEEE Sensors J., 2011, pp. 908–911.
-  J. Ryckaert et al., “Channel model for wireless communication around human body,” IET Electronics Letters, vol. 40, no. 9, pp. 543–544, 2004.
Y. Ganin et al., “Domain-adversarial training of neural networks,”
MIT press Journal of Machine Learning Research, vol. 17, no. 1, pp. 2030–2096, 2016.
-  M. Zhao et al., “Learning sleep stages from radio signals: A conditional adversarial architecture,” in Proc. ACM ICML, 2017, pp. 4100–4109.
-  Y. Shinohara, “Adversarial multi-task learning of deep neural networks for robust speech recognition,” in Interspeech, 2016, pp. 2369–2372.
-  I. Goodfellow et al., Deep learning. MIT press, 2016, vol. 1.
“MobileNetV2: The Next Generation of On-Device Computer Vision Networks,” Google Research, April 3, 2018. [Online]. Available:https://ai.googleblog.com/2018/04/mobilenetv2-next-generation-of-on.html
-  Y. Xiong and F. Quek, “Hand motion gesture frequency properties and multimodal discourse analysis,” Springer International Journal of Computer Vision, vol. 69, no. 3, pp. 353–371, 2006.
-  W. Xu et al., “Walkie-talkie: Motion-assisted automatic key generation for secure on-body device communication,” in Proc. ACM/IEEE IPSN, 2016, pp. 1–12.
-  Y. Ren et al., “Smartphone based user verification leveraging gait recognition for mobile healthcare systems,” in Proc. IEEE SECON, 2013, pp. 149–157.
-  A. Das, N. Borisov, and M. Caesar, “Do you hear what I hear?: Fingerprinting smart devices through embedded acoustic components,” in Proc. ACM CCS, 2014, pp. 441–452.
-  W. Wang et al., “Securing on-body IoT devices by exploiting creeping wave propagation,” IEEE J. Sel. Areas Commun., vol. 36, no. 4, pp. 696–703, 2018.