I Introduction
With the rapid proliferation of the Internet of Things (IoT), the task of securing IoT networks has become more challenging. Wireless devices in these networks such as sensors are typically constrained by their power and computational capability, rendering traditional cryptographybased authentication systems unsuitable. To address this, passive Physical Layer Authentication (PLA) has been proposed since it does not impose any overhead on the transmitter[1]. To identify transmitters, PLA uses channel state information and fingerprints embedded in transmitted signals due to hardware impairments.
Typically such an authentication system needs to differentiate among transmitters in the authorized set while rejecting unauthorized transmitters (outliers). Since the unauthorized set is practically infinite, this problem has been posed as an openset classification, as opposed to closedset classification where all classes are known. Recently, a number of efforts have evaluated open set classification models based on deep learning (DL) in this regard [2] [3]. They have become the stateoftheart in PLA, owing to reaching high accuracy while being reasonably robust in the face of channel variations [3].
To the best of our knowledge, these authentication systems have all been evaluated with a static authorized set, meaning that the authorized set of transmitters was assumed to be fixed during training, testing and deployment. However, in most practical situations, needs change after deployment, resulting in changes to the authorized set: some authorized transmitters might need to be invalidated while others might need to be added. For example, a malfunctioning sensor in an IoT network might need to be replaced with a new sensor. In such cases, it is critical that the authentication system be adapted quickly to the updated authorized set to avoid long downtimes. Despite the existence of efficient strategies for retraining DL models, they are still too timeintensive for critical realtime applications like authorization, especially in situations where high availability is key.
In this paper, we propose to use similarity search techniques used in information retrieval applications for open set transmitter authorization. The neural network (NN) of a DLbased authenticator is used to extract feature vectors from a training dataset consisting of authorized, and possibly unauthorized signals. Using the feature vector of each signal sample as its RF fingerprint, we formulate the task of authenticating a query signal as a nearestneighbor search over the database of RF fingerprints. Since the inference latency associated with an exact nearestneighbor search is too prohibitive for realtime authentication, localitysensitive hashing (LSH) is used to partition the database, allowing for a much faster approximate nearest neighbor (ANN) search to be performed. This authorization scheme by design allows for new authorized transmitters to be added by simply indexing signal samples from those transmitters into the database. Removing authorized transmitters could be accommodated without requiring any changes to the database. Our results show that the proposed LSH scheme is able to achieve retraining times orders of magnitude lower than DL models, with a negligible impact on outlier detection accuracy and inference latency.
Several previous works have used hashing methods to solve the openset face recognition problem: in
[4], the authors paired LSH with fullyconnected neural networks, but their approach differs significantly from ours since the purpose for using LSH was for model selection and not for nearestneighbor search. The closest approach to ours is in [5], where they used LSH to identify most similar faces and thereby solve openset face identification. However, neither of these approaches considered a dynamic authorized set.The rest of the paper is organized as follows: we start by formulating the problem in Section II. Section III discuses how stateoftheart DL models could be adapted to changes in the authorized set. Section IV presents our LSHbased authorization scheme. An empirical validation of the proposed methods is included in Section V. Section VI concludes the paper.
Ii System Model and Problem Formulation
We consider a finite set of transmitters that are authorized to access a system through receiver . The signal received at when some transmitter sends a set of symbols is ; models the channel effect, as well as the transmitter fingerprint imprinted on by due to the variability of its internal circuitry. The authentication problem can then be formulated as the following binary hypotheses test: based on , should determine whether belongs to the authorized set () or to the set of outliers (). This is visualized in Fig. 1.
An additional set , where , of known outliers may be used to improve the outlier detection [3]. So typically, a dataset of signal samples captured from transmitters in and a similar dataset captured from transmitters in will be used during training to assist the outlier detector to differentiate between authorized and nonauthorized transmitters.
Our task is to adapt to a change in after deploying the authentication system, as quickly as possible. Denote by , and , respectively, the initial value of , and . Then, some set of transmitters could be added to or some set could be removed from and added to (although both an addition and removal from could happen, this could be thought of as an addition followed by a removal).
Iii Adapting deeplearning based classifiers
In [3], we explored several neural network architectures that could be used for the authentication problem such as Disc, DClass and OvA. In this section, we demonstrate how each of these architectures could be adapted to accommodate changes in , without entirely retraining the underlying model from scratch.
Disc  DClass  OvA  
The highlevel architecture of Disc, DClass and OvA are given in Table I (within dashedboxes), where each could be broken into three building blocks: input, feature extractor and output. The input and feature extractor blocks are similar in all three architectures. In Disc, the output block produces a scalar output through a sigmoid activation indicating its binary authentication decision. OvA has parallel output blocks, each identical to the output block in Disc, and where the th block is tasked with independently determining whether the input signal belongs to . DClass has one output block with outputs emerging through a softmax activation: the first outputs correspond to authorized transmitters while the last output corresponds to outliers.
Adding transmitters to the authorized set requires a modification of the output block in some form for all three architectures, as summarized in Table I. If is the set of newly added transmitters to , in the case of OvA, this modification could be achieved by adding more output blocks in parallel, and retraining the new output blocks while keeping the rest of the NN frozen. Since there is only a single scalar output block in Disc, we could simply retrain that output block. With DClass, a similar approach to Disc is possible, where a new output block with outputs could be trained; however, a more efficient approach would be to utilize the cascaded architecture shown in Table I. First we train a secondary network, using as the authorized set, with the same input and feature extractor blocks as the original network, but with a new output block with outputs. A query signal is then judged to be unauthorized only if it is rejected by both NNs. The transmitterlevel granularity of OvA and DClass output blocks makes removing transmitters from relatively straightforward: during inference, we simply need to treat the outputs corresponding to the invalidated transmitters as unauthorized. However, Disc does not offer this flexibility, requiring a retraining of the output block as in Table I.
Note that except in the case of Disc, it could be potentially very expensive computationally to adapt the NN models to additions to the authorized set, especially for large , even with the strategies highlighted in Table I (we will demonstrate this empirically in Section V). This is our motivation to explore alternative authorization schemes that are more adept at efficiently adapting to changes in .
Iv Information retrievalbased transmitter authorization
Information retrieval is a broad term that refers to the organization, storage and retrieval of information with respect to a repository of data objects such as signals, documents or images. A typical use case is the task of finding a similar object to a given query object in a repository of objects. More formally, assume we have a repository of objects, each of dimensionality ; given a query object , the task is to find an object similar to , based on some similarity metric, as efficiently as possible. In practice, evaluating similarity between objects in the raw data space is ineffective as proximity in data space does not typically correspond to semantic similarity. Therefore, a mapping is done from each data object to a feature vector where the similarity search could be achieved by performing a nearestneighbor search over a database consisting of those feature vectors.
Assuming we have a training dataset containing sufficient signals from both and , a simple algorithm to solve the open set transmitter authorization problem is to find the most similar signal to the query signal : if we can infer , and if . A straightforward solution to the similarity problem is to perform an exact nearest neighbor search over the entire database. If the distance between two feature vectors could be computed in time (e.g. Euclidean distance), this process would take time. i.e. linear in . Assuming that the open set transmitter authorization problem could be solved by performing such a nearestneighbor search, a perquery lineartime solution is too prohibitive, considering the fact that such an authorization system is expected to serve multiple authorization requests per second. Therefore a sublinear time search is required.
Approximate nearestneighbor search algorithms allow us to perform the similarity search in sublinear time by making the compromise that the returned item need not be the strictly nearestneighbor, but whose distance to the query object is sufficiently close to that of the strictly nearestneighbor. A common approach to achieving sublinearity is to eliminate the need for an exhaustive search by partitioning the database into some “buckets” such that and its true nearest neighbor
are in the same bucket with high probability; then, the exhaustive search for
need only be done inside that bucket, and not over the entire database. Locality Sensitive Hashing (LSH) [6] could be used to perform the partitioning such that this property holds.Cryptographic hash functions (CHFs) attempt to create a large deviation in the hash value when there is a slight deviation in the input; conversely, LSH functions try to create hash values that preserve locality. In particular, LSH functions ensure that inputs that are close in the input space receive the same hash value with high probability. Although there are a number of LSH functions proposed in the literature, in this paper we chose the function based on random projections, mainly due to its simplicity and ease of implementation. For an input , the hash value is a binary string calculated as following: hyperplanes are randomly generated where ; then, the th bit of , is set to 1 or 0 depending on whether the point is above or below the hyperplane in space. Here, is the length of the hash value, called the hash size (note that there are possible hash values). With defined this way, the indexing process is simply to place each signal in the bucket labeled with hash value , as visualized in Fig. 2.
Iva Using LSH database to perform authorization
Assume we have indexed a set of training signals into an LSH database; includes signal samples from and possibly samples from . For a query signal , we can use the LSH database to determine whether or not in a two step inference process:

Step 1: Determine , the approximate nearestneighbor of . If does not exist, we infer that . Otherwise, we move to the next step.

Step 2: Let . If , we infer that , and that otherwise.
Note that the existence of in Step 1 is not guaranteed since the randomization involved means that similar items are not guaranteed to be grouped correctly. This shortcoming could be overcome by creating LSH databases instead of one, where the set of hyperplanes is generated independently in each case. Here, the exact nearestneighbor search is performed on all buckets mapped to over all databases, increasing the chance that a nearestneighbor is found. Furthermore, it should be noted that the twostep process above does not require to contain samples from ; in that case, intuitively should not exist as long as is large enough (there are enough buckets).
IvB Feature extraction
It has been shown that the activations produced by deeper layers of convolutional neural networks trained for image classification tasks could be used as a highlevel image descriptor
[7]. Inspired by this, we propose to use the activations invoked by the feature extractor block of a trained transmitter authorization NN model as the feature vector for a signal in our LSH authorization scheme. We call this NN our embedding model since it is used to extract a feature vector or embedding. Although this creates a dependence on a standard DLbased classifier, the expectation is that as long as the initial embedding model is expressive enough (trained on a sufficiently large dataset), it does not need to be retrained when the authorized set changes.
IvC Adapting to changes in
With the authorization scheme described above, it is straightforward to adapt to changes in . If transmitters in are added to , then we simply need to index signal samples collected from transmitters in to the LSH database. If some transmitters are removed from , then no modification to the LSH database is necessary: during Step 2 of the inference process, it should simply be noted that if , then in fact .
IvD Computational complexity and feature vector compression
It is easy to see that the indexing process has a cost of (cost of dimensional dot products for the datapoints, repeated for all databases). Since , the computational complexity of the twostep inference process is essentially the same as that of the operation:

Calculating has a cost of since a dimensional dot product needs to be calculated times

If all datapoints are distributed evenly over the buckets, then the exact nearestneighbor search will constitute calculating the distance metric over datapoints for a total cost of .

Since is evaluated over databases, the total inference cost is .
Note that both the indexing cost and the inference cost has a linear dependence on the dimensionality of the feature vectors
. Therefore we could also attempt to add a dimensionalityreduction step during indexing as well as during inference; in this paper, we tested the use of an autoencoder model for this purpose. Note that similar to the embedding model used for featureextraction, the encoder does not need to be retrained during changes to
, as long as the initial autoencoder was trained on a sufficiently large dataset.V Experimental Evaluation
Auth. scheme  Description  Trained on  Retrained on 
DClass  Initial DClass model  and  and (adapted as in Table I) 
DClass sep  Initial DClass model retrained from scratch  and  on and 
LSH  Standard LSH scheme  
LSH small  LSH scheme with a smaller database  300 samples from  300 samples from 
LSH dimred  LSH scheme with dimensionalityreduced feature vectors  
LSH dimred small  Similar to LSH dimred but with a smaller database  300 samples from  300 samples from 
We start by introducing the dataset and evaluation procedure, and discuss results obtained for different experiments.
Va Dataset
A dataset consisting of 71 transmitters was captured on the Orbit testbed [8]. The receiver was a software defined radio (USRP N210) and each transmitter was an offtheshelf Atheros WiFi module allowed to transmit over Channel 11 (with a center frequency of 2462 MHz and bandwidth of 20 MHz). Energy detection was used to extract packets after an IQ capture at a rate of 25 Msps for 1 second. Without any synchronization or further preprocessing, we used the first 256 IQ samples of each packet, containing the preamble, as the signal sample.
VB Evaluation Procedure
As explained in Section III, removing transmitters from the authorized set is a relatively inexpensive procedure for all the NN architectures in Table I; therefore, we will only focus on the case of adding transmitters to the authorized set. Also, we will only use DClass for comparisons with the LSH scheme since it has better outlier detection accuracy than Disc while being less computationally intensive to train than OvA [3], offering a more fair comparison.
, and will be chosen randomly, subject to the constraints specified for each evaluation—however, when comparing different authorization schemes, the same , and will be kept. For chosen , and , the dataset split will be as follows: for the training dataset and the validation dataset , we use 70% of the samples belonging to , and all the samples belonging . The shuffled combination of this data is split into 80% for and 20% for . The test set contains all samples from and the remaining 30% of . We will define this method of splitting the dataset for some , and as where .
For each , and , we start with a DClass model trained on and , and an LSH authorization scheme where is used to create the initial LSH database. The composition of the DClass feature extractor block was the same as that used in [3]. A frozen copy of the initial DClass model will be used as the embedding model for any LSH authorization schemes. An autoencoder is also trained on ; the resulting encoder is isolated and frozen to be used as the encoder for any dimensionality reduction. Then for a given value of , a set of transmitters will be randomly chosen from as and the dataset will be split again to form and .
Table II details the set of authorization schemes we use in our experiments, including on which datasets they are trained and retrained on. The small datasets are considered because the inference cost is positively correlated with , and therefore should help reduce the inference latency. Note that Euclidean distance was used as the distance metric for LSH schemes.
Different authorization schemes will be evaluated on with respect to:

Accuracy: Outlier detection accuracy on

Inference latency: The time to output the authorization decision per query signal, averaged across

Retraining time: The total time required to adapt the deployed authorization system to the change in .
It should be stressed that as long as the LSH scheme does not significantly compromise the accuracy and inference latency compared to DClass, retraining time is the critical metric of interest. Training time, which is the total time required to train each authorization system, is not analyzed as it is predictably higher for LSH schemes due to the indexing overhead; this is however, a good compromise to make as the trainingphase occurs before the deployment of the authorization system.
VC Adding transmitters to
In this experiment, we fix and start with , , and then add transmitters to from . The variation of retraining time, outlier detection accuracy and inference latency versus are given in Fig. 3. Fig. (a)a provides strong evidence that LSH authorization schemes are able to adapt to the change in the authorized set much faster than the DL models; in particular, we are able to see a roughly 100x improvement in retraining time (note that the timeaxis is in logarithmic scale). Furthermore, from Fig. (b)b we can immediately see that LSH schemes are able to match or even outperform the DClass models in terms of outlier detection accuracy. Also note that DClass matches the performance of DClass sep, justifying the freezeandtrain method proposed in Table I. Fig. (c)c paints a contrasting picture: DClass models are able to perform authorization decisions much faster than the standard LSH scheme. This justifies the purpose of opting to build smaller LSH databases with dimensionalityreduced features. Note in particular that LSH dimred small is able to match the latency performance of DClass while still slightly outperforming it on accuracy performance. Therefore, it is clear that LSH authorization schemes are a viable alternative to DL models, especially when is expected to evolve over the lifetime of the authorization system.
VD Effect of and
Understanding the performance impact of the two hyperparameters and (number of LSH databases and hash size) can help design LSH authorization systems to fit individual needs and flexibilities. To evaluate this, we fixed , , , and varied , to obtain the results in Fig. 4.
Recall from Section IVD that the indexing cost is directly proportional to both and ; therefore as expected, we observed that the retraining time grew with both and (not displayed in Fig. 4 for the sake of brevity). More interestingly, from Fig. (b)b and Fig. (c)c we see that for large , as is increased, the precision increases but the recall decreases. Increasing amortizes the effect of bad hyperplane selections, ensuring that true nearestneighbors “collide” (fall to the same bucket) on at least one of the databases. This results in a decrease of falsepositives (authorized signals being flagged as unauthorized) and hence an increase in precision, as it prevents Step 1 of the twostep inference process from failing erroneously. However, increasing also has the sideeffect of increasing the probability that an unauthorized signal collides with authorized signals (imagine a case when exclusively has samples from ), thereby increasing false negatives and hence decreasing the recall. Decreasing increases the likelihood of false collisions resulting in increased falsenegatives and hence lower recall as seen in Fig. (c)c. However, if is too high at low , it could result in similar points not colliding, resulting in false positives and hence low precision; this can actually be seen in Fig. (b)b, where for , the precision increases at first but then decreases. Due to false negatives varying in a larger range (higher range of recall in Fig. (c)c) than false positives (lower range of precision in Fig. (b)b), it is unsurprising that in Fig. (a)a the accuracy follows the same trend as the recall.
Arguably the most surprising result in Fig. 4 is that higher accuracy in Fig. (a)a does not come at the cost of higher latency in Fig. (d)d; in fact, it seems that higher accuracy is attainable with lower latency. Although this might seem counterintuitive, it is explainable from the inference cost formula we derived in Section IVD: . As it dictates, we can clearly see the linear variation of with in Fig. (d)d. However, the variation of with very much depends on the particular value of ; in fact, assuming the formula for holds, it can be theoretically shown that minimizes . In our case, so should have been ideal, which is seemingly contradicted in Fig. (d)d due to the latency continuing to drop as is increased upto 25. This discrepancy is most likely due to the assumption made in deriving that datapoints in the LSH database are evenly divided across the buckets, which may not be true due to the nature of the data involved. In fact, as is increased beyond 25, around the latency starts to increase as the cost of calculating becomes too prohibitive.
The takeaway from this experiment is that the performance impact of and is hard to predict due to dependence on factors like , composition of and the nature of the data involved. Therefore, it would be advisable to use a validation split of the dataset to calibrate them to the specific use case.
Vi Conclusion
In this paper, we considered the problem of adapting to a dynamic authorized set in RF transmitter authorization. First, we demonstrated how stateoftheart DL models could be adapted to changes in the authorized set. Then we described how locality sensitive hashing could be used to facilitate approximate nearestneighbor search in the realm of information retrieval to solve the transmitter authorization problem by building an LSH database. With this approach, incorporating changes to the authorized set in terms of additions and removals was shown to be manageable with simple changes to the underlying LSH scheme. From empirical results we showed that LSH schemes offers dramatically reduced retraining times compared to DL models when is changed, while matching their accuracy; although LSH schemes tended to have higher inference latencies, it was shown that the latencygap could be bridged by building smaller databases with dimensionalityreduced features. Furthermore, we showed how the number of LSH databases and the hash size interplay to tradeoff precision, recall and latency.
Even though we demonstrated many promising features of LSHbased authorization, these results are preliminary, and hence our message from this paper is not for them to replace DL models as the stateoftheart. Since the LSH scheme we evaluated relied on a DLbased authenticator as its featureextractor by design, our proposition is that they be used as a quickadapting backup to DL models in the face of sudden changes in the authorized set: this is depicted in Fig. 5. As the LSH scheme could be adapted quickly, we can use it as a backup authenticator while the DL model is down, and retrain the DL model in the background. This ensures that the authorization system experiences minimal downtime while not compromising much in terms of accuracy or latency.
References
 [1] W. Wang, Z. Sun, S. Piao, B. Zhu, and K. Ren, “Wireless PhysicalLayer Identification: Modeling and Validation,” IEEE Transactions on Information Forensics and Security, vol. 11, pp. 2091–2106, Sept. 2016.
 [2] S. Riyaz, K. Sankhe, S. Ioannidis, and K. Chowdhury, “Deep Learning Convolutional Neural Networks for Radio Identification,” IEEE Communications Magazine, vol. 56, pp. 146–152, Sept. 2018.
 [3] S. Hanna, S. Karunaratne, and D. Cabric, “Open Set Wireless Transmitter Authorization: Deep Learning Approaches and Dataset Considerations,” IEEE Transactions on Cognitive Communications and Networking, vol. 7, pp. 59–72, Mar. 2021.
 [4] R. Vareto, S. Silva, F. Costa, and W. R. Schwartz, “Towards openset face recognition using hashing functions,” in 2017 IEEE International Joint Conference on Biometrics (IJCB), pp. 634–641, Oct. 2017.
 [5] X. Dong, S. Kim, Z. Jin, J. Y. Hwang, S. Cho, and A. B. J. Teoh, “Openset face identification with indexofmax hashing by learning,” Pattern Recognition, vol. 103, p. 107277, July 2020.
 [6] A. Gionis, P. Indyk, and R. Motwani, “Similarity Search in High Dimensions via Hashing,” in Proceedings of the 25th International Conference on Very Large Data Bases, VLDB ’99, (San Francisco, CA, USA), pp. 518–529, Morgan Kaufmann Publishers Inc., Sept. 1999.

[7]
A. Babenko, A. Slesarev, A. Chigorin, and V. Lempitsky, “Neural Codes for Image Retrieval,” in
Computer Vision – ECCV 2014 (D. Fleet, T. Pajdla, B. Schiele, and T. Tuytelaars, eds.), Lecture Notes in Computer Science, (Cham), pp. 584–599, Springer International Publishing, 2014.  [8] D. Raychaudhuri, I. Seskar, M. Ott, S. Ganu, K. Ramachandran, H. Kremo, R. Siracusa, H. Liu, and M. Singh, “Overview of the ORBIT radio grid testbed for evaluation of nextgeneration wireless network protocols,” in Wireless Communications and Networking Conference, 2005 IEEE, vol. 3, pp. 1664–1669, IEEE, 2005.
Comments
There are no comments yet.