I Introduction
Mass digitization of people and things has opened the doorway to the Internet of Things (IoT), critical in many social and industrial applications. In particular, IoT is envisaged to deliver the vital data to inform Digital Twins and improve infrastructure maintenance and safety. In many cases, wireless IoT sensors that measure physical signals (e.g. gas pressure, water contamination) are buried underground. Encrypting the critical infrastructure information is important for national security, commercial sensitivity, and antitampering requirements. Many current IoT wireless transmissions (e.g. LoRaWAN, ZigBee) are vulnerable to eavesdropping. Authentication (e.g. overtheair activation session keys in LoRaWAN, Elliptic Curve DiffieHellman in Bluetooth) verifies the user’s identity and prevents malicious users from accessing the network. Encrypted wireless transmission protects data integrity and confidentiality [Poor19].
Ia From Public Key Cryptography to Physical Layer Security
Conventional encrypted communications employ symmetric encryption such as the advanced encryption standard (AES), which relies on a secret key shared between them beforehand. Public key cryptography (PKC) is the de facto key distribution protocol. Although efficient conventional PKC schemes are complex and computationally not suitable for IoT devices with limited capability. This introduces not only a computational cost challenge, but also sets the IoT devices at a disadvantage against most powerful eavesdroppers with orders of magnitude more computational power.
Physical layer security (PLS) has been proposed in recent years as a way to overcome many of the aforementioned challenges by using the wireless channel properties to create symmetrical keys without the need for a PKC distribution protocol [Poor19]. PLS negates the risk of key intercept and high computational requirement of PKC schemes, which makes it very suitable for IoT devices. PLS does however require that the wireless channel between nodes to be reciprocal (true for most propagation environments), dynamic (fading), and unique. This is to ensure robust symmetric key generation and avoid brute force attempts. PLS has been applied to a variety of embedded wireless and wired communication systems [Rothe20]. The shortfall of PLS lies in it requires the IoT devices to make accurate estimations of the wireless channel statistics [Zhang19]. Accurate estimation requires reasonably powerful signal processing units and also requires a reasonably high communication signal to noise ratio (SNR). Many embedded or underground IoT devices operate in low communication SNR regimes, and therefore PLS is not suitable.
IB Introducing Graph Layer Security (GLS): Encryption using Networked Physics
To overcome the PLS requirement for a high communication SNR for accurate wireless channel estimation, we identify and exploit a different physical attribute that is common to many IoT sensors. The general idea to exploit common physics has been proposed before, such as common heartbeat in different medical IoT devices across a body. However, those devices typically do not suffer from the aforementioned low communication SNR challenges. Here, we consider IoT devices placed in underground or embedded networked systems, such as oil/gas/water pipes, electrical networks, optical fibre networks, and other underground connected systems. In networked physical systems, a common continuity equation connects all the dynamics (e.g. Navier Stokes for flow, Nonlinear Schrodinger for optic transmission, power flow for electricity). We propose to exploit the common networked physical signals at different IoT monitoring points to encrypt the IoT device’s wireless data. This has the advantage of: 1) IoT sensors usually have very high precision in measuring the physical signals, and 2) requires no specific knowledge or requirement of wireless channel or public key distribution. This novel physical driven security is distinctive from both PKC and PLS, and its security independence from the digital environment makes it more resilient against digital attacks.
The rest of the paper is organised as follows. First we give a brief overview of how we can exploit nonlinear networked physical dynamics to encode digital IoT data. Then we go to the results to showcase demonstrations premised on real industrial networks. We then explain the methods in detail.
Ii Results
Iia Brief Description of Approach
In this work, we propose a novel digital encryption paradigm over a physical network  called graph layer security (GLS). The process is data driven and model agnostic. As shown in Fig. 1, we exploit the correlation (dependency) of underlying networked physical dynamics to enable encryption amongst digital transceivers without sharing a key or drawing from a public key pool. As such, a key can be generated by the physical dynamic on one node, and can be reconciliated by other nodes or a centralized hub. To analyze the dependency among nodes, we draw heavily on sparsifying high dimensional networked nonlinear dynamics by the use of Koopman linearization operator and Graph Fourier Transform (GFT)  in order to identify and convert the dependency analysis into a small subset of dominant and fixed basis. The way this is achieved is shown in Methods and proven in SI with extended validation results and theorem proofs.
Using the correlated physical dynamics and with the aid of the GFT operator to reconciliate the dynamics (these form the encryption basis in Fig. 1), we generate symmetrical keys through the standard PLS steps of [Zhang19, Liu17]
: feature extraction and information reconciliation, quantization
[Zenger15], privacy amplification and key agreement [Zeng15]. After which, two types of secure GLS communications are designed: an adhoc multihop nodenode and a centralised nodehub communication network.IiB Experimental Setting
The underlying network and the physical dynamics are configured as follows. For the network topology, 1000 random samples are configured via the Erdös–Rényi (ER) topological model, with nodes and independent edge included (i.e.,
) by probability of
.For the physical dynamics, four general nonlinear ODEs commonly found to model real systems are used [Barzel13, Gao16] separately below. We emphasize here such ODE models in Eq. (1) are only used for dynamic generation, and are unknown for the encryption and communication processes (e.g. process is data driven and model agnostic).
Our algorithm is model agnostic and more sophisticated models of higher dimensional ODEs or PDEs can be used, but for the sake of a demonstration, we use the classic ODE models provided in [barzel2013universality], i.e.,
(1)  
where is the physical dynamic on node , and the parameters , , are all set to for demonstrative purposes. The value of is detected at each node by an IoT sensor with a physical SNR (not to be confused with communication SNR), which is used to generate cipher keys.
In the physical network, any node could be a transmitter (Tx), and the receiver (Rx) can be either a different node in the physical network (nodenode) or a digital hub (nodehub). The channels for such communications are wireless and are independent with the physical dynamics.
An eavesdropper is assumed to be able to intercept the same encrypted wireless information with Rx. We assume that the eavesdropper may have some knowledge of the underlying model in Eq. (1) and the network topology, but do not have access to the physical sensor data across the network. Our goal is to minimize successful eavesdropping, and we minimize the usefulness of the model knowledge through random signal perturbations shown the Methods. We now show their results.
IiC Performance of GLS
IiC1 Nodenode Communication
One illustration of multihop nodenode communication is provided by Fig. 2(a). The information is encrypted at Tx node by its underlying physical dynamic, and then transmitted and processed via a group of selected relays and their dynamics (e.g., R1 and R2 in Fig. 2(a)). At the final Rx node, the encrypted information will be received and decoded. The idea of encryption and decoding is leveraged on the linear dependency of the underlying physical dynamics in Tx, relays, and Rx nodes. Such dependency analysis and relay selections are pursued in an offline manner by the Koopman linearization and GFT operator described in Methods.
Fig. 2(b) shows two security benefits: 1) communication security, and 2) encryption reliability, by providing the bit error rate (BER) between (Tx,Rx) versus the physical SNR of sensors for dynamic extraction. This further highlights the security’s dependency on sensor accuracy rather than wireless channel estimation quality or diversity (PLS) or public key security. The BERs of (Tx,Rx) stay smaller than those of the eavesdropper (constant ). For sensitive physical SNRs (as one expects of good IoT systems), the encrypted communication channel can achieve a decryption BER of , 34 orders of magnitude better than the eavesdropper. This indicates the encryption reliability of GLS, which can be ensured solely by a cheap but accurate physical sensors (for key generation and encryption), as opposed to the existing wirelessbased encryption (PLS) that requires complex and unreliable channel estimation technologies.
We also demonstrate the two benefits via Fig. 2(c), which provides the distribution of (Tx,Rx) pairs’ BER regimes, i.e., an order of (including ), an order of , and an order of ). It is seen that when physical SNRdB, the ratio of (Tx,Rx) with BER has an order of approach to , suggesting 1) the communication security as BERs of most eavesdroppers are , and 2) the encryption reliability that depends on only the accuracy of sensors for physical dynamic collections.
IiC2 NodeHub Communication
An uplink communication from a Tx node of network to the hub is illustrated via Fig. 2(d), whereby information is encrypted by the physical dynamic of such node, and the decoding process at the hub is leveraged on the recovered dynamic via the samples of sampling node set. As is shown, the recovery error in terms of the normalized root mean square (NRMSE) is decreasing with the increase of physical sensor SNR, demonstrating the feasibility of using common features for encryption and decoding.
In Fig. 2(e), the communication security and encryption reliability are shown respectively by the decreasing BER with rising physical SNR, and by the BER gap between nodehub and eavesdropper. The security performance can be further demonstrated by the distribution of nodehub uplink channels into different BER regimes in Fig. 2(f). It is observed that when physical SNRdB, the ratio of uplink channels with BER approaches to , whereas the BER of most eavesdroppers are constant to an order of .
Iii Discussion and Conclusion
Graph Layer Security (GLS) is proposed for the first time here, as a way to encode networked physical assets’ information via their graph signal processing properties. Our approach is premised on the exploration of correlation and dependency of nonlinear physical dynamics for encryption and decryption. The advantage of this approach, as described and demonstrated, is to rely solely on the IoT sensors’ accuracy in measuring the physical dynamics (e.g. water flow rate, contamination, gas pressure, voltage) of a networked system. Over the past few decades, we have developed cheap and accurate sensors. Therefore, encrypting the digital information by exploiting this accuracy makes sense compared to continuously and accurately estimating the wireless environment in PLS, which remains challenging for small IoT devices.
The challenge with GLS is to develop representative GSP operators that can reflect the complex dynamics, especially those that involve PDEs. Our prior work in sparse sensing of water distribution networks has shown that GSP can be applied successfully to NavierStokes PDEs in water distribution networks [8839864]. The generality of this datadriven approach is strong as it does not require knowledge of the underlying physical model, and indeed many real world systems do not have one or involve couplings between ODEs and PDEs (e.g. electricity grid connected to a thermo energy storage).
Iv Methodology
In this section, we will elaborate the graph layer encryption of nodenode and nodehub communications, which exploits correlated physical dynamics over the network. To do so, the key steps are:

artificially induce randomness into the underlying dynamics in order to ensure that the model cannot be guessed by brute force

extract the sparse dependency of the dynamics in order to generate cipher keys for encryption and decoding.
From these steps, the dependency analysis allows for the selection of relays in the nodenode case, and for the dynamic recovery at the hub in the nodehub case.
Iva Step 1: Dynamic Model with Random Perturbations
To generate the random dynamics, we induce a groups of randomtime harmless perturbations (e.g. a harmless noise) into the physical network. As such, an discrete version of ODE models in Eq. (1) is expressed as:
(2) 
where is the discrete dynamic matrix of size (representing nodes and total timeindices sampled by [8865055]), and (the th column of ) denotes the vector stacked by the dynamics of total nodes at th timeindex. is the nonlinear evolution function derived from the continuous differential equations. is the artificially randomtime injection, specified as:
(3) 
Here, we assign as a known injection amplitude, and represents the Dirac function governed by the random injectiontime (i.e., ) aiming to generate the randomness of physical dynamics,  see Fig. 3(a).
IvB Step 2: Generate Symmetrical Keys: Linearization and Sparsification
The GLS encryption requires the dependency analysis for the relay selections (nodenode) and the dynamic recovery (nodehub). The difficulties lie in the nonlinearity and randomness driven by the evolution function and randomtime injection , as the direct dependency analysis from the random and nonlinear dynamics on nodes is hard to pursue. Instead, we characterize and analyze the dynamics by their dominant and fixed basis (orthogonal vectors). To be specific, we at first generate a linearized evolution model by Koopman operator. Then, a graph Fourier transform (GFT) operator is designed to convert the dependency analysis of random dynamics to the rows of dominant basis vectors.
IvB1 Koopman Linearization
Koopman linearization of a nonlinear dynamic system is pursued by the Koopman operator, which is referred to as a linear operator that evolves the selected observable functions defined on the original dynamicspace. By defining the space of all observable functions as , and stacking such observable functions as with and , the Koopman operator is specified as [williams2015data, 8431738, 9029917]:
(4) 
Existing designs of observable functions are leveraged on the Taylor series of the evolution function , and requires observable functions for the multiplicative terms [8431738, 9029917]. This makes them less attractive for largescale networks (e.g., ), given the high computational and storage consumption to process the Koopman operator of size . To reduce the computational overhead for low power IoT devices, we resort to a logarithm form observable function which is able to convert the multiplicative terms by the summation of logarithm terms. Here, we provide the formulation of the logarithm observable functions, i.e.,
(5)  
where is a matrix of size of which the th column is denoted as . is a large constant to make sure close to , and we assign for this work. The detailed deduction and explanation are provided in [wei2020sampling], and in SI. One illustration of Koopman linearization based on logarithm observable functions is provided by Fig. 3(b), in which we convert the nonlinear dynamics as a linearized evolution, by expanding the dynamic on each node via observable functions, therefore using only observable functions. Then, an approximated Koopman operator, denoted as , can be specified for the evolution of observable functions from timeindex to , i.e.,
(6) 
The derivation of is from the groups of simulated training dynamics denoted as and the corresponding from Eq. (5), , i.e.,
(7) 
where represents the submatrix of by selecting all rows and second to th columns, and denotes the MoorePenrose pseudoinverse of the matrix.
The accuracy of the Koopman linearization is provided in Fig. 3(c), where xaxis represents the number of observable functions, and yaxis is the NormalizedRMSE of the Koopman evolved dynamics to the original nonlinear dynamics. It is observed that the NormalizedRMSE of the proposed logarithm observable design can reach a small value (e.g., ) with only observable functions, as opposed to the existing polynomial design in [8431738, 9029917] that requires . This demonstrates (i) the ability of dynamic linearization by the proposed logarithm observable functions, and (ii) the feasibility for further processing over largescale network (i.e., ).
IvB2 Graph Fourier Transform for Sparsification
Based on the linearized dynamic evolution model, we develop a GFT operator to pursue dependency analysis of random physical dynamics on nodes. In essence, the GFT operator, denoted as , is a matrix that transforms the dynamic (observable) space into a space spanned by graph basis (orthogonal vectors) [Chen15, anis2016efficient, Chen16, ortega2018graph, 7979500, 7480396]. By denoting the graph basis as , we give the GFT and inverse GFT processes via following equations and via Fig. 3(d):
(8)  
where represents the graph frequency response of .
For this work, we aim to characterize the random dynamics by leading basis, and therefore convert the dependency analysis to the rows of such leading basis. For this purpose, we give our design of the GFT operator but leave the detailing deduction in SI, i.e.,
(9) 
where , and is to generate the diagonal matrix by vector. The number of dominant basis, i.e., , is selected to minimize the recovering error of from , which has a tradeoff between the physical noise of sensors and the coefficient on th graph basis (as is shown by Fig. 3(e)). We give the computation of in the following, but leave the deduction in SI, i.e.,
(10) 
where is the expected number of injection,
denotes the variance of physical sampling noise of sensor, and
denotes theth singular value of
.IvC Step 3: Encrypted Communication
For any node to act as the Tx, we encrypt the desirable information (a timeseries of length ) via the physical dynamic on node , denoted as , i.e.,
(11) 
where represents the encrypted information to be transmitted.
IvC1 Nodenode Communication with Relay Selection
For any Rx node , the nodenode communication is pursued by the selected multihop relay nodes whose physical dynamics are linearly dependent with each other, i.e.,
(12) 
In Eq. (12), are the selected relay nodes, and is the corresponding coefficient that will be determined in advance. As such, can be transmitted via the relay nodes , each of which processes the received data via , and transmits the processed data to the next relay. Finally, Rx node decodes the received data via and derive the decoded information , i.e.,
(13)  
Then, we will elaborate how to determine the appropriate relays nodes and their corresponding coefficients for each (Tx,Rx) in advance. Recall that the relay selection for nodenode encryption is to find the nodes whose physical dynamics are linearly dependent with each other. Given the merit of the GFT operator , we convert such linear dependency analysis on random dynamics to the selection of dependent rows in . Here, we resort to the orthogonal matching pursuit (OMP) [karahanoglu2012orthogonal, donoho2012sparse]. Define vector as the positions of in . Then, for any nodenode pair (Tx,Rx)=, the relay nodes can be selected via the computation of weight vector as follows:
(14)  
where denotes the submatrix of by selecting the row with index as (the first element in vector ) and columns with indices spanned from to .
IvC2 Nodehub Communication
For nodehub encryption, the idea is to select a subset of sampling nodes, whose physical dynamics collected by the hub can guarantee the complete dynamic recovery. As such, the information encrypted via the physical dynamic on Tx node can be decoded at the hub, via the recovery of such dynamic, denoted as , i.e.,
(15) 
The processes of selecting sampling node and dynamic recovery are leveraged on the graph sampling theory [Chen15, anis2016efficient, Chen16, ortega2018graph, 7979500, 7480396], which is specified in the following.
Selection of Sampling Node: We define as the sampling node set. According to [Chen15, anis2016efficient, Chen16, ortega2018graph, 7979500, 7480396], the guarantee of complete signal recovery from samples is:
(16) 
Eq. (16) is implemented via a greedy algorithm that maximizes the minimum singular of by finding and adding row index to , i.e., , such that .
Physical Dynamic Recovery: After collecting the samples from nodes in , the hub then recovers the complete dynamics for decoding the received information. We denote the samples as . The recovered dynamic observables, denoted as is computed as [Chen15, anis2016efficient, Chen16, ortega2018graph, 7979500, 7480396]:
(17) 
Then, for any timeindex, the recovered dynamics can be derived via .
Comments
There are no comments yet.