Graph Layer Security: Encrypting Information via Common Networked Physics

06/05/2020
by   Zhuangkun Wei, et al.
0

The proliferation of low cost Internet of Things (IoT) devices demands new encryption mechanisms over their wireless communication channel. Traditional public key cryptography (PKC) demands high computational power and is not suitable for low power IoT devices, making them vulnerable to more powerful eavesdroppers. Recent advances in physical layer security (PLS) exploits common wireless channel statistics to generate symmetrical keys, but require accurate channel estimation and a high communication signal-to-noise ratio (SNR). As most embedded and underground IoT devices operate in low communication SNR regimes, they cannot reliably use either PKC nor PLS encryption. Many IoT devices monitor underground networked assets such as water, oil, gas, and electrical networks. Here, we propose to exploit the monitored physical dynamics data to act as a basis for encrypting the digital information. Graph Layer Security (GLS) is proposed for the first time here, as a way to encode networked physical assets' information via their graph signal processing properties. Our approach is premised on the exploitation of networked correlation in nonlinear physical dynamics for encryption and decryption. We achieve this using Koopman operator linearisation and Graph Fourier Transform (GFT) sparsification. The resulting GLS encryption scheme, like PLS, do not require the exchange of keys or a public key, and is not reliant on wireless channel properties. Using real world examples, we demonstrate remarkably secure wireless communication encryption. We believe the technology has widespread applicability in secure health monitoring for Digital Twins in challenging radio environments and conclude our seminal paper with a discussion on future development challenges.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

05/31/2019

JEDI: Many-to-Many End-to-End Encryption and Key Delegation for IoT

As the Internet of Things (IoT) emerges over the next decade, developing...
03/18/2019

A Survey of Electromagnetic Side-Channel Attacks and Discussion on their Case-Progressing Potential for Digital Forensics

The increasing prevalence of Internet of Things (IoT) devices has made i...
10/18/2019

Physical Layer Encryption using a Vernam Cipher

Secure communication is a necessity. However, encryption is commonly onl...
05/06/2020

Preprint: Using RF-DNA Fingerprints To Classify OFDM Transmitters Under Rayleigh Fading Conditions

The Internet of Things (IoT) is a collection of Internet connected devic...
11/04/2017

Secure Communications using Nonlinear Silicon Photonic Keys

We present a secure communication system constructed using pairs of nonl...
11/27/2018

Cross-Technology Communications for Heterogeneous IoT Devices Through Artificial Doppler Shifts

Recent years have seen major innovations in developing energy-efficient ...
07/30/2019

A Robust Algorithm for Sniffing BLE Long-Lived Connections in Real-time

Bluetooth Low Energy (BLE) has become an intrinsic wireless technology f...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Mass digitization of people and things has opened the doorway to the Internet of Things (IoT), critical in many social and industrial applications. In particular, IoT is envisaged to deliver the vital data to inform Digital Twins and improve infrastructure maintenance and safety. In many cases, wireless IoT sensors that measure physical signals (e.g. gas pressure, water contamination) are buried underground. Encrypting the critical infrastructure information is important for national security, commercial sensitivity, and anti-tampering requirements. Many current IoT wireless transmissions (e.g. LoRaWAN, ZigBee) are vulnerable to eavesdropping. Authentication (e.g. over-the-air activation session keys in LoRaWAN, Elliptic Curve Diffie-Hellman in Bluetooth) verifies the user’s identity and prevents malicious users from accessing the network. Encrypted wireless transmission protects data integrity and confidentiality [Poor19].

I-a From Public Key Cryptography to Physical Layer Security

Conventional encrypted communications employ symmetric encryption such as the advanced encryption standard (AES), which relies on a secret key shared between them beforehand. Public key cryptography (PKC) is the de facto key distribution protocol. Although efficient conventional PKC schemes are complex and computationally not suitable for IoT devices with limited capability. This introduces not only a computational cost challenge, but also sets the IoT devices at a disadvantage against most powerful eavesdroppers with orders of magnitude more computational power.

Physical layer security (PLS) has been proposed in recent years as a way to overcome many of the aforementioned challenges by using the wireless channel properties to create symmetrical keys without the need for a PKC distribution protocol [Poor19]. PLS negates the risk of key intercept and high computational requirement of PKC schemes, which makes it very suitable for IoT devices. PLS does however require that the wireless channel between nodes to be reciprocal (true for most propagation environments), dynamic (fading), and unique. This is to ensure robust symmetric key generation and avoid brute force attempts. PLS has been applied to a variety of embedded wireless and wired communication systems [Rothe20]. The shortfall of PLS lies in it requires the IoT devices to make accurate estimations of the wireless channel statistics [Zhang19]. Accurate estimation requires reasonably powerful signal processing units and also requires a reasonably high communication signal to noise ratio (SNR). Many embedded or underground IoT devices operate in low communication SNR regimes, and therefore PLS is not suitable.

Fig. 1:

Illustration of digital Graph Layer Security (GLS) encryption scheme using a common physical dynamics. (a) physical dynamics at different points of a utility network can exhibit correlated and unique dynamics which can be exploited by digital transceivers to encode their data without the need for public key exchange. Sub-plots: (b) the GS algorithm steps and operators (from networked nonlinear dynamics to orthogonal basis vectors to symmetrical cipher keys).

I-B Introducing Graph Layer Security (GLS): Encryption using Networked Physics

To overcome the PLS requirement for a high communication SNR for accurate wireless channel estimation, we identify and exploit a different physical attribute that is common to many IoT sensors. The general idea to exploit common physics has been proposed before, such as common heartbeat in different medical IoT devices across a body. However, those devices typically do not suffer from the aforementioned low communication SNR challenges. Here, we consider IoT devices placed in underground or embedded networked systems, such as oil/gas/water pipes, electrical networks, optical fibre networks, and other underground connected systems. In networked physical systems, a common continuity equation connects all the dynamics (e.g. Navier Stokes for flow, Nonlinear Schrodinger for optic transmission, power flow for electricity). We propose to exploit the common networked physical signals at different IoT monitoring points to encrypt the IoT device’s wireless data. This has the advantage of: 1) IoT sensors usually have very high precision in measuring the physical signals, and 2) requires no specific knowledge or requirement of wireless channel or public key distribution. This novel physical driven security is distinctive from both PKC and PLS, and its security independence from the digital environment makes it more resilient against digital attacks.

The rest of the paper is organised as follows. First we give a brief overview of how we can exploit nonlinear networked physical dynamics to encode digital IoT data. Then we go to the results to showcase demonstrations premised on real industrial networks. We then explain the methods in detail.

Fig. 2: Performance of GLS. (a)-(c) are for multi-hop node-node communication. (a) gives ones illustration of (Tx,Rx) with relays whereby encryption and decoding are leveraged on the linear dependency of physical dynamics. (b) shows the BER for node-node communications versus physical SNR of sensors for dynamic collection. (c) provides the distributions of of (Tx,Rx) with different levels of BER. (d)-(f) are for node-hub uplink communication. (d) illustrates node-hub uplinks whereby the encryption and decoding processes depend on the dynamic recovery at hub using sampling node set. (e) gives the BER for node-hub communications versus physical SNR. (f) provides the distributions of uplinks with different levels of BER.

Ii Results

Ii-a Brief Description of Approach

In this work, we propose a novel digital encryption paradigm over a physical network - called graph layer security (GLS). The process is data driven and model agnostic. As shown in Fig. 1, we exploit the correlation (dependency) of underlying networked physical dynamics to enable encryption amongst digital transceivers without sharing a key or drawing from a public key pool. As such, a key can be generated by the physical dynamic on one node, and can be reconciliated by other nodes or a centralized hub. To analyze the dependency among nodes, we draw heavily on sparsifying high dimensional networked nonlinear dynamics by the use of Koopman linearization operator and Graph Fourier Transform (GFT) - in order to identify and convert the dependency analysis into a small subset of dominant and fixed basis. The way this is achieved is shown in Methods and proven in SI with extended validation results and theorem proofs.

Using the correlated physical dynamics and with the aid of the GFT operator to reconciliate the dynamics (these form the encryption basis in Fig. 1), we generate symmetrical keys through the standard PLS steps of [Zhang19, Liu17]

: feature extraction and information reconciliation, quantization

[Zenger15], privacy amplification and key agreement [Zeng15]. After which, two types of secure GLS communications are designed: an ad-hoc multi-hop node-node and a centralised node-hub communication network.

Ii-B Experimental Setting

The underlying network and the physical dynamics are configured as follows. For the network topology, 1000 random samples are configured via the Erdös–Rényi (ER) topological model, with nodes and independent edge included (i.e.,

) by probability of

.

For the physical dynamics, four general nonlinear ODEs commonly found to model real systems are used [Barzel13, Gao16] separately below. We emphasize here such ODE models in Eq. (1) are only used for dynamic generation, and are unknown for the encryption and communication processes (e.g. process is data driven and model agnostic).

Our algorithm is model agnostic and more sophisticated models of higher dimensional ODEs or PDEs can be used, but for the sake of a demonstration, we use the classic ODE models provided in [barzel2013universality], i.e.,

(1)

where is the physical dynamic on node , and the parameters , , are all set to for demonstrative purposes. The value of is detected at each node by an IoT sensor with a physical SNR (not to be confused with communication SNR), which is used to generate cipher keys.

In the physical network, any node could be a transmitter (Tx), and the receiver (Rx) can be either a different node in the physical network (node-node) or a digital hub (node-hub). The channels for such communications are wireless and are independent with the physical dynamics.

An eavesdropper is assumed to be able to intercept the same encrypted wireless information with Rx. We assume that the eavesdropper may have some knowledge of the underlying model in Eq. (1) and the network topology, but do not have access to the physical sensor data across the network. Our goal is to minimize successful eavesdropping, and we minimize the usefulness of the model knowledge through random signal perturbations shown the Methods. We now show their results.

Ii-C Performance of GLS

Ii-C1 Node-node Communication

One illustration of multi-hop node-node communication is provided by Fig. 2(a). The information is encrypted at Tx node by its underlying physical dynamic, and then transmitted and processed via a group of selected relays and their dynamics (e.g., R1 and R2 in Fig. 2(a)). At the final Rx node, the encrypted information will be received and decoded. The idea of encryption and decoding is leveraged on the linear dependency of the underlying physical dynamics in Tx, relays, and Rx nodes. Such dependency analysis and relay selections are pursued in an off-line manner by the Koopman linearization and GFT operator described in Methods.

Fig. 2(b) shows two security benefits: 1) communication security, and 2) encryption reliability, by providing the bit error rate (BER) between (Tx,Rx) versus the physical SNR of sensors for dynamic extraction. This further highlights the security’s dependency on sensor accuracy rather than wireless channel estimation quality or diversity (PLS) or public key security. The BERs of (Tx,Rx) stay smaller than those of the eavesdropper (constant ). For sensitive physical SNRs (as one expects of good IoT systems), the encrypted communication channel can achieve a decryption BER of , 3-4 orders of magnitude better than the eavesdropper. This indicates the encryption reliability of GLS, which can be ensured solely by a cheap but accurate physical sensors (for key generation and encryption), as opposed to the existing wireless-based encryption (PLS) that requires complex and unreliable channel estimation technologies.

We also demonstrate the two benefits via Fig. 2(c), which provides the distribution of (Tx,Rx) pairs’ BER regimes, i.e., an order of (including ), an order of , and an order of ). It is seen that when physical SNRdB, the ratio of (Tx,Rx) with BER has an order of approach to , suggesting 1) the communication security as BERs of most eavesdroppers are , and 2) the encryption reliability that depends on only the accuracy of sensors for physical dynamic collections.

Ii-C2 Node-Hub Communication

An up-link communication from a Tx node of network to the hub is illustrated via Fig. 2(d), whereby information is encrypted by the physical dynamic of such node, and the decoding process at the hub is leveraged on the recovered dynamic via the samples of sampling node set. As is shown, the recovery error in terms of the normalized root mean square (N-RMSE) is decreasing with the increase of physical sensor SNR, demonstrating the feasibility of using common features for encryption and decoding.

In Fig. 2(e), the communication security and encryption reliability are shown respectively by the decreasing BER with rising physical SNR, and by the BER gap between node-hub and eavesdropper. The security performance can be further demonstrated by the distribution of node-hub uplink channels into different BER regimes in Fig. 2(f). It is observed that when physical SNRdB, the ratio of uplink channels with BER approaches to , whereas the BER of most eavesdroppers are constant to an order of .

Iii Discussion and Conclusion

Graph Layer Security (GLS) is proposed for the first time here, as a way to encode networked physical assets’ information via their graph signal processing properties. Our approach is premised on the exploration of correlation and dependency of nonlinear physical dynamics for encryption and decryption. The advantage of this approach, as described and demonstrated, is to rely solely on the IoT sensors’ accuracy in measuring the physical dynamics (e.g. water flow rate, contamination, gas pressure, voltage) of a networked system. Over the past few decades, we have developed cheap and accurate sensors. Therefore, encrypting the digital information by exploiting this accuracy makes sense compared to continuously and accurately estimating the wireless environment in PLS, which remains challenging for small IoT devices.

The challenge with GLS is to develop representative GSP operators that can reflect the complex dynamics, especially those that involve PDEs. Our prior work in sparse sensing of water distribution networks has shown that GSP can be applied successfully to Navier-Stokes PDEs in water distribution networks [8839864]. The generality of this data-driven approach is strong as it does not require knowledge of the underlying physical model, and indeed many real world systems do not have one or involve couplings between ODEs and PDEs (e.g. electricity grid connected to a thermo energy storage).

Iv Methodology

In this section, we will elaborate the graph layer encryption of node-node and node-hub communications, which exploits correlated physical dynamics over the network. To do so, the key steps are:

  • artificially induce randomness into the underlying dynamics in order to ensure that the model cannot be guessed by brute force

  • extract the sparse dependency of the dynamics in order to generate cipher keys for encryption and decoding.

From these steps, the dependency analysis allows for the selection of relays in the node-node case, and for the dynamic recovery at the hub in the node-hub case.

Iv-a Step 1: Dynamic Model with Random Perturbations

To generate the random dynamics, we induce a groups of random-time harmless perturbations (e.g. a harmless noise) into the physical network. As such, an discrete version of ODE models in Eq. (1) is expressed as:

(2)

where is the discrete dynamic matrix of size (representing nodes and total time-indices sampled by [8865055]), and (the th column of ) denotes the vector stacked by the dynamics of total nodes at th time-index. is the nonlinear evolution function derived from the continuous differential equations. is the artificially random-time injection, specified as:

(3)

Here, we assign as a known injection amplitude, and represents the Dirac function governed by the random injection-time (i.e., ) aiming to generate the randomness of physical dynamics, - see Fig. 3(a).

Given the modelling of Eqs. (2)-(3), the purpose of this work is to encrypt the wireless communications of node-node and node-hub, using the artificial physical dynamics of the graph layer.

Iv-B Step 2: Generate Symmetrical Keys: Linearization and Sparsification

The GLS encryption requires the dependency analysis for the relay selections (node-node) and the dynamic recovery (node-hub). The difficulties lie in the non-linearity and randomness driven by the evolution function and random-time injection , as the direct dependency analysis from the random and nonlinear dynamics on nodes is hard to pursue. Instead, we characterize and analyze the dynamics by their dominant and fixed basis (orthogonal vectors). To be specific, we at first generate a linearized evolution model by Koopman operator. Then, a graph Fourier transform (GFT) operator is designed to convert the dependency analysis of random dynamics to the rows of dominant basis vectors.

Fig. 3: Performance of Koopman linearization and GFT operator, which are designed to analyze dependency between random physical dynamics for node-node and node-hub encryption. (a) shows the random physical dynamics generated by artificially random-time injection. (b)illustrates Koopman linearization that generates a linearized evolution model of observables defined on original physical dynamics. (c) provides the accuracy of Koopman linearized model versus the number of observables. (d) shows the GFT process of observables and graph frequency response on graph basis domain. (e) gives the recovery performance versus the number of selected graph basis.

Iv-B1 Koopman Linearization

Koopman linearization of a nonlinear dynamic system is pursued by the Koopman operator, which is referred to as a linear operator that evolves the selected observable functions defined on the original dynamic-space. By defining the space of all observable functions as , and stacking such observable functions as with and , the Koopman operator is specified as [williams2015data, 8431738, 9029917]:

(4)

Existing designs of observable functions are leveraged on the Taylor series of the evolution function , and requires observable functions for the multiplicative terms [8431738, 9029917]. This makes them less attractive for large-scale networks (e.g., ), given the high computational and storage consumption to process the Koopman operator of size . To reduce the computational overhead for low power IoT devices, we resort to a logarithm form observable function which is able to convert the multiplicative terms by the summation of logarithm terms. Here, we provide the formulation of the logarithm observable functions, i.e.,

(5)

where is a matrix of size of which the th column is denoted as . is a large constant to make sure close to , and we assign for this work. The detailed deduction and explanation are provided in [wei2020sampling], and in SI. One illustration of Koopman linearization based on logarithm observable functions is provided by Fig. 3(b), in which we convert the nonlinear dynamics as a linearized evolution, by expanding the dynamic on each node via observable functions, therefore using only observable functions. Then, an approximated Koopman operator, denoted as , can be specified for the evolution of observable functions from time-index to , i.e.,

(6)

The derivation of is from the groups of simulated training dynamics denoted as and the corresponding from Eq. (5), , i.e.,

(7)

where represents the sub-matrix of by selecting all rows and second to th columns, and denotes the Moore-Penrose pseudoinverse of the matrix.

The accuracy of the Koopman linearization is provided in Fig. 3(c), where x-axis represents the number of observable functions, and y-axis is the Normalized-RMSE of the Koopman evolved dynamics to the original nonlinear dynamics. It is observed that the Normalized-RMSE of the proposed logarithm observable design can reach a small value (e.g., ) with only observable functions, as opposed to the existing polynomial design in [8431738, 9029917] that requires . This demonstrates (i) the ability of dynamic linearization by the proposed logarithm observable functions, and (ii) the feasibility for further processing over large-scale network (i.e., ).

Iv-B2 Graph Fourier Transform for Sparsification

Based on the linearized dynamic evolution model, we develop a GFT operator to pursue dependency analysis of random physical dynamics on nodes. In essence, the GFT operator, denoted as , is a matrix that transforms the dynamic (observable) space into a space spanned by graph basis (orthogonal vectors) [Chen15, anis2016efficient, Chen16, ortega2018graph, 7979500, 7480396]. By denoting the graph basis as , we give the GFT and inverse GFT processes via following equations and via Fig. 3(d):

(8)

where represents the graph frequency response of .

For this work, we aim to characterize the random dynamics by leading basis, and therefore convert the dependency analysis to the rows of such leading basis. For this purpose, we give our design of the GFT operator but leave the detailing deduction in SI, i.e.,

(9)

where , and is to generate the diagonal matrix by vector. The number of dominant basis, i.e., , is selected to minimize the recovering error of from , which has a trade-off between the physical noise of sensors and the coefficient on th graph basis (as is shown by Fig. 3(e)). We give the computation of in the following, but leave the deduction in SI, i.e.,

(10)

where is the expected number of injection,

denotes the variance of physical sampling noise of sensor, and

denotes the

th singular value of

.

Iv-C Step 3: Encrypted Communication

For any node to act as the Tx, we encrypt the desirable information (a time-series of length ) via the physical dynamic on node , denoted as , i.e.,

(11)

where represents the encrypted information to be transmitted.

Iv-C1 Node-node Communication with Relay Selection

For any Rx node , the node-node communication is pursued by the selected multi-hop relay nodes whose physical dynamics are linearly dependent with each other, i.e.,

(12)

In Eq. (12), are the selected relay nodes, and is the corresponding coefficient that will be determined in advance. As such, can be transmitted via the relay nodes , each of which processes the received data via , and transmits the processed data to the next relay. Finally, Rx node decodes the received data via and derive the decoded information , i.e.,

(13)

Then, we will elaborate how to determine the appropriate relays nodes and their corresponding coefficients for each (Tx,Rx) in advance. Recall that the relay selection for node-node encryption is to find the nodes whose physical dynamics are linearly dependent with each other. Given the merit of the GFT operator , we convert such linear dependency analysis on random dynamics to the selection of dependent rows in . Here, we resort to the orthogonal matching pursuit (OMP) [karahanoglu2012orthogonal, donoho2012sparse]. Define vector as the positions of in . Then, for any node-node pair (Tx,Rx)=, the relay nodes can be selected via the computation of weight vector as follows:

(14)

where denotes the sub-matrix of by selecting the row with index as (the first element in vector ) and columns with indices spanned from to .

Iv-C2 Node-hub Communication

For node-hub encryption, the idea is to select a subset of sampling nodes, whose physical dynamics collected by the hub can guarantee the complete dynamic recovery. As such, the information encrypted via the physical dynamic on Tx node can be decoded at the hub, via the recovery of such dynamic, denoted as , i.e.,

(15)

The processes of selecting sampling node and dynamic recovery are leveraged on the graph sampling theory [Chen15, anis2016efficient, Chen16, ortega2018graph, 7979500, 7480396], which is specified in the following.

Selection of Sampling Node: We define as the sampling node set. According to [Chen15, anis2016efficient, Chen16, ortega2018graph, 7979500, 7480396], the guarantee of complete signal recovery from samples is:

(16)

Eq. (16) is implemented via a greedy algorithm that maximizes the minimum singular of by finding and adding row index to , i.e., , such that .

Physical Dynamic Recovery: After collecting the samples from nodes in , the hub then recovers the complete dynamics for decoding the received information. We denote the samples as . The recovered dynamic observables, denoted as is computed as [Chen15, anis2016efficient, Chen16, ortega2018graph, 7979500, 7480396]:

(17)

Then, for any time-index, the recovered dynamics can be derived via .

References