Network Information Theoretic Security

01/15/2020 ∙ by Hongchao Zhou, et al. ∙ Shandong University Stanford University 0

Shannon showed that to achieve perfect secrecy in point-to-point communication, the message rate cannot exceed the shared secret key rate giving rise to the simple one-time pad encryption scheme. In this paper, we extend this work from point-to-point to networks. We consider a connected network with pairwise communication between the nodes. We assume that each node is provided with a certain amount of secret bits before communication commences. An eavesdropper with unlimited computing power has access to all communication and can hack a subset of the nodes not known to the rest of the nodes. We investigate the limits on information-theoretic secure communication for this network. We establish a tradeoff between the secure channel rate (for a node pair) and the secure network rate (sum over all node pair rates) and show that perfect secrecy can be achieved if and only if the sum rate of any subset of unhacked channels does not exceed the shared unhacked-secret-bit rate of these channels. We also propose two practical and efficient schemes that achieve a good balance of network and channel rates with perfect secrecy guarantee. This work has a wide range of potential applications for which perfect secrecy is desired, such as cyber-physical systems, distributed-control systems, and ad-hoc networks.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

The information-theoretic security introduced by Shannon [1] and widely accepted as the strictest notation of security, is becoming increasingly attractive for many cyber-physical systems, distributed-control systems, wireless ad-hoc networks, among other applications. Secure network coding [2] has been well studied to guarantee the information-theoretic security when a subset of channels are wiretapped [3, 4] or in the presence of Byzantine adversaries [5, 6]. In this paper, we make a stronger assumption: all the channels are eavesdropped and some nodes are hacked without knowledge of the rest of the nodes. This assumption is realistic for example in wireless networks in which an eavesdropper can sense the transmitted signals. Under this assumption, pure network-coding approaches cannot work without the help of common randomness shared among the network nodes. Physical layer security [8, 7, 9, 11, 10] can be used to distribute secret keys among network nodes, however, the channel advantage required by the receivers over any eavesdropper is not easy to guarantee in a wireless network. In many scenarios, it is feasible and much cheaper to pre-distribute a very large number of secret bits to network nodes to support future secure communication. In this paper, we are interested in a fundamental problem: if every node in the network is allowed to carry a certain large number of secret bits, how much information can be securely transmitted over the network under the information-theoretic security criterion?

We consider a connected network of nodes, with at most nodes hacked without knowledge of the other nodes. We assume pairwise communication with end-to-end encryption, that is each sender node encrypts its message using a secret key generated from the common randomness shared with the intended receiver node and the receiver decrypts it using the same key. Through the process of key pre-distribution, each node has a sequence of

secret bits, and the secret bits from different nodes may be correlated according to a symmetric joint probability distribution which does not depend on future communications. If a node is hacked, all its secret bits are disclosed to the eavesdropper. Before two nodes communicate, they generate a secret key by purifying their common randomness with privacy amplification 

[13, 14, 12, 15] then use the one-time pad scheme to communicate the message. As the total length of the messages to be transmitted over a channel (i.e., from a source to a destination) cannot exceed the secret-bit length of each node, we define their ratio as the channel rate, and the sum rate of all the channels as the network rate.

The secure-communication limit of a network depends on the secret bits distributed and the method of utilizing them to deliver information. As an example, consider a network with nodes and nodes being hacked. A straightforward way to pre-distribute the secret bits is to assign each pair of nodes common secret bits as the secret key which they can use with the one-time pad scheme. In this case, the messages are secure if and only if the rate of each channel does not exceed . As the network size increases, this approach can only reach channel rate of at most , limiting its applications for large networks. Another way to redistribute the secret bits in the -node example is to assign every three nodes common secret bits, and hence there are four sequences of secret bits denoted by , where is the secret bits distributed to nodes . When two nodes say nodes and communicate to each other, they use as the secret key. In this case, no matter whether node or node is hacked, the messages are secure as long as the channel rate between node and does not exceed . In fact, for a larger network of size , by allowing each secret bit to be distributed to multiple nodes instead of only two nodes, the maximum channel rate (channel capacity) can be improved to more than from . Our scheme can be viewed as another application of linear network coding. It allows higher secure communication rates by utilizing the secret bits shared by multiple channels perfect secrecy can be achieved.

We address several basic questions about our network setting: (i) what is the limit on the network rate and the channel rate for secure communication considering all the symmetric ways of pre-distributing secret bits? (ii) given an arbitrary distribution on the secret bits, how do we determine the security of a network with given channel rates? and (iii) how to design efficient and practical network-communication schemes that have both high network rate and high channel rate? These questions are addressed in the rest of this paper.

Ii Definitions

We consider a network consisting of a set of nodes , where every two nodes can find a path (channel) connecting them, and use to denote the set of all the channels. Each node in the network is able to store a large number of secret bits. To guarantee the network security, the total message length of a channel cannot exceed . We call the number of message bits transmitted through a channel per a node’s secret bit as the channel rate , and the sum of the channel rates as the network rate . Mathematically,

We assume that up up to nodes could be hacked by an eavesdropper, and in this case, all the secret bits stored in these nodes are revealed to the eavesdropper. We use to denote the set of hacked nodes, and use to denote the set of unhacked secure nodes. As a channel is insecure if one of its terminals is hacked, our goal is to protect those channels between secure nodes, denoted by .

A secure network-communication scheme consists of two phases. In the key pre-distribution phase, the scheme provides a sequence of secret bits to the network nodes, with each bit possibly distributed to multiple nodes. We use to denote the sequence of secret bits distributed to node . In the communication phase, we let be the message transmitted in channel (we assume there is a single message for simplicity), and the corresponding ciphertext is . Both the transmitted ciphertexts and the hacked secret bits are known to the eavesdropper.

The network communication is information-theoretically secure if and only if for any possible set of hacked nodes and for any channel , it has

(1)

for small , with for perfect secrecy. Note that the security of the network depends on the message lengths (i.e. the channel rates for a given ) but not on their exact values or which of the two terminals is the source. We say that a set of channel rates is achievable using a scheme if any messages with these channel rates can be securely communicated with high probability (arbitrarily close to ) for sufficiently large. We denote the set of achievable rates using a scheme by .

We define the maximum channel rate and the maximum network rate of a scheme as the supremum of all the achievable channel rates and network rates of the scheme, and denote them by

The communication demand is typically unknown during the key pre-distribution phase. To ensure that the designed schemes have enough flexibility, the secret bits are pre-distributed to network nodes in a symmetric fashion (thus permuting the indices of the nodes does not change the joint probability distribution of the secret bits). In this case, the maximum network rate can be achieved by the equal channel rates with for all the channels, following the symmetry assumption of the network as well as the distributed randomness. On the other hand, the maximum channel rate can be achieved when there is only one channel having message transmissions, i.e., for a specific single channel and for all the other channels.

To investigate the communication limit, we define the channel capacity and the network capacity of a network as the maximum over all the maximum channel rates and maximum network rates of any scheme with symmetric key distribution, and denote them by

Iii Summary of Main Results

There is a certain tradeoff between the maximum network rate and the maximum channel rate of a scheme. In particular, they have the following relationship for , implying that one may sacrifice the maximum network rate to obtain a better maximum channel rate, and vice versa.

Theorem 1.

Given a network of nodes without any nodes being hacked, for any scheme with symmetric key distribution, it satisfies

(2)

The equality is achievable for any .

The maximum channel rate of a scheme is at most . But this upper bound cannot be reached when . The following result provides the capacities of a general network, as the theoretical limit for all the schemes with symmetric key distribution. From this result, a network with and has channel capacity of . Interestingly, if we further increase the size of the network to , the channel capacity is still .

Theorem 2.

Given a network of nodes with at most nodes being hacked, its network capacity and channel capacity are

Given a scheme, it is crucial to determine whether a network with channel rates is perfectly secure or not, as the communication phase has to be terminated for guaranteeing perfect secrecy when the channel rates get very close to the limit. The difficulty arises from the fact that different channels may share some common secret bits, hence “interfere” with each other. Given an arbitrary (symmetric or not) distribution of secret bits, we prove that perfect secrecy is achievable if and only if the sum rate of any subset of unhacked channels does not exceed the shared unhacked-secret-bit rate of these channels. Here, given the set of hacked nodes , the shared unhacked-secret-bit rate of a set of channels is defined by

Theorem 3.

Given a network of size , the channel rates are achievable if and only if for any possible set of hacked nodes and for any subset of channels , it satisfies 111It is assumed that privacy amplification is applied to every channel. Otherwise, the equality holds if the common secret bits shared by two nodes are unknown by the rest of nodes and are used as the secret key directly. either or

(3)

Achievability uses a simple method for privacy amplification that generates each secret-key bit by computing the XOR of randomly sampled common secret bits with . For a network with nodes, assume that the secret bits are distributed as follows: each secret bit is distributed to different nodes, and every nodes share common secret bits. If no nodes are hacked, then the network is secure for sufficient large if and only if

holds for any node permutation. None of the inequalities can be dismissed. If one of the nodes is hacked, then the network is secure for sufficient large if and only if

(4)

holds for any node permutation.

Note that if the size of the network is large, the number of constrains in the above criteria becomes prohibitively large. One can reduce the computational complexity by relaxing the conditions to hold only for any set of channels of some size ,

where the left term is easy to calculate and the right term can be computed explicitly (see subsection VI-A) for symmetric key distribution.

The following result provides an alternative approach to check the security of a network, in which we let be the set of secret bits distributed only to all the nodes in . The shared unhacked-secret-bit rate of is defined by .

Theorem 4.

Given a network of size , the channel rates are achievable if for any possible set of hacked nodes , there exists a non-negative feasible solution for such that

We continue using the network of nodes as an example. If node is hacked, then only is not hacked. The network is secure for sufficiently large if there exists a feasible solution for such that

reaching the same condition as (4).

Iv Network Schemes

Maximum Network Rate   Maximum Channel Rate
Combinational Key Scheme
Random Key Scheme
TABLE I: The maximum rate of the combinational key scheme and the random key scheme.

We wish to develop schemes that can securely communicate as many message bits as possible not only over the entire network but also through a single channel.

As defined earlier, a secure network-communication scheme consists of a key pre-distribution phase and a communication phase in which a secret key between two nodes is established from their common secret bits via privacy amplification and the one-time pad scheme is used to achieve secure communication. There are a variety of methods for privacy amplification, such as universal hashing [16]

, random linear transformations 

[17] and polar codes [18]. Given a large number of common secret bits between two nodes, one straightforward idea is to divide the shared secret bits into blocks, and then apply privacy amplification to each block. However, this approach is not appropriate for our applications as it requires knowledge of the message length (channel rate) before communication as well as sophisticated coordination among the nodes. We adopt a simple method for privacy amplification: each secret-key bit is generated by computing the XOR of randomly sampled common secret bits from with a large integer, for example, . We can repeat this process whenever more secret-key bits are needed. The performance of this method is very close to optimal. To further improve the computational efficiency, one can generate secret-key bits simultaneously by packing secret bits together at the same location and performing the same operations on them.

Iv-a Key Pre-Distribution

We study two different key pre-distribution schemes. The first is the combinational key scheme, which distributes each secret bit to exactly nodes. The second scheme is the random key scheme, which distributes each secret bit to every node with some probability .

The combinational key scheme distributes the same number of distinct secret bits to each combination of nodes with . Hence, we divide all the secret bits into groups each of size , and assign the secret bits of each group to a unique combination of nodes. For every two nodes, their shared common secret bits consist of the secret bits from groups. There is a problem with this scheme: when and are large, the scheme becomes less practical as there are too many groups of secret bits. To address this problem, we suggest to use only random groups. This subset of groups can be found based on an random matrix with each row containing ones (corresponding to a group) and each column containing ones (corresponding to a node), whose construction has been studied for the parity-check matrix of regular LDPC codes [19].

The random key scheme generates a random-bit sequence of length and assigns each of its bits to every network node with a predetermined probability . Similar ideas were explored for key management in sensor networks with the computational security [20, 21]. In contrast, we study the key distribution for the information-theoretic security, which directly affect the communication rates. With the random key scheme, each node obtains a sequence of secret bits with length around . A difficulty with this scheme is to help every two nodes to identify their shared common secret bits for generating a secret key. It would be too storage-inefficient if each node stores not only the values of its secret bits but also their locations in . Our observation is that the secret bits need to be truly random, but not the ways of distributing them. One can use a pseudo-random permutation for key pre-distribution, and it helps to identify those common secret bits between nodes. Specifically, given the network size and the total number of secret bits , we construct a pseudo-random permutation [22] . We distribute the th secret bit in to node at the location if . This pseudo-random permutation is publicly known by all the network nodes.

Iv-B Maximum Rates

Table I lists the maximum rates of the combinational key scheme and the random key scheme. For the maximum rates of the combinational key scheme, they have a common term which is a decreasing function of with . This term captures the effect of the number of hacked nodes

on the maximum rates of the scheme. From this term, we can estimate the number of hacked nodes that the scheme can tolerate. For instance, when

, , and one can tolerate relatively large . When is large, the scheme can only tolerate a very small number of nodes to be hacked. We observe similar behaviors for the maximum rates of the random key scheme, which have a common term approximately that captures the effect of .

Further comparing the maximum rates of the two schemes, there is a rough mapping between the parameter in the combinational key scheme and the in the random key scheme, which is about the expected number of nodes that each secret bit is distributed to. The intuition behind this mapping is that: compared to the scheme that distributes each secret bit to nodes, the proposed schemes distribute each secret bit to around nodes. As a result, the maximum channel rates of the schemes increase by a factor of . Meanwhile, the usage efficiency of each secret bit (corresponding to the maximum network rates) is reduced by a factor of roughly , as each secret bit can only be used for once by two nodes among the nodes.

Iv-C Hybrid Schemes

For any combinational/random key scheme, the multiplication of its maximum network rate and its maximum channel rate does not exceed . It implies that with a pure combinational/random key scheme, high network rate and high channel rate cannot be achieved at the same time.

We denote the combinational key scheme with as the pairwise key scheme, which reaches the network capacity. To better balance the maximum network rate and the maximum channel rate, we consider a hybrid scheme, in which each node uses a fraction of its storage space to run the pairwise key scheme and the rest to run the combinational key scheme with (or a random key scheme with ). The maximum network rate of this hybrid scheme is the weighted sum of their respective component schemes’ maximum network rates, and so is its maximum channel rate. For , for example, the maximum rates of the hybrid scheme are

The maximum network rate is strictly larger than , and the maximum channel rate can be adjusted by selecting appropriate . For example, for a network with nodes and , the maximum network and channel rates of the pairwise key scheme are and , respectively, and those of the hybrid scheme with are and , respectively, which improves on the maximum channel rate by sacrificing on the maximum network rate.

V Proofs of Main Results

In this section we provide proofs of some of our results.

V-a Proof of Theorem 1

Given a scheme , let be the independent sequence of random bits from which the distributed secret bits are chosen. Denote the fraction of bits that are distributed to exactly nodes by with and

The total amount of memory needed to store the distributed secret bits is

(5)

Here, we write for a function .

Firstly, the total number of secure message bits is upper bounded by the total number of secret bits , hence the maximum network rate

(6)

Secondly, the number of message bits that can be communicated in a channel is upper bounded by the number of common secret bits shared by the two terminals. If the key distribution is symmetric, then

(7)

The last step follows since for .

From(6) and (7), we obtain

Let us prove the achievablity, starting from two simple schemes. In the first scheme (called the same key scheme), all the nodes share the same set of secret bits, and its maximum rates are

(8)

The second scheme is the pairwise key scheme, whose maximum rates are

(9)

The equality in the theorem holds both for the same key scheme and the pairwise key scheme. Here we construct a scheme as the hybrid of the two simple schemes. For each node, it uses a fraction of its storage space for the same key scheme and the rest for the pairwise key scheme. The maximum rates for the hybrid scheme are

By adjusting the fraction , we can obtain all the maximum network rates and the maximum channel rates meeting the equality in the theorem.

V-B Proof of Theorem 2

The network capacity is easy to derive: The total message length communicated with a node cannot exceed , hence

This leads to , yielding the upper bound on the network capacity. This upper bound is achievable using the simple pairwise key scheme. We continue to prove that the channel capacity is at most with , and it’s achievable.

Given the sequence of secret bits stored in node , , for all , the entropy of is at most . Due to the symmetry of the network, without loss of generality, we assume that the first nodes are hacked, then the hacked secret bits are , and the maximal number of message bits that can be securely communicated between node and node is the mutual information between and conditioning on . As a result,

Since is invariant under any permutation of node indices, for simplicity, we can rewrite it as .

We can generalize this concept of conditional mutual information to a higher order as

with the short notations,

(10)

This quantity is the amount of mutual information among all the nodes given the secret bits of any other nodes.

From (10), it can be shown that

(11)

where .

On the other hand,

(12)

From (11) and (12), it can be shown that

This leads to the upper bound on the channel capacity. This upper bound can be achieved with the combinational key scheme with if the underlying privacy amplification is asymptotically optimal.

V-C Proof of Theorem 3

The necessity is easy to prove: if there exists a subset of channels violating (3), then their total message length must be larger than the number of their used unhacked secret bits. As a result, at least one of these messages must be information-theoretically insecure.

To prove achievability, we consider a simple method for privacy amplification: for every channel , given the common secret bits , the secret key with a sparse random matrix of density . The reason of using this method is not only due to its asymptotic optimality, but also to its practicality. It is the basis of our proposed network schemes.

Let be the secret keys between unhacked nodes, and let be the distinct secret bits stored in hacked nodes, which are disclosed to the eavesdropper. The network communication is information-theoretically secure if and only if for any possible set of hacked nodes , the secret keys and the hacked secret bits are truly random bits, and are independent. Note that both and can be written as linear transformations of the source sequence .

Let be the concatenation of and , then

(13)

for some matrix . The network is information-theoretically secure if and only if all the rows in matrix are linearly independent.

We can write the secret key as

for some matrices and , where is an matrix consisting of random columns of density and zero columns.

Then is represented by

(14)

where

is an identity matrix and

consists of all the matrices with . The network is perfectly secure if the rows in are linearly independent. This is equivalent to showing that the rows in are linearly independent, i.e., all the rows in are linearly independent. This can be proved based on the following results.

Lemma 5.

All the rows in are linearly independent if and only if for any subset of channels , there does not exist any subset of rows from

that includes at least one row from each matrix such that their sum is a zero-vector.

Lemma 6.

Given any subset of channels , if with a random matrix of density and

with , when , with probability almost there does not exist any subset of rows from that includes at least one row from each matrix such that their sum is a zero-vector.

The proof of Lemma 6 is provided in subsection V-E. Finally, we can conclude that the rows of the security matrix are linearly independent with high probability, and the criteria in Theorem 3 are sufficient.

V-D Proof of Theorem 4

Using the same proof as Theorem 3, the network is perfectly secure if and only if the rows of the matrix in (14) are linearly independent. In Theorem 4, for this matrix , it has the following properties: there are columns in corresponding to , in which each column has random entries with corresponding to with .

The rank of the matrix remains unchanged if we do elementary row or column operations on . The rows of a matrix are linearly independent if and only if the the matrix can be reduced to the simplest form

by elementary operations such that it consists of an identity matrix and a zero matrix.

If there exists a feasible solution for , we can divide the columns corresponding to into some groups of sizes with and .

On the other hand, we can divide the rows corresponding to into some groups of sizes with and

According to the inequalities in the theorem, it has either or .

Based on the row groups and the column groups, the matrix is divided into sub-matrices, whose dimensions are . By switching the rows and columns of the matrix , the matrix can be transformed into a form such that the sub-matrices of dimensions are on the diagonal of the sub-matrices. We denote the sub-matrices on the diagonal by , and the matrix is transformed to

The sub-matrices are random matrices of density . The dimension of the sub-matrix is for some such that for or .

For the sub-matrix , according to Lemma 7 in subsection V-E, the rows of are linearly independent with high probability when is sufficiently large. The sub-matrix can be reduced to its simplest form consisting of an identity matrix and a zero matrix by elementary operations on . Furthermore, all the other entries on the right of (in the same rows with ) can be reduced to by elementary column operations. Right now, each sub-matrix with is transformed to with

for some independent of , and the matrix is reduced to

We continue repeating the above process to handle iteratively. For the sub-matrix , it can be proved that the conclusion of Lemma 7 still holds, and all the rows of are linearly independent with high probability when is sufficiently large.

Finally, all the sub-matrices are reduced to their simplest forms with high probability, and all the other entries on their right are s. In this case, the matrix

is reduced to the reversed row echelon form, and it has full rank. Hence all the rows of the matrix

are linearly independent with high probability if is sufficiently large. This leads to the achievability of the channel rates.

V-E Proof of Lemma 6

We first prove the following result.

Lemma 7.

Let be a random matrix such that the probability of each entry being is . The rows in are linearly independent with high probability for sufficiently large if and only if .

Each row in is an independent random vector. The sum of any rows in is still an independent vector. Denote the probability of its entry being by .

It is easy to show that

from which and by induction, we obtain

Furthermore, since the sum of any rows is an independent vector, the probability for it being a zero-vector is

The rows of are linearly independent if and only if for any subset of the rows, their sum is not a zero-vector. Hence, the probability of the rows of being linearly independent

where is the number of subsets consisting of rows.

This leads to

(15)

When with sufficiently large, it has

When with sufficiently large, it has

When with sufficiently large, it has

Summing the above results up, we obtain

for any when is sufficiently large.

Lemma 8.

Given with , let with be a binary matrix such that each entry in columns is with probability and each entry not in columns is . If and , as , with high probability there does not exist any subset of rows from that includes at least one row from each matrix such that their sum is a zero-vector.

We say that a set of matrices are linearly cross-independent if and only if there does not exist a subset of rows from that includes at least one row from each matrix such that their sum is a zero-vector. If the rows of are linearly cross-independent, it does not necessarily imply that these rows are linearly independent. For example, consider the matrices

The rows in are linearly cross-independent, but not linearly independent, as the rows in are not linearly independent.

One observation is that if are linearly cross-independent on a subset of columns, then are linearly cross-independent on all the columns.

We divide the columns into at most groups depending on which the column belongs to. Two columns are in the same group if and only if they belong to the same subset of . Now, we are only interested in the groups of size (sufficiently large groups), and the union of their columns are denoted by . Then

for sufficiently small , which leads to for sufficient large . We will prove that the matrices are linearly cross-independent on the columns in .

Given with , we choose rows from with , and we use to denote the probability that the sum of all the chosen rows is a zero-vector on . Then the probability of the matrices being not linearly cross-independent on is

There are two possibilities considering the chosen rows: (1) every column in has more than random entries in the chosen rows; and (2) there exists a group (among the up to groups) of columns in , whose size is at least and in which each column has at most random entries in the chosen rows. We use to denote the probability that the sum of chosen rows is a zero-vector on in the first case, and to denote that in the second case. It can be shown that

for sufficient small and

Consider all possible choices of , as , the sum probability of the first case is

for sufficiently small .

Consider all possible choices of , as , the sum probability of the second case is