Physical layer security exploits the inherent randomness in a communication environment to derive security; this form of security makes no assumptions on the eavesdropper’s capabilities. This is in direct contrast to computational based security which derives security based on the assumption that the eavesdropper has bounded computational resources.
Computational based security has been the de facto security for communication systems since its inception due especially to its ease of implementation; however, the main assumption of computational boundedness has been scrutinized in recent years more than ever. One of the primary reasons for this scrutiny is the potential advent of practical quantum computers in the near future. On the other hand, physical layer security is impervious to advances in computing, in particular quantum computing, because it makes no underlying assumptions on computational resources. Thus, regardless of the technology the eavesdropper possesses, physical layer security maintains its integrity. In this way, physical layer security is inherent security.
Given this clear advantage of physical layer security, it is still underutilized in modern communication systems. This is primarily because most proposed schemes to implement physical layer security are too impractical. The schemes are most often only theorized to exist with a tangible construction unknown, i.e., proofs are by existence and not by construction. Moreover, even when a construction is given, it is rarely efficient in block length.
Overcoming these hurdles has been one of the primary aims of the physical layer community for quite some time. But there is yet another reason physical layer security has not found common use in new communication systems; this reason is significantly more subtle. The measure of security provided by most physical layer security schemes is insufficient to be used in a practical setting.
There is no direct analog of this problem that arises from computational based security because in that case the underlying assumption that certain decision problems are computationally hard is unproven anyway. Here, in physical layer security where security is rigorously proven, the choice of how security is measured needs to be consistent with reality if the proof of security is to hold any merit.
If a physical layer scheme could be created that is tangible, efficient, utilizes the most realistic measure of security, and achieves an input/output rate near the theoretical maximum, then physical layer security could potentially rival computational based security as the de facto security of modern communication systems, or at the very least could be an indispensable component. Motivated by this, herein we develop a physical layer coding scheme that aims to satisfy all of these properties and in some cases even does.
I-a Background - Security Metrics
Physical layer security is often modeled by a wiretap channel which was introduced in the 1970’s by Wyner [Wyner] and later generalized by Csiszár and Körner [csiszarkorner]. The metric used to measure security in these works is now colloquially referred to as the weak security metric. For years, this was the primary metric used to measure security on wiretap channels, however, it was asserted in the 1990’s in [strongsecurity] that the weak metric provided an inadequate measure of security to be deemed practical. This led to the creation of the strong security metric, the unnormalized version of the weak metric.
This metric sufficed for awhile, but in 2012, this metric was again shown to be an inadequate measure of security for realistic communication systems by Bellare, Tessaro, and Vardy [cryptoTreatment]. In addition to showing this, they created three new security metrics provably stronger than the strong security metric and proved them asymptotically equivalent. For the purposes of this paper, due to their equivalence, we will refer to all three of these metrics collectively as the semantic security metric, the name given in [cryptoTreatment]. This metric is now held to be the gold standard of security metrics for the wiretap channel. Moreover, it is argued that a stronger security metric than the semantic security metric does not exist. For these reasons, it is the only measure of security that should be utilized in practice. Admittedly, proving results with this metric tend to be more arduous, therefore many results in literature still use the strong security metric and even the weak security metric, but in this work we will exclusively use the semantic metric to prove security.
I-B Background - Fading Channels
In addition to focusing on physical layer security schemes that are tangible, efficient, and utilize semantic security, we will be primarily concerned with the most realistic of wiretap channel models: the fading wiretap channel. Fading wiretap channels are commonly used to model security of wireless communications. It assumes the input signal is attenuated/amplified then corrupted by some additive noise. The amount of attenuation/amplification is called the channel state. When the channel state changes frequently and independently, we are in the so called fast fading regime. This is one of the most practical fading wiretap channel models and is the main focus of our applications.
Due to the nature of wireless systems, fading wiretap channels sometimes assume that the current channel state is fed back from the receiver to the transmitter (this is abbreviated by CSIT - instantaneous channel state information at the transmitter). However, since there are actually two point-to-point channels within a wiretap channel, the transmitter potentially receives both of these channel states, a channel state corresponding to the intended receiver’s channel and a channel state corresponding to the eavesdropper’s channel.
We denote the case when the transmitter knows neither of these channel states by No-CSIT, although we do assume the transmitter knows the statistics
of the channel states as random variables. We denote the case when the transmitter knows the intended receiver’s current channel state but not the eavesdropper’s current channel state (only the statistics) by partial CSIT. Lastly, we denote the case when the transmitter knows both current channel states by full CSIT.
The level of CSIT drastically changes which secure rates are achievable. For this reason, we will treat No-CSIT, partial CSIT, and full CSIT as separate wiretap channels entirely.
I-C Related Work
In [cryptoTreatment] and also in [semanticallySecure], a tangible (concrete) and efficient wiretap coding scheme was given that could achieve positive secrecy rates on discrete memoryless wiretap channels under semantic security. In certain cases, this wiretap scheme could also achieve the semantic secrecy capacity [channelupgrading]. In [explicitGaussianWiretap], this scheme was extended for use on the AWGN wiretap channel and was shown to achieve the secrecy capacity, however, the wiretap scheme therein was only able to achieve positive secrecy rates under the strong security metric. In [UHF], however, this wiretap scheme was shown to achieve the strong secrecy capacity for both continuous and discrete wiretap channels. Their proof is a direct bound on the strong leakage and admits a nice characterization of the secure achievable rates. In [AWGNsemanticRecent], a wiretap scheme was shown to achieve the semantic secrecy capacity of AWGN wiretap channels, albeit in a completely different manner than the previously mentioned five papers. To date, there is currently no universal wiretap scheme that achieves the semantic secrecy capacity for both discrete memoryless and AWGN wiretap channels.
Physical layer security for fast fading wiretap channels was arguably started with Liang, Poor, and Shamai in [secureoverfading] where they found the weak secrecy capacity of the fast fading wiretap channel with the assumption of full CSIT. This was later improved by Bloch and Laneman in [chresolv] where they determined the secrecy capacity of this channel under the strong secrecy metric. In a different direction, Bloch and Laneman [PartialCSIT_2013] considered the case of fast fading wiretap channels with partial CSIT; they gave a set of achievable secrecy rates under the strong secrecy metric for this channel. Their solution relies on an optimization problem that has no closed form solution and thus it represents the best known secrecy rate on the fast fading channel with partial CSIT. In the case of fast fading channels with No-CSIT, it was only recently shown in [us], [stochasticdegrade], [mukherjee_ulukus_2013] that positive rates are actually achievable and an upper bound for the secrecy capacity is also derived. For a special class of fast fading No-CSIT channels, [stochasticdegrade] actually finds the secrecy capacity of these channels under the weak secrecy constraint. In [almostuniversal], a positive semantically secure achievable rate is obtained for fast fading channels with No-CSIT. To date, there are few results involving semantic security on fast fading wiretap channels. In particular, no one has constructed a wiretap scheme that achieves the best possible semantically secure rates for each case of CSIT. Moreover, hardly any wiretap schemes exist for fast fading channels that are tangle/efficient and come close to the best possible rates, even in the lesser weak and strong cases.
I-D Summary of Results
The main purpose of this paper is to amplify results of physical layer security into a more practical setting. We prove all of our results using the semantic security metric, the most demanding security metric in this field. Our wiretap coding scheme developed is modular in the sense that it can immediately be adapted to any existing channel to provide semantic security; furthermore, it is shown to be concrete and efficient111As will be made clear in Section III, we only prove the preprocessor is concrete and efficient; however, if the error correcting code is also such, then so is the entire wiretap coding scheme..
To prove our wiretap coding scheme is semantically secure, we bound the semantic leakage asymptotically (LABEL:lem:LHL). We do this by upgrading the strong leakage bounds found in [UHF]. In particular, we optimize over all message distributions. As in [explicitGaussianWiretap, UHF], our wiretap scheme is a modular scheme consisting of a preprocessor based on UHFs. However, in order to guarantee that our scheme is semantically secure, we require the UHF to also have additional properties (we dub UHFs with these additional properties as semantically secure universal hash families - SS-UHFs). The additional properties are non-restrictive in general and we provide a particular implementation of an SS-UHF based on finite field arithmetic that is concrete and quadratic time efficient. In effect, our SS-UHF based preprocessor is a converter that takes in an off-the-shelf ECC and converts it to a semantically secure wiretap coding scheme (LABEL:thm:LHL3).
In LABEL:procedure below, we outline the necessary steps for using our wiretap scheme on an arbitrary wiretap channel. Use of this procedure attains semantic security for any wiretap channel contingent on certain conditions being satisfied which are derived from the wiretap channel. We show that these conditions are indeed satisfied for the DMC, AWGN, and fast fading wiretap channels where we examine the fading channels with various levels of instantaneous channel state information at the transmitter. In other words, we demonstrate this procedure, in effect, proving that our wiretap coding scheme can achieve semantically secure rates on these channels.
The following are our specific contributions on each of the aforementioned channels.
DMC - In LABEL:thm:DMWC, we reestablish the result given by Tal and Vardy’s upgrade [channelupgrading] of Bellare, Tessaro, and Vardy’s original result [cryptoTreatment, semanticallySecure]; that is, we show our wiretap coding scheme achieves the semantic secrecy capacity of any symmetric, degraded, discrete memoryless wiretap channel. However, we allow any ECC for the main point-to-point channel in our construction. This is in contrast to the previous results that impose certain restrictions on the ECC in order to achieve secrecy capacity.
AWGN - In LABEL:thm:awgnrates, we reestablish [AWGNsemanticRecent] by constructing a concrete, end-to-end efficient wiretap scheme and prove that it can achieve the secrecy capacity on the AWGN wiretap channel under semantic security. However, we prefer our wiretap scheme in the fact that it is modular: the same preprocessor used here can be used on any channel without modification.
No-CSIT - In LABEL:thm:ImaxLessCE, we prove that our wiretap scheme achieves the semantic secrecy capacity here for the case when the eavesdropper’s channel is stochastically degraded (cf. [stochasticdegrade]) with respect to the main channel. Furthermore, in other cases, we provide a set of semantically secure achievable rates.
Partial CSIT - In LABEL:thm:ImaxPartialCSIT, we prove that our wiretap scheme achieves the best known achievable secrecy rates to date (cf. [bloch_barros_2011]) with semantic security.
Full CSIT - In LABEL:thm:ImaxFullCSIT, we prove that our scheme can actually achieve the strong secrecy capacity in this setting with semantic security thereby proving that semantic secrecy capacity is equivalent to the strong secrecy capacity and hence also the weak secrecy capacity.
All of the achievable semantically secure rates on these channels can be attained concretely and efficiently (LABEL:prop:SSUHF and LABEL:prop:UHFisEfficient) - since our preprocessor is already such, one only needs to concentrate on finding an error correcting code that is concrete and efficient. Once this is done, the entire wiretap coding scheme is concrete and efficient! In other words, we have converted the problem of finding good wiretap coding schemes into a problem of finding good error correcting coding schemes where good here means concrete and efficient.
To recap, we give in this paper a procedure for attaining semantically secure rates in a concrete and efficient way for arbitrary wiretap channels. We apply this procedure in particular to the five aforementioned channels. Therefore, if the reader desires to attain semantically secure rates on one of these channels, all that remains is to find an error correcting code. As a special case, we have pointed the reader to an ideal error correcting code for the AWGN wiretap channel, thereby completing the procedure in this case in full. If the reader wants to attain semantic security on a wiretap channel not listed above, then the reader must apply LABEL:procedure in its entirety. Specifically, the reader must check that the hypothesis of LABEL:thm:LHL3 is satisfied for that channel.
The remainder of the paper is organized as follows. Section II introduces notation and gives the preliminary mathematical background necessary to proceed through the rest of the paper. Section III presents our modular wiretap coding scheme and gives a concrete and efficient implementation of the preprocessor based on finite field arithmetic. LABEL:sec:LHL analyzes both the security and achievable rates of our proposed wiretap scheme and gives a procedure for how to utilize our main results on an arbitrary (discrete or continuous) wiretap channel. In LABEL:sec:App1, we apply this procedure to the DMC and AWGN wiretap channels as a first application and show how our wiretap scheme replicates the best results from literature. LABEL:section:fading considers fast fading wiretap channels with various levels of CSIT (No-CSIT, partial CSIT, and full CSIT) and gives semantically secure achievable rates for each of these. Moreover, we show how our wiretap scheme in these cases exceeds the best results from literature.
In an attempt to give a more polished presentation, we have assigned nearly all of the proofs to the appendices.
Ii-a Notation and Conventions
We shall write to denote an
-dimensional vector wheredenotes the -th component, i.e., . We use the usual notation
to denote the Euclidean norm. We shall denote the indicator (or characteristic) function byor and will take all logarithms in this paper to be base unless we write , for which we mean the logarithm of base . We will write , and to denote the set of natural, real, and complex numbers respectively. With a slight abuse of notation, we will write to denote the set of non-negative reals. We will write to denote the cardinality of set .
We will denote random variables by capital letters and will denote the spaces for which a random variable is defined by a respective scripted letter, e.g., is a random variable with values in . As usual we write to denote that is a uniform random variable over some discrete set ; we write to denote that is a real Gaussian random variable with mean; we write to denote that is a circularly symmetric complex Gaussian random variable with mean and standard deviation .
We shall use the notation of [csiszarkorner, UHF] and let denote the usual mutual information between random variables and . We write when random variable is independent of . We write to denote the probability of event and to denote the expected value of random variable . When we want to be explicit about which random variable we are taking the probability (resp. expected value) with respect to, we shall denote the random variable by a subscript.
We denote all probability densities222Sometimes when we have a probability mass function, instead we will use the notation with appropriate subscripts as necessary. by defined by the Radon-Nykodym derivative with respect to some implicit reference measure; we will almost always denote this reference measure by . We denote the conditional probability density in an analogous way as . As an example of our notation, if and are random variables on and respectively, then denotes the probability density of and denotes the conditional probability density of given .
When algorithms are completed in polynomial time (in the worst case) then we take up the standard convention and call such algorithms efficient.
Let and be sets. We shall denote a stochastic map by . Given , a stochastic map assigns a likelihood that will map to a certain . For each , this induces the random variable . The support of this random variable, , is the elements in that can map to with non-zero likelihood.
Let be some stochastic map, a random variable on , some reference measure on , and . We will call the conditional density the transition density of the stochastic map and we will call the tuple a channel. We will often abuse language/notation and call itself a channel. The transition density probabilistically tells us how the channel is mapping to . Given that some symbol was sent across the channel, the probability that is in some subset is given by .
For the rest of this paper, we will be considering subnormalized channels: channels with transition densities such that . This is a technical condition that allows us to define the following. Given a channel and subset denote . This induces a restricted channel as follows. Given that was sent across the restricted channel, the probability that is in some subset is given by .
Ii-C Error Correcting Codes
We will always refer to the number of channel uses333Note that we are only considering discrete-time channels in this work. as the block length (of the code) and denote it by . As usual, we will mainly be considering the -letter extension of channel notated by .
Let be some finite message set. An -length encoder for is an injective function . The image is called the codebook and is denoted . Elements of the codebook are referred to as codewords. An -length decoder for is a function and an -length code is a tuple . The rate of the code is given by . Lastly, a family of codes is called a coding scheme with rate given by , where we assume this limit exists.
The maximum probability of error for code is given by . If is sufficiently small then is called an error correcting code (ECC). If every code in scheme is an ECC, we call an ECC scheme. If as then we say the scheme is reliable. In particular, if for some constants and for every , then we call the ECC scheme exceptionally reliable.
It was noted in [cryptoTreatment] that “good” error correcting coding schemes in practice should satisfy the reliability condition exponentially fast; they called such ECC schemes “strongly reliable.” Due to the plethora of definitions containing the wording “strong” in the literature, we have instead called such ECC schemes here “exceptionally reliable.”
For continuous channels (i.e. ) we shall always impose the average power constraint as usual. In more detail, for some fixed constant , we shall require the code to satisfy for every .
The supremum of reliable achievable rates over all ECC schemes is known as the (point-to-point) channel capacity. We shall denote the channel capacity of a channel by .
Ii-D Wiretap Codes
Let be a channel that models the communication between a transmitter Alice and intended receiver Bob. Let be a channel modeling the unintended communication between Alice and a passive eavesdropper Eve. We call the pair of channels the wiretap channel.
Note that we have chosen the letters , , and so as to denote the Transmission channel, Eavesdropper’s channel, and Wiretap channel. We also note that the -letter wiretap channel is given by .
The goal of physical layer security as modeled by a wiretap channel is for Alice to communicate information reliably to Bob while keeping that same information hidden from Eve. Let be the random variable representing the message Alice wants to impart to Bob yet keep secret from Eve. Let be the -letter random variable representing Eve’s output. To measure security, we recall the most common security metrics.
[maurer_1994] Strong444Strong security is sometimes referred to as MIS-R, cf. [cryptoTreatment].:
We refer to each of these quantities as leakage and we say that a coding scheme is secure under a given metric if its respective leakage goes to 0 as . In a similar fashion to exceptional reliability, we say that a coding scheme is exceptionally secure if the leakage is vanishing exponentially fast with .
The expression for semantic security above is technically called mutual information security (MIS) as originally defined in [cryptoTreatment]. Semantic security (in the wiretap context) is actually defined using guessing probabilities. However, therein it was shown for discrete channels (and in [continuousSemantic] for continuous channels) that MIS was equivalent to semantic security asymptotically. Thus, in the asymptotic regime there is no need to differentiate between the two metrics because each implies the other. Hence, our choice of name is technically justified.
However, one may still ask why we call the definition above “semantic security” when it is actually the definition of MIS; the reasoning is as follows. The definition of semantic security in [cryptoTreatment] is named such to allude to the gold standard definition from computational based security [goldwasser_micali_1984]. However, the definition of semantic security is considerably less tractable than the definition of MIS. In order to get the best of both worlds, we have chosen our naming convention. We note that it is a convention already followed by other works.
Let be a coding scheme for channel (and inherently channel ) using message set . We say is a -wiretap coding scheme, where , if it satisfies each of the following.
Reliability: is a reliable ECC scheme for .
Security: is secure (relative to ) using the -metric.
If these two conditions are satisfied exceptionally, then we say that is an outstanding -wiretap coding scheme.
If is the rate of an wiretap coding scheme, then we say is an achievable secrecy rate. We call the supremum of all achievable secrecy rates the secrecy capacity denoted by or simply when the metric is clear from context. If all secure rates achievable under the weak secrecy metric are also achievable under the semantic secrecy metric, then:
Ii-E Universal Hashing
Let be the set of binary strings of length , and be finite sets, and a uniform random variable on . Consider now a family of a finite number of functions indexed by :
is called a universal hash family (UHF) if for every ,
is called uniform if for every and for every ,
is called -regular if for every and for every ,
is called invertible if for each there exists some stochastic mapping such that for all and , . If is a uniform random variable for every and then we call evenly invertible.
Lastly, we call a semantically secure universal hash family (SS-UHF) if it is: (i) universal, (ii) uniform, (iii) -regular, and (iv) evenly invertible.
Many of the definitions here coincide with those found in computer science literature. The conditions of being a universal hash family (as introduced in [CARTER]) and uniform are found in most textbooks on hash families. The condition of being -regular and invertible can be found in [semanticallySecure] and [UHF]. That being said, we have invented some terminology. We have dubbed hash families that are universal, uniform, -regular, and evenly invertible as semantically secure universal hash families to emphasize that hash families with these four properties are the proper ones for inducing semantic security (see LABEL:sec:LHL).
Ii-F -smooth -Mutual Information
In order to measure the amount of information leaked to the eavesdropper using our wiretap scheme, we will need to employ the use of a different measure of information, known as -mutual information. -mutual information is defined using Rényi entropy and is actually a generalization of the usual mutual information defined by Shannon.
For a discrete random variableover , the following generalizes Shannon’s entropy and is called Rényi entropy of order [renyi1961measures]: . This can be extended by continuity to the cases of and where is the usual Shannon entropy and is the usual min-entropy. In particular, when is uniform, for any we have , a fact we will use frequently.
In a similar way, one can define conditional Rényi entropy, however, there is no universal notion of such a definition in literature as different definitions can be employed based on the specific properties one desires (cf. [Renyi2017, Berens]). We will be using Arimoto’s definition [arimoto1977information, ConditionalRenyi2018] given as follows.
Let be an arbitrary random variable over (with measure on ) and a discrete random variable over . Then conditional Rényi entropy of order is given by:
Just as in the case of (unconditioned) Rényi entropy, this definition can be extended to the cases of and by continuity. For , one easily checks using L’Hospitals rule that becomes , the conditional Shannon entropy. For , the definition becomes
and is often referred to as conditional min-entropy. Another important case for which we would like to emphasize is when :
which is often referred to as conditional collision entropy.
Now let us finally define -mutual information: the Rényi extension to Shannon’s mutual information. Again, there is no universal definition in literature but we will be using the definition put forth in [ConditionalRenyi2018] for the special case when is a uniform random variable.
Let and be random variables as before except now we require to be uniform over . For we define the -mutual information between and by
Notice that is exactly Shannon’s mutual information so in this case we will drop the subscript. Moreover, for the case of , we will often call collision-information and for the case of , we will often call max-information.
[ConditionalRenyi2018, arimotoProperties] For any , is monotonically increasing in . Note that this fact justifies the name of as max-information because it measure the most amount of information of all of the -mutual informations.
The -mutual information also admits several other desirable properties of an “information measure” which can be found in [arimotoProperties]. Note however that this definition of
-mutual information is not symmetric in its arguments and does not satisfy the chain rule in general. This of course is in contrast to Shannon’s mutual information.
To facilitate our proofs later on we will also need a concept called -smooth -mutual information. Basically, we will define -mutual information on a portion of the entire space that probabilistically contains enough content up to some . To make this rigorous we first introduce the concept of a typical set.
For , we call a subset a -typical set if
Furthermore, we will denote the set of all -typical sets by . Typical sets intuitively contain almost all that there is to know about our space up to some , hence the name typical.
For some typical set , we first define the conditional Rényi entropy of order restricted to . This is simply given by
Given define -smooth -mutual information for uniform over by
where -mutual information evaluated on is given by
Given some threshold , we find the smallest value that -mutual information could possibly be when defined on the subnormalized channels corresponding to those sets that contain enough probability with respect to our threshold. Later, we will bound the leakage between the transmitter and eavesdropper as an increasing function of this metric; thus, defining -smooth -mutual information using the infimum provides the tightest bound we should expect when is our threshold.
Note that when , contains only sets equal to the entire space less a set of measure zero and hence .
Analogous to Section II-F we have the following ordering for -smooth -mutual information, a result we will use in proving our wiretap scheme is secure.
For any and , is monotonically increasing in .
This follows easily from the proof given for [ConditionalRenyi2018, Proposition 1] replacing the densities by and noting that all inequalities still hold. ∎
Iii A Wiretap Coding Scheme
In this section we will furnish a wiretap coding scheme for an arbitrary555Here arbitrary indeed means any discrete-time wiretap channel; however, a positive secrecy rate may not be attainable on some wiretap channels. wiretap channel which is based on a wiretap scheme put forth in [semanticallySecure], [explicitGaussianWiretap], and [UHF]. We will first define each step of this scheme and show that it is reliable (we will show security in the next section). Then we will give a particular implementation and show that this implementation is efficient with respect to the block length .
Over an arbitrary wiretap channel our wiretap coding scheme involves combining an SS-UHF with a reliable ECC already in use over the main point-to-point channel. This modular wiretap scheme is precisely the scheme put forth in [semanticallySecure, explicitGaussianWiretap, UHF] except there the UHF was only required to be -regular and evenly invertible. Here, we are also demanding that our UHF be uniform. The necessity of this extra property will be elucidated in the next section when we prove that our scheme is semantically secure.
Consider LABEL:fig:txscheme; this describes our wiretap scheme overall. We will now describe in detail each layer.