Achievable Rates of Attack Detection Strategies in Echo-Assisted Communication

We consider an echo-assisted communication model wherein block-coded messages transmitted by a source reach the destination as multiple noisy copies. We address adversarial attacks on such models wherein a subset of the received copies at the destination are rendered unreliable by an adversary. Particularly, we study a non-persistent attack model with the adversary attacking 50 destination to detect the attacked copies within every codeword before combining them to recover the information bits. Our main objective is to compute the achievable rates of practical attack-detection strategies as a function of their false-positive and miss-detection rates. However, due to intractability in obtaining closed-form expressions on mutual information, we present a new framework to approximate the achievable rates in terms of their false-positives under special conditions. We show that the approximate rates offered by our framework is lower bounded by that of conservative countermeasures, thereby giving rise to interesting questions on code-design criteria at the source. Finally, we showcase the approximate rates achieved by traditional as well as neural-network based attack-detection strategies, and study their applicability to detect attacks on block-coded messages of short block-lengths.

READ FULL TEXT VIEW PDF

page 1

page 2

page 3

page 4

05/19/2022

Dissemination Control in Dynamic Data Clustering For Dense IIoT Against False Data Injection Attack

The IoT has made possible the development of increasingly driven service...
12/02/2020

Simple Closed-Form Approximations for Achievable Information Rates of Coded Modulation Systems

The intuitive sphere-packing argument is used to obtain analytically-tra...
07/24/2020

Stochastic Dynamic Information Flow Tracking Game using Supervised Learning for Detecting Advanced Persistent Threats

Advanced persistent threats (APTs) are organized prolonged cyberattacks ...
02/08/2022

Private Information Delivery with Coded Storage

In private information delivery (PID) problem, there are K messages stor...
04/29/2020

Broadcast Approach for the Information Bottleneck Channel

This work considers a layered coding approach for efficient transmission...
09/14/2021

The Effect of False Positives: Why Fuzzy Message Detection Leads to Fuzzy Privacy Guarantees?

Fuzzy Message Detection (FMD) is a recent cryptographic primitive invent...
03/27/2019

Convolution Attack on Frequency-Hopping by Full-Duplex Radios

We propose a new adversarial attack on frequency-hopping based wireless ...

I Introduction

A number of wireless applications exists involving echo-assisted communication wherein messages transmitted by the source arrive at the destination as multiple noisy copies. Typical examples include communication over frequency-selective channels [1], relay networks [2], and multiple receive antennas [3]. In such echo-assisted scenarios, it is well known that suitably combining the noisy received copies at the destination increases the effective signal-to-noise-ratio, thereby facilitating higher transmission rate.

In this work, we consider attack models on echo-assisted communication wherein a subset of the copies collected at the destination might have been manipulated by an adversary. Such scenarios are attributed to practical limitations on the adversary to manipulate all the copies. For instance, in the case of frequency-selective channels with delay spreads, the adversary may have processing-delay constraints to manipulate the symbols on the first copy, but not the subsequent ones [1]. We study a specific adversarial attack referred to as the flipping attack [4] wherein the message bits of the attacked copy are flipped at 50% rate independently. With such attacks, the dilemma at the destination is whether to combine the multiple copies or to discard them when recovering the messages. To gain insights on the attack model, we focus on the case of two received copies, out of which the second copy might have been manipulated by an adversary. Although adversarial models on binary channels have been studied in the literature [4, 5], flipping attacks on echo-assisted communication involving binary input and continuous output have not been studied hitherto. Henceforth, throughout the paper, we refer to the source and the destination as Alice and Bob, respectively.

Fig. 1: Plot of of two detectors, where and denote the miss-detection and false-positive rates conditioned on input codewords for . We propose a framework to approximate the achievable rates of detectors which have below the line with slope for some small . To exemplify, given a small , our framework can approximate the rate of the detector marked with symbol in green but not the one with in red.

I-a Motivation and Contributions

Consider an echo-assisted communication, wherein information bits from Alice to Bob are transmitted as a sequence of -length binary codewords. Specifically, each codeword, represented by , is received at Bob as two noisy copies, given by and , where and are constants perfectly known to Bob, and and are statistically independent additive noise components. The adversarial model in our setting is that the second copy is vulnerable to the flipping attack but not the first one. As a result, Bob needs to detect whether the second copy is attacked, and then decide on combining the two copies. In this work, we consider a non-persistent attack model, wherein is vulnerable to the flipping attack on 50% of the codewords chosen at random in an i.i.d. fashion. A conservative strategy to handle this adversarial setting is as follows:

  • Bob discards of every codeword irrespective of the attack, and uses only to recover the information bits.

  • Alice designs an -length codebook (designed for Gaussian channels) which achieves the rate .

Keeping in view of the above conservative baseline, we are interested in designing a detection strategy at Bob that can assist Alice in transmitting at higher rate than . Towards that end, Bob needs to detect the attack within the first samples (referred to as the frame length) of every codeword, and then decide whether the second copy can be used to recover the information bits. Subsequently, the decision on the combining strategy has to be fed back to Alice so that any possible rate modifications can be incorporated through the next coded symbols. Given that a practical detection strategy is typically imperfect, we are interested in quantifying the achievable rates of a detection strategy by incorporating its associated miss-detection and false-positive rates. However, since the attack model is not memoryless and the input alphabet is finite in size [6], we show that computing the achievable rates for arbitrary miss-detection rates is challenging for large . To circumvent this issue, we provide a new framework, as depicted in Fig. 1, to approximate the achievable rates of detectors under special conditions on miss-detection and false-positive rates (see Theorem 1). We also show that the achievable rates offered by our framework is lower bounded by that of the conservative strategy, thereby giving rise to interesting questions on code-design criteria at Alice. We propose a code-design criteria to assist Alice in achieving the rates promised by the detector, and show that the criteria is closely coupled with , which is the number of samples after which Bob has to feedback his decision on attack detection. Finally, we showcase the results of attack-detection strategies which are motivated by both traditional as well as neural network ideas, and study their applicability to detect attacks on codewords of short block-lengths.

Notations: For an

-dimensional random vector

with joint probability distribution function

, its differential entropy, denoted by , is represented as , where the expectation is over

. A Gaussian random variable with zero mean and variance

is denoted by . An identity matrix, an -length vector of zeros, and an -length vector of ones are denoted by , , and , respectively. For a given -length vector, denoted by , the notation for , denotes the -length vector containing the first components of . The notation denotes the usual probability operator.

Ii System Model

Alice transmits an -length sequence such that the components of are i.i.d. over the Probability Mass Function (PMF) for some . Meanwhile, Bob receives two copies of over the Additive White Gaussian Noise (AWGN) channels as

(1)

where and are constants known to Bob, and represent the additive Gaussian noise distributed as . We assume that and are statistically independent. Between the two copies, we assume that is vulnerable to the flipping attack, whereas is not. To model the flipping attack on , we introduce Hadamard product, denoted by , between and . With attack, the components of are i.i.d. over the PMF , and are unknown to Bob. However, without attack, . In this adversarial setting, the attacker executes the flipping attack on a codeword chosen randomly with probability in an i.i.d. fashion. By using and to denote the events of attack and no-attack, respectively, we have .

We compute the mutual information (MI) offered by the channel when is perfectly known to Bob, namely, and . We refer to this case as the Genie detector. When , each bit of is flipped by the attacker with probability in an i.i.d. fashion, and as a result, it is straightforward to prove that . As a countermeasure, the following proposition shows that discarding when is the optimal strategy at Bob (we omit the proof due to lack of space).

Proposition 1

When the components of are i.i.d. over the PMF , for , then we have and where and are as given in (1).

(2)

 

Without the flipping attack, i.e., , the mutual information of the channel is given by

Thus, with perfect knowledge of at Bob, the MI offered by the channel is

where in the last equality is obtained by combining and when as such that the additive noise vector is distributed as , where . It is straightforward to verify that , which implies that the combining strategy is optimal without the attack. Furthermore, can be simplified as

(3)

by using the memoryless nature of the channel, attributed to the perfect knowledge of at Bob. Here, are the scalar channels such that the additive noise is distributed as Since takes values from finite input alphabet, in (3) can be numerically computed as a function of the input PMF , constants and , and [6]. Specifically, is given by

(4)

where such that is as given in (2). The conditional entropy can be computed using the distribution given by

for . Similarly, we can also compute . In the next section, we study the MI of the combining strategy when the attack detector at Bob is not perfect.

Iii Achievable Rates with Practical Detection Strategy

We consider a practical attack-detection strategy, which uses the received samples to detect the flipping attack on every codeword. Based on the detector’s output, represented by the variable , Bob decides either to combine and , or discard . Note that this detector is typically imperfect, and as a result, it has its associated miss-detection and false-positive rates, defined as and , respectively. When the detector outputs , Bob drops the samples , and only uses the samples to recover the information bits. On the other hand, when the detector outputs , Bob combines and to obtain and then uses it to recover the information bits.

In the event of miss-detection, i.e., when and , we know that is random and unknown to Bob. Therefore, is denoted as , and is given by

(5)

However, when and , we have , and therefore, is denoted as , and is given by

(6)

The MI of this detection strategy, denoted by , is

(7)

where and .

(9)

 

To compute , we have to compute for a given block-length . However, this needs us to evaluate the differential entropy of the probability distribution function given in (9). Since the input alphabet is finite in size, the corresponding differential entropy can only be computed using numerical methods, and as a result, computing is intractable for sufficiently large (of the order of hundreds). In a nutshell, the above computational issue is because the equivalent channel when is not memoryless. To circumvent this problem, we show that the MI value of some detectors can be computed using an approximation under special conditions on and .

The following sequence of definitions and lemmas are useful to present our results on approximation in Theorem 1.

Definition 1

For , let a set , for some negligible , be defined as

Definition 2

For a given attack detector, we define its performance profile as

where and .

Definition 3

For a given , let denote an -dimensional discrete constellation in obtained by using over . On , we define

  • ,

where and denotes the squared Euclidean distance.

Lemma 1

If are such that and is a negligible number, then we have

Proof:

The convex combination can be written as . This implies that , where when , and when . Since is negligible, for every .

Since the accuracy of the approximation depends on , we henceforth denote by .

Lemma 2

If , and are such that for each then we have

(10)
(11)

for every .

Proof:

We only show the applicability of (10). Since can be written as a weighted sum of over all , (10) can be used to show the applicability of (11). Given , the -dimensional distribution of is given by

When evaluated at , we can upper bound the above term as

(12)

where is as given in Definition 3. Meanwhile, the -dimensional distribution of is given by

(13)

where the first inequality holds since . The second inequality holds because of triangle inequality. Finally, if for each then (13) can be further lower bounded as

(14)
(15)

where the last inequality is due to the bound in (12). This implies that for each . This completes the proof.

Using the results of Lemma 1 and Lemma 2, we are now ready to present our result on approximation.

Theorem 1

If , and are such that for each and if the detection strategy is such that , for a fixed small , then we have , where

(16)

and the notation captures the notion that the approximation on MI is a result of approximating the underlying distributions using .

Proof:

Based on the expression of in (7), it is straightforward to show that . In this proof, we only address the computation of . From first principles, we have

where can be obtained using as

where is as given in (9). When the attack-detection technique operates at , then we can show that , where and such that the expectation is over . By applying the results of Lemma 1 and Lemma 2 on (9), we get

The above approximation holds because plays the role of in Lemma 1, and the condition of Lemma 1 is satisfied because of (11) in Lemma 2. As a result . Furthermore, since each component of is independent across , we have

(17)

where such that is given by (2). Similarly, the conditional differential entropy is given by

(18)

where and such that can be written as

(19)

To arrive at (19), we assume that and are statistically independent. Again, applying the results of Lemma 1 and Lemma 2 on (19), we have the approximation

for every . As a result, we have . Finally, using the above expression in (18), we get

(20)

where the last equality is due to i.i.d. nature of . Overall, using (20) and (17) in (7), we get the expression in (16).


Due to intractability in evaluating , Theorem 1 approximates the achievable rates of a special class of detection strategies that operate in the region on the channel parameters