On Bounds and Closed Form Expressions for Capacities of Discrete Memoryless Channels with Invertible Positive Matrices

01/07/2020 ∙ by Thuan Nguyen, et al. ∙ Oregon State University 0

While capacities of discrete memoryless channels are well studied, it is still not possible to obtain a closed-form expression for the capacity of an arbitrary discrete memoryless channel. This paper describes an elementary technique based on Karush Kuhn Tucker (KKT) conditions to obtain (1) a good upper bound of a discrete memoryless channel having an invertible positive channel matrix and (2) a closed-form expression for the capacity if the channel matrix satisfies certain conditions related to its singular value and its Gershgorin disk.

READ FULL TEXT VIEW PDF
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

I Introduction

Discrete memoryless channels (DMC) play a critical role in the early development of information theory and its applications. DMCs are especially useful for studying many well-known modulation/demodulation schemes (e.g., PSK and QAM ) in which the continuous inputs and outputs of a channel are quantized into discrete symbols. Thus, there exists a rich literature on the capacities of DMCs [cover2012elements], [blahut1972computation], [arimoto1972algorithm], [muroga1953capacity], [shannon1956zero], [robert1990ash], [nguyen2018closed]. In particular, capacities of many well-known channels such as (weakly) symmetric channels can be written in elementary formulas [cover2012elements]. However, it is often not possible to express the capacity of an arbitrary DMC in a closed form expression [cover2012elements]. Recently, several papers have been able to obtain closed form expressions for a small class of DMCs with small alphabets. For example, Martin et al. established closed form expression for a general binary channel [martin2010algebraic]. Liang showed that the capacity of channels with two inputs and three outputs can be expressed as an infinite series [liang2008algebraic]

. Paul Cotae et al. found the capacity of two input and two output channels in term of the eigenvalues of the channel matrices

[cotae2010eigenvalue]. On the other hand, the problem of finding the capacity of a discrete memoryless channel can be formulated as a convex optimization problem [grant2008cvx], [sinha2014convex]. Thus, efficient algorithmic solutions exist. There is also others algorithms such as Arimoto-Blahut algorithm [blahut1972computation], [arimoto1972algorithm] which can be accelerated in [dupuis2004blahut], [matz2004information], [yu2010squeezing]. In [meister1967capacity], [jimbo1979iteration], another iterative method which can yield both upper and lower bounds for the channel capacity.

That said, it is still beneficial to find the channel capacity in closed form expression for a number of reasons. These include (1) formulas can often provide a good intuition about the relationship between the capacity and different channel parameters, (2) formulas offer a faster way to determine the capacity than that of algorithms, and (3) formulas are useful for analytical derivations where closed form expression of the capacity is needed in the intermediate steps. To that end, our paper describes an elementary technique based on the theory of convex optimization, to find closed form expressions for (1) a new upper bound on capacities of discrete memoryless channels with positive invertible channel matrix and (2) the optimality conditions of the channel matrix such that the upper bound is precisely the capacity. In particular, the optimality conditions establish a relationship between the singular value and the Gershgorin’s disk of the channel matrix.

Ii Preliminaries

Ii-a Convex Optimization and KKT Conditions

A DMC is characterized by a random variable

for the inputs, a random variable for the outputs, and a channel matrix . In this paper, we consider DMCs with equal number of inputs and outputs , thus . The matrix entry

represents the conditional probability that given

is transmitted, is received. Let

be the input probability mass vector (pmf) of

, where denotes the probability of to be transmitted, then the pmf of is . The mutual information between and is:

(1)

where

(2)
(3)

The mutual information function can be written as:

(4)

where denotes the component of the vector . The capacity associated with a channel matrix is the theoretical maximum rate at which information can be transmitted over the channel without the error [shannon1956zero], [shannon1998mathematical], [cover1975achievable]. It is obtained using the optimal pmf such that is maximized. For a given channel matrix , is a concave function of [cover2012elements]. Therefore, maximizing is equivalent to minimizing , and finding the capacity can be cast as the following convex problem:

Minimize:

Subject to:

The optimal can be found efficiently using various algorithms such as gradient methods [boyd2004convex], but in a few cases, can be found directly using the Karush-Kuhn-Tucker (KKT) conditions [boyd2004convex]. To explain the KKT conditions, we first state the canonical convex optimization problem below:

Problem P1: Minimize:
 Subject to:

where , are convex functions and is a linear function.

Define the Lagrangian function as:

(5)

then the KKT conditions [boyd2004convex] states that, the optimal point must satisfy:

(6)

for , .

Ii-B Elementary Linear Algebra Results

Definition 1.

Let be an invertible channel matrix and be the entropy of row, define

where denotes the entry of the inverse matrix . and are called the maximum and minimum inverse row entropies of , respectively.

Definition 2.

Let be a square matrix. The Gershgorin radius of row of [weisstein2003gershgorin] is defined as:

(7)

The Gershgorin ratio of row of is defined as:

(8)

and the minimum Gershgorin ratio of is defined as:

(9)

We note that since the channel matrix is a stochastic matrix, therefore

(10)
Definition 3.

Let be a square matrix.

(a) is called a positive matrix if for .

(b) is called a strictly diagonally dominant positive matrix [fiedler1967diagonally] if is a positive matrix and

(11)
Lemma 1.

Let be a strictly diagonally dominant positive channel matrix then (a) it is invertible; (b) the eigenvalues of are where are eigenvalues of , (c) and the largest absolute element in the column of is , i.e., for .

Proof.

The proof is shown in Appendix -A. ∎

Lemma 2.

Let be a strictly diagonally dominant positive matrix, then:

(12)

Moreover, for any rows and ,

(13)
Proof.

The proof is shown in Appendix -B. ∎

Lemma 3.

Let be a strictly diagonally dominant positive matrix, then:

(14)

where is the largest entry in and is the minimum singular value of .

Proof.

The proof is shown in Appendix -C. ∎

Lemma 4.

Let be an invertible channel matrix, then

i.e., the sum of any row of equals to 1. Furthermore, for any probability mass vector , sum of the vector equal to 1.

Proof.

The proof is shown in Appendix -D. ∎

Iii Main Results

Our first main result is an upper bound on the capacity of discrete memoryless channels having invertible positive channel matrices.

Proposition 1 (Main Result 1).

Let be an invertible positive channel matrix and

(15)
(16)

then the capacity associated with the channel matrix is upper bounded by:

(17)
Proof.

Let be the pmf of the output , then . Thus,

We construct the Lagrangian in (5) using as the objective function and optimization variable :

(19)

where the constraints and in problem P1 are translated into and , respectively.

Using the KKT conditions in (6), the optimal points , , for all , must satisfy:

(20)
(21)
(22)
(23)
(24)

Since and , there exists at least one . Since , we have:

(25)

Based on (24) and (25), we must have . Therefore, all five KKT conditions (20-24) are reduced to the following two conditions:

(26)
(27)

Next,

(28)

Using (27) and (28), we have:

(29)

Plugging (29) to (26), we have:

From (29),

(30)

If is such that and , then is a valid p.m.f and Proposition 1 will hold with equality by the KKT conditions. However these two constraints might not hold in general. On the other hand, maximizing in terms of and ignoring these constraints is equivalent to enlarging the feasible region, will necessarily yield a value that is at least equal to the capacity . Thus, by plugging into (III), we obtain the proof for the upper bound. ∎

Next, we present some sufficient conditions on the channel matrix such that its capacity can be written in closed form expression. We note that the channel capacity closed form expression is also discovered in [muroga1953capacity] and [robert1990ash] using the input distribution variables. However in both [muroga1953capacity] and [robert1990ash], the sufficient conditions for closed form expression are not fully characterized.

Proposition 2 (Main Result 2).

Let be a strictly diagonally dominant positive matrix, if ,

(31)

then the capacity of channel matrix admits a closed form expression which is exactly the upper bound in Proposition 1.

Proof.

Based on the discussion of the KKT conditions, it is sufficient to show that if and then has a closed form expression. The condition is always true as shown in Lemma 4 in the Appendix -D. Thus, we only need to show that if , then

Let and , we have:

(32)
(33)

with (32) due to which follows by Lemma 1-, (33) is due to . Now if we want , from (33), it is sufficient to require that, ,

with (III) due to (30) and , are corresponding to , , respectively. Thus, Proposition 2 is proven.

We are now ready to state and prove the third main result that characterizes the sufficient conditions on a channel matrix so that the upper bound in Proposition 1 is precisely the capacity.

Proposition 3 (Main Result 3).

Let be a strictly diagonally dominant positive channel matrix and be the maximum row entropy of . The capacity is the upper bound in Proposition 1 i.e., hold with equality if

(35)

where is the minimum singular value of channel matrix , and

(36)
Proof.

From (12) in Lemma 2 and Proposition 2, if we can show that

(37)

then Proposition 3 is proven. Suppose that and are obtained at rows and , respectively. We note that from (30), and correspond to and , respectively. Thus, from the Definition 1, we have:

(38)
(39)
(40)
(41)
(42)
(43)

where (38) due to the property of absolute value function, (39) due to Schwarz inequality, (40) due to is the maximum row entropy of , (41) due to (13), (42) due to is the largest entry in and (43) is due to Lemma 3. Thus,

(44)

From (37) and (44), if

(45)

then the capacity is the upper bound in Proposition 1. (45) is equivalent to (35). Thus Proposition 3 is proven. ∎

An easy to use version of Proposition 3 is stated in Corollary 1.

Corollary 1.

The capacity is the upper bound in Proposition 1 if

(46)
Proof.

Similar to Proposition 3,

(47)
(48)
(49)
(50)
(51)

with (47), (48), (49) are similar to (38), (39), (40), respectively. (50) is due to is the largest entry in , (51) due to and Lemma 3. Thus, by changing in (45) by , the Corollary 1 is proven. ∎

A direct result of Proposition 3 without using singular value is shown in Corollary 2.

Corollary 2.

The capacity is the upper bound in Proposition 1 if

(52)

where,

(53)
(54)
(55)
Proof.

We will construct the lower bound for and the upper bound for . From Lemma 5 in Appendix -E

(56)

and

(57)

Therefore

(58)

Thus, by changing in (35) by , the Corollary 2 is proven.

We note that, when is relatively larger than the size of matrix , the lower bound of goes to 1. We also note that (52) can be checked efficiency without requiring both and at the expense of a looser upper bound as compare to (35). ∎

Iv Examples and Numerical Results

Iv-a Example 1: Reliable Channels

We illustrate the optimality conditions in Proposition 3 using a reliable channel having the channel matrix:

Here, , , and , . From Definition 2, . The closed form channel capacity can be readily computed by Proposition 1 since the channel matrix satisfies both conditions in Proposition 3 and Corollary 2. The optimal input and output probability mass vectors are:

respectively and the capacity is 1.2715.

In general, for a good channel with inputs and outputs whose symbol error probabilities are small, then it is likely that the channel matrix will satisfy the optimality conditions in Proposition 1. This is because the diagonal entries (probability of receiving correct the symbol) tend to be larger than the sum of other entries in its row (probability of errors), satisfying the property of diagonally dominant matrix.

Iv-B Example 2: Cooperative Relay-MISO Channels

In this example, we investigate the channel capacity for a class of channels named Relay-MISO (Relay - Multiple Input Single Output). Relay-MISO channel [thuan_thinh_2018] can be constructed by the combination of a relay channel [cover1979capacity] [rankov2006achievable] and a Multiple Input Single Output channel, as illustrated in Fig. 1.

In a Relay-MISO channel, senders want to transmit data to a same receiver via relay base station nodes. The uplink of these senders using wireless links that are prone to transmission errors. Each sender can transmit bit “0” or “1” with the probability of bit flipping is , . For a simplicity, suppose that relay channels have the same error probability . Next, all of the relay base station nodes will relay the signal by a reliable channel such as optical fiber cable to a same receiver. The receiver adds all the relay signals (symbols) to produce a single output symbol.

It can be shown that the channel matrix of this Relay-MISO channel [thuan_thinh_2018]

is an invertible matrix of size

whose can be computed as:


Figure 1: Relay-MISO channel

We note that this Relay-MISO channel matrix is invertible and the inverse matrix has the closed form expression which is characterized in [thuan_thinh_2018]. For example, the channel matrix of a Relay-MISO channel with is given as follows:

where . We note that this channel matrix is strictly diagonally dominant matrix when is close to 0 or is close to 1. In addition, for values that are close to 0 or 1, it can be shown that channel matrix satisfies the conditions in Proposition 3. Thus, the channel capacity admits a closed form expression in Proposition 1. For other values of , e.g. closer to 0.5, the optimality conditions in Proposition 3 no longer holds. In this case, Proposition 1 can still be used as a good upper bound on the capacity.

We show that our upper bound is tighter than existing upper bounds. In particular, Fig. 2 shows the actual capacity and the known upper bounds as functions of parameter for Relay-MISO channels having . The green curve depicts the actual capacity computed using convex optimization algorithm. The red curve is constructed using our closed form expression in Proposition 1, and the blue dotted curve is the constructed using the well-known upper bound result of channel capacity in [chiang2004geometric], [boyd2007tutorial]. Specifically, this upper bound is:

(59)

Finally, the red dotted curve shows another well-known upper bound by Arimoto [arimoto1972algorithm] which is:

(60)

We note that the second term is negative.

Fig. 2 shows that our closed form upper bound is precisely the capacity (the red and green graphs are overlapped) when values are close to 0 or 1 as predicted by the optimality conditions in Proposition 3. On the other hand, when values are closer to 0.5, our optimality conditions no longer hold. In this case, we can only determine the upper bound. However, it is interesting to note that our upper bound in this case is tighter than both the Boy-Chiang [chiang2004geometric] and Arimoto [arimoto1972algorithm] upper bounds.


Figure 2: Channel capacity and various upper bounds as functions of

Iv-C Example 3: Symmetric and Weakly Symmetric Channels

Our results confirm the capacity of the well known symmetric and weakly symmetric channel matrices. In particular, when the channel matrix is symmetric and positive definite, all our results are applicable. Indeed, since the channel matrix is symmetric and positive definite, the inverse channel matrix exists and also is symmetric. From Definition 1, all values of is the same since they are the same sum of permutation entries. Therefore, from Proposition 1, the optimal output probability mass vector

(61)

are equal each other for all . As a result, the input probability mass function

is the uniform distribution, and the channel capacity is upper bounded by:

(62)
(63)

Interestingly, our result also shows the capacities of many channels that are not weakly symmetric, but admits the closed form formula of weakly symmetric channels. In particular, consider a channel matrix called semi-weakly symmetric whose all rows are permutations of each other, but the sum of entries in each column might not be the same. Furthermore, if the optimal condition is satisfied (Proposition 3), then the channel has closed-form capacity which is identical to the capacity of a symmetric and weakly symmetric channel:

(64)

For example, the following channel matrix:

is not a weakly symmetric channel even though its rows are permutations of each other since the column sums are different. However, this channel matrix satisfies Proposition 3 and Corollary 2 since , , , , and . Thus, it has closed form formula for capacity, and can be easily shown to be . The optimal output and input probability mass vectors can be shown to be:

respectively.

The following channel matrix is another example of semi-weakly symmetric matrix whose entries are controlled by a parameter in the range of and given by the following form:

Fig. 3 shows the capacity upper bound of the semi-weakly symmetric channel and the actual channel capacity as function of . As seen, for most of , the upper bound is identical to the actual channel capacity which is numerically determined using CVX [grant2008cvx].


Figure 3: Channel capacity of (semi) weakly symmetric channel as a function of

Iv-D Example 4: Unreliable Channels

We now consider an unreliable channel whose channel matrix is:

In this case, our optimality conditions do not satisfy, and the Arimoto upper bound is tightest () as compared to our upper bound (0.19282) and Boyd-Chiang upper bound (0.848).

Iv-E Example 5: Bounds as Function of Channel Reliability

Since we know that our proposed bounds are tight if the channel is reliable, we want to examine quantitatively how channel reliability affects various bounds. In this example, we consider a special class of channel whose channel matrix entries are controlled by a reliability parameter for as shown below:

When is small, the channel tends to be reliable and when is large, the channel tends to be unreliable. Fig. 4 shows various upper bounds as a function of together with the actual capacity. The actual channel capacities for various are numerically computed using a convex optimization algorithm [grant2008cvx]. As seen, our closed form upper bound expression for capacity (red curve) from Proposition 1 is much closer to the actual capacity (black dash curve) than other bounds for most values of . When is small () or channel is reliable, the closed form upper bound is precise the real channel capacity, and we can verify that the optimal conditions in Proposition 3 holds. When the channel becomes unreliable, i.e., , our upper bound is no longer tight, however, it is still the tightest among all the existing upper bounds. We note that when the is small, the channel matrix becomes a nearly diagonally dominant matrix, and our upper bound is tightest.


Figure 4: Channel capacity and various upper bounds functions of

V Conclusion

In this paper, we describe an elementary technique based on Karush-Kuhn-Tucker (KKT) conditions to obtain (1) a good upper bound of a discrete memoryless channel having an invertible positive channel matrix and (2) a closed form expression for the capacity if the channel matrix satisfies certain conditions related to its singular value and its Gershgorin’s disk. We provide a number of channels where the proposed upper bound becomes precisely the capacity. We also demonstrate that our proposed bounds are tighter than other existing bounds for these channels.

References