An Explicit Formula for the Zero-Error Feedback Capacity of a Class of Finite-State Additive Noise Channels

by   Amir Saberi, et al.
The University of Melbourne

It is known that for a discrete channel with correlated additive noise, the ordinary capacity with or without feedback both equal log q-ℋ (Z), where ℋ(Z) is the entropy rate of the noise process Z and q is the alphabet size. In this paper, a class of finite-state additive noise channels is introduced. It is shown that the zero-error feedback capacity of such channels is either zero or C_0f =log q -h (Z), where h (Z) is the topological entropy of the noise process. A topological condition is given when the zero-error capacity is zero, with or without feedback. Moreover, the zero-error capacity without feedback is lower-bounded by log q-2 h (Z). We explicitly compute the zero-error feedback capacity for several examples, including channels with isolated errors and a Gilbert-Elliot channel.



There are no comments yet.


page 1

page 2

page 3

page 4


Zero-Error Feedback Capacity of Finite-State Additive Noise Channels for Stabilization of Linear Systems

It is known that for a discrete channel with correlated additive noise, ...

Bounded state Estimation over Finite-State Channels: Relating Topological Entropy and Zero-Error Capacity

We investigate bounded state estimation of linear systems over finite-st...

On capacities of the two-user union channel with complete feedback

The exact values of the optimal symmetric rate point in the Cover--Leung...

Computability of the Channel Reliability Function and Related Bounds

The channel reliability function is an important tool that characterizes...

Bounds for the capacity error function for unidirectional channels with noiseless feedback

In digital systems such as fiber optical communications the ratio betwee...

Feedback Capacity of MIMO Gaussian Channels

Finding a computable expression for the feedback capacity of additive ch...

How to apply the rubber method for channels with feedback

We give an overview of applications of the rubber method. The rubber met...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

1 Introduction

In his 1956 paper [1], Shannon introduced the concept of zero-error communication. Although, a general formula is still missing for the zero-error capacity of a discrete memoryless channel (DMC) without feedback, Shannon derived one for the zero-error capacity of a DMC with noiseless feedback. In recent years, there has been progress towards determining for channels with memory. In [2], Zhao and Permuter introduced a dynamic programming formulation for computing

for a finite-state channel modeled as a Markov decision process, assuming state information is available at both encoder and decoder. However, the problem is still open when there is no state information at the decoder.

In this paper, we study the zero-error capacity, with and without feedback, of discrete channels with additive correlated noise. The ordinary capacities with and without feedback of such channels are studied in [3], in which it is proved that


where is the input alphabet size and is the entropy rate of the noise process . In this paper, we consider additive noise channels where the noise is generated by a finite-state machine. We prove a similar formula for the zero-error feedback capacity and a lower bound for the zero-error capacity , in terms of topological entropy (Theorem 2). Unlike [2], we do not assume that channel state information is available at the encoder or decoder. In [4], we studied of some special cases of these channels and derived a similar lower bound. In this paper, we extend that result to a more general channel model, and also derive an exact formula for . Examples including the well-known Gilbert-Elliot channel are considered, for which the explicit value of is computed. To the best of our knowledge, this has not been done for these channels.

The rest of paper is organized as follows. In Section 2 the channel model and main results are presented. Proofs are given in Sections 3 and 4 and some examples are discussed in 5. Finally, concluding remarks and future extensions are discussed in section 6.

Throughout the paper, calligraphic letters such as , denote sets. The cardinality of set is denoted by . The channel input alphabet size is , logarithms are in base

. Random variables are denoted by upper case letters such as

, and their realizations by lower case letters such as

. The vector

is denoted by .

2 Channel Model and Main Results

Let the input, output and noise at time in the channel be , , and , respectively. Before we describe the channel, we define the following notion.

Definition 1 (Finite-state machine).

A finite-state machine is defined as directed graph , where the vertex set denotes states of the machine, and the edge set denotes possible transitions between two states. We say a process is described by

if a) there is a positive probability that any state is eventually visited, i.e.

, s.t. , b) if , a transition is always possible for all possible past state sequences, i.e. whenever ; and c) conversely, if , then whenever .

Remark: Processes described by a finite-state machine are topologically Markov

[5, Ch.2], but need not be stochastic Markov chains.

The following channel is studied in this paper.

Definition 2 (Finite-state additive noise channels).

A discrete channel with common input, noise and output -ary alphabet is called finite-state additive noise if its output at time is obtained by

where is modulo addition and the correlated additive noise is governed by a state process on a finite-state machine such that each outgoing edge from a state corresponds to different values of the noise. Thus, there are at most outgoing edges from each state. We assume the state transition diagram of the channel is strongly connected and that is independent111 This can be relaxed to qualitative independence [5, Ch.1]; i.e. if and are both positive, then . of .

Figure 1 shows a noise process which defines a channel that has no more than two consecutive errors. For example, the transition at time from state to itself corresponds to . Moreover, leads to the transition ending in state (state at next time step). Note that, in , the noise can only take and transits to .

Figure 1: State transition diagram of a noise process in a channel at which no more than two consecutive errors can happen in the channel.
Definition 3 (Coupled graph).

Coupled graph of a finite-state machine (with labeled graph ) is defined as a labeled directed graph222This product is called tensor product, as well as Kronecker product [6, Ch. 4]. , such that it has vertex set and has an edge from node to if and only if there are edges from to (with a label value ) and from to (with a label value ) in , each edge has a label equal to , where is modulo subtraction.

For a state-dependent channel, the zero-error capacity is defined as follows.

Definition 4.

The zero-error capacity, , is the largest block-coding rate that permits zero decoding errors, i.e.,


where is the set of all block codes of length that yield zero decoding errors for any channel noise sequence and channel initial state, such that no state information is available at the encoder and decoder. In a zero-error code, any two distinct codewords can never result in the same channel output sequence, regardless of the channel noise and initial state.

The zero-error feedback capacity is defined in the presence of a noiseless feedback from the output. In other words, assuming is the message to be sent and is the output sequence received then where is the encoding function. Let the family of encoding functions . The zero-error feedback capacity, is the largest block-coding rate that permits zero decoding errors.

Before, presenting the main results, we need some preliminaries from symbolic dynamics. In symbolic dynamics, topological entropy is defined as the asymptotic growth rate of the number of possible state sequences. For a finite-state machine with an irreducible transition matrix , the topological entropy is known to coincide with , where is the Perron value of  [7]. This is essentially due to the fact that the number of the paths from state to in steps is the -th element of , which grows at the rate of for large .

First we give a topological condition on when zero-error capacity is zero, with or without feedback.

Theorem 1.

The zero-error capacity with(out) feedback (resp. ) of a finite-state additive noise channel [Def. 2] having finite-state machine [Def. 1] graph is zero, if and only if , there exists a walk on the coupled graph [Def. 3] of with the label sequence .

Remark: This result implies that if and only if for finite-state additive noise channels.


Sufficiency: We show that for any choice of encoding functions and blocklength there is a common output for , i.e., such that the output sequences, , where . In other words, , and

such that .

First observe that having current states and , for two noise sequences of and , respectively, the label on out-going edges in the coupled graph is belong to . Now consider the first transmission, by choosing any inputs , if there is an edge from any state with the value then there exist that produce a common output for two channel inputs and . By continuing this argument for any having , if is chosen such that there is an edge with value then there is an output shared with two messages. In other words, by choosing any value for , if there is an edge with corresponding value it means there is a pair of noise values such that , therefore . If there is no such an edge for a particular , then there is no pair of noise values that produces the same output, and thus, .

Therefore, if and for any choice of there is a walk on the coupled graph then the corresponding noise sequences of the walk can produce the same output, i.e. which implies and therefore .

Necessity: Assume there is no walk for a sequence of then by choosing any two input sequences such that , two messages and can be transmitted with zero-error which contradict with the assumption that (and also ). ∎

We now relate the zero-error capacities of the channel to the noise process topological entropy.

Theorem 2.

The zero-error feedback capacity of the finite-state additive noise channel [Def. 2] with topological entropy of the noise process where no state information is available at the transmitter and decoder is either zero or


Moreover, the zero-error capacity (without feedback) is lower bounded by



  • The zero-error feedback capacity has a similar representation to the ordinary feedback capacity in (1) but with the stochastic noise entropy rate replaced with the topological entropy .

  • The topological entropy can be viewed as the rate at which the noise dynamics generate uncertainty. Intuitively, this uncertainty cannot increase which explains why it appears as a negative term on the right hand side of (3) and (4). Moreover, the sum of zero-error feedback capacity and the topological entropy is always equal to , meaning that if the noise uncertainty is increased, the same amount will be decreased in the capacity.

  • The result of (3) is an explicit closed-form solution, which is a notable departure from the iterative, dynamic programming solution in [2].

  • Following Definition 2, the channel states are not assumed to be Markov, just topologically Markov. Thus the transition probabilities in the finite-state machine can be time-varying dependent on previous states. In other words, as long as the graphical structure is not changed, the result is valid.

3 Proof of the Zero-error Feedback Capacity

The conditions on when is given in Theorem 1. Here, we consider . Before presenting the rest of the proof, we give the following lemma.

Lemma 1.

For a finite-state additive noise channel with irreducible adjacency matrix, there exist positive constants and such that, for any input sequence , the number of all possible outputs


where is the Perron value of the adjacency matrix. Moreover, and are the possible output and noise values for a given initial state and input sequence .


The output sequence, , is a function of input sequence, , and channel noise, , which can be represented as the following


where . The set of all output sequences can be obtained as . Since for given , (6) is bijective, we have the following


For a given initial state , define the binary indicator vector consisting of all zeros except for a 1 in the position corresponding to ; e.g. in Fig.1, if starting from state , then . Observe that since each output of the finite-state additive channel triggers a different state transition, each sequence of state transitions has a one-to-one correspondence to the output sequence, given the input sequence.

The total number of state trajectories after -step starting from state is equal to sum of -th row of [7]. Hence, because of a one-to-one correspondence between state sequences and output sequences then .

Next, we show the upper and lower bounds in (5). According to the Perron-Frobenius Theorem, for an irreducible matrix

(or, equivalently, the adjacency matrix for a strongly connected graph), the entries of eigenvector

corresponding to are strictly positive [8, Thm. 8.8.1],[7, Thm. 4.2.3]. Therefore, multiplying by results in for . Left multiplication by the indicator vector, yields


Denote minimum and maximum element of vector by and respectively. Hence, considering that all the elements in both sides of (8) are positive, we have

where is all-one column vector. Therefore, dividing by , we have


where . Moreover, for deriving the lower bound similar to above, we have

Let , hence which combining it with (9) results in (5). ∎

3.1 Converse

We prove no coding method can do better than (3).

Let be the message to be sent and be the output sequence received such that

where is the additive noise and the encoding function. Therefore, the output is a function of encoding function and noise sequence, i.e., . We denote all possible outputs , where is the family of encoding functions.

For having a zero-error code any two and any two must result in . Note that when , (even with feedback) at first position that will result in . Therefore, assuming the initial condition is known at both encoder and decoder,

Therefore, is an upper bound on the number of messages that can be transmitted when initial condition is not available. We know that . Therefore,

Moreover, , which proves the converse in (3).

3.2 Achievability

A coding method is proposed that achieves (3). Consider a code of length such that first symbols are the data to be transmitted and the rest of symbols serve as parity check symbols.

We know that for an input of size there are possible output sequences, which is bounded as follows

The transmitter having the output sequence , sends the receiver which output pattern (e.g. a message from ) was received using the parity check symbols. Assume that the transmitter sends the parity check symbols with a rate slightly below the zero-error feedback capacity, i.e., , where is arbitrary small.333The reason to choose is to deal with situation when is achieved when blocklength tends to infinity. Therefore,

Using the upper bound on size of the output, i.e., and rearranging the inequality, gives

Considering the fact that the total rate of coding is upper-bounded by , we have

Rearranging gives the following.

By choosing small and making large, the last two terms disappear and this concludes the proof.

4 Proof of the Zero-Error Capacity Lower Bound

First, we give the following Lemma.

Lemma 2.

Let be subset of the inputs that can result in output with initial state for the finite-state additive noise channel. The following holds


where and are constants appeared in (5).


The subset of the inputs that can result in output with initial state , is defined as the following

Fixing , the mapping in (6) is bijective, hence . Combining it with (7) yields . Moreover, Lemma 1 gives the bounds on . ∎

Figure 2: State transition diagram of a noise process in a channel at which no two consecutive errors can happen in the channel.

Let be the first codeword for which adjacent inputs denoted by . Again, each output sequence is in . Hence,


where, , which gives

Using Lemma 2, we have

According to (5), for any initial state the number of outputs is upper-bounded by . Therefore,

By choosing non-adjacent inputs as the codebook, results in an error-free transmission. The above argument is true for other codewords, i.e.,

where is the number of codewords in the codebook such that union of corresponding for covers . Then,

A a result, the number of distinguishable inputs is lower bounded by . Therefore, according to zero-error capacity definition

If is large, the last term vanishes and proves the lower bound in (4).

5 Examples

Here, we provide some examples, and for them, compute explicitly. Examples 1 and 2 consider channels with isolated and limited runs of errors. In Example 3 we consider a Gilbert-Elliot channel. Moreover, for examples 1 and 2, we investigate that minimum value of ordinary feedback capacity over the transition probabilities and observe how far is this natural upper bound from the zero-error feedback capacity.

Figure 3: Markov chain for channel states in Example 3.
Example 1.

Consider a channel with no two consecutive errors (Fig. 2). If then . Whilst, if it has a zero-error feedback capacity of bit/use where is known as the golden ratio.

Moreover, assuming Markovianity with the transition probability , the ordinary feedback capacity is from (1), where is the binary entropy function. It turns out that .

Example 2.

The example of Fig. 1 represents a channel with no more than two consecutive errors, having adjacency matrix

If then and if it has .

If the channel states are Markov with transition probabilities and , it can be shown that

Example 3.

Consider a Gilbert-Elliot channel with input alphabet of size and two states (Fig. 3). When the state the channel is error-free, i.e., and when state it acts like a noisy type-writer channel (Fig. 4) which is also known as the Pentagon channel [1]. In this state, the probability of error for any input symbol is and thus the probability of error-free transmission is . Figure 3 shows this channel’s state transition diagram. However, this channel does not fit Definition 2

, because outgoing edges are not associated with unique noise values. This reflects the fact that the noise process is a hidden Markov model, not a Markov chain, and the same state sequence can yield multiple noise sequences.

Nonetheless, in the following we show an equivalent representation of this channel compatible with Definition 2. The resultant model (shown in Fig. 5) is a state machine that produces the same set of noise sequences, where the edges define the noise values in each transmission.










Figure 4: Pentagon channel.

Note that if the channel is in state , the noise can only take value , but in state , the noise , thus at all times. In the sequel, we show that


whenever the conditioning sequence of occurs with non-zero probability. Therefore, irrespective of past noises the state machine shown in Fig. 5 can produce all noise sequences that occur with nonzero probability. It should be stressed that this noise process may not be a stochastic Markov chain, however, it is a topological Markov chain [7, Ch.2]. First, note by inspection of Fig. 3 that the noise process has zero probability of taking value twice in a row. Thus . Using Bayes rule, it then follows that

whenever .

Next we show (13)-(14). Let be any past noise sequence such that .Therefore, such that


From Fig. 3, . Thus

since the second factor on the RHS is positive, by (15). Therefore, , and (13) holds. Now, we show (14). If , it can be shown from Fig. 3 and the noise probabilities that



Note from Fig. 3 that . Thus,

Consequently, (12)-(14) hold yielding the state machine in Fig. 5. Note that, corresponds to and , to .

Figure 5: State machine generating the noise sequence of Example 3.

Now, we can use the results of Theorem 2, to get

This shows that the zero-error feedback capacity of some channels with different structure than Definition 2, such as time-varying state transmissions (non-homogeneous Markov chains) and even transitions that depend on previous transmissions can be explicitly obtained.

6 Conclusion

We introduced a formula for computing the zero-error feedback capacity for a class of additive noise channels without state information at the decoder and encoder. This reveals a close connection between the topological entropy of the underlying noise process and the zero-error communication. Moreover, a lower bound on zero-error capacity (without feedback) was given based on the topological entropy.

Future work includes extending these results to a more general class of channels.


  • [1] C. Shannon, “The zero error capacity of a noisy channel,” IRE Transactions on Information Theory, vol. 2, no. 3, pp. 8–19, 1956.
  • [2] L. Zhao and H. H. Permuter, “Zero-error feedback capacity of channels with state information via dynamic programming,” IEEE Transactions on Information Theory, vol. 56, no. 6, pp. 2640–2650, 2010.
  • [3] F. Alajaji, “Feedback does not increase the capacity of discrete channels with additive noise,” IEEE transactions on information theory, vol. 41, no. 2, pp. 546–549, 1995.
  • [4]

    A. Saberi, F. Farokhi, and G. N. Nair, “State estimation via worst-case erasure and symmetric channels with memory,” in

    2019 IEEE International Symposium on Information Theory (ISIT).   IEEE, 2019, pp. 3072–3076.
  • [5] A. Rényi, Foundations of probability.   Holden-Day, 1970.
  • [6] R. Hammack, W. Imrich, and S. Klavžar, Handbook of Product Graphs.   CRC press, 2011.
  • [7] D. Lind and B. Marcus, An introduction to symbolic dynamics and coding.   Cambridge university press, 1995.
  • [8] R. G. Godsil, Chris, Algebraic graph theory.   Springer, New York, 2001.