1 Introduction
In his 1956 paper [1], Shannon introduced the concept of zeroerror communication. Although, a general formula is still missing for the zeroerror capacity of a discrete memoryless channel (DMC) without feedback, Shannon derived one for the zeroerror capacity of a DMC with noiseless feedback. In recent years, there has been progress towards determining for channels with memory. In [2], Zhao and Permuter introduced a dynamic programming formulation for computing
for a finitestate channel modeled as a Markov decision process, assuming state information is available at both encoder and decoder. However, the problem is still open when there is no state information at the decoder.
In this paper, we study the zeroerror capacity, with and without feedback, of discrete channels with additive correlated noise. The ordinary capacities with and without feedback of such channels are studied in [3], in which it is proved that
(1) 
where is the input alphabet size and is the entropy rate of the noise process . In this paper, we consider additive noise channels where the noise is generated by a finitestate machine. We prove a similar formula for the zeroerror feedback capacity and a lower bound for the zeroerror capacity , in terms of topological entropy (Theorem 2). Unlike [2], we do not assume that channel state information is available at the encoder or decoder. In [4], we studied of some special cases of these channels and derived a similar lower bound. In this paper, we extend that result to a more general channel model, and also derive an exact formula for . Examples including the wellknown GilbertElliot channel are considered, for which the explicit value of is computed. To the best of our knowledge, this has not been done for these channels.
The rest of paper is organized as follows. In Section 2 the channel model and main results are presented. Proofs are given in Sections 3 and 4 and some examples are discussed in 5. Finally, concluding remarks and future extensions are discussed in section 6.
Throughout the paper, calligraphic letters such as , denote sets. The cardinality of set is denoted by . The channel input alphabet size is , logarithms are in base
. Random variables are denoted by upper case letters such as
, and their realizations by lower case letters such as. The vector
is denoted by .2 Channel Model and Main Results
Let the input, output and noise at time in the channel be , , and , respectively. Before we describe the channel, we define the following notion.
Definition 1 (Finitestate machine).
A finitestate machine is defined as directed graph , where the vertex set denotes states of the machine, and the edge set denotes possible transitions between two states. We say a process is described by
if a) there is a positive probability that any state is eventually visited, i.e.
, s.t. , b) if , a transition is always possible for all possible past state sequences, i.e. whenever ; and c) conversely, if , then whenever .Remark: Processes described by a finitestate machine are topologically Markov
[5, Ch.2], but need not be stochastic Markov chains.
The following channel is studied in this paper.
Definition 2 (Finitestate additive noise channels).
A discrete channel with common input, noise and output ary alphabet is called finitestate additive noise if its output at time is obtained by
where is modulo addition and the correlated additive noise is governed by a state process on a finitestate machine such that each outgoing edge from a state corresponds to different values of the noise. Thus, there are at most outgoing edges from each state. We assume the state transition diagram of the channel is strongly connected and that is independent^{1}^{1}1 This can be relaxed to qualitative independence [5, Ch.1]; i.e. if and are both positive, then . of .
Figure 1 shows a noise process which defines a channel that has no more than two consecutive errors. For example, the transition at time from state to itself corresponds to . Moreover, leads to the transition ending in state (state at next time step). Note that, in , the noise can only take and transits to .
Definition 3 (Coupled graph).
Coupled graph of a finitestate machine (with labeled graph ) is defined as a labeled directed graph^{2}^{2}2This product is called tensor product, as well as Kronecker product [6, Ch. 4]. , such that it has vertex set and has an edge from node to if and only if there are edges from to (with a label value ) and from to (with a label value ) in , each edge has a label equal to , where is modulo subtraction.
For a statedependent channel, the zeroerror capacity is defined as follows.
Definition 4.
The zeroerror capacity, , is the largest blockcoding rate that permits zero decoding errors, i.e.,
(2) 
where is the set of all block codes of length that yield zero decoding errors for any channel noise sequence and channel initial state, such that no state information is available at the encoder and decoder. In a zeroerror code, any two distinct codewords can never result in the same channel output sequence, regardless of the channel noise and initial state.
The zeroerror feedback capacity is defined in the presence of a noiseless feedback from the output. In other words, assuming is the message to be sent and is the output sequence received then where is the encoding function. Let the family of encoding functions . The zeroerror feedback capacity, is the largest blockcoding rate that permits zero decoding errors.
Before, presenting the main results, we need some preliminaries from symbolic dynamics. In symbolic dynamics, topological entropy is defined as the asymptotic growth rate of the number of possible state sequences. For a finitestate machine with an irreducible transition matrix , the topological entropy is known to coincide with , where is the Perron value of [7]. This is essentially due to the fact that the number of the paths from state to in steps is the th element of , which grows at the rate of for large .
First we give a topological condition on when zeroerror capacity is zero, with or without feedback.
Theorem 1.
Remark: This result implies that if and only if for finitestate additive noise channels.
Proof.
Sufficiency: We show that for any choice of encoding functions and blocklength there is a common output for , i.e., such that the output sequences, , where . In other words, , and
such that .
First observe that having current states and , for two noise sequences of and , respectively, the label on outgoing edges in the coupled graph is belong to . Now consider the first transmission, by choosing any inputs , if there is an edge from any state with the value then there exist that produce a common output for two channel inputs and . By continuing this argument for any having , if is chosen such that there is an edge with value then there is an output shared with two messages. In other words, by choosing any value for , if there is an edge with corresponding value it means there is a pair of noise values such that , therefore . If there is no such an edge for a particular , then there is no pair of noise values that produces the same output, and thus, .
Therefore, if and for any choice of there is a walk on the coupled graph then the corresponding noise sequences of the walk can produce the same output, i.e. which implies and therefore .
Necessity: Assume there is no walk for a sequence of then by choosing any two input sequences such that , two messages and can be transmitted with zeroerror which contradict with the assumption that (and also ). ∎
We now relate the zeroerror capacities of the channel to the noise process topological entropy.
Theorem 2.
The zeroerror feedback capacity of the finitestate additive noise channel [Def. 2] with topological entropy of the noise process where no state information is available at the transmitter and decoder is either zero or
(3) 
Moreover, the zeroerror capacity (without feedback) is lower bounded by
(4) 
Remarks:

The zeroerror feedback capacity has a similar representation to the ordinary feedback capacity in (1) but with the stochastic noise entropy rate replaced with the topological entropy .

The topological entropy can be viewed as the rate at which the noise dynamics generate uncertainty. Intuitively, this uncertainty cannot increase which explains why it appears as a negative term on the right hand side of (3) and (4). Moreover, the sum of zeroerror feedback capacity and the topological entropy is always equal to , meaning that if the noise uncertainty is increased, the same amount will be decreased in the capacity.

Following Definition 2, the channel states are not assumed to be Markov, just topologically Markov. Thus the transition probabilities in the finitestate machine can be timevarying dependent on previous states. In other words, as long as the graphical structure is not changed, the result is valid.
3 Proof of the Zeroerror Feedback Capacity
The conditions on when is given in Theorem 1. Here, we consider . Before presenting the rest of the proof, we give the following lemma.
Lemma 1.
For a finitestate additive noise channel with irreducible adjacency matrix, there exist positive constants and such that, for any input sequence , the number of all possible outputs
(5) 
where is the Perron value of the adjacency matrix. Moreover, and are the possible output and noise values for a given initial state and input sequence .
Proof.
The output sequence, , is a function of input sequence, , and channel noise, , which can be represented as the following
(6) 
where . The set of all output sequences can be obtained as . Since for given , (6) is bijective, we have the following
(7) 
For a given initial state , define the binary indicator vector consisting of all zeros except for a 1 in the position corresponding to ; e.g. in Fig.1, if starting from state , then . Observe that since each output of the finitestate additive channel triggers a different state transition, each sequence of state transitions has a onetoone correspondence to the output sequence, given the input sequence.
The total number of state trajectories after step starting from state is equal to sum of th row of [7]. Hence, because of a onetoone correspondence between state sequences and output sequences then .
Next, we show the upper and lower bounds in (5). According to the PerronFrobenius Theorem, for an irreducible matrix
(or, equivalently, the adjacency matrix for a strongly connected graph), the entries of eigenvector
corresponding to are strictly positive [8, Thm. 8.8.1],[7, Thm. 4.2.3]. Therefore, multiplying by results in for . Left multiplication by the indicator vector, yields(8) 
Denote minimum and maximum element of vector by and respectively. Hence, considering that all the elements in both sides of (8) are positive, we have
where is allone column vector. Therefore, dividing by , we have
(9) 
where . Moreover, for deriving the lower bound similar to above, we have
3.1 Converse
We prove no coding method can do better than (3).
Let be the message to be sent and be the output sequence received such that
where is the additive noise and the encoding function. Therefore, the output is a function of encoding function and noise sequence, i.e., . We denote all possible outputs , where is the family of encoding functions.
For having a zeroerror code any two and any two must result in . Note that when , (even with feedback) at first position that will result in . Therefore, assuming the initial condition is known at both encoder and decoder,
Therefore, is an upper bound on the number of messages that can be transmitted when initial condition is not available. We know that . Therefore,
Moreover, , which proves the converse in (3).
3.2 Achievability
A coding method is proposed that achieves (3). Consider a code of length such that first symbols are the data to be transmitted and the rest of symbols serve as parity check symbols.
We know that for an input of size there are possible output sequences, which is bounded as follows
The transmitter having the output sequence , sends the receiver which output pattern (e.g. a message from ) was received using the parity check symbols. Assume that the transmitter sends the parity check symbols with a rate slightly below the zeroerror feedback capacity, i.e., , where is arbitrary small.^{3}^{3}3The reason to choose is to deal with situation when is achieved when blocklength tends to infinity. Therefore,
Using the upper bound on size of the output, i.e., and rearranging the inequality, gives
Considering the fact that the total rate of coding is upperbounded by , we have
Rearranging gives the following.
By choosing small and making large, the last two terms disappear and this concludes the proof.
4 Proof of the ZeroError Capacity Lower Bound
First, we give the following Lemma.
Lemma 2.
Let be subset of the inputs that can result in output with initial state for the finitestate additive noise channel. The following holds
(10) 
where and are constants appeared in (5).
Proof.
Let be the first codeword for which adjacent inputs denoted by . Again, each output sequence is in . Hence,
(11) 
where, , which gives
Using Lemma 2, we have
According to (5), for any initial state the number of outputs is upperbounded by . Therefore,
By choosing nonadjacent inputs as the codebook, results in an errorfree transmission. The above argument is true for other codewords, i.e.,
where is the number of codewords in the codebook such that union of corresponding for covers . Then,
A a result, the number of distinguishable inputs is lower bounded by . Therefore, according to zeroerror capacity definition
If is large, the last term vanishes and proves the lower bound in (4).
5 Examples
Here, we provide some examples, and for them, compute explicitly. Examples 1 and 2 consider channels with isolated and limited runs of errors. In Example 3 we consider a GilbertElliot channel. Moreover, for examples 1 and 2, we investigate that minimum value of ordinary feedback capacity over the transition probabilities and observe how far is this natural upper bound from the zeroerror feedback capacity.
Example 1.
Consider a channel with no two consecutive errors (Fig. 2). If then . Whilst, if it has a zeroerror feedback capacity of bit/use where is known as the golden ratio.
Moreover, assuming Markovianity with the transition probability , the ordinary feedback capacity is from (1), where is the binary entropy function. It turns out that .
Example 2.
The example of Fig. 1 represents a channel with no more than two consecutive errors, having adjacency matrix
If then and if it has .
If the channel states are Markov with transition probabilities and , it can be shown that
.
Example 3.
Consider a GilbertElliot channel with input alphabet of size and two states (Fig. 3). When the state the channel is errorfree, i.e., and when state it acts like a noisy typewriter channel (Fig. 4) which is also known as the Pentagon channel [1]. In this state, the probability of error for any input symbol is and thus the probability of errorfree transmission is . Figure 3 shows this channel’s state transition diagram. However, this channel does not fit Definition 2
, because outgoing edges are not associated with unique noise values. This reflects the fact that the noise process is a hidden Markov model, not a Markov chain, and the same state sequence can yield multiple noise sequences.
Nonetheless, in the following we show an equivalent representation of this channel compatible with Definition 2. The resultant model (shown in Fig. 5) is a state machine that produces the same set of noise sequences, where the edges define the noise values in each transmission.
Note that if the channel is in state , the noise can only take value , but in state , the noise , thus at all times. In the sequel, we show that
(12)  
(13)  
(14) 
whenever the conditioning sequence of occurs with nonzero probability. Therefore, irrespective of past noises the state machine shown in Fig. 5 can produce all noise sequences that occur with nonzero probability. It should be stressed that this noise process may not be a stochastic Markov chain, however, it is a topological Markov chain [7, Ch.2]. First, note by inspection of Fig. 3 that the noise process has zero probability of taking value twice in a row. Thus . Using Bayes rule, it then follows that
whenever .
Next we show (13)(14). Let be any past noise sequence such that .Therefore, such that
(15) 
From Fig. 3, . Thus
since the second factor on the RHS is positive, by (15). Therefore, , and (13) holds. Now, we show (14). If , it can be shown from Fig. 3 and the noise probabilities that
(16) 
Therefore,
Note from Fig. 3 that . Thus,
Consequently, (12)(14) hold yielding the state machine in Fig. 5. Note that, corresponds to and , to .
Now, we can use the results of Theorem 2, to get
This shows that the zeroerror feedback capacity of some channels with different structure than Definition 2, such as timevarying state transmissions (nonhomogeneous Markov chains) and even transitions that depend on previous transmissions can be explicitly obtained.
6 Conclusion
We introduced a formula for computing the zeroerror feedback capacity for a class of additive noise channels without state information at the decoder and encoder. This reveals a close connection between the topological entropy of the underlying noise process and the zeroerror communication. Moreover, a lower bound on zeroerror capacity (without feedback) was given based on the topological entropy.
Future work includes extending these results to a more general class of channels.
References
 [1] C. Shannon, “The zero error capacity of a noisy channel,” IRE Transactions on Information Theory, vol. 2, no. 3, pp. 8–19, 1956.
 [2] L. Zhao and H. H. Permuter, “Zeroerror feedback capacity of channels with state information via dynamic programming,” IEEE Transactions on Information Theory, vol. 56, no. 6, pp. 2640–2650, 2010.
 [3] F. Alajaji, “Feedback does not increase the capacity of discrete channels with additive noise,” IEEE transactions on information theory, vol. 41, no. 2, pp. 546–549, 1995.

[4]
A. Saberi, F. Farokhi, and G. N. Nair, “State estimation via worstcase erasure and symmetric channels with memory,” in
2019 IEEE International Symposium on Information Theory (ISIT). IEEE, 2019, pp. 3072–3076.  [5] A. Rényi, Foundations of probability. HoldenDay, 1970.
 [6] R. Hammack, W. Imrich, and S. Klavžar, Handbook of Product Graphs. CRC press, 2011.
 [7] D. Lind and B. Marcus, An introduction to symbolic dynamics and coding. Cambridge university press, 1995.
 [8] R. G. Godsil, Chris, Algebraic graph theory. Springer, New York, 2001.
Comments
There are no comments yet.