# Successive-Cancellation Decoding of Linear Source Code

This paper investigates the error probability of several decoding methods for a source code with decoder side information, where the decoding methods are: 1) symbol-wise maximum a posteriori decoding, 2) successive-cancellation decoding, and 3) stochastic successive-cancellation decoding. The proof of the effectiveness of a decoding method is reduced to that for an arbitrary decoding method, where effective' means that the error probability goes to zero as n goes to infinity. Furthermore, we revisit the polar source code showing that stochastic successive-cancellation decoding, as well as successive-cancellation decoding, is effective for this code.

## Authors

• 6 publications
01/19/2018

### Chained Successive Cancellation Decoding of the Extended Golay code

The extended Golay code is shown to be representable as a chained polar ...
08/19/2019

### Weight Distributions for Successive Cancellation Decoding of Polar Codes

In this paper, we derive the exact weight distributions for the successi...
12/15/2020

### Performance and Complexity of the Sequential Successive Cancellation Decoding Algorithm

Simulation results illustrating the performance and complexity of the se...
08/23/2021

### Decoding Error Probability of the Random Matrix Ensemble over the Erasure Channel

Using tools developed in a recent work by Shen and the second author, in...
09/10/2018

### Towards Practical Software Stack Decoding of Polar Codes

The successive cancellation list decoding algorithm for polar codes yiel...
04/11/2020

### Comparison Between the Joint and Successive Decoding Schemes for the Binary CEO Problem

A comparison between the joint and the successive decoding schemes for a...
11/01/2021

### Noise Error Pattern Generation Based on Successive Addition-Subtraction for Guessing Decoding

Guessing random additive noise decoding (GRAND) algorithm has emerged as...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## I Introduction

Successive-cancellation (SC) decoding is one of the elements constituting the polar code introduced by Arıkan [1]. This paper investigates the error probability of SC decoding for a source code with decoder side information by extending the results in [2, 12] to general linear source codes. It is shown that if for a given encoder there is a decoder such that the block error probability is , then the block error probability of an SC decoder for the same encoder is . Furthermore, we introduce stochastic successive-cancellation (SSC) decoding and show that it is equivalent to the constrained-random-number generator introduced in [7]. It is shown that if for a given encoder there is a decoder such that the block error probability is , then the block error probability of an SC decoder for the same encoder is . It is also shown that the error probability of the symbol-wise maximum a posteriori decoding of a linear source code and the SSC decoder of the polar source code goes to zero as the block length goes to infinity.

It should be noted that the results of this paper can be applied to the channel coding as introduced in [2, 10, 12]. In particular, the syndrome decoding is the case when a channel is additive, a parity check matrix corresponds to a source encoding function, the syndrome of a channel output corresponds to a codeword of the source code without decoder side information, and the kernel of the parity check matrix forms the channel inputs, that is, the codewords for a channel code.

Throughout the paper, we use the following notations. For random variable

, let be the alphabet of , be the distribution of , and be the conditional distribution of for a given random variable . Let be the conditional entropy of for a given , where we assume that the base of is the cardinality of

. A column vector is denoted by a boldface letter

, where its dimension depends on the context. We define , where is the null string when . Let be a support function defined as

 χ(S)≡{1,if the statement S is % true0,if the statement S is false.

## Ii Symbol-wise Maximum A Posteriori Decoding

First, we revisit symbol-wise maximum a posteriori (SMAP) decoding, which is used for the conventional decoding of a low density parity check code. Although the symbol error rate (the Hamming distance between a source output and its reproduction divided by the block length ) is discussed with symbol-wise maximum a posteriori decoding, we focus on the block error probability (an error occurs when a source output and its reproduction are different, that is, the Hamming distance is positive) throughout this paper.

Let be a pair consisting of a source encoder and a decoder with side information. Let be the codeword of a source output . The decoder is constructed by using functions reproducing the -th coordinate as

 ˆϕi(c1,y)≡argmaxxiμXi|C1Y(xi|c1,y).

It should be noted that when is memoryless and is a sparse matrix we can use the sum-product algorithm to obtain an approximation of .

We have the following theorem.

###### Theorem 1

The error probability of the code is bounded as

 Prob(ˆϕ(AX,Y)≠X)≤nProb(ϕ(AX,Y)≠X),

where the right hand side of this inequality goes to zero as when .

###### Proof:

Let be the -th coordinate of . Then we have

 Prob(ˆϕ(AX,Y)≠X) =Prob(ˆϕi(AX,Y)≠Xi for some i) ≤n∑i=1Prob(ˆϕi(AX,Y)≠Xi) =n∑i=1Prob(argmaxxiμXi|C1Y(xi|AX,Y)≠Xi) ≤n∑i=1Prob(ϕi(AX,Y)≠Xi) ≤n∑i=1Prob(ϕ(AX,Y)≠X) ≤nProb(ϕ(AX,Y)≠X), (1)

where the first inequality comes from the union bound, the second inequality comes from the fact that the maximum a posteriori decision minimizes the error probability, and the third inequality comes from the fact that implies .

It is known that, when there is an encoding function such that error probability is close to zero for all sufficiently large  [5, 11], where we can use one of the following decoders:

• the typical set decoder defined as

 ϕ(c1,y)≡{ˆxif there is a unique ˆx∈TX|Y,ε(y)error'otherwise,

where

 TX|Y,ε(y)≡{x:|logμX|Y(x|y)−H(X|Y)|≤nε}

is a conditional typical set,

• the maximum a posteriori probability decoder111The right hand side of the third equality of (2) might be called the maximum-likelihood decoder. defined as

 ϕ(c1,y) ≡argmaxxμX|C1Y(x|c1,y) =argmaxxμXC1Y(x,c1,y) =argmaxx:Ax=c1μXY(x,y) =argmaxx:Ax=c1μX|Y(x|y), (2)

where the third equality comes from the fact that when and when .

The following sections show upper bounds of the error probability for several decoders in terms of the error probability of a code , where is an arbitrary decoder. It should be noted that we can use one of the decoders mentioned above. We can reduce the effectiveness of the decoders to that of an arbitrary decoder, where ‘effective’ means that the error probability goes to zero as goes to infinity. For example, [9, 10] show that a decoder using a constrained-random-number generator is effective by showing that the maximum a posteriori probability decoder is effective.

## Iii Decoding Extended Codeword

Let be an encoder of a source code with decoder side information. Here, we assume that, for a given there is a function and a bijection such that

 Q(Ax,Bx)=xfor all x∈Xn. (3)

In particular, this condition is satisfied when is a full-rank matrix. We define the bijection as .

Let and be a partition of , that is, they satisfy and . We call and ordered when and . For a vector , define and so that is a symbol in when for every . In the following, we assume that and , where corresponding index sets and may not be ordered in the bijection . We call the extended codeword of . In the following, we denote omitting the dependence on .

Let be a function that reproduces the extended codeword by using the side information. For a codeword and side information , the source decoder with side information is defined as

 ψ(c1,y)≡Q(f(c1,y)). (4)

In the context of the polar source codes, corresponds to unfrozen symbols and corresponds to the final step of SC decoding. We have the following lemma for a general case.

###### Lemma 1

Let and . Then we have

 Prob(ψ(AX,Y)≠X)=Prob(f(C1,Y)≠(C0,C1)).
###### Proof:

We have

 Prob(ψ(AX,Y)≠X) =∑x,yμXY(x,y)χ(ψ(Ax,y)≠x) =∑x,y,c0,c1μXY(x,y)χ(Ax=c1)χ(Bx=c0) ⋅χ(ψ(c1,y)≠x) =∑x,y,c0,c1μXY(x,y)χ(Ax=c1)χ(Bx=c0) ⋅χ(f(c1,y)≠Q−1(x)) =∑x,y,c0,c1μXY(x,y)χ(Ax=c1)χ(Bx=c0) ⋅χ(f(c1,y)≠(Ax,Bx)) =∑x,y,c0,c1μXY(x,y)χ(Ax=c1)χ(Bx=c0) ⋅χ(f(c1,y)≠(c0,c1)) =∑c0,c1,yμC0C1Y(c0,c1,y)χ(f(c1,y)≠(c0,c1)), =Prob(f(C1,Y)≠(C0,C1)), (5)

where the third equality comes from the fact that is bijective, and in the sixth equality we define

 μC0C1Y(c0,c1,y)≡μXY(Q(c1,c0),y) (6)

and use the fact that for all and there is a unique satisfying and .

In the following, we investigate the decoding error probability for an extended codeword.

## Iv Successive-Cancellation Decoding

This section investigates the error probability of the (deterministic) SC decoding. For a source encoder , let , , , and be defined as in the previous section.

For a codeword and side information , the output of an SC decoder is defined recursively as

 ˆci≡{fi(ˆci−11,y)if i∈I0ciif i∈I1

by using functions defined as

 fi(ci−11,y)≡argmaxciμCi|Ci−11Y(ci|ci−11,y), (7)

which is known as the maximum a posteriori decision rule, where is the conditional probability defined as

 μCi|Ci−11Y(ci|ci−11,y)≡∑cni+1μC0C1Y(c0,c1,y)∑cniμC0C1Y(c0,c1,y) (8)

by using defined by (6).

To simplify the notation, we define when although does not depend on and . We have the following lemma.

###### Lemma 2
 Prob(f(C1,Y)≠(C0,C1)) ≤∑i∈I0Prob(fi(Ci−11,Y)≠Ci).
###### Proof:

As with the proof in [1], we can express the block error events as , where

 Ei≡{(c,y):fj(cj−11,y)=cj for all j∈{1,…,i−1}fi(ci−11,y)≠ci}

is an event where the first decision error in SC decoding occurs at stage . The decoding error probability for a extended codeword is evaluated as

 Prob(f(C1,Y)≠(C0,C1)) =Prob((C0,C1,Y)∈E) ≤n∑i=1Prob((C0,C1,Y)∈Ei) =∑i∈I0Prob((C0,C1,Y)∈Ei), ≤∑i∈I0Prob(fi(Ci−11,Y)≠Ci), (9)

where the first inequality comes from the union bound, the second equality comes from the fact that when , and the last inequality comes from the fact that implies .

When the index sets and are not ordered like the polar source codes [2, 12], defined by (7) may not use the full information of a codeword . Borrowing words from [1], treats future symbols as random variables rather than as known symbols. In other words, ignores the future symbols in a codeword . This implies that is different from the optimum maximum a posteriori decoder defined as

 fMAP(c1,y)≡argmaxc0μC0|C1Y(c0|c1,y).

The following investigates the error probability of the (deterministic) SC decoding by assuming that the index sets and are ordered, that is, and . This implies that for every , defined by (7) uses the full information of a codeword .

###### Lemma 3

For a source encoder and decoder with side information, let , , , and be as defined in the previous section, where it is assumed that the index sets and are ordered. Then we have

 Prob(fi(Ci−11,Y)≠Ci)≤Prob(ϕ(AX,Y)≠X)

for all .

###### Proof:

For , let be the -th coordinate of the extended codeword of . Then we have the fact that

 f′i(c1,y)≠ci ⇒Q−1(ϕ(c1,y))≠(c0,c1) ⇔ϕ(c1,y)≠Q(c1,c0) ⇔ϕ(Ax,y)≠x (10)

for all satisfying and , where the second equivalence comes from the fact that is bijective, and the third equivalence comes from (3). Then we have

 Prob(fi(Ci−11,Y)≠Ci) =Prob(argmaxciμCi|Ci−11Y(ci|Ci−11,Y)≠Ci) ≤Prob(argmaxciμCi|C1Y(ci|C1,Y)≠Ci) ≤Prob(f′i(C1,Y)≠Ci) ≤Prob(ϕ(AX,Y)≠X), (11)

where the first inequality comes from Lemma 7 in the Appendix and the fact that , the second inequality comes from the fact that the maximum a posteriori decision rule minimizes the decision error probability, and the last inequality comes from (10).

From Lemmas 13 and the fact that , we have the following theorem, which implies that SC decoding is effective when for a given encoding function there is an effective decoding function .

###### Theorem 2

For a source code with decoder side information, error probability of the (deterministic) SC decoding is bounded as

 Prob(ψ(AX,Y)≠X)≤nProb(ϕ(AX,Y)≠X),

where the right hand side of this inequality goes to zero as when .

It should be noted again that the index sets and are ordered, while they are not ordered in the original polar source code. In contrast, we can use an arbitrary function that satisfies the assumption and rearrange the index sets and so that they are ordered, while they are fixed in the original polar source code.

## V Stochastic Successive-Cancellation Decoding

This section introduces stochastic successive-cancellation (SSC) decoding, which is known as randomized rounding in the context of polar codes.

When , we replace defined in (7) by the stochastic decision rule generating

randomly subject to the probability distribution

for a given . Let be the stochastic decision rule described above. Let be the stochastic decoder by using instead of when . We denote the stochastic decoder corresponding to (4) by . An analysis of the error probability will be presented in the next section.

## Vi Implementation of Successive-Cancellation Decoding

In this section, we assume that is a full-rank (sparse) matrix. Without loss of generality, we can assume that the right part of is an invertible matrix. This condition is satisfied for an arbitrary full-rank matrix by using a permutation matrix , where satisfies the condition, and the codeword can be obtained as .

Let be an matrix, where the left part of is an invertible matrix. Then we have the fact that by concatenating row vectors of to , we obtain the invertible matrix , that is, is bijective. By using and , we can construct a successive-cancellation decoder that reproduces an extended codeword with and .

Here, let us assume that the left part of is the identity matrix and the right part of is the zero matrix. It should be noted that a similar discussion is possible when the identity matrix is replaced by a permutation matrix.

Since the left part of is the identity matrix, then, for all , the -element of is , which is the only positive element in -th row of . Then we have the fact that

 Cl+j=Xjfor all j∈{1,…,n−l},

which implies .

First, we reduce the conditional probability defined by (8). For and , we have

 μCi|Ci−11Y(ci|ci−11,y) =μC11Y(ci1,y)μCi−11Y(ci−11,y) =μCl1Cl+jl+1Y(cl1,cl+jl+1,y)μCl1Cl+j−1l+1Y(cl1,cl+j−1l+1,y) =μCl+jl+1C1,Y(cl+jl+1,c1,y)μCl+j−1l+1C1Y(cl+j−1l+1,c1,y) =μXj1C1,Y(cl+jl+1,c1,y)μXj−11C1Y(cl+j−1l+1,c1,y) =μXj|Xj−11C1,Y(cl+j|cl+j−1l+1,c1,y), (12)

where the third equality comes from the fact that and the fourth equality comes from Lemma 8 in the Appendix and the fact that for all . By substituting , we have

 μCi|Ci−11Y(xj|xj−11,y) =μXj|Xj−11C1Y(xj|xj−11,c1,y) =∑xnj+1μX|Y(x|y)χ(Ax=c1)∑xnjμX|Y(x|y)χ(Ax=c1) (13)

for and . It should be noted that the right hand side of the second equality appears in the constrained-random-number generation algorithm [7, Eq. (41)]222In [7, Eq. (41)], should be replaced by .. This implies that the constrained-random-number generator can be considered as an SSC decoding of the extended codeword specified in the previous section, where we have assumed that this algorithm uses the full information of the codeword for every .

Next, we assume that is memoryless and reduce the condition to improve the algorithm. This idea has already been presented in [8]. Let be the -th column vector of . Let be the sub-matrix of obtained by using and be that obtained by using . At the computation of (13) for , we can assume that has already been determined. Furthermore, we have the fact that the condition is equivalent to . Then, by letting , we can reduce (13) as follows:

 μCi|Ci−11Y(xj|xj−11,y) =∑xnj+1μX|Y(x|y)χ(Ax=c1)∑xnjμX|Y(x|y)χ(Ax=c1) =∑xnj+1[∏nk=jμXk|Yk(xk|yk)]χ(Anjxnj=c′1(j))∑xnj[∏nk=jμXk|Yk(xk|yk)]χ(Anjxnj=c′1(j)). (14)

It should be noted that we can obtain recursively by deleting the left-end column vector of . We can obtain the vector recursively by using the relations

 c′1(1) ≡c1 c′1(j) ≡c′1(j−1)−xj−1aj−1 for j∈{2,…,n−l}.

These operations reduce the computational complexity of the algorithm. It should also be noted that the sum-product algorithm is available for the approximate computation of (14) when is a sparse matrix.

Next, we convert the reproduction of a extended codeword to the reproduction of a source output. When , we have obtained the extended codeword , where . We can reproduce the source output by using the relation , where is the inverse of the concatenation of and . Then we have the relations

 c1 =An−l1xn−l1+Ann−l+1xnn−l+1 c0 =xn−l1

from the assumptions of and . Since

 c′1(n−l+1)=c1−An−l1xn−l1,

we obtain as

 xnn−l+1=[Ann−l+1]−1c′1(n−l+1),

where is the inverse of .

Finally, we summarize the decoding algorithm. We assume that is memoryless, is an (sparse) matrix satisfying that is an invertible matrix, and is an matrix satisfying that is an identity matrix.

SC/SSC Decoding Algorithm Using Sum-Product Algorithm:

• Let and .

• Calculate the conditional probability distribution

as

 μCl+j|Cl+j−11Y(cj|cj−11,y) =μXj|Xj−11C1Y(xj|xj−11,c1,y) =∑xnj+1⎡⎣n∏k=jμXk|Yk(xk|yk)⎤⎦χ(Anjxnj=c′1)∑xnj⎡⎣n∏k=jμXk|Yk(xk|yk)⎤⎦χ(Anjxnj=c′1) (15)

by using , , , and , where we define . It should be noted that the sum-product algorithm can be employed to obtain an approximation of (15).

• For the deterministic SC decoding, let be defined as

 xj≡argmaxx′jμCl+j|Cl+j−11Y(x′j|xj−11,y).

For the SSC decoding, generate and record a random number subject to the distribution .

• Let .

• If , then compute , output and terminate.

• Let and go to Step 2.

Since the SSC decoder is equivalent to a constrained-random-number generator generating a random sequence subject to the a posteriori probability distribution [7, Theorem 5], we have the following theorem from the fact that the error probability of a stochastic decision with an a posteriori probability distribution is at most twice that of any decision rule [9, Lemma 3].

###### Theorem 3

For a linear source code with decoder side information, the decoding error of the SSC decoding algorithm is bounded as

 Prob(Ψ(AX,Y)≠X)≤2Prob(ϕ(AX,Y)≠X),

where the right hand side of this inequality goes to zero as when .

## Vii Analysis When Index Sets Are Not Ordered

In the previous sections, it was assumed that the index sets and corresponding to and are ordered, that is, and . This section investigates the case when they are not ordered. The following lemma asserts that the effectiveness of the decoder is reduced to a condition where the sum of the conditional entropies corresponding to the complement of the codeword goes to zero as .

###### Lemma 4

Let and be the SC and SSC decoding functions, respectively. Then

 Prob(ψ(AX,Y)≠X) ≤12log2∑i∈I0H(Ci|Ci−11,Y) Prob(Ψ(AX,Y)≠X) ≤1log2∑i∈I0H(Ci|Ci−11,Y).
###### Proof:

The first inequality is shown from Lemmas 13 as

 Prob(ψ(AX,Y)≠X) =Prob(f(C1,Y)≠(C