# Decoding Error Probability of the Random Matrix Ensemble over the Erasure Channel

Using tools developed in a recent work by Shen and the second author, in this paper we carry out an in-depth study on the average decoding error probability of the random matrix ensemble over the erasure channel under three decoding principles, namely unambiguous decoding, maximum likelihood decoding and list decoding. We obtain explicit formulas for the average decoding error probabilities of the random matrix ensemble under these three decoding principles and compute the error exponents. Moreover, for unambiguous decoding, we compute the variance of the decoding error probability of the random matrix ensemble and the error exponent of the variance, which imply a strong concentration result, that is, roughly speaking, the ratio of the decoding error probability of a random code in the ensemble and the average decoding error probability of the ensemble converges to 1 with high probability when the code length goes to infinity.

## Authors

• 3 publications
• 28 publications
• 11 publications
04/11/2019

### On Error Decoding of Locally Repairable and Partial MDS Codes

We consider error decoding of locally repairable codes (LRC) and partial...
09/13/2019

### LDPC Codes Achieve List Decoding Capacity

We show that Gallager's ensemble of Low-Density Parity Check (LDPC) code...
03/28/2019

### Successive-Cancellation Decoding of Linear Source Code

This paper investigates the error probability of several decoding method...
01/21/2019

### A Lower Bound on the Error Exponent of Linear Block Codes over the Erasure Channel

A lower bound on the maximum likelihood (ML) decoding error exponent of ...
10/23/2019

### On the Bee-Identification Error Exponent with Absentee Bees

The "bee-identification problem" was formally defined by Tandon, Tan and...
09/05/2018

### Bounds on the Error Probability of Raptor Codes under Maximum Likelihood Decoding

In this paper upper bounds on the probability of decoding failure under ...
02/06/2018

### A Distance Between Channels: the average error of mismatched channels

Two channels are equivalent if their maximum likelihood (ML) decoders co...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## I Introduction

### I-a Background

In digital communication, it is common that messages transmitted through a public channel may be distorted by the channel noise. The theory of error-correcting codes is the study of mechanisms to cope with this problem. This is an important research area with many applications in modern life. For example, error-correcting codes are widely employed in cell phones to correct errors arising from fading noise during high frequency radio transmission. One of the major challenges in coding theory remains to construct new error-correcting codes with good properties and to study their decoding and encoding algorithms.

In a binary erasure channel (BEC), a binary symbol is either received correctly or totally erased with probability . The concept of BEC was first introduced by Elias in 1955 [2]. Together with the binary symmetric channel (BSC), they are frequently used in coding theory and information theory because they are among the simplest channel models, and many problems in communication theory can be reduced to problems in a BEC. Here we consider more generally a -ary erasure channel in which a -ary symbol is either received correctly, or totally erased with probability .

The problem of decoding linear codes over the erasure channel has received renewed attention in recent years due to their wide application in the internet and the distributed storage system in analyzing random packet losses [1, 8, 9]. Three important decoding principles, namely unambiguous decoding, maximum likelihood decoding and list decoding, were studied in recent years for linear codes over the erasure channel, the corresponding decoding error probabilities under these principles were also investigated (see [3, 6, 11, 13] and reference therein).

In particular in [11], upon improving previous results, the authors provided a detailed study on the decoding error probabilities of a general -ary linear code over the erasure channel under the three decoding principles. Via the notion of -incorrigible sets for linear codes, they showed that all these decoding error probabilities can be expressed explicitly by the -th support weight distribution of the linear codes. As applications they obtained explicit formulas of the decoding error probabilities for some of the most interesting linear codes such as MDS codes, the binary Golay code, the simplex codes and the first-order Reed-Muller codes etc. where the -support weight distributions were known. They also computed the average decoding error probabilities of a random code over the erasure channel and obtained the error exponent of a random code () under one of the decoding principles.

### I-B Statement of the main results

In this paper we consider a new code ensemble, namely the random matrix ensemble , that is, the set of all matrices over endowed with uniform probability, each of which is associated with a parity-check code as follows: for each , the corresponding parity-check code is given by

 CH={x∈Fnq:Hxt=0}. (1)

Here boldface letters such as

denote row vectors.

As for previous results about the ensemble , the undetected error probability was studied in the binary symmetric channel by Wadayama [12] (i.e. ), and some bounds on the error probability under the maximum likelihood decoding principle were obtained in the -ary erasure channel [4, 7], but other than these results, not much is known. It is easy to see that contains all linear codes in the random code ensemble considered in [11], but these two ensembles are quite different for two reasons: first, in the random code ensemble considered in [11], each code is counted exactly once, while in each code is counted with some multiplicity as different choices for the matrix may give rise to the same code; second, some codes in may have rates strictly larger than as the rows of may not be linearly independent.

It is conceivable that most of the codes in have rate , and the average behavior of codes in should be similar to that of the random code ensemble considered in [11]. The advantage of studying the ensemble is that it is much easier to deal with in terms of mathematics than with the random ensemble – such an advantage has been exploited in [12] – hence we may be able to obtain much stronger results than what was obtained in [11]. We will show that this is indeed the case.

We first obtain explicit formulas for the average decoding error probability of the ensemble over the erasure channel under the three different decoding principles. This is comparable to [11, Theorem 2] for the random code ensemble. Such formulas are useful as they allow explicit evaluations of the average decoding error probabilities for any given and , hence giving us a meaningful guidance as to what to expect for a good code over the erasure channel.

###### Theorem 1.

Let be the random matrix ensemble described above. Denote by the Gaussian -binomial coefficient and denote

 ψm(i):=i−1∏k=0(1−qk−m),1≤i≤m. (2)
1. The average unsuccessful decoding probability of under list decoding with list size , where is a non-negative integer, is given by

 Pld(Rm,n,ℓ,ε)=n∑i=1i∑j=ℓ+1q−mjψm(i−j)[ij]q(ni)εi(1−ε)n−i; (3)
2. The average unsuccessful decoding probability of under unambiguous decoding is given by

 Pud(Rm,n,ε)=n∑i=1(1−ψm(i))(ni)εi(1−ε)n−i; (4)
3. The average decoding error probability of under maximum likelihood decoding is given by

 (5)

Next, letting for , we compute the error exponents of the average decoding error probability of the ensemble series as under these decoding principles.

###### Theorem 2.

Let the rate be fixed and .

1. For any fixed integer , the error exponent for average unsuccessful decoding probability of under list decoding with list size is given by

 Tld(ℓ,ε)=⎧⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪⎩0(1−ε≤R<1)(1−R)logq(1−Rε)+Rlogq(R1−ε)(1−ε1−ε+qℓ+1ε
2. The error exponents for average unsuccessful decoding probability of under unambiguous decoding and maximum likelihood decoding (respectively) are both given by

 Tud(ε)=Tmld(ε)=⎧⎪ ⎪ ⎪⎨⎪ ⎪ ⎪⎩0(1−ε≤R<1)(1−R)logq(1−Rε)+Rlogq(R1−ε)(1−ε1−ε+qε

A plot of the function for in the range is given by Fig. 1.

It can be checked that the error exponent here under unambiguous decoding principle coincides with that for the random code ensemble obtained in [11, Theorem 3].

Next, we establish a strong concentration result for the unsuccessful decoding probability of a random code in the ensemble towards the mean under unambiguous decoding.

###### Theorem 3.

Let the rate be fixed and . Then as runs over the ensemble , we have

 Pud(Hn,ε)Pud(R(1−R)n,n,ε)→1\emphWHP, (8)

under either of the following conditions:

• if for any , or

• if for .

Here the notion WHP in (8) refers to “with high probability”, that is, for any , there is and such that

 P(∣∣∣Pud(Hn,ε)Pud(R(1−R)n,n,ε)−1∣∣∣<δ)>1−q−nc∀n>n0.

Noting that in the range , it was known that (see Theorem 2), hence

 Pud(R(1−R)n,n,ε)=q−n(Tud(ε)+o(1))→0, as n→∞,

so (8) shows that also tends to zero exponentially fast with high probability for the ensemble under either Condition (1) or (2) of Theorem 3.

Finally, we point out a weaker but more general concentration result:

###### Theorem 4.

Let the rate be fixed and . Then as runs over the ensemble , we have

 Pud(Hn,ε)−Pud(R(1−R)n,n)→0\emphWHP. (9)

It is known that is the capacity of the erasure channel, so if , then by the converse to the coding theorem (see [5, Theorem 4.3.4] or [14, Theorem 1.3.1]), there is a constant such that and ; if , then and

 Pud(R(1−R)n,n,ε)=q−n(Tud(ε)+o(1))→0, as n→∞,

hence from (9) we also conclude that .

### I-C Discussion of Theorem 2

It is interesting to make a comparison of Theorem 2 with what can be obtained by Gallager’s method for nonlinear code ensembles over the erasure channel (see [5, Exercise 5.20, page 538]): consider the ensemble of all block codes of length and rate () over the erasure channel in which each letter of each codeword is selected independently as an element of with equal probability , then the average decoding error probability under list-decoding with list size is upper bounded by

 ¯PL,e≤q−nT(∗)ld(L,ε),

where the function is given as

 T(∗)ld(L,ε)=⎧⎪ ⎪ ⎪ ⎪⎨⎪ ⎪ ⎪ ⎪⎩0(1−ε≤R<1)(1−R)logq(1−Rε)+Rlogq(R1−ε)(1−ε1−ε+qLε

We compare the error exponent given in Theorem 2 corresponding to list decoding with list size for the random matrix ensemble with above where which corresponds to list decoding with list size for the random code ensemble of the same rate described by Gallager. We can observe that in the high rate region , the two exponents and coincide with each other, but in the low rate region , we have whenever . As illustrations we plot the two exponents as functions of in Figs. 2 and 3 for , , with and . This shows that under the list decoding principle over the erasure channel, the performance of linear codes on average is as good as that of nonlinear codes in the high rate range, but is inferior in the low rate range if the list size is at least 3. It is well-known that they have the same performance when , that is, under the unambiguous decoding and the maximum likelihood decoding principles [5].

The paper is now organized as follows. In Section II, we introduce the Gaussian -binomial coefficient in more details. Then in Section III, we provide three counting results regarding matrices of certain rank over . Afterwards in Sections IV, V and VI, we give the proofs of Theorems 1, 2 and 3-4 respectively. The proofs of Theorems 3 and 4 involves some technical calculus computations on the error exponent of the variance. In order to streamline their proofs, we put some of the arguments in Section VII Appendix. Finally we conclude this paper in Section VIII.

## Ii Preliminaries

For integers , the Gaussian binomial coefficients is defined as

 [nk]q:=(q)n(q)k(q)n−k,

where . By convention , for any and if or . The function defined in (2) can be written as

 ψm(i)=q−mi+i(i−1)/2(q)i[mi]q.

We may define for and if or . Next, recall the well-known combinatorial interpretation of :

###### Lemma 1 ([10]).

The number of -dimensional subspaces of an -dimensional vector space over is .

The Gaussian binomial coefficient satisfies the property

 [nk]q=[nn−k]q,∀n≥k≥0,

and the identity

 (11)

## Iii Three counting results for the ensemble Rm,n

In this section we provide three counting results about matrices of certain rank in the ensemble . Such results may not be new, but since we cannot locate them in the literature, we prove them here. These results will be used repeatedly in the proofs later on.

For , denote by the rank of the matrix over .

###### Lemma 2.

Let be a random matrix in the ensemble . Then for any integer , we have

 (12)
###### Proof.

We may assume that satisfies , because if is not in the range, then both sides of Equation (12) are obviously zero.

Denote by the set of

to . Writing vectors in and as row vectors, we see that the random matrix ensemble can be identified with the set via the relation

 H↔G:Fmq→Fnqx↦xH. (13)

Since if and only if , and , we have

 P(rk(H)=j) =q−mn∑G∈Hom(m,n)dim(ImG)=j1 =q−mn∑V≤FnqdimV=j∑G∈Hom(m,n)ImG=V1.

The inner sum counts the number of surjective linear transformations from to , a -dimensional subspace of . Since , this is also the number of surjective linear transformations from to , or, equivalently, the number of matrices over such that the columns of are linearly independent. The number of such matrices can be counted as follows: the first column of can be any nonzero vector over , there are choices; given the first column, the second column can be any vector lying outside the space of scalar multiples of the first column, so there are choices; inductively, given the first columns, the -th column lies outside a -dimensional subspace, so the number of choices for the -th column is . Thus we have

 ∑G∈Homq(m,n)ImG=V1=j−1∏k=0(qm−qk)=qmjψm(j),dimV=j. (14)

Together with Lemma 1, we obtain

 P(rk(H)=j)=q−mnqmjψm(j)∑V≤FnqdimV=j1=q−m(n−j)ψm(j)[nj]q,

which is the desired result. ∎

###### Lemma 3.

Let be a random matrix in the ensemble . Let be a subset with cardinality . Denote by the submatrix formed by columns of indexed from . Then for any integers and , we have

 P(rk(H)=j∩rk(HA)=r)=q−m(n−j)+r(n−j−s+r)ψm(j)[sr]q[n−sj−r]q. (15)
###### Proof.

We may assume that and , because if or does not satisfy this condition, then both sides of Equation (15) are zero.

Using the relations (13) and (14), we can expand the term as

 P(rk(H)=j∩rk(HA)=r)=q−mn∑V≤FnqdimV=jdimVA=r∑G∈Hom(m,n)ImG=V1=q−m(n−j)ψm(j)∑V≤FnqdimV=jdimVA=r1. (16)

Here is the subspace of formed by restricting to coordinates with indices from . We may consider the projection given by

 πA: V→VA (vk)nk=1↦(vk)k∈A.

The kernel of has dimension and is of the form for some subspace . So we can further decompose the sum on the right hand side of (16) as

 ∑V≤FnqdimV=jdimVA=r1=∑W≤F[n]−AqdimW=j−r∑V≤FnqdimV=jker(πA)=W×{(0)A}1. (17)

Now we compute the inner sum on the right hand side of (17). Suppose we are given an ordered basis of the -dimensional subspace of . We extend it to an ordered basis of some -dimensional subspace as follows: first we need other basis vectors to be linearly independent. At the same time, they have to be linearly independent with any nonzero vector in due to the kernel condition. This requires the set to be linearly independent in . On the other hand, if this condition is satisfied, then the vectors are also linearly independent with one another as well as with any nonzero vector in . Therefore it reduces to counting the number of ordered linearly independent sets of vectors in . This number is clearly given by , so the total number of different ordered bases is given by .

On the other hand, given a fixed -dimensional subspace with , we count the number of ordered bases of of the form stated in previous paragraph as follows: we choose to be any vector in but not in , which gives many choices for ; similarly is any vector in but not in the span of and , this gives us many choices for ; using this argument, we see that the number of such ordered bases is given by .

We conclude from the above arguments that

 ∑V≤FnqdimV=jker(πA)=W×{(0)A}1 =qr(n−s)∏r−1k=0(qs−qk)∏r−1k=0(qj−qj−r+k) =qr(n−j−s+r)[sr]q.

Putting this into the right hand side of (17) and using Lemma 1 again, we get

 ∑V≤FnqdimV=jdimVA=r1=∑W≤F[n]−AqdimW=j−rqr(n−j−s+r)[sr]q=qr(n−j−s+r)[sr]q[n−sj−r]q.

The desired result is obtained immediately by plugging this into the right hand side of (16). This completes the proof of Lemma 3. ∎

###### Lemma 4.

Let be a random matrix in the ensemble . Let be subsets of such that

 i=#E,i′=#E′,s=#(E∩E′).

Then

 P(rk(HE)=i∩rk(HE′)=i′)=ψm(i)ψm(i′)ψm(s). (18)
###### Proof.

It is clear that if a matrix has full rank, then so is the submatrix for any index subset . Hence we have

 P(rk(HE)=i∩rk(HE′)=i′) =P(rk(HE)=i∩rk(HE′)=i′|rk(HE∩E′)=s)P(rk(HE∩E′)=s).

It is easy to see that the two events and are conditionally independent given , since columns of and are independent as random vectors over . Hence we get

 P(rk(HE)=i∩rk(HE′)=i′) =P(rk(HE)=i|rk(HE∩E′)=s)P(rk(HE)=i′|rk(HE∩E′)=s)P(rk(HE∩E′)=s) =P(rk(HE)=i∩rk(HE∩E′)=s)P(rk(HE)=i′∩rk(HE∩E′)=s)P(rk(HE∩E′)=s) =ψm(i)ψm(i′)ψm(s).

Here we have applied Lemmas 2 and 3 in the last equality with and and . ∎

## Iv Proof of Theorem 1

The background of the three decoding principles unambiguous decoding, maximum likelihood decoding and list decoding of linear codes over the erasure channel, the computation of their decoding error probability functions and , and their relation to the concept of -incorrigible set of a linear code were all laid out perfectly in [11, II. Preliminaries], so we do not repeat here. Interested readers may refer to that paper for more details. We focus on what are most relevant to the proof of Theorem 1 in this paper.

Let be an linear code, that is, is a -dimensional subspace of . Denote . For any , define

 C(E):={c=(c1,…,cn)∈C:ci=0∀i∈[n]∖E}.

Since is a linear code, is also a vector space over .

Denote by the -incorrigible set distribution of , and the incorrigible set distribution of , which are defined respectively as follows:

 I(ℓ)i(C) = #{E⊂[n]:#E=i,dimC(E)>ℓ}, Ii(C) = #{E⊂[n]:#E=i,C(E)≠{0}}. (19)

It is easy to see that , so if , then . We also define

 λ(ℓ)i(C)=#{E⊂[n]:#E=i,dimC(E)=ℓ}.

It is easy to see that , if and

 i∑ℓ=0λ(ℓ)i=(ni). (20)

We also have the identity

 I(ℓ)i(C)=i∑j=ℓ+1λ(j)i(C),∀i,ℓ. (21)

Recall from [11] that the values and can all be expressed in terms of , and as follows:

 Pud(C,ε) = n∑i=1Ii(C)εi(1−ε)n−i, (22) Pld(C,ℓ,ε) = n∑i=1I(ℓ)i(C)εi(1−ε)n−i, (23) Pmld(C,ε) = n∑i=1i∑ℓ=1λ(ℓ)i(C)(1−q−ℓ)εi(1−ε)n−i. (24)

For , we write and for , where is the parity-check code defined by (1). The average decoding error probabilities over the matrix ensemble are given by

 Pld(Rm,n,ℓ,ε):=E[Pld(H,ℓ,ε)],

and

 P∗(Rm,n,ε):=E[P∗(H,ε)],∗∈{ud,mld}.

Here the expectation is taken over the ensemble .

Now we can start the proof of Theorem 1. For , we denote

 Ii:=Ii(CH),I(ℓ)i:=I(ℓ)i(CH),λ(ℓ)i:=λ(ℓ)i(CH).

Taking expectations on both sides of Equations (22)-(24), we obtain

 Pud(Rm,n,ε) = n∑i=1E[Ii]εi(1−ε)n−i, (25) Pld(Rm,n,ℓ,ε) = n∑i=1E[I(ℓ)i]εi(1−ε)n−i, (26) Pmld(Rm,n,ε) = n∑i=1i∑ℓ=1E[λ(ℓ)i](1−q−ℓ)εi(1−ε)n−i. (27)

We now compute . Noting that for and , we have , thus

 E[λ(ℓ)i] = 1#Rm,n∑H∈Rm,n∑E⊂[n]#E=i1dimCH(E)=ℓ = 1#Rm,n∑E⊂[n]#E=i∑H∈Rm,nrk(HE)=i−ℓ1.

By the symmetry of the ensemble , the inner sum on the right hand side depends only on the cardinality of , so we may assume to obtain

 E[λ(ℓ)i]=1#Rm,i(ni)∑H∈Rm,irk(H)=i−ℓ1.

The right hand side is exactly where the probability is over the ensemble . So from Lemma 2 we have

 E[λ(ℓ)i]=q−mℓψm(i−ℓ)[ii−ℓ]q(ni). (28)

Using this and (21), we also obtain

 E[I(ℓ)i]=i∑j=ℓ+1E[λ(j)i]=i∑j=ℓ+1q−mjψm(i−j)[ij]q(ni).

Inserting the above values and into (26) and (27) respectively, we obtain explicit expressions of and