# Non-binary Code Correcting Single b-Burst of Insertions or Deletions

This paper constructs a non-binary code correcting a single b-burst of insertions or deletions. This paper also proposes a decoding algorithm of this code and evaluates a lower bound of the cardinality of this code. Moreover, we evaluate an asymptotic upper bound on the cardinality of codes which can correct a single burst of insertions or deletions.

## Authors

• 1 publication
• 8 publications
07/05/2021

### Improved Asymptotic Bounds for Codes Correcting Insertions and Deletions

This paper studies the cardinality of codes correcting insertions and de...
10/13/2019

### Perfect Multi Deletion Codes Achieve the Asymptotic Optimality of Code Size

This paper studies on the cardinality of perfect multi deletion binary c...
04/30/2020

### Criss-Cross Deletion Correcting Codes

This paper studies the problem of constructing codes correcting deletion...
10/27/2014

### Exact Expression For Information Distance

Information distance can be defined not only between two strings but als...
04/13/2018

### Erasure Correcting Codes by Using Shift Operation and Exclusive OR

This paper proposes an erasure correcting code and its systematic form f...
10/02/2016

### Syntactic Structures and Code Parameters

We assign binary and ternary error-correcting codes to the data of synta...
06/19/2013

### Verifying the Steane code with Quantomatic

In this paper we give a partially mechanized proof of the correctness of...
##### This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.

## I Introduction

In communication and storage systems, several symbols in a sequence are inserted or deleted for the synchronization errors. Levenshtein [1] proved that VT codes (constructed by Varshamov and Tenengolts [2] for error correction on the Z-channel) correct a single insertion or deletion. This code had been extended to non-binary single insertion or deletion [3] and to two adjacent insertion or deletion [4]. This code had been also extended to a binary [5] and a non-binary multiple insertion or deletion correcting code [6].

Cheng et al. [7] constructed a binary -burst insertion or deletion correcting code, which corrects any consecutive insertion or deletion of length . Schoeny et al. [8] improved this construction and showed that the resulting code has larger cardinality than the code constructed by Cheng et al. These constructions have been extended to permutation code [9, 10]. Nowadays, Schoeny et al. [11] gives a non-binary -burst insertion or deletion correcting code.

In this paper, we construct a non-binary -burst insertion or deletion correcting code with a larger cardinality. The key idea of the paper is to investigate the correcting capability of the non-binary shifted VT code, which is a component of non-binary -burst insertion or deletion correcting codes. We also derive a lower bound of the number of codewords of the constructed non-binary -burst insertion or deletion correcting code. Moreover, we show an asymptotic upper bound of the cardinality of the best non-binary -burst insertion or deletion correcting code.

## Ii Preliminaries And Previous Works

This section briefly introduces previous works, i.e, insertion/deletion111Section II-A will give the details of definition of the notation “insertion/deletion”. codes given in [2, 3, 7, 8, 11]. We use notations given in this section throughout the paper.

### Ii-a Notation and Definition

For integers , define and , where stands the set of integers. For a sequence , we denote the subsequence of whose -th symbol is deleted, by , i.e, . In this case, we say that a single deletion has occurred in . If is an output of the single insertion channel with an input , there exists such that . For a sequence , a symbol , and an integer , we denote .

A run of length of a sequence is a subsequence of such that , (for ), and (for ).

###### Remark 1

For a sequence , is a run of length 2 and we have

 x¬2=x¬3=(1,0,1,1,1).

From this, we see that we receive the same subsequences if a symbol in the same run is deleted under the single deletion channel. In other words, in the single deletion channel, even if one can correct a deletion, one cannot detect which symbol in a run is deleted.

Similarly, we get

 x⊢(2,0)=x⊢(3,0)=x⊢(4,0)=(1,0,0,0,1,1,1).

Hence, in the single insertion channel, we receive the same sequence if the symbol is inserted into a run of .

We refer to exactly consecutive deletions as a single -burst deletion. We define . In words, when the consecutive, namely from -th to -th, symbols of are deleted, we denote it, by . If is an output of the single -insertion channel with an input , there exists an integer such that .

A code which corrects single -burst deletions (resp. insertions) is called a single -burst deletion (resp. insertion) correcting code. A code is -burst insertion/deletion correcting if it corrects single -burst insertions or single -burst deletions. Similarly, we define the terms: single deletion correcting code, single insertion correcting code, and single insertion/deletion correcting code.

The following theorem given in [8] shows a relationship between single -burst deletion correcting codes and single -burst insertion correcting codes.

###### Theorem 1

[8, Theorem 1] A code is a -burst deletion correcting code if and only if it is a -burst insertion correcting code.

This theorem holds for not only binary case but also non-binary case. Hence, when we prove a code is a -burst insertion/deletion correcting code, we only need to prove it is a -burst deletion correcting code.

### Ii-B Single Insertion/Deletion Correcting Code

The VT code is a single insertion/deletion correcting code. The VT code is defined by the code length and as follows:

 VTa(n)={x∈[2]n∣∑ni=1ixi≡a(modn+1)}.

Let be the indicator function, which equals 1 if the proposition is true and equals 0 otherwise. A mapping of a -ary sequence to a binary sequence is defined by

 ui=I[xi

We refer to the sequence as the ascent sequence for . The non-binary VT code is a non-binary single insertion/deletion correcting code defined by the code length , and as follows:

 qVTa,c(n,q)={x∈[q]n∣ ∑ni=1xi≡c(modq), σ(x)∈VTa(n−1)}.

### Ii-C Binary Burst Insertion/Deletion Correcting Code

This section briefly introduces the binary -burst insertion/deletion correcting codes given in [7, 8]. Roughly speaking, those methods employ interleaving to construct the codes.

For simplicity, we assume that is divided by . The matrix representation for a sequence is given as

 Ab(x)=⎛⎜ ⎜ ⎜ ⎜ ⎜⎝x1xb+1⋯xn−b+1x2xb+2⋯xn−b+2⋮⋮⋱⋮xbx2b⋯xn⎞⎟ ⎟ ⎟ ⎟ ⎟⎠. (1)

We denote the -th row of this matrix, by .

###### Example 1

Consider the 3-burst deletion channel with an input . Assume that the output is . Then, these matrix representations are

 A3(x) =⎛⎜⎝x1x4x7x10x2x5x8x11x3x6x9x12⎞⎟⎠, A3(x¬[6,8]) =⎛⎜⎝x1x4x10x2x5x11x3x9x12⎞⎟⎠.

From these, we see that is a result of a single deletion to . Moreover, we see that when the -th entry of is deleted, the -th or -th entry is deleted for .

From the above example, for recovering a single -burst deletion, one needs to correct a single deletion for each row of the matrix representation. Moreover, if one detects the position of deletion in the first row, one needs to correct a deletion for a given two adjacent positions in the other rows.

The code in [7, Sect.III-C] embeds a marker in the first row of the matrix representation to detect the deletion position and employs substitution-transposition codes [12] in the other rows to correct a single deletion for a given two adjacent positions. Here, note that we are able to regard to the marker as a codeword of a VT code with maximum run length 1.

Schoeny et al. [8] improved the construction of this code. The first row of the code in [8] is a run-length-limited VT code which is a VT code with maximum run length at most . From Remark 1, one detects the interval of deletion position with the length at most . The other rows of the code are the shifted-VT codes, which correct a single deletion for a given adjacent positions. Let be the set of sequences in with maximum run length at most . Then, the run-length-limited VT code and shifted-VT (SVT) code are defined as

 RLL\mathchar45VTa(n,r)=VTa(n)∩Sn,2(r), SVTd,e(n,r)={x∈[2]n:∑ni=1ixi≡d(modr), ∑ni=1xi≡e(mod2)},

for and . By using those codes, the binary single -burst correcting code is constructed as:

 C2,b ={x:Ab(x)1∈RLL\mathchar45VTa(n/b,r), ∀i∈[2,b]  Ab(x)i∈SVTd,e(n/b,r+1)}.

### Ii-D Decoding Algorithm for SVT codes

In this section, we briefly introduce the decoding algorithm for the SVT codes. The details of decoding algorithms are in [8, Appendix C].

Firstly, we consider the case of deletion correction. Assume that we employ . Let be the received sequence. Denote the first possible deletion position, by . The inputs of the deletion decoder are those, namely , , and

. We denote the estimated codeword, by

. Let be the interval of the run which contains the inserted symbol. The outputs of the deletion decoder are a pair of the estimated codeword and interval . We denote the deletion correcting algorithm for the SVT code, by . For example, we have .

Secondly, we consider the case of insertion correction. Let be the received sequence. Denote the first possible insertion position, by . We denote the estimated codeword, by . Let be the interval of the run which contains the deleted symbol. We denote the insertion correcting algorithm for the SVT code, by . For example, we have . The notations and will be used in Section III-C.

### Ii-E Non-binary Burst Insertion/Deletion Correcting Code

This section introduces the non-binary -burst insertion/deletion correcting code give in [11].

By a straightforward construction, one obtains the non-binary -burst insertion/deletion correcting code. Similar to the construction of non-binary VT code, we employ the mapping given in Sect. II-B. The non-binary run-length-limited VT code and the non-binary SVT code are defined as:

 RLL\mathchar45qVTa,c(n,r,q):=qVTa,c(n,q)∩Sn,q(r). qSVTd,e,f(n,r,q):={x∈[q]n∣∑ni=1xi≡f(modq), σ(x)∈SVTd,e(n−1,r)},

where , and . Schoney et al. [11] showed the following lemma:

###### Lemma 1 ( [11, Lemma 1] )

For all , and , the code corrects a single insertion/deletion for a given adjacent positions.

As the result, they constructed the following non-binary single -burst insertion/deletion correcting code:

 ˘Cq,b:={x ∣Ab(x)1∈RLL\mathchar45qVTa,b(n/b,r,q), ∀i∈[2,b]  Ab(x)i∈qSVTd,e,f(n/b,r+2,q)}. (2)

## Iii Main Results

This section constructs a non-binary burst insertion/deletion correcting code with a large cardinality. Section III-A gives the main theorem and construction of the code. Section III-B proves that the code is a non-binary burst insertion/deletion correcting code. Section III-C provides the decoding algorithm for the code. Section IV will evaluate the asymptotic cardinality of the code and show a numerical example.

### Iii-a Code Construction And Main Theorem

We investigate the correcting capability of the non-binary SVT code. As a result, we obtain that the code corrects a single insertion/deletion in a longer range as the following theorem.

###### Theorem 2

For all , and , the code corrects a single insertion/deletion for a given adjacent positions.

Based on this result, we construct a code:

 Cq,b:={x ∣Ab(x)1∈RLL\mathchar45qVTa,b(n/b,r,q), ∀i∈[2,b]  Ab(x)i∈qSVTd,e,f(n/b,r+1,q)}. (3)

Moreover, we show the following theorem.

###### Theorem 3

The code corrects a single -burst insertion/deletion.

### Iii-B Proof of Theorems

In this section, we prove Theorem 2 and 3. Now, we will derive several lemmas to prove Theorem 2 The following lemma clarifies the effect of a single deletion in a sequence to its ascent sequence.

###### Lemma 2

Denote . Then, or holds.

###### Proof:

Denote . Obviously, it hold that for and for . Hence, we will show that or holds.

Firstly, we assume . Then, holds. Since , holds. Hence, holds. Secondly, we assume and . Then, and holds. If , equals 1, otherwise equals 0. Hence, or holds.

The other cases are proved in a similar way.

Similarly, for an insertion, we obtain the following lemma.

###### Lemma 3

Denote . Then, or holds, where equals or .

The following lemma is used for the proof of Theorem 2.

###### Lemma 4

Consider such that and for a pair of integers . Denote , , and . Then, the following hold:

1. If , then there exist such that

2. For a pair of integers , if and , there exist such that .

###### Proof:

From Lemma 2, we have or , and or . Hence, holds for a pair of integers . We have

 0 ≡∑ni=1xi−∑ni=1yi(modq) =xs−yt,

where the first equivalence follows from and the second equation follows from . Since , we get

 xs=yt. (4)

From and , we have

 xi ={yi,(i∈[1,s−1]∪[t+1,n]),yi−1,(i∈[s+1,t]), (5) ui ={vi,(i∈[1,s−α−1]∪[t−β+1,n−1]),vi−1,(i∈[s−α+1,t−β]). (6)

Firstly, we prove the case 1), i.e, the case of . Let us hypothesize . From (6), we get . Hence, we have

 xs≥xs+1≥⋯≥xt+1,ys−1≥ys≥⋯≥yt. (7)

Note that and follow from (5). From (4), (5) and (7), we have

 xs≥xs+1≥⋯≥xt=yt−1≥yt=xs, xs≥xs+1=ys≥ys+1≥⋯≥yt=xs.

Note that both ends of these equations are . Hence, these give

 xs=xs+1=⋯=xt=ys=ys+1=⋯=yt.

From this equation and (5), we get . This contradicts . Next, let us hypothesize . Similarly, we get

 xs

Note that follows from (5). Combining those and (4), we have the following contradiction

 xs

Thus, we obtain the case 1).

Secondly, we prove the case 2), i.e, the case of . From the assumption, we have . Now, let us hypothesize for all . Suppose . Then, for all . Hence, we have

 xs−α≥xs−α+1≥⋯≥xt−β+1, (8) ys−α≥ys−α+1≥⋯≥yt−β+1. (9)

Combining (4) (5), (8), and (9), we get

 xs xs \rotatebox90.0$=$(α=0) \rotatebox90.0$=$(α=1) xs−α ≥ xs−α+1 ≥ xs−α+2 ≥ ⋯ ≥ xt−β ≥ xt−β+1 \rotatebox90.0$=$(α=0) \rotatebox90.0$=$(β=1) ys−α ≥ ys−α+1 ≥ ⋯ ≥ yt−β−1 ≥ yt−β ≥ yt−β+1 \rotatebox90.0$=$(β=0) \rotatebox90.0$=$(β=1) xs xs

where equality with label holds if the condition is satisfied (e.g, equality labeled with holds if ). The above gives

 xs=xs+1=⋯=xt=ys=ys+1=⋯=yt,

for all pair of . Combining this and (5), we get . This contradicts . Next, suppose . Then, for all . Similarly, we get

 xs xs \rotatebox90.0$=$(α=0) \rotatebox90.0$=$(α=1) xs−α < xs−α+1 < xs−α+2 < ⋯ < xt−β < xt−β+1 \rotatebox90.0$=$(α=0) \rotatebox90.0$=$(β=1) ys−α < ys−α+1 < ⋯ < yt−β−1 < yt−β < yt−β+1 \rotatebox90.0$=$(β=0) \rotatebox90.0$=$(β=1) xs xs

Now we will prove the two theorems.

Proof of Theorem 2: Let us hypothesize that there exists a pair of codewords such that and for two integers and . Here, without loss of generality, we assume . Denote and . From Lemma 2, and holds for a pair of integers . We have

 0 ≡∑n−1i=1ui−∑n−1i=1vi(mod2) =us−α−vt−β,

where the first equivalence follows from , i.e, , and the second equation follows from . Hence, we get

 us−α=vt−β. (10)

Since , we get (6). From (6) and (10), we have

 ∑n−1i=1iui−∑n−1i=1ivi = ∑t−βi=s−α+1ui−(t−s+α−β)us−α. (11)

Note that . Hence, the pair of and satisfies the conditions of Lemma 4. Firstly, we assume . Then, case 1) of Lemma 4 derives

 0<∑ti=sui≤t−s.

Recall that . Combining the above with (11), we obtain for

 0<∑n−1i=1iui−∑n−1i=1ivi≤t−s

and for

 −r≤−(t−s)−1<∑n−1i=1iui−∑n−1i=1ivi≤−1.

However, these contradict which follows from , i.e, . Secondly, we assume . Then, case 2) of Lemma 4 derives

 0<∑t−βi=s−α+1ui≤t−s+α−β, (if us−α=0), 0≤∑t−βi=s−α+1ui

Since and , we have Combining the above and (11), we obtain

 0<∑n−1i=1iui−∑n−1i=1ivi

Similarly, these contradict . Hence, we obtain the theorem.

Theorem 3 is proved in a similar way to [8, Theorem 5].

### Iii-C Decoding Algorithm

Due to space limitations, we only describe the insertion/deletion correcting algorithm for the non-binary SVT code. In other words, we omit the decoding algorithm for .

We denote the remainder when is divided by , by . Denote the transmitted sequence, by . Algorithms 1 and 2 describe the deletion and insertion correcting algorithm for the SVT code, respectively. The set of inputs of those algorithms is the received sequence , code parameters , and first possible deletion/insertion position . The output of those algorithms is the estimated sequence.

In Algorithm 1, stands the deleted symbol and represents the position of the deleted symbol. Step 1 calculates the deleted symbol since . Step 2 checks whether the -th symbol is deleted. If the condition of Step 2 does not satisfy, then the deletion position is in . In such a case, from Lemma 2, equals to with an integer . Hence, we obtain as in Step 5. The algorithm searches the position of the deleted symbol in Steps 7-20.

In Algorithm 2, stands the inserted symbol and represents the position of the inserted symbol. Step 1 calculates the inserted symbol since . Step 2 checks whether the -th symbol is inserted. If the condition of Step 2 does not satisfy, then the inserted position is in . In such case, from Lemma 3, equals with an integer and . Hence, we obtain as in Step 5. The algorithm searches the position of the inserted symbol in Steps 7-11.

## Iv The Number of Codewords

This section evaluates the gap between the lower bound of the cardinality of the constructed code and the upper bound of the cardinality of arbitrary non-binary -burst insertion/deletion correcting codes. Moreover, we evaluates the number of codewords of the SVT codes by a numerical example for an evidence that the code in (3) has a larger cardinality.

### Iv-a Lower Bound of Cardinality of Constructed Code

In a similar way to [8, Lemma 2], we have the following lemma.

###### Lemma 5

The following holds

 |Sn,q(r)|≥(qr−n)qn−r.

By the pigeonhole principle and this lemma, we get the following two lemmas.

###### Lemma 6

The cardinality of non-binary run-length-limited VT code is lower bounds as:

 maxa∈[n],c∈[q]|RLL\mathchar45qVTa,c(n,r,q)|≥(qr−n)qn−r−1n.
###### Lemma 7

The cardinality of non-binary SVT code is lower bounds as:

 maxd∈[r],e∈[2],f∈[q]|qSVTd,e,f(n,r,q)|≥qn−12r.

From those lemmas, we obtain a lower bound of cardinality of the constructed code.

###### Theorem 4

For all , the cardinality of satisfies