In communication and storage systems, several symbols in a sequence are inserted or deleted for the synchronization errors. Levenshtein  proved that VT codes (constructed by Varshamov and Tenengolts  for error correction on the Z-channel) correct a single insertion or deletion. This code had been extended to non-binary single insertion or deletion  and to two adjacent insertion or deletion . This code had been also extended to a binary  and a non-binary multiple insertion or deletion correcting code .
Cheng et al.  constructed a binary -burst insertion or deletion correcting code, which corrects any consecutive insertion or deletion of length . Schoeny et al.  improved this construction and showed that the resulting code has larger cardinality than the code constructed by Cheng et al. These constructions have been extended to permutation code [9, 10]. Nowadays, Schoeny et al.  gives a non-binary -burst insertion or deletion correcting code.
In this paper, we construct a non-binary -burst insertion or deletion correcting code with a larger cardinality. The key idea of the paper is to investigate the correcting capability of the non-binary shifted VT code, which is a component of non-binary -burst insertion or deletion correcting codes. We also derive a lower bound of the number of codewords of the constructed non-binary -burst insertion or deletion correcting code. Moreover, we show an asymptotic upper bound of the cardinality of the best non-binary -burst insertion or deletion correcting code.
Ii Preliminaries And Previous Works
This section briefly introduces previous works, i.e, insertion/deletion111Section II-A will give the details of definition of the notation “insertion/deletion”. codes given in [2, 3, 7, 8, 11]. We use notations given in this section throughout the paper.
Ii-a Notation and Definition
For integers , define and , where stands the set of integers. For a sequence , we denote the subsequence of whose -th symbol is deleted, by , i.e, . In this case, we say that a single deletion has occurred in . If is an output of the single insertion channel with an input , there exists such that . For a sequence , a symbol , and an integer , we denote .
A run of length of a sequence is a subsequence of such that , (for ), and (for ).
For a sequence , is a run of length 2 and we have
From this, we see that we receive the same subsequences if a symbol in the same run is deleted under the single deletion channel. In other words, in the single deletion channel, even if one can correct a deletion, one cannot detect which symbol in a run is deleted.
Similarly, we get
Hence, in the single insertion channel, we receive the same sequence if the symbol is inserted into a run of .
We refer to exactly consecutive deletions as a single -burst deletion. We define . In words, when the consecutive, namely from -th to -th, symbols of are deleted, we denote it, by . If is an output of the single -insertion channel with an input , there exists an integer such that .
A code which corrects single -burst deletions (resp. insertions) is called a single -burst deletion (resp. insertion) correcting code. A code is -burst insertion/deletion correcting if it corrects single -burst insertions or single -burst deletions. Similarly, we define the terms: single deletion correcting code, single insertion correcting code, and single insertion/deletion correcting code.
The following theorem given in  shows a relationship between single -burst deletion correcting codes and single -burst insertion correcting codes.
[8, Theorem 1] A code is a -burst deletion correcting code if and only if it is a -burst insertion correcting code.
This theorem holds for not only binary case but also non-binary case. Hence, when we prove a code is a -burst insertion/deletion correcting code, we only need to prove it is a -burst deletion correcting code.
Ii-B Single Insertion/Deletion Correcting Code
The VT code is a single insertion/deletion correcting code. The VT code is defined by the code length and as follows:
Let be the indicator function, which equals 1 if the proposition is true and equals 0 otherwise. A mapping of a -ary sequence to a binary sequence is defined by
We refer to the sequence as the ascent sequence for . The non-binary VT code is a non-binary single insertion/deletion correcting code defined by the code length , and as follows:
Ii-C Binary Burst Insertion/Deletion Correcting Code
For simplicity, we assume that is divided by . The matrix representation for a sequence is given as
We denote the -th row of this matrix, by .
Consider the 3-burst deletion channel with an input . Assume that the output is . Then, these matrix representations are
From these, we see that is a result of a single deletion to . Moreover, we see that when the -th entry of is deleted, the -th or -th entry is deleted for .
From the above example, for recovering a single -burst deletion, one needs to correct a single deletion for each row of the matrix representation. Moreover, if one detects the position of deletion in the first row, one needs to correct a deletion for a given two adjacent positions in the other rows.
The code in [7, Sect.III-C] embeds a marker in the first row of the matrix representation to detect the deletion position and employs substitution-transposition codes  in the other rows to correct a single deletion for a given two adjacent positions. Here, note that we are able to regard to the marker as a codeword of a VT code with maximum run length 1.
Schoeny et al.  improved the construction of this code. The first row of the code in  is a run-length-limited VT code which is a VT code with maximum run length at most . From Remark 1, one detects the interval of deletion position with the length at most . The other rows of the code are the shifted-VT codes, which correct a single deletion for a given adjacent positions. Let be the set of sequences in with maximum run length at most . Then, the run-length-limited VT code and shifted-VT (SVT) code are defined as
for and . By using those codes, the binary single -burst correcting code is constructed as:
Ii-D Decoding Algorithm for SVT codes
In this section, we briefly introduce the decoding algorithm for the SVT codes. The details of decoding algorithms are in [8, Appendix C].
Firstly, we consider the case of deletion correction. Assume that we employ . Let be the received sequence. Denote the first possible deletion position, by . The inputs of the deletion decoder are those, namely , , and
. We denote the estimated codeword, by. Let be the interval of the run which contains the inserted symbol. The outputs of the deletion decoder are a pair of the estimated codeword and interval . We denote the deletion correcting algorithm for the SVT code, by . For example, we have .
Secondly, we consider the case of insertion correction. Let be the received sequence. Denote the first possible insertion position, by . We denote the estimated codeword, by . Let be the interval of the run which contains the deleted symbol. We denote the insertion correcting algorithm for the SVT code, by . For example, we have . The notations and will be used in Section III-C.
Ii-E Non-binary Burst Insertion/Deletion Correcting Code
This section introduces the non-binary -burst insertion/deletion correcting code give in .
By a straightforward construction, one obtains the non-binary -burst insertion/deletion correcting code. Similar to the construction of non-binary VT code, we employ the mapping given in Sect. II-B. The non-binary run-length-limited VT code and the non-binary SVT code are defined as:
where , and . Schoney et al.  showed the following lemma:
Lemma 1 ( [11, Lemma 1] )
For all , and , the code corrects a single insertion/deletion for a given adjacent positions.
As the result, they constructed the following non-binary single -burst insertion/deletion correcting code:
Iii Main Results
This section constructs a non-binary burst insertion/deletion correcting code with a large cardinality. Section III-A gives the main theorem and construction of the code. Section III-B proves that the code is a non-binary burst insertion/deletion correcting code. Section III-C provides the decoding algorithm for the code. Section IV will evaluate the asymptotic cardinality of the code and show a numerical example.
Iii-a Code Construction And Main Theorem
We investigate the correcting capability of the non-binary SVT code. As a result, we obtain that the code corrects a single insertion/deletion in a longer range as the following theorem.
For all , and , the code corrects a single insertion/deletion for a given adjacent positions.
Based on this result, we construct a code:
Moreover, we show the following theorem.
The code corrects a single -burst insertion/deletion.
Iii-B Proof of Theorems
Denote . Then, or holds.
Denote . Obviously, it hold that for and for . Hence, we will show that or holds.
Firstly, we assume . Then, holds. Since , holds. Hence, holds. Secondly, we assume and . Then, and holds. If , equals 1, otherwise equals 0. Hence, or holds.
The other cases are proved in a similar way.
Similarly, for an insertion, we obtain the following lemma.
Denote . Then, or holds, where equals or .
The following lemma is used for the proof of Theorem 2.
Consider such that and for a pair of integers . Denote , , and . Then, the following hold:
If , then there exist such that
For a pair of integers , if and , there exist such that .
From Lemma 2, we have or , and or . Hence, holds for a pair of integers . We have
where the first equivalence follows from and the second equation follows from . Since , we get
From and , we have
Firstly, we prove the case 1), i.e, the case of . Let us hypothesize . From (6), we get . Hence, we have
Note that both ends of these equations are . Hence, these give
From this equation and (5), we get . This contradicts . Next, let us hypothesize . Similarly, we get
Thus, we obtain the case 1).
Secondly, we prove the case 2), i.e, the case of . From the assumption, we have . Now, let us hypothesize for all . Suppose . Then, for all . Hence, we have
where equality with label holds if the condition is satisfied (e.g, equality labeled with holds if ). The above gives
for all pair of . Combining this and (5), we get . This contradicts . Next, suppose . Then, for all . Similarly, we get
This leads the contradiction . Thus, we obtain the case 2).
Now we will prove the two theorems.
Proof of Theorem 2: Let us hypothesize that there exists a pair of codewords such that and for two integers and . Here, without loss of generality, we assume . Denote and . From Lemma 2, and holds for a pair of integers . We have
where the first equivalence follows from , i.e, , and the second equation follows from . Hence, we get
Recall that . Combining the above with (11), we obtain for
However, these contradict which follows from , i.e, . Secondly, we assume . Then, case 2) of Lemma 4 derives
Since and , we have Combining the above and (11), we obtain
Similarly, these contradict . Hence, we obtain the theorem.
Iii-C Decoding Algorithm
Due to space limitations, we only describe the insertion/deletion correcting algorithm for the non-binary SVT code. In other words, we omit the decoding algorithm for .
We denote the remainder when is divided by , by . Denote the transmitted sequence, by . Algorithms 1 and 2 describe the deletion and insertion correcting algorithm for the SVT code, respectively. The set of inputs of those algorithms is the received sequence , code parameters , and first possible deletion/insertion position . The output of those algorithms is the estimated sequence.
In Algorithm 1, stands the deleted symbol and represents the position of the deleted symbol. Step 1 calculates the deleted symbol since . Step 2 checks whether the -th symbol is deleted. If the condition of Step 2 does not satisfy, then the deletion position is in . In such a case, from Lemma 2, equals to with an integer . Hence, we obtain as in Step 5. The algorithm searches the position of the deleted symbol in Steps 7-20.
In Algorithm 2, stands the inserted symbol and represents the position of the inserted symbol. Step 1 calculates the inserted symbol since . Step 2 checks whether the -th symbol is inserted. If the condition of Step 2 does not satisfy, then the inserted position is in . In such case, from Lemma 3, equals with an integer and . Hence, we obtain as in Step 5. The algorithm searches the position of the inserted symbol in Steps 7-11.
Iv The Number of Codewords
This section evaluates the gap between the lower bound of the cardinality of the constructed code and the upper bound of the cardinality of arbitrary non-binary -burst insertion/deletion correcting codes. Moreover, we evaluates the number of codewords of the SVT codes by a numerical example for an evidence that the code in (3) has a larger cardinality.
Iv-a Lower Bound of Cardinality of Constructed Code
In a similar way to [8, Lemma 2], we have the following lemma.
The following holds
By the pigeonhole principle and this lemma, we get the following two lemmas.
The cardinality of non-binary run-length-limited VT code is lower bounds as:
The cardinality of non-binary SVT code is lower bounds as:
From those lemmas, we obtain a lower bound of cardinality of the constructed code.
For all , the cardinality of satisfies