Asymptotically Optimal Codes Correcting Fixed-Length Duplication Errors in DNA Storage Systems

08/30/2018
by   Mladen Kovačević, et al.
0

A (tandem) duplication of length k is an insertion of an exact copy of a substring of length k next to its original position. This and related types of impairments are of relevance in modeling communication in the presence of synchronization errors, as well as in several information storage applications. We demonstrate that Levenshtein's construction of binary codes correcting insertions of zeros is, with minor modifications, applicable also to channels with arbitrary alphabets and with duplication errors of arbitrary (but fixed) length k . Furthermore, we derive bounds on the cardinality of optimal q -ary codes correcting up to t duplications of length k , and establish the following corollaries in the asymptotic regime of growing block-length: 1.) the presented family of codes is optimal for every q, t, k , in the sense of the asymptotic scaling of code redundancy; 2.) the upper bound, when specialized to q = 2 , k = 1 , improves upon Levenshtein's bound for every t ≥ 3 ; 3.) the bounds coincide for t = 1 , thus yielding the exact asymptotic behavior of the size of optimal single-duplication-correcting codes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
08/18/2020

Error-correcting Codes for Noisy Duplication Channels

Because of its high data density and longevity, DNA is emerging as a pro...
research
09/08/2019

Explicit Constructions of Two-Dimensional Reed-Solomon Codes in High Insertion and Deletion Noise Regime

Insertion and deletion (insdel for short) errors are synchronization err...
research
07/05/2021

Improved Asymptotic Bounds for Codes Correcting Insertions and Deletions

This paper studies the cardinality of codes correcting insertions and de...
research
01/15/2018

A Finite Block Length Achievability Bound for Low Probability of Detection Communication

Low probability of detection (or covert) communication refers to the sce...
research
08/03/2022

Low-redundancy codes for correcting multiple short-duplication and edit errors

Due to its higher data density, longevity, energy efficiency, and ease o...
research
03/11/2019

Clustering-Correcting Codes

A new family of codes, called clustering-correcting codes, is presented ...
research
02/05/2021

Function-Correcting Codes

Motivated by applications in machine learning and archival data storage,...

Please sign up or login with your details

Forgot password? Click here to reset