Capacity-Approaching Constrained Codes with Error Correction for DNA-Based Data Storage

01/09/2020
by   Tuan Thanh Nguyen, et al.
0

We propose coding techniques that limit the length of homopolymers runs, ensure the GC-content constraint, and are capable of correcting a single edit error in strands of nucleotides in DNA-based data storage systems. In particular, for given ℓ, ϵ > 0, we propose simple and efficient encoders/decoders that transform binary sequences into DNA base sequences (codewords), namely sequences of the symbols A, T, C and G, that satisfy the following properties: (i) Runlength constraint: the maximum homopolymer run in each codeword is at most ℓ, (ii) GC-content constraint: the GC-content of each codeword is within [0.5-ϵ, 0.5+ϵ], (iii) Error-correction: each codeword is capable of correcting a single deletion, or single insertion, or single substitution error. For practical values of ℓ and ϵ, we show that our encoders achieve much higher rates than existing results in the literature and approach the capacity. Our methods have low encoding/decoding complexity and limited error propagation.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/01/2023

Codes with Biochemical Constraints and Single Error Correction for DNA-Based Data Storage

In DNA-based data storage, DNA codes with biochemical constraints and er...
research
06/24/2019

Survey of Information Encoding Techniques for DNA

Key to DNA storage is encoding the information to a sequence of nucleoti...
research
02/01/2019

Some Enumeration Problems in the Duplication-Loss Model of Genome Rearrangement

Tandem-duplication-random-loss (TDRL) is an important genome rearrangeme...
research
12/31/2019

DNA Linear Block Codes: Generation, Error-detection and Error-correction of DNA Codeword

In modern age, the increasing complexity of computation and communicatio...
research
04/07/2023

Iterative Soft Decoding Algorithm for DNA Storage Using Quality Score and Redecoding

Ever since deoxyribonucleic acid (DNA) was considered as a next-generati...
research
04/06/2022

SPIDER-WEB enables stable, repairable, and encryptible algorithms under arbitrary local biochemical constraints in DNA-based storage

DNA has been considered as a promising medium for storing digital inform...
research
02/27/2015

Error-Correcting Factorization

Error Correcting Output Codes (ECOC) is a successful technique in multi-...

Please sign up or login with your details

Forgot password? Click here to reset