Coding over Sets for DNA Storage

01/15/2018
by   Andreas Lenz, et al.
0

In this paper we study error-correcting codes for the storage of data in synthetic DNA. We investigate a storage model where a data set is represented by an unordered set of M sequences, each of length L. Errors within that model are losses of whole sequences and point errors inside the sequences, such as insertions, deletions and substitutions. We propose code constructions which can correct errors in such a storage system that can be encoded and decoded efficiently. By deriving upper bounds on the cardinalities of these codes using sphere packing arguments, we show that many of our codes are close to optimal.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/08/2018

Bounds and Constructions for Multi-Symbol Duplication Error Correcting Codes

In this paper, we study codes correcting t duplications of ℓ consecutive...
research
01/21/2019

Anchor-Based Correction of Substitutions in Indexed Sets

Motivated by DNA-based data storage, we investigate a system where digit...
research
08/11/2023

Embracing Errors is More Efficient than Avoiding Them through Constrained Coding for DNA Data Storage

DNA is an attractive medium for digital data storage. When data is store...
research
01/26/2022

Adversarial Torn-paper Codes

This paper studies the adversarial torn-paper channel. This problem is m...
research
02/06/2019

Restriction enzymes use a 24 dimensional coding space to recognize 6 base long DNA sequences

Restriction enzymes recognize and bind to specific sequences on invading...
research
10/21/2022

Non-binary Codes for Correcting a Burst of at Most t Deletions

The problem of correcting deletions has received significant attention, ...
research
01/15/2021

Improved Rank-Modulation Codes for DNA Storage with Shotgun Sequencing

We study permutations over the set of ℓ-grams, that are feasible in the ...

Please sign up or login with your details

Forgot password? Click here to reset