Clustering-Correcting Codes

03/11/2019
by   Tal Shinkar, et al.
0

A new family of codes, called clustering-correcting codes, is presented in this paper. This family of codes is motivated by the special structure of data that is stored in DNA-based storage systems. The data stored in these systems has the form of unordered sequences, also called strands, and every strand is synthesized thousands to millions of times, where some of these copies are read back during sequencing. Due to the unordered structure of the strands, an important task in the decoding process is to place them in their correct order. This is usually accomplished by allocating a part of the strand for an index. However, in the presence of errors in the index field, important information on the order of the strands may be lost. Clustering-correcting codes ensure that if the distance between the index fields of two strands is small, then there will be a large distance between their data fields. It is shown how this property enables to place the strands together in their correct clusters even in the presence of errors. We present lower and upper bounds on the size of clustering-correcting codes and an explicit construction of these codes which uses only a single bit of redundancy.

READ FULL TEXT
research
01/21/2019

Anchor-Based Correction of Substitutions in Indexed Sets

Motivated by DNA-based data storage, we investigate a system where digit...
research
02/05/2021

Function-Correcting Codes

Motivated by applications in machine learning and archival data storage,...
research
04/20/2023

DNA-Correcting Codes: End-to-end Correction in DNA Storage Systems

This paper introduces a new solution to DNA storage that integrates all ...
research
08/30/2018

Asymptotically Optimal Codes Correcting Fixed-Length Duplication Errors in DNA Storage Systems

A (tandem) duplication of length k is an insertion of an exact copy of...
research
03/16/2018

Runlength-Limited Sequences and Shift-Correcting Codes

This work is motivated by the problem of error correction in bit-shift c...
research
08/18/2020

Error-correcting Codes for Noisy Duplication Channels

Because of its high data density and longevity, DNA is emerging as a pro...
research
05/17/2023

Error-Correcting Codes for Nanopore Sequencing

Nanopore sequencers, being superior to other sequencing technologies for...

Please sign up or login with your details

Forgot password? Click here to reset