Sequence-Subset Distance and Coding for Error Control in DNA-based Data Storage

09/16/2018
by   Wentu Song, et al.
0

The process of DNA-based data storage (DNA storage for short) can be mathematically modelled as a communication channel, termed DNA storage channel, whose inputs and outputs are sets of unordered sequences. To design error correcting codes for DNA storage channel, a new metric, termed the sequence-subset distance, is introduced, which generalizes the Hamming distance to a distance function defined between any two sets of unordered vectors and helps to establish a uniform framework to design error correcting codes for DNA storage channel. We further introduce a family of error correcting codes, referred to as sequence-subset codes, for DNA storage and show that the error-correcting ability of such codes is completely determined by their minimum distance. We derive some upper bounds on the size of the sequence-subset codes including a Singleton-like bound and a Plotkin-like bound. We also propose some constructions, which imply lower bounds on the size of such codes.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/16/2018

Sequence-Subset Distance and Coding for Error Control in DNA Data Storage

The process of DNA data storage can be mathematically modelled as a comm...
research
09/18/2020

Improved Coding over Sets for DNA-Based Data Storage

Error-correcting codes over sets, with applications to DNA storage, are ...
research
04/20/2023

DNA-Correcting Codes: End-to-end Correction in DNA Storage Systems

This paper introduces a new solution to DNA storage that integrates all ...
research
05/17/2023

Error-Correcting Codes for Nanopore Sequencing

Nanopore sequencers, being superior to other sequencing technologies for...
research
01/04/2019

Efficient and Explicit Balanced Primer Codes

To equip DNA-based data storage with random-access capabilities, Yazdi e...
research
05/11/2022

DNA data storage, sequencing data-carrying DNA

DNA is a leading candidate as the next archival storage media due to its...
research
05/12/2023

Deletion Correcting Codes for Efficient DNA Synthesis

The synthesis of DNA strands remains the most costly part of the DNA sto...

Please sign up or login with your details

Forgot password? Click here to reset