Unique Reconstruction of Coded Strings from Multiset Substring Spectra

04/12/2018
by   Ryan Gabrys, et al.
0

The problem of reconstructing strings from their substring spectra has a long history and in its most simple incarnation asks for determining under which conditions the spectrum uniquely determines the string. We study the problem of coded string reconstruction from multiset substring spectra, where the strings are restricted to lie in some codebook. In particular, we consider binary codebooks that allow for unique string reconstruction and propose a new method, termed repeat replacement, to create the codebook. Our contributions include algorithmic solutions for repeat replacement and constructive redundancy bounds for the underlying coding schemes. We also consider extensions of the problem to noisy settings in which substrings are compromised by burst and random errors. The study is motivated by applications in DNA-based data storage systems that use high throughput readout sequencers.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/21/2020

Reconstructing Mixtures of Coded Strings from Prefix and Suffix Compositions

The problem of string reconstruction from substring information has foun...
research
08/31/2022

Reconstruction of a Single String from a Part of its Composition Multiset

Motivated by applications in polymer-based data storage, we study the pr...
research
10/05/2021

Reconstruction of Sets of Strings from Prefix/Suffix Compositions

The problem of reconstructing strings from substring information has fou...
research
03/02/2020

Coding for Polymer-Based Data Storage

Motivated by polymer-based data-storage platforms that use chains of bin...
research
01/21/2022

Insertion and Deletion Correction in Polymer-based Data Storage

Synthetic polymer-based storage seems to be a particularly promising can...
research
08/18/2018

The Capacity of Some Pólya String Models

We study random string-duplication systems, which we call Pólya string m...
research
01/14/2020

Mass Error-Correction Codes for Polymer-Based Data Storage

We consider the problem of correcting mass readout errors in information...

Please sign up or login with your details

Forgot password? Click here to reset