Reconstruction of Sets of Strings from Prefix/Suffix Compositions

10/05/2021
by   Ryan Gabrys, et al.
0

The problem of reconstructing strings from substring information has found many applications due to its importance in genomic data sequencing and DNA- and polymer-based data storage. One practically important and challenging paradigm requires reconstructing mixtures of strings based on the union of compositions of their prefixes and suffixes, generated by mass spectrometry devices. We describe new coding methods that allow for unique joint reconstruction of subsets of strings selected from a code and provide upper and lower bounds on the asymptotic rate of the underlying codebooks. Our code constructions combine properties of binary Bh and Dyck strings and that can be extended to accommodate missing substrings in the pool. As auxiliary results, we obtain the first known bounds on binary Bh sequences for arbitrary even parameters h, and also describe various error models inherent to mass spectrometry analysis. This paper contains a correction of the prior work by the authors, published in [24]. In particular, the bounds on the prefix codes are now corrected.

READ FULL TEXT

page 6

page 19

research
10/21/2020

Reconstructing Mixtures of Coded Strings from Prefix and Suffix Compositions

The problem of string reconstruction from substring information has foun...
research
03/23/2023

On Constant-Weight Binary B_2-Sequences

Motivated by applications in polymer-based data storage we introduced th...
research
04/19/2019

Reconstruction and Error-Correction Codes for Polymer-Based Data Storage

Motivated by polymer-based data-storage platforms that use chains of bin...
research
04/12/2018

Unique Reconstruction of Coded Strings from Multiset Substring Spectra

The problem of reconstructing strings from their substring spectra has a...
research
01/20/2020

Uncertainty of Reconstructing Multiple Messages from Uniform-Tandem-Duplication Noise

A growing number of works have, in recent years, been concerned with in-...
research
12/23/2019

Reconstruction of Strings from their Substrings Spectrum

This paper studies reconstruction of strings based upon their substrings...
research
05/10/2023

Fundamental Limits of Multiple Sequence Reconstruction from Substrings

The problem of reconstructing a sequence from the set of its length-k su...

Please sign up or login with your details

Forgot password? Click here to reset