Mass Error-Correction Codes for Polymer-Based Data Storage

by   Ryan Gabrys, et al.

We consider the problem of correcting mass readout errors in information encoded in binary polymer strings. Our work builds on results for string reconstruction problems using composition multisets [Acharya et al., 2015] and the unique string reconstruction framework proposed in [Pattabiraman et al., 2019]. Binary polymer-based data storage systems [Laure et al., 2016] operate by designing two molecules of significantly different masses to represent the symbols {0,1} and perform readouts through noisy tandem mass spectrometry. Tandem mass spectrometers fragment the strings to be read into shorter substrings and only report their masses, often with errors due to imprecise ionization. Modeling the fragmentation process output in terms of composition multisets allows for designing asymptotically optimal codes capable of unique reconstruction and the correction of a single mass error [Pattabiraman et al., 2019] through the use of derivatives of Catalan paths. Nevertheless, no solutions for multiple-mass error-corrections are currently known. Our work addresses this issue by describing the first multiple-error correction codes that use the polynomial factorization approach for the Turnpike problem [Skiena et al., 1990] and the related factorization described in [Acharya et al., 2015]. Adding Reed-Solomon type coding redundancy into the corresponding polynomials allows for correcting t mass errors in polynomial time using t^2 log k redundant bits, where k is the information string length. The redundancy can be improved to log k + t. However, no decoding algorithm that runs polynomial-time in both t and n for this scheme are currently known, where n is the length of the coded string.


Reconstruction and Error-Correction Codes for Polymer-Based Data Storage

Motivated by polymer-based data-storage platforms that use chains of bin...

Insertion and Deletion Correction in Polymer-based Data Storage

Synthetic polymer-based storage seems to be a particularly promising can...

A New Algebraic Approach for String Reconstruction from Substring Compositions

We consider the problem of binary string reconstruction from the multise...

Coding for Polymer-Based Data Storage

Motivated by polymer-based data-storage platforms that use chains of bin...

Unique Reconstruction of Coded Strings from Multiset Substring Spectra

The problem of reconstructing strings from their substring spectra has a...

A QPTAS for Gapless MEC

We consider the problem Minimum Error Correction (MEC). A MEC instance i...

A Short Course on Error-Correcting Codes

When digital data are transmitted over a noisy channel, it is important ...

Please sign up or login with your details

Forgot password? Click here to reset