Thermodynamically Stable DNA Code Design using a Similarity Significance Model

05/14/2020
by   Yixin Wang, et al.
0

DNA code design aims to generate a set of DNA sequences (codewords) with minimum likelihood of undesired hybridizations among sequences and their reverse-complement (RC) pairs (cross-hybridization). Inspired by the distinct hybridization affinities (or stabilities) of perfect double helix constructed by individual single-stranded DNA (ssDNA) and its RC pair, we propose a novel similarity significance (SS) model to measure the similarity between DNA sequences. Particularly, instead of directly measuring the similarity of two sequences by any metric/approach, the proposed SS works in a way to evaluate how more likely will the undesirable hybridizations occur over the desirable hybridizations in the presence of the two measured sequences and their RC pairs. With this SS model, we construct thermodynamically stable DNA codes subject to several combinatorial constraints using a sorting-based algorithm. The proposed scheme results in DNA codes with larger code sizes and wider free energy gaps (hence better cross-hybridization performance) compared to the existing methods.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
04/10/2023

Kernel Code for DNA Digital Data Storage

The biggest challenge when using DNA as a storage medium is maintaining ...
research
08/10/2022

Diversifying Design of Nucleic Acid Aptamers Using Unsupervised Machine Learning

Inverse design of short single-stranded RNA and DNA sequences (aptamers)...
research
02/13/2021

DNA codes over two noncommutative rings of order four

DNA codes based on error-correcting codes have been successful in DNA-ba...
research
03/01/2020

Did Sequence Dependent Geometry Influence the Evolution of the Genetic Code?

The genetic code is the function from the set of codons to the set of am...
research
07/16/2021

Ranking labs-of-origin for genetically engineered DNA using Metric Learning

With the constant advancements of genetic engineering, a common concern ...
research
06/06/2019

Evolution of Hierarchical Structure Reuse in iGEM Synthetic DNA Sequences

Many complex systems, both in technology and nature, exhibit hierarchica...
research
07/11/2022

Deep Squared Euclidean Approximation to the Levenshtein Distance for DNA Storage

Storing information in DNA molecules is of great interest because of its...

Please sign up or login with your details

Forgot password? Click here to reset