Phase transition in the computational complexity of the shortest common superstring and genome assembly

10/18/2022
by   L. A. Fernandez, et al.
0

Genome assembly, the process of reconstructing a long genetic sequence by aligning and merging short fragments, or reads, is known to be NP-hard, either as a version of the shortest common superstring problem or in a Hamiltonian-cycle formulation. That is, the computing time is believed to grow exponentially with the the problem size in the worst case. Despite this fact, high-throughput technologies and modern algorithms currently allow bioinformaticians to produce and assemble datasets of billions of reads. Using methods from statistical mechanics, we address this conundrum by demonstrating the existence of a phase transition in the computational complexity of the problem and showing that practical instances always fall in the `easy' phase (solvable by polynomial-time algorithms). In addition, we propose a Markov-chain Monte Carlo method that outperforms common deterministic algorithms in the hard regime.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/06/2022

A Crowdsourced Gameplay for Whole-Genome Assembly via Short Reads

Next-generation sequencing has revolutionized the field of genomics by p...
research
11/25/2019

Orienting Ordered Scaffolds: Complexity and Algorithms

Despite the recent progress in genome sequencing and assembly, many of t...
research
05/21/2021

GapPredict: A Language Model for Resolving Gaps in Draft Genome Assemblies

Short-read DNA sequencing instruments can yield over 1e+12 bases per run...
research
12/26/2021

Quantum Algorithm for the Shortest Superstring Problem

In this paper, we consider the “Shortest Superstring Problem”(SSP) or th...
research
10/21/2022

Graph Coloring via Neural Networks for Haplotype Assembly and Viral Quasispecies Reconstruction

Understanding genetic variation, e.g., through mutations, in organisms i...
research
11/07/2018

Approximability of the Eight-vertex Model

We initiate a study of the classification of approximation complexity of...
research
05/18/2023

On the Computational Complexity of Generalized Common Shape Puzzles

In this study, we investigate the computational complexity of some varia...

Please sign up or login with your details

Forgot password? Click here to reset