Orienting Ordered Scaffolds: Complexity and Algorithms

11/25/2019
by   Sergey Aganezov, et al.
0

Despite the recent progress in genome sequencing and assembly, many of the currently available assembled genomes come in a draft form. Such draft genomes consist of a large number of genomic fragments (scaffolds), whose order and/or orientation (i.e., strand) in the genome are unknown. There exist various scaffold assembly methods, which attempt to determine the order and orientation of scaffolds along the genome chromosomes. Some of these methods (e.g., based on FISH physical mapping, chromatin conformation capture, etc.) can infer the order of scaffolds, but not necessarily their orientation. This leads to a special case of the scaffold orientation problem (i.e., deducing the orientation of each scaffold) with a known order of the scaffolds. We address the problem of orientating ordered scaffolds as an optimization problem based on given weighted orientations of scaffolds and their pairs (e.g., coming from pair-end sequencing reads, long reads, or homologous relations). We formalize this problem using notion of a scaffold graph (i.e., a graph, where vertices correspond to the assembled contigs or scaffolds and edges represent connections between them). We prove that this problem is NP-hard, and present a polynomial-time algorithm for solving its special case, where orientation of each scaffold is imposed relatively to at most two other scaffolds. We further develop an FPT algorithm for the general case of the OOS problem.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/03/2023

Make a graph singly connected by edge orientations

A directed graph D is singly connected if for every ordered pair of vert...
research
08/24/2022

st-Orientations with Few Transitive Edges

The problem of orienting the edges of an undirected graph such that the ...
research
10/18/2022

Phase transition in the computational complexity of the shortest common superstring and genome assembly

Genome assembly, the process of reconstructing a long genetic sequence b...
research
04/11/2018

Weighted proper orientations of trees and graphs of bounded treewidth

Given a simple graph G, a weight function w:E(G)→N∖{0}, and an orientati...
research
11/08/2020

The Hierarchical Chinese Postman Problem: the slightest disorder makes it hard, yet disconnectedness is manageable

The Hierarchical Chinese Postman Problem is finding a shortest traversal...
research
10/21/2022

Graph Coloring via Neural Networks for Haplotype Assembly and Viral Quasispecies Reconstruction

Understanding genetic variation, e.g., through mutations, in organisms i...
research
03/12/2020

Duality pairs and homomorphisms to oriented and unoriented cycles

In the homomorphism order of digraphs, a duality pair is an ordered pair...

Please sign up or login with your details

Forgot password? Click here to reset