Revisiting the Complexity of and Algorithms for the Graph Traversal Edit Distance and Its Variants

05/17/2023
by   Yutong Qiu, et al.
0

The graph traversal edit distance (GTED) is an elegant distance measure defined as the minimum edit distance between strings reconstructed from Eulerian trails in two edge-labeled graphs. GTED can be used to infer evolutionary relationships between species by comparing de Bruijn graphs directly without the computationally costly and error-prone process of genome assembly. Ebrahimpour Boroojeny et al. (2018) suggest two ILP formulations for GTED and claim that GTED is polynomially solvable because the linear programming relaxation of one of the ILP always yields optimal integer solutions. The result that GTED is polynomially solvable is contradictory to the complexity results of existing string-to-graph matching problems. We resolve this conflict in complexity results by proving that GTED is NP-complete and showing that the ILPs proposed by Ebrahimpour Boroojeny et al. do not solve GTED but instead solve for a lower bound of GTED and are not solvable in polynomial time. In addition, we provide the first two, correct ILP formulations of GTED and evaluate their empirical efficiency. These results provide solid algorithmic foundations for comparing genome graphs and point to the direction of approximation heuristics. The source code to reproduce experimental results is available at https://github.com/Kingsford-Group/gtednewilp/.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
01/30/2023

A Safety Framework for Flow Decomposition Problems via Integer Linear Programming

Many important problems in Bioinformatics (e.g., assembly or multi-assem...
research
01/28/2022

The Complexity of Approximate Pattern Matching on De Bruijn Graphs

Aligning a sequence to a walk in a labeled graph is a problem of fundame...
research
04/26/2018

Edit Distance between Unrooted Trees in Cubic Time

Edit distance between trees is a natural generalization of the classical...
research
12/13/2018

Mind the Independence Gap

The independence gap of a graph was introduced by Ekim et al. (2018) as ...
research
01/07/2020

Complexity Issues of String to Graph Approximate Matching

The problem of matching a query string to a directed graph, whose vertic...
research
07/16/2020

String Sanitization Under Edit Distance: Improved and Generalized

Let W be a string of length n over an alphabet Σ, k be a positive intege...
research
03/20/2023

On the Maximal Independent Sets of k-mers with the Edit Distance

In computational biology, k-mers and edit distance are fundamental conce...

Please sign up or login with your details

Forgot password? Click here to reset