The Labeled Direct Product Optimally Solves String Problems on Graphs

09/11/2021
by   Nicola Rizzo, et al.
0

Suffix trees are an important data structure at the core of optimal solutions to many fundamental string problems, such as exact pattern matching, longest common substring, matching statistics, and longest repeated substring. Recent lines of research focused on extending some of these problems to vertex-labeled graphs, although using ad-hoc approaches which in some cases do not generalize to all input graphs. In the absence of a ubiquitous tool like the suffix tree for labeled graphs, we introduce the labeled direct product of two graphs as a general tool for obtaining optimal algorithms: we obtain conceptually simpler algorithms for the quadratic problems of string matching (SMLG) and longest common substring (LCSP) in labeled graphs. Our algorithms are also more efficient, since they run in time linear in the size of the labeled product graph, which may be smaller than quadratic for some inputs, and their run-time is predictable, because the size of the labeled direct product graph can be precomputed efficiently. We also solve LCSP on graphs containing cycles, which was left as an open problem by Shimohira et al. in 2011. To show the power of the labeled product graph, we also apply it to solve the matching statistics (MSP) and the longest repeated string (LRSP) problems in labeled graphs. Moreover, we show that our (worst-case quadratic) algorithms are also optimal, conditioned on the Orthogonal Vectors Hypothesis. Finally, we complete the complexity picture around LRSP by studying it on undirected graphs.

READ FULL TEXT
research
12/15/2022

Parameterized Algorithms for String Matching to DAGs: Funnels and Beyond

The problem of String Matching to Labeled Graphs (SMLG) asks to find all...
research
07/15/2023

Computing SEQ-IC-LCS of Labeled Graphs

We consider labeled directed graphs where each vertex is labeled with a ...
research
01/13/2023

Computing matching statistics on Wheeler DFAs

Matching statistics were introduced to solve the approximate string matc...
research
02/03/2020

Graphs cannot be indexed in polynomial time for sub-quadratic time string matching, unless SETH fails

We consider the following string matching problem on a node-labeled grap...
research
01/28/2022

The Complexity of Approximate Pattern Matching on De Bruijn Graphs

Aligning a sequence to a walk in a labeled graph is a problem of fundame...
research
04/30/2018

Practical Low-Dimensional Halfspace Range Space Sampling

We develop, analyze, implement, and compare new algorithms for creating ...
research
08/04/2021

Relational E-Matching

We present a new approach to e-matching based on relational join; in par...

Please sign up or login with your details

Forgot password? Click here to reset