On the Complexity of BWT-runs Minimization via Alphabet Reordering

11/08/2019
by   Jason Bentley, et al.
0

We present the first set of results on the computational complexity of minimizing BWT-runs via alphabet reordering. We prove that the decision version of this problem is NP-complete and cannot be solved in time 2^o(σ)n unless the Exponential Time Hypothesis fails, where σ is the size of the alphabet. Moreover, we show that optimization variations of this problem yield strong inapproximability results. In doing so we relate two previously disparate topics: the size of a path cover of a graph and the number of runs in the BWT of a text. This provides a surprising connection between problems on graphs and string compression. As a result we are able to prove (all assuming P ≠ NP): (i) No PTAS exists if we define the cost of a solution as exactly the number of runs exceeding σ; (ii) For all δ > 0, no polytime ϵ n^1/2-approximation algorithm exists for ϵ > 0 small enough if we consider the number of runs exceeding (1+δ)σ as the cost of a solution. In this case the problem is APX-hard as well. To the best of our knowledge these are the first ever inapproximability results pertaining to the BWT. In addition, by relating recent results in the field of dictionary compression, we demonstrate that if we define cost purely as the number of runs, we obtain a log^2 n-approximation algorithm. Finally, we provide an efficient algorithm for the more restricted problem of finding an optimal ordering on a subset of symbols (occurring only once) under ordering constraints which runs in optimal time for small values of σ. We also look at a version of the problem on the newly discovered class of graphs with BWT like properties called Wheeler graphs. Here also we show NP-hardness results on a related problem which we call Source Ordering.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/29/2017

On the Parameterized Complexity of Approximating Dominating Set

We study the parameterized complexity of approximating the k-Dominating ...
research
04/13/2020

A General Framework for Approximating Min Sum Ordering Problems

We consider a large family of problems in which an ordering of a finite ...
research
02/05/2019

On the Hardness and Inapproximability of Recognizing Wheeler Graphs

In recent years several compressed indexes based on variants of the Borr...
research
10/05/2022

On Convexity in Split graphs: Complexity of Steiner tree and Domination

Given a graph G with a terminal set R ⊆ V(G), the Steiner tree problem (...
research
10/10/2020

Decode efficient prefix codes

Data compression is used in a wide variety of tasks, including compressi...
research
10/16/2020

An Approximation Algorithm for Optimal Subarchitecture Extraction

We consider the problem of finding the set of architectural parameters f...
research
07/14/2022

Streaming complexity of CSPs with randomly ordered constraints

We initiate a study of the streaming complexity of constraint satisfacti...

Please sign up or login with your details

Forgot password? Click here to reset