Fine-Grained Complexity of Analyzing Compressed Data: Quantifying Improvements over Decompress-And-Solve

03/02/2018
by   Amir Abboud, et al.
0

Can we analyze data without decompressing it? As our data keeps growing, understanding the time complexity of problems on compressed inputs, rather than in convenient uncompressed forms, becomes more and more relevant. Suppose we are given a compression of size n of data that originally has size N, and we want to solve a problem with time complexity T(·). The naive strategy of "decompress-and-solve" gives time T(N), whereas "the gold standard" is time T(n): to analyze the compression as efficiently as if the original data was small. We restrict our attention to data in the form of a string (text, files, genomes, etc.) and study the most ubiquitous tasks. While the challenge might seem to depend heavily on the specific compression scheme, most methods of practical relevance (Lempel-Ziv-family, dictionary methods, and others) can be unified under the elegant notion of Grammar Compressions. A vast literature, across many disciplines, established this as an influential notion for Algorithm design. We introduce a framework for proving (conditional) lower bounds in this field, allowing us to assess whether decompress-and-solve can be improved, and by how much. Our main results are: - The O(nN√(N/n)) bound for LCS and the O({N N, nM}) bound for Pattern Matching with Wildcards are optimal up to N^o(1) factors, under the Strong Exponential Time Hypothesis. (Here, M denotes the uncompressed length of the compressed pattern.) - Decompress-and-solve is essentially optimal for Context-Free Grammar Parsing and RNA Folding, under the k-Clique conjecture. - We give an algorithm showing that decompress-and-solve is not optimal for Disjointness.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
11/09/2021

Pattern Matching on Grammar-Compressed Strings in Linear Time

The most fundamental problem considered in algorithms for text processin...
research
07/17/2023

Grammar Boosting: A New Technique for Proving Lower Bounds for Computation over Compressed Data

Grammar compression is a general compression framework in which a string...
research
04/23/2018

Entropy bounds for grammar compression

In grammar compression we represent a string as a context free grammar. ...
research
10/27/2020

Impossibility Results for Grammar-Compressed Linear Algebra

To handle vast amounts of data, it is natural and popular to compress ve...
research
11/03/2018

Optimal Rank and Select Queries on Dictionary-Compressed Text

Let γ be the size of a string attractor for a string S of length n over ...
research
08/15/2020

Stronger Lower Bounds for Polynomial Time Problems

We introduce techniques for proving stronger conditional lower bounds fo...
research
03/02/2018

Multivariate Fine-Grained Complexity of Longest Common Subsequence

We revisit the classic combinatorial pattern matching problem of finding...

Please sign up or login with your details

Forgot password? Click here to reset