Approximating binary longest common subsequence in almost-linear time

11/30/2022
by   Xiaoyu He, et al.
0

The Longest Common Subsequence (LCS) is a fundamental string similarity measure, and computing the LCS of two strings is a classic algorithms question. A textbook dynamic programming algorithm gives an exact algorithm in quadratic time, and this is essentially best possible under plausible fine-grained complexity assumptions, so a natural problem is to find faster approximation algorithms. When the inputs are two binary strings, there is a simple 1/2-approximation in linear time: compute the longest common all-0s or all-1s subsequence. It has been open whether a better approximation is possible even in truly subquadratic time. Rubinstein and Song showed that the answer is yes under the assumption that the two input strings have equal lengths. We settle the question, generalizing their result to unequal length strings, proving that, for any ε>0, there exists δ>0 and a (1/2+δ)-approximation algorithm for binary LCS that runs in n^1+ε time. As a consequence of our result and a result of Akmal and Vassilevska-Williams, for any ε>0, there exists a (1/q+δ)-approximation for LCS over q-ary strings in n^1+ε time. Our techniques build on the recent work of Guruswami, He, and Li who proved new bounds for error-correcting codes tolerating deletion errors. They prove a combinatorial "structure lemma" for strings which classifies them according to their oscillation patterns. We prove and use an algorithmic generalization of this structure lemma, which may be of independent interest.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
12/15/2021

Approximating the Longest Common Subsequence problem within a sub-polynomial factor in linear time

The Longest Common Subsequence (LCS) of two strings is a fundamental str...
research
05/07/2021

Improved Approximation for Longest Common Subsequence over Small Alphabets

This paper investigates the approximability of the Longest Common Subseq...
research
07/02/2022

Approximating Dynamic Time Warping Distance Between Run-Length Encoded Strings

Dynamic Time Warping (DTW) is a widely used similarity measure for compa...
research
03/08/2018

Synchronization Strings: Efficient and Fast Deterministic Constructions over Small Alphabets

Synchronization strings are recently introduced by Haeupler and Shahrasb...
research
05/26/2023

Can You Solve Closest String Faster than Exhaustive Search?

We study the fundamental problem of finding the best string to represent...
research
04/28/2020

Approximating longest common substring with k mismatches: Theory and practice

In the problem of the longest common substring with k mismatches we are ...
research
01/16/2021

Strings-and-Coins and Nimstring are PSPACE-complete

We prove that Strings-and-Coins – the combinatorial two-player game gene...

Please sign up or login with your details

Forgot password? Click here to reset