# Improved Approximation for Longest Common Subsequence over Small Alphabets

This paper investigates the approximability of the Longest Common Subsequence (LCS) problem. The fastest algorithm for solving the LCS problem exactly runs in essentially quadratic time in the length of the input, and it is known that under the Strong Exponential Time Hypothesis the quadratic running time cannot be beaten. There are no such limitations for the approximate computation of the LCS however, except in some limited scenarios. There is also a scarcity of approximation algorithms. When the two given strings are over an alphabet of size k, returning the subsequence formed by the most frequent symbol occurring in both strings achieves a 1/k approximation for the LCS. It is an open problem whether a better than 1/k approximation can be achieved in truly subquadratic time (O(n^2-δ) time for constant δ>0). A recent result [Rubinstein and Song SODA'2020] showed that a 1/2+ϵ approximation for the LCS over a binary alphabet is possible in truly subquadratic time, provided the input strings have the same length. In this paper we show that if a 1/2+ϵ approximation (for ϵ>0) is achievable for binary LCS in truly subquadratic time when the input strings can be unequal, then for every constant k, there is a truly subquadratic time algorithm that achieves a 1/k+δ approximation for k-ary alphabet LCS for some δ>0. Thus the binary case is the hardest. We also show that for every constant k, if one is given two strings of equal length over a k-ary alphabet, one can obtain a 1/k+ϵ approximation for some constant ϵ>0 in truly subquadratic time, thus extending the Rubinstein and Song result to all alphabets of constant size.

• 7 publications
• 45 publications
research
11/30/2022

### Approximating binary longest common subsequence in almost-linear time

The Longest Common Subsequence (LCS) is a fundamental string similarity ...
research
10/02/2018

### Sketching, Streaming, and Fine-Grained Complexity of (Weighted) LCS

We study sketching and streaming algorithms for the Longest Common Subse...
research
06/15/2021

### A Linear-Time n^0.4-Approximation for Longest Common Subsequence

We consider the classic problem of computing the Longest Common Subseque...
research
12/22/2017

### Longest common substring with approximately k mismatches

In the longest common substring problem we are given two strings of leng...
research
12/03/2022

### The Chvátal-Sankoff problem: Understanding random string comparison through stochastic processes

Given two equally long, uniformly random binary strings, the expected le...
research
02/10/2021

### All instantiations of the greedy algorithm for the shortest superstring problem are equivalent

In the Shortest Common Superstring problem (SCS), one needs to find the ...
research
03/02/2018

### Multivariate Fine-Grained Complexity of Longest Common Subsequence

We revisit the classic combinatorial pattern matching problem of finding...