Longest common substring with approximately k mismatches

12/22/2017
by   Tomasz Kociumaka, et al.
0

In the longest common substring problem we are given two strings of length n and must find a substring of maximal length that occurs in both strings. It is well-known that the problem can be solved in linear time, but the solution is not robust and can vary greatly when the input strings are changed even by one letter. To circumvent this, Leimeister and Morgenstern introduced the problem of the longest common substring with k mismatches. Lately, this problem has received a lot of attention in the literature. In this paper we first show a conditional lower bound based on the SETH hypothesis implying that there is little hope to improve existing solutions. We then introduce a new but closely related problem of the longest common substring with approximately k mismatches and use computational geometry techniques to show that it admits a solution with strongly subquadratic running time. We also apply these results to obtain a strongly subquadratic approximation algorithm for the longest common substring with k mismatches problem and show conditional hardness of improving its approximation ratio.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
06/15/2021

A Linear-Time n^0.4-Approximation for Longest Common Subsequence

We consider the classic problem of computing the Longest Common Subseque...
research
04/28/2020

Approximating longest common substring with k mismatches: Theory and practice

In the problem of the longest common substring with k mismatches we are ...
research
05/07/2021

Improved Approximation for Longest Common Subsequence over Small Alphabets

This paper investigates the approximability of the Longest Common Subseq...
research
09/07/2020

A Fast Randomized Algorithm for Finding the Maximal Common Subsequences

Finding the common subsequences of L multiple strings has many applicati...
research
04/30/2018

On improving the approximation ratio of the r-shortest common superstring problem

The Shortest Common Superstring problem (SCS) consists, for a set of str...
research
02/18/2018

Linear-Time Algorithm for Long LCF with k Mismatches

In the Longest Common Factor with k Mismatches (LCF_k) problem, we are g...
research
10/02/2018

Sketching, Streaming, and Fine-Grained Complexity of (Weighted) LCS

We study sketching and streaming algorithms for the Longest Common Subse...

Please sign up or login with your details

Forgot password? Click here to reset