Approximating the Median under the Ulam Metric

11/02/2020
by   Diptarka Chakraborty, et al.
0

We study approximation algorithms for variants of the median string problem, which asks for a string that minimizes the sum of edit distances from a given set of m strings of length n. Only the straightforward 2-approximation is known for this NP-hard problem. This problem is motivated e.g. by computational biology, and belongs to the class of median problems (over different metric spaces), which are fundamental tasks in data analysis. Our main result is for the Ulam metric, where all strings are permutations over [n] and each edit operation moves a symbol (deletion plus insertion). We devise for this problem an algorithms that breaks the 2-approximation barrier, i.e., computes a (2-δ)-approximate median permutation for some constant δ>0 in time Õ(nm^2+n^3). We further use these techniques to achieve a (2-δ) approximation for the median string problem in the special case where the median is restricted to length n and the optimal objective is large Ω(mn). We also design an approximation algorithm for the following probabilistic model of the Ulam median: the input consists of m perturbations of an (unknown) permutation x, each generated by moving every symbol to a random position with probability (a parameter) ϵ>0. Our algorithm computes with high probability a (1+o(1/ϵ))-approximate median permutation in time O(mn^2+n^3).

READ FULL TEXT

page 1

page 2

page 3

page 4

research
07/20/2021

Approximate Trace Reconstruction via Median String (in Average-Case)

We consider an approximate version of the trace reconstruction problem, ...
research
12/06/2021

On Complexity of 1-Center in Various Metrics

We consider the classic 1-center problem: Given a set P of n points in a...
research
12/04/2019

Assessing the best edit in perturbation-based iterative refinement algorithms to compute the median string

Strings are a natural representation of biological data such as DNA, RNA...
research
03/04/2020

Pivot Selection for Median String Problem

The Median String Problem is W[1]-Hard under the Levenshtein distance, t...
research
01/05/2022

Deterministic metric 1-median selection with very few queries

Given an n-point metric space (M,d), metric 1-median asks for a point p∈...
research
12/04/2022

Clustering Permutations: New Techniques with Streaming Applications

We study the classical metric k-median clustering problem over a set of ...
research
11/10/2021

Permute, Graph, Map, Derange

We study decomposable combinatorial labeled structures in the exp-log cl...

Please sign up or login with your details

Forgot password? Click here to reset