Fitting Distances by Tree Metrics Minimizing the Total Error within a Constant Factor

10/06/2021
by   Vincent Cohen-Addad, et al.
0

We consider the numerical taxonomy problem of fitting a positive distance function D:S 2→ℝ_>0 by a tree metric. We want a tree T with positive edge weights and including S among the vertices so that their distances in T match those in D. A nice application is in evolutionary biology where the tree T aims to approximate the branching process leading to the observed distances in D [Cavalli-Sforza and Edwards 1967]. We consider the total error, that is the sum of distance errors over all pairs of points. We present a deterministic polynomial time algorithm minimizing the total error within a constant factor. We can do this both for general trees, and for the special case of ultrametrics with a root having the same distance to all vertices in S. The problems are APX-hard, so a constant factor is the best we can hope for in polynomial time. The best previous approximation factor was O((log n)(loglog n)) by Ailon and Charikar [2005] who wrote "Determining whether an O(1) approximation can be obtained is a fascinating question".

READ FULL TEXT
POST COMMENT

Comments

There are no comments yet.

Authors

page 1

page 2

page 3

page 4

05/19/2021

Approximation Algorithms For The Dispersion Problems in a Metric Space

In this article, we consider the c-dispersion problem in a metric space ...
07/09/2019

PTAS and Exact Algorithms for r-Gathering Problems on Tree

r-gathering problem is a variant of facility location problems. In this ...
12/01/2021

Quasi-universality of Reeb graph distances

We establish bi-Lipschitz bounds certifying quasi-universality (universa...
10/28/2021

Recognizing k-leaf powers in polynomial time, for constant k

A graph G is a k-leaf power if there exists a tree T whose leaf set is V...
11/09/2018

Minimizing and Computing the Inverse Geodesic Length on Trees

The inverse geodesic length (IGL) of a graph G=(V,E) is the sum of inver...
10/11/2018

A random model for multidimensional fitting method

Multidimensional fitting (MDF) method is a multivariate data analysis me...
05/26/2021

On Alternative Models for Leaf Powers

A fundamental problem in computational biology is the construction of ph...
This week in AI

Get the week's most popular data science and artificial intelligence research sent straight to your inbox every Saturday.