Metric Violation Distance: Revisited and Extended
Metric data plays an important role in various settings such as metric-based indexing, clustering, classification, and approximation algorithms in general. Due to measurement error, noise, or an inability to completely gather all the data, a collection of distances may not satisfy the basic metric requirements, most notably the triangle inequality. Thus last year the authors introduced the Metric Violation Distance (MVD) problem, where the input is an undirected and positively-weighted complete graph, and the goal is to identify a minimum cardinality subset of edges whose weights can be modified such that the resulting graph is its own metric completion. This problem was shown to be APX-hard, and moreover an O(OPT^1/3)-approximation was shown, where OPT is the size of the optimal solution. In this paper we introduce the Generalized Metric Violation Distance (GMVD) problem, where the goal is the same, but the input graph is no longer required to be complete. For GMVD we prove stronger hardness results, and provide a significantly faster approximation algorithm with an improved approximation guarantee. In particular, we give an approximation-preserving reduction from the well studied MultiCut problem, which is hard to approximate within any constant factor assuming the Unique Games Conjecture. Our approximation factor depends on deficit values, which for a given cycle is the largest single edge weight minus the sum of the weights of all its other edges. Note that no cycle has positive deficit in a metric complete graph. We give an O(c n)-approximation algorithm for , where c is the number of distinct positive cycle deficit values in the input graph.
READ FULL TEXT