Generalized Metric Repair on Graphs

07/19/2018
by   Anna C. Gilbert, et al.
1

Many modern data analysis algorithms either assume that or are considerably more efficient if the distances between the data points satisfy a metric. These algorithms include metric learning, clustering, and dimensionality reduction. Because real data sets are noisy, the similarity measures often fail to satisfy a metric. For this reason, Gilbert and Jain [11] and Fan, et al. [8] introduce the closely related problems of sparse metric repair and metric violation distance. The goal of each problem is to repair as few distances as possible to ensure that the distances between the data points satisfy a metric. We generalize these problems so as to no longer require all the distances between the data points. That is, we consider a weighted graph G with corrupted weights w and our goal is to find the smallest number of modifications to the weights so that the resulting weighted graph distances satisfy a metric. This problem is a natural generalization of the sparse metric repair problem and is more flexible as it takes into account different relationships amongst the input data points. As in previous work, we distinguish amongst the types of repairs permitted (decrease, increase, and general repairs). We focus on the increase and general versions and establish hardness results and show the inherent combinatorial structure of the problem. We then show that if we restrict to the case when G is a chordal graph, then the problem is fixed parameter tractable. We also present several classes of approximation algorithms. These include and improve upon previous metric repair algorithms for the special case when G = K_n

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/29/2017

If it ain't broke, don't fix it: Sparse metric repair

Many modern data-intensive computational problems either require, or ben...
research
07/09/2018

n-metrics for multiple graph alignment

The work of Ioannidis et al. 2018 introduces a family of distances betwe...
research
07/21/2018

Metric Violation Distance: Revisited and Extended

Metric data plays an important role in various settings such as metric-b...
research
06/16/2022

Approximating optimization problems in graphs with locational uncertainty

Many combinatorial optimization problems can be formulated as the search...
research
07/01/2019

Learning to Link

Clustering is an important part of many modern data analysis pipelines, ...
research
06/17/2022

Distances for Comparing Multisets and Sequences

Measuring the distance between data points is fundamental to many statis...
research
10/18/2018

Finding Average Regret Ratio Minimizing Set in Database

Selecting a certain number of data points (or records) from a database w...

Please sign up or login with your details

Forgot password? Click here to reset