A Data-Dependent Distance for Regression

04/30/2018
by   Jeff M. Phillips, et al.
0

We develop a new data-dependent distance for regression problems to compare two regressors (hyperplanes that fits or divides a data set). Most distances between objects attempt to capture the intrinsic geometry of these objects and measure how similar that geometry is. However, we argue that existing measures are inappropriate for regressors to a data set. We introduce a family of new distances that measure how similarly two regressors interact with the data they attempt to fit. For the variant we advocate we show it is a metric (under mild assumptions), induces metric balls with bounded VC-dimension, it is robust to changes in the corresponding data, and can be approximated quickly. We show a simple extension to trajectories that inherits these properties, as well as several other algorithmic applications. Moreover, in order to develop efficient approximation algorithms for this distance we formalize the relationship between sensitivity and leverage scores. This may be of independent interest.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
02/25/2022

The k-outlier Fréchet distance

The Fréchet distance is a popular metric for curves; however, its bottle...
research
04/21/2022

Lipschitz (non-)equivalence of the Gromov–Hausdorff distances, including on ultrametric spaces

The Gromov–Hausdorff distance measures the difference in shape between c...
research
06/23/2020

ABID: Angle Based Intrinsic Dimensionality

The intrinsic dimensionality refers to the “true” dimensionality of the ...
research
12/20/2022

Identifying latent distances with Finslerian geometry

Riemannian geometry provides powerful tools to explore the latent space ...
research
06/17/2022

Distances for Comparing Multisets and Sequences

Measuring the distance between data points is fundamental to many statis...
research
10/29/2017

If it ain't broke, don't fix it: Sparse metric repair

Many modern data-intensive computational problems either require, or ben...
research
12/13/2018

The Relationship Between the Intrinsic Cech and Persistence Distortion Distances for Metric Graphs

Metric graphs are meaningful objects for modeling complex structures tha...

Please sign up or login with your details

Forgot password? Click here to reset