Efficient Classification for Metric Data

06/11/2013
by   Lee-Ad Gottlieb, et al.
0

Recent advances in large-margin classification of data residing in general metric spaces (rather than Hilbert spaces) enable classification under various natural metrics, such as string edit and earthmover distance. A general framework developed for this purpose by von Luxburg and Bousquet [JMLR, 2004] left open the questions of computational efficiency and of providing direct bounds on generalization error. We design a new algorithm for classification in general metric spaces, whose runtime and accuracy depend on the doubling dimension of the data points, and can thus achieve superior classification performance in many common scenarios. The algorithmic core of our approach is an approximate (rather than exact) solution to the classical problems of Lipschitz extension and of Nearest Neighbor Search. The algorithm's generalization performance is guaranteed via the fat-shattering dimension of Lipschitz classifiers, and we present experimental evidence of its superiority to some common kernel methods. As a by-product, we offer a new perspective on the nearest neighbor classifier, which yields significantly sharper risk asymptotics than the classic analysis of Cover and Hart [IEEE Trans. Info. Theory, 1967].

READ FULL TEXT
research
07/08/2020

A Nearest Neighbor Characterization of Lebesgue Points in Metric Measure Spaces

The property of almost every point being a Lebesgue point has proven to ...
research
09/22/2017

Intrinsic Metrics: Exact Equality between a Geodesic Metric and a Graph metric

Some researchers have proposed using non-Euclidean metrics for clusterin...
research
10/15/2016

Generalization of metric classification algorithms for sequences classification and labelling

The article deals with the issue of modification of metric classificatio...
research
06/24/2007

Metric Embedding for Nearest Neighbor Classification

The distance metric plays an important role in nearest neighbor (NN) cla...
research
10/21/2021

How can classical multidimensional scaling go wrong?

Given a matrix D describing the pairwise dissimilarities of a data set, ...
research
11/21/2022

Labeled Nearest Neighbor Search and Metric Spanners via Locality Sensitive Orderings

Chan, Har-Peled, and Jones [SICOMP 2020] developed locality-sensitive or...
research
08/16/2023

Two Phases of Scaling Laws for Nearest Neighbor Classifiers

A scaling law refers to the observation that the test performance of a m...

Please sign up or login with your details

Forgot password? Click here to reset