Learning Nearest Neighbor Graphs from Noisy Distance Samples

05/30/2019
by   Blake Mason, et al.
0

We consider the problem of learning the nearest neighbor graph of a dataset of n items. The metric is unknown, but we can query an oracle to obtain a noisy estimate of the distance between any pair of items. This framework applies to problem domains where one wants to learn people's preferences from responses commonly modeled as noisy distance judgments. In this paper, we propose an active algorithm to find the graph with high probability and analyze its query complexity. In contrast to existing work that forces Euclidean structure, our method is valid for general metrics, assuming only symmetry and the triangle inequality. Furthermore, we demonstrate efficiency of our method empirically and theoretically, needing only O(n log(n)Delta^-2) queries in favorable settings, where Delta^-2 accounts for the effect of noise. Using crowd-sourced data collected for a subset of the UT Zappos50K dataset, we apply our algorithm to learn which shoes people believe are most similar and show that it beats both an active baseline and ordinal embedding.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/26/2020

Query Complexity of k-NN based Mode Estimation

Motivated by the mode estimation problem of an unknown multivariate prob...
research
03/08/2021

Nearest Neighbor Search Under Uncertainty

Nearest Neighbor Search (NNS) is a central task in knowledge representat...
research
09/22/2017

Intrinsic Metrics: Nearest Neighbor and Edge Squared Distances

Some researchers have proposed using non-Euclidean metrics for clusterin...
research
09/22/2017

Intrinsic Metrics: Exact Equality between a Geodesic Metric and a Graph metric

Some researchers have proposed using non-Euclidean metrics for clusterin...
research
05/05/2018

On k Nearest Neighbor Queries in the Plane for General Distance Functions

We study k nearest neighbor queries in the plane for general (convex, pa...
research
09/22/2019

Classification in asymmetric spaces via sample compression

We initiate the rigorous study of classification in quasi-metric spaces....
research
11/15/2021

Margin-Independent Online Multiclass Learning via Convex Geometry

We consider the problem of multi-class classification, where a stream of...

Please sign up or login with your details

Forgot password? Click here to reset