Fast Distance Oracles for Any Symmetric Norm

05/30/2022
โˆ™
by   Yichuan Deng, et al.
โˆ™
0
โˆ™

In the Distance Oracle problem, the goal is to preprocess n vectors x_1, x_2, โ‹ฏ, x_n in a d-dimensional metric space (๐•^d, ยท_l) into a cheap data structure, so that given a query vector q โˆˆ๐•^d and a subset SโŠ† [n] of the input data points, all distances q - x_i _l for x_iโˆˆ S can be quickly approximated (faster than the trivial โˆผ d|S| query time). This primitive is a basic subroutine in machine learning, data mining and similarity search applications. In the case of โ„“_p norms, the problem is well understood, and optimal data structures are known for most values of p. Our main contribution is a fast (1+ฮต) distance oracle for any symmetric norm ยท_l. This class includes โ„“_p norms and Orlicz norms as special cases, as well as other norms used in practice, e.g. top-k norms, max-mixture and sum-mixture of โ„“_p norms, small-support norms and the box-norm. We propose a novel data structure with ร•(n (d + mmc(l)^2 ) ) preprocessing time and space, and t_q = ร•(d + |S| ยทmmc(l)^2) query time, for computing distances to a subset S of data points, where mmc(l) is a complexity-measure (concentration modulus) of the symmetric norm. When l = โ„“_p , this runtime matches the aforementioned state-of-art oracles.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
โˆ™ 02/07/2019

On Mean Estimation for General Norms with Statistical Queries

We study the problem of mean estimation for high-dimensional distributio...
research
โˆ™ 05/10/2021

Near Neighbor Search via Efficient Average Distortion Embeddings

A recent series of papers by Andoni, Naor, Nikolov, Razenshteyn, and Wai...
research
โˆ™ 01/22/2021

On the robustness of certain norms

We study a family of norms defined for functions on an interval. These n...
research
โˆ™ 05/28/2015

A Bounded p-norm Approximation of Max-Convolution for Sub-Quadratic Bayesian Inference on Additive Factors

Max-convolution is an important problem closely resembling standard conv...
research
โˆ™ 07/09/2023

Private Data Stream Analysis for Universal Symmetric Norm Estimation

We study how to release summary statistics on a data stream subject to t...
research
โˆ™ 04/10/2002

The Geometric Maximum Traveling Salesman Problem

We consider the traveling salesman problem when the cities are points in...
research
โˆ™ 10/29/2017

If it ain't broke, don't fix it: Sparse metric repair

Many modern data-intensive computational problems either require, or ben...

Please sign up or login with your details

Forgot password? Click here to reset