DEANN: Speeding up Kernel-Density Estimation using Approximate Nearest Neighbor Search

07/06/2021
by   Matti Karppa, et al.
0

Kernel Density Estimation (KDE) is a nonparametric method for estimating the shape of a density function, given a set of samples from the distribution. Recently, locality-sensitive hashing, originally proposed as a tool for nearest neighbor search, has been shown to enable fast KDE data structures. However, these approaches do not take advantage of the many other advances that have been made in algorithms for nearest neighbor algorithms. We present an algorithm called Density Estimation from Approximate Nearest Neighbors (DEANN) where we apply Approximate Nearest Neighbor (ANN) algorithms as a black box subroutine to compute an unbiased KDE. The idea is to find points that have a large contribution to the KDE using ANN, compute their contribution exactly, and approximate the remainder with Random Sampling (RS). We present a theoretical argument that supports the idea that an ANN subroutine can speed up the evaluation. Furthermore, we provide a C++ implementation with a Python interface that can make use of an arbitrary ANN implementation as a subroutine for KDE evaluation. We show empirically that our implementation outperforms state of the art implementations in all high dimensional datasets we considered, and matches the performance of RS in cases where the ANN yield no gains in performance.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
09/23/2016

EFANNA : An Extremely Fast Approximate Nearest Neighbor Search Algorithm Based on kNN Graph

Approximate nearest neighbor (ANN) search is a fundamental problem in ma...
research
07/01/2019

Graph-based Nearest Neighbor Search: From Practice to Theory

Graph-based approaches are empirically shown to be very successful for a...
research
12/22/2016

A Revisit of Hashing Algorithms for Approximate Nearest Neighbor Search

Approximate Nearest Neighbor (ANN) search is a fundamental problem in ma...
research
08/09/2017

Random Binary Trees for Approximate Nearest Neighbour Search in Binary Space

Approximate nearest neighbour (ANN) search is one of the most important ...
research
07/15/2018

ANN-Benchmarks: A Benchmarking Tool for Approximate Nearest Neighbor Algorithms

This paper describes ANN-Benchmarks, a tool for evaluating the performan...
research
01/04/2023

Automating Nearest Neighbor Search Configuration with Constrained Optimization

The approximate nearest neighbor (ANN) search problem is fundamental to ...
research
08/31/2011

Anisotropic k-Nearest Neighbor Search Using Covariance Quadtree

We present a variant of the hyper-quadtree that divides a multidimension...

Please sign up or login with your details

Forgot password? Click here to reset