The Effect of Points Dispersion on the k-nn Search in Random Projection Forests

02/25/2023
by   Mashaan Alshammari, et al.
0

Partitioning trees are efficient data structures for k-nearest neighbor search. Machine learning libraries commonly use a special type of partitioning trees called kd-trees to perform k-nn search. Unfortunately, kd-trees can be ineffective in high dimensions because they need more tree levels to decrease the vector quantization (VQ) error. Random projection trees rpTrees solve this scalability problem by using random directions to split the data. A collection of rpTrees is called rpForest. k-nn search in an rpForest is influenced by two factors: 1) the dispersion of points along the random direction and 2) the number of rpTrees in the rpForest. In this study, we investigate how these two factors affect the k-nn search with varying k values and different datasets. We found that with larger number of trees, the dispersion of points has a very limited effect on the k-nn search. One should use the original rpTree algorithm by picking a random direction regardless of the dispersion of points.

READ FULL TEXT

page 1

page 3

page 4

page 5

page 6

page 7

page 9

page 10

research
02/25/2023

Random projection tree similarity metric for SpectralNet

SpectralNet is a graph clustering method that uses neural network to fin...
research
12/31/2018

K-nearest Neighbor Search by Random Projection Forests

K-nearest neighbor (kNN) search has wide applications in many areas, inc...
research
05/09/2012

Which Spatial Partition Trees are Adaptive to Intrinsic Dimension?

Recent theory work has found that a special type of spatial partition tr...
research
02/22/2023

Random Projection Forest Initialization for Graph Convolutional Networks

Graph convolutional networks (GCNs) were a great step towards extending ...
research
11/07/2019

Efficient Spatial Nearest Neighbor Queries Based on Multi-layer Voronoi Diagrams

Nearest neighbor (NN) problem is an important scientific problem. The NN...
research
09/23/2015

Fast k-NN search

Efficient index structures for fast approximate nearest neighbor queries...
research
06/18/2012

Approximate Principal Direction Trees

We introduce a new spatial data structure for high dimensional data call...

Please sign up or login with your details

Forgot password? Click here to reset