Parallel Batch-Dynamic kd-Trees

12/12/2021
by   Rahul Yesantharao, et al.
0

kd-trees are widely used in parallel databases to support efficient neighborhood/similarity queries. Supporting parallel updates to kd-trees is therefore an important operation. In this paper, we present BDL-tree, a parallel, batch-dynamic implementation of a kd-tree that allows for efficient parallel k-NN queries over dynamically changing point sets. BDL-trees consist of a log-structured set of kd-trees which can be used to efficiently insert or delete batches of points in parallel with polylogarithmic depth. Specifically, given a BDL-tree with n points, each batch of B updates takes O(Blog^2(n+B)) amortized work and O(log(n+B)loglog(n+B)) depth (parallel time). We provide an optimized multicore implementation of BDL-trees. Our optimizations include parallel cache-oblivious kd-tree construction and parallel bloom filter construction. Our experiments on a 36-core machine with two-way hyper-threading using a variety of synthetic and real-world datasets show that our implementation of BDL-tree achieves a self-relative speedup of up to 34.8× (28.4× on average) for batch insertions, up to 35.5× (27.2× on average) for batch deletions, and up to 46.1× (40.0× on average) for k-nearest neighbor queries. In addition, it achieves throughputs of up to 14.5 million updates/second for batch-parallel updates and 6.7 million queries/second for k-NN queries. We compare to two baseline kd-tree implementations and demonstrate that BDL-trees achieve a good tradeoff between the two baseline options for implementing batch updates.

READ FULL TEXT

Please sign up or login with your details

Forgot password? Click here to reset