Parallel Adaptive Sampling with almost no Synchronization

Approximation via sampling is a widespread technique whenever exact solutions are too expensive. In this paper, we present techniques for an efficient parallelization of adaptive (a. k. a. progressive) sampling algorithms on multi-threaded shared-memory machines. Our basic algorithmic technique requires no synchronization except for atomic load-acquire and store-release operations. It does, however, require O(n) memory per thread, where n is the size of the sampling state. We present variants of the algorithm that either reduce this memory consumption to O(1) or ensure that deterministic results are obtained. Using the KADABRA algorithm for betweenness centrality (a popular measure in network analysis) approximation as a case study, we demonstrate the empirical performance of our techniques. In particular, on a 32-core machine, our best algorithm is 2.9x faster than what we could achieve using a straightforward OpenMP-based parallelization and 65.3x faster than the existing implementation of KADABRA.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
10/24/2019

Scaling Betweenness Approximation to Billions of Edges by MPI-based Adaptive Sampling

Betweenness centrality is one of the most popular vertex centrality meas...
research
05/08/2017

Block-Parallel IDA* for GPUs (Extended Manuscript)

We investigate GPU-based parallelization of Iterative-Deepening A* (IDA*...
research
09/06/2023

An Evaluation of Software Sketches

This work presents a detailed evaluation of Rust (software) implementati...
research
05/19/2018

Tell Me Something New: a new framework for asynchronous parallel learning

We present a novel approach for parallel computation in the context of m...
research
12/22/2022

Accelerating Barnes-Hut t-SNE Algorithm by Efficient Parallelization on Multi-Core CPUs

t-SNE remains one of the most popular embedding techniques for visualizi...
research
05/10/2012

A Discussion on Parallelization Schemes for Stochastic Vector Quantization Algorithms

This paper studies parallelization schemes for stochastic Vector Quantiz...
research
07/19/2022

Implementing and Breaking Load-Link / Store-Conditional on an ARM-Based System

Manufacturers of modern electronic devices are constantly attempting to ...

Please sign up or login with your details

Forgot password? Click here to reset