Efficient Random Sampling - Parallel, Vectorized, Cache-Efficient, and Online

10/17/2016
by   Peter Sanders, et al.
0

We consider the problem of sampling n numbers from the range {1,...,N} without replacement on modern architectures. The main result is a simple divide-and-conquer scheme that makes sequential algorithms more cache efficient and leads to a parallel algorithm running in expected time O(n/p+ p) on p processors. The amount of communication between the processors is very small and independent of the sample size. We also discuss modifications needed for load balancing, reservoir sampling, online sampling, sampling with replacement, Bernoulli sampling, and vectorization on SIMD units or GPUs.

READ FULL TEXT

page 1

page 2

page 3

page 4

research
03/01/2019

Parallel Weighted Random Sampling

Data structures for efficient sampling from a set of weighted items are ...
research
04/11/2021

Simple, Optimal Algorithms for Random Sampling Without Replacement

Consider the fundamental problem of drawing a simple random sample of si...
research
01/02/2019

Massively Parallel Construction of Radix Tree Forests for the Efficient Sampling of Discrete Probability Distributions

We compare different methods for sampling from discrete probability dist...
research
08/29/2018

Consistent Sampling with Replacement

We describe a very simple method for `consistent sampling' that allows f...
research
09/30/2017

An Efficient Load Balancing Method for Tree Algorithms

Nowadays, multiprocessing is mainstream with exponentially increasing nu...
research
01/29/2018

Temporally-Biased Sampling for Online Model Management

To maintain the accuracy of supervised learning models in the presence o...
research
10/24/2019

Communication-Efficient (Weighted) Reservoir Sampling

We consider communication-efficient weighted and unweighted (uniform) ra...

Please sign up or login with your details

Forgot password? Click here to reset