Parallel Weighted Random Sampling

Data structures for efficient sampling from a set of weighted items are an important building block of many applications. However, few parallel solutions are known. We close many of these gaps both for shared-memory and distributed-memory machines. We give efficient, fast, and practicable algorithms for sampling single items, k items with/without replacement, permutation, subset sampling, and reservoir sampling. Our output sensitive algorithm for sampling with replacement also improves the state of the art for sequential algorithms.


page 1

page 2

page 3

page 4


Communication-Efficient (Weighted) Reservoir Sampling

We consider communication-efficient weighted and unweighted (uniform) ra...

Efficient Random Sampling - Parallel, Vectorized, Cache-Efficient, and Online

We consider the problem of sampling n numbers from the range {1,...,N} w...

Weighted Reservoir Sampling from Distributed Streams

We consider message-efficient continuous random sampling from a distribu...

Consistent Sampling with Replacement

We describe a very simple method for `consistent sampling' that allows f...

Analysis of Crowdsourced Sampling Strategies for HodgeRank with Sparse Random Graphs

Crowdsourcing platforms are now extensively used for conducting subjecti...

Differentiable Subset Sampling

Many machine learning tasks require sampling a subset of items from a co...

Weighted Random Sampling on GPUs

An alias table is a data structure that allows for efficiently drawing w...

Please sign up or login with your details

Forgot password? Click here to reset